Skip to Content
ProjectsProject 1: CS & Operations Analytics08. Satisfaction Prediction (ML)

03. Satisfaction Prediction Model

Expert2 hours

1. Overview and Scenario

Situation: Waiting for customers to respond to surveys is too late. The customer may have already left. What if, the moment support ends, we could predict “This customer’s satisfaction will be 2 out of 5 points”? A manager could immediately intervene and send an apology coupon.

This is the core of a Churn Prevention System.


2. Data Preparation and Feature Engineering

Select the variables (Features) to use for prediction.

  • Target: csat_score (or binary classification of satisfied/dissatisfied)
  • Features: first_response_time, resolution_time, issue_type, priority
# With BigQuery ML, you can model using SQL only from google.cloud import bigquery client = bigquery.Client()

3. Model Training

❓ Problem 1: Train Classification Model

Q. Create a classification model that predicts High if csat_score is 4 or higher, and Low if 3 or lower.

💡

Hint: Use CREATE OR REPLACE MODEL ... OPTIONS(model_type='LOGISTIC_REG').

View Solution

CREATE OR REPLACE MODEL `your-project-id.retail_analytics_us.csat_classifier` OPTIONS( model_type='LOGISTIC_REG', input_label_cols=['satisfaction_label'] ) AS SELECT IF(s.csat_score >= 4, 'High', 'Low') as satisfaction_label, TIMESTAMP_DIFF(t.first_response_at, t.opened_at, HOUR) as response_hours, TIMESTAMP_DIFF(t.resolved_at, t.opened_at, HOUR) as resolution_hours, t.issue_type, t.priority, t.channel FROM `your-project-id.retail_analytics_us.cs_tickets_dummy` t JOIN `your-project-id.retail_analytics_us.survey_cs_dummy` s ON t.ticket_id = s.related_ticket_id WHERE t.status = 'solved' AND s.csat_score IS NOT NULL;

4. Model Evaluation and Prediction

Let’s check how accurate the model is and apply it to real data.

❓ Problem 2: Model Evaluation

Q. Check the Accuracy and ROC-AUC of the trained model.

SELECT * FROM ML.EVALUATE(MODEL `your-project-id.retail_analytics_us.csat_classifier`);

Example Results:

precisionrecallaccuracyroc_auc
0.850.820.840.91

❓ Problem 3: Real-Time Prediction (What-If)

Q. Predict for customers who haven’t responded to the survey yet (or hypothetical scenarios). How does satisfaction probability change “when response time is 1 hour vs 24 hours”?

SELECT response_hours, predicted_satisfaction_label, predicted_satisfaction_label_probs FROM ML.PREDICT( MODEL `your-project-id.retail_analytics_us.csat_classifier`, ( SELECT 1 as response_hours, 2 as resolution_hours, 'shipping' as issue_type, 'high' as priority, 'email' as channel UNION ALL SELECT 24 as response_hours, 48 as resolution_hours, 'shipping' as issue_type, 'high' as priority, 'email' as channel ) );

5. Feature Importance

Which factors had the greatest impact on satisfaction? BigQuery ML provides the ML.FEATURE_IMPORTANCE function.

SELECT * FROM ML.FEATURE_IMPORTANCE(MODEL `your-project-id.retail_analytics_us.csat_classifier`);
  • Results: Usually response_hours (response speed) or resolution_hours (resolution speed) rank at the top.
  • Insight: You might conclude that “responding quickly matters more for satisfaction than being polite in the conversation”.

💡 Summary

  • Classification: Predict satisfied/dissatisfied, churn/retention, etc.
  • BigQuery ML: All-in-One processing from modeling to deployment and prediction using SQL.
  • Actionable Insight: Use modeling results (Feature Importance) to develop business strategies (e.g., improve response speed).

Project 1 Complete! You are now a Full-Stack Data Analyst capable of analyzing, validating, and predicting from data. The next step is creating all of this into a Dashboard (Streamlit) to share with your team.

Last updated on

🤖AI 모의면접실전처럼 연습하기