Customer Churn, A Data Science Use Case in Telecom

Customer Churn
A Data Science Use Case in Telecom
Chris Chen - Data Analyst@Shaw Communications

CRISP-DM: Cross Industry
Standard Process for Data Mining
Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment

Business Understanding
Business objectives:
• Reduce customer
churns
• Minimize the costs
(efforts) of retention
• Conduct insights
Success criteria
• Metrics
• Non-metrics

Data Understanding
Data sources
Internal: Customer Data, Product Data, Transactions
and Customer interactions
External
Data qualities: missing values, duplicates, outliers etc.
First insights: Binary Classiﬁcation, Skewed (Imbalanced)

Data Preparation
ETL
Feature selection
Feature engineering
Train/ validation/ test

Feature Selection (subtraction)
• Expert voting system
• Engage SMEs from various background: tech vs non-
tech, marketing vs customer care, management vs
frontline sales.
• 10-15 most important features that may have impacts
on customer churn/ customer retention

Wrapper based methods
Random Forest/ Boosting Tree - also good hints for feature
engineering
Filter methods
Missing Values Ratio
Low Variance Filter ( less informative features )
High Correlation Filter ( similar features)
Can be good candidates for interactions. e.g. Age vs Income
PCA
Feature Selection (subtraction) - science

Feature Engineering ( addition)
Business acumen, combined with domain knowledge and
model understandings
Ordinal vs Nominal: label encoding, one-hot-encoding
Transformation: normalization, log and so on
Imputation: missing values
Feature Interactions
Time series

Modeling
Classiﬁcation:
Gradient Boosting Tree (GBT) - I’m big fan of Xgboost
Random Forest (RF)
Logistic Regression (LR) or Elastic Net (EN)
Neural Network (NN)
Support Vector Machine (SVM)

Modeling - Metric
• Precision = True Positive/ (True Positive
+ False Positive)
• Recall = True Positive / (True Positive +
False Negative)
• Accuracy = (TP + TN)/ (TP + TN +FP +
FN)
For a dataset that contains 99% non-
churn customers and 1% churn
customers if we predicted all customers as
churn then the accuracy would be 99%
Area Under Curve (AUC): Precision vs
Recall Trade oﬀ for skewed
classiﬁcations
True Positive
(Churn Customers
that were correctly
predicted as churns)
False Positive
(No-churn Customers
that were incorrectly
predicted as churn
customers)
False Negative
(Churn customers
that were incorrectly
predicted as non-
churns)
True Negative
(Non-churn
customers that were
correctly predicted as
non-churns)
Actuals
Predictions

Evaluation - Model excellence vs Business
excellence
Accuracy
Interpretability
Low
Low
High
High
Radom Forest
Boosting
Deep Learning
Neural Network
Linear/ Logistic Regresson
Decision Trees
Naive Bayes
Nearest Neighbours
SVM

Deployment
Rule of Thumb:
Business Engagement

Customer Churn, A Data Science Use Case in Telecom

More Related Content

What's hot

Viewers also liked

Similar to Customer Churn, A Data Science Use Case in Telecom

Recently uploaded

Customer Churn, A Data Science Use Case in Telecom