Analyzing Churn Rate with Classification Models

•

0 likes•1,446 views

Objective: reduce a telecommunication company’s churn rate (originally 14.49%) through developing predictive models with past customers’ churning data and identifying churners’ characteristics using machine learning techniques Tools used: >RStudio (ggplot2, dplyr) >Weka (for model prototyping) >MS Excel, MS Powerpoint Techniques used: >Decision tree >Naive Bayes >Random Forest >K-means clustering analysis >Recommender system

Data & Analytics

Colleen Bobbie, Stephen De Medicis,
Iris Huang, Krean Naidoo, Calvin Reid
CIND 119 CLASS PROJECT
Group Project Analyzing
Churn Rate

Data Preparation
 Derived attributes
 Categorize attributes
 Summary statistics
 Distribution

Data Preparation
 Correlation plot
 Simple logistic regression (attributes most
linked to churn)
Significant imbalance in churn attribute
 Derived values: NA -> 0
 Area code: discretized

Predictive Modeling: Classification
Our predictive modeling covers three different
classification algorithms:
 Decision Tree
 Naïve Bayes and
 Random Forest

K-Means Clustering our ‘Churners’
“Cathy Complainers”
Total charges
Day charges
Day price per call
Price per call
Customer service calls
Voicemail messages
“Danny Daytimes”
Voicemail messages
Customer service calls
Total charges
Day charges
Day Price per call
Total Price per minute
“Irene Internationals”
Customer service calls
Voicemail messages
International charges
International price per
call
*Comparisons amongst ‘churners’ only

Recommendations
1. Tailored Phone Plans
2. Implementation Tool
3. Data Collection Improvements

1. Retention Recommendations
“Cathy Complainers”
 Lower cost retention
plan with free
voicemail
 Offer to high volume
callers to call centre
“Danny Daytimes”
 Retention plan with
discount/additional
daytime/evening
minutes
 Offer at early contact
or proactively offer
“Irene Internationals”
 Retention plan with
discount/additional
international minutes
or better international
plan
 Offer at early contact
or proactively offer

Accuracy
Unsmoted Smoted
Smoted
Pruned
Unsmoted
Pruned
Decision
Tree
66%/33%
Split
97.4404 81.182 93.831 97.7052
3 Fold 97.4197 81.5154 93.9031 97.5998
10 Fold 97.5398 81.4273 94.2379 97.5098
Naïve Bayes
66%/33%
Split
84.8191 81.0783 72.1617 85.7899
3 Fold 84.9385 81.7621 72.3348 85.9886
10 Fold 84.3684 82.5198 72.3348 86.1386
Random
Forest
66%/33%
Split
86.496 97.0969 93.9347 97.3522
3 Fold 88.8689 95.9648 93.4449 97.4497
10 Fold 89.679 96.1938 94.6079 97.5998

False Positive (where FP rate of Class 'False')
Unsmoted Smoted
Smoted
Pruned
Unsmoted
Pruned
Decision
Tree
66%/33%
Split
0.14 0.368 0.078 0.124
3 Fold 0.164 0.364 0.081 0.149
10 Fold 0.155 0.366 0.068 0.157
Naïve Bayes
66%/33%
Split
0.446 0.13 0.309 0.5
3 Fold 0.482 0.117 0.303 0.516
10 Fold 0.487 0.104 0.303 0.511
Random
Forest
66%/33%
Split
0.823 0.036 0.109 0.161
3 Fold 0.768 0.033 0.113 0.174
10 Fold 0.712 0.032 0.094 0.164

Analyzing Churn Rate with Classification Models

What's hot

ClusteringM Rizwan Aqeel

Churn predictionGigi Lino

Telecom Churn PredictionAnurag Mukhopadhyay

Churn customer analysisDr.Bechoo Lal

Customer Churn Prediction Using Machine Learning Techniques: the case of Lion...IIJSRJournal

Bank churn with Data ScienceCarolyn Knight

Customer churn prediction for telecom data set.Kuldeep Mahani

Over fitting underfittingSivapriyaS12

Machine learning module 2Gokulks007

Exploratory data analysis data visualizationDr. Hamdan Al-Sabri

DTH Case StudyCequity Solutions

Data mining and analysis of customer churn datasetRohan Choksi

Churn in the Telecommunications Industryskewdlogix

Data Mining: Application and trends in data miningDataminingTools Inc

Predicting Bank Customer Churn Using ClassificationVishva Abeyrathne

Telecom Churn AnalysisVasudev pendyala

Telecom customer churn predictionSaleesh Satheeshchandran

Customer Segmentation ProjectAditya Ekawade

Exploratory data analysis Peter Reimann

Introduction to Machine Learning ClassifiersFunctional Imperative

What's hot (20)

Clustering

Churn prediction

Telecom Churn Prediction

Churn customer analysis

Customer Churn Prediction Using Machine Learning Techniques: the case of Lion...

Bank churn with Data Science

Customer churn prediction for telecom data set.

Over fitting underfitting

Machine learning module 2

Exploratory data analysis data visualization

DTH Case Study

Data mining and analysis of customer churn dataset

Churn in the Telecommunications Industry

Data Mining: Application and trends in data mining

Predicting Bank Customer Churn Using Classification

Telecom Churn Analysis

Telecom customer churn prediction

Customer Segmentation Project

Exploratory data analysis

Introduction to Machine Learning Classifiers

Similar to Analyzing Churn Rate with Classification Models

Improving Hardware Efficiency for DNN ApplicationsChester Chen

Avaya Sip Within Your Enterprisehypknight

Prioritization Strategies (BarCamp Boston 6)Trevor Lohrbeer

Datamining intro-iepaaryarun1999

Dwd mdatamining intro-iepAshish Kumar Thakur

Cisco Connect 2018 Malaysia - Prognosis - investment protection; insure your ...NetworkCollaborators

PCA.pptxtestuser473730

Barga Data Science lecture 4Roger Barga

Feature selection with imbalanced data in agricultureAboul Ella Hassanien

Best Practices in Mobile Data Collectionvcuniversity

Maximizing BEAD: The Roadmap to Digital Inclusion and Internet for AllPrecisely

Introduction to Data MiningKai Koenig

TELECOMMUNICATION (2).pptxLakshmiDevi244885

Dwdm ppt for the btech student contain basisnivatripathy93

Data preparation and processing chapter 2Mahmoud Alfarra

Eshg sequencing workshopOxford Gene Technology

Visualising Machine learning: Humanising the advanced IntelligenceGramener

Automatic System for Detection and Classification of Brain TumorsFatma Sayed Ibrahim

CMU Trecvid sed11Lu Jiang

Robust inference via generative classifiers for handling noisy labelsKimin Lee

Similar to Analyzing Churn Rate with Classification Models (20)

Improving Hardware Efficiency for DNN Applications

Avaya Sip Within Your Enterprise

Prioritization Strategies (BarCamp Boston 6)

Datamining intro-iep

Dwd mdatamining intro-iep

Cisco Connect 2018 Malaysia - Prognosis - investment protection; insure your ...

PCA.pptx

Barga Data Science lecture 4

Feature selection with imbalanced data in agriculture

Best Practices in Mobile Data Collection

Maximizing BEAD: The Roadmap to Digital Inclusion and Internet for All

Introduction to Data Mining

TELECOMMUNICATION (2).pptx

Dwdm ppt for the btech student contain basis

Data preparation and processing chapter 2

Eshg sequencing workshop

Visualising Machine learning: Humanising the advanced Intelligence

Automatic System for Detection and Classification of Brain Tumors

CMU Trecvid sed11

Robust inference via generative classifiers for handling noisy labels

Recently uploaded

Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna

B2 Creative Industry Response Evaluation.docxStephen266013

From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck

Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha

Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863

Brighton SEO | April 2024 | Data StorytellingNeil Barnes

Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach

PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408

Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083

Data Warehouse , Data Cube Computationsit20ad004

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor

Recently uploaded (20)

Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...

B2 Creative Industry Response Evaluation.docx

From idea to production in a day – Leveraging Azure ML and Streamlit to build...

Call Girls In Mahipalpur O9654467111 Escorts Service

Dubai Call Girls Wifey O52&786472 Call Girls Dubai

Brighton SEO | April 2024 | Data Storytelling

Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt

PKS-TGC-1084-630 - Stage 1 Proposal.pptx

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps

Unveiling Insights: The Role of a Data Analyst

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call

Data Warehouse , Data Cube Computation

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai

Analyzing Churn Rate with Classification Models

1. Colleen Bobbie, Stephen De Medicis, Iris Huang, Krean Naidoo, Calvin Reid CIND 119 CLASS PROJECT Group Project Analyzing Churn Rate

2. Project Description CHURN

3. Data Preparation

4. Data Preparation  Derived attributes  Categorize attributes  Summary statistics  Distribution

5. Data Preparation  Correlation plot  Simple logistic regression (attributes most linked to churn) Significant imbalance in churn attribute  Derived values: NA -> 0  Area code: discretized

6. Data Preparation  Correlation plot  Simple logistic regression (attributes most linked to churn) Significant imbalance in churn attribute  Derived values: NA -> 0  Area code: discretized

7. Predictive Analysis

8. SMOTE and Cross-Validation SMOTE

9. Predictive Modeling: Classification Our predictive modeling covers three different classification algorithms:  Decision Tree  Naïve Bayes and  Random Forest

10. Predictive Modeling Interpretation

11. Post-Predictive Analysis

12. K-Means Clustering our ‘Churners’ “Cathy Complainers” Total charges Day charges Day price per call Price per call Customer service calls Voicemail messages “Danny Daytimes” Voicemail messages Customer service calls Total charges Day charges Day Price per call Total Price per minute “Irene Internationals” Customer service calls Voicemail messages International charges International price per call *Comparisons amongst ‘churners’ only

13. Recommendations 1. Tailored Phone Plans 2. Implementation Tool 3. Data Collection Improvements

14. 1. Retention Recommendations “Cathy Complainers”  Lower cost retention plan with free voicemail  Offer to high volume callers to call centre “Danny Daytimes”  Retention plan with discount/additional daytime/evening minutes  Offer at early contact or proactively offer “Irene Internationals”  Retention plan with discount/additional international minutes or better international plan  Offer at early contact or proactively offer

15. 2. Implementation Tools

16. 3. Data Collection Improvements

17. Thank You! Any Questions?

18. Accuracy Unsmoted Smoted Smoted Pruned Unsmoted Pruned Decision Tree 66%/33% Split 97.4404 81.182 93.831 97.7052 3 Fold 97.4197 81.5154 93.9031 97.5998 10 Fold 97.5398 81.4273 94.2379 97.5098 Naïve Bayes 66%/33% Split 84.8191 81.0783 72.1617 85.7899 3 Fold 84.9385 81.7621 72.3348 85.9886 10 Fold 84.3684 82.5198 72.3348 86.1386 Random Forest 66%/33% Split 86.496 97.0969 93.9347 97.3522 3 Fold 88.8689 95.9648 93.4449 97.4497 10 Fold 89.679 96.1938 94.6079 97.5998

19. False Positive (where FP rate of Class 'False') Unsmoted Smoted Smoted Pruned Unsmoted Pruned Decision Tree 66%/33% Split 0.14 0.368 0.078 0.124 3 Fold 0.164 0.364 0.081 0.149 10 Fold 0.155 0.366 0.068 0.157 Naïve Bayes 66%/33% Split 0.446 0.13 0.309 0.5 3 Fold 0.482 0.117 0.303 0.516 10 Fold 0.487 0.104 0.303 0.511 Random Forest 66%/33% Split 0.823 0.036 0.109 0.161 3 Fold 0.768 0.033 0.113 0.174 10 Fold 0.712 0.032 0.094 0.164

20. Random Forest

Analyzing Churn Rate with Classification Models

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Analyzing Churn Rate with Classification Models

Similar to Analyzing Churn Rate with Classification Models (20)

Recently uploaded

Recently uploaded (20)

Analyzing Churn Rate with Classification Models