SlideShare a Scribd company logo
1 of 17
Download to read offline
Problem
´ Explore the dataset to determine its quality and specify data quality issues
identified. Apply the pre-processing and visualisation techniques used in
completing the Day 2 Group work.
´ Describe how you dealt with the data quality issues encountered
´ Modify the R code provided to fit a logistic regression model to predict
churn
´ Identify the variables that contribute most to predicting churn
´ What business insights can be derived from the analysis?
Preprocessing Data
Data cleaning:
´ Dropped four unwanted columns .
´ Changed categorical variables to Numerical variables.
´ No NULL records found
Exploratory Data Analysis:
´ Graphs for numerical values show normal Distribution
´Gender count comparison
for churn – 0 and 1, shows
Male-Female count was very
high at around 2500 for churn-
0 compared to Male-female
count of < 1000 for churn-1, this
shows gender can be a good
predictor variable for
predicting churn.
Gender count churn-0 Vs churn-1
0
500
1000
1500
2000
2500
3000
0 1
Fema
le
Male
Gender & Partner Count
0
500
1000
1500
2000
2500
3000
0 1
Female
Male
0
500
1000
1500
2000
2500
3000
0 1
No
Yes
Dependent & PhoneService Count
0
500
1000
1500
2000
2500
3000
3500
4000
0 1
No
Yes
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0 1
No
Yes
MultipleLines & InternetService count
0
500
1000
1500
2000
2500
3000
0 1
No
No phone service
Yes
0
500
1000
1500
2000
2500
0 1
DSL
Fiber optic
No
OnlineSecurity & OnlineBackup
0
500
1000
1500
2000
2500
0 1
No
No internet service
Yes
0
500
1000
1500
2000
2500
0 1
No
No internet service
Yes
Payment Method and PaperlessBilling
count
0
200
400
600
800
1000
1200
1400
0 1
Bank transfer (automatic)
Credit card (automatic)
Electronic check
Mailed check
0
500
1000
1500
2000
2500
3000
0 1
No
Yes
Contract & StreamingMovie Count
0
500
1000
1500
2000
2500
0 1
Month-to-month
One year
Two year
0
500
1000
1500
2000
2500
0 1
No
No internet service
Yes
StreamingTV & TechSupport Count
0
200
400
600
800
1000
1200
1400
1600
1800
2000
0 1
No
No internet service
Yes
0
500
1000
1500
2000
2500
0 1
No
No internet service
Yes
Logistic Regression Model
´ Binary LR model works on categorical dependent variables or qualitative
variables that can take 2 values, ex: Yes/No,
´ Multinomial LR models can work for three possible categories.
´ LR models estimates probability of occurrence of event using logarithmic
likelihood function and not least minimum square method used by
regression models.
´ Binary logistic regression model estimates probability of occurrence of
dependent variable Y, which present itself in dichotomous form(0/1).
´ 𝑧 = 𝛼 + 𝛽1𝑋1 + 𝛽2𝑋2 + ⋯ + 𝛽𝑛𝑋𝑛
´ Z is known as logit., a,B1,B2,..Bk are the estimated parameters for
explanatory variables X1,X2,…Xk
LR model
´ Binary logistic regression defines Z logit as natural logarithm of odds, such
that:
´ Type equation here.
´ ℓ𝓃(pi/(1-pi))=zi
´ pi=1/(1+e-zi) or p=f(Z)
´ Substitute Categorical variables with dummy
Variables, then maximize the logarithmic
Likelihood function given by (yi).ln(pi)+(1-yi).ln(1-pi).
Keeping 𝛼1=𝛽1= 𝛽 2= 𝛽3=..= 𝛽n=0, calculate zi , pi, lli .
Logistic Regression
´ M3, M5,M7,M9 cells contain respective estimation parameters, use solver to
calculate z, pi, lli ,maximizing the sum of Lli using the data file.
´ There is no percentage of variance w.r.t predicting variables or R^2 as in
traditional regression models estimated by least minimum square., More
adequate criteria to choose best model is ROC curve(receiver operating
characteristic)
´ X2 test is used to verify the model significance , since its null hypothesis are
´ H0: =B2=B2=…=Bk=0
´ H1: there is atleast one Bi !=0
Logistic Regression
´ Confusion Matrix based on cutoff = 0.5, representing number of FP, TP, FN,
TN
´ OME- Overall model efficiency = (TP+ TN)/Total events
´ Sensitivity= % of hits, for a determined cutoff considering observations that
are in-fact events
´ Specificity = % of hits, for a determined cutoff considering observations that
are not events.
Logistic Regression Workflow
LR Results
´ Overall Accuracy: 0.80
´ Avg_AUC : 0.84
´ Ang_F1: 0.58
Precision Vs Recall Graph

More Related Content

What's hot

11.polynomial regression model of making cost prediction in mixed cost analysis
11.polynomial regression model of making cost prediction in mixed cost analysis11.polynomial regression model of making cost prediction in mixed cost analysis
11.polynomial regression model of making cost prediction in mixed cost analysisAlexander Decker
 
Ai_Project_report
Ai_Project_reportAi_Project_report
Ai_Project_reportRavi Gupta
 
Advanced functions visual Basic .net
Advanced functions visual Basic .netAdvanced functions visual Basic .net
Advanced functions visual Basic .netMohammad Dwikat
 
Machine Learning Application: Credit Scoring
Machine Learning Application: Credit ScoringMachine Learning Application: Credit Scoring
Machine Learning Application: Credit Scoringeurosigdoc acm
 
Linear logisticregression
Linear logisticregressionLinear logisticregression
Linear logisticregressionkongara
 
Linear algebra application in linear programming
Linear algebra application in linear programming Linear algebra application in linear programming
Linear algebra application in linear programming Lahiru Dilshan
 
Bayesian analysis of shape parameter of Lomax distribution using different lo...
Bayesian analysis of shape parameter of Lomax distribution using different lo...Bayesian analysis of shape parameter of Lomax distribution using different lo...
Bayesian analysis of shape parameter of Lomax distribution using different lo...Premier Publishers
 
Ridge regression, lasso and elastic net
Ridge regression, lasso and elastic netRidge regression, lasso and elastic net
Ridge regression, lasso and elastic netVivian S. Zhang
 
Post-optimal analysis of LPP
Post-optimal analysis of LPPPost-optimal analysis of LPP
Post-optimal analysis of LPPRAVI PRASAD K.J.
 
The Basic Model of Computation
The Basic Model of ComputationThe Basic Model of Computation
The Basic Model of ComputationDipakKumar122
 
5 parallel implementation 06299286
5 parallel implementation 062992865 parallel implementation 06299286
5 parallel implementation 06299286Ninad Samel
 

What's hot (15)

11.polynomial regression model of making cost prediction in mixed cost analysis
11.polynomial regression model of making cost prediction in mixed cost analysis11.polynomial regression model of making cost prediction in mixed cost analysis
11.polynomial regression model of making cost prediction in mixed cost analysis
 
Ai_Project_report
Ai_Project_reportAi_Project_report
Ai_Project_report
 
Advanced functions visual Basic .net
Advanced functions visual Basic .netAdvanced functions visual Basic .net
Advanced functions visual Basic .net
 
Machine Learning Application: Credit Scoring
Machine Learning Application: Credit ScoringMachine Learning Application: Credit Scoring
Machine Learning Application: Credit Scoring
 
Linear logisticregression
Linear logisticregressionLinear logisticregression
Linear logisticregression
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Linear algebra application in linear programming
Linear algebra application in linear programming Linear algebra application in linear programming
Linear algebra application in linear programming
 
Bayesian analysis of shape parameter of Lomax distribution using different lo...
Bayesian analysis of shape parameter of Lomax distribution using different lo...Bayesian analysis of shape parameter of Lomax distribution using different lo...
Bayesian analysis of shape parameter of Lomax distribution using different lo...
 
Ridge regression, lasso and elastic net
Ridge regression, lasso and elastic netRidge regression, lasso and elastic net
Ridge regression, lasso and elastic net
 
Post-optimal analysis of LPP
Post-optimal analysis of LPPPost-optimal analysis of LPP
Post-optimal analysis of LPP
 
The Basic Model of Computation
The Basic Model of ComputationThe Basic Model of Computation
The Basic Model of Computation
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
KNN
KNNKNN
KNN
 
Regression analysis on SPSS
Regression analysis on SPSSRegression analysis on SPSS
Regression analysis on SPSS
 
5 parallel implementation 06299286
5 parallel implementation 062992865 parallel implementation 06299286
5 parallel implementation 06299286
 

Similar to Churn Prediction on customer data

Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.pptTanyaWadhwani4
 
Data Analysison Regression
Data Analysison RegressionData Analysison Regression
Data Analysison Regressionjamuga gitulho
 
Shrinkage Methods in Linear Regression
Shrinkage Methods in Linear RegressionShrinkage Methods in Linear Regression
Shrinkage Methods in Linear RegressionBennoG1
 
Business Analytics Foundation with R tools - Part 2
Business Analytics Foundation with R tools - Part 2Business Analytics Foundation with R tools - Part 2
Business Analytics Foundation with R tools - Part 2Beamsync
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionRupak Roy
 
Suggest one psychological research question that could be answered.docx
Suggest one psychological research question that could be answered.docxSuggest one psychological research question that could be answered.docx
Suggest one psychological research question that could be answered.docxpicklesvalery
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92
Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92
Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92ohenebabismark508
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdfNarenRajVivek
 
SupportVectorRegression
SupportVectorRegressionSupportVectorRegression
SupportVectorRegressionDaniel K
 
Exploring Support Vector Regression - Signals and Systems Project
Exploring Support Vector Regression - Signals and Systems ProjectExploring Support Vector Regression - Signals and Systems Project
Exploring Support Vector Regression - Signals and Systems ProjectSurya Chandra
 
Lecture - 8 MLR.pptx
Lecture - 8 MLR.pptxLecture - 8 MLR.pptx
Lecture - 8 MLR.pptxiris765749
 

Similar to Churn Prediction on customer data (20)

Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 
working with python
working with pythonworking with python
working with python
 
Chapter14
Chapter14Chapter14
Chapter14
 
Data Analysison Regression
Data Analysison RegressionData Analysison Regression
Data Analysison Regression
 
Shrinkage Methods in Linear Regression
Shrinkage Methods in Linear RegressionShrinkage Methods in Linear Regression
Shrinkage Methods in Linear Regression
 
Business Analytics Foundation with R tools - Part 2
Business Analytics Foundation with R tools - Part 2Business Analytics Foundation with R tools - Part 2
Business Analytics Foundation with R tools - Part 2
 
Regression
RegressionRegression
Regression
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Suggest one psychological research question that could be answered.docx
Suggest one psychological research question that could be answered.docxSuggest one psychological research question that could be answered.docx
Suggest one psychological research question that could be answered.docx
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92
Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92
Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdf
 
SupportVectorRegression
SupportVectorRegressionSupportVectorRegression
SupportVectorRegression
 
Exploring Support Vector Regression - Signals and Systems Project
Exploring Support Vector Regression - Signals and Systems ProjectExploring Support Vector Regression - Signals and Systems Project
Exploring Support Vector Regression - Signals and Systems Project
 
Ullmayer_Rodriguez_Presentation
Ullmayer_Rodriguez_PresentationUllmayer_Rodriguez_Presentation
Ullmayer_Rodriguez_Presentation
 
Explore ml day 2
Explore ml day 2Explore ml day 2
Explore ml day 2
 
Lecture - 8 MLR.pptx
Lecture - 8 MLR.pptxLecture - 8 MLR.pptx
Lecture - 8 MLR.pptx
 
1624.pptx
1624.pptx1624.pptx
1624.pptx
 
TamingStatistics
TamingStatisticsTamingStatistics
TamingStatistics
 

More from NidhiArora113

Paid search Advertising Research
Paid search Advertising ResearchPaid search Advertising Research
Paid search Advertising ResearchNidhiArora113
 
Contemperory issues in_it report
Contemperory issues in_it reportContemperory issues in_it report
Contemperory issues in_it reportNidhiArora113
 
Strategic change Analytics Report- Walmart
Strategic change Analytics Report- WalmartStrategic change Analytics Report- Walmart
Strategic change Analytics Report- WalmartNidhiArora113
 
Social media monitoring
Social media monitoringSocial media monitoring
Social media monitoringNidhiArora113
 
Marketing analytics virginia
Marketing analytics virginiaMarketing analytics virginia
Marketing analytics virginiaNidhiArora113
 

More from NidhiArora113 (7)

Paid search Advertising Research
Paid search Advertising ResearchPaid search Advertising Research
Paid search Advertising Research
 
Contemperory issues in_it report
Contemperory issues in_it reportContemperory issues in_it report
Contemperory issues in_it report
 
Market Analytics
Market AnalyticsMarket Analytics
Market Analytics
 
Strategic change Analytics Report- Walmart
Strategic change Analytics Report- WalmartStrategic change Analytics Report- Walmart
Strategic change Analytics Report- Walmart
 
Business Insights
Business InsightsBusiness Insights
Business Insights
 
Social media monitoring
Social media monitoringSocial media monitoring
Social media monitoring
 
Marketing analytics virginia
Marketing analytics virginiaMarketing analytics virginia
Marketing analytics virginia
 

Recently uploaded

Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxWorkforce Group
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesDipal Arora
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with CultureSeta Wicaksana
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLSeo
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...lizamodels9
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsP&CO
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfAdmir Softic
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...lizamodels9
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMRavindra Nath Shukla
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxpriyanshujha201
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...amitlee9823
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...rajveerescorts2022
 

Recently uploaded (20)

Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSM
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
 

Churn Prediction on customer data

  • 1. Problem ´ Explore the dataset to determine its quality and specify data quality issues identified. Apply the pre-processing and visualisation techniques used in completing the Day 2 Group work. ´ Describe how you dealt with the data quality issues encountered ´ Modify the R code provided to fit a logistic regression model to predict churn ´ Identify the variables that contribute most to predicting churn ´ What business insights can be derived from the analysis?
  • 2. Preprocessing Data Data cleaning: ´ Dropped four unwanted columns . ´ Changed categorical variables to Numerical variables. ´ No NULL records found Exploratory Data Analysis: ´ Graphs for numerical values show normal Distribution
  • 3. ´Gender count comparison for churn – 0 and 1, shows Male-Female count was very high at around 2500 for churn- 0 compared to Male-female count of < 1000 for churn-1, this shows gender can be a good predictor variable for predicting churn. Gender count churn-0 Vs churn-1 0 500 1000 1500 2000 2500 3000 0 1 Fema le Male
  • 4. Gender & Partner Count 0 500 1000 1500 2000 2500 3000 0 1 Female Male 0 500 1000 1500 2000 2500 3000 0 1 No Yes
  • 5. Dependent & PhoneService Count 0 500 1000 1500 2000 2500 3000 3500 4000 0 1 No Yes 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 0 1 No Yes
  • 6. MultipleLines & InternetService count 0 500 1000 1500 2000 2500 3000 0 1 No No phone service Yes 0 500 1000 1500 2000 2500 0 1 DSL Fiber optic No
  • 7. OnlineSecurity & OnlineBackup 0 500 1000 1500 2000 2500 0 1 No No internet service Yes 0 500 1000 1500 2000 2500 0 1 No No internet service Yes
  • 8. Payment Method and PaperlessBilling count 0 200 400 600 800 1000 1200 1400 0 1 Bank transfer (automatic) Credit card (automatic) Electronic check Mailed check 0 500 1000 1500 2000 2500 3000 0 1 No Yes
  • 9. Contract & StreamingMovie Count 0 500 1000 1500 2000 2500 0 1 Month-to-month One year Two year 0 500 1000 1500 2000 2500 0 1 No No internet service Yes
  • 10. StreamingTV & TechSupport Count 0 200 400 600 800 1000 1200 1400 1600 1800 2000 0 1 No No internet service Yes 0 500 1000 1500 2000 2500 0 1 No No internet service Yes
  • 11. Logistic Regression Model ´ Binary LR model works on categorical dependent variables or qualitative variables that can take 2 values, ex: Yes/No, ´ Multinomial LR models can work for three possible categories. ´ LR models estimates probability of occurrence of event using logarithmic likelihood function and not least minimum square method used by regression models. ´ Binary logistic regression model estimates probability of occurrence of dependent variable Y, which present itself in dichotomous form(0/1). ´ 𝑧 = 𝛼 + 𝛽1𝑋1 + 𝛽2𝑋2 + ⋯ + 𝛽𝑛𝑋𝑛 ´ Z is known as logit., a,B1,B2,..Bk are the estimated parameters for explanatory variables X1,X2,…Xk
  • 12. LR model ´ Binary logistic regression defines Z logit as natural logarithm of odds, such that: ´ Type equation here. ´ ℓ𝓃(pi/(1-pi))=zi ´ pi=1/(1+e-zi) or p=f(Z) ´ Substitute Categorical variables with dummy Variables, then maximize the logarithmic Likelihood function given by (yi).ln(pi)+(1-yi).ln(1-pi). Keeping 𝛼1=𝛽1= 𝛽 2= 𝛽3=..= 𝛽n=0, calculate zi , pi, lli .
  • 13. Logistic Regression ´ M3, M5,M7,M9 cells contain respective estimation parameters, use solver to calculate z, pi, lli ,maximizing the sum of Lli using the data file. ´ There is no percentage of variance w.r.t predicting variables or R^2 as in traditional regression models estimated by least minimum square., More adequate criteria to choose best model is ROC curve(receiver operating characteristic) ´ X2 test is used to verify the model significance , since its null hypothesis are ´ H0: =B2=B2=…=Bk=0 ´ H1: there is atleast one Bi !=0
  • 14. Logistic Regression ´ Confusion Matrix based on cutoff = 0.5, representing number of FP, TP, FN, TN ´ OME- Overall model efficiency = (TP+ TN)/Total events ´ Sensitivity= % of hits, for a determined cutoff considering observations that are in-fact events ´ Specificity = % of hits, for a determined cutoff considering observations that are not events.
  • 16. LR Results ´ Overall Accuracy: 0.80 ´ Avg_AUC : 0.84 ´ Ang_F1: 0.58