SlideShare a Scribd company logo
1 of 14
Name – Sanket V. Butoliya
UID – U95365115
Major - Business Analytics & Information Systems
Predicting Customer churn using WEKA 3.8
Introduction:
Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
Problem Statement :
A Cellular service provider wants to analyze customer data to predict whether a
customer is going to churn and also identify what are the critical factors that are causing customers to
churn so that preventive actions could be taken based on these factors.
Flow of Presentation :
• Introduction
• Problem Statement
• Dataset Overview
• Data Analysis and Baseline deduction
• ZeroR
• Predictive model
• Decision Tree
• Neural networks
• Naive Bayes
• Ibk (KNN = 50)
• Visualization in Excel
• Conclusion
No. Attributes Information Values
1 College? Zero, One
2 Income Numeric
3 Overage Numeric
4 Leftover Numeric
5 House value Numeric
6 Handset price Numeric
7 Avg long calls Numeric
8 Avg duration Numeric
9 Satisfaction Avg, Sat, Unsat, Very_sat, Very_unsat
10 Usage level Avg, high, little, very_high, Very_little
11 Considering change?
No, considering, perhaps, never_thought,
actively_looking_into_it
12 Retained? STAY, LEAVE
Predicted attribute = Customer
Retained?
Number of Instances = 5000
Number of Attributes = 12
Missing Attribute Values = None
Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
Variables Worth
Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
ZeroR
Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
Decision Tree
60-40 Split 70-30 Split
80-20 Split 90-10 Split
Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
Neural Networks
60-40 Split 70-30 Split
80-20 Split 90-10 Split
Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
Naive Bayes
60-40 Split 70-30 Split
80-20 Split 90-10 Split
Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
Ibk - KNN = 50
60-40 Split 70-30 Split
80-20 Split 90-10 Split
Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
Overage Attribute - Tableau
Percentage
Split
Decision
Tree
Neural
networks
Naive Bayes
IBk
KNN = 50
60-40 68.2 62.55 63.4 62.2
70-30 68.8 62.8 64.13 64.2
80-20 70.8 63.8 63.8 63.7
90-10 67.8 63.6 62.4 62.6
Accuracy Table
ZeroR
47.8 %
Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
40
45
50
55
60
65
70
75
60-40 Split 70-30 Split 80-20 Split 90-10 Split
Accuracy Analysis
Decision Tree Neural networks Naive Bayes Ibk ZeroR
Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
Model Comparison - Excel
Thank You…

More Related Content

Similar to Weka linked in

OBACS - SCINET ENTERPRISE Data Sheet
OBACS - SCINET ENTERPRISE Data SheetOBACS - SCINET ENTERPRISE Data Sheet
OBACS - SCINET ENTERPRISE Data Sheet
Coskun Oba
 
DNA: an overview
DNA: an overviewDNA: an overview
DNA: an overview
Cisco DevNet
 
Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1
khairulhuda242
 
customer_profiling_based_on_fuzzy_principals_linkedin
customer_profiling_based_on_fuzzy_principals_linkedincustomer_profiling_based_on_fuzzy_principals_linkedin
customer_profiling_based_on_fuzzy_principals_linkedin
Asoka Korale
 
House Sale Price Prediction
House Sale Price PredictionHouse Sale Price Prediction
House Sale Price Prediction
sriram30691
 
Open06
Open06Open06
Open06
butest
 

Similar to Weka linked in (20)

Classification and Prediction Based Data Mining Algorithm in Weka Tool
Classification and Prediction Based Data Mining Algorithm in Weka ToolClassification and Prediction Based Data Mining Algorithm in Weka Tool
Classification and Prediction Based Data Mining Algorithm in Weka Tool
 
MNIST 10-class Classifiers
MNIST 10-class ClassifiersMNIST 10-class Classifiers
MNIST 10-class Classifiers
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
IRJET - Finger Vein Extraction and Authentication System for ATM
IRJET -  	  Finger Vein Extraction and Authentication System for ATMIRJET -  	  Finger Vein Extraction and Authentication System for ATM
IRJET - Finger Vein Extraction and Authentication System for ATM
 
OBACS - SCINET ENTERPRISE Data Sheet
OBACS - SCINET ENTERPRISE Data SheetOBACS - SCINET ENTERPRISE Data Sheet
OBACS - SCINET ENTERPRISE Data Sheet
 
DNA: an overview
DNA: an overviewDNA: an overview
DNA: an overview
 
Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
customer_profiling_based_on_fuzzy_principals_linkedin
customer_profiling_based_on_fuzzy_principals_linkedincustomer_profiling_based_on_fuzzy_principals_linkedin
customer_profiling_based_on_fuzzy_principals_linkedin
 
House Sale Price Prediction
House Sale Price PredictionHouse Sale Price Prediction
House Sale Price Prediction
 
CAPTCHA Cracking System
CAPTCHA Cracking SystemCAPTCHA Cracking System
CAPTCHA Cracking System
 
Digital supply chain quality management
Digital supply chain quality managementDigital supply chain quality management
Digital supply chain quality management
 
BIG DATA ANALYTICS MEANS “IN-DATABASE” ANALYTICS
BIG DATA ANALYTICS MEANS “IN-DATABASE” ANALYTICSBIG DATA ANALYTICS MEANS “IN-DATABASE” ANALYTICS
BIG DATA ANALYTICS MEANS “IN-DATABASE” ANALYTICS
 
Managing Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceManaging Statistics for Optimal Query Performance
Managing Statistics for Optimal Query Performance
 
Vi sem
Vi semVi sem
Vi sem
 
Open06
Open06Open06
Open06
 
Machine Learning with Azure
Machine Learning with AzureMachine Learning with Azure
Machine Learning with Azure
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 

Weka linked in

  • 1. Name – Sanket V. Butoliya UID – U95365115 Major - Business Analytics & Information Systems Predicting Customer churn using WEKA 3.8
  • 2. Introduction: Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot Problem Statement : A Cellular service provider wants to analyze customer data to predict whether a customer is going to churn and also identify what are the critical factors that are causing customers to churn so that preventive actions could be taken based on these factors. Flow of Presentation : • Introduction • Problem Statement • Dataset Overview • Data Analysis and Baseline deduction • ZeroR • Predictive model • Decision Tree • Neural networks • Naive Bayes • Ibk (KNN = 50) • Visualization in Excel • Conclusion
  • 3. No. Attributes Information Values 1 College? Zero, One 2 Income Numeric 3 Overage Numeric 4 Leftover Numeric 5 House value Numeric 6 Handset price Numeric 7 Avg long calls Numeric 8 Avg duration Numeric 9 Satisfaction Avg, Sat, Unsat, Very_sat, Very_unsat 10 Usage level Avg, high, little, very_high, Very_little 11 Considering change? No, considering, perhaps, never_thought, actively_looking_into_it 12 Retained? STAY, LEAVE Predicted attribute = Customer Retained? Number of Instances = 5000 Number of Attributes = 12 Missing Attribute Values = None Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
  • 4. Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
  • 5. Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot Variables Worth
  • 6. Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot ZeroR
  • 7. Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot Decision Tree 60-40 Split 70-30 Split 80-20 Split 90-10 Split
  • 8. Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot Neural Networks 60-40 Split 70-30 Split 80-20 Split 90-10 Split
  • 9. Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot Naive Bayes 60-40 Split 70-30 Split 80-20 Split 90-10 Split
  • 10. Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot Ibk - KNN = 50 60-40 Split 70-30 Split 80-20 Split 90-10 Split
  • 11. Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot Overage Attribute - Tableau
  • 12. Percentage Split Decision Tree Neural networks Naive Bayes IBk KNN = 50 60-40 68.2 62.55 63.4 62.2 70-30 68.8 62.8 64.13 64.2 80-20 70.8 63.8 63.8 63.7 90-10 67.8 63.6 62.4 62.6 Accuracy Table ZeroR 47.8 % Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot
  • 13. 40 45 50 55 60 65 70 75 60-40 Split 70-30 Split 80-20 Split 90-10 Split Accuracy Analysis Decision Tree Neural networks Naive Bayes Ibk ZeroR Data set Analysis Weka 3.8 Classification Accuracy Accuracy Plot Model Comparison - Excel