SlideShare a Scribd company logo
1
Predicting Customer Churn in Banking Industry
Sonali Gupta
X01527245
MSc in Data Analytics
National College of Ireland
Abstract-- The aim of this project is to predict the customer churn with different data mining techniques in the banking industry.
Data mining analyse the large set of data into useful information with different algorithms. Data mining also help to explain the
banking problems by finding some relation, correlation and causality to corporate data which are not visible because they are
concealed in a large amount of data. In this paper, we are using different data mining techniques such as Logistic Regression,
Support Vector Machine (SVM), K-means Nearest Neighbours (KNN), Artificial Neural network (ANN). We will also compare
the accuracy of the model to show its performance.
Keywords: Data mining, Support vector machine, logistic regression, Artificial neural network.
I. INTRODUCTION
Customer Churn is the tendency of customers to terminate
doing business with the organization in a certain period of
time. Customer churn is a critical concern for every company.
Many of the researchers are figure out the problem in our own
perspective to find out a solution for churners. Many of the
banking industry are facing churn problem and all types of
churns lead to acquiring loss and loss of loyal and high-value
customers will create a problem for an organization.
Customers are always being a significant part of the growth of
any business. With the high amount of race in every market. It
is critical to retain the loyal and long-term customers. [1]
Customer churn is the main key of success and loss for any
industry. Consequently, now banks give retention those
customers who are worth for the company to prevent the
churn.
Customer Churn Prediction is important tool to predict those
customers who are more likely to leave. On the other hand,
Data mining plays a crucial role to predict customer churn.
These data mining techniques may use Logistic Regression
(LR), Decision Tree (DT), K-nearest Neighbor (KNN),
Support Vector Machine (SVM) in further Artificial Neural
Network (ANN) to predict churners.
II. TYPES OF CHURNERS
Churners are categories into two parts which are Voluntary
and Involuntary. In Voluntary Churn divide into two parts i.e.
Deliberated and Incidental. In Voluntary Churn means when a
customer decides to cease with the company. Incidental Churn
means when customers have some problems in their own lives
such as due to financial condition churn, Change location
churn. In Deliberate churn occur when a customer decides to
leave for example customer want new service, quality and
some social factors. Involuntary Churn easiest to find out
where Organization decides to remove customers. These
customers are fraud, non-paying.
Fig-1 Churn Type
III. PROPOSED MODEL
The proposed model consists of five steps. First identify the
problem, second select the required dataset, third investigate
the dataset, fourth applying the techniques to evaluate and
interpret the result.
Fig-2
IV. HYPOTHESIS
Which Customers are higher risk to leave the bank?
V. BACKGROUND OF THE DATASET
This dataset is of the big international banks and has been
taken from the Kaggle website. Dataset contain 10,000
records and 13 attributes of the customers.
The attributes of this dataset are explained below:
1.CustomerID: This is unique ID of the customer provided by
the bank.
Identify the
Problem
Data
Selection
Investigate
Dataset
Data Mining Techniques
(KNN, ANN, Decision
Tree, SVM
Interpret &
Evaluate
result
2
2.Surname: This is surname of the customer to identify the
customer.
3.CreditScore: A Credit score is a number that reflects the
likelihood of paying back. Lenders like banks and credit card
companies will look at credit history and calculate credit
score, which show them the level of risk.
4.Geography: location of bank (French, Spain, Germany)
5.Gender: Male or Female
6.Age: Age of Customers
7.Tenure: Number of year customer relation with the bank.
8.Balance: This attribute represents the customer’s balance in
account.
9.NoOfProducts: This attribute represents the number of
product of the customer which provided by the bank.
10.HasCrCard: This attribute represents the customers who
has the credit card.
11.IsActiveMember: This attribute represents the customer
who is active member in bank.
12.EstimatedSalary: This attribute represents the of
customer’s salary.
13.Exited: This attribute represents the customers who is exit
from the bank.
VI. DATA PREPARATION DETAILS
In this data set first, we check the missing values and found
that there were no missing values.
Second, we checked that which columns are useful and some
of the columns such as CustomerID, Surname were not useful.
So, we excluded these columns from the dataset using R.
Third, in dataset Geography and Gender was denoted in the
characteristic form which has been converted into numeric
values.
Last, Choose the outcome variable (dependent variable) so
that could give the answer to the hypothesis. In the dataset
Exited attribute are selected as the dependent variable whereas
“0” act as “Non Exit” and “1” act as “Exit”. And normalized
all the columns. Now the dataset was ready for applying
techniques for predicting customer churn.
VII. RELATED WORK
T.Vafeiadis et al.[2] predict the customer churn in telecom
industry using cross-validation and compared the accuracy of
boosted versions method with non-boosted versions. Semrl et
al. [3] have proposed churn prediction to increase the gym
members using Logistic Regression and Neural network.
Shaaban et al. [4] proposed churn prediction model using
SVM and Clustering with WEKA software. Oyeniyi et al. [5]
proposed churn prediction in banking sector using clustering
k-means algorithm to determine the pattern and develop
customer retention service. Zoric et al. [6] presented a case
study of churn analysis in banking industry using a neural
network with the help of Alyuda NeuroIntelligence and
conclude that customers who used more services are less
Likely to leave and clients who used fewer services are more
likely to leave the bank.
VIII. METHODOLOGIES
Data mining play a significant role in every Customer
Relationship Management (CRM) framework, easy to detect
customer’s behavior, build and evaluate the answer of the
business problem and reduce the churn rate in the banking
industry.
In this project, five different techniques have been used to
predict the accuracy of the model and compared the accuracy
to determine the best fit model. The main aim is to predict
those customers who will likely to leave bank on the based on
diverse attributes of the dataset.
A. DECISION TREE
The Decision tree is a supervised learning technique and tree
like structure which consists of roots and nodes is easy to
understand the output and commonly used in CRM related
problems.
In this illustration going to use all the attributes with respect to
Exited attribute. It shows how our response attribute (Exited)
is different from all other independent attributes. In this 80%
data used as training set and 20 % used as test set. The code of
the decision tree in Fig-3.
Fig-3
From the Fig -4, we can see the accuracy of the test data is
87% and can also predict from the confusion matrix table that
160 customers are correctly predicted who Exited and 1575
customer correctly predict who were Non-Exit.
Fig- 4
3
From the decision tree fig-5, we can understand
1.If customer age is less than or equal to then 71% chance of
exit and if these customers have greater than 2.5 products then
2 % chance to exit and if they have less than 2.5 products than
69% chance they will not exit from the bank.
2.If Customer age is greater than 42 then 29 % chance
customer will not exit and if customer is active member 13 %
chance of exit additionally, if customer not active then 16 %
chances of stay.
Fig- 5 Decision Tree
B. SVM (SUPPORT VECTOR MACHINE)
In this technique compare all the attributes with respect to
Exited attributes and here 80:20 ratios of splitting the data into
training and testing set moreover used different kernel such as
rbfdot, laplacedot, besseldot and splinedot to check the
accuracy of the model.
Fig-6
From the Fig-7, the accuracy of the test data using “rbfdot”
kernal is 85.1 % and also correctly predict 162 customers who
Exited and 1541 customers who are Not-Exit.
Fig-7
C. KNN (K- NEAREST NEIGHBORS)
In this technique compare all the independent attributes with
the response variable (Exited). In this Case, 80% of the data
has been split as training set and 20% of the data as testing set
and also check two times nearest neighbors k=3 and k=9. The
code of this technique given below.
Fig-8
From the Fig-9, The accuracy of the test data when k=3 is 81.1
% and also 144 correctly predicted Exit Customer and 1478
predicted Not-Exited Customer and when k=9 accuracy is 82.1
% at that time 98 customer correctly predicted who has likely
to leave and 1544 customer predicted as Not-Exited.
4
Fig-9
D. ANN (ARTIFICIAL NEURAL NETWORK)
In this technique compare all the independent attribute with
respect to response variable (Exited) and 80 % data considered
as training and 20 % as testing data. The code of this
technique is given in fig-10
Fig-10
Fig-11
From Fig-11, the accuracy of the test data when hidden=1 is
83.2% and this model also correctly predicted 107 customers
are likely to Exited and 1544 customers Not-Exited.
Fig-12 Neural Network
E. LOGISTIC REGRESSION
In this technique, we have compared all the independent
variable with the dependent variable and divided the training
and testing data in 80:20 ratios. The code of this technique is
given in fig-12.
Fig-13
Fig-14
From the fig-13, the accuracy of test data is 83.3% and 154
customers correctly predicted as the exit customers and 1512
customers predicted as Non-Exited.
5
IX. INTERPRETATION OF RESULT
A. Confusion Matrix table result
B. Comparison of DT, SVM, KNN, LR, ANN techniques
Method Accuracy
Decision Tree 86.4 %
SVM (Support Vector
Machine)
85.7 %
KNN (K Nearest
Neighbour)
82.1 %
Logistic Regression 83.3 %
ANN (Artificial Neural
Network)
79.1 %
X. CUSTOMER RETENTION SOLUTION
In today’s Banking Industry Customer retention is a very
important task, because Banking Industry profit is based on
transactions volume not on margin, as there is not much profit
margin on single transactions on banking products. Banking
industry relies on volume. So, it is very important for them to
have a huge number of customers to work with and increase
their profit base. Banking industry employs many customer
retention techniques which are as follows: -
1.Customizing the product as per Customer need and Demand
2.Extending the Credit for High-end Customers as per their
requirement after analyzing their Credit history and Income
3.Conducting Survey to set Customer Expectations.
4.Setting up R&D division to look for the solutions to the
problem of today and tomorrow
5.Building relationship with the Customer with Trust and
understanding
6.Banking Industry runs on customers who return back to the
Bank, for this the happen flawlessly Banks need to be
customer friendly and be ready to go the extra mile (i.e. Under
Legal boundaries) with the customer. Today the Banking
scenario is changing around the globe and so are the
customers and their needs. Earlier banking was all about
deposits and withdrawals, but as time passed they got into
Lending, Insurance, Currency exchange, Business
Development, Investment and many more.
So, the Banks should keep an eye out for any new sector
opening to retain their customers to their banking system
because it is a very competitive market and everyone is
fighting for a piece of the Pie.
XI. CONCLUSION
In this Customer Churn prediction, we compared the accuracy
of different supervised learning techniques such as Decision
tree, SVM, KNN, ANN and Logistic Regression.
Decision tree gives best accuracy which means this model is
best for predicting Customer Churn in banking industry.
REFERENCES
[1] Ahn,”Customer churn analysis:churn determinants and
mediation effects of partial defection in the Korean mobile
telecommunication service industry”.
[2] T.Vafeiadis,“A Comparison of machine learning
techniques for customer churn prediction”.
[3] Semrl,“Churn Prediction Model for Effective Gym
Customer Retention”.
[4] Shaaban, “A Proposed Churn Prediction Model”
[5] Oyeniyi,“Customer Churn Analysis in Banking Sector
using Data Mining Techniques”
[6] Zoric,“Predicting Customer Churn in Banking Industry
using Neural Network”.
Actual
Class
Actual Prediction
Decision
Tree
Not Exit Exit
Not Exit 1575 222
Exit 43 165
SVM Not Exit 1541 271
Exit 26 162
KNN Not Exit 1478 132
Exit 292 144
Logistic
Regression
Not Exit 1512 69
Exit 265 154
ANN Not Exit 1557 47
Exit 289 107

More Related Content

What's hot

Default Probability Prediction using Artificial Neural Networks in R Programming
Default Probability Prediction using Artificial Neural Networks in R ProgrammingDefault Probability Prediction using Artificial Neural Networks in R Programming
Default Probability Prediction using Artificial Neural Networks in R Programming
Vineet Ojha
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Predicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using ClassificationPredicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using Classification
Vishva Abeyrathne
 
CHURN ANALYSIS AND PLAN RECOMMENDATION FOR TELECOM OPERATORS
CHURN ANALYSIS AND PLAN RECOMMENDATION FOR TELECOM OPERATORSCHURN ANALYSIS AND PLAN RECOMMENDATION FOR TELECOM OPERATORS
CHURN ANALYSIS AND PLAN RECOMMENDATION FOR TELECOM OPERATORS
Journal For Research
 
Default payment prediction system
Default payment prediction systemDefault payment prediction system
Default payment prediction system
Ashish Arora
 
Risk Based Loan Approval Framework
Risk Based Loan Approval FrameworkRisk Based Loan Approval Framework
Risk Based Loan Approval Framework
Ramkumar Ravichandran
 
X24164167
X24164167X24164167
X24164167
IJERA Editor
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network ModelEric Esajian
 
Meta Classification Technique for Improving Credit Card Fraud Detection
Meta Classification Technique for Improving Credit Card Fraud Detection Meta Classification Technique for Improving Credit Card Fraud Detection
Meta Classification Technique for Improving Credit Card Fraud Detection
IJSTA
 
ART1197.DOC
ART1197.DOCART1197.DOC
ART1197.DOCbutest
 
Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...
Pratibha Singh
 
Default of Credit Card Payments
Default of Credit Card PaymentsDefault of Credit Card Payments
Default of Credit Card Payments
Vikas Virani
 
Da nsut delhi i think
Da nsut delhi i thinkDa nsut delhi i think
Da nsut delhi i think
Shivam Sharma
 
Proficiency comparison ofladtree
Proficiency comparison ofladtreeProficiency comparison ofladtree
Proficiency comparison ofladtree
ijcsa
 
Phase 2 of Predicting Payment default on Vehicle Loan EMI
Phase 2 of Predicting Payment default on Vehicle Loan EMIPhase 2 of Predicting Payment default on Vehicle Loan EMI
Phase 2 of Predicting Payment default on Vehicle Loan EMI
Vikas Virani
 
Ijatcse71852019
Ijatcse71852019Ijatcse71852019
Ijatcse71852019
loki536577
 
PROBABILISTIC CREDIT SCORING FOR COHORTS OF BORROWERS
PROBABILISTIC CREDIT SCORING FOR COHORTS OF BORROWERSPROBABILISTIC CREDIT SCORING FOR COHORTS OF BORROWERS
PROBABILISTIC CREDIT SCORING FOR COHORTS OF BORROWERS
Andresz26
 
Computing Ratings and Rankings by Mining Feedback Comments
Computing Ratings and Rankings by Mining Feedback CommentsComputing Ratings and Rankings by Mining Feedback Comments
Computing Ratings and Rankings by Mining Feedback Comments
IRJET Journal
 
IRJET- Fatigue Analysis of Offshore Steel Structures
IRJET- Fatigue Analysis of Offshore Steel StructuresIRJET- Fatigue Analysis of Offshore Steel Structures
IRJET- Fatigue Analysis of Offshore Steel Structures
IRJET Journal
 

What's hot (19)

Default Probability Prediction using Artificial Neural Networks in R Programming
Default Probability Prediction using Artificial Neural Networks in R ProgrammingDefault Probability Prediction using Artificial Neural Networks in R Programming
Default Probability Prediction using Artificial Neural Networks in R Programming
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Predicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using ClassificationPredicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using Classification
 
CHURN ANALYSIS AND PLAN RECOMMENDATION FOR TELECOM OPERATORS
CHURN ANALYSIS AND PLAN RECOMMENDATION FOR TELECOM OPERATORSCHURN ANALYSIS AND PLAN RECOMMENDATION FOR TELECOM OPERATORS
CHURN ANALYSIS AND PLAN RECOMMENDATION FOR TELECOM OPERATORS
 
Default payment prediction system
Default payment prediction systemDefault payment prediction system
Default payment prediction system
 
Risk Based Loan Approval Framework
Risk Based Loan Approval FrameworkRisk Based Loan Approval Framework
Risk Based Loan Approval Framework
 
X24164167
X24164167X24164167
X24164167
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
 
Meta Classification Technique for Improving Credit Card Fraud Detection
Meta Classification Technique for Improving Credit Card Fraud Detection Meta Classification Technique for Improving Credit Card Fraud Detection
Meta Classification Technique for Improving Credit Card Fraud Detection
 
ART1197.DOC
ART1197.DOCART1197.DOC
ART1197.DOC
 
Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...
 
Default of Credit Card Payments
Default of Credit Card PaymentsDefault of Credit Card Payments
Default of Credit Card Payments
 
Da nsut delhi i think
Da nsut delhi i thinkDa nsut delhi i think
Da nsut delhi i think
 
Proficiency comparison ofladtree
Proficiency comparison ofladtreeProficiency comparison ofladtree
Proficiency comparison ofladtree
 
Phase 2 of Predicting Payment default on Vehicle Loan EMI
Phase 2 of Predicting Payment default on Vehicle Loan EMIPhase 2 of Predicting Payment default on Vehicle Loan EMI
Phase 2 of Predicting Payment default on Vehicle Loan EMI
 
Ijatcse71852019
Ijatcse71852019Ijatcse71852019
Ijatcse71852019
 
PROBABILISTIC CREDIT SCORING FOR COHORTS OF BORROWERS
PROBABILISTIC CREDIT SCORING FOR COHORTS OF BORROWERSPROBABILISTIC CREDIT SCORING FOR COHORTS OF BORROWERS
PROBABILISTIC CREDIT SCORING FOR COHORTS OF BORROWERS
 
Computing Ratings and Rankings by Mining Feedback Comments
Computing Ratings and Rankings by Mining Feedback CommentsComputing Ratings and Rankings by Mining Feedback Comments
Computing Ratings and Rankings by Mining Feedback Comments
 
IRJET- Fatigue Analysis of Offshore Steel Structures
IRJET- Fatigue Analysis of Offshore Steel StructuresIRJET- Fatigue Analysis of Offshore Steel Structures
IRJET- Fatigue Analysis of Offshore Steel Structures
 

Similar to Project crm submission sonali

Telcom churn .pptx
Telcom churn .pptxTelcom churn .pptx
Telcom churn .pptx
ResearchproGlobal
 
Customer churn classification using machine learning techniques
Customer churn classification using machine learning techniquesCustomer churn classification using machine learning techniques
Customer churn classification using machine learning techniques
SindhujanDhayalan
 
Data Mining on Customer Churn Classification
Data Mining on Customer Churn ClassificationData Mining on Customer Churn Classification
Data Mining on Customer Churn Classification
Kaushik Rajan
 
Data Mining to Classify Telco Churners
Data Mining to Classify Telco ChurnersData Mining to Classify Telco Churners
Data Mining to Classify Telco Churners
MohitMhapuskar
 
ML_project_ppt.pdf
ML_project_ppt.pdfML_project_ppt.pdf
ML_project_ppt.pdf
HetansheeShah2
 
A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...
IRJET Journal
 
A Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud DetectionA Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud Detection
IRJET Journal
 
BANK CUSTOMER CHURN predictio mkini projectn
BANK CUSTOMER CHURN predictio mkini projectnBANK CUSTOMER CHURN predictio mkini projectn
BANK CUSTOMER CHURN predictio mkini projectn
lahimani30
 
Report 190804110930
Report 190804110930Report 190804110930
Report 190804110930
udara12345
 
20 ccp using logistic
20 ccp using logistic20 ccp using logistic
20 ccp using logistic
Vrinda Sachdeva
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
Boston Institute of Analytics
 
Predictive modelling
Predictive modellingPredictive modelling
Predictive modelling
Rajib Kumar De
 
IRJET- Credit Card Fraud Detection using Random Forest
IRJET-  	  Credit Card Fraud Detection using Random ForestIRJET-  	  Credit Card Fraud Detection using Random Forest
IRJET- Credit Card Fraud Detection using Random Forest
IRJET Journal
 
Bank Customer Churn Prediction- Saurav Singh.pptx
Bank Customer Churn Prediction- Saurav Singh.pptxBank Customer Churn Prediction- Saurav Singh.pptx
Bank Customer Churn Prediction- Saurav Singh.pptx
Boston Institute of Analytics
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
Aniket Patil
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
patilaniket2418
 
Credit Card Fraud Detection_ Mansi_Choudhary.pptx
Credit Card Fraud Detection_ Mansi_Choudhary.pptxCredit Card Fraud Detection_ Mansi_Choudhary.pptx
Credit Card Fraud Detection_ Mansi_Choudhary.pptx
Boston Institute of Analytics
 
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
IRJET Journal
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaRahul Bhatia
 

Similar to Project crm submission sonali (20)

Telcom churn .pptx
Telcom churn .pptxTelcom churn .pptx
Telcom churn .pptx
 
Customer churn classification using machine learning techniques
Customer churn classification using machine learning techniquesCustomer churn classification using machine learning techniques
Customer churn classification using machine learning techniques
 
Data Mining on Customer Churn Classification
Data Mining on Customer Churn ClassificationData Mining on Customer Churn Classification
Data Mining on Customer Churn Classification
 
Data Mining to Classify Telco Churners
Data Mining to Classify Telco ChurnersData Mining to Classify Telco Churners
Data Mining to Classify Telco Churners
 
ML_project_ppt.pdf
ML_project_ppt.pdfML_project_ppt.pdf
ML_project_ppt.pdf
 
A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...
 
A Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud DetectionA Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud Detection
 
BANK CUSTOMER CHURN predictio mkini projectn
BANK CUSTOMER CHURN predictio mkini projectnBANK CUSTOMER CHURN predictio mkini projectn
BANK CUSTOMER CHURN predictio mkini projectn
 
Report 190804110930
Report 190804110930Report 190804110930
Report 190804110930
 
20 ccp using logistic
20 ccp using logistic20 ccp using logistic
20 ccp using logistic
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Predictive modelling
Predictive modellingPredictive modelling
Predictive modelling
 
IRJET- Credit Card Fraud Detection using Random Forest
IRJET-  	  Credit Card Fraud Detection using Random ForestIRJET-  	  Credit Card Fraud Detection using Random Forest
IRJET- Credit Card Fraud Detection using Random Forest
 
Bank Customer Churn Prediction- Saurav Singh.pptx
Bank Customer Churn Prediction- Saurav Singh.pptxBank Customer Churn Prediction- Saurav Singh.pptx
Bank Customer Churn Prediction- Saurav Singh.pptx
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Credit Card Fraud Detection_ Mansi_Choudhary.pptx
Credit Card Fraud Detection_ Mansi_Choudhary.pptxCredit Card Fraud Detection_ Mansi_Choudhary.pptx
Credit Card Fraud Detection_ Mansi_Choudhary.pptx
 
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_Bhatia
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 

Recently uploaded

The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 

Recently uploaded (20)

The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 

Project crm submission sonali

  • 1. 1 Predicting Customer Churn in Banking Industry Sonali Gupta X01527245 MSc in Data Analytics National College of Ireland Abstract-- The aim of this project is to predict the customer churn with different data mining techniques in the banking industry. Data mining analyse the large set of data into useful information with different algorithms. Data mining also help to explain the banking problems by finding some relation, correlation and causality to corporate data which are not visible because they are concealed in a large amount of data. In this paper, we are using different data mining techniques such as Logistic Regression, Support Vector Machine (SVM), K-means Nearest Neighbours (KNN), Artificial Neural network (ANN). We will also compare the accuracy of the model to show its performance. Keywords: Data mining, Support vector machine, logistic regression, Artificial neural network. I. INTRODUCTION Customer Churn is the tendency of customers to terminate doing business with the organization in a certain period of time. Customer churn is a critical concern for every company. Many of the researchers are figure out the problem in our own perspective to find out a solution for churners. Many of the banking industry are facing churn problem and all types of churns lead to acquiring loss and loss of loyal and high-value customers will create a problem for an organization. Customers are always being a significant part of the growth of any business. With the high amount of race in every market. It is critical to retain the loyal and long-term customers. [1] Customer churn is the main key of success and loss for any industry. Consequently, now banks give retention those customers who are worth for the company to prevent the churn. Customer Churn Prediction is important tool to predict those customers who are more likely to leave. On the other hand, Data mining plays a crucial role to predict customer churn. These data mining techniques may use Logistic Regression (LR), Decision Tree (DT), K-nearest Neighbor (KNN), Support Vector Machine (SVM) in further Artificial Neural Network (ANN) to predict churners. II. TYPES OF CHURNERS Churners are categories into two parts which are Voluntary and Involuntary. In Voluntary Churn divide into two parts i.e. Deliberated and Incidental. In Voluntary Churn means when a customer decides to cease with the company. Incidental Churn means when customers have some problems in their own lives such as due to financial condition churn, Change location churn. In Deliberate churn occur when a customer decides to leave for example customer want new service, quality and some social factors. Involuntary Churn easiest to find out where Organization decides to remove customers. These customers are fraud, non-paying. Fig-1 Churn Type III. PROPOSED MODEL The proposed model consists of five steps. First identify the problem, second select the required dataset, third investigate the dataset, fourth applying the techniques to evaluate and interpret the result. Fig-2 IV. HYPOTHESIS Which Customers are higher risk to leave the bank? V. BACKGROUND OF THE DATASET This dataset is of the big international banks and has been taken from the Kaggle website. Dataset contain 10,000 records and 13 attributes of the customers. The attributes of this dataset are explained below: 1.CustomerID: This is unique ID of the customer provided by the bank. Identify the Problem Data Selection Investigate Dataset Data Mining Techniques (KNN, ANN, Decision Tree, SVM Interpret & Evaluate result
  • 2. 2 2.Surname: This is surname of the customer to identify the customer. 3.CreditScore: A Credit score is a number that reflects the likelihood of paying back. Lenders like banks and credit card companies will look at credit history and calculate credit score, which show them the level of risk. 4.Geography: location of bank (French, Spain, Germany) 5.Gender: Male or Female 6.Age: Age of Customers 7.Tenure: Number of year customer relation with the bank. 8.Balance: This attribute represents the customer’s balance in account. 9.NoOfProducts: This attribute represents the number of product of the customer which provided by the bank. 10.HasCrCard: This attribute represents the customers who has the credit card. 11.IsActiveMember: This attribute represents the customer who is active member in bank. 12.EstimatedSalary: This attribute represents the of customer’s salary. 13.Exited: This attribute represents the customers who is exit from the bank. VI. DATA PREPARATION DETAILS In this data set first, we check the missing values and found that there were no missing values. Second, we checked that which columns are useful and some of the columns such as CustomerID, Surname were not useful. So, we excluded these columns from the dataset using R. Third, in dataset Geography and Gender was denoted in the characteristic form which has been converted into numeric values. Last, Choose the outcome variable (dependent variable) so that could give the answer to the hypothesis. In the dataset Exited attribute are selected as the dependent variable whereas “0” act as “Non Exit” and “1” act as “Exit”. And normalized all the columns. Now the dataset was ready for applying techniques for predicting customer churn. VII. RELATED WORK T.Vafeiadis et al.[2] predict the customer churn in telecom industry using cross-validation and compared the accuracy of boosted versions method with non-boosted versions. Semrl et al. [3] have proposed churn prediction to increase the gym members using Logistic Regression and Neural network. Shaaban et al. [4] proposed churn prediction model using SVM and Clustering with WEKA software. Oyeniyi et al. [5] proposed churn prediction in banking sector using clustering k-means algorithm to determine the pattern and develop customer retention service. Zoric et al. [6] presented a case study of churn analysis in banking industry using a neural network with the help of Alyuda NeuroIntelligence and conclude that customers who used more services are less Likely to leave and clients who used fewer services are more likely to leave the bank. VIII. METHODOLOGIES Data mining play a significant role in every Customer Relationship Management (CRM) framework, easy to detect customer’s behavior, build and evaluate the answer of the business problem and reduce the churn rate in the banking industry. In this project, five different techniques have been used to predict the accuracy of the model and compared the accuracy to determine the best fit model. The main aim is to predict those customers who will likely to leave bank on the based on diverse attributes of the dataset. A. DECISION TREE The Decision tree is a supervised learning technique and tree like structure which consists of roots and nodes is easy to understand the output and commonly used in CRM related problems. In this illustration going to use all the attributes with respect to Exited attribute. It shows how our response attribute (Exited) is different from all other independent attributes. In this 80% data used as training set and 20 % used as test set. The code of the decision tree in Fig-3. Fig-3 From the Fig -4, we can see the accuracy of the test data is 87% and can also predict from the confusion matrix table that 160 customers are correctly predicted who Exited and 1575 customer correctly predict who were Non-Exit. Fig- 4
  • 3. 3 From the decision tree fig-5, we can understand 1.If customer age is less than or equal to then 71% chance of exit and if these customers have greater than 2.5 products then 2 % chance to exit and if they have less than 2.5 products than 69% chance they will not exit from the bank. 2.If Customer age is greater than 42 then 29 % chance customer will not exit and if customer is active member 13 % chance of exit additionally, if customer not active then 16 % chances of stay. Fig- 5 Decision Tree B. SVM (SUPPORT VECTOR MACHINE) In this technique compare all the attributes with respect to Exited attributes and here 80:20 ratios of splitting the data into training and testing set moreover used different kernel such as rbfdot, laplacedot, besseldot and splinedot to check the accuracy of the model. Fig-6 From the Fig-7, the accuracy of the test data using “rbfdot” kernal is 85.1 % and also correctly predict 162 customers who Exited and 1541 customers who are Not-Exit. Fig-7 C. KNN (K- NEAREST NEIGHBORS) In this technique compare all the independent attributes with the response variable (Exited). In this Case, 80% of the data has been split as training set and 20% of the data as testing set and also check two times nearest neighbors k=3 and k=9. The code of this technique given below. Fig-8 From the Fig-9, The accuracy of the test data when k=3 is 81.1 % and also 144 correctly predicted Exit Customer and 1478 predicted Not-Exited Customer and when k=9 accuracy is 82.1 % at that time 98 customer correctly predicted who has likely to leave and 1544 customer predicted as Not-Exited.
  • 4. 4 Fig-9 D. ANN (ARTIFICIAL NEURAL NETWORK) In this technique compare all the independent attribute with respect to response variable (Exited) and 80 % data considered as training and 20 % as testing data. The code of this technique is given in fig-10 Fig-10 Fig-11 From Fig-11, the accuracy of the test data when hidden=1 is 83.2% and this model also correctly predicted 107 customers are likely to Exited and 1544 customers Not-Exited. Fig-12 Neural Network E. LOGISTIC REGRESSION In this technique, we have compared all the independent variable with the dependent variable and divided the training and testing data in 80:20 ratios. The code of this technique is given in fig-12. Fig-13 Fig-14 From the fig-13, the accuracy of test data is 83.3% and 154 customers correctly predicted as the exit customers and 1512 customers predicted as Non-Exited.
  • 5. 5 IX. INTERPRETATION OF RESULT A. Confusion Matrix table result B. Comparison of DT, SVM, KNN, LR, ANN techniques Method Accuracy Decision Tree 86.4 % SVM (Support Vector Machine) 85.7 % KNN (K Nearest Neighbour) 82.1 % Logistic Regression 83.3 % ANN (Artificial Neural Network) 79.1 % X. CUSTOMER RETENTION SOLUTION In today’s Banking Industry Customer retention is a very important task, because Banking Industry profit is based on transactions volume not on margin, as there is not much profit margin on single transactions on banking products. Banking industry relies on volume. So, it is very important for them to have a huge number of customers to work with and increase their profit base. Banking industry employs many customer retention techniques which are as follows: - 1.Customizing the product as per Customer need and Demand 2.Extending the Credit for High-end Customers as per their requirement after analyzing their Credit history and Income 3.Conducting Survey to set Customer Expectations. 4.Setting up R&D division to look for the solutions to the problem of today and tomorrow 5.Building relationship with the Customer with Trust and understanding 6.Banking Industry runs on customers who return back to the Bank, for this the happen flawlessly Banks need to be customer friendly and be ready to go the extra mile (i.e. Under Legal boundaries) with the customer. Today the Banking scenario is changing around the globe and so are the customers and their needs. Earlier banking was all about deposits and withdrawals, but as time passed they got into Lending, Insurance, Currency exchange, Business Development, Investment and many more. So, the Banks should keep an eye out for any new sector opening to retain their customers to their banking system because it is a very competitive market and everyone is fighting for a piece of the Pie. XI. CONCLUSION In this Customer Churn prediction, we compared the accuracy of different supervised learning techniques such as Decision tree, SVM, KNN, ANN and Logistic Regression. Decision tree gives best accuracy which means this model is best for predicting Customer Churn in banking industry. REFERENCES [1] Ahn,”Customer churn analysis:churn determinants and mediation effects of partial defection in the Korean mobile telecommunication service industry”. [2] T.Vafeiadis,“A Comparison of machine learning techniques for customer churn prediction”. [3] Semrl,“Churn Prediction Model for Effective Gym Customer Retention”. [4] Shaaban, “A Proposed Churn Prediction Model” [5] Oyeniyi,“Customer Churn Analysis in Banking Sector using Data Mining Techniques” [6] Zoric,“Predicting Customer Churn in Banking Industry using Neural Network”. Actual Class Actual Prediction Decision Tree Not Exit Exit Not Exit 1575 222 Exit 43 165 SVM Not Exit 1541 271 Exit 26 162 KNN Not Exit 1478 132 Exit 292 144 Logistic Regression Not Exit 1512 69 Exit 265 154 ANN Not Exit 1557 47 Exit 289 107