Explore the cutting-edge presentations by students from the Boston Institute of Analytics as they delve into the realm of employee churn prediction. Gain valuable insights into the methodologies, techniques, and predictive models used to anticipate and mitigate employee turnover. Dive into real-world case studies and discover how businesses can leverage predictive analytics to retain top talent and foster organizational success. Enroll in our comprehensive course to master the skills needed for predictive analytics in the workplace. visit https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/ for more data science insights
3. 1. INTRODUCTION
2. PROBLEM IDENTIFICATION
3. ATTRIBUTE /FEATURE DESCRIPTION
4. EXPLORATORY DATA ANALYSIS
5. MODEL BUILDING
6. BUILDING CLASSIFICATION MODEL
7. RESULT AND CONCLUSION
PROJECT CONTENTS
4. INTRODUCTION
โข In the dynamic landscape of the banking industry,
retaining customers is paramount for sustained success.
โข Customer churn, or the loss of customers, poses
challenges that this project aims to address through
data-driven insights and proactive strategies.
โข This presentation outlines our approach to identifying,
predicting, and mitigating customer churn for the
benefit of our bank and its valued customers
5. PROBLEM IDENTIFICATION
โข Inadequate Customer Insights
โข Data Quality Issues
โข Dynamic Market Conditions
โข Resource Allocation
โข Limited Personalization
โข Customer Communication Gaps
6. ATTRIBUTE/FEATURE DESCRIPTION
๏CustomerID: ID given to the Customer
๏Surname: Customers LastName
๏Geography: The place where the customers belongs.
๏Gender: Customers gender
๏Age: Customers Age
๏Tenure: Time duration of customers
๏Balance: The Amount remaining in the Account
7. EXPLORATORY DATA ANALYSIS
โข IMPORT DATA:
df=pd.read_csv('/content/drive/MyDrive/Classroom/BIA/ML/Churn_Modelling.csvโ)
โข FIND MISSING VALUES:
No Missing Values
โข FINDING FEATURES WITH ONE VALUE:
No features with one value
8. โข CHECKING IF THE DATA IS BALANCED OR NOT ON TARGET
โข The Data is highly Imbalanced.
12. โข DROP UNWANTED COLUMNS:
data=data.drop(['CustomerId','Surname','Exited','RowNumber'],axis=1)
we have dropped these columns because it does not have huge impact on
model building. And dropped Exited column because it is Target variable.
โข STANDARDIZATION:
Standardization is a preprocessing method used to transform numerical
data by scaling it to have a mean of zero and a standard deviation of one.
This transformation is applied to all features ensuring that they have the
same scale, thus preventing features with larger magnitudes from
dominating the learning algorithm.
โข LABEL ENCODER:
As we have Analyzed in EDA we have Total 3 categorical features. Including
the Target column. So before Model building we will convert those into
numerical features,With the help of label encode.
13. MODEL BUILDING
โข DATA IS HIGHLY IMBALANCED SO WE HAVE USED OVER SAMPLING:
โข SPLITTING DATASET:
Split our dataset into 80% - 20% ratio
where x= Independent variable
y= Dependent variable
14. BUILDING CLASSIFICATION MODEL
โข WE HAVE USED 3 ALGORITHM TO FIND BEST
ACCURACY:
โข DECISION TREE
โข XGBOOST CLASSIFIER
โข RANDOM FOREST CLASSIFIER
15. โข DECISION TREE:
Decision Tree is a Supervised learning technique that can be used for both
classification and Regression problems, but mostly it is preferred for solving
Classification problems..
16. โข RANDOMFOREST CLASSIFIER:
Random Forest is a popular machine learning algorithm that belongs to the
supervised learning technique. It can be used for both Classification and
Regression problems in ML. It is based on the concept of ensemble learning
17. โข XGBOOST CLASSIFIER:
XGBoost is an optimized distributed gradient boosting library designed
for efficient and scalable training of machine learning models. It is an
ensemble learning method that combines the predictions of multiple weak
models to produce a stronger prediction.