SlideShare a Scribd company logo
Capstone
Project
BY – SURENDRA UPADHYAY
Introduction
 Capstone project is a Machine Learning
application which creates a model for a famous
bank from New Jersey.
 It analyze their Clients who took loan in their
bank based on various parameters.
Problem Statement
A Bank in New Jersey is looking to analyze different areas of their
bank clients. they also want to have inbuilt models which will
basically segregate the clients based on various parameters.
The model must be built to understand the Loan Status as well as to
identify different group of customers.
Applied Strategy
Building a Machine Learning classification
model to predict the loan status of clients
based on Duration, Amount and Payments
using Random Forest Classifier algorithm.
Clustering clients into different groups
based on Amount and Balance using ML
K-Means Clustering algorithm.
Data Set
868(Rows) 56(columns)
Master table
by joining 8
tables
Categorical,
Numerical
Data Used
Proposed method and Architecture
Row Data
Data
Processing
Sampling
Training
Set
Validation
Set
Validation
Set
Build Model
Validate
New Data
Architecture
 Raw Data: Data is collected in the form of excel files.
 Data Processing: Preprocessing done to join all the input files based on
common key fields to create master table.
 Sampling: Processed master table is splite into Train and Test datasets.
One is to train the model and another is to validate the model.
 Algorithm: K-Means clustering and Random Forest Classifier is used to
create clusters among clients and to classify them based on based on
Amount and Balance and Duration, Amount and Payments
respectively.
Architecture
Build Model: K-Means is used for clustering and hyper tuned
parameters like n-clusters and obtained optimum value 2 clusters,
through Elbow method.
Random Forest Classifier is used classified the clients.
Validated with test dataset and obtained same accuracy as we got for
train dataset.
Methodology
Creating rough idea
to approach with
the solution.
Understanding the
relation among the
input files and
identifying keys to
join them.
Joining them using
proper type of join
which best fits the
problem statement.
Getting data
cleaned well.
Checking for
missing values and
removing
duplicates.
Obtaining useful
insights from pre-
processed data.
Creating graphs to
get obtain
important
information about
the data.
Scaling, Splitting
and Feature
extraction are
applied to make the
data ready to fit the
model.
Building Model.
Analysis and Implement
Loan Amount vs Duration From the figure it can be drawn
that clients with less loan
amount they paid within less
time.
That of clients with higher
amount took longer time to pay
back the debt.
Loan Amount Vs Balance
From the figure it is evident
that most of the clients falls in
the loan amount range of 50k-
1L and 1.5L-2.25L.
Most of them has balance in
the range of 10K-60K.
Loan Amount Vs Status
 Status A people falls in
the range of 10K-1.2
from the fig
 Most of the Status B and
C falls in the range of
1.7L-2.2L
 Status D clients are very
less in count.
Status Wise Plot
It is clear from
figure that clients
with A has higher in
count and with D
has least value.
Status B and C has
average count of
clients.
Conclusion
We can see in the figure, all
the clients were classified
into four segments.
The classification is
performed by considering
the historical behavior of
the train dataset used
while training the Model.
Amount Vs Balance
clustering
We can see in the figure that
Clients were clustered (grouped)
into two categories 0, 1.
First cluster has the customers
with low loan_ amount and
balance ranging between low,
high.
Second cluster comprises the
clients with high loan_ amount
and balance between low , high.
Thank you

More Related Content

What's hot

Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Simplilearn
 
Operation Analytics and Investigating Metric Spike_P-3.pptx
Operation Analytics and Investigating Metric Spike_P-3.pptxOperation Analytics and Investigating Metric Spike_P-3.pptx
Operation Analytics and Investigating Metric Spike_P-3.pptx
surendrapushpupadhya
 

What's hot (20)

Data Science Project Lifecycle
Data Science Project LifecycleData Science Project Lifecycle
Data Science Project Lifecycle
 
Introduction To Predictive Analytics Part I
Introduction To Predictive Analytics   Part IIntroduction To Predictive Analytics   Part I
Introduction To Predictive Analytics Part I
 
Exploratory Data Analysis Bank Fraud Case Study
Exploratory  Data Analysis Bank Fraud Case StudyExploratory  Data Analysis Bank Fraud Case Study
Exploratory Data Analysis Bank Fraud Case Study
 
Credit EDA Case Study : Exploratory Data Analysis on Bank Loan Data
Credit EDA Case Study : Exploratory Data Analysis on Bank Loan DataCredit EDA Case Study : Exploratory Data Analysis on Bank Loan Data
Credit EDA Case Study : Exploratory Data Analysis on Bank Loan Data
 
Strategy For Data Quality
Strategy For Data QualityStrategy For Data Quality
Strategy For Data Quality
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
 
Data quality metrics infographic
Data quality metrics infographicData quality metrics infographic
Data quality metrics infographic
 
Credit eda case study presentation
Credit eda case study presentation  Credit eda case study presentation
Credit eda case study presentation
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An Overview
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
 
Loan Default Prediction with Machine Learning
Loan Default Prediction with Machine LearningLoan Default Prediction with Machine Learning
Loan Default Prediction with Machine Learning
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language
 
Operation Analytics and Investigating Metric Spike_P-3.pptx
Operation Analytics and Investigating Metric Spike_P-3.pptxOperation Analytics and Investigating Metric Spike_P-3.pptx
Operation Analytics and Investigating Metric Spike_P-3.pptx
 
Sentiment Analysis of Airline Tweets
Sentiment Analysis of Airline TweetsSentiment Analysis of Airline Tweets
Sentiment Analysis of Airline Tweets
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Introduction To Analytics
Introduction To AnalyticsIntroduction To Analytics
Introduction To Analytics
 
Loan prediction
Loan predictionLoan prediction
Loan prediction
 
Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)
 
Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud Detection
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data Analytics
 

Similar to Capstone Project.pptx

credit scoring paper published in eswa
credit scoring paper published in eswacredit scoring paper published in eswa
credit scoring paper published in eswa
Akhil Bandhu Hens, FRM
 
Resume_Partha_Data Consultant_23_July_2016
Resume_Partha_Data Consultant_23_July_2016Resume_Partha_Data Consultant_23_July_2016
Resume_Partha_Data Consultant_23_July_2016
Partha Sarathi Pattnaik
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
Eric Esajian
 

Similar to Capstone Project.pptx (20)

LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
credit scoring paper published in eswa
credit scoring paper published in eswacredit scoring paper published in eswa
credit scoring paper published in eswa
 
Cross selling credit card to existing debit card customers
Cross selling credit card to existing debit card customersCross selling credit card to existing debit card customers
Cross selling credit card to existing debit card customers
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 
Loan Analysis Predicting Defaulters
Loan Analysis Predicting DefaultersLoan Analysis Predicting Defaulters
Loan Analysis Predicting Defaulters
 
Personal Loan Risk Assessment
Personal Loan Risk Assessment Personal Loan Risk Assessment
Personal Loan Risk Assessment
 
Arun_Kaushik
Arun_KaushikArun_Kaushik
Arun_Kaushik
 
Poster
PosterPoster
Poster
 
Dmml report final
Dmml report finalDmml report final
Dmml report final
 
RimpaKundu
RimpaKunduRimpaKundu
RimpaKundu
 
Predicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using ClassificationPredicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using Classification
 
Report 190804110930
Report 190804110930Report 190804110930
Report 190804110930
 
Resume
ResumeResume
Resume
 
Clustering
ClusteringClustering
Clustering
 
Sgf2009 278 2009
Sgf2009 278 2009Sgf2009 278 2009
Sgf2009 278 2009
 
DBT PU BI Lab Manual for ETL Exercise.pdf
DBT PU BI Lab Manual for ETL Exercise.pdfDBT PU BI Lab Manual for ETL Exercise.pdf
DBT PU BI Lab Manual for ETL Exercise.pdf
 
Resume_Partha_Data Consultant_23_July_2016
Resume_Partha_Data Consultant_23_July_2016Resume_Partha_Data Consultant_23_July_2016
Resume_Partha_Data Consultant_23_July_2016
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
 

More from surendrapushpupadhya

More from surendrapushpupadhya (14)

Hiring Process Analytics.pptx
Hiring Process Analytics.pptxHiring Process Analytics.pptx
Hiring Process Analytics.pptx
 
IMDB.pptx
IMDB.pptxIMDB.pptx
IMDB.pptx
 
INSTAGRAM_DATA_ANALYSIS.PPT.pptx
INSTAGRAM_DATA_ANALYSIS.PPT.pptxINSTAGRAM_DATA_ANALYSIS.PPT.pptx
INSTAGRAM_DATA_ANALYSIS.PPT.pptx
 
Assignment 3 vadapav case study
Assignment 3 vadapav case studyAssignment 3 vadapav case study
Assignment 3 vadapav case study
 
Surendra upadhyay k12 type 5
Surendra upadhyay  k12  type 5Surendra upadhyay  k12  type 5
Surendra upadhyay k12 type 5
 
India population and rural market share
India population and rural market shareIndia population and rural market share
India population and rural market share
 
Wadeshware
WadeshwareWadeshware
Wadeshware
 
Eastman Auto and power ltd.
Eastman Auto and power ltd.Eastman Auto and power ltd.
Eastman Auto and power ltd.
 
Consumer forum
Consumer forumConsumer forum
Consumer forum
 
The new science of customer emotion
The new science of customer emotionThe new science of customer emotion
The new science of customer emotion
 
Kashmiri product
Kashmiri productKashmiri product
Kashmiri product
 
Marketing packaging presentation
Marketing packaging presentationMarketing packaging presentation
Marketing packaging presentation
 
case study of treadway tire co.
case study of treadway tire co.case study of treadway tire co.
case study of treadway tire co.
 
whirlpool PPT
whirlpool PPTwhirlpool PPT
whirlpool PPT
 

Recently uploaded

PD ARRAY THEORY FOR INTERMEDIATE (1).pdf
PD ARRAY THEORY FOR INTERMEDIATE (1).pdfPD ARRAY THEORY FOR INTERMEDIATE (1).pdf
PD ARRAY THEORY FOR INTERMEDIATE (1).pdf
JerrySMaliki
 
655264371-checkpoint-science-past-papers-april-2023.pdf
655264371-checkpoint-science-past-papers-april-2023.pdf655264371-checkpoint-science-past-papers-april-2023.pdf
655264371-checkpoint-science-past-papers-april-2023.pdf
morearsh02
 
一比一原版Adelaide毕业证阿德莱德大学毕业证成绩单如何办理
一比一原版Adelaide毕业证阿德莱德大学毕业证成绩单如何办理一比一原版Adelaide毕业证阿德莱德大学毕业证成绩单如何办理
一比一原版Adelaide毕业证阿德莱德大学毕业证成绩单如何办理
zsewypy
 
一比一原版Birmingham毕业证伯明翰大学|学院毕业证成绩单如何办理
一比一原版Birmingham毕业证伯明翰大学|学院毕业证成绩单如何办理一比一原版Birmingham毕业证伯明翰大学|学院毕业证成绩单如何办理
一比一原版Birmingham毕业证伯明翰大学|学院毕业证成绩单如何办理
betoozp
 
NO1 Uk Black Magic Specialist Expert In Sahiwal, Okara, Hafizabad, Mandi Bah...
NO1 Uk Black Magic Specialist Expert In Sahiwal, Okara, Hafizabad,  Mandi Bah...NO1 Uk Black Magic Specialist Expert In Sahiwal, Okara, Hafizabad,  Mandi Bah...
NO1 Uk Black Magic Specialist Expert In Sahiwal, Okara, Hafizabad, Mandi Bah...
Amil Baba Dawood bangali
 
一比一原版UO毕业证渥太华大学毕业证成绩单如何办理
一比一原版UO毕业证渥太华大学毕业证成绩单如何办理一比一原版UO毕业证渥太华大学毕业证成绩单如何办理
一比一原版UO毕业证渥太华大学毕业证成绩单如何办理
yonemuk
 

Recently uploaded (20)

PD ARRAY THEORY FOR INTERMEDIATE (1).pdf
PD ARRAY THEORY FOR INTERMEDIATE (1).pdfPD ARRAY THEORY FOR INTERMEDIATE (1).pdf
PD ARRAY THEORY FOR INTERMEDIATE (1).pdf
 
Falcon Invoice Discounting: Optimizing Returns with Minimal Risk
Falcon Invoice Discounting: Optimizing Returns with Minimal RiskFalcon Invoice Discounting: Optimizing Returns with Minimal Risk
Falcon Invoice Discounting: Optimizing Returns with Minimal Risk
 
Jio Financial service Multibagger 2024 from India stock Market
Jio Financial service  Multibagger 2024 from India stock MarketJio Financial service  Multibagger 2024 from India stock Market
Jio Financial service Multibagger 2024 from India stock Market
 
how can I sell my pi coins for cash in a pi APP
how can I sell my pi coins for cash in a pi APPhow can I sell my pi coins for cash in a pi APP
how can I sell my pi coins for cash in a pi APP
 
9th issue of our inhouse magazine Ingenious May 2024.pdf
9th issue of our inhouse magazine Ingenious May 2024.pdf9th issue of our inhouse magazine Ingenious May 2024.pdf
9th issue of our inhouse magazine Ingenious May 2024.pdf
 
how to sell pi coins on Bitmart crypto exchange
how to sell pi coins on Bitmart crypto exchangehow to sell pi coins on Bitmart crypto exchange
how to sell pi coins on Bitmart crypto exchange
 
655264371-checkpoint-science-past-papers-april-2023.pdf
655264371-checkpoint-science-past-papers-april-2023.pdf655264371-checkpoint-science-past-papers-april-2023.pdf
655264371-checkpoint-science-past-papers-april-2023.pdf
 
Introduction to Economics II Chapter 25 Production and Growth.pdf
Introduction to Economics II Chapter 25 Production and Growth.pdfIntroduction to Economics II Chapter 25 Production and Growth.pdf
Introduction to Economics II Chapter 25 Production and Growth.pdf
 
how can I transfer pi coins to someone in a different country.
how can I transfer pi coins to someone in a different country.how can I transfer pi coins to someone in a different country.
how can I transfer pi coins to someone in a different country.
 
how can I sell pi coins after successfully completing KYC
how can I sell pi coins after successfully completing KYChow can I sell pi coins after successfully completing KYC
how can I sell pi coins after successfully completing KYC
 
一比一原版Adelaide毕业证阿德莱德大学毕业证成绩单如何办理
一比一原版Adelaide毕业证阿德莱德大学毕业证成绩单如何办理一比一原版Adelaide毕业证阿德莱德大学毕业证成绩单如何办理
一比一原版Adelaide毕业证阿德莱德大学毕业证成绩单如何办理
 
一比一原版Birmingham毕业证伯明翰大学|学院毕业证成绩单如何办理
一比一原版Birmingham毕业证伯明翰大学|学院毕业证成绩单如何办理一比一原版Birmingham毕业证伯明翰大学|学院毕业证成绩单如何办理
一比一原版Birmingham毕业证伯明翰大学|学院毕业证成绩单如何办理
 
Economics and Economic reasoning Chap. 1
Economics and Economic reasoning Chap. 1Economics and Economic reasoning Chap. 1
Economics and Economic reasoning Chap. 1
 
Webinar Exploring DORA for Fintechs - Simont Braun
Webinar Exploring DORA for Fintechs - Simont BraunWebinar Exploring DORA for Fintechs - Simont Braun
Webinar Exploring DORA for Fintechs - Simont Braun
 
NO1 Uk Black Magic Specialist Expert In Sahiwal, Okara, Hafizabad, Mandi Bah...
NO1 Uk Black Magic Specialist Expert In Sahiwal, Okara, Hafizabad,  Mandi Bah...NO1 Uk Black Magic Specialist Expert In Sahiwal, Okara, Hafizabad,  Mandi Bah...
NO1 Uk Black Magic Specialist Expert In Sahiwal, Okara, Hafizabad, Mandi Bah...
 
Empowering the Unbanked: The Vital Role of NBFCs in Promoting Financial Inclu...
Empowering the Unbanked: The Vital Role of NBFCs in Promoting Financial Inclu...Empowering the Unbanked: The Vital Role of NBFCs in Promoting Financial Inclu...
Empowering the Unbanked: The Vital Role of NBFCs in Promoting Financial Inclu...
 
National Financial Reporting Authority (NFRA)
National Financial Reporting Authority (NFRA)National Financial Reporting Authority (NFRA)
National Financial Reporting Authority (NFRA)
 
how can i trade pi coins for Bitcoin easily.
how can i trade pi coins for Bitcoin easily.how can i trade pi coins for Bitcoin easily.
how can i trade pi coins for Bitcoin easily.
 
what is the best method to sell pi coins in 2024
what is the best method to sell pi coins in 2024what is the best method to sell pi coins in 2024
what is the best method to sell pi coins in 2024
 
一比一原版UO毕业证渥太华大学毕业证成绩单如何办理
一比一原版UO毕业证渥太华大学毕业证成绩单如何办理一比一原版UO毕业证渥太华大学毕业证成绩单如何办理
一比一原版UO毕业证渥太华大学毕业证成绩单如何办理
 

Capstone Project.pptx

  • 2. Introduction  Capstone project is a Machine Learning application which creates a model for a famous bank from New Jersey.  It analyze their Clients who took loan in their bank based on various parameters.
  • 3. Problem Statement A Bank in New Jersey is looking to analyze different areas of their bank clients. they also want to have inbuilt models which will basically segregate the clients based on various parameters. The model must be built to understand the Loan Status as well as to identify different group of customers.
  • 4. Applied Strategy Building a Machine Learning classification model to predict the loan status of clients based on Duration, Amount and Payments using Random Forest Classifier algorithm. Clustering clients into different groups based on Amount and Balance using ML K-Means Clustering algorithm.
  • 5. Data Set 868(Rows) 56(columns) Master table by joining 8 tables Categorical, Numerical Data Used
  • 6. Proposed method and Architecture Row Data Data Processing Sampling Training Set Validation Set Validation Set Build Model Validate New Data
  • 7. Architecture  Raw Data: Data is collected in the form of excel files.  Data Processing: Preprocessing done to join all the input files based on common key fields to create master table.  Sampling: Processed master table is splite into Train and Test datasets. One is to train the model and another is to validate the model.  Algorithm: K-Means clustering and Random Forest Classifier is used to create clusters among clients and to classify them based on based on Amount and Balance and Duration, Amount and Payments respectively.
  • 8. Architecture Build Model: K-Means is used for clustering and hyper tuned parameters like n-clusters and obtained optimum value 2 clusters, through Elbow method. Random Forest Classifier is used classified the clients. Validated with test dataset and obtained same accuracy as we got for train dataset.
  • 9. Methodology Creating rough idea to approach with the solution. Understanding the relation among the input files and identifying keys to join them. Joining them using proper type of join which best fits the problem statement. Getting data cleaned well. Checking for missing values and removing duplicates. Obtaining useful insights from pre- processed data. Creating graphs to get obtain important information about the data. Scaling, Splitting and Feature extraction are applied to make the data ready to fit the model. Building Model.
  • 10. Analysis and Implement Loan Amount vs Duration From the figure it can be drawn that clients with less loan amount they paid within less time. That of clients with higher amount took longer time to pay back the debt.
  • 11. Loan Amount Vs Balance From the figure it is evident that most of the clients falls in the loan amount range of 50k- 1L and 1.5L-2.25L. Most of them has balance in the range of 10K-60K.
  • 12. Loan Amount Vs Status  Status A people falls in the range of 10K-1.2 from the fig  Most of the Status B and C falls in the range of 1.7L-2.2L  Status D clients are very less in count.
  • 13. Status Wise Plot It is clear from figure that clients with A has higher in count and with D has least value. Status B and C has average count of clients.
  • 14. Conclusion We can see in the figure, all the clients were classified into four segments. The classification is performed by considering the historical behavior of the train dataset used while training the Model.
  • 15. Amount Vs Balance clustering We can see in the figure that Clients were clustered (grouped) into two categories 0, 1. First cluster has the customers with low loan_ amount and balance ranging between low, high. Second cluster comprises the clients with high loan_ amount and balance between low , high.