SlideShare a Scribd company logo
1 of 33
THE CREDIT RISK ANALYTICS
EDA Case Study By,
• Mr. Prathmesh Pise
• Mr. Vishal Patil
 CONTENTS
 Problem statement
 Flow Chart
 Importing and Cleaning1
 Importing and Cleaning2
 Approach
 Data Visualization
 Significant Insights
 PROBLEM STATEMENT:
1. Aim is to identify patterns which indicate if a client had difficulty paying their installments which
will help the bank in taking following actions:
• Denying the loan
• Reducing the amount of loan
• Lending (to risky applicants) at a higher interest rate, etc.
2. Identifying the co-relation between dependent variables with target variable
3. To ensure that the consumers capable of repaying the loan are not rejected
 FLOW CHART
DATA
LOADING
DETECT
TARGET
VARIABL
E
DATA
CLEANING
HANDLING
MISSING
VALUES
UNIVARIATE
ANALYSIS
BIVARIATE
ANALYSIS
MULTIVARIATE
ANALYSIS
DATA
VISUALIZATION
DATA
INSIGHTS
EDA Process Followed
1. Imported pandas, matplotlib and seaborn library for loading the data and data
visualization
2. Target variable is flag variable weather a clients pays instalments on time or not
3. Two data frames were created from csv files namely,
• Application data- Contains all the information of the client at the time of application
• Previous application data - contains information about the client’s previous loan data
4. Dropped unnecessary columns like the one belonging to client’s house dimensions
5. Achieved 40% memory usage reduction by changing the data types of categorical
variables from object to category.
 IMPORTING AND CLEANING1:
 IMPORTING AND CLEANING2:
1. Imported required data set for previous application data set:
• Previous application data set as previous_app
2. Cleaned the data by removing columns that were less significant for
analysis and were prone to containing erroneous data, namely,
• WEEKDAY_APPR_PROCESS_START
• HOUR_APPR_PROCESS_START, etc.
3. Achieved 40% memory usage reduction by changing the data types
of categorical variables from object to category and dropping
unnecessary columns
HANDLING DATA AND MISSING VALUES:
1. Checked for null values in application_data and found that:
• OWN_CAR_AGE had 65.99%, OCCUPATION_TYPE had 31.35% and EXT_SOURCE_1
had 56.38% missing values
• Hence decided to drop these columns
2. We also checked for null values in previous_app and found that:
• RATE_INTEREST_PRIMARY had 99.64%
• RATE_INTEREST_PRIVILEGED had 99.64% had of Null values
• Hence we dropped them
3. The external source data had some missing values , We impute them to zero
as the External agencies have not provided score for these customers
meaning the client's account was not prone to be a defaulter. Hence score
was assumed as zero.
4. Took average of EXT_SOURCE_1, EXT_SOURCE_2, EXT_SOURCE_3 columns
creating ext_sources column.
5. In previous_app, NAME_TYPE_SUITE had 49% missing values and does not
affect whether the client will default or not. Hence, we drop this column.
6. Defined a function null_percentage to calculate null values in the columns
from both the data sets.
7. Since data is imbalanced we have taken proportion of all the categories to
analyse the data and have used stacked bar plots as it enhances our
understanding.
8. Defined a function called stacker this function compares a categorical column
with our Target variable, it considers data imbalance and converts each
category into percentages and plots the stacked chart with their proportion.
9. Merged previous_app data set with application data set, to compare it with
our Target variable.
 DATA VISUALIZATION
• Univariate analysis on following variables,
1. Target
2. Income
3. Children count
• Bi-variate analysis on Target variable against the following,
1. Gender & age
2. Contract type
3. Average external score
4. Income & occupation type
5. Education type etc
• Multi-variate analysis on Target variable against the following,
1. Income and education type
2. Income and previous application status
 TARGET V/S GENDER
Inference:
• The percentage of Males that pay late installments is more than that of females.
• The percentage of Females paying on time is more than that of males.
 TARGET V/S CONTRACT TYPE
Inference:
• The clients with Cash loans tend to pay late as compared to the clients with
Revolving loans.
 TARGET V/S CAR
Inference:
• Percentage of people with No-Car and paying late installments are slightly more
than that of people with Car
 TARGET V/S AVG_EXT_SCORE
Inference:
• 50% client population who delay their installment payments have a low average
external score, and it ranges from 0.2-0.4 approximately.
• The clients who pay their installments on time have a moderate average score ranging
from 0.3-0.5 approximately.
• There are some clients who have received a very high score and they delay their
installments.
 TARGET V/S AMT INCOME
Inference:
• The clients with income less than 2 lakhs pa pay late installments among these
classes.
• The clients with income more that 6 lakhs pa i.e. Rich class is more likely to pay on
time than other classes.
 TARGET V/S INCOME TYPE
Inference:
• Amongst all the Income types, the Others(Maternity leaves, Students, Unemployed clients, etc.) are the
one who tend to pay late installments.
• The Businessman income types do not pay late installments.
• The working class also have a higher percentage of people in late paying installments which is 10%.
 TARGET V/S FAMILY STATUS
Inference:
• The clients who are Single/not married and the Civil marriage class tend to
pay late installments.
 TARGET V/S HOUSING TYPE
Inference:
• The clients who live in rented apartments and with parents tend to pay late
installments.
• The clients who stay in office apartments pay on time installments.
Inference:
• The people who do not provide the Document2 tend to pay late
installments. Hence it is advisable to make this document mandatory.
 TARGET V/S DOCUMENT 2
Inference:
• The people who provide mobile number tend to pay installments on time.
• Hence it is advisable to collect mobile number of the clients.
 TARGET V/S CLIENTS PROVIDING MOBILE NUMBERS
 TARGET V/S AGE
Inference:
• The clients with age below 25 tend to pay late installments.
• The clients with age of 65 and above pay the installments on time.
• The possible reason is that clients below age 25 are less financially stable as
compared to those above 65.
 TARGET V/S OCCUPATION TYPE
Inference:
• Low skill laborers , Waiters/barmen staff , security staff , cooking , cleaning staff , drivers, Laborers tend
to pay late installments.
• Most of the accountants, High skill tech staff and HR-staffs pay the installments on time.
• The obvious reason being that they represent the sectors with higher salary.
 TARGET V/S CNT_CHILDREN
Inference:
• The clients who have count of children greater than 5 tend to pay late installments.
• Most of the clients with count of children of 2 or 3 pay installments on time.
 TARGET V/S NAME_EDUCATION_TYPE
Inference:
• The clients with academic degree pay installments on time.
• The clients with lower secondary education pay late installments.
 MULTIVARIATE ANALYSIS ON NUMERIC VARIABLES
Inference:
• A positive high co-relation is seen between good's price and amount credit
• A positive high co-relation is seen between annuity amount and amount credit
• A positive high co-relation is seen between annuity amount and good's price
 PROPORTIONS OF CLIENTS BASED ON PREVIOUS APPLICATION STATUS
Inference:
• Out of the total loan applications only 63% were Approved.
• 17% were Refused loan and 19% applications were cancelled by the clients.
 HANDLING OUTLIERS
Inference:
• Outliers were observed in the annual income variable.
• 99% clients had their income less than 4.75 LPA
• Hence for analyzing the annual income, the analysis was limited to clients with annual
income less than 4.75
 TARGET V/S INCOME V/S EDUCATION TYPE
Inference:
• The clients with Education type as academic degree and income in range of
3-3.6 Lakhs pay late installments as compared to those with low income
 TARGET V/S NAME_CASH_LOAN_PURPOSE
Inference:
• The clients who previously took loan for the payments on other loan pay
late installments.
• Following them ,are the clients with Home/Office/Land Loan and personal
household expenses, they pay late installments
 TARGET V/S INCOME V/S PREVIOUS APPLICATION
STATUS
Inference:
• Clients who took loan for Business Development and annual income above
2.6 LPA pay late instalments.
 TARGET V/S PREVIOUS LOAN STATUS
Inference:
• The clients for whom the previous loan was Refused , pay
the installments late
 KEY INSIGHTS
• Following are the strong indicators of default
1. NAME_HOUSING_TYPE : Clients living in rented apartments
2. NAME_FAMILY_STATUS : Clients belonging to Civil marriage
and those who are single/married
3. NAME_INCOME_TYPE : Maternity leave , students,
Unemployed clients
4. FLAG_DOCUMENT_2 : The clients who do not provide
document 2
5. FLAG_MOBIL : The clients who do not provide mobile number
6. OCCUPATION_TYPE : Low skill, Laborer, Waiters, Barmen,
Security staff
7. CNT_CHILDREN : Positive co-relation between number of
children with the chance of client being a defaulter
8. NAME_EDUCATION_TYPE : Clients with lower secondary and
secondary/ secondary special and incomplete higher
9. EDUCATION_TYPE : Clients with academic degree and annual
income between 3-3.6 lakhs
10. CASH_LOAN_PURPOSE : Clients with previous loan purpose as
payment on other loans
• Following clients should be targeted
1. CODE_GENDER : Females
2. NAME_CONTRACT_TYPE : Clients with revolving loans
3. FLAG_CODE_CAR : Clients with car
4. AVG_EXT_SCORE : Clients with moderate external score
5. AMT_INCOME_TOTAL : Clients with annual income
greater than 6 lakhs
6. NAME_INCOME_TYPE : The businessmen and pensioners
7. FLAG_MOBIL :Clients who provide mobile number
8. DAYS_BIRTH :Clients with age of 65 and above
9. OCCUPATION_TYPE : accountants, High skill tech staff and
HR-staffs pay the installments on time
10. NAME_EDUCATION_TYPE : Clients with academic degree
 CONCLUSION
• Based on the inferences obtained, a credit score can be
set
• Variables which contributes towards the chances of a client
being a defaulter will be rated a low score
• The variables contributing towards the chances of a client paying
the installments on time, will be rated with high credit scores
• Based on the final credit score, bank can take following
decision,
1. Grant loan to clients with healthy overall credit score
2. Grant loan at higher interest rates to clients with
comparatively low credit scores
3. Reject loan for clients with extremely low credit score
THANK YOU!

More Related Content

What's hot

Home credit company risk presentation
Home credit company risk presentationHome credit company risk presentation
Home credit company risk presentationShreya Solanki
 
Credit default risk
Credit default riskCredit default risk
Credit default riskchs71
 
Loan Default Prediction with Machine Learning
Loan Default Prediction with Machine LearningLoan Default Prediction with Machine Learning
Loan Default Prediction with Machine LearningAlibaba Cloud
 
Credit Risk Evaluation Model
Credit Risk Evaluation ModelCredit Risk Evaluation Model
Credit Risk Evaluation ModelMihai Enescu
 
Exploratory Data Analysis Example - Credit Risk Analysis (Second Attempt)
Exploratory Data Analysis Example - Credit Risk Analysis (Second Attempt)Exploratory Data Analysis Example - Credit Risk Analysis (Second Attempt)
Exploratory Data Analysis Example - Credit Risk Analysis (Second Attempt)PRABHASH GOKARN
 
Credit risk scoring model final
Credit risk scoring model finalCredit risk scoring model final
Credit risk scoring model finalRitu Sarkar
 
Machine Learning Project - Default credit card clients
Machine Learning Project - Default credit card clients Machine Learning Project - Default credit card clients
Machine Learning Project - Default credit card clients Vatsal N Shah
 
Default Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDefault Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDeep Borkar
 
Credit Scoring
Credit ScoringCredit Scoring
Credit ScoringMABSIV
 
Case Study: Loan default prediction
Case Study: Loan default predictionCase Study: Loan default prediction
Case Study: Loan default predictionALTEN Calsoft Labs
 
Lead scoring case study presentation
Lead scoring case study presentationLead scoring case study presentation
Lead scoring case study presentationMithul Murugaadev
 
Presentation on Peer 2 Peer Lending
Presentation on Peer 2 Peer LendingPresentation on Peer 2 Peer Lending
Presentation on Peer 2 Peer LendingAmeet Roy
 
Lead scoring case study
Lead scoring case studyLead scoring case study
Lead scoring case studyShreya Solanki
 
Customer loan origination system
Customer loan origination systemCustomer loan origination system
Customer loan origination systemSandeep Verma
 
Predictive Model for Loan Approval Process using SAS 9.3_M1
Predictive Model for Loan Approval Process using SAS 9.3_M1Predictive Model for Loan Approval Process using SAS 9.3_M1
Predictive Model for Loan Approval Process using SAS 9.3_M1Akanksha Jain
 
Default of Credit Card Payments
Default of Credit Card PaymentsDefault of Credit Card Payments
Default of Credit Card PaymentsVikas Virani
 
Predicting Credit Card Defaults using Machine Learning Algorithms
Predicting Credit Card Defaults using Machine Learning AlgorithmsPredicting Credit Card Defaults using Machine Learning Algorithms
Predicting Credit Card Defaults using Machine Learning AlgorithmsSagar Tupkar
 

What's hot (20)

Home credit company risk presentation
Home credit company risk presentationHome credit company risk presentation
Home credit company risk presentation
 
Credit default risk
Credit default riskCredit default risk
Credit default risk
 
Loan Default Prediction with Machine Learning
Loan Default Prediction with Machine LearningLoan Default Prediction with Machine Learning
Loan Default Prediction with Machine Learning
 
Credit Risk Evaluation Model
Credit Risk Evaluation ModelCredit Risk Evaluation Model
Credit Risk Evaluation Model
 
Exploratory Data Analysis Example - Credit Risk Analysis (Second Attempt)
Exploratory Data Analysis Example - Credit Risk Analysis (Second Attempt)Exploratory Data Analysis Example - Credit Risk Analysis (Second Attempt)
Exploratory Data Analysis Example - Credit Risk Analysis (Second Attempt)
 
Credit risk scoring model final
Credit risk scoring model finalCredit risk scoring model final
Credit risk scoring model final
 
Machine Learning Project - Default credit card clients
Machine Learning Project - Default credit card clients Machine Learning Project - Default credit card clients
Machine Learning Project - Default credit card clients
 
Default Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDefault Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan Data
 
Credit Scoring
Credit ScoringCredit Scoring
Credit Scoring
 
Case Study: Loan default prediction
Case Study: Loan default predictionCase Study: Loan default prediction
Case Study: Loan default prediction
 
Induction Credit Risk
Induction Credit RiskInduction Credit Risk
Induction Credit Risk
 
Credit scorecard
Credit scorecardCredit scorecard
Credit scorecard
 
Lead scoring case study presentation
Lead scoring case study presentationLead scoring case study presentation
Lead scoring case study presentation
 
Presentation on Peer 2 Peer Lending
Presentation on Peer 2 Peer LendingPresentation on Peer 2 Peer Lending
Presentation on Peer 2 Peer Lending
 
Lead scoring case study
Lead scoring case studyLead scoring case study
Lead scoring case study
 
Customer loan origination system
Customer loan origination systemCustomer loan origination system
Customer loan origination system
 
Credit Risk Model Building Steps
Credit Risk Model Building StepsCredit Risk Model Building Steps
Credit Risk Model Building Steps
 
Predictive Model for Loan Approval Process using SAS 9.3_M1
Predictive Model for Loan Approval Process using SAS 9.3_M1Predictive Model for Loan Approval Process using SAS 9.3_M1
Predictive Model for Loan Approval Process using SAS 9.3_M1
 
Default of Credit Card Payments
Default of Credit Card PaymentsDefault of Credit Card Payments
Default of Credit Card Payments
 
Predicting Credit Card Defaults using Machine Learning Algorithms
Predicting Credit Card Defaults using Machine Learning AlgorithmsPredicting Credit Card Defaults using Machine Learning Algorithms
Predicting Credit Card Defaults using Machine Learning Algorithms
 

Similar to Exploratory Data Analysis For Credit Risk Assesment

Banking Credit Risk- EDA.pptx
Banking Credit Risk- EDA.pptxBanking Credit Risk- EDA.pptx
Banking Credit Risk- EDA.pptxrishikakhanna7
 
Estimating Supply and Demand for Microcredit
Estimating Supply and Demand for MicrocreditEstimating Supply and Demand for Microcredit
Estimating Supply and Demand for MicrocreditFriedman Associates
 
ROLE OF credit score WHILE SanctionING LOAN .pptx
ROLE OF credit score  WHILE SanctionING LOAN .pptxROLE OF credit score  WHILE SanctionING LOAN .pptx
ROLE OF credit score WHILE SanctionING LOAN .pptxrekhabawa2
 
Customer Lifetime Value
Customer Lifetime ValueCustomer Lifetime Value
Customer Lifetime ValueJennaToler
 
Lending unit 4
Lending unit 4Lending unit 4
Lending unit 4UNBFS
 
CFPB Small Dollar Lending Exam Procedures Module 2 ECOA, FCRA, TILA and Othe...
CFPB Small Dollar Lending Exam Procedures  Module 2 ECOA, FCRA, TILA and Othe...CFPB Small Dollar Lending Exam Procedures  Module 2 ECOA, FCRA, TILA and Othe...
CFPB Small Dollar Lending Exam Procedures Module 2 ECOA, FCRA, TILA and Othe...Justin Hosie
 
Debt recovery techniques
Debt recovery techniques Debt recovery techniques
Debt recovery techniques Humayra Trina
 
globalca-panel-final
globalca-panel-finalglobalca-panel-final
globalca-panel-finalJim Faith
 
Seminar fico and credit scores presentation new for posting
Seminar fico and credit scores presentation new for postingSeminar fico and credit scores presentation new for posting
Seminar fico and credit scores presentation new for postingnokio
 
Credit Repair Education for Libraries 6.15.19
Credit Repair Education for Libraries  6.15.19Credit Repair Education for Libraries  6.15.19
Credit Repair Education for Libraries 6.15.19Victor Johnson
 
sandip nayek CRM ASSIGNMENT.PPTX 2023.pptx
sandip nayek CRM ASSIGNMENT.PPTX 2023.pptxsandip nayek CRM ASSIGNMENT.PPTX 2023.pptx
sandip nayek CRM ASSIGNMENT.PPTX 2023.pptxSANDIPNAYEK1
 
Understanding How Your Fair Issac Credit Scores (FICO) Scores and How They Work
Understanding How Your Fair Issac Credit Scores (FICO) Scores and How They WorkUnderstanding How Your Fair Issac Credit Scores (FICO) Scores and How They Work
Understanding How Your Fair Issac Credit Scores (FICO) Scores and How They WorkAbsolute Home Mortgage Corp.
 
Southern university bangladesh.
Southern university bangladesh.Southern university bangladesh.
Southern university bangladesh.Tasrif masruf khan
 
Southern university bangladesh.
Southern university bangladesh.Southern university bangladesh.
Southern university bangladesh.Tasrif masruf khan
 
How a Credit Union Can Stay Off the CFPB's Radar
How a Credit Union Can Stay Off the CFPB's RadarHow a Credit Union Can Stay Off the CFPB's Radar
How a Credit Union Can Stay Off the CFPB's RadarSilver cloud
 

Similar to Exploratory Data Analysis For Credit Risk Assesment (20)

Banking Credit Risk- EDA.pptx
Banking Credit Risk- EDA.pptxBanking Credit Risk- EDA.pptx
Banking Credit Risk- EDA.pptx
 
retailing-credit card
retailing-credit cardretailing-credit card
retailing-credit card
 
Credit bureau
Credit bureauCredit bureau
Credit bureau
 
Estimating Supply and Demand for Microcredit
Estimating Supply and Demand for MicrocreditEstimating Supply and Demand for Microcredit
Estimating Supply and Demand for Microcredit
 
ROLE OF credit score WHILE SanctionING LOAN .pptx
ROLE OF credit score  WHILE SanctionING LOAN .pptxROLE OF credit score  WHILE SanctionING LOAN .pptx
ROLE OF credit score WHILE SanctionING LOAN .pptx
 
Customer Lifetime Value
Customer Lifetime ValueCustomer Lifetime Value
Customer Lifetime Value
 
Lending unit 4
Lending unit 4Lending unit 4
Lending unit 4
 
CFPB Small Dollar Lending Exam Procedures Module 2 ECOA, FCRA, TILA and Othe...
CFPB Small Dollar Lending Exam Procedures  Module 2 ECOA, FCRA, TILA and Othe...CFPB Small Dollar Lending Exam Procedures  Module 2 ECOA, FCRA, TILA and Othe...
CFPB Small Dollar Lending Exam Procedures Module 2 ECOA, FCRA, TILA and Othe...
 
Debt recovery techniques
Debt recovery techniques Debt recovery techniques
Debt recovery techniques
 
globalca-panel-final
globalca-panel-finalglobalca-panel-final
globalca-panel-final
 
Seminar fico and credit scores presentation new for posting
Seminar fico and credit scores presentation new for postingSeminar fico and credit scores presentation new for posting
Seminar fico and credit scores presentation new for posting
 
Raghav resume latest
Raghav resume latestRaghav resume latest
Raghav resume latest
 
Credit Repair Education for Libraries 6.15.19
Credit Repair Education for Libraries  6.15.19Credit Repair Education for Libraries  6.15.19
Credit Repair Education for Libraries 6.15.19
 
sandip nayek CRM ASSIGNMENT.PPTX 2023.pptx
sandip nayek CRM ASSIGNMENT.PPTX 2023.pptxsandip nayek CRM ASSIGNMENT.PPTX 2023.pptx
sandip nayek CRM ASSIGNMENT.PPTX 2023.pptx
 
Understanding How Your Fair Issac Credit Scores (FICO) Scores and How They Work
Understanding How Your Fair Issac Credit Scores (FICO) Scores and How They WorkUnderstanding How Your Fair Issac Credit Scores (FICO) Scores and How They Work
Understanding How Your Fair Issac Credit Scores (FICO) Scores and How They Work
 
Southern university bangladesh.
Southern university bangladesh.Southern university bangladesh.
Southern university bangladesh.
 
Southern university bangladesh.
Southern university bangladesh.Southern university bangladesh.
Southern university bangladesh.
 
How a Credit Union Can Stay Off the CFPB's Radar
How a Credit Union Can Stay Off the CFPB's RadarHow a Credit Union Can Stay Off the CFPB's Radar
How a Credit Union Can Stay Off the CFPB's Radar
 
Accounts Payables Specialist
Accounts Payables SpecialistAccounts Payables Specialist
Accounts Payables Specialist
 
Group 1 p53
Group 1 p53Group 1 p53
Group 1 p53
 

Recently uploaded

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 

Recently uploaded (20)

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 

Exploratory Data Analysis For Credit Risk Assesment

  • 1. THE CREDIT RISK ANALYTICS EDA Case Study By, • Mr. Prathmesh Pise • Mr. Vishal Patil
  • 2.  CONTENTS  Problem statement  Flow Chart  Importing and Cleaning1  Importing and Cleaning2  Approach  Data Visualization  Significant Insights
  • 3.  PROBLEM STATEMENT: 1. Aim is to identify patterns which indicate if a client had difficulty paying their installments which will help the bank in taking following actions: • Denying the loan • Reducing the amount of loan • Lending (to risky applicants) at a higher interest rate, etc. 2. Identifying the co-relation between dependent variables with target variable 3. To ensure that the consumers capable of repaying the loan are not rejected
  • 5. 1. Imported pandas, matplotlib and seaborn library for loading the data and data visualization 2. Target variable is flag variable weather a clients pays instalments on time or not 3. Two data frames were created from csv files namely, • Application data- Contains all the information of the client at the time of application • Previous application data - contains information about the client’s previous loan data 4. Dropped unnecessary columns like the one belonging to client’s house dimensions 5. Achieved 40% memory usage reduction by changing the data types of categorical variables from object to category.  IMPORTING AND CLEANING1:
  • 6.  IMPORTING AND CLEANING2: 1. Imported required data set for previous application data set: • Previous application data set as previous_app 2. Cleaned the data by removing columns that were less significant for analysis and were prone to containing erroneous data, namely, • WEEKDAY_APPR_PROCESS_START • HOUR_APPR_PROCESS_START, etc. 3. Achieved 40% memory usage reduction by changing the data types of categorical variables from object to category and dropping unnecessary columns
  • 7. HANDLING DATA AND MISSING VALUES: 1. Checked for null values in application_data and found that: • OWN_CAR_AGE had 65.99%, OCCUPATION_TYPE had 31.35% and EXT_SOURCE_1 had 56.38% missing values • Hence decided to drop these columns 2. We also checked for null values in previous_app and found that: • RATE_INTEREST_PRIMARY had 99.64% • RATE_INTEREST_PRIVILEGED had 99.64% had of Null values • Hence we dropped them 3. The external source data had some missing values , We impute them to zero as the External agencies have not provided score for these customers meaning the client's account was not prone to be a defaulter. Hence score was assumed as zero. 4. Took average of EXT_SOURCE_1, EXT_SOURCE_2, EXT_SOURCE_3 columns creating ext_sources column. 5. In previous_app, NAME_TYPE_SUITE had 49% missing values and does not affect whether the client will default or not. Hence, we drop this column.
  • 8. 6. Defined a function null_percentage to calculate null values in the columns from both the data sets. 7. Since data is imbalanced we have taken proportion of all the categories to analyse the data and have used stacked bar plots as it enhances our understanding. 8. Defined a function called stacker this function compares a categorical column with our Target variable, it considers data imbalance and converts each category into percentages and plots the stacked chart with their proportion. 9. Merged previous_app data set with application data set, to compare it with our Target variable.
  • 9.  DATA VISUALIZATION • Univariate analysis on following variables, 1. Target 2. Income 3. Children count • Bi-variate analysis on Target variable against the following, 1. Gender & age 2. Contract type 3. Average external score 4. Income & occupation type 5. Education type etc • Multi-variate analysis on Target variable against the following, 1. Income and education type 2. Income and previous application status
  • 10.  TARGET V/S GENDER Inference: • The percentage of Males that pay late installments is more than that of females. • The percentage of Females paying on time is more than that of males.
  • 11.  TARGET V/S CONTRACT TYPE Inference: • The clients with Cash loans tend to pay late as compared to the clients with Revolving loans.
  • 12.  TARGET V/S CAR Inference: • Percentage of people with No-Car and paying late installments are slightly more than that of people with Car
  • 13.  TARGET V/S AVG_EXT_SCORE Inference: • 50% client population who delay their installment payments have a low average external score, and it ranges from 0.2-0.4 approximately. • The clients who pay their installments on time have a moderate average score ranging from 0.3-0.5 approximately. • There are some clients who have received a very high score and they delay their installments.
  • 14.  TARGET V/S AMT INCOME Inference: • The clients with income less than 2 lakhs pa pay late installments among these classes. • The clients with income more that 6 lakhs pa i.e. Rich class is more likely to pay on time than other classes.
  • 15.  TARGET V/S INCOME TYPE Inference: • Amongst all the Income types, the Others(Maternity leaves, Students, Unemployed clients, etc.) are the one who tend to pay late installments. • The Businessman income types do not pay late installments. • The working class also have a higher percentage of people in late paying installments which is 10%.
  • 16.  TARGET V/S FAMILY STATUS Inference: • The clients who are Single/not married and the Civil marriage class tend to pay late installments.
  • 17.  TARGET V/S HOUSING TYPE Inference: • The clients who live in rented apartments and with parents tend to pay late installments. • The clients who stay in office apartments pay on time installments.
  • 18. Inference: • The people who do not provide the Document2 tend to pay late installments. Hence it is advisable to make this document mandatory.  TARGET V/S DOCUMENT 2
  • 19. Inference: • The people who provide mobile number tend to pay installments on time. • Hence it is advisable to collect mobile number of the clients.  TARGET V/S CLIENTS PROVIDING MOBILE NUMBERS
  • 20.  TARGET V/S AGE Inference: • The clients with age below 25 tend to pay late installments. • The clients with age of 65 and above pay the installments on time. • The possible reason is that clients below age 25 are less financially stable as compared to those above 65.
  • 21.  TARGET V/S OCCUPATION TYPE Inference: • Low skill laborers , Waiters/barmen staff , security staff , cooking , cleaning staff , drivers, Laborers tend to pay late installments. • Most of the accountants, High skill tech staff and HR-staffs pay the installments on time. • The obvious reason being that they represent the sectors with higher salary.
  • 22.  TARGET V/S CNT_CHILDREN Inference: • The clients who have count of children greater than 5 tend to pay late installments. • Most of the clients with count of children of 2 or 3 pay installments on time.
  • 23.  TARGET V/S NAME_EDUCATION_TYPE Inference: • The clients with academic degree pay installments on time. • The clients with lower secondary education pay late installments.
  • 24.  MULTIVARIATE ANALYSIS ON NUMERIC VARIABLES Inference: • A positive high co-relation is seen between good's price and amount credit • A positive high co-relation is seen between annuity amount and amount credit • A positive high co-relation is seen between annuity amount and good's price
  • 25.  PROPORTIONS OF CLIENTS BASED ON PREVIOUS APPLICATION STATUS Inference: • Out of the total loan applications only 63% were Approved. • 17% were Refused loan and 19% applications were cancelled by the clients.
  • 26.  HANDLING OUTLIERS Inference: • Outliers were observed in the annual income variable. • 99% clients had their income less than 4.75 LPA • Hence for analyzing the annual income, the analysis was limited to clients with annual income less than 4.75
  • 27.  TARGET V/S INCOME V/S EDUCATION TYPE Inference: • The clients with Education type as academic degree and income in range of 3-3.6 Lakhs pay late installments as compared to those with low income
  • 28.  TARGET V/S NAME_CASH_LOAN_PURPOSE Inference: • The clients who previously took loan for the payments on other loan pay late installments. • Following them ,are the clients with Home/Office/Land Loan and personal household expenses, they pay late installments
  • 29.  TARGET V/S INCOME V/S PREVIOUS APPLICATION STATUS Inference: • Clients who took loan for Business Development and annual income above 2.6 LPA pay late instalments.
  • 30.  TARGET V/S PREVIOUS LOAN STATUS Inference: • The clients for whom the previous loan was Refused , pay the installments late
  • 31.  KEY INSIGHTS • Following are the strong indicators of default 1. NAME_HOUSING_TYPE : Clients living in rented apartments 2. NAME_FAMILY_STATUS : Clients belonging to Civil marriage and those who are single/married 3. NAME_INCOME_TYPE : Maternity leave , students, Unemployed clients 4. FLAG_DOCUMENT_2 : The clients who do not provide document 2 5. FLAG_MOBIL : The clients who do not provide mobile number 6. OCCUPATION_TYPE : Low skill, Laborer, Waiters, Barmen, Security staff 7. CNT_CHILDREN : Positive co-relation between number of children with the chance of client being a defaulter 8. NAME_EDUCATION_TYPE : Clients with lower secondary and secondary/ secondary special and incomplete higher 9. EDUCATION_TYPE : Clients with academic degree and annual income between 3-3.6 lakhs 10. CASH_LOAN_PURPOSE : Clients with previous loan purpose as payment on other loans • Following clients should be targeted 1. CODE_GENDER : Females 2. NAME_CONTRACT_TYPE : Clients with revolving loans 3. FLAG_CODE_CAR : Clients with car 4. AVG_EXT_SCORE : Clients with moderate external score 5. AMT_INCOME_TOTAL : Clients with annual income greater than 6 lakhs 6. NAME_INCOME_TYPE : The businessmen and pensioners 7. FLAG_MOBIL :Clients who provide mobile number 8. DAYS_BIRTH :Clients with age of 65 and above 9. OCCUPATION_TYPE : accountants, High skill tech staff and HR-staffs pay the installments on time 10. NAME_EDUCATION_TYPE : Clients with academic degree
  • 32.  CONCLUSION • Based on the inferences obtained, a credit score can be set • Variables which contributes towards the chances of a client being a defaulter will be rated a low score • The variables contributing towards the chances of a client paying the installments on time, will be rated with high credit scores • Based on the final credit score, bank can take following decision, 1. Grant loan to clients with healthy overall credit score 2. Grant loan at higher interest rates to clients with comparatively low credit scores 3. Reject loan for clients with extremely low credit score