SlideShare a Scribd company logo
Estimating the probability
of default: Credit Risk
Mohamed Arsalan Qadri
Sarvesh Saurabh
Mohit Ravi
Summary
• Credit risk – The probability of default
• Data Cleansing
• Logistic Regression
• Linear Discriminant Analysis
• Comparison of the LR and LDA
• Factor Analysis
Credit Risk
What is it?
• The risk of default on a debt that may arise from a borrower failing to make
required payments.
Impact on the lender?
• Lost principal and interest, disruption to cash flows, and increased collection
costs.
How to estimate it?
• Credit risk arises from the potential that a borrower or counterparty will fail to perform
on an obligation
Sources of risk?
• For most banks, loans are the largest and most obvious source of credit risk.
• There are other sources of credit risk both on and off the balance sheet including
letters of credit unfunded loan commitments, and lines of credit.
• Other products, activities, and services that expose a bank to credit risk are credit
derivatives, foreign exchange, and cash management services.
Credit Risk
Credit Scoring vs Risk
Estimation of risk?
• The risk posed by the borrower is inversely proportional to the credit score.
• A statistically derived numeric expression of a person's creditworthiness that is used by
lenders to access the likelihood that a person will repay his or her debts.
• A credit score is based on, among other things, a person's past credit history (300-850)
Credit Scoring
• Consumers can typically keep their credit scores high by maintaining a long history of
always paying their bills on time and not having too much debt.
• A FICO score is the most widely used credit scoring system.
• A credit score is primarily based on a credit report information typically sourced from
credit bureaus.
Data Cleaning
Data Cleaning
• Serious Delinquency in two years. (Make a Pi chart for this)
• Revolving Utilization Of Unsecured Lines
Data Cleaning
• Age
Data Cleaning
• Number Of Time 30-59 Days Past Due Not Worse
Data Cleaning
• Number Of Time 60-89 Days Past Due Not Worse
Data Cleaning
• Number Of Times 90 Days Late
Data Cleaning
• Monthly Income
• Replaced with Mean
Data Cleaning
Data Cleaning
• Monthly Income
• Ran Multiple Linear Regression
on Missing Values
Data Cleaning
• Monthly Income
• The Histogram after running
Multiple Linear Regression
on Missing Values
Data Cleaning
• Debt Ratio
• We found that the Debt Ratio was extremely high in many cases.
• Upon Closer inspection, we found out that high debt ratio was present for those records
whose Monthly Income was unknown.
• From this we inferred that the Debt Ratio could most probably be the Debt.
Data Cleaning
• Debt Ratio
• We replaced the high values of debt ratio by dividing it by the predicted values of the
monthly income.
• The new mean after replacement was 0.67
Data Cleaning
• Number of Dependents
Data Modelling
• Split the dataset into Training data (70%) and Test Data (30%).
• Computed Co-relation Matrix among Independent variables.
• The variables had very less Co-relation amongst themselves.
• Ran Logistic Regression by using Stepwise selection.
• Ran Linear Discriminant Analysis.
• Compared both the models by measuring their accuracy of prediction.
• Ran both models on significant Factors using Factor Analysis.
Logistic Regression
Logistic Regression
• Ran Logistic Regression separately for each variable.
• Computed the ROC curve for each variable and compared the AUC value.
Stepwise Selection
• Overall Model was Significant.
• All the variables were included in the
model.
• The model built on the Training data
was tested on the Test data.
• Probability of default > 0.7 was coded
as 1, and Probability of default <0.7
was coded as 0.
Logistic Regression on Test Data
Overall Accuracy = (41374+291)/(41374+291+175+2661)
= 93.6 %
True Positive Rate = TP / (TP+FN)
= 9.85%
True Negative Rate = TN / (TN+FP)
= 99.5%
Predicted Values Actual Values
Confusion Matrix
ROC curve for Test Data
• AUC Value = 0.8557
Discriminant Analysis
Discriminant Analysis
Overall accuracy =(38134+1717)/Total
=89.5 %
True Positive Rate = TP / (TP+FN)
= 58%
True Negative Rate = TN / (TN+FP)
= 91.7%
Predicted Predicted
0 1
Actual
0
Actual
1
38134 3415
1235 1717
Serious
Deliquen
Comparison of Models
Linear Discriminant Analysis
Overall accuracy =89.5 %
Predicted Predicted
0 1
Actual
0
Actual
1
38134 3415
1235 1717
Serious
Deliquen
Logistic Regression
Overall Accuracy = 93.6 %
Normality of variables
Factor Analysis
Factor Analysis
Factor Pattern
Factor1 Factor2 Factor3 Factor4
NumberOfTimes90DaysLate 0.54684 0.28062 0.26286 -0.0429
Factor 1 NumberOfTime60_89DaysPastDueNot 0.50016 0.3943 0.37949 -0.0015
RevolvingUtilizationOfUnsecured 0.60945 0.24942 -0.1861 -0.0285
NumberOfOpenCreditLinesAndLoans -0.5203 0.5275 0.1922 0.15051
NumberRealEstateLoansOrLines -0.4698 0.61529 -0.0292 0.09694
Factor 2 NumberOfDependents_num 0.03058 0.46357 -0.6034 -0.008
Monthlyincome_debt -0.4298 0.5044 -0.09 -0.1628
NumberOfTime30_59DaysPastDueNot 0.40861 0.49901 0.31943 0.05977
Factor 3 age -0.4301 -0.1476 0.65733 -0.0396
Factor 4 DebtRatio 0.05584 -0.0712 -0.0331 0.97112
Conclusion
• 80% time spent on Data cleaning
• Logistic Regression gives better results when data is not normal as compared to LDA
• Factors can be grouped for a logical understanding, with Debt Ratio and age explaining high
variance.
Thank you

More Related Content

What's hot

Risk management in financial institution
Risk management in financial institutionRisk management in financial institution
Risk management in financial institutionUjjwal 'Shanu'
 
Counterparty Credit Risk and CVA under Basel III
Counterparty Credit Risk and CVA under Basel IIICounterparty Credit Risk and CVA under Basel III
Counterparty Credit Risk and CVA under Basel IIIHäner Consulting
 
Credit management
Credit managementCredit management
Credit managementAdil Shaikh
 
Loan Default Prediction with Machine Learning
Loan Default Prediction with Machine LearningLoan Default Prediction with Machine Learning
Loan Default Prediction with Machine LearningAlibaba Cloud
 
project on credit-risk-management
project on credit-risk-managementproject on credit-risk-management
project on credit-risk-managementShanky Rana
 
Credit Risk Management Presentation
Credit Risk Management PresentationCredit Risk Management Presentation
Credit Risk Management PresentationSumant Palwankar
 
Model building in credit card and loan approval
Model building in credit card and loan approval Model building in credit card and loan approval
Model building in credit card and loan approval Venkata Reddy Konasani
 
The 8 Steps of Credit Risk Management
The 8 Steps of Credit Risk ManagementThe 8 Steps of Credit Risk Management
The 8 Steps of Credit Risk ManagementColleen Beck-Domanico
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language Aayush Kumar
 
Credit risk scoring model final
Credit risk scoring model finalCredit risk scoring model final
Credit risk scoring model finalRitu Sarkar
 
Credit Risk Management ppt
Credit Risk Management pptCredit Risk Management ppt
Credit Risk Management pptSneha Salian
 

What's hot (20)

Risk management in financial institution
Risk management in financial institutionRisk management in financial institution
Risk management in financial institution
 
Presentation on credit risk
Presentation on credit risk Presentation on credit risk
Presentation on credit risk
 
Counterparty Credit Risk and CVA under Basel III
Counterparty Credit Risk and CVA under Basel IIICounterparty Credit Risk and CVA under Basel III
Counterparty Credit Risk and CVA under Basel III
 
Credit management
Credit managementCredit management
Credit management
 
Loan Default Prediction with Machine Learning
Loan Default Prediction with Machine LearningLoan Default Prediction with Machine Learning
Loan Default Prediction with Machine Learning
 
Managing the Risks in SMEs Financing
Managing the Risks in SMEs FinancingManaging the Risks in SMEs Financing
Managing the Risks in SMEs Financing
 
Credit risk (3)
Credit risk (3)Credit risk (3)
Credit risk (3)
 
Credit risk
Credit riskCredit risk
Credit risk
 
project on credit-risk-management
project on credit-risk-managementproject on credit-risk-management
project on credit-risk-management
 
Credit Risk Management Presentation
Credit Risk Management PresentationCredit Risk Management Presentation
Credit Risk Management Presentation
 
Model building in credit card and loan approval
Model building in credit card and loan approval Model building in credit card and loan approval
Model building in credit card and loan approval
 
Credit scoring
Credit scoringCredit scoring
Credit scoring
 
Credit defaulter analysis
Credit defaulter analysisCredit defaulter analysis
Credit defaulter analysis
 
The 8 Steps of Credit Risk Management
The 8 Steps of Credit Risk ManagementThe 8 Steps of Credit Risk Management
The 8 Steps of Credit Risk Management
 
Market Risk
Market RiskMarket Risk
Market Risk
 
Jntu credit risk-management
Jntu credit risk-managementJntu credit risk-management
Jntu credit risk-management
 
OFSAA-ALM
OFSAA-ALMOFSAA-ALM
OFSAA-ALM
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language
 
Credit risk scoring model final
Credit risk scoring model finalCredit risk scoring model final
Credit risk scoring model final
 
Credit Risk Management ppt
Credit Risk Management pptCredit Risk Management ppt
Credit Risk Management ppt
 

Viewers also liked

Sound Credit Risk Experience Sharing Vietnam Fsa And Bank
Sound Credit Risk Experience Sharing   Vietnam Fsa And BankSound Credit Risk Experience Sharing   Vietnam Fsa And Bank
Sound Credit Risk Experience Sharing Vietnam Fsa And BankEric Kuo
 
Onno de vrij (sas) better decision making 12-10
Onno de vrij (sas) better decision making 12-10Onno de vrij (sas) better decision making 12-10
Onno de vrij (sas) better decision making 12-10Wim Assink
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionDrZahid Khan
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionsaba khan
 
Optimization strategy for Amazon's Uber like delivery service
Optimization strategy for Amazon's Uber like delivery serviceOptimization strategy for Amazon's Uber like delivery service
Optimization strategy for Amazon's Uber like delivery serviceArsalan Qadri
 
Introduction to Default
Introduction to DefaultIntroduction to Default
Introduction to DefaultLoanXpress
 
Credit+risk+estimation(2)
Credit+risk+estimation(2)Credit+risk+estimation(2)
Credit+risk+estimation(2)Wrik Barman
 
Credit Risk Modelling Primer
Credit Risk Modelling PrimerCredit Risk Modelling Primer
Credit Risk Modelling Primerav vedpuriswar
 
H2O World - GBM and Random Forest in H2O- Mark Landry
H2O World - GBM and Random Forest in H2O- Mark LandryH2O World - GBM and Random Forest in H2O- Mark Landry
H2O World - GBM and Random Forest in H2O- Mark LandrySri Ambati
 
Higgs Boson Machine Learning Challenge - Kaggle
Higgs Boson Machine Learning Challenge - KaggleHiggs Boson Machine Learning Challenge - Kaggle
Higgs Boson Machine Learning Challenge - KaggleSajith Edirisinghe
 
classification_methods-logistic regression Machine Learning
classification_methods-logistic regression Machine Learning classification_methods-logistic regression Machine Learning
classification_methods-logistic regression Machine Learning Shiraz316
 
Bankruptcy prediction models (2)
Bankruptcy prediction models (2)Bankruptcy prediction models (2)
Bankruptcy prediction models (2)himanshujaiswal
 
Credit Risk Management Primer
Credit Risk Management PrimerCredit Risk Management Primer
Credit Risk Management Primerav vedpuriswar
 
Forecasting P2P Credit Risk based on Lending Club data
Forecasting P2P Credit Risk based on Lending Club dataForecasting P2P Credit Risk based on Lending Club data
Forecasting P2P Credit Risk based on Lending Club dataArchange Giscard DESTINE
 
Consumer Credit Scoring Using Logistic Regression and Random Forest
Consumer Credit Scoring Using Logistic Regression and Random ForestConsumer Credit Scoring Using Logistic Regression and Random Forest
Consumer Credit Scoring Using Logistic Regression and Random ForestHirak Sen Roy
 
Logistic regression with low event rate (rare events)
Logistic regression with low event rate (rare events)Logistic regression with low event rate (rare events)
Logistic regression with low event rate (rare events)Tejamoy Ghosh
 

Viewers also liked (20)

Sound Credit Risk Experience Sharing Vietnam Fsa And Bank
Sound Credit Risk Experience Sharing   Vietnam Fsa And BankSound Credit Risk Experience Sharing   Vietnam Fsa And Bank
Sound Credit Risk Experience Sharing Vietnam Fsa And Bank
 
Onno de vrij (sas) better decision making 12-10
Onno de vrij (sas) better decision making 12-10Onno de vrij (sas) better decision making 12-10
Onno de vrij (sas) better decision making 12-10
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Optimization strategy for Amazon's Uber like delivery service
Optimization strategy for Amazon's Uber like delivery serviceOptimization strategy for Amazon's Uber like delivery service
Optimization strategy for Amazon's Uber like delivery service
 
Introduction to Default
Introduction to DefaultIntroduction to Default
Introduction to Default
 
Credit+risk+estimation(2)
Credit+risk+estimation(2)Credit+risk+estimation(2)
Credit+risk+estimation(2)
 
Altman Z-Score+
Altman Z-Score+Altman Z-Score+
Altman Z-Score+
 
Credit Risk Modelling Primer
Credit Risk Modelling PrimerCredit Risk Modelling Primer
Credit Risk Modelling Primer
 
KMV model
KMV modelKMV model
KMV model
 
H2O World - GBM and Random Forest in H2O- Mark Landry
H2O World - GBM and Random Forest in H2O- Mark LandryH2O World - GBM and Random Forest in H2O- Mark Landry
H2O World - GBM and Random Forest in H2O- Mark Landry
 
Higgs Boson Machine Learning Challenge - Kaggle
Higgs Boson Machine Learning Challenge - KaggleHiggs Boson Machine Learning Challenge - Kaggle
Higgs Boson Machine Learning Challenge - Kaggle
 
classification_methods-logistic regression Machine Learning
classification_methods-logistic regression Machine Learning classification_methods-logistic regression Machine Learning
classification_methods-logistic regression Machine Learning
 
Bankruptcy prediction models (2)
Bankruptcy prediction models (2)Bankruptcy prediction models (2)
Bankruptcy prediction models (2)
 
Credit Risk Management Primer
Credit Risk Management PrimerCredit Risk Management Primer
Credit Risk Management Primer
 
Forecasting P2P Credit Risk based on Lending Club data
Forecasting P2P Credit Risk based on Lending Club dataForecasting P2P Credit Risk based on Lending Club data
Forecasting P2P Credit Risk based on Lending Club data
 
Consumer Credit Scoring Using Logistic Regression and Random Forest
Consumer Credit Scoring Using Logistic Regression and Random ForestConsumer Credit Scoring Using Logistic Regression and Random Forest
Consumer Credit Scoring Using Logistic Regression and Random Forest
 
Credit risk models
Credit risk modelsCredit risk models
Credit risk models
 
Z-Scores
Z-ScoresZ-Scores
Z-Scores
 
Logistic regression with low event rate (rare events)
Logistic regression with low event rate (rare events)Logistic regression with low event rate (rare events)
Logistic regression with low event rate (rare events)
 

Similar to Estimation of the probability of default : Credit Rish

Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryPranov Mishra
 
Incremental Risk Charge - Credit Migration Risk
Incremental Risk Charge - Credit Migration RiskIncremental Risk Charge - Credit Migration Risk
Incremental Risk Charge - Credit Migration Riskjohannes_rebel
 
Desai_edinburgh2001
Desai_edinburgh2001Desai_edinburgh2001
Desai_edinburgh2001Vijay Desai
 
Risk Management in Financial Institutions
Risk Management in Financial InstitutionsRisk Management in Financial Institutions
Risk Management in Financial InstitutionsArchanaKamble18
 
CECL Methodology - Forecasting
CECL Methodology - ForecastingCECL Methodology - Forecasting
CECL Methodology - ForecastingLibby Bierman
 
Pricing and How CECL Affects It
Pricing and How CECL Affects ItPricing and How CECL Affects It
Pricing and How CECL Affects ItBaker Hill
 
Revenue assurance 101
Revenue assurance 101Revenue assurance 101
Revenue assurance 101ntel
 
Cecl automation banking book analytics v3
Cecl automation   banking book analytics v3Cecl automation   banking book analytics v3
Cecl automation banking book analytics v3Sohail Farooq
 
Creditscore
CreditscoreCreditscore
Creditscorekevinlan
 
Mafias in training- Resolvr, The SmartCube
Mafias in training- Resolvr, The SmartCubeMafias in training- Resolvr, The SmartCube
Mafias in training- Resolvr, The SmartCubeApoorv Parmar
 
Business impact analysis and Cost-benefit Analysis. Risk Assesment
Business impact analysis and Cost-benefit Analysis. Risk AssesmentBusiness impact analysis and Cost-benefit Analysis. Risk Assesment
Business impact analysis and Cost-benefit Analysis. Risk Assesmenterfan7486
 
desai_wharton2002
desai_wharton2002desai_wharton2002
desai_wharton2002Vijay Desai
 
Bridging marke- credit risk-Modelling the Incremental Risk Charge.pptx
Bridging marke- credit risk-Modelling the  Incremental Risk Charge.pptxBridging marke- credit risk-Modelling the  Incremental Risk Charge.pptx
Bridging marke- credit risk-Modelling the Incremental Risk Charge.pptxGarima Singh Makhija
 
Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...
Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...
Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...Eesti Pank
 
Introduction to Core Assessments
Introduction to Core AssessmentsIntroduction to Core Assessments
Introduction to Core AssessmentsResolver Inc.
 
Investment decisions under risk
Investment decisions under riskInvestment decisions under risk
Investment decisions under riskSyama Raveendran
 
Comparison Study of Neural Network and Deep Neural Network on Repricing GAP P...
Comparison Study of Neural Network and Deep Neural Network on Repricing GAP P...Comparison Study of Neural Network and Deep Neural Network on Repricing GAP P...
Comparison Study of Neural Network and Deep Neural Network on Repricing GAP P...Hendri Karisma
 

Similar to Estimation of the probability of default : Credit Rish (20)

Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage Industry
 
Incremental Risk Charge - Credit Migration Risk
Incremental Risk Charge - Credit Migration RiskIncremental Risk Charge - Credit Migration Risk
Incremental Risk Charge - Credit Migration Risk
 
Desai_edinburgh2001
Desai_edinburgh2001Desai_edinburgh2001
Desai_edinburgh2001
 
Risk Management in Financial Institutions
Risk Management in Financial InstitutionsRisk Management in Financial Institutions
Risk Management in Financial Institutions
 
CECL Methodology - Forecasting
CECL Methodology - ForecastingCECL Methodology - Forecasting
CECL Methodology - Forecasting
 
Pricing and How CECL Affects It
Pricing and How CECL Affects ItPricing and How CECL Affects It
Pricing and How CECL Affects It
 
Revenue assurance 101
Revenue assurance 101Revenue assurance 101
Revenue assurance 101
 
Final presentation - Group10(ADS)
Final presentation - Group10(ADS)Final presentation - Group10(ADS)
Final presentation - Group10(ADS)
 
Cecl automation banking book analytics v3
Cecl automation   banking book analytics v3Cecl automation   banking book analytics v3
Cecl automation banking book analytics v3
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
Creditscore
CreditscoreCreditscore
Creditscore
 
Lean Six Sigma Black Belt Training
Lean Six Sigma Black Belt TrainingLean Six Sigma Black Belt Training
Lean Six Sigma Black Belt Training
 
Mafias in training- Resolvr, The SmartCube
Mafias in training- Resolvr, The SmartCubeMafias in training- Resolvr, The SmartCube
Mafias in training- Resolvr, The SmartCube
 
Business impact analysis and Cost-benefit Analysis. Risk Assesment
Business impact analysis and Cost-benefit Analysis. Risk AssesmentBusiness impact analysis and Cost-benefit Analysis. Risk Assesment
Business impact analysis and Cost-benefit Analysis. Risk Assesment
 
desai_wharton2002
desai_wharton2002desai_wharton2002
desai_wharton2002
 
Bridging marke- credit risk-Modelling the Incremental Risk Charge.pptx
Bridging marke- credit risk-Modelling the  Incremental Risk Charge.pptxBridging marke- credit risk-Modelling the  Incremental Risk Charge.pptx
Bridging marke- credit risk-Modelling the Incremental Risk Charge.pptx
 
Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...
Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...
Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...
 
Introduction to Core Assessments
Introduction to Core AssessmentsIntroduction to Core Assessments
Introduction to Core Assessments
 
Investment decisions under risk
Investment decisions under riskInvestment decisions under risk
Investment decisions under risk
 
Comparison Study of Neural Network and Deep Neural Network on Repricing GAP P...
Comparison Study of Neural Network and Deep Neural Network on Repricing GAP P...Comparison Study of Neural Network and Deep Neural Network on Repricing GAP P...
Comparison Study of Neural Network and Deep Neural Network on Repricing GAP P...
 

Recently uploaded

standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhArpitMalhotra16
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单ewymefz
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?DOT TECH
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单nscud
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单ewymefz
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sMAQIB18
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单enxupq
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsalex933524
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单ukgaet
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesStarCompliance.io
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportSatyamNeelmani2
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单ewymefz
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单ewymefz
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单vcaxypu
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单enxupq
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .NABLAS株式会社
 

Recently uploaded (20)

standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis Report
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 

Estimation of the probability of default : Credit Rish

  • 1. Estimating the probability of default: Credit Risk Mohamed Arsalan Qadri Sarvesh Saurabh Mohit Ravi
  • 2. Summary • Credit risk – The probability of default • Data Cleansing • Logistic Regression • Linear Discriminant Analysis • Comparison of the LR and LDA • Factor Analysis
  • 3. Credit Risk What is it? • The risk of default on a debt that may arise from a borrower failing to make required payments. Impact on the lender? • Lost principal and interest, disruption to cash flows, and increased collection costs. How to estimate it? • Credit risk arises from the potential that a borrower or counterparty will fail to perform on an obligation
  • 4. Sources of risk? • For most banks, loans are the largest and most obvious source of credit risk. • There are other sources of credit risk both on and off the balance sheet including letters of credit unfunded loan commitments, and lines of credit. • Other products, activities, and services that expose a bank to credit risk are credit derivatives, foreign exchange, and cash management services. Credit Risk
  • 5. Credit Scoring vs Risk Estimation of risk? • The risk posed by the borrower is inversely proportional to the credit score. • A statistically derived numeric expression of a person's creditworthiness that is used by lenders to access the likelihood that a person will repay his or her debts. • A credit score is based on, among other things, a person's past credit history (300-850)
  • 6. Credit Scoring • Consumers can typically keep their credit scores high by maintaining a long history of always paying their bills on time and not having too much debt. • A FICO score is the most widely used credit scoring system. • A credit score is primarily based on a credit report information typically sourced from credit bureaus.
  • 8. Data Cleaning • Serious Delinquency in two years. (Make a Pi chart for this)
  • 9. • Revolving Utilization Of Unsecured Lines Data Cleaning
  • 11. • Number Of Time 30-59 Days Past Due Not Worse Data Cleaning
  • 12. • Number Of Time 60-89 Days Past Due Not Worse Data Cleaning
  • 13. • Number Of Times 90 Days Late Data Cleaning
  • 14. • Monthly Income • Replaced with Mean Data Cleaning
  • 15. Data Cleaning • Monthly Income • Ran Multiple Linear Regression on Missing Values
  • 16. Data Cleaning • Monthly Income • The Histogram after running Multiple Linear Regression on Missing Values
  • 17. Data Cleaning • Debt Ratio • We found that the Debt Ratio was extremely high in many cases. • Upon Closer inspection, we found out that high debt ratio was present for those records whose Monthly Income was unknown. • From this we inferred that the Debt Ratio could most probably be the Debt.
  • 18. Data Cleaning • Debt Ratio • We replaced the high values of debt ratio by dividing it by the predicted values of the monthly income. • The new mean after replacement was 0.67
  • 19. Data Cleaning • Number of Dependents
  • 20. Data Modelling • Split the dataset into Training data (70%) and Test Data (30%). • Computed Co-relation Matrix among Independent variables. • The variables had very less Co-relation amongst themselves. • Ran Logistic Regression by using Stepwise selection. • Ran Linear Discriminant Analysis. • Compared both the models by measuring their accuracy of prediction. • Ran both models on significant Factors using Factor Analysis.
  • 22. Logistic Regression • Ran Logistic Regression separately for each variable. • Computed the ROC curve for each variable and compared the AUC value.
  • 23. Stepwise Selection • Overall Model was Significant. • All the variables were included in the model. • The model built on the Training data was tested on the Test data. • Probability of default > 0.7 was coded as 1, and Probability of default <0.7 was coded as 0.
  • 24. Logistic Regression on Test Data Overall Accuracy = (41374+291)/(41374+291+175+2661) = 93.6 % True Positive Rate = TP / (TP+FN) = 9.85% True Negative Rate = TN / (TN+FP) = 99.5% Predicted Values Actual Values Confusion Matrix
  • 25. ROC curve for Test Data • AUC Value = 0.8557
  • 27. Discriminant Analysis Overall accuracy =(38134+1717)/Total =89.5 % True Positive Rate = TP / (TP+FN) = 58% True Negative Rate = TN / (TN+FP) = 91.7% Predicted Predicted 0 1 Actual 0 Actual 1 38134 3415 1235 1717 Serious Deliquen
  • 28. Comparison of Models Linear Discriminant Analysis Overall accuracy =89.5 % Predicted Predicted 0 1 Actual 0 Actual 1 38134 3415 1235 1717 Serious Deliquen Logistic Regression Overall Accuracy = 93.6 %
  • 31. Factor Analysis Factor Pattern Factor1 Factor2 Factor3 Factor4 NumberOfTimes90DaysLate 0.54684 0.28062 0.26286 -0.0429 Factor 1 NumberOfTime60_89DaysPastDueNot 0.50016 0.3943 0.37949 -0.0015 RevolvingUtilizationOfUnsecured 0.60945 0.24942 -0.1861 -0.0285 NumberOfOpenCreditLinesAndLoans -0.5203 0.5275 0.1922 0.15051 NumberRealEstateLoansOrLines -0.4698 0.61529 -0.0292 0.09694 Factor 2 NumberOfDependents_num 0.03058 0.46357 -0.6034 -0.008 Monthlyincome_debt -0.4298 0.5044 -0.09 -0.1628 NumberOfTime30_59DaysPastDueNot 0.40861 0.49901 0.31943 0.05977 Factor 3 age -0.4301 -0.1476 0.65733 -0.0396 Factor 4 DebtRatio 0.05584 -0.0712 -0.0331 0.97112
  • 32. Conclusion • 80% time spent on Data cleaning • Logistic Regression gives better results when data is not normal as compared to LDA • Factors can be grouped for a logical understanding, with Debt Ratio and age explaining high variance.

Editor's Notes

  1. ROC curve measures how well your binary classifier is performing. It is comparing the rate at which the classifier is making correct prediction vs the rate at which the classifier is making wrong predictions. The diagonal line in the middle represents the classifier making random guess. Which means it is right 50% of the time and wrong the other 50% of the time. Here we have the ROC curve for Monthly income. From this ROC curve, we can calculate the area under this curve. In this case 0.8508. The higher the AUC value, the better is the model. On the right, we have the AUC values for all the variables. Monthly Income has the best AUC value of 0.8508. Most of the other variables fall below 0.7 and debt ratio does worse than 0.5