SlideShare a Scribd company logo
1 of 15
Statistical Learning
on Credit Data
BUILD MODELS TO PREDICT LOAN DEFAULTS
Firas Obeid
Ex Treasury Dealer
Problem
Identification
Lending Club Data
Loan Data
Goal: Predict Loan Defaults using set of variables
Process : Statistical Learning Models
Data Preparation
 Identify data types
 Fill missing values
 Convert nonnumeric variables
 Drop unnecessary variables
Statistical Models
 “As far as the laws of mathematics refer
to reality, they are not certain; and as far
as they are certain, they do not refer to
reality.”~ Einstein
 Many machine learning algorithms are
stochastic because they explicitly use
randomness during optimization or
learning.
Model Interpretation
ARE WE USING BLACKBOXES?
Tree Based Model - GBM
Tree Based
Model -
GBM
 LIME (Local Interpretable Model Agnostic
Explanations)
Neural
Network
FITTED ON IMPORTANT IDENTIFIED
VARIABLES BY PREVIOUS MODEL
Logistic
Regression –
NN
Interpreter
 Number of Iterations/Epochs: * Solution: Make
epochs = max_iter
 Optimizer : * Solution: Optimizer = 'sgd' (stochastic
gradient decsent Solver = 'sag' (stochastic
average gradient descent)
 Regularization: * Solution: l2 in both
 The L2 regularization (also called Ridge): For l2 /
Ridge, as the penalisation increases, the coefficients
approach but do not equal zero, hence no variable
is ever excluded!
Model Stability
 For high-bias models, the
performance of the model on
the validation set is similar to
the performance on the training
set.
 For high-variance models, the
performance of the model on
the validation set is far worse
than the performance on the
training set.
Model
Deployment
 Probability of Default from
model predictions
 Use Probabilities of
observations to assign buckets
that group customers according
to default rate
Scorecard (Segmentation & Pricing)
Scorecard
(Segmentation &
Pricing)
 Mean Loan amount per bucket
 Interest Rate Proposed per
Bucket
Bucket NPV
Conclusion
Using statistical models that assign probability to classified classes, help allocate operational risk capital by identifying level of
customer riskiness. The method of segmenting customer into buckets using the scorecard approach, identifies:
Level of operational and default risk a financial entity is willing to take,
The customer base to choose
The appropriate interest rate to charge and price loan/credit product repayments.

More Related Content

What's hot

Linear Regression Ex
Linear Regression ExLinear Regression Ex
Linear Regression Ex
mailund
 

What's hot (20)

Step by Step guide to executing an analytics project
Step by Step guide to executing an analytics projectStep by Step guide to executing an analytics project
Step by Step guide to executing an analytics project
 
Housing price prediction
Housing price predictionHousing price prediction
Housing price prediction
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning
 
Risk Based Loan Approval Framework
Risk Based Loan Approval FrameworkRisk Based Loan Approval Framework
Risk Based Loan Approval Framework
 
Ia for fintech
Ia for fintechIa for fintech
Ia for fintech
 
Classification
ClassificationClassification
Classification
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
 
Stock Market Analysis
Stock Market AnalysisStock Market Analysis
Stock Market Analysis
 
What makes a good decision tree?
What makes a good decision tree?What makes a good decision tree?
What makes a good decision tree?
 
Cmpe 255 cross validation
Cmpe 255 cross validationCmpe 255 cross validation
Cmpe 255 cross validation
 
Classes of Model
Classes of ModelClasses of Model
Classes of Model
 
Machine learning
Machine learningMachine learning
Machine learning
 
Supervised learning
  Supervised learning  Supervised learning
Supervised learning
 
Fraud Detection with Ensemble Learning Technique
Fraud Detection with Ensemble Learning TechniqueFraud Detection with Ensemble Learning Technique
Fraud Detection with Ensemble Learning Technique
 
Ew36913917
Ew36913917Ew36913917
Ew36913917
 
Stock market analysis
Stock market analysisStock market analysis
Stock market analysis
 
Presentation machine learning
Presentation machine learningPresentation machine learning
Presentation machine learning
 
Heart disease classification
Heart disease classificationHeart disease classification
Heart disease classification
 
Model Selection Techniques
Model Selection TechniquesModel Selection Techniques
Model Selection Techniques
 
Linear Regression Ex
Linear Regression ExLinear Regression Ex
Linear Regression Ex
 

Similar to Statistical Learning on Credit Data

Presentation Title
Presentation TitlePresentation Title
Presentation Title
butest
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
Eric Esajian
 

Similar to Statistical Learning on Credit Data (20)

Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
 
Unveiling the Market: Predicting House Prices with Data Science
Unveiling the Market: Predicting House Prices with Data ScienceUnveiling the Market: Predicting House Prices with Data Science
Unveiling the Market: Predicting House Prices with Data Science
 
Predicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning ApproachPredicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning Approach
 
Machine Learning part 3 - Introduction to data science
Machine Learning part 3 - Introduction to data science Machine Learning part 3 - Introduction to data science
Machine Learning part 3 - Introduction to data science
 
Presentation Title
Presentation TitlePresentation Title
Presentation Title
 
Credit scorecard
Credit scorecardCredit scorecard
Credit scorecard
 
Personal Loan Risk Assessment
Personal Loan Risk Assessment Personal Loan Risk Assessment
Personal Loan Risk Assessment
 
Dmml report final
Dmml report finalDmml report final
Dmml report final
 
Bank Customer Churn Prediction- Saurav Singh.pptx
Bank Customer Churn Prediction- Saurav Singh.pptxBank Customer Churn Prediction- Saurav Singh.pptx
Bank Customer Churn Prediction- Saurav Singh.pptx
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
Default Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDefault Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan Data
 
Creating an Explainable Machine Learning Algorithm
Creating an Explainable Machine Learning AlgorithmCreating an Explainable Machine Learning Algorithm
Creating an Explainable Machine Learning Algorithm
 
Explainable Machine Learning
Explainable Machine LearningExplainable Machine Learning
Explainable Machine Learning
 
DSO530 Group project
DSO530 Group projectDSO530 Group project
DSO530 Group project
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Machine Learning in the Financial Industry
Machine Learning in the Financial IndustryMachine Learning in the Financial Industry
Machine Learning in the Financial Industry
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 

Recently uploaded

Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Abortion pills in Jeddah |+966572737505 | get cytotec
Abortion pills in Jeddah |+966572737505 | get cytotecAbortion pills in Jeddah |+966572737505 | get cytotec
Abortion pills in Jeddah |+966572737505 | get cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
mikehavy0
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
jk0tkvfv
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
pwgnohujw
 
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted KitAbortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh +966572737505 get cytotec
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 

Recently uploaded (20)

Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
 
Abortion pills in Jeddah |+966572737505 | get cytotec
Abortion pills in Jeddah |+966572737505 | get cytotecAbortion pills in Jeddah |+966572737505 | get cytotec
Abortion pills in Jeddah |+966572737505 | get cytotec
 
Pentesting_AI and security challenges of AI
Pentesting_AI and security challenges of AIPentesting_AI and security challenges of AI
Pentesting_AI and security challenges of AI
 
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
 
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTSDBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted KitAbortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 

Statistical Learning on Credit Data

  • 1. Statistical Learning on Credit Data BUILD MODELS TO PREDICT LOAN DEFAULTS Firas Obeid Ex Treasury Dealer
  • 2. Problem Identification Lending Club Data Loan Data Goal: Predict Loan Defaults using set of variables Process : Statistical Learning Models
  • 3. Data Preparation  Identify data types  Fill missing values  Convert nonnumeric variables  Drop unnecessary variables
  • 4. Statistical Models  “As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality.”~ Einstein  Many machine learning algorithms are stochastic because they explicitly use randomness during optimization or learning.
  • 5. Model Interpretation ARE WE USING BLACKBOXES?
  • 7. Tree Based Model - GBM  LIME (Local Interpretable Model Agnostic Explanations)
  • 8. Neural Network FITTED ON IMPORTANT IDENTIFIED VARIABLES BY PREVIOUS MODEL
  • 9. Logistic Regression – NN Interpreter  Number of Iterations/Epochs: * Solution: Make epochs = max_iter  Optimizer : * Solution: Optimizer = 'sgd' (stochastic gradient decsent Solver = 'sag' (stochastic average gradient descent)  Regularization: * Solution: l2 in both  The L2 regularization (also called Ridge): For l2 / Ridge, as the penalisation increases, the coefficients approach but do not equal zero, hence no variable is ever excluded!
  • 10. Model Stability  For high-bias models, the performance of the model on the validation set is similar to the performance on the training set.  For high-variance models, the performance of the model on the validation set is far worse than the performance on the training set.
  • 11. Model Deployment  Probability of Default from model predictions  Use Probabilities of observations to assign buckets that group customers according to default rate
  • 13. Scorecard (Segmentation & Pricing)  Mean Loan amount per bucket  Interest Rate Proposed per Bucket
  • 15. Conclusion Using statistical models that assign probability to classified classes, help allocate operational risk capital by identifying level of customer riskiness. The method of segmenting customer into buckets using the scorecard approach, identifies: Level of operational and default risk a financial entity is willing to take, The customer base to choose The appropriate interest rate to charge and price loan/credit product repayments.