Credit Limit
Debt to Income
Credit Report
Credit History
Low Score leads to rejection of application or with
higher interest rates
Identify credit-
worthy customers
High Credit score attracts applications of high credit worthy
customers and hence the approval process is quick with lowest
interest rates.
Practical Observations and growing need of
Scorecard Model
-By Mary Suma Thiburtius
Using historical data to assesses the credit worthiness of
borrower
http://www.linkedin.com/in/sumayyappan | https://twitter.com/tmarysuma
Score Card based model and Probability of Default
 Credit Score access the likelihood that the borrower will exhibit characteristics associated with poor performance by
assessing the historical default rate of borrowers with similar characteristics.
 Credit Scores are good for identifying relative risk of borrower in a population.
 Once the credit scoring model is built, it can be used to decide whether the credit application should be accepted or
rejected or to derive the probability of a future default.
 Probability of default take into account the economic conditions when predicting the future default rate.
 Probability of default represents risk as a percentage likelihood that the borrower will fail to make a payment on loan.
 Default probability is the likelihood over a specified period, usually one year, that a borrower will not be able to make
scheduled repayments. Default probability (PD), depends not only on the borrower's characteristics but also on the economic
conditions.
1. Scorecard based model
2. Probability of Default
Note: In general, defaults may come from 10% to 20% of the lending segments and hence profiling risky segment is
important to reveal useful information.
http://www.linkedin.com/in/sumayyappan | https://twitter.com/tmarysuma
1. Develop score card model using Logistic Regression
Find scores for variables based on weight of evidence bin and
logistic regression:
Probability Distribution of Score for Train Data
Find Scorecard point for all observations from train data
In the example
• For Default cases – Score Ranges from 65 to 753 with a
mean 374
• For Non Default cases – Score Ranges from 186 to 814
with a mean 591
Inference:
• < 600 - Indicates poor score and more likely to default
• 600-700 – Less default rate and more likely to get
approved for loan
• >700 - Excellent Score.
http://www.linkedin.com/in/sumayyappan | https://twitter.com/tmarysuma
Probability Distribution of Score for Test Data
Scorecard point for all observations from test
data set
• For Default cases – Score Ranges from 19 to
797 with a mean 381
• For Non Default cases – Score Ranges from 106
to 814 with a mean 587
0.70
0.87
0.20
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
500 510 520 530 540 550 560 570 580 590 600 610 620 630 640 650 660 670 680 690 700
Accuracy Default NonDefault Kappa Sensitivity Specificity
How to find the optimal threshold of score to maximize the accuracy?
Scorecard performance on test data
Scorecard at Maximum Accuracy of 0.7 is at 700
• For default cases, scorecard at Maximum accuracy of 0.87 is
700
• For non-default cases, scorecard at Maximum accuracy of 0.2 is
500
Accuracy Rate for different Thresholds ( for Score Card )
http://www.linkedin.com/in/sumayyappan | https://twitter.com/tmarysuma
Find Scorecard point for all observations from train data
Calculate optimal threshold for probability to maximize the accuracy.
Probability of default provide more information than the
scores.
• For Default cases – Mean probability of default is 0.52
• For Non Default cases – Mean probability of default is
0.12
Accuracy Rate for different Thresholds ( for Probability)
Threshold at Maximum Accuracy of 0.87 is 0.5
• For default cases, threshold at Maximum accuracy of
0.96 is 0.9
• For non-default cases, threshold at Maximum accuracy
of 0.87 is 0.1
Distribution of Probability of Default
0.87
0.996
0.87
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
Accuracy Default NonDefault Sensitivity Specificity Kappa
2. Probabilities of Default on test observations
Accuracy is used to evaluate model performance. This is
affected by the selected threshold. Below is an approach to
find optimal threshold that maximizes accuracy.
http://www.linkedin.com/in/sumayyappan | https://twitter.com/tmarysuma

Score Card Model

  • 1.
    Credit Limit Debt toIncome Credit Report Credit History Low Score leads to rejection of application or with higher interest rates Identify credit- worthy customers High Credit score attracts applications of high credit worthy customers and hence the approval process is quick with lowest interest rates. Practical Observations and growing need of Scorecard Model -By Mary Suma Thiburtius Using historical data to assesses the credit worthiness of borrower http://www.linkedin.com/in/sumayyappan | https://twitter.com/tmarysuma
  • 2.
    Score Card basedmodel and Probability of Default  Credit Score access the likelihood that the borrower will exhibit characteristics associated with poor performance by assessing the historical default rate of borrowers with similar characteristics.  Credit Scores are good for identifying relative risk of borrower in a population.  Once the credit scoring model is built, it can be used to decide whether the credit application should be accepted or rejected or to derive the probability of a future default.  Probability of default take into account the economic conditions when predicting the future default rate.  Probability of default represents risk as a percentage likelihood that the borrower will fail to make a payment on loan.  Default probability is the likelihood over a specified period, usually one year, that a borrower will not be able to make scheduled repayments. Default probability (PD), depends not only on the borrower's characteristics but also on the economic conditions. 1. Scorecard based model 2. Probability of Default Note: In general, defaults may come from 10% to 20% of the lending segments and hence profiling risky segment is important to reveal useful information. http://www.linkedin.com/in/sumayyappan | https://twitter.com/tmarysuma
  • 3.
    1. Develop scorecard model using Logistic Regression Find scores for variables based on weight of evidence bin and logistic regression: Probability Distribution of Score for Train Data Find Scorecard point for all observations from train data In the example • For Default cases – Score Ranges from 65 to 753 with a mean 374 • For Non Default cases – Score Ranges from 186 to 814 with a mean 591 Inference: • < 600 - Indicates poor score and more likely to default • 600-700 – Less default rate and more likely to get approved for loan • >700 - Excellent Score. http://www.linkedin.com/in/sumayyappan | https://twitter.com/tmarysuma
  • 4.
    Probability Distribution ofScore for Test Data Scorecard point for all observations from test data set • For Default cases – Score Ranges from 19 to 797 with a mean 381 • For Non Default cases – Score Ranges from 106 to 814 with a mean 587 0.70 0.87 0.20 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 500 510 520 530 540 550 560 570 580 590 600 610 620 630 640 650 660 670 680 690 700 Accuracy Default NonDefault Kappa Sensitivity Specificity How to find the optimal threshold of score to maximize the accuracy? Scorecard performance on test data Scorecard at Maximum Accuracy of 0.7 is at 700 • For default cases, scorecard at Maximum accuracy of 0.87 is 700 • For non-default cases, scorecard at Maximum accuracy of 0.2 is 500 Accuracy Rate for different Thresholds ( for Score Card ) http://www.linkedin.com/in/sumayyappan | https://twitter.com/tmarysuma Find Scorecard point for all observations from train data
  • 5.
    Calculate optimal thresholdfor probability to maximize the accuracy. Probability of default provide more information than the scores. • For Default cases – Mean probability of default is 0.52 • For Non Default cases – Mean probability of default is 0.12 Accuracy Rate for different Thresholds ( for Probability) Threshold at Maximum Accuracy of 0.87 is 0.5 • For default cases, threshold at Maximum accuracy of 0.96 is 0.9 • For non-default cases, threshold at Maximum accuracy of 0.87 is 0.1 Distribution of Probability of Default 0.87 0.996 0.87 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Accuracy Default NonDefault Sensitivity Specificity Kappa 2. Probabilities of Default on test observations Accuracy is used to evaluate model performance. This is affected by the selected threshold. Below is an approach to find optimal threshold that maximizes accuracy. http://www.linkedin.com/in/sumayyappan | https://twitter.com/tmarysuma