More Related Content Similar to Chapter 5 consumer lending Similar to Chapter 5 consumer lending (20) Chapter 5 consumer lending1. Chapter 5
Consumer Lending
The Presentation Slides for Teaching Regulatory Technology
Website : https://sites.google.com/site/quanrisk
E-mail : quanrisk@gmail.com
Copyright © 2021 Dr. LAM Yat-fai
2. Declaration
Copyright © 2021 Dr. LAM Yat-fai
All rights reserved. No part of this presentation file may be
reproduced, in any form or by any means, without written
permission from Dr. LAM Yat-fai.
Authored by Dr. LAM Yat-fai (林日辉),
Chief Data Scientist, CapitaLogic Limited,
Adjunct Professor of Finance, City University of Hong Kong,
Doctor of Business Administration,
CFA, CAIA, CAMS, CFE, FRM, PRM, MCSE, MCNE.
Copyright © 2021 Dr. LAM Yat-fai 2
3. Outline
Sample data set
Regulatory requirements
Model development
Default prediction model
Overdue prediction model
Copyright © 2021 Dr. LAM Yat-fai 3
4. Two class sampling techniques
For machine learning theories
50% to 50%
Real full data set
30% to 70%
20% to 80%
Under sampling
Reduce the majority class to match the minority class
Over sampling
Increase the minority class to match the majority class
Copyright © 2021 Dr. LAM Yat-fai 4
8. SMOTE
Synthetic Minority Oversampling Technique
Create new records from existing minority
class
For each record of minority class
Substitute the value of a feature with the value of a
record nearby
Copyright © 2021 Dr. LAM Yat-fai 8
9. Feature type
Numeric
Continuous feature Annual income
Discrete feature No. of children
Non-numeric
Binary feature Gender
Categorical feature Education
Copyright © 2021 Dr. LAM Yat-fai 9
11. Categorical feature
Education
Postgraduate
Undergraduate
Secondary school
Primary school
No education
Country
United States
United Kingdom
Canada
Australia
China
Copyright © 2021 Dr. LAM Yat-fai 11
12. Assign ranks to categorical variable
Country Label average Rank
Australia 0.21 5
Canada 0.63 1
China 0.49 3
United Kingdom 0.38 4
United States 0.52 2
Copyright © 2021 Dr. LAM Yat-fai 12
13. Chapter 5a2 – Sample data set (1)
Datasets
Chapter 5a1 – Full data set.csv
Chapter 5a3 – Country rank
SMOTE
Label column y
SMOTE percentage 58
Edit Metadata
Column Gender, Country
Copyright © 2021 Dr. LAM Yat-fai 13
14. Chapter 5a2 – Sample data set (2)
Convert to Indicator Values
Categorical columns Gender
Join Data
Join key columns for Left Country
Join key columns for Right Country
Select Columns in Dataset
All columns exclude Gender, Gender-Male,
Country, Country(2)
Convert to CSV
Copyright © 2021 Dr. LAM Yat-fai 14
15. Outline
Sample data set
Regulatory requirements
Model development
Default prediction model
Overdue prediction model
Copyright © 2021 Dr. LAM Yat-fai 15
17. Major retail lending products
Term loan
Fixed loan principal
No collateral
Credit card
Subject to a credit limit
No collateral
Residential mortgage
Principal amortization
Property as collateral
Copyright © 2021 Dr. LAM Yat-fai 17
18. HKMA SPM CR-G-1
General Principles of Credit Risk Management
4.1.4 Credit decisions should be supported by
adequate evaluation of the borrower's
repayment ability based on reliable
information. Sufficient and up-to-date
information should continue to be available to
enable effective monitoring of the account.
Copyright © 2021 Dr. LAM Yat-fai 18
19. Commercial Banks Law, PRC
Article 35. Before granting a loan,
commercial banks shall strictly examine the
borrower's purpose for the loan, ability to
repay the loan, method of repayment, etc.
.
Copyright © 2021 Dr. LAM Yat-fai 19
20. 20
Default
Generic definition
A borrower fails to pay to the lender the interest
and/or principal in full on schedule
Regulatory definition
A borrower fails to pay to the lender the interest
and/or principal within 90 days of the due date
22. Risk management objectives
To predict whether a borrower will default in
the following one year
Good, bad
Good, moderate, bad
Good, good to moderate, moderate to bad, bad
Probability of default
Copyright © 2021 Dr. LAM Yat-fai 22
23. Regulatory and legal constraints
Interpretability
The relationship between the label and features can
be well established by experience and theory
No violation with common sense unless the
common sense is inapplicable
Equal opportunity
Religion, race, disability … cannot be used
directly as a feature, as mandated by anti-
discrimination law
Copyright © 2021 Dr. LAM Yat-fai 23
24. Outline
Sample data set
Regulatory requirements
Model development
Default prediction model
Overdue prediction model
Copyright © 2021 Dr. LAM Yat-fai 24
25. Standard procedure (1)
Analyze regulatory requirements
Collect historical records
Prepare sample data set
Over sampling
Convert binary features into [0,1]
Convert categorical features into ranks
Copyright © 2021 Dr. LAM Yat-fai 25
26. Standard procedure (2)
Assess monotonicity and principal components
Develop machine learning model
Evaluate model performance
Create prediction model
Assess sample, positive class and negative
class accuracies
Conduct prediction
Copyright © 2021 Dr. LAM Yat-fai 26
27. Modelling assumptions
Many historical borrower records
Many borrowers survive in one year
Some borrowers default in one year
There is a monotonic relationship between
Label: whether a borrower defaults in one year
Features: borrower information collected one year
ago
Copyright © 2021 Dr. LAM Yat-fai 27
28. Label
0
The borrower survives in one year
1
The borrower defaults in one year
Copyright © 2021 Dr. LAM Yat-fai 28
29. Features
From application form
Personal data
Employment data
Income data
Asset data
Liability data
Copyright © 2021 Dr. LAM Yat-fai 29
30. Expected monotonicity
Strong
Annual income, outstanding loan, equity investments
To be verified by data
Weak or unknown
Education, gender, martial status
To be determined by data
No
ID card no., telephone no., application date
To be ignored
Copyright © 2021 Dr. LAM Yat-fai 30
31. Data dictionary
Label
Survive (0) or Default (1)
Description
The default status of a borrower in one year after
his features were collected
Label type
Binary
Copyright © 2021 Dr. LAM Yat-fai 31
32. Data dictionary
Feature
Income
Description
The total income during the last 12 month, including
fixed salary, commission and bonus, after income tax
and pension contribution have been deducted
Feature type
Continuous value
Expected impact to default
Strong
Negative
Copyright © 2021 Dr. LAM Yat-fai 32
33. Data dictionary
Feature
Loan amount
Description
The outstanding loan amount
Feature type
Continuous value
Expected impact to default
Strong
Positive
Copyright © 2021 Dr. LAM Yat-fai 33
34. Outline
Sample data set
Regulatory requirements
Model development
Default prediction model
Overdue prediction model
Copyright © 2021 Dr. LAM Yat-fai 34
35. Two class PD model
1 2 3 N
Default Default Default Default
1 2 3 U
Survive Survive Survive Survive
1 2 3 V
PD = F x , x ,x , , x
Maximize
L = PD × PD × PD × × PD ×
1 - PD × 1 - PD × 1 - PD × × 1 - PD
35
Copyright © 2020 CapitaLogic Limited
36. Full data set
Feature
Data on existing mortgage borrowers one year ago
Label
Either survives or defaults in one year
Several thousand records
Imbalanced
Some features missed
Some records duplicated
Copyright © 2021 Dr. LAM Yat-fai 36
37. Label and all features
Label
0: Survives in the last year
1: Defaults in the last year
Loan amount
Loan purpose
Debt consolidation
Home improvement
Job title
No. of years at current job
No. of major derogatory reports
No. of delinquent credit lines
Age of oldest credit line in
months
No. of credit enquiries in the
most recent 30 days
No. of credit lines
Debt-to-income ratio
Copyright © 2021 Dr. LAM Yat-fai 37
38. Chapter 5b2 – Sample data set (1)
Datasets
Chapter 5b1 – Full data set.csv
Chapter 5b3 – Job title rank
SMOTE
Label column Default
SMOTE percentage 57
Edit Metadata
Column Loan purpose, Job title
Copyright © 2021 Dr. LAM Yat-fai 38
39. Chapter 5B2 – Sample data set (2)
Convert to Indicator Values
Categorical columns Loan purpose
Join Data
Join key columns for Left Job title
Join key columns for Right Job title
Save as Dataset
Select Columns in Dataset
All columns exclude Loan purpose, Loan purpose-
Loan Consolidation, Job title, Job title(2)
Convert to CSV
Copyright © 2021 Dr. LAM Yat-fai 39
40. Feature selection
Weak monotonicity
Mortgage amount
No. of credit lines
Loan purpose-Home
No. of principal components
8
Copyright © 2021 Dr. LAM Yat-fai 40
41. Chapter 5b4 – Prediction model
Datasets
Chapter 5b2 – Sample data set
Chapter 5b3 – Job title rank
Edit Metadata
Column Job title
Join Data
Join key columns for Left Job title
Join key columns for Right Job title
Select Columns in Dataset
All columns exclude Loan purpose, Job title, Job
title(2)
Copyright © 2021 Dr. LAM Yat-fai 41
42. Other standard procedures
Run training experiment
Create predictive experiment
Modify Web service input and Web service output
Run predictive experiment
Deploy web service
Download Excel prediction model
Upload Excel prediction model to OneDrive
Conduct assessment and prediction
Copyright © 2021 Dr. LAM Yat-fai 42
43. Excel prediction model
5 samples test
All samples accuracy
Overall accuracy
Survive class accuracy, full data set
Upper cutoff score
Defaulted class accuracy, full data set
Lower cutoff score
Prediction
Copyright © 2021 Dr. LAM Yat-fai 43
44. Outline
Sample data set
Regulatory requirements
Model development
Default prediction model
Overdue prediction model
Copyright © 2021 Dr. LAM Yat-fai 44
45. Label and all features, three class
Label
Survive
Overdue
Default
Loan amount
Loan purpose
Debt consolidation
Home improvement
Job title
No. of years at current job
No. of major derogatory reports
No. of delinquent credit lines
Age of oldest credit line in
months
No. of credit enquiries in the
most recent 30 days
No. of credit lines
Debt-to-income ratio
Copyright © 2021 Dr. LAM Yat-fai 45
46. Label and all features, two class
Label
0: Survive
1: Overdue or default
Loan amount
Loan purpose
Debt consolidation
Home improvement
Job title
No. of years at current job
No. of major derogatory reports
No. of delinquent credit lines
Age of oldest credit line in months
No. of credit enquiries
Enquiries in the most recent 30 days
No. of credit lines
Debt-to-income ratio
Copyright © 2021 Dr. LAM Yat-fai 46
47. Chapter 5c2 – Sample data set (1)
Datasets
Chapter 5c1 – Full data set.csv
Chapter 5c3 – Job title rank
SMOTE
Label column Default
SMOTE percentage 124
Edit Metadata
Column Loan purpose, Job title
Copyright © 2021 Dr. LAM Yat-fai 47
48. Chapter 5c2 – Sample data set (2)
Convert to Indicator Values
Categorical columns Loan purpose
Join Data
Join key columns for Left Job title
Join key columns for Right Job title
Select Columns in Dataset
All columns exclude Loan purpose, Loan purpose-
Loan Consolidation, Job title, Job title(2)
Convert to CSV
Copyright © 2021 Dr. LAM Yat-fai 48
49. Feature selection
Weak monotonicity
Mortgage amount
No. of credit lines
Binary feature
Loan purpose-Home improvement
PCA cannot be applied
No. of principal components
9
Copyright © 2021 Dr. LAM Yat-fai 49
50. Chapter 5c4 – Prediction model (1)
Datasets
Chapter 5b2 – Sample data set
Chapter 5b3 – Job title rank
Edit Metadata
Column Loan purpose, Job title
Convert to Indicator Values
Categorical columns Loan purpose
Copyright © 2021 Dr. LAM Yat-fai 50
51. Chapter 5c4 – Prediction model (2)
Join Data
Join key columns for Left Job title
Join key columns for Right Job title
Select Columns in Dataset
All columns exclude Loan purpose, Loan purpose-
Loan Consolidation, Job title, Job title(2)
Copyright © 2021 Dr. LAM Yat-fai 51
52. Is overdue a weak form of default?
Default
Loan purpose
No. of credit lines
Overdue
Loan amount
No. of credit lines
Copyright © 2021 Dr. LAM Yat-fai 52
53. Creation of theory
Hypothesis
Overdue is a week form of default
Testing
H0: Overdue and default are caused by the same
set of features
H0: Overdue and default are NOT caused by the
same set of features
To be researched
Copyright © 2021 Dr. LAM Yat-fai 53