Default Probability Prediction using Artificial Neural Networks in R Programming

Regulatory Environment
• Basel Accords refer to the banking supervision accords (recommendations on banking
regulations) Basel I, Basel II and Basel III—issued by the Basel Committee on Banking
Supervision (BCBS).
• The Basel Committee does not have the authority to enforce recommendations, although most
member countries as well as some other countries tend to implement the Committee's policies.
G-20 and other major baking locales such as Hong Kong and Singapore.
• Measurement of Credit Risk is an important exercise for financial institutions, more so because
of regulatory requirements.
• Credit Risk can be classified under two categories – Issuer Risk and Counterparty Risk.
• Issuer Risk is the risk that the issuer/obligor defaults and is unable to fulfil payment obligations.
• Counterparty Risk includes Default Risk – risk that counterparty defaults without any
payment/incomplete payment on the transaction; Replacement Risk – risk that, after default occurs,
replacing the deal under the same conditions is not possible; settlement risk – risk that parties involved in
the settlement fails before the transaction is fully settled.

• The integral components of credit risk, as recognized by the Bank of International Settlements
(BIS), are
• Probability of Default (PD): Probability that the obligor will default within a given time horizon
• Exposure at Default (EAD): Amount outstanding with the obligor at the time of default
• Loss given Default (LGD): Percentage loss incurred relative to the EAD
• Maturity (M): Effective maturity of the exposure
Probability of
Default
Exposure at
Default
Loss given
Default
Maturity
Components of Credit Risk

• Basel I, the 1988 Accord, is primarily focused on Credit Risk and appropriate Risk Weighting of
Assets.
• Assets of banks were classified and grouped in five categories according to credit risk, carrying
risk weights of 0% (for example cash, bullion, home country debt like Treasuries), 20%
(securitizations such as mortgage-backed securities (MBS) with the highest AAA rating), 50%
(municipal revenue bonds, residential mortgages), 100% (for example, most corporate debt), and
some assets given No Rating.
• Banks are also required to report off-balance-sheet items such as letters of credit, unused
commitments, and derivatives. These all factor into the risk weighted assets.
• Under Basel I, the risk weights depend on the categorization of obligors. They do not consider
the actual obligor risk rating or the tenor of the facility and do not recognize any form of
collateral. Therefore, the credit risk capital required as a percentage of exposure will be a
constant 8% across all facility ratings.
Basel I

THREE PILLARS CONCEPT
MINIMUM CAPITAL
REQUIREMENTS
SUPERVISORY REVIEW
MARKET DISCIPLINE
• The First Pillar: Minimum Capital Requirement deals with maintenance of regulatory capital
calculated for three major components of risk that a bank faces: CREDIT
RISK, OPERATIONAL RISK, AND MARKET RISK.
• The Second Pillar: Supervisory Review is a regulatory response to the first pillar,
giving regulators better 'tools' over those previously available. It also provides a framework
for dealing with systemic risk, pension risk, concentration risk, strategic risk, reputational
risk, liquidity risk and legal risk, which the accord combines under the title of residual risk.
• The Third Pillar: The Market Discipline aims to complement the minimum capital
requirements and supervisory review process by developing a set of disclosure requirements
which will allow the market participants to gauge the capital adequacy of an institution.
Basel II

• Under Basel II, the Credit Risk Measurement techniques proposed under Capital Adequacy rules can be
classified under
• Standardized Approach: This approach uses a simplistic categorization of obligors, without considering
their actual credit risks; external credit ratings are used
• Internal Ratings-Based (IRB) Approach: In this approach, banks that meet certain criteria are permitted to
use their own estimated risk parameters to calculate regulatory capital required for credit risk
• The IRB approach can be further classified into
• Foundation-IRB: Banks are allowed to calculate the Probability of Default (PD) for each asset; while the regulator will
determine Loss Given Default (LGD) and Exposure at Default (EAD). Maturity (M) can be assigned by either
• Advanced-IRB: Banks are allowed to use their internal models to calculate PD, LGD, EAD, and M. The primary
objective of employing these models is to arrive at the Total Risk Weighted Assets (RWA), which is used to calculate
the regulatory capital.
• The RWA calculation is based on either Standardized or IRB approach.
Basel II

Internal Ratings Based Approach
• The Internal Ratings-Based (IRB) Approach of Basel II allows banks to use their internal estimates of
risk parameters to calculate the required capital related to the exposure.
• From the bank's perspective, the IRB approach allows it to use internal models to calculate credit capital,
enabling more sensitivity to the credit risk in the bank's portfolio. Furthermore, incorporating better risk
management techniques on its portfolio will show its effect on minimizing the regulatory capital required.
• Under the IRB approach, banks are required to categorize their banking book exposures into the following
asset classes
• Corporate
• Sovereign
• Bank
• Retail
• Equity
• Basel provides risk weight formulas for the IRB approach; the PD, LGD, and M are inputs to these
formulas. The formula varies depending on the exposure category. Under the IRB–Foundation, the
formula assumes a value for LGD and M, while under the IRB–Advanced, all parameters are estimated
using internal models.

Important Characteristics of IRB
• To get a go-ahead from the regulators, banks internal estimation techniques should
meet some stringent quantitative and qualitative requirements as follows
• Internal models are expected to be Risk-Sensitive to the Portfolio of the bank
• The Internal Model should be able to capture obligor characteristics and should have
sufficient information to estimate the key risk parameters within statistical confidence levels
• There should be proper Corporate Governance and internal controls
• The modelling and capital estimation framework should be linked to the day-to-day
operations of the bank
• There should be an appropriate validation and testing process, ensuring the estimation of
the precise PD, LGD, EAD, and capital estimates for credit risk

Probability of Default (PD)
• Probability of Default (PD) is an estimate of the likelihood that the obligor will be unable to meet its debt obligation
over a certain time horizon.
• PD is integral to estimating credit risk and its associated economic capital/regulatory capital.
• According to Basel II, a default event on an obligation would occur if either or both of the following conditions meet:
• The obligor is unlikely to be able to repay its debt without giving up any pledged collateral
• The obligor has passed more than 90 days without paying a material credit obligation
• Estimation of PD depends on two broad categories of information
• Macroeconomic – Unemployment, GDP Growth Rate, Interest Rate etc.
• Obligor Specific – Financial Ratios/Growth (Corporate), Demographic Information (Retail)
• PD can be categorized as Unstressed/Stressed PD and Through-the-Cycle/Point-in-Time PD.
• If the PD is estimated considering the current macroeconomic and obligor-specific information, it is known as
Unstressed PD. Stressed PD is estimated using current obligor-specific information and “stressed”
macroeconomic factors (independent of the current state of the economy).
• Point-in-Time PD estimates incorporate macroeconomic and the obligor’s own credit quality, whereas Through-
the-Cycle PD estimates are mainly determined by factors affecting the obligor’s long-run credit quality trends.

PD EstimationTechniques
• Any of the following four modeling techniques can be used to estimate PD
• Pooling– estimated empirically using historical default data of a large universe of obligors
• Statistical– estimated using statistical techniques through macro and obligor-specific data
• Reduced-Form– estimated from the observable prices of CDSs, bonds, and equity options
• Structural– estimated using company level information

Artificial Neural Networks
• Neural Networks are Learning Systems which can model the relation between a set of inputs and a set of
outputs, under the assumption that the relationship is non linear
• They assume Non-Linear relationship between variables
• Black Box Nature
• They are considered Black Boxes since it is not possible to extract symbolic information from their internal
configuration.To overcome this, Fuzzy Logic can be used.
• Artificial Neural Networks modify their internal parameters in order to perform a given computational task

NetworkTopology Structure
• Composed of a bunch of neurons connected in predefined topology
• In some cases, the topology is also dynamic
• The connections among the neurons have an associated weight which determines the type and intensity of the
information exchanged
• In a given topology, the weights represent the functions that defines the network behavior.
• The most used Network Topologies are Layered and Completely Connected
• In Layered networks, the network has the neurons subdivided in layers. If the connections are only in one
direction, the network is called Feedforward Network, and if loops are also allowed, the network is called
Recurrent Network
• In Completely Connected Networks, the network has neurons which are connected with each other.

Working of ANN
• Neurons are elementary computational units of the network
• A neuron receives inputs from other neurons and produces an
output which is transmitted to other destination neurons
• The generation of output is divided in two steps: In the first step,
the weighted sum of inputs is evaluated, i.e., every single input is
multiplied by the weight on the corresponding link and all these
values are summed up. In the second step, the activation is
evaluated by applying a particular activation function to the
weighted sum of inputs.
• Some kind of activation functions used are Linear Function, Step
Function, Sigmoid Function etc.

Learning Algorithm
• Before the neural network can be applied to the problem at hand, a specific tuning of its weights has to be done.
• This task is accomplished by the learning algorithm which trains the network and iteratively modifies the weights until
a specific condition is verified.
• In most applications, the learning algorithm stops as soon as the discrepancy (error) between desired output and the
output produced by the network falls below a predefined threshold.
• There are three types of learning mechanisms for neural networks
Supervised Learning
Supervised Learning is
characterized by a training set
which is a set of correct examples
used to train the network.The
training set is composed of pairs of
inputs and corresponding desired
outputs.The error produced by the
network then is used to change
the weights.
Unsupervised Learning
In Unsupervised Learning
algorithms, the network is only
provided with a set of inputs and no
desired output is given.The
algorithm guides the network to
self-organize and adapt its weights.
Reinforced Learning
• Reinforced Learning trains the
network by introducing prizes and
penalties as a function of the
network response. Prizes and
penalties are then used to modify
the weights. Reinforced learning
algorithms are applied to train
adaptive systems which perform a
task composed of a sequence of
actions.

Learning Algorithm – Backpropogation
• In the case of the BP Learning Algorithm, the network begins its
training with a random set of weights, and a set of input-output
relations(which are represented by data sets).
• During the feedforward process each input unit (X) receives an
input signal and sends the signal to each of the hidden node.
• Each hidden node then computes its activation and sends its
signal Z, to the output node.
• Each output note computes its Activation Y* to form the
response of the network for the given input pattern.
• These estimates are then compared to the desired output and an
error is computed for each given observation.
• This error is then transmitted backward from the output layer to
each node in the hidden layer.
• Each of the hidden nodes receives only a portion of the error,
which is based on the relative contribution of each of the hidden
nodes to the given estimate.

Variables
• For assessing credit worthiness of customers, there are various variables that can be looked upon related to their profitability, growth, expenses etc.
• These variables can be some ratios as well
• Variable names should be properly defined
CleaningThe
Dataset
• Missing And WrongValues: Directly discarding the entries for corresponding firm/customer may lead to deletion of some other important information
• To avoid this, if possible replace these values by the average of that variable or obtain a regression equation from the available variables and replace
the missing ones with this equation
• Factor Analysis
Normalization
• This is performed to feed the net with data ranging in the same interval for each input node
• Min-Max linear transformation can be used to interval data in [0,1], but we would lose a lot of useful information and several fields would have almost
all data close to one of the limit of normalization
• Another way can be to use logarithmic formula to normalize data
Correlation
Analysis
• We perform this step in order to analyze the correlation between our set of variables
• If high correlation exists between any set of variables, adjustments should be ,made accordingly
Training &
Testing Data
• The data set should be divided in a ratio for training and testing purpose
• Using the a portion of the dataset the network should be trained and should be tested on the remainder of the data
• CrossValidationTechniques
PREPARINGTHEDATASET

Data CrossValidation
Data Split
Data Splitting involves partitioning the
data into an explicit training dataset
used to prepare the model and an
unseen test dataset used to evaluate the
models performance on unseen data.
Bootstrap Resampling
Bootstrap Resampling involves taking
random samples from the dataset (with
re-selection) against which to evaluate
the model. In aggregate, the results
provide an indication of the variance of
the models performance.Typically, large
number of resampling iterations are
performed (thousands or tends of
thousands).
K-fold CrossValidation
• The K-fold cross validation method
involves splitting the dataset into k-
subsets. For each subset is held out while
the model is trained on all other subsets.
This process is completed until accuracy
is determine for each instance in the
dataset, and an overall accuracy estimate
is provided.
Repeated K-fold CrossValidation
The process of splitting the data into k-folds can
be repeated a number of times, this is called
Repeated K-fold CrossValidation.The final
model accuracy is taken as the mean from the
number of repeats
Leave One Out CrossValidation
In Leave One Out CrossValidation (LOOCV), a
data instance is left out and a model constructed
on all other data instances in the training set.This
is repeated for all data instances.

DISCRIMINANT ANALYSIS & ANN IN CREDIT
CARD CUSTOMERS
• Paper by Mehmat Yazici of Istanbul Arel University to address the problem of Credit Card
Defaults
• Credibility of credit card customers using Discriminant Analysis (to find the statistically
significant inputs) and then using the data for Artificial Neural Networks (ANN) analysis
• Customers failing to make payment for 3 months were problematic (0) in data set
• 133 Customer datasets (97 training) & 23 independent variables (11 quantitative)
• Discriminant Analysis done using Altman Z score, significant variables were then subjected
to ANN analysis
• Eigen Value, Wilk’s Lambda(40%) & Canonical Correlation(59%) were used to explain the
variation in dependent variable
• 9 variables found significant and were used in ANN framework
• ANN supplemented with discriminant analysis provide more significant results
• Only the statistically significant variables were used to get results through ANN
Prediction
Sample Observation 0 1 True
Separation %
Training 0
1
Total%
14
0
12.4%
0
83
85.6%
100
100
100
Test 0
1
Total%
6
1
19.4%
2
27
80.6%
75.0
96.4
91.7
EVALUATING CUSTOMER LOANS USING
NEURAL NETWORKS
• Paper by Rashmi Malhotra, Philadelphia University
• Evaluates the effectiveness of Neural Network in assisting loan officer to screen
potential loan defaulters & comparing results with Discriminant Analysis
• 1078 observations (700 training) with both quantitative & qualitative variables
• Categorization: Group 1: Good credit, Group 2: Defaults
• Discriminant Analysis done using Altman Z score, significant variables were then
subjected to ANN analysis
• Backpropagation is used to minimize the error between desired & actual input
• CrossValidation: Models run through 7 sample data sets to evaluate consistency
in performance
Results: Neural Network Model performed consistently better in identifying problem
loans
• Accuracy of 70-71% using Discriminant Model
• 72-85% accuracy for good loan using ANN
• Paired t-test shows a statistically significant difference in the superiority of
Artificial Neural Network Model over Discriminant Analysis in identifying
problem loans
• sd

NEURAL NETWORK FOR CREDIT RISK
EVALUATION
• Paper by Adnan Khashman,Turkey
• Usefulness of Neural Networks in learning & estimating Default
Tendency of borrower
• Addresses the problems in credit risk evaluation usingANN
• High ratio of training to validation datasets
• Normalization of input data
• Computational data
• German credit dataset of 1000(700 good, 300 bad) loan applications &
20 categorical & numerical attributes
• Data Processing Phase: Each numerical value is normalized separately
• Evaluation Phase: Deciding whether to accept or reject the application
using applicant attributes
• Investigated ideal learning ratio under 9 learning scheme scenarios
• Ideal Ratio of 400:600 arrived after 18652 iterations, having least
computational cost & training time with highest accuracy rate of 99.25%
NEURAL NETWORK APPROACH FOR CREDIT
RISK EVALUATION
• Paper by Eliana Angelini, Italy
• Estimate the Probability Of Default of a given user using Neural Network
• Applying neural network method to 76 small businesses from a bank in Italy
• Training Set of 53 firms, 11 independent variables (8 having financial ratios)
• Categorization: in Bonis: Good credit, Default: Not repaying loan obligation
• Cleaning and normalization of datasets with mix-max linear transformation
• Output range [0,1] interpreted with y>0.5 as in bonis and rest default
• Feedforward Network: inputs given as ordered array ad hoc network inputs are
grouped by three corresponding to values of attributes for three years
• Average total errors between 11% & 14% for Feedforward Networks & about 7%
for Ad Hoc Network
Learning
Scheme
Learning Ratio Error Training Time Overall
accuracy
LS1 100:900 0.020122 102.11 73.9
LS3 300:700 0.010148 206.61 79.3
LS4 400:600 0.008000 183.88 83.6
LS7 700:300 0.008531 414.16 85.9

BANKRUPTCY PREDICTION OF FINANCIALLY STRESSED FIRMS
• Developed Financial Stress Prediction Models including quantitative & qualitative measures
• Uses data of financially stressed firms that didn’t file for bankruptcy to make more realistic decision making process
• Two types of neural networks, Backpropagation (ANN-BP) & Genetic Algorithms (ANN-GA) and comparing with Multiple
Discriminant Analysis (MDA)
• Sample of 522(418 training) firms, 319 with loan default, 91 firms restructuring their debt & 104 filed for bankruptcy
• ANN-GA achieved highest accuracy in predicting the financial status of the company followed by ANN-BP
• ANN-GA had lowest error of 5%
ANN-BP with 11%
MDA error of 44%
• ANN-GA used Zmijewski score instead of
Altman Z score
• Can be universally applied to industries
• Lowest expected loss for all cost ratios

Developing ANN in R
Import the
DataSet to
R
Normalization
of Data
Splitting of
Data into
Training
andTest
Data Set
Generate
the
Formula for
ANN
Develop the
Artificial
Neural
Network
Plot the
Artificial
Neural
Network
Determine
Mean
Square
Error,
Sensitivity
and
Specificity,
Compare
with Logit

Our Data (1)
FORM
ULA
0 – No Default
1 – Default
NORM
ALIZED

Results and Analysis
RESULTS OF LOGISTIC REGRESSION ON R

Our Data (2)
0 – No Default
1 – Default
FORM
ULA
NORM
ALIZED

Results and Analysis
RESULTS OF LOGISTIC REGRESSION ON RANN and LOGISTIC REGRESSSION - COMPARISON

Accuracy and Consistency of Models
• After developing the model, implementation of the Risk Framework – Data Integration, Stress Models, Stress
Scenarios – takes place, a thorough validation and independent review of the relevant models is carried out.
• Also, all the methodologies, processes, system information, key assumptions, and suggested actions require
proper documentation.
• Regulatory Perspective Model risk management is a key component.
• Banks are expected to have a robust system to validate the accuracy and consistency of various models used for
capital and risk estimations.They should establish internal validation processes to assess the performance of such
models.
• The suggested process cycle for models validation includes the following
• Regular monitoring of model performance, which includes evaluation and rigorous statistical testing of stability of
the model and its key coefficients
• Identifying and documenting individual fixed relationships in the model that are no longer appropriate
• Periodic testing of model outputs versus outcomes, at least on an annual basis
• Demanding change control process, which specifies procedures to be followed before making changes in the
model, as a response to validation outcomes

Default Probability Prediction using Artificial Neural Networks in R Programming

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Default Probability Prediction using Artificial Neural Networks in R Programming

Similar to Default Probability Prediction using Artificial Neural Networks in R Programming (20)

Recently uploaded

Recently uploaded (20)

Default Probability Prediction using Artificial Neural Networks in R Programming