EAD Parameter : A stochastic way to model the Credit Conversion Factor

© CHAPPUIS HALDER & CO
EAD Parameter : A stochastic
way to model the Credit
Conversion Factor
By Leonard Brie and Yousra Belmajdoub
Global Research & Analytics1
1
This work was supported by the Global Research & Analytics Dept. of Chappuis Halder & Co.
Many thanks to Simon Corcos for his involvement and work on this White paper as well as his time spent on the writing and thanks to Helene
Freon for her help on the translation of this White Paper.

© Global Research & Analytics Dept.| 2020 | All rights reserved
2
Executive Summary
Following the 2007-2008 financial crisis, the banking industry faced an overall strengthening
of regulatory requirements, especially in the EU. The main goal being to ensure an adequate
level of capitalisation to guarantee financial robustness. Basel II defined a prudential framework
where banks are required to hold minimum capital amount to cover all their risks.
Banks are required to precisely estimate their risks – notably credit risk, which account for more
than 75% of banks overall RWA – in order to hold an optimal level of capital while respecting
regulatory limits.
This white paper aims at estimating credit risk by modelling the Credit Conversion Factor
(CCF) parameter related to the Exposure-at-Default (EAD). It has been decided to perform the
estimation thanks to stochastic processes instead of usual statistical methodologies (such as
classification tree or GLM).
Our paper will focus on two types of model: the Ornstein Uhlenbeck (OU) model – part of
ARMA model types – and the Geometric Brownian Movement (GBM) model. First, we will
describe, then implement and calibrate each model to ensure relevance and robustness of our
results. Then, we will focus on GBM model to model CCF.
Finally, it has been observed that the stochastic methods provide satisfying, robust and accurate
results where the delta between observed and expected CCF is negligible (0.69% relative
deviation). Furthermore, these methodologies enable the capture of the EAD monthly
evolution, which is a significant addition of information in comparison to the usual statistical
models.
Keywords : EAD, CCF, Basel III, Credit Risk, Stochastic process
Classification JEL : C02, C22, C63, G01, G17, G21

3
Table of Contents
Executive Summary ................................................................................................................... 2
Table of Contents ....................................................................................................................... 3
1. Introduction........................................................................................................................ 5
2. Context ............................................................................................................................... 5
3. The CCF Parameter............................................................................................................ 6
Definition..................................................................................................................... 6
Objectives of our White Paper..................................................................................... 6
4. Benchmark of existing methodologies............................................................................... 7
5. Modelling the CCF parameter.......................................................................................... 11
Creation of the modelling database ........................................................................... 11
5.1.1. Features of the modelling portfolio.................................................................... 11
5.1.2. Data quality tests ................................................................................................ 15
5.1.3. Application and calibration data ........................................................................ 17
5.1.4. Modelling principle............................................................................................ 18
Ornstein Uhlenbeck model (OU)............................................................................... 19
5.2.1. Model’s presentation.......................................................................................... 19
5.2.2. Tenor Analysis – Check of the model assumptions ........................................... 20
5.2.3. Implementation and calibration of the model .................................................... 22
5.2.4. Results................................................................................................................ 24
Geometric Brownian Motion model (GBM)............................................................. 26
5.3.1. Model presentation............................................................................................. 26
5.3.2. Tenor Analysis – Check of the model assumptions ........................................... 27
5.3.3. Implementing and calibrating the model............................................................ 28

4
5.3.4. Results................................................................................................................ 28
6. Application of the methodology to model the CCF parameter........................................ 30
7. Conclusion........................................................................................................................ 33
8. References ........................................................................................................................ 34
9. Table of Figures ............................................................................................................... 35

5
1. Introduction
Following the 2007-2008 financial crisis, the banking industry faced an overall strengthening
of regulatory requirements, especially in the EU. The main goal being to ensure an adequate
level of capitalisation to guarantee financial robustness. Basel II defined a prudential framework
where banks are required to hold minimum capital amount to cover all their risks. Its
fundamentals are three-fold:
• Pilar 1 | Regulatory capital requirements, reflected through the calculation of
McDonough’s solvency ratio;
• Pilar 2 | Surveillance mechanism to monitor capital management;
• Pilar 3 | Market discipline through transparent communication between financial
institutions.
Pilar 1 defines the McDonough’s ratio as the ratio between capital divided by the sum of Risk
Weighted Assets (RWA) for credit, market and operational risks. Basel requires a minimum
ratio of 8%.
Banks seek to optimise their capital levels while adhering to Basel requirements. To that end,
they need to precisely estimate their risks, particularly credit risk as it represents on average
more than 75% of banks RWA in Europe.
Credit risk estimation is based upon the measure of 3 parameters: The Probability of Default
(PD), the Loss Given Default (LGD) and the Exposure-At-Default (EAD). There are two types
of approach to quantify these parameters: a standard approach where PD and LGD are estimated
through external parties’ rating systems; internal approaches where banks use their own internal
models, validated by the supervisor, to calculate each parameter. In the latter case, Credit Risk
Modelling teams are primarily aiming at optimising the precision and robustness of their models
and methodologies.
Most methods focus on PD or LGD modelling, rather than EAD or CCF modelling.
This paper will focus on an innovative methodology to model CCF parameter for a portfolio of
non-defaulted contracts, developed by CH&Co’s Global Research & Analytics (GRA) team.
2. Context
The CCF parameter is usually estimated through statistical segmentation methodologies, where
each segment is assigned a CCFj value (weighted averages of observed ratios) supplemented
by prudential adjustments. This method leads to CCFs representing the EAD over a one-year
horizon and therefore does not account for the evolution of the exposure amount over the pre-
default period. The challenge for the Banks being to develop CCF estimating methods that are
both accurate and minimise the volatility of the resulting estimators.

6
This paper will focus on an innovative methodology to model CCF parameters for a portfolio
of non-defaulted contracts, developed by CH&Co’s Global Research & Analytics (GRA) team.
We will follow the following steps.
 Estimate the worsening cashflows of a client exposure through stochastic process
 Challenge the results using distinctive stochastic dynamics, which simulate different
market conditions
 Implement validation and sensitivity tests to challenge the proposed model and ensure
its accuracy and robustness
3. The CCF Parameter
Definition
CCF is a credit risk parameter, derived from EAD as it is its equivalent for Off Balance-Sheet
contracts.
EAD parameter measures the total amount (i.e. sum of capital, interest, commissions and fees)
due to the bank by a given defaulting client. For on-balance positions, EAD is the total amount
due as of the date of capital computation. For off-balance positions (overdraft facility), EAD is
the sum of on-balance exposure plus the off-balance position multiplied by the CCF parameter
as of the date of capital computation.
To ensure a rigorous and exhaustive definition of this parameter, we’d need to list all specific
cases per type of credit contract and to define complex financial concepts. Hence, we will
choose a simpler and intuitive definition, applicable to most cases, where CCF is defined as the
evolution of the credit exposure over one year:
𝐶𝐶𝐶𝐶𝐶𝐶(𝑡𝑡) =
𝐸𝐸𝐸𝐸𝐸𝐸(𝑡𝑡)
𝐸𝐸𝐸𝐸𝐸𝐸(𝑡𝑡 − 1 year)
From a regulatory standpoint, the issue with estimating EAD exists in the calculation of the
CCF for contracts with an off-balance position.
Regarding the regulatory aspect of post-default drawdowns, it should be noted that the
regulations allow institutions to take these into account either in their CCF or in their LGD
estimates. Since it is customary to take these withdrawals into account in LGD estimates, it was
therefore decided not to include them in the modelling of the CCF parameter.
Objectives of our White Paper
The current known methodologies put in place by financial institutions to estimate the CCF
parameter rely on statistical methodologies as the use of generalized linear model (linear or

7
logistic regression) combined with clustering methodologies in order to identify homogeneous
cluster of current accounts that will have the same CCF parameter on average.
Nevertheless, these methodologies have shown weaknesses, indeed a classic linear regression
based on the least square optimisation can lead to unrealistic predicted values as they could be
lower than 0 or significantlygreater than 1. Similarly, the use of a logistical regression model
may not be the perfect modelling solution as the CCF is a non-binary continuous parameter.
This means that the variable used to explain should be transformed which in many cases leads
to a loss of information or a bias in the interpretation or analysis of the results.
Overall, it has to be noted that most of the Banks aim at predicting the final EAD with statistic
models without taking into account the evolution of the exposure during the period preceeding
the default. This finding leads us to find a new way to model the CCF parameter relying on
time dependant process and thus being able to compute the exposure at any time over a
regulatory 1-year horizon. This 1-year horizon being aligned on the probability of default which
corresponds to the probability of a client to default over a 1-year horizon.
In addition, to the mathematical interest and challenge to develop a new CCF methodology, the
probabilistic rather than statistic approach will allow to obtain more precise estimations but
could also give a forecast of the exposure at any time (daily, weekly or monthly frequency)
over a 1-year period. In other words, now it would be possible to measure the EAD over a
period of 1-year instead of 1-year later.
4. Benchmark of existing methodologies
Existing methodologies benchmarked among banking institutions rely on various statistical
methodologies such as generalised linear models (linear or logistic regressions), or
segmentation models where homogenous groups of CCF behaviour are identified and then an
average rate specific to each segment is applied.
Over the years, five major methodologies have been identified to estimate the CCF parameter;
thee are quickly described in this section and are as follows:
- Model 1 | Historical mean per cluster
- Model 2 | Linear regression function per cluster
- Model 3 | Logistic regression function per cluster
- Model 4 | CART regression tree to identify clusters then combination of bootstrap and
margin of conservatism to estimate the CCF per cluster
- Model 5 | Number weighted average of historical CCF per clusted and addition of fixed
margin of conservatism

8
Regarding Model 1, the CCF modelling approach relies on the estimation of the parameter
based on the historical defaulting contracts. Where the calculus formula of the CCF parameter
estimated each month is as follows:
𝐶𝐶𝐶𝐶𝐶𝐶 =
max�𝑀𝑀 − 𝑀𝑀|𝑛𝑛; 0�
𝐷𝐷|𝑛𝑛
With max�𝑚𝑚 − 𝑀𝑀|𝑛𝑛; 0� estimated for each current account and for a given month:
N : Number of non-defaulting current accounts
M : On-balance sheet exposure of non-defaulting current accounts
D : overdraft facility of non-defaulting current accounts
M|n : On-balance sheet exposure of non-defaulting current accounts that defaulted in the
next 12 months
D|n : overdraft facility of non-defaulting current accounts that defaulted in the next 12
months
n : Number of defaulting current accounts in the next 12 months
m : Exposure of defaulting current accounts in the next 12 months
Regarding Model 2, in order to compute regulatory capital requirement under internal
approach, a Bank must determine the exposure at default (EAD) of each homogeneous risk
class. First computed, thanks to a scoring function,is the economic EAD per contract which
corresponds to the expected exposure remaining to be paid if the contract has to default in a 12
months horizon.
In order to estimate the EAD, a scoring function has been put in place which is a linear
regression per cluster. The linear regression is based on the information of EAD observed in
the past and relies on the existence of a liner link between this variable to explain and a selection
of explanatory variables. The performance of the models obtained is then checked using two
indicators. The coefficient of determination (R²); which corresponds to the ratio between the
sum of the squared deviation of the estimated EAD and the mean of the observed one. And the
mean absolute deviation (MAD); which sensibly differs from the R² indicator, since it
corresponds to the average of the absolute deviation between the observed and estimated EAD.
Regarding Model 3, it is similar to the previous one in its principle; except that it has been
decided to dichotomize the variable to explain (as a modelling assumption) in two modalities
and to apply a logistic regression function to model it. The modeled variable is designed as
follows:

9
• It is equal to 1 if the observed CCF parameter is less than or equal to 1;
• 0 either.
For the contracts whose overdraft facility is non-null, the modality 1 identifies current accounts
that didn’t exceeded it before or when defaulting in a 12 months period. For the contracts whose
overdraft facility is null, the modality 1 describes a decreasing trend of the exposure between
the observation and the defaulting date.
Regarding Model 4, the estimation of the CCF parameter is based on statiscal studies wich are
completed by expert judgement elements, in particular when the quantity of available data may
impact the robustness and relevance of the statistical model retained. The modelling phase is
split in three steps:
• Step 1: Estimation of the historical observed CCF (based on the definition of the 26
June 2013 CRR);
• Step 2: Clustering analysis whose objective is to define homogeneous clusters of
historical CCF and based on risk data. This step creates a current account bucket whose
behavior is considered as similar as when close to default;
• Step 3: Prudent estimation of the CCF level for each homogeneous cluster, that takes
into account margin of conservartism that are regulatory compliant.
Step 2 is performed using a segmentation CART (Classification And Regression Tree) decision
tree giving homogeneous CCF cluster. This leads to a relevant approach of the risk
measurement and allows us to compute an exposure bucket whose average CCF value is
compliant whith the article 133.a. of the 26 June 2013 CRR. The regression tree approach
consists in sucessively dividing the population in homogeneous cluster of CCF. These clusters
are determined thanks to the modalities of the more discriminatory explanatory variable. The
process of clustering is repeated until no more explanatory variables can be used. The algorithm
also tests all the potential explanatory variables and retains only the optimal one which
maximises the following variance criterion:
• The variable to explain must have a variance lesser in the subset node than in the parent
one;
• The variable to explain must have the more distinct average between one subset node
to the other.
In other words, the regression tree tends, at each step of the algortithm, to minimise the variance
intra-cluster and maximise the variance inter-cluster.
With this methodology, it is more frequent to use a binary regression tree (the CCF being a
continuous variable): it means that each parent node only has a maximum of two sub-set nodes
but the size of the tree is limitless. This type of tree does have the ability to quickly identify
specific risk profiles. For this benchmark model, the retained settings are the following:

10
• A node is split when the Fisher’s statistic 20% threshold is reached (this value is set by
default);
• The needed number of contracts to split a node is a minimum of 22 (10% of the initial
size);
• A minimum of 5 contracts is required by node.
The choice of the above threshold allows us to obtain a smaller number of final CCF classes
but more relevant ones. Indeed, too many clusters may strenghten the instability of the
regression tree without offering a significant enhancement of the model performance.
Regarding model 5, in this approach the estimated CCF of a given contract at a date T0
corresponds the best estimator of the part of off-balance sheet that will be consumed in the case
of default occurring during the 12 months following T0. The computation of the CCF estimators
relies on the two following principles:
• On one side, it is based on the observed historical CCF of past generations computed by
homogeneous cluster of contracts;
• On the other side, their measurement is done as a ratio of the T0 off-balance sheet.
For a given cluster, the computation of the estimated CCF is as follows :
• First, the Bank determines the historical CCF average weighted by the number of
contracts per cluster;
• Then, a 0% floor is applied on each cluster;
• Finally, a third and last step consists in applying a 20% expert adjusting factor to the
non-withdrawn overdraft facility of the contracts.
Finally, for the contracts of a specific cluster, the EAD at T0 is equal to:
𝐸𝐸𝐸𝐸𝐸𝐸 = 𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂 𝑠𝑠ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜 𝑡𝑡ℎ𝑒𝑒 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑎𝑎𝑎𝑎 𝑇𝑇0 + 𝐶𝐶𝐶𝐶𝐶𝐶 𝑜𝑜𝑜𝑜 𝑡𝑡ℎ𝑒𝑒 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
∗ 𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂 𝑠𝑠ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜 𝑡𝑡ℎ𝑒𝑒 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑎𝑎𝑎𝑎 𝑇𝑇0
By design, the EAD of contracts without the Off-balance sheet amount or with a null amount
is thus equal to the observed On-balance sheet exposure. In particular, creditors accounts
without an Off-balance sheet amount will have a null EAD. Similarly, debtor accounts which
do not have any Off-balance sheet exposure have their EAD equal to the debtor amount at T0.

11
5. Modelling the CCF parameter
Creation of the modelling database
The available data corresponds of current account portfolios from a Tier 1 French Bank. The
study conducted in this White Paper focuses on the modelling of a raw CCF parameter, i.e.
without considering any prior segmentation of the client database. Thus, it is assumed that all
the available current accounts come from a single segment therefore, during the application of
the parameter a single value of CCF will be considered.
It is important to note that a specific feature of the working database is that it only contains
current accounts defaulting in a year. Thus, for each current account, its end of month exposure
is gathered over a 12-month horizon before its defaulting date (called tdef). The initial date (the
one starting 12 months before tdef) being noted t0. In addition to this feature, it must be
remembered that for the considered portfolios almost all the current accounts see their balance
getting worse over the period; consequently, the majority of the computed CCF are greater than
1. Finally, in the case of our research studies, the default of a current account is declared when
the exposure remains above the overdraft facility for more than 90 days.
The first step in modelling of the CCF parameter relies on the exposure forecast of a current
account at each Tenor (1 month, 2 months, …, 12 months) until the defaulting date. Thus, it is
necessary to extract and work on the data as time series ; where each series corresponds to a
studied Tenor.
5.1.1. Features of the modelling portfolio
The available portfolio contains 39,932 current accounts defaulting over the period from June
2009 to December 2012. The initial dates of each current account are extracted on a daily basis.
Nevertheless, in order to avoid the existence of missing data during the design of the time series,
it has been decided to aggregate the information on a monthly basis. This ensured that the
portfolio contains at least 1 current account per monthly initial date. The estimation of the CCF
parameter, the construction of the database and the several analyses performed are done
following a monthly granularity.
To summarize, the features of the database are the following :
• The data span over 3 years and half from June 2009 to December 2012.
• There are 39,932 rows corresponding to each current account
• There are 14 columns corresponding to :
 t0: the initial date being 12 months before the defaulting date.
 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝑡𝑡0
: it is the current account exposure seen at the initial date t0.
 The 12 monthly cashflows worsening the initial exposure. For instance, the
column Flow 1, corresponds to the amount to be summed to the 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝑡𝑡0
in

12
order to obtain the exposure at t0 + 1 month. The EAD of the current account is
thus 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝑡𝑡0
+ ∑ 𝐹𝐹𝐹𝐹𝐹𝐹𝑤𝑤𝑖𝑖
12
𝑖𝑖=1
In order to obtain suited time series, the initial exposure is aggregated and corresponds to the
total exposure amount of all the current accounts available for a specific month. Similarly, all
the cashflows are summed for each tenor and divided by the total initial exposure. It is then
contained in a database of 43 rows (one for each observed month) Il and 14 columns (date, total
initial exposure and the 12 monthly cashflows).
The data used is historical data, which means that the information of the EAD and monthly
exposure is available over the whole modelling period (use of a square matrix). As an
illustration, the modelling database looks as follows:
The figures displayed below allow us to visualize the information from the modelling database.
The first figure shows the evolution of the monthly total initial exposure (in k€):
Figure 1- Illustration of the modelling database

13
In the above figure, it is observed that the initial exposure is quite steady over the period from
June 2009 to June 2011 with an average exposure between 500k€ and 1500 k€. After June 2011,
the total initial exposure keeps growing until a maximum value of 3083 k€, this can be
explained by the evolution of the Bank activity and the size of its total assets which grew as
time went on. At the start of its activity the Bank was cautious with constraining credit granting
in order to contain its risk. Finally, the Bank has opened more current accounts with less
limitations thus increasing the density and probability of observing defaults over the last years.
The second figure, below, shows the monthly evolution of the accounts cashflows:
Figure 2 - Figure 2 - Monthly evolution of the total EAD - June 2009 to December 2012

14
The evolution of the 5M exposure worsening cashflows shows a similar trend as the one
observed in Figure 2 ; i.e. the evolution seems to be correlated the Banks activity and is
proportional to the growth of its total assets. Regarding the cashflow evolution displayed as a
ratio of the EAD, it is steadier and comprises between [0.52%;1.95%].
Finally, the last figure, below, shows descriptive statistics of the 5M exposure worsening
cashflows:
Figure 3 - Monthly evolution of the accounts cashflows - Tenor 5M

15
Figure 4 - Descriptive statistics of worsening cashflows- Tenor 5M
The statistics displayed in the above figures allow us to better understand the data to model.
The analysis of this information gives us a first appreciation of the parameters that will be used
in the stochastic diffusion model. Those descriptive statistics have been computed for each
Tenor to be modeled.
5.1.2. Data quality tests
The data quality tests are realized on the monthly time series corresponding to the worsening
aggregated cashflows. The stochastic model that will be developed thereafter relies on the
modelling and forecasting of the time series of worsening cashflows of each Tenor (the column
of the matrix displayed in Figure 1) and the model will be applied to the last known observation
of total exposure of the non-defaulted population. As such it is important to use data that is
clean and comprehensive. Many tests are realised in order to assess the data quality and ensure
that there is an adequate quantity of data to estimate the CCF parameter. The tests realised on
the time series of each Tenor are presented below:
 Check for the presence of a missing value: This test consists in checking that the time
series have no missing values between the first (June 2009) and last (December 2012)

16
monthly observation date. Furthermore, it also ensures the continuity in the time series
i.e. for each current account it is checked that the all data is available between the initial
and default date.
 Number of observations and modelling assumptions: The calibration of the CCF
model relies on observed data. The stochastic model is used to forecast the last known
exposure of non-defaulted current account as if they will be defaulting in the next 12
months. As such, it is necessary to ensure the quality of the available data as well as its
exhaustiveness. The calibration of the stochastic model relies on the estimation of
parameters such as the mean, the standard deviation of the time series… The
convergence of the estimator being ensured by the Central Limit Theorem whose main
assumption is the necessity of at least 30 observations when estimating the parameter.
Thus, many checks are conducted in order to ensure that there is enough data and the
upcoming estimations will be relevant.
First, the number of observations is counted and checked to ensure that they are above
30 by Tenor. Given that the modelling period covers June 2009 to December 2012 on a
monthly basis; it means that 42 months of observations are available which satisfies the
completeness criteria.
In addition, in order to have a continuous time series without missing observations it
has been decided to aggregate the current account information. During the creation of
those time series, a representativeness criterion has been defined and thus it ensured that
a minimum of 30 observations was used each month when computing the total exposure
and the average cashflows. This test is valid for the 42 months of the modelling
database.
In practice, a bank having less data, which does not satisfy the criterions of the previous
check can still reinforce its modelling database thanks to external data as long as it
remains relevant. Otherwise, a bootstrap approach can still be put in place in order to
expand the modelling sample. In addition, a Bank can also define a specific margin of
conservatism related to the data quality in order to cover the uncertainty around the
estimation of the CCF parameter.
 Evidencing outliers for each Tenor : Before modelling, it is mandatory to ensure that
the time series doesn’t show any outliers which could bias the estimation parameters.
IF such values are observed then they should be retreated. Any outlier could be defined
as a value excessing the MAD (mean absolute deviation) of its associated Tenor.
For a Tenor i, an outlier of the the time series 𝐹𝐹𝐹𝐹𝐹𝐹𝑤𝑤𝑖𝑖 is defined as follows:
If 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝑖𝑖(𝑡𝑡) − 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝑖𝑖(𝑡𝑡 − 1) > 𝑀𝑀𝑀𝑀𝑀𝑀[𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝑖𝑖], where t is the initial date then
𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝑖𝑖(𝑡𝑡) is an outlier of the time series 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝑖𝑖.
The tests performed on the time series show quite low peaks where a retreatment does
not appear to be necessary. Regarding a significant outlier, if any, a possible retreat
would be to affect them their MAD value.

17
5.1.3. Application and calibration data
At this stage, the data corresponds to exhaustive time series representing the average ratio of
worsening cashflows. Nevertheless, when applying the model on non-defaulted exposure in
order to calibrate the CCF and estimate their final EAD; the worsening cashflows are not all
known. That’s why it is necessary to forecast them; based on the historical data. The below
figure shows the problem related to the EAD computation, if it must be computed, based on
non-defaulted exposure reported at C.O.B December 2012:
The above Figure shows the shape of the matrix when we must estimate EAD on non-defaulted
exposure at end of December 2012. The values in the orange cells correspond to observed and
historical worsening cashflows. Whereas the orange values in the light grey cells correspond to
non-observed worsening cashflows that have been forecasted thanks to the stochastic model.
The forecasting of those values until the 2012 generation gives the simulated worsening
cashflows and allows us to compute the final EAD amount for December 2012; in the above
figure it is obtained via the values presented in the black frame. The CCF parameter corresponds
then to the ratio between the final EAD and initial exposure of December 2012. The evolution
of the cumulated exposure until the EAD is the following:
Figure 5 – Matrix of worsening cashflows – Application data – C.O.B. December 2012

18
The following sections will now introduce and present the stochastic models that will be used
to forecast the level of worsening cashflows for each Tenor. First, the time series designed and
studied in the previous sections are analysed in order to ensure that they fit the stochastic model
that will be used. Finally, the results of the forecast and the application to an EAD computation
are presented.
5.1.4. Modelling principle
Following the definition of the CCF parameter, the Tenors to model correspond to the 12
months following the initial date or the 12 months preceding the default date. For each Tenor,
there exists a time series of worsening cashflows corresponding to the rates observed by the
Bank each month. The purpose of the approach is to forecast the incomplete time series in order
to predict the future worsening rate associated to the generation used when computing the final
CCF. For each month of the modelling period, the time series of worsening rates are assimilated
to forward-rate curves which will give, for a specified maturity (or Tenor), the marginal
worsening rate. By construction, the granularity of the data is monthly and in order to model
the dynamic of each Tenor, two approaches have been considered:
 A modelling of the current account behavior via a mean-reverting model using an
Ornstein-Uhlenbeck process. The purpose being to calibrate the process on observed
data and to forecast the time series thanks to the estimated model parameters; thus,
giving the final worsening rate necessary to compute the EAD of the non-defaulted
exposure.
 Use of a geometric Brownian motion which does not rely on any stationary assumption
and may be more fitted to our data. In order to estimate the parameters of each process;
Brownian motions corresponding of each Tenor are simulated.
Nevertheless, it is not optimal to simulate the Brownian motions in an independent way as it
does not allow us to capture the correlation between each Tenor. That’s why, a first step consists
Figure 6 – Evolution of the cumulated exposure (EAD)

19
in reducing the dimension of the matrix (where each column is a Tenor) via an PCA (Principal
Component Analysis); which allows us to capture the correlation structure of the Tenor while
simulating a reduced number of Brownian motions. Another solution, would be to simulate 12
correlated Brownian motions thanks to the use of the time series correlation matrix and the
extraction of the “square matrix” using a Cholesky decomposition; however it would have been
time consuming (in terms of computing time) due to the large number of times series to correlate
and forecast.
Once the correlated Brownian motions generated thanks to the rotation matrix deriving from
the PCA, the parameters of the Ornstein-Uhlenbeck are estimated via the use of a linear
regression giving a closed formula for each parameter of the model. Regarding the parameter
of the GBM, they are empirically obtained from the historical values of the time series by Tenor.
The worsening cashflows are then extrapolated thanks to these models where the starting point
corresponds to the last observed data from each time series.
Ornstein Uhlenbeck model (OU)
In this section we present the first alternative methodology introduced by the Global Research
& Analytics (GRA) department of Chappuis Halder &Co.
5.2.1. Model’s presentation
The Ornstein Uhlenbeck model (OU) is a stochastic model frequently used in market finance
to model time series with a mean reverting process, the discrete form of the OU is an AR(1)
and is part of the «Auto Regressive Moving Average » (ARMA) processes.
The OU process is a mean reverting process, solution of the following stochastic differential
equation:
𝑑𝑑𝑋𝑋𝑡𝑡 = 𝜆𝜆 ∗ (µ − 𝑋𝑋𝑡𝑡) ∗ 𝑑𝑑𝑑𝑑 + 𝜎𝜎 ∗ 𝑑𝑑𝑊𝑊𝑡𝑡
Where:
• λ>0: Mean reverting speed
• µ: Mean value
• σ>0: Volatility
• Wt: Wiener process
Applying the Itô's lemma to the process 𝑓𝑓(𝑡𝑡, 𝑋𝑋𝑡𝑡) = 𝑋𝑋𝑡𝑡 ∗ 𝑒𝑒 𝜆𝜆∗𝑡𝑡
gives the following discrete
equation:
𝑋𝑋𝑡𝑡+1 = µ ∗ �1 − 𝑒𝑒−𝜆𝜆
� + 𝑒𝑒−𝜆𝜆
∗ 𝑋𝑋𝑡𝑡 + 𝜎𝜎 ∗ �
1 − 𝑒𝑒−2𝜆𝜆
2𝜆𝜆
�
1
2
∗ 𝑁𝑁(0,1)
This formula shows that the OU model is an AR(1) process.

20
5.2.2. Tenor Analysis – Check of the model assumptions
Before implementing the model, it’s necessary to check that the Tenor can be considered as an
OU process. First stationarity and normality tests are run on the time series and then, their
autocorrelation plots are analysed.
Normality test
The OU model is a gaussian process hence to use it, it must be checked that the Tenor follows
a normal distribution.
Thus, it is plotting the QQ-plot for each Tenor. It’s a graphical tool to assess if a set of data fits
a theoretical assumed distribution (normal distribution). It is performed by plotting the quantiles
of the two distributions against each other. The plot below shows the QQ-plot of the 9th
Tenor:
The QQ-plots for the 12 series show satisfactory results meaning that it can be assumed that the
time series follow a normal distribution.
In addition, and in order to validate those normality results, it has been realised the Jarque-Bera
test on the 12 series to test the following null hypothesis H0: the studied data follows a normal
distribution. For the 12 series, the Jarque-Bera test doesn’t reject H0 with a significance level
of 5%. This strengthens the hypothesis that the time series follows a normal distribution hence
the use of a gaussian process to model it.
Stationary test
The time series modelling is based on the stationarity hypothesis; to test this hypothesis an
Augmented Dickey-Fuller (ADF) test is conducted on all the Tenors.
Figure 7 - QQ-plot du Tenor 9M

21
The ADF test results show that one third of the Tenors are indeed stationary. To get a stationary
time series allowing us to model the processes, a first order differencing has been applied on
the time series.
The first order differencing is to compute the differences between consecutive terms of the time
series, i.e. if (𝑋𝑋𝑡𝑡) > 0 is a time series then the first order differencing series (𝑆𝑆𝑡𝑡) > 1 verifies:
𝑆𝑆𝑡𝑡 = 𝑋𝑋𝑡𝑡 − 𝑋𝑋𝑡𝑡−1
Differencing a time series allows us to remove the deterministic polynomial trend as well as the
periodicity. This transformation ensures we obtain a “pure” stochastic process, i.e. without a
deterministic component. This increases the odds of having a stationary series.
This is verified for 10 out of the 12 time series, for which the differencing series verifies the
augmented Dickey-Fuller (ADF) stationary test. It’s possible to differentiate the series a second
time to have all the 12 series stationary, but it will not be done as this approach can alter the
quality and interpretation of the results. Therefore, it was decided to model and calibrate on the
first order differencing series.
Autocorrelation study
The study of the autocorrelations functions (ACF) and the partial autocorrelations functions
(PACF) allows us to validate the use of an autoregressive process AR(1). The plot of the ACF
of an AR(1) must show an unique significant autocorrelation order: the first one. This would
justify the correlation between Xt and Xt−1 and hence the use of a first order autoregressive
process. Similarly, the PACF must point out the first autocorrelation order only as significant.
For example, here are the ACF and PACF plots for the differentiated 8M Tenor:
All the Tenors show similar graphs with a significant peak on the first order for the two types
of plots, which highlights an autocorrelation of order 1. Nonetheless, some Tenors also show
peaks on superior orders too, which suggests that it could be AR process of an order p>1.
Figure 8 - ACF et PACF plots – Differentiated 8M Tenor

22
It’s also noted that the ACF peaks don’t decrease: The peak’s heights are irregular. This
behavior is usually associated with the existence of a moving average term (MA). But since
none of those peaks are significant, it’s not wrong to not consider them, but it must be kept in
mind when interpreting and analysing the results.
Overall, the results are satisfactory and it appears consistent to use an AR(1) process to model
the differentiated series.
Finally, below summarizes the retained hypotheses and their validations:
- The normality hypothesis is validated hence the choice of a gaussian model is relevant.
- To obtain the stationarity, which is one of the main hypotheses, a first order
differentiation has been applied to the series. The first order transformed series have
been proven to be stationary and thus it seems relevant to assume them as OU processes.
And once the forecast of the differentiated series is finalised, it will be possible to
retrieve the values for each Tenor. Even though this transformation is not an issue from
a theoretical point of view (in terms of calculations and methodology), it may still have
an impact on the final predictions’ quality when the Tenors are retrieved.
- The order 1 autocorrelation is verified for most of the cases. Nonetheless for some
Tenors, a superior order and an MA term for some could be considered.
5.2.3. Implementation and calibration of the model
Construction of correlated Brownian motions
The OU processes are calibrated from the time series of the average worsening cashflows to
then forecast each Tenor. In order to ensure this, Brownian motions are simulated to obtain the
necessary Wiener processes in the OU model. However, to design a model accurate enough and
fitting the reality, the Brownian motions need to keep and transcribe the correlation structure
between the differentiated Tenors. Therefore, it has been decided to build them thanks to a
methodology relying on the dimension reduction of the modelling data base. It should be noted
that this methodology also allows us to decrease the number of simulations.
The approach follows the steps detailed below:
- First, a Principal Component Analysis (PCA) is done between each time series. It
highlights the most explicative vectors of the initial structure. The number of
eigenvectors chosen is done based on the elbow criterion and the cumulated variance
analysis which must be at least 80%. The vectors will constitute the rotation matrix. The
PCA specifies that the first 5 eigenvectors explain approximately 80% of the variance.
- Then, the five vectors are reunited to obtain the rotation matrix of size 12x5.
- The next step is to simulate 12 matrix of size 5xN of N(0,1) independently and
identically distributed, where N=1000 is the number of scenarios of the model. This
corresponds to the simulation of 12x5x1000 = 60,000 standard normal distributions.

23
- Those matrices are then multiplied by the rotation matrix which gives 12 matrices of
size 12x1,000.
- Each matrix is used to predict the point of the 12 differentiated series; knowing that not
all the series need to be forecasted by the same number of points, all the lines of all the
matrices are not used:
• For the first matrix, i=1 to 12, the line i is used to prolong the first point of the
ith
series.
• For the second matrix, i= 2 to 12, the line i is used to prolong the second point
of the ith
series.
• And so on, the last matrix serving only to prolong the last point of the ith
series.
The above steps are described in the figure below:
Model calibration
Once the calculation and simulation of Brownian motions is explained, the best calibration
parameters need to be found for the models of each Tenor. As presented above, the discrete
form of the OU process corresponds to the equation of the linear regression of Xt+1 by Xt:
𝑋𝑋𝑡𝑡+1 = 𝑎𝑎 + 𝑏𝑏 ∗ 𝑋𝑋𝑡𝑡 + 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛
Where a and b are easily related to λ and µ by identification.
This regression, resolved by the maximum likelihood method, gives estimators of those
parameters:
µ =
𝑆𝑆𝑦𝑦 ∗ 𝑆𝑆𝑥𝑥𝑥𝑥 − 𝑆𝑆𝑥𝑥 ∗ 𝑆𝑆𝑥𝑥𝑥𝑥
𝑛𝑛 ∗ �𝑆𝑆𝑥𝑥𝑥𝑥 − 𝑆𝑆𝑥𝑥 𝑥𝑥� − (𝑆𝑆𝑥𝑥
2 − 𝑆𝑆𝑥𝑥 ∗ 𝑆𝑆𝑦𝑦)
Figure 9 - Explicative figure showing the construction process of correlated brownian motions

24
𝜆𝜆 = −ln (
𝑆𝑆𝑥𝑥𝑥𝑥 − µ ∗ �𝑆𝑆𝑥𝑥 + 𝑆𝑆𝑦𝑦� + 𝑛𝑛 ∗ µ2
𝑆𝑆𝑥𝑥𝑥𝑥 − 2 ∗ µ ∗ 𝑆𝑆𝑥𝑥 + 𝑛𝑛 ∗ µ2
)
𝜎𝜎2
=
2 ∗ 𝜆𝜆 ∗ [𝑆𝑆𝑦𝑦𝑦𝑦 − 2 ∗ 𝛼𝛼 ∗ 𝑆𝑆𝑥𝑥𝑥𝑥 + 𝛼𝛼2
∗ 𝑆𝑆𝑥𝑥𝑥𝑥 − 2 ∗ µ ∗ (1 − 𝛼𝛼) ∗ �𝑆𝑆𝑦𝑦 − 𝛼𝛼 ∗ 𝑆𝑆𝑥𝑥� + 𝑛𝑛 ∗ µ2
∗ (1 − 𝛼𝛼)2
]
𝑛𝑛 ∗ (1 − 𝛼𝛼)2
Where, 𝛼𝛼 = 𝑒𝑒−𝜆𝜆
, 𝑆𝑆𝑥𝑥 = ∑ 𝑆𝑆𝑖𝑖−1
𝑛𝑛
𝑖𝑖=1 , 𝑆𝑆𝑦𝑦 = ∑ 𝑆𝑆𝑖𝑖
𝑛𝑛
𝑖𝑖=1 , 𝑆𝑆𝑥𝑥𝑥𝑥 = ∑ 𝑆𝑆𝑖𝑖−1
2𝑛𝑛
𝑖𝑖=1 , 𝑆𝑆𝑦𝑦 𝑦𝑦 = ∑ 𝑆𝑆𝑖𝑖
2𝑛𝑛
𝑖𝑖=1 , 𝑆𝑆𝑥𝑥 𝑥𝑥 =
∑ 𝑆𝑆𝑖𝑖−1 ∗ 𝑆𝑆𝑖𝑖
𝑛𝑛
𝑖𝑖=1 , n = number of observed data and ( Si ) a differentiated Tenor.
The model parameters are estimated for each differentiated series, and then thanks to the
matrices of simulated Brownian motions it is now possible to forecast the 12 differentiated
series. It has been decided to retain 1,000 scenarios for each model in order to optimize the
computation time.
5.2.4. Results
To judge the forecasting quality and accuracy of the models, first it must be recalled that the
models were calibrated on a portion of the historic dataset (starting from June 2009 to February
2012 for the 9M Tenor for instance). Then an out-of-the-sample validation test has been
conducted on the part of the dataset which wasn’t used for calibration (March 2012 to December
2012 for the 9M Tenor) in order to assess the model’s accuracy.
Indeed, a CCF model doesn’t only need to predict the most conservative EAD but also needs
to be robust through the estimations. It is then necessary to check that all the calculation
methods used are robust over time. In order to ensure this, a cross validation method has been
implemented, it’s the out-of-the-sample validation. It relies on the estimation of the model’s
parameters on a part of the dataset and their validation on another part of the dataset not used
during modelling. The aim is to split the data by removing the most recent observations and
calibrate the model’s parameters to estimate worsening cashflows. The purpose is to compare
the model’s predictions to the observations on data that hasn’t been used for calibration.
The results are displayed in a same figure (see below) presenting the observed differentiated
time series, the average predictions from the 1,000 simulations, and the 10% and 90% quantiles
associated to the prediction. Then, in a second plot, the initial time series is presented after
proceeding to an inverse differentiation, which allows us to have the final representation of the
estimated worsening cashflows.
To present and discuss the results, a focus was undertaken on the 9M and 10M Tenor:

25
For the studied Tenor, the analysis of Figure 10 and Figure 12 shows that the Ornstein
Uhlenbeck model allows us to capture the evolution of the series over the out-of-sample
validation period. Indeed, the simulation’s envelope represented by the 10% and 90%
Figure 10 - 9M Differentiated Tenor – worsening cashflow -
Estimated vs. Observed
Figure 11 - 9M Tenor – worsening cashflow - Estimated vs. Observed
Figure 12 - 10M Differentiated Tenor – worsening cashflow -
Estimated vs. Observed
Figure 13 - 10M Tenor – worsening cashflow - Estimated vs.
Observed

26
percentiles covers almost all the forecasted values. As illustrated in the above figure, some
excess appears on the 12 differentiated series; it may correspond to extreme variations related
to a Bank economic event which was not predictable. Overall, the model correctly predicts the
evolution of the differentiated Tenor.
Nonetheless, once the inverse differentiation is applied to retrieve the final predicted worsening
cashflows, in Figure 11 and Figure 13 one can observe more excess of the simulation’s
envelope. In addition, for all the series, the average predictions are below the observed points;
which means that the model does not seem to be able to capture the drift existing in the recent
evolution of the time series. This drift appearing to increase over time. Consequently, one may
wonder whether a mean-reverting process is adequate to fit the data.
Even though the predictions on the differentiated series were satisfactory, it appears that the
final series are not accurate enough to be exploitable. The principal reasons appear to be the
following:
• The inverse differentiation operation makes the forecasts less accurate. It questions the use
of a model which needs a stationary hypothesis.
• It has also been observed that the predicted series didn’t considered the observed drift during
the last years. This issue can be reduced by introducing a moving average term in the model.
During the analysis of the model assumptions, it was already observed that such term could
be considered.
Geometric Brownian Motion model (GBM)
In the following sections are presented the second alternative modelling method introduced by
the Global Research & Analytics (GRA) of Chappuis Halder &Co.
5.3.1. Model presentation
Following the observations made during the analysis of the OU model and its results (presented
in the section 5.2); it was decided to test another stochastic model which doesn’t have a
stationary hypothesis and contains a drift term. The geometric Brownian motion model (GBM)
meets those two conditions. It’s a standard process used in modelling and mathematical
forecasts; particularly in finance where it’s used to model the evolution of stock prices.
Furthermore, its implementation cost remains relatively minimal.
The GBM is a positive process, solution of the stochastic differential equation:
𝑑𝑑𝑋𝑋𝑡𝑡 = µ ∗ 𝑋𝑋𝑡𝑡 ∗ 𝑑𝑑𝑑𝑑 + 𝜎𝜎 ∗ 𝑋𝑋𝑡𝑡 ∗ 𝑑𝑑𝑊𝑊𝑡𝑡
Where:
• µ: Drift,

27
• σ: Volatility
• Wt: Wiener process
Applying the Itô's lemma to the process gives the following discrete equation which will be the
formula used for the model implementation:
𝑋𝑋𝑡𝑡+1 = 𝑋𝑋𝑡𝑡 ∗ 𝑒𝑒
( �µ−
𝜎𝜎2
2
�+ 𝜎𝜎∗𝑁𝑁(0,1) )
5.3.2. Tenor Analysis – Check of the model assumptions
Normality tests
The use of a GBM and its discrete form indicates that:
∀𝑡𝑡 ≥ 0, log(𝑋𝑋𝑡𝑡+1) − log(𝑋𝑋𝑡𝑡) ~𝑁𝑁(�µ −
𝜎𝜎2
2
� , σ2
)
It’s therefore necessary to check that the increments of the log’s series are gaussian. As seen
previously, this verification is done graphically via the use of QQ plots. An example is shown
below:
Figure 14 - QQ plot –Log of the increments of the 12M Tenor
This analysis has been done for all the Tenors and has shown satisfactory results; the normality
hypothesis is then verified for all the series.

28
Independence test
The use of the GBM requires the (Xi) to be independent for all the Tenors. This hypothesis is
verified during the study of the ACF and PACF plots (introduced in the section on the OU
model). The aim being to prove that there is no correlation between the observations from one
date to another.
For example, here are the ACF and PACF plots for the 10M Tenor:
It can be observed in the above Figure 15 that the hypothesis is justified and to be verified as
the correlations are low. Nonetheless the levels of correlation are not completely negligible,
hence the hypothesis of independence needs to be considered with caution. This analysis has
been done for all the Tenors and it appears that all the ACF & PACF are similar; the conclusions
presented for the 10M Tenor can be extended to all the time series.
5.3.3. Implementing and calibrating the model
Once the parameters are calibrated, the implementation is done recursively. The model
calibration is done based on the history of the time series and deduced empirically. The
estimators of the two model’s parameters are:
• σ2
= empirical variance of the increments of the log-series
• µ = empirical mean of the increments of the log series + σ2
/2
5.3.4. Results
As mentioned during the presentation of the GBM model, it has been decided to test this model
since it doesn’t need a stationary hypothesis and contains a drift term; allowing to deal with the
Ornstein Uhlenbeck model weaknesses observed during the analysis of the results in the
previous section. Below presented in a same graph (for the 9M and 10M Tenor) are the observed
time series, the forecasted average of the 1,000 scenarios and the 10% and 90% quantiles
associated to the simulations.
Figure 15 - ACF & PACF – 10M Tenor

29
Figure 16 and Figure 17 show more satisfactory results than those found with the Ornstein
Uhlenbeck model; taking into account the drift term in the model allows the average predictions
to be closer to the observations (residuals are minimised) and the observation fit the quantiles’
envelope (no excess observed) regardless of the estimated Tenor.
The fact that the quantiles include the observations is a non-negligible argument for selecting
the GBM model as considering the results of the predicted 90% quantiles will be conservative
enough overall.
Nonetheless it is noted that the simulation envelope is relatively broad, as there are situations
where the 90% quantiles values are two times bigger than the observations. The size of the
envelope being directly related to the volatility σ of the model, adjustments in its calibrations
could be considered. The caution mentioned when looking at the independence hypothesis
could be one of the reasons behind this lack of accuracy. In addition, the main driver behind the
introduction of this model was to take into account a drift term in the modelling. Both above
figures show that the evolution of the average prediction as a linear function indicates that the
trend is captured adequately.
Even though the size of the envelope might be problematic during the modelling of the CCF
parameter and the presentation of the results, the ability of the model to capture the drift term
makes its use relevant.
This model can be improved but seems more appropriate than the previous tested model
(OU), it thus has been selected for the final modelling of the CCF parameter.
Figure 16 - 9M Tenor – Worsening cashflows Estimated vs. Observed Figure 17- 10M Tenor – Worsening cashflows Estimated vs. Observed

30
6. Application of the methodology to model the CCF parameter
Remember that the purpose of this paper is to predict the CCF parameter by forecasting the
worsening cashflows. At the date of the calculation of the non-defaulted exposures (December
2012 in the examples below), thanks to the stochastic diffusion model retained, it’s possible to
determine the evolution of the non-defaulted exposure over a one-year horizon. It’s the last line
of the matrix presented in Figure 5. The aim being to retrieve the last predicted point for each
Tenor; those points represent the evolution of the cashflows of the EAD since December 2012
and are obtained with the GBM model. The plot below shows the monthly evolution of the
exposure of non-defaulted current accounts over a one-year horizon (forecasts vs.
observations):
In order to assess the model’s accuracy, the prediction error is calculated as the ratio of the
residuals and the observed value:
𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 𝑜𝑜𝑜𝑜 𝑎𝑎 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑖𝑖 =
| 𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂(𝑖𝑖) − 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸(𝑖𝑖)|
𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸(𝑖𝑖)
M𝑒𝑒𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 = 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 𝑜𝑜𝑜𝑜 𝑡𝑡ℎ𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤ℎ𝑡𝑡𝑡𝑡𝑡𝑡 𝑏𝑏𝑏𝑏 𝑡𝑡ℎ𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣
An average prediction error of 16.7% is found; the precision of the model is therefore
considered satisfactory.
The major advantage of the proposed model is to be able to get the information of the exposure
month by month. In particular, the CCF can be considered as the average exposure on the 12
months divided by the initial exposure. The average is taken since there is no reason that the
Figure 18 – Evolution of the bank exposure cashflow over a one-year horizon - December 2012

31
current account will default on the 12th
month exactly but can occur at any time over the year.
Furthermore, in the calculation of the RWA or the EL, the probability of default used
corresponds to the probability that the default occur over the year and not at the end of the year
exactly. The proposed formula of our final CCF parameter is:
𝐶𝐶𝐶𝐶𝐶𝐶𝑝𝑝𝑝𝑝edicted =
𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎[𝑡𝑡0; 𝑡𝑡0 + 12𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚ℎ𝑠𝑠]
𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸(𝑡𝑡0)
During those research study, the data base used for the modelling only considered clients whose
exposures deteriorate over time hence the EAD curve is increasing. Nonetheless, in the event
of positive cashflows being observed, the evolution of the exposure won’t necessarily be
monotonic. Thus, to remain the most conservative possible, it has been decided to put in place
an « Effectiveness » computation in order to consider the Effective Exposure of the current
account. It is computed as follows:
𝐸𝐸𝐸𝐸𝑓𝑓𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 (𝑡𝑡) = 𝑚𝑚𝑚𝑚𝑚𝑚(𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸(𝑡𝑡), 𝐸𝐸𝐸𝐸𝑓𝑓𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸(𝑡𝑡 − 1))
As an illustration of this principle, the simulated exposure corresponds to the solid curve in the
below figure whereas the Effective Exposure, used to calculate the final average exposure
corresponds to the dashed curve:
Going back to the initial exposure forecast, on the basis of the data appearing in the Figure 18
and the known exposure value of December 2012, it’s now possible to plot the evolution of the
EAD by summing the cumulated cashflows. The predicted results compared to the observed
ones are presented in the graph below:
Figure 19 - Illustration of the Effective Exposure

32
In addition to the Figure 18, Figure 20 shows also the good fit of the model and its ability to
predict the exposure on each date over a one-year horizon. Usually, the CCF would be
calculated by dividing the first point by the last point observed in Figure 20:
𝐶𝐶𝐶𝐶𝐹𝐹𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂 =
𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝑟𝑟𝑒𝑒(𝑡𝑡0 + 12 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚ℎ𝑠𝑠)
𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 (𝑡𝑡0)
Which would give:
𝐶𝐶𝐶𝐶𝐹𝐹𝑜𝑜𝑜𝑜𝑜𝑜 = 1,1854
While the CCF value based on the diffusion stochastic model of the exposure by using the
average exposure is:
𝐶𝐶𝐶𝐶𝐹𝐹𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 = 1,1772
Finally, the average prediction of the CCF is relatively close to the observations, since the
relative error between the two values is equal to 0,69%.
�1 −
𝐶𝐶𝐶𝐶𝐹𝐹𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
𝐶𝐶𝐶𝐶𝐹𝐹𝑜𝑜𝑜𝑜𝑜𝑜
� = 0,69%
Evolution of the monthly EAD starting from 12/2012
Observations vs. Predicted
Figure 20 - Evolution of the exposure over a one-year horizon – December 2012

33
7. Conclusion
This white paper presented an alternative modelling methodology of the CCF parameter,
developed by the Global Research & Analytics (GRA) team of Chappuis Halder & Co, which
is based on the stochastic modelling of the worsening cashflows of the exposures. After
presenting and testing several methods, one method has been retained. This is the one giving
the most accurate estimation of the CCF parameter while obtaining a significant addition of
information in comparison to the usual statistical models as it captures the monthly evolution
of the EAD.
Our study was done on time series hence the choice of the Ornstein Uhlenbeck (OU) model,
which is a mean-reverting stochastic process, frequently used in market finance. After
calibrating the model and to assess the accuracy of the model, a cross validation exercise has
been done thanks to data base that hasn’t been used during the calibration step. In fact, an
appropriate forecasting model of the CCF should not only predict the final most conservative
EAD level but also remain robust during the estimations. In the end, the results even though
satisfactory weren’t usable as the differentiation made them less accurate and did not take
account of the drift term observed in our portfolio over the last years.
To address those issues, a second stochastic model was considered. The most notable
differences are that this model doesn’t need a stationary hypothesis and introduces a drift term.
The model is the Geometric Brownian Motion model (GBM). Although the size of the
estimation’s envelope could be problematic during the final CCF modelling, the results were
better than those obtained with the OU model as the drift was now considered. This model can
be improved but seems more relevant than the Ornstein Uhlenbeck model and hence it was
chosen for the final modelling of the CCF parameter.
In the last part, the CCF parameter has been modelled thanks to the forecast of the worsening
cashflows, the main advantage of this methodology is that it enables us to retrieve the exposure
month by month. Finally, the observed CCF has been computed and then compared to the CCF
predicted by the GBM model. To conclude, the average prediction of the CCF is very close to
the observed one with a relative error of 0.69%.

34
8. References
− Basel Committee on Banking supervision [2016]
− Benchmarking regression algorithms for loss given default modelling, Gert Loterman, Iain
Brown, David Martens, Christophe Mues and Bart Baesens [2012]
− Using a transactor/revolver scorecard to make credit and pricing decisions, So, M.C.,
Thomas, L.C., Seow, H-V and Mues [2014]
− Estimating EAD for retail exposures for Basel II purposes, Valvonis [2008]
− An empirical study of exposure at default, Jacobs [2010]
− Modelling exposure at default, credit conversion factors and the Basel II accord, Taplin,
Minh To, & Hee [2007]
− Modelling exposure at default and loss given default: Empirical approaches and technical
implementation, Yang, & Tkachenko, [2012]
− Loss given default models incorporating macro-economic variables for credit cards,
Bellotti, & Crook, [2012]
− Regression model development for credit card exposure at default (EAD) using SAS/STAT
and SAS Enterprise Miner, TM 5.3, Brown [2011]
− Exposure at default of unsecured credit cards, Qi [2009]
− A new mixture model for the estimation of credit card exposure at default, Leow & Crook
[2015]
− Modelling credit risk of portfolio of consumer loans, Malik & Thomas [2010]
− Apprentissage Statistique : modélisation, prévision et data mining, P. BESSE & B.
LAURENT
− Calibrating the Ornstein-Uhlenbeck (Vasicek) model, T. VAN DE BERG [2011]
− Review of Statistical Arbitrage, Cointegration and Multivariate Ornstein-Uhlenbeck -
Chapitre 1, A. MEUCCI [2010]
− Consultative document Fundamental Review of the trading book: A revised market risk
framework, Basel Committee on Banking Supervision, January 2014

35
Table of Figures Figure 1- Illustration of the modelling database.......................................... 12
Figure 2 - Monthly evolution of the total EAD - June 2009 to December 2012Erreur ! Signet
non défini.
Figure 3 - Monthly evolution of the accounts cashflows - Tenor 5M...................................... 14
Figure 4 - Descriptive statistics of worsening cashflows- Tenor 5M ...................................... 15
Figure 5 – Descriptive statistics of the worsening cashflows – Tenor 5MErreur ! Signet non
défini.
Figure 6 – Matrix of worsening cashflows – Application data – C.O.B. December 2012 ...... 17
Figure 7 – Evolution of the cumulated exposure (EAD) ......................................................... 18
Figure 8 - QQ-plot du Tenor 9M.............................................................................................. 20
Figure 9 - ACF et PACF plots – Differentiated 8M Tenor ...................................................... 21
Figure 10 - Explicative figure showing the construction process of correlated brownian motions
.................................................................................................................................................. 23
Figure 11 - 9M Differentiated Tenor – worsening cashflow - Estimated vs. Observed .......... 25
Figure 12 - 9M Tenor – worsening cashflow - Estimated vs. Observed.................................. 25
Figure 13 - 10M Differentiated Tenor – worsening cashflow - Estimated vs. Observed ........ 25
Figure 14 - 10M Tenor – worsening cashflow - Estimated vs. Observed................................ 25
Figure 15 - QQ plot –Log of the increments of the 12M Tenor............................................... 27
Figure 16 - ACF & PACF – 10M Tenor .................................................................................. 28
Figure 17 - 9M Tenor – Worsening cashflows Estimated vs. Observed.................................. 29
Figure 18- 10M Tenor – Worsening cashflows Estimated vs. Observed................................. 29
Figure 19 – Evolution of the bank exposure cashflow over a one-year horizon - December 2012
.................................................................................................................................................. 30
Figure 20 - Illustration of the Effective Exposure.................................................................... 31
Figure 21 - Evolution of the exposure over a one-year horizon – December 2012 ................. 32

36

EAD Parameter : A stochastic way to model the Credit Conversion Factor

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to EAD Parameter : A stochastic way to model the Credit Conversion Factor

Similar to EAD Parameter : A stochastic way to model the Credit Conversion Factor (20)

More from Genest Benoit

More from Genest Benoit (9)

Recently uploaded

Recently uploaded (20)

EAD Parameter : A stochastic way to model the Credit Conversion Factor