Credit Default Swap (CDS) Rate Construction by Machine Learning Techniques

Electronic copy available at: https://ssrn.com/abstract=3142258
CDS Rate Construction Methods
by Machine Learning Techniques
Zhongmin Luo
7-Mar-2018 * **
Independent Consultant and Researcher
Department of Economics, Mathematics and Statistics,
Birkbeck, University of London
** A Presentation at Risk’s Quant Summit Europe Conference on 7 Mar 2018 in London, UK. Disclaimer: thanks for feedbacks from participants and the views and
opinions expressed in the presentation are those of the authors and do not necessarily reflect those of above affiliated institutions.
* 2018 Call for Paper Winner: Based on a joint work with Raymond Brummelhuis, University of Reims on the paper titled
CDS Rate Construction Methods by Machine Learning Techniques. Available at SSRN: https://ssrn.com/abstract=2967184

Agenda
1. Motivations
2. Machine Learning Techniques based CDS Proxy Method
1) The Method
2) Top 3 classifiers: Neural Network, Support Vector Machine,
Ensemble/Bagged Trees
3) Empirical Results (Summary, Parameter Tuning, Regularization and
Correlation impacts)
3. Conclusions
4. Reference and Q&A
2

Question?
3
Besides they all belong to European Region/Banking Sector, what else do they have
in common on 15Sep08?
Lehman EU,
Fortis, AIB,
Northern Rock
Commerzbank, Credit Suisse,
Macqaurie UK, Wachovia EU,
Standard Chartered,
UniCredit
A

Motivations
• After the 2007-09 Financial Crisis, financial institutions have to answer two questions:
1. How much Value Adjustments (CVA/XVA) are needed for Derivative Book’s MtM to
reflect the counterparty default risk?
2. How much capital do banks need to hold against the volatility of CVA?
4
• One core input is risk-neutral Counterparty Default Probability 𝑷𝑫 𝟎, 𝒕 to calculate
CVA or CVA Risk Capital[Basel 4 (2017) [Pykhtin and Zhu (2007) ][Brigo et al (2013)].
𝐶𝑉𝐴 𝑡 = 1 − 𝑅
𝑡
𝑇
𝐸 𝑄
𝐵0
𝐵𝑡
𝐸 𝑡 𝒅𝑷𝑫(𝟎, 𝒕)
• Financial regulators and accounting standard bodies require us to derive the risk-neutral
𝑷𝑫(𝟎, 𝒕) from counterparty’s liquid CDS quotes if available; otherwise, a so-called CDS
Proxy Method has to be applied.

1.1 A Shortage of Liquidity Problem
• Shortage of Liquidity problem: in reality, the vast majority of FIs’ counterparties don’t have
liquid CDS quotes; thus, a CDS Proxy Method has to be used to construct Proxy CDS Rates.
5
87%
70%
84%
90%
96%
82%
99%
94%
89%
83%
63%
91%
95%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
%ofCounterpartiesinRegions/Sectors
A Typical European Bank Counterparty Distribution by Regions/Sectors
(Overall: 84.4%; EBA Survey: >75%)
Observables Nonobservables

1.2. Regulatory Criteria and Two Existing CDS Proxy Methods
Two Existing CDS Proxy Methods (both violate #3)
1. Credit Curve Mapping: proxies CDS rates by the mean/median of CDS rates within a
Region/Sector/Credit Quality (Rating) bucket.
2. Cross-sectional Regression: explains a term-specific CDS rate for counterparty 𝑖 (denoted
by 𝑆𝑖) by its response (𝛽) to the event whether the counterparty belongs to a region (𝑅),
sector (𝑆), rating (𝑟) or seniority (s), indicated by respective indicator function 𝐼, estimated
by running a Cross-sectional regression for each CDS term as shown below:
6
Regulatory Criteria [Basel 2015, 2017 and EBA 2015]
1. The CDS Proxy Method has to be based on an algorithm that discriminates at least 3 types
of variables: Region, Sector and Credit Quality (e.g., rating or PDs).
2. Both the observable and the nonobservable counterparties come from the same peer
group defined by the above 3 variables.
3. The appropriateness of a Proxy CDS Spread should be determined by its CDS spread
volatility across the constituents within the bucket and not by its level; i.e., any CDS Proxy
Method should reflect the idiosyncratic component of a counterparty’s default risk.

1.3. Research Gaps and Research Objectives
Two Types of Research Gaps
1. As CDS Curve Proxy Method
• Credit Curve Mapping: failing #3
• Cross-sectional Regression: failing #3. it can introduce Arbitrage for CDS
curves. [Brummelhuis and Luo (2018)]
• Bond spreads include significant liquidity premiums (Longstaff et al,2005), thus, are
not good choice for CDS Proxy.
• Rating-implied PDs: real-world PDs implied from (Credit/IRB) ratings for CVA.
2. As a Classifier Performance Comparison study based on financial market
data:
• Existing Classifier Performance Comparison studies[Delgado et al (2014), King et al (1995)]
are based on non-financial market data;
• our study is cross-classifier performance comparison for financial market
data.
Research Objectives: fill in the gaps identified above based on Machine Learning
or ML-based Techniques.
7

1.4. Criteria for a Sound CDS Proxy Method
8
• Meet regulatory requirements – Region/Sector/Credit Quality while
accounting for idiosyncratic part of counterparty default risk based on
liquid CDS quotes; i.e., avoid …
• No model-induced CDS Curve Arbitrage [Brummelhuis and Luo 2018].
• Training and Cross-validation based on established statistical principles.
Lehman EU,
Fortis, AIB,
Northern Rock
Commerzbank, Credit
Suisse, Standard
Chartered, Macqaurie UK,
Wachovia EU, UniCredit
A

2.1 First attempt
9
Extended Cross-sectional Regression by including 𝑃𝐷 0, 𝑡 .
• LEFT: But the regression line doesn’t fit the data well due to the nonlinearity/outliers. (1.83% estimated as 4.98%).
• RIGHT: we cannot use real-world PDs directly for pricing; Risk-neutral PDs > Real-world ones (explained next)!

2.2 What is the Real-world PD?
10
But we have other information about illiquid counterparties, e.g.,
1. Public firms: we have Equity prices and firms’ Balance Sheet information.
2. Firms with Equity option prices, we have implied vols that are
explanatory for 𝑃𝐷(0, 𝑡) [Berndt et al, 2005].
From #1, based on 1st-passage-time Structural Model [Black and Cox 1976], we can get
Real-world 𝑃𝐷(0, 𝑡) after an empirical transformation from risk-neutral
Distance to Default 𝐷𝐷 , which is a common practice followed by
Bloomberg™ [BBG, 2014]and MoodysKMV™ [MoodysKMV].
Equity +
Balance
Sheet Data
Risk-
neutral
DD
Real-
world
PDs
Structural
Model
Empirical
Transformation

2.3 The ML-Technique based CDS Proxy Method
11
Given a Training Set 𝐷 𝑇
with 𝑦𝑖 for Class Label and 𝒙𝒊 for Feature Vector,
as shown below:
ML-Techniques construct a mapping called 𝐹 𝜃(𝑥) below; 𝜃 is learned
from 𝐷 𝑇
based on a algorithm called a Classifier Family.
A Classifier Family with a parameterization choice is called a Classifier.
In this paper, we studied 8 Classifier Families and presented 156
Classifiers. ML also has a large number of Regression Techniques; one
application in finance is: [Brummelhuis and Luo, 2018].

2.4 List of Eight Classifier Families and 156 Classifiers
12
1. Neural Network (NN): e.g., Activation Functions, # of hidden units
2. Support Vector Machine (SVM): e.g., kernel functions
3. Ensemble Bagged Tree (BT): e.g., # of learning cycles.
4. Discriminant Analysis (DA): e.g., Linear/Quadratic; regularization.
5. Naïve Bayes (NB): e.g., Kernel choices; bandwidth selections.
6. 𝑘 Nearest Neighbours (𝑘NN): e.g., Distance metrics; 𝑘 in 𝑘NN.
7. Logistic Regression (LR).
8. Decision Tree (DT): e.g., Impurity measure choices; Tree sizes.

2.6 Cross/Intra-classifier Performance for ML-CDS Proxy Methods
14

2.7 A Simple 𝒏 Unit 3-layer Neural Network
15
Activation Functions:
e.g., Sigmoid
Output transform
functions: Softmax
Fitting of Neural
Network

2.8 Mathematical Representation
16
Activation Functions Fitting NN-> Minimizing
Cross-Entropy

2.9 Neural Network performance
17

2.10 SVM for Linearly Separable Data
18
Maximizing the margin

2.11 SVM for Nonlinearly Separable Data
19
• For non-linearly separable data, transformed it into a linearly separable one
first; by limiting ourselves to 𝛽 = 𝑖 𝛼𝑖 𝑥𝑖, the previous optimization problem
becomes one on the Left.
• Then, one can replace the 𝑿 𝑇
𝑿 with a kernel function denoted by 𝑘(𝒙𝑖, 𝒙𝑗),
which is also called “kernel trick” as indicated on the Right.
• Linear kernel; Polynomial kernel; Gaussian kernel

2.13 Bagged Tree / Ensemble
21
• Ensemble is based on a committee of learning algorithms; e.g., Bagged Trees
is based on Bootstrapping;
• The learning outcome is determined by Majority Vote Rule from a sequence
of Decision Tree classification results.

2.14 Model Assessments: K-fold Cross Validation
22
• First, we split observable data 𝐷 𝑂into 𝐾 folds typically of equal sizes
𝑫 𝑶 =
𝒏=𝟏
𝑲
𝑫 𝒏(𝑲)
• Second, for 𝑛 = 1,2, … , 𝐾, define holdout sample 𝐷 𝐻 𝑛 = 𝐷 𝑛(𝐾) and define
the 𝑛-th Training Set by
𝑫 𝑻 𝒏 = 𝑫 𝑶 − 𝑫 𝑯 𝒏
• Third, for 𝑛 = 1,2, … , 𝐾, we apply the Classifier trained from Training Set
𝐷 𝑇 𝑛 to estimate 𝑦𝑛 for data (𝑥, 𝑦) of 𝐷 𝐻 𝑛 and calculate the expected
Misclassification Rate as:
𝝐 𝒏
𝑯 =
𝟏
#𝑫 𝑯 𝒏
𝒙,𝒚 ∈𝑫 𝑯 𝒏
𝑰(𝒚, 𝒚 𝒙 )

3.1 Summary of Cross-classifier Performances
23
• Other 5 classifier families: Discriminant Analysis (DA), Naïve Bayes (NB), 𝑘 Nearest
Neighbours (𝑘NN), Logistic Regression (LR) and Decision Tree (DT).
• The ranking of top performing Classifier Families is in line with those reported in
classifier performance comparison literature based on non-financial (Delgado et al 2014, King
et al 1995).

3.2. Conclusions
1. ML-Technique based CDS Proxy Method satisfies regulatory requirements,
account for counterparty-specific default risk, appropriate for CVA pricing,
Counterparty Credit risk management (Success Criteria #1).
2. No Arbitrage introduced by the model (Success Criteria #2) [Brummelhuis and Luo 2018].
3. Model assessment is based on Statistical/Machine Learning theories,
produces satisfactory results based on Cross-validation procedure (Success
Criteria #3).
• Based on studies of 156 classifiers across 8 algorithms, Neural Network [99.3% (0.6%)], SVM [96.8% (1.6%)]
and Ensemble/Bagged Tree [96.0% (2.2%)]are top 3 performers; the ranking is in line with Classifier
Comparison literature.
4. To the best of our knowledge, the study is:
• The 1st Machine Learning Technique based CDS Proxy Method.
• The 1st Classifier Performance Comparison research based on Financial Market Data.
• The 1st Research effort to look at Correlation impacts on cross-classifier performance.
24

References
25
1. Berndt, A., Duffie, D., Douglas, R., Ferguson M., Schranz, D., 2005, Measure Default Risk Premia from Default
Swap Rates and EDFs.
2. BCBS, July 2015, Review of the Credit Valuation Adjustment Risk Framework, Consultative Document, Bank
for International Settlements.
3. BCBS, Basel III, Finalizing post-crisis reforms, December 2017.
4. Bloomberg, Bloomberg Credit Risk, Framework, Usage and Methodology, 2014
5. Brigo, D., Morini M. and Pallavicini A., 2013, Counterparty Credit Risk, Collateral and Funding: With Pricing
Cases for All Asset Classes, John Wiley and Sons Ltd.
6. Brummelhuis, Raymond and Luo, Zhongmin, CDS Rate Construction Methods by Machine Learning
Techniques (May 12, 2017). SSRN: https://ssrn.com/abstract=2967184
7. Brummelhuis, Raymond and Luo, Zhongmin, A Note on No-arbitrage Restrictions on CDS Curve, 2018.
8. Brummelhuis, Raymond and Luo, Zhongmin, Bank Capital, Net Interest Margin and Stress Testing by
Machine Learning Techniques, 2018
9. EBA Report, 22 February 2015, On Credit Valuation Adjustment (CVA) under Article 456(2) of Regulation (EU)
No 575/2013 (Capital Requirements Regulation).
10. King, R., Feng, C., and Shutherland, A., Statlog: comparison of classification algorithms on large real-world
problems, Applied Artificial Intelligence, 1995, 9(3), 289-333.
11. Pykhtin M. and Zhu S., 2007, A Guide to Modelling Counterparty Credit Risk.
12. Wu, X., Kumar, V., Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G., Ng, A., Liu, B., Yu, P., Zhou, Z.,
Steinbach, M., Hand, D., Steinberg, D., 2008, Top 10 algorithms in data mining, Knowl Inf Syst (2008).

Credit Default Swap (CDS) Rate Construction by Machine Learning Techniques

Recommended

Recommended

More Related Content

What's hot

What's hot (9)

Similar to Credit Default Swap (CDS) Rate Construction by Machine Learning Techniques

Similar to Credit Default Swap (CDS) Rate Construction by Machine Learning Techniques (20)

Recently uploaded

Recently uploaded (20)

Credit Default Swap (CDS) Rate Construction by Machine Learning Techniques