SlideShare a Scribd company logo
1 of 22
Download to read offline
Using SkLearn to Improve Existing Risk
Models
PJ Fitzpatrick
opensourcepj@gmail.com
Definitions and Assumptions
●Market Risk instead of Credit Risk
●Linear instrument on Stock Indices
●Assume we have 6 indices:
● Dow Jones Ind/Transport/Utilities
● S&P 500
● Nasdaq
● Russell 2000
Definitions and Assumptions
●Risk Measure = VaR
●Currency of Risk
●Defined at confidence interval and holding
period. Eg VaR 1 day holding period at 95%
confidence interval of 100 means that 95% of the
time 1 day PnL will be above -100
●Used to consistently measure disparate products
in disparate markets
●Measure of risk frequently changes with changes
in volatility
Historical VaR
●Historical VaR
●Simulation based risk
●Simulation scenarios are taken as actual
historical returns
●Usually the most recent 1 years data
●Tradeoff between relevance and number of
scenarios
Historical VaR
●Easy to explain
●Easy to implement
●A lot more consistent implementations
●Deals very well with the problem of fat tailed
distributions and correlation behaviour under
extreme movements
import pandas as pd
from settings import data_dir, scenario_dir
import os
indices = pd.read_csv(os.path.join(data_dir, "prel_indices.csv"))
stock_names = ['DJI', 'DJT', 'DJU', 'GSPC', 'IXIC', 'RUT']
for stock_name in stock_names:
chg_name = '{0}_Chge'.format(stock_name)
indices[chg_name] = indices[stock_name].pct_change()
indices.to_csv(os.path.join(data_dir, "indices.csv"))
Historical VaR –Pre Processing
indices = pd.read_csv(os.path.join(data_dir, 'indices.csv'))
position = {'DJI':1, 'DJT':1,'DJU':1,'GSPC':1,'IXIC':1,'RUT':1}
idx = 1000
var_periods = 250
series_historical_var = sum(map(lambda x: indices.iloc[idx-var_periods-1:idx-
1]['{0}_Chge'.format(x)]*position[x], position.keys()))
historical_var = np.percentile(series_historical_var, 5)
hypothetical_pnl = sum(map(lambda x:
indices.iloc[idx]['{0}_Chge'.format(x)]*position[x], position.keys()))
Historical VaR –Implementation
VaR Testing
● Hypothetical PnL as opposed to actual PnL
● Hypothetical PnL compared to VaR each day
and whether breach (1) or not (0) recorded
● Usually every year the model tested
● Kupiec testing – that we are getting around
the amount of breaches for our percentile
amount taking into account the amount of
observations
● Christophensen Test – Testing for runs of
breaches
● Test portfolios used to improve test coverage
def kupiec(var_results, per):
n = len(var_results)
m = sum(var_results)
return 2*np.log(pow(1-m/n,n-m)*pow(m/n,m))-
2*np.log(pow(1-per,n-m)*pow(per,m))
def christoffersen_serial_ind(var_results):
n00 = 0
n01 = 0
n10 = 0
n11 = 0
for idx, result in enumerate(var_results[:-1]):
if result == 0:
if var_results[idx+1] == 0:
n00 += 1
else:
n01 += 1
if result != 0:
if var_results[idx+1] == 0:
n10 += 1
else:
n11 += 1
pi01 = n01 / (n00 +n01)
pi11 = n11 / (n10 +n11)
pi = (n01 + n11) / (n00 +n01 + n10 +n11)
return 2*np.log(pow(1-pi01,n00)*pow(pi01,n01)*pow(1-pi11,n10)*pow(pi11,n11)) -
2*np.log(pow(1-pi,n00+n10)*pow(pi,n01+n11))
VaR Testing
Improving Historical VaR
●Historical VaR involves selecting scenarios from
a fixed window of most recent data
●An alternative is to use a larger window and
cluster within the window and only select from
the cluster that the current observation belongs
to
●Assumes that market alternates between
different states
Improving Historical VaR
●To cluster we need attributes. These can be:
●Derived from risk factors
●External/Calendar based
●Only going to derive attributes from risk factors
here
●External/Calendar are domain specific but those
derived from risk factors can be used for any
asset
Improving Historical VaR
●For each risk factor calculate:
●Ratio of index to 30, 50 and 200 day moving average. Eg
DJI_Avg_R_30
●Ratio of 2 averages to each other eg DJI_Avg_R2_30_50
●Ratio of 2 standard deviations to each other eg
DJI_Std_R_30_50
●Number of Std from Average eg DJI_NumStd_30
import numpy as np
import pandas as pd
import os
from settings import data_dir, scenario_dir
stock_names = ['DJI', 'DJT', 'DJU', 'GSPC', 'IXIC', 'RUT']
indices = pd.read_csv(os.path.join(data_dir, "prel_indices.csv"))
for stock_name in stock_names:
chg_name = '{0}_Chge'.format(stock_name)
indices[chg_name] = indices[stock_name].pct_change()
for num_days in [30,50,200]:
avg_name = '{0}_{1}_Avg'.format(stock_name, num_days)
std_name = '{0}_{1}_Std'.format(stock_name, num_days)
indices[avg_name] = indices[stock_name].rolling(window=num_days, center=False).mean()
indices[std_name] = indices[chg_name].rolling(window=num_days, center=False).std()
ratio_name = '{0}_Avg_R_{1}'.format(stock_name, num_days)
indices[ratio_name] = indices[stock_name] / indices[avg_name]
Adding Attributes
for stock_name in stock_names:
ratio_name = '{0}_Std_R_{1}_{2}'.format(stock_name, 30,50)
indices[ratio_name] = indices['{0}_{1}_Std'.format(stock_name, 30)] /
indices['{0}_{1}_Std'.format(stock_name, 50)]
avg_ratio_name = '{0}_Avg_R2_{1}_{2}'.format(stock_name, 30, 50)
indices[avg_ratio_name] = indices['{0}_{1}_Avg'.format(stock_name, 30)] /
indices['{0}_{1}_Avg'.format(stock_name, 50)]
ratio_name = '{0}_Std_R_{1}_{2}'.format(stock_name, 50, 200)
indices[ratio_name] = indices['{0}_{1}_Std'.format(stock_name, 50)] /
indices['{0}_{1}_Std'.format(stock_name, 200)]
avg_ratio_name = '{0}_Avg_R2_{1}_{2}'.format(stock_name, 50, 200)
indices[avg_ratio_name] = indices['{0}_{1}_Avg'.format(stock_name, 50)] /
indices['{0}_{1}_Avg'.format(stock_name, 200)]
for stock_name in stock_names:
for num_days in [30,50,200]:
avg_name = '{0}_{1}_Avg'.format(stock_name, num_days)
std_name = '{0}_{1}_Std'.format(stock_name, num_days)
numstd_name = '{0}_NumStd_{1}'.format(stock_name, num_days)
indices[numstd_name] = (indices[stock_name]-indices[avg_name])/indices[std_name]
Adding Attributes
start = idx-kmeans_horizon
end = idx - 1
idxs = indices.loc[start:end].index
X_prel = indices.loc[start:end, cluster_attributes].values
scaler = StandardScaler()
scaler.fit(X_prel)
X = scaler.transform(X_prel)
kmeans = KMeans(n_clusters=3, random_state=0).fit(X)
selected_cluster_label = kmeans.labels_[-1]
hist_var_scen_from_cluster = [idxs[idx] for idx, item in
enumerate(kmeans.labels_) if item ==
selected_cluster_label][var_periods*-1:]
Improving Historical VaR
Comparing Results to HistVar
●Compare the absolute deviation from the
percentile against the same number for historical
var
●Perform this for a number of different portfolio
types that are based on trading styles
indices = pd.read_csv(os.path.join(data_dir, "indices.csv"))
stock_names = ['DJI', 'DJT', 'DJU', 'GSPC', 'IXIC', 'RUT']
positions = {'long':{}, 'short':{}, 'spread1':{}, 'spread2':{}
}
for idx in range(start, end):
end = idx - 1
prel_list = []
for stock_name in stock_names:
calc_ratio = ((indices.iloc[end][stock_name] - indices.iloc[end-
90:end][stock_name].min())/
(indices.iloc[end-90:end][stock_name].max() - indices.iloc[end-
90:end][stock_name].min()))
prel_list.append((calc_ratio,stock_name))
s_list = sorted(prel_list)
positions['spread1'][idx] = {
s_list[0][1]: 1, s_list[1][1]: .5, s_list[2][1]: 0,
s_list[3][1]: 0, s_list[4][1]: -.5, s_list[5][1]: -1 }
positions['spread2'][idx] = {
s_list[0][1]: -1, s_list[1][1]: -.5, s_list[2][1]: 0,
s_list[3][1]: 0, s_list[4][1]: .5, s_list[5][1]: 1 }
positions['short'][idx] = {
s_list[0][1]: -1, s_list[1][1]: -.5, s_list[2][1]: 0,
s_list[3][1]: 0, s_list[4][1]: 0, s_list[5][1]: 0 }
positions['long'][idx] = {
s_list[0][1]: 1, s_list[1][1]: .5, s_list[2][1]: 0,
s_list[3][1]: 0, s_list[4][1]: 0, s_list[5][1]: 0 }
Calculating Test Portfolios
Results for GSPC_Avg_R_200-
IXIC_Std_R_50_200
year long short spread1 spread2
2002 Y Y N Y 2
2003 Y Y Y Y 4
2004 N N Y Y 0
2005 N N Y N -2
2006 Y N N N -2
2007 Y Y Y Y 4
2008 Y Y Y Y 4
2009 Y Y Y Y 4
2010 N N N N -4
2011 Y Y Y Y 4
2012 Y Y N N 0
2013 N N Y Y 0
2014 Y Y N N 0
2015 N Y N N -2
2016 N N N N -4
3 3 1 1
Summary
●Results vary by portfolio type
●Allows for model diversification
●Less frequent but larger changes in risk measures
●Much more sophisticated selection criteria required
in practice that includes comprehensive measure of
using a VaR model
●Include static portfolios
●Overfitting
●Combining models
Using Decision Trees to Explain VaR
Breaches
●Pre-process dates for announcements
●Used to explain rather than predict
●Would not expect risk factors to be significant
usually. Transformed risk factors might be
present
●Useful results for non technical users
Using Regression to Identify Positions In
Portfolio
● Regress the current portfolio PnL by scenario
against instrument PnL by scenario
● Use Lasso/Elastic Net Regression to let the
model calculate the positions
● Very useful information to effectively reduce
disparate portfolios to a comprehendible
number of positions
Stress Testing
Specification usually on a small subset of risk
factors. Eg Indu changes by -5%
Objective is to fill out the other risk factors in
manner that is consistent and coherent
Issues:
● Correlation a lot different under extreme moves
● Applying changes can result in impossible risk
factor levels

More Related Content

Similar to Using SkLearn to Improve Existing Risk Models

2008 implementation of va r in financial institutions
2008   implementation of va r in financial institutions2008   implementation of va r in financial institutions
2008 implementation of va r in financial institutions
crmbasel
 
1 CMIS 102 Hands-On Lab Week 8 Overview Th.docx
1  CMIS 102 Hands-On Lab  Week 8 Overview Th.docx1  CMIS 102 Hands-On Lab  Week 8 Overview Th.docx
1 CMIS 102 Hands-On Lab Week 8 Overview Th.docx
honey725342
 
Statistics project2
Statistics project2Statistics project2
Statistics project2
shri1984
 
OpLossModels_A2015
OpLossModels_A2015OpLossModels_A2015
OpLossModels_A2015
WenSui Liu
 
QNT_EQTY_PPT
QNT_EQTY_PPTQNT_EQTY_PPT
QNT_EQTY_PPT
Liu Chang
 

Similar to Using SkLearn to Improve Existing Risk Models (20)

2008 implementation of va r in financial institutions
2008   implementation of va r in financial institutions2008   implementation of va r in financial institutions
2008 implementation of va r in financial institutions
 
Program 1 assignment kit
Program 1 assignment kitProgram 1 assignment kit
Program 1 assignment kit
 
Program 1 assignment kit
Program 1 assignment kitProgram 1 assignment kit
Program 1 assignment kit
 
1 CMIS 102 Hands-On Lab Week 8 Overview Th.docx
1  CMIS 102 Hands-On Lab  Week 8 Overview Th.docx1  CMIS 102 Hands-On Lab  Week 8 Overview Th.docx
1 CMIS 102 Hands-On Lab Week 8 Overview Th.docx
 
Self Assessment
Self AssessmentSelf Assessment
Self Assessment
 
1 forecasting SHORT NOTES FOR ESE AND GATE
1 forecasting SHORT NOTES FOR ESE AND GATE1 forecasting SHORT NOTES FOR ESE AND GATE
1 forecasting SHORT NOTES FOR ESE AND GATE
 
Value At Risk Sep 22
Value At Risk Sep 22Value At Risk Sep 22
Value At Risk Sep 22
 
Uncertainty-Penalized Portfolio Optimization
Uncertainty-Penalized Portfolio OptimizationUncertainty-Penalized Portfolio Optimization
Uncertainty-Penalized Portfolio Optimization
 
Application of square root of time scaling
Application of square root of time scalingApplication of square root of time scaling
Application of square root of time scaling
 
Value at Risk Engine
Value at Risk EngineValue at Risk Engine
Value at Risk Engine
 
Anomaly Detection using Neural Networks with Pandas, Keras and Python
Anomaly Detection using Neural Networks with Pandas, Keras and PythonAnomaly Detection using Neural Networks with Pandas, Keras and Python
Anomaly Detection using Neural Networks with Pandas, Keras and Python
 
19157 Questions and Answers
19157 Questions and Answers19157 Questions and Answers
19157 Questions and Answers
 
Statistics project2
Statistics project2Statistics project2
Statistics project2
 
OpLossModels_A2015
OpLossModels_A2015OpLossModels_A2015
OpLossModels_A2015
 
QCP user manual EN.pdf
QCP user manual EN.pdfQCP user manual EN.pdf
QCP user manual EN.pdf
 
Software testing effort estimation with cobb douglas function a practical app...
Software testing effort estimation with cobb douglas function a practical app...Software testing effort estimation with cobb douglas function a practical app...
Software testing effort estimation with cobb douglas function a practical app...
 
Software testing effort estimation with cobb douglas function- a practical ap...
Software testing effort estimation with cobb douglas function- a practical ap...Software testing effort estimation with cobb douglas function- a practical ap...
Software testing effort estimation with cobb douglas function- a practical ap...
 
2. Module II (1) FRM.pdf
2. Module II (1) FRM.pdf2. Module II (1) FRM.pdf
2. Module II (1) FRM.pdf
 
PPT - Deep Hedging OF Derivatives Using Reinforcement Learning
PPT - Deep Hedging OF Derivatives Using Reinforcement LearningPPT - Deep Hedging OF Derivatives Using Reinforcement Learning
PPT - Deep Hedging OF Derivatives Using Reinforcement Learning
 
QNT_EQTY_PPT
QNT_EQTY_PPTQNT_EQTY_PPT
QNT_EQTY_PPT
 

Recently uploaded

APPLIED ECONOMICS Sept 9FGHFGHFHGFGHFHGFHGFH
APPLIED ECONOMICS Sept 9FGHFGHFHGFGHFHGFHGFHAPPLIED ECONOMICS Sept 9FGHFGHFHGFGHFHGFHGFH
APPLIED ECONOMICS Sept 9FGHFGHFHGFGHFHGFHGFH
geloencina777
 

Recently uploaded (20)

DIGITAL COMMERCE SHAPE VIETNAMESE SHOPPING HABIT IN 4.0 INDUSTRY
DIGITAL COMMERCE SHAPE VIETNAMESE SHOPPING HABIT IN 4.0 INDUSTRYDIGITAL COMMERCE SHAPE VIETNAMESE SHOPPING HABIT IN 4.0 INDUSTRY
DIGITAL COMMERCE SHAPE VIETNAMESE SHOPPING HABIT IN 4.0 INDUSTRY
 
APPLIED ECONOMICS Sept 9FGHFGHFHGFGHFHGFHGFH
APPLIED ECONOMICS Sept 9FGHFGHFHGFGHFHGFHGFHAPPLIED ECONOMICS Sept 9FGHFGHFHGFGHFHGFHGFH
APPLIED ECONOMICS Sept 9FGHFGHFHGFGHFHGFHGFH
 
Kala jadu specialist in USA (Kala ilam expert in france) Black magic expert i...
Kala jadu specialist in USA (Kala ilam expert in france) Black magic expert i...Kala jadu specialist in USA (Kala ilam expert in france) Black magic expert i...
Kala jadu specialist in USA (Kala ilam expert in france) Black magic expert i...
 
Famous Kala Jadu, Black magic expert in Oman Or Kala ilam expert in Kuwait
Famous Kala Jadu, Black magic expert in Oman Or Kala ilam expert in KuwaitFamous Kala Jadu, Black magic expert in Oman Or Kala ilam expert in Kuwait
Famous Kala Jadu, Black magic expert in Oman Or Kala ilam expert in Kuwait
 
Bank of Tomorrow White Paper For Reading
Bank of Tomorrow White Paper For ReadingBank of Tomorrow White Paper For Reading
Bank of Tomorrow White Paper For Reading
 
The Pfandbrief Roundtable 2024 - Covered Bonds
The Pfandbrief Roundtable 2024 - Covered BondsThe Pfandbrief Roundtable 2024 - Covered Bonds
The Pfandbrief Roundtable 2024 - Covered Bonds
 
Pension dashboards forum 1 May 2024 (1).pdf
Pension dashboards forum 1 May 2024 (1).pdfPension dashboards forum 1 May 2024 (1).pdf
Pension dashboards forum 1 May 2024 (1).pdf
 
Black magic specialist in Canada (Kala ilam specialist in UK) Bangali Amil ba...
Black magic specialist in Canada (Kala ilam specialist in UK) Bangali Amil ba...Black magic specialist in Canada (Kala ilam specialist in UK) Bangali Amil ba...
Black magic specialist in Canada (Kala ilam specialist in UK) Bangali Amil ba...
 
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
 
Production and Cost of the firm with curves
Production and Cost of the firm with curvesProduction and Cost of the firm with curves
Production and Cost of the firm with curves
 
cost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptxcost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptx
 
GIFT City Overview India's Gateway to Global Finance
GIFT City Overview  India's Gateway to Global FinanceGIFT City Overview  India's Gateway to Global Finance
GIFT City Overview India's Gateway to Global Finance
 
Strategic Resources May 2024 Corporate Presentation
Strategic Resources May 2024 Corporate PresentationStrategic Resources May 2024 Corporate Presentation
Strategic Resources May 2024 Corporate Presentation
 
Pitch-deck CopyFinancial and MemberForex.ppsx
Pitch-deck CopyFinancial and MemberForex.ppsxPitch-deck CopyFinancial and MemberForex.ppsx
Pitch-deck CopyFinancial and MemberForex.ppsx
 
Retail sector trends for 2024 | European Business Review
Retail sector trends for 2024  | European Business ReviewRetail sector trends for 2024  | European Business Review
Retail sector trends for 2024 | European Business Review
 
Significant AI Trends for the Financial Industry in 2024 and How to Utilize Them
Significant AI Trends for the Financial Industry in 2024 and How to Utilize ThemSignificant AI Trends for the Financial Industry in 2024 and How to Utilize Them
Significant AI Trends for the Financial Industry in 2024 and How to Utilize Them
 
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
 
Certified Kala Jadu, Black magic specialist in Rawalpindi and Bangali Amil ba...
Certified Kala Jadu, Black magic specialist in Rawalpindi and Bangali Amil ba...Certified Kala Jadu, Black magic specialist in Rawalpindi and Bangali Amil ba...
Certified Kala Jadu, Black magic specialist in Rawalpindi and Bangali Amil ba...
 
fundamentals of corporate finance 11th canadian edition test bank.docx
fundamentals of corporate finance 11th canadian edition test bank.docxfundamentals of corporate finance 11th canadian edition test bank.docx
fundamentals of corporate finance 11th canadian edition test bank.docx
 
Webinar on E-Invoicing for Fintech Belgium
Webinar on E-Invoicing for Fintech BelgiumWebinar on E-Invoicing for Fintech Belgium
Webinar on E-Invoicing for Fintech Belgium
 

Using SkLearn to Improve Existing Risk Models

  • 1. Using SkLearn to Improve Existing Risk Models PJ Fitzpatrick opensourcepj@gmail.com
  • 2. Definitions and Assumptions ●Market Risk instead of Credit Risk ●Linear instrument on Stock Indices ●Assume we have 6 indices: ● Dow Jones Ind/Transport/Utilities ● S&P 500 ● Nasdaq ● Russell 2000
  • 3. Definitions and Assumptions ●Risk Measure = VaR ●Currency of Risk ●Defined at confidence interval and holding period. Eg VaR 1 day holding period at 95% confidence interval of 100 means that 95% of the time 1 day PnL will be above -100 ●Used to consistently measure disparate products in disparate markets ●Measure of risk frequently changes with changes in volatility
  • 4. Historical VaR ●Historical VaR ●Simulation based risk ●Simulation scenarios are taken as actual historical returns ●Usually the most recent 1 years data ●Tradeoff between relevance and number of scenarios
  • 5. Historical VaR ●Easy to explain ●Easy to implement ●A lot more consistent implementations ●Deals very well with the problem of fat tailed distributions and correlation behaviour under extreme movements
  • 6. import pandas as pd from settings import data_dir, scenario_dir import os indices = pd.read_csv(os.path.join(data_dir, "prel_indices.csv")) stock_names = ['DJI', 'DJT', 'DJU', 'GSPC', 'IXIC', 'RUT'] for stock_name in stock_names: chg_name = '{0}_Chge'.format(stock_name) indices[chg_name] = indices[stock_name].pct_change() indices.to_csv(os.path.join(data_dir, "indices.csv")) Historical VaR –Pre Processing
  • 7. indices = pd.read_csv(os.path.join(data_dir, 'indices.csv')) position = {'DJI':1, 'DJT':1,'DJU':1,'GSPC':1,'IXIC':1,'RUT':1} idx = 1000 var_periods = 250 series_historical_var = sum(map(lambda x: indices.iloc[idx-var_periods-1:idx- 1]['{0}_Chge'.format(x)]*position[x], position.keys())) historical_var = np.percentile(series_historical_var, 5) hypothetical_pnl = sum(map(lambda x: indices.iloc[idx]['{0}_Chge'.format(x)]*position[x], position.keys())) Historical VaR –Implementation
  • 8. VaR Testing ● Hypothetical PnL as opposed to actual PnL ● Hypothetical PnL compared to VaR each day and whether breach (1) or not (0) recorded ● Usually every year the model tested ● Kupiec testing – that we are getting around the amount of breaches for our percentile amount taking into account the amount of observations ● Christophensen Test – Testing for runs of breaches ● Test portfolios used to improve test coverage
  • 9. def kupiec(var_results, per): n = len(var_results) m = sum(var_results) return 2*np.log(pow(1-m/n,n-m)*pow(m/n,m))- 2*np.log(pow(1-per,n-m)*pow(per,m)) def christoffersen_serial_ind(var_results): n00 = 0 n01 = 0 n10 = 0 n11 = 0 for idx, result in enumerate(var_results[:-1]): if result == 0: if var_results[idx+1] == 0: n00 += 1 else: n01 += 1 if result != 0: if var_results[idx+1] == 0: n10 += 1 else: n11 += 1 pi01 = n01 / (n00 +n01) pi11 = n11 / (n10 +n11) pi = (n01 + n11) / (n00 +n01 + n10 +n11) return 2*np.log(pow(1-pi01,n00)*pow(pi01,n01)*pow(1-pi11,n10)*pow(pi11,n11)) - 2*np.log(pow(1-pi,n00+n10)*pow(pi,n01+n11)) VaR Testing
  • 10. Improving Historical VaR ●Historical VaR involves selecting scenarios from a fixed window of most recent data ●An alternative is to use a larger window and cluster within the window and only select from the cluster that the current observation belongs to ●Assumes that market alternates between different states
  • 11. Improving Historical VaR ●To cluster we need attributes. These can be: ●Derived from risk factors ●External/Calendar based ●Only going to derive attributes from risk factors here ●External/Calendar are domain specific but those derived from risk factors can be used for any asset
  • 12. Improving Historical VaR ●For each risk factor calculate: ●Ratio of index to 30, 50 and 200 day moving average. Eg DJI_Avg_R_30 ●Ratio of 2 averages to each other eg DJI_Avg_R2_30_50 ●Ratio of 2 standard deviations to each other eg DJI_Std_R_30_50 ●Number of Std from Average eg DJI_NumStd_30
  • 13. import numpy as np import pandas as pd import os from settings import data_dir, scenario_dir stock_names = ['DJI', 'DJT', 'DJU', 'GSPC', 'IXIC', 'RUT'] indices = pd.read_csv(os.path.join(data_dir, "prel_indices.csv")) for stock_name in stock_names: chg_name = '{0}_Chge'.format(stock_name) indices[chg_name] = indices[stock_name].pct_change() for num_days in [30,50,200]: avg_name = '{0}_{1}_Avg'.format(stock_name, num_days) std_name = '{0}_{1}_Std'.format(stock_name, num_days) indices[avg_name] = indices[stock_name].rolling(window=num_days, center=False).mean() indices[std_name] = indices[chg_name].rolling(window=num_days, center=False).std() ratio_name = '{0}_Avg_R_{1}'.format(stock_name, num_days) indices[ratio_name] = indices[stock_name] / indices[avg_name] Adding Attributes
  • 14. for stock_name in stock_names: ratio_name = '{0}_Std_R_{1}_{2}'.format(stock_name, 30,50) indices[ratio_name] = indices['{0}_{1}_Std'.format(stock_name, 30)] / indices['{0}_{1}_Std'.format(stock_name, 50)] avg_ratio_name = '{0}_Avg_R2_{1}_{2}'.format(stock_name, 30, 50) indices[avg_ratio_name] = indices['{0}_{1}_Avg'.format(stock_name, 30)] / indices['{0}_{1}_Avg'.format(stock_name, 50)] ratio_name = '{0}_Std_R_{1}_{2}'.format(stock_name, 50, 200) indices[ratio_name] = indices['{0}_{1}_Std'.format(stock_name, 50)] / indices['{0}_{1}_Std'.format(stock_name, 200)] avg_ratio_name = '{0}_Avg_R2_{1}_{2}'.format(stock_name, 50, 200) indices[avg_ratio_name] = indices['{0}_{1}_Avg'.format(stock_name, 50)] / indices['{0}_{1}_Avg'.format(stock_name, 200)] for stock_name in stock_names: for num_days in [30,50,200]: avg_name = '{0}_{1}_Avg'.format(stock_name, num_days) std_name = '{0}_{1}_Std'.format(stock_name, num_days) numstd_name = '{0}_NumStd_{1}'.format(stock_name, num_days) indices[numstd_name] = (indices[stock_name]-indices[avg_name])/indices[std_name] Adding Attributes
  • 15. start = idx-kmeans_horizon end = idx - 1 idxs = indices.loc[start:end].index X_prel = indices.loc[start:end, cluster_attributes].values scaler = StandardScaler() scaler.fit(X_prel) X = scaler.transform(X_prel) kmeans = KMeans(n_clusters=3, random_state=0).fit(X) selected_cluster_label = kmeans.labels_[-1] hist_var_scen_from_cluster = [idxs[idx] for idx, item in enumerate(kmeans.labels_) if item == selected_cluster_label][var_periods*-1:] Improving Historical VaR
  • 16. Comparing Results to HistVar ●Compare the absolute deviation from the percentile against the same number for historical var ●Perform this for a number of different portfolio types that are based on trading styles
  • 17. indices = pd.read_csv(os.path.join(data_dir, "indices.csv")) stock_names = ['DJI', 'DJT', 'DJU', 'GSPC', 'IXIC', 'RUT'] positions = {'long':{}, 'short':{}, 'spread1':{}, 'spread2':{} } for idx in range(start, end): end = idx - 1 prel_list = [] for stock_name in stock_names: calc_ratio = ((indices.iloc[end][stock_name] - indices.iloc[end- 90:end][stock_name].min())/ (indices.iloc[end-90:end][stock_name].max() - indices.iloc[end- 90:end][stock_name].min())) prel_list.append((calc_ratio,stock_name)) s_list = sorted(prel_list) positions['spread1'][idx] = { s_list[0][1]: 1, s_list[1][1]: .5, s_list[2][1]: 0, s_list[3][1]: 0, s_list[4][1]: -.5, s_list[5][1]: -1 } positions['spread2'][idx] = { s_list[0][1]: -1, s_list[1][1]: -.5, s_list[2][1]: 0, s_list[3][1]: 0, s_list[4][1]: .5, s_list[5][1]: 1 } positions['short'][idx] = { s_list[0][1]: -1, s_list[1][1]: -.5, s_list[2][1]: 0, s_list[3][1]: 0, s_list[4][1]: 0, s_list[5][1]: 0 } positions['long'][idx] = { s_list[0][1]: 1, s_list[1][1]: .5, s_list[2][1]: 0, s_list[3][1]: 0, s_list[4][1]: 0, s_list[5][1]: 0 } Calculating Test Portfolios
  • 18. Results for GSPC_Avg_R_200- IXIC_Std_R_50_200 year long short spread1 spread2 2002 Y Y N Y 2 2003 Y Y Y Y 4 2004 N N Y Y 0 2005 N N Y N -2 2006 Y N N N -2 2007 Y Y Y Y 4 2008 Y Y Y Y 4 2009 Y Y Y Y 4 2010 N N N N -4 2011 Y Y Y Y 4 2012 Y Y N N 0 2013 N N Y Y 0 2014 Y Y N N 0 2015 N Y N N -2 2016 N N N N -4 3 3 1 1
  • 19. Summary ●Results vary by portfolio type ●Allows for model diversification ●Less frequent but larger changes in risk measures ●Much more sophisticated selection criteria required in practice that includes comprehensive measure of using a VaR model ●Include static portfolios ●Overfitting ●Combining models
  • 20. Using Decision Trees to Explain VaR Breaches ●Pre-process dates for announcements ●Used to explain rather than predict ●Would not expect risk factors to be significant usually. Transformed risk factors might be present ●Useful results for non technical users
  • 21. Using Regression to Identify Positions In Portfolio ● Regress the current portfolio PnL by scenario against instrument PnL by scenario ● Use Lasso/Elastic Net Regression to let the model calculate the positions ● Very useful information to effectively reduce disparate portfolios to a comprehendible number of positions
  • 22. Stress Testing Specification usually on a small subset of risk factors. Eg Indu changes by -5% Objective is to fill out the other risk factors in manner that is consistent and coherent Issues: ● Correlation a lot different under extreme moves ● Applying changes can result in impossible risk factor levels