The article re-institutes the investor faith in CAPM model and talks about how CAPM is very closely coupled to the actual investment practices at the ground level. The general criticism of CAPM model that it does not fit empirical asset pricing well cast doubt on the validity of model have been explained.
These Lecture series are relating the use R language software, its interface and functions required to evaluate financial risk models. Furthermore, R software applications relating financial market data, measuring risk, modern portfolio theory, risk modeling relating returns generalized hyperbolic and lambda distributions, Value at Risk (VaR) modelling, extreme value methods and models, the class of ARCH models, GARCH risk models and portfolio optimization approaches.
Since the CAPM model Sharpe (1965) and the first “fundamental” model by King (1966) the use of “factors” in alpha generation and risk modeling has become mainstream. However, the types of factors we employ and the techniques we use to model relationships have in general not progressed much since. In addition, many of our favorite techniques assume that the world is static, whereas of course markets evolve and change dramatically; as we have seen so vividly illustrated over the last few years.
We review fundamental, macro-economic, and statistical factors, describing the advantages and disadvantages of each, and review some newer techniques that explicitly allow for evolving relationships in data sets and harness emerging technologies that can capture much more nuanced relationships than simple correlation: “flexible” least-squares regression, artificial immune systems, single-pass clustering, semantic clustering, social network influence measurement, layer-embedded networks, block-modeling, and more.
The article re-institutes the investor faith in CAPM model and talks about how CAPM is very closely coupled to the actual investment practices at the ground level. The general criticism of CAPM model that it does not fit empirical asset pricing well cast doubt on the validity of model have been explained.
These Lecture series are relating the use R language software, its interface and functions required to evaluate financial risk models. Furthermore, R software applications relating financial market data, measuring risk, modern portfolio theory, risk modeling relating returns generalized hyperbolic and lambda distributions, Value at Risk (VaR) modelling, extreme value methods and models, the class of ARCH models, GARCH risk models and portfolio optimization approaches.
Since the CAPM model Sharpe (1965) and the first “fundamental” model by King (1966) the use of “factors” in alpha generation and risk modeling has become mainstream. However, the types of factors we employ and the techniques we use to model relationships have in general not progressed much since. In addition, many of our favorite techniques assume that the world is static, whereas of course markets evolve and change dramatically; as we have seen so vividly illustrated over the last few years.
We review fundamental, macro-economic, and statistical factors, describing the advantages and disadvantages of each, and review some newer techniques that explicitly allow for evolving relationships in data sets and harness emerging technologies that can capture much more nuanced relationships than simple correlation: “flexible” least-squares regression, artificial immune systems, single-pass clustering, semantic clustering, social network influence measurement, layer-embedded networks, block-modeling, and more.
These Lecture series are relating the use R language software, its interface and functions required to evaluate financial risk models. Furthermore, R software applications relating financial market data, measuring risk, modern portfolio theory, risk modeling relating returns generalized hyperbolic and lambda distributions, Value at Risk (VaR) modelling, extreme value methods and models, the class of ARCH models, GARCH risk models and portfolio optimization approaches.
Determinants of the implied equity risk premium in BrazilFGV Brazil
This paper proposes and tests market determinants of the equity risk premium (ERP) in Brazil. We use implied ERP, based on the Elton (1999) critique. The ultimate goal of this exercise is to demonstrate that the calculation of implied, as opposed to historical ERP makes sense, because it varies, in the expected direction, with changes in fundamental market indicators. The ERP for Brazil is calculated as a mean of large samples of individual stock prices in each month in the January, 1995 to September, 2015 period, using the “implied risk premium” approach. As determinants of changes in the ERP we obtain, as significant, and in the expected direction: changes in the CDI rate, in the country debt risk spread, in the US market liquidity premium and in the level of the S&P500. The influence of the proposed determining factors is tested with the use of time series regression analysis. The possibility of a change in that relationship with the 2008 crisis was also tested, and the results indicate that the global financial crisis had no significant impact on the nature of the relationship between the ERP and its determining factors. For comparison purposes, we also consider the same variables as determinants of the ERP calculated with average historical returns, as is common in professional practice. First, the constructed series does not exhibit any relationship to known market-events. Second, the variables found to be significantly associated with historical ERP do not exhibit any intuitive relationship with compensation for market risk.
Authors:
Sanvicente, Antonio Zoratto
Carvalho, Mauricio Rocha de
FGV's Sao Paulo School of Economics (EESP)
A review of the assumptions behind fundamental, macro, and statistical risk models. Pros and cons of each approach. Introducing adaptive hybrid risk models.
These Lecture series are relating the use R language software, its interface and functions required to evaluate financial risk models. Furthermore, R software applications relating financial market data, measuring risk, modern portfolio theory, risk modeling relating returns generalized hyperbolic and lambda distributions, Value at Risk (VaR) modelling, extreme value methods and models, the class of ARCH models, GARCH risk models and portfolio optimization approaches.
Training is anchored in the past. It cannot prepare for a disruptive future. ROI on training is easy to justify be because it should be able to prove greater efficiency. We need a different kind of learning to prepare us for disruption. Let's call it x-learning. How does x-learning differ from training? And how do your programs compare?
These Lecture series are relating the use R language software, its interface and functions required to evaluate financial risk models. Furthermore, R software applications relating financial market data, measuring risk, modern portfolio theory, risk modeling relating returns generalized hyperbolic and lambda distributions, Value at Risk (VaR) modelling, extreme value methods and models, the class of ARCH models, GARCH risk models and portfolio optimization approaches.
Determinants of the implied equity risk premium in BrazilFGV Brazil
This paper proposes and tests market determinants of the equity risk premium (ERP) in Brazil. We use implied ERP, based on the Elton (1999) critique. The ultimate goal of this exercise is to demonstrate that the calculation of implied, as opposed to historical ERP makes sense, because it varies, in the expected direction, with changes in fundamental market indicators. The ERP for Brazil is calculated as a mean of large samples of individual stock prices in each month in the January, 1995 to September, 2015 period, using the “implied risk premium” approach. As determinants of changes in the ERP we obtain, as significant, and in the expected direction: changes in the CDI rate, in the country debt risk spread, in the US market liquidity premium and in the level of the S&P500. The influence of the proposed determining factors is tested with the use of time series regression analysis. The possibility of a change in that relationship with the 2008 crisis was also tested, and the results indicate that the global financial crisis had no significant impact on the nature of the relationship between the ERP and its determining factors. For comparison purposes, we also consider the same variables as determinants of the ERP calculated with average historical returns, as is common in professional practice. First, the constructed series does not exhibit any relationship to known market-events. Second, the variables found to be significantly associated with historical ERP do not exhibit any intuitive relationship with compensation for market risk.
Authors:
Sanvicente, Antonio Zoratto
Carvalho, Mauricio Rocha de
FGV's Sao Paulo School of Economics (EESP)
A review of the assumptions behind fundamental, macro, and statistical risk models. Pros and cons of each approach. Introducing adaptive hybrid risk models.
These Lecture series are relating the use R language software, its interface and functions required to evaluate financial risk models. Furthermore, R software applications relating financial market data, measuring risk, modern portfolio theory, risk modeling relating returns generalized hyperbolic and lambda distributions, Value at Risk (VaR) modelling, extreme value methods and models, the class of ARCH models, GARCH risk models and portfolio optimization approaches.
Training is anchored in the past. It cannot prepare for a disruptive future. ROI on training is easy to justify be because it should be able to prove greater efficiency. We need a different kind of learning to prepare us for disruption. Let's call it x-learning. How does x-learning differ from training? And how do your programs compare?
Within the larger category of leaders, there is an important sub-category, adaptive leaders who can read the signs of the times and help their teams and organizations to adapt to change. What do they look like and how can we test for them?
Author Mr Di Chen : Ecole Polytechnique Féderale de Lausanne: Financial Engineering Section
This paper shows that complexity influences stock returns. By establishing the complexity and resilience measure of the common stock and analyzing the relationship between return, momentum, size, complexity, book-to-market ratio and resilience, three measures (size, complexity and momentum) stand out as the factors that can influence stock returns.
Hedge Fund Predictability Under the Magnifying Glass:The Economic Value of Fo...Ryan Renicker CFA
The recent financial crisis has highlighted the need to search for suitable models forecasting hedge fund performance.
This paper develops and applies a framework in which to assess return predictability on a fund-by-fund basis.
Using a comprehensive sample of hedge funds during the 994-2008 period, we identify the fraction of funds in each style that are truly predictable, positively or negatively, by macro variables.
Out-of-sample, exploiting predictability can be di¢ cult as estimation risk and model uncertainty lead to imprecise fund forecast.
Moreover, in our multi-fund setting, investors face a trade-o¤ between unconditional and predictable performance, as strongly predictable funds may exhibit low unconditional mean.
Nevertheless, a strategy that combines forecasts across predictors circumvents all these challenges and delivers superior performance.
We highlight the statistical and economic drivers of this performance, especially in periods when predictor values strongly depart from their long run means.
Finally, we use one such period, the 2008 crisis, as a natural out-of-sample experiment to validate the robustness of our findings.
This paper studies an optimal investment and reinsurance problem for a jump-diffusion risk model
with short-selling constraint under the mean-variance criterion. Assume that the insurer
is allowed to purchase proportional reinsurance from the reinsurer and invest in a risk-free asset and a risky
asset whose price follows a geometric Brownian motion. In particular, both the insurance and reinsurance
premium are assumed to be calculated via the variance principle with different p
Discussion of “Systemic and Systematic risk” by Billio et al. and “CDS based ...SYRTO Project
Discussion of “Systemic and Systematic risk” by Billio et al. and “CDS based indicators for systemic risk of Euro area sovereigns and for Euro area financial firms” by Lucas et al. - Carsten Detken.
SYRTO Code Workshop
Workshop on Systemic Risk Policy Issues for SYRTO (Bundesbank-ECB-ESRB)
Head Office of Deustche Bundesbank, Guest House
Frankfurt am Main - July, 2 2014
Motivated by the problem of computing investment portfolio weightings we investigate various methods of clustering as alternatives to traditional mean-variance approaches. Such methods can have significant benefits from a practical point of view since they remove the need to invert a sample covariance matrix, which can suffer from estimation error and will almost certainly be non-stationary. The general idea is to find groups of assets which share similar return
characteristics over time and treat each group as a single composite asset. We then apply inverse volatility weightings to these new composite assets. In the course of our investigation we devise a method of clustering based on triangular potentials and we present associated theoretical results as well as various examples based on synthetic data.
Statistical Arbitrage
Pairs Trading, Long-Short Strategy
Cyrille BEN LEMRID

1 Pairs Trading Model 5
1.1 Generaldiscussion ................................ 5 1.2 Cointegration ................................... 6 1.3 Spreaddynamics ................................. 7
2 State of the art and model overview 9
2.1 StochasticDependenciesinFinancialTimeSeries . . . . . . . . . . . . . . . 9 2.2 Cointegration-basedtradingstrategies ..................... 10 2.3 FormulationasaStochasticControlProblem. . . . . . . . . . . . . . . . . . 13 2.4 Fundamentalanalysis............................... 16
3 Strategies Analysis 19
3.1 Roadmapforstrategydesign .......................... 19 3.2 Identificationofpotentialpairs ......................... 19 3.3 Testingcointegration ............................... 20 3.4 Riskcontrolandfeasibility............................ 20
4 Results
22
2
Contents

Introduction
This report presents my research work carried out at Credit Suisse from May to September 2012. This study has been pursued in collaboration with the Global Arbitrage Strategies team.
Quantitative analysis strategy developers use sophisticated statistical and optimization techniques to discover and construct new algorithms. These algorithms take advantage of the short term deviation from the ”fair” securities’ prices. Pairs trading is one such quantitative strategy - it is a process of identifying securities that generally move together but are currently ”drifting away”.
Pairs trading is a common strategy among many hedge funds and banks. However, there is not a significant amount of academic literature devoted to it due to its proprietary nature. For a review of some of the existing academic models, see [6], [8], [11] .
Our focus for this analysis is the study of two quantitative approaches to the problem of pairs trading, the first one uses the properties of co-integrated financial time series as a basis for trading strategy, in the second one we model the log-relationship between a pair of stock prices as an Ornstein-Uhlenbeck process and use this to formulate a portfolio optimization based stochastic control problem.
This study was performed to show that under certain assumptions the two approaches are equivalent.
Practitioners most often use a fundamentally driven approach, analyzing the performance of stocks around a market event and implement strategies using back-tested trading levels.
We also study an example of a fundamentally driven strategy, using market reaction to a stock being dropped or added to the MSCI World Standard, as a signal for a pair trading strategy on those stocks once their inclusion/exclusion has been made effective.
This report is organized as follows. Section 1 provides some background on pairs trading strategy. The theoretical results are described in Section 2. Section 3
Understanding greek government bond spreadsIlias Lekkos
The aim of our research is to enhance our understanding on the fundamental behavior of the Greek Government Bond Spreads both before and after the Greek economic crisis. We do that by trying to address the following issues:
Which are the fundamental drivers of GGB spreads?
Is this set of driving factors constant or it varies relative to the situation in the Greek bond market and the size of the spreads?
Even for factors that are significant across the spreads distribution do they maintain a constant influence (constant betas) on the spreads or this varies as well?
Are Greek Government Bonds fairly valued given the levels of their fundamentals?
Are the risks around their fair value always symmetric or can we identify periods of positive (i.e. increased probability for narrower spreads) and negative (i.e. increased probability for wider spreads) risks?
Can we use the enhanced flexibility of the model for risk management purposes? That is, can we produce more accurate Value-at-Risk analysis for GGB spreads?
Finally, can we employ our model to explore the behavior of GGB spreads under various macroeconomic scenarios?
1. Exploring Policyholder Behavior in the Extreme Tail
A Process of Modeling and Discovery with Extreme Value Theory
By Yuhong (Jason) Xue1
ABSTRACT
Policyholder behavior risk should be treated as a long term strategic risk not only because its trend
emerges slowly and its impact can only be recognized far in the future. More importantly, failing to
recognize a change in trend can lead to inadequate capitalization or missed opportunities of strategic
importance. Actuaries and risk officers have an obligation to employ more forward looking techniques
to better understand and mitigate this risk. This paper demonstrates that Extreme Value Theory (EVT)
can be used as such a tool to model policyholder behavior in the extreme tail, an area where judgment
has been constantly applied. EVT is a mathematical theory that explores the relationship of random
variables in the extremes. It is capable of predicting behavior in the extremes based on data in the not
so extreme portions of the distribution. This paper applies EVT to the study of variable annuity dynamic
lapse behavior in the extreme tail as an example. It illustrates a process whereby a company can fully
model policyholder behavior including the extreme tail with existing data. No judgment about the
extremal behavior is necessary. To this end, a variety of copulas are fitted to find the best model that
describes the extremal dependency between ITM and lapse rate. The actual data combined with data
simulated by the model in the extreme tail is then used to fit a dynamic lapse formula which displays
different characteristics compared to the traditional methods.
1 Introduction
Firms face both short term and long term risks. During a time of uncertainty, business leaders will
naturally focus more on the immediate threats: perhaps the loss of certain investments in the euro
zone, the weakened market demand in a recession, or the urgent capital need in a volatile environment.
Whatever the challenges may be, thanks to sound risk management, companies are able to adapt
quickly and steer away from the many clear and present dangers. But is merely tactically managing
short term risks enough to ensure long term sustainable success? The answer is clearly no. Of the
companies that ultimately failed, the root cause was often not a failure to mitigate a short term threat.
It was failing to identify a long term strategic risk, which might be a change in customers’ behavior or a
margin squeeze that threatens the whole industry.
1
Yuhong (Jason) Xue, FSA, MAAA, Retirement Solutions, Guardian Life Insurance Company of America , New York, NY 10004
Email: yuhong_xue@glic.com
1
2. According to an online survey conducted by The Economist Intelligence Unit in 2011 on how companies
are managing long term strategic risks, although more companies are now recognizing the importance
of long term risk management, few claimed to be good at it. This is not at all surprising. Long term risk
is inherently unknown, slow emerging, and does not easily lend itself to available analytical tools. For
life insurance companies, one such long term risk is how policyholders will ultimately behave in extreme
market conditions.
For years, actuaries have observed that policyholders behave differently in different markets. Holders of
variable annuity policies, for example, tend to surrender less frequently in a prolonged down market
than they would in a normal market. To account for this, dynamic behavior is assumed in a range of
applications such as pricing, valuation, hedging and capital determination, where an inverse link is
established between a market indicator, i.e. In The Moneyness (ITM), and a behavior indicator, i.e. full
lapse rate. However, this link is only established based on past experience; in many cases, a block of
business either has never been in a severe market or there is not enough data in a severe market to
draw any credible conclusions. In other words, how policyholders would behave under extreme market
conditions is largely unknown at this point. Nevertheless, assumptions about it are made everywhere,
from pricing and reserving to economic and regulatory capital.
Assumptions or opinions differ widely from company to company in terms of how efficient policyholders
would eventually be in severe markets. The price of new products and the amount of reserve and
capital companies are holding are, to some extent, reflections of their opinions about the behavior in
the extremes. Needless to say, getting it right in the long run is of strategic importance to companies,
regulators and the industry as a whole. According to the same survey, one of the most important roles
that the risk function should play is to question strategic assumptions and challenge management’s
entrenched view about the future. For actuaries and risk officers, what can we do to fulfill this
important obligation with respect to setting policyholder behavior assumptions?
Thanks to the recent market volatility, some behavior data under stressed market conditions is
beginning to emerge. In terms of existing analytical tools, however, we still don’t have enough data to
conclude anything about the behavior under extreme conditions. But is the newly collected data in the
past few years dropping hints towards the extremes? Can we apply other tools to gain more insights?
The answer is a definite yes. In this paper, we will illustrate how to apply Extreme Value Theory to shed
some light onto policyholder behavior in the extremes.
Distributions in the tail are of strong interest to people not only because of the significant economic
consequences that tail events can inflict on financial institutions and society as a whole, but also
because random events tend to correlate with each other very differently in the tail than in the normal
range of distribution, which can greatly soften or exacerbate the financial impact. Understanding tail
distribution seems like a daunting task since there is only limited data in the tail and even fewer or none
in the extreme tail. However, mathematicians have discovered an interesting fact: for all random
variables, as long as they satisfy certain conditions, the conditional distribution beyond a large threshold
can be approximated by a family of parametric distributions. The dependence structure of two or more
random variables once they are all above a large threshold can also be approximated by a family of
2
3. copulas. This is called the Extreme Value Theory. It suggests that if you can define a large enough
threshold and still have enough data to find a good fit for a distribution of the EVT family, then the
extreme tail can be described by the fitted distribution. In other words, EVT has the power of predicting
the extreme tail based on observations in the not so extreme range.
Researchers started to apply EVT to solving insurance problems two decades ago. Embrechts, Resnick,
and Samorodnitsky (1999) introduced the basic EVT theory with examples of industrial fire insurance.
Frees, Carrie`re, and Valdez (1996) studied dependence of human mortality between the two lives of a
joint annuity policy. Dupuis and Jones (2006) explored the dependence structures in the extreme tail
including large P&C losses and expenses they incur, hurricane losses in two different regions, and sharp
declines of two stocks.
In this paper, we will illustrate a process of using EVT in studying the behavior data by exploring the
relationship of ITM and lapse rate of a block of variable annuity business when ITM is extremely large, or
in the extreme tail. By fitting a bivariate distribution only to the data that is large enough, we will take
advantage of the predictive power of EVT and gain understanding of the lapse experience in the
extremes.
The paper is organized as follows. Section 2 gives a brief introduction to copulas and EVT. Section 3
discusses the raw data used in this analysis. Section 4 presents the result of the EVT analysis including
the fitted marginal distributions, the fitted copula that described the dependence structure, the
simulated data, and the regression analysis. Finally, concluding remarks are made in section 5. All data
analysis in this paper is performed using the R statistical language.
2 Introduction to Copulas and EVT
We present only the foundations of the theories here which are sufficient for the subsequent analysis.
Readers can refer to Dupuis and Jones (2006) and the book: “An Introduction to Copulas” by Roger
Nelson for more details.
2.1 Introduction to Copulas
A copula is defined to be a joint distribution function of standard uniform random variables:
where , i = 1, . . . , p is uniformly distributed random variables and is a real number between 0 and
1.
For any p random variables and their distribution functions , we can
construct a multivariate distribution F by the following:
Sklar (1959) showed that the inverse is true also. For any multivariate distribution function F, there
exists a copula function C, such that the above formula holds.
3
4. Resulting from Sklar’s theory, for any multivariate distribution, we now can write it in the form of their
marginal distributions and a copula. In other words, the dependence structure of the multivariate
distribution is fully captured in the copula and independent of the marginal distributions.
2.2 Extreme Value Theory
When studying the distribution of univariate large values, Pickands (1975) suggested using the
Generalized Pareto (GP) distribution to approximate the conditional distribution of excesses above a
sufficiently large threshold. That is, the distribution of Pr(X > u + y | X > u), where y > 0 and u is
sufficiently large, can be modeled by
where µ is called the location parameter, σ is called the scale parameter and ε is called the shape
parameter.
A similar result can be extended to the multivariate case whereby the joint excesses can be
approximated by a combination of GP distributions for their marginals and a copula that belongs to the
Extreme Value copula family.
One important consideration of applying EVT is the choice of the threshold. It should be sufficiently
large so that the EVT distribution and copula converge to the real multivariate distribution. But it
cannot be so large either that there is not enough data to provide a good fit. A reasonable choice of the
threshold is a tradeoff between these two conflicting requirements.
In practice, three families of copulas are commonly used when studying extremal dependence. They are
Gumbel (1960), Frank (1979) and Clayton (1978) copulas. Even though they don’t belong to the EVT
copula family, for dealing with very large but finite data, these copulas provide a good representation of
the range of possible dependence structures. They are also easier to fit as they are represented by only
one dependence parameter.
An interesting fact about extremal dependence is that it can disappear when the threshold goes to
infinity. This is called asymptotic independence. The Frank and Clayton copulas both exhibit this
characteristic but the Gumbel copula does not, meaning it still has strong dependence even when all
random variables exceed huge numbers.
Table 1 summarizes the three copulas.
Family Dependence Mathematical representation Asymptotic
parameter α independence
Gumbel α>1 No
Frank α≥1 Yes
Clayton -∞ < α < ∞ Yes
Table 1: Summary of copulas commonly used in extremal dependence studies
4
5. 3 Raw Data
The author constructed a hypothetical in-force block of lifetime GMWB business. Although it is not
based on real company data, it resembles common product designs as well as general patterns of ITM
and lapse experience observed by many industry surveys and studies.
The block is consists of mostly L share lifetime GMWB business issued from Jan 1999 to June 2006. The
observation period is the 5 years from Jan 2006 to Dec 2011. For simplicity, we observe only the lapse
behavior in the first year after the surrender charge period, or the shock lapse behavior.
There are two variations of the product designs. The first offers only annual ratchet on the lifetime
withdrawal guarantee, while the second also has a roll up rate of 6% guaranteed for 10 years. ITM is
defined to be:
Based on the product type and their issue week, we sub-divide the block into 1,452 distinct cohorts. We
observe the ITM and lapse rate in the shock lapse period for every cohort. The result is 1,452 distinct
pairs of ITM and lapse rate data. In order to better analyze the data using EVT, we further process the
data pair by converting the lapse rate to 1/lapse rate. For example, 10% lapse rate would be converted
to 10. The ending data set is plotted in Figure 1.
The Pearson’s correlation is 0.7 and the Kendall’s Correlation is 0.5, both suggesting strong correlation
between ITM and lapse rate. This is consistent with the visual in Figure 1.
However, our data is concentrated in the +/- 20% ITM range. It turns sparse when ITM goes beyond
20%. Perhaps it is warranted to look at data in the upper tail alone. We ranked the ITM and the
reciprocal of lapse rate independently and plotted the observations when both variables are beyond
their 85th and 90th percentile in figure 2. Interestingly, correlation is less obvious when it is beyond these
thresholds.
5
6. Figure 1: Scatter plot of raw data
th
Figure 2: Scatter plot of reciprocal of lapse rate and ITM. Left: both exceed 85 percentile. Right:
th
both exceed 90 percentile.
The distributions of ITM and lapse rate, or the marginal distributions, are also important. In figure 3 we
plotted histograms of the two variables.
6
7. Figure 3: Left: histogram of reciprocal of lapse rate. Right: histogram of ITM
4 Result of EVT analysis
The purpose of this section is to illustrate the process of using EVT to model and explore the potential
dependencies in the extremes between policyholder behavior and economic indicators such as ITM. We
use shock lapse data from the hypothetical variable annuity block as an example. The goal is not to
draw any definitive conclusions about the dependency. Rather, we will demonstrate the process by
which one can model and discover insights into one’s own data.
The process generally works as follows. We first analyze the marginal empirical distributions and define
a threshold usually in terms of percentiles of the empirical marginals. Normally we would want to select
a few such thresholds in order to find the largest that still leaves enough data for a good fit. We fit GP
only to the marginal data that exceeds the threshold. Then using the same thresholds, we fit a number
of copulas to the data exceeding the thresholds. Hopefully after this step, we can find a threshold and
copula combination that provides a good fit to the tail. Now with the tail completely specified by the GP
marginals and the copula, we can simulate the extreme tail data using the Monte-Carlo method. With
the simulated data, we can perform analysis familiar to actuaries such as regression and stochastic
calculation.
4.1 Marginal fitting
Here we fit GP to the excesses of ITM and the reciprocal of lapse rate data using the 55th, 85th and 90th
percentile of their empirical distribution respectively as the threshold. Table 2 summarizes the result:
The fits were very good judging by the closeness to the empirical distributions. Figure 4 overlays the
empirical density with the fitted GP density at 55th percentile threshold.
7
8. Threshold Variable Location Scale Shape
55th ITM -0.005 0.197 -0.193
1/lapse 3.448 0.282 1.387
85th ITM 0.161 0.259 -0.446
1/lapse 4.545 1.986 -0.156
90th ITM 0.223 0.245 -0.476
1/lapse 5.000 2.222 -0.217
Table 2: result of fitting GP to the excess of ITM and Reciprocal of Lapse
Rate.
th
Figure 4: density functions of fitted GP (red) vs. empirical (black) with data exceeding 55
percentile. Left: ITM density. Right: reciprocal of lapse rate density.
4.2 Copula fitting
Now that the marginal distributions are fully specified with the three thresholds, we turn to model the
dependence structure between ITM and lapse rate in the tail. We fit the Gumbel, Frank and Clayton
copulas to data exceeding the thresholds. Other copulas are also possible. But these three capture a
range of dependencies and are sufficient for our purpose.
The method of fitting is canonical maximum likelihood (CML). This approach uses the empirical marginal
distributions instead of the fitted GP marginal distributions in the fitting process. It estimates the copula
dependence parameter more consistently without depending on the fitted marginal distributions. Table
3 shows the results.
At the 55th percentile threshold, the Gumbel copula provides the best fit. However at the 85th
percentile, the Clayton copula fits better. At the 90th percentile, there are less than 100 data points
which is not enough to fit the copulas properly. Therefore, we chose the 85th percentile as our threshold
8
9. and the Clayton copula to model the tail. The distribution of ITM and lapse is fully specified using the
fitted Clayton copula and the GP fitted marginals for ITM and lapse exceeding the 85th percentile.
Threshold 55th 85th 90th
Number of 560 145 95
data pairs
Copula Parameter Pseudo Max Parameter Pseudo Max Parameter Pseudo Max
Loglikelihood Loglikelihood Loglikelihood
Gumbel 1.715 140.869 1.278 8.893 1.106 1.236
Frank 4.736 134.379 2.420 10.678 0.912 1.043
Clayton 0.801 69.952 0.601 10.881 0.148 0.531
Table 3: result of fitting copula to the empirical data.
4.3 Simulation
With the bivariate distribution fully specified above the 85th percentile, we performed a Monte-Carlo
simulation and generated 500 ITM and lapse data pairs. We also did the same using the fitted Gumbel
copula as a comparison. Figure 5 shows the scatter plots of the simulated data.
th
Figure 5: simulated data above joint threshold at 85 percentile by fitted bivariate distribution. Left: data
simulated by fitted Clayton copula. Right: data simulated by fitted Gumbel copula
The plots on the left (Clayton) is more scattered than the one on the right (Gumbel). This is consistent
with the asymptotic dependence characteristics of the two copulas. Recall that Clayton is asymptotically
independent, which means the dependency of the two random variables will become weaker in the
extreme tail. Gumbel, on the contrary, will show dependency in the extreme tail.
9
10. 4.4 Regression analysis
Of all the statistical techniques that actuaries use, regression is one of the favorites. It explains
dependence between variables reasonably well when enough data is distributed around the mean. It
breaks down where data is sparse or randomness is too high. Nonetheless, the regressed function
offers a very simple way of describing a dependence structure. In fact, it is commonly used in actuarial
modeling such as in deriving dynamic lapse functions used to link a market variable to a multiplicative
factor of the lapse rate.
We applied a common regression technique called Generalized Linear Models (GLM) to regress the
combined simulated data above the 85th percentile threshold and the raw data. We did the same for
the raw data alone as a comparison.
For the GLM regression, we chose the Poisson link function so that the resulting regressed function
takes the form of where α and β are parameters to estimate,
the multiplicative factor is used to multiply the base lapse rate, and the AV/Guar is the ratio of account
value over the perceived guarantee value defined to be the present value of all future payments. The
regressed functions are plotted in figure 6.
Figure 6: regression analysis using simulated data
When the raw data is combined with the simulated data generated by our chosen Clayton copula, the
resulting dynamic lapse function has the highest value, suggesting it will reduce the base lapse rate the
least. This implies, for example, when account value is only 25% of the guaranteed value, the lapse rate
would be about 25% of the base. Repeating the same for the combined raw and Gumbel copula
simulated data results in a slightly lower dynamic lapse function. This is consistent with the asymptotic
dependence characteristic of the two copulas: the Clayton copula is predicting less dependence
between ITM and lapse rate in the extreme tail.
10
11. If we analyze only the raw data, a dynamic lapse function that produces a lower multiplier would
emerge. Because the data in the extreme tail is scarce, the regression relies on the data in the +/- 20%
ITM range to define the influence of ITM over lapse rate. The low dynamic lapse function in the extreme
tail is just an extension of that strong dependence observed in the +/- 20% ITM data range. However,
our modeling using EVT suggests that the dependence is not necessarily that strong in the extreme tail.
Perhaps it is consistent with the belief that no matter how great the perceived value of the guarantee,
there will always be some policyholders who elect to surrender the policy due to other considerations.
Using the simulated data, a more accurate approach would be to model lapse stochastically. Interested
readers can test if that will yield materially different results than the regression approach.
5 Summary and Conclusion
To the insurance industry, how policyholders behave under extreme market conditions has remained
largely unknown. Yet companies have to make assumptions about this behavior in order to price
products and determine reserves and capital. These assumptions vary widely from company to
company depending on how efficient they believe their policyholders would eventually be in prolonged
severe markets. In the end, only one future will emerge and getting it wrong can hurt companies’
ability to execute their business strategies. A long term risk such as this one often emerges very slowly,
which means if we just monitor it in the rear view mirror, by the time we noticed a changing trend, it
may already be too late.
Actuaries and risk officers have an obligation to not only monitor the experience as it happens, but also
to employ forward looking tools to gain insight into an emerging trend and more importantly to advise
business leaders on how to mitigate the risk. This paper explores the extremal relationship of variable
annuity lapse rate and ITM as an example to illustrate the process of applying EVT to model the extremal
dependency between policyholder behavior and market variables. It also demonstrates the predictive
power of EVT by simulating the dependency in the extreme tail. The example suggests that the
dependency between ITM and lapse rate may not be as strong in the extreme tail as people might
expect. Although the conclusion of this analysis should not be generalized as it can be highly data
dependent, this paper shows EVT could reveal a different dynamic in the extreme tail than traditional
techniques.
With the newly collected data through the recent market cycles, the industry is in a position to
reexamine its understanding of extremal policyholder behavior. There could be profound implications
to a wide range of applications ranging from capital determination to pricing. This process need not be
limited to VA dynamic lapse study; it can be used in other behavior studies such as exploring VA
withdrawal behavior. However, as illustrated in section 4, the choice of the threshold is an important
consideration. One should exercise caution when applying EVT if the underlying data does not allow for
a good threshold selection.
11
12. ACKNOWLEDGMENT
The author wishes to thank Elizabeth Rogalin and Michael Slipowitz for providing valuable comments for
this paper.
REFERENCES
[1] Clayton, D.G. 1978. “A Model for Association in Bivariate Life Tables and its Application in
Epidemiological Studies of Familial Tendency in Chronic Disease Incidence”, Biometrika 65:141–
51.
[2] Dupuis, D. J. and B. L. Jones, 2006. “Multivariate Extreme Value Theory and Its Usefulness in
Understanding Risk”, North American Actuarial Journal 10(4).
[3] Embrechts, P., S. I. Resnick, and G. Samorodnitsky. 1999. “Extreme Value Theory as a Risk
Management Tool”, North American Actuarial Journal 3(2): 30–41.
[4] Frank, M. J. 1979. “On the Simultaneous Associativity of F(x, y) and x _ y _ F(x, y)”, Aequationes
Mathematicae 19:194–226.
[5] Frees, E. W., J. F. Carrie`re, and E. Valdez. 1996. “Annuity Valuation with Dependent Mortality”,
Journal of Risk and Insurance 63: 229–61.
[6] Gumbel, E. 1960. “Distributions des valeurs extremes en plusieurs dimensions”, Publications de
l’Institut de statistique de l’Universite´ de Paris 9:171–73.
[7] Nelson, R. 1999. “An Introduction to Copulas” New York: Springer.
[8] Pickands, J. 1975. “Statistical Inference Using Extreme Order Statistics”, Annals of Statistics 3(1):
119–31.
[9] Sklar, A. 1959. “Fonctions de repartition a n dimensions et leurs marges”, Publ. Inst. Stat. Univ.
Paris 8:229–231.
[10]The Economist Intelligence Unit, 2011. “The Long View – Getting New Perspective on Strategic
Risk”, http://www.businessresearch.eiu.com/long-view.html
12