Applications of optimal portfolio management
by
Dimitrios Bisias
Submitted to the Sloan School of Management
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Operations Research
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
September 2015
c Massachusetts Institute of Technology 2015. All rights reserved.
Author .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sloan School of Management
June 22, 2015
Certified by. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Andrew W. Lo
Charles E. and Susan T. Harris Professor of Finance
Thesis Supervisor
Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Patrick Jaillet
Dugald C. Jackson Professor, Department of Electrical Engineering
and Computer Science
Co-director, Operations Research Center
2
Applications of optimal portfolio management
by
Dimitrios Bisias
Submitted to the Sloan School of Management
on June 22, 2015, in partial fulfillment of the
requirements for the degree of
Doctor of Philosophy in Operations Research
Abstract
This thesis revolves around applications of optimal portfolio theory.
In the first essay, we study the optimal portfolio allocation among convergence
trades and mean reversion trading strategies for a risk averse investor who faces Value-
at-Risk and collateral constraints with and without fear of model misspecification.
We investigate the properties of the optimal trading strategy, when the investor fully
trusts his model dynamics. Subsequently, we investigate how the optimal trading
strategy of the investor changes when he mistrusts the model. In particular, we
assume that the investor believes that the data will come from an unknown member
of a set of unspecified alternative models near his approximating model. The investor
believes that his model is a pretty good approximation in the sense that the relative
entropy of the alternative models with respect to his nominal model is small. Concern
about model misspecification leads the investor to choose a robust optimal portfolio
allocation that works well over that set of alternative models.
In the second essay, we study how portfolio theory can be used as a framework for
making biomedical funding allocation decisions focusing on the National Institutes
of Health (NIH). Prioritizing research efforts is analogous to managing an invest-
ment portfolio. In both cases, there are competing opportunities to invest limited
resources, and expected returns, risk, correlations, and the cost of lost opportunities
are important factors in determining the return of those investments. Can we apply
portfolio theory as a systematic framework of making biomedical funding allocation
decisions? Does NIH manage its research risk in an efficient way? What are the
challenges and limitations of portfolio theory as a way of making biomedical funding
allocation decisions?
Finally in the third essay, we investigate how risk constraints in portfolio opti-
mization and fear of model misspecification affect the statistical properties of the
market returns. Risk sensitive regulation has become the cornerstone of international
financial regulations. How does this kind of regulation affect the statistical properties
of the financial market? Does it affect the risk premium of the market? What about
the volatility or the liquidity of the market?
3
Thesis Supervisor: Andrew W. Lo
Title: Charles E. and Susan T. Harris Professor of Finance
4
Acknowledgments
I would like to express my gratitude to my advisor and mentor, Professor Andrew
W. Lo, for his continuing support and advice over all the years I spent at MIT. His
immense knowledge in diverse research areas, enthusiasm, hard work, outstanding
leadership and motivation have been a source of inspiration. Working with him has
been an honor and privilege and I could not have imagined having a better advisor
and mentor for my Ph.D study.
I would also like to thank the rest of my thesis committee: Professor Dimitri P.
Bertsekas for comments that greatly improved this thesis and for his great books that
made me love the field of optimization in the first place and Professor Leonid Kogan
who provided his insight and expertise that greaty assisted this research.
In addition I would like to thank Dr. James F. Watkins, MD for his invaluable
help, insights and contribution to the second part of this research.
Moreover, I would like to thank Dr. Paul Mende, Dr. Saman Majd and Dr. Eric
Rosenfeld whom I had the fortune of being their teaching assistant in finance classes.
Paul’s experience in quantitative trading made me realize what career I would like to
follow and I am grateful for this.
Being part of MIT and in particular the ORC and LFE communities has been a
blessing and I consider myself very fortunate to be among very interesting and smart
people. I will always remember my years at MIT with nostalgia and joy and I hope
that I ’ll be able to express my gratitude in the future several times.
My life at MIT would not be so complete and joyful if I didn’t have good lifelong
friends to spend time and have productive discussions with. In particular, I would
like to thank Nick Trichakis and his wife Lena, Christos and Elli Nicolaides, Markos
and Sophia Trichas, Thomas and Anastasia Trikalinos, the golden coach George Pa-
pachristoudis and Gerry Tsoukalas.
Last but not least I would like to thank my parents Giorgo and Roula and my
sister Katerina for their unconditional love and support. I owe to them everything
and this thesis is dedicated to them.
5
6
Contents
1 Optimal trading of arbitrage opportunities under constraints 29
1.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.2.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.2.2 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.2.3 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.2.4 Connection with Ridge and Lasso regression . . . . . . . . . . 46
1.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.3.1 Convergence trades . . . . . . . . . . . . . . . . . . . . . . . . 47
1.3.2 Mean reversion trading opportunities . . . . . . . . . . . . . . 56
1.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2 Optimal trading of arbitrage opportunities under model misspecifi-
cation 57
2.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.2.1 Alternative models representation . . . . . . . . . . . . . . . . 61
2.2.2 Model setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.3 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.3.1 No fear of model misspecification . . . . . . . . . . . . . . . . 65
2.3.2 Fear of model misspecification no constraints . . . . . . . . . . 67
2.3.3 Fear of model misspecification with VaR and margin constraints 70
2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7
2.4.1 Convergence trades without constraints . . . . . . . . . . . . . 73
2.4.2 Mean reversion trading strategies without constraints . . . . . 78
2.4.3 Convergence trades with constraints . . . . . . . . . . . . . . . 92
2.4.4 Mean reversion trading strategies with constraints . . . . . . . 111
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
3 Estimating the NIH Efficient Frontier 131
3.1 NIH Background and Literature Review . . . . . . . . . . . . . . . . 132
3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
3.2.1 Funding Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
3.2.2 Burden of Disease Data . . . . . . . . . . . . . . . . . . . . . 139
3.2.3 Applying Portfolio Theory . . . . . . . . . . . . . . . . . . . . 142
3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
3.3.1 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . 147
3.3.2 Efficient Frontiers . . . . . . . . . . . . . . . . . . . . . . . . . 148
3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
4 Impact of model misspecification and risk constraints on market 157
4.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
4.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
4.2.1 Model setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
4.2.2 Varying constraints . . . . . . . . . . . . . . . . . . . . . . . . 161
4.2.3 Varying risk aversions . . . . . . . . . . . . . . . . . . . . . . 165
4.2.4 Varying constraints and risk aversions . . . . . . . . . . . . . . 168
4.2.5 Varying fear of model misspecification . . . . . . . . . . . . . 168
4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
A Technical Notes 173
8
List of Figures
1-1 Ellipsoids. Ellipsoids of poor investment opportunities for N=2 con-
vergence trades at times t = 0.3, 0.6, 0.9. . . . . . . . . . . . . . . . . 42
1-2 Weights for the case of uncorrelated spreads and collateral
constraint. Weights for the case of uncorrelated spreads. . . . . . . 45
1-3 VaR constraints, positive correlations. Wealth distribution at t =
0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated
(ρ = 0.5) convergence trades, while facing VaR constraints (K=1).
Initial wealth is $100. . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1-4 VaR constraints, negative correlations. Wealth distribution at
t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively cor-
related (ρ = −0.5) convergence trades, while facing VaR constraints
(K=1). Initial wealth is $100. . . . . . . . . . . . . . . . . . . . . . . 48
1-5 VaR constraints, positive correlations, tight constraints. Wealth
distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two
positively correlated (ρ = 0.5) convergence trades, while facing VaR
constraints (K=0.25). Initial wealth is $100. . . . . . . . . . . . . . . 49
1-6 VaR constraints, negative correlations, tight constraints. Wealth
distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two
negatively correlated (ρ = −0.5) convergence trades, while facing VaR
constraints (K=0.25). Initial wealth is $100. . . . . . . . . . . . . . . 49
9
1-7 Wealth evolution under VaR constraint. Typical path of the
wealth evolution for an investor investing in two convergence trades
using the same noise process for positive and negative correlation under
the VaR constraint. Initial wealth is $100. . . . . . . . . . . . . . . . 50
1-8 Relation between final wealth and frequency the VaR con-
straint binds. Final wealth is negatively correlated to the percentage
of time the constraints bind when the initial values of the convergence
trades are low. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
1-9 Margin constraints, positive correlations. Wealth distribution at
t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively cor-
related (ρ = 0.5) convergence trades, while facing margin constraints
(Collateral = 1). Initial wealth is $100. . . . . . . . . . . . . . . . . . 52
1-10 Margin constraints, negative correlations. Wealth distribution
at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively cor-
related (ρ = −0.5) convergence trades, while facing margin constraints
(Collateral = 1). Initial wealth is $100. . . . . . . . . . . . . . . . . . 52
1-11 Margin constraints, positive correlations, more collateral needed.
Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests
in two positively correlated (ρ = 0.5) convergence trades, while facing
margin constraints (Collateral = 2). Initial wealth is $100. . . . . . . 53
1-12 Margin constraints, negative correlations, more collateral needed.
Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests
in two negatively correlated (ρ = −0.5) convergence trades, while fac-
ing margin constraints (Collateral = 2). Initial wealth is $100. . . . . 53
1-13 Wealth evolution under margin constraint. Typical path of the
wealth evolution for an investor investing in two convergence trades
using the same noise process for positive and negative correlation under
the margin constraint. Initial wealth is $100. . . . . . . . . . . . . . . 54
10
1-14 Relation between final wealth and frequency the margin con-
straint binds. Final wealth is negatively correlated to the percentage
of time the constraints bind when the initial values of the convergence
trades are low. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
1-15 Positions evolution under VaR constraints. Typical path of the
positions in two convergence trading opportunities under VaR con-
straints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
1-16 Positions evolution under margin constraints. Typical path of
the positions in two convergence trading opportunities under margin
constraints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2-1 Partial derivative of the value function with respect to S for
a single convergence trade. VS as a function of time at S = 1
for different values of the robustness multiplier for a single convergence
trade. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
2-2 Distortion drift for a single convergence trade. Distortion drift
as a function of time at S = 1 for different values of the robustness
multiplier for a single convergence trade. . . . . . . . . . . . . . . . . 76
2-3 Distortion drift terms for a single convergence trade. Distor-
tion drift terms as a function of time at S = 1 for ν = 1 for a single
convergence trade. The first term corresponds to a positive distor-
tion drift that reduces the wealth of the investor since the investor is
shorting the spread, while the second term corresponds to a negative
distortion drift that points to worse investment opportunities. . . . . 77
2-4 Optimal weight of a single convergence trade. Weight of the
convergence trading strategy as a function of time at S = 1 for different
values of the robustness multiplier. . . . . . . . . . . . . . . . . . . . 77
2-5 Partial derivative of the value function with respect to S for
a single mean reversion trading strategy. VS as a function of
time at S = 1 for different values of the robustness multiplier. . . . . 80
11
2-6 Distortion drift for a single mean reversion trading strategy.
Distortion drift as a function of time at S = 1 for different values of
the robustness multiplier. . . . . . . . . . . . . . . . . . . . . . . . . . 81
2-7 Distortion drift terms for a single mean reversion trading
strategy. Distortion drift terms as a function of time at S = 1
for ν = 1. The first term corresponds to a positive distortion drift
that reduces the wealth of the investor, since the investor is shorting
the spread, while the second term corresponds to a negative distortion
drift that points to worse investment opportunities. . . . . . . . . . . 81
2-8 Optimal weight of a single mean reversion trading strategy.
Weight of the mean reversion trading strategy as a function of time at
S = 1 for different values of the robustness multiplier. . . . . . . . . . 82
2-9 Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2. Weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 2 for
different values of the robustness multiplier. The correlation coefficient
is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
2-10 Ratio of the optimal weights. Ratio of the optimal weights of the
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
2-11 Partial derivative of the value function with respect to S1 and
S2 at S1 = 1 and S2 = 2 when ρ = 0. Partial derivative of the value
function with respect to S1 and S2 as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
12
2-12 Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 =
2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
2-13 Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 =
2 for different values of the robustness multiplier. The correlation
coefficient is ρ = −0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2-14 Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 1. Weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 1 for
different values of the robustness multiplier. The correlation coefficient
is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
2-15 Ratio of the optimal weights at S1 = 1 and S2 = 1 when ρ = 0.
Ratio of the optimal weights of the mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 1 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0. . . . . . . 88
2-16 Partial derivative of the value function with respect to S1 and
S2 at S1 = 1 and S2 = 1 when ρ = 0. Partial derivative of the value
function with respect to S1 and S2 as a function of time at S1 = 1 and
S2 = 1 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
2-17 Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 =
1 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0.9. . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
13
2-18 Partial derivative of the value function with respect to S1 and
S2 at S1 = 1 and S2 = 1 when ρ = 0.9. Partial derivative of
the value function with respect to S1 and S2 as a function of time at
S1 = 1 and S2 = 1 for different values of the robustness multiplier.
The correlation coefficient is ρ = 0.9. . . . . . . . . . . . . . . . . . . 90
2-19 Ratio of the optimal weights at S1 = 1 and S2 = 1 when ρ = 0.9.
Ratio of the magnitude of the optimal weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 1 for
different values of the robustness multiplier. The correlation coefficient
is ρ = 0.9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
2-20 Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 =
1 for different values of the robustness multiplier. The correlation
coefficient is ρ = −0.8. . . . . . . . . . . . . . . . . . . . . . . . . . . 91
2-21 Partial derivative of the value function with respect to S for
a single convergence trade when L = 0.1 and L = 100. VS
as a function of time at S = 1 for different values of the robustness
multiplier. The solid line is when L = 100 and the dotted line is for
L = 0.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
2-22 Partial derivative of the value function with respect to S for
a single convergence trade when L = 0.1. VS as a function of
time at S = 1 for different values of the robustness multiplier. The
collateral constraint is |F| ≤ 0.1. . . . . . . . . . . . . . . . . . . . . . 95
2-23 Optimal weight of a single convergence trade when L = 0.1.
Weight of the convergence trading strategy as a function of time at
S = 1 for different values of the robustness multiplier. The collateral
constraint is |F| ≤ 0.1. . . . . . . . . . . . . . . . . . . . . . . . . . . 95
14
2-24 Optimal weight of a single convergence trade when L = 1.
Weight of the convergence trading strategy as a function of time at
S = 1 for different values of the robustness multiplier. The collateral
constraint is |F| ≤ 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
2-25 Distortion drift for a single convergence trade when L = 0.1.
Distortion drift as a function of time at S = 1 for different values of
the robustness multiplier. The collateral constraint is |F| ≤ 0.1. . . . 96
2-26 Distortion drift for a single convergence trade when L = 1.
Distortion drift as a function of time at S = 1 for different values of
the robustness multiplier. The collateral constraint is |F| ≤ 1. . . . . 97
2-27 Distortion drift terms for a single convergence trade when
L = 0.1. Distortion drift terms as a function of time at S = 1 for
ν = 1 and L = 0.1. The first term corresponds to a positive distortion
drift that reduces the wealth of the investor and it is bounded above
due to the collateral constraint, while the second term corresponds to a
negative distortion drift that points to worse investment opportunities. 97
2-28 Distortion drift terms for a single convergence trade when
L = 1. Distortion drift terms as a function of time at S = 1 for ν = 1
and L = 1. The first term corresponds to a positive distortion drift
that reduces the wealth of the investor and it is bounded above due
to the collateral constraint, while the second term corresponds to a
negative distortion drift that points to worse investment opportunities. 98
2-29 Optimal weight of a single convergence trade when L = 0.1 and
L = 100. Weight of the convergence trading strategy as a function of
time at S = 1 for different values of the robustness multiplier. The
solid line is when L = 100 and the dotted line is for L = 0.1. . . . . . 98
15
2-30 Optimal weights of two uncorrelated convergence trades for
S1 = 1 and S2 = 2 when L = 0.5. Weights of the convergence trades
as a function of time at S1 = 1 and S2 = 2 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0 and the rhs
of the VaR constraint is L = 0.5. . . . . . . . . . . . . . . . . . . . . 100
2-31 Value of the normalized wealth variance for two uncorrelated
convergence trades at S1 = 1 and S2 = 2 when L = 0.5. Value
of the normalized wealth variance for two uncorrelated convergence
trades as a function of time at S1 = 1 and S2 = 2 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0 and
the rhs of the VaR constraint is L = 0.5. . . . . . . . . . . . . . . . . 101
2-32 Optimal weights of two uncorrelated convergence trades for
S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades
as a function of time at S1 = 1 and S2 = 2 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0 and the rhs
of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . . . . . . 101
2-33 Value of the normalized wealth variance for two uncorrelated
convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value
of the normalized wealth variance for two uncorrelated convergence
trades as a function of time at S1 = 1 and S2 = 2 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0 and
the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . 102
2-34 Optimal weights of two positively correlated convergence trades
for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence
trades as a function of time at S1 = 1 and S2 = 2 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0.5 and
the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . 103
16
2-35 Value of the normalized wealth variance for two positively cor-
related convergence trades at S1 = 1 and S2 = 2 when L = 0.05.
Value of the normalized wealth variance for two positively correlated
convergence trades as a function of time at S1 = 1 and S2 = 2 for dif-
ferent values of the robustness multiplier. The correlation coefficient
is ρ = 0.5 and the rhs of the VaR constraint is L = 0.05. . . . . . . . 104
2-36 Optimal weights of two negatively correlated convergence trades
for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence
trades as a function of time at S1 = 1 and S2 = 2 for different values of
the robustness multiplier. The correlation coefficient is ρ = −0.5 and
the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . 104
2-37 Value of the normalized wealth variance for two negatively
correlated convergence trades at S1 = 1 and S2 = 2 when
L = 0.05. Value of the normalized wealth variance for two nega-
tively correlated convergence trades as a function of time at S1 = 1
and S2 = 2 for different values of the robustness multiplier. The cor-
relation coefficient is ρ = −0.5 and the rhs of the VaR constraint is
L = 0.05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
2-38 Optimal weights of two uncorrelated convergence trades for
S1 = 1 and S2 = 1 when L = 0.05. Weights of the convergence trades
as a function of time at S1 = 1 and S2 = 1 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0 and the rhs
of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . . . . . . 107
2-39 Value of the normalized wealth variance for two uncorrelated
convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value
of the normalized wealth variance for two uncorrelated convergence
trades as a function of time at S1 = 1 and S2 = 1 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0 and
the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . 107
17
2-40 Optimal weights of two positively correlated convergence trades
for S1 = 1 and S2 = 1 when L = 0.05. Weights of the convergence
trades as a function of time at S1 = 1 and S2 = 1 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0.8 and
the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . 108
2-41 Value of the normalized wealth variance for two positively cor-
related convergence trades at S1 = 1 and S2 = 1 when L = 0.05.
Value of the normalized wealth variance for two positively correlated
convergence trades as a function of time at S1 = 1 and S2 = 1 for dif-
ferent values of the robustness multiplier. The correlation coefficient
is ρ = 0.8 and the rhs of the VaR constraint is L = 0.05. . . . . . . . 109
2-42 Optimal weights of two negatively correlated convergence trades
for S1 = 1 and S2 = 1 when L = 8. Weights of the convergence
trades as a function of time at S1 = 1 and S2 = 1 for different values of
the robustness multiplier. The correlation coefficient is ρ = −0.8 and
the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . 109
2-43 Value of the normalized wealth variance for two negatively
correlated convergence trades at S1 = 1 and S2 = 1 when
L = 0.05. Value of the normalized wealth variance for two nega-
tively correlated convergence trades as a function of time at S1 = 1
and S2 = 1 for different values of the robustness multiplier. The cor-
relation coefficient is ρ = −0.8 and the rhs of the VaR constraint is
L = 0.05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
2-44 Partial derivative of the value function with respect to S for
a single mean reversion trading strategy and a collateral con-
straint with L = 0.7. VS as a function of time at S = 1 for different
values of the robustness multiplier for L = 0.7. . . . . . . . . . . . . . 113
18
2-45 Distortion drift terms for a single mean reversion trading
strategy and a collateral constraint with L = 0.7. Distortion
drift terms as a function of time at S = 1 for ν = 2 and for L = 0.7.
The first term corresponds to a positive distortion drift that reduces
the wealth of the investor, since the investor is shorting the spread,
while the second term corresponds to a negative distortion drift that
points to worse investment opportunities. The first term is bounded
above due to the collateral constraint. . . . . . . . . . . . . . . . . . . 113
2-46 Optimal weight of a single mean reversion trading strategy
with a collateral constraint with L = 0.7. Weight of the mean
reversion trading strategy as a function of time at S = 1 for different
values of the robustness multiplier and for L = 0.7. . . . . . . . . . . 114
2-47 Partial derivative of the value function with respect to S for
a single mean reversion trading strategy with different collat-
eral constraints. VS as a function of time at S = 1 for different
values of the robustness multiplier and different collateral constraints.
The solid line is for L = 70 and the dotted line for L = 0.7. . . . . . . 114
2-48 Optimal weight of a single mean reversion trading strategy
with different collateral constraints. Weight of the mean reversion
trading strategy as a function of time at S = 1 for different values of
the robustness multiplier and different collateral constraints. The solid
line is for L = 70 and the dotted line for L = 0.7. . . . . . . . . . . . 115
2-49 Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2 when L = 3. Weights of the
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 3. . . . . 117
19
2-50 Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 2 when
L = 3. Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 3. . . . . 118
2-51 Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2 when L = 2. Weights of the
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. . . . . 118
2-52 Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 2 when
L = 2. Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. . . . . 119
2-53 Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 7. . . . . 119
2-54 Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 2 when
L = 7. Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 7. . . . . 120
20
2-55 Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights
of the mean reversion trading strategies as a function of time at S1 =
1 and S2 = 2 for different values of the robustness multiplier. The
correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is
L = 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
2-56 Value of the normalized wealth variance for two positively
correlated mean reversion trading strategies at S1 = 1 and
S2 = 2 when L = 7. Value of the normalized wealth variance for two
positively correlated mean reversion trading strategies as a function
of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0.5 and the rhs of the
VaR constraint is L = 7. . . . . . . . . . . . . . . . . . . . . . . . . . 122
2-57 Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights
of the mean reversion trading strategies as a function of time at S1 =
1 and S2 = 2 for different values of the robustness multiplier. The
correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is
L = 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
2-58 Value of the normalized wealth variance for two negatively
correlated mean reversion trading strategies at S1 = 1 and
S2 = 2 when L = 7. Value of the normalized wealth variance for two
negatively correlated mean reversion trading strategies as a function
of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the
VaR constraint is L = 7. . . . . . . . . . . . . . . . . . . . . . . . . . 123
21
2-59 Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 1 when L = 2. Weights of the
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 1 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. . . . . 125
2-60 Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 1 when
L = 2. Value of the normalized wealth variance for two negatively
correlated mean reversion trading strategies as a function of time at
S1 = 1 and S2 = 1 for different values of the robustness multiplier.
The correlation coefficient is ρ = 0 and the rhs of the VaR constraint
is L = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
2-61 Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1 when L = 2. Weights
of the mean reversion trading strategies as a function of time at S1 =
1 and S2 = 1 for different values of the robustness multiplier. The
correlation coefficient is ρ = 0.9 and the rhs of the VaR constraint is
L = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
2-62 Value of the normalized wealth variance for two positively
correlated mean reversion trading strategies at S1 = 1 and
S2 = 1 when L = 2. Value of the normalized wealth variance for two
positively correlated mean reversion trading strategies as a function
of time at S1 = 1 and S2 = 1 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0.9 and the rhs of the
VaR constraint is L = 2. . . . . . . . . . . . . . . . . . . . . . . . . . 127
22
2-63 Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1 when L = 8. Weights
of the mean reversion trading strategies as a function of time at S1 =
1 and S2 = 1 for different values of the robustness multiplier. The
correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is
L = 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
2-64 Value of the normalized wealth variance for two negatively
correlated mean reversion trading strategies at S1 = 1 and
S2 = 1 when L = 8. Value of the normalized wealth variance for two
negatively correlated mean reversion trading strategies as a function
of time at S1 = 1 and S2 = 1 for different values of the robustness
multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the
VaR constraint is L = 8. . . . . . . . . . . . . . . . . . . . . . . . . . 128
3-1 NIH time series flowchart. Flowchart for the construction of NIH
appropriations time series. “NIH Approp.” denotes NIH appropria-
tions; “PHS Gaps” denotes Institute funding by the U.S. Public Health
Service; “Complete Approp.” denotes the union of these two series;
“FY Change” allows for the change in government fiscal years; “4Q
FY” time series refers to the resulting series in which all years are
treated as having four quarters of three months each. . . . . . . . . . 138
3-2 Appropriations data. NIH appropriations in real (2005) dollars,
categorized by disease group. . . . . . . . . . . . . . . . . . . . . . . 138
3-3 YLL time series flowchart. Flowchart for the construction of years
of life lost (YLL) time series. “WONDER Chapter Age Group” refers
to a query to the CDC WONDER database at the chapter level, strati-
fied by age group at death; “US Pop.” is the United States population
from census data as expressed in the WONDER dataset; and “US
GDP” denotes U.S. gross domestic product. . . . . . . . . . . . . . . 140
23
3-4 YLL data. Panel (a): Raw YLL categorized by disease group. Panel
(b): Population-normalized YLL (with base year of 2005), categorized
by disease group. Both panels are based on data from 1979 to 2007. 141
3-5 Efficient frontiers. Efficient frontiers for (a) all groups except HIV
and AMS, γ = 0; (b) all groups except HIV and AMS, γ = 5; (c) all
groups except HIV and AMS without the dementia effect, γ = 0; and
(d) all groups except HIV and AMS without the dementia effect, γ =5;
based on historical ROI from 1980 to 2003. . . . . . . . . . . . . . . . 148
4-1 Price of the risky asset as a function of the aggregate market
supply under varying constraints. We assume that we have 5
agents with the same risk aversion coefficients. The red plot assumes
the same L = 30 for all the agents, while the blue assumes L to be
different across the agents L1 = 10, L2 = 20, L3 = 30, L4 = 40, L5 = 50. 163
4-2 Price of the risky asset as a function of the aggregate market
supply under tightening constraints. We assume that we have 5
agents with the same risk aversion coefficients. The blue plot assumes
L to be different across the agents L1 = 10, L2 = 20, L3 = 30, L4 =
40, L5 = 50 and the red assumes that each Li is reduced by 20%. . . . 164
4-3 Price of the risky asset as a function of the aggregate market
supply with less variable constraints. We assume that we have 5
agents with the same risk aversion coefficients. The blue plot assumes
L to be different across the agents L1 = 10, L2 = 20, L3 = 30, L4 =
40, L5 = 50 and the red assumes that L1 = 20, L2 = 25, L3 = 30, L4 =
35, L5 = 40. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
4-4 Price of the risky asset as a function of the aggregate market
supply with constraints and varying risk aversions. We assume
that we have 5 agents with same constraints but different risk aversion
coefficients. The blue plot assumes L = 30 for each agent, while the
red line assumes that the agents are unconstrained. . . . . . . . . . . 166
24
4-5 Price of the risky asset as a function of the aggregate market
supply with tightening constraints and varying risk aversions.
We assume that we have 5 agents with same constraints but different
risk aversion coefficients. The blue plot assumes L = 30 for each agent,
while the red line assumes that L = 20 for each agent. . . . . . . . . . 167
25
26
List of Tables
3.1 IoM recommendations. 12 major recommendations of the 1998
Institute of Medicine panel in four large areas for improving the process
of allocating research funds. . . . . . . . . . . . . . . . . . . . . . . . 133
3.2 ICD mapping. Classification of ICD-9 (1978–1998) and ICD-10 (1999–
2007) Chapters and NIH appropriations by Institute and Center to 7
disease groups: oncology (ONC); heart lung and blood (HLB); diges-
tive, renal and endocrine (DDK); central nervous system and sensory
(CNS) into which we placed dementia and unspecified psychoses to
create comparable series as there was a clear, ongoing migration noted
from NMH to CNS after the change to ICD-10 in 1999; psychiatric and
substance abuse (NMH); infectious disease, subdivided into estimated
HIV (HIV) and other (AID); maternal, fetal, congenital and pediatric
(CHD). The categories LAB and EXT are omitted from our analysis. 137
3.3 Return summary statistics. Summary statistics for the ROI of
disease groups, in units of years (for the lag length) and per-capita-
GDP-denominated reductions in YLL between years t and t+4 per
dollar of research funding in year t−q, based on historical ROI from
1980 to 2003. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
3.4 ROI example. An example of the ROI calculation for HLB from 1986. 147
3.5 Portfolio weights. Benchmark, single- and dual-objective optimal
portfolio weights (in percent), based on historical ROI from 1980 to
2003. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
27
28
Chapter 1
Optimal trading of arbitrage
opportunities under constraints
In financial economics, an arbitrage is an investment opportunity that is too good
to be true when there are no market frictions. In actual financial markets however,
there are frictions and even if there are arbitrage opportunities the investors may not
be able to fully exploit them due to the constraints they face.
We will explore two kinds of risky arbitrage opportunities when there are market
frictions. The first one is a case of a textbook arbitrage, a convergence trade strategy.
The second is a case of a statistical arbitrage, a mean reversion trading strategy. These
two strategies are two of the most popular trading strategies that hedge funds follow,
so studying them in detail when there are market frictions is a valuable exercise.
A convergence trade is a trading strategy consisting of long/short positions in two
similar assets, where we buy the cheap asset, we short the expensive asset and we wait
until the prices of the two assets to converge which we know it will happen for sure
some particular time in the future. An example of this trade involves the difference
in price between the on-the-run and the most recent off-the-run security. An on-
the-run security is the most recently issued, and hence most liquid, of a periodically
issued security. Since an on-the-run security is more liquid it trades at a premium
to off-the-run securities [29]. A convergence trade involves taking a long position in
the most recent off-the-run security and shorting the on-the-run security. The on-
29
the-run will become off-the-run upon the issue of a newer security and then there
will be almost no difference between the two securities in our trade so their prices
will converge. Another example involves investing in Treasury STRIPS with identical
maturity dates but different prices.
A mean reversion trading strategy involves investing in an asset or a portfolio of
assets whose value is a mean reverting process. Since most price series in the equity
space follow random walk, this strategy most commonly involves investing in a port-
folio of non-mean-reverting assets whose value is a stationary mean reverting series.
These price series that can be combined in such a way are called cointegrating. A
classic statistical arbitrage example is the pairs trading, which is the first type of
algorithmic mean reversion trading strategy invented by institutional traders, report-
edly by the trading desk of Nunzio Tartaglia at Morgan Stanley [64]. The statistical
arbitrage pairs trading strategy bets on the convergence of the prices of two similar
assets whose prices have diverged without a fundamental reason for this.
These arbitrage opportunities are risky under market frictions. In particular the
first case is exposed to the “divergence risk”, i.e. the fact that the pricing differential
between the two similar assets can diverge arbitrarily far from 0 prior to its conver-
gence at some particular time in the future. The second case is exposed both to the
“divergence risk” and to the “horizon risk”, in other words the fact that the times at
which the spread will converge to its long run mean are uncertain.
We will explore the optimal portfolio allocation of a risk averse investor who
invests in N convergence trades or mean reverting trading strategies, while facing
constraints. In particular, we will study the optimal trading strategy when he faces
VaR constraints or collateral constraints. Risk sensitive regulation, such as the VaR
constraint, has lately become a central component of international financial regu-
lations. Collateral or margin constraints, where the investor has to have sufficient
wealth to secure the liabilities taken by short positions, have been ubiquitous in the
financial transactions for centuries and margin calls have been behind several crises
including the LTCM debacle[54].
In the rest of this chapter, we will discuss the relevant literature review. Then we
30
will discuss about the setup of the model and the constraints, we will find the optimal
trading strategy of the investor and finally we will explore the characteristics of this
optimal strategy.
1.1 Literature review
Merton studied the problem of optimal portfolio allocation in a continuous time set-
ting without any market frictions [59]. The optimal portfolio involves two terms: a
market timing term and a hedging demand term. The first term is a myopic term that
represents the optimal allocation if you were interested at each time instant t only for
an horizon dt ahead. The second term represents the investor’s additional demand
due to the covariance of the wealth process with the attractiveness of the available
investment opportunities. Although Merton gives an analytical general solution this
is expressed in terms of the partial derivatives of the value function and additional
work is needed to derive the solution in terms of the model parameters. Additionally
it assumes that there are no market frictions.
Optimal trading of mean reversion trading strategies have been studied by both
Boguslavsky and Boguslavskaya [15] and Jurek and Yang [48]. They have found
analytical solutions for the optimal weight of a single mean reverting trading strategy
for risk averse CRRA investors. Their analysis is similar with the one in Kim and
Omberg [49], where they assume that there is a risk free asset with a constant risk-
free rate and a single risky asset with a mean reverting risk premium, which implies
a mean reverting instantaneous Sharpe ratio. In all the cases they have assumed that
there are no market frictions whatsoever.
Longstaff and Liu [53] have studied the problem of optimal trading of a single
convergence trade under a margin constraint. For the single convergence trade case
both VaR constraints and margin constraints collapse in the same constraint and the
problem is significantly easier. In addition by studying only convergence trades they
have taken out one important dimension of risk, the horizon risk, keeping only the
divergence risk. Brennan and Schwarz [17] have also studied the problem of optimal
31
trading of a single convergence trade including transaction costs when the arbitrage
potential is restricted by position limits.
The literature is rich with papers that study the existence of an equilibrium where
there exists mispricings. This persistence of mispricings is typically attributed to
agency problems, frictions or some kind of risk. Unlike textbook arbitrages, which
generate riskless profits and require no capital commitments, exploiting real-world
mispricings requires the assumption of some kind of risk. Shleifer and Vishny [74]
emphasized that risks such as the uncertainty about when the pricing differential
will converge to 0 and the possibility of a divergence of the mispricing prior to its
elimination may play a role in limiting the size of positions that arbitrageurs are
willing to take, contributing to the persistence of the arbitrage in equilibrium. Basak
and Choitoru [6] also showed that arbitrage can persist in equilibrium when there are
frictions. They study dynamic models with log utility and heterogeneous beliefs in
the presence of margin requirements and other portfolio constraints.
With respect to the constraints, Basak and Shapiro [7] study the problem of opti-
mal trading strategy of a risk averse investor who faces finite horizon VaR constraints
in a complete markets setting using the martingale representation approach [4]. Here
again there are no constraints in the optimal portfolio allocation at each time t but
there is only one constraint in the wealth at some finite horizon. Finally, Geanakoplos
[33] studies the collateral constraints, how these determine an equilibrium leverage
and how this leverage changes over time, the so-called leverage cycles.
Let us now discuss about the setup of the model and the constraints and find the
optimal trading strategy of the investor.
1.2 Analysis
We assume we have a risk averse investor maximizing the expected continuously
compounded rate of return or equivalently the expected logarithm of his final wealth
E(lnWT ). There are two cases to consider. In the first case, the investor can invest
in a risk-free asset and N non-redundant convergence trades, modeled as correlated
32
Brownian bridges. In the second case, the investor can invest in a risk-free asset
and N non-redundant mean reversion trading strategies, modeled as a multivariate
Ornstein-Uhlenbeck (OU) process. The investor faces two kinds of constraints: VaR
constraints or collateral constraints. We determine the optimal trading strategy and
its characteristics in both cases.
1.2.1 Models
As we mentioned already, a convergence trade is a trading strategy consisting of
long/short positions in two similar assets, where we buy the cheap asset, we short the
expensive asset and we wait until the prices of the two assets to converge which we
know it will happen for sure some time in the future. The spread of the convergence
trade can be modeled as a Brownian bridge driven by K Brownian motions, which
has the property that the spread will converge to 0 almost surely at some determined
time in the future. The stochastic differential equation governing the spread of the
trade is given by:
dSt = −
aSt
T − t
dt +
K
X
k=1
σkdZkt (1.1)
where St is the spread of the trade, a is a parameter controlling the rate of the mean
reversion to 0, T is the horizon of the investor which is also the time at which the
spread goes to 0 with probability 1 and Zt is a Brownian motion in RK
. We can
see that the reversion to 0 grows stronger as t → T. Therefore, the investment
opportunities get better as the spread gets larger and t → T, since then the drift
term pushing the spread towards 0 gets larger.
A mean reversion trading strategy involves investing in a stationary portfolio of
non-mean reverting assets, whose value is a mean reverting process. The value of
the portfolio can be modeled as an Ornstein-Uhlenbeck (OU) process. The stochastic
differential equation governing it is given by:
dSt = −φ(St − S̄)dt +
K
X
k=1
σkdZkt
33
In our case we have N of these mean reverting processes and we assume that they
are modeled as a multivariate Ornstein-Uhlenbeck process, which is defined by the
following stochastic differential equation:
dSt = −Φ(St − S̄)dt + σdZt (1.2)
Above Φ is a N-by-N square transition matrix that characterizes the deterministic
portion of the evolution of the process, S̄ is the vector representing the unconditional
mean of the process, σ is a N-by-K matrix that drives the dispersion of the process
and Zt is a Brownian motion in RK
.
The Ornstein-Uhlenbeck process has the nice property that its conditional distri-
bution is normal at all times, with mean equal to
Et[St+τ ] = S̄ + e−Φτ
(St − S̄)
and covariance matrix independent of St [60]. We assume that Φ has eigenvalues with
positive real part, so that the conditional expectation approaches to S̄ as t → ∞.
The Ornstein-Uhlenbeck process captures the two important dimensions of risk
in all relative value trades: the “horizon risk”, in other words the fact that the
times at which the spread will converge to its long run mean are uncertain and the
“divergence risk”, i.e. the fact that the pricing differential can diverge arbitrarily far
from its long run mean prior to its convergence. The Brownian bridge captures only
the “divergence” risk, since by its definition we assume that the investor has perfect
information about the magnitude of the mispricing at some future date T, i.e. we
assume that the date T on which the mispricing will be eliminated is known ahead
with certainty.
1.2.2 Constraints
We consider two kinds of constraints: VaR and collateral constraints. The VaR
constraint is a widely used statistical risk measure, adopted both by the regulators
34
and the private sector. It is the cornerstone of the capital regulations adopted by
Basel regulations. Both the 1996 market risk amendment of the original 1988 Basel
accord and the Basel II regulations have been built on the notion of Value-at-Risk
[47]. The Value at risk (VaR) at α-level is defined as the threshold value such that the
probability of losses greater than the threshold is less than α. In our case we consider
instantaneous VaR constraints which amount for determining an upper bound in
the wealth volatility, since locally the diffusion processes have normal distributions.
Therefore, the instantaneous VaR constraints are given by:
θT
Σθ ≤ LW2
where θ is a N by 1 vector of positions, Σ is the instantaneous covariance matrix of
the spreads, L is some proportionality constant that determines the tightness of the
constraint and W is the investor’s wealth.
Collateral or margin constraints have been ubiquitous in the financial transactions for
centuries. Even Shakespeare in the “Merchant of Venice” points out the importance
of the collateral, as Shylock charged Antonio no interest rate but he asked for a
pound of flesh as a collateral. The collateral constraints provide protection against
mark-to-market losses whenever an investor generates a liability by shorting an asset.
Therefore, they require that the investor’s wealth is bounded below by the collateral
necessary to secure the liabilities. They are given by:
N
X
i=1
λi|θi| ≤ W
where λi is the collateral necessary to secure the liability in spread i. In our work, each
unit of arbitrage should be understood as being relative to a fixed face or notional
amount and therefore each λi is a percentage of this fixed face value or notional
amount.
35
1.2.3 Solution
Let us now find the optimal trading strategy of a risk averse investor who maximizes
the expected logarithm of his final wealth E(lnWT ). We consider two cases:
• The investor invests in the risk free asset and in N correlated convergence trades.
• The investor invests in the risk free asset and in N correlated mean reversion
trading strategies.
For both cases our analysis is similar. For both cases we have:
Wt =
N
X
i=1
θitSit + θ0tB0t ∀t ∈ [0, T] (1.3)
where θit is the investor’s position in opportunity i for i = 1, · · · , N, θ0t is the in-
vestor’s position in the risk free asset, Sit is the spread of the convergence trade or
the value of the mean reverting portfolio and B0t is the price of the risk free asset.
The process θt is adapted to the filtration generated by the Brownian motion Zt.
The investor solves the following problem:
maximizeθ∈Θ E(lnWT )
subject to
dWt =
PN
i=1 θitdSit + θ0tdB0t
dSt = µ(S, t)dt + σ(S, t)dZt
(1.4)
where Θ is the set of admissible trading strategies. Let us first define ∀t ∈ [0, T] Ft =
θt/Wt ∈ RN
.
36
For the convergence trades case, investor’s wealth satisfies the following stochastic
differential equation:
dWt = Wtrdt +
N
X
i=1
θitSit(−
ai
T − t
− r)dt + θT
σdZt
dWt
Wt
= rdt +
N
X
i=1
FitSit(−
ai
T − t
− r)dt + FT
σdZt
By applying Ito’s Lemma we have that:
d(ln(Wt)) = rdt +
N
X
i=1
FitSit(−
ai
T − t
− r)dt − 1/2FT
t ΣFtdt + FT
t σdZt
Therefore it is:
ln(WT ) = ln(Wt) +
Z T
t
rs ds
+
Z T
t
N
X
i=1
FisSis(−
ai
T − t
− rs) −
1
2
FT
s ΣFs
!
ds
+
Z T
t
FT
s σdZs (1.5)
Assuming constant interest rate, we have:
Et(ln(WT )) = ln(Wt) + r(T − t)
+ Et
Z T
t
N
X
i=1
FisSis(−
ai
T − t
− rs) −
1
2
FT
s ΣFs
!
ds
!
+ Et(
Z T
t
FT
s σdZs) (1.6)
37
For the mean reversion trading strategies case, investor’s wealth satisfies the following
stochastic differential equation:
dWt = Wtrdt +
N
X
i=1
θit(−ΦT
i (St − S̄) − rSit)dt + θT
σdZt
dWt
Wt
= rdt +
N
X
i=1
Fit(−ΦT
i (St − S̄) − rSit)dt + FT
t σdZt
where Φi is the i’th row of the transition matrix Φ. By applying Ito’s Lemma we
have that:
d(ln(Wt)) = rdt +
N
X
i=1
Fit(−ΦT
i (St − S̄) − rSit)dt − 1/2FT
t ΣFtdt + FT
t σdZt
Therefore it is:
ln(WT ) = ln(Wt) +
Z T
t
rs ds
+
Z T
t
N
X
i=1
Fis(−ΦT
i (Ss − S̄) − rSis) −
1
2
FT
s ΣFs
!
ds
+
Z T
t
FT
s σdZs (1.7)
Assuming constant interest rate we have:
Et(ln(WT )) = ln(Wt) + r(T − t)
+ Et
Z T
t
N
X
i=1
Fis(−ΦT
i (Ss − S̄) − rSis) −
1
2
FT
s ΣFs
!
ds
!
+ Et(
Z T
t
FT
s σdZs) (1.8)
Under VaR constaints it is:
FT
t ΣFt ≤ L < ∞ ∀t
38
Under the margin constraints it is:
N
X
i=1
λi|Fit| ≤ 1 ∀t
FT
t ΣFt =
N
X
i=1
N
X
j=1
FitFjtσij
≤
N
X
i=1
N
X
j=1
λiλj|Fit||Fjt|
σij
λiλj
< C < ∞ ∀t
Therefore, for both the cases and both the constraints the integrand of the stochastic
integral belongs in H2
, which is a sufficient condition for the stochastic integral to be
a martingale. Consequently, Et(
R T
t
FT
s σdZs) is equal to 0.
Maximizing Et(ln(WT )) is equivalent to maximizing the third term is equations
(1.6), (1.8) for both the cases respectively. Let’s now stydy in detail the solution for
both cases for both the constraints.
VaR constraint
Maximizing Et(ln(WT )) under the VaR constraint is equivalent to solving ∀t the
following QCQP:
minimize FT
t µt +
1
2
FT
t ΣFt
subject to FT
t ΣFt ≤ L
(1.9)
where
µt =





S1t( a1
T−t
+ r)
.
.
.
SNt( aN
T−t
+ r)





(1.10)
for the convergence trades case and
µt =





ΦT
1 (St − S̄) + rS1t
.
.
.
ΦT
N (St − S̄) + rSNt





(1.11)
for the mean reversion trading strategies case.
39
We can easily solve the problem 1.9 by applying the KKT conditions or by ge-
ometry (see Appendix). Fopt
t , λopt
t are optimal iff they satisfy the following KKT
conditions ([10]):
• Primal feasibility: FT opt
t ΣFopt
t ≤ L
• Dual feasibility: λopt
t ≥ 0
• Complementary slackness: λopt
t (FT opt
t ΣFopt
t − L) = 0
• Minimization of the Lagrangean: Fopt
t = argmin L(Ft, λopt
t )
By solving the KKT conditions (see Appendix for details) we find that:
θopt
t =







−Σ−1
µtWt if µT
t Σ−1
µt ≤ L
− Σ−1µtWt
r
µT
t Σ−1µt
L
if µT
t Σ−1
µt ≥ L
This is equivalently written as:
θopt
t = −
Σ−1
µtWt
1 + λopt
t
where 1 + λopt
t = max 1,
r
µT
t Σ−1µt
L
!
Let’s now discuss more the properties of the solution. The investor has logarithmic
preferences. Therefore, he is a myopic optimizer - there is no hedging demand [59].
At each time t he looks dt ahead and decides how to trade in an optimal way. There
are two cases to consider:
• Case 1: At time t: µT
t Σ−1
µt ≤ L In this case, the optimal solution is the
unconstrained myopic optimal solution, since it satisfies the VaR constraint.
For the convergence trades case, this is equivalent to the spread St being in the
ellipsoid Et = {S | ST
(AtΣ−1
At)S ≤ L} where At = diag( a1
T−t
+ r, · · · , aN
T−t
+ r).
40
The volume of the ellipsoid Et is shrinking as t → T, since vol(E) =
QN
i=1( a1
T−t
+
r)−1
p
det(Σ)vol(B(0, 1)) where B(0, 1) is the unit sphere. Figure 1-1 shows this
shrinking ellipsoid at three time instants.
For the mean reversion trading strategies case, this is equivalent to the spread or
value of the trade being inside the convex set C = {S | (S−S̄)T
((Φ+rI)T
Σ−1
(Φ+
rI))(S − S̄) + 2rS̄T
Σ−1
(Φ + rI))(S − S̄) ≤ L − r2
S̄T
Σ−1
S̄}, which in the case
of r = 0 is the ellipsoid C = {S | (S − S̄)T
(ΦT
Σ−1
Φ)(S − S̄) ≤ L. If S̄ = 0 this
convex set is also an ellipsoid.
These ellipsoids characterize poor opportunities where the constraints are not
active. What constitutes poor investment opportunities changes over time for
the case of convergence trades, while it remains invariant for the mean reversion
trades case. For the case of convergence trades, the same spreads initially can be
considered poor investment opportunities, where the investor does not bind the
constraint, he is more conservative, but after some time they can be considered
good opportunities and the investor becomes more aggressive and binds the
constraint.
Informally, when the investment opportunities are poor, the spreads are more
likely to widen which then would lead to mark-to-market losses and the investor
would not have sufficient wealth to take advantage the better investment oppor-
tunities and simultaneously satisfy the VaR constraints. Therefore, the investor
is more conservative.
• Case 2: At time t: µT
t Σ−1
µt > L Now the unconstrained myopic optimal solu-
tion does not satisfy the VaR constraint. This case is equivalent to the spread
St being outside the shrinking ellipsoid Et for the convergence trades case or the
set C for the mean reversion trades case. Now the investment opportunities are
good. The investor wants to invest the unconstrained optimal trading strategy,
but due to the VaR constraint invests in the proportion of this optimal trading
strategy necessary to satisfy the VaR constraint.
41
−0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15
−0.2
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
Ellipsoids of poor investment opportunities for t=0.3, 0.6, 0.9.
Spread 1
Spread
2
Figure 1-1: Ellipsoids. Ellipsoids of poor investment opportunities for N=2 conver-
gence trades at times t = 0.3, 0.6, 0.9.
Margin constraint
Maximizing Et(ln(WT )) under the margin constraint is equivalent to solving ∀t the
following convex program:
minimize FT
t µt +
1
2
FT
t ΣFt
subject to
PN
i=1 λi|Fit| ≤ 1
(1.12)
µt =





S1t( a1
T−t
+ r)
.
.
.
SNt( aN
T−t
+ r)





for the convergence trades case and
µt =





ΦT
1 (St − S̄) + rS1t
.
.
.
ΦT
N (St − S̄) + rSNt





42
for the mean reversion trading strategies case.
Let’s apply the KKT conditions. Fopt
t , νopt
t are optimal iff they satisfy the KKT
conditions:
• Primal feasibility:
PN
i=1 λi|Fit| ≤ 1
• Dual feasibility: νopt
t ≥ 0
• Complementary slackness: νopt
t (
PN
i=1 λi|Fit| − 1) = 0
• Minimization of the Lagrangean Fopt
t = argmin L(Ft, λopt
t )
This program cannot be solved analytically in general. Again there are two cases to
consider.
• Case 1: At time t: kΛΣ−1
µtk1 ≤ 1 where Λ = diag(λ1, · · · , λN ) In this case, the
optimal solution is the unconstrained myopic optimal solution, since it satisfies
the margin constraint.
For the convergence trades, this is equivalent to having at time t: kΛΣ−1
AtSk1 ≤
1 where Λ = diag(λ1, · · · , λN ) and At = diag( a1
T−t
+ r, · · · , aN
T−t
+ r). In this case
we have that St is inside a “diamond” in N dimensional space, which shrinks
as t → T.
For the mean reversion trades, this is equivalent to having at time t: kΛΣ−1
(Φ(St−
S̄) + rSt)k1 ≤ 1 where Λ = diag(λ1, · · · , λN ).
Informally again, when the investment opportunities are poor, the spreads are
more likely to widen which then would lead to mark-to-market losses and the in-
vestor would not have sufficient wealth to take advantage the better investment
opportunities and have enough wealth for the collateral necessary to secure the
liabilities.
43
• Case 2: At time t: kΛΣ−1
µtk1 ≥ 1 where Λ = diag(λ1, · · · , λN ). Now the
investment opportunities are good, the unconstrained myopic optimal solution
does not satisfy the collateral constraint and the constraint binds at the optimal
solution.
Uncorrelated opportunities
There is a special case when the trading opportunities are uncorrelated, where we
can solve analytically the KKT conditions (see Appendix for details). In that case
the optimal positions are given by:
θopt
it =
sign(−µit)(|µit
λi
| − νopt
t )+
σ2
i
λi
Wt (1.13)
We observe the following:
• First of all for the convergence trades, in case the spread is positive we short the
spread as we would expect and in case it is negative we are long the spread. For
the mean reversion trades, the sign is the opposite of the sign of ΦT
i (St−S̄)+rSit.
• Second, if µt is high relative to the collateral then the magnitude of the position
is higher.
• Third, if the variability of the opportunity is high the magnitude of the corre-
sponding position is low.
• Finally the more interesting property of the solution is that it has a cutoff value,
the dual variable, and if the absolute value of µt over the collateral is greater
than the dual variable the position is different from zero otherwise the position
is 0.
It is:
νopt
t = 0 if
N
X
i=1
|λiµit|
σ2
i
≤ 1
44
and
νopt
t > 0 if
N
X
i=1
|λiµit|
σ2
i
> 1
The dual variable is 0 when the investment opportunities are poor. It is easy to see
that when the margin constraint binds we have:
F̃it
opt
=
sign(−µit)(|µit
λi
| − νopt
t )+
σ2
i
λ2
i
andkF̃k1 = 1 (1.14)
1 2 3 4 5 6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Weights in different arbitrage opportunities.
Weights
Figure 1-2: Weights for the case of uncorrelated spreads and collateral con-
straint. Weights for the case of uncorrelated spreads.
In Figure 1-2 we can see an example of how we invest in different convergence
trades when there is no correlation among them, with λ = 1 and volatilities equal
to 1 for all the opportunities. The height of each bar is the absolute value of µit
and we invest only in those spreads where the µit is larger than the dual variable.
If
PN
i=1 |µit| < 1, then the dual variable is 0, we invest in all the opportunities and
the collateral constraint does not bind. If
PN
i=1 |µit| > 1 as is in the figure then the
margin constraint binds, the dual variable is positive and we can find it as follows.
We start from the maximum µit and then reduce it until the sum of the weights is
equal to 1 where each weight is the distance between the absolute value of µit and ν.
45
1.2.4 Connection with Ridge and Lasso regression
Before we explore further the properties and the results of the optimal trading strate-
gies, it would be interesting to digress for a while and see what connection there is
between our problems and the regularized regressions.
In the basic form of regularized regression, the goal is not only to have a good fit,
but also regression coefficients that are “small”. Two of the most common forms of
regularized regressions are the Ridge and Lasso regression.
Ridge regression shrinks the regression coefficients by imposing a penalty on their
size [42]. Equation 1.15 is one of the ways to write the Ridge problem.
minimize
PN
i=1(yi − β0 −
Pp
j=1 xijβj)2
subject to
Pp
j=1 β2
j ≤ t
(1.15)
The Ridge regression coefficients solution is similar to the optimal trading strategy
followed by a risk averse investor with logarithmic preferences, who can choose among
N diffusion processes and faces VaR constraints. In both cases we have this propor-
tional shrinkage where we reduce all the weights by a constant.
Lasso regression is another common form of a regularized regression. It can be
used as a heuristic for finding a sparse solution. It does a kind of continuous subset
selection [16]. Equation 1.16 is one of the ways to write the Lasso problem.
minimize
PN
i=1(yi − β0 −
Pp
j=1 xijβj)2
subject to
Pp
j=1 kβjk ≤ t
(1.16)
The Lasso regression coefficients solution is similar to the optimal trading strategy
followed by a risk averse investor with logarithmic preferences, who can choose among
N diffusion processes and faces margin constraints. Therefore, we can expect that in
this case we will have a sparse solution where the weights of several of the opportu-
nities will be 0.
46
1.3 Results
Let us move on now to the results first for the convergence trades and then for the
mean reversion trading strategies.
1.3.1 Convergence trades
VaR constraints. We have simulated the optimal trading strategy for N = 2
correlated convergence trading opportunities under VaR constraints. We find the
following:
• It is often optimal for an investor to underinvest i.e. not to bind the constraint.
• The investor typically experiences losses early before locking at a profit as we
can see in Figures 1-3, 1-4, 1-5, 1-6.
• Tighter constraints lead to less variability and less skewness in the distribution
of wealth. They also lead to less final wealth as we can see in Figures 1-5, 1-6.
• The wealth is higher when the opportunities hedge each other, as we can see
in Figures 1-4, 1-6. This makes sense because when the constraints are binding
we care more about losing money which would then lead surely to liquidation
when the investment opportunities are better and therefore we prefer the op-
portunities to hedge each other. Figure 1-7 shows a typical path for the wealth
evolution using the same noise process for positive and negative correlation
under the VaR constraint. We see clearly this hedging effect where negative
correlation leads to more wealth.
• When the initial values of the convergence trades are low, the constraints bind
for a small percentage of time and final wealth is negatively correlated to the
percentage of time the constraints bind. Figure 1-8 shows this effect.
• The final portfolio wealth is highly positively skewed as it is obvious in Figures
1-3, 1-4, 1-5, 1-6
47
For all the simulations we used: σ1 = σ2 = 1, a1 = a2 = 1, S[0] = [1; 1], rf =
0.06, number of steps = 1000.
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Distribution of wealth Time 0.25 rho 0.5
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Distribution of wealth Time 0.5 rho 0.5
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Distribution of wealth Time 0.75 rho 0.5
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Distribution of wealth Time 1 rho 0.5
Figure 1-3: VaR constraints, positive correlations. Wealth distribution at t =
0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated (ρ = 0.5)
convergence trades, while facing VaR constraints (K=1). Initial wealth is $100.
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.25 rho −0.5 K 1
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.5 rho −0.5 K 1
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.75 rho −0.5 K 1
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 1 rho −0.5 K 1
Figure 1-4: VaR constraints, negative correlations. Wealth distribution at t =
0.25, 0.5, 0.75, 1 for an investor who invests in two negatively correlated (ρ = −0.5)
convergence trades, while facing VaR constraints (K=1). Initial wealth is $100.
48
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.25 rho 0.5 K 0.25
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.5 rho 0.5 K 0.25
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.75 rho 0.5 K 0.25
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 1 rho 0.5 K 0.25
Figure 1-5: VaR constraints, positive correlations, tight constraints. Wealth
distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively cor-
related (ρ = 0.5) convergence trades, while facing VaR constraints (K=0.25). Initial
wealth is $100.
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.25 rho −0.5 K 0.25
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.5 rho −0.5 K 0.25
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.75 rho −0.5 K 0.25
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 1 rho −0.5 K 0.25
Figure 1-6: VaR constraints, negative correlations, tight constraints. Wealth
distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively
correlated (ρ = −0.5) convergence trades, while facing VaR constraints (K=0.25).
Initial wealth is $100.
49
0 200 400 600 800 1000 1200
0
200
400
600
800
1000
1200
1400
Simulation step
Final
wealth
rho 0.5
rho −0.5
Figure 1-7: Wealth evolution under VaR constraint. Typical path of the wealth
evolution for an investor investing in two convergence trades using the same noise
process for positive and negative correlation under the VaR constraint. Initial wealth
is $100.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
200
400
600
800
1000
1200
1400
1600
Frequency the constraint binds
Final
wealth
Figure 1-8: Relation between final wealth and frequency the VaR constraint
binds. Final wealth is negatively correlated to the percentage of time the constraints
bind when the initial values of the convergence trades are low.
50
Margin constraints. We have also simulated the optimal trading strategy for N =
2 correlated convergence trading opportunities under margin constraints using the
same noise process as with the VaR constraints. We have similar results with the
case of VaR constraints as we see in Figures 1-9, 1-10, 1-11, 1-12, 1-13 with the
following important differences:
• When the constraints bind, it is often the case that the position in one of the
convergence trades is 0, i.e. we have less diversification, sparse solution. Figure
1-15 shows a typical path of the positions in two convergence trading opportu-
nities under VaR constraints, where we see that they tend to be different than
0. Figure 1-16 shows the evolutions of the positions in two convergence trading
opportunities under margin constraints for the same exactly asset processes as
before. We clearly see that often we invest only in one position, as we expected
due to the similarity of the positions with the Lasso regression coefficients.
• The final wealth is less skewed and smaller with respect to the case of VaR
constraints.
51
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.25 rho 0.5 Collateral 1
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.5 rho 0.5 Collateral 1
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.75 rho 0.5 Collateral 1
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 1 rho 0.5 Collateral 1
Figure 1-9: Margin constraints, positive correlations. Wealth distribution at
t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated (ρ = 0.5)
convergence trades, while facing margin constraints (Collateral = 1). Initial wealth
is $100.
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.25 rho −0.5 Collateral 1
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.5 rho −0.5 Collateral 1
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.75 rho −0.5 Collateral 1
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 1 rho −0.5 Collateral 1
Figure 1-10: Margin constraints, negative correlations. Wealth distribution at
t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively correlated (ρ = −0.5)
convergence trades, while facing margin constraints (Collateral = 1). Initial wealth
is $100.
52
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.25 rho 0.5 Collateral 2
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.5 rho 0.5 Collateral 2
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.75 rho 0.5 Collateral 2
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 1 rho 0.5 Collateral 2
Figure 1-11: Margin constraints, positive correlations, more collateral
needed. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests
in two positively correlated (ρ = 0.5) convergence trades, while facing margin con-
straints (Collateral = 2). Initial wealth is $100.
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.25 rho −0.5 Collateral 2
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.5 rho −0.5 Collateral 2
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 0.75 rho −0.5 Collateral 2
0 500 1000 1500 2000 2500 3000 3500 4000
0
50
100
Time 1 rho −0.5 Collateral 2
Figure 1-12: Margin constraints, negative correlations, more collateral
needed. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests
in two negatively correlated (ρ = −0.5) convergence trades, while facing margin con-
straints (Collateral = 2). Initial wealth is $100.
53
0 200 400 600 800 1000 1200
0
100
200
300
400
500
600
700
Simulation step
Final
wealth
rho 0.5
rho −0.5
Figure 1-13: Wealth evolution under margin constraint. Typical path of the
wealth evolution for an investor investing in two convergence trades using the same
noise process for positive and negative correlation under the margin constraint. Initial
wealth is $100.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
200
400
600
800
1000
1200
1400
1600
Frequency the constraint binds
Final
wealth
Figure 1-14: Relation between final wealth and frequency the margin con-
straint binds. Final wealth is negatively correlated to the percentage of time the
constraints bind when the initial values of the convergence trades are low.
54
0 100 200 300 400 500 600 700 800 900 1000
−800
−600
−400
−200
0
200
400
600
800
Simulation step
Position
Convergence trade 1
Convergence trade 2
Figure 1-15: Positions evolution under VaR constraints. Typical path of the
positions in two convergence trading opportunities under VaR constraints.
0 100 200 300 400 500 600 700 800 900 1000
−400
−300
−200
−100
0
100
200
300
400
Simulation step
Position
Convergence trade 1
Convergence trade 2
Figure 1-16: Positions evolution under margin constraints. Typical path of
the positions in two convergence trading opportunities under margin constraints.
55
1.3.2 Mean reversion trading opportunities
Similar results we get also by simulating the optimal trading strategy for N = 2
correlated mean reversion trading opportunities under VaR and margin constraints.
This makes sense since without loss of generality, we have assumed that S̄ = 0 and
that Φ is a diagonal matrix, which makes the mean reversion trades case similar to the
convergence trades case. They are different in that the drift term of the convergence
trades for the same spreads S gets better and better as t → T, while for the mean
reversion trades it remains constant.
1.4 Conclusions
We explored the optimal portfolio allocation of a risk averse investor who invests in
N convergence trades or mean reverting trading strategies, while facing constraints.
In particular, we studied the optimal trading strategy when he faces VaR constraints
or collateral constraints. The optimal trading strategy is found by solving at each
time t a convex program for both cases, we characterized the solution of the convex
program and we found the properties of the optimal trading strategy. In all the
chapter, we have assumed that the investor completely trusts his model and he is
certain about the dynamics of the opportunities he faces. What happens if the model
is just an approximation? What happens if the investor believes that opportunities’
dynamics come from an unknown member of a set of unspecified models near an
approximating model? Concern about model misspecification will change the optimal
trading strategy of the investor and this is the topic of the next chapter.
56
Chapter 2
Optimal trading of arbitrage
opportunities under model
misspecification
A decision maker maximizes a utility function subject to a model. Standard control
theory helps a decision maker to make optimal decisions when his model is correct.
Robust control theory helps him to make good decisions when his model is approxi-
mately correct. In this chapter we will use methods of robust control theory to find
the optimal portfolio allocation of a risk averse investor, who invests in convergence
trades or mean reverting trading strategies, and is not completely confident about
the dynamics of his models.
In particular, we assume that the investor believes that the data comes from
an unknown member of a set of alternative models near his nominal model. These
alternative models are statistically difficult to distinguish from the nominal model.
The investor believes that his model is a pretty good approximation in the sense that
the discrepancies between the alternative models and his nominal model are small.
We will use the relative entropy to characterize the discrepancies between different
models. Concern about model misspecification leads the investor to choose a trading
strategy that is robust over the alternative models.
Three questions come naturally at this point:
57
• What does it mean to have a robust trading strategy?
A robust trading strategy is a strategy that works well over the set of alternative
probability models. We evaluate the worst performance of a given strategy over
that set of alternative probability models and we pick the one that maximizes
this worst case performance. It is essentially a “max-min” problem, a two-player
game in which a maximizing player chooses the best response to a malevolent
player who can disturb the stochastic model within limits.
• Why would we be interested in a robust decision rule over alternative models?
Why don’t we take a Bayesian approach, where we put a prior distribution over
the set of alternative probability models?
This could be another approach, but this set of alternative models may be too
large or too difficult for the investor to come up with a well behaved, plausible
prior distribution. In addition, we might want our solution to work well over
any kind of prior distribution [41].
• Why do we use the relative entropy to measure the discrepancy between an
alternative and the nominal model?
There are other ways to measure discrepancies between alternative probability
models, like Prokhorov distance [9] but the relative entropy with respect to a
measure P has nice properties and it is more tractable. It is given by:
D(Q) =
Z
log(
dQ
dP
)dQ
and it is a convex function of the measure Q.
In the rest of this chapter, we will review the relevant literature. Then we will
discuss about the set of the alternative models, the relative entropy and equivalent
ways to formulate our problem of the optimal robust portfolio allocation of the in-
vestor. Subsequently, we will find the optimal robust trading strategy of the investor
and finally we will explore the characteristics of this robust strategy.
58
2.1 Literature review
Whittle [79], [80] has studied mathematical methods for answering the question of
how to make decisions when you don’t fully trust your model.
Lars Hansen and Thomas Sargent in [41] have studied how to make economic
decisions in the face of model misspecification by modifying and extending aspects
of robust control theory. Their work revolves mostly around the linear-quadratic
regulator framework, where there is a certainty equivalence principle that allows a
deterministic presentation of the control theory.
Gilboa and Schmeidler in [34] have studied the max-min expected utility problem
where the decision maker has multiple priors and maximizes his expected utility as-
suming that nature chooses a probability measure to minimize his expected utility.
The minimization is over a closed and convex set of finitely additive probability mea-
sures. Their axiomatic treatment views this set of non-unique priors as an expression
of the agent’s preferences and the priors are not cast as distortions to a nominal
model.
Lars Hansen et al. [40] have studied robust decision rules when the agent fears that
the data are generated by a statistical perturbation of an approximating model that
is either a controlled diffusion process or a control measure over continuous functions
of time. They describe how stochastic formulations of robust control “constraint
problems” can be viewed in terms of Gilboa and Schmeidler’s max-min expected
utility model. They show the connection between the penalty robust control problem
and the constraint robust control problem, two closely related problems and formulate
the Hamilton Jacobi Bellman equations for various two-player zero sum continuous
time games that are defined in terms of a Markov diffusion process. We extend their
framework to the problem of optimal robust trading rules for a risk averse investor
who does not trust his model dynamics, believes that his nominal model is a good
approximation to the real model and invests in arbitrage opportunities.
Fleming and Souganidis [77] present how the Bellman-Isaacs condition defines a
Bellman equation for a two-player zero-sum game in which both players decide at time
59
0 or recursively. In other words, they show that the freedom to exchange orders of
maximization and minimization guarantees that equilibria of games where the choices
are done under mutual commitment at time 0 and of games where the choices are
done sequentially by both agents coincide.
Anderson et al. [3] show how the set of perturbed models in our formulations
is difficult to distinguish statistically from the approximating model given a finite
sample of timeseries observations.
Jacobson [44] and Whittle [78] studied risk sensitive optimal control in the context
of discrete-time linear quadratic regulator decision problems. They showed how the
risk-sensitive control law can be computed by equivalently solving a robust penalty
problem.
We will now discuss first how to represent the alternative probability models over
which we want our decision rules to be robust and how relative entropy can be used
to describe their discrepancies from the nominal model. We will formulate two closely
related nonsequential problems and the corresponding recursive HJB equations and
finally we will find the optimal robust portfolio allocation for a risk averse investor who
is not confident about the dynamics of his models and wants to invest in convergence
trades or mean reversion trading strategies.
2.2 Analysis
In Chapter 1 we saw that in the case when there is no model misspecification, the
investor wants to find the optimal portfolio allocation that solves the following prob-
lem:
maximizeθ∈Θ E(lnWT )
subject to
dWt = θtdSt + θ0tdBt
dSt = µ(S, t)dt + σ(S, t)dZt
dBt = rBdt
(2.1)
60
Here we have St ∈ RN
and we have studied the following special cases:
dSit = −
aiSit
T − t
dt +
K
X
k=1
σikdZkt
dSt = −Φ(St − S̄)dt + σdZt
and Θ is the set of admissible trading strategies:
Θ =

θ |θT
Σθ ≤ LW2
for the case of VaR constraints and
Θ =
(
θ |
N
X
i=1
λi|θi| ≤ W
)
for the case of the margin constraints.
In this Chapter, the investor doubts his model dSt = µ(S, t)dt + σ(S, t)dZt. To
capture this doubt of the investor, we surround the approximating model with a cloud
of models that are statistically difficult to distinguish and we add a malevolent agent
who picks the worst possible model. The investor wants to find the optimal trading
strategy that solves the following problem:
maxθ∈Θ minQ∈Q EQ(lnWT ) (2.2)
where Θ is the set of admissible trading strategies and Q is the set of alternative
probability models. Problem 2.2 fits the max-min expected utility model of Gilboa
and Schmeidler [34], where Q is a set of multiple different priors. Let’s now discuss
how we represent the set of alternative probability models.
2.2.1 Alternative models representation
We use martingales to represent perturbations to the probability models and relative
entropy to measure the discrepancy between our nominal model and the alternative
61
models. To understand better our continuous time formulations, we digress for a
while by borrowing an example from [41].
Let’s consider a discrete time approximating model and its innovations ǫt which are
i.i.d Gaussian shocks. An alternative model alters the distribution of these shocks.
We use martingales to represent distortions to the probabilities. Let π̂t(ǫ) be the
alternative density of the shock ǫt+1 based on date t information. Then the random
variable Mt =
Qt
j=1 mj, where mj =
π̂j−1(ǫ)
π(ǫ)
and M0 = 1, is a martingale and is a
ratio of the joint alternative density over the joint nominal density. We define the
entropy of the alternative distribution associated with Mt as the expected likelihood
ratio with respect to the distorted distribution E(Mtlog(Mt)). It has the property
that it is always non-negative and it is equal to 0, only when there is no distortion to
the nominal distribution.
Similarly, in our continuous time formulations we will use martingales to represent
distortions to the nominal probability model. We will construct an alternative model
by replacing Zt in our model by Ẑt +
R t
0
hsds, where Ẑt is a Brownian motion under the
alternative measure Q and ht is an adapted process that models the distortion, such
that the process ξt = e
R t
0
hsdZs− 1
2
R t
0
hT
s hsds
is a martingale. Therefore, the nominal model
is misspecified by allowing the conditional mean of the shock vector in the alternative
models to feed back arbitrarily on the history up to date t. Since ξ0 = 1 we have
that E(ξt) = 1. Since in addition, ξt  0, we can define a probability measure Q such
that Q(A) = E[1AξT ], in other words ξT = dQ
dP
is the Radon-Nikodym derivative of
Q with respect to P, where the measures Q and P are equivalent. In fact one can
always define a process ht so that for any measure Q the Radon-Nikodym derivative
of Q with respect to P, dQ
dP
, is given by the exponential martingale ξT . In this way,
our distorted models are:
• For the convergence trades,
dSt = −
aiSit
T − t
dt +
K
X
k=1
σik(dẐkt + hktdt) (2.3)
62
• For the mean reversion trades,
dSt = −Φ(St − S̄)dt + σ(dẐt + htdt) (2.4)
where Ẑt is a Brownian motion under Q. Why is it a Brownian motion under Q? The
answer lies with the Girsanov theorem [30] that states that if a process ht is such that
ξt is a martingale and ξT = dQ
dP
is the Radon-Nikodym derivative of Q with respect
to P, then the process ˆ
Z(t) = Zt −
R t
0
hsds is a Brownian motion under measure Q.
Therefore, we parameterize Q by the choice of the drift distortion adapted process
ht.
Similarly with the discrete time case, we measure the discrepancy between mea-
sures Q and P as the relative entropy D(Q) (see Appendix for derivation),
D(Q) =
Z T
0
1
2
EQ[hT
t ht]dt (2.5)
.
This is to be expected, since the relative entropy between a multivariate Gaussian
distribution N(µ, I) and the multivariate standard normal distribution is D(Q) =
1
2
µT
µ (See Appendix for derivation) and htdt is the conditional mean of the process
dZt under the alternative probability measure Q.
To express the notion that the nominal model is a good approximation to the real
model that generate the spread dynamics, we either restrain the alternative models
by D(Q) ≤ η or we penalize them with the magnitude of the entropy.
2.2.2 Model setup
Having described the set of alternative distributions, we are ready to formulate the
problem a risk averse investor faces who distrusts his model dynamics. As in [40] we
define two closely related problems:
63
• A multiplier robust control problem.
maxθ∈Θ minQ EQ(lnWT ) + νD(Q)
subject to dWt = θtdSt + θ0tdBt
dSt = µ(S, t)dt + σ(S, t)(dẐt + htdt)
dBt = rBdt
dξt = ξthtdZt
(2.6)
where ξT = dQ
dP
and D(Q) is given by equation 2.5. Here in essence there is an
implicit restriction manifested by the nonnegative penalty parameter ν.
• A constrained robust control problem.
maxθ∈Θ minQ EQ(lnWT )
subject to dWt = θtdSt + θ0tdBt
dSt = µ(S, t)dt + σ(S, t)(dẐt + htdt)
dBt = rBdt
dξt = ξthtdZt
D(Q) ≤ η
(2.7)
where ξT = dQ
dP
and D(Q) is given by equation 2.5.
In both cases the minimizing malevolent agent chooses the distortion process ht taken
θ as given and the maximizing investor chooses the optimal strategy taken ht as given.
We index the family of multiplier robust control problems by ν and the family of
constrained robust control problems by η. Obviously the two problems are related,
since the robustness parameter ν can be interpreted as the Lagrange multiplier on
the constraint D(Q) ≤ η. Actually we can show that if V (ν) is the optimal value
of the multiplier robust problem and K(η) is the optimal value of the constrained
robust problem then we have: K(η) = maxν≥0V (ν) − νη [40]. Therefore we will be
only interested in finding V (ν).
64
2.3 Solution
We will solve 2.6 by solving the corresponding Hamilton Jacobi Bellman (HJB) equa-
tion. We will solve the HJB equation for the case that there are no constraints in
the admissible trading strategies and when the trading strategies are constrained by
VaR or collateral considerations. But first let’s digress for a while and solve the HJB
equation for the case when there is no fear of model misspecification.
2.3.1 No fear of model misspecification
For now we assume that the investor completely trusts the dynamics of his models
dSt = µ(S, t)dt + σ(S, t)dZt. He chooses the trading strategy θ ∈ Θ that solves the
problem 2.1 where Θ is the set of admissible trading strategies:
Θ =

θ|θT
Σθ ≤ LW2
for the case of VaR constraints and
Θ =
(
θ|
N
X
i=1
λi|θi| ≤ W
)
for the case of the margin constraints. In this case the HJB equation is:
max
θ∈Θ
Vt + VW (Wr + θT
(µ(S, t) − rSt)) + V T
S µ(S, t)
+ 1/2VW W θT
Σθ + VW SΣθ + 1/2trace(ΣVSS) = 0
where Σ = σσT
and V (W, S, t) is the value function of the investor subject to the
terminal condition V (W, S, T) = ln(W).
Due to the logarithmic preferences of the investor it is: V (W, S, t) = ln(W)+H(S, t),
therefore VW S = 0, VW = 1
W
and VW W = − 1
W 2 . We also define ∀t ∈ [0, T] Ft = θt/Wt
∈ RN
.
65
The HJB equation becomes:
max
F ∈F
Vt + (r + FT
(µ(S, t) − rSt)) + V T
S µ(S, t)
− 1/2FT
ΣF + 1/2trace(ΣVSS) = 0
where F is the set of admissible trading strategies:
F =

F|FT
ΣF ≤ L
for the case of VaR constraints and
F =
(
F|
N
X
i=1
λi|Fi| ≤ 1
)
for the case of the margin constraints.
The optimal trading strategy is the solution to the following convex problem
min
F ∈F
FT
(−µ(S, t) + rSt) + 1/2FT
ΣF (2.8)
as we also proved with a different method in Chapter 1, where µt = −µ(S, t) + rSt
and in particular it is:
µt =





S1t( a1
T−t
+ r)
.
.
.
SNt( aN
T−t
+ r)





for the convergence trades case and
µt =





ΦT
1 (St − S̄) + rS1t
.
.
.
ΦT
N (St − S̄) + rSNt





for the mean reversion trading strategies case.
66
2.3.2 Fear of model misspecification no constraints
In this section we assume that there are no constraints in the trading strategies fol-
lowed by the risk averse investor and the investor is not confident about the dynamics
of his models. The Hamilton Jacobi Bellman equation for the problem 2.6 is given
by:
max
θ
min
h
Vt + VW (Wr + θT
(µ(S, t) − rSt)) + V T
S µ(S, t)
+ 1/2VW W θT
Σθ + VW SΣθ + 1/2trace(ΣVSS) + VW θT
σh + V T
S σh +
ν
2
hT
h = 0
where Σ = σσT
and V (W, S, t) is the value function of the investor subject to the
terminal condition V (W, S, T) = ln(W). The malevolent agent picks the worst case
distortion drift process ht and the investor maximizes against the worst case scenario.
After defining ∀t ∈ [0, T] Ft = θt/Wt ∈ RN
the HJB equation becomes:
max
F
min
h
Vt + WVW (r + FT
(µ(S, t) − rSt)) + V T
S µ(S, t)
+1/2W2
VW W FT
ΣF+WVW SΣF+1/2trace(ΣVSS)+WVW FT
σh+V T
S σh+
ν
2
hT
h = 0
The inner minimization problem is a convex quadratic problem. The first order
conditions are:
WVW σT
F + σT
VS + νh = 0
h = −
σT
(WVW F + VS)
ν
The optimal value of the inner minimization problem is:
g(F) = −
W2
V 2
W Ft
ΣF + V T
S ΣVS + 2WVW FT
ΣVS
2ν
67
Plugging this back into the HJB equation we have:
max
F
Vt + WVW (r + FT
(µ(S, t) − rSt)) + V T
S µ(S, t)
+ 1/2W2
VW W FT
ΣF + WVW SΣF + 1/2trace(ΣVSS)
−
W2
V 2
W Ft
ΣF + V T
S ΣVS + 2WVW FT
ΣVS
2ν
= 0
Due to the logarithmic preferences of the investor it is: V (W, S, t) = lnW + H(S, t)
and in that case VW = 1
W
VW W = − 1
W 2 VW S = 0, VS(W, S, t) = HS(S, t) and the
minimizing drift distortion h = −σT (F +HS)
ν
independent of the wealth.
The HJB equation now becomes:
max
F
Vt + r + FT
(µ(S, t) − rSt) + V T
S µ(S, t)
− 1/2FT
ΣF + 1/2trace(ΣVSS) −
FT
ΣF + V T
S ΣVS + 2FT
ΣVS
2ν
= 0
The optimal trading strategy is the solution to the following convex quadratic prob-
lem:
maximize FT
(µ(S, t) − rSt −
ΣVS
ν
) −
1
2
(1 +
1
ν
)FT
t ΣFt (2.9)
The first order conditions are:
µ(S, t) − rSt −
ΣVS(St, t)
ν
= (1 +
1
ν
)ΣFopt
t
Fopt
t =
1
1 + 1
ν
Σ−1
(µ(S, t) − rSt −
ΣVS(St, t)
ν
)
Fopt
t =
ν
ν + 1
Σ−1
(µ(S, t) − rSt) −
VS(St, t)
ν + 1
We clearly see that as ν → ∞ the optimal trading strategy converges to the one where
we have no fear of model misspecification. This is to be expected since at this case the
problems 2.1 and 2.6 are equivalent. It is interesting to find the conditions under which
these weights are equal to the weights when there is no fear of model misspecification.
When there is a fear of model misspecification, the optimal weights are a convex
68
combination of Σ−1
(µ(S, t)−rSt), i.e.the weights without model misspecification and
−VS. Therefore these weights are equal to the weights when there is no fear of model
misspecification, when Vs + Fopt
= 0, which is equivalent to hmin = 0. Of course this
is expected since in that case there would be no distortion drift and the HJB equation
would be the same as the benchmark case of no model misspecification.
After plugging in the optimal trading strategy to the HJB equation, it becomes:
Vt + r + 1/2 trace(ΣVSS) + 1/2
1
1 + 1
ν
(µ(S, t) − rSt)T
Σ−1
(µ(S, t) − rSt)
+ V T
S

µ(S, t) −
µ(S, t) − rS
ν + 1

− 1/2
1
ν + 1
V T
S ΣVS = 0
We can plug in the optimal trading strategy to h = −σT (W VW F +VS)
ν
to find that:
hmin = −
(σT
Fopt
+ σT
VS)
ν
hmin = −
(σT 1
1+ 1
ν
Σ−1
(µ(S, t) − rSt − ΣVS
ν
) + σT
VS)
ν
hmin = −
σT
(Σ−1
(µ(S, t) − rSt) + VS)
ν + 1
We consider two cases:
• Convergence trades. The optimal trading strategy is given by:
θopt
t = −

ν
ν + 1
Σ−1
AtSt +
VS
ν + 1

Wt (2.10)
where At = diag( a1
T−t
+ r, · · · , aN
T−t
+ r). The optimal trading strategy is a
convex combination of the strategy without fear of model misspecification and
−VS with weights ν
ν+1
and 1
ν+1
. From the symmetry of the problem we have:
H(S, t) = H(−S, t) from which we get VS(St, t) = −VS(−St, t).
• Mean reversion trades. The optimal trading strategy is given by:
θopt
t = −

ν
ν + 1
Σ−1
(Φ(St − S̄) + rSt) +
VS
ν + 1

Wt (2.11)
69
The optimal trading strategy is again a convex combination of the strategy
without fear of model misspecification and −VS with weights ν
ν+1
and 1
ν+1
. For
the special case where S̄ = 0 we have:
θopt
t = −

ν
ν + 1
Σ−1
((Φ + rI)St) +
VS
ν + 1

Wt (2.12)
2.3.3 Fear of model misspecification with VaR and margin
constraints
In this section we assume that the investor is not confident about the dynamics of his
models and he faces either VaR or margin constraints. The Hamilton Jacobi Bellman
equation for the problem 2.6 is given by:
max
θ∈Θ
min
h
Vt + VW (Wr + θT
(µ(S, t) − rSt)) + V T
S µ(S, t)
+ 1/2VW W θT
Σθ + VW SΣθ + 1/2trace(ΣVSS) + VW θT
σh + V T
S σh +
ν
2
hT
h = 0
where Σ = σσT
and V (W, S, t) is the value function of the investor subject to the
terminal condition V (W, S, T) = ln(W). As previously Θ is the set of admissible
trading strategies:
Θ =

θ|θT
Σθ ≤ LW2
for the case of VaR constraints and
Θ =
(
θ|
N
X
i=1
λi|θi| ≤ W
)
for the case of the margin constraints.
We can proceed like the previous case where we had no constraints and we will get
70
the following HJB equation:
max
F ∈F
Vt + (r + FT
(µ(S, t) − rSt)) + V T
S µ(S, t)
− 1/2FT
ΣF + 1/2trace(ΣVSS) −
FT
ΣF + V T
S ΣVS + 2FT
ΣVS
2ν
= 0
where F is the set of admissible trading strategies:
F =

F|FT
ΣF ≤ L
for the case of VaR constraints and
F =
(
F|
N
X
i=1
λi|Fi| ≤ 1
)
for the case of the margin constraints. The optimal trading strategy is the solution
to the following convex problem:
maximize FT
(µ(S, t) − rSt −
ΣVS
ν
) −
1
2
(1 +
1
ν
)FT
t ΣFt
subject to F ∈ F
(2.13)
We clearly see again that as ν → ∞ the optimal trading strategy converges to the one
where we have no fear of model misspecification as expected. Using a similar proof
as in Chapter 1, we can show (see Appendix) that in the case the investor faces VaR
constraints the optimal portfolio is given by:
Fopt
t =







1
1+ 1
ν
Σ−1
µt if µT
t Σ−1
µt ≤ L(1 + 1
ν
)2
Σ−1µt
r
µT
t Σ−1µt
L
if µT
t Σ−1
µt ≥ L(1 + 1
ν
)2
where µt = µ(S, t) − rSt − ΣVS(St,t)
ν
.
71
This is also equivalently written as:
Fopt
t =
1
1 + 1
ν
+ λ
Σ−1
µt
where 1 + 1
ν
+ λ = max(1 + 1
ν
,
q
µT
t Σ−1µt
L
).
We consider two cases:
• Convergence trades. The optimal trading strategy is the solution to the follow-
ing convex problem:
minimize FT
(AtSt +
ΣVS
ν
) +
1
2
(1 +
1
ν
)FT
t ΣFt
subject to F ∈ F
(2.14)
where At = diag( a1
T−t
+ r, · · · , aN
T−t
+ r).
• Mean reversion trades. The optimal trading strategy is the solution to the
following convex problem:
minimize FT
(Φ(St − S̄) + rSt +
ΣVS
ν
) +
1
2
(1 +
1
ν
)FT
t ΣFt
subject to F ∈ F
(2.15)
For the special case where S̄ = 0, we have the following problem:
minimize FT
((Φ + rI)St +
ΣVS
ν
) +
1
2
(1 +
1
ν
)FT
t ΣFt
subject to F ∈ F
(2.16)
2.4 Results
We will investigate how the optimal trading strategy changes as a result of mistrust
of the model dynamics. We will first study the case where we have no constraints
and then the case where we have VaR constraints.
72
2.4.1 Convergence trades without constraints
We consider the case where we have N = 1 arbitrage opportunity and there are no
constraints. We will study the case where N = 2 when we have constraints. Due to
the symmetry of S around 0 it suffices to study only what happens when S ≥ 0, since
the symmetry implies that the value function is an even function of the spread S and
its partial derivative with respect to the spread is an odd function of S for each t.
The optimal weight in the arbitrage opportunity is given by:
Fopt
t = −
1
σ2(1 + 1
ν
)
((
a
T − t
+ r)St +
σ2
VS(St, t)
ν
) (2.17)
and the minimizing distortion drift is given by: hmin = −σ(F opt+VS)
ν
. Comparing to
the case where there is no fear of model misspecification we see that now the variance
increases by multiplying by a factor of (1 + 1
ν
) and the drift increases by adding
σ2VS(St,t)
ν
.
When S is positive, one would think that there are three cases to consider:
• If VS  0 then F  0. In this case there is a tradeoff between the two terms
in hmin. The first term −σF opt
ν
corresponds to a positive distortion drift that
reduces the wealth of the investor since the investor is shorting the spread, while
the second term −σVS
ν
corresponds to a negative distortion drift that points to
worse investment opportunities.
• If −AtStν
σ2  VS  0 then F  0 and both the terms in hmin correspond to
positive distortion drifts with the first one reducing the wealth of the investor
and the second term pointing to worse investment opportunities.
• If VS  −AtStν
σ2 then F  0. There is now again a tradeoff between the two
terms in hmin. Now the first term corresponds to a negative distortion drift
reducing the wealth of the investor since in this case the investor is long the
spread, and the second term corresponds to a positive distortion pointing to
worse investment opportunities.
73
A little more thought though will exclude the last two cases, since we expect the value
function V to be a non-decreasing function of S for nonnegative values of the spread,
since higher values of the spread S correspond to better investment opportunities.
That will lead to non-negative values of VS for S ≥ 0.
The HJB equation is given by:
Vt + r + 1/2 σ2
VSS + 1/2
1
1 + 1
ν
(
a
T − t
+ r)2 S2
σ2
+ VS(−
a
T − t
S +
( a
T−t
+ r)S
ν + 1
) − 1/2
1
ν + 1
V 2
S σ2
= 0
We solve the HJB equation numerically using the method of finite differences [75].
In the following figures we have assumed that rf = 0, σ = 1, a = 0.01 and T = 1.
We observe the following:
• VS becomes larger and larger as t → T for each value of ν as we see in Figure 2-1
until some value close to the horizon where it starts going down. In addition,
VS is higher for higher values of ν.
• For very low values of ν the drift distortion hmin is positive and becomes larger
as t → T as we see in Figure 2-2. For higher values of ν the drift distortion
starts negative and after some point increases as t → T to positive values. As we
showed in the previous section, when hmin = 0, the optimal weight is equal to
the optimal weight in the case where there is no fear of model misspecification.
We can see this in Figure 2-4, where very close at the time where hmin crosses
0, the optimal weight graph crosses the one when ν = 100.
• Figure 2-3 shows the two terms of the distortion drift for ν = 1. The first term
corresponds to a positive distortion drift that reduces the wealth of the investor
since the investor is shorting the spread, while the second term corresponds to
a negative distortion drift that points to worse investment opportunities. In
this tradeoff the first term is losing at the beginning which makes the drift
distortion negative but as t → T it increases in a fast rate making the drift
distortion positive.
74
• The fact that VS becomes larger as t → T for each value of ν until a point close
to the horizon in combination with the fact that the drift term a
T−t
increases
in a hyperbolic way leads to F becoming larger (in absolute value) as t → T
(Figure 2-4). The risk averse investor becomes more aggressive as the time to
T becomes smaller despite the fact that as t → T the malevolent agent picks
a more adverse distortion drift (Figure 2-2). This is due to the fact that the
improvement in the investment opportunities is so substantial that dominates
the fact that the distortion drift gets also larger.
• For very low values of ν the investor is more conservative than the case without
model misspecification for all t. For higher values on ν at the beginning the
investor is more aggressive and as t → T becomes more conservative comparing
to the case without model misspecification (Figure 2-4). This is because at the
beginning the drift term a
T−t
is very low comparing to VS making the total drift
term lower for large values of ν than smaller values of ν, which leads to lower
magnitude of weight. This situation changes as time to horizon T gets smaller.
• As ν → ∞ the optimal weight in the strategy converges to the optimal weight
when there is no fear of model misspecification as we have argued before.
75
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Vs
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Vs as a function of time for S = 1
nu is: 0.01
nu is: 0.1
nu is: 1
nu is: 10
nu is: 100
Figure 2-1: Partial derivative of the value function with respect to S for a
single convergence trade. VS as a function of time at S = 1 for different values
of the robustness multiplier for a single convergence trade.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
hmin
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
hmin as a function of time for S = 1
nu is: 0.01
nu is: 0.1
nu is: 1
nu is: 10
nu is: 100
Figure 2-2: Distortion drift for a single convergence trade. Distortion drift
as a function of time at S = 1 for different values of the robustness multiplier for a
single convergence trade.
76
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Distortion
-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
Drift distortion and its two terms as a function of time for S = 1
h
h1
h2
Figure 2-3: Distortion drift terms for a single convergence trade. Distortion
drift terms as a function of time at S = 1 for ν = 1 for a single convergence trade.
The first term corresponds to a positive distortion drift that reduces the wealth of the
investor since the investor is shorting the spread, while the second term corresponds
to a negative distortion drift that points to worse investment opportunities.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Weights
-0.12
-0.1
-0.08
-0.06
-0.04
-0.02
0
Optimal weights as a function of time for S = 1
nu is: 0.01
nu is: 0.1
nu is: 1
nu is: 10
nu is: 100
Figure 2-4: Optimal weight of a single convergence trade. Weight of the
convergence trading strategy as a function of time at S = 1 for different values of the
robustness multiplier.
77
2.4.2 Mean reversion trading strategies without constraints
We consider first the case where we have N = 1 mean reversion trading strategy and
S̄ = 0 i.e.
dSt = −φStdt + σdZt
Due to the symmetry of S around 0 it suffices to study only what happens when
S ≥ 0, since the symmetry implies that the value function is an even function of the
spread S and its partial derivative with respect to the spread is an odd function of S
for each t.
The optimal weight in the trading strategy is given by:
Fopt
t = −
1
σ2(1 + 1
ν
)
((φ + r)St +
σ2
VS(St, t)
ν
)
= −
ν
σ2(ν + 1)
(φ + r)St −
VS(St, t)
ν + 1
and the minimizing distortion drift is given by:
hmin = −
σ(Fopt
+ VS)
ν
=
(φ + r)St
σ(ν + 1)
−
σVS
ν + 1
Comparing to the case where there is no fear of model misspecification we see that
when the investor does not trust the model dynamics, the variance increases by mul-
tiplying with a factor of (1 + 1
ν
) and the drift increases by adding σ2VS(St,t)
ν
. If VS
is non-negative and decreases as a function of time t, then the increase in the drift
gets smaller and smaller and the investor gets more and more conservative as time
passes by. In this case the distortion drift also gets larger and larger as time passes
by, which explains why the investor gets more and more conservative.
When S is positive, we have VS ≥ 0, since higher values of S correspond to better
investment opportunities. If VS ≥ 0 then Fopt
≤ 0 for S ≥ 0. Therefore we see that
there is a tradeoff between the two terms in hmin. The first term −σF opt
ν
corresponds to
a positive distortion drift that reduces the wealth of the investor since the investor is
78
shorting the spread, while the second term −σVS
ν
corresponds to a negative distortion
drift that points to worse investment opportunities.
The HJB equation is given by:
Vt+r+1/2 σ2
VSS +1/2
1
1 + 1
ν
(φ+r)2 S2
σ2
+VS(−φS+
(φ + r)S
ν + 1
)−1/2
1
ν + 1
V 2
S σ2
= 0
We solve the HJB equation numerically using the method of finite differences. In the
following figures we have assumed that rf = 0, σ = 1, φ = 1 and T = 1. We observe
the following:
• VS becomes smaller and smaller as t → T for each value of ν as we see in Figure
2-5. This makes sense, since as t → T there is less time to take advantage of
the mean reversion trading strategy. In addition, VS is higher for higher values
of ν.
• The drift distortion hmin is positive and becomes larger and larger as t → T
for each value of the robustness multiplier as we see in Figure 2-6. Figure 2-7
shows the two terms of the distortion drift for ν = 1. The first term corresponds
to a positive distortion drift that reduces the wealth of the investor since the
investor is shorting the spread, while the second term corresponds to a negative
distortion drift that points to worse investment opportunities. In this tradeoff
the first term wins making the drift distortion positive.
• The fact that VS becomes smaller and smaller as t → T for each value of ν leads
to F becoming smaller and smaller (in absolute value) as t → T ( Figure 2-8).
In other words the risk averse investor becomes more conservative as the time
to T becomes smaller. This is to be expected, since as t → T the malevolent
agent picks a more adverse distortion drift (Figure 2-6) causing the investor to
be more cautious.
• The lower the value of ν the more conservative the investor is as we see in Figure
2-8. This is because ν is the robustness multiplier and lower values of it puts
79
less penalty in the distorting alternative distribution, leading to higher positive
drift distortions (Figure 2-6).
• As ν → ∞ the optimal weight in the strategy converges to the optimal weight
when there is no fear of model misspecification as we have argued before.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vs
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Vs as a function of time for S = 1
nu is: 1
nu is: 10
nu is: 100
Figure 2-5: Partial derivative of the value function with respect to S for a
single mean reversion trading strategy. VS as a function of time at S = 1 for
different values of the robustness multiplier.
80
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
hmin
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
hmin as a function of time for S = 1
nu is: 1
nu is: 10
nu is: 100
Figure 2-6: Distortion drift for a single mean reversion trading strategy.
Distortion drift as a function of time at S = 1 for different values of the robustness
multiplier.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Distortion
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Drift distortion and its two terms as a function of time for S = 1
h
h1
h2
Figure 2-7: Distortion drift terms for a single mean reversion trading strat-
egy. Distortion drift terms as a function of time at S = 1 for ν = 1. The first term
corresponds to a positive distortion drift that reduces the wealth of the investor, since
the investor is shorting the spread, while the second term corresponds to a negative
distortion drift that points to worse investment opportunities.
81
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-1
-0.95
-0.9
-0.85
-0.8
-0.75
-0.7
-0.65
-0.6
-0.55
-0.5
Optimal weights as a function of time for S = 1
nu is: 1
nu is: 10
nu is: 100
Figure 2-8: Optimal weight of a single mean reversion trading strategy.
Weight of the mean reversion trading strategy as a function of time at S = 1 for
different values of the robustness multiplier.
82
Let us now consider the case where we have N = 2 mean reversion trading strate-
gies and again S̄ = 0. We solve numerically the HJB equation using the method of
finite differences. We have assumed that rf = 0, T = 1,
Φ =


2 0
0 1


and
Σ =


1 ρ
ρ 1


In Figure 2-9 we plot the weights of the mean reversion trading strategies as a
function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier.
We have chosen S1 = 1 and S2 = 2, since for these values the drift is the same for
both the strategies. We have assumed that there is no correlation between the two
trading strategies. We observe the following:
• First of all when there is no fear of model misspecification the weights of the
two strategies are the same and they do not change over time.
• When there is a fear of model misspecification the investor becomes more and
more conservative over time just like in the N = 1 case. It is interesting to note
that the weight is higher for the first trading strategy where the φ coefficient is
higher. That makes sense since “ceteris parebus” we would expect the Vs to be
higher for the strategy with the stronger rate of mean reversion (φ coefficient).
This is shown is Figure 2-11.
• Figure 2-10 shows the ratio of the weights of the two trading strategies. We
observe that this is higher for smaller values of the robustness multiplier and it
is reduced to 1 as t → T.
In Figure 2-12 we have assumed that there is a correlation ρ = 0.5 between the
two trading strategies. Now the weights are smaller than before due to the positive
correlations but they have the same properties as before. In the case when there is a
83
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-2
-1.9
-1.8
-1.7
-1.6
-1.5
-1.4
-1.3
-1.2
-1.1
-1
Optimal weights as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-9: Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2. Weights of the mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
1
1.05
1.1
1.15
1.2
1.25
1.3
Ratio of optimal weights as a function of time for S1 = 1 and S2 = 2
nu is: 1
nu is: 10
nu is: 100
Figure 2-10: Ratio of the optimal weights. Ratio of the optimal weights of the
mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for
different values of the robustness multiplier. The correlation coefficient is ρ = 0.
negative correlation (Figure 2-13 ) the weights are larger, since now the opportunities
hedge each other, otherwise the properties remain the same.
84
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vs
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Vs1 Vs2 as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-11: Partial derivative of the value function with respect to S1 and
S2 at S1 = 1 and S2 = 2 when ρ = 0. Partial derivative of the value function with
respect to S1 and S2 as a function of time at S1 = 1 and S2 = 2 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-1.4
-1.3
-1.2
-1.1
-1
-0.9
-0.8
-0.7
-0.6
Optimal weights as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-12: Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2. Weights of the mean reversion trading
strategies as a function of time at S1 = 1 and S2 = 2 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0.5.
85
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-4
-3.8
-3.6
-3.4
-3.2
-3
-2.8
-2.6
-2.4
-2.2
-2
Optimal weights as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-13: Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2. Weights of the mean reversion trading
strategies as a function of time at S1 = 1 and S2 = 2 for different values of the
robustness multiplier. The correlation coefficient is ρ = −0.5.
86
So far we have examined the two trading strategies at values where the drifts are
the same. We now choose S1 = 1 and S2 = 1, since for these values the drifts differ
for the two strategies. Figure 2-14 shows the weights of the strategies over time for
different values of the robustness multiplier when there is no correlation between the
two trading strategies. We now observe the following:
• First of all when there is no fear of model misspecification the ratio of the
weights of the two strategies is the same as the ratio of their drifts normalized
by their variances and it does not change over time.
• When there is a fear of model misspecification the investor becomes more and
more conservative over time just like in the N = 1 case as we see in 2-14.
• VS1 is higher than VS2. That makes sense since “ceteris parebus” we would
expect the Vs to be higher for the strategy with the stronger rate of mean
reversion (φ coefficient). This is shown is Figure 2-16.
• Figure 2-15 shows the ratio of the weights of the two trading strategies. We
observe that this is higher for smaller values of the robustness multiplier and
after initially increasing it finally converges to 1 as t → T. The reason that
initially the ratio is less than the one for the case where there is no fear of
model misspecification is that initially the ratio of VS1
VS2
is less than the ratio of
the drifts.
A very interesting case arises when there is a high positive correlation like ρ = 0.9
between the two trading strategies. In this case the investor uses the second trading
strategy with the lower drift as a hedge for the first strategy as it is shown in Figure 2-
17 where the investor is shorting the first strategy while he is long the second strategy.
Now VS2 is negative (Figure 2-18) which is to be expected since the investor is long
the asset and higher values of S2 lead to worse investment opportunities for a long
investor. Figure 2-19 shows the ratio of the magnitudes of the optimal weights as a
function of time.
87
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-2
-1.5
-1
-0.5
Optimal weights as a function of time for S1 = 1 and S2 = 1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-14: Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 1. Weights of the mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 1 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
1.94
1.96
1.98
2
2.02
2.04
2.06
2.08
2.1
2.12
Ratio of optimal weights as a function of time for S1 = 1 and S2 = 1
nu is: 1
nu is: 10
nu is: 100
Figure 2-15: Ratio of the optimal weights at S1 = 1 and S2 = 1 when ρ = 0.
Ratio of the optimal weights of the mean reversion trading strategies as a function
of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The
correlation coefficient is ρ = 0.
Finally Figure 2-20 shows the weights when there is negative correlation ρ = −0.8.
The results are similar with the case of no correlation, although now the weights are
higher due to the negative correlation, which makes the two strategies good hedges.
88
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vs
0
0.2
0.4
0.6
0.8
1
1.2
Vs1 Vs2 as a function of time for S1 = 1 and S2 = 1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-16: Partial derivative of the value function with respect to S1 and
S2 at S1 = 1 and S2 = 1 when ρ = 0. Partial derivative of the value function with
respect to S1 and S2 as a function of time at S1 = 1 and S2 = 1 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-6
-5
-4
-3
-2
-1
0
1
2
3
4
Optimal weights as a function of time for S1 = 1 and S2 = 1
Spread 1 nu is: 0.1
Spread 2 nu is: 0.1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Figure 2-17: Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1. Weights of the mean reversion trading
strategies as a function of time at S1 = 1 and S2 = 1 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0.9.
89
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vs
-1
-0.5
0
0.5
1
1.5
2
2.5
Vs1 Vs2 as a function of time for S1 = 1 and S2 = 1
V
S1
nu is: 0.1
V
S2
nu is: 0.1
VS1
nu is: 1
VS2
nu is: 1
V
S1
nu is: 10
V
S2
nu is: 10
Figure 2-18: Partial derivative of the value function with respect to S1 and
S2 at S1 = 1 and S2 = 1 when ρ = 0.9. Partial derivative of the value function
with respect to S1 and S2 as a function of time at S1 = 1 and S2 = 1 for different
values of the robustness multiplier. The correlation coefficient is ρ = 0.9.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
1.35
1.4
1.45
1.5
1.55
1.6
1.65
Ratio of optimal weights as a function of time for S1 = 1 and S2 = 1
nu is: 1
nu is: 10
nu is: 100
Figure 2-19: Ratio of the optimal weights at S1 = 1 and S2 = 1 when ρ = 0.9.
Ratio of the magnitude of the optimal weights of the mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 1 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0.9.
90
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-8
-7.5
-7
-6.5
-6
-5.5
-5
-4.5
-4
-3.5
Optimal weights as a function of time for S1 = 1 and S2 = 1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-20: Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1. Weights of the mean reversion trading
strategies as a function of time at S1 = 1 and S2 = 1 for different values of the
robustness multiplier. The correlation coefficient is ρ = −0.8.
91
2.4.3 Convergence trades with constraints
We consider first the case where we have N = 1 arbitrage opportunity and there are
collateral constraints i.e. |F| ≤ L. Due to the symmetry of the constraint and the
spread S around 0 it suffices to study only what happens when S ≥ 0, since the
symmetry implies that the value function is an even function of the spread S and its
partial derivative with respect to the spread is an odd function of S for each t.
The optimal weight in the arbitrage opportunity is given by:
Fopt
t =













− 1
σ2(1+ 1
ν
)
(( a
T−t
+ r)St + σ2VS(St,t)
ν
) if|( a
T−t
+ r)St + σ2VS(St,t)
ν
| ≤ Lσ2
(1 + 1
ν
)
−L if( a
T−t
+ r)St + σ2VS(St,t)
ν
≥ Lσ2
(1 + 1
ν
)
L if( a
T−t
+ r)St + σ2VS(St,t)
ν
≤ −Lσ2
(1 + 1
ν
)
and the minimizing distortion drift is given by: hmin = −σ(F opt+VS)
ν
.
The HJB equation is given by:



































Vt + r + 1/2 σ2
VSS + 1/2 1
1+ 1
ν
( a
T−t
+ r)2 S2
σ2 + VS(− a
T−t
S +
( a
T −t
+r)S
ν+1
) − 1/2 1
ν+1
V 2
S σ2
= 0
if |( a
T−t
+ r)St + σ2VS(St,t)
ν
| ≤ Lσ2
(1 + 1
ν
)
Vt + r + 1/2 σ2
VSS − 1
2ν
V 2
S − VS( a
T−t
S − Lσ2
ν
) − 1
2
(1 + 1
ν
)σ2
L2
+ L( a
T−t
+ r)S = 0
if ( a
T−t
+ r)St + σ2VS(St,t)
ν
≥ Lσ2
(1 + 1
ν
)
Vt + r + 1/2 σ2
VSS − 1
2ν
V 2
S − VS( a
T−t
S + Lσ2
ν
) − 1
2
(1 + 1
ν
)σ2
L2
− L( a
T−t
+ r)S = 0
if ( a
T−t
+ r)St + σ2VS(St,t)
ν
≤ −Lσ2
(1 + 1
ν
)
We solve the HJB equation numerically using the method of finite differences. In
the following figures we have assumed that rf = 0, σ = 1, a = 0.01 and T = 1. We
observe the following:
• VS is lower when the constraint is tighter (Figure 2-21). Figure 2-22 shows a
typical behaviour of VS over time at S = 1 and L = 0.1 for different values of
the robustness multiplier.
92
• The lower the value of the robustness multiplier, the more time it takes to
bind the constraint. For very low values of ν the investor is more conservative
than the case without model misspecification for all t. When the constraint is
relatively tight it is the case that the investor is more conservative when the
robustness multiplier is lower (Figure 2-23). When the constraint is relatively
loose (L is higher) it might be the case that for not very low values of ν the
investor is initially more aggressive and as t → T becomes more conservative
comparing to the case without model misspecification (Figure 2-24). This is
because at the beginning the drift term a
T−t
is very low comparing to VS making
the total drift term lower for large values of ν than smaller values of ν, which
leads to lower magnitude of weight. This situation changes as time to horizon
T gets smaller.
• The fact that typically VS becomes larger as t → T for each value of ν until a
point close to the horizon in combination with the fact that the drift term a
T−t
increases in a hyperbolic way leads to F becoming larger (in absolute value) as
t → T (Figure 2-23, 2-24) till it binds the collateral constraint. The risk averse
investor becomes more aggressive as the time to T becomes smaller despite the
fact that as t → T the malevolent agent picks a more adverse distortion drift.
This is due to the fact that the improvement in the investment opportunities is
so substantial that dominates the fact that the distortion drift gets also larger.
• For very low values of ν the drift distortion hmin is always positive. When the
constraint is tight enough the drift distortion is positive for all values of the
robustness multiplier (see Figure 2-25). When the constraint is not very tight
for not very low values of ν the drift distortion starts negative and after some
point increases as t → T to positive values (see Figure 2-26).
• Figure 2-27 shows the two terms of the distortion drift for ν = 1 when L = 0.1.
The first term corresponds to a positive distortion drift that reduces the wealth
of the investor since the investor is shorting the spread and it is bounded above
due to the collateral constraint, while the second term corresponds to a negative
93
distortion drift that points to worse investment opportunities. In this tradeoff
the first term wins and the final distortion drift is positive. For higher values of
L (see Figure 2-28) we see that the first term is losing at the beginning which
makes the drift distortion negative, explaining why initially the investor is more
aggressive than the case when there is no fear of model misspecification, but
as t → T it increases at a fast rate making the drift distortion finally positive,
which explains the fact that after a while the investor becomes more conservative
comparing to the case without fear of model misspecification. It is interesting
to note that after the collateral constraint binds the distortion drift evolution
is determined by the evolution of VS and therefore it might also undergo some
initial reduction before increasing to its upper bound dictated by the constraint.
• The tighter the collateral constraint the more conservative the investor is even
when the constrains does not bind (Figure 2-29).
• As ν → ∞ the optimal weight in the strategy converges to the optimal weight
when there is no fear of model misspecification as we have argued before.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Vs
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Vs as a function of time for S = 1
Figure 2-21: Partial derivative of the value function with respect to S for
a single convergence trade when L = 0.1 and L = 100. VS as a function of
time at S = 1 for different values of the robustness multiplier. The solid line is when
L = 100 and the dotted line is for L = 0.1.
94
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vs
×10-3
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
Vs as a function of time for S = 1
nu is: 0.01 L is: 0.1
nu is: 0.1 L is: 0.1
nu is: 1 L is: 0.1
nu is: 10 L is: 0.1
nu is: 100 L is: 0.1
Figure 2-22: Partial derivative of the value function with respect to S for a
single convergence trade when L = 0.1. VS as a function of time at S = 1 for
different values of the robustness multiplier. The collateral constraint is |F| ≤ 0.1.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-0.1
-0.08
-0.06
-0.04
-0.02
0
Optimal weights as a function of time for S = 1
nu is: 0.01 L is: 0.1
nu is: 0.1 L is: 0.1
nu is: 1 L is: 0.1
nu is: 10 L is: 0.1
nu is: 100 L is: 0.1
Figure 2-23: Optimal weight of a single convergence trade when L = 0.1.
Weight of the convergence trading strategy as a function of time at S = 1 for different
values of the robustness multiplier. The collateral constraint is |F| ≤ 0.1.
95
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Weights
-0.1
-0.09
-0.08
-0.07
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
Optimal weights as a function of time for S = 1
nu is: 0.01 L is: 1
nu is: 0.1 L is: 1
nu is: 1 L is: 1
nu is: 10 L is: 1
nu is: 100 L is: 1
Figure 2-24: Optimal weight of a single convergence trade when L = 1. Weight
of the convergence trading strategy as a function of time at S = 1 for different values
of the robustness multiplier. The collateral constraint is |F| ≤ 1.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
hmin
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
hmin as a function of time for S = 1
nu is: 0.01 L is: 0.1
nu is: 0.1 L is: 0.1
nu is: 1 L is: 0.1
nu is: 10 L is: 0.1
nu is: 100 L is: 0.1
Figure 2-25: Distortion drift for a single convergence trade when L = 0.1.
Distortion drift as a function of time at S = 1 for different values of the robustness
multiplier. The collateral constraint is |F| ≤ 0.1.
96
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
hmin
-0.02
0
0.02
0.04
0.06
0.08
0.1
hmin as a function of time for S = 1
nu is: 0.01 L is: 1
nu is: 0.1 L is: 1
nu is: 1 L is: 1
nu is: 10 L is: 1
nu is: 100 L is: 1
Figure 2-26: Distortion drift for a single convergence trade when L = 1.
Distortion drift as a function of time at S = 1 for different values of the robustness
multiplier. The collateral constraint is |F| ≤ 1.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Distortion
-0.02
0
0.02
0.04
0.06
0.08
0.1
Drift distortion and its two terms as a function of time for S = 1
h
h1
h2
Figure 2-27: Distortion drift terms for a single convergence trade when
L = 0.1. Distortion drift terms as a function of time at S = 1 for ν = 1 and L =
0.1. The first term corresponds to a positive distortion drift that reduces the wealth
of the investor and it is bounded above due to the collateral constraint, while the
second term corresponds to a negative distortion drift that points to worse investment
opportunities.
97
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Distortion
-0.04
-0.02
0
0.02
0.04
0.06
0.08
Drift distortion and its two terms as a function of time for S = 1
h
h1
h2
Figure 2-28: Distortion drift terms for a single convergence trade when
L = 1. Distortion drift terms as a function of time at S = 1 for ν = 1 and L =
1. The first term corresponds to a positive distortion drift that reduces the wealth
of the investor and it is bounded above due to the collateral constraint, while the
second term corresponds to a negative distortion drift that points to worse investment
opportunities.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Weights
-0.12
-0.1
-0.08
-0.06
-0.04
-0.02
0
Optimal weights as a function of time for S = 1
Figure 2-29: Optimal weight of a single convergence trade when L = 0.1 and
L = 100. Weight of the convergence trading strategy as a function of time at S = 1
for different values of the robustness multiplier. The solid line is when L = 100 and
the dotted line is for L = 0.1.
98
Let us now consider the case where we have N = 2 convergence trades. We
solve numerically the HJB equation using the method of finite differences. We have
assumed that rf = 0, T = 1,
a =


0.04
0.02


and
Σ =


1 ρ
ρ 1


Additionally we assume that we have a VaR constraint FT
ΣF ≤ L. In Figures
2-30 and 2-32 we plot the weights of the convergence trading strategies as a function
of time at S1 = 1 and S2 = 2 for different values of L and different values of the
robustness multiplier. At these values of S1 and S2 the drift is the same for both
the strategies. We have assumed that there is no correlation between the two trading
strategies. Figures 2-31 and 2-33 show FT
ΣF, the normalized wealth variance, as a
function of time for the different values of the robustness multiplier. We observe the
following:
• When the VaR constraint binds, and there is a fear of model misspecification
we invest more on the spread with the higher rate of mean reversion, due to
higher VS. The asymmetry between the two convergence trades goes down as
t → T and that is why when the VaR constraint binds, the difference in the
weights of the two strategies have to go down, as we see in Figures 2-30 and
2-32 where L = 0.5 and L = 0.05 respectively. Moreover, the investment in the
spread with the higher rate of mean reversion is higher than the corresponding
investment when there is no fear of model misspecification. This is due to the
fact that in both cases we have the same VaR constraint F2
1 + F2
2 = L, but in
the fist case there is an asymmetry causing F1 to be higher and F2 to be lower
than the corresponding weights in the second case.
• The lower the value of the robustness multiplier, the more time it takes to bind
the constraint, just like in the N = 1 case.
99
• When the VaR constraint does not bind the weights of both of the strategies in-
crease at t → T due to the improvement of the investment opportunities. When
the constraint is relatively loose (L is higher) it might be the case that for not
very low values of ν the investor is initially more aggressive and as t → T be-
comes more conservative comparing to the case without model misspecification
(Figures 2-30 and 2-32). This is because at the beginning the drift term a
T−t
is
very low comparing to VS making the total drift term lower for large values of
ν than smaller values of ν, which leads to lower magnitude of weight.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
Optimal weights as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-30: Optimal weights of two uncorrelated convergence trades for
S1 = 1 and S2 = 2 when L = 0.5. Weights of the convergence trades as a function
of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The
correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.5.
100
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
0
0.1
0.2
0.3
0.4
0.5
0.6
VaR Constraint as a function of time for S1 = 1 and S2 = 2
nu is: 1
nu is: 10
nu is: 100
Figure 2-31: Value of the normalized wealth variance for two uncorrelated
convergence trades at S1 = 1 and S2 = 2 when L = 0.5. Value of the normalized
wealth variance for two uncorrelated convergence trades as a function of time at
S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.5.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-0.18
-0.16
-0.14
-0.12
-0.1
-0.08
-0.06
-0.04
-0.02
Optimal weights as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-32: Optimal weights of two uncorrelated convergence trades for
S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades as a function
of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The
correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.05.
101
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
0
0.01
0.02
0.03
0.04
0.05
0.06
VaR Constraint as a function of time for S1 = 1 and S2 = 2
nu is: 1
nu is: 10
nu is: 100
Figure 2-33: Value of the normalized wealth variance for two uncorrelated
convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value of the normalized
wealth variance for two uncorrelated convergence trades as a function of time at S1 = 1
and S2 = 2 for different values of the robustness multiplier. The correlation coefficient
is ρ = 0 and the rhs of the VaR constraint is L = 0.05.
102
Figures 2-34 and 2-35 show the weights and the lhs of the constraint for the
case of positive correlation, while Figures 2-36 and 2-37 cover the case for negative
correlations. The properties are similar with the ones for the uncorrelated case.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-0.14
-0.12
-0.1
-0.08
-0.06
-0.04
-0.02
0
Optimal weights as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-34: Optimal weights of two positively correlated convergence trades
for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades as a
function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier.
The correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is L = 0.05.
103
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
0
0.01
0.02
0.03
0.04
0.05
0.06
VaR Constraint as a function of time for S1 = 1 and S2 = 2
nu is: 1
nu is: 10
nu is: 100
Figure 2-35: Value of the normalized wealth variance for two positively
correlated convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value of
the normalized wealth variance for two positively correlated convergence trades as a
function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier.
The correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is L = 0.05.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-0.24
-0.22
-0.2
-0.18
-0.16
-0.14
-0.12
-0.1
-0.08
-0.06
-0.04
Optimal weights as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-36: Optimal weights of two negatively correlated convergence
trades for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades
as a function of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint
is L = 0.05.
104
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
0
0.01
0.02
0.03
0.04
0.05
0.06
VaR Constraint as a function of time for S1 = 1 and S2 = 2
nu is: 1
nu is: 10
nu is: 100
Figure 2-37: Value of the normalized wealth variance for two negatively
correlated convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value of
the normalized wealth variance for two negatively correlated convergence trades as a
function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier.
The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 0.05.
105
So far we have examined the two trading strategies at values where the drifts are
the same. We now choose S1 = 1 and S2 = 1, since for these values the drifts differ
for the two strategies. Figure 2-38 shows the weights of the strategies over time for
different values of the robustness multiplier when there is no correlation between the
two trading strategies and when L = 0.05, while Figure 2-39 shows the lhs of the VaR
constraint, the normalized wealth variance, over time. We now observe the following:
• First of all when there is no fear of model misspecification the ratio of the
weights of the two strategies is the same as the ratio of their drifts normalized
by their variances and it does not change over time independent on whether the
VaR constraint binds or not.
• When the VaR constraint binds and there is a fear of model misspecification the
investor invests more on the spread with the higher drift. Since the constraint is
F2
1 + F2
2 = L if F1 increases (decreases) over time, then F2 decreases (increases)
over time. In Figure 2-38 F1 decreases over time (in magnitude), because the
asymmetry between the two strategies goes down as t → T. The existence of
this asymmetry is also the cause that when the robustness multiplier is lower the
difference in the weights of the two strategies is larger than when the robustness
multiplier is higher.
• The lower the value of the robustness multiplier, the more time it takes to bind
the VaR constraint.
• When the VaR constraint does not bind then the investor becomes more and
more aggressive, due to the improvement of the investment opportunities.
106
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-0.22
-0.2
-0.18
-0.16
-0.14
-0.12
-0.1
-0.08
-0.06
-0.04
-0.02
Optimal weights as a function of time for S1 = 1 and S2 = 1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-38: Optimal weights of two uncorrelated convergence trades for
S1 = 1 and S2 = 1 when L = 0.05. Weights of the convergence trades as a function
of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The
correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.05.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
0
0.01
0.02
0.03
0.04
0.05
0.06
VaR Constraint as a function of time for S1 = 1 and S2 = 1
nu is: 1
nu is: 10
nu is: 100
Figure 2-39: Value of the normalized wealth variance for two uncorrelated
convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value of the normalized
wealth variance for two uncorrelated convergence trades as a function of time at S1 = 1
and S2 = 1 for different values of the robustness multiplier. The correlation coefficient
is ρ = 0 and the rhs of the VaR constraint is L = 0.05.
107
It is interesting to note the weights when there is a positive correlation high enough
to be long the second spread and use it as a hedge to the first one. In Figure 2-40 we
plot the weights of the two trading strategies and we observe that they have different
signs. Figure 2-41 shows the normalized wealth variance as a function of time. When
the VaR constraint does not bind both the weights increase in magnitude due to
the improvement in the investment opportunities. When the VaR constraint binds,
both of the weights become larger in magnitude. This is due to the fact that the
asymmetry between the two strategies is reduced towards the asymmetry in the case
without fear of model misspecification.
Finally Figures 2-42 and 2-43 show what happens when there is a negative corre-
lation. The results are similar with the case of no correlation.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
Optimal weights as a function of time for S1 = 1 and S2 = 1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-40: Optimal weights of two positively correlated convergence trades
for S1 = 1 and S2 = 1 when L = 0.05. Weights of the convergence trades as a
function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier.
The correlation coefficient is ρ = 0.8 and the rhs of the VaR constraint is L = 0.05.
108
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
0
0.01
0.02
0.03
0.04
0.05
0.06
VaR Constraint as a function of time for S1 = 1 and S2 = 1
nu is: 1
nu is: 10
nu is: 100
Figure 2-41: Value of the normalized wealth variance for two positively
correlated convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value of
the normalized wealth variance for two positively correlated convergence trades as a
function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier.
The correlation coefficient is ρ = 0.8 and the rhs of the VaR constraint is L = 0.05.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-0.4
-0.35
-0.3
-0.25
-0.2
-0.15
-0.1
Optimal weights as a function of time for S1 = 1 and S2 = 1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-42: Optimal weights of two negatively correlated convergence
trades for S1 = 1 and S2 = 1 when L = 8. Weights of the convergence trades as a
function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier.
The correlation coefficient is ρ = −0.8 and the rhs of the VaR constraint is L = 0.05.
109
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
0
0.01
0.02
0.03
0.04
0.05
0.06
VaR Constraint as a function of time for S1 = 1 and S2 = 1
nu is: 1
nu is: 10
nu is: 100
Figure 2-43: Value of the normalized wealth variance for two negatively
correlated convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value of
the normalized wealth variance for two negatively correlated convergence trades as a
function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier.
The correlation coefficient is ρ = −0.8 and the rhs of the VaR constraint is L = 0.05.
110
2.4.4 Mean reversion trading strategies with constraints
We consider first the case where we have N = 1 mean reversion trading strategy with
S̄ = 0 and we have a collateral constraint |F|  L. Again due to the symmetry of
S around 0 it suffices to study only what happens when S ≥ 0, since the symmetric
behaviour of the spread dynamics combined with the symmetry of the constraint
imply that the value function is an even function of the spread S and its partial
derivative with respect to the spread is an odd function of S for each t.
The optimal weight in the trading strategy is given by:
Fopt
t =













− 1
σ2(1+ 1
ν
)
((φ + r)St + σ2VS(St,t)
ν
) if |(φ + r)St + σ2VS(St,t)
ν
| ≤ Lσ2
(1 + 1
ν
)
−L if (φ + r)St + σ2VS(St,t)
ν
≥ Lσ2
(1 + 1
ν
)
L if (φ + r)St + σ2VS(St,t)
ν
≤ −Lσ2
(1 + 1
ν
)
and the minimizing distortion drift is given by: hmin = −σ(F opt+VS)
ν
. When S is
positive, we have VS ≥ 0, since higher values of S correspond to better investment
opportunities. If VS ≥ 0 then −L ≤ Fopt
≤ 0 for S ≥ 0. Therefore we see that again
there is a tradeoff between the two terms in hmin. The first term −σF opt
ν
corresponds to
a positive distortion drift that reduces the wealth of the investor since the investor is
shorting the spread and it is bounded above, while the second term −σVS
ν
corresponds
to a negative distortion drift that points to worse investment opportunities.
The HJB equation is given by:



































Vt + r + 1/2 σ2
VSS + 1/2 1
1+ 1
ν
(φ + r)2 S2
σ2 + VS(−φS + (φ+r)S
ν+1
) − 1/2 1
ν+1
V 2
S σ2
= 0
if |(φ + r)St + σ2VS(St,t)
ν
| ≤ Lσ2
(1 + 1
ν
)
Vt + r + 1/2 σ2
VSS − 1
2ν
V 2
S − VS(φS − Lσ2
ν
) − 1
2
(1 + 1
ν
)σ2
L2
+ L(φ + r)S = 0
if (φ + r)St + σ2VS(St,t)
ν
≥ Lσ2
(1 + 1
ν
)
Vt + r + 1/2 σ2
VSS − 1
2ν
V 2
S − VS(φS + Lσ2
ν
) − 1
2
(1 + 1
ν
)σ2
L2
− L(φ + r)S = 0
if (φ + r)St + σ2VS(St,t)
ν
≤ −Lσ2
(1 + 1
ν
)
111
We solve the HJB equation numerically using the method of finite differences. In the
following figures we have assumed that rf = 0, σ = 1, φ = 1 and T = 1 for two
different constraints, one tight and one very loose. We have similar results with the
case where there are no constraints. In addition we observe the following:
• VS becomes higher the less tight the constraint is for each value of the robustness
multiplier (Figure 2-47). This is to be expected since the tighter the constraints
the less we can take advantage the better investment opportunities when S gets
higher.
• Figure 2-45 shows the two terms of the distortion drift for ν = 2 an L = 0.7.
The first term corresponds to a positive distortion drift that reduces the wealth
of the investor since the investor is shorting the spread and it is bounded above
due to the constraint, while the second term corresponds to a negative distortion
drift that points to worse investment opportunities. In this tradeoff the first
term wins making the drift distortion positive.
• Figures 2-46, 2-48 shows the optimal weight when S = 1 over time for different
values of ν. For L = 0.7 we see that for high values of ν the constraint binds for
all the time period, for lower values of ν the constraint binds initially and then
F is reduced after some time t. Finally for even lower values of ν the constraint
does not bind at all.
• Figure 2-48 shows that when the constraint is tighter the investor is more con-
servative even when the constraint does not bind. This is expected since the
tighter the constraint the smaller is the VS. This difference in the weights
becomes smaller as t → T.
112
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vs
0
0.05
0.1
0.15
0.2
0.25
0.3
Vs as a function of time for S = 1
nu is: 1 L is: 0.7
nu is: 2 L is: 0.7
nu is: 3 L is: 0.7
Figure 2-44: Partial derivative of the value function with respect to S for
a single mean reversion trading strategy and a collateral constraint with
L = 0.7. VS as a function of time at S = 1 for different values of the robustness
multiplier for L = 0.7.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Distortion
-0.2
-0.1
0
0.1
0.2
0.3
0.4
Drift distortion and its two terms as a function of time for S = 1
h
h1
h2
Figure 2-45: Distortion drift terms for a single mean reversion trading strat-
egy and a collateral constraint with L = 0.7. Distortion drift terms as a function
of time at S = 1 for ν = 2 and for L = 0.7. The first term corresponds to a positive
distortion drift that reduces the wealth of the investor, since the investor is short-
ing the spread, while the second term corresponds to a negative distortion drift that
points to worse investment opportunities. The first term is bounded above due to the
collateral constraint.
113
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-0.75
-0.7
-0.65
-0.6
-0.55
-0.5
Optimal weights as a function of time for S = 1
nu is: 1 L is: 0.7
nu is: 2 L is: 0.7
nu is: 3 L is: 0.7
Figure 2-46: Optimal weight of a single mean reversion trading strategy
with a collateral constraint with L = 0.7. Weight of the mean reversion trading
strategy as a function of time at S = 1 for different values of the robustness multiplier
and for L = 0.7.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vs
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Vs as a function of time for S = 1
Figure 2-47: Partial derivative of the value function with respect to S for a
single mean reversion trading strategy with different collateral constraints.
VS as a function of time at S = 1 for different values of the robustness multiplier and
different collateral constraints. The solid line is for L = 7 and the dotted line for
L = 0.7.
114
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-0.85
-0.8
-0.75
-0.7
-0.65
-0.6
-0.55
-0.5
Optimal weights as a function of time for S = 1
Figure 2-48: Optimal weight of a single mean reversion trading strategy with
different collateral constraints. Weight of the mean reversion trading strategy
as a function of time at S = 1 for different values of the robustness multiplier and
different collateral constraints. The solid line is for L = 7 and the dotted line for
L = 0.7.
115
Let us now consider the case where we have N = 2 mean reversion trading strate-
gies and again S̄ = 0. We solve numerically the HJB equation using the method of
finite differences. We have assumed that rf = 0, T = 1,
Φ =


2 0
0 1


and
Σ =


1 ρ
ρ 1


Additionally we assume that we have a VaR constraint FT
ΣF ≤ L. In Figures
2-49, 2-51 and 2-53 we plot the weights of the mean reversion trading strategies as a
function of time at S1 = 1 and S2 = 2 for different values of L and different values
of the robustness multiplier. At these values of S1 and S2 the drift is the same for
both the strategies. We have assumed that there is no correlation between the two
trading strategies. Figures 2-50,2-52 and 2-54 show FT
ΣF as a function of time for
the different values of the robustness multiplier. We observe the following:
• First of all when there is no fear of model misspecification the weights of the
two strategies are the same, since there is no asymmetry between them, and
they do not change over time independent on whether the VaR constraint binds
or not.
• When the VaR constraint binds, and there is a fear of model misspecification we
invest more on the spread with the higher φ coefficient, due to higher VS. As we
saw in the unconstrained case (Figure 2-10), the ratio of the weights is reduced
over time, since their asymmetry due to the reduction of the VS1 and VS2 is
reduced over time. Therefore, when the VaR constraint binds, the difference
in the weights of the two strategies have to go down, as we see in Figures 2-49
and 2-51 where L = 3 and L = 2 respectively. Moreover, the investment in the
spread with the higher φ coefficient is higher than the corresponding investment
when there is no fear of model misspecification. This is due to the fact that
116
in both cases we have the same VaR constraint F2
1 + F2
2 = L, but in the first
case there is an asymmetry causing F1 to be higher and F2 to be lower than the
corresponding weights in the second case.
• When the VaR constraint does not bind the weights of both of the strategies
go down just like in the unconstrained case as we see in Figures 2-49 and 2-53.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-1.4
-1.35
-1.3
-1.25
-1.2
-1.15
-1.1
-1.05
-1
Optimal weights as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-49: Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2 when L = 3. Weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of
the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR
constraint is L = 3.
117
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
2
2.2
2.4
2.6
2.8
3
3.2
VaR Constraint as a function of time for S1 = 1 and S2 = 2
nu is: 1
nu is: 10
nu is: 100
Figure 2-50: Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 3. Value of
the normalized wealth variance for two uncorrelated mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is
L = 3.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-1.15
-1.1
-1.05
-1
-0.95
-0.9
-0.85
Optimal weights as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-51: Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2 when L = 2. Weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of
the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR
constraint is L = 2.
118
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
1.9
1.92
1.94
1.96
1.98
2
2.02
2.04
2.06
2.08
2.1
VaR Constraint as a function of time for S1 = 1 and S2 = 2
nu is: 1
nu is: 10
nu is: 100
Figure 2-52: Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 2. Value of
the normalized wealth variance for two uncorrelated mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is
L = 2.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-2
-1.9
-1.8
-1.7
-1.6
-1.5
-1.4
-1.3
-1.2
-1.1
-1
Optimal weights as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-53: Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of
the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR
constraint is L = 7.
119
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
2
3
4
5
6
7
8
VaR Constraint as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 1 nu is: 10
Spread 1 nu is: 100
Figure 2-54: Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 7. Value of
the normalized wealth variance for two uncorrelated mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is
L = 7.
120
Figures 2-55 and 2-56 show the weights and the constraint for the case of positive
correlation, while Figures 2-57 and 2-58 cover the case for negative correlations. The
properties are similar with the ones for the uncorrelated case.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-1.4
-1.3
-1.2
-1.1
-1
-0.9
-0.8
-0.7
-0.6
Optimal weights as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-55: Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different
values of the robustness multiplier. The correlation coefficient is ρ = 0.5 and the rhs
of the VaR constraint is L = 7.
121
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
VaR Constraint as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 1 nu is: 10
Spread 1 nu is: 100
Figure 2-56: Value of the normalized wealth variance for two positively
correlated mean reversion trading strategies at S1 = 1 and S2 = 2 when
L = 7. Value of the normalized wealth variance for two positively correlated mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different
values of the robustness multiplier. The correlation coefficient is ρ = 0.5 and the rhs
of the VaR constraint is L = 7.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-2.9
-2.8
-2.7
-2.6
-2.5
-2.4
-2.3
-2.2
-2.1
-2
Optimal weights as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-57: Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different
values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the
rhs of the VaR constraint is L = 7.
122
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
4
4.5
5
5.5
6
6.5
7
7.5
VaR Constraint as a function of time for S1 = 1 and S2 = 2
Spread 1 nu is: 1
Spread 1 nu is: 10
Spread 1 nu is: 100
Figure 2-58: Value of the normalized wealth variance for two negatively
correlated mean reversion trading strategies at S1 = 1 and S2 = 2 when
L = 7. Value of the normalized wealth variance for two negatively correlated mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different
values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the
rhs of the VaR constraint is L = 7.
123
So far we have examined the two trading strategies at values where the drifts are
the same. We now choose S1 = 1 and S2 = 1 just like in the unconstrained case, since
for these values the drifts differ for the two strategies. Figure 2-59 shows the weights
of the strategies over time for different values of the robustness multiplier when there
is no correlation between the two trading strategies and when L = 2, while Figure
2-60 shows the VaR constraint over time. We now observe the following:
• First of all when there is no fear of model misspecification the ratio of the
weights of the two strategies is the same as the ratio of their drifts normalized
by their variances and it does not change over time independent on whether the
VaR constraint binds or not.
• When the VaR constraint binds and there is a fear of model misspecification the
investor invests more on the spread with the higher drift. Since the constraint is
F2
1 + F2
2 = L if F1 increases (decreases) over time, then F2 decreases (increases)
over time. In Figure 2-59 F1 increases over time, because the asymmetry be-
tween the two strategies grows larger as we can see in the unconstrained case
in Figure 2-15.
• When the VaR constraint does not bind then the investor becomes more and
more conservative in both the strategies over time just like in the N = 1 case.
124
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-1.3
-1.2
-1.1
-1
-0.9
-0.8
-0.7
-0.6
-0.5
Optimal weights as a function of time for S1 = 1 and S2 = 1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-59: Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 1 when L = 2. Weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of
the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR
constraint is L = 2.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2
2.1
VaR Constraint as a function of time for S1 = 1 and S2 = 1
nu is: 1
nu is: 10
nu is: 100
Figure 2-60: Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 1 when L = 2. Value of
the normalized wealth variance for two negatively correlated mean reversion trading
strategies as a function of time at S1 = 1 and S2 = 1 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR
constraint is L = 2.
125
It is interesting to note the weights when there is a positive correlation high
enough to be long the second spread and use it as a hedge to the first one. In
Figure 2-61 we plot the weights of the two trading strategies and we observe that
they have different signs. Figure 2-62 shows the VaR value as a function of time.
When the VaR constraint does not bind both the weights reduce in magnitude like
in the unconstrained case. When the VaR constraint binds, both of the weights
become larger in magnitude. This is due to the fact that the asymmetry between the
two strategies is reduced towards the asymmetry in the case without fear of model
misspecification, as we see in the unconstrained case (Figure 2-19).
Finally Figures 2-63 and 2-64 show what happens when there is a negative corre-
lation. The results are similar with the case of no correlation.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
Optimal weights as a function of time for S1 = 1 and S2 = 1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-61: Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1 when L = 2. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different
values of the robustness multiplier. The correlation coefficient is ρ = 0.9 and the rhs
of the VaR constraint is L = 2.
126
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2
2.1
VaR Constraint as a function of time for S1 = 1 and S2 = 1
nu is: 1
nu is: 10
nu is: 100
Figure 2-62: Value of the normalized wealth variance for two positively
correlated mean reversion trading strategies at S1 = 1 and S2 = 1 when
L = 2. Value of the normalized wealth variance for two positively correlated mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different
values of the robustness multiplier. The correlation coefficient is ρ = 0.9 and the rhs
of the VaR constraint is L = 2.
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
-3.2
-3
-2.8
-2.6
-2.4
-2.2
-2
-1.8
-1.6
-1.4
-1.2
Optimal weights as a function of time for S1 = 1 and S2 = 1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Figure 2-63: Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1 when L = 8. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different
values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the
rhs of the VaR constraint is L = 8.
127
Time
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weights
2
3
4
5
6
7
8
9
VaR Constraint as a function of time for S1 = 1 and S2 = 1
nu is: 1
nu is: 10
nu is: 100
Figure 2-64: Value of the normalized wealth variance for two negatively
correlated mean reversion trading strategies at S1 = 1 and S2 = 1 when
L = 8. Value of the normalized wealth variance for two negatively correlated mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different
values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the
rhs of the VaR constraint is L = 8.
128
2.5 Conclusions
We investigated how the optimal trading strategy of a risk averse investor, facing
arbitrage opportunities and risk constraints, changes when he is not confident of
his model dynamics. In particular, we assumed that the investor believes that the
data come from an unknown member of a set of unspecified alternative models near
his approximating model. The investor believes that his model is a pretty good
approximation in the sense that the relative entropy of the alternative models with
respect to his nominal model is small. Concern about model misspecification leads the
investor to choose a robust trading strategy that works well over that set of alternative
models. We found what the optimal trading strategy is for the case of the convergence
trades and mean reversion trading strategies with and without constraints by solving
the corresponding Hamilton Jacobi Bellman equation.
In all of our cases we dealt with diffusion processes and the alternative models
distorted the conditional mean of the Brownian motion. An interesting extension of
our work would be to assume that we have jump processes, where the misspecification
is now about the dynamics of the jumps. This will be the topic of future work.
129
130
Chapter 3
Estimating the NIH Efficient
Frontier
The National Institutes of Health (NIH) is among the world’s largest and most im-
portant investors in biomedical research. Its stated mission is to “seek fundamental
knowledge about the nature and behavior of living systems and the application of
that knowledge to enhance health, lengthen life, and reduce the burdens of illness
and disability” (http://www.nih.gov/about/mission.htm). Some have criticized the
NIH funding process as not being sufficiently focused on disease burden[69, 45, 43].
We consider a framework in which biomedical research allocation decisions are
more directly tied to the risk/reward trade-off of burden-of-disease outcomes. Pri-
oritizing research efforts is analogous to managing an investment portfolio—in both
cases, there are competing opportunities to invest limited resources, and expected
returns, risk, correlations, and the cost of lost opportunities are important factors in
determining the return of those investments.
Financial decisions are commonly made according to portfolio theory[55], in which
the optimal trade-off between risk and reward among a collection of competing
investments—known as the “efficient frontier”—is constructed via quadratic opti-
mization, and a point on this frontier is selected based on an investor’s risk/reward
preferences. Given a measure of “return on investment” (ROI), an “efficient portfo-
lio” is defined to be the investment allocation that yields the highest expected return
131
for a given and fixed level of risk (as measured by return volatility), and the locus of
efficient portfolios across all levels of risk is the efficient frontier.
We recast the NIH funding allocation decision as a portfolio-optimization problem
in which the objective is to allocate a fixed amount of funds across a set of disease
groups to maximize the expected “return on investment” (ROI) for a given level of
volatility. We define ROI as the subsequent improvements in years of life lost (YLL).
We use historical time series data provided by the NIH and the Centers for Disease
Control for each of 7 disease groups and we estimate the means, variances, and co-
variances among these time series. These serve as inputs to the portfolio-optimization
problem. Such an approach provides objective, systematic, transparent, and repeat-
able metrics that can incorporate “real-world” constraints, and yields well-defined
optimal risk-sensitive biomedical research funding allocations expressly designed to
reduce the burden of disease.
In the rest of this chapter, we will discuss the relevant literature review. Then
we will discuss about the data and solution methods used, present our results and
conclude with a discussion of our findings.
3.1 NIH Background and Literature Review
The National Institutes of Health (NIH) was established in 1938 and has a budget
of over $31 billion, of which 80% is awarded in competitive research grants to more
than 325,000 researchers through nearly 50,000 competitive grants at over 3,000 uni-
versities, medical schools, and other research institutions (http://www.nih.gov). The
NIH allocates funding among competing priorities by assessing such priorities with re-
spect to five major criteria[65]: (a) public needs; (b) scientific quality of the research;
(c) potential for scientific progress (the existence of promising pathways and qualified
investigators); (d) portfolio diversification; and (e) adequate support of infrastruc-
ture (human capital, equipment, instrumentation, and facilities). This framework
was supported, with some additional recommendations, by an Institute of Medicine
(IoM) blue-ribbon panel in 1998 (see Table 3.1)[1].
132
Criteria Processes Public Congress
Recommendation 1. The
committee generally supports the
criteria that NIH uses for priority
setting and recommends that NIH
continue to use these criteria in a
balanced way to cover the full
spectrum of research related to
human health.
Recommendation 5. In exercising
the overall authority to oversee and
coordinate the priority-setting
process, the NIH director should
receive from the directors of all the
institutes and centers multiyear
strategic plans, including budget
scenarios, in a standard format on an
annual basis.
Recommendation 7. NIH should
establish an Office of Public Liaison
in the Office of the Director, and,
where offices performing such a
function are not already in place, in
each institute. These offices should
document, in a standard format, their
public outreach, input, and response
mechanisms. The director's Office of
Public Liaison should review and
evaluate these mechanisms and
identify best practices.
Recommendation 10. The U.S.
Congress should use its authority to
mandate specific research programs,
establish levels of funding for them,
and implement new organizational
entities only when other approaches
have proven inadequate. NIH should
provide Congress with analyses of
how NIH is responding to requests for
such major changes and whether these
requests can be addressed within
existing mechanisms.
Recommendation 2. NIH should
make clear its mechanisms for
implementing its criteria for setting
priorities and should evaluate their
use and effectiveness.
Recommendation 6. The director
of NIH should increase the
involvement of the Advisory
Committee to the Director in the
priority-setting process. The
diversity of the committee's
membership should be increased,
particularly with respect to its public
members.
Recommendation 8. The director
of NIH should establish and
appropriately staff a Director's
Council of Public Representatives,
chaired by the NIH director, to
facilitate interactions between NIH
and the general public.
Recommendation 11. The director
of NIH should periodically review and
report on the organizational structure
of NIH, in light of changes in science
and the health needs of the public.
Recommendation 3. In setting
priorities, NIH should strengthen its
analysis and use of health data, such
as burdens and costs of diseases, and
of data on the impact of research on
the health of the public.
Recommendation 9. The public
membership of NIH policy and
program advisory groups should be
selected to represent a broad range
of public constituencies.
Recommendation 12. Congress
should adjust the levels of funding for
research management and support so
that the NIH can implement
improvements in the priority-setting
process, including stronger analytical,
planning, and public interface
capabilities.
Recommendation 4. NIH should
improve the quality and analysis of
its data on funding by disease and
should include direct and related
expenditures.
Table 3.1: IoM recommendations. 12 major recommendations of the 1998 In-
stitute of Medicine panel in four large areas for improving the process of allocating
research funds.
133
Despite this framework and the IoM endorsement, NIH funding has been criticized
as not being aligned to disease burden and insufficiently effective[69, 45, 43]. For
example, the impact of cancer has been estimated as only 5% of total direct cost but
23% of all deaths[20], while extramural spending by the National Cancer Institute
(NCI) is about 15% of the total (http://report.nih.gov/). Sandler et al.[70] suggested
that digestive diseases were relatively underfunded based on comparisons of disease
burden as measured by direct and indirect cost. Gross[38] noted that NIH funding
is reasonably predicted by some burden-of-disease metrics (disability-adjusted life-
years or DALY, which are unavailable in time-series form)[57]. Earmarks or target
funding levels for specific diseases and programs have been suggested by a number of
policymakers[2].
Funding allocation decisions are not unique to the NIH; in a study similar to
Gross et al.Gross, Curry et al. [27] has questioned the allocations of the Centers
for Disease Control. NIH leaders have noted that funding basic research is itself a
risky endeavor, involving trade-offs among all five of their funding criteria, and may
also include unstated secondary objectives, e.g., actively “balancing out” spending by
other agencies, charities, and the private sector[76]. Collectively, these factors impose
significant challenges to determining an ideal allocation of research funds.
Although the economic impact of biomedical research has been considered[23],
the main focus has been on measuring value-added rather than determining optimal
funding allocations. Murphy and Topel [63] estimate U.S. economic surplus from
improved health on the order of $2.6 trillion annually, with benefits distributed un-
equally across age and gender, and suggest that in some cases, incremental benefits
may not exceed the cost of achieving them. Johnston et al.[46] found a return to soci-
ety in the form of averted treatment costs and public health benefits divided by cost
of trial expenditures of 46% for clinical trials at the National Institute of Neurological
Disorders and Stroke (NINDS), where the returns or net savings were generated by
four of the 28 trials examined, and collectively exceeded the costs of not only the
clinical, but the entire program of research at NINDS during the study period. Cut-
ler and McClellan[28] computed returns of technological advances for five conditions
134
and found net benefit for four and costs equal to benefits in the fifth. Fleurence and
Togerson[31] suggested that research should be allocated to provide the most health
benefits to the population, subject to equity considerations, and observed that sub-
jective, burden-of-disease, and payback methods all failed this test to some degree.
Instead, they argue that a method of information valuation is superior.
Modern financial portfolio theory—in which the expected return, risk (as measured
by volatility), and correlations of a collection of investment opportunities are taken
as inputs, and the set of all portfolio weights with the highest expected return for
a given level of risk is the output—produces rational allocations of limited resources
among competing priorities. For developing this method in 1952, Markowitz shared
the Nobel Memorial Prize in Economic Sciences in 1990. The theory has had extensive
applications among mutual funds, pension funds, endowments, and sovereign wealth
funds[55, 56, 24].
More recently, portfolio theory has been proposed as a means for conducting risk-
sensitive cost-benefit analysis for health-care budgeting decisions[11, 66, 19, 18, 71,
72]. The motivation for these studies is the observation that typical cost-effectiveness
studies of healthcare programs ignore the uncertainty of realized costs, which can be
addressed by applying portfolio theory to balance the risks against the rewards of spe-
cific budget allocations. These studies present simplified frameworks for incorporating
risk into the healthcare budgeting process, e.g., two-security examples (although[71]
does contain 11 hypothetical cost/effect distributions) and do not contain full-scale
empirical applications to realistic budgeting tasks. As the authors note, applying
portfolio theory to large public healthcare reimbursement problems can be challeng-
ing. Patients may have differing and non-constant utility functions, and some argue
that the manager/administrator should only consider expected returns, allowing the
patient and physician to consider risk trade-offs at individual treatment levels, in
which case the aggregate utility function is implicit.
Despite the growing interest in measuring the return on biomedical research[67,
51], and the fact that portfolio theory has already been applied to healthcare bud-
geting decisions, some sceptics continue to argue against the use of any quantita-
135
tive metrics in this domain. For example, Black[14] states categorically that “[t]he
biomedical ‘payback’ approach is certainly inappropriate and attempts to impose it
should be strenuously resisted. Instead, a qualitative approach should be applied
that takes into account the ‘slow-burning fuse’ and avoids simple attribution of cause
and effect”. While such a response may be acceptable for certain types of funding,
it is becoming increasingly untenable with respect to public funds and government
support, which, by law, almost always require some form of cost/benefit analysis,
performance attribution, and oversight.
3.2 Methods
3.2.1 Funding Data
The NIH has 27 Institutes and Centers, of which we identified 10 with research
missions clearly tied to specific disease states, and which account for $21 billion of
funding in 2005 or 74% of the total (see Table 3.2 for the disease classification scheme
used and Figure 3-1 for the procedure for constructing the appropriation time series).
The National Institute of Allergies and Infectious Diseases (NIAID) spending has
been split to account for HIV, which is presented separately (see HIV discussion
below).
These Institutes and the basic research they fund have inevitable overlap and
effect beyond their charter; we treat all spending for any given Institute as being
directed toward the corresponding disease states, and account for spillover effects by
considering the correlations in the lessening of the burden of disease in other groups.
For example, molecular biology funded by the NCI may be relevant to infectious
diseases but, like the entire NCI budget, would be assumed for modeling purposes to
be directed at cancer; the hypothetical infectious-disease improvement would appear
in the correlation between the decrease in years of life lost for cancer and that of
infectious diseases.
136
Analytic
Group
ICD 9 ICD 10 NIH
Chapter(s) Codes Chapter(s) Blocks Institute(s)
AID Infectious and Parasitic Diseases 001-139 Certain infectious and parasitic diseases A00-B99 NIAID
NCI Neoplasms 140-239 Neoplasms C00-D48 NCI
DDK Endocrine, nutritional and metabolic diseases, and
immunity disorders;
Diseases of the digestive system;
Diseases of the genitourinary system
240-279;
520-579;
580-629
Endocrine, nutritional and metabolic diseases;
Diseases of the digestive system;
Diseases of the genitourinary system
E00-E88;
K00-K92;
N00-N98
NIDDK
HLB Diseases of the blood and blood-forming organs:
Diseases of the circulatory system;
Diseases of the respiratory system
280-289;
390-459;
460-519
Diseases of the blood and blood-forming organs and
certain disorders involving the immune mechanism;
Diseases of the circulatory system;
Diseases of the respiratory system
D50-D89;
I00-I99;
J00-J98
NHLBI
NMH Mental disorders 290-319 Mental and behavioural disorders F01-F99 NIMH
CNS Diseases of the nervous system and sense organs 320-389 Diseases of the nervous system;
Diseases of the eye and adnexa;
Diseases of the ear and mastoid process
G00-G98;
H00-H57;
H60-H93
NINDS
NEI
NIDCD
CHD Complications of pregnancy, childbirth, and the
puerperium;
Congenital anomalies;
Certain conditions originating in the perinatal period
630-676;
740-759;
760-779
Pregnancy, childbirth and the puerperium;
Certain conditions originating in the perinatal period;
Congenital malformations, deformations and
chromosomal abnormalities
O00-O99;
P00-P96;
Q00-Q99
NICHD
AMS Diseases of the skin and subcutaneous tissue;
Diseases of the musculoskeletal system and connective
tissue
680-709;
710-739
Diseases of the skin and subcutaneous tissue;
Diseases of the musculoskeletal system and connective
tissue
L00-L98;
M00-M99
NIAMS
LAB Symptoms, signs, and ill-defined conditions 780-799 Symptoms, signs and abnormal clinical and laboratory
findings, not elsewhere classified
R00-R99
EXT External causes of injury and poisoning E800-
E999
Codes for special purposes; External causes of morbidity
and mortality
U00-U99;
V01-Y89
Table 3.2: ICD mapping. Classification of ICD-9 (1978–1998) and ICD-10 (1999–
2007) Chapters and NIH appropriations by Institute and Center to 7 disease groups:
oncology (ONC); heart lung and blood (HLB); digestive, renal and endocrine (DDK);
central nervous system and sensory (CNS) into which we placed dementia and un-
specified psychoses to create comparable series as there was a clear, ongoing migration
noted from NMH to CNS after the change to ICD-10 in 1999; psychiatric and sub-
stance abuse (NMH); infectious disease, subdivided into estimated HIV (HIV) and
other (AID); maternal, fetal, congenital and pediatric (CHD). The categories LAB
and EXT are omitted from our analysis.
137
Figure 3-1: NIH time series flowchart. Flowchart for the construction of NIH ap-
propriations time series. “NIH Approp.” denotes NIH appropriations; “PHS Gaps”
denotes Institute funding by the U.S. Public Health Service; “Complete Approp.”
denotes the union of these two series; “FY Change” allows for the change in govern-
ment fiscal years; “4Q FY” time series refers to the resulting series in which all years
are treated as having four quarters of three months each.
1940 1950 1960 1970 1980 1990 2000
0
5
10
15
20
25
Year
Funding
(in
billions
of
$)
AID
HIV
AMS
CHD
CNS
DDK
HLB
ONC
NMH
Figure 3-2: Appropriations data. NIH appropriations in real (2005) dollars, cate-
gorized by disease group.
138
3.2.2 Burden of Disease Data
Because of its simplicity, availability, breadth, and long history, years of life lost (YLL)
was chosen as the measure of burden of disease to be used in constructing the esti-
mated return on investment from NIH-funded research. The CDC Wide-ranging On-
line Data for Epidemiologic Research (WONDER) database (http://wonder.cdc.gov/)
was queried for the underlying cause of death at the Chapter level (except for mental
disorders, where dementia and unspecified psychoses were all placed in CNS for consis-
tency with CDC coding after 1998) for International Classification of Diseases (ICD)
categories ICD-9 (for 1979–1998) and ICD-10 (for 1999–2007). The two datasets for
pre- and post-1998 were joined into one continuous series, data were stratified into
groups by age at death, and YLL were computed by comparing the midpoint of the
age ranges with the World Health Organization’s (WHO) year-2000 U.S. life table
(http://www.who.int/whosis/en/). Years of life lost were then tabulated by Chapter
annually, and adjusted for population growth to remove what would otherwise be a
systematic downward bias in realized health improvements. This process yielded YLL
series for 9 distinct disease groups.
Using 2005 as the base year, the raw YLL observations were adjusted in other
years to be comparable to the 2005 population:
YLLt ≡ YLLraw
t ×
POP2005
POPt
, POPt ≡ U.S. population in year t . (3.1)
The procedure for assembling the YLL time series is summarized in Figure 3-3,
and the resulting series, both raw and normalized for population growth, are shown
in Figure 3-4.
The change in burden of disease was measured by taking first differences. These
first differences were used to compute the “return on investment” on which the mean-
variance optimizations were based (see the “Methods” section below).
Three disease areas required special consideration: HIV, AMS, and dementia.
AMS and HIV have shorter histories, which is problematic for estimating parameters
based on historical returns that are lagged by typical FDA approval times plus 4 years.
139
Figure 3-3: YLL time series flowchart. Flowchart for the construction of years
of life lost (YLL) time series. “WONDER Chapter Age Group” refers to a query to
the CDC WONDER database at the chapter level, stratified by age group at death;
“US Pop.” is the United States population from census data as expressed in the
WONDER dataset; and “US GDP” denotes U.S. gross domestic product.
140
(a) YLL Gross
(b) YLL Normalized
Figure 3-4: YLL data. Panel (a): Raw YLL categorized by disease group. Panel (b):
Population-normalized YLL (with base year of 2005), categorized by disease group.
Both panels are based on data from 1979 to 2007.
141
Dementia, including Alzheimer’s disease and unspecified psychoses, was reclassified
with the change from ICD-9 to ICD-10 from mental and behavioral disorders to
diseases of the nervous system; we placed all dementia YLL in the CNS group to
avoid a transition-point artifact at the juncture between ICD-9 and ICD-10, and
then performed a sensitivity analysis with and without the dementia YLL.
HIV poses a special challenge given its extreme returns after the introduction of
protease inhibitors, which are outliers that are likely to be non-stationary and would
heavily bias the parameter estimates on which the portfolio optimization is based.
To address this outlier, HIV spending and its corresponding YLL were omitted from
those of other infectious diseases—the component of NIAID spending directed at
HIV was estimated by straight-line interpolation from published figures, and this
HIV spending was treated as a separate entity and subtracted from reported NIAID
appropriations; a similar procedure was followed for the estimation of HIV-related
YLL, and WONDER was queried at the subchapter level to implement this separa-
tion. Because of their unique characteristics, these two groups are omitted from our
main empirical results.
3.2.3 Applying Portfolio Theory
To apply portfolio theory, the concept of a “return on investment” (ROI) must first be
defined. Although YLL has already been chosen as the metric by which the impact
of research funding is to be gauged, there are at least two issues in determining
the relation between research expenditures and YLL that must be considered. The
first is whether or not any relation exists between the two quantities. While the
objectives of pure science do not always include practical applications that impact
YLL, the fact that part of the NIH mission is to “enhance health, lengthen life, and
reduce the burdens of illness and disability” suggests the presumption—at least by
the NIH—that there is indeed a non-trivial relation between NIH-funded research and
burden of disease. For the purposes of this study, and as a first approximation, we
assume that YLL improvements are proportional to research expenditures. Of course,
factors other than NIH research expenditures also affect YLL, including research
142
from other domestic and international medical centers and institutes, spending in
the pharmaceutical and biotechnology industries, public health policy, behavioral
patterns, prosperity level and environmental conditions. Therefore, the YLL/NIH-
funding relation is likely to be noisy, with confounding effects that may not be easily
disentangled. The Discussion section contains a more detailed discussion of this
assumption and some possible alternatives.
The second issue is the significant time lag between research expenditures and
observable impact on YLL. For example, Mosteller[62] cites a lag of 264 years, starting
in 1601, for the adoption of citrus to prevent scurvy by the British merchant marine.
More contemporary examples[26, 35, 39] cite lags of 17 to 20 years. We use shorter
lags in this study both because of data limitations (our entire dataset spans only
29 years), and also to reduce the impact of factors other than research expenditures
on our measure of burden of disease (YLL). Any attempt to optimize appropriations
to achieve YLL-related objectives must take this lag into account, otherwise the
resulting optimized appropriations may not have the intended effects on subsequent
YLL outcomes.
The impact of NIH-funded research on disease burden is likely to be spread out
over several years after this intervening lag, given the diffusion-like process in which
research results are shared in the scientific community. For simplicity, the same
duration (p = 5 years) of the diffusion-like impact for all the disease groups was
hypothesized. The lag q for each disease group was estimated by running linear
regressions associating improvements in YLL over p = 5 years with NIH funding q
years earlier and real income and choosing the lag between 9 and 16 years ( beyond
which data limitations and other factors make it impossible to distinguish the impact
of research funding from other confounding factors affecting YLL) that maximizes
the R2
and the corresponding lags are shown in Table 3.3.
This procedure is, of course, a crude but systematic heuristic for relating research
funding to YLL outcomes. Alternatives include using a single fixed lag across all
groups, simply assuming particular values for group-specific lags based on NIH man-
dates and experience, computing a time-weighted average YLL for each group with
143
a weighting scheme corresponding to an assumed or estimated knowledge-diffusion
rate for that group, or constructing a more accurate YLL return series by tracking
individual NIH grants within each group to determine the specific impact on YLL
(through new drugs, protocols, and other improvements in morbidity and mortality)
from the award dates to the present. While the choice of lag is critical in determining
the characteristics of the YLL return series and deserves further research, it does
not effect the applicability of the overall analytical framework. While our procedure
is surely imperfect, it is a plausible starting point from which improvements can be
made.
Assuming constant impact of research funding on YLL over the duration of p
years, the measure of the ROI that accrues to funds allocated in year t is then given
by:
Rt+q ≡ −
1
p
Pp−1
i=0 (YLLt+q+i − YLLt+q+i−1) × GDPt+q+i
Appropriationt
(3.2)
where the minus sign reflects the focus on decreases in YLL, and the multiplier
GDPt+q is per capita real gross domestic product (GDP) in year t+ q , which is
included to convert the numerator to a dollar-denominated quantity to match the
denominator. This ratio’s units are then comparable to those of typical investment
returns: date-(t+q) dollars of return per date-t dollars of investment.
Given the definition in equation (3.2) for the ROI of each of the disease groups,
the “optimal” appropriation of funds among those groups must be determined, i.e.,
the appropriation that produces the best possible aggregate expected return on total
research funding per unit risk. Denote by R ≡ [ R1 R2 · · · Rn ]′
the vector
of returns of all n groups for a given appropriation date t (where time subscripts
have been suppressed for notational simplicity), and denote by µ and Σ the vector
of expected returns and the covariance matrix, respectively. If the weights of the
budget allocation among the groups are ω, the ROI for the entire portfolio of grants,
denoted by Rp, is given by Rp = ω′
R, and its expected value and variance are ω′
µ
and ω′
Σω, respectively. The objective function to be optimized is then given by
144
the expected value minus some multiple of the variance which reflects risk tolerance,
and this quadratic function of ω is maximized using standard quadratic optimization
techniques, subject to the constraint that the weights sum to 1.
In the mean-variance framework, we seek to find the best trade-offs between risk
and expected return by varying the portfolio weights ω to trace out the locus of
mean-variance combinations that cannot be improved upon, i.e., that are “efficient”.
This set of efficient portfolios, also known as the “efficient frontier” is formally defined
as the curve in mean-variance (or mean-standard deviation) space corresponding to
all portfolios with the highest level of expected return for a given level of variance.
This efficient frontier defines the set of allocations that cannot be improved upon
from a mean-variance perspective, and the optimal allocation is a single point on this
frontier that is determined by the investor’s desired volatility level or risk tolerance.
More formally, an investor with standard mean-variance preferences is assumed to
prefer portfolios with greater expected return and lower variance, with diminishing
returns in each (so that progressively greater increments of expected return must be
offered to the investor to induce him to accept increases in the same increment of risk
as the level of risk rises). This type of preferences generates so-called “indifference
curves” (non-intersecting curves in mean-variance space that trace out combinations
of mean and variance for which an individual is indifferent) that are upward sloping
and convex. The optimal portfolio for a given set of indifference curves is the tangency
point T of the efficient frontier with the most upper-left indifference curve.
To compute the efficient frontier, the following optimization problem must be
solved (we maintain the following notational conventions: (1) all vectors are column
vectors unless otherwise indicated; (2) matrix transposes are indicated by a prime
superscript, hence ω′
is the transpose of ω; and (3) vectors and matrices are always
typeset in boldface, i.e., X and µ are scalars and X and µ are vectors or matrices):
Minimize
ω
ω′
Σω
subject to ω′
µ ≥ µo , ω′
ι = 1 , ω ≥ 0
(3.3)
145
where ι is an (n × 1)-vector of 1’s and µo is an arbitrary fixed level of expected
return. By varying µo between a range of values and solving the optimization problem
for each value, all the efficient allocations ω∗
may be tabulated, and the locus of
points in mean-standard-deviation space corresponding to these efficient allocations
is the efficient frontier. This so-called Markowitz portfolio optimization problem
involves minimizing a quadratic objective function with linear constraints, which is a
standard quadratic programming (QP) problem that can easily be solved analytically
in some cases[58], and numerically in all other cases by a variety of efficient and stable
solvers[37, 36].
One additional refinement to address the well-known issue of “corner solutions”
(in which several components of ω∗
are 0) that often arise in the standard portfolio-
optimization framework is proposed. While such extreme allocations may, indeed, be
optimal with respect to the mean-variance criterion, they are more often the result
of estimation error and outliers in the data[8]. Moreover, even in the absence of es-
timation error, mean-variance optimality may not adequately reflect other objectives
such as social equity across disease groups or distance from current status quo in al-
location. To incorporate such considerations, a “regularization” technique is applied
in which the objective function is penalized for allocations that are far away from the
average allocation policy. Specifically, we consider the following regularized version
of the standard portfolio-optimization problem:
Minimize
ω
ω′
Σω + γ kω − ωNIHk2
subject to ω′
µ ≥ µo , ω′
ι = 1 , ω ≥ 0
(3.4)
This formulation is essentially a dual-objective optimization problem in which the
first objective is to minimize the portfolio’s variance (ω′
Σω), and the second objective
is to minimize the difference from the average NIH allocation policy (kω − ωNIHk2
)
and the non-negative parameter γ determines the relative importance of these two
objectives. Larger values of γ yield optimal weights that are closer to average NIH al-
146
Group Lag Mean SD Min Med Max Skewness Kurtosis
AID 10 -0.9 1.3 -4.3 -0.8 2.4 -0.3 5.2
CHD 12 5.1 3.8 0.2 4.4 11.1 0.1 1.4
CNS 10 -1.7 0.8 -3.1 -1.7 -0.4 0.1 1.9
DDK 11 -0.8 1.5 -3.8 -0.8 2.9 0.2 3.6
HLB 16 9.8 3.4 4.3 9.6 18.6 0.7 3.6
ONC 16 0.5 1.3 -2.3 1.2 2.1 -0.7 2.2
NMH 9 0.0 0.3 -0.8 0.0 0.6 -0.1 3.2
Table 3.3: Return summary statistics. Summary statistics for the ROI of disease
groups, in units of years (for the lag length) and per-capita-GDP-denominated re-
ductions in YLL between years t and t+4 per dollar of research funding in year t−q,
based on historical ROI from 1980 to 2003.
Year 1985 1986 1987 1988 1989 1990
YLL 19,741,993 19,380,387 19,015,838 18,951,220 18,086,872 17,670,665
∆ 361,605 364,549 64,617 864,348 416,207
GDP/Capita ($) 29,443 30,115 31,069 31,877 32,112
GDP-Weighted ∆ ($) 10,646,748,753 10,978,408,090 2,007,589,676 27,552,827,900 13,365,233,032
Mean GDP-Weighted YLL ∆ ($) 12,910,161,490
Lag (years) 16
Funding year 1970
Appropriation $ 695,809,705
ROI 18.6
Table 3.4: ROI example. An example of the ROI calculation for HLB from 1986.
location but which correspond to portfolios with greater volatility, and smaller values
of γ yield optimal weights that may be more concentrated among a smaller subset of
groups, but which imply lower portfolio volatility.
3.3 Results
3.3.1 Summary Statistics
Summary statistics of the ROI for the period 1980–2003 are presented in Table 3.3.
In Table 3.4 we provide an example of the ROI calculation for HLB for 1986, when
the return was 18.6.
Large differences in mean ROI for different Institutes are evident in Table 3.3,
147
0 1 2 3 4
−2
0
2
4
6
8
10
(a) With Alzheimer effect, gamma = 0
Efficient
AID
CHD
CNS
DDK
HLB
ONC
NMH
NIH Avg
1/n
NIH−Var
NIH−Mean
Min−Var
Eff−25%
Eff−50%
Eff−75%
0 1 2 3 4
−2
0
2
4
6
8
10
(b) With Alzheimer effect, gamma = 5
0 1 2 3 4
−2
0
2
4
6
8
10
(c) Without Alzheimer effect, gamma = 0
0 1 2 3 4
−2
0
2
4
6
8
10
(d) Without Alzheimer effect, gamma = 5
Figure 3-5: Efficient frontiers. Efficient frontiers for (a) all groups except HIV and
AMS, γ = 0; (b) all groups except HIV and AMS, γ = 5; (c) all groups except HIV
and AMS without the dementia effect, γ =0; and (d) all groups except HIV and AMS
without the dementia effect, γ =5; based on historical ROI from 1980 to 2003.
ranging from small negative values (e.g., −1.7 for CNS) to large positive values (e.g.,
9.8 for HLB). Large differences in standard deviation also exist, ranging from 0.3 for
NMH to 3.8 for CHD.
3.3.2 Efficient Frontiers
In Figure 3-5 , efficient frontiers for the single- and dual-objective optimization prob-
lems are plotted in mean-standard deviation space for the 7-group cases with and
without taking into account the dementia effect.
148
For each of these frontiers, the mean-standard deviation points for the following
funding allocations are also plotted:
(i) historical average NIH allocation for years 1996–2005;
(ii) equal-weighted (1/n) allocation;
(iii) minimum-variance allocation;
(iv) the allocation on the efficient frontier that has the same mean as the average
NIH allocation (the “NIH-mean” allocation);
(v) the allocation on the efficient frontier which has the same variance as the average
NIH allocation (the “NIH-var” allocation);
(vi) the allocation on the efficient frontier that is 25% of the distance from the
minimum variance allocation to the maximum expected-return allocation;
(vii) the allocation on the efficient frontier that is 50% of the distance from the
minimum variance allocation to the maximum expected-return allocation;
(viii) the allocation on the efficient frontier that is 75% of the distance from the
minimum variance allocation to the maximum expected-return allocation.
The region bounded by (i), (iv), (v) and the efficient frontier is of special interest
because all portfolios in this region offer lower variance, higher expected return, or
both when compared to the average NIH allocation, hence from a mean-variance
perspective such allocations are unambiguously preferable. These allocations are
called “dominating” portfolios relative to the average NIH allocation (i).
Figure 3-5a shows that a number of the disease groups appear to be concentrated
in a relatively low-risk sector of the risk/reward universe, which may be evidence of
active variance-minimization strategies by various stakeholders.
A sensitivity analysis is conducted by estimating the efficient frontier with (Fig-
ure 3-5a) and without the dementia effect (Figure 3-5c). Table 3.5 contains the
portfolio weights corresponding to Figures 3-5a and 3-5c respectively.
149
NIH
Avg
1/n
NIH-
Var
NIH-
Mean
Min-
Var
Eff-
25%
Eff-
50%
Eff-
75%
NIH-
Var
NIH-
Mean
Min-
Var
Eff-
25%
Eff-
50%
Eff-
75%
All Groups:
AID 8 14 0 0 0 0 0 0 0 3 5 0 0 0
CHD 7 14 24 11 0 13 27 36 18 11 7 18 28 34
CNS 14 14 0 0 25 0 0 0 7 15 19 8 0 0
DDK 10 14 0 0 0 0 0 0 0 5 8 0 0 0
HLB 17 14 23 11 0 13 32 55 24 14 9 23 39 59
ONC 27 14 53 28 16 33 42 9 34 33 32 34 31 7
NMH 16 14 0 50 58 41 0 0 17 20 21 17 2 0
Without Dementia:
AID 8 14 0 0 0 0 0 0 0 3 5 0 0 0
CHD 7 14 23 11 0 14 28 36 17 11 8 18 28 34
CNS 14 14 2 32 41 27 0 0 12 18 19 11 0 0
DDK 10 14 0 0 0 0 0 0 0 5 8 0 0 0
HLB 17 14 23 12 0 16 34 55 23 13 9 24 40 60
ONC 27 14 52 37 17 43 39 9 33 32 31 33 31 6
NMH 16 14 0 8 41 0 0 0 15 19 20 14 1 0
Group
Benchmarks Single-Objective Portfolios (in %) Dual-Objective Portfolios (γ = 5) (in %)
Table 3.5: Portfolio weights. Benchmark, single- and dual-objective optimal port-
folio weights (in percent), based on historical ROI from 1980 to 2003.
The top left sub-panel of Table 3.5 shows that the single-objective optimization
does yield sparse weights as expected. For example, the minimum-variance portfolio
allocates to only three groups: 58% to NMH, 25% to CNS, and 16% to ONC. By min-
imizing variance, irrespective of the mean, this portfolio allocates funding to groups
with least variability in YLL improvements. The efficient-25% portfolio allocates
non-zero weights in four groups (41% to NMH, 33% to ONC, 13% to HLB, and 13%
to CHD), and yields 26% better expected return with 28% less risk. With still more
emphasis on expected return, the efficient-50% portfolio gives non-zero weights only
to three successful groups: 42% to ONC, 32% to the higher risk, higher expected-
return HLB, and 27% to CHD. This portfolio has 172% higher expected return but
only 27% more risk than the NIH portfolio. The efficient-75% portfolio gives an even
higher weight of 55% to HLB, 36% to CHD, and 9% to ONC, yielding 318% higher
expected return and 148% more risk, a diminishing risk-adjusted expected return as
compared to portfolios with lower volatility. Given the greater emphasis on expected
return for this portfolio, it is not surprising to see HLB getting a bigger role due to
its apparent historical success in reducing YLL. Of course, whether or not past suc-
150
cess is indicative of comparable future success hinges on the science and associated
translational efforts underlying the diseases covered by HLB. This underscores the
importance of incorporating research and clinical insights into the funding allocation
process, especially within a systematic framework such as portfolio theory.
However, the dementia effect may underestimate the performance of the CNS
disease group, hence the lower panel of Table 3.5 reports corresponding optimal-
portfolio results without the dementia effect. In the single-objective case, the efficient-
50% and 75% portfolios are still sparse, with non-zero weights in 3 groups, while the
lower risk efficient-25% portfolio is less concentrated with non-zero weights to 4 groups
and significant weight (27%) to the CNS group.
Table 3.5 also contains the optimal portfolios for the dual-objective case (with
γ = 5) in the right sub-panels (see Figures 3-5b and 3-5d). These cases correspond
to portfolios that trade off closeness to the average NIH allocation policy with better
risk-adjusted expected returns. Now we observe that for both upper and lower sub-
panels corresponding to the 7-group with/without the dementia effect optimization,
respectively, the weights are less concentrated than in the single-objective case. For
example, the minimum-variance portfolio without the dementia effect now allocates
funding to all the groups, with weights ranging from 5% to 31%. However, even in this
case, the efficient-75% portfolio is still extreme, allocating weights only to HLB, CHD
and ONC. Therefore, special care must be exercised in selecting the appropriate point
on the efficient frontier. We also observe from the NIH-var or NIH-mean portfolios
that slight changes to the average NIH policy apparently yield superior performance
in mean-standard deviation space (28% to 89% relative improvement, depending on
the assumptions).
3.4 Discussion
Portfolio theory provides a systematic framework for determining optimal research
funding allocations based on historical return on investment, variance, and correlation
between appropriations and reductions in disease burden. The optimization results
151
suggest that significant YLL improvements with respect to a mean-variance criterion
may be possible through funding re-allocation. To our knowledge, this is the first
time such an approach has been empirically implemented in this domain.
However, our findings must be qualified in at least three respects: (1) YLL as a
measure of burden of disease, which is clearly incomplete and less than ideal; (2) the
definition of ROI and the challenges of relating research expenditures to subsequent
outcomes such as burden of disease; and (3) the known limitations of portfolio the-
ory. While each of these qualifications can be addressed to varying degrees through
additional data and analysis, the empirical conclusions are likely to depend critically
on the nature of their resolution. In this section, we provide a short synopsis of these
qualifications, and also consider other objections to this framework and directions for
future research.
YLL captures only the most extreme form of disease burden, and other measures
such as disability-adjusted or quality-adjusted life years are clearly preferable. How-
ever, time series histories for such measures are currently unavailable; hence YLL is
the most natural starting point for gauging the impact of biomedical research funding,
and is directly aligned with the NIH mission to “lengthen life”.
As a measure of disease burden, YLL captures only lethal illness by definition;
chronic illness enters the optimization process only indirectly, mortality in the young
is more heavily weighted than that of the elderly, and quality-of-life is not captured at
all. The choice of YLL is motivated by several factors: long time-series observations
of YLL are readily available, they cover a large population, and they address the
entire spectrum of diagnoses categorized under the ICD. Broader measures of burden
of disease such as disability adjusted life years (DALY)[38] and quality-adjusted life
years (QALY)[68, 22] have been proposed, but historical time series for such measures
are not yet available. As better measures are developed (e.g., incidence, prevalence,
physician visits, hospitalization, DALY, QALY), portfolio-optimization methods may
be applied to them as well through appropriately defined “returns”. Should datasets
covering not only age and cause of death but also ante-mortem symptoms become
available, mean-variance-efficient allocations would likely place significant weight on
152
improvements in the care of less-lethal chronic diseases.
Even if YLL is an appropriate measure of disease burden, our definition of ROI
can also be challenged as being imprecise and ad hoc in several respects. NIH funding
is typically focused on basic research rather than translational efforts, therefore, NIH
spending may not be as directly related to subsequent YLL improvements. We have
not accounted for other expenditures that may also affect YLL, and to the extent
that NIH appropriations are systematically used to complement private spending
to allocate total funding across diseases more fairly[76], the relation between NIH
funding and subsequent YLL improvements may be even noisier, and may require
modelling private-sector expenditures as a separate but complementary portfolio-
optimization problem with an objective function and constraints that are linked to
those of the NIH. Also, the standard portfolio-optimization framework implicitly as-
sumes a constant multiplicative relation between dollars invested today and dollars
returned tomorrow (so that doubling the investment will typically double the ROI of
that investment), whereas the return to biomedical investments may be non-linear.
In addition, translational research takes time and significant non-NIH resources, fur-
ther blurring the relation between NIH allocations and subsequent changes in YLL.
Finally, other factors may contribute to YLL improvements, including changes in cul-
tural norms (including consumption of alcohol and cigarettes), economic conditions
(such as recessions vs. expansions), and public policy (such as vaccine programs and
mandates for automobile, home, and workplace safety). While all of these qualifica-
tions have merit, they are not insurmountable obstacles and can likely be addressed
through additional data collection and more sophisticated metrics, perhaps along the
lines of Porter[67] or Lane and Bertuzzi[51]. Moreover, the portfolio-optimization
approach provides a useful conceptual framework for formulating funding allocation
decisions systematically, even if its empirical implications are imprecise.
The estimates of q were an initial attempt to link appropriation with outcome in
a systemic and non-discretionary manner, but they were derived heuristically from
regulatory, appropriation, and epidemiological data which may not be stationary or
predictive. For example, if the Food and Drug Administration’s capacity for reviewing
153
new-drug applications is held constant and applications double, substantial increases
in regulatory queuing would be expected, even with the added resources generated by
the Prescription Drug User Fee Act. Finally, in converting changes in YLL to dollar
amounts, per-capita real GDP was used as the “conversion factor” irrespective of age,
despite the fact that children and retired individuals are economically less active.
While these caveats highlight the imprecision with which the impact of research
spending is measured, they also provide direction for developing better metrics. In
particular, the underlying science of each grant implies a particular set of dynamics for
translation and YLL impact, and with more sophisticated models of such dynamics,
the returns to fundamental research should be measurable with greater accuracy.
Even within the exact domain for which it was developed, portfolio theory has
several well-known limitations, of which the most obvious is the possibility that the
mean-variance criterion may not, in fact, be the appropriate objective function to
be optimized. While there is little disagreement that higher expected ROI is prefer-
able to its alternative, the trade-off between expected ROI and risk is fraught with
subtleties involving specific psychological, perceptual, and behavioural mechanisms
of individuals and groups. Because of these considerations, mean-variance analysis is
often considered an approximation to a much more complicated reality—a starting
point for investment allocation decisions, not the final answer.
Another known limitation of portfolio theory is the fact that the input parameters
(µ, Σ) must be estimated from historical data, and estimation error in these param-
eter estimates can lead to portfolios that are unstable and sub-optimal[55]. One
common approach to addressing this problem in the financial context is to employ
prior information regarding the input parameters, thereby reducing the dependence
on historical data. Using Bayesian methods, expert opinions regarding the statisti-
cal properties of the individual asset returns can be incorporated into the portfolio
optimization process[13, 5] [12].
One limitation that is unique to the current application is the fact that portfolio
theory is silent on which mean-variance-optimal portfolio to select. In the financial
context, the existence of a riskless investment (e.g., U.S. Treasury bills) implies that
154
one unique portfolio on the efficient frontier will be desired by all investors—the so-
called “tangency” portfolio[73]. Because there is no analog to a riskless investment in
biomedical research, the notion of a tangency portfolio does not exist in this context.
Therefore, decision makers must first determine society’s collective preferences for risk
and return with respect to changes in YLL before a unique solution to the portfolio-
optimization problem can be obtained, i.e., they must agree on a societal “utility
function” for trading off the risks and rewards of biomedical research.
This critical step is a pre-requisite to any formal analysis of funding allocation
decisions, and underscores the need for integration of basic science with biomedical
investment performance analysis and science policy. Such integration will require
close and ongoing collaboration between scientists and policymakers to determine the
appropriate parameters for the funding allocation process, and to incorporate prior
information and qualitative judgments[14] regarding likely research successes, social
priorities, policy objectives and constraints, and hidden correlations due to non-linear
dependencies not captured by the data. In particular, it is easy to imagine contexts
in which funding objectives can and should change quickly in response to new envi-
ronmental threats or public-policy concerns. However, such pressing needs must be
balanced against the disruptions—which can be severe due to the significant adjust-
ment costs implicit in biomedical research[32]—caused by large unanticipated positive
or negative shifts in research funding. Although the end result of collaborative discus-
sion may fall short of a well-defined objective function that yields a clear-cut optimal
portfolio allocation, the portfolio-optimization process provides a transparent and ra-
tional starting point for such discussions, from which several insights regarding the
complex relation between research funding and social outcomes are likely to emerge.
Any repeatable and transparent process for making funding allocation decisions—
especially one that involves criteria other than peer-review-based academic excellence—
will, understandably, be viewed with some degree of suspicion and contempt by the
scientific community. However, if one of the goals of biomedical research is to reduce
the burden of disease, some tension between academics and public policy may be
unavoidable. Moreover, in the absence of a common framework for evaluating the
155
trade-offs between academic excellence and therapeutic potential, other approaches
such as political earmarking[2] are being proposed, which may be even less palatable
from the scientific perspective.
In an environment of tightening budgets and increasing oversight of appropria-
tions, portfolio theory offers scientists, policymakers, and regulators—all of whom
are, in effect, research portfolio managers—a rational, systematic, transparent, and
reproducible framework in which to explicitly balance and trade off expected benefits
with potential risks while accounting for correlation among multiple research agen-
das and real-world constraints in allocating scarce resources. Most funding agencies
and scientists have already been making such trade-offs informally and heuristically;
there may be additional benefits to making such decisions within an explicit frame-
work based on standardized and objective metrics.
One of the most significant benefits from adopting such a framework may be the
reduction of uncertainty surrounding future funding-allocation decisions, which would
greatly enhance the ability of funding agencies and scientists to plan for the future and
better manage their respective budgets, research agendas, and careers. By approach-
ing funding decisions in a more analytical fashion, it may be possible to improve their
ultimate outcomes while reducing the chances of unintended consequences.
156
Chapter 4
Impact of model misspecification
and risk constraints on market
In Chapters 1, 2 we studied the optimal trading strategy of a risk averse investor who
faces risk constraints and model misspecification. In this Chapter we will study how
risk constraints and fear of model misspecification affect the statistical properties of
the market returns. In particular, we will study their effect on the risk premium, the
volatility and liquidity of the market.
We find that the statistical properties of the market change. In particular, vari-
ability of the risk constraints leads to increasing risk premium, increasing volatility
and increasing illiquidity of the market. In addition, tightening of these constraints
leads also to increasing risk premium, increasing volatility and increasing illiquidity.
Moreover, we find that variability in risk aversions along with risk constraints also
lead to a more concave pricing function of the aggregate supply for the market, imply-
ing increasing risk premium, increasing volatility and increasing illiquidity. Finally,
we explore how the properties of the asset returns change when the investors do not
completely trust their models. We find that model misspecification is another source
of increasing risk premium, endogenous volatility and increasing illiquidity.
In the rest of this chapter, we will discuss the relevant literature review. Then
we will discuss about the setup of the model, and we will analyze the impact on the
market returns of varying risk constraints across agents, varying risk aversions across
157
agents and varying degrees of fear of model misspecification across agents. Finally,
we will conclude with all of our results.
4.1 Literature review
In the literature there are papers that assume heterogeneity along three dimensions:
risk aversion coefficients of the agents, constraints the agents face and beliefs of the
agents.
Danielsson and Zigrand [81] assume that the agents have the same beliefs and face
the same constraints but they differ in their risk aversion coefficients. They study
the economic implications of a Value-at-risk based regulatory system by analyzing a
two period multi-asset general equilibrium model with agents heterogeneous in risk
preferences and wealth. They assume that the agents have CARA utilities and they
argue that there will be endogenous volatility and increasing risk premium due to the
fact that “... risk will have to be transferred from the more risk-tolerant to the more
risk-averse”. As we will prove this is not true and not necessary for having increasing
risk premium, endogenous volatility and increasing illiquidity.
Kogan and Uppal [50] show how to analyze the equilibrium prices and policies in
an economy with incomplete financial markets and stochastic investment opportunity
set, where the agents face portfolio constraints. They study a general equilibrium
exchange economy with multiple agents, who differ in the risk aversion coefficients
and face borrowing constraints, while having the same beliefs.
Brumm et al. [21] consider a general equilibrium infinite-horizon economy with
the agents having heterogeneous risk preferences and facing the same constraints,
while having the same beliefs. They find that the presence of collateral constraints
leads to strong excess volatility and a regulation of margin requirements potentially
has stabilizing effects.
Then, there are papers with different prior beliefs not due to asymmetric infor-
mation among the agents. Geanakoplos [33] assumes that the agents have different
priors (optimists, pessimists) but same risk aversion coefficients and wealth and they
158
face identical collateral constraints. He studies how these constraints determine an
equilibrium leverage and how this leverage changes over time leading to crashes and
boom periods, the so-called leverage cycles.
Chen, Hong and Stein [25] study what happens to the price of a risky asset, when
there are investors with heterogeneous priors who face short sales constraints. The
idea that short sales constraints increase the prices of risky assets when the investors
have heterogeneous beliefs is due to Lintner [52] and Miller [61]. Chen, Hong and
Stein show that greater dispersion of beliefs leads to even higher prices.
Finally, Hansen and Sargent [41] study a framework where the agents have a
common approximating model, but they differ in the degree of mistrust of the model.
They find that agent’s caution in responding to concerns about model misspecification
can raise prices assigned to macroeconomic risks.
We will see how risk constraints affect the statistical properties of the market,
in particular the risk premium, volatility and liquidity of the market. We will first
study the case where the investors differ in the constraints they face and/or their risk
aversion coefficients and then the case where they mistrust the model of asset payoffs
and the mistrust varies among the investors.
4.2 Analysis
4.2.1 Model setup
We assume we have H mean-variance single-period optimizers with heterogeneous
risk aversions, risk constraints and wealth. Each agent can invest in the market and
the risk free rate at t = 0. We assume that the risk-free rate is exogenously given
and the market is modeled as a risky asset with stochastic payoff at t = 1. There are
also noise traders. We do not model their utility explicitly, we only assume that they
are hit by random liquidity shocks and they submit random market orders at time
t = 0. Equivalently, the supply of the risky asset is stochastic. Each agent faces a
risk constraint, a constraint in his wealth volatility of the form |θh| ≤ LhWh0, where
159
θh is the position of the h agent in the market, Wh0 is the initial wealth of agent h
and Lh determines the tightness of the risk constraint that the h agent faces.
Each agent’s wealth at time t=1 is given by:
Wh1 = dθh + (Wh0 − qθh)Rf
where d is the payoff of the risky asset (market), θh is the number of shares, q is the
price of the risky asset and Rf is the risk-free rate. It is:
E(R) =
µ̂θh + (Wh0 − qθh)Rf
Wh0
where µ̂ is the expected payoff of the risky asset. We also have:
var(R) =
σ̂2
θ2
h
W2
h0
where σ̂ is the volatility of the payoff of the risky asset.
Each agent solves the following optimization problem:
maximize
θh
(µ̂ − qRf )
θh
Wh0
−
1
2
γhσ̂2
(
θh
Wh0
)2
subject to |θh| ≤ LhWh0
where γh is the agent’s risk aversion coefficient.
By solving the KKT conditions we find:
θopt
h =
µ̂ − qRf
γh
Wh0
σ̂2(1 + λh)
(4.1)
where 1 + λh = max(1,
|µ̂−qRf |
Lhγhσ̂2 )
160
The solution can also be written as:
θopt
h =













µ̂−qRf
γh
Wh0
σ̂2 if µ̂−Lhγhσ̂2
Rf
≤ q ≤ µ̂+Lhγhσ̂2
Rf
LhWh0 if q ≤ µ̂−Lhγhσ̂2
Rf
−LhWh0 if q ≥ µ̂+Lhγhσ̂2
Rf
(4.2)
A competitive equilibrium is a set of portfolios (θ1, · · · , θH) and a price q such
that:
• Markets clear
• Each agent’s portfolio is optimal
The market clearing condition implies that:
X
h∈H
θopt
h = θα ⇒
q =
µ̂ − Ψσ̂2
θα
Rf
where θα is the aggregate supply of the risky asset and 1
Ψ
≡
P
h∈H
1
γh
Wh0
(1+λh)
We will consider two special cases:
• Constraints vary across the agents
• Risk aversion relative to wealth varies across the agents
4.2.2 Varying constraints
In the first case, we assume that the agents face heterogeneous constraints. In partic-
ular we assume that the parameter L̂h ≡ LhWh0 varies across the agents, while γh
Wh0
is constant equal to γ. Without loss of generality, we assume that L̂1 ≤ L̂2 · · · ≤ L̂H .
From the market clearing condition, we have that: Ψθα =
µ̂−qRf
σ̂2 . In addition from
161
equation 4.1, we have that:
µ̂ − qRf
σ̂2
= θopt
h
γh
Wh0
(1 + λh)
= θopt
h γ(1 + λh)
Therefore, it is: Ψθα = θopt
h γ (1 + λh) ∀h ∈ H.
We perform a sensitivity analysis for two cases:
• Keep L̂h constant and change θα. By changing the aggregate supply, we find
that:
– Ψθα is a piecewise linear convex increasing function of the aggregate supply.
– Its slope is given by γ
H−i+1
till the constraint binds for agent i. In other
words, at each point it is equal to γ over the number of agents for whom
the constraint is not binding yet.
– The constraint binds for agent 1 when θα,1 = HL̂1 and for agent i when
θα,i = θα,i−1 + (H − i + 1)(L̂i − L̂i−1). These are the kink points in Figures
4-1, 4-2, 4-3.
– When the aggregate supply is greater than θα =
P
h∈H L̂h there is no equi-
librium, since in that case all the agents are constrained in their positions
and they cannot buy any more shares and therefore the market cannot
clear.
– Since q = µ̂−Ψσ̂2θα
Rf
, we see that the pricing function is a piecewise linear
concave decreasing function of the aggregate supply, as we see in Figures
4-1, 4-2, 4-3.
– Therefore, variability of the constraints leads to increasing risk premium,
increasing volatility and increasing illiquidity, since a small change in the
aggregate supply, a small liquidity shock by the noise traders leads to a
larger change in the price of the risky asset comparing to the case, where
there is no variability in the constraints the different agents face. Figure
162
0 50 100 150
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Aggregate supply
Price
Price of the risky asset
Variability in constraints
Same constraints
Figure 4-1: Price of the risky asset as a function of the aggregate market
supply under varying constraints. We assume that we have 5 agents with the
same risk aversion coefficients. The red plot assumes the same L = 30 for all the
agents, while the blue assumes L to be different across the agents L1 = 10, L2 =
20, L3 = 30, L4 = 40, L5 = 50.
4-1 shows the pricing function when the agents face the same constraints
and when they face variable constraints. Figure 4-3 also shows the pricing
function of the risky asset when the agents face two sets of constraints with
the same mean but with different variability.
• Keep θα constant and change L̂h. As we see in Figure 4-2 as we tighten the
constraints, the pricing function becomes more concave. Therefore, tightening
of the constraints leads to increasing risk premium, increasing volatility and
increasing illiquidity.
163
0 50 100 150
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Aggregate supply
Price
Price of the risky asset
Different constraints Lh
Constraints reduced by a factor 1/5
Figure 4-2: Price of the risky asset as a function of the aggregate market
supply under tightening constraints. We assume that we have 5 agents with the
same risk aversion coefficients. The blue plot assumes L to be different across the
agents L1 = 10, L2 = 20, L3 = 30, L4 = 40, L5 = 50 and the red assumes that each Li
is reduced by 20%.
0 50 100 150
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Aggregate supply
Price
Price of the risky asset
Different constraints Lh
Less variable constraints by a factor of 2
Figure 4-3: Price of the risky asset as a function of the aggregate market
supply with less variable constraints. We assume that we have 5 agents with
the same risk aversion coefficients. The blue plot assumes L to be different across
the agents L1 = 10, L2 = 20, L3 = 30, L4 = 40, L5 = 50 and the red assumes that
L1 = 20, L2 = 25, L3 = 30, L4 = 35, L5 = 40.
164
4.2.3 Varying risk aversions
So far we have assumed that the agents have heterogeneous constraints but same
normalized risk aversions. Now, we assume that the risk aversion relative to wealth
ˆ
γh ≡ γh
Wh0
varies across the agents, while the parameters L̂h are constant equal to L.
Without loss of generality, we assume that ˆ
γ1 ≤ ˆ
γ2 · · · ≤ ˆ
γH. As we proved in the
previous section, we have: Ψθα = θopt
h ˆ
γh (1 + λh) ∀h ∈ H.
We perform a sensitivity analysis for two cases:
• Keep ˆ
γh constant and change θα. By changing the aggregate supply, we find
that:
– Ψθα is a piecewise linear convex increasing function of θα.
– Slope is (
PH
h=i
1
ˆ
γh
)−1
till the constraint binds for agent i. In other words,
at each point it is equal to the aggregate risk aversion of the agents whose
constraint is not binding yet.
– The constraint binds for agent i when θα,i = iL +
PH
k=i+1 L ˆ
γi
ˆ
γk
. These are
the kink points in Figures 4-4, 4-5.
– When the aggregate supply is greater than θα = HL there is no equilib-
rium, since in that case all the agents are constrained in their positions,
they cannot buy any more shares and therefore the market cannot clear.
– Since q = µ̂−Ψσ̂2θα
Rf
, we see that the pricing function is a piecewise linear
concave decreasing function of the aggregate supply, as we see in Figures
4-4, 4-5.
– Therefore, variability of the risk aversion coefficients and constraints leads
to increasing risk premium, increasing volatility and increasing illiquidity,
since a small change in the aggregate supply, a small liquidity shock by
the noise traders leads to a larger change in the price of the risky asset
comparing to the case, where there is variability in the risk aversion co-
efficients but no constraints. Figure 4-4 shows the pricing function of the
165
0 50 100 150
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Aggregate supply
Price
Price of the risky asset
Variability in risk aversions
No constraints
Figure 4-4: Price of the risky asset as a function of the aggregate mar-
ket supply with constraints and varying risk aversions. We assume that we
have 5 agents with same constraints but different risk aversion coefficients. The blue
plot assumes L = 30 for each agent, while the red line assumes that the agents are
unconstrained.
risky asset when the agents with different risk aversion coefficients face
constraints and when they do not face any constraints.
• Keep θα constant and change L. As we see in Figure 4-5 as we tighten the
constraints, the pricing function becomes more concave. Therefore, tightening
of the constraints leads to increasing risk premium, increasing volatility and
increasing illiquidity.
166
0 50 100 150
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Aggregate supply
Price
Price of the risky asset
Variability in risk aversions
Tighter constraints
Figure 4-5: Price of the risky asset as a function of the aggregate market
supply with tightening constraints and varying risk aversions. We assume
that we have 5 agents with same constraints but different risk aversion coefficients.
The blue plot assumes L = 30 for each agent, while the red line assumes that L = 20
for each agent.
167
4.2.4 Varying constraints and risk aversions
Now we assume that both the risk aversion relative to wealth ˆ
γh ≡ γh
Wh0
and the
parameters L̂h vary across the agents. Without loss of generality we assume that
ˆ
γ1L̂1 ≤ ˆ
γ2L̂2 · · · ≤ ˆ
γHL̂H . By changing the aggregate supply, we find that:
• Ψθα is a piecewise linear convex increasing function of θα.
• Slope is (
PH
h=i
1
ˆ
γh
)−1
till the constraint binds for agent i. In other words, at each
point it is equal to the aggregate risk aversion of the agents whose constraint is
not binding yet.
• The constraint binds increasing in the order of γ̂iL̂i = γiLi. It binds for agent
i when θα,i =
Pi
k=1 L̂k +
PH
k=i+1 L̂i
ˆ
γi
ˆ
γk
.
• When the aggregate supply is greater than θα =
P
h∈H L̂h there is no equilib-
rium, since in that case all the agents are constrained in their positions, they
cannot buy any more shares and therefore the market cannot clear.
• The pricing function of the risky asset is a piecewise linear concave decreas-
ing function of the aggregate supply. That means increasing risk premium,
increasing volatility and increasing illiquidity.
4.2.5 Varying fear of model misspecification
We now study how fear of model misspecification across the agents affects the sta-
tistical properties of risky assets. We change a little our initial model setup. In
particular, we assume we have H investors with CARA utilities, heterogeneous risk
aversions and wealth. Each investor can invest in N risky assets and the risk free
rate at t = 0. Each agent mistrusts the model that describes the payoff distribution
of the risky assets. All the agents have a common nominal model but they have
heterogeneous fears of model misspecification. We assume that the risk-free rate is
exogenously given and the common approximating model describing the risky assets’
payoff distribution at t = 1 is N(µ̂, σ̂2
). There are also noise traders. We do not
168
model their utility explicitly, we only assume that they are hit by random liquidity
shocks and they submit random market orders at time t = 0. Equivalently, the supply
of the risky assets is stochastic.
Each agent’s wealth at time t=1 is given by:
Wh1 = dT
θh + (Wh0 − qT
θh)Rf
where d ∈ RN
describes the payoff of the N risky assets, θh describe the number of
shares of the assets, q ∈ RN
is the vector of prices of the risky assets and Rf is the
risk-free rate.
The common approximating model of the payoff distribution is d = µ̂+Σ̂1/2
ǫ where
ǫ follows the standard multivariate Gaussian distribution. The alternative models
alter the distribution of ǫ. In particular, similarly with the framework described in
Chapter 2, they change the mean of the shocks and they assume that the distribution
of the shock is N(h, I). Therefore, under the alternative models the payoff is d =
µ̂ + Σ̂1/2
(ǫ + h) i.e. d follows N(µ̂ + Σ̂1/2
h, Σ̂). The relative entropy of the alternative
distributions with respect to the nominal distribution is given by:D(Q) = 1/2hT
h
as we have shown in Chapter 2 and proved in the Appendix. Under the alternative
distributions the certainty equivalence of the investors with CARA utilities is γ(W0 −
qT
θ)Rf + γµ̂T
θ − 1/2γ2
θT
Σ̂θ + γθT
Σ̂1/2
h.
We assume each investor solves the following optimization problem:
max
θ
min
h
γ(W0 − qT
θ)Rf + γµ̂T
θ − 1/2γ2
θT
Σ̂θ + γθT
Σ̂1/2
h + λ1/2hT
h (4.3)
where γ is the risk aversion coefficient of the investor and λ is the multiplier that
penalizes the relative entropy of the alternative distribution.
By solving this optimization problem (see Appendix) we find that:
θopt
h =
Σ̂−1
(µ̂ − Rf q)
γh(1 + 1
λh
)
(4.4)
169
The market clearing condition for the risky assets implies that (see Appendix):
X
h∈H
θopt
h = θα
q =
µ̂ − ΨΣ̂θα
Rf
where θα ∈ RN
is the aggregate supply of the risky assets and 1
Ψ
≡
P
h∈H
1
γh(1+ 1
λh
)
.
In case the investors fully trusted their model dynamics we would have
q =
µ̂ − ΨnrΣ̂θα
Rf
(4.5)
but now 1
Ψnr
≡
P
h∈H
1
γh
.
So we see the case when the agents mistrust their models is equivalent to the
case where they fully trust their models but with increasing effective risk aversion
γeff = γ(1 + 1
λ
).
By changing the aggregate supply, we find that when the investors mistrust their
models and make a robust decision rule then the slope of the pricing function of
the assets is larger compared to the case when the investors fully trust their models’
dynamics, since Ψ  Ψnr. If we consider the case where N = 1 and the risky asset
is the market, then we see that the risk premium, the volatility and the illiquidity go
up, since a small change in the aggregate supply, a small liquidity shock by the noise
traders leads to a larger change in the price of the risky asset comparing to the case
where the investors fully trust their models.
4.3 Conclusions
Risk sensitive regulations have become the cornerstone of international financial reg-
ulations. They imply an upper bound on the wealth volatility for each investor. In
this chapter we studied how risk constraints and model misspecification affect the
statistical properties of the market returns. In particular, we studied their effect on
the risk premium, the volatility and liquidity of the market.
170
We studied the following cases:
• Variability of the risk constraints: This is the case where the agents face differ-
ent risk constraints. The more variability there is in the risk constraints, the
larger the risk premium, volatility and illiquidity of the market is. In addition
tightening of the constraints leads also to more risk premium, volatility and
illiquidity of the market.
• Variability in risk aversions: This is the case where the agents face the same
risk constraint but have different aversions to risk. Variability in risk aversions
along with the risk constraints also lead to a more concave pricing function of the
aggregate supply for the market, implying increasing risk premium, increasing
volatility and increasing illiquidity. An interesting question here is the following.
Do we have a concave pricing function due to the fact that “...risk will have to be
transferred from the more risk-tolerant to the more risk-averse”? Well as we saw
this is not the case. Any constraint that binds for an agent forces the discount
and the slope of the pricing function to be larger so that the other agents are
induced to absorb the excess supply and this is the mechanism that leads to a
more concave decreasing pricing function with respect to the aggregate supply.
• Model misspecification: This is the case where the agents do not fully trust their
model dynamics, they believe that the real model is an unknown member of a set
of alternative models near their nominal model and they make robust decision
rules. We find that model misspecification is another source of increasing risk
premium, endogenous volatility and increasing illiquidity.
171
172
Appendix A
Technical Notes
Proposition 1. The following QCQP:
minimize FT
t µt +
1
2
FT
t ΣFt
subject to FT
t ΣFt ≤ L
has a solution given by:
Fopt
t =







−Σ−1
µt if µT
t Σ−1
µt ≤ L
− Σ−1µt
r
µT
t Σ−1µt
L
if µT
t Σ−1
µt ≥ L
Proof. Let us apply the Karush Kuhn Tucker (KKT) conditions. Fopt
t , λopt
t are optimal
iff they satisfy the following KKT conditions:
• Primal feasibility: FT opt
t ΣFopt
t ≤ L
• Dual feasibility: λopt
t ≥ 0
• Complementary slackness: λopt
t (FT opt
t ΣFopt
t − L) = 0
• Minimization of the Lagrangean: Fopt
t = argmin L(Ft, λopt
t )
173
The Lagrangean is given by:
L(F, λ) = FT
t µt +
1
2
FT
t ΣFt + λ(1/2FT
t ΣFt − 1/2L)
The first order conditions are:
µt + ΣFopt
t + λopt
t ΣFopt
t = 0 ⇒
Fopt
t = −
Σ−1
µt
1 + λopt
t
If µT
t Σ−1
µt  L, then we cannot have λopt
t = 0, since in that case FT opt
t ΣFopt
t  L.
Therefore λopt
t  0 and from the CS condition
FT opt
t ΣFopt
t = L ⇒
µT
t Σ−1
µt
(1 + λopt
t )2
= L ⇒
(1 + λopt
t ) =
r
µT
t Σ−1µt
L
 1
Therefore, if µT
t Σ−1
µt  L, then Fopt
t = − Σ−1µt
r
µT
t Σ−1µt
L
.
If µT
t Σ−1
µt  L, then FT opt
t ΣFopt
t  L, therefore λopt
t = 0 and Fopt
t = −Σ−1
µt.
Finally if µT
t Σ−1
µt = L, then we cannot have λopt
t  0, since in that case we would
have FT opt
t ΣFopt
t  L and λopt
t  0 and the CS condition would be violated. Therefore,
λopt
t = 0 and Fopt
t = −Σ−1
µt.
Therefore we proved that:
Fopt
t =







−Σ−1
µt if µT
t Σ−1
µt ≤ L
− Σ−1µt
r
µT
t Σ−1µt
L
if µT
t Σ−1
µt ≥ L
174
We could also prove this result in another way. Let us make a change of variables
where:
y = Σ1/2
Ft
Ft = Σ−1/2
y
Then our problem becomes:
minimize yT
Σ−1/2
µt +
1
2
yT
y
subject to yT
y ≤ L
The optimal solution is given by finding the projection of µ̃ = −Σ−1/2
µt on the
Euclidean ball yT
y ≤ L. This projection is given by:
yopt
=
µ̃
max(1,
q
µ̃T µ̃
L
)
Therefore the optimal solution for the original problem is given by:
Fopt
t = Σ−1/2
yopt
Fopt
t = −
Σ−1
µt
max(1,
q
µT
t Σ−1µt
L
)
175
Proposition 2. When Σ is a diagonal matrix, the following convex program:
minimize FT
t µt +
1
2
FT
t ΣFt
subject to
PN
i=1 λi|Fit| ≤ 1
has a solution given by:
Fopt
it =
sign(−µit)(|µit
λi
| − νopt
t )+
σ2
i
λi
Proof. Let us apply the Karush Kuhn Tucker (KKT) conditions. Fopt
t , νopt
t are optimal
iff they satisfy the KKT conditions:
• Primal feasibility:
PN
i=1 λi|Fit| ≤ 1
• Dual feasibility: νopt
t ≥ 0
• Complementary slackness: νopt
t (
PN
i=1 λi|Fit| − 1) = 0
• Minimization of the Lagrangean Fopt
t = argmin L(Ft, νopt
t )
The Lagrangean is given by:
L(F, ν) = FT
t µt +
1
2
FT
t ΣFt + ν(
N
X
i=1
λi|Fit| − 1)
176
Fopt
t = argmin L(Ft, νopt
t )
= argmin 1/2
N
X
i=1
σ2
i F2
it +
N
X
i=1
Fitµit + νopt
t (
N
X
i=1
λi|Fit| − 1)
= argmin 1/2
N
X
i=1
σ2
i |Fit|2
+
N
X
i=1
|Fit|sign(−µit)µit + νopt
t (
N
X
i=1
λi|Fit| − 1)
= argmin 1/2
N
X
i=1
σ2
i |Fit|2
+
N
X
i=1
|Fit|(sign(−µit)µit + λiνopt
t )
where Fopt
it = |Fopt
it |sign(−µit).
We have:
|Fopt
it | =





0 if sign(−µit)µit + λiνopt
t ≥ 0
−
sign(−µit)µit+λiνopt
t
σ2
i
if sign(−µit)µit + λiνopt
t ≤ 0
Therefore:
Fopt
it =





0 if sign(−µit)µit + λiνopt
t ≥ 0
−
µit
λi
−sign(−µit)νopt
t
σ2
i /λi
if sign(−µit)µit + λiνopt
t ≤ 0
which is equivalent to:
Fopt
it =







0 if sign(−µit)µit + λiνopt
t ≥ 0
sign(−µit)
|µit|
λi
−νopt
t
σ2
i
λi
if sign(−µit)µit + λiνopt
t ≤ 0
Since it is:
sign(−µit)µit + λiνopt
t ≤ 0 ⇒
−|µit| + λiνopt
t ≤ 0 ⇒
|
µit
λi
| − νopt
t ≥ 0
177
we have proved that:
Fopt
it =
sign(−µit)(|µit
λi
| − νopt
t )+
σ2
i
λi
Proposition 3. The relative entropy of a multivariate Gaussian distribution N(µ, I)
with respect to the multivariate standard Gaussian distribution is D(Q) = 1
2
µT
µ
Proof. It is:
D(Q) =
Z +∞
−∞
log(
f(x)
g(x)
)f(x)dx
where f(x) = 1
(2π)N/2 e−1/2(x−µ)T (x−µ)
and g(x) = 1
(2π)N/2 e−1/2xT x
D(Q) = −
Z +∞
−∞
1/2(x − µ)T
(x − µ)f(x)dx +
Z +∞
−∞
1/2xT
xf(x)dx
The first term is:
−
1
2
E[(X − µ)T
(X − µ)] = −
1
2
E[trace((X − µ)T
(X − µ))]
= −
1
2
E[trace((X − µ)(X − µ)T
)]
= −
1
2
trace(I)
= −N/2
The second term is:
1
2
E[XT
X] =
1
2
E[trace((X − µ)T
(X − µ))] +
1
2
E[X]T
E[X]
= N/2 +
1
2
µT
µ
Therefore, we have D(Q) = 1
2
µT
µ.
178
Proposition 4. The relative entropy of probability measure Q with respect to P,
where dQ
dP
= ξT and ξt = e
R t
0
hT
s dZs− 1
2
R t
0
hT
s hsds
is given by:
D(Q) =
Z T
0
1
2
EQ[hT
t ht]dt
Proof. The relative entropy of Q with respect to P is:
D(Q) = EQ[log(
dQ
dP
)]
= EQ[log(ξT )]
= EQ[
Z T
0
hT
s dZs −
1
2
Z T
0
hT
s hsds]
=
Z T
0
EQ[hT
s dZs] −
1
2
Z T
0
EQ[hT
s hs]ds
=
Z T
0
EQ[EQ[hT
s dZs|Fs]] −
1
2
Z t
0
EQ[hT
s hs]ds
=
Z T
0
1
2
EQ[hT
t ht]dt
since EQ[dZt|Ft] = htdt from Girsanov’s theorem.
Proposition 5. The following QCQP:
minimize −FT
(µ(S, t) − rSt −
ΣHS
ν
) +
1
2
(1 +
1
ν
)FT
t ΣFt
subject to FT
ΣF ≤ L
has a solution given by:
Fopt
t =







1
1+ 1
ν
Σ−1
µt if µT
t Σ−1
µt ≤ L(1 + 1
ν
)2
Σ−1µt
r
µT
t Σ−1µt
L
if µT
t Σ−1
µt ≥ L(1 + 1
ν
)2
where µt = µ(S, t) − rSt − ΣHS(St,t)
ν
.
Proof. Let us apply the Karush Kuhn Tucker (KKT) conditions. Fopt
t , λopt
t are optimal
iff they satisfy the following KKT conditions:
179
• Primal feasibility: FT opt
t ΣFopt
t ≤ L
• Dual feasibility: λopt
t ≥ 0
• Complementary slackness: λopt
t (FT opt
t ΣFopt
t − L) = 0
• Minimization of the Lagrangean: Fopt
t = argmin L(Ft, λopt
t )
The Lagrangean is given by:
L(F, λ) = −FT
t µt +
1
2
(1 +
1
ν
)FT
t ΣFt + λ(1/2FT
t ΣFt − 1/2L)
The first order conditions are:
−µt + (1 +
1
ν
)ΣFopt
t + λopt
t ΣFopt
t = 0 ⇒
Fopt
t =
Σ−1
µt
1 + 1
ν
+ λopt
t
If µT
t Σ−1
µt  L(1+ 1
ν
)2
, then we cannot have λopt
t = 0, since in that case FT opt
t ΣFopt
t 
L. Therefore λopt
t  0 and from the CS condition
FT opt
t ΣFopt
t = L ⇒
µT
t Σ−1
µt
(1 + 1
ν
+ λopt
t )2
= L ⇒
(1 +
1
ν
+ λopt
t ) =
r
µT
t Σ−1µt
L
 1 +
1
ν
Therefore, if µT
t Σ−1
µt  L(1 + 1
ν
)2
, then Fopt
t = Σ−1µt
r
µT
t Σ−1µt
L
.
If µT
t Σ−1
µt  L(1 + 1
ν
)2
, then FT opt
t ΣFopt
t  L, therefore λopt
t = 0 and Fopt
t = Σ−1µt
1+ 1
ν
.
Finally if µT
t Σ−1
µt = L(1 + 1
ν
)2
, then we cannot have λopt
t  0, since in that case we
would have FT opt
t ΣFopt
t  L and λopt
t  0 and the CS condition would be violated.
Therefore, λopt
t = 0 and Fopt
t = Σ−1µt
1+ 1
ν
.
180
Proposition 6. The following convex program:
max
θ
min
h
γ(W0 − qT
θ)Rf + γµ̂T
θ − 1/2γ2
θT
Σ̂θ + γθT
Σ̂1/2
h + λ1/2hT
h (A.1)
has a solution given by:
θopt
=
Σ̂−1
(µ̂ − Rf q)
γ(1 + 1
λ
)
Proof. For each h the objective function is concave in θ. Therefore, when we take
the minimum over h, we take the minimum of concave functions, which leads to a
concave function. Therefore this problem maximizes a concave function of θ.
The inner minimization problem is:
min
h
γθT
Σ̂1/2
h + λ1/2hT
h
This is a convex problem where we minimize a quadratic function of h. The first
order conditions are:
γΣ̂1/2
θ + λh = 0 ⇒
h = −
γΣ̂1/2
θ
λ
The optimal value of the inner optimization problem is:
V (θ) = −γθT
Σ̂1/2 γΣ̂1/2
θ
λ
+ λ1/2
γ2
θT
Σ̂θ
λ2
= −
γ2
θT
Σ̂θ
2λ
The problem A.1 then becomes:
max
θ
γ(W0 − qT
θ)Rf + γµ̂T
θ − 1/2γ2
θT
Σ̂θ −
γ2
θT
Σ̂θ
2λ
181
The first order conditions of this concave problem are:
γ(µ̂ − Rf q) − γ2
Σ̂θopt
(1 +
1
λ
) = 0
θopt
=
Σ̂−1
(µ̂ − Rf q)
γ(1 + 1
λ
)
Therefore we proved our proposition.
Proposition 7. The market clearing condition for the risky assets implies that
X
h∈H
θopt
h = θα ⇒
q =
µ̂ − ΨΣ̂θα
Rf
where θα ∈ RN
is the aggregate supply of the risky assets and 1
Ψ
≡
P
h∈H
1
γh(1+ 1
λh
)
Proof. From the proposition above we have:
θh =
Σ̂−1
(µ̂ − Rf q)
γ(1 + 1
λ
)
Σ̂θh =
(µ̂ − Rf q)
γ(1 + 1
λ
)
By adding across agents we have:
X
h∈H
Σ̂θh =
X
h∈H
(µ̂ − Rf q)
γh(1 + 1
λh
)
Σ̂θα = (µ̂ − Rf q)
X
h∈H
1
γh(1 + 1
λh
)
ΨΣ̂θα = (µ̂ − Rf q)
q =
µ̂ − ΨΣ̂θα
Rf
where 1
Ψ
≡
P
h∈H
1
γh(1+ 1
λh
)
182
Bibliography
[1] Committee on the NIH Research Priority-Setting Process, Scientific Opportu-
nities and Public Needs: Improving Priority Setting and Public Input at the
National Institutes of Health, pp. 11–12. Washington, D.C.: National Academy
Press. (1998).
[2] C. Anderson. A new kind of earmarking. Science, 260(5107):483, Apr. 23, 1993
1993.
[3] Evan W. Anderson, Lars Peter Hansen, and Thomas J. Sargent. Robustness,
detection and the price of risk, 2000.
[4] Kerry E. Back. Asset Pricing and Portfolio Choice Theory. Oxford University
Press, 2010.
[5] Alexander Bade, Gabriel Frahm, and Uwe Jaekel. A general approach to Bayesian
portfolio optimization. Mathematical Methods of Operations Research, 70(2):337–
356, 2009.
[6] Suleyman Basak and B. Croitoru. Equilibrium mispricing in a capital market
with portfolio constraints. The Review of financial studies, 13:715–748, 2000.
[7] Suleyman Basak and Alexander Shapiro. Value-at-risk-based risk management:
Optimal policies and asset prices. The Review of financial studies, 14:371–405,
2001.
[8] V. Bawa, S. Brown, and R. Klein. Estimation Risk and Optimal Portfolio Choice.
North-Holland, Amsterdam, 1979.
[9] Dirk Bergemann and Karl Schlag. Robust monopoly pricing. Cowles Foundation,
Yale University, 2005.
[10] Dimitri P. Bertsekas. Convex Optimization Theory. Athena Scientific, 2009.
[11] S. Birch and A. Gafni. Cost effectiveness/utility analyses. Do current decision
rules lead us to where we want to be? Journal of Health Economics, 11:279–296,
1992.
[12] Dimitrios Bisias, Andrew W. Lo, and James F. Watkins. Estimating the NIH
efficient frontier. PLOS One, 2012.
183
[13] F. Black and R. Litterman. Global portfolio optimization. Financial Analysts
Journal, 48(5):28–43, 1992.
[14] Nick Black. Health services research: the gradual encroachment of ideas. Journal
of Health Services Research  Policy, 14:120–123, 2009.
[15] Michael Boguslavsky and Elena Boguslavskaya. Arbitrage under power. Risk,
pages 69–73, June 2004.
[16] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University
Press, Cambridge, UK, 2004.
[17] Michael J. Brennan and Eduardo S. Schwartz. Arbitrage in stock index futures.
The Journal of Business, 63, 1990.
[18] J. F. Bridges and D. D. Terris. Portfolio evaluation of health programs: A reply
to Sendi et al. Social Science  Medicine, 58:1849–1851, 2004.
[19] J. F. B. Bridges, M. Stewart, M. T. King, and K. van Gool. Adapting portfolio
theory for the evaluation of multiple investments in health with a multiplicative
extension for treatment synergies. European Journal of Health Economics, 3:47–
53, 2002.
[20] M. L. Brown, J. Lipscomb, and C. Snyder. The burden of illness of cancer:
economic cost and quality of life. Annual Review of Public Health, 22:91–113,
2001.
[21] Johannes Brumm, Felix Kubler, Michael Grill, and Karl Schmedders. Margin
regulation and volatility. ECB working paper, 1698, 2014.
[22] Carol S. Burckhardt and Kathryn L. Anderson. The quality of life scale (QOLS):
Reliability, validity, and utilization. Health and Quality of Life Outcomes, 1:60–
66, 2003.
[23] M. Buxton, S. Hanney, and T. Jones. Estimating the economic value to societies
of the impact of health research: a critical review. Bulletin of the World Health
Organization, 82(10):733–739, Oct 2004.
[24] S. Chandra. Regional economy size and the growth/instability frontier: Evidence
from Europe. Journal of Regional Science, 43(1):95–122, 2003.
[25] Joseph Chen, Harrison Hong, and Jeremy C. Stein. Breadth of ownership and
stock returns. Journal of financial economics, 66:171–205, 2002.
[26] Julius H. Comroe Jr. and Robert D. Dripps. Scientific basis for the support of
biomedical science. Science, 192(4235):105–111, 1976.
[27] C. W. Curry, A. K. De, R. M. Ikeda, and S. B. Thacker. Health burden and
funding at the Centers for Disease Control and Prevention. American Journal
of Preventive Medicine, 30(3):269–276, MAR 2006.
184
[28] D. M. Cutler and M. McClellan. Is technological change in medicine worth it?
Health Affairs, 20(5):11–29, Sep–Oct 2001.
[29] Darrell Duffie. Special repo rates. Journal of Finance, 51:493–526, 1996.
[30] Darrell Duffie. Dynamic Asset Pricing Theory. Princeton Series in Finance, 3
edition, 2001.
[31] R. L. Fleurence and D. J. Torgerson. Setting priorities for research. Health
Policy, 69(1):1–10, JUL 2004.
[32] Richard Freeman and John Van Reenen. What if Congress doubled RD spend-
ing on the physical sciences? Technical Report 931, Center for Economic Per-
formance, May 2009.
[33] John Geanakoplos. The leverage cycle. Cowles Foundation Discussion Paper,
1715R, 2009.
[34] Itzhak Gilboa and David Schmeidler. Maxmin expected utility with non-unique
prior. Journal of Mathematical Economics, 18:141–153, 1989.
[35] J. Grant. Evaluating “payback” on biomedical research from papers cited in
clinical guidelines: applied bibliometric study. BMJ, 320(7242):1107–1111, 2000.
[36] M. Grant and S. Boyd. Graph implementations for nonsmooth convex programs.
In V. Blondel, S. Boyd, and H. Kimura, editors, Recent Advances in Learning and
Control (a tribute to M. Vidyasagar), Lecture Notes in Control and Information
Sciences, pages 95–110. Springer, Berlin / Heidelberg, 2008.
[37] M. Grant and S. Boyd. CVX: Matlab software for disciplined convex program-
ming, June 2009 2009.
[38] C. P. Gross, G. F. Anderson, and N. R. Rowe. The relation between funding
by the National Institutes of Health and the burden of disease. New England
Journal of Medicine, 340(24):1881–1887, JUN 17 1999.
[39] Steve Hanney, Iain Frame, Jonathan Grant, Philip Green, and Martin Buxton.
From Bench to Bedside: Tracing the Payback Forwards from Basic or Early
Clinical Research A Preliminary Exercise and Proposals for a Future Study.
Health Economics Research Group, Brunel University, Uxbridge, UK, 2003.
[40] Lars Hansen, Thomas Sargent, G. Turmuhambetova, and N. Williams. Robust
control, min-max expected utility and model misspecification. Journal of Eco-
nomic Theory, 128:45–90, 2006.
[41] Lars P. Hansen and Thomas Sargent. Robustness. Princeton University Press,
2008.
[42] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The Elements of Sta-
tistical Learning: Data Mining, Inference, and Prediction. Springer, 2013.
185
[43] Ernest Istook. Research funding on major diseases is not proportionate to tax-
payer’s needs. Journal of NIH Research, 9(8):26–28, 1997.
[44] D.H. Jacobson. Optimal stochastic linear systems with exponential performance
criteria and their relation to deterministic differential games. IEEE Transactions
Automatic Control, 18:124–131, 1973.
[45] S. C. Johnston and S. L. Hauser. Basic and clinical research: what is the most
appropriate weighting in a public investment portfolio? Annals of Neurology,
60(1):9A–11A, Jul 2006.
[46] S. C. Johnston, J. D. Rootenberg, S. Katrak, W. S. Smith, and J. S. Elkins.
Effect of a US national institutes of health programme of clinical trials on public
health and costs. Lancet, 367(9519):1319–1327, APR 22 2006.
[47] Philippe Jorion. Value at Risk: The New Benchmark for Managing Financial
Risk. McGraw-Hill, 2006.
[48] Jakub Jurek and Halla Yang. Dynamic portfolio selection in arbitrage. EFA
Meeting, 2006.
[49] Tong Suk Kim and Edward Omberg. Dynamic nonmyopic portfolio behavior.
The Review of Financial Studies, 9:141–161, 1996.
[50] Leonid Kogan and Raman Uppal. Risk aversion and optimal portfolio policies
in partial and general equilibrium economies. NBER working paper, 8609, 2001.
[51] Julia Lane and Stefano Bertuzzi. Measuring the results of science investments.
Science, 331:678–680, 2011.
[52] John Lintner. The aggregation of investor’s diverse judgements and preferences
in purely competitive security markets. Journal of financial and quantitative
analysis, 4:347–400, 1969.
[53] Jun Liu and Francis A. Longstaff. Losing money on arbitrage: Optimal dynamic
portfolio choice in markets with arbitrage opportunities. The Review of Financial
Studies, 17, 2004.
[54] Roger Lowenstein. When genius failed: The Rise and Fall of Long-Term Capital
Management. Random House Trade Paperbacks, 2001.
[55] H. M. Markowitz. Portfolio selection. Journal of Finance, 7(1):77–91, March
1952 1952.
[56] F. W. McFarlane. Portfolio approach to information systems. Harvard Business
Review, 59(4):142–150, 1981.
[57] M. T. McKenna, C. M. Michaud, C. J. L. Murray, and J. S. Marks. Assessing
the burden of disease in the United States using disability-adjusted life years.
American Journal of Preventive Medicine, 28(5):415–423, JUN 2005.
186
[58] Robert C. Merton. An analytic derivation of the efficient portfolio frontier.
Journal of Financial and Quantitative Analysis, 7:1851–1872, 1972.
[59] Robert C. Merton. Continuous time Finance. Wiley-Blackwell, 1992.
[60] Attilio Meucci. Review of statistical arbitrage, cointegration and multivariate
ornstein-uhlenbeck, 2010.
[61] Edward M. Miller. Risk, uncertainty and divergence of opinion. Journal of
Finance, 32:1151–1168, 1977.
[62] F. Mosteller. Innovation and evaluation. Science, 211(4485):881–886, 1981.
[63] Kevin Murphy and Richard Topel. Diminishing returns? The costs and benefits
of improving health. Perspectives in Biology and Medicine, 43(3):S108–S128,
2003.
[64] Rishi K. Narang. Inside the Black Box: A Simple Guide to Quantitative and
High Frequency Trading. Wiley, 2013.
[65] National Institutes of Health. Setting Research Priorities at the National Insti-
tutes of Health. National Institutes of Health, Bethesda, MD, 1997.
[66] B. J. O’Brien and M. J. Sculpher. Building uncertainty into cost-effectiveness
rankings: Portfolio risk-return tradeoffs and implications for decision rules. Med-
ical Care, 38:460–468, 2000.
[67] M. E. Porter. What is value in health care? New England Journal of Medicine,
363(26):2477–2481, 2010.
[68] Luis Prieto and Jose A. Sacristan. Problems and solutions in calculating quality-
adjusted life years (QALYs). Health and Quality of Life Outcomes, 1:80–87, 2003.
[69] S. J. Rangel, B. Efron, and R. L. Moss. Recent trends in National Institutes
of Health funding of surgical research. Annals of Surgery, 236(3):277–287, SEP
2002.
[70] R. S. Sandler, J. E. Everhart, M. Donowitz, E. Adams, K. Cronin, C. Goodman,
E. Gemmen, S. Shah, A. Avdic, and R. Rubin. The burden of selected digestive
diseases in the United States. Gastroenterology, 122(5):1500–1511, May 2002.
[71] P. Sendi, M. J. Al, A. Gafni, and S. Birch. Optimizing a portfolio of health
care programs in the presence of uncertainty and constrained resources. Social
Science  Medicine, 57:2207–2215, 2003.
[72] Pedram Sendi, Maiwenn J. Al, and Frans F. H. Rutten. Portfolio theory and
cost-effectiveness analysis: A further discussion. Value In Health, 7:595–601,
2004.
187
[73] William F. Sharpe. Capital asset prices: a theory of market equilibrium under
conditions of risk. J. Finance, 19:425–442, 1964.
[74] Andrei Shleifer and Robert W. Vishny. The limits of arbitrage. The Journal of
Finance, 52:35–55, 1997.
[75] Gilbert Strang. Computational Science and Engineering. Wellesley Cambridge
Press, 2007.
[76] H. Varmus. Evaluating the burden of disease and spending the research dol-
lars of the National Institutes of Health. New England Journal of Medicine,
340(24):1914–1915, JUN 17 1999.
[77] W.H.Fleming and P.E.Souganidis. On the existence of value functions of two-
player zero-sum stochastic differential games. Indiana University Mathematics
Journal, 38:293–314, 1989.
[78] P. Whittle. Risk sensitive linear quadratic gaussian control. Advanced Applied
Probability, 13:776–777, 1981.
[79] P. Whittle. Risk sensitive optimal control. Wiley, 1990.
[80] P. Whittle. Optimal control: basics and beyond. Wiley, 1996.
[81] Jean-Pierre Zigrand and Jon Danielsson. What happens when you regulate risk?:
evidence from a simple equilibrium model. Lse research online documents on
economics, London School of Economics and Political Science, LSE Library, 2001.
188

938838223-MIT.pdf

  • 1.
    Applications of optimalportfolio management by Dimitrios Bisias Submitted to the Sloan School of Management in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Operations Research at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 2015 c Massachusetts Institute of Technology 2015. All rights reserved. Author .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sloan School of Management June 22, 2015 Certified by. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andrew W. Lo Charles E. and Susan T. Harris Professor of Finance Thesis Supervisor Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Patrick Jaillet Dugald C. Jackson Professor, Department of Electrical Engineering and Computer Science Co-director, Operations Research Center
  • 2.
  • 3.
    Applications of optimalportfolio management by Dimitrios Bisias Submitted to the Sloan School of Management on June 22, 2015, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Operations Research Abstract This thesis revolves around applications of optimal portfolio theory. In the first essay, we study the optimal portfolio allocation among convergence trades and mean reversion trading strategies for a risk averse investor who faces Value- at-Risk and collateral constraints with and without fear of model misspecification. We investigate the properties of the optimal trading strategy, when the investor fully trusts his model dynamics. Subsequently, we investigate how the optimal trading strategy of the investor changes when he mistrusts the model. In particular, we assume that the investor believes that the data will come from an unknown member of a set of unspecified alternative models near his approximating model. The investor believes that his model is a pretty good approximation in the sense that the relative entropy of the alternative models with respect to his nominal model is small. Concern about model misspecification leads the investor to choose a robust optimal portfolio allocation that works well over that set of alternative models. In the second essay, we study how portfolio theory can be used as a framework for making biomedical funding allocation decisions focusing on the National Institutes of Health (NIH). Prioritizing research efforts is analogous to managing an invest- ment portfolio. In both cases, there are competing opportunities to invest limited resources, and expected returns, risk, correlations, and the cost of lost opportunities are important factors in determining the return of those investments. Can we apply portfolio theory as a systematic framework of making biomedical funding allocation decisions? Does NIH manage its research risk in an efficient way? What are the challenges and limitations of portfolio theory as a way of making biomedical funding allocation decisions? Finally in the third essay, we investigate how risk constraints in portfolio opti- mization and fear of model misspecification affect the statistical properties of the market returns. Risk sensitive regulation has become the cornerstone of international financial regulations. How does this kind of regulation affect the statistical properties of the financial market? Does it affect the risk premium of the market? What about the volatility or the liquidity of the market? 3
  • 4.
    Thesis Supervisor: AndrewW. Lo Title: Charles E. and Susan T. Harris Professor of Finance 4
  • 5.
    Acknowledgments I would liketo express my gratitude to my advisor and mentor, Professor Andrew W. Lo, for his continuing support and advice over all the years I spent at MIT. His immense knowledge in diverse research areas, enthusiasm, hard work, outstanding leadership and motivation have been a source of inspiration. Working with him has been an honor and privilege and I could not have imagined having a better advisor and mentor for my Ph.D study. I would also like to thank the rest of my thesis committee: Professor Dimitri P. Bertsekas for comments that greatly improved this thesis and for his great books that made me love the field of optimization in the first place and Professor Leonid Kogan who provided his insight and expertise that greaty assisted this research. In addition I would like to thank Dr. James F. Watkins, MD for his invaluable help, insights and contribution to the second part of this research. Moreover, I would like to thank Dr. Paul Mende, Dr. Saman Majd and Dr. Eric Rosenfeld whom I had the fortune of being their teaching assistant in finance classes. Paul’s experience in quantitative trading made me realize what career I would like to follow and I am grateful for this. Being part of MIT and in particular the ORC and LFE communities has been a blessing and I consider myself very fortunate to be among very interesting and smart people. I will always remember my years at MIT with nostalgia and joy and I hope that I ’ll be able to express my gratitude in the future several times. My life at MIT would not be so complete and joyful if I didn’t have good lifelong friends to spend time and have productive discussions with. In particular, I would like to thank Nick Trichakis and his wife Lena, Christos and Elli Nicolaides, Markos and Sophia Trichas, Thomas and Anastasia Trikalinos, the golden coach George Pa- pachristoudis and Gerry Tsoukalas. Last but not least I would like to thank my parents Giorgo and Roula and my sister Katerina for their unconditional love and support. I owe to them everything and this thesis is dedicated to them. 5
  • 6.
  • 7.
    Contents 1 Optimal tradingof arbitrage opportunities under constraints 29 1.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 1.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 1.2.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 1.2.2 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 1.2.3 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 1.2.4 Connection with Ridge and Lasso regression . . . . . . . . . . 46 1.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 1.3.1 Convergence trades . . . . . . . . . . . . . . . . . . . . . . . . 47 1.3.2 Mean reversion trading opportunities . . . . . . . . . . . . . . 56 1.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 2 Optimal trading of arbitrage opportunities under model misspecifi- cation 57 2.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 2.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.2.1 Alternative models representation . . . . . . . . . . . . . . . . 61 2.2.2 Model setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 2.3 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 2.3.1 No fear of model misspecification . . . . . . . . . . . . . . . . 65 2.3.2 Fear of model misspecification no constraints . . . . . . . . . . 67 2.3.3 Fear of model misspecification with VaR and margin constraints 70 2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 7
  • 8.
    2.4.1 Convergence tradeswithout constraints . . . . . . . . . . . . . 73 2.4.2 Mean reversion trading strategies without constraints . . . . . 78 2.4.3 Convergence trades with constraints . . . . . . . . . . . . . . . 92 2.4.4 Mean reversion trading strategies with constraints . . . . . . . 111 2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 3 Estimating the NIH Efficient Frontier 131 3.1 NIH Background and Literature Review . . . . . . . . . . . . . . . . 132 3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 3.2.1 Funding Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 3.2.2 Burden of Disease Data . . . . . . . . . . . . . . . . . . . . . 139 3.2.3 Applying Portfolio Theory . . . . . . . . . . . . . . . . . . . . 142 3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 3.3.1 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . 147 3.3.2 Efficient Frontiers . . . . . . . . . . . . . . . . . . . . . . . . . 148 3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 4 Impact of model misspecification and risk constraints on market 157 4.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 4.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 4.2.1 Model setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 4.2.2 Varying constraints . . . . . . . . . . . . . . . . . . . . . . . . 161 4.2.3 Varying risk aversions . . . . . . . . . . . . . . . . . . . . . . 165 4.2.4 Varying constraints and risk aversions . . . . . . . . . . . . . . 168 4.2.5 Varying fear of model misspecification . . . . . . . . . . . . . 168 4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 A Technical Notes 173 8
  • 9.
    List of Figures 1-1Ellipsoids. Ellipsoids of poor investment opportunities for N=2 con- vergence trades at times t = 0.3, 0.6, 0.9. . . . . . . . . . . . . . . . . 42 1-2 Weights for the case of uncorrelated spreads and collateral constraint. Weights for the case of uncorrelated spreads. . . . . . . 45 1-3 VaR constraints, positive correlations. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated (ρ = 0.5) convergence trades, while facing VaR constraints (K=1). Initial wealth is $100. . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 1-4 VaR constraints, negative correlations. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively cor- related (ρ = −0.5) convergence trades, while facing VaR constraints (K=1). Initial wealth is $100. . . . . . . . . . . . . . . . . . . . . . . 48 1-5 VaR constraints, positive correlations, tight constraints. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated (ρ = 0.5) convergence trades, while facing VaR constraints (K=0.25). Initial wealth is $100. . . . . . . . . . . . . . . 49 1-6 VaR constraints, negative correlations, tight constraints. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively correlated (ρ = −0.5) convergence trades, while facing VaR constraints (K=0.25). Initial wealth is $100. . . . . . . . . . . . . . . 49 9
  • 10.
    1-7 Wealth evolutionunder VaR constraint. Typical path of the wealth evolution for an investor investing in two convergence trades using the same noise process for positive and negative correlation under the VaR constraint. Initial wealth is $100. . . . . . . . . . . . . . . . 50 1-8 Relation between final wealth and frequency the VaR con- straint binds. Final wealth is negatively correlated to the percentage of time the constraints bind when the initial values of the convergence trades are low. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 1-9 Margin constraints, positive correlations. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively cor- related (ρ = 0.5) convergence trades, while facing margin constraints (Collateral = 1). Initial wealth is $100. . . . . . . . . . . . . . . . . . 52 1-10 Margin constraints, negative correlations. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively cor- related (ρ = −0.5) convergence trades, while facing margin constraints (Collateral = 1). Initial wealth is $100. . . . . . . . . . . . . . . . . . 52 1-11 Margin constraints, positive correlations, more collateral needed. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated (ρ = 0.5) convergence trades, while facing margin constraints (Collateral = 2). Initial wealth is $100. . . . . . . 53 1-12 Margin constraints, negative correlations, more collateral needed. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively correlated (ρ = −0.5) convergence trades, while fac- ing margin constraints (Collateral = 2). Initial wealth is $100. . . . . 53 1-13 Wealth evolution under margin constraint. Typical path of the wealth evolution for an investor investing in two convergence trades using the same noise process for positive and negative correlation under the margin constraint. Initial wealth is $100. . . . . . . . . . . . . . . 54 10
  • 11.
    1-14 Relation betweenfinal wealth and frequency the margin con- straint binds. Final wealth is negatively correlated to the percentage of time the constraints bind when the initial values of the convergence trades are low. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 1-15 Positions evolution under VaR constraints. Typical path of the positions in two convergence trading opportunities under VaR con- straints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 1-16 Positions evolution under margin constraints. Typical path of the positions in two convergence trading opportunities under margin constraints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2-1 Partial derivative of the value function with respect to S for a single convergence trade. VS as a function of time at S = 1 for different values of the robustness multiplier for a single convergence trade. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 2-2 Distortion drift for a single convergence trade. Distortion drift as a function of time at S = 1 for different values of the robustness multiplier for a single convergence trade. . . . . . . . . . . . . . . . . 76 2-3 Distortion drift terms for a single convergence trade. Distor- tion drift terms as a function of time at S = 1 for ν = 1 for a single convergence trade. The first term corresponds to a positive distor- tion drift that reduces the wealth of the investor since the investor is shorting the spread, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. . . . . 77 2-4 Optimal weight of a single convergence trade. Weight of the convergence trading strategy as a function of time at S = 1 for different values of the robustness multiplier. . . . . . . . . . . . . . . . . . . . 77 2-5 Partial derivative of the value function with respect to S for a single mean reversion trading strategy. VS as a function of time at S = 1 for different values of the robustness multiplier. . . . . 80 11
  • 12.
    2-6 Distortion driftfor a single mean reversion trading strategy. Distortion drift as a function of time at S = 1 for different values of the robustness multiplier. . . . . . . . . . . . . . . . . . . . . . . . . . 81 2-7 Distortion drift terms for a single mean reversion trading strategy. Distortion drift terms as a function of time at S = 1 for ν = 1. The first term corresponds to a positive distortion drift that reduces the wealth of the investor, since the investor is shorting the spread, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. . . . . . . . . . . 81 2-8 Optimal weight of a single mean reversion trading strategy. Weight of the mean reversion trading strategy as a function of time at S = 1 for different values of the robustness multiplier. . . . . . . . . . 82 2-9 Optimal weights of two uncorrelated mean reversion trading strategies for S1 = 1 and S2 = 2. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 2-10 Ratio of the optimal weights. Ratio of the optimal weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 2-11 Partial derivative of the value function with respect to S1 and S2 at S1 = 1 and S2 = 2 when ρ = 0. Partial derivative of the value function with respect to S1 and S2 as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 12
  • 13.
    2-12 Optimal weightsof two positively correlated mean reversion trading strategies for S1 = 1 and S2 = 2. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 2-13 Optimal weights of two negatively correlated mean reversion trading strategies for S1 = 1 and S2 = 2. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . 86 2-14 Optimal weights of two uncorrelated mean reversion trading strategies for S1 = 1 and S2 = 1. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 2-15 Ratio of the optimal weights at S1 = 1 and S2 = 1 when ρ = 0. Ratio of the optimal weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0. . . . . . . 88 2-16 Partial derivative of the value function with respect to S1 and S2 at S1 = 1 and S2 = 1 when ρ = 0. Partial derivative of the value function with respect to S1 and S2 as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 2-17 Optimal weights of two positively correlated mean reversion trading strategies for S1 = 1 and S2 = 1. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.9. . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 13
  • 14.
    2-18 Partial derivativeof the value function with respect to S1 and S2 at S1 = 1 and S2 = 1 when ρ = 0.9. Partial derivative of the value function with respect to S1 and S2 as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.9. . . . . . . . . . . . . . . . . . . 90 2-19 Ratio of the optimal weights at S1 = 1 and S2 = 1 when ρ = 0.9. Ratio of the magnitude of the optimal weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 2-20 Optimal weights of two negatively correlated mean reversion trading strategies for S1 = 1 and S2 = 1. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.8. . . . . . . . . . . . . . . . . . . . . . . . . . . 91 2-21 Partial derivative of the value function with respect to S for a single convergence trade when L = 0.1 and L = 100. VS as a function of time at S = 1 for different values of the robustness multiplier. The solid line is when L = 100 and the dotted line is for L = 0.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 2-22 Partial derivative of the value function with respect to S for a single convergence trade when L = 0.1. VS as a function of time at S = 1 for different values of the robustness multiplier. The collateral constraint is |F| ≤ 0.1. . . . . . . . . . . . . . . . . . . . . . 95 2-23 Optimal weight of a single convergence trade when L = 0.1. Weight of the convergence trading strategy as a function of time at S = 1 for different values of the robustness multiplier. The collateral constraint is |F| ≤ 0.1. . . . . . . . . . . . . . . . . . . . . . . . . . . 95 14
  • 15.
    2-24 Optimal weightof a single convergence trade when L = 1. Weight of the convergence trading strategy as a function of time at S = 1 for different values of the robustness multiplier. The collateral constraint is |F| ≤ 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 2-25 Distortion drift for a single convergence trade when L = 0.1. Distortion drift as a function of time at S = 1 for different values of the robustness multiplier. The collateral constraint is |F| ≤ 0.1. . . . 96 2-26 Distortion drift for a single convergence trade when L = 1. Distortion drift as a function of time at S = 1 for different values of the robustness multiplier. The collateral constraint is |F| ≤ 1. . . . . 97 2-27 Distortion drift terms for a single convergence trade when L = 0.1. Distortion drift terms as a function of time at S = 1 for ν = 1 and L = 0.1. The first term corresponds to a positive distortion drift that reduces the wealth of the investor and it is bounded above due to the collateral constraint, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. 97 2-28 Distortion drift terms for a single convergence trade when L = 1. Distortion drift terms as a function of time at S = 1 for ν = 1 and L = 1. The first term corresponds to a positive distortion drift that reduces the wealth of the investor and it is bounded above due to the collateral constraint, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. 98 2-29 Optimal weight of a single convergence trade when L = 0.1 and L = 100. Weight of the convergence trading strategy as a function of time at S = 1 for different values of the robustness multiplier. The solid line is when L = 100 and the dotted line is for L = 0.1. . . . . . 98 15
  • 16.
    2-30 Optimal weightsof two uncorrelated convergence trades for S1 = 1 and S2 = 2 when L = 0.5. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.5. . . . . . . . . . . . . . . . . . . . . 100 2-31 Value of the normalized wealth variance for two uncorrelated convergence trades at S1 = 1 and S2 = 2 when L = 0.5. Value of the normalized wealth variance for two uncorrelated convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.5. . . . . . . . . . . . . . . . . 101 2-32 Optimal weights of two uncorrelated convergence trades for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . . . . . . 101 2-33 Value of the normalized wealth variance for two uncorrelated convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value of the normalized wealth variance for two uncorrelated convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . 102 2-34 Optimal weights of two positively correlated convergence trades for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . 103 16
  • 17.
    2-35 Value ofthe normalized wealth variance for two positively cor- related convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value of the normalized wealth variance for two positively correlated convergence trades as a function of time at S1 = 1 and S2 = 2 for dif- ferent values of the robustness multiplier. The correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is L = 0.05. . . . . . . . 104 2-36 Optimal weights of two negatively correlated convergence trades for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . 104 2-37 Value of the normalized wealth variance for two negatively correlated convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value of the normalized wealth variance for two nega- tively correlated convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The cor- relation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 2-38 Optimal weights of two uncorrelated convergence trades for S1 = 1 and S2 = 1 when L = 0.05. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . . . . . . 107 2-39 Value of the normalized wealth variance for two uncorrelated convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value of the normalized wealth variance for two uncorrelated convergence trades as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . 107 17
  • 18.
    2-40 Optimal weightsof two positively correlated convergence trades for S1 = 1 and S2 = 1 when L = 0.05. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.8 and the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . 108 2-41 Value of the normalized wealth variance for two positively cor- related convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value of the normalized wealth variance for two positively correlated convergence trades as a function of time at S1 = 1 and S2 = 1 for dif- ferent values of the robustness multiplier. The correlation coefficient is ρ = 0.8 and the rhs of the VaR constraint is L = 0.05. . . . . . . . 109 2-42 Optimal weights of two negatively correlated convergence trades for S1 = 1 and S2 = 1 when L = 8. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.8 and the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . 109 2-43 Value of the normalized wealth variance for two negatively correlated convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value of the normalized wealth variance for two nega- tively correlated convergence trades as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The cor- relation coefficient is ρ = −0.8 and the rhs of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 2-44 Partial derivative of the value function with respect to S for a single mean reversion trading strategy and a collateral con- straint with L = 0.7. VS as a function of time at S = 1 for different values of the robustness multiplier for L = 0.7. . . . . . . . . . . . . . 113 18
  • 19.
    2-45 Distortion driftterms for a single mean reversion trading strategy and a collateral constraint with L = 0.7. Distortion drift terms as a function of time at S = 1 for ν = 2 and for L = 0.7. The first term corresponds to a positive distortion drift that reduces the wealth of the investor, since the investor is shorting the spread, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. The first term is bounded above due to the collateral constraint. . . . . . . . . . . . . . . . . . . 113 2-46 Optimal weight of a single mean reversion trading strategy with a collateral constraint with L = 0.7. Weight of the mean reversion trading strategy as a function of time at S = 1 for different values of the robustness multiplier and for L = 0.7. . . . . . . . . . . 114 2-47 Partial derivative of the value function with respect to S for a single mean reversion trading strategy with different collat- eral constraints. VS as a function of time at S = 1 for different values of the robustness multiplier and different collateral constraints. The solid line is for L = 70 and the dotted line for L = 0.7. . . . . . . 114 2-48 Optimal weight of a single mean reversion trading strategy with different collateral constraints. Weight of the mean reversion trading strategy as a function of time at S = 1 for different values of the robustness multiplier and different collateral constraints. The solid line is for L = 70 and the dotted line for L = 0.7. . . . . . . . . . . . 115 2-49 Optimal weights of two uncorrelated mean reversion trading strategies for S1 = 1 and S2 = 2 when L = 3. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 3. . . . . 117 19
  • 20.
    2-50 Value ofthe normalized wealth variance for two uncorrelated mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 3. Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 3. . . . . 118 2-51 Optimal weights of two uncorrelated mean reversion trading strategies for S1 = 1 and S2 = 2 when L = 2. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. . . . . 118 2-52 Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 2. Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. . . . . 119 2-53 Optimal weights of two uncorrelated mean reversion trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 7. . . . . 119 2-54 Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 7. Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 7. . . . . 120 20
  • 21.
    2-55 Optimal weightsof two positively correlated mean reversion trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is L = 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 2-56 Value of the normalized wealth variance for two positively correlated mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 7. Value of the normalized wealth variance for two positively correlated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is L = 7. . . . . . . . . . . . . . . . . . . . . . . . . . 122 2-57 Optimal weights of two negatively correlated mean reversion trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 2-58 Value of the normalized wealth variance for two negatively correlated mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 7. Value of the normalized wealth variance for two negatively correlated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 7. . . . . . . . . . . . . . . . . . . . . . . . . . 123 21
  • 22.
    2-59 Optimal weightsof two uncorrelated mean reversion trading strategies for S1 = 1 and S2 = 1 when L = 2. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. . . . . 125 2-60 Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies at S1 = 1 and S2 = 1 when L = 2. Value of the normalized wealth variance for two negatively correlated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 2-61 Optimal weights of two positively correlated mean reversion trading strategies for S1 = 1 and S2 = 1 when L = 2. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.9 and the rhs of the VaR constraint is L = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 2-62 Value of the normalized wealth variance for two positively correlated mean reversion trading strategies at S1 = 1 and S2 = 1 when L = 2. Value of the normalized wealth variance for two positively correlated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.9 and the rhs of the VaR constraint is L = 2. . . . . . . . . . . . . . . . . . . . . . . . . . 127 22
  • 23.
    2-63 Optimal weightsof two negatively correlated mean reversion trading strategies for S1 = 1 and S2 = 1 when L = 8. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 2-64 Value of the normalized wealth variance for two negatively correlated mean reversion trading strategies at S1 = 1 and S2 = 1 when L = 8. Value of the normalized wealth variance for two negatively correlated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 8. . . . . . . . . . . . . . . . . . . . . . . . . . 128 3-1 NIH time series flowchart. Flowchart for the construction of NIH appropriations time series. “NIH Approp.” denotes NIH appropria- tions; “PHS Gaps” denotes Institute funding by the U.S. Public Health Service; “Complete Approp.” denotes the union of these two series; “FY Change” allows for the change in government fiscal years; “4Q FY” time series refers to the resulting series in which all years are treated as having four quarters of three months each. . . . . . . . . . 138 3-2 Appropriations data. NIH appropriations in real (2005) dollars, categorized by disease group. . . . . . . . . . . . . . . . . . . . . . . 138 3-3 YLL time series flowchart. Flowchart for the construction of years of life lost (YLL) time series. “WONDER Chapter Age Group” refers to a query to the CDC WONDER database at the chapter level, strati- fied by age group at death; “US Pop.” is the United States population from census data as expressed in the WONDER dataset; and “US GDP” denotes U.S. gross domestic product. . . . . . . . . . . . . . . 140 23
  • 24.
    3-4 YLL data.Panel (a): Raw YLL categorized by disease group. Panel (b): Population-normalized YLL (with base year of 2005), categorized by disease group. Both panels are based on data from 1979 to 2007. 141 3-5 Efficient frontiers. Efficient frontiers for (a) all groups except HIV and AMS, γ = 0; (b) all groups except HIV and AMS, γ = 5; (c) all groups except HIV and AMS without the dementia effect, γ = 0; and (d) all groups except HIV and AMS without the dementia effect, γ =5; based on historical ROI from 1980 to 2003. . . . . . . . . . . . . . . . 148 4-1 Price of the risky asset as a function of the aggregate market supply under varying constraints. We assume that we have 5 agents with the same risk aversion coefficients. The red plot assumes the same L = 30 for all the agents, while the blue assumes L to be different across the agents L1 = 10, L2 = 20, L3 = 30, L4 = 40, L5 = 50. 163 4-2 Price of the risky asset as a function of the aggregate market supply under tightening constraints. We assume that we have 5 agents with the same risk aversion coefficients. The blue plot assumes L to be different across the agents L1 = 10, L2 = 20, L3 = 30, L4 = 40, L5 = 50 and the red assumes that each Li is reduced by 20%. . . . 164 4-3 Price of the risky asset as a function of the aggregate market supply with less variable constraints. We assume that we have 5 agents with the same risk aversion coefficients. The blue plot assumes L to be different across the agents L1 = 10, L2 = 20, L3 = 30, L4 = 40, L5 = 50 and the red assumes that L1 = 20, L2 = 25, L3 = 30, L4 = 35, L5 = 40. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 4-4 Price of the risky asset as a function of the aggregate market supply with constraints and varying risk aversions. We assume that we have 5 agents with same constraints but different risk aversion coefficients. The blue plot assumes L = 30 for each agent, while the red line assumes that the agents are unconstrained. . . . . . . . . . . 166 24
  • 25.
    4-5 Price ofthe risky asset as a function of the aggregate market supply with tightening constraints and varying risk aversions. We assume that we have 5 agents with same constraints but different risk aversion coefficients. The blue plot assumes L = 30 for each agent, while the red line assumes that L = 20 for each agent. . . . . . . . . . 167 25
  • 26.
  • 27.
    List of Tables 3.1IoM recommendations. 12 major recommendations of the 1998 Institute of Medicine panel in four large areas for improving the process of allocating research funds. . . . . . . . . . . . . . . . . . . . . . . . 133 3.2 ICD mapping. Classification of ICD-9 (1978–1998) and ICD-10 (1999– 2007) Chapters and NIH appropriations by Institute and Center to 7 disease groups: oncology (ONC); heart lung and blood (HLB); diges- tive, renal and endocrine (DDK); central nervous system and sensory (CNS) into which we placed dementia and unspecified psychoses to create comparable series as there was a clear, ongoing migration noted from NMH to CNS after the change to ICD-10 in 1999; psychiatric and substance abuse (NMH); infectious disease, subdivided into estimated HIV (HIV) and other (AID); maternal, fetal, congenital and pediatric (CHD). The categories LAB and EXT are omitted from our analysis. 137 3.3 Return summary statistics. Summary statistics for the ROI of disease groups, in units of years (for the lag length) and per-capita- GDP-denominated reductions in YLL between years t and t+4 per dollar of research funding in year t−q, based on historical ROI from 1980 to 2003. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 3.4 ROI example. An example of the ROI calculation for HLB from 1986. 147 3.5 Portfolio weights. Benchmark, single- and dual-objective optimal portfolio weights (in percent), based on historical ROI from 1980 to 2003. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 27
  • 28.
  • 29.
    Chapter 1 Optimal tradingof arbitrage opportunities under constraints In financial economics, an arbitrage is an investment opportunity that is too good to be true when there are no market frictions. In actual financial markets however, there are frictions and even if there are arbitrage opportunities the investors may not be able to fully exploit them due to the constraints they face. We will explore two kinds of risky arbitrage opportunities when there are market frictions. The first one is a case of a textbook arbitrage, a convergence trade strategy. The second is a case of a statistical arbitrage, a mean reversion trading strategy. These two strategies are two of the most popular trading strategies that hedge funds follow, so studying them in detail when there are market frictions is a valuable exercise. A convergence trade is a trading strategy consisting of long/short positions in two similar assets, where we buy the cheap asset, we short the expensive asset and we wait until the prices of the two assets to converge which we know it will happen for sure some particular time in the future. An example of this trade involves the difference in price between the on-the-run and the most recent off-the-run security. An on- the-run security is the most recently issued, and hence most liquid, of a periodically issued security. Since an on-the-run security is more liquid it trades at a premium to off-the-run securities [29]. A convergence trade involves taking a long position in the most recent off-the-run security and shorting the on-the-run security. The on- 29
  • 30.
    the-run will becomeoff-the-run upon the issue of a newer security and then there will be almost no difference between the two securities in our trade so their prices will converge. Another example involves investing in Treasury STRIPS with identical maturity dates but different prices. A mean reversion trading strategy involves investing in an asset or a portfolio of assets whose value is a mean reverting process. Since most price series in the equity space follow random walk, this strategy most commonly involves investing in a port- folio of non-mean-reverting assets whose value is a stationary mean reverting series. These price series that can be combined in such a way are called cointegrating. A classic statistical arbitrage example is the pairs trading, which is the first type of algorithmic mean reversion trading strategy invented by institutional traders, report- edly by the trading desk of Nunzio Tartaglia at Morgan Stanley [64]. The statistical arbitrage pairs trading strategy bets on the convergence of the prices of two similar assets whose prices have diverged without a fundamental reason for this. These arbitrage opportunities are risky under market frictions. In particular the first case is exposed to the “divergence risk”, i.e. the fact that the pricing differential between the two similar assets can diverge arbitrarily far from 0 prior to its conver- gence at some particular time in the future. The second case is exposed both to the “divergence risk” and to the “horizon risk”, in other words the fact that the times at which the spread will converge to its long run mean are uncertain. We will explore the optimal portfolio allocation of a risk averse investor who invests in N convergence trades or mean reverting trading strategies, while facing constraints. In particular, we will study the optimal trading strategy when he faces VaR constraints or collateral constraints. Risk sensitive regulation, such as the VaR constraint, has lately become a central component of international financial regu- lations. Collateral or margin constraints, where the investor has to have sufficient wealth to secure the liabilities taken by short positions, have been ubiquitous in the financial transactions for centuries and margin calls have been behind several crises including the LTCM debacle[54]. In the rest of this chapter, we will discuss the relevant literature review. Then we 30
  • 31.
    will discuss aboutthe setup of the model and the constraints, we will find the optimal trading strategy of the investor and finally we will explore the characteristics of this optimal strategy. 1.1 Literature review Merton studied the problem of optimal portfolio allocation in a continuous time set- ting without any market frictions [59]. The optimal portfolio involves two terms: a market timing term and a hedging demand term. The first term is a myopic term that represents the optimal allocation if you were interested at each time instant t only for an horizon dt ahead. The second term represents the investor’s additional demand due to the covariance of the wealth process with the attractiveness of the available investment opportunities. Although Merton gives an analytical general solution this is expressed in terms of the partial derivatives of the value function and additional work is needed to derive the solution in terms of the model parameters. Additionally it assumes that there are no market frictions. Optimal trading of mean reversion trading strategies have been studied by both Boguslavsky and Boguslavskaya [15] and Jurek and Yang [48]. They have found analytical solutions for the optimal weight of a single mean reverting trading strategy for risk averse CRRA investors. Their analysis is similar with the one in Kim and Omberg [49], where they assume that there is a risk free asset with a constant risk- free rate and a single risky asset with a mean reverting risk premium, which implies a mean reverting instantaneous Sharpe ratio. In all the cases they have assumed that there are no market frictions whatsoever. Longstaff and Liu [53] have studied the problem of optimal trading of a single convergence trade under a margin constraint. For the single convergence trade case both VaR constraints and margin constraints collapse in the same constraint and the problem is significantly easier. In addition by studying only convergence trades they have taken out one important dimension of risk, the horizon risk, keeping only the divergence risk. Brennan and Schwarz [17] have also studied the problem of optimal 31
  • 32.
    trading of asingle convergence trade including transaction costs when the arbitrage potential is restricted by position limits. The literature is rich with papers that study the existence of an equilibrium where there exists mispricings. This persistence of mispricings is typically attributed to agency problems, frictions or some kind of risk. Unlike textbook arbitrages, which generate riskless profits and require no capital commitments, exploiting real-world mispricings requires the assumption of some kind of risk. Shleifer and Vishny [74] emphasized that risks such as the uncertainty about when the pricing differential will converge to 0 and the possibility of a divergence of the mispricing prior to its elimination may play a role in limiting the size of positions that arbitrageurs are willing to take, contributing to the persistence of the arbitrage in equilibrium. Basak and Choitoru [6] also showed that arbitrage can persist in equilibrium when there are frictions. They study dynamic models with log utility and heterogeneous beliefs in the presence of margin requirements and other portfolio constraints. With respect to the constraints, Basak and Shapiro [7] study the problem of opti- mal trading strategy of a risk averse investor who faces finite horizon VaR constraints in a complete markets setting using the martingale representation approach [4]. Here again there are no constraints in the optimal portfolio allocation at each time t but there is only one constraint in the wealth at some finite horizon. Finally, Geanakoplos [33] studies the collateral constraints, how these determine an equilibrium leverage and how this leverage changes over time, the so-called leverage cycles. Let us now discuss about the setup of the model and the constraints and find the optimal trading strategy of the investor. 1.2 Analysis We assume we have a risk averse investor maximizing the expected continuously compounded rate of return or equivalently the expected logarithm of his final wealth E(lnWT ). There are two cases to consider. In the first case, the investor can invest in a risk-free asset and N non-redundant convergence trades, modeled as correlated 32
  • 33.
    Brownian bridges. Inthe second case, the investor can invest in a risk-free asset and N non-redundant mean reversion trading strategies, modeled as a multivariate Ornstein-Uhlenbeck (OU) process. The investor faces two kinds of constraints: VaR constraints or collateral constraints. We determine the optimal trading strategy and its characteristics in both cases. 1.2.1 Models As we mentioned already, a convergence trade is a trading strategy consisting of long/short positions in two similar assets, where we buy the cheap asset, we short the expensive asset and we wait until the prices of the two assets to converge which we know it will happen for sure some time in the future. The spread of the convergence trade can be modeled as a Brownian bridge driven by K Brownian motions, which has the property that the spread will converge to 0 almost surely at some determined time in the future. The stochastic differential equation governing the spread of the trade is given by: dSt = − aSt T − t dt + K X k=1 σkdZkt (1.1) where St is the spread of the trade, a is a parameter controlling the rate of the mean reversion to 0, T is the horizon of the investor which is also the time at which the spread goes to 0 with probability 1 and Zt is a Brownian motion in RK . We can see that the reversion to 0 grows stronger as t → T. Therefore, the investment opportunities get better as the spread gets larger and t → T, since then the drift term pushing the spread towards 0 gets larger. A mean reversion trading strategy involves investing in a stationary portfolio of non-mean reverting assets, whose value is a mean reverting process. The value of the portfolio can be modeled as an Ornstein-Uhlenbeck (OU) process. The stochastic differential equation governing it is given by: dSt = −φ(St − S̄)dt + K X k=1 σkdZkt 33
  • 34.
    In our casewe have N of these mean reverting processes and we assume that they are modeled as a multivariate Ornstein-Uhlenbeck process, which is defined by the following stochastic differential equation: dSt = −Φ(St − S̄)dt + σdZt (1.2) Above Φ is a N-by-N square transition matrix that characterizes the deterministic portion of the evolution of the process, S̄ is the vector representing the unconditional mean of the process, σ is a N-by-K matrix that drives the dispersion of the process and Zt is a Brownian motion in RK . The Ornstein-Uhlenbeck process has the nice property that its conditional distri- bution is normal at all times, with mean equal to Et[St+τ ] = S̄ + e−Φτ (St − S̄) and covariance matrix independent of St [60]. We assume that Φ has eigenvalues with positive real part, so that the conditional expectation approaches to S̄ as t → ∞. The Ornstein-Uhlenbeck process captures the two important dimensions of risk in all relative value trades: the “horizon risk”, in other words the fact that the times at which the spread will converge to its long run mean are uncertain and the “divergence risk”, i.e. the fact that the pricing differential can diverge arbitrarily far from its long run mean prior to its convergence. The Brownian bridge captures only the “divergence” risk, since by its definition we assume that the investor has perfect information about the magnitude of the mispricing at some future date T, i.e. we assume that the date T on which the mispricing will be eliminated is known ahead with certainty. 1.2.2 Constraints We consider two kinds of constraints: VaR and collateral constraints. The VaR constraint is a widely used statistical risk measure, adopted both by the regulators 34
  • 35.
    and the privatesector. It is the cornerstone of the capital regulations adopted by Basel regulations. Both the 1996 market risk amendment of the original 1988 Basel accord and the Basel II regulations have been built on the notion of Value-at-Risk [47]. The Value at risk (VaR) at α-level is defined as the threshold value such that the probability of losses greater than the threshold is less than α. In our case we consider instantaneous VaR constraints which amount for determining an upper bound in the wealth volatility, since locally the diffusion processes have normal distributions. Therefore, the instantaneous VaR constraints are given by: θT Σθ ≤ LW2 where θ is a N by 1 vector of positions, Σ is the instantaneous covariance matrix of the spreads, L is some proportionality constant that determines the tightness of the constraint and W is the investor’s wealth. Collateral or margin constraints have been ubiquitous in the financial transactions for centuries. Even Shakespeare in the “Merchant of Venice” points out the importance of the collateral, as Shylock charged Antonio no interest rate but he asked for a pound of flesh as a collateral. The collateral constraints provide protection against mark-to-market losses whenever an investor generates a liability by shorting an asset. Therefore, they require that the investor’s wealth is bounded below by the collateral necessary to secure the liabilities. They are given by: N X i=1 λi|θi| ≤ W where λi is the collateral necessary to secure the liability in spread i. In our work, each unit of arbitrage should be understood as being relative to a fixed face or notional amount and therefore each λi is a percentage of this fixed face value or notional amount. 35
  • 36.
    1.2.3 Solution Let usnow find the optimal trading strategy of a risk averse investor who maximizes the expected logarithm of his final wealth E(lnWT ). We consider two cases: • The investor invests in the risk free asset and in N correlated convergence trades. • The investor invests in the risk free asset and in N correlated mean reversion trading strategies. For both cases our analysis is similar. For both cases we have: Wt = N X i=1 θitSit + θ0tB0t ∀t ∈ [0, T] (1.3) where θit is the investor’s position in opportunity i for i = 1, · · · , N, θ0t is the in- vestor’s position in the risk free asset, Sit is the spread of the convergence trade or the value of the mean reverting portfolio and B0t is the price of the risk free asset. The process θt is adapted to the filtration generated by the Brownian motion Zt. The investor solves the following problem: maximizeθ∈Θ E(lnWT ) subject to dWt = PN i=1 θitdSit + θ0tdB0t dSt = µ(S, t)dt + σ(S, t)dZt (1.4) where Θ is the set of admissible trading strategies. Let us first define ∀t ∈ [0, T] Ft = θt/Wt ∈ RN . 36
  • 37.
    For the convergencetrades case, investor’s wealth satisfies the following stochastic differential equation: dWt = Wtrdt + N X i=1 θitSit(− ai T − t − r)dt + θT σdZt dWt Wt = rdt + N X i=1 FitSit(− ai T − t − r)dt + FT σdZt By applying Ito’s Lemma we have that: d(ln(Wt)) = rdt + N X i=1 FitSit(− ai T − t − r)dt − 1/2FT t ΣFtdt + FT t σdZt Therefore it is: ln(WT ) = ln(Wt) + Z T t rs ds + Z T t N X i=1 FisSis(− ai T − t − rs) − 1 2 FT s ΣFs ! ds + Z T t FT s σdZs (1.5) Assuming constant interest rate, we have: Et(ln(WT )) = ln(Wt) + r(T − t) + Et Z T t N X i=1 FisSis(− ai T − t − rs) − 1 2 FT s ΣFs ! ds ! + Et( Z T t FT s σdZs) (1.6) 37
  • 38.
    For the meanreversion trading strategies case, investor’s wealth satisfies the following stochastic differential equation: dWt = Wtrdt + N X i=1 θit(−ΦT i (St − S̄) − rSit)dt + θT σdZt dWt Wt = rdt + N X i=1 Fit(−ΦT i (St − S̄) − rSit)dt + FT t σdZt where Φi is the i’th row of the transition matrix Φ. By applying Ito’s Lemma we have that: d(ln(Wt)) = rdt + N X i=1 Fit(−ΦT i (St − S̄) − rSit)dt − 1/2FT t ΣFtdt + FT t σdZt Therefore it is: ln(WT ) = ln(Wt) + Z T t rs ds + Z T t N X i=1 Fis(−ΦT i (Ss − S̄) − rSis) − 1 2 FT s ΣFs ! ds + Z T t FT s σdZs (1.7) Assuming constant interest rate we have: Et(ln(WT )) = ln(Wt) + r(T − t) + Et Z T t N X i=1 Fis(−ΦT i (Ss − S̄) − rSis) − 1 2 FT s ΣFs ! ds ! + Et( Z T t FT s σdZs) (1.8) Under VaR constaints it is: FT t ΣFt ≤ L < ∞ ∀t 38
  • 39.
    Under the marginconstraints it is: N X i=1 λi|Fit| ≤ 1 ∀t FT t ΣFt = N X i=1 N X j=1 FitFjtσij ≤ N X i=1 N X j=1 λiλj|Fit||Fjt| σij λiλj < C < ∞ ∀t Therefore, for both the cases and both the constraints the integrand of the stochastic integral belongs in H2 , which is a sufficient condition for the stochastic integral to be a martingale. Consequently, Et( R T t FT s σdZs) is equal to 0. Maximizing Et(ln(WT )) is equivalent to maximizing the third term is equations (1.6), (1.8) for both the cases respectively. Let’s now stydy in detail the solution for both cases for both the constraints. VaR constraint Maximizing Et(ln(WT )) under the VaR constraint is equivalent to solving ∀t the following QCQP: minimize FT t µt + 1 2 FT t ΣFt subject to FT t ΣFt ≤ L (1.9) where µt =      S1t( a1 T−t + r) . . . SNt( aN T−t + r)      (1.10) for the convergence trades case and µt =      ΦT 1 (St − S̄) + rS1t . . . ΦT N (St − S̄) + rSNt      (1.11) for the mean reversion trading strategies case. 39
  • 40.
    We can easilysolve the problem 1.9 by applying the KKT conditions or by ge- ometry (see Appendix). Fopt t , λopt t are optimal iff they satisfy the following KKT conditions ([10]): • Primal feasibility: FT opt t ΣFopt t ≤ L • Dual feasibility: λopt t ≥ 0 • Complementary slackness: λopt t (FT opt t ΣFopt t − L) = 0 • Minimization of the Lagrangean: Fopt t = argmin L(Ft, λopt t ) By solving the KKT conditions (see Appendix for details) we find that: θopt t =        −Σ−1 µtWt if µT t Σ−1 µt ≤ L − Σ−1µtWt r µT t Σ−1µt L if µT t Σ−1 µt ≥ L This is equivalently written as: θopt t = − Σ−1 µtWt 1 + λopt t where 1 + λopt t = max 1, r µT t Σ−1µt L ! Let’s now discuss more the properties of the solution. The investor has logarithmic preferences. Therefore, he is a myopic optimizer - there is no hedging demand [59]. At each time t he looks dt ahead and decides how to trade in an optimal way. There are two cases to consider: • Case 1: At time t: µT t Σ−1 µt ≤ L In this case, the optimal solution is the unconstrained myopic optimal solution, since it satisfies the VaR constraint. For the convergence trades case, this is equivalent to the spread St being in the ellipsoid Et = {S | ST (AtΣ−1 At)S ≤ L} where At = diag( a1 T−t + r, · · · , aN T−t + r). 40
  • 41.
    The volume ofthe ellipsoid Et is shrinking as t → T, since vol(E) = QN i=1( a1 T−t + r)−1 p det(Σ)vol(B(0, 1)) where B(0, 1) is the unit sphere. Figure 1-1 shows this shrinking ellipsoid at three time instants. For the mean reversion trading strategies case, this is equivalent to the spread or value of the trade being inside the convex set C = {S | (S−S̄)T ((Φ+rI)T Σ−1 (Φ+ rI))(S − S̄) + 2rS̄T Σ−1 (Φ + rI))(S − S̄) ≤ L − r2 S̄T Σ−1 S̄}, which in the case of r = 0 is the ellipsoid C = {S | (S − S̄)T (ΦT Σ−1 Φ)(S − S̄) ≤ L. If S̄ = 0 this convex set is also an ellipsoid. These ellipsoids characterize poor opportunities where the constraints are not active. What constitutes poor investment opportunities changes over time for the case of convergence trades, while it remains invariant for the mean reversion trades case. For the case of convergence trades, the same spreads initially can be considered poor investment opportunities, where the investor does not bind the constraint, he is more conservative, but after some time they can be considered good opportunities and the investor becomes more aggressive and binds the constraint. Informally, when the investment opportunities are poor, the spreads are more likely to widen which then would lead to mark-to-market losses and the investor would not have sufficient wealth to take advantage the better investment oppor- tunities and simultaneously satisfy the VaR constraints. Therefore, the investor is more conservative. • Case 2: At time t: µT t Σ−1 µt > L Now the unconstrained myopic optimal solu- tion does not satisfy the VaR constraint. This case is equivalent to the spread St being outside the shrinking ellipsoid Et for the convergence trades case or the set C for the mean reversion trades case. Now the investment opportunities are good. The investor wants to invest the unconstrained optimal trading strategy, but due to the VaR constraint invests in the proportion of this optimal trading strategy necessary to satisfy the VaR constraint. 41
  • 42.
    −0.2 −0.15 −0.1−0.05 0 0.05 0.1 0.15 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 Ellipsoids of poor investment opportunities for t=0.3, 0.6, 0.9. Spread 1 Spread 2 Figure 1-1: Ellipsoids. Ellipsoids of poor investment opportunities for N=2 conver- gence trades at times t = 0.3, 0.6, 0.9. Margin constraint Maximizing Et(ln(WT )) under the margin constraint is equivalent to solving ∀t the following convex program: minimize FT t µt + 1 2 FT t ΣFt subject to PN i=1 λi|Fit| ≤ 1 (1.12) µt =      S1t( a1 T−t + r) . . . SNt( aN T−t + r)      for the convergence trades case and µt =      ΦT 1 (St − S̄) + rS1t . . . ΦT N (St − S̄) + rSNt      42
  • 43.
    for the meanreversion trading strategies case. Let’s apply the KKT conditions. Fopt t , νopt t are optimal iff they satisfy the KKT conditions: • Primal feasibility: PN i=1 λi|Fit| ≤ 1 • Dual feasibility: νopt t ≥ 0 • Complementary slackness: νopt t ( PN i=1 λi|Fit| − 1) = 0 • Minimization of the Lagrangean Fopt t = argmin L(Ft, λopt t ) This program cannot be solved analytically in general. Again there are two cases to consider. • Case 1: At time t: kΛΣ−1 µtk1 ≤ 1 where Λ = diag(λ1, · · · , λN ) In this case, the optimal solution is the unconstrained myopic optimal solution, since it satisfies the margin constraint. For the convergence trades, this is equivalent to having at time t: kΛΣ−1 AtSk1 ≤ 1 where Λ = diag(λ1, · · · , λN ) and At = diag( a1 T−t + r, · · · , aN T−t + r). In this case we have that St is inside a “diamond” in N dimensional space, which shrinks as t → T. For the mean reversion trades, this is equivalent to having at time t: kΛΣ−1 (Φ(St− S̄) + rSt)k1 ≤ 1 where Λ = diag(λ1, · · · , λN ). Informally again, when the investment opportunities are poor, the spreads are more likely to widen which then would lead to mark-to-market losses and the in- vestor would not have sufficient wealth to take advantage the better investment opportunities and have enough wealth for the collateral necessary to secure the liabilities. 43
  • 44.
    • Case 2:At time t: kΛΣ−1 µtk1 ≥ 1 where Λ = diag(λ1, · · · , λN ). Now the investment opportunities are good, the unconstrained myopic optimal solution does not satisfy the collateral constraint and the constraint binds at the optimal solution. Uncorrelated opportunities There is a special case when the trading opportunities are uncorrelated, where we can solve analytically the KKT conditions (see Appendix for details). In that case the optimal positions are given by: θopt it = sign(−µit)(|µit λi | − νopt t )+ σ2 i λi Wt (1.13) We observe the following: • First of all for the convergence trades, in case the spread is positive we short the spread as we would expect and in case it is negative we are long the spread. For the mean reversion trades, the sign is the opposite of the sign of ΦT i (St−S̄)+rSit. • Second, if µt is high relative to the collateral then the magnitude of the position is higher. • Third, if the variability of the opportunity is high the magnitude of the corre- sponding position is low. • Finally the more interesting property of the solution is that it has a cutoff value, the dual variable, and if the absolute value of µt over the collateral is greater than the dual variable the position is different from zero otherwise the position is 0. It is: νopt t = 0 if N X i=1 |λiµit| σ2 i ≤ 1 44
  • 45.
    and νopt t > 0if N X i=1 |λiµit| σ2 i > 1 The dual variable is 0 when the investment opportunities are poor. It is easy to see that when the margin constraint binds we have: F̃it opt = sign(−µit)(|µit λi | − νopt t )+ σ2 i λ2 i andkF̃k1 = 1 (1.14) 1 2 3 4 5 6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights in different arbitrage opportunities. Weights Figure 1-2: Weights for the case of uncorrelated spreads and collateral con- straint. Weights for the case of uncorrelated spreads. In Figure 1-2 we can see an example of how we invest in different convergence trades when there is no correlation among them, with λ = 1 and volatilities equal to 1 for all the opportunities. The height of each bar is the absolute value of µit and we invest only in those spreads where the µit is larger than the dual variable. If PN i=1 |µit| < 1, then the dual variable is 0, we invest in all the opportunities and the collateral constraint does not bind. If PN i=1 |µit| > 1 as is in the figure then the margin constraint binds, the dual variable is positive and we can find it as follows. We start from the maximum µit and then reduce it until the sum of the weights is equal to 1 where each weight is the distance between the absolute value of µit and ν. 45
  • 46.
    1.2.4 Connection withRidge and Lasso regression Before we explore further the properties and the results of the optimal trading strate- gies, it would be interesting to digress for a while and see what connection there is between our problems and the regularized regressions. In the basic form of regularized regression, the goal is not only to have a good fit, but also regression coefficients that are “small”. Two of the most common forms of regularized regressions are the Ridge and Lasso regression. Ridge regression shrinks the regression coefficients by imposing a penalty on their size [42]. Equation 1.15 is one of the ways to write the Ridge problem. minimize PN i=1(yi − β0 − Pp j=1 xijβj)2 subject to Pp j=1 β2 j ≤ t (1.15) The Ridge regression coefficients solution is similar to the optimal trading strategy followed by a risk averse investor with logarithmic preferences, who can choose among N diffusion processes and faces VaR constraints. In both cases we have this propor- tional shrinkage where we reduce all the weights by a constant. Lasso regression is another common form of a regularized regression. It can be used as a heuristic for finding a sparse solution. It does a kind of continuous subset selection [16]. Equation 1.16 is one of the ways to write the Lasso problem. minimize PN i=1(yi − β0 − Pp j=1 xijβj)2 subject to Pp j=1 kβjk ≤ t (1.16) The Lasso regression coefficients solution is similar to the optimal trading strategy followed by a risk averse investor with logarithmic preferences, who can choose among N diffusion processes and faces margin constraints. Therefore, we can expect that in this case we will have a sparse solution where the weights of several of the opportu- nities will be 0. 46
  • 47.
    1.3 Results Let usmove on now to the results first for the convergence trades and then for the mean reversion trading strategies. 1.3.1 Convergence trades VaR constraints. We have simulated the optimal trading strategy for N = 2 correlated convergence trading opportunities under VaR constraints. We find the following: • It is often optimal for an investor to underinvest i.e. not to bind the constraint. • The investor typically experiences losses early before locking at a profit as we can see in Figures 1-3, 1-4, 1-5, 1-6. • Tighter constraints lead to less variability and less skewness in the distribution of wealth. They also lead to less final wealth as we can see in Figures 1-5, 1-6. • The wealth is higher when the opportunities hedge each other, as we can see in Figures 1-4, 1-6. This makes sense because when the constraints are binding we care more about losing money which would then lead surely to liquidation when the investment opportunities are better and therefore we prefer the op- portunities to hedge each other. Figure 1-7 shows a typical path for the wealth evolution using the same noise process for positive and negative correlation under the VaR constraint. We see clearly this hedging effect where negative correlation leads to more wealth. • When the initial values of the convergence trades are low, the constraints bind for a small percentage of time and final wealth is negatively correlated to the percentage of time the constraints bind. Figure 1-8 shows this effect. • The final portfolio wealth is highly positively skewed as it is obvious in Figures 1-3, 1-4, 1-5, 1-6 47
  • 48.
    For all thesimulations we used: σ1 = σ2 = 1, a1 = a2 = 1, S[0] = [1; 1], rf = 0.06, number of steps = 1000. 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Distribution of wealth Time 0.25 rho 0.5 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Distribution of wealth Time 0.5 rho 0.5 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Distribution of wealth Time 0.75 rho 0.5 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Distribution of wealth Time 1 rho 0.5 Figure 1-3: VaR constraints, positive correlations. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated (ρ = 0.5) convergence trades, while facing VaR constraints (K=1). Initial wealth is $100. 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.25 rho −0.5 K 1 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.5 rho −0.5 K 1 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.75 rho −0.5 K 1 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 1 rho −0.5 K 1 Figure 1-4: VaR constraints, negative correlations. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively correlated (ρ = −0.5) convergence trades, while facing VaR constraints (K=1). Initial wealth is $100. 48
  • 49.
    0 500 10001500 2000 2500 3000 3500 4000 0 50 100 Time 0.25 rho 0.5 K 0.25 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.5 rho 0.5 K 0.25 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.75 rho 0.5 K 0.25 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 1 rho 0.5 K 0.25 Figure 1-5: VaR constraints, positive correlations, tight constraints. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively cor- related (ρ = 0.5) convergence trades, while facing VaR constraints (K=0.25). Initial wealth is $100. 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.25 rho −0.5 K 0.25 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.5 rho −0.5 K 0.25 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.75 rho −0.5 K 0.25 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 1 rho −0.5 K 0.25 Figure 1-6: VaR constraints, negative correlations, tight constraints. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively correlated (ρ = −0.5) convergence trades, while facing VaR constraints (K=0.25). Initial wealth is $100. 49
  • 50.
    0 200 400600 800 1000 1200 0 200 400 600 800 1000 1200 1400 Simulation step Final wealth rho 0.5 rho −0.5 Figure 1-7: Wealth evolution under VaR constraint. Typical path of the wealth evolution for an investor investing in two convergence trades using the same noise process for positive and negative correlation under the VaR constraint. Initial wealth is $100. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 200 400 600 800 1000 1200 1400 1600 Frequency the constraint binds Final wealth Figure 1-8: Relation between final wealth and frequency the VaR constraint binds. Final wealth is negatively correlated to the percentage of time the constraints bind when the initial values of the convergence trades are low. 50
  • 51.
    Margin constraints. Wehave also simulated the optimal trading strategy for N = 2 correlated convergence trading opportunities under margin constraints using the same noise process as with the VaR constraints. We have similar results with the case of VaR constraints as we see in Figures 1-9, 1-10, 1-11, 1-12, 1-13 with the following important differences: • When the constraints bind, it is often the case that the position in one of the convergence trades is 0, i.e. we have less diversification, sparse solution. Figure 1-15 shows a typical path of the positions in two convergence trading opportu- nities under VaR constraints, where we see that they tend to be different than 0. Figure 1-16 shows the evolutions of the positions in two convergence trading opportunities under margin constraints for the same exactly asset processes as before. We clearly see that often we invest only in one position, as we expected due to the similarity of the positions with the Lasso regression coefficients. • The final wealth is less skewed and smaller with respect to the case of VaR constraints. 51
  • 52.
    0 500 10001500 2000 2500 3000 3500 4000 0 50 100 Time 0.25 rho 0.5 Collateral 1 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.5 rho 0.5 Collateral 1 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.75 rho 0.5 Collateral 1 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 1 rho 0.5 Collateral 1 Figure 1-9: Margin constraints, positive correlations. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated (ρ = 0.5) convergence trades, while facing margin constraints (Collateral = 1). Initial wealth is $100. 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.25 rho −0.5 Collateral 1 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.5 rho −0.5 Collateral 1 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.75 rho −0.5 Collateral 1 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 1 rho −0.5 Collateral 1 Figure 1-10: Margin constraints, negative correlations. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively correlated (ρ = −0.5) convergence trades, while facing margin constraints (Collateral = 1). Initial wealth is $100. 52
  • 53.
    0 500 10001500 2000 2500 3000 3500 4000 0 50 100 Time 0.25 rho 0.5 Collateral 2 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.5 rho 0.5 Collateral 2 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.75 rho 0.5 Collateral 2 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 1 rho 0.5 Collateral 2 Figure 1-11: Margin constraints, positive correlations, more collateral needed. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated (ρ = 0.5) convergence trades, while facing margin con- straints (Collateral = 2). Initial wealth is $100. 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.25 rho −0.5 Collateral 2 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.5 rho −0.5 Collateral 2 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 0.75 rho −0.5 Collateral 2 0 500 1000 1500 2000 2500 3000 3500 4000 0 50 100 Time 1 rho −0.5 Collateral 2 Figure 1-12: Margin constraints, negative correlations, more collateral needed. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively correlated (ρ = −0.5) convergence trades, while facing margin con- straints (Collateral = 2). Initial wealth is $100. 53
  • 54.
    0 200 400600 800 1000 1200 0 100 200 300 400 500 600 700 Simulation step Final wealth rho 0.5 rho −0.5 Figure 1-13: Wealth evolution under margin constraint. Typical path of the wealth evolution for an investor investing in two convergence trades using the same noise process for positive and negative correlation under the margin constraint. Initial wealth is $100. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 200 400 600 800 1000 1200 1400 1600 Frequency the constraint binds Final wealth Figure 1-14: Relation between final wealth and frequency the margin con- straint binds. Final wealth is negatively correlated to the percentage of time the constraints bind when the initial values of the convergence trades are low. 54
  • 55.
    0 100 200300 400 500 600 700 800 900 1000 −800 −600 −400 −200 0 200 400 600 800 Simulation step Position Convergence trade 1 Convergence trade 2 Figure 1-15: Positions evolution under VaR constraints. Typical path of the positions in two convergence trading opportunities under VaR constraints. 0 100 200 300 400 500 600 700 800 900 1000 −400 −300 −200 −100 0 100 200 300 400 Simulation step Position Convergence trade 1 Convergence trade 2 Figure 1-16: Positions evolution under margin constraints. Typical path of the positions in two convergence trading opportunities under margin constraints. 55
  • 56.
    1.3.2 Mean reversiontrading opportunities Similar results we get also by simulating the optimal trading strategy for N = 2 correlated mean reversion trading opportunities under VaR and margin constraints. This makes sense since without loss of generality, we have assumed that S̄ = 0 and that Φ is a diagonal matrix, which makes the mean reversion trades case similar to the convergence trades case. They are different in that the drift term of the convergence trades for the same spreads S gets better and better as t → T, while for the mean reversion trades it remains constant. 1.4 Conclusions We explored the optimal portfolio allocation of a risk averse investor who invests in N convergence trades or mean reverting trading strategies, while facing constraints. In particular, we studied the optimal trading strategy when he faces VaR constraints or collateral constraints. The optimal trading strategy is found by solving at each time t a convex program for both cases, we characterized the solution of the convex program and we found the properties of the optimal trading strategy. In all the chapter, we have assumed that the investor completely trusts his model and he is certain about the dynamics of the opportunities he faces. What happens if the model is just an approximation? What happens if the investor believes that opportunities’ dynamics come from an unknown member of a set of unspecified models near an approximating model? Concern about model misspecification will change the optimal trading strategy of the investor and this is the topic of the next chapter. 56
  • 57.
    Chapter 2 Optimal tradingof arbitrage opportunities under model misspecification A decision maker maximizes a utility function subject to a model. Standard control theory helps a decision maker to make optimal decisions when his model is correct. Robust control theory helps him to make good decisions when his model is approxi- mately correct. In this chapter we will use methods of robust control theory to find the optimal portfolio allocation of a risk averse investor, who invests in convergence trades or mean reverting trading strategies, and is not completely confident about the dynamics of his models. In particular, we assume that the investor believes that the data comes from an unknown member of a set of alternative models near his nominal model. These alternative models are statistically difficult to distinguish from the nominal model. The investor believes that his model is a pretty good approximation in the sense that the discrepancies between the alternative models and his nominal model are small. We will use the relative entropy to characterize the discrepancies between different models. Concern about model misspecification leads the investor to choose a trading strategy that is robust over the alternative models. Three questions come naturally at this point: 57
  • 58.
    • What doesit mean to have a robust trading strategy? A robust trading strategy is a strategy that works well over the set of alternative probability models. We evaluate the worst performance of a given strategy over that set of alternative probability models and we pick the one that maximizes this worst case performance. It is essentially a “max-min” problem, a two-player game in which a maximizing player chooses the best response to a malevolent player who can disturb the stochastic model within limits. • Why would we be interested in a robust decision rule over alternative models? Why don’t we take a Bayesian approach, where we put a prior distribution over the set of alternative probability models? This could be another approach, but this set of alternative models may be too large or too difficult for the investor to come up with a well behaved, plausible prior distribution. In addition, we might want our solution to work well over any kind of prior distribution [41]. • Why do we use the relative entropy to measure the discrepancy between an alternative and the nominal model? There are other ways to measure discrepancies between alternative probability models, like Prokhorov distance [9] but the relative entropy with respect to a measure P has nice properties and it is more tractable. It is given by: D(Q) = Z log( dQ dP )dQ and it is a convex function of the measure Q. In the rest of this chapter, we will review the relevant literature. Then we will discuss about the set of the alternative models, the relative entropy and equivalent ways to formulate our problem of the optimal robust portfolio allocation of the in- vestor. Subsequently, we will find the optimal robust trading strategy of the investor and finally we will explore the characteristics of this robust strategy. 58
  • 59.
    2.1 Literature review Whittle[79], [80] has studied mathematical methods for answering the question of how to make decisions when you don’t fully trust your model. Lars Hansen and Thomas Sargent in [41] have studied how to make economic decisions in the face of model misspecification by modifying and extending aspects of robust control theory. Their work revolves mostly around the linear-quadratic regulator framework, where there is a certainty equivalence principle that allows a deterministic presentation of the control theory. Gilboa and Schmeidler in [34] have studied the max-min expected utility problem where the decision maker has multiple priors and maximizes his expected utility as- suming that nature chooses a probability measure to minimize his expected utility. The minimization is over a closed and convex set of finitely additive probability mea- sures. Their axiomatic treatment views this set of non-unique priors as an expression of the agent’s preferences and the priors are not cast as distortions to a nominal model. Lars Hansen et al. [40] have studied robust decision rules when the agent fears that the data are generated by a statistical perturbation of an approximating model that is either a controlled diffusion process or a control measure over continuous functions of time. They describe how stochastic formulations of robust control “constraint problems” can be viewed in terms of Gilboa and Schmeidler’s max-min expected utility model. They show the connection between the penalty robust control problem and the constraint robust control problem, two closely related problems and formulate the Hamilton Jacobi Bellman equations for various two-player zero sum continuous time games that are defined in terms of a Markov diffusion process. We extend their framework to the problem of optimal robust trading rules for a risk averse investor who does not trust his model dynamics, believes that his nominal model is a good approximation to the real model and invests in arbitrage opportunities. Fleming and Souganidis [77] present how the Bellman-Isaacs condition defines a Bellman equation for a two-player zero-sum game in which both players decide at time 59
  • 60.
    0 or recursively.In other words, they show that the freedom to exchange orders of maximization and minimization guarantees that equilibria of games where the choices are done under mutual commitment at time 0 and of games where the choices are done sequentially by both agents coincide. Anderson et al. [3] show how the set of perturbed models in our formulations is difficult to distinguish statistically from the approximating model given a finite sample of timeseries observations. Jacobson [44] and Whittle [78] studied risk sensitive optimal control in the context of discrete-time linear quadratic regulator decision problems. They showed how the risk-sensitive control law can be computed by equivalently solving a robust penalty problem. We will now discuss first how to represent the alternative probability models over which we want our decision rules to be robust and how relative entropy can be used to describe their discrepancies from the nominal model. We will formulate two closely related nonsequential problems and the corresponding recursive HJB equations and finally we will find the optimal robust portfolio allocation for a risk averse investor who is not confident about the dynamics of his models and wants to invest in convergence trades or mean reversion trading strategies. 2.2 Analysis In Chapter 1 we saw that in the case when there is no model misspecification, the investor wants to find the optimal portfolio allocation that solves the following prob- lem: maximizeθ∈Θ E(lnWT ) subject to dWt = θtdSt + θ0tdBt dSt = µ(S, t)dt + σ(S, t)dZt dBt = rBdt (2.1) 60
  • 61.
    Here we haveSt ∈ RN and we have studied the following special cases: dSit = − aiSit T − t dt + K X k=1 σikdZkt dSt = −Φ(St − S̄)dt + σdZt and Θ is the set of admissible trading strategies: Θ = θ |θT Σθ ≤ LW2 for the case of VaR constraints and Θ = ( θ | N X i=1 λi|θi| ≤ W ) for the case of the margin constraints. In this Chapter, the investor doubts his model dSt = µ(S, t)dt + σ(S, t)dZt. To capture this doubt of the investor, we surround the approximating model with a cloud of models that are statistically difficult to distinguish and we add a malevolent agent who picks the worst possible model. The investor wants to find the optimal trading strategy that solves the following problem: maxθ∈Θ minQ∈Q EQ(lnWT ) (2.2) where Θ is the set of admissible trading strategies and Q is the set of alternative probability models. Problem 2.2 fits the max-min expected utility model of Gilboa and Schmeidler [34], where Q is a set of multiple different priors. Let’s now discuss how we represent the set of alternative probability models. 2.2.1 Alternative models representation We use martingales to represent perturbations to the probability models and relative entropy to measure the discrepancy between our nominal model and the alternative 61
  • 62.
    models. To understandbetter our continuous time formulations, we digress for a while by borrowing an example from [41]. Let’s consider a discrete time approximating model and its innovations ǫt which are i.i.d Gaussian shocks. An alternative model alters the distribution of these shocks. We use martingales to represent distortions to the probabilities. Let π̂t(ǫ) be the alternative density of the shock ǫt+1 based on date t information. Then the random variable Mt = Qt j=1 mj, where mj = π̂j−1(ǫ) π(ǫ) and M0 = 1, is a martingale and is a ratio of the joint alternative density over the joint nominal density. We define the entropy of the alternative distribution associated with Mt as the expected likelihood ratio with respect to the distorted distribution E(Mtlog(Mt)). It has the property that it is always non-negative and it is equal to 0, only when there is no distortion to the nominal distribution. Similarly, in our continuous time formulations we will use martingales to represent distortions to the nominal probability model. We will construct an alternative model by replacing Zt in our model by Ẑt + R t 0 hsds, where Ẑt is a Brownian motion under the alternative measure Q and ht is an adapted process that models the distortion, such that the process ξt = e R t 0 hsdZs− 1 2 R t 0 hT s hsds is a martingale. Therefore, the nominal model is misspecified by allowing the conditional mean of the shock vector in the alternative models to feed back arbitrarily on the history up to date t. Since ξ0 = 1 we have that E(ξt) = 1. Since in addition, ξt 0, we can define a probability measure Q such that Q(A) = E[1AξT ], in other words ξT = dQ dP is the Radon-Nikodym derivative of Q with respect to P, where the measures Q and P are equivalent. In fact one can always define a process ht so that for any measure Q the Radon-Nikodym derivative of Q with respect to P, dQ dP , is given by the exponential martingale ξT . In this way, our distorted models are: • For the convergence trades, dSt = − aiSit T − t dt + K X k=1 σik(dẐkt + hktdt) (2.3) 62
  • 63.
    • For themean reversion trades, dSt = −Φ(St − S̄)dt + σ(dẐt + htdt) (2.4) where Ẑt is a Brownian motion under Q. Why is it a Brownian motion under Q? The answer lies with the Girsanov theorem [30] that states that if a process ht is such that ξt is a martingale and ξT = dQ dP is the Radon-Nikodym derivative of Q with respect to P, then the process ˆ Z(t) = Zt − R t 0 hsds is a Brownian motion under measure Q. Therefore, we parameterize Q by the choice of the drift distortion adapted process ht. Similarly with the discrete time case, we measure the discrepancy between mea- sures Q and P as the relative entropy D(Q) (see Appendix for derivation), D(Q) = Z T 0 1 2 EQ[hT t ht]dt (2.5) . This is to be expected, since the relative entropy between a multivariate Gaussian distribution N(µ, I) and the multivariate standard normal distribution is D(Q) = 1 2 µT µ (See Appendix for derivation) and htdt is the conditional mean of the process dZt under the alternative probability measure Q. To express the notion that the nominal model is a good approximation to the real model that generate the spread dynamics, we either restrain the alternative models by D(Q) ≤ η or we penalize them with the magnitude of the entropy. 2.2.2 Model setup Having described the set of alternative distributions, we are ready to formulate the problem a risk averse investor faces who distrusts his model dynamics. As in [40] we define two closely related problems: 63
  • 64.
    • A multiplierrobust control problem. maxθ∈Θ minQ EQ(lnWT ) + νD(Q) subject to dWt = θtdSt + θ0tdBt dSt = µ(S, t)dt + σ(S, t)(dẐt + htdt) dBt = rBdt dξt = ξthtdZt (2.6) where ξT = dQ dP and D(Q) is given by equation 2.5. Here in essence there is an implicit restriction manifested by the nonnegative penalty parameter ν. • A constrained robust control problem. maxθ∈Θ minQ EQ(lnWT ) subject to dWt = θtdSt + θ0tdBt dSt = µ(S, t)dt + σ(S, t)(dẐt + htdt) dBt = rBdt dξt = ξthtdZt D(Q) ≤ η (2.7) where ξT = dQ dP and D(Q) is given by equation 2.5. In both cases the minimizing malevolent agent chooses the distortion process ht taken θ as given and the maximizing investor chooses the optimal strategy taken ht as given. We index the family of multiplier robust control problems by ν and the family of constrained robust control problems by η. Obviously the two problems are related, since the robustness parameter ν can be interpreted as the Lagrange multiplier on the constraint D(Q) ≤ η. Actually we can show that if V (ν) is the optimal value of the multiplier robust problem and K(η) is the optimal value of the constrained robust problem then we have: K(η) = maxν≥0V (ν) − νη [40]. Therefore we will be only interested in finding V (ν). 64
  • 65.
    2.3 Solution We willsolve 2.6 by solving the corresponding Hamilton Jacobi Bellman (HJB) equa- tion. We will solve the HJB equation for the case that there are no constraints in the admissible trading strategies and when the trading strategies are constrained by VaR or collateral considerations. But first let’s digress for a while and solve the HJB equation for the case when there is no fear of model misspecification. 2.3.1 No fear of model misspecification For now we assume that the investor completely trusts the dynamics of his models dSt = µ(S, t)dt + σ(S, t)dZt. He chooses the trading strategy θ ∈ Θ that solves the problem 2.1 where Θ is the set of admissible trading strategies: Θ = θ|θT Σθ ≤ LW2 for the case of VaR constraints and Θ = ( θ| N X i=1 λi|θi| ≤ W ) for the case of the margin constraints. In this case the HJB equation is: max θ∈Θ Vt + VW (Wr + θT (µ(S, t) − rSt)) + V T S µ(S, t) + 1/2VW W θT Σθ + VW SΣθ + 1/2trace(ΣVSS) = 0 where Σ = σσT and V (W, S, t) is the value function of the investor subject to the terminal condition V (W, S, T) = ln(W). Due to the logarithmic preferences of the investor it is: V (W, S, t) = ln(W)+H(S, t), therefore VW S = 0, VW = 1 W and VW W = − 1 W 2 . We also define ∀t ∈ [0, T] Ft = θt/Wt ∈ RN . 65
  • 66.
    The HJB equationbecomes: max F ∈F Vt + (r + FT (µ(S, t) − rSt)) + V T S µ(S, t) − 1/2FT ΣF + 1/2trace(ΣVSS) = 0 where F is the set of admissible trading strategies: F = F|FT ΣF ≤ L for the case of VaR constraints and F = ( F| N X i=1 λi|Fi| ≤ 1 ) for the case of the margin constraints. The optimal trading strategy is the solution to the following convex problem min F ∈F FT (−µ(S, t) + rSt) + 1/2FT ΣF (2.8) as we also proved with a different method in Chapter 1, where µt = −µ(S, t) + rSt and in particular it is: µt =      S1t( a1 T−t + r) . . . SNt( aN T−t + r)      for the convergence trades case and µt =      ΦT 1 (St − S̄) + rS1t . . . ΦT N (St − S̄) + rSNt      for the mean reversion trading strategies case. 66
  • 67.
    2.3.2 Fear ofmodel misspecification no constraints In this section we assume that there are no constraints in the trading strategies fol- lowed by the risk averse investor and the investor is not confident about the dynamics of his models. The Hamilton Jacobi Bellman equation for the problem 2.6 is given by: max θ min h Vt + VW (Wr + θT (µ(S, t) − rSt)) + V T S µ(S, t) + 1/2VW W θT Σθ + VW SΣθ + 1/2trace(ΣVSS) + VW θT σh + V T S σh + ν 2 hT h = 0 where Σ = σσT and V (W, S, t) is the value function of the investor subject to the terminal condition V (W, S, T) = ln(W). The malevolent agent picks the worst case distortion drift process ht and the investor maximizes against the worst case scenario. After defining ∀t ∈ [0, T] Ft = θt/Wt ∈ RN the HJB equation becomes: max F min h Vt + WVW (r + FT (µ(S, t) − rSt)) + V T S µ(S, t) +1/2W2 VW W FT ΣF+WVW SΣF+1/2trace(ΣVSS)+WVW FT σh+V T S σh+ ν 2 hT h = 0 The inner minimization problem is a convex quadratic problem. The first order conditions are: WVW σT F + σT VS + νh = 0 h = − σT (WVW F + VS) ν The optimal value of the inner minimization problem is: g(F) = − W2 V 2 W Ft ΣF + V T S ΣVS + 2WVW FT ΣVS 2ν 67
  • 68.
    Plugging this backinto the HJB equation we have: max F Vt + WVW (r + FT (µ(S, t) − rSt)) + V T S µ(S, t) + 1/2W2 VW W FT ΣF + WVW SΣF + 1/2trace(ΣVSS) − W2 V 2 W Ft ΣF + V T S ΣVS + 2WVW FT ΣVS 2ν = 0 Due to the logarithmic preferences of the investor it is: V (W, S, t) = lnW + H(S, t) and in that case VW = 1 W VW W = − 1 W 2 VW S = 0, VS(W, S, t) = HS(S, t) and the minimizing drift distortion h = −σT (F +HS) ν independent of the wealth. The HJB equation now becomes: max F Vt + r + FT (µ(S, t) − rSt) + V T S µ(S, t) − 1/2FT ΣF + 1/2trace(ΣVSS) − FT ΣF + V T S ΣVS + 2FT ΣVS 2ν = 0 The optimal trading strategy is the solution to the following convex quadratic prob- lem: maximize FT (µ(S, t) − rSt − ΣVS ν ) − 1 2 (1 + 1 ν )FT t ΣFt (2.9) The first order conditions are: µ(S, t) − rSt − ΣVS(St, t) ν = (1 + 1 ν )ΣFopt t Fopt t = 1 1 + 1 ν Σ−1 (µ(S, t) − rSt − ΣVS(St, t) ν ) Fopt t = ν ν + 1 Σ−1 (µ(S, t) − rSt) − VS(St, t) ν + 1 We clearly see that as ν → ∞ the optimal trading strategy converges to the one where we have no fear of model misspecification. This is to be expected since at this case the problems 2.1 and 2.6 are equivalent. It is interesting to find the conditions under which these weights are equal to the weights when there is no fear of model misspecification. When there is a fear of model misspecification, the optimal weights are a convex 68
  • 69.
    combination of Σ−1 (µ(S,t)−rSt), i.e.the weights without model misspecification and −VS. Therefore these weights are equal to the weights when there is no fear of model misspecification, when Vs + Fopt = 0, which is equivalent to hmin = 0. Of course this is expected since in that case there would be no distortion drift and the HJB equation would be the same as the benchmark case of no model misspecification. After plugging in the optimal trading strategy to the HJB equation, it becomes: Vt + r + 1/2 trace(ΣVSS) + 1/2 1 1 + 1 ν (µ(S, t) − rSt)T Σ−1 (µ(S, t) − rSt) + V T S µ(S, t) − µ(S, t) − rS ν + 1 − 1/2 1 ν + 1 V T S ΣVS = 0 We can plug in the optimal trading strategy to h = −σT (W VW F +VS) ν to find that: hmin = − (σT Fopt + σT VS) ν hmin = − (σT 1 1+ 1 ν Σ−1 (µ(S, t) − rSt − ΣVS ν ) + σT VS) ν hmin = − σT (Σ−1 (µ(S, t) − rSt) + VS) ν + 1 We consider two cases: • Convergence trades. The optimal trading strategy is given by: θopt t = − ν ν + 1 Σ−1 AtSt + VS ν + 1 Wt (2.10) where At = diag( a1 T−t + r, · · · , aN T−t + r). The optimal trading strategy is a convex combination of the strategy without fear of model misspecification and −VS with weights ν ν+1 and 1 ν+1 . From the symmetry of the problem we have: H(S, t) = H(−S, t) from which we get VS(St, t) = −VS(−St, t). • Mean reversion trades. The optimal trading strategy is given by: θopt t = − ν ν + 1 Σ−1 (Φ(St − S̄) + rSt) + VS ν + 1 Wt (2.11) 69
  • 70.
    The optimal tradingstrategy is again a convex combination of the strategy without fear of model misspecification and −VS with weights ν ν+1 and 1 ν+1 . For the special case where S̄ = 0 we have: θopt t = − ν ν + 1 Σ−1 ((Φ + rI)St) + VS ν + 1 Wt (2.12) 2.3.3 Fear of model misspecification with VaR and margin constraints In this section we assume that the investor is not confident about the dynamics of his models and he faces either VaR or margin constraints. The Hamilton Jacobi Bellman equation for the problem 2.6 is given by: max θ∈Θ min h Vt + VW (Wr + θT (µ(S, t) − rSt)) + V T S µ(S, t) + 1/2VW W θT Σθ + VW SΣθ + 1/2trace(ΣVSS) + VW θT σh + V T S σh + ν 2 hT h = 0 where Σ = σσT and V (W, S, t) is the value function of the investor subject to the terminal condition V (W, S, T) = ln(W). As previously Θ is the set of admissible trading strategies: Θ = θ|θT Σθ ≤ LW2 for the case of VaR constraints and Θ = ( θ| N X i=1 λi|θi| ≤ W ) for the case of the margin constraints. We can proceed like the previous case where we had no constraints and we will get 70
  • 71.
    the following HJBequation: max F ∈F Vt + (r + FT (µ(S, t) − rSt)) + V T S µ(S, t) − 1/2FT ΣF + 1/2trace(ΣVSS) − FT ΣF + V T S ΣVS + 2FT ΣVS 2ν = 0 where F is the set of admissible trading strategies: F = F|FT ΣF ≤ L for the case of VaR constraints and F = ( F| N X i=1 λi|Fi| ≤ 1 ) for the case of the margin constraints. The optimal trading strategy is the solution to the following convex problem: maximize FT (µ(S, t) − rSt − ΣVS ν ) − 1 2 (1 + 1 ν )FT t ΣFt subject to F ∈ F (2.13) We clearly see again that as ν → ∞ the optimal trading strategy converges to the one where we have no fear of model misspecification as expected. Using a similar proof as in Chapter 1, we can show (see Appendix) that in the case the investor faces VaR constraints the optimal portfolio is given by: Fopt t =        1 1+ 1 ν Σ−1 µt if µT t Σ−1 µt ≤ L(1 + 1 ν )2 Σ−1µt r µT t Σ−1µt L if µT t Σ−1 µt ≥ L(1 + 1 ν )2 where µt = µ(S, t) − rSt − ΣVS(St,t) ν . 71
  • 72.
    This is alsoequivalently written as: Fopt t = 1 1 + 1 ν + λ Σ−1 µt where 1 + 1 ν + λ = max(1 + 1 ν , q µT t Σ−1µt L ). We consider two cases: • Convergence trades. The optimal trading strategy is the solution to the follow- ing convex problem: minimize FT (AtSt + ΣVS ν ) + 1 2 (1 + 1 ν )FT t ΣFt subject to F ∈ F (2.14) where At = diag( a1 T−t + r, · · · , aN T−t + r). • Mean reversion trades. The optimal trading strategy is the solution to the following convex problem: minimize FT (Φ(St − S̄) + rSt + ΣVS ν ) + 1 2 (1 + 1 ν )FT t ΣFt subject to F ∈ F (2.15) For the special case where S̄ = 0, we have the following problem: minimize FT ((Φ + rI)St + ΣVS ν ) + 1 2 (1 + 1 ν )FT t ΣFt subject to F ∈ F (2.16) 2.4 Results We will investigate how the optimal trading strategy changes as a result of mistrust of the model dynamics. We will first study the case where we have no constraints and then the case where we have VaR constraints. 72
  • 73.
    2.4.1 Convergence tradeswithout constraints We consider the case where we have N = 1 arbitrage opportunity and there are no constraints. We will study the case where N = 2 when we have constraints. Due to the symmetry of S around 0 it suffices to study only what happens when S ≥ 0, since the symmetry implies that the value function is an even function of the spread S and its partial derivative with respect to the spread is an odd function of S for each t. The optimal weight in the arbitrage opportunity is given by: Fopt t = − 1 σ2(1 + 1 ν ) (( a T − t + r)St + σ2 VS(St, t) ν ) (2.17) and the minimizing distortion drift is given by: hmin = −σ(F opt+VS) ν . Comparing to the case where there is no fear of model misspecification we see that now the variance increases by multiplying by a factor of (1 + 1 ν ) and the drift increases by adding σ2VS(St,t) ν . When S is positive, one would think that there are three cases to consider: • If VS 0 then F 0. In this case there is a tradeoff between the two terms in hmin. The first term −σF opt ν corresponds to a positive distortion drift that reduces the wealth of the investor since the investor is shorting the spread, while the second term −σVS ν corresponds to a negative distortion drift that points to worse investment opportunities. • If −AtStν σ2 VS 0 then F 0 and both the terms in hmin correspond to positive distortion drifts with the first one reducing the wealth of the investor and the second term pointing to worse investment opportunities. • If VS −AtStν σ2 then F 0. There is now again a tradeoff between the two terms in hmin. Now the first term corresponds to a negative distortion drift reducing the wealth of the investor since in this case the investor is long the spread, and the second term corresponds to a positive distortion pointing to worse investment opportunities. 73
  • 74.
    A little morethought though will exclude the last two cases, since we expect the value function V to be a non-decreasing function of S for nonnegative values of the spread, since higher values of the spread S correspond to better investment opportunities. That will lead to non-negative values of VS for S ≥ 0. The HJB equation is given by: Vt + r + 1/2 σ2 VSS + 1/2 1 1 + 1 ν ( a T − t + r)2 S2 σ2 + VS(− a T − t S + ( a T−t + r)S ν + 1 ) − 1/2 1 ν + 1 V 2 S σ2 = 0 We solve the HJB equation numerically using the method of finite differences [75]. In the following figures we have assumed that rf = 0, σ = 1, a = 0.01 and T = 1. We observe the following: • VS becomes larger and larger as t → T for each value of ν as we see in Figure 2-1 until some value close to the horizon where it starts going down. In addition, VS is higher for higher values of ν. • For very low values of ν the drift distortion hmin is positive and becomes larger as t → T as we see in Figure 2-2. For higher values of ν the drift distortion starts negative and after some point increases as t → T to positive values. As we showed in the previous section, when hmin = 0, the optimal weight is equal to the optimal weight in the case where there is no fear of model misspecification. We can see this in Figure 2-4, where very close at the time where hmin crosses 0, the optimal weight graph crosses the one when ν = 100. • Figure 2-3 shows the two terms of the distortion drift for ν = 1. The first term corresponds to a positive distortion drift that reduces the wealth of the investor since the investor is shorting the spread, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. In this tradeoff the first term is losing at the beginning which makes the drift distortion negative but as t → T it increases in a fast rate making the drift distortion positive. 74
  • 75.
    • The factthat VS becomes larger as t → T for each value of ν until a point close to the horizon in combination with the fact that the drift term a T−t increases in a hyperbolic way leads to F becoming larger (in absolute value) as t → T (Figure 2-4). The risk averse investor becomes more aggressive as the time to T becomes smaller despite the fact that as t → T the malevolent agent picks a more adverse distortion drift (Figure 2-2). This is due to the fact that the improvement in the investment opportunities is so substantial that dominates the fact that the distortion drift gets also larger. • For very low values of ν the investor is more conservative than the case without model misspecification for all t. For higher values on ν at the beginning the investor is more aggressive and as t → T becomes more conservative comparing to the case without model misspecification (Figure 2-4). This is because at the beginning the drift term a T−t is very low comparing to VS making the total drift term lower for large values of ν than smaller values of ν, which leads to lower magnitude of weight. This situation changes as time to horizon T gets smaller. • As ν → ∞ the optimal weight in the strategy converges to the optimal weight when there is no fear of model misspecification as we have argued before. 75
  • 76.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 Vs 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 Vs as a function of time for S = 1 nu is: 0.01 nu is: 0.1 nu is: 1 nu is: 10 nu is: 100 Figure 2-1: Partial derivative of the value function with respect to S for a single convergence trade. VS as a function of time at S = 1 for different values of the robustness multiplier for a single convergence trade. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 hmin -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 hmin as a function of time for S = 1 nu is: 0.01 nu is: 0.1 nu is: 1 nu is: 10 nu is: 100 Figure 2-2: Distortion drift for a single convergence trade. Distortion drift as a function of time at S = 1 for different values of the robustness multiplier for a single convergence trade. 76
  • 77.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 Distortion -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 Drift distortion and its two terms as a function of time for S = 1 h h1 h2 Figure 2-3: Distortion drift terms for a single convergence trade. Distortion drift terms as a function of time at S = 1 for ν = 1 for a single convergence trade. The first term corresponds to a positive distortion drift that reduces the wealth of the investor since the investor is shorting the spread, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Weights -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 Optimal weights as a function of time for S = 1 nu is: 0.01 nu is: 0.1 nu is: 1 nu is: 10 nu is: 100 Figure 2-4: Optimal weight of a single convergence trade. Weight of the convergence trading strategy as a function of time at S = 1 for different values of the robustness multiplier. 77
  • 78.
    2.4.2 Mean reversiontrading strategies without constraints We consider first the case where we have N = 1 mean reversion trading strategy and S̄ = 0 i.e. dSt = −φStdt + σdZt Due to the symmetry of S around 0 it suffices to study only what happens when S ≥ 0, since the symmetry implies that the value function is an even function of the spread S and its partial derivative with respect to the spread is an odd function of S for each t. The optimal weight in the trading strategy is given by: Fopt t = − 1 σ2(1 + 1 ν ) ((φ + r)St + σ2 VS(St, t) ν ) = − ν σ2(ν + 1) (φ + r)St − VS(St, t) ν + 1 and the minimizing distortion drift is given by: hmin = − σ(Fopt + VS) ν = (φ + r)St σ(ν + 1) − σVS ν + 1 Comparing to the case where there is no fear of model misspecification we see that when the investor does not trust the model dynamics, the variance increases by mul- tiplying with a factor of (1 + 1 ν ) and the drift increases by adding σ2VS(St,t) ν . If VS is non-negative and decreases as a function of time t, then the increase in the drift gets smaller and smaller and the investor gets more and more conservative as time passes by. In this case the distortion drift also gets larger and larger as time passes by, which explains why the investor gets more and more conservative. When S is positive, we have VS ≥ 0, since higher values of S correspond to better investment opportunities. If VS ≥ 0 then Fopt ≤ 0 for S ≥ 0. Therefore we see that there is a tradeoff between the two terms in hmin. The first term −σF opt ν corresponds to a positive distortion drift that reduces the wealth of the investor since the investor is 78
  • 79.
    shorting the spread,while the second term −σVS ν corresponds to a negative distortion drift that points to worse investment opportunities. The HJB equation is given by: Vt+r+1/2 σ2 VSS +1/2 1 1 + 1 ν (φ+r)2 S2 σ2 +VS(−φS+ (φ + r)S ν + 1 )−1/2 1 ν + 1 V 2 S σ2 = 0 We solve the HJB equation numerically using the method of finite differences. In the following figures we have assumed that rf = 0, σ = 1, φ = 1 and T = 1. We observe the following: • VS becomes smaller and smaller as t → T for each value of ν as we see in Figure 2-5. This makes sense, since as t → T there is less time to take advantage of the mean reversion trading strategy. In addition, VS is higher for higher values of ν. • The drift distortion hmin is positive and becomes larger and larger as t → T for each value of the robustness multiplier as we see in Figure 2-6. Figure 2-7 shows the two terms of the distortion drift for ν = 1. The first term corresponds to a positive distortion drift that reduces the wealth of the investor since the investor is shorting the spread, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. In this tradeoff the first term wins making the drift distortion positive. • The fact that VS becomes smaller and smaller as t → T for each value of ν leads to F becoming smaller and smaller (in absolute value) as t → T ( Figure 2-8). In other words the risk averse investor becomes more conservative as the time to T becomes smaller. This is to be expected, since as t → T the malevolent agent picks a more adverse distortion drift (Figure 2-6) causing the investor to be more cautious. • The lower the value of ν the more conservative the investor is as we see in Figure 2-8. This is because ν is the robustness multiplier and lower values of it puts 79
  • 80.
    less penalty inthe distorting alternative distribution, leading to higher positive drift distortions (Figure 2-6). • As ν → ∞ the optimal weight in the strategy converges to the optimal weight when there is no fear of model misspecification as we have argued before. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Vs 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Vs as a function of time for S = 1 nu is: 1 nu is: 10 nu is: 100 Figure 2-5: Partial derivative of the value function with respect to S for a single mean reversion trading strategy. VS as a function of time at S = 1 for different values of the robustness multiplier. 80
  • 81.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 hmin 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 hmin as a function of time for S = 1 nu is: 1 nu is: 10 nu is: 100 Figure 2-6: Distortion drift for a single mean reversion trading strategy. Distortion drift as a function of time at S = 1 for different values of the robustness multiplier. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Distortion -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Drift distortion and its two terms as a function of time for S = 1 h h1 h2 Figure 2-7: Distortion drift terms for a single mean reversion trading strat- egy. Distortion drift terms as a function of time at S = 1 for ν = 1. The first term corresponds to a positive distortion drift that reduces the wealth of the investor, since the investor is shorting the spread, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. 81
  • 82.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -1 -0.95 -0.9 -0.85 -0.8 -0.75 -0.7 -0.65 -0.6 -0.55 -0.5 Optimal weights as a function of time for S = 1 nu is: 1 nu is: 10 nu is: 100 Figure 2-8: Optimal weight of a single mean reversion trading strategy. Weight of the mean reversion trading strategy as a function of time at S = 1 for different values of the robustness multiplier. 82
  • 83.
    Let us nowconsider the case where we have N = 2 mean reversion trading strate- gies and again S̄ = 0. We solve numerically the HJB equation using the method of finite differences. We have assumed that rf = 0, T = 1, Φ =   2 0 0 1   and Σ =   1 ρ ρ 1   In Figure 2-9 we plot the weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. We have chosen S1 = 1 and S2 = 2, since for these values the drift is the same for both the strategies. We have assumed that there is no correlation between the two trading strategies. We observe the following: • First of all when there is no fear of model misspecification the weights of the two strategies are the same and they do not change over time. • When there is a fear of model misspecification the investor becomes more and more conservative over time just like in the N = 1 case. It is interesting to note that the weight is higher for the first trading strategy where the φ coefficient is higher. That makes sense since “ceteris parebus” we would expect the Vs to be higher for the strategy with the stronger rate of mean reversion (φ coefficient). This is shown is Figure 2-11. • Figure 2-10 shows the ratio of the weights of the two trading strategies. We observe that this is higher for smaller values of the robustness multiplier and it is reduced to 1 as t → T. In Figure 2-12 we have assumed that there is a correlation ρ = 0.5 between the two trading strategies. Now the weights are smaller than before due to the positive correlations but they have the same properties as before. In the case when there is a 83
  • 84.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -2 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1 Optimal weights as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-9: Optimal weights of two uncorrelated mean reversion trading strategies for S1 = 1 and S2 = 2. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 1 1.05 1.1 1.15 1.2 1.25 1.3 Ratio of optimal weights as a function of time for S1 = 1 and S2 = 2 nu is: 1 nu is: 10 nu is: 100 Figure 2-10: Ratio of the optimal weights. Ratio of the optimal weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0. negative correlation (Figure 2-13 ) the weights are larger, since now the opportunities hedge each other, otherwise the properties remain the same. 84
  • 85.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Vs 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Vs1 Vs2 as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-11: Partial derivative of the value function with respect to S1 and S2 at S1 = 1 and S2 = 2 when ρ = 0. Partial derivative of the value function with respect to S1 and S2 as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -1.4 -1.3 -1.2 -1.1 -1 -0.9 -0.8 -0.7 -0.6 Optimal weights as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-12: Optimal weights of two positively correlated mean reversion trading strategies for S1 = 1 and S2 = 2. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.5. 85
  • 86.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -4 -3.8 -3.6 -3.4 -3.2 -3 -2.8 -2.6 -2.4 -2.2 -2 Optimal weights as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-13: Optimal weights of two negatively correlated mean reversion trading strategies for S1 = 1 and S2 = 2. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5. 86
  • 87.
    So far wehave examined the two trading strategies at values where the drifts are the same. We now choose S1 = 1 and S2 = 1, since for these values the drifts differ for the two strategies. Figure 2-14 shows the weights of the strategies over time for different values of the robustness multiplier when there is no correlation between the two trading strategies. We now observe the following: • First of all when there is no fear of model misspecification the ratio of the weights of the two strategies is the same as the ratio of their drifts normalized by their variances and it does not change over time. • When there is a fear of model misspecification the investor becomes more and more conservative over time just like in the N = 1 case as we see in 2-14. • VS1 is higher than VS2. That makes sense since “ceteris parebus” we would expect the Vs to be higher for the strategy with the stronger rate of mean reversion (φ coefficient). This is shown is Figure 2-16. • Figure 2-15 shows the ratio of the weights of the two trading strategies. We observe that this is higher for smaller values of the robustness multiplier and after initially increasing it finally converges to 1 as t → T. The reason that initially the ratio is less than the one for the case where there is no fear of model misspecification is that initially the ratio of VS1 VS2 is less than the ratio of the drifts. A very interesting case arises when there is a high positive correlation like ρ = 0.9 between the two trading strategies. In this case the investor uses the second trading strategy with the lower drift as a hedge for the first strategy as it is shown in Figure 2- 17 where the investor is shorting the first strategy while he is long the second strategy. Now VS2 is negative (Figure 2-18) which is to be expected since the investor is long the asset and higher values of S2 lead to worse investment opportunities for a long investor. Figure 2-19 shows the ratio of the magnitudes of the optimal weights as a function of time. 87
  • 88.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -2 -1.5 -1 -0.5 Optimal weights as a function of time for S1 = 1 and S2 = 1 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-14: Optimal weights of two uncorrelated mean reversion trading strategies for S1 = 1 and S2 = 1. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 1.94 1.96 1.98 2 2.02 2.04 2.06 2.08 2.1 2.12 Ratio of optimal weights as a function of time for S1 = 1 and S2 = 1 nu is: 1 nu is: 10 nu is: 100 Figure 2-15: Ratio of the optimal weights at S1 = 1 and S2 = 1 when ρ = 0. Ratio of the optimal weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0. Finally Figure 2-20 shows the weights when there is negative correlation ρ = −0.8. The results are similar with the case of no correlation, although now the weights are higher due to the negative correlation, which makes the two strategies good hedges. 88
  • 89.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Vs 0 0.2 0.4 0.6 0.8 1 1.2 Vs1 Vs2 as a function of time for S1 = 1 and S2 = 1 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-16: Partial derivative of the value function with respect to S1 and S2 at S1 = 1 and S2 = 1 when ρ = 0. Partial derivative of the value function with respect to S1 and S2 as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -6 -5 -4 -3 -2 -1 0 1 2 3 4 Optimal weights as a function of time for S1 = 1 and S2 = 1 Spread 1 nu is: 0.1 Spread 2 nu is: 0.1 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Figure 2-17: Optimal weights of two positively correlated mean reversion trading strategies for S1 = 1 and S2 = 1. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.9. 89
  • 90.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Vs -1 -0.5 0 0.5 1 1.5 2 2.5 Vs1 Vs2 as a function of time for S1 = 1 and S2 = 1 V S1 nu is: 0.1 V S2 nu is: 0.1 VS1 nu is: 1 VS2 nu is: 1 V S1 nu is: 10 V S2 nu is: 10 Figure 2-18: Partial derivative of the value function with respect to S1 and S2 at S1 = 1 and S2 = 1 when ρ = 0.9. Partial derivative of the value function with respect to S1 and S2 as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.9. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 1.35 1.4 1.45 1.5 1.55 1.6 1.65 Ratio of optimal weights as a function of time for S1 = 1 and S2 = 1 nu is: 1 nu is: 10 nu is: 100 Figure 2-19: Ratio of the optimal weights at S1 = 1 and S2 = 1 when ρ = 0.9. Ratio of the magnitude of the optimal weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.9. 90
  • 91.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -8 -7.5 -7 -6.5 -6 -5.5 -5 -4.5 -4 -3.5 Optimal weights as a function of time for S1 = 1 and S2 = 1 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-20: Optimal weights of two negatively correlated mean reversion trading strategies for S1 = 1 and S2 = 1. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.8. 91
  • 92.
    2.4.3 Convergence tradeswith constraints We consider first the case where we have N = 1 arbitrage opportunity and there are collateral constraints i.e. |F| ≤ L. Due to the symmetry of the constraint and the spread S around 0 it suffices to study only what happens when S ≥ 0, since the symmetry implies that the value function is an even function of the spread S and its partial derivative with respect to the spread is an odd function of S for each t. The optimal weight in the arbitrage opportunity is given by: Fopt t =              − 1 σ2(1+ 1 ν ) (( a T−t + r)St + σ2VS(St,t) ν ) if|( a T−t + r)St + σ2VS(St,t) ν | ≤ Lσ2 (1 + 1 ν ) −L if( a T−t + r)St + σ2VS(St,t) ν ≥ Lσ2 (1 + 1 ν ) L if( a T−t + r)St + σ2VS(St,t) ν ≤ −Lσ2 (1 + 1 ν ) and the minimizing distortion drift is given by: hmin = −σ(F opt+VS) ν . The HJB equation is given by:                                    Vt + r + 1/2 σ2 VSS + 1/2 1 1+ 1 ν ( a T−t + r)2 S2 σ2 + VS(− a T−t S + ( a T −t +r)S ν+1 ) − 1/2 1 ν+1 V 2 S σ2 = 0 if |( a T−t + r)St + σ2VS(St,t) ν | ≤ Lσ2 (1 + 1 ν ) Vt + r + 1/2 σ2 VSS − 1 2ν V 2 S − VS( a T−t S − Lσ2 ν ) − 1 2 (1 + 1 ν )σ2 L2 + L( a T−t + r)S = 0 if ( a T−t + r)St + σ2VS(St,t) ν ≥ Lσ2 (1 + 1 ν ) Vt + r + 1/2 σ2 VSS − 1 2ν V 2 S − VS( a T−t S + Lσ2 ν ) − 1 2 (1 + 1 ν )σ2 L2 − L( a T−t + r)S = 0 if ( a T−t + r)St + σ2VS(St,t) ν ≤ −Lσ2 (1 + 1 ν ) We solve the HJB equation numerically using the method of finite differences. In the following figures we have assumed that rf = 0, σ = 1, a = 0.01 and T = 1. We observe the following: • VS is lower when the constraint is tighter (Figure 2-21). Figure 2-22 shows a typical behaviour of VS over time at S = 1 and L = 0.1 for different values of the robustness multiplier. 92
  • 93.
    • The lowerthe value of the robustness multiplier, the more time it takes to bind the constraint. For very low values of ν the investor is more conservative than the case without model misspecification for all t. When the constraint is relatively tight it is the case that the investor is more conservative when the robustness multiplier is lower (Figure 2-23). When the constraint is relatively loose (L is higher) it might be the case that for not very low values of ν the investor is initially more aggressive and as t → T becomes more conservative comparing to the case without model misspecification (Figure 2-24). This is because at the beginning the drift term a T−t is very low comparing to VS making the total drift term lower for large values of ν than smaller values of ν, which leads to lower magnitude of weight. This situation changes as time to horizon T gets smaller. • The fact that typically VS becomes larger as t → T for each value of ν until a point close to the horizon in combination with the fact that the drift term a T−t increases in a hyperbolic way leads to F becoming larger (in absolute value) as t → T (Figure 2-23, 2-24) till it binds the collateral constraint. The risk averse investor becomes more aggressive as the time to T becomes smaller despite the fact that as t → T the malevolent agent picks a more adverse distortion drift. This is due to the fact that the improvement in the investment opportunities is so substantial that dominates the fact that the distortion drift gets also larger. • For very low values of ν the drift distortion hmin is always positive. When the constraint is tight enough the drift distortion is positive for all values of the robustness multiplier (see Figure 2-25). When the constraint is not very tight for not very low values of ν the drift distortion starts negative and after some point increases as t → T to positive values (see Figure 2-26). • Figure 2-27 shows the two terms of the distortion drift for ν = 1 when L = 0.1. The first term corresponds to a positive distortion drift that reduces the wealth of the investor since the investor is shorting the spread and it is bounded above due to the collateral constraint, while the second term corresponds to a negative 93
  • 94.
    distortion drift thatpoints to worse investment opportunities. In this tradeoff the first term wins and the final distortion drift is positive. For higher values of L (see Figure 2-28) we see that the first term is losing at the beginning which makes the drift distortion negative, explaining why initially the investor is more aggressive than the case when there is no fear of model misspecification, but as t → T it increases at a fast rate making the drift distortion finally positive, which explains the fact that after a while the investor becomes more conservative comparing to the case without fear of model misspecification. It is interesting to note that after the collateral constraint binds the distortion drift evolution is determined by the evolution of VS and therefore it might also undergo some initial reduction before increasing to its upper bound dictated by the constraint. • The tighter the collateral constraint the more conservative the investor is even when the constrains does not bind (Figure 2-29). • As ν → ∞ the optimal weight in the strategy converges to the optimal weight when there is no fear of model misspecification as we have argued before. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Vs 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 Vs as a function of time for S = 1 Figure 2-21: Partial derivative of the value function with respect to S for a single convergence trade when L = 0.1 and L = 100. VS as a function of time at S = 1 for different values of the robustness multiplier. The solid line is when L = 100 and the dotted line is for L = 0.1. 94
  • 95.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Vs ×10-3 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 Vs as a function of time for S = 1 nu is: 0.01 L is: 0.1 nu is: 0.1 L is: 0.1 nu is: 1 L is: 0.1 nu is: 10 L is: 0.1 nu is: 100 L is: 0.1 Figure 2-22: Partial derivative of the value function with respect to S for a single convergence trade when L = 0.1. VS as a function of time at S = 1 for different values of the robustness multiplier. The collateral constraint is |F| ≤ 0.1. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -0.1 -0.08 -0.06 -0.04 -0.02 0 Optimal weights as a function of time for S = 1 nu is: 0.01 L is: 0.1 nu is: 0.1 L is: 0.1 nu is: 1 L is: 0.1 nu is: 10 L is: 0.1 nu is: 100 L is: 0.1 Figure 2-23: Optimal weight of a single convergence trade when L = 0.1. Weight of the convergence trading strategy as a function of time at S = 1 for different values of the robustness multiplier. The collateral constraint is |F| ≤ 0.1. 95
  • 96.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 Weights -0.1 -0.09 -0.08 -0.07 -0.06 -0.05 -0.04 -0.03 -0.02 -0.01 0 Optimal weights as a function of time for S = 1 nu is: 0.01 L is: 1 nu is: 0.1 L is: 1 nu is: 1 L is: 1 nu is: 10 L is: 1 nu is: 100 L is: 1 Figure 2-24: Optimal weight of a single convergence trade when L = 1. Weight of the convergence trading strategy as a function of time at S = 1 for different values of the robustness multiplier. The collateral constraint is |F| ≤ 1. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 hmin 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 hmin as a function of time for S = 1 nu is: 0.01 L is: 0.1 nu is: 0.1 L is: 0.1 nu is: 1 L is: 0.1 nu is: 10 L is: 0.1 nu is: 100 L is: 0.1 Figure 2-25: Distortion drift for a single convergence trade when L = 0.1. Distortion drift as a function of time at S = 1 for different values of the robustness multiplier. The collateral constraint is |F| ≤ 0.1. 96
  • 97.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 hmin -0.02 0 0.02 0.04 0.06 0.08 0.1 hmin as a function of time for S = 1 nu is: 0.01 L is: 1 nu is: 0.1 L is: 1 nu is: 1 L is: 1 nu is: 10 L is: 1 nu is: 100 L is: 1 Figure 2-26: Distortion drift for a single convergence trade when L = 1. Distortion drift as a function of time at S = 1 for different values of the robustness multiplier. The collateral constraint is |F| ≤ 1. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Distortion -0.02 0 0.02 0.04 0.06 0.08 0.1 Drift distortion and its two terms as a function of time for S = 1 h h1 h2 Figure 2-27: Distortion drift terms for a single convergence trade when L = 0.1. Distortion drift terms as a function of time at S = 1 for ν = 1 and L = 0.1. The first term corresponds to a positive distortion drift that reduces the wealth of the investor and it is bounded above due to the collateral constraint, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. 97
  • 98.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 Distortion -0.04 -0.02 0 0.02 0.04 0.06 0.08 Drift distortion and its two terms as a function of time for S = 1 h h1 h2 Figure 2-28: Distortion drift terms for a single convergence trade when L = 1. Distortion drift terms as a function of time at S = 1 for ν = 1 and L = 1. The first term corresponds to a positive distortion drift that reduces the wealth of the investor and it is bounded above due to the collateral constraint, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Weights -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 Optimal weights as a function of time for S = 1 Figure 2-29: Optimal weight of a single convergence trade when L = 0.1 and L = 100. Weight of the convergence trading strategy as a function of time at S = 1 for different values of the robustness multiplier. The solid line is when L = 100 and the dotted line is for L = 0.1. 98
  • 99.
    Let us nowconsider the case where we have N = 2 convergence trades. We solve numerically the HJB equation using the method of finite differences. We have assumed that rf = 0, T = 1, a =   0.04 0.02   and Σ =   1 ρ ρ 1   Additionally we assume that we have a VaR constraint FT ΣF ≤ L. In Figures 2-30 and 2-32 we plot the weights of the convergence trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of L and different values of the robustness multiplier. At these values of S1 and S2 the drift is the same for both the strategies. We have assumed that there is no correlation between the two trading strategies. Figures 2-31 and 2-33 show FT ΣF, the normalized wealth variance, as a function of time for the different values of the robustness multiplier. We observe the following: • When the VaR constraint binds, and there is a fear of model misspecification we invest more on the spread with the higher rate of mean reversion, due to higher VS. The asymmetry between the two convergence trades goes down as t → T and that is why when the VaR constraint binds, the difference in the weights of the two strategies have to go down, as we see in Figures 2-30 and 2-32 where L = 0.5 and L = 0.05 respectively. Moreover, the investment in the spread with the higher rate of mean reversion is higher than the corresponding investment when there is no fear of model misspecification. This is due to the fact that in both cases we have the same VaR constraint F2 1 + F2 2 = L, but in the fist case there is an asymmetry causing F1 to be higher and F2 to be lower than the corresponding weights in the second case. • The lower the value of the robustness multiplier, the more time it takes to bind the constraint, just like in the N = 1 case. 99
  • 100.
    • When theVaR constraint does not bind the weights of both of the strategies in- crease at t → T due to the improvement of the investment opportunities. When the constraint is relatively loose (L is higher) it might be the case that for not very low values of ν the investor is initially more aggressive and as t → T be- comes more conservative comparing to the case without model misspecification (Figures 2-30 and 2-32). This is because at the beginning the drift term a T−t is very low comparing to VS making the total drift term lower for large values of ν than smaller values of ν, which leads to lower magnitude of weight. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 Optimal weights as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-30: Optimal weights of two uncorrelated convergence trades for S1 = 1 and S2 = 2 when L = 0.5. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.5. 100
  • 101.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 0 0.1 0.2 0.3 0.4 0.5 0.6 VaR Constraint as a function of time for S1 = 1 and S2 = 2 nu is: 1 nu is: 10 nu is: 100 Figure 2-31: Value of the normalized wealth variance for two uncorrelated convergence trades at S1 = 1 and S2 = 2 when L = 0.5. Value of the normalized wealth variance for two uncorrelated convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.5. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -0.18 -0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 Optimal weights as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-32: Optimal weights of two uncorrelated convergence trades for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.05. 101
  • 102.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 0 0.01 0.02 0.03 0.04 0.05 0.06 VaR Constraint as a function of time for S1 = 1 and S2 = 2 nu is: 1 nu is: 10 nu is: 100 Figure 2-33: Value of the normalized wealth variance for two uncorrelated convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value of the normalized wealth variance for two uncorrelated convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.05. 102
  • 103.
    Figures 2-34 and2-35 show the weights and the lhs of the constraint for the case of positive correlation, while Figures 2-36 and 2-37 cover the case for negative correlations. The properties are similar with the ones for the uncorrelated case. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 Optimal weights as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-34: Optimal weights of two positively correlated convergence trades for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is L = 0.05. 103
  • 104.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 0 0.01 0.02 0.03 0.04 0.05 0.06 VaR Constraint as a function of time for S1 = 1 and S2 = 2 nu is: 1 nu is: 10 nu is: 100 Figure 2-35: Value of the normalized wealth variance for two positively correlated convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value of the normalized wealth variance for two positively correlated convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is L = 0.05. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -0.24 -0.22 -0.2 -0.18 -0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 Optimal weights as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-36: Optimal weights of two negatively correlated convergence trades for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 0.05. 104
  • 105.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 0 0.01 0.02 0.03 0.04 0.05 0.06 VaR Constraint as a function of time for S1 = 1 and S2 = 2 nu is: 1 nu is: 10 nu is: 100 Figure 2-37: Value of the normalized wealth variance for two negatively correlated convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value of the normalized wealth variance for two negatively correlated convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 0.05. 105
  • 106.
    So far wehave examined the two trading strategies at values where the drifts are the same. We now choose S1 = 1 and S2 = 1, since for these values the drifts differ for the two strategies. Figure 2-38 shows the weights of the strategies over time for different values of the robustness multiplier when there is no correlation between the two trading strategies and when L = 0.05, while Figure 2-39 shows the lhs of the VaR constraint, the normalized wealth variance, over time. We now observe the following: • First of all when there is no fear of model misspecification the ratio of the weights of the two strategies is the same as the ratio of their drifts normalized by their variances and it does not change over time independent on whether the VaR constraint binds or not. • When the VaR constraint binds and there is a fear of model misspecification the investor invests more on the spread with the higher drift. Since the constraint is F2 1 + F2 2 = L if F1 increases (decreases) over time, then F2 decreases (increases) over time. In Figure 2-38 F1 decreases over time (in magnitude), because the asymmetry between the two strategies goes down as t → T. The existence of this asymmetry is also the cause that when the robustness multiplier is lower the difference in the weights of the two strategies is larger than when the robustness multiplier is higher. • The lower the value of the robustness multiplier, the more time it takes to bind the VaR constraint. • When the VaR constraint does not bind then the investor becomes more and more aggressive, due to the improvement of the investment opportunities. 106
  • 107.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -0.22 -0.2 -0.18 -0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 Optimal weights as a function of time for S1 = 1 and S2 = 1 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-38: Optimal weights of two uncorrelated convergence trades for S1 = 1 and S2 = 1 when L = 0.05. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.05. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 0 0.01 0.02 0.03 0.04 0.05 0.06 VaR Constraint as a function of time for S1 = 1 and S2 = 1 nu is: 1 nu is: 10 nu is: 100 Figure 2-39: Value of the normalized wealth variance for two uncorrelated convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value of the normalized wealth variance for two uncorrelated convergence trades as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.05. 107
  • 108.
    It is interestingto note the weights when there is a positive correlation high enough to be long the second spread and use it as a hedge to the first one. In Figure 2-40 we plot the weights of the two trading strategies and we observe that they have different signs. Figure 2-41 shows the normalized wealth variance as a function of time. When the VaR constraint does not bind both the weights increase in magnitude due to the improvement in the investment opportunities. When the VaR constraint binds, both of the weights become larger in magnitude. This is due to the fact that the asymmetry between the two strategies is reduced towards the asymmetry in the case without fear of model misspecification. Finally Figures 2-42 and 2-43 show what happens when there is a negative corre- lation. The results are similar with the case of no correlation. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 Optimal weights as a function of time for S1 = 1 and S2 = 1 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-40: Optimal weights of two positively correlated convergence trades for S1 = 1 and S2 = 1 when L = 0.05. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.8 and the rhs of the VaR constraint is L = 0.05. 108
  • 109.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 0 0.01 0.02 0.03 0.04 0.05 0.06 VaR Constraint as a function of time for S1 = 1 and S2 = 1 nu is: 1 nu is: 10 nu is: 100 Figure 2-41: Value of the normalized wealth variance for two positively correlated convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value of the normalized wealth variance for two positively correlated convergence trades as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.8 and the rhs of the VaR constraint is L = 0.05. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -0.4 -0.35 -0.3 -0.25 -0.2 -0.15 -0.1 Optimal weights as a function of time for S1 = 1 and S2 = 1 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-42: Optimal weights of two negatively correlated convergence trades for S1 = 1 and S2 = 1 when L = 8. Weights of the convergence trades as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.8 and the rhs of the VaR constraint is L = 0.05. 109
  • 110.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 0 0.01 0.02 0.03 0.04 0.05 0.06 VaR Constraint as a function of time for S1 = 1 and S2 = 1 nu is: 1 nu is: 10 nu is: 100 Figure 2-43: Value of the normalized wealth variance for two negatively correlated convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value of the normalized wealth variance for two negatively correlated convergence trades as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.8 and the rhs of the VaR constraint is L = 0.05. 110
  • 111.
    2.4.4 Mean reversiontrading strategies with constraints We consider first the case where we have N = 1 mean reversion trading strategy with S̄ = 0 and we have a collateral constraint |F| L. Again due to the symmetry of S around 0 it suffices to study only what happens when S ≥ 0, since the symmetric behaviour of the spread dynamics combined with the symmetry of the constraint imply that the value function is an even function of the spread S and its partial derivative with respect to the spread is an odd function of S for each t. The optimal weight in the trading strategy is given by: Fopt t =              − 1 σ2(1+ 1 ν ) ((φ + r)St + σ2VS(St,t) ν ) if |(φ + r)St + σ2VS(St,t) ν | ≤ Lσ2 (1 + 1 ν ) −L if (φ + r)St + σ2VS(St,t) ν ≥ Lσ2 (1 + 1 ν ) L if (φ + r)St + σ2VS(St,t) ν ≤ −Lσ2 (1 + 1 ν ) and the minimizing distortion drift is given by: hmin = −σ(F opt+VS) ν . When S is positive, we have VS ≥ 0, since higher values of S correspond to better investment opportunities. If VS ≥ 0 then −L ≤ Fopt ≤ 0 for S ≥ 0. Therefore we see that again there is a tradeoff between the two terms in hmin. The first term −σF opt ν corresponds to a positive distortion drift that reduces the wealth of the investor since the investor is shorting the spread and it is bounded above, while the second term −σVS ν corresponds to a negative distortion drift that points to worse investment opportunities. The HJB equation is given by:                                    Vt + r + 1/2 σ2 VSS + 1/2 1 1+ 1 ν (φ + r)2 S2 σ2 + VS(−φS + (φ+r)S ν+1 ) − 1/2 1 ν+1 V 2 S σ2 = 0 if |(φ + r)St + σ2VS(St,t) ν | ≤ Lσ2 (1 + 1 ν ) Vt + r + 1/2 σ2 VSS − 1 2ν V 2 S − VS(φS − Lσ2 ν ) − 1 2 (1 + 1 ν )σ2 L2 + L(φ + r)S = 0 if (φ + r)St + σ2VS(St,t) ν ≥ Lσ2 (1 + 1 ν ) Vt + r + 1/2 σ2 VSS − 1 2ν V 2 S − VS(φS + Lσ2 ν ) − 1 2 (1 + 1 ν )σ2 L2 − L(φ + r)S = 0 if (φ + r)St + σ2VS(St,t) ν ≤ −Lσ2 (1 + 1 ν ) 111
  • 112.
    We solve theHJB equation numerically using the method of finite differences. In the following figures we have assumed that rf = 0, σ = 1, φ = 1 and T = 1 for two different constraints, one tight and one very loose. We have similar results with the case where there are no constraints. In addition we observe the following: • VS becomes higher the less tight the constraint is for each value of the robustness multiplier (Figure 2-47). This is to be expected since the tighter the constraints the less we can take advantage the better investment opportunities when S gets higher. • Figure 2-45 shows the two terms of the distortion drift for ν = 2 an L = 0.7. The first term corresponds to a positive distortion drift that reduces the wealth of the investor since the investor is shorting the spread and it is bounded above due to the constraint, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. In this tradeoff the first term wins making the drift distortion positive. • Figures 2-46, 2-48 shows the optimal weight when S = 1 over time for different values of ν. For L = 0.7 we see that for high values of ν the constraint binds for all the time period, for lower values of ν the constraint binds initially and then F is reduced after some time t. Finally for even lower values of ν the constraint does not bind at all. • Figure 2-48 shows that when the constraint is tighter the investor is more con- servative even when the constraint does not bind. This is expected since the tighter the constraint the smaller is the VS. This difference in the weights becomes smaller as t → T. 112
  • 113.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Vs 0 0.05 0.1 0.15 0.2 0.25 0.3 Vs as a function of time for S = 1 nu is: 1 L is: 0.7 nu is: 2 L is: 0.7 nu is: 3 L is: 0.7 Figure 2-44: Partial derivative of the value function with respect to S for a single mean reversion trading strategy and a collateral constraint with L = 0.7. VS as a function of time at S = 1 for different values of the robustness multiplier for L = 0.7. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Distortion -0.2 -0.1 0 0.1 0.2 0.3 0.4 Drift distortion and its two terms as a function of time for S = 1 h h1 h2 Figure 2-45: Distortion drift terms for a single mean reversion trading strat- egy and a collateral constraint with L = 0.7. Distortion drift terms as a function of time at S = 1 for ν = 2 and for L = 0.7. The first term corresponds to a positive distortion drift that reduces the wealth of the investor, since the investor is short- ing the spread, while the second term corresponds to a negative distortion drift that points to worse investment opportunities. The first term is bounded above due to the collateral constraint. 113
  • 114.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -0.75 -0.7 -0.65 -0.6 -0.55 -0.5 Optimal weights as a function of time for S = 1 nu is: 1 L is: 0.7 nu is: 2 L is: 0.7 nu is: 3 L is: 0.7 Figure 2-46: Optimal weight of a single mean reversion trading strategy with a collateral constraint with L = 0.7. Weight of the mean reversion trading strategy as a function of time at S = 1 for different values of the robustness multiplier and for L = 0.7. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Vs 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Vs as a function of time for S = 1 Figure 2-47: Partial derivative of the value function with respect to S for a single mean reversion trading strategy with different collateral constraints. VS as a function of time at S = 1 for different values of the robustness multiplier and different collateral constraints. The solid line is for L = 7 and the dotted line for L = 0.7. 114
  • 115.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -0.85 -0.8 -0.75 -0.7 -0.65 -0.6 -0.55 -0.5 Optimal weights as a function of time for S = 1 Figure 2-48: Optimal weight of a single mean reversion trading strategy with different collateral constraints. Weight of the mean reversion trading strategy as a function of time at S = 1 for different values of the robustness multiplier and different collateral constraints. The solid line is for L = 7 and the dotted line for L = 0.7. 115
  • 116.
    Let us nowconsider the case where we have N = 2 mean reversion trading strate- gies and again S̄ = 0. We solve numerically the HJB equation using the method of finite differences. We have assumed that rf = 0, T = 1, Φ =   2 0 0 1   and Σ =   1 ρ ρ 1   Additionally we assume that we have a VaR constraint FT ΣF ≤ L. In Figures 2-49, 2-51 and 2-53 we plot the weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of L and different values of the robustness multiplier. At these values of S1 and S2 the drift is the same for both the strategies. We have assumed that there is no correlation between the two trading strategies. Figures 2-50,2-52 and 2-54 show FT ΣF as a function of time for the different values of the robustness multiplier. We observe the following: • First of all when there is no fear of model misspecification the weights of the two strategies are the same, since there is no asymmetry between them, and they do not change over time independent on whether the VaR constraint binds or not. • When the VaR constraint binds, and there is a fear of model misspecification we invest more on the spread with the higher φ coefficient, due to higher VS. As we saw in the unconstrained case (Figure 2-10), the ratio of the weights is reduced over time, since their asymmetry due to the reduction of the VS1 and VS2 is reduced over time. Therefore, when the VaR constraint binds, the difference in the weights of the two strategies have to go down, as we see in Figures 2-49 and 2-51 where L = 3 and L = 2 respectively. Moreover, the investment in the spread with the higher φ coefficient is higher than the corresponding investment when there is no fear of model misspecification. This is due to the fact that 116
  • 117.
    in both caseswe have the same VaR constraint F2 1 + F2 2 = L, but in the first case there is an asymmetry causing F1 to be higher and F2 to be lower than the corresponding weights in the second case. • When the VaR constraint does not bind the weights of both of the strategies go down just like in the unconstrained case as we see in Figures 2-49 and 2-53. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -1.4 -1.35 -1.3 -1.25 -1.2 -1.15 -1.1 -1.05 -1 Optimal weights as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-49: Optimal weights of two uncorrelated mean reversion trading strategies for S1 = 1 and S2 = 2 when L = 3. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 3. 117
  • 118.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 2 2.2 2.4 2.6 2.8 3 3.2 VaR Constraint as a function of time for S1 = 1 and S2 = 2 nu is: 1 nu is: 10 nu is: 100 Figure 2-50: Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 3. Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 3. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -1.15 -1.1 -1.05 -1 -0.95 -0.9 -0.85 Optimal weights as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-51: Optimal weights of two uncorrelated mean reversion trading strategies for S1 = 1 and S2 = 2 when L = 2. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. 118
  • 119.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 1.9 1.92 1.94 1.96 1.98 2 2.02 2.04 2.06 2.08 2.1 VaR Constraint as a function of time for S1 = 1 and S2 = 2 nu is: 1 nu is: 10 nu is: 100 Figure 2-52: Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 2. Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -2 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1 Optimal weights as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-53: Optimal weights of two uncorrelated mean reversion trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 7. 119
  • 120.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 2 3 4 5 6 7 8 VaR Constraint as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 1 nu is: 10 Spread 1 nu is: 100 Figure 2-54: Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 7. Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 7. 120
  • 121.
    Figures 2-55 and2-56 show the weights and the constraint for the case of positive correlation, while Figures 2-57 and 2-58 cover the case for negative correlations. The properties are similar with the ones for the uncorrelated case. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -1.4 -1.3 -1.2 -1.1 -1 -0.9 -0.8 -0.7 -0.6 Optimal weights as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-55: Optimal weights of two positively correlated mean reversion trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is L = 7. 121
  • 122.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 VaR Constraint as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 1 nu is: 10 Spread 1 nu is: 100 Figure 2-56: Value of the normalized wealth variance for two positively correlated mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 7. Value of the normalized wealth variance for two positively correlated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is L = 7. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.3 -2.2 -2.1 -2 Optimal weights as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-57: Optimal weights of two negatively correlated mean reversion trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 7. 122
  • 123.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 4 4.5 5 5.5 6 6.5 7 7.5 VaR Constraint as a function of time for S1 = 1 and S2 = 2 Spread 1 nu is: 1 Spread 1 nu is: 10 Spread 1 nu is: 100 Figure 2-58: Value of the normalized wealth variance for two negatively correlated mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 7. Value of the normalized wealth variance for two negatively correlated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 7. 123
  • 124.
    So far wehave examined the two trading strategies at values where the drifts are the same. We now choose S1 = 1 and S2 = 1 just like in the unconstrained case, since for these values the drifts differ for the two strategies. Figure 2-59 shows the weights of the strategies over time for different values of the robustness multiplier when there is no correlation between the two trading strategies and when L = 2, while Figure 2-60 shows the VaR constraint over time. We now observe the following: • First of all when there is no fear of model misspecification the ratio of the weights of the two strategies is the same as the ratio of their drifts normalized by their variances and it does not change over time independent on whether the VaR constraint binds or not. • When the VaR constraint binds and there is a fear of model misspecification the investor invests more on the spread with the higher drift. Since the constraint is F2 1 + F2 2 = L if F1 increases (decreases) over time, then F2 decreases (increases) over time. In Figure 2-59 F1 increases over time, because the asymmetry be- tween the two strategies grows larger as we can see in the unconstrained case in Figure 2-15. • When the VaR constraint does not bind then the investor becomes more and more conservative in both the strategies over time just like in the N = 1 case. 124
  • 125.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -1.3 -1.2 -1.1 -1 -0.9 -0.8 -0.7 -0.6 -0.5 Optimal weights as a function of time for S1 = 1 and S2 = 1 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-59: Optimal weights of two uncorrelated mean reversion trading strategies for S1 = 1 and S2 = 1 when L = 2. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 VaR Constraint as a function of time for S1 = 1 and S2 = 1 nu is: 1 nu is: 10 nu is: 100 Figure 2-60: Value of the normalized wealth variance for two uncorrelated mean reversion trading strategies at S1 = 1 and S2 = 1 when L = 2. Value of the normalized wealth variance for two negatively correlated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. 125
  • 126.
    It is interestingto note the weights when there is a positive correlation high enough to be long the second spread and use it as a hedge to the first one. In Figure 2-61 we plot the weights of the two trading strategies and we observe that they have different signs. Figure 2-62 shows the VaR value as a function of time. When the VaR constraint does not bind both the weights reduce in magnitude like in the unconstrained case. When the VaR constraint binds, both of the weights become larger in magnitude. This is due to the fact that the asymmetry between the two strategies is reduced towards the asymmetry in the case without fear of model misspecification, as we see in the unconstrained case (Figure 2-19). Finally Figures 2-63 and 2-64 show what happens when there is a negative corre- lation. The results are similar with the case of no correlation. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 Optimal weights as a function of time for S1 = 1 and S2 = 1 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-61: Optimal weights of two positively correlated mean reversion trading strategies for S1 = 1 and S2 = 1 when L = 2. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.9 and the rhs of the VaR constraint is L = 2. 126
  • 127.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 VaR Constraint as a function of time for S1 = 1 and S2 = 1 nu is: 1 nu is: 10 nu is: 100 Figure 2-62: Value of the normalized wealth variance for two positively correlated mean reversion trading strategies at S1 = 1 and S2 = 1 when L = 2. Value of the normalized wealth variance for two positively correlated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = 0.9 and the rhs of the VaR constraint is L = 2. Time 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights -3.2 -3 -2.8 -2.6 -2.4 -2.2 -2 -1.8 -1.6 -1.4 -1.2 Optimal weights as a function of time for S1 = 1 and S2 = 1 Spread 1 nu is: 1 Spread 2 nu is: 1 Spread 1 nu is: 10 Spread 2 nu is: 10 Spread 1 nu is: 100 Spread 2 nu is: 100 Figure 2-63: Optimal weights of two negatively correlated mean reversion trading strategies for S1 = 1 and S2 = 1 when L = 8. Weights of the mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 8. 127
  • 128.
    Time 0 0.1 0.20.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Weights 2 3 4 5 6 7 8 9 VaR Constraint as a function of time for S1 = 1 and S2 = 1 nu is: 1 nu is: 10 nu is: 100 Figure 2-64: Value of the normalized wealth variance for two negatively correlated mean reversion trading strategies at S1 = 1 and S2 = 1 when L = 8. Value of the normalized wealth variance for two negatively correlated mean reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 8. 128
  • 129.
    2.5 Conclusions We investigatedhow the optimal trading strategy of a risk averse investor, facing arbitrage opportunities and risk constraints, changes when he is not confident of his model dynamics. In particular, we assumed that the investor believes that the data come from an unknown member of a set of unspecified alternative models near his approximating model. The investor believes that his model is a pretty good approximation in the sense that the relative entropy of the alternative models with respect to his nominal model is small. Concern about model misspecification leads the investor to choose a robust trading strategy that works well over that set of alternative models. We found what the optimal trading strategy is for the case of the convergence trades and mean reversion trading strategies with and without constraints by solving the corresponding Hamilton Jacobi Bellman equation. In all of our cases we dealt with diffusion processes and the alternative models distorted the conditional mean of the Brownian motion. An interesting extension of our work would be to assume that we have jump processes, where the misspecification is now about the dynamics of the jumps. This will be the topic of future work. 129
  • 130.
  • 131.
    Chapter 3 Estimating theNIH Efficient Frontier The National Institutes of Health (NIH) is among the world’s largest and most im- portant investors in biomedical research. Its stated mission is to “seek fundamental knowledge about the nature and behavior of living systems and the application of that knowledge to enhance health, lengthen life, and reduce the burdens of illness and disability” (http://www.nih.gov/about/mission.htm). Some have criticized the NIH funding process as not being sufficiently focused on disease burden[69, 45, 43]. We consider a framework in which biomedical research allocation decisions are more directly tied to the risk/reward trade-off of burden-of-disease outcomes. Pri- oritizing research efforts is analogous to managing an investment portfolio—in both cases, there are competing opportunities to invest limited resources, and expected returns, risk, correlations, and the cost of lost opportunities are important factors in determining the return of those investments. Financial decisions are commonly made according to portfolio theory[55], in which the optimal trade-off between risk and reward among a collection of competing investments—known as the “efficient frontier”—is constructed via quadratic opti- mization, and a point on this frontier is selected based on an investor’s risk/reward preferences. Given a measure of “return on investment” (ROI), an “efficient portfo- lio” is defined to be the investment allocation that yields the highest expected return 131
  • 132.
    for a givenand fixed level of risk (as measured by return volatility), and the locus of efficient portfolios across all levels of risk is the efficient frontier. We recast the NIH funding allocation decision as a portfolio-optimization problem in which the objective is to allocate a fixed amount of funds across a set of disease groups to maximize the expected “return on investment” (ROI) for a given level of volatility. We define ROI as the subsequent improvements in years of life lost (YLL). We use historical time series data provided by the NIH and the Centers for Disease Control for each of 7 disease groups and we estimate the means, variances, and co- variances among these time series. These serve as inputs to the portfolio-optimization problem. Such an approach provides objective, systematic, transparent, and repeat- able metrics that can incorporate “real-world” constraints, and yields well-defined optimal risk-sensitive biomedical research funding allocations expressly designed to reduce the burden of disease. In the rest of this chapter, we will discuss the relevant literature review. Then we will discuss about the data and solution methods used, present our results and conclude with a discussion of our findings. 3.1 NIH Background and Literature Review The National Institutes of Health (NIH) was established in 1938 and has a budget of over $31 billion, of which 80% is awarded in competitive research grants to more than 325,000 researchers through nearly 50,000 competitive grants at over 3,000 uni- versities, medical schools, and other research institutions (http://www.nih.gov). The NIH allocates funding among competing priorities by assessing such priorities with re- spect to five major criteria[65]: (a) public needs; (b) scientific quality of the research; (c) potential for scientific progress (the existence of promising pathways and qualified investigators); (d) portfolio diversification; and (e) adequate support of infrastruc- ture (human capital, equipment, instrumentation, and facilities). This framework was supported, with some additional recommendations, by an Institute of Medicine (IoM) blue-ribbon panel in 1998 (see Table 3.1)[1]. 132
  • 133.
    Criteria Processes PublicCongress Recommendation 1. The committee generally supports the criteria that NIH uses for priority setting and recommends that NIH continue to use these criteria in a balanced way to cover the full spectrum of research related to human health. Recommendation 5. In exercising the overall authority to oversee and coordinate the priority-setting process, the NIH director should receive from the directors of all the institutes and centers multiyear strategic plans, including budget scenarios, in a standard format on an annual basis. Recommendation 7. NIH should establish an Office of Public Liaison in the Office of the Director, and, where offices performing such a function are not already in place, in each institute. These offices should document, in a standard format, their public outreach, input, and response mechanisms. The director's Office of Public Liaison should review and evaluate these mechanisms and identify best practices. Recommendation 10. The U.S. Congress should use its authority to mandate specific research programs, establish levels of funding for them, and implement new organizational entities only when other approaches have proven inadequate. NIH should provide Congress with analyses of how NIH is responding to requests for such major changes and whether these requests can be addressed within existing mechanisms. Recommendation 2. NIH should make clear its mechanisms for implementing its criteria for setting priorities and should evaluate their use and effectiveness. Recommendation 6. The director of NIH should increase the involvement of the Advisory Committee to the Director in the priority-setting process. The diversity of the committee's membership should be increased, particularly with respect to its public members. Recommendation 8. The director of NIH should establish and appropriately staff a Director's Council of Public Representatives, chaired by the NIH director, to facilitate interactions between NIH and the general public. Recommendation 11. The director of NIH should periodically review and report on the organizational structure of NIH, in light of changes in science and the health needs of the public. Recommendation 3. In setting priorities, NIH should strengthen its analysis and use of health data, such as burdens and costs of diseases, and of data on the impact of research on the health of the public. Recommendation 9. The public membership of NIH policy and program advisory groups should be selected to represent a broad range of public constituencies. Recommendation 12. Congress should adjust the levels of funding for research management and support so that the NIH can implement improvements in the priority-setting process, including stronger analytical, planning, and public interface capabilities. Recommendation 4. NIH should improve the quality and analysis of its data on funding by disease and should include direct and related expenditures. Table 3.1: IoM recommendations. 12 major recommendations of the 1998 In- stitute of Medicine panel in four large areas for improving the process of allocating research funds. 133
  • 134.
    Despite this frameworkand the IoM endorsement, NIH funding has been criticized as not being aligned to disease burden and insufficiently effective[69, 45, 43]. For example, the impact of cancer has been estimated as only 5% of total direct cost but 23% of all deaths[20], while extramural spending by the National Cancer Institute (NCI) is about 15% of the total (http://report.nih.gov/). Sandler et al.[70] suggested that digestive diseases were relatively underfunded based on comparisons of disease burden as measured by direct and indirect cost. Gross[38] noted that NIH funding is reasonably predicted by some burden-of-disease metrics (disability-adjusted life- years or DALY, which are unavailable in time-series form)[57]. Earmarks or target funding levels for specific diseases and programs have been suggested by a number of policymakers[2]. Funding allocation decisions are not unique to the NIH; in a study similar to Gross et al.Gross, Curry et al. [27] has questioned the allocations of the Centers for Disease Control. NIH leaders have noted that funding basic research is itself a risky endeavor, involving trade-offs among all five of their funding criteria, and may also include unstated secondary objectives, e.g., actively “balancing out” spending by other agencies, charities, and the private sector[76]. Collectively, these factors impose significant challenges to determining an ideal allocation of research funds. Although the economic impact of biomedical research has been considered[23], the main focus has been on measuring value-added rather than determining optimal funding allocations. Murphy and Topel [63] estimate U.S. economic surplus from improved health on the order of $2.6 trillion annually, with benefits distributed un- equally across age and gender, and suggest that in some cases, incremental benefits may not exceed the cost of achieving them. Johnston et al.[46] found a return to soci- ety in the form of averted treatment costs and public health benefits divided by cost of trial expenditures of 46% for clinical trials at the National Institute of Neurological Disorders and Stroke (NINDS), where the returns or net savings were generated by four of the 28 trials examined, and collectively exceeded the costs of not only the clinical, but the entire program of research at NINDS during the study period. Cut- ler and McClellan[28] computed returns of technological advances for five conditions 134
  • 135.
    and found netbenefit for four and costs equal to benefits in the fifth. Fleurence and Togerson[31] suggested that research should be allocated to provide the most health benefits to the population, subject to equity considerations, and observed that sub- jective, burden-of-disease, and payback methods all failed this test to some degree. Instead, they argue that a method of information valuation is superior. Modern financial portfolio theory—in which the expected return, risk (as measured by volatility), and correlations of a collection of investment opportunities are taken as inputs, and the set of all portfolio weights with the highest expected return for a given level of risk is the output—produces rational allocations of limited resources among competing priorities. For developing this method in 1952, Markowitz shared the Nobel Memorial Prize in Economic Sciences in 1990. The theory has had extensive applications among mutual funds, pension funds, endowments, and sovereign wealth funds[55, 56, 24]. More recently, portfolio theory has been proposed as a means for conducting risk- sensitive cost-benefit analysis for health-care budgeting decisions[11, 66, 19, 18, 71, 72]. The motivation for these studies is the observation that typical cost-effectiveness studies of healthcare programs ignore the uncertainty of realized costs, which can be addressed by applying portfolio theory to balance the risks against the rewards of spe- cific budget allocations. These studies present simplified frameworks for incorporating risk into the healthcare budgeting process, e.g., two-security examples (although[71] does contain 11 hypothetical cost/effect distributions) and do not contain full-scale empirical applications to realistic budgeting tasks. As the authors note, applying portfolio theory to large public healthcare reimbursement problems can be challeng- ing. Patients may have differing and non-constant utility functions, and some argue that the manager/administrator should only consider expected returns, allowing the patient and physician to consider risk trade-offs at individual treatment levels, in which case the aggregate utility function is implicit. Despite the growing interest in measuring the return on biomedical research[67, 51], and the fact that portfolio theory has already been applied to healthcare bud- geting decisions, some sceptics continue to argue against the use of any quantita- 135
  • 136.
    tive metrics inthis domain. For example, Black[14] states categorically that “[t]he biomedical ‘payback’ approach is certainly inappropriate and attempts to impose it should be strenuously resisted. Instead, a qualitative approach should be applied that takes into account the ‘slow-burning fuse’ and avoids simple attribution of cause and effect”. While such a response may be acceptable for certain types of funding, it is becoming increasingly untenable with respect to public funds and government support, which, by law, almost always require some form of cost/benefit analysis, performance attribution, and oversight. 3.2 Methods 3.2.1 Funding Data The NIH has 27 Institutes and Centers, of which we identified 10 with research missions clearly tied to specific disease states, and which account for $21 billion of funding in 2005 or 74% of the total (see Table 3.2 for the disease classification scheme used and Figure 3-1 for the procedure for constructing the appropriation time series). The National Institute of Allergies and Infectious Diseases (NIAID) spending has been split to account for HIV, which is presented separately (see HIV discussion below). These Institutes and the basic research they fund have inevitable overlap and effect beyond their charter; we treat all spending for any given Institute as being directed toward the corresponding disease states, and account for spillover effects by considering the correlations in the lessening of the burden of disease in other groups. For example, molecular biology funded by the NCI may be relevant to infectious diseases but, like the entire NCI budget, would be assumed for modeling purposes to be directed at cancer; the hypothetical infectious-disease improvement would appear in the correlation between the decrease in years of life lost for cancer and that of infectious diseases. 136
  • 137.
    Analytic Group ICD 9 ICD10 NIH Chapter(s) Codes Chapter(s) Blocks Institute(s) AID Infectious and Parasitic Diseases 001-139 Certain infectious and parasitic diseases A00-B99 NIAID NCI Neoplasms 140-239 Neoplasms C00-D48 NCI DDK Endocrine, nutritional and metabolic diseases, and immunity disorders; Diseases of the digestive system; Diseases of the genitourinary system 240-279; 520-579; 580-629 Endocrine, nutritional and metabolic diseases; Diseases of the digestive system; Diseases of the genitourinary system E00-E88; K00-K92; N00-N98 NIDDK HLB Diseases of the blood and blood-forming organs: Diseases of the circulatory system; Diseases of the respiratory system 280-289; 390-459; 460-519 Diseases of the blood and blood-forming organs and certain disorders involving the immune mechanism; Diseases of the circulatory system; Diseases of the respiratory system D50-D89; I00-I99; J00-J98 NHLBI NMH Mental disorders 290-319 Mental and behavioural disorders F01-F99 NIMH CNS Diseases of the nervous system and sense organs 320-389 Diseases of the nervous system; Diseases of the eye and adnexa; Diseases of the ear and mastoid process G00-G98; H00-H57; H60-H93 NINDS NEI NIDCD CHD Complications of pregnancy, childbirth, and the puerperium; Congenital anomalies; Certain conditions originating in the perinatal period 630-676; 740-759; 760-779 Pregnancy, childbirth and the puerperium; Certain conditions originating in the perinatal period; Congenital malformations, deformations and chromosomal abnormalities O00-O99; P00-P96; Q00-Q99 NICHD AMS Diseases of the skin and subcutaneous tissue; Diseases of the musculoskeletal system and connective tissue 680-709; 710-739 Diseases of the skin and subcutaneous tissue; Diseases of the musculoskeletal system and connective tissue L00-L98; M00-M99 NIAMS LAB Symptoms, signs, and ill-defined conditions 780-799 Symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified R00-R99 EXT External causes of injury and poisoning E800- E999 Codes for special purposes; External causes of morbidity and mortality U00-U99; V01-Y89 Table 3.2: ICD mapping. Classification of ICD-9 (1978–1998) and ICD-10 (1999– 2007) Chapters and NIH appropriations by Institute and Center to 7 disease groups: oncology (ONC); heart lung and blood (HLB); digestive, renal and endocrine (DDK); central nervous system and sensory (CNS) into which we placed dementia and un- specified psychoses to create comparable series as there was a clear, ongoing migration noted from NMH to CNS after the change to ICD-10 in 1999; psychiatric and sub- stance abuse (NMH); infectious disease, subdivided into estimated HIV (HIV) and other (AID); maternal, fetal, congenital and pediatric (CHD). The categories LAB and EXT are omitted from our analysis. 137
  • 138.
    Figure 3-1: NIHtime series flowchart. Flowchart for the construction of NIH ap- propriations time series. “NIH Approp.” denotes NIH appropriations; “PHS Gaps” denotes Institute funding by the U.S. Public Health Service; “Complete Approp.” denotes the union of these two series; “FY Change” allows for the change in govern- ment fiscal years; “4Q FY” time series refers to the resulting series in which all years are treated as having four quarters of three months each. 1940 1950 1960 1970 1980 1990 2000 0 5 10 15 20 25 Year Funding (in billions of $) AID HIV AMS CHD CNS DDK HLB ONC NMH Figure 3-2: Appropriations data. NIH appropriations in real (2005) dollars, cate- gorized by disease group. 138
  • 139.
    3.2.2 Burden ofDisease Data Because of its simplicity, availability, breadth, and long history, years of life lost (YLL) was chosen as the measure of burden of disease to be used in constructing the esti- mated return on investment from NIH-funded research. The CDC Wide-ranging On- line Data for Epidemiologic Research (WONDER) database (http://wonder.cdc.gov/) was queried for the underlying cause of death at the Chapter level (except for mental disorders, where dementia and unspecified psychoses were all placed in CNS for consis- tency with CDC coding after 1998) for International Classification of Diseases (ICD) categories ICD-9 (for 1979–1998) and ICD-10 (for 1999–2007). The two datasets for pre- and post-1998 were joined into one continuous series, data were stratified into groups by age at death, and YLL were computed by comparing the midpoint of the age ranges with the World Health Organization’s (WHO) year-2000 U.S. life table (http://www.who.int/whosis/en/). Years of life lost were then tabulated by Chapter annually, and adjusted for population growth to remove what would otherwise be a systematic downward bias in realized health improvements. This process yielded YLL series for 9 distinct disease groups. Using 2005 as the base year, the raw YLL observations were adjusted in other years to be comparable to the 2005 population: YLLt ≡ YLLraw t × POP2005 POPt , POPt ≡ U.S. population in year t . (3.1) The procedure for assembling the YLL time series is summarized in Figure 3-3, and the resulting series, both raw and normalized for population growth, are shown in Figure 3-4. The change in burden of disease was measured by taking first differences. These first differences were used to compute the “return on investment” on which the mean- variance optimizations were based (see the “Methods” section below). Three disease areas required special consideration: HIV, AMS, and dementia. AMS and HIV have shorter histories, which is problematic for estimating parameters based on historical returns that are lagged by typical FDA approval times plus 4 years. 139
  • 140.
    Figure 3-3: YLLtime series flowchart. Flowchart for the construction of years of life lost (YLL) time series. “WONDER Chapter Age Group” refers to a query to the CDC WONDER database at the chapter level, stratified by age group at death; “US Pop.” is the United States population from census data as expressed in the WONDER dataset; and “US GDP” denotes U.S. gross domestic product. 140
  • 141.
    (a) YLL Gross (b)YLL Normalized Figure 3-4: YLL data. Panel (a): Raw YLL categorized by disease group. Panel (b): Population-normalized YLL (with base year of 2005), categorized by disease group. Both panels are based on data from 1979 to 2007. 141
  • 142.
    Dementia, including Alzheimer’sdisease and unspecified psychoses, was reclassified with the change from ICD-9 to ICD-10 from mental and behavioral disorders to diseases of the nervous system; we placed all dementia YLL in the CNS group to avoid a transition-point artifact at the juncture between ICD-9 and ICD-10, and then performed a sensitivity analysis with and without the dementia YLL. HIV poses a special challenge given its extreme returns after the introduction of protease inhibitors, which are outliers that are likely to be non-stationary and would heavily bias the parameter estimates on which the portfolio optimization is based. To address this outlier, HIV spending and its corresponding YLL were omitted from those of other infectious diseases—the component of NIAID spending directed at HIV was estimated by straight-line interpolation from published figures, and this HIV spending was treated as a separate entity and subtracted from reported NIAID appropriations; a similar procedure was followed for the estimation of HIV-related YLL, and WONDER was queried at the subchapter level to implement this separa- tion. Because of their unique characteristics, these two groups are omitted from our main empirical results. 3.2.3 Applying Portfolio Theory To apply portfolio theory, the concept of a “return on investment” (ROI) must first be defined. Although YLL has already been chosen as the metric by which the impact of research funding is to be gauged, there are at least two issues in determining the relation between research expenditures and YLL that must be considered. The first is whether or not any relation exists between the two quantities. While the objectives of pure science do not always include practical applications that impact YLL, the fact that part of the NIH mission is to “enhance health, lengthen life, and reduce the burdens of illness and disability” suggests the presumption—at least by the NIH—that there is indeed a non-trivial relation between NIH-funded research and burden of disease. For the purposes of this study, and as a first approximation, we assume that YLL improvements are proportional to research expenditures. Of course, factors other than NIH research expenditures also affect YLL, including research 142
  • 143.
    from other domesticand international medical centers and institutes, spending in the pharmaceutical and biotechnology industries, public health policy, behavioral patterns, prosperity level and environmental conditions. Therefore, the YLL/NIH- funding relation is likely to be noisy, with confounding effects that may not be easily disentangled. The Discussion section contains a more detailed discussion of this assumption and some possible alternatives. The second issue is the significant time lag between research expenditures and observable impact on YLL. For example, Mosteller[62] cites a lag of 264 years, starting in 1601, for the adoption of citrus to prevent scurvy by the British merchant marine. More contemporary examples[26, 35, 39] cite lags of 17 to 20 years. We use shorter lags in this study both because of data limitations (our entire dataset spans only 29 years), and also to reduce the impact of factors other than research expenditures on our measure of burden of disease (YLL). Any attempt to optimize appropriations to achieve YLL-related objectives must take this lag into account, otherwise the resulting optimized appropriations may not have the intended effects on subsequent YLL outcomes. The impact of NIH-funded research on disease burden is likely to be spread out over several years after this intervening lag, given the diffusion-like process in which research results are shared in the scientific community. For simplicity, the same duration (p = 5 years) of the diffusion-like impact for all the disease groups was hypothesized. The lag q for each disease group was estimated by running linear regressions associating improvements in YLL over p = 5 years with NIH funding q years earlier and real income and choosing the lag between 9 and 16 years ( beyond which data limitations and other factors make it impossible to distinguish the impact of research funding from other confounding factors affecting YLL) that maximizes the R2 and the corresponding lags are shown in Table 3.3. This procedure is, of course, a crude but systematic heuristic for relating research funding to YLL outcomes. Alternatives include using a single fixed lag across all groups, simply assuming particular values for group-specific lags based on NIH man- dates and experience, computing a time-weighted average YLL for each group with 143
  • 144.
    a weighting schemecorresponding to an assumed or estimated knowledge-diffusion rate for that group, or constructing a more accurate YLL return series by tracking individual NIH grants within each group to determine the specific impact on YLL (through new drugs, protocols, and other improvements in morbidity and mortality) from the award dates to the present. While the choice of lag is critical in determining the characteristics of the YLL return series and deserves further research, it does not effect the applicability of the overall analytical framework. While our procedure is surely imperfect, it is a plausible starting point from which improvements can be made. Assuming constant impact of research funding on YLL over the duration of p years, the measure of the ROI that accrues to funds allocated in year t is then given by: Rt+q ≡ − 1 p Pp−1 i=0 (YLLt+q+i − YLLt+q+i−1) × GDPt+q+i Appropriationt (3.2) where the minus sign reflects the focus on decreases in YLL, and the multiplier GDPt+q is per capita real gross domestic product (GDP) in year t+ q , which is included to convert the numerator to a dollar-denominated quantity to match the denominator. This ratio’s units are then comparable to those of typical investment returns: date-(t+q) dollars of return per date-t dollars of investment. Given the definition in equation (3.2) for the ROI of each of the disease groups, the “optimal” appropriation of funds among those groups must be determined, i.e., the appropriation that produces the best possible aggregate expected return on total research funding per unit risk. Denote by R ≡ [ R1 R2 · · · Rn ]′ the vector of returns of all n groups for a given appropriation date t (where time subscripts have been suppressed for notational simplicity), and denote by µ and Σ the vector of expected returns and the covariance matrix, respectively. If the weights of the budget allocation among the groups are ω, the ROI for the entire portfolio of grants, denoted by Rp, is given by Rp = ω′ R, and its expected value and variance are ω′ µ and ω′ Σω, respectively. The objective function to be optimized is then given by 144
  • 145.
    the expected valueminus some multiple of the variance which reflects risk tolerance, and this quadratic function of ω is maximized using standard quadratic optimization techniques, subject to the constraint that the weights sum to 1. In the mean-variance framework, we seek to find the best trade-offs between risk and expected return by varying the portfolio weights ω to trace out the locus of mean-variance combinations that cannot be improved upon, i.e., that are “efficient”. This set of efficient portfolios, also known as the “efficient frontier” is formally defined as the curve in mean-variance (or mean-standard deviation) space corresponding to all portfolios with the highest level of expected return for a given level of variance. This efficient frontier defines the set of allocations that cannot be improved upon from a mean-variance perspective, and the optimal allocation is a single point on this frontier that is determined by the investor’s desired volatility level or risk tolerance. More formally, an investor with standard mean-variance preferences is assumed to prefer portfolios with greater expected return and lower variance, with diminishing returns in each (so that progressively greater increments of expected return must be offered to the investor to induce him to accept increases in the same increment of risk as the level of risk rises). This type of preferences generates so-called “indifference curves” (non-intersecting curves in mean-variance space that trace out combinations of mean and variance for which an individual is indifferent) that are upward sloping and convex. The optimal portfolio for a given set of indifference curves is the tangency point T of the efficient frontier with the most upper-left indifference curve. To compute the efficient frontier, the following optimization problem must be solved (we maintain the following notational conventions: (1) all vectors are column vectors unless otherwise indicated; (2) matrix transposes are indicated by a prime superscript, hence ω′ is the transpose of ω; and (3) vectors and matrices are always typeset in boldface, i.e., X and µ are scalars and X and µ are vectors or matrices): Minimize ω ω′ Σω subject to ω′ µ ≥ µo , ω′ ι = 1 , ω ≥ 0 (3.3) 145
  • 146.
    where ι isan (n × 1)-vector of 1’s and µo is an arbitrary fixed level of expected return. By varying µo between a range of values and solving the optimization problem for each value, all the efficient allocations ω∗ may be tabulated, and the locus of points in mean-standard-deviation space corresponding to these efficient allocations is the efficient frontier. This so-called Markowitz portfolio optimization problem involves minimizing a quadratic objective function with linear constraints, which is a standard quadratic programming (QP) problem that can easily be solved analytically in some cases[58], and numerically in all other cases by a variety of efficient and stable solvers[37, 36]. One additional refinement to address the well-known issue of “corner solutions” (in which several components of ω∗ are 0) that often arise in the standard portfolio- optimization framework is proposed. While such extreme allocations may, indeed, be optimal with respect to the mean-variance criterion, they are more often the result of estimation error and outliers in the data[8]. Moreover, even in the absence of es- timation error, mean-variance optimality may not adequately reflect other objectives such as social equity across disease groups or distance from current status quo in al- location. To incorporate such considerations, a “regularization” technique is applied in which the objective function is penalized for allocations that are far away from the average allocation policy. Specifically, we consider the following regularized version of the standard portfolio-optimization problem: Minimize ω ω′ Σω + γ kω − ωNIHk2 subject to ω′ µ ≥ µo , ω′ ι = 1 , ω ≥ 0 (3.4) This formulation is essentially a dual-objective optimization problem in which the first objective is to minimize the portfolio’s variance (ω′ Σω), and the second objective is to minimize the difference from the average NIH allocation policy (kω − ωNIHk2 ) and the non-negative parameter γ determines the relative importance of these two objectives. Larger values of γ yield optimal weights that are closer to average NIH al- 146
  • 147.
    Group Lag MeanSD Min Med Max Skewness Kurtosis AID 10 -0.9 1.3 -4.3 -0.8 2.4 -0.3 5.2 CHD 12 5.1 3.8 0.2 4.4 11.1 0.1 1.4 CNS 10 -1.7 0.8 -3.1 -1.7 -0.4 0.1 1.9 DDK 11 -0.8 1.5 -3.8 -0.8 2.9 0.2 3.6 HLB 16 9.8 3.4 4.3 9.6 18.6 0.7 3.6 ONC 16 0.5 1.3 -2.3 1.2 2.1 -0.7 2.2 NMH 9 0.0 0.3 -0.8 0.0 0.6 -0.1 3.2 Table 3.3: Return summary statistics. Summary statistics for the ROI of disease groups, in units of years (for the lag length) and per-capita-GDP-denominated re- ductions in YLL between years t and t+4 per dollar of research funding in year t−q, based on historical ROI from 1980 to 2003. Year 1985 1986 1987 1988 1989 1990 YLL 19,741,993 19,380,387 19,015,838 18,951,220 18,086,872 17,670,665 ∆ 361,605 364,549 64,617 864,348 416,207 GDP/Capita ($) 29,443 30,115 31,069 31,877 32,112 GDP-Weighted ∆ ($) 10,646,748,753 10,978,408,090 2,007,589,676 27,552,827,900 13,365,233,032 Mean GDP-Weighted YLL ∆ ($) 12,910,161,490 Lag (years) 16 Funding year 1970 Appropriation $ 695,809,705 ROI 18.6 Table 3.4: ROI example. An example of the ROI calculation for HLB from 1986. location but which correspond to portfolios with greater volatility, and smaller values of γ yield optimal weights that may be more concentrated among a smaller subset of groups, but which imply lower portfolio volatility. 3.3 Results 3.3.1 Summary Statistics Summary statistics of the ROI for the period 1980–2003 are presented in Table 3.3. In Table 3.4 we provide an example of the ROI calculation for HLB for 1986, when the return was 18.6. Large differences in mean ROI for different Institutes are evident in Table 3.3, 147
  • 148.
    0 1 23 4 −2 0 2 4 6 8 10 (a) With Alzheimer effect, gamma = 0 Efficient AID CHD CNS DDK HLB ONC NMH NIH Avg 1/n NIH−Var NIH−Mean Min−Var Eff−25% Eff−50% Eff−75% 0 1 2 3 4 −2 0 2 4 6 8 10 (b) With Alzheimer effect, gamma = 5 0 1 2 3 4 −2 0 2 4 6 8 10 (c) Without Alzheimer effect, gamma = 0 0 1 2 3 4 −2 0 2 4 6 8 10 (d) Without Alzheimer effect, gamma = 5 Figure 3-5: Efficient frontiers. Efficient frontiers for (a) all groups except HIV and AMS, γ = 0; (b) all groups except HIV and AMS, γ = 5; (c) all groups except HIV and AMS without the dementia effect, γ =0; and (d) all groups except HIV and AMS without the dementia effect, γ =5; based on historical ROI from 1980 to 2003. ranging from small negative values (e.g., −1.7 for CNS) to large positive values (e.g., 9.8 for HLB). Large differences in standard deviation also exist, ranging from 0.3 for NMH to 3.8 for CHD. 3.3.2 Efficient Frontiers In Figure 3-5 , efficient frontiers for the single- and dual-objective optimization prob- lems are plotted in mean-standard deviation space for the 7-group cases with and without taking into account the dementia effect. 148
  • 149.
    For each ofthese frontiers, the mean-standard deviation points for the following funding allocations are also plotted: (i) historical average NIH allocation for years 1996–2005; (ii) equal-weighted (1/n) allocation; (iii) minimum-variance allocation; (iv) the allocation on the efficient frontier that has the same mean as the average NIH allocation (the “NIH-mean” allocation); (v) the allocation on the efficient frontier which has the same variance as the average NIH allocation (the “NIH-var” allocation); (vi) the allocation on the efficient frontier that is 25% of the distance from the minimum variance allocation to the maximum expected-return allocation; (vii) the allocation on the efficient frontier that is 50% of the distance from the minimum variance allocation to the maximum expected-return allocation; (viii) the allocation on the efficient frontier that is 75% of the distance from the minimum variance allocation to the maximum expected-return allocation. The region bounded by (i), (iv), (v) and the efficient frontier is of special interest because all portfolios in this region offer lower variance, higher expected return, or both when compared to the average NIH allocation, hence from a mean-variance perspective such allocations are unambiguously preferable. These allocations are called “dominating” portfolios relative to the average NIH allocation (i). Figure 3-5a shows that a number of the disease groups appear to be concentrated in a relatively low-risk sector of the risk/reward universe, which may be evidence of active variance-minimization strategies by various stakeholders. A sensitivity analysis is conducted by estimating the efficient frontier with (Fig- ure 3-5a) and without the dementia effect (Figure 3-5c). Table 3.5 contains the portfolio weights corresponding to Figures 3-5a and 3-5c respectively. 149
  • 150.
    NIH Avg 1/n NIH- Var NIH- Mean Min- Var Eff- 25% Eff- 50% Eff- 75% NIH- Var NIH- Mean Min- Var Eff- 25% Eff- 50% Eff- 75% All Groups: AID 814 0 0 0 0 0 0 0 3 5 0 0 0 CHD 7 14 24 11 0 13 27 36 18 11 7 18 28 34 CNS 14 14 0 0 25 0 0 0 7 15 19 8 0 0 DDK 10 14 0 0 0 0 0 0 0 5 8 0 0 0 HLB 17 14 23 11 0 13 32 55 24 14 9 23 39 59 ONC 27 14 53 28 16 33 42 9 34 33 32 34 31 7 NMH 16 14 0 50 58 41 0 0 17 20 21 17 2 0 Without Dementia: AID 8 14 0 0 0 0 0 0 0 3 5 0 0 0 CHD 7 14 23 11 0 14 28 36 17 11 8 18 28 34 CNS 14 14 2 32 41 27 0 0 12 18 19 11 0 0 DDK 10 14 0 0 0 0 0 0 0 5 8 0 0 0 HLB 17 14 23 12 0 16 34 55 23 13 9 24 40 60 ONC 27 14 52 37 17 43 39 9 33 32 31 33 31 6 NMH 16 14 0 8 41 0 0 0 15 19 20 14 1 0 Group Benchmarks Single-Objective Portfolios (in %) Dual-Objective Portfolios (γ = 5) (in %) Table 3.5: Portfolio weights. Benchmark, single- and dual-objective optimal port- folio weights (in percent), based on historical ROI from 1980 to 2003. The top left sub-panel of Table 3.5 shows that the single-objective optimization does yield sparse weights as expected. For example, the minimum-variance portfolio allocates to only three groups: 58% to NMH, 25% to CNS, and 16% to ONC. By min- imizing variance, irrespective of the mean, this portfolio allocates funding to groups with least variability in YLL improvements. The efficient-25% portfolio allocates non-zero weights in four groups (41% to NMH, 33% to ONC, 13% to HLB, and 13% to CHD), and yields 26% better expected return with 28% less risk. With still more emphasis on expected return, the efficient-50% portfolio gives non-zero weights only to three successful groups: 42% to ONC, 32% to the higher risk, higher expected- return HLB, and 27% to CHD. This portfolio has 172% higher expected return but only 27% more risk than the NIH portfolio. The efficient-75% portfolio gives an even higher weight of 55% to HLB, 36% to CHD, and 9% to ONC, yielding 318% higher expected return and 148% more risk, a diminishing risk-adjusted expected return as compared to portfolios with lower volatility. Given the greater emphasis on expected return for this portfolio, it is not surprising to see HLB getting a bigger role due to its apparent historical success in reducing YLL. Of course, whether or not past suc- 150
  • 151.
    cess is indicativeof comparable future success hinges on the science and associated translational efforts underlying the diseases covered by HLB. This underscores the importance of incorporating research and clinical insights into the funding allocation process, especially within a systematic framework such as portfolio theory. However, the dementia effect may underestimate the performance of the CNS disease group, hence the lower panel of Table 3.5 reports corresponding optimal- portfolio results without the dementia effect. In the single-objective case, the efficient- 50% and 75% portfolios are still sparse, with non-zero weights in 3 groups, while the lower risk efficient-25% portfolio is less concentrated with non-zero weights to 4 groups and significant weight (27%) to the CNS group. Table 3.5 also contains the optimal portfolios for the dual-objective case (with γ = 5) in the right sub-panels (see Figures 3-5b and 3-5d). These cases correspond to portfolios that trade off closeness to the average NIH allocation policy with better risk-adjusted expected returns. Now we observe that for both upper and lower sub- panels corresponding to the 7-group with/without the dementia effect optimization, respectively, the weights are less concentrated than in the single-objective case. For example, the minimum-variance portfolio without the dementia effect now allocates funding to all the groups, with weights ranging from 5% to 31%. However, even in this case, the efficient-75% portfolio is still extreme, allocating weights only to HLB, CHD and ONC. Therefore, special care must be exercised in selecting the appropriate point on the efficient frontier. We also observe from the NIH-var or NIH-mean portfolios that slight changes to the average NIH policy apparently yield superior performance in mean-standard deviation space (28% to 89% relative improvement, depending on the assumptions). 3.4 Discussion Portfolio theory provides a systematic framework for determining optimal research funding allocations based on historical return on investment, variance, and correlation between appropriations and reductions in disease burden. The optimization results 151
  • 152.
    suggest that significantYLL improvements with respect to a mean-variance criterion may be possible through funding re-allocation. To our knowledge, this is the first time such an approach has been empirically implemented in this domain. However, our findings must be qualified in at least three respects: (1) YLL as a measure of burden of disease, which is clearly incomplete and less than ideal; (2) the definition of ROI and the challenges of relating research expenditures to subsequent outcomes such as burden of disease; and (3) the known limitations of portfolio the- ory. While each of these qualifications can be addressed to varying degrees through additional data and analysis, the empirical conclusions are likely to depend critically on the nature of their resolution. In this section, we provide a short synopsis of these qualifications, and also consider other objections to this framework and directions for future research. YLL captures only the most extreme form of disease burden, and other measures such as disability-adjusted or quality-adjusted life years are clearly preferable. How- ever, time series histories for such measures are currently unavailable; hence YLL is the most natural starting point for gauging the impact of biomedical research funding, and is directly aligned with the NIH mission to “lengthen life”. As a measure of disease burden, YLL captures only lethal illness by definition; chronic illness enters the optimization process only indirectly, mortality in the young is more heavily weighted than that of the elderly, and quality-of-life is not captured at all. The choice of YLL is motivated by several factors: long time-series observations of YLL are readily available, they cover a large population, and they address the entire spectrum of diagnoses categorized under the ICD. Broader measures of burden of disease such as disability adjusted life years (DALY)[38] and quality-adjusted life years (QALY)[68, 22] have been proposed, but historical time series for such measures are not yet available. As better measures are developed (e.g., incidence, prevalence, physician visits, hospitalization, DALY, QALY), portfolio-optimization methods may be applied to them as well through appropriately defined “returns”. Should datasets covering not only age and cause of death but also ante-mortem symptoms become available, mean-variance-efficient allocations would likely place significant weight on 152
  • 153.
    improvements in thecare of less-lethal chronic diseases. Even if YLL is an appropriate measure of disease burden, our definition of ROI can also be challenged as being imprecise and ad hoc in several respects. NIH funding is typically focused on basic research rather than translational efforts, therefore, NIH spending may not be as directly related to subsequent YLL improvements. We have not accounted for other expenditures that may also affect YLL, and to the extent that NIH appropriations are systematically used to complement private spending to allocate total funding across diseases more fairly[76], the relation between NIH funding and subsequent YLL improvements may be even noisier, and may require modelling private-sector expenditures as a separate but complementary portfolio- optimization problem with an objective function and constraints that are linked to those of the NIH. Also, the standard portfolio-optimization framework implicitly as- sumes a constant multiplicative relation between dollars invested today and dollars returned tomorrow (so that doubling the investment will typically double the ROI of that investment), whereas the return to biomedical investments may be non-linear. In addition, translational research takes time and significant non-NIH resources, fur- ther blurring the relation between NIH allocations and subsequent changes in YLL. Finally, other factors may contribute to YLL improvements, including changes in cul- tural norms (including consumption of alcohol and cigarettes), economic conditions (such as recessions vs. expansions), and public policy (such as vaccine programs and mandates for automobile, home, and workplace safety). While all of these qualifica- tions have merit, they are not insurmountable obstacles and can likely be addressed through additional data collection and more sophisticated metrics, perhaps along the lines of Porter[67] or Lane and Bertuzzi[51]. Moreover, the portfolio-optimization approach provides a useful conceptual framework for formulating funding allocation decisions systematically, even if its empirical implications are imprecise. The estimates of q were an initial attempt to link appropriation with outcome in a systemic and non-discretionary manner, but they were derived heuristically from regulatory, appropriation, and epidemiological data which may not be stationary or predictive. For example, if the Food and Drug Administration’s capacity for reviewing 153
  • 154.
    new-drug applications isheld constant and applications double, substantial increases in regulatory queuing would be expected, even with the added resources generated by the Prescription Drug User Fee Act. Finally, in converting changes in YLL to dollar amounts, per-capita real GDP was used as the “conversion factor” irrespective of age, despite the fact that children and retired individuals are economically less active. While these caveats highlight the imprecision with which the impact of research spending is measured, they also provide direction for developing better metrics. In particular, the underlying science of each grant implies a particular set of dynamics for translation and YLL impact, and with more sophisticated models of such dynamics, the returns to fundamental research should be measurable with greater accuracy. Even within the exact domain for which it was developed, portfolio theory has several well-known limitations, of which the most obvious is the possibility that the mean-variance criterion may not, in fact, be the appropriate objective function to be optimized. While there is little disagreement that higher expected ROI is prefer- able to its alternative, the trade-off between expected ROI and risk is fraught with subtleties involving specific psychological, perceptual, and behavioural mechanisms of individuals and groups. Because of these considerations, mean-variance analysis is often considered an approximation to a much more complicated reality—a starting point for investment allocation decisions, not the final answer. Another known limitation of portfolio theory is the fact that the input parameters (µ, Σ) must be estimated from historical data, and estimation error in these param- eter estimates can lead to portfolios that are unstable and sub-optimal[55]. One common approach to addressing this problem in the financial context is to employ prior information regarding the input parameters, thereby reducing the dependence on historical data. Using Bayesian methods, expert opinions regarding the statisti- cal properties of the individual asset returns can be incorporated into the portfolio optimization process[13, 5] [12]. One limitation that is unique to the current application is the fact that portfolio theory is silent on which mean-variance-optimal portfolio to select. In the financial context, the existence of a riskless investment (e.g., U.S. Treasury bills) implies that 154
  • 155.
    one unique portfolioon the efficient frontier will be desired by all investors—the so- called “tangency” portfolio[73]. Because there is no analog to a riskless investment in biomedical research, the notion of a tangency portfolio does not exist in this context. Therefore, decision makers must first determine society’s collective preferences for risk and return with respect to changes in YLL before a unique solution to the portfolio- optimization problem can be obtained, i.e., they must agree on a societal “utility function” for trading off the risks and rewards of biomedical research. This critical step is a pre-requisite to any formal analysis of funding allocation decisions, and underscores the need for integration of basic science with biomedical investment performance analysis and science policy. Such integration will require close and ongoing collaboration between scientists and policymakers to determine the appropriate parameters for the funding allocation process, and to incorporate prior information and qualitative judgments[14] regarding likely research successes, social priorities, policy objectives and constraints, and hidden correlations due to non-linear dependencies not captured by the data. In particular, it is easy to imagine contexts in which funding objectives can and should change quickly in response to new envi- ronmental threats or public-policy concerns. However, such pressing needs must be balanced against the disruptions—which can be severe due to the significant adjust- ment costs implicit in biomedical research[32]—caused by large unanticipated positive or negative shifts in research funding. Although the end result of collaborative discus- sion may fall short of a well-defined objective function that yields a clear-cut optimal portfolio allocation, the portfolio-optimization process provides a transparent and ra- tional starting point for such discussions, from which several insights regarding the complex relation between research funding and social outcomes are likely to emerge. Any repeatable and transparent process for making funding allocation decisions— especially one that involves criteria other than peer-review-based academic excellence— will, understandably, be viewed with some degree of suspicion and contempt by the scientific community. However, if one of the goals of biomedical research is to reduce the burden of disease, some tension between academics and public policy may be unavoidable. Moreover, in the absence of a common framework for evaluating the 155
  • 156.
    trade-offs between academicexcellence and therapeutic potential, other approaches such as political earmarking[2] are being proposed, which may be even less palatable from the scientific perspective. In an environment of tightening budgets and increasing oversight of appropria- tions, portfolio theory offers scientists, policymakers, and regulators—all of whom are, in effect, research portfolio managers—a rational, systematic, transparent, and reproducible framework in which to explicitly balance and trade off expected benefits with potential risks while accounting for correlation among multiple research agen- das and real-world constraints in allocating scarce resources. Most funding agencies and scientists have already been making such trade-offs informally and heuristically; there may be additional benefits to making such decisions within an explicit frame- work based on standardized and objective metrics. One of the most significant benefits from adopting such a framework may be the reduction of uncertainty surrounding future funding-allocation decisions, which would greatly enhance the ability of funding agencies and scientists to plan for the future and better manage their respective budgets, research agendas, and careers. By approach- ing funding decisions in a more analytical fashion, it may be possible to improve their ultimate outcomes while reducing the chances of unintended consequences. 156
  • 157.
    Chapter 4 Impact ofmodel misspecification and risk constraints on market In Chapters 1, 2 we studied the optimal trading strategy of a risk averse investor who faces risk constraints and model misspecification. In this Chapter we will study how risk constraints and fear of model misspecification affect the statistical properties of the market returns. In particular, we will study their effect on the risk premium, the volatility and liquidity of the market. We find that the statistical properties of the market change. In particular, vari- ability of the risk constraints leads to increasing risk premium, increasing volatility and increasing illiquidity of the market. In addition, tightening of these constraints leads also to increasing risk premium, increasing volatility and increasing illiquidity. Moreover, we find that variability in risk aversions along with risk constraints also lead to a more concave pricing function of the aggregate supply for the market, imply- ing increasing risk premium, increasing volatility and increasing illiquidity. Finally, we explore how the properties of the asset returns change when the investors do not completely trust their models. We find that model misspecification is another source of increasing risk premium, endogenous volatility and increasing illiquidity. In the rest of this chapter, we will discuss the relevant literature review. Then we will discuss about the setup of the model, and we will analyze the impact on the market returns of varying risk constraints across agents, varying risk aversions across 157
  • 158.
    agents and varyingdegrees of fear of model misspecification across agents. Finally, we will conclude with all of our results. 4.1 Literature review In the literature there are papers that assume heterogeneity along three dimensions: risk aversion coefficients of the agents, constraints the agents face and beliefs of the agents. Danielsson and Zigrand [81] assume that the agents have the same beliefs and face the same constraints but they differ in their risk aversion coefficients. They study the economic implications of a Value-at-risk based regulatory system by analyzing a two period multi-asset general equilibrium model with agents heterogeneous in risk preferences and wealth. They assume that the agents have CARA utilities and they argue that there will be endogenous volatility and increasing risk premium due to the fact that “... risk will have to be transferred from the more risk-tolerant to the more risk-averse”. As we will prove this is not true and not necessary for having increasing risk premium, endogenous volatility and increasing illiquidity. Kogan and Uppal [50] show how to analyze the equilibrium prices and policies in an economy with incomplete financial markets and stochastic investment opportunity set, where the agents face portfolio constraints. They study a general equilibrium exchange economy with multiple agents, who differ in the risk aversion coefficients and face borrowing constraints, while having the same beliefs. Brumm et al. [21] consider a general equilibrium infinite-horizon economy with the agents having heterogeneous risk preferences and facing the same constraints, while having the same beliefs. They find that the presence of collateral constraints leads to strong excess volatility and a regulation of margin requirements potentially has stabilizing effects. Then, there are papers with different prior beliefs not due to asymmetric infor- mation among the agents. Geanakoplos [33] assumes that the agents have different priors (optimists, pessimists) but same risk aversion coefficients and wealth and they 158
  • 159.
    face identical collateralconstraints. He studies how these constraints determine an equilibrium leverage and how this leverage changes over time leading to crashes and boom periods, the so-called leverage cycles. Chen, Hong and Stein [25] study what happens to the price of a risky asset, when there are investors with heterogeneous priors who face short sales constraints. The idea that short sales constraints increase the prices of risky assets when the investors have heterogeneous beliefs is due to Lintner [52] and Miller [61]. Chen, Hong and Stein show that greater dispersion of beliefs leads to even higher prices. Finally, Hansen and Sargent [41] study a framework where the agents have a common approximating model, but they differ in the degree of mistrust of the model. They find that agent’s caution in responding to concerns about model misspecification can raise prices assigned to macroeconomic risks. We will see how risk constraints affect the statistical properties of the market, in particular the risk premium, volatility and liquidity of the market. We will first study the case where the investors differ in the constraints they face and/or their risk aversion coefficients and then the case where they mistrust the model of asset payoffs and the mistrust varies among the investors. 4.2 Analysis 4.2.1 Model setup We assume we have H mean-variance single-period optimizers with heterogeneous risk aversions, risk constraints and wealth. Each agent can invest in the market and the risk free rate at t = 0. We assume that the risk-free rate is exogenously given and the market is modeled as a risky asset with stochastic payoff at t = 1. There are also noise traders. We do not model their utility explicitly, we only assume that they are hit by random liquidity shocks and they submit random market orders at time t = 0. Equivalently, the supply of the risky asset is stochastic. Each agent faces a risk constraint, a constraint in his wealth volatility of the form |θh| ≤ LhWh0, where 159
  • 160.
    θh is theposition of the h agent in the market, Wh0 is the initial wealth of agent h and Lh determines the tightness of the risk constraint that the h agent faces. Each agent’s wealth at time t=1 is given by: Wh1 = dθh + (Wh0 − qθh)Rf where d is the payoff of the risky asset (market), θh is the number of shares, q is the price of the risky asset and Rf is the risk-free rate. It is: E(R) = µ̂θh + (Wh0 − qθh)Rf Wh0 where µ̂ is the expected payoff of the risky asset. We also have: var(R) = σ̂2 θ2 h W2 h0 where σ̂ is the volatility of the payoff of the risky asset. Each agent solves the following optimization problem: maximize θh (µ̂ − qRf ) θh Wh0 − 1 2 γhσ̂2 ( θh Wh0 )2 subject to |θh| ≤ LhWh0 where γh is the agent’s risk aversion coefficient. By solving the KKT conditions we find: θopt h = µ̂ − qRf γh Wh0 σ̂2(1 + λh) (4.1) where 1 + λh = max(1, |µ̂−qRf | Lhγhσ̂2 ) 160
  • 161.
    The solution canalso be written as: θopt h =              µ̂−qRf γh Wh0 σ̂2 if µ̂−Lhγhσ̂2 Rf ≤ q ≤ µ̂+Lhγhσ̂2 Rf LhWh0 if q ≤ µ̂−Lhγhσ̂2 Rf −LhWh0 if q ≥ µ̂+Lhγhσ̂2 Rf (4.2) A competitive equilibrium is a set of portfolios (θ1, · · · , θH) and a price q such that: • Markets clear • Each agent’s portfolio is optimal The market clearing condition implies that: X h∈H θopt h = θα ⇒ q = µ̂ − Ψσ̂2 θα Rf where θα is the aggregate supply of the risky asset and 1 Ψ ≡ P h∈H 1 γh Wh0 (1+λh) We will consider two special cases: • Constraints vary across the agents • Risk aversion relative to wealth varies across the agents 4.2.2 Varying constraints In the first case, we assume that the agents face heterogeneous constraints. In partic- ular we assume that the parameter L̂h ≡ LhWh0 varies across the agents, while γh Wh0 is constant equal to γ. Without loss of generality, we assume that L̂1 ≤ L̂2 · · · ≤ L̂H . From the market clearing condition, we have that: Ψθα = µ̂−qRf σ̂2 . In addition from 161
  • 162.
    equation 4.1, wehave that: µ̂ − qRf σ̂2 = θopt h γh Wh0 (1 + λh) = θopt h γ(1 + λh) Therefore, it is: Ψθα = θopt h γ (1 + λh) ∀h ∈ H. We perform a sensitivity analysis for two cases: • Keep L̂h constant and change θα. By changing the aggregate supply, we find that: – Ψθα is a piecewise linear convex increasing function of the aggregate supply. – Its slope is given by γ H−i+1 till the constraint binds for agent i. In other words, at each point it is equal to γ over the number of agents for whom the constraint is not binding yet. – The constraint binds for agent 1 when θα,1 = HL̂1 and for agent i when θα,i = θα,i−1 + (H − i + 1)(L̂i − L̂i−1). These are the kink points in Figures 4-1, 4-2, 4-3. – When the aggregate supply is greater than θα = P h∈H L̂h there is no equi- librium, since in that case all the agents are constrained in their positions and they cannot buy any more shares and therefore the market cannot clear. – Since q = µ̂−Ψσ̂2θα Rf , we see that the pricing function is a piecewise linear concave decreasing function of the aggregate supply, as we see in Figures 4-1, 4-2, 4-3. – Therefore, variability of the constraints leads to increasing risk premium, increasing volatility and increasing illiquidity, since a small change in the aggregate supply, a small liquidity shock by the noise traders leads to a larger change in the price of the risky asset comparing to the case, where there is no variability in the constraints the different agents face. Figure 162
  • 163.
    0 50 100150 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Aggregate supply Price Price of the risky asset Variability in constraints Same constraints Figure 4-1: Price of the risky asset as a function of the aggregate market supply under varying constraints. We assume that we have 5 agents with the same risk aversion coefficients. The red plot assumes the same L = 30 for all the agents, while the blue assumes L to be different across the agents L1 = 10, L2 = 20, L3 = 30, L4 = 40, L5 = 50. 4-1 shows the pricing function when the agents face the same constraints and when they face variable constraints. Figure 4-3 also shows the pricing function of the risky asset when the agents face two sets of constraints with the same mean but with different variability. • Keep θα constant and change L̂h. As we see in Figure 4-2 as we tighten the constraints, the pricing function becomes more concave. Therefore, tightening of the constraints leads to increasing risk premium, increasing volatility and increasing illiquidity. 163
  • 164.
    0 50 100150 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Aggregate supply Price Price of the risky asset Different constraints Lh Constraints reduced by a factor 1/5 Figure 4-2: Price of the risky asset as a function of the aggregate market supply under tightening constraints. We assume that we have 5 agents with the same risk aversion coefficients. The blue plot assumes L to be different across the agents L1 = 10, L2 = 20, L3 = 30, L4 = 40, L5 = 50 and the red assumes that each Li is reduced by 20%. 0 50 100 150 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Aggregate supply Price Price of the risky asset Different constraints Lh Less variable constraints by a factor of 2 Figure 4-3: Price of the risky asset as a function of the aggregate market supply with less variable constraints. We assume that we have 5 agents with the same risk aversion coefficients. The blue plot assumes L to be different across the agents L1 = 10, L2 = 20, L3 = 30, L4 = 40, L5 = 50 and the red assumes that L1 = 20, L2 = 25, L3 = 30, L4 = 35, L5 = 40. 164
  • 165.
    4.2.3 Varying riskaversions So far we have assumed that the agents have heterogeneous constraints but same normalized risk aversions. Now, we assume that the risk aversion relative to wealth ˆ γh ≡ γh Wh0 varies across the agents, while the parameters L̂h are constant equal to L. Without loss of generality, we assume that ˆ γ1 ≤ ˆ γ2 · · · ≤ ˆ γH. As we proved in the previous section, we have: Ψθα = θopt h ˆ γh (1 + λh) ∀h ∈ H. We perform a sensitivity analysis for two cases: • Keep ˆ γh constant and change θα. By changing the aggregate supply, we find that: – Ψθα is a piecewise linear convex increasing function of θα. – Slope is ( PH h=i 1 ˆ γh )−1 till the constraint binds for agent i. In other words, at each point it is equal to the aggregate risk aversion of the agents whose constraint is not binding yet. – The constraint binds for agent i when θα,i = iL + PH k=i+1 L ˆ γi ˆ γk . These are the kink points in Figures 4-4, 4-5. – When the aggregate supply is greater than θα = HL there is no equilib- rium, since in that case all the agents are constrained in their positions, they cannot buy any more shares and therefore the market cannot clear. – Since q = µ̂−Ψσ̂2θα Rf , we see that the pricing function is a piecewise linear concave decreasing function of the aggregate supply, as we see in Figures 4-4, 4-5. – Therefore, variability of the risk aversion coefficients and constraints leads to increasing risk premium, increasing volatility and increasing illiquidity, since a small change in the aggregate supply, a small liquidity shock by the noise traders leads to a larger change in the price of the risky asset comparing to the case, where there is variability in the risk aversion co- efficients but no constraints. Figure 4-4 shows the pricing function of the 165
  • 166.
    0 50 100150 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Aggregate supply Price Price of the risky asset Variability in risk aversions No constraints Figure 4-4: Price of the risky asset as a function of the aggregate mar- ket supply with constraints and varying risk aversions. We assume that we have 5 agents with same constraints but different risk aversion coefficients. The blue plot assumes L = 30 for each agent, while the red line assumes that the agents are unconstrained. risky asset when the agents with different risk aversion coefficients face constraints and when they do not face any constraints. • Keep θα constant and change L. As we see in Figure 4-5 as we tighten the constraints, the pricing function becomes more concave. Therefore, tightening of the constraints leads to increasing risk premium, increasing volatility and increasing illiquidity. 166
  • 167.
    0 50 100150 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Aggregate supply Price Price of the risky asset Variability in risk aversions Tighter constraints Figure 4-5: Price of the risky asset as a function of the aggregate market supply with tightening constraints and varying risk aversions. We assume that we have 5 agents with same constraints but different risk aversion coefficients. The blue plot assumes L = 30 for each agent, while the red line assumes that L = 20 for each agent. 167
  • 168.
    4.2.4 Varying constraintsand risk aversions Now we assume that both the risk aversion relative to wealth ˆ γh ≡ γh Wh0 and the parameters L̂h vary across the agents. Without loss of generality we assume that ˆ γ1L̂1 ≤ ˆ γ2L̂2 · · · ≤ ˆ γHL̂H . By changing the aggregate supply, we find that: • Ψθα is a piecewise linear convex increasing function of θα. • Slope is ( PH h=i 1 ˆ γh )−1 till the constraint binds for agent i. In other words, at each point it is equal to the aggregate risk aversion of the agents whose constraint is not binding yet. • The constraint binds increasing in the order of γ̂iL̂i = γiLi. It binds for agent i when θα,i = Pi k=1 L̂k + PH k=i+1 L̂i ˆ γi ˆ γk . • When the aggregate supply is greater than θα = P h∈H L̂h there is no equilib- rium, since in that case all the agents are constrained in their positions, they cannot buy any more shares and therefore the market cannot clear. • The pricing function of the risky asset is a piecewise linear concave decreas- ing function of the aggregate supply. That means increasing risk premium, increasing volatility and increasing illiquidity. 4.2.5 Varying fear of model misspecification We now study how fear of model misspecification across the agents affects the sta- tistical properties of risky assets. We change a little our initial model setup. In particular, we assume we have H investors with CARA utilities, heterogeneous risk aversions and wealth. Each investor can invest in N risky assets and the risk free rate at t = 0. Each agent mistrusts the model that describes the payoff distribution of the risky assets. All the agents have a common nominal model but they have heterogeneous fears of model misspecification. We assume that the risk-free rate is exogenously given and the common approximating model describing the risky assets’ payoff distribution at t = 1 is N(µ̂, σ̂2 ). There are also noise traders. We do not 168
  • 169.
    model their utilityexplicitly, we only assume that they are hit by random liquidity shocks and they submit random market orders at time t = 0. Equivalently, the supply of the risky assets is stochastic. Each agent’s wealth at time t=1 is given by: Wh1 = dT θh + (Wh0 − qT θh)Rf where d ∈ RN describes the payoff of the N risky assets, θh describe the number of shares of the assets, q ∈ RN is the vector of prices of the risky assets and Rf is the risk-free rate. The common approximating model of the payoff distribution is d = µ̂+Σ̂1/2 ǫ where ǫ follows the standard multivariate Gaussian distribution. The alternative models alter the distribution of ǫ. In particular, similarly with the framework described in Chapter 2, they change the mean of the shocks and they assume that the distribution of the shock is N(h, I). Therefore, under the alternative models the payoff is d = µ̂ + Σ̂1/2 (ǫ + h) i.e. d follows N(µ̂ + Σ̂1/2 h, Σ̂). The relative entropy of the alternative distributions with respect to the nominal distribution is given by:D(Q) = 1/2hT h as we have shown in Chapter 2 and proved in the Appendix. Under the alternative distributions the certainty equivalence of the investors with CARA utilities is γ(W0 − qT θ)Rf + γµ̂T θ − 1/2γ2 θT Σ̂θ + γθT Σ̂1/2 h. We assume each investor solves the following optimization problem: max θ min h γ(W0 − qT θ)Rf + γµ̂T θ − 1/2γ2 θT Σ̂θ + γθT Σ̂1/2 h + λ1/2hT h (4.3) where γ is the risk aversion coefficient of the investor and λ is the multiplier that penalizes the relative entropy of the alternative distribution. By solving this optimization problem (see Appendix) we find that: θopt h = Σ̂−1 (µ̂ − Rf q) γh(1 + 1 λh ) (4.4) 169
  • 170.
    The market clearingcondition for the risky assets implies that (see Appendix): X h∈H θopt h = θα q = µ̂ − ΨΣ̂θα Rf where θα ∈ RN is the aggregate supply of the risky assets and 1 Ψ ≡ P h∈H 1 γh(1+ 1 λh ) . In case the investors fully trusted their model dynamics we would have q = µ̂ − ΨnrΣ̂θα Rf (4.5) but now 1 Ψnr ≡ P h∈H 1 γh . So we see the case when the agents mistrust their models is equivalent to the case where they fully trust their models but with increasing effective risk aversion γeff = γ(1 + 1 λ ). By changing the aggregate supply, we find that when the investors mistrust their models and make a robust decision rule then the slope of the pricing function of the assets is larger compared to the case when the investors fully trust their models’ dynamics, since Ψ Ψnr. If we consider the case where N = 1 and the risky asset is the market, then we see that the risk premium, the volatility and the illiquidity go up, since a small change in the aggregate supply, a small liquidity shock by the noise traders leads to a larger change in the price of the risky asset comparing to the case where the investors fully trust their models. 4.3 Conclusions Risk sensitive regulations have become the cornerstone of international financial reg- ulations. They imply an upper bound on the wealth volatility for each investor. In this chapter we studied how risk constraints and model misspecification affect the statistical properties of the market returns. In particular, we studied their effect on the risk premium, the volatility and liquidity of the market. 170
  • 171.
    We studied thefollowing cases: • Variability of the risk constraints: This is the case where the agents face differ- ent risk constraints. The more variability there is in the risk constraints, the larger the risk premium, volatility and illiquidity of the market is. In addition tightening of the constraints leads also to more risk premium, volatility and illiquidity of the market. • Variability in risk aversions: This is the case where the agents face the same risk constraint but have different aversions to risk. Variability in risk aversions along with the risk constraints also lead to a more concave pricing function of the aggregate supply for the market, implying increasing risk premium, increasing volatility and increasing illiquidity. An interesting question here is the following. Do we have a concave pricing function due to the fact that “...risk will have to be transferred from the more risk-tolerant to the more risk-averse”? Well as we saw this is not the case. Any constraint that binds for an agent forces the discount and the slope of the pricing function to be larger so that the other agents are induced to absorb the excess supply and this is the mechanism that leads to a more concave decreasing pricing function with respect to the aggregate supply. • Model misspecification: This is the case where the agents do not fully trust their model dynamics, they believe that the real model is an unknown member of a set of alternative models near their nominal model and they make robust decision rules. We find that model misspecification is another source of increasing risk premium, endogenous volatility and increasing illiquidity. 171
  • 172.
  • 173.
    Appendix A Technical Notes Proposition1. The following QCQP: minimize FT t µt + 1 2 FT t ΣFt subject to FT t ΣFt ≤ L has a solution given by: Fopt t =        −Σ−1 µt if µT t Σ−1 µt ≤ L − Σ−1µt r µT t Σ−1µt L if µT t Σ−1 µt ≥ L Proof. Let us apply the Karush Kuhn Tucker (KKT) conditions. Fopt t , λopt t are optimal iff they satisfy the following KKT conditions: • Primal feasibility: FT opt t ΣFopt t ≤ L • Dual feasibility: λopt t ≥ 0 • Complementary slackness: λopt t (FT opt t ΣFopt t − L) = 0 • Minimization of the Lagrangean: Fopt t = argmin L(Ft, λopt t ) 173
  • 174.
    The Lagrangean isgiven by: L(F, λ) = FT t µt + 1 2 FT t ΣFt + λ(1/2FT t ΣFt − 1/2L) The first order conditions are: µt + ΣFopt t + λopt t ΣFopt t = 0 ⇒ Fopt t = − Σ−1 µt 1 + λopt t If µT t Σ−1 µt L, then we cannot have λopt t = 0, since in that case FT opt t ΣFopt t L. Therefore λopt t 0 and from the CS condition FT opt t ΣFopt t = L ⇒ µT t Σ−1 µt (1 + λopt t )2 = L ⇒ (1 + λopt t ) = r µT t Σ−1µt L 1 Therefore, if µT t Σ−1 µt L, then Fopt t = − Σ−1µt r µT t Σ−1µt L . If µT t Σ−1 µt L, then FT opt t ΣFopt t L, therefore λopt t = 0 and Fopt t = −Σ−1 µt. Finally if µT t Σ−1 µt = L, then we cannot have λopt t 0, since in that case we would have FT opt t ΣFopt t L and λopt t 0 and the CS condition would be violated. Therefore, λopt t = 0 and Fopt t = −Σ−1 µt. Therefore we proved that: Fopt t =        −Σ−1 µt if µT t Σ−1 µt ≤ L − Σ−1µt r µT t Σ−1µt L if µT t Σ−1 µt ≥ L 174
  • 175.
    We could alsoprove this result in another way. Let us make a change of variables where: y = Σ1/2 Ft Ft = Σ−1/2 y Then our problem becomes: minimize yT Σ−1/2 µt + 1 2 yT y subject to yT y ≤ L The optimal solution is given by finding the projection of µ̃ = −Σ−1/2 µt on the Euclidean ball yT y ≤ L. This projection is given by: yopt = µ̃ max(1, q µ̃T µ̃ L ) Therefore the optimal solution for the original problem is given by: Fopt t = Σ−1/2 yopt Fopt t = − Σ−1 µt max(1, q µT t Σ−1µt L ) 175
  • 176.
    Proposition 2. WhenΣ is a diagonal matrix, the following convex program: minimize FT t µt + 1 2 FT t ΣFt subject to PN i=1 λi|Fit| ≤ 1 has a solution given by: Fopt it = sign(−µit)(|µit λi | − νopt t )+ σ2 i λi Proof. Let us apply the Karush Kuhn Tucker (KKT) conditions. Fopt t , νopt t are optimal iff they satisfy the KKT conditions: • Primal feasibility: PN i=1 λi|Fit| ≤ 1 • Dual feasibility: νopt t ≥ 0 • Complementary slackness: νopt t ( PN i=1 λi|Fit| − 1) = 0 • Minimization of the Lagrangean Fopt t = argmin L(Ft, νopt t ) The Lagrangean is given by: L(F, ν) = FT t µt + 1 2 FT t ΣFt + ν( N X i=1 λi|Fit| − 1) 176
  • 177.
    Fopt t = argminL(Ft, νopt t ) = argmin 1/2 N X i=1 σ2 i F2 it + N X i=1 Fitµit + νopt t ( N X i=1 λi|Fit| − 1) = argmin 1/2 N X i=1 σ2 i |Fit|2 + N X i=1 |Fit|sign(−µit)µit + νopt t ( N X i=1 λi|Fit| − 1) = argmin 1/2 N X i=1 σ2 i |Fit|2 + N X i=1 |Fit|(sign(−µit)µit + λiνopt t ) where Fopt it = |Fopt it |sign(−µit). We have: |Fopt it | =      0 if sign(−µit)µit + λiνopt t ≥ 0 − sign(−µit)µit+λiνopt t σ2 i if sign(−µit)µit + λiνopt t ≤ 0 Therefore: Fopt it =      0 if sign(−µit)µit + λiνopt t ≥ 0 − µit λi −sign(−µit)νopt t σ2 i /λi if sign(−µit)µit + λiνopt t ≤ 0 which is equivalent to: Fopt it =        0 if sign(−µit)µit + λiνopt t ≥ 0 sign(−µit) |µit| λi −νopt t σ2 i λi if sign(−µit)µit + λiνopt t ≤ 0 Since it is: sign(−µit)µit + λiνopt t ≤ 0 ⇒ −|µit| + λiνopt t ≤ 0 ⇒ | µit λi | − νopt t ≥ 0 177
  • 178.
    we have provedthat: Fopt it = sign(−µit)(|µit λi | − νopt t )+ σ2 i λi Proposition 3. The relative entropy of a multivariate Gaussian distribution N(µ, I) with respect to the multivariate standard Gaussian distribution is D(Q) = 1 2 µT µ Proof. It is: D(Q) = Z +∞ −∞ log( f(x) g(x) )f(x)dx where f(x) = 1 (2π)N/2 e−1/2(x−µ)T (x−µ) and g(x) = 1 (2π)N/2 e−1/2xT x D(Q) = − Z +∞ −∞ 1/2(x − µ)T (x − µ)f(x)dx + Z +∞ −∞ 1/2xT xf(x)dx The first term is: − 1 2 E[(X − µ)T (X − µ)] = − 1 2 E[trace((X − µ)T (X − µ))] = − 1 2 E[trace((X − µ)(X − µ)T )] = − 1 2 trace(I) = −N/2 The second term is: 1 2 E[XT X] = 1 2 E[trace((X − µ)T (X − µ))] + 1 2 E[X]T E[X] = N/2 + 1 2 µT µ Therefore, we have D(Q) = 1 2 µT µ. 178
  • 179.
    Proposition 4. Therelative entropy of probability measure Q with respect to P, where dQ dP = ξT and ξt = e R t 0 hT s dZs− 1 2 R t 0 hT s hsds is given by: D(Q) = Z T 0 1 2 EQ[hT t ht]dt Proof. The relative entropy of Q with respect to P is: D(Q) = EQ[log( dQ dP )] = EQ[log(ξT )] = EQ[ Z T 0 hT s dZs − 1 2 Z T 0 hT s hsds] = Z T 0 EQ[hT s dZs] − 1 2 Z T 0 EQ[hT s hs]ds = Z T 0 EQ[EQ[hT s dZs|Fs]] − 1 2 Z t 0 EQ[hT s hs]ds = Z T 0 1 2 EQ[hT t ht]dt since EQ[dZt|Ft] = htdt from Girsanov’s theorem. Proposition 5. The following QCQP: minimize −FT (µ(S, t) − rSt − ΣHS ν ) + 1 2 (1 + 1 ν )FT t ΣFt subject to FT ΣF ≤ L has a solution given by: Fopt t =        1 1+ 1 ν Σ−1 µt if µT t Σ−1 µt ≤ L(1 + 1 ν )2 Σ−1µt r µT t Σ−1µt L if µT t Σ−1 µt ≥ L(1 + 1 ν )2 where µt = µ(S, t) − rSt − ΣHS(St,t) ν . Proof. Let us apply the Karush Kuhn Tucker (KKT) conditions. Fopt t , λopt t are optimal iff they satisfy the following KKT conditions: 179
  • 180.
    • Primal feasibility:FT opt t ΣFopt t ≤ L • Dual feasibility: λopt t ≥ 0 • Complementary slackness: λopt t (FT opt t ΣFopt t − L) = 0 • Minimization of the Lagrangean: Fopt t = argmin L(Ft, λopt t ) The Lagrangean is given by: L(F, λ) = −FT t µt + 1 2 (1 + 1 ν )FT t ΣFt + λ(1/2FT t ΣFt − 1/2L) The first order conditions are: −µt + (1 + 1 ν )ΣFopt t + λopt t ΣFopt t = 0 ⇒ Fopt t = Σ−1 µt 1 + 1 ν + λopt t If µT t Σ−1 µt L(1+ 1 ν )2 , then we cannot have λopt t = 0, since in that case FT opt t ΣFopt t L. Therefore λopt t 0 and from the CS condition FT opt t ΣFopt t = L ⇒ µT t Σ−1 µt (1 + 1 ν + λopt t )2 = L ⇒ (1 + 1 ν + λopt t ) = r µT t Σ−1µt L 1 + 1 ν Therefore, if µT t Σ−1 µt L(1 + 1 ν )2 , then Fopt t = Σ−1µt r µT t Σ−1µt L . If µT t Σ−1 µt L(1 + 1 ν )2 , then FT opt t ΣFopt t L, therefore λopt t = 0 and Fopt t = Σ−1µt 1+ 1 ν . Finally if µT t Σ−1 µt = L(1 + 1 ν )2 , then we cannot have λopt t 0, since in that case we would have FT opt t ΣFopt t L and λopt t 0 and the CS condition would be violated. Therefore, λopt t = 0 and Fopt t = Σ−1µt 1+ 1 ν . 180
  • 181.
    Proposition 6. Thefollowing convex program: max θ min h γ(W0 − qT θ)Rf + γµ̂T θ − 1/2γ2 θT Σ̂θ + γθT Σ̂1/2 h + λ1/2hT h (A.1) has a solution given by: θopt = Σ̂−1 (µ̂ − Rf q) γ(1 + 1 λ ) Proof. For each h the objective function is concave in θ. Therefore, when we take the minimum over h, we take the minimum of concave functions, which leads to a concave function. Therefore this problem maximizes a concave function of θ. The inner minimization problem is: min h γθT Σ̂1/2 h + λ1/2hT h This is a convex problem where we minimize a quadratic function of h. The first order conditions are: γΣ̂1/2 θ + λh = 0 ⇒ h = − γΣ̂1/2 θ λ The optimal value of the inner optimization problem is: V (θ) = −γθT Σ̂1/2 γΣ̂1/2 θ λ + λ1/2 γ2 θT Σ̂θ λ2 = − γ2 θT Σ̂θ 2λ The problem A.1 then becomes: max θ γ(W0 − qT θ)Rf + γµ̂T θ − 1/2γ2 θT Σ̂θ − γ2 θT Σ̂θ 2λ 181
  • 182.
    The first orderconditions of this concave problem are: γ(µ̂ − Rf q) − γ2 Σ̂θopt (1 + 1 λ ) = 0 θopt = Σ̂−1 (µ̂ − Rf q) γ(1 + 1 λ ) Therefore we proved our proposition. Proposition 7. The market clearing condition for the risky assets implies that X h∈H θopt h = θα ⇒ q = µ̂ − ΨΣ̂θα Rf where θα ∈ RN is the aggregate supply of the risky assets and 1 Ψ ≡ P h∈H 1 γh(1+ 1 λh ) Proof. From the proposition above we have: θh = Σ̂−1 (µ̂ − Rf q) γ(1 + 1 λ ) Σ̂θh = (µ̂ − Rf q) γ(1 + 1 λ ) By adding across agents we have: X h∈H Σ̂θh = X h∈H (µ̂ − Rf q) γh(1 + 1 λh ) Σ̂θα = (µ̂ − Rf q) X h∈H 1 γh(1 + 1 λh ) ΨΣ̂θα = (µ̂ − Rf q) q = µ̂ − ΨΣ̂θα Rf where 1 Ψ ≡ P h∈H 1 γh(1+ 1 λh ) 182
  • 183.
    Bibliography [1] Committee onthe NIH Research Priority-Setting Process, Scientific Opportu- nities and Public Needs: Improving Priority Setting and Public Input at the National Institutes of Health, pp. 11–12. Washington, D.C.: National Academy Press. (1998). [2] C. Anderson. A new kind of earmarking. Science, 260(5107):483, Apr. 23, 1993 1993. [3] Evan W. Anderson, Lars Peter Hansen, and Thomas J. Sargent. Robustness, detection and the price of risk, 2000. [4] Kerry E. Back. Asset Pricing and Portfolio Choice Theory. Oxford University Press, 2010. [5] Alexander Bade, Gabriel Frahm, and Uwe Jaekel. A general approach to Bayesian portfolio optimization. Mathematical Methods of Operations Research, 70(2):337– 356, 2009. [6] Suleyman Basak and B. Croitoru. Equilibrium mispricing in a capital market with portfolio constraints. The Review of financial studies, 13:715–748, 2000. [7] Suleyman Basak and Alexander Shapiro. Value-at-risk-based risk management: Optimal policies and asset prices. The Review of financial studies, 14:371–405, 2001. [8] V. Bawa, S. Brown, and R. Klein. Estimation Risk and Optimal Portfolio Choice. North-Holland, Amsterdam, 1979. [9] Dirk Bergemann and Karl Schlag. Robust monopoly pricing. Cowles Foundation, Yale University, 2005. [10] Dimitri P. Bertsekas. Convex Optimization Theory. Athena Scientific, 2009. [11] S. Birch and A. Gafni. Cost effectiveness/utility analyses. Do current decision rules lead us to where we want to be? Journal of Health Economics, 11:279–296, 1992. [12] Dimitrios Bisias, Andrew W. Lo, and James F. Watkins. Estimating the NIH efficient frontier. PLOS One, 2012. 183
  • 184.
    [13] F. Blackand R. Litterman. Global portfolio optimization. Financial Analysts Journal, 48(5):28–43, 1992. [14] Nick Black. Health services research: the gradual encroachment of ideas. Journal of Health Services Research Policy, 14:120–123, 2009. [15] Michael Boguslavsky and Elena Boguslavskaya. Arbitrage under power. Risk, pages 69–73, June 2004. [16] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, UK, 2004. [17] Michael J. Brennan and Eduardo S. Schwartz. Arbitrage in stock index futures. The Journal of Business, 63, 1990. [18] J. F. Bridges and D. D. Terris. Portfolio evaluation of health programs: A reply to Sendi et al. Social Science Medicine, 58:1849–1851, 2004. [19] J. F. B. Bridges, M. Stewart, M. T. King, and K. van Gool. Adapting portfolio theory for the evaluation of multiple investments in health with a multiplicative extension for treatment synergies. European Journal of Health Economics, 3:47– 53, 2002. [20] M. L. Brown, J. Lipscomb, and C. Snyder. The burden of illness of cancer: economic cost and quality of life. Annual Review of Public Health, 22:91–113, 2001. [21] Johannes Brumm, Felix Kubler, Michael Grill, and Karl Schmedders. Margin regulation and volatility. ECB working paper, 1698, 2014. [22] Carol S. Burckhardt and Kathryn L. Anderson. The quality of life scale (QOLS): Reliability, validity, and utilization. Health and Quality of Life Outcomes, 1:60– 66, 2003. [23] M. Buxton, S. Hanney, and T. Jones. Estimating the economic value to societies of the impact of health research: a critical review. Bulletin of the World Health Organization, 82(10):733–739, Oct 2004. [24] S. Chandra. Regional economy size and the growth/instability frontier: Evidence from Europe. Journal of Regional Science, 43(1):95–122, 2003. [25] Joseph Chen, Harrison Hong, and Jeremy C. Stein. Breadth of ownership and stock returns. Journal of financial economics, 66:171–205, 2002. [26] Julius H. Comroe Jr. and Robert D. Dripps. Scientific basis for the support of biomedical science. Science, 192(4235):105–111, 1976. [27] C. W. Curry, A. K. De, R. M. Ikeda, and S. B. Thacker. Health burden and funding at the Centers for Disease Control and Prevention. American Journal of Preventive Medicine, 30(3):269–276, MAR 2006. 184
  • 185.
    [28] D. M.Cutler and M. McClellan. Is technological change in medicine worth it? Health Affairs, 20(5):11–29, Sep–Oct 2001. [29] Darrell Duffie. Special repo rates. Journal of Finance, 51:493–526, 1996. [30] Darrell Duffie. Dynamic Asset Pricing Theory. Princeton Series in Finance, 3 edition, 2001. [31] R. L. Fleurence and D. J. Torgerson. Setting priorities for research. Health Policy, 69(1):1–10, JUL 2004. [32] Richard Freeman and John Van Reenen. What if Congress doubled RD spend- ing on the physical sciences? Technical Report 931, Center for Economic Per- formance, May 2009. [33] John Geanakoplos. The leverage cycle. Cowles Foundation Discussion Paper, 1715R, 2009. [34] Itzhak Gilboa and David Schmeidler. Maxmin expected utility with non-unique prior. Journal of Mathematical Economics, 18:141–153, 1989. [35] J. Grant. Evaluating “payback” on biomedical research from papers cited in clinical guidelines: applied bibliometric study. BMJ, 320(7242):1107–1111, 2000. [36] M. Grant and S. Boyd. Graph implementations for nonsmooth convex programs. In V. Blondel, S. Boyd, and H. Kimura, editors, Recent Advances in Learning and Control (a tribute to M. Vidyasagar), Lecture Notes in Control and Information Sciences, pages 95–110. Springer, Berlin / Heidelberg, 2008. [37] M. Grant and S. Boyd. CVX: Matlab software for disciplined convex program- ming, June 2009 2009. [38] C. P. Gross, G. F. Anderson, and N. R. Rowe. The relation between funding by the National Institutes of Health and the burden of disease. New England Journal of Medicine, 340(24):1881–1887, JUN 17 1999. [39] Steve Hanney, Iain Frame, Jonathan Grant, Philip Green, and Martin Buxton. From Bench to Bedside: Tracing the Payback Forwards from Basic or Early Clinical Research A Preliminary Exercise and Proposals for a Future Study. Health Economics Research Group, Brunel University, Uxbridge, UK, 2003. [40] Lars Hansen, Thomas Sargent, G. Turmuhambetova, and N. Williams. Robust control, min-max expected utility and model misspecification. Journal of Eco- nomic Theory, 128:45–90, 2006. [41] Lars P. Hansen and Thomas Sargent. Robustness. Princeton University Press, 2008. [42] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The Elements of Sta- tistical Learning: Data Mining, Inference, and Prediction. Springer, 2013. 185
  • 186.
    [43] Ernest Istook.Research funding on major diseases is not proportionate to tax- payer’s needs. Journal of NIH Research, 9(8):26–28, 1997. [44] D.H. Jacobson. Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games. IEEE Transactions Automatic Control, 18:124–131, 1973. [45] S. C. Johnston and S. L. Hauser. Basic and clinical research: what is the most appropriate weighting in a public investment portfolio? Annals of Neurology, 60(1):9A–11A, Jul 2006. [46] S. C. Johnston, J. D. Rootenberg, S. Katrak, W. S. Smith, and J. S. Elkins. Effect of a US national institutes of health programme of clinical trials on public health and costs. Lancet, 367(9519):1319–1327, APR 22 2006. [47] Philippe Jorion. Value at Risk: The New Benchmark for Managing Financial Risk. McGraw-Hill, 2006. [48] Jakub Jurek and Halla Yang. Dynamic portfolio selection in arbitrage. EFA Meeting, 2006. [49] Tong Suk Kim and Edward Omberg. Dynamic nonmyopic portfolio behavior. The Review of Financial Studies, 9:141–161, 1996. [50] Leonid Kogan and Raman Uppal. Risk aversion and optimal portfolio policies in partial and general equilibrium economies. NBER working paper, 8609, 2001. [51] Julia Lane and Stefano Bertuzzi. Measuring the results of science investments. Science, 331:678–680, 2011. [52] John Lintner. The aggregation of investor’s diverse judgements and preferences in purely competitive security markets. Journal of financial and quantitative analysis, 4:347–400, 1969. [53] Jun Liu and Francis A. Longstaff. Losing money on arbitrage: Optimal dynamic portfolio choice in markets with arbitrage opportunities. The Review of Financial Studies, 17, 2004. [54] Roger Lowenstein. When genius failed: The Rise and Fall of Long-Term Capital Management. Random House Trade Paperbacks, 2001. [55] H. M. Markowitz. Portfolio selection. Journal of Finance, 7(1):77–91, March 1952 1952. [56] F. W. McFarlane. Portfolio approach to information systems. Harvard Business Review, 59(4):142–150, 1981. [57] M. T. McKenna, C. M. Michaud, C. J. L. Murray, and J. S. Marks. Assessing the burden of disease in the United States using disability-adjusted life years. American Journal of Preventive Medicine, 28(5):415–423, JUN 2005. 186
  • 187.
    [58] Robert C.Merton. An analytic derivation of the efficient portfolio frontier. Journal of Financial and Quantitative Analysis, 7:1851–1872, 1972. [59] Robert C. Merton. Continuous time Finance. Wiley-Blackwell, 1992. [60] Attilio Meucci. Review of statistical arbitrage, cointegration and multivariate ornstein-uhlenbeck, 2010. [61] Edward M. Miller. Risk, uncertainty and divergence of opinion. Journal of Finance, 32:1151–1168, 1977. [62] F. Mosteller. Innovation and evaluation. Science, 211(4485):881–886, 1981. [63] Kevin Murphy and Richard Topel. Diminishing returns? The costs and benefits of improving health. Perspectives in Biology and Medicine, 43(3):S108–S128, 2003. [64] Rishi K. Narang. Inside the Black Box: A Simple Guide to Quantitative and High Frequency Trading. Wiley, 2013. [65] National Institutes of Health. Setting Research Priorities at the National Insti- tutes of Health. National Institutes of Health, Bethesda, MD, 1997. [66] B. J. O’Brien and M. J. Sculpher. Building uncertainty into cost-effectiveness rankings: Portfolio risk-return tradeoffs and implications for decision rules. Med- ical Care, 38:460–468, 2000. [67] M. E. Porter. What is value in health care? New England Journal of Medicine, 363(26):2477–2481, 2010. [68] Luis Prieto and Jose A. Sacristan. Problems and solutions in calculating quality- adjusted life years (QALYs). Health and Quality of Life Outcomes, 1:80–87, 2003. [69] S. J. Rangel, B. Efron, and R. L. Moss. Recent trends in National Institutes of Health funding of surgical research. Annals of Surgery, 236(3):277–287, SEP 2002. [70] R. S. Sandler, J. E. Everhart, M. Donowitz, E. Adams, K. Cronin, C. Goodman, E. Gemmen, S. Shah, A. Avdic, and R. Rubin. The burden of selected digestive diseases in the United States. Gastroenterology, 122(5):1500–1511, May 2002. [71] P. Sendi, M. J. Al, A. Gafni, and S. Birch. Optimizing a portfolio of health care programs in the presence of uncertainty and constrained resources. Social Science Medicine, 57:2207–2215, 2003. [72] Pedram Sendi, Maiwenn J. Al, and Frans F. H. Rutten. Portfolio theory and cost-effectiveness analysis: A further discussion. Value In Health, 7:595–601, 2004. 187
  • 188.
    [73] William F.Sharpe. Capital asset prices: a theory of market equilibrium under conditions of risk. J. Finance, 19:425–442, 1964. [74] Andrei Shleifer and Robert W. Vishny. The limits of arbitrage. The Journal of Finance, 52:35–55, 1997. [75] Gilbert Strang. Computational Science and Engineering. Wellesley Cambridge Press, 2007. [76] H. Varmus. Evaluating the burden of disease and spending the research dol- lars of the National Institutes of Health. New England Journal of Medicine, 340(24):1914–1915, JUN 17 1999. [77] W.H.Fleming and P.E.Souganidis. On the existence of value functions of two- player zero-sum stochastic differential games. Indiana University Mathematics Journal, 38:293–314, 1989. [78] P. Whittle. Risk sensitive linear quadratic gaussian control. Advanced Applied Probability, 13:776–777, 1981. [79] P. Whittle. Risk sensitive optimal control. Wiley, 1990. [80] P. Whittle. Optimal control: basics and beyond. Wiley, 1996. [81] Jean-Pierre Zigrand and Jon Danielsson. What happens when you regulate risk?: evidence from a simple equilibrium model. Lse research online documents on economics, London School of Economics and Political Science, LSE Library, 2001. 188