Statistical Arbitrage
Pairs Trading, Long-Short Strategy
Cyrille BEN LEMRID
Credit Suisse supervisor : Fr´ed´eric PECQUEUR
Academic supervisors : Olivier GU´EANT, Simone SCOTTI
Paris Diderot University, Paris VII
October 1, 2012
Contents
1 Pairs Trading Model 5
1.1 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Cointegration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Spread dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 State of the art and model overview 9
2.1 Stochastic Dependencies in Financial Time Series . . . . . . . . . . . . . . . 9
2.2 Cointegration-based trading strategies . . . . . . . . . . . . . . . . . . . . . 10
2.3 Formulation as a Stochastic Control Problem . . . . . . . . . . . . . . . . . . 13
2.4 Fundamental analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Strategies Analysis 19
3.1 Road map for strategy design . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Identification of potential pairs . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Testing cointegration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4 Risk control and feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4 Results 22
2
Introduction
This report presents my research work carried out at Credit Suisse from May to September
2012. This study has been pursued in collaboration with the Global Arbitrage Strategies
team.
Quantitative analysis strategy developers use sophisticated statistical and optimization
techniques to discover and construct new algorithms. These algorithms take advantage
of the short term deviation from the ”fair” securities’ prices. Pairs trading is one such
quantitative strategy - it is a process of identifying securities that generally move together
but are currently ”drifting away”.
Pairs trading is a common strategy among many hedge funds and banks. However, there is
not a significant amount of academic literature devoted to it due to its proprietary nature.
For a review of some of the existing academic models, see [6], [8], [11] .
Our focus for this analysis is the study of two quantitative approaches to the problem of
pairs trading, the first one uses the properties of co-integrated financial time series as a basis
for trading strategy, in the second one we model the log-relationship between a pair of stock
prices as an Ornstein-Uhlenbeck process and use this to formulate a portfolio optimization
based stochastic control problem.
This study was performed to show that under certain assumptions the two approaches are
equivalent.
Practitioners most often use a fundamentally driven approach, analyzing the performance
of stocks around a market event and implement strategies using back-tested trading levels.
We also study an example of a fundamentally driven strategy, using market reaction to a
stock being dropped or added to the MSCI World Standard, as a signal for a pair trading
strategy on those stocks once their inclusion/exclusion has been made e↵ective.
This report is organized as follows. Section 1 provides some background on pairs trading
strategy. The theoretical results are described in Section 2. Section 3 describes our
3
methodology of constructing pairs and calculating returns. And finally in section 4, the
results are illustrated with numerical examples involving a real pair of stocks.
Acknowledgements
This study has been pursued in collaboration with the Global Arbitrage Strategies team at
Credit Suisse. Since my first arrival in the locals of Credit Suisse, I have been thoroughly
assisted. Successful professionals were kind enough to answer my questions and to give their
opinion on my work during the whole internship. Without their advices, this study would
not have achieved its current findings.
First, I would like to thank my supervisor Mr Fr´ed´eric Pecqueur, Managing Director of the
Index Arbitrage Team. I would also like to thank Alan Du↵y and Reda Ighachan, they
greatly contributed to my project and gave me valuable feedbacks. I am also thankful to
Andrew Coman, Alex Nevill and Karl Bogoslavski for their contributions, and to the team
in general.
4
CHAPTER 1
Pairs Trading Model
1.1 General discussion
The fundamental idea of pair trading is that knowing that a pair of financial instruments
has historically moved together and kept a specific pattern for their spread, we could take
advantage of any disturbance over this historic trend. Pairs trading involves selling the
higher-priced security and buying the lower-priced security with the idea that the mispricing
will correct itself in the future. The mutual mispricing between the two securities is captured
by the notion of spread. The greater the spread, the higher the magnitude of mispricing and
greater the profit potential.
There are generally two types of pairs trading: statistical arbitrage convergence/divergence
trades, and fundamentally-driven valuation trades. In the former, the driving force for the
trade is a aberration in the long-term spread between the two securities, and to realize the
mean-reversion back to the norm, you short one and go long the other. The trick is creating
a program to find the pairs, and for the relationship to hold. The other form of pairs trading
would be more fundamentally-driven variation, which is the purvey of most market-neutral
hedge funds: in essence they short the most overvalued stock and go long the undervalued
stock. However, it is possible to determine that a security is overvalued or undervalued only
if we also know the true value of the security in absolute terms.
A long-short position in the two securities is constructed such that it has a negligible beta and
therefore minimal exposure to the market. Hence, the returns from the trade are uncorrelated
to market returns, a feature typical of market neutral strategies. The key to success in pairs
trading lies in the identification of security pairs.
Based on the discussion so far, the truly crucial questions are : How do you identify ”stocks
that move together?” Need they be in the same industry? Should they only be liquid stocks?
5
How far do they have to diverge before a position is put on? When is a position unwound?
The return on both securities is expected to be close over the time frames. In other words,
the increment to the logarithm of the prices at the current time must be about the same
for both the securities at all time instances in the future. This, of course, means that the
time series of the logarithm of the two prices must move together, and the spread calculation
formula is therefore based on the di↵erence in the logarithm of the prices. Having explained
our approach, we now need to define in precise terms what we mean when we say that the
price series or the log price series of the two securities must move together.
The idea of comovement is not captured by correlation of two time series but by the notion
of cointegration that has been well developed in the field of statistics.
1.2 Cointegration
It is well documented that the correlation is a measure of the short-term linear dependencies
(see [4], Theorem 4.5.7). In contrast to correlation, cointegration is a measure of long-term
dependencies (see [9]).
We will briefly outline the notion of cointegration. In order to do so, we need to first define
stationary and integrated time series.
Definition 1.2.1 Xt 2 R, t 2 Z is said strictly stationary i↵ its finite dimensional
distributions are invariant under any time translation, i.e:
8⌧ 2 Z,8n 2 N⇤
, 8(t1, .., tn) 2 Zn
, (Xt1 , .., Xtn ) ⇠ (Xt1 ⌧ , .., Xtn ⌧ )
Definition 1.2.2 A stochastic process Xt 2 R, t 2 Z is stationary if its first and second
moments are time invariant::
– (Xt)t2Z 2 L2
(R), i.e 8t 2 Z, E [X2
t ] < 1
– 8t 2 Z, E [Xt] = E [X0] := µX
– 8s, t 2 Z, X(t, s) := Cov (Xt, Xs) = Cov (X0, Xs t) =: (s t)
Such a process is known as integrated of order 0 and denoted by I(0).
Definition 1.2.3 Univariate process is called integrated of order d, I(d), if in its original
form it is non-stationary but becomes stationary after di↵erencing d times.
Definition 1.2.4 If all elements of the vector Xt = (X1
t , ..., XN
t )
T
for t = 1, ..., N, are I(1),
and there exists a non-null vector = ( 1
, ..., N
)
T
such that T
X is I(0), then the vector
process Xt is said to be cointegrated, and b is called the cointegrating vector. For example,
two time series X and Y are cointegrated if X, Y are I(1), and there exists a scalar such
that X ⇤ Y is I(0).
6
The explanation for cointegration dynamics is captured by the notion of error correction.
The idea behind error correction is that cointegrated systems have a long-run equilibrium;
that is, the long-run mean of the linear combination of the two time series. If there is a
deviation from the long-run mean, then one or both time series adjust themselves to restore
the long-run equilibrium.
1.3 Spread dynamics
The purpose of this section is to demonstrate that the modeling of the spread in parametric
terms could indeed get complex. This section formulates a stochastic model associated with
a cointegration relation. We begin by describing the asset, spread, and wealth dynamics.
We assume that a risk-free asset S0(t) exists with a riskfree rate of r compounded continu-
ously. Thus, S0(t) satisfies the dynamics
dS0(t) = rS0(t)dt (1.1)
Let S1(t) and S2(t) denote respectively the prices of the pair of stocks S1 and S2 at time t.
We assume that stock S2 follows a geometric Brownian motion
dS2(t) = µS2(t)dt + S2(t)dW2(t) (1.2)
where µ is the drift, is the volatility, and W2(t) is a standard Brownian motion. Let X(t)
denote the spread of the two co-integrated stocks at time t, defined as
X(t) = ln(S1(t)) ln(S2(t)) (1.3)
Since S1(t) and S2(t) are co-integrated X(t) is stationary. We assume that the spread follows
an Ornstein-Uhlenbeck process
dX(t) = k(✓ X(t))dt + ⌘dWX(t) (1.4)
where k(✓ X(t)) is the drift term that represents the expected instantaneous change in the
spread at time t, and ✓ is the long-term equilibrium level to which the spread reverts. The
rate of reversion is represented by the parameter k, which has to be positive to ensure stability
around the equilibrium value. The standard deviation parameter, ⌘, determines the volatility
of the spread. WX(t) is a standard Brownian motion where denotes the instantaneous
correlation coe cient between WX(t) and W2(t) (i.e. E [dWX(t)dW2(t)] = ⇢dt).
By using (1.2), (1.3), (1.4) and applying Ito’s lemma to S1(t) = eX(t)
S2(t) , we are able to
obtain the dynamics of S1(t) as
7
dS1(t) =
✓
µ + k(✓ X(t)) +
1
2
⌘2
+
1
2
( 1) 2
+ ⇢ ⌘
◆
S1(t)
+ S1(t)dW2(t) + ⌘S1(t)dWX(t) (1.5)
Let V (t) be the value of a self-financing pairs-trading portfolio and let h(t) and ˜h(t) denote
respectively the portfolio weights for stocks S1 and S2 at time t. Additionally, we only allow
ourselves to trade stocks S1 and S2 as a cointegrated pair. Thus, we require that
˜h(t) = h(t) (1.6)
Finally, the wealth dynamics of the self-financed portfolio value is given by
dV (t) = V (t)
✓
h(t)
dS1(t)
S1(t)
+ ˜h(t)
dS2(t)
S2(t)
+
dS0(t)
S0(t)
◆
(1.7)
Using (1.1), (1.2), (2.2) and (1.6), we can rewrite (1.7) as
dV (t) = V (t)
⇢
h(t)
✓
k(✓ X(t)) +
1
2
⌘2
+
1
2
( 1) 2
+ ⌘⇢
◆
+ r dt + ⌘dWX(t)
(1.8)
We see that the self-financed portfolio dynamics depends on the spread dynamics only, thus
in theory we do almost totally avoid the systematic market risk modelled here by W2(t).
8
CHAPTER 2
State of the art and model overview
2.1 Stochastic Dependencies in Financial Time Series
Assume now that we have N 2 cointegrated financial assets, and their log-prices are I(1)
processes. It is widely assumed that stock returns are integrated of order 0, whereas the
stock prices are integrated of order 1 (see [1]).
Denote the vector of the asset prices by St = (S1
t , ..., SN
t )T
Each of its elements can be written as
Si
t = Si
0e
tP
j=0
ri
j
where rt = (r1
t , ..., rN
t )T
are the continuously compounded asset returns, and S1
0 , ..., SN
0 are
the initial prices.
Then, the log-prices can be written as
ln Si
t = ln Si
0 +
tX
j=0
ri
j
Denote the corresponding cointegrating vector by = ( 1
, ..., N
)
T
. By the definition of
cointegration, the resulting time series Xt
Xt =
NX
i=1
i
ln Si
t
9
will be stationary and integrated of order 0.
The next two propositions lead to derivations of new properties of cointegrated time series
that we later use for the construction of a new trading strategy.
Proposition 2.1.1 Assume that the log prices of N assets, lnSi
, i = 1, ..., N, are cointe-
grated with a cointegrating vector . Let Xt =
NP
i=1
i
ln Si
t be the corresponding stationary
series, and rt = (r1
t , ..., rN
t )T
be the continuously compounded asset returns at time t ¿ 0.
Define Zt =
NP
i=1
bi
ri
t has the following three properties:
Zt = Xt Xt 1 =
NP
i=1
i
ri
t. If limp!1 Cov [Xt, Xt p] = 0 , then
1P
p=1
p Cov [Zt, Zt p] = Var Xt
where Cov [Zt, Zt p] =
NP
i=1
NP
j=1
bi
bj
Cov
⇥
ri
t, rj
t p
⇤
Proof : See [10].
Proposition 2.1.1 is a technical result. Intuitively, it shows that the variance of the
cointegration process (Xt) inadvertently defines the auto-covariance of the asset returns.
Proposition 2.1.2 Assume lnSi
, i = 1, ..., N are the log-prices of N assets, and rt =
(r1
t , ..., rN
t )T
are the continuously compounded asset returns at time t ¿ 0. For some finite
vector the process limp!1 Cov [Xt, Xt p] = 0 is stationary, and therefore the time series
of the assets’ log-prices are cointegrated, if and only if the process Zt =
NP
i=1
i
ri
t.
– E [Zt] = 0
– var Zt = 2
1P
p=1
Cov [Zt, Zt p]
–
1P
p=1
p Cov [Zt, Zt p] < 1
Proof : See [10].
As a result of proposition 2.1.2, it follows that cointegration is a property related to the
1st and 2nd moments of asset returns. In previous work, cointegration was viewed as a
property of asset prices. Here cointegration is defined by the stochastic relationships among
the returns.
2.2 Cointegration-based trading strategies
Next, we introduce a trading strategy by exploiting the theoretical results derived in the
previous sections.
10
Summarizing the results from the two propositions: for process Zt = Xt Xt 1 =
NP
i=1
bi
ri
t
we have that E [Zt] = 0 and Var Zt = 2
1P
p=1
Cov [Zt, Zt p]. Consider a strategy where each
time period we buy i
C
1P
p=1
Zt p value of stock i, i = 1, ..., N with C is a positive scale
factor. The reason for which we include constant C will become clear later.
At any point in time we can compute the profit of this strategy by multiplying the next
period return by the shares purchased:
Vt =
NX
i=1
i
C
" 1X
p=1
Zt p
#
ri
t = C
1X
p=1
Zt pZt
Given that E[Zt] = 0 and Cov[Zt, Zt p] = E[ZtZt p],p > 0, the expected profit of this
strategy is:
E [Vt] = E
"
C
1X
p=1
Zt pZt
#
= C
1X
p=1
Cov [Zt p, Zt] = 0.5C Var Zt
Since Var Zt and C are positive, the expected profit of the proposed strategy is always positive
and proportional to the scale factor C. The reasoning behind this strategy is fairly simple.
The cointegration relations between time series imply that the time series are bound together.
Over time the time series might drift apart for a short period of time, but they ought to
re-converge. The term
1P
p=1
Zt p =
1P
p=1
(Xt p Xt 1 p) = Xt lim
u!1
Xu = Xt ✓ measures how far they diverge, and
sign
✓
i
C
1P
p=1
Zt p
◆
provides the direction of the trade for stock i. Specifically, +1 stands
for a long position, whereas -1 denotes a short trade. This strategy relies on identifying
spreads that have gone apart but are expected to mean revert in the future. The spreads of
typical pairs-trading strategy get identified by using correlation as a similarity measure and
standard deviation as a spread measure. A trade, for example, will be put in place if the
assets are highly correlated but have gone apart for more than 3 standard deviations. The
trade will unwind when the assets converge or some time limit is reached.
This approach uses cointegration as a measure of similarity. Cointegration is the natural
answer of the question: How do we identify assets that move together? Proposition 2 provides
the answer of the question: How far do the assets have to diverge before a trade is placed?
As a result, the decision to execute a trade is driven by cointegration properties of the assets.
Having positive expected profit is excellent news for any strategy. The proposed strategy has
some shortcomings. The initial amount of money needed each period is a random variable,
and the resulting portfolio is not dollar neutral (i.e. the total dollar value of the long position
is not equal to the total dollar value of the short position.) To construct a dollar neutral
11
long-short portfolio, we will first partition the cointegrated time series into two sets L and
S:
i 2 L $ i
0
i 2 S $ i
< 0
Next, depending on what set a given asset belongs to, we purchase the value of
iCsign
PP
p=1
˜Zt p+1
!
Pi
t
P
j2L
j
iCsign
PP
p=1
˜Zt p+1
!
Pi
t
P
j2S
j
The return of this modified strategy is identical to the proposed earlier. Hence, the expected
profit for that strategy is also positive. Indeed (without loss of generality) assume that
sign
✓
i
C
1P
p=1
Zt p
◆
= 1. The long returns RL
t and the short returns RS
t of our original
strategy are
RL
t =
P
i2L
i
C
P1
p=1 Zt pSi
t+1 Si
t
P
i2L
iC
P1
p=1 Zt p
1 =
P
i2L
i
Si
t+1 Si
t
P
i2L
i
1
RS
t = 1
P
i2S
i
C
P1
p=1 Zt pSi
t+1 Si
t
P
i2S
iC
P1
p=1 Zt p
= 1
P
i2S
i
Si
t+1 Si
t
P
i2S
i
The modified strategy has the following returns from the short and long positions:
RL
t =
P
i2L
i
P
i2L
i Csign
⇣P1
p=1 Zt p
⌘
Si
t+1 Si
t
P
i2L
i
P
i2L
i Csign
⇣P1
p=1 Zt p
⌘ 1 =
P
i2L
i
Si
t+1 Si
t
P
i2L
i
1
RS
t = 1
P
i2L
i
P
i2S
i Csign
⇣P1
p=1 Zt p
⌘
Si
t+1 Si
t
P
i2L
i
P
i2S
i Csign
⇣P1
p=1 Zt p
⌘ = 1
P
i2S
i
Si
t+1 Si
t
P
i2S
i
The above derivations indicate the return of the modified strategy is the same as the original
one, therefore its expected profit is positive(since we proved that the expected return of the
original strategy is positive). Now we can explain why we have included the constant C. In the
modified strategy, every time period the value of C is invested in short and long positions.
Hence, the money needed for each time period in order to execute the new strategy is a
constant, and the portfolio we obtain is dollar neutral.
In reality, we cannot compute the true value of (
P1
p=1 Zt p (the cointegration vector b.) We
estimate them, and with the above theoretical results in mind, we propose the following
trading strategy:
– Step 1: using historical data, estimate the cointegration vector .
12
– Step 2: using the estimated cointegration vector ˜ and historical data, construct ˜Zt
realizations of the process Zt =
NP
i=1
i
ri
t
– Step 3: compute the final sum
PX
p=1
˜Zt p+1
, where P is a parameter.
– Step 4: partition the assets into two sets L and S (depending on values of ˜.)
– Step 5: buy (depending in which set the asset belongs to) the following number of shares
(round down to get integer number of shares):
˜i
Si
t
P
i2L
˜i
Csign
PX
p=1
˜Zt p+1
!
˜i
Si
t
P
i2S
˜i
Csign
PX
p=1
˜Zt p+1
!
– Step 5: buy (depending in which set the asset belongs to) the following number of shares
(round down to get integer number of shares):
– Step 6: rebalance all the open positions the following trading day.
– Step 7: update the historical data set.
– Step 8: If it is time to re-estimate the cointegration vector (which happens every 22 trading
days), go to step 1, otherwise go to step 2.
In the next section we describe the procedures used to test the strategy and present the
numerical results.
2.3 Formulation as a Stochastic Control Problem
We recall the wealth dynamics (2.1)
dV (t) = V (t)
⇢
h(t)
✓
k(✓ X(t)) +
1
2
⌘2
+
1
2
( 1) 2
+ ⌘⇢
◆
+ r dt + ⌘dWX(t)
(2.1)
We formulate the portfolio optimization pair-trading problem as a stochastic optimal control
problem. We assume that an investor’s preference can be represented by the utility function
U(x) = 1
x , with x 0 and x < 1. In this formulation, our objective is to maximize
expected utility at the final time T. Thus, we seek to solve
13
sup
h(t)
E

1
V (T)
subject to V (0) = v0, X(0) = x0
dX(t) = k(✓ X(t))dt + ⌘dWX(t)
dV (t)
V (t)
=

h(t)
✓
k(✓ X(t)) +
1
2
⌘2
+
1
2
( 1) 2
+ ⌘⇢
◆
+ r dt
+ ⌘dWX(t)
where the supremum is taken over strategies h(t) that are adapted to the filtration generated
by WX(t) and W2(t). (For a rigorous formulation in a related setting, see [13].) In this
optimal control problem, the first constraint just specifies the initial wealth of our portfolio
and the spread. The second and third constraints describe the spread and wealth dynamics
respectively.
In the following section, we show that a closed form solution to the above stochastic control
problem exists.
Let G(t, v, x) denote the value function.
G(t, v, x) = sup
h
Et,x,v [V (T) ]
For any strategies h(t), define the Dynkin operator
Lh
(t, x, v) = k(x ✓)@x +

hk(✓ x) +
1
2
h⌘2
+
1
2
h ( 1) 2
+ h ⌘⇢ + rh v@v
+
1
2
⇥
⌘2
@xx + 2h⌘2
v@vx + h2
⌘2
v2
@vv
⇤
(2.2)
The HJB equation can be rewritten using the Dynkin operator
@tG + sup
h
⇥
Lh
G
⇤
= 0
subject to the terminal condition
G(T, v, x) = v
By standard arguments, one may show that the Hamilton-Jacobi-Bellman (HJB) equation
corresponding to our stochastic control problem is
Gt + sup
h
{
1
2
⇥
h2
⌘2
v2
Gvv + ⌘2
Gxx + 2h⌘2
vGvx
⇤
+

hk(✓ x) +
1
2
h⌘2
+
1
2
h ( 1) 2
+ h ⌘⇢ + rh vGv
k(x ✓)Gx} = 0 (2.3)
14
where the subscripts on G denote partial derivative.
For notational ease we let b = k(✓ x) + 1
2
⌘2
+ ⇢⌘ + 1
2
h ( 1) 2
and rewrite 2.3 as
Gt + sup
h
{
1
2
⇥
h2
⌘2
v2
Gvv + ⌘2
Gxx + 2h⌘2
vGvx
⇤
+ [hb + r] vGv k(x ✓)Gx} = 0 (2.4)
The first order condition for the maximization in 2.4 is
h⇤
⌘2
vGvv + ⌘2
Gvx + bGv = 0 (2.5)
Assuming Gvv < 0, the first order condition 2.5 is also su cient, yielding
h⇤
=
⌘2
Gvx + bGv
⌘2vGvv
(2.6)
Plugging 2.6 back into 2.4 yields
⌘2
GtGvv
1
2
⌘4
G2
vx
1
2
b2
G2
v b⌘2
GvGvx +
1
2
⌘4
GvvGxx + r⌘2
vGvGvv
k(x ✓)⌘2
GxGvv = 0 (2.7)
Thus, we must solve the partial di↵erential equation 2.7 in order to determine an optimal
strategy.
To obtain a closed form solution, we consider the following separation ansatz that was
motivated by [13] where a di↵erent portfolio optimization problem under Vasicek [18] term
structure dynamics was solved,
G(t, v, x) = f(t, x)v
with the condition that
f(T, x) = 1
For this choice of ansatz, 2.7 becomes
( 1)⌘2
fft
1
2
⌘4
f2
x
1
2
b2
f2 1
2
⌘4
ffx ⇢ ⌘3
ffx +
1
2
( 1)⌘4
ffxx
+ ( 1)r⌘2
f2
+ k(x ✓)⌘2
ffx = 0 (2.8)
We then use the following ansatz for f(t, x)
f(t, x) = g(t)exB(t)+x2A(t)
with g(T) = 1, B(T) = 0, A(T) = 0.
15
Pluging the ansatz into 2.8 and setting the coe cient of x2
to be zero yields an ordinary
di↵erential equation for A(t)
h⇥
( 1)⌘2
⇤
A04
i
A2
+
⇥
2k⌘2
⇤
A
1
2
k2
= 0 (2.9)
Similarly, setting the coe cient of x in 2.8 to be zero yields an ordinary di↵erential equation
for B(t)
h⇥
( 1)⌘2
⇤
B02
2⌘4
A
i
B+

⌘4
A 2 ⇢ ⌘3
A 2k✓⌘2
A + k2
✓ +
1
2
k⌘2
+ k⇢⌘ +
1
2
k ( 1) 2
= 0
(2.10)
Noting that 2.9 is a Riccati equation for A(t), and 2.10 is first order linear ordinary di↵erential
equations for B(t), respectively, one may obtain the solution in closed form as,
A(t) =
k 1
p
1
2⌘2
8
<
:
1 +
2
p
1
1
p
1 (1 +
p
1 ) exp
⇣
2k(T t)
p
1
⌘
9
=
;
(2.11)
B(t) =
1
2⌘2
h
(1
p
1 ) (1 +
p
1 ) exp
⇣
2k(T t)
p
1
⌘i
[
p
1 (⌘2
+ 2 ⇢ ⌘ + ( 1) 2
)

1 exp
✓
2k(T t)
p
1
◆ 2
⌘2
+ 2 ⇢ ⌘ + ( 1) 2
+ 2k✓

1 exp
✓
2k(T t)
p
1
◆
] (2.12)
Consequently, the optimal weight h⇤
(t) can be obtained via 2.6
h⇤
(t, x) =
1
1

B(t) + 2A(t)x
k (x ✓)
⌘2
+
⇢
⌘
+
( 1) 2
2⌘2
+
1
2
(2.13)
With the above closed form solution in hand we find that as in the previous section the
optimal weight h⇤
(t) is linear in x, through the distance between the di↵erence in the spread
process and its the long-run mean x ✓.
The term Xt ✓ in the optimal weight h⇤
(t) is equivalent to the term
1P
p=1
Zt p we found in
the standard cointegration strategy, we find that both approaches are consistent.
2.4 Fundamental analysis
Fundamental analysis of a business involves analyzing its financial data to get some insight
on whether it is overvalued or undervalued. This is done by analyzing historical and present
16
economic data to do a financial forecast of the business. The intrinsic value of the business
is found by doing a fundamental analysis which consist of three main steps; (I) economic
analysis, (II) industry analysis and (III) company analysis. If the intrinsic value is higher
than the market price it is recommended to buy stocks, if it is equal to market price then it
is best to hold your shares, and if it is less than the market price then it’s a selling signal.
Fundamental analysis maintains that markets may misprice an asset in the short run but
that the ”correct” price will eventually be reached. Profits can be made by trading the
mispriced security and then waiting for the market to recognize its ”mistake” and reprises
the security.
In this section we study an example of a fundamentally driven strategy, using market reaction
to a stock being dropped or added to the MSCI World Standard, as a signal for a pair trading
strategy on those stocks once their inclusion/exclusion has been made e↵ective.
Both FTSE and MSCI have their own set of criteria for including stocks into their respective
indices. These criteria include (but are not limited to) size, liquidity, free float and trade
history. Stocks not passing through these filters are not eligible to be a part of the index.
Since so much money throughout the world is passively managed (and therefore needs to
closely replicate the performance of the benchmarked index), it is reasonable to assume that
changes in index constituents can drive huge flows in and out of the stocks in play.
The way we constructed this fundamentally driven strategy was to look to ”buy” stocks that
are announced to be included in the MSCI ACWI and ”sell” stocks that are announced to
be dropped from the benchmark.
Figure 2.1: MSCI Rebalance Profit and Loss
The strategy sounds simple enough to track; however, there are a few practical barriers to
testing this hypothesis that we had to consider. In an ideal world, to extract maximum
benefit from the announcement, we would like to buy (sell) the additions (deletions) on the
night the reviews are announced (or develop a strategy to pre-empt those announcements)
17
and exit our positions at the close of business on the day when changes become e↵ective.
The first drawback is that since the reviews are not made public until the markets have
ceased trading for the day, it is impossible to take positions at the closing levels of the day,
unless we are aware of what changes constitute the announcements.
We now go about testing the ”Index e↵ect” strategy historically. As mentioned earlier, our
interest covers ”announcement dates” as well as ”e↵ective” dates. For this analysis, we
therefore decided to go Long stocks that have been announced as being soon added into the
MSCI Europe Index and we go Short on stocks that have been announced as soon being
deleted from the Index.
We go Long (Short) on the close of the day following the index change announcement and
keep positions until the close of the e↵ective date. On the close of the e↵ective date (ie
when all index changes are taking place), we square o↵ our positions and wait for the next
quarterly review.
Figure 2.2: MSCI longshort
We see that following the announcement of its addition to benchmark Index, a stock’s per-
formance is usually positive on an absolute and relative basis. Conversely, an announcement
of a stock’s deletion from benchmark Index is usually a negative trigger, on both absolute
and relative basis.
In this piece, this fundamentally driven strategy generated some impressive returns: 12.3%
annualised with a Sharpe of 0.7 (before commissions, borrowing fees and transaction costs).
18
CHAPTER 3
Strategies Analysis
3.1 Road map for strategy design
In this section we we provide a road map for the design and analysis of the pairs trading
strategy. The steps involved are as follows:
1. Identify stock pairs that could potentially be cointegrated. This process can be based
on the stock fundamentals or alternately on a pure statistical approach based on
historical data.
2. Once the potential pairs are identified, we verify the proposed hypothesis that the
stock pairs are indeed cointegrated based on statistical evidence from historical data.
This involves determining the cointegration coe cient and examining the spread time
series to ensure that it is stationary and mean reverting.
3. We then examine the cointegrated pairs to determine the delta. A feasible delta that
can be traded on will be substantially greater than the slippage encountered due to
the bid-ask spreads in the stocks.
3.2 Identification of potential pairs
The challenge in this strategy is identifying stocks that tend to move together and therefore
make potential pairs. Our aim is to identify pairs of stocks with mean-reverting relative
prices. To find out if two stocks are mean-reverting the test conducted is the Dickey-Fuller
test of the log ratio of the pair.
A Dickey-Fuller test consists in determining if the log-ratio xt = log S1
t log S2
t of share
prices S1
t and S2
t . is indeed stationary.
19
Critical values of the cointegration test are depends on the number of observations, so that
we have to compute our own critical values for the Dickey-Fuller test for cointegration. The
procedure is as follows :
1. We simulate two time series of T error terms ("
(i)
t , ⌘
(i)
t ), t = 1, ..., T, distributed as
two independent N(0, 1) variables, and the independent random walks associated
p
(i)
t = p
(i)
t 1 + "
(i)
t and d
(i)
t = d
(i)
t 1 + ⌘
(i)
t
2. We estimate by regression the relation between the two time series p
(i)
t = a+bd
(i)
t +z
(i)
t
Under the null of no co-integration, the residual series z
(i)
t should be non-stationary.
We therefore perform a standard Dickey-Fuller test on z
(i)
t .
3. We fit an AR(1) model for the residuals, under the alternative hypothesis, i.e.
z
(i)
t = ↵(i)
+ (i)
z
(i)
t 1 + u
(i)
t
And compute the t-stat for (i)
denoted t( (i)
) .
4. Then the quantiles at 10%, 5% and 1% of the distribution of t( (i)
) give the 10%, 5%
and 1% critical values for the Dickey-Fuller test for cointegration. For T = 100 and
the critical values are -3.07 at 10%, -3.37 at 5% and -3.96 at 1%).
3.3 Testing cointegration
We now test the null hypothesis that stock prices are cointegrated. We proceed as follows.
1. Estimate the regression for the pair s1
t = logS1
t , s2
t = logS2
t :
s1
t = a + bs2
t + xt
2. Use the Dickey-Fuller test for testing the null of unit root in xt . So, estimate the
regression
xt = ↵ + xt 1 + u
(i)
t
and test the null hypothesis H0 : = 0 using the corresponding t-stat. Use the critical
values computed previously.
In other words, we are regressing on lagged values of Xt. the null hypothesis is that = 0,
which means that the process is not mean reverting. If the null hypothesis can be rejected on
the 99% confidence level the price ratio is following a weak stationary process and is thereby
mean-reverting. Research has shown that if the confidence level is relaxed, the pairs do not
mean-revert good enough to generate satisfactory returns. This implies that a very large
number of regressions will be run to identify the pairs. If we have 200 stocks, we should have
to run 19 900 regressions, which makes this quite time consuming.
3.4 Risk control and feasibility
As already mentioned, through this strategy in theory we do almost totally avoid the
systematic market risk. The reason there is still some market risk exposure, is that a minor
20
beta spread is allowed for. Also the industry risk ban be eliminated, if we invest in pairs
belonging to the same industry.
The main risk we are being exposed to is then the risk of stock specific events, that is the risk
of fundamental changes implying that the prices may never mean revert again, or at least
not within the holding period. In order to control for this risk we use the rules of stop-loss
and maximum holding period.
We now study a simple trading strategy to access the feasibility of such trades. Given
that stocks have a bid-ask spread, we would incur a trading slippage every time a trade is
executed. Reducing the trading frequency reduces the e↵ect of this slippage. Let us therefore
consider the strategy where the trades are put on and unwound on a deviation of on either
direction from the long-run equilibrium m. We buy the portfolio (long S1 and short S2)
when the time series is below the mean and sell the portfolio (sell S1 and buy S2) when
the time series is above the mean in i time steps.
The profit on the trade is the incremental change in the spread, 2 . Consider two stocks S1
and S2 that are cointegrated with the following data:
– Cointegration Ratio = 1.5
– Delta used for trade signal = 0.045
– Bid price of S1 at time t = $19.50
– Ask price of S2 at time t = $7.46
– Ask price of S1 at time t + i = $20.10
– Bid price of S2 at time t + i = $7.17
– Average bid-ask spread for S1 = .0005 percent (5 basis points)
– Average bid-ask spread for S2 = .0010 percent ( 10 basis points)
We first examine if trading is feasible given the average bid-ask spreads.
– Average trading slippage = ( 0.0005 + 1.5 * 0.0010) = .002 ( 20 basis points) This is
smaller than the delta value of 0.045. Trading is therefore feasible. At time t, buy shares
of S1 and short shares of S2 in the ratio 1:1.5.
– Spread at time t = log (19.50) - 1.5 * log (7.46)= -0.045 At time t + i, sell shares of S1
and buy back shares the shares of S2.
– Spread at time t + i = log (20.10) - 1.5 * log (7.17) = 0.045
– Total return = return on S1 + g * return on S2
– = log (20.10) - log(19.50) + 1.5 * (log(7.46) - log(7.17) )
– = 0.3 + 1.5 * 4.0
– = .09 (9 percent)
21
CHAPTER 4
Results
We provide an example to illustrate the stochastic control trading strategy. We collected
on September 17, 2012, minute-by-minute data, on two stocks traded on the New York
Stock Exchange, Goldman Sachs Group, with ticker symbols GS, and JPMorgan Chase and
Company, with ticker symbol JPM. This gives us a 2 dimensional time series with 444 data
points. We test for co-integration and estimate the parameters in our co-integration model.
We obtain, via Montecarlo simulation and for T=444 and N=10000, the following empirical
distribution of t( ).
Mean Std Skewness Kurtosis q10% q5% q1%
-2.0322 0.8159 0.21331 3.5245 -3.0345 -3.3405 -3.9153
Table 4.1: Statistics of t( ) for the pair JP/GS
For T=444, we obtain the following histogram concerning t( ) :
Critical values for the Dickey-Fuller test for cointegration will correspond to the quantiles at
10 %, 5% and 1% of our empirical distribution of t( ). We get q10%=-3.0345, q5%=-3.3405,
q1%=-3.9153.
↵ t( )
-0.3485 1.3739 -3.9247
Table 4.2: Results of the regression log(GS) = ↵ + ⇤ log(JP)
First, as t( ) < q1%, we can assume the to be stationary, and thus we can clearly conclude
that we should reject the null of no-cointegration: the data are co-integrated at the 99%
confidence level. Indeed, the ACF for the spread Xt is typical of an AR process :
22
Figure 4.1: Empirical distribution of t( )
Figure 4.2: Correlogram of Residual Spread
A plot of the two cointegrated series is shown in Figure 4.3.
Now we estimate mean reversion behaviour in the pair of stocks: GS/JP. First, the estimated
coe cients are significant across the three pairs, supporting the Vasicek model of mean
reversion in the residual spreads. Table 4.3 reports estimation results.
✓ k t(⌘)
-0.0009 6.3665 0.0433
Table 4.3: Estimation of dXt = k(✓ Xt)dt + ⌘dWX(t)
Second, the level of mean reversion is strong, reflected by large values of k around 6.4. This
values is also captured visually in the graphs where the estimated state is shown to quickly
23
Figure 4.3: GS/JP cointegrated Time Series
revert to its mean. The implication is twofold. On one hand, mean reversion is ample, hence
the non-convergence risk is mitigated. On the other, it may be too strong, such that profit
opportunities are quick to vanish for those selected pairs. Third, the estimate of ✓ is not
zero, albeit close to zero. This suggests there remains some residual risk over and above the
beta risk.
Figure 4.4 plot the estimated AR(1) residual spread as implied from the observed return
di↵erential.
Figure 4.4: Estimation of Residual Spread
For the purpose of illustrating, we plot our stock prices and the optimal policies for a whole
trading day. More precisely, we show the stock prices S1; S2, as well as the optimal policies
⇡1 = h⇤
and ⇡2 = ⇤ h⇤
.
We then present in figure 4.6, the cumulative Profit and Loss function.
As expected, in a pairs trading setting, the controls are opposite in sign. We also notice
that the positions, which are large during the first half of the day, are both progressively
24
Figure 4.5: Stocks and optimal policies
Figure 4.6: Profit and Loss processes
Annual Return 15.48%
Standard Dev. 13.46%
Sharpe Ratio 1.15
Sortino Ratio 0.89
Table 4.4: Performance Measures
unwound in the second half, ending close to 0 by the end of the trading day. For this
data set, a significant profit is instantly realized because of the leverage. The profit then
fluctuates throughout the day but remains strongly positive and by the end of the day, it
is approximately $1,348. Repeating this strategy on a daily basis gives a Sharpe Ratio of
1.15 and an annual return of 15.48%. Of course, these figures do not take into account the
cost of borrowing or transaction costs which are both assumed to be 0 in this model, this is
unrealistic and in practice those costs can wipe out all the gains since the pair is reweighted
every minute. To avoid the slippage one solution is to enter the trade only when the spread
is wide enough, around earnings announcement of one the constituent of the pair.
25
Conclusion and outlook
In this report, we studied two quantitative approaches to the problem of pairs trading, the
first one formulated the problem of optimal trading of pairs as a stochastic control problem.
We were able to derive a closed form solution to this control problem. The second one
relied on the properties of co-integrated financial time series to construct a strategy with
a theoretical positive expected return. This study was performed to show that the two
approaches are equivalent, in the sense that the portfolio weight in both case depends only
on the distance between the di↵erence in the logarithm of the prices and its the long-run
mean.
The applicability of the method is illustrated with minute-by-minute historical stock data. In
the model, the two stock processes are co-integrated, correlated, and have constant volatility
and we ignore the costs associated with trading. The simplicity of the present formulation
enables a feasible implementation of parameter calibration and the derivation of analytical
formulae for the optimal trading strategies.
Maybe further work can be done in order to address the slippage issue one solution could be
to enter the trade only when the spread is wide enough, and possibly to mix this strategy
with a fundamentally driven one using the MSCI rebalance signal.
26
Bibliography
[1] Alexander, C., Giblin, I. and W. Weddington, Cointegration and asset allocation: A new
active hedge fund strategy, ISMA Centre Discussion Papers in Finance Series.
[2] Brockwell, P.J., and Davis, R.A.,Introduction to Time Series and Forecasting, second
edition , Springer-Verlag, New York. (2002)
[3] Brown, D. and R. Jennings, On technical analysis, The Review of Financial Studies,
527-551, 1989
[4] Casella, G. and R. L. Berger, Statistical inference, 2012
[5] Chen, Z. and P. Knez, ( Measurement of martket integration and arbitrage), The Review
of Financial Studies 8, 287-325, 1995
[6] Do, B., Fa↵, R., and K. Hamza, A new approach to modeling and estimation for
pairs trading, In Proceedings of 2006 Financial Man- agement Association European
Conference, Stockholm, June 2006.
[7] Duan, J.C., and S. Pliska, Option valuation with co-integrated asset prices, Journal of
Economic Dynamics and Control, 754, 2004.
[8] Elliott, R. J. , van der Hoek, J., and W. P. Malcolm, Pairs trading, Quantitative Finance,
271-276, 2005.
[9] Engle, R. F. and C. W. J. Granger, Long-run economic relationships, readings in
cointegration, Oxford University Press, 1991.
[10] Galenko, A., Popova, E. and I. Popova, Trading in the Presence of Cointegration,
Operations Research and Industrial Engineering,78712
[11] Gatev, E., Goetzmann, W. N. , and K. G. Rouwenhorst, Pairs Trading: Performance
of a Relative-Value Arbitrage Rule, Review of Financial Studies, 797-827, 2006.
[12] Johansen, S. , Likelihood-based inference in cointegrated vector autoregressive models,
Oxford University Press, 1995
[13] Korn, R., and H. Kraft. A Stochastic Control Approach to Portfolio Problems with
Stochastic Interest Rates, SIAM Journal on Control and Optimization, 1250-1269, 2002.
27
[14] Mudchanatongsuk, S., Primbs, J. A., and W. Wong. Optimal pairs trading: A stochastic
control approach, Proceedings of the Amer- ican Control Conference, 1035:1039, 2008.
[15] Dos Passos, W.,( Numerical methods, algorithms, and tools in C# ), CRC Press, (2010)
[16] Phillips, P.C.B., and S. Ouliaris, Asymptotic properties of residual based tests for
cointegration, Econometrica, 165:193, 1990
[17] Tsay, R. S., Analysis of financial time series, Wiley, 2005
[18] Vasicek, O., An Equilibrium Characterization of the Term Structure, Journal of Finan-
cial Economics, 177-188, 1977.
28

Statistical Arbitrage Pairs Trading, Long-Short Strategy

  • 1.
    Statistical Arbitrage Pairs Trading,Long-Short Strategy Cyrille BEN LEMRID Credit Suisse supervisor : Fr´ed´eric PECQUEUR Academic supervisors : Olivier GU´EANT, Simone SCOTTI Paris Diderot University, Paris VII October 1, 2012
  • 2.
    Contents 1 Pairs TradingModel 5 1.1 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Cointegration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 Spread dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 State of the art and model overview 9 2.1 Stochastic Dependencies in Financial Time Series . . . . . . . . . . . . . . . 9 2.2 Cointegration-based trading strategies . . . . . . . . . . . . . . . . . . . . . 10 2.3 Formulation as a Stochastic Control Problem . . . . . . . . . . . . . . . . . . 13 2.4 Fundamental analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 Strategies Analysis 19 3.1 Road map for strategy design . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Identification of potential pairs . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3 Testing cointegration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.4 Risk control and feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4 Results 22 2
  • 3.
    Introduction This report presentsmy research work carried out at Credit Suisse from May to September 2012. This study has been pursued in collaboration with the Global Arbitrage Strategies team. Quantitative analysis strategy developers use sophisticated statistical and optimization techniques to discover and construct new algorithms. These algorithms take advantage of the short term deviation from the ”fair” securities’ prices. Pairs trading is one such quantitative strategy - it is a process of identifying securities that generally move together but are currently ”drifting away”. Pairs trading is a common strategy among many hedge funds and banks. However, there is not a significant amount of academic literature devoted to it due to its proprietary nature. For a review of some of the existing academic models, see [6], [8], [11] . Our focus for this analysis is the study of two quantitative approaches to the problem of pairs trading, the first one uses the properties of co-integrated financial time series as a basis for trading strategy, in the second one we model the log-relationship between a pair of stock prices as an Ornstein-Uhlenbeck process and use this to formulate a portfolio optimization based stochastic control problem. This study was performed to show that under certain assumptions the two approaches are equivalent. Practitioners most often use a fundamentally driven approach, analyzing the performance of stocks around a market event and implement strategies using back-tested trading levels. We also study an example of a fundamentally driven strategy, using market reaction to a stock being dropped or added to the MSCI World Standard, as a signal for a pair trading strategy on those stocks once their inclusion/exclusion has been made e↵ective. This report is organized as follows. Section 1 provides some background on pairs trading strategy. The theoretical results are described in Section 2. Section 3 describes our 3
  • 4.
    methodology of constructingpairs and calculating returns. And finally in section 4, the results are illustrated with numerical examples involving a real pair of stocks. Acknowledgements This study has been pursued in collaboration with the Global Arbitrage Strategies team at Credit Suisse. Since my first arrival in the locals of Credit Suisse, I have been thoroughly assisted. Successful professionals were kind enough to answer my questions and to give their opinion on my work during the whole internship. Without their advices, this study would not have achieved its current findings. First, I would like to thank my supervisor Mr Fr´ed´eric Pecqueur, Managing Director of the Index Arbitrage Team. I would also like to thank Alan Du↵y and Reda Ighachan, they greatly contributed to my project and gave me valuable feedbacks. I am also thankful to Andrew Coman, Alex Nevill and Karl Bogoslavski for their contributions, and to the team in general. 4
  • 5.
    CHAPTER 1 Pairs TradingModel 1.1 General discussion The fundamental idea of pair trading is that knowing that a pair of financial instruments has historically moved together and kept a specific pattern for their spread, we could take advantage of any disturbance over this historic trend. Pairs trading involves selling the higher-priced security and buying the lower-priced security with the idea that the mispricing will correct itself in the future. The mutual mispricing between the two securities is captured by the notion of spread. The greater the spread, the higher the magnitude of mispricing and greater the profit potential. There are generally two types of pairs trading: statistical arbitrage convergence/divergence trades, and fundamentally-driven valuation trades. In the former, the driving force for the trade is a aberration in the long-term spread between the two securities, and to realize the mean-reversion back to the norm, you short one and go long the other. The trick is creating a program to find the pairs, and for the relationship to hold. The other form of pairs trading would be more fundamentally-driven variation, which is the purvey of most market-neutral hedge funds: in essence they short the most overvalued stock and go long the undervalued stock. However, it is possible to determine that a security is overvalued or undervalued only if we also know the true value of the security in absolute terms. A long-short position in the two securities is constructed such that it has a negligible beta and therefore minimal exposure to the market. Hence, the returns from the trade are uncorrelated to market returns, a feature typical of market neutral strategies. The key to success in pairs trading lies in the identification of security pairs. Based on the discussion so far, the truly crucial questions are : How do you identify ”stocks that move together?” Need they be in the same industry? Should they only be liquid stocks? 5
  • 6.
    How far dothey have to diverge before a position is put on? When is a position unwound? The return on both securities is expected to be close over the time frames. In other words, the increment to the logarithm of the prices at the current time must be about the same for both the securities at all time instances in the future. This, of course, means that the time series of the logarithm of the two prices must move together, and the spread calculation formula is therefore based on the di↵erence in the logarithm of the prices. Having explained our approach, we now need to define in precise terms what we mean when we say that the price series or the log price series of the two securities must move together. The idea of comovement is not captured by correlation of two time series but by the notion of cointegration that has been well developed in the field of statistics. 1.2 Cointegration It is well documented that the correlation is a measure of the short-term linear dependencies (see [4], Theorem 4.5.7). In contrast to correlation, cointegration is a measure of long-term dependencies (see [9]). We will briefly outline the notion of cointegration. In order to do so, we need to first define stationary and integrated time series. Definition 1.2.1 Xt 2 R, t 2 Z is said strictly stationary i↵ its finite dimensional distributions are invariant under any time translation, i.e: 8⌧ 2 Z,8n 2 N⇤ , 8(t1, .., tn) 2 Zn , (Xt1 , .., Xtn ) ⇠ (Xt1 ⌧ , .., Xtn ⌧ ) Definition 1.2.2 A stochastic process Xt 2 R, t 2 Z is stationary if its first and second moments are time invariant:: – (Xt)t2Z 2 L2 (R), i.e 8t 2 Z, E [X2 t ] < 1 – 8t 2 Z, E [Xt] = E [X0] := µX – 8s, t 2 Z, X(t, s) := Cov (Xt, Xs) = Cov (X0, Xs t) =: (s t) Such a process is known as integrated of order 0 and denoted by I(0). Definition 1.2.3 Univariate process is called integrated of order d, I(d), if in its original form it is non-stationary but becomes stationary after di↵erencing d times. Definition 1.2.4 If all elements of the vector Xt = (X1 t , ..., XN t ) T for t = 1, ..., N, are I(1), and there exists a non-null vector = ( 1 , ..., N ) T such that T X is I(0), then the vector process Xt is said to be cointegrated, and b is called the cointegrating vector. For example, two time series X and Y are cointegrated if X, Y are I(1), and there exists a scalar such that X ⇤ Y is I(0). 6
  • 7.
    The explanation forcointegration dynamics is captured by the notion of error correction. The idea behind error correction is that cointegrated systems have a long-run equilibrium; that is, the long-run mean of the linear combination of the two time series. If there is a deviation from the long-run mean, then one or both time series adjust themselves to restore the long-run equilibrium. 1.3 Spread dynamics The purpose of this section is to demonstrate that the modeling of the spread in parametric terms could indeed get complex. This section formulates a stochastic model associated with a cointegration relation. We begin by describing the asset, spread, and wealth dynamics. We assume that a risk-free asset S0(t) exists with a riskfree rate of r compounded continu- ously. Thus, S0(t) satisfies the dynamics dS0(t) = rS0(t)dt (1.1) Let S1(t) and S2(t) denote respectively the prices of the pair of stocks S1 and S2 at time t. We assume that stock S2 follows a geometric Brownian motion dS2(t) = µS2(t)dt + S2(t)dW2(t) (1.2) where µ is the drift, is the volatility, and W2(t) is a standard Brownian motion. Let X(t) denote the spread of the two co-integrated stocks at time t, defined as X(t) = ln(S1(t)) ln(S2(t)) (1.3) Since S1(t) and S2(t) are co-integrated X(t) is stationary. We assume that the spread follows an Ornstein-Uhlenbeck process dX(t) = k(✓ X(t))dt + ⌘dWX(t) (1.4) where k(✓ X(t)) is the drift term that represents the expected instantaneous change in the spread at time t, and ✓ is the long-term equilibrium level to which the spread reverts. The rate of reversion is represented by the parameter k, which has to be positive to ensure stability around the equilibrium value. The standard deviation parameter, ⌘, determines the volatility of the spread. WX(t) is a standard Brownian motion where denotes the instantaneous correlation coe cient between WX(t) and W2(t) (i.e. E [dWX(t)dW2(t)] = ⇢dt). By using (1.2), (1.3), (1.4) and applying Ito’s lemma to S1(t) = eX(t) S2(t) , we are able to obtain the dynamics of S1(t) as 7
  • 8.
    dS1(t) = ✓ µ +k(✓ X(t)) + 1 2 ⌘2 + 1 2 ( 1) 2 + ⇢ ⌘ ◆ S1(t) + S1(t)dW2(t) + ⌘S1(t)dWX(t) (1.5) Let V (t) be the value of a self-financing pairs-trading portfolio and let h(t) and ˜h(t) denote respectively the portfolio weights for stocks S1 and S2 at time t. Additionally, we only allow ourselves to trade stocks S1 and S2 as a cointegrated pair. Thus, we require that ˜h(t) = h(t) (1.6) Finally, the wealth dynamics of the self-financed portfolio value is given by dV (t) = V (t) ✓ h(t) dS1(t) S1(t) + ˜h(t) dS2(t) S2(t) + dS0(t) S0(t) ◆ (1.7) Using (1.1), (1.2), (2.2) and (1.6), we can rewrite (1.7) as dV (t) = V (t) ⇢ h(t) ✓ k(✓ X(t)) + 1 2 ⌘2 + 1 2 ( 1) 2 + ⌘⇢ ◆ + r dt + ⌘dWX(t) (1.8) We see that the self-financed portfolio dynamics depends on the spread dynamics only, thus in theory we do almost totally avoid the systematic market risk modelled here by W2(t). 8
  • 9.
    CHAPTER 2 State ofthe art and model overview 2.1 Stochastic Dependencies in Financial Time Series Assume now that we have N 2 cointegrated financial assets, and their log-prices are I(1) processes. It is widely assumed that stock returns are integrated of order 0, whereas the stock prices are integrated of order 1 (see [1]). Denote the vector of the asset prices by St = (S1 t , ..., SN t )T Each of its elements can be written as Si t = Si 0e tP j=0 ri j where rt = (r1 t , ..., rN t )T are the continuously compounded asset returns, and S1 0 , ..., SN 0 are the initial prices. Then, the log-prices can be written as ln Si t = ln Si 0 + tX j=0 ri j Denote the corresponding cointegrating vector by = ( 1 , ..., N ) T . By the definition of cointegration, the resulting time series Xt Xt = NX i=1 i ln Si t 9
  • 10.
    will be stationaryand integrated of order 0. The next two propositions lead to derivations of new properties of cointegrated time series that we later use for the construction of a new trading strategy. Proposition 2.1.1 Assume that the log prices of N assets, lnSi , i = 1, ..., N, are cointe- grated with a cointegrating vector . Let Xt = NP i=1 i ln Si t be the corresponding stationary series, and rt = (r1 t , ..., rN t )T be the continuously compounded asset returns at time t ¿ 0. Define Zt = NP i=1 bi ri t has the following three properties: Zt = Xt Xt 1 = NP i=1 i ri t. If limp!1 Cov [Xt, Xt p] = 0 , then 1P p=1 p Cov [Zt, Zt p] = Var Xt where Cov [Zt, Zt p] = NP i=1 NP j=1 bi bj Cov ⇥ ri t, rj t p ⇤ Proof : See [10]. Proposition 2.1.1 is a technical result. Intuitively, it shows that the variance of the cointegration process (Xt) inadvertently defines the auto-covariance of the asset returns. Proposition 2.1.2 Assume lnSi , i = 1, ..., N are the log-prices of N assets, and rt = (r1 t , ..., rN t )T are the continuously compounded asset returns at time t ¿ 0. For some finite vector the process limp!1 Cov [Xt, Xt p] = 0 is stationary, and therefore the time series of the assets’ log-prices are cointegrated, if and only if the process Zt = NP i=1 i ri t. – E [Zt] = 0 – var Zt = 2 1P p=1 Cov [Zt, Zt p] – 1P p=1 p Cov [Zt, Zt p] < 1 Proof : See [10]. As a result of proposition 2.1.2, it follows that cointegration is a property related to the 1st and 2nd moments of asset returns. In previous work, cointegration was viewed as a property of asset prices. Here cointegration is defined by the stochastic relationships among the returns. 2.2 Cointegration-based trading strategies Next, we introduce a trading strategy by exploiting the theoretical results derived in the previous sections. 10
  • 11.
    Summarizing the resultsfrom the two propositions: for process Zt = Xt Xt 1 = NP i=1 bi ri t we have that E [Zt] = 0 and Var Zt = 2 1P p=1 Cov [Zt, Zt p]. Consider a strategy where each time period we buy i C 1P p=1 Zt p value of stock i, i = 1, ..., N with C is a positive scale factor. The reason for which we include constant C will become clear later. At any point in time we can compute the profit of this strategy by multiplying the next period return by the shares purchased: Vt = NX i=1 i C " 1X p=1 Zt p # ri t = C 1X p=1 Zt pZt Given that E[Zt] = 0 and Cov[Zt, Zt p] = E[ZtZt p],p > 0, the expected profit of this strategy is: E [Vt] = E " C 1X p=1 Zt pZt # = C 1X p=1 Cov [Zt p, Zt] = 0.5C Var Zt Since Var Zt and C are positive, the expected profit of the proposed strategy is always positive and proportional to the scale factor C. The reasoning behind this strategy is fairly simple. The cointegration relations between time series imply that the time series are bound together. Over time the time series might drift apart for a short period of time, but they ought to re-converge. The term 1P p=1 Zt p = 1P p=1 (Xt p Xt 1 p) = Xt lim u!1 Xu = Xt ✓ measures how far they diverge, and sign ✓ i C 1P p=1 Zt p ◆ provides the direction of the trade for stock i. Specifically, +1 stands for a long position, whereas -1 denotes a short trade. This strategy relies on identifying spreads that have gone apart but are expected to mean revert in the future. The spreads of typical pairs-trading strategy get identified by using correlation as a similarity measure and standard deviation as a spread measure. A trade, for example, will be put in place if the assets are highly correlated but have gone apart for more than 3 standard deviations. The trade will unwind when the assets converge or some time limit is reached. This approach uses cointegration as a measure of similarity. Cointegration is the natural answer of the question: How do we identify assets that move together? Proposition 2 provides the answer of the question: How far do the assets have to diverge before a trade is placed? As a result, the decision to execute a trade is driven by cointegration properties of the assets. Having positive expected profit is excellent news for any strategy. The proposed strategy has some shortcomings. The initial amount of money needed each period is a random variable, and the resulting portfolio is not dollar neutral (i.e. the total dollar value of the long position is not equal to the total dollar value of the short position.) To construct a dollar neutral 11
  • 12.
    long-short portfolio, wewill first partition the cointegrated time series into two sets L and S: i 2 L $ i 0 i 2 S $ i < 0 Next, depending on what set a given asset belongs to, we purchase the value of iCsign PP p=1 ˜Zt p+1 ! Pi t P j2L j iCsign PP p=1 ˜Zt p+1 ! Pi t P j2S j The return of this modified strategy is identical to the proposed earlier. Hence, the expected profit for that strategy is also positive. Indeed (without loss of generality) assume that sign ✓ i C 1P p=1 Zt p ◆ = 1. The long returns RL t and the short returns RS t of our original strategy are RL t = P i2L i C P1 p=1 Zt pSi t+1 Si t P i2L iC P1 p=1 Zt p 1 = P i2L i Si t+1 Si t P i2L i 1 RS t = 1 P i2S i C P1 p=1 Zt pSi t+1 Si t P i2S iC P1 p=1 Zt p = 1 P i2S i Si t+1 Si t P i2S i The modified strategy has the following returns from the short and long positions: RL t = P i2L i P i2L i Csign ⇣P1 p=1 Zt p ⌘ Si t+1 Si t P i2L i P i2L i Csign ⇣P1 p=1 Zt p ⌘ 1 = P i2L i Si t+1 Si t P i2L i 1 RS t = 1 P i2L i P i2S i Csign ⇣P1 p=1 Zt p ⌘ Si t+1 Si t P i2L i P i2S i Csign ⇣P1 p=1 Zt p ⌘ = 1 P i2S i Si t+1 Si t P i2S i The above derivations indicate the return of the modified strategy is the same as the original one, therefore its expected profit is positive(since we proved that the expected return of the original strategy is positive). Now we can explain why we have included the constant C. In the modified strategy, every time period the value of C is invested in short and long positions. Hence, the money needed for each time period in order to execute the new strategy is a constant, and the portfolio we obtain is dollar neutral. In reality, we cannot compute the true value of ( P1 p=1 Zt p (the cointegration vector b.) We estimate them, and with the above theoretical results in mind, we propose the following trading strategy: – Step 1: using historical data, estimate the cointegration vector . 12
  • 13.
    – Step 2:using the estimated cointegration vector ˜ and historical data, construct ˜Zt realizations of the process Zt = NP i=1 i ri t – Step 3: compute the final sum PX p=1 ˜Zt p+1 , where P is a parameter. – Step 4: partition the assets into two sets L and S (depending on values of ˜.) – Step 5: buy (depending in which set the asset belongs to) the following number of shares (round down to get integer number of shares): ˜i Si t P i2L ˜i Csign PX p=1 ˜Zt p+1 ! ˜i Si t P i2S ˜i Csign PX p=1 ˜Zt p+1 ! – Step 5: buy (depending in which set the asset belongs to) the following number of shares (round down to get integer number of shares): – Step 6: rebalance all the open positions the following trading day. – Step 7: update the historical data set. – Step 8: If it is time to re-estimate the cointegration vector (which happens every 22 trading days), go to step 1, otherwise go to step 2. In the next section we describe the procedures used to test the strategy and present the numerical results. 2.3 Formulation as a Stochastic Control Problem We recall the wealth dynamics (2.1) dV (t) = V (t) ⇢ h(t) ✓ k(✓ X(t)) + 1 2 ⌘2 + 1 2 ( 1) 2 + ⌘⇢ ◆ + r dt + ⌘dWX(t) (2.1) We formulate the portfolio optimization pair-trading problem as a stochastic optimal control problem. We assume that an investor’s preference can be represented by the utility function U(x) = 1 x , with x 0 and x < 1. In this formulation, our objective is to maximize expected utility at the final time T. Thus, we seek to solve 13
  • 14.
    sup h(t) E  1 V (T) subject toV (0) = v0, X(0) = x0 dX(t) = k(✓ X(t))dt + ⌘dWX(t) dV (t) V (t) =  h(t) ✓ k(✓ X(t)) + 1 2 ⌘2 + 1 2 ( 1) 2 + ⌘⇢ ◆ + r dt + ⌘dWX(t) where the supremum is taken over strategies h(t) that are adapted to the filtration generated by WX(t) and W2(t). (For a rigorous formulation in a related setting, see [13].) In this optimal control problem, the first constraint just specifies the initial wealth of our portfolio and the spread. The second and third constraints describe the spread and wealth dynamics respectively. In the following section, we show that a closed form solution to the above stochastic control problem exists. Let G(t, v, x) denote the value function. G(t, v, x) = sup h Et,x,v [V (T) ] For any strategies h(t), define the Dynkin operator Lh (t, x, v) = k(x ✓)@x +  hk(✓ x) + 1 2 h⌘2 + 1 2 h ( 1) 2 + h ⌘⇢ + rh v@v + 1 2 ⇥ ⌘2 @xx + 2h⌘2 v@vx + h2 ⌘2 v2 @vv ⇤ (2.2) The HJB equation can be rewritten using the Dynkin operator @tG + sup h ⇥ Lh G ⇤ = 0 subject to the terminal condition G(T, v, x) = v By standard arguments, one may show that the Hamilton-Jacobi-Bellman (HJB) equation corresponding to our stochastic control problem is Gt + sup h { 1 2 ⇥ h2 ⌘2 v2 Gvv + ⌘2 Gxx + 2h⌘2 vGvx ⇤ +  hk(✓ x) + 1 2 h⌘2 + 1 2 h ( 1) 2 + h ⌘⇢ + rh vGv k(x ✓)Gx} = 0 (2.3) 14
  • 15.
    where the subscriptson G denote partial derivative. For notational ease we let b = k(✓ x) + 1 2 ⌘2 + ⇢⌘ + 1 2 h ( 1) 2 and rewrite 2.3 as Gt + sup h { 1 2 ⇥ h2 ⌘2 v2 Gvv + ⌘2 Gxx + 2h⌘2 vGvx ⇤ + [hb + r] vGv k(x ✓)Gx} = 0 (2.4) The first order condition for the maximization in 2.4 is h⇤ ⌘2 vGvv + ⌘2 Gvx + bGv = 0 (2.5) Assuming Gvv < 0, the first order condition 2.5 is also su cient, yielding h⇤ = ⌘2 Gvx + bGv ⌘2vGvv (2.6) Plugging 2.6 back into 2.4 yields ⌘2 GtGvv 1 2 ⌘4 G2 vx 1 2 b2 G2 v b⌘2 GvGvx + 1 2 ⌘4 GvvGxx + r⌘2 vGvGvv k(x ✓)⌘2 GxGvv = 0 (2.7) Thus, we must solve the partial di↵erential equation 2.7 in order to determine an optimal strategy. To obtain a closed form solution, we consider the following separation ansatz that was motivated by [13] where a di↵erent portfolio optimization problem under Vasicek [18] term structure dynamics was solved, G(t, v, x) = f(t, x)v with the condition that f(T, x) = 1 For this choice of ansatz, 2.7 becomes ( 1)⌘2 fft 1 2 ⌘4 f2 x 1 2 b2 f2 1 2 ⌘4 ffx ⇢ ⌘3 ffx + 1 2 ( 1)⌘4 ffxx + ( 1)r⌘2 f2 + k(x ✓)⌘2 ffx = 0 (2.8) We then use the following ansatz for f(t, x) f(t, x) = g(t)exB(t)+x2A(t) with g(T) = 1, B(T) = 0, A(T) = 0. 15
  • 16.
    Pluging the ansatzinto 2.8 and setting the coe cient of x2 to be zero yields an ordinary di↵erential equation for A(t) h⇥ ( 1)⌘2 ⇤ A04 i A2 + ⇥ 2k⌘2 ⇤ A 1 2 k2 = 0 (2.9) Similarly, setting the coe cient of x in 2.8 to be zero yields an ordinary di↵erential equation for B(t) h⇥ ( 1)⌘2 ⇤ B02 2⌘4 A i B+  ⌘4 A 2 ⇢ ⌘3 A 2k✓⌘2 A + k2 ✓ + 1 2 k⌘2 + k⇢⌘ + 1 2 k ( 1) 2 = 0 (2.10) Noting that 2.9 is a Riccati equation for A(t), and 2.10 is first order linear ordinary di↵erential equations for B(t), respectively, one may obtain the solution in closed form as, A(t) = k 1 p 1 2⌘2 8 < : 1 + 2 p 1 1 p 1 (1 + p 1 ) exp ⇣ 2k(T t) p 1 ⌘ 9 = ; (2.11) B(t) = 1 2⌘2 h (1 p 1 ) (1 + p 1 ) exp ⇣ 2k(T t) p 1 ⌘i [ p 1 (⌘2 + 2 ⇢ ⌘ + ( 1) 2 )  1 exp ✓ 2k(T t) p 1 ◆ 2 ⌘2 + 2 ⇢ ⌘ + ( 1) 2 + 2k✓  1 exp ✓ 2k(T t) p 1 ◆ ] (2.12) Consequently, the optimal weight h⇤ (t) can be obtained via 2.6 h⇤ (t, x) = 1 1  B(t) + 2A(t)x k (x ✓) ⌘2 + ⇢ ⌘ + ( 1) 2 2⌘2 + 1 2 (2.13) With the above closed form solution in hand we find that as in the previous section the optimal weight h⇤ (t) is linear in x, through the distance between the di↵erence in the spread process and its the long-run mean x ✓. The term Xt ✓ in the optimal weight h⇤ (t) is equivalent to the term 1P p=1 Zt p we found in the standard cointegration strategy, we find that both approaches are consistent. 2.4 Fundamental analysis Fundamental analysis of a business involves analyzing its financial data to get some insight on whether it is overvalued or undervalued. This is done by analyzing historical and present 16
  • 17.
    economic data todo a financial forecast of the business. The intrinsic value of the business is found by doing a fundamental analysis which consist of three main steps; (I) economic analysis, (II) industry analysis and (III) company analysis. If the intrinsic value is higher than the market price it is recommended to buy stocks, if it is equal to market price then it is best to hold your shares, and if it is less than the market price then it’s a selling signal. Fundamental analysis maintains that markets may misprice an asset in the short run but that the ”correct” price will eventually be reached. Profits can be made by trading the mispriced security and then waiting for the market to recognize its ”mistake” and reprises the security. In this section we study an example of a fundamentally driven strategy, using market reaction to a stock being dropped or added to the MSCI World Standard, as a signal for a pair trading strategy on those stocks once their inclusion/exclusion has been made e↵ective. Both FTSE and MSCI have their own set of criteria for including stocks into their respective indices. These criteria include (but are not limited to) size, liquidity, free float and trade history. Stocks not passing through these filters are not eligible to be a part of the index. Since so much money throughout the world is passively managed (and therefore needs to closely replicate the performance of the benchmarked index), it is reasonable to assume that changes in index constituents can drive huge flows in and out of the stocks in play. The way we constructed this fundamentally driven strategy was to look to ”buy” stocks that are announced to be included in the MSCI ACWI and ”sell” stocks that are announced to be dropped from the benchmark. Figure 2.1: MSCI Rebalance Profit and Loss The strategy sounds simple enough to track; however, there are a few practical barriers to testing this hypothesis that we had to consider. In an ideal world, to extract maximum benefit from the announcement, we would like to buy (sell) the additions (deletions) on the night the reviews are announced (or develop a strategy to pre-empt those announcements) 17
  • 18.
    and exit ourpositions at the close of business on the day when changes become e↵ective. The first drawback is that since the reviews are not made public until the markets have ceased trading for the day, it is impossible to take positions at the closing levels of the day, unless we are aware of what changes constitute the announcements. We now go about testing the ”Index e↵ect” strategy historically. As mentioned earlier, our interest covers ”announcement dates” as well as ”e↵ective” dates. For this analysis, we therefore decided to go Long stocks that have been announced as being soon added into the MSCI Europe Index and we go Short on stocks that have been announced as soon being deleted from the Index. We go Long (Short) on the close of the day following the index change announcement and keep positions until the close of the e↵ective date. On the close of the e↵ective date (ie when all index changes are taking place), we square o↵ our positions and wait for the next quarterly review. Figure 2.2: MSCI longshort We see that following the announcement of its addition to benchmark Index, a stock’s per- formance is usually positive on an absolute and relative basis. Conversely, an announcement of a stock’s deletion from benchmark Index is usually a negative trigger, on both absolute and relative basis. In this piece, this fundamentally driven strategy generated some impressive returns: 12.3% annualised with a Sharpe of 0.7 (before commissions, borrowing fees and transaction costs). 18
  • 19.
    CHAPTER 3 Strategies Analysis 3.1Road map for strategy design In this section we we provide a road map for the design and analysis of the pairs trading strategy. The steps involved are as follows: 1. Identify stock pairs that could potentially be cointegrated. This process can be based on the stock fundamentals or alternately on a pure statistical approach based on historical data. 2. Once the potential pairs are identified, we verify the proposed hypothesis that the stock pairs are indeed cointegrated based on statistical evidence from historical data. This involves determining the cointegration coe cient and examining the spread time series to ensure that it is stationary and mean reverting. 3. We then examine the cointegrated pairs to determine the delta. A feasible delta that can be traded on will be substantially greater than the slippage encountered due to the bid-ask spreads in the stocks. 3.2 Identification of potential pairs The challenge in this strategy is identifying stocks that tend to move together and therefore make potential pairs. Our aim is to identify pairs of stocks with mean-reverting relative prices. To find out if two stocks are mean-reverting the test conducted is the Dickey-Fuller test of the log ratio of the pair. A Dickey-Fuller test consists in determining if the log-ratio xt = log S1 t log S2 t of share prices S1 t and S2 t . is indeed stationary. 19
  • 20.
    Critical values ofthe cointegration test are depends on the number of observations, so that we have to compute our own critical values for the Dickey-Fuller test for cointegration. The procedure is as follows : 1. We simulate two time series of T error terms (" (i) t , ⌘ (i) t ), t = 1, ..., T, distributed as two independent N(0, 1) variables, and the independent random walks associated p (i) t = p (i) t 1 + " (i) t and d (i) t = d (i) t 1 + ⌘ (i) t 2. We estimate by regression the relation between the two time series p (i) t = a+bd (i) t +z (i) t Under the null of no co-integration, the residual series z (i) t should be non-stationary. We therefore perform a standard Dickey-Fuller test on z (i) t . 3. We fit an AR(1) model for the residuals, under the alternative hypothesis, i.e. z (i) t = ↵(i) + (i) z (i) t 1 + u (i) t And compute the t-stat for (i) denoted t( (i) ) . 4. Then the quantiles at 10%, 5% and 1% of the distribution of t( (i) ) give the 10%, 5% and 1% critical values for the Dickey-Fuller test for cointegration. For T = 100 and the critical values are -3.07 at 10%, -3.37 at 5% and -3.96 at 1%). 3.3 Testing cointegration We now test the null hypothesis that stock prices are cointegrated. We proceed as follows. 1. Estimate the regression for the pair s1 t = logS1 t , s2 t = logS2 t : s1 t = a + bs2 t + xt 2. Use the Dickey-Fuller test for testing the null of unit root in xt . So, estimate the regression xt = ↵ + xt 1 + u (i) t and test the null hypothesis H0 : = 0 using the corresponding t-stat. Use the critical values computed previously. In other words, we are regressing on lagged values of Xt. the null hypothesis is that = 0, which means that the process is not mean reverting. If the null hypothesis can be rejected on the 99% confidence level the price ratio is following a weak stationary process and is thereby mean-reverting. Research has shown that if the confidence level is relaxed, the pairs do not mean-revert good enough to generate satisfactory returns. This implies that a very large number of regressions will be run to identify the pairs. If we have 200 stocks, we should have to run 19 900 regressions, which makes this quite time consuming. 3.4 Risk control and feasibility As already mentioned, through this strategy in theory we do almost totally avoid the systematic market risk. The reason there is still some market risk exposure, is that a minor 20
  • 21.
    beta spread isallowed for. Also the industry risk ban be eliminated, if we invest in pairs belonging to the same industry. The main risk we are being exposed to is then the risk of stock specific events, that is the risk of fundamental changes implying that the prices may never mean revert again, or at least not within the holding period. In order to control for this risk we use the rules of stop-loss and maximum holding period. We now study a simple trading strategy to access the feasibility of such trades. Given that stocks have a bid-ask spread, we would incur a trading slippage every time a trade is executed. Reducing the trading frequency reduces the e↵ect of this slippage. Let us therefore consider the strategy where the trades are put on and unwound on a deviation of on either direction from the long-run equilibrium m. We buy the portfolio (long S1 and short S2) when the time series is below the mean and sell the portfolio (sell S1 and buy S2) when the time series is above the mean in i time steps. The profit on the trade is the incremental change in the spread, 2 . Consider two stocks S1 and S2 that are cointegrated with the following data: – Cointegration Ratio = 1.5 – Delta used for trade signal = 0.045 – Bid price of S1 at time t = $19.50 – Ask price of S2 at time t = $7.46 – Ask price of S1 at time t + i = $20.10 – Bid price of S2 at time t + i = $7.17 – Average bid-ask spread for S1 = .0005 percent (5 basis points) – Average bid-ask spread for S2 = .0010 percent ( 10 basis points) We first examine if trading is feasible given the average bid-ask spreads. – Average trading slippage = ( 0.0005 + 1.5 * 0.0010) = .002 ( 20 basis points) This is smaller than the delta value of 0.045. Trading is therefore feasible. At time t, buy shares of S1 and short shares of S2 in the ratio 1:1.5. – Spread at time t = log (19.50) - 1.5 * log (7.46)= -0.045 At time t + i, sell shares of S1 and buy back shares the shares of S2. – Spread at time t + i = log (20.10) - 1.5 * log (7.17) = 0.045 – Total return = return on S1 + g * return on S2 – = log (20.10) - log(19.50) + 1.5 * (log(7.46) - log(7.17) ) – = 0.3 + 1.5 * 4.0 – = .09 (9 percent) 21
  • 22.
    CHAPTER 4 Results We providean example to illustrate the stochastic control trading strategy. We collected on September 17, 2012, minute-by-minute data, on two stocks traded on the New York Stock Exchange, Goldman Sachs Group, with ticker symbols GS, and JPMorgan Chase and Company, with ticker symbol JPM. This gives us a 2 dimensional time series with 444 data points. We test for co-integration and estimate the parameters in our co-integration model. We obtain, via Montecarlo simulation and for T=444 and N=10000, the following empirical distribution of t( ). Mean Std Skewness Kurtosis q10% q5% q1% -2.0322 0.8159 0.21331 3.5245 -3.0345 -3.3405 -3.9153 Table 4.1: Statistics of t( ) for the pair JP/GS For T=444, we obtain the following histogram concerning t( ) : Critical values for the Dickey-Fuller test for cointegration will correspond to the quantiles at 10 %, 5% and 1% of our empirical distribution of t( ). We get q10%=-3.0345, q5%=-3.3405, q1%=-3.9153. ↵ t( ) -0.3485 1.3739 -3.9247 Table 4.2: Results of the regression log(GS) = ↵ + ⇤ log(JP) First, as t( ) < q1%, we can assume the to be stationary, and thus we can clearly conclude that we should reject the null of no-cointegration: the data are co-integrated at the 99% confidence level. Indeed, the ACF for the spread Xt is typical of an AR process : 22
  • 23.
    Figure 4.1: Empiricaldistribution of t( ) Figure 4.2: Correlogram of Residual Spread A plot of the two cointegrated series is shown in Figure 4.3. Now we estimate mean reversion behaviour in the pair of stocks: GS/JP. First, the estimated coe cients are significant across the three pairs, supporting the Vasicek model of mean reversion in the residual spreads. Table 4.3 reports estimation results. ✓ k t(⌘) -0.0009 6.3665 0.0433 Table 4.3: Estimation of dXt = k(✓ Xt)dt + ⌘dWX(t) Second, the level of mean reversion is strong, reflected by large values of k around 6.4. This values is also captured visually in the graphs where the estimated state is shown to quickly 23
  • 24.
    Figure 4.3: GS/JPcointegrated Time Series revert to its mean. The implication is twofold. On one hand, mean reversion is ample, hence the non-convergence risk is mitigated. On the other, it may be too strong, such that profit opportunities are quick to vanish for those selected pairs. Third, the estimate of ✓ is not zero, albeit close to zero. This suggests there remains some residual risk over and above the beta risk. Figure 4.4 plot the estimated AR(1) residual spread as implied from the observed return di↵erential. Figure 4.4: Estimation of Residual Spread For the purpose of illustrating, we plot our stock prices and the optimal policies for a whole trading day. More precisely, we show the stock prices S1; S2, as well as the optimal policies ⇡1 = h⇤ and ⇡2 = ⇤ h⇤ . We then present in figure 4.6, the cumulative Profit and Loss function. As expected, in a pairs trading setting, the controls are opposite in sign. We also notice that the positions, which are large during the first half of the day, are both progressively 24
  • 25.
    Figure 4.5: Stocksand optimal policies Figure 4.6: Profit and Loss processes Annual Return 15.48% Standard Dev. 13.46% Sharpe Ratio 1.15 Sortino Ratio 0.89 Table 4.4: Performance Measures unwound in the second half, ending close to 0 by the end of the trading day. For this data set, a significant profit is instantly realized because of the leverage. The profit then fluctuates throughout the day but remains strongly positive and by the end of the day, it is approximately $1,348. Repeating this strategy on a daily basis gives a Sharpe Ratio of 1.15 and an annual return of 15.48%. Of course, these figures do not take into account the cost of borrowing or transaction costs which are both assumed to be 0 in this model, this is unrealistic and in practice those costs can wipe out all the gains since the pair is reweighted every minute. To avoid the slippage one solution is to enter the trade only when the spread is wide enough, around earnings announcement of one the constituent of the pair. 25
  • 26.
    Conclusion and outlook Inthis report, we studied two quantitative approaches to the problem of pairs trading, the first one formulated the problem of optimal trading of pairs as a stochastic control problem. We were able to derive a closed form solution to this control problem. The second one relied on the properties of co-integrated financial time series to construct a strategy with a theoretical positive expected return. This study was performed to show that the two approaches are equivalent, in the sense that the portfolio weight in both case depends only on the distance between the di↵erence in the logarithm of the prices and its the long-run mean. The applicability of the method is illustrated with minute-by-minute historical stock data. In the model, the two stock processes are co-integrated, correlated, and have constant volatility and we ignore the costs associated with trading. The simplicity of the present formulation enables a feasible implementation of parameter calibration and the derivation of analytical formulae for the optimal trading strategies. Maybe further work can be done in order to address the slippage issue one solution could be to enter the trade only when the spread is wide enough, and possibly to mix this strategy with a fundamentally driven one using the MSCI rebalance signal. 26
  • 27.
    Bibliography [1] Alexander, C.,Giblin, I. and W. Weddington, Cointegration and asset allocation: A new active hedge fund strategy, ISMA Centre Discussion Papers in Finance Series. [2] Brockwell, P.J., and Davis, R.A.,Introduction to Time Series and Forecasting, second edition , Springer-Verlag, New York. (2002) [3] Brown, D. and R. Jennings, On technical analysis, The Review of Financial Studies, 527-551, 1989 [4] Casella, G. and R. L. Berger, Statistical inference, 2012 [5] Chen, Z. and P. Knez, ( Measurement of martket integration and arbitrage), The Review of Financial Studies 8, 287-325, 1995 [6] Do, B., Fa↵, R., and K. Hamza, A new approach to modeling and estimation for pairs trading, In Proceedings of 2006 Financial Man- agement Association European Conference, Stockholm, June 2006. [7] Duan, J.C., and S. Pliska, Option valuation with co-integrated asset prices, Journal of Economic Dynamics and Control, 754, 2004. [8] Elliott, R. J. , van der Hoek, J., and W. P. Malcolm, Pairs trading, Quantitative Finance, 271-276, 2005. [9] Engle, R. F. and C. W. J. Granger, Long-run economic relationships, readings in cointegration, Oxford University Press, 1991. [10] Galenko, A., Popova, E. and I. Popova, Trading in the Presence of Cointegration, Operations Research and Industrial Engineering,78712 [11] Gatev, E., Goetzmann, W. N. , and K. G. Rouwenhorst, Pairs Trading: Performance of a Relative-Value Arbitrage Rule, Review of Financial Studies, 797-827, 2006. [12] Johansen, S. , Likelihood-based inference in cointegrated vector autoregressive models, Oxford University Press, 1995 [13] Korn, R., and H. Kraft. A Stochastic Control Approach to Portfolio Problems with Stochastic Interest Rates, SIAM Journal on Control and Optimization, 1250-1269, 2002. 27
  • 28.
    [14] Mudchanatongsuk, S.,Primbs, J. A., and W. Wong. Optimal pairs trading: A stochastic control approach, Proceedings of the Amer- ican Control Conference, 1035:1039, 2008. [15] Dos Passos, W.,( Numerical methods, algorithms, and tools in C# ), CRC Press, (2010) [16] Phillips, P.C.B., and S. Ouliaris, Asymptotic properties of residual based tests for cointegration, Econometrica, 165:193, 1990 [17] Tsay, R. S., Analysis of financial time series, Wiley, 2005 [18] Vasicek, O., An Equilibrium Characterization of the Term Structure, Journal of Finan- cial Economics, 177-188, 1977. 28