report

Final Project
A Statistical Arbitrage Strategy for SP500
Zhicheng Li/Sirui Zhang/Jian Wang
Stony Brook University
December 9, 2014
Zhicheng Li/Sirui Zhang/Jian Wang (Stony Brook University)Final Project December 9, 2014 1 / 18

Theory of Strategy
Following the paper of Avellaneda et. al and Principal Components
Analysis
Form a dynamic market neutral portfolio, use statistic arbitrage to do
group trading
Use mean-reverting process to generate the trading signal

Method of Strategy: 1
Parameters: in window=M days, out window=1 day, M=60,K=15
(These are followed by Avellanda) Calculate each
stock’s log-return
Rit = log(
Pit
Pit−1
) t = 1, 2, . . . , M, i = 1, 2, . . . , N (1)
Standardized logreturn
Yit =
Rit − Ri
σi
(2)
where
Ri =
1
M
M
t=1
Rit, σi
2
=
1
M − 1
M
t=1
(Rit − Ri )2
(3)

Calculate the empirical correlation matrix of the data.
ρij =
1
M − 1
M
t=1
YitYjt (4)
Calculate Principal Components of each time window
C = Cov(ρ); [V D] = eig(C); (5)
Choose the most signiﬁcant K eigen vectors that correspond to the biggest
K eigen values
V = V(:, NL − K + 1 : NL); (6)

Project log-return matrix on these egien vectors and form K market factors
Fjt =
N
i=1
v
(j)
i
¯σi
Rit j = 1, 2, . . . , K. (7)
Regressing each stock’s returns on these market factors
Ri = mi +
K
j=1
βij Fj + ˜Ri i = 1, 2, . . . , N. (8)
As we could assume E(˜Ri ) = 0, we auto-regress each ˜Ri and ﬁnd those
residuals that have the highest negative autoregressive coeﬃcient
˜Rit = ρi
˜Rit−1 + it (9)
Choose K+1 (Here is 16) stocks as our portfolio member
PTi , i = 1, 2, . . . , 16,

A market-neutral trading portfolio is that the dollar amounts {Qi }K+1
i=1
invested in each stock in this portfolio are satisﬁed:
¯βj =
K+1
i=1
βij Qi = 0, j = 1, 2, . . . , k. (10)
βij is the coeﬀ. of stock i regress on factor j. In code, we use Null space
to solve this linear system
Q = Null{β[K]×[K+1]} (11)
Then we have
K+1
i=1
Qi Ri =
K+1
i=1
Qi mi +
K+1
i=1
Qi
K
j=1
βij Fj +
K+1
i=1
Qi
˜Ri
=
K+1
i=1
Qi mi +
K+1
i=1
Qi
˜Ri +
K
j=1
K+1
i=1
βij Qi Fj
(12)

which means
K+1
i=1
Qi Ri =
K+1
i=1
Qi (mi + ˜Ri ) (13)

Mean-Reverting Process: 1
Assume that stock returns satisfy the system of stochastic diﬀerential
equations
dSi (t)
Si (t)
= αi dt +
N
j=1
βij
dIj (t)
Ij (t)
+ dXi (t) (14)
Here,the idiosyncratic component of the return is given by
αi dt + dXi (t) (15)
Our model assumes(i) a drift which measures systematic deviations from
the sector and(ii) a price ﬂuctuation that is mean-reverting to the overall
industry level.

Based on these considerations,we introduce a parametric model forXi (t)
which can be estimated easily, namely, the Ornstein-Uhlembeck process:
dXi (t) = κi (mi − Xi (t))dt + σi dWi (t) (16)
If we assume momentarily that the parameters of the model are constant,
we can write
Xi (t0+∆t) = e−κi ∆t
Xi (t0)+(1−e−κi ∆t
)mi +σi
t0+∆t
t0
e−κi (t0+∆t−s)
dWi (s)
(17)
Equilibrium probability distribution for the process Xi (t) is normal with
E {Xi (t)} = mi and Var {Xi (t)} =
σi
2
2κi
(18)

According to Equation(14),long 1 dollar in the stock and shortβij dollars in
the jth principle component has an expected 1-day return
αi dt + κi (mi − Xi (t))dt (19)
The second term corresponds to the model’s prediction for the return
based on the position of the stationary process Xi (t) :it forecasts a
negative return if Xi (t) is suﬃciently high and a positive return if Xi (t) is
suﬃciently low.

Signal Generation: 1
We focus only on the process Xi (t),neglecting the drift αi .We know that
the equilibrium variance is
σeq,i =
σi
√
2κi
(20)
Accordingly, we deﬁne the dimensionless variable
si =
Xi (t) − mi
σeq,i
(21)
We call this variable the s-score.Our basic trading signal based on
mean-reversion is: buy to open(means buying one dollar of the
corresponding stock and selling βij dollars of its jth principle components) if
si < −1.25; sell to open(means selling one dollar of the corresponding
stock and buying βij dollars of its jth principle components) if si > 1.25;
close short position(means buying stock and selling principle components)
if si < 0.75; close long position(means selling stock and buying principle
components) if si > −0.5

Here, we use solution in appendix to estimate the residual process and
generate the signal.
Estimate the regression
RS
n = β0 + βRI
n + n n = 1, 2, . . . , 60 (22)
We set
α =
β0
∆t
= β0 ∗ 252 (23)
Next,we deﬁne auxiliary process
Xk =
k
j=1
j k = 1, 2, . . . , 60 (24)
which can viewed as a discrete version of X(t),the OU process that we are
estimating.

Notice that the regression ”forces” the residuals to have mean zero, so we
have X60 = 0.
The estimation of the OU parameters is done by solving the 1-lag
regression model
Xn+1 = a + bXn + ζn+1 n = 1, 2 . . . , 59 (25)
According to (17),we have
a = m(1 − e−κ∆t
), b = e−κ∆t
, Variance(ζ) = σ2 1 − e−2κ∆t
2κ
(26)

Whence
κ = −log(b)∗252, m =
a
1 − b
, σ =
Variance(ζ) ∗ 2κ
1 − b2
, σeq =
Variance(ζ)
1 − b2
(27)
Notice that the s-score,which is deﬁned theoretically as
s =
X(t) − m
σeq
(28)
SinceX(t) = X60 = 0
s =
−m
σeq
=
−a ∗
√
1 − b2
(1 − b) ∗ Variance(ζ)
(29)

The last caveat is that we found that centered means work better,so we set
m =
a
1 − b
−
a
1 − b
(30)
where brackets denote averaging over diﬀerent stocks.The s-score is
therefore,
s =
−m
σeq
=
−a ∗
√
1 − b2
(1 − b) ∗ Variance(ζ)
+
a
1 − b
∗
1 − b2
Variance(ζ)
(31)

Since we cannot long or short the principle components, we need to use
the market-neutral way to erase the principle components part.According
to(13),when we use the portfolio Q, we only need to long or short the
portfolio Q according to the signal. Here, we need to calculate the signal
of the portfolio.Si is the signal of ith stock in portfolio Q.
SQ =
K+1
i=1
Qi Si (32)

Plot and Result: 1
First 40 times signals plot:

Plot and Result: 2
First 40 times strategy.

report

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to report

Similar to report (20)

More from Sirui Zhang

More from Sirui Zhang (6)

report