SlideShare a Scribd company logo
Comparison of Different Methods in Forecasting
Stocks’ Returns or Prices
Zhicheng Li/Sirui Zhang/Haoran Jiang
Abstract
In this paper, four models are built in order to explain stocks behav-
ior, and the corresponding methods are used to forecast stocks’ returns
or prices in S&P 500 universe. All the forecasting results are compared
with the real values. It is shown that the traditional time series meth-
ods, including univariate (in AR model) and mutivariate (in VAR model)
methods, give little forecastability. On the contrary, the methods based on
statistical arbitrage, i.e, the Pair Trading and Market Neurtral model, per-
form much better. Meanwhile, we introduce some statistical techniques,
such as Principle Components Analysis (PCA) and mean-reversion con-
cept. Finally, Econometrics and statistic analysis are attempt to give a
reasonable interpretation.
1 Introduction
Forecasting is an everlasting topics not only in Economics but also in Fi-
nance. In the stock market, the incentive to make a good forecasting is
particularly strong, in the sense that people who have a better prediction
would make more money. Therefore, a lot of researches have been done
and various models and methods have been proposed and used. Before the
age of computers, people traded stocks and commodities mainly on intu-
ition. As the level of trading and the technology grew, people searched for
tools and methods that would increase their gains meanwhile minimizing
their risk. Statistics, fundamental analysis, and linear/non-liner regres-
sions are all attempt to predict and benefit from the markets direction [5].
In recent studies, some new techniques, such as Neural Network, Hidden
Markov Method(HMM) and Genetic Algorithms (GA), are used to fore-
cast stocks’ activity [9][10][13]. None of these techniques has proven to be
consistently correct as desired, and many skeptics argue about the utility
of many of these approaches. However, these methods are commonly used
in practice.
In our paper, we present four models with the application to S&P 500
stocks market. In each model, we state the concrete method for forecast-
ing. Given a particular time window in S&P universe, we forecast the
stocks’ prices or returns, then we compare the forecasting results with the
real values by calculating correlations. At last we look at the performance
of each method. The first model we start with is Auto-Regression (AR)
1
model, which is broadly used in time series analysis [5][7]. It assumes
that stock behaves in an autocorrelated and stochastic way, and is not
correlated with other stocks/factors. Basically, this method attempts to
model a linear function by a recurrence relation derived from past values.
The recurrence relation can then be used to predict new values in the time
series, which hopefully will be good approximations of the actual values.
While in the second model, we think that two stocks, especially in one
common industry, are tent to be correlated, i.e., a pair of stocks’ prices
are possibly to have a statistical relationship, called cointegration. We dig
out this property and implement pair trading in the second model. This
model is the ancestor of statistical arbitrage, which now is a widely used
method in the investment area [16][14]. In the third model, we extend
our idea to the point where individual stock is very possibly influenced
by whole market. We hope to find those common market factors that
each stock may depend on. Therefore, a statistical method, called Prin-
ciple Component Analysis (PCA), is employed to extract these common
market factors [17], i.e., Principle Components(PCs). By regressing each
stock on PCs, we infer their relationship, and further by VAR model,
which is a multivariate time series model, we forecast how PCs evolute.
Then we put the predicted PCs back to the original regressions and fore-
cast individual stocks. The last model we apply is market neutral model,
in which we form a portfolio whose expected returns are nothing related
with the market fundamentals. In spite of how the market fluctuates, the
portfolio’ return is just a stationary mean-reverting process. By using
mean-reversion, which is a very important technique in statistical arbi-
trage [3], we look for the opportunities that would give us large expected
returns, and then compare these returns with real values.
The structure of our paper is organized as below. In Section 2, we
introduce the data of S&P 500 stock market that we are using, and we
further diagnose and discover some property of this data set. Section 3 are
divided into four parts. Each part set forth a model of studying stocks’
behaviors and a method of how to forecast stocks’ prices/returns in our
case. Then in Section 4, we show the results of these four methods and
compare their performance. A detailed and reasonable analysis is also
tried. At last, we make a conclusion in the final Section.
2 Data and Stylized fact
In this paper, we use a database of S&P 500 (Standard & Poor’s 500)
from year 1989 to 2012. The data source is from CRSP (Center Research
Security Price), which is part of University of Chicago and renowned for
its expertise in building and maintaining historical, academic research-
quality stock market databases. The reason to choose S&P 500 is that it
comprises nearly 500 common stocks issued by 500 large-cap companies,
and covers about 75 percent of the American equity market by capitaliza-
tion. Meanwhile, S&P 500 indice is one of the most commonly followed
equity indices, and many consider it one of the best representations of the
U.S. stock market, and a bellwether for the U.S. economy [1] (See Figure
1).
2
Figure 1: Historical S&P 500 Earning and US Nominal GDP
The components of the S&P 500 are selected by the committee. This is
similar to the Dow Jones Industrial Average, but different from others such
as the Russell 1000, which are strictly rule-based. When considering the
eligibility of a new addition, the committee assesses the company’s merit
using eight primary criteria: market capitalization, liquidity, domicile,
public float, sector classification, financial viability, length of time publicly
traded and listing exchange [2]. The committee selects the companies in
the S&P 500 so they are representative of the industries in the United
States economy. In order to be added to the index, a company must satisfy
these liquidity-based size requirements: i) market capitalization is greater
than or equal to US4.0 billion; ii) annual dollar value traded to float-
adjusted market capitalization is greater than 1.0; iii)minimum monthly
trading volume of 250,000 shares in each of the six months leading up
to the evaluation date. Therefore, companies in S&P 500 are not static.
Sometimes, one company may dropped out from the list, and sometimes
another new company entered. That’s why we could see 1127 stocks’
records in our data.
The stocks’ prices in this data set are End-of-Day prices. As we have
roughly 252 business days a year, there are 5799 time records. In addition,
these prices are adjusted for including dividends and expanding shares.
Thus, the tendency of one stock prices can reflect the market value of that
company. Moreover, we normally think price’s increment is proportional
to itself, so the trend of one stock prices is exponential (See Figure 2)
and the log-prices would be I(1) process, which means the log-returns
(first differences of log-prices) are stationary (Stock and Watson (1988b)).
Table 1 is the results of ADF tests for all the stocks, which evidently
show that log-prices are basically I(1) process which have unit root and
log-returns are stationary process.
3
Figure 2: Five S&P 500 Stocks Prices’ Evolution
As we have a long time series in broad universe of U.S equities, we
could use back-testing method to compare different methods for forecast-
ing stocks’ prices/returns. The principle is following: we set two param-
eters, i.e., historical window and forecasting window. Given the data in
historical window, we anticipate the prices/returns in forecasting window,
and then compare them with the actual data. The historical window can
move over time, so we can get a series of comparison results and make
a judgment. Another issue is that within a particular historical window,
some companies are not belong to S&P 500 or have no data, we need
refine the dataset to those stocks who continuously existed in that period.
Standard & Poor believes that turnover in index membership should be
avoided whenever possible. Hence companies which were added to the
index usually stays in the index unless too many of the addition criteria
has been violated or if the company no longer exist due to mergers and
acquisitions [2]. Thus even it has the selection base which we have men-
tioned before, within the certain historical window that is not too long,
we can think that stocks behave naturally.
Table 1: Results of ADF tests for log-prices and log-returns processes
H0: have a unit
root (5% level)
Ratio of stocks
that accept
Ratio of stocks
that reject
log-prices 95.08% 4.92%
log-returns 0 100%
4
3 Models and Methods
3.1 Simple Autoregression Model
At the beginning, let us use a very simple model, that is autoregres-
sion(AR) model, which is widely used in single time series problem. Sup-
pose we are interested in forecasting the value of a variable Yt+1 based
on a set of variables Xt observed at date t. In this case, Xt consist of a
constant plus Yt, Yt−1, . . . Yt−m+1. Common methodology is to choose the
forecast Y ∗
t+1|t, so as to minimize
E(Yt+1 − Y ∗
t+1|t)2
(1)
which is mean squared error. Y ∗
t+1|t has a function form g(Xt) based on
the current information, then the last equation is to find the function
g(Xt) that minimize
E(Yt+1 − g(Xt))2
(2)
When we use linear projection, i.e, g(Xt) is a linear combination of Yt,
. . . Yt−m+1, equation 2 becomes a AR model. In our papaer, we just
choose two lags and have the regression model:
Yt − u = φ1(Yt−1 − u) + φ2(Yt−2 − u) + εt (3)
The reason for using two lags linear projection other than some other
methods (AIC/BIC) in determining lags [8] or using non-linear models is
that we think there is a trade-off between the size of samples, the numbers
of parameters to be estimated, and the credibility of the model we have.
Many parameters to be estimated might cause the lack of precision due
to the estimation process. And because we don’t have a ‘true’ model
governing stock prices/returns (Black (1986)), as long as what we have
built is effective to some extend as we expect, we could use it.
Back to the equation 3, if we could assume E(εt | Yt−1, Yt−2) = 0
and the process {Yt, [Yt−1, Yt−2]} is covariance-stationary and ergodic for
second moments, then the OLS regression yields a consistent estimate for
coefficients (Hamilton (1994)). Or, we transfer equation 3 to the form:
φ(L)(Y − u) = εt (4)
where the autoregressive operator φ(L) = (1−φ1L−φ2L2
). As long as all
the roots of φ(z) = 0 lie outside the unit circle, the autoregression satisfies
the stationary condition.
In this AR model, we choose log-returns which are already stationary
process as our forecasting object. Specifically, if we define Yit as the log-
return of stock i at time t, then equation 3 becomes
Yit = β0i + β1iYit−1 + β2iYit−2 + εit (5)
If the previous assumptions hold, we could apply OLS to this regression
and get consistent estimator ˆβki, (k = 0 . . . 2, i = 1 . . . N). Here we should
notice that this is not a panel data regression. They are different regres-
sions for different stocks, and the coefficients vary between stocks. Further
5
more, we set the length of the moving historical window as 1000 days, and
we want to forecast the next day return E(Yit+1) of stock i, which is
E(Yit+1) = ˆβ0i + ˆβ1iYit + ˆβ2iYit−1 (6)
At last we compare the forecast returns with real returns, and the results
are shown in next section.
3.2 Pair Trading Model
The assumptions in the previous model are very strong. It is unlikely
that stocks changes by themselves and are uncorrelated with others. In
other words, it is more plausible to think that stocks are possibly corre-
lated, especially in the same industry. Figure 3 shows a example that the
prices’ evolutions of two stocks in the same industry ‘Petroleum Refining’
(SIC:2911) from year 1989 to 1990, and it seems that they are highly cor-
related. Hence, in this model, we adopt one relationship which commonly
used in time series, i.e., cointegration, to analysis. Other than dealing
with log-returns, which are stationary process, we consider the log-prices
that are integrated of order 1. If stocks i and j are in the same industry
or have similar characteristics, one expects by hedging one stock on the
other to get positive profit (see Pole (2008)). Particularly, denote Pit and
Pjt as the corresponding price series, when we can model them like
ln(Pit) = αt + βln(Pjt) + Xt (7)
where Xt is a stationary, or a mean-reverting process. Then the relation
between these two log-prices which are I(1) series is cointegration. By
taking first difference of equation 7, log-returns should be satisfied
ln(Rit) = αdt + βln(Rjt) + dXt (8)
In many situation, the drift α is small compared to the fluctuations of Xt
and can be neglected. Thus the mean-reversion of Xt suggests us that we
could form a long-short portfolio in which we go long 1 dollar of stock i
and short β dollars of stock j if Xt is small. And conversely, go short stock
i and long j if Xt is large. Both situations are expected to get positive
returns. This mean-reversion paradigm is typically associated with market
over-reaction: assets are temporarily under or over priced with respect to
one or several reference securities (Lo and MacKinlay (1990)).
For our dataset, the concrete method is described as below. At first
within one historical window, we find a pair of stocks which are cointe-
grated without deterministic trend under certain industry (in our data,
we use SIC code to identify the industries). Denote them as stock i and
j, by regressing one on the other, we have:
ln(Pit) = βln(Pjt) + Xt (9)
And correspondingly, for log-returns,
ln(Rit) = βln(Rjt) + dXt (10)
6
Figure 3: Prices of two stocks in ‘Petroleum Refining’ industry from 1989 to
1990
As the Xt is stationary process and we expect to find mean-reverting
property, we use AR(1) model to do diagnose Xt:
Xt = β0 + β1Xt−1 + εt (11)
Subtracting both sides by Xt−1, we get
dXt = β0 + (β1 − 1)Xt−1 + εt (12)
The mean-reversion requires (β1 − 1) < 0, and the more negative, the
more mean-reverting. Therefore, the next step is to, within the particular
historical window (t=1. . . T), search all the stocks, find the top ten mean-
reverting pairs, and denote them as {i∗
, j∗
}10. Then for these ten pair-
trading portfolio, we need forecast their next day returns. By putting
T+1 to the equation 10, it becomes
ln(Ri∗T +1) − βln(Rj∗T +1) = dX∗
T +1 (13)
which means that long 1 dollar stock i∗
and short β dollars j∗
would give
us a expected return ET(dX∗
T +1). What’s more, from equation 12, it is
easy to see
dX∗
T +1 = β∗
0 + (β∗
1 − 1)X∗
T + ε∗
T +1 (14)
If we have the valid assumption ET(ε∗
T +1) = 0, which is also the require-
ment for getting a consistent estimator in AR(1), we could derive the
result:
ET(dX∗
T +1) = ET{ln(Ri∗T +1) − βln(Rj∗T +1)}
= β∗
0 + (β∗
1 − 1)X∗
T
(15)
showing that the expected returns in next day (T+1) of this pair trading
are just β∗
0 + (β∗
1 − 1)X∗
T . Then we can compare the forecasting returns
with the real returns by using pair trading, which is just ln(Ri∗T +1) −
βln(Rj∗T +1) located in the forecasting window. The results of comparison
will be shown in next part.
7
Moreover, if we want form a strategy to make more money, within the
ten pairs that are chosen by us, we select the pair (i∗∗
and j∗∗
) whose
absolute expected returns equals max{|β∗
0 +(β∗
1 −1)X∗
T |}, and just do pair
trading for that pair. If the expect return value is positive, we just long
1 dollar i∗∗
stock and short β dollars j∗∗
stock. While when the value is
negative, on the contrary, we short 1 dollar i∗∗
stock and long β dollars j∗∗
stock. Both cases give us the positive return, i.e, max{|β∗
0 +(β∗
1 −1)X∗
T |}.
3.3 VAR Model
From the previous model, we could see that cointegrated time series share
at least one common trend. Both causal observation and economic the-
ory suggest that many series might contain the same stochastic trend so
that they are cointegrated. If each of n series is integrated of order 1 and
can be jointly characterized by k < n stochastic trends, then the vector
representation of these series has k I(1) processes and n − k distinct sta-
tionary linear combinations. A technique proposed by Stock and Watson
(1988a) claim that we can extract common stochastic trends by Principal
Components Analysis (PCA). As we already know that log-prices is I(1)
process, we can regress each log-prices process on these cointegrated Prin-
cipal Components (PCs), then the residual we get should be stationary.
Or we can directly use log-returns which are already stationary process,
then the principal components and the residuals after regression are all
stationary.
Here we briefly introduce PCA. PCA is a statistical method that uses
an orthogonal transformation to convert a set of observations of possibly
correlated variables into a set of values of linearly uncorrelated variables
called principal components. This transformation is defined in such a way
that the first principal component has the largest possible variance, that
is, accounts for as much of the variability in the data as possible. And
each succeeding component in turn has the highest variance possible under
the constraint that it is orthogonal to the preceding components. Thus,
we can preserve most of the information of original data and meanwhile
achieve the purpose of reducing the dimension of dataset, i.e., get small
numbers of common stochastic trends.
The detail procedure for our case is following. Within one historical
window (t=1. . . T, i=1. . . N), we first standardized the volatility of each
stock’s log-prices (pi).
Yit =
pit − ¯pi
¯σi
(16)
where
¯pi =
1
T
T
t=1
pit ; ¯σ2
i =
1
T − 1
T
t=1
(pit − ¯pi)2
Then we calculate the covariance matrix of Yit (here is also the correlation
matrix). It is defined as C, and
Cij =
1
T − 1
T
t=1
YitYjt (17)
8
which is symmetric and non-negative definite. Notice that, for any stock
i, we have Cii = 1. The next step is to consider the eigenvectors and
eigenvalues of the covariance matrix. Define V as the eigenvectors matix
and λ as corresponding eigenvalues, i.e,
[V λ] = Eig(C); (18)
As Vi (i = 1 . . . N) are the eigenvectors of the covariance matrix, they are
orthogonal to each other. These eigenvectors can form a set of orthogonal
bases of another space. When we rank the eigenvalues in decreasing order:
N ≥ λ1 ≥ λ1 ≥ λ1 ≥ . . . ≥ λN ≥ 0
and define V1, V2, V3 . . . VN as the corresponding eigenvectors. A spectrum
of eigenvalues shows that they only contain a few large eigenvalues (See
Figure 4). We can then choose top K eigenvectors which correspond to the
biggest K eigenvalues. From Jolliffe (2005), we know that the projection
of original data on these top eigenvectors V1, V2, V3 . . . VK (also principal
bases in new space) can preserve most of the information.
Figure 4: Eigenvalues of the correlation matrix of stocks’ log-prices computed
on the first historical window (t=1. . . 100)
Thus, we project the log-prices data in the historical window on these
top eigenvectors and get K principal components (Fk, k = 1 . . . K):
Ftj =
N
i=1
Vji
¯σi
pti t = 1, . . . , T j = 1, . . . , K; (19)
For each stock’s log-prices process, we regress it on those common trends:
pi = θi0 +
K
j=1
θijFj + δi i = 1, 2, . . . , N. (20)
As they are cointegrated, and if we can claim that the disturbance item
is uncorrelated with PCs, the OLS estimator ˆθij, (i = 1 . . . N, j = 0 . . . K)
9
are consistent. The next step is that, rather than auto-regressing each
single log-price process and forecast, we use Vector Autoregression (VAR)
model to forecast these common trends (PCs) and combine them together
to estimate each log-prices process by putting them back to the original
regression equation 20. A VAR(p) model is written as an vector autore-
gression over the previous p values of the series, in this case:
#»
F t = #»c + φ1
#»
F t−1 + · · · + φp
#»
F t−p + #»ε t (21)
where
#»
F t =



F1t
...
FKt


 ; #»c =



c1t
...
cKt


 ; #»ε t =



ε1t
...
εKt


 ; φs = {φs
ij}K×K (22)
And putting forecasting value of
#»
F t+1 into equation 20, we have
ˆpit+1 = ˆθi0 +
K
j=1
ˆθijFjt+1 (23)
The principle of this method is that, rather than treating the evolution
of stock price as a spontaneous and endogenous process, we think it is
highly correlated with the whole market. As it is impossible to regress
each stock on the whole set of other stocks, we extract a small numbers of
common stochastic trends which can largely represent the whole market.
By the evolution of these trends, we capture more information which
would influence the single stock’s behavior. Indeed, we will encounter
similar econometrics problem as we were doing single series autoregression.
And it is hard for us to justify the valid of those assumptions. However,
as long as this model could increase the forecastability, it is effective to
some extent.
3.4 Market Neutral Model
Stocks’ prices or returns are apparently influenced by market fundamen-
tals. However, it is hard to build a model and take all possible factors into
account for explaining and forecasting fundamentals. Therefore, in this
section, we consider a statistical arbitrage model, in which the portfolio’s
return is not impacted by market fundamentals. The common features of
statistical arbitrage are (i) trading signals are systematic or rules-based,
(ii) the trading portfolio is market-neutral, in the sense that it has zero
beta with the market, and (iii) the mechanism for generating excess re-
turns is statistical. The idea is to make many bets with significant positive
expected returns in the appropriate time, and produce a low-volatility in-
vestment strategy which is uncorrelated with market.
Here we take reference of the paper by Avellaneda and Lee (2010) and
build this model. First we form principal components of log-returns of
S&P500 stocks in a certain period. For example, if we are at time T and
need forecast the next period stocks’ returns, we use the past 60 days of
record, i.e, the historical window is chosen as 60 days. Following the same
10
principle in last section, we choose the most significant K eigenvectors
that correspond to the biggest K eigenvalues. Define these vectors as
Vi, (i = 1 . . . K). Then we project log-return matrix (60 × N) on these
eigenvectors and form K market factors.
Ftj =
N
i=1
Vji
¯σi
Rti j = 1, . . . , K; t = (T − 59), . . . , T (24)
Where Ftj is the jth market factor at time t. We should notice that these
market factors are dynamic because they would change as the historical
window moving forward.
Then we regress each stock’s log-returns on these market factors
Ri = mi +
K
j=1
βijFj + ˜Ri i = 1, 2, . . . , N. (25)
Of course returns, principal components and the residuals are all station-
ary, and we could assume E( ˜Ri) = 0. The proposed strategy is to look for
those regression residuals that have the most significant reverting process.
Thus, we auto-regress each ˜Ri and find those residuals that have highest
negative autoregressive coefficient.
˜Rit = ρi
˜Rit−1 + it i = 1, 2, . . . , N. (26)
Figure 5 shows the top five mean-reverting residuals in the first historical
window.
Figure 5: The top 5 mean-reverting residuals in the first historical window
A trading portfolio which contains n stocks is said to be market-neutral
if the dollar amounts {Qi}n
i=1 invested in each stock in this portfolio are
satisfied:
¯βj =
n
i=1
βijQi = 0, j = 1, 2, . . . , k. (27)
11
βij is the coefficients of stock i regress on factor j. In code, we use Null
space to solve this linear system
Q = Null{β[K]×[n]} (28)
In order to guarantee a non-zero solution for the portfolio, we need choose
n = K+1 stocks, which have the smallest K+1 autoregressive coefficients,
as our portfolio member. Then we have
K+1
i=1
QiRi =
K+1
i=1
Qimi +
K+1
i=1
Qi
K
j=1
βijFj +
K+1
i=1
Qi
˜Ri
=
K+1
i=1
Qimi +
K+1
i=1
Qi
˜Ri +
K
j=1
K+1
i=1
βijQi Fj
=
K+1
i=1
Qi(mi + ˜Ri)
(29)
In this equation, it is obviously to see that the portfolio return has nothing
to do with market environment. And it is depend on the intrinsic factor
mi and a statistic random variable ˜Ri, which is mean zero and stationary
process satisfy mean-reversion.
The next step is to generate signals for entering trading. Loading
auto-regressing expression 26 into equation 29, we have:
K+1
i=1
QiRit =
K+1
i=1
Qi(mi + ˜Rit) =
K+1
i=1
Qi(mi + ρi
˜Rit−1 + it) (30)
Suppose we are at the last period T of historical window, from above
equation, we expect the portfolio return at T+1 is
ET
K+1
i=1
QiRiT +1 = ET
K+1
i=1
Qi(mi + ρi
˜RiT + iT +1)
=
K+1
i=1
Qi(mi + ρi
˜RiT )
(31)
When K+1
i=1 Qi(mi + ρi
˜RiT ) is very high (positive), we could buy this
portfolio and expect to get a high return. While when K+1
i=1 Qi(mi +
ρi
˜RiT ) is sufficiently negative, we could short this portfolio, and still ex-
pect to get a high return. Thus, we could directly use K+1
i=1 Qi(mi +
ρi
˜RiT ) as our trading signal, where ρi is negative coefficient. Define the
signal as ST . In our strategy, we set the trading entry criteria are:
1. if ST −mean{St} ≥ 0.7(max{St}−mean{St}), t = (T −59), . . . , T
Enter trading, long the stocks whose Qi are positive by the amount
of | Qi |, short the stocks whose Qi are negative by the amount of
| Qi |. This would give expected return as | K+1
i=1 Qi(mi + ρi
˜RiT ) |
2. if ST −mean{St} ≤ 0.7(min{St}−mean{St}), t = (T −59), . . . , T
Enter trading, long the stocks whose Qi are negative by the amount
of | Qi |, short the stocks whose Qi are posituve by the amount of
| Qi |. This would give expected return as | K+1
i=1 Qi(mi + ρi
˜RiT ) |
12
Finally, as historical window moving forward, we compare these expected
portfolio returns to the real portfolio returns and get a correlation result.
4 Comparison and Analysis
Table 2 shows the comparison results of these four different methods for
forecasting stocks’ log-returns in S&P 500 universe. Here need clarify
some parameters. In the time series models (AR and VAR), we use the
historical window across 1000 days. While in statistical arbitrage models,
followed by Avellaneda and Lee (2010), we use past 60 days’ records as
our information set for trading. ‘Common factors’ refer to the number
of other time series are used to forecast. In Pair Trading and Market
Neutral models, it means to the number of PCs that we used. As in the
last model, we use a signal to identify whether enter trading or not, we
see that the forecasting times is less than others.
Table 2: Comparison between four types of forecasting methods
Length of Common Forecasting Correlation
historical window factors used times with real returns
AR 1000 NA 1000 3.17%
Pair Trading 60 2 1000 13.89%
VAR 1000 5 1000 4.52%
Market Neutral 60 15 503 17.2%
From the table we can see that both AR and VAR models exhibit
little forecastability. In the AR model, we just investigate each stock’s
log-returns. We know that individual log-price processes are almost a
random walk process, in the sense that log-returns, that is the differences
of log-prices, are almost white noise. Even though they are stationary
process, it is still hard to forecast their following behaviors. While in
the VAR model, we want to capture more market information that would
impact stock’ behavior. Thus we switch to look at how common market
factors (PCs) evolute. Then by putting the forecasting values of PCs into
the original regressions, we get the predicted values for each stock. How-
ever, we see that the effect on forecasting each individual’s log-returns is
trivial. Therefore, it would not increase much opportunity to earn money.
Moreover, when we extend our forecasting window to five days, we find
that the accuracy of these two methods decrease as the forecasting period
increase (See Table 3). Overall, time series forecasting provides reason-
able credit over short periods of time, but the accuracy of forecasting
diminishes sharply as the length of prediction increases.
Nonetheless, from Table 2, we find the second and last model improve
a lot on the forecastility. The latent methodology in the second and last
models is mean-reversion, which is a mathematical concept sometimes
used for stock investing. This concept suggest that prices and returns
eventually move back towards the mean or average. Revisiting the equa-
tion 9 in Pair Trading model, we see that the pair of stocks’ log-prices
13
Table 3: Time series methods to forecast different periods
Length of Days to Correlation
historical window forecast with real returns
AR 1000 1 3.17%
VAR 1000 1 4.52%
AR 1000 5 1.38%
VAR 1000 5 1.90%
are cointegrated and the residuals after regression are supposed to move
around the average. By mean-reversion, we expect dXt have a negative
correlation with Xt. This is not only a property that we infer or extract
from data, but also supported by a theoretical model, i.e, OrnsteinUh-
lenbeck (O-U) process. In mathematics, the O-U process (see Gardiner
(1985)), is a stochastic process that describes the velocity of a massive
Brownian particle under the influence of friction. The process is sta-
tionary, Gaussian, and Markovian. Over time, the process tends to drift
towards its long-term mean: such a process is called mean-reverting. More
over, another important and widely used assumption in Finance is that
stock prices’ stochastical movement follows geometric brownian motion.
Thus, for the Xt in equation 9, we could apply O-U process and get:
dXt = κ(m − Xt)dt + σ · dWt, κ > 0 (32)
where m is the mean of Xt, dWt is the increment of brownian motion
(Wt ∼ N(0, t)), σ measures the volatility of movement, and the parame-
ter κ is called the speed of mean-reversion. This process is stationary and
auto-regressive with lag 1. In particular, the increment dXt has uncondi-
tional mean zero and conditional mean equal to
E{dXt|Xs, s ≤ t} = κ(m − Xt)dt
When Xt > m, we expect dXt be negative, and Xt < m implies a positive
dXt. A small transformation to equation 32 , we get:
dXt = κm · dt − κdt · Xt + σ · dWt (33)
Compare it with equation 9 in Pair Trading model, we find that they have
the same form, and
β0 = κm · dt, β1 − 1 = −κdt, εt = σdWt (34)
This on the other hand endorses AR(1) model which we used for the
process Xt. And finding the most negative coefficient β1 is equivalent to
finding the process which has the highest speed of mean-reverting.
For the last model, i.e., Market Neutral model, we used another method
to identify mean-reverting process. In stead of studying cointegrated
log-prices, we directly regress log-returns which are already stationary
process on the common market factors (PCs). The residuals after re-
gression(include constant item) are mean zero. But there is no rigorous
model to support that the residual series are mean-reverting around zero.
14
The relationship in equation 26, i.e., ˜Rit = ρi
˜Rit−1 + it, where ρi < 0
is basically an assumption. However, we looked for all the stocks, and
found those who are most possible to obey this relationship (See Fig-
ure 5). Therefore, for the stocks we have chosen, the residuals ˜Rit after
regression on the common market factors, are reasonable to assume oscil-
lating near zero. Then we could effectively apply mean-reversion method.
Nevertheless, we need pay attention that not all stationary processes are
mean-reverting, or can be used for mean-reversion. Moreover, if a ran-
dom walk I(1) process have mean zero, the probability for it crosses zero
is one, but the mean time to crossing zero is infinite. Thus, we could
neither apply mean-reversion to a random walk process in a direct way.
The reason for a relatively good performance of the second and last
model is that, in stead of focusing on forecasting variables themselves, we
pay attention on the residuals. Either by the existing theories or econo-
metrics analysis, we extract more information on the property of residuals,
which exhibit more forecastability. Just as the famous saying in Finance:
“Profit comes from residuals”. The other learning from our research is
that, there is not a ‘real’ model explaining stocks’ prices or returns in
Finance. All the existing theory are partially right, and all the model are
only valid when the assumptions are reasonable. For example, the funda-
mental assumption for the O-U process or the famous Black-Scholes model
is that the underling stock price St follows geometric brownian motion
St+1 − St = (r − q)Stdt + σStdWt
=⇒
St+1
St
= 1 + (r − q)dt + σ
√
dt · Z, Z ∼ N(0, 1)
(35)
which suggests that log-prices is a self auto-regressive process and not
impacted by others, however, we already found this is not proper most
time. There are too many variables and factors which could influence the
stock markets. Even one model can works well for a time, once many
people begin to use it, people’s trading and investment behavior would
conversely impact the market and may offset the utility of that model.
Hence, other than some Economics problem, Finance market are almost
full of noise (Black (1986)) and hard to model. The job is to find a little
bit useful information in the enormous environment, catch opportunity
and make money.
5 Conclusion
Practical experiments and back testing results illustrate that the tradi-
tional time series methods don’t work well. The models AR and VAR
which belong to univariate and multivariate time series analysis respec-
tively can only have less than 5 percent accuracy. When the forecasting
period increases, the accuracy decreases significantly. This suggests that it
is hard to derive a true recurrence relation that can be used to predict new
values. However, Pair Trading and Market Neutral models which based
on statistical arbitrage principle improve the forecastability to more than
10 percent. The idea is to form a pair or a portfolio whose returns only de-
15
pend on the values of residuals, and further by excavating mean-reversion
property of these residuals, we gain more forecastability.
References
(2012). Standard & Poor’s 500 index - S&P 500. Investopedia.
(2013). S&P Indice Methodology. Standard And Poor’s.
Avellaneda, M. and J.-H. Lee (2010). Statistical arbitrage in the us equi-
ties market. Quantitative Finance 10(7), 761–782.
Black, F. (1986). Noise. The journal of finance 41(3), 529–543.
Box, G. E., G. M. Jenkins, and G. C. Reinsel (2013). Time series analysis:
forecasting and control. John Wiley & Sons.
Gardiner, C. (1985). Stochastic methods. Springer-Verlag, Berlin–
Heidelberg–New York–Tokyo.
Hamilton, J. D. (1994). Time series analysis, Volume 2. Princeton uni-
versity press Princeton.
Hannan, E. J. and B. G. Quinn (1979). The determination of the order
of an autoregression. Journal of the Royal Statistical Society. Series B
(Methodological), 190–195.
Hassan, M. R. and B. Nath (2005). Stock market forecasting using hidden
markov model: a new approach. In Intelligent Systems Design and
Applications, 2005. ISDA’05. Proceedings. 5th International Conference
on, pp. 192–196. IEEE.
Hassan, M. R., B. Nath, and M. Kirley (2007). A fusion model of hmm,
ann and ga for stock market forecasting. Expert Systems with Applica-
tions 33(1), 171–180.
Hirsa, A. (2012). Computational methods in finance. CRC Press.
Jolliffe, I. (2005). Principal component analysis. Wiley Online Library.
Lawrence, R. (1997). Using neural networks to forecast stock market
prices. University of Manitoba.
Lo, A. W. and A. C. MacKinlay (1990). When are contrarian profits
due to stock market overreaction? Review of Financial studies 3(2),
175–205.
Miller, M. H., J. Muthuswamy, and R. E. Whaley (1994). Mean reversion
of standard & poor’s 500 index basis changes: Arbitrage-induced or
statistical illusion? The Journal of Finance 49(2), 479–513.
Pole, A. (2008). Statistical arbitrage: algorithmic trading insights and
techniques, Volume 411. John Wiley & Sons.
16
Stock, J. H. and M. W. Watson (1988a). Testing for common trends.
Journal of the American statistical Association 83(404), 1097–1107.
Stock, J. H. and M. W. Watson (1988b). Variable trends in economic time
series. The Journal of Economic Perspectives, 147–174.
17

More Related Content

What's hot

MODELING THE AUTOREGRESSIVE CAPITAL ASSET PRICING MODEL FOR TOP 10 SELECTED...
  MODELING THE AUTOREGRESSIVE CAPITAL ASSET PRICING MODEL FOR TOP 10 SELECTED...  MODELING THE AUTOREGRESSIVE CAPITAL ASSET PRICING MODEL FOR TOP 10 SELECTED...
MODELING THE AUTOREGRESSIVE CAPITAL ASSET PRICING MODEL FOR TOP 10 SELECTED...
IAEME Publication
 
IRJET - Stock Recommendation System using Machine Learning Approache
IRJET - Stock Recommendation System using Machine Learning ApproacheIRJET - Stock Recommendation System using Machine Learning Approache
IRJET - Stock Recommendation System using Machine Learning Approache
IRJET Journal
 
AN ANALYSIS OF THE FINANCIAL PERFORMANCE EFFECT OF SHARIA COMPANIES ON STOCK ...
AN ANALYSIS OF THE FINANCIAL PERFORMANCE EFFECT OF SHARIA COMPANIES ON STOCK ...AN ANALYSIS OF THE FINANCIAL PERFORMANCE EFFECT OF SHARIA COMPANIES ON STOCK ...
AN ANALYSIS OF THE FINANCIAL PERFORMANCE EFFECT OF SHARIA COMPANIES ON STOCK ...
Saputra Ayudi
 
Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...
Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...
Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...
Business, Management and Economics Research
 
Stock Prices valuation of IT Companies in India: An Empirical Study
Stock Prices valuation of IT Companies in India: An Empirical Study  Stock Prices valuation of IT Companies in India: An Empirical Study
Stock Prices valuation of IT Companies in India: An Empirical Study
Dr.Punit Kumar Dwivedi
 
What Do Good Integrated Reports Tell Us?: An Empirical Study of Japanese Comp...
What Do Good Integrated Reports Tell Us?: An Empirical Study of Japanese Comp...What Do Good Integrated Reports Tell Us?: An Empirical Study of Japanese Comp...
What Do Good Integrated Reports Tell Us?: An Empirical Study of Japanese Comp...
Kei Nakagawa
 
Columbia Business School - RBP Methodology
Columbia Business School - RBP MethodologyColumbia Business School - RBP Methodology
Columbia Business School - RBP MethodologyMarc Kirst
 
Thesis final bilal n saif 222 (2010 2011)
Thesis final bilal n saif 222 (2010 2011)Thesis final bilal n saif 222 (2010 2011)
Thesis final bilal n saif 222 (2010 2011)Saifullah Malik
 
2 capm dan apt aminullah assagaf
2 capm dan apt  aminullah assagaf2 capm dan apt  aminullah assagaf
2 capm dan apt aminullah assagaf
Aminullah Assagaf
 
Technical Trading Rules for the NASDAQ100
Technical Trading Rules for the NASDAQ100Technical Trading Rules for the NASDAQ100
Technical Trading Rules for the NASDAQ100
Temi Vasco
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
Just Burnee
 
Statistical arbitrage
Statistical arbitrageStatistical arbitrage
Statistical arbitrage
Mukul Bhartiya
 

What's hot (16)

MODELING THE AUTOREGRESSIVE CAPITAL ASSET PRICING MODEL FOR TOP 10 SELECTED...
  MODELING THE AUTOREGRESSIVE CAPITAL ASSET PRICING MODEL FOR TOP 10 SELECTED...  MODELING THE AUTOREGRESSIVE CAPITAL ASSET PRICING MODEL FOR TOP 10 SELECTED...
MODELING THE AUTOREGRESSIVE CAPITAL ASSET PRICING MODEL FOR TOP 10 SELECTED...
 
Qir 2013q2 us
Qir 2013q2 usQir 2013q2 us
Qir 2013q2 us
 
IRJET - Stock Recommendation System using Machine Learning Approache
IRJET - Stock Recommendation System using Machine Learning ApproacheIRJET - Stock Recommendation System using Machine Learning Approache
IRJET - Stock Recommendation System using Machine Learning Approache
 
AN ANALYSIS OF THE FINANCIAL PERFORMANCE EFFECT OF SHARIA COMPANIES ON STOCK ...
AN ANALYSIS OF THE FINANCIAL PERFORMANCE EFFECT OF SHARIA COMPANIES ON STOCK ...AN ANALYSIS OF THE FINANCIAL PERFORMANCE EFFECT OF SHARIA COMPANIES ON STOCK ...
AN ANALYSIS OF THE FINANCIAL PERFORMANCE EFFECT OF SHARIA COMPANIES ON STOCK ...
 
Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...
Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...
Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...
 
Stock Prices valuation of IT Companies in India: An Empirical Study
Stock Prices valuation of IT Companies in India: An Empirical Study  Stock Prices valuation of IT Companies in India: An Empirical Study
Stock Prices valuation of IT Companies in India: An Empirical Study
 
G9 final sudarshan
G9 final sudarshanG9 final sudarshan
G9 final sudarshan
 
What Do Good Integrated Reports Tell Us?: An Empirical Study of Japanese Comp...
What Do Good Integrated Reports Tell Us?: An Empirical Study of Japanese Comp...What Do Good Integrated Reports Tell Us?: An Empirical Study of Japanese Comp...
What Do Good Integrated Reports Tell Us?: An Empirical Study of Japanese Comp...
 
G2 final sudarshan
G2 final sudarshanG2 final sudarshan
G2 final sudarshan
 
Columbia Business School - RBP Methodology
Columbia Business School - RBP MethodologyColumbia Business School - RBP Methodology
Columbia Business School - RBP Methodology
 
Low Carbon Mutual Fund
Low Carbon Mutual FundLow Carbon Mutual Fund
Low Carbon Mutual Fund
 
Thesis final bilal n saif 222 (2010 2011)
Thesis final bilal n saif 222 (2010 2011)Thesis final bilal n saif 222 (2010 2011)
Thesis final bilal n saif 222 (2010 2011)
 
2 capm dan apt aminullah assagaf
2 capm dan apt  aminullah assagaf2 capm dan apt  aminullah assagaf
2 capm dan apt aminullah assagaf
 
Technical Trading Rules for the NASDAQ100
Technical Trading Rules for the NASDAQ100Technical Trading Rules for the NASDAQ100
Technical Trading Rules for the NASDAQ100
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
Statistical arbitrage
Statistical arbitrageStatistical arbitrage
Statistical arbitrage
 

Similar to HEHEH

Testing and extending the capital asset pricing model
Testing and extending the capital asset pricing modelTesting and extending the capital asset pricing model
Testing and extending the capital asset pricing model
Gabriel Koh
 
All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...
All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...
All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...justinlent
 
An Empirical Analysis of the Capital Asset Pricing Model.pdf
An Empirical Analysis of the Capital Asset Pricing Model.pdfAn Empirical Analysis of the Capital Asset Pricing Model.pdf
An Empirical Analysis of the Capital Asset Pricing Model.pdf
SaiReddy794166
 
journal paperjournal paper
journal paperjournal paperjournal paperjournal paper
journal paperjournal paper
SaiReddy794166
 
Federico Thibaud - Capital Structure Arbitrage
Federico Thibaud - Capital Structure ArbitrageFederico Thibaud - Capital Structure Arbitrage
Federico Thibaud - Capital Structure ArbitrageFederico Thibaud
 
A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...
A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...
A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...
ijmvsc
 
Volatility of stock returns_Yuxiang Ou
Volatility of stock returns_Yuxiang OuVolatility of stock returns_Yuxiang Ou
Volatility of stock returns_Yuxiang OuYuxiang Ou
 
Can we use Mixture Models to Predict Market Bottoms? by Brian Christopher - 2...
Can we use Mixture Models to Predict Market Bottoms? by Brian Christopher - 2...Can we use Mixture Models to Predict Market Bottoms? by Brian Christopher - 2...
Can we use Mixture Models to Predict Market Bottoms? by Brian Christopher - 2...
QuantInsti
 
Impact of capital asset pricing model (capm) on pakistan
Impact of capital asset pricing model (capm) on pakistanImpact of capital asset pricing model (capm) on pakistan
Impact of capital asset pricing model (capm) on pakistan
Alexander Decker
 
Coursework- Soton (Single Index Model and CAPM)
Coursework- Soton (Single Index Model and CAPM)Coursework- Soton (Single Index Model and CAPM)
Coursework- Soton (Single Index Model and CAPM)
Ece Akbulut
 
I05724149
I05724149I05724149
I05724149
IOSR-JEN
 
64920420 solution-ch10-charles-p-jones
64920420 solution-ch10-charles-p-jones64920420 solution-ch10-charles-p-jones
64920420 solution-ch10-charles-p-jones
Atiqa Tanveer
 

Similar to HEHEH (20)

Testing and extending the capital asset pricing model
Testing and extending the capital asset pricing modelTesting and extending the capital asset pricing model
Testing and extending the capital asset pricing model
 
CAPM 1.1
CAPM 1.1CAPM 1.1
CAPM 1.1
 
All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...
All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...
All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...
 
13F_working_paper
13F_working_paper13F_working_paper
13F_working_paper
 
An Empirical Analysis of the Capital Asset Pricing Model.pdf
An Empirical Analysis of the Capital Asset Pricing Model.pdfAn Empirical Analysis of the Capital Asset Pricing Model.pdf
An Empirical Analysis of the Capital Asset Pricing Model.pdf
 
journal paperjournal paper
journal paperjournal paperjournal paperjournal paper
journal paperjournal paper
 
Federico Thibaud - Capital Structure Arbitrage
Federico Thibaud - Capital Structure ArbitrageFederico Thibaud - Capital Structure Arbitrage
Federico Thibaud - Capital Structure Arbitrage
 
A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...
A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...
A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...
 
Unstructured Data Management
Unstructured Data ManagementUnstructured Data Management
Unstructured Data Management
 
Volatility of stock returns_Yuxiang Ou
Volatility of stock returns_Yuxiang OuVolatility of stock returns_Yuxiang Ou
Volatility of stock returns_Yuxiang Ou
 
Statistical Arbitrage
Statistical Arbitrage Statistical Arbitrage
Statistical Arbitrage
 
Can we use Mixture Models to Predict Market Bottoms? by Brian Christopher - 2...
Can we use Mixture Models to Predict Market Bottoms? by Brian Christopher - 2...Can we use Mixture Models to Predict Market Bottoms? by Brian Christopher - 2...
Can we use Mixture Models to Predict Market Bottoms? by Brian Christopher - 2...
 
Beta
BetaBeta
Beta
 
muthu.shree
muthu.shreemuthu.shree
muthu.shree
 
Impact of capital asset pricing model (capm) on pakistan
Impact of capital asset pricing model (capm) on pakistanImpact of capital asset pricing model (capm) on pakistan
Impact of capital asset pricing model (capm) on pakistan
 
Coursework- Soton (Single Index Model and CAPM)
Coursework- Soton (Single Index Model and CAPM)Coursework- Soton (Single Index Model and CAPM)
Coursework- Soton (Single Index Model and CAPM)
 
Graduate RP
Graduate RPGraduate RP
Graduate RP
 
Graduate RP
Graduate RPGraduate RP
Graduate RP
 
I05724149
I05724149I05724149
I05724149
 
64920420 solution-ch10-charles-p-jones
64920420 solution-ch10-charles-p-jones64920420 solution-ch10-charles-p-jones
64920420 solution-ch10-charles-p-jones
 

More from Sirui Zhang

Sirui_Zhang_Demograhpy_Term_Paper
Sirui_Zhang_Demograhpy_Term_PaperSirui_Zhang_Demograhpy_Term_Paper
Sirui_Zhang_Demograhpy_Term_PaperSirui Zhang
 
Sirui_Zhang_Demograhpy_Term_Paper
Sirui_Zhang_Demograhpy_Term_PaperSirui_Zhang_Demograhpy_Term_Paper
Sirui_Zhang_Demograhpy_Term_PaperSirui Zhang
 
Sirui_Zhang_Resume
Sirui_Zhang_ResumeSirui_Zhang_Resume
Sirui_Zhang_ResumeSirui Zhang
 

More from Sirui Zhang (6)

Sirui_Zhang_Demograhpy_Term_Paper
Sirui_Zhang_Demograhpy_Term_PaperSirui_Zhang_Demograhpy_Term_Paper
Sirui_Zhang_Demograhpy_Term_Paper
 
Sirui_Zhang_Demograhpy_Term_Paper
Sirui_Zhang_Demograhpy_Term_PaperSirui_Zhang_Demograhpy_Term_Paper
Sirui_Zhang_Demograhpy_Term_Paper
 
First Version
First VersionFirst Version
First Version
 
HEHEH
HEHEHHEHEH
HEHEH
 
Sirui_Zhang_Resume
Sirui_Zhang_ResumeSirui_Zhang_Resume
Sirui_Zhang_Resume
 
report
reportreport
report
 

HEHEH

  • 1. Comparison of Different Methods in Forecasting Stocks’ Returns or Prices Zhicheng Li/Sirui Zhang/Haoran Jiang Abstract In this paper, four models are built in order to explain stocks behav- ior, and the corresponding methods are used to forecast stocks’ returns or prices in S&P 500 universe. All the forecasting results are compared with the real values. It is shown that the traditional time series meth- ods, including univariate (in AR model) and mutivariate (in VAR model) methods, give little forecastability. On the contrary, the methods based on statistical arbitrage, i.e, the Pair Trading and Market Neurtral model, per- form much better. Meanwhile, we introduce some statistical techniques, such as Principle Components Analysis (PCA) and mean-reversion con- cept. Finally, Econometrics and statistic analysis are attempt to give a reasonable interpretation. 1 Introduction Forecasting is an everlasting topics not only in Economics but also in Fi- nance. In the stock market, the incentive to make a good forecasting is particularly strong, in the sense that people who have a better prediction would make more money. Therefore, a lot of researches have been done and various models and methods have been proposed and used. Before the age of computers, people traded stocks and commodities mainly on intu- ition. As the level of trading and the technology grew, people searched for tools and methods that would increase their gains meanwhile minimizing their risk. Statistics, fundamental analysis, and linear/non-liner regres- sions are all attempt to predict and benefit from the markets direction [5]. In recent studies, some new techniques, such as Neural Network, Hidden Markov Method(HMM) and Genetic Algorithms (GA), are used to fore- cast stocks’ activity [9][10][13]. None of these techniques has proven to be consistently correct as desired, and many skeptics argue about the utility of many of these approaches. However, these methods are commonly used in practice. In our paper, we present four models with the application to S&P 500 stocks market. In each model, we state the concrete method for forecast- ing. Given a particular time window in S&P universe, we forecast the stocks’ prices or returns, then we compare the forecasting results with the real values by calculating correlations. At last we look at the performance of each method. The first model we start with is Auto-Regression (AR) 1
  • 2. model, which is broadly used in time series analysis [5][7]. It assumes that stock behaves in an autocorrelated and stochastic way, and is not correlated with other stocks/factors. Basically, this method attempts to model a linear function by a recurrence relation derived from past values. The recurrence relation can then be used to predict new values in the time series, which hopefully will be good approximations of the actual values. While in the second model, we think that two stocks, especially in one common industry, are tent to be correlated, i.e., a pair of stocks’ prices are possibly to have a statistical relationship, called cointegration. We dig out this property and implement pair trading in the second model. This model is the ancestor of statistical arbitrage, which now is a widely used method in the investment area [16][14]. In the third model, we extend our idea to the point where individual stock is very possibly influenced by whole market. We hope to find those common market factors that each stock may depend on. Therefore, a statistical method, called Prin- ciple Component Analysis (PCA), is employed to extract these common market factors [17], i.e., Principle Components(PCs). By regressing each stock on PCs, we infer their relationship, and further by VAR model, which is a multivariate time series model, we forecast how PCs evolute. Then we put the predicted PCs back to the original regressions and fore- cast individual stocks. The last model we apply is market neutral model, in which we form a portfolio whose expected returns are nothing related with the market fundamentals. In spite of how the market fluctuates, the portfolio’ return is just a stationary mean-reverting process. By using mean-reversion, which is a very important technique in statistical arbi- trage [3], we look for the opportunities that would give us large expected returns, and then compare these returns with real values. The structure of our paper is organized as below. In Section 2, we introduce the data of S&P 500 stock market that we are using, and we further diagnose and discover some property of this data set. Section 3 are divided into four parts. Each part set forth a model of studying stocks’ behaviors and a method of how to forecast stocks’ prices/returns in our case. Then in Section 4, we show the results of these four methods and compare their performance. A detailed and reasonable analysis is also tried. At last, we make a conclusion in the final Section. 2 Data and Stylized fact In this paper, we use a database of S&P 500 (Standard & Poor’s 500) from year 1989 to 2012. The data source is from CRSP (Center Research Security Price), which is part of University of Chicago and renowned for its expertise in building and maintaining historical, academic research- quality stock market databases. The reason to choose S&P 500 is that it comprises nearly 500 common stocks issued by 500 large-cap companies, and covers about 75 percent of the American equity market by capitaliza- tion. Meanwhile, S&P 500 indice is one of the most commonly followed equity indices, and many consider it one of the best representations of the U.S. stock market, and a bellwether for the U.S. economy [1] (See Figure 1). 2
  • 3. Figure 1: Historical S&P 500 Earning and US Nominal GDP The components of the S&P 500 are selected by the committee. This is similar to the Dow Jones Industrial Average, but different from others such as the Russell 1000, which are strictly rule-based. When considering the eligibility of a new addition, the committee assesses the company’s merit using eight primary criteria: market capitalization, liquidity, domicile, public float, sector classification, financial viability, length of time publicly traded and listing exchange [2]. The committee selects the companies in the S&P 500 so they are representative of the industries in the United States economy. In order to be added to the index, a company must satisfy these liquidity-based size requirements: i) market capitalization is greater than or equal to US4.0 billion; ii) annual dollar value traded to float- adjusted market capitalization is greater than 1.0; iii)minimum monthly trading volume of 250,000 shares in each of the six months leading up to the evaluation date. Therefore, companies in S&P 500 are not static. Sometimes, one company may dropped out from the list, and sometimes another new company entered. That’s why we could see 1127 stocks’ records in our data. The stocks’ prices in this data set are End-of-Day prices. As we have roughly 252 business days a year, there are 5799 time records. In addition, these prices are adjusted for including dividends and expanding shares. Thus, the tendency of one stock prices can reflect the market value of that company. Moreover, we normally think price’s increment is proportional to itself, so the trend of one stock prices is exponential (See Figure 2) and the log-prices would be I(1) process, which means the log-returns (first differences of log-prices) are stationary (Stock and Watson (1988b)). Table 1 is the results of ADF tests for all the stocks, which evidently show that log-prices are basically I(1) process which have unit root and log-returns are stationary process. 3
  • 4. Figure 2: Five S&P 500 Stocks Prices’ Evolution As we have a long time series in broad universe of U.S equities, we could use back-testing method to compare different methods for forecast- ing stocks’ prices/returns. The principle is following: we set two param- eters, i.e., historical window and forecasting window. Given the data in historical window, we anticipate the prices/returns in forecasting window, and then compare them with the actual data. The historical window can move over time, so we can get a series of comparison results and make a judgment. Another issue is that within a particular historical window, some companies are not belong to S&P 500 or have no data, we need refine the dataset to those stocks who continuously existed in that period. Standard & Poor believes that turnover in index membership should be avoided whenever possible. Hence companies which were added to the index usually stays in the index unless too many of the addition criteria has been violated or if the company no longer exist due to mergers and acquisitions [2]. Thus even it has the selection base which we have men- tioned before, within the certain historical window that is not too long, we can think that stocks behave naturally. Table 1: Results of ADF tests for log-prices and log-returns processes H0: have a unit root (5% level) Ratio of stocks that accept Ratio of stocks that reject log-prices 95.08% 4.92% log-returns 0 100% 4
  • 5. 3 Models and Methods 3.1 Simple Autoregression Model At the beginning, let us use a very simple model, that is autoregres- sion(AR) model, which is widely used in single time series problem. Sup- pose we are interested in forecasting the value of a variable Yt+1 based on a set of variables Xt observed at date t. In this case, Xt consist of a constant plus Yt, Yt−1, . . . Yt−m+1. Common methodology is to choose the forecast Y ∗ t+1|t, so as to minimize E(Yt+1 − Y ∗ t+1|t)2 (1) which is mean squared error. Y ∗ t+1|t has a function form g(Xt) based on the current information, then the last equation is to find the function g(Xt) that minimize E(Yt+1 − g(Xt))2 (2) When we use linear projection, i.e, g(Xt) is a linear combination of Yt, . . . Yt−m+1, equation 2 becomes a AR model. In our papaer, we just choose two lags and have the regression model: Yt − u = φ1(Yt−1 − u) + φ2(Yt−2 − u) + εt (3) The reason for using two lags linear projection other than some other methods (AIC/BIC) in determining lags [8] or using non-linear models is that we think there is a trade-off between the size of samples, the numbers of parameters to be estimated, and the credibility of the model we have. Many parameters to be estimated might cause the lack of precision due to the estimation process. And because we don’t have a ‘true’ model governing stock prices/returns (Black (1986)), as long as what we have built is effective to some extend as we expect, we could use it. Back to the equation 3, if we could assume E(εt | Yt−1, Yt−2) = 0 and the process {Yt, [Yt−1, Yt−2]} is covariance-stationary and ergodic for second moments, then the OLS regression yields a consistent estimate for coefficients (Hamilton (1994)). Or, we transfer equation 3 to the form: φ(L)(Y − u) = εt (4) where the autoregressive operator φ(L) = (1−φ1L−φ2L2 ). As long as all the roots of φ(z) = 0 lie outside the unit circle, the autoregression satisfies the stationary condition. In this AR model, we choose log-returns which are already stationary process as our forecasting object. Specifically, if we define Yit as the log- return of stock i at time t, then equation 3 becomes Yit = β0i + β1iYit−1 + β2iYit−2 + εit (5) If the previous assumptions hold, we could apply OLS to this regression and get consistent estimator ˆβki, (k = 0 . . . 2, i = 1 . . . N). Here we should notice that this is not a panel data regression. They are different regres- sions for different stocks, and the coefficients vary between stocks. Further 5
  • 6. more, we set the length of the moving historical window as 1000 days, and we want to forecast the next day return E(Yit+1) of stock i, which is E(Yit+1) = ˆβ0i + ˆβ1iYit + ˆβ2iYit−1 (6) At last we compare the forecast returns with real returns, and the results are shown in next section. 3.2 Pair Trading Model The assumptions in the previous model are very strong. It is unlikely that stocks changes by themselves and are uncorrelated with others. In other words, it is more plausible to think that stocks are possibly corre- lated, especially in the same industry. Figure 3 shows a example that the prices’ evolutions of two stocks in the same industry ‘Petroleum Refining’ (SIC:2911) from year 1989 to 1990, and it seems that they are highly cor- related. Hence, in this model, we adopt one relationship which commonly used in time series, i.e., cointegration, to analysis. Other than dealing with log-returns, which are stationary process, we consider the log-prices that are integrated of order 1. If stocks i and j are in the same industry or have similar characteristics, one expects by hedging one stock on the other to get positive profit (see Pole (2008)). Particularly, denote Pit and Pjt as the corresponding price series, when we can model them like ln(Pit) = αt + βln(Pjt) + Xt (7) where Xt is a stationary, or a mean-reverting process. Then the relation between these two log-prices which are I(1) series is cointegration. By taking first difference of equation 7, log-returns should be satisfied ln(Rit) = αdt + βln(Rjt) + dXt (8) In many situation, the drift α is small compared to the fluctuations of Xt and can be neglected. Thus the mean-reversion of Xt suggests us that we could form a long-short portfolio in which we go long 1 dollar of stock i and short β dollars of stock j if Xt is small. And conversely, go short stock i and long j if Xt is large. Both situations are expected to get positive returns. This mean-reversion paradigm is typically associated with market over-reaction: assets are temporarily under or over priced with respect to one or several reference securities (Lo and MacKinlay (1990)). For our dataset, the concrete method is described as below. At first within one historical window, we find a pair of stocks which are cointe- grated without deterministic trend under certain industry (in our data, we use SIC code to identify the industries). Denote them as stock i and j, by regressing one on the other, we have: ln(Pit) = βln(Pjt) + Xt (9) And correspondingly, for log-returns, ln(Rit) = βln(Rjt) + dXt (10) 6
  • 7. Figure 3: Prices of two stocks in ‘Petroleum Refining’ industry from 1989 to 1990 As the Xt is stationary process and we expect to find mean-reverting property, we use AR(1) model to do diagnose Xt: Xt = β0 + β1Xt−1 + εt (11) Subtracting both sides by Xt−1, we get dXt = β0 + (β1 − 1)Xt−1 + εt (12) The mean-reversion requires (β1 − 1) < 0, and the more negative, the more mean-reverting. Therefore, the next step is to, within the particular historical window (t=1. . . T), search all the stocks, find the top ten mean- reverting pairs, and denote them as {i∗ , j∗ }10. Then for these ten pair- trading portfolio, we need forecast their next day returns. By putting T+1 to the equation 10, it becomes ln(Ri∗T +1) − βln(Rj∗T +1) = dX∗ T +1 (13) which means that long 1 dollar stock i∗ and short β dollars j∗ would give us a expected return ET(dX∗ T +1). What’s more, from equation 12, it is easy to see dX∗ T +1 = β∗ 0 + (β∗ 1 − 1)X∗ T + ε∗ T +1 (14) If we have the valid assumption ET(ε∗ T +1) = 0, which is also the require- ment for getting a consistent estimator in AR(1), we could derive the result: ET(dX∗ T +1) = ET{ln(Ri∗T +1) − βln(Rj∗T +1)} = β∗ 0 + (β∗ 1 − 1)X∗ T (15) showing that the expected returns in next day (T+1) of this pair trading are just β∗ 0 + (β∗ 1 − 1)X∗ T . Then we can compare the forecasting returns with the real returns by using pair trading, which is just ln(Ri∗T +1) − βln(Rj∗T +1) located in the forecasting window. The results of comparison will be shown in next part. 7
  • 8. Moreover, if we want form a strategy to make more money, within the ten pairs that are chosen by us, we select the pair (i∗∗ and j∗∗ ) whose absolute expected returns equals max{|β∗ 0 +(β∗ 1 −1)X∗ T |}, and just do pair trading for that pair. If the expect return value is positive, we just long 1 dollar i∗∗ stock and short β dollars j∗∗ stock. While when the value is negative, on the contrary, we short 1 dollar i∗∗ stock and long β dollars j∗∗ stock. Both cases give us the positive return, i.e, max{|β∗ 0 +(β∗ 1 −1)X∗ T |}. 3.3 VAR Model From the previous model, we could see that cointegrated time series share at least one common trend. Both causal observation and economic the- ory suggest that many series might contain the same stochastic trend so that they are cointegrated. If each of n series is integrated of order 1 and can be jointly characterized by k < n stochastic trends, then the vector representation of these series has k I(1) processes and n − k distinct sta- tionary linear combinations. A technique proposed by Stock and Watson (1988a) claim that we can extract common stochastic trends by Principal Components Analysis (PCA). As we already know that log-prices is I(1) process, we can regress each log-prices process on these cointegrated Prin- cipal Components (PCs), then the residual we get should be stationary. Or we can directly use log-returns which are already stationary process, then the principal components and the residuals after regression are all stationary. Here we briefly introduce PCA. PCA is a statistical method that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. This transformation is defined in such a way that the first principal component has the largest possible variance, that is, accounts for as much of the variability in the data as possible. And each succeeding component in turn has the highest variance possible under the constraint that it is orthogonal to the preceding components. Thus, we can preserve most of the information of original data and meanwhile achieve the purpose of reducing the dimension of dataset, i.e., get small numbers of common stochastic trends. The detail procedure for our case is following. Within one historical window (t=1. . . T, i=1. . . N), we first standardized the volatility of each stock’s log-prices (pi). Yit = pit − ¯pi ¯σi (16) where ¯pi = 1 T T t=1 pit ; ¯σ2 i = 1 T − 1 T t=1 (pit − ¯pi)2 Then we calculate the covariance matrix of Yit (here is also the correlation matrix). It is defined as C, and Cij = 1 T − 1 T t=1 YitYjt (17) 8
  • 9. which is symmetric and non-negative definite. Notice that, for any stock i, we have Cii = 1. The next step is to consider the eigenvectors and eigenvalues of the covariance matrix. Define V as the eigenvectors matix and λ as corresponding eigenvalues, i.e, [V λ] = Eig(C); (18) As Vi (i = 1 . . . N) are the eigenvectors of the covariance matrix, they are orthogonal to each other. These eigenvectors can form a set of orthogonal bases of another space. When we rank the eigenvalues in decreasing order: N ≥ λ1 ≥ λ1 ≥ λ1 ≥ . . . ≥ λN ≥ 0 and define V1, V2, V3 . . . VN as the corresponding eigenvectors. A spectrum of eigenvalues shows that they only contain a few large eigenvalues (See Figure 4). We can then choose top K eigenvectors which correspond to the biggest K eigenvalues. From Jolliffe (2005), we know that the projection of original data on these top eigenvectors V1, V2, V3 . . . VK (also principal bases in new space) can preserve most of the information. Figure 4: Eigenvalues of the correlation matrix of stocks’ log-prices computed on the first historical window (t=1. . . 100) Thus, we project the log-prices data in the historical window on these top eigenvectors and get K principal components (Fk, k = 1 . . . K): Ftj = N i=1 Vji ¯σi pti t = 1, . . . , T j = 1, . . . , K; (19) For each stock’s log-prices process, we regress it on those common trends: pi = θi0 + K j=1 θijFj + δi i = 1, 2, . . . , N. (20) As they are cointegrated, and if we can claim that the disturbance item is uncorrelated with PCs, the OLS estimator ˆθij, (i = 1 . . . N, j = 0 . . . K) 9
  • 10. are consistent. The next step is that, rather than auto-regressing each single log-price process and forecast, we use Vector Autoregression (VAR) model to forecast these common trends (PCs) and combine them together to estimate each log-prices process by putting them back to the original regression equation 20. A VAR(p) model is written as an vector autore- gression over the previous p values of the series, in this case: #» F t = #»c + φ1 #» F t−1 + · · · + φp #» F t−p + #»ε t (21) where #» F t =    F1t ... FKt    ; #»c =    c1t ... cKt    ; #»ε t =    ε1t ... εKt    ; φs = {φs ij}K×K (22) And putting forecasting value of #» F t+1 into equation 20, we have ˆpit+1 = ˆθi0 + K j=1 ˆθijFjt+1 (23) The principle of this method is that, rather than treating the evolution of stock price as a spontaneous and endogenous process, we think it is highly correlated with the whole market. As it is impossible to regress each stock on the whole set of other stocks, we extract a small numbers of common stochastic trends which can largely represent the whole market. By the evolution of these trends, we capture more information which would influence the single stock’s behavior. Indeed, we will encounter similar econometrics problem as we were doing single series autoregression. And it is hard for us to justify the valid of those assumptions. However, as long as this model could increase the forecastability, it is effective to some extent. 3.4 Market Neutral Model Stocks’ prices or returns are apparently influenced by market fundamen- tals. However, it is hard to build a model and take all possible factors into account for explaining and forecasting fundamentals. Therefore, in this section, we consider a statistical arbitrage model, in which the portfolio’s return is not impacted by market fundamentals. The common features of statistical arbitrage are (i) trading signals are systematic or rules-based, (ii) the trading portfolio is market-neutral, in the sense that it has zero beta with the market, and (iii) the mechanism for generating excess re- turns is statistical. The idea is to make many bets with significant positive expected returns in the appropriate time, and produce a low-volatility in- vestment strategy which is uncorrelated with market. Here we take reference of the paper by Avellaneda and Lee (2010) and build this model. First we form principal components of log-returns of S&P500 stocks in a certain period. For example, if we are at time T and need forecast the next period stocks’ returns, we use the past 60 days of record, i.e, the historical window is chosen as 60 days. Following the same 10
  • 11. principle in last section, we choose the most significant K eigenvectors that correspond to the biggest K eigenvalues. Define these vectors as Vi, (i = 1 . . . K). Then we project log-return matrix (60 × N) on these eigenvectors and form K market factors. Ftj = N i=1 Vji ¯σi Rti j = 1, . . . , K; t = (T − 59), . . . , T (24) Where Ftj is the jth market factor at time t. We should notice that these market factors are dynamic because they would change as the historical window moving forward. Then we regress each stock’s log-returns on these market factors Ri = mi + K j=1 βijFj + ˜Ri i = 1, 2, . . . , N. (25) Of course returns, principal components and the residuals are all station- ary, and we could assume E( ˜Ri) = 0. The proposed strategy is to look for those regression residuals that have the most significant reverting process. Thus, we auto-regress each ˜Ri and find those residuals that have highest negative autoregressive coefficient. ˜Rit = ρi ˜Rit−1 + it i = 1, 2, . . . , N. (26) Figure 5 shows the top five mean-reverting residuals in the first historical window. Figure 5: The top 5 mean-reverting residuals in the first historical window A trading portfolio which contains n stocks is said to be market-neutral if the dollar amounts {Qi}n i=1 invested in each stock in this portfolio are satisfied: ¯βj = n i=1 βijQi = 0, j = 1, 2, . . . , k. (27) 11
  • 12. βij is the coefficients of stock i regress on factor j. In code, we use Null space to solve this linear system Q = Null{β[K]×[n]} (28) In order to guarantee a non-zero solution for the portfolio, we need choose n = K+1 stocks, which have the smallest K+1 autoregressive coefficients, as our portfolio member. Then we have K+1 i=1 QiRi = K+1 i=1 Qimi + K+1 i=1 Qi K j=1 βijFj + K+1 i=1 Qi ˜Ri = K+1 i=1 Qimi + K+1 i=1 Qi ˜Ri + K j=1 K+1 i=1 βijQi Fj = K+1 i=1 Qi(mi + ˜Ri) (29) In this equation, it is obviously to see that the portfolio return has nothing to do with market environment. And it is depend on the intrinsic factor mi and a statistic random variable ˜Ri, which is mean zero and stationary process satisfy mean-reversion. The next step is to generate signals for entering trading. Loading auto-regressing expression 26 into equation 29, we have: K+1 i=1 QiRit = K+1 i=1 Qi(mi + ˜Rit) = K+1 i=1 Qi(mi + ρi ˜Rit−1 + it) (30) Suppose we are at the last period T of historical window, from above equation, we expect the portfolio return at T+1 is ET K+1 i=1 QiRiT +1 = ET K+1 i=1 Qi(mi + ρi ˜RiT + iT +1) = K+1 i=1 Qi(mi + ρi ˜RiT ) (31) When K+1 i=1 Qi(mi + ρi ˜RiT ) is very high (positive), we could buy this portfolio and expect to get a high return. While when K+1 i=1 Qi(mi + ρi ˜RiT ) is sufficiently negative, we could short this portfolio, and still ex- pect to get a high return. Thus, we could directly use K+1 i=1 Qi(mi + ρi ˜RiT ) as our trading signal, where ρi is negative coefficient. Define the signal as ST . In our strategy, we set the trading entry criteria are: 1. if ST −mean{St} ≥ 0.7(max{St}−mean{St}), t = (T −59), . . . , T Enter trading, long the stocks whose Qi are positive by the amount of | Qi |, short the stocks whose Qi are negative by the amount of | Qi |. This would give expected return as | K+1 i=1 Qi(mi + ρi ˜RiT ) | 2. if ST −mean{St} ≤ 0.7(min{St}−mean{St}), t = (T −59), . . . , T Enter trading, long the stocks whose Qi are negative by the amount of | Qi |, short the stocks whose Qi are posituve by the amount of | Qi |. This would give expected return as | K+1 i=1 Qi(mi + ρi ˜RiT ) | 12
  • 13. Finally, as historical window moving forward, we compare these expected portfolio returns to the real portfolio returns and get a correlation result. 4 Comparison and Analysis Table 2 shows the comparison results of these four different methods for forecasting stocks’ log-returns in S&P 500 universe. Here need clarify some parameters. In the time series models (AR and VAR), we use the historical window across 1000 days. While in statistical arbitrage models, followed by Avellaneda and Lee (2010), we use past 60 days’ records as our information set for trading. ‘Common factors’ refer to the number of other time series are used to forecast. In Pair Trading and Market Neutral models, it means to the number of PCs that we used. As in the last model, we use a signal to identify whether enter trading or not, we see that the forecasting times is less than others. Table 2: Comparison between four types of forecasting methods Length of Common Forecasting Correlation historical window factors used times with real returns AR 1000 NA 1000 3.17% Pair Trading 60 2 1000 13.89% VAR 1000 5 1000 4.52% Market Neutral 60 15 503 17.2% From the table we can see that both AR and VAR models exhibit little forecastability. In the AR model, we just investigate each stock’s log-returns. We know that individual log-price processes are almost a random walk process, in the sense that log-returns, that is the differences of log-prices, are almost white noise. Even though they are stationary process, it is still hard to forecast their following behaviors. While in the VAR model, we want to capture more market information that would impact stock’ behavior. Thus we switch to look at how common market factors (PCs) evolute. Then by putting the forecasting values of PCs into the original regressions, we get the predicted values for each stock. How- ever, we see that the effect on forecasting each individual’s log-returns is trivial. Therefore, it would not increase much opportunity to earn money. Moreover, when we extend our forecasting window to five days, we find that the accuracy of these two methods decrease as the forecasting period increase (See Table 3). Overall, time series forecasting provides reason- able credit over short periods of time, but the accuracy of forecasting diminishes sharply as the length of prediction increases. Nonetheless, from Table 2, we find the second and last model improve a lot on the forecastility. The latent methodology in the second and last models is mean-reversion, which is a mathematical concept sometimes used for stock investing. This concept suggest that prices and returns eventually move back towards the mean or average. Revisiting the equa- tion 9 in Pair Trading model, we see that the pair of stocks’ log-prices 13
  • 14. Table 3: Time series methods to forecast different periods Length of Days to Correlation historical window forecast with real returns AR 1000 1 3.17% VAR 1000 1 4.52% AR 1000 5 1.38% VAR 1000 5 1.90% are cointegrated and the residuals after regression are supposed to move around the average. By mean-reversion, we expect dXt have a negative correlation with Xt. This is not only a property that we infer or extract from data, but also supported by a theoretical model, i.e, OrnsteinUh- lenbeck (O-U) process. In mathematics, the O-U process (see Gardiner (1985)), is a stochastic process that describes the velocity of a massive Brownian particle under the influence of friction. The process is sta- tionary, Gaussian, and Markovian. Over time, the process tends to drift towards its long-term mean: such a process is called mean-reverting. More over, another important and widely used assumption in Finance is that stock prices’ stochastical movement follows geometric brownian motion. Thus, for the Xt in equation 9, we could apply O-U process and get: dXt = κ(m − Xt)dt + σ · dWt, κ > 0 (32) where m is the mean of Xt, dWt is the increment of brownian motion (Wt ∼ N(0, t)), σ measures the volatility of movement, and the parame- ter κ is called the speed of mean-reversion. This process is stationary and auto-regressive with lag 1. In particular, the increment dXt has uncondi- tional mean zero and conditional mean equal to E{dXt|Xs, s ≤ t} = κ(m − Xt)dt When Xt > m, we expect dXt be negative, and Xt < m implies a positive dXt. A small transformation to equation 32 , we get: dXt = κm · dt − κdt · Xt + σ · dWt (33) Compare it with equation 9 in Pair Trading model, we find that they have the same form, and β0 = κm · dt, β1 − 1 = −κdt, εt = σdWt (34) This on the other hand endorses AR(1) model which we used for the process Xt. And finding the most negative coefficient β1 is equivalent to finding the process which has the highest speed of mean-reverting. For the last model, i.e., Market Neutral model, we used another method to identify mean-reverting process. In stead of studying cointegrated log-prices, we directly regress log-returns which are already stationary process on the common market factors (PCs). The residuals after re- gression(include constant item) are mean zero. But there is no rigorous model to support that the residual series are mean-reverting around zero. 14
  • 15. The relationship in equation 26, i.e., ˜Rit = ρi ˜Rit−1 + it, where ρi < 0 is basically an assumption. However, we looked for all the stocks, and found those who are most possible to obey this relationship (See Fig- ure 5). Therefore, for the stocks we have chosen, the residuals ˜Rit after regression on the common market factors, are reasonable to assume oscil- lating near zero. Then we could effectively apply mean-reversion method. Nevertheless, we need pay attention that not all stationary processes are mean-reverting, or can be used for mean-reversion. Moreover, if a ran- dom walk I(1) process have mean zero, the probability for it crosses zero is one, but the mean time to crossing zero is infinite. Thus, we could neither apply mean-reversion to a random walk process in a direct way. The reason for a relatively good performance of the second and last model is that, in stead of focusing on forecasting variables themselves, we pay attention on the residuals. Either by the existing theories or econo- metrics analysis, we extract more information on the property of residuals, which exhibit more forecastability. Just as the famous saying in Finance: “Profit comes from residuals”. The other learning from our research is that, there is not a ‘real’ model explaining stocks’ prices or returns in Finance. All the existing theory are partially right, and all the model are only valid when the assumptions are reasonable. For example, the funda- mental assumption for the O-U process or the famous Black-Scholes model is that the underling stock price St follows geometric brownian motion St+1 − St = (r − q)Stdt + σStdWt =⇒ St+1 St = 1 + (r − q)dt + σ √ dt · Z, Z ∼ N(0, 1) (35) which suggests that log-prices is a self auto-regressive process and not impacted by others, however, we already found this is not proper most time. There are too many variables and factors which could influence the stock markets. Even one model can works well for a time, once many people begin to use it, people’s trading and investment behavior would conversely impact the market and may offset the utility of that model. Hence, other than some Economics problem, Finance market are almost full of noise (Black (1986)) and hard to model. The job is to find a little bit useful information in the enormous environment, catch opportunity and make money. 5 Conclusion Practical experiments and back testing results illustrate that the tradi- tional time series methods don’t work well. The models AR and VAR which belong to univariate and multivariate time series analysis respec- tively can only have less than 5 percent accuracy. When the forecasting period increases, the accuracy decreases significantly. This suggests that it is hard to derive a true recurrence relation that can be used to predict new values. However, Pair Trading and Market Neutral models which based on statistical arbitrage principle improve the forecastability to more than 10 percent. The idea is to form a pair or a portfolio whose returns only de- 15
  • 16. pend on the values of residuals, and further by excavating mean-reversion property of these residuals, we gain more forecastability. References (2012). Standard & Poor’s 500 index - S&P 500. Investopedia. (2013). S&P Indice Methodology. Standard And Poor’s. Avellaneda, M. and J.-H. Lee (2010). Statistical arbitrage in the us equi- ties market. Quantitative Finance 10(7), 761–782. Black, F. (1986). Noise. The journal of finance 41(3), 529–543. Box, G. E., G. M. Jenkins, and G. C. Reinsel (2013). Time series analysis: forecasting and control. John Wiley & Sons. Gardiner, C. (1985). Stochastic methods. Springer-Verlag, Berlin– Heidelberg–New York–Tokyo. Hamilton, J. D. (1994). Time series analysis, Volume 2. Princeton uni- versity press Princeton. Hannan, E. J. and B. G. Quinn (1979). The determination of the order of an autoregression. Journal of the Royal Statistical Society. Series B (Methodological), 190–195. Hassan, M. R. and B. Nath (2005). Stock market forecasting using hidden markov model: a new approach. In Intelligent Systems Design and Applications, 2005. ISDA’05. Proceedings. 5th International Conference on, pp. 192–196. IEEE. Hassan, M. R., B. Nath, and M. Kirley (2007). A fusion model of hmm, ann and ga for stock market forecasting. Expert Systems with Applica- tions 33(1), 171–180. Hirsa, A. (2012). Computational methods in finance. CRC Press. Jolliffe, I. (2005). Principal component analysis. Wiley Online Library. Lawrence, R. (1997). Using neural networks to forecast stock market prices. University of Manitoba. Lo, A. W. and A. C. MacKinlay (1990). When are contrarian profits due to stock market overreaction? Review of Financial studies 3(2), 175–205. Miller, M. H., J. Muthuswamy, and R. E. Whaley (1994). Mean reversion of standard & poor’s 500 index basis changes: Arbitrage-induced or statistical illusion? The Journal of Finance 49(2), 479–513. Pole, A. (2008). Statistical arbitrage: algorithmic trading insights and techniques, Volume 411. John Wiley & Sons. 16
  • 17. Stock, J. H. and M. W. Watson (1988a). Testing for common trends. Journal of the American statistical Association 83(404), 1097–1107. Stock, J. H. and M. W. Watson (1988b). Variable trends in economic time series. The Journal of Economic Perspectives, 147–174. 17