SlideShare a Scribd company logo
1 of 18
FE8513 Financial Time Series Analysis
Shanghai Stock Exchange Returns Model
Master of Science (Financial Engineering)
Academic Year 2014/15 Mini-Term 3
RAKTIM RAY
G1401584K
Ray | Shanghai Stock Exchange Returns Model
2
Table Of Contents
Introduction...........................................................................................................3
Model Data and Elementary Analysis ..................................................................3
Mean Model Construction and Choice.................................................................4
Residual Diagnostics: Modelling Volatility .........................................................7
Forecasting and Final Model Selection ..............................................................10
Formal Statement of Optimal Model..................................................................12
Limitations ..........................................................................................................15
Conclusion ..........................................................................................................16
Acknowledgement ..............................................................................................17
Bibliography .......................................................................................................18
Ray | Shanghai Stock Exchange Returns Model
3
Introduction
Our main objective is to create a time series model to forecast the future
prospective returns and the volatility of these returns. The investment opportunity is in
the Shanghai Pilot Free Trade Zone (SPFTZ). Since this is a newly launched initiative for
foreign companies to invest in, we use the Shanghai Stock Exchange returns as proxy for
modelling the SPFTZ returns and volatility. Our objective is to provide investment
returns and risk insights for SPFTZ index investing based on quantitative modelling.
This report will aim to serve the Senior Management of the firm in arriving at a
quantitatively driven decision of whether or not to go ahead with this new venture
based on historical proxy returns. Theoretically, index values are themselves non-
stationary. It is difficult to model non-stationary processes. We therefore base our
model building on log returns which are stationary in nature. This removes the
subjective biases of absolute value and non-stationarity and sets an objective outlook of
the relative returns which is the end goal for all investments. The data tables used for
general analyses and model building and referred to in the report is self-contained. An
analysis support folder is also provided with all the tests and output from the software
for the technical reader.
Model Data and Elementary Analysis
The index values of the Shanghai Stock Exchange ranging from 03/Jan/2000 to
27/Nov/2014 are used which is deemed a sufficient time period for our model building
purpose. We do not account for dividends paid on the stock and assume that the price or
index level have been already adjusted to account for them. Our purpose of elementary
analysis is to understand the nature of data that we are dealing with and to check whether our
assumptions of stationarity and normality hold. Additionally, other characteristics of the data
Ray | Shanghai Stock Exchange Returns Model
4
are also tested here namely the heteroskedasticity and autocorrelation of error terms present
in our model. Checking the distribution of price data (adjusted close price) henceforth to be
known as price, we see that the data is not normally distributed from the histogram plot.
Formal tests also reveal the same result. It is difficult to work with variables which are not
normally distributed. But this problem disappears with large number of observations as the
law of large numbers starts to prevail. A bigger problem with that of price is of non
stationarity. We conduct statistical tests to verify that this is indeed the case. We cannot deal
with non-stationary data when trying to model a time series relationship. Therefore the
variable is transferred to the log return which makes the model parsimonious since return on
investment prediction is the true focus and the price levels or the absolute value of the
investments. This data is also not normally distributed like most finance data, but as
mentioned before, the law of large numbers makes the OLS estimates unbiased and efficient
and consistent with minimum variance. The stationarity tests using the unit root method
(ADF and KPSS for the technical reader) conducted on the log return data shows that it is
indeed stationary over time and does not explode. Therefore this variable may be used to
construct a time series model. This also removes the possibility of spurious correlation in
case we try to use this variable with some others to model for some other forecasts.
Now that we have a stationary series we proceed to form linear models and test for
other desirable properties exhibited by the model which makes it robust and a good prediction
model.
Mean Model Construction and Choice
The next step in the model building process involved the determining the order of the
linear model. Since this is time series estimation, we follow the Auto Regressive/Moving
Average framework for the linear model or the ARMA framework. To determine the exact
Ray | Shanghai Stock Exchange Returns Model
5
levels of lag terms to include, the correlogram was initially used to determine if there is any
strong trend to determine the model. This is the Box-Jenkins approach. Technical readers
may be interested in this methodology but for our purposes the description is restricted to its
use which is limited at this point. As in most instances, it was difficult to pick up the correct
lag terms for the AR as well as the MA part looking at the Auto Correlation (AC) and Partial
Auto Correlation (PAC) terms and their decays. Additionally this is an informal method and a
quantitative model demands a quantitative choice for the optimal model.
Focus was therefore shifted to forming a cohort of linear models based on the ARMA
structure and then choosing the optimal model based on some information criterion.
Information criterion implies Akaike Information Criterion (AIC) and Schwarz Bayesian
Information Criterion (SBIC). These two information criterion uses the same method to rank
various models. One term is related to the residual sum of squares which may be compared
with that part of the dependent variable that is still unexplained by the model. The lower this
term is the better it is for our model. But being parsimonious is also another objective of the
model. The information criterion uses a penalty term which adds to value of the AIC or SBIC
when extra variables are added. SBIC uses a stricter penalty term. So overall, to choose the
model with the lowest Information Criterion value, one has to have a trade-off between
adding more variables to lower the first term and decreasing the number of variables to
decrease the second term. A more technical discussion regarding degrees of freedom and
residual sum of squares is beyond the scope of this paper and the technical reader may refer
to Brooks1
. To sum up, models with the lowest AIC and SBIC values are to be selected as the
optimal model to explain the returns of the SSE.
For the sake of being parsimonious, the AR and the MA terms are restricted to 5
which imply that the model should not have more than 5 lagged values of the returns or the
1
Chris Brooks : Introductory Econometrics for Finance- 2nd
Edition (2008)
Ray | Shanghai Stock Exchange Returns Model
6
error terms. Beyond 5 it gets very difficult to interpret the model. This is a minor cost
compared to the major benefit of simplicity and interpretation. What we will gain by adding
the 6th
terms and so on is small compared to the loss in the power interpretation and
implementation of the model.
Data is truncated at this point between in sample and out of sample. The most
important use of a time series model is its predictive power. If predictive power is low, the
model is useless. The most recent data is therefore reserved to be the test data or out of
sample data. This is used to test which model has the best predictive power and therefore may
be used as the optimal model. Among 3776 observations, 3210 or about 85% is kept for in
sample, truncated at the date 07/09/2012, and the rest was truncated as out of sample or the
holdout sample to be used for forecasting purposes.
The AIC and SBIC table is provided for reference. From the table the choices are
marked with circles. One model is selected based on the AIC criterion. This is the ARMA (5,
4) model. The SBIC table returns the lowest IC value for the ARMA (0, 0). This is naturally
dropped as the interpretation will be that the returns in the future will be the same for every
day which is equal to the intercept of the model. This will lead to poor predictive power and
huge errors on the estimates of future returns. Volatility will equal the variance of the error
term which is as good as not modelling at all. Therefore, this choice is dropped naturally and
we proceed to choose the next least IC which is given by ARMA (1, 0) model. But the
regression reveals that the coefficients are insignificant. Therefore we drop this model. We
move on to choose ARMA (1, 1) as it is low on the SBIC value list and gives significant
values for the intercepts and is parsimonious.
We make a third choice to broaden our search for the best model. Since the next best
model in terms of the AIC and the coefficients being significant is ARMA (4, 5) and
according to the SBIC is ARMA (2, 3) the latter is selected since it is more parsimonious.
Ray | Shanghai Stock Exchange Returns Model
7
To sum up, the three choices are ARMA (5, 4), ARMA (1, 1) and ARMA (2, 3).
Information Criterion for ARMA models of the
log returns of the Shanghai Stock Exchange index
Residual Diagnostics: Modelling Volatility
After formulating the mean model our focus now shifts on the residuals and their
diagnostics. Our assumptions about the errors were that they are normally distributed with a
constant mean and variance. In statistical parlance, this is called white noise. Now if these
assumptions hold, then the model behaves according to the predication with accurate interval
estimation of the parameters. But if these assumptions are not true, then we may have
misleading results and our estimates may not be good. The assumption of white noise being a
stationary process will then not hold true. Residuals are used as a proxy of the errors for the
models to test the assumptions.
ARCH effect or the presence of heteroskedasticity is tested. Our assumption for the
errors was that of constant volatility. But this rarely holds true for financial data. The
AIC
SBIC
Ray | Shanghai Stock Exchange Returns Model
8
variance of the error terms change over time (is heteroskedastic) and is required to be
modelled along with the modelling of the mean which was done above. At this point we leave
the realm of linear models and step into non-linear modelling
The three chosen models were tested for the ARCH effect. At this point, it is to be
noted that the purpose of the test is to detect the presence of heteroskedasticity and not to
model the variance. GARCH will be used to model the variance since ARCH has a problem
of determining the correct amount of lags. So long as ARCH effect exists, the test works. But
when ARCH effect is not found then the process of including more lags will go on and is thus
not effective for modelling purposes.
The analysis was conducted starting with a lag of two for the ARCH
heteroskedasticity test. All the three models show ARCH effect. Both the F test and the Chi-
Squared tests were significant which shows that the volatility of the residuals is indeed not
constant and needs to be modelled along with the mean. Results are documented in the
analysis folder for the technical reader.
With the three mean models, the process of estimating the GARCH model for the
volatility was conducted. GARCH implies General Auto Regressive Conditionally
Heteroskedastic which means the volatility of the log returns will be modelled with previous
volatility terms and the square of the error terms as estimated from the mean equations.
Therefore, even if the assumptions of constant variance of error terms do not hold for the
mean equation, the variance will be constantly updated through this model. The analysis was
made thorough by modelling the volatility with EGARCH, GJR or TGARCH and GARCH in
Mean models as well. EGARCH accounts for leverage effects which means that volatility
should rise more after huge price fall than for an equivalent price rise. TGARCH accounts for
the fact that bad news affects volatility more than good news and GARCH in Mean puts the
volatility term in the mean equation to account for the fact that index level is directly
Ray | Shanghai Stock Exchange Returns Model
9
determined by the past volatility level. For a more detailed study on these models, please
refer to Brooks2
. At this point the ARCH models were also tested but it is to be noted that
most finance data volatility agrees with the GARCH model. Not all the models and their tests
are recorded. Only the optimal volatility models for the three mean models are recorded in
the analysis folder. At each stage of the GARCH/ARCH models, diagnostic checks are
conducted mainly to see if the heteroskedasticity effect is significant any more or not. In
almost all cases, after modelling for volatility, it was seen that the ARCH effect becomes
insignificant. Additionally correlogram checks are also done to ensure that the residuals are
not auto correlated. This applies to both the residuals and the squared residuals since the
volatility model has a squared residual term. The ideal model should possess no
heteroskedasticity, no auto correlation both for the residuals and the squared residuals. With
these criteria in mind, the GRACH models were tested.
The focus now is to find that volatility model which has no heteroskedasticity and the
residuals are not auto correlated. Although AIC and SBIC values were looked at but priority
was given to the residual criteria here. In order to maintain some level of parsimony, the
GARCH levels were limited to (5, 5) or five lags. The final decisions were made with low
AIC and SBIC as well as fulfilling the other essential conditions mentioned above and
maintaining parsimony and selecting a simple model. Subjectivity is important in model
where it should not always be the focus of mathematically optimising but also choosing a
model that is easy to understand and implement. Due to the large variety of options the
comparisons for AIC and SBIC have not been provided here to keep the choices
straightforward. All volatility models were checked for standard coefficient violations during
the selection process so as to make the volatility process stationary.
2
Chris Brooks : Introductory Econometrics for Finance- 2nd
Edition (2008)
Ray | Shanghai Stock Exchange Returns Model
10
For the ARMA (5, 4) model, it was determined that the volatility model most apt to fit
was the TGARCH (1, 1) with Threshold Order=1. This conclusion was arrived at after going
through the various combinations as mentioned above and conducting residual checks to see
which model provides the best results.
For the ARMA (1, 1) model, it was determined that the TGARCH (1, 1) in Mean
model with Threshold Order = 1 produced the optimal fit. At this point it must be mentioned
that since it is a market index model, the intuition is for a GARCH term to be present in the
mean model as well. Although the GARCH term in the mean equation is insignificant, the
model produced a much reduced auto correlation for the standardized coefficient. Therefore
the above volatility model was selected for the ARMA (1, 1) model.
For the ARMA (2, 3) model the volatility prediction model is selected as the
TGARCH (1, 2) with threshold order=2. This model made most of the coefficients significant
as well as reduced the auto correlation between the error terms to a decent level so that the
assumptions behind the model may be matched with the data.
A summary of the Information criteria for the models selected is given below.
Information
Criterion
ARMA(5,4)TGARCH
(1,1)(t=1)
ARMA(1,1)TGARCH(1,1)-
in M(t=1)
ARMA(2,3)
TGARCH(1,2)(t=2)
AIC 3.5600 3.5633 3.5600
SBIC 3.5866 3.5784 3.5827
HQIC 3.5696 3.5687 3.5682
As can be seen from the table, the ICs are very close to one another and a choice for
the optimal model would involve some other criteria than the ICs and the correlogram.
Forecasting and Final Model Selection
The purpose of a good model is to be able to forecast the future values with the
greatest degree of accuracy. Some academics and statisticians are of the opinion that the as
long as forecasts are good, the model is good. The assumptions are secondary to the real
Ray | Shanghai Stock Exchange Returns Model
11
purpose of the model which is to be able to forecast the future value with a given level of
accuracy.
Therefore the criterion for the final model selection will be based on the degree of
forecasting accuracy possessed by the three models. Both dynamic and static forecasts are
used to determine model forecasting strength since both long run and short run predictions
are of importance in time series modelling. Forecasting both ways gives information
regarding whether the model is a good predictor in the short or the long run. The summary of
the results for the forecasting errors are represented below with the forecasting of the chosen
model being represented in the graphical form.
Static Forecast
Forecast
Statistics
ARMA(5,4)TGARCH
(1,1)(t=1)
ARMA(1,1)TGARCH(1,1)-
in M(t=1)
ARMA(2,3)
TGARCH(1,2)(t=2)
RMSE 1.0212 1.0231 1.0294
MAPE 107.4325 105.3958 117.7223
THEIL U 0.9469 0.9724 0.9157
Dynamic Forecast
Forecast
Statistics
ARMA(5,4)TGARCH
(1,1)(t=1)
ARMA(1,1)TGARCH(1,1)-
in M(t=1)
ARMA(2,3)
TGARCH(1,2)(t=2)
RMSE 1.0238 1.0229 1.0235
MAPE 103.6614 102.6099 99.9349
THEIL U 0.9839 0.9780 0.9954
Static and Dynamic Forecast of ARMA(1,1) TGARCH(1,1)-in M(t=1)
Ray | Shanghai Stock Exchange Returns Model
12
-4
-2
0
2
4
III IV I II III IV I II III IV
2012 2013 2014
DPRICEF ± 2 S.E.
Forecast: DPRICEF
Actual: DPRICE
Forecast sample: 7/10/2012 11/27/2014
Included observations: 566
Root Mean Squared Error 1.022977
Mean Absolute Error 0.746364
Mean Abs. Percent Error 102.6099
Theil Inequality Coefficient 0.978045
Bias Proportion 0.000154
Variance Proportion 0.987612
Covariance Proportion 0.012234
1.4
1.6
1.8
2.0
2.2
2.4
2.6
III IV I II III IV I II III IV
2012 2013 2014
Forecast of Variance
-6
-4
-2
0
2
4
6
III IV I II III IV I II III IV
2012 2013 2014
DPRICEF ± 2 S.E.
Forecast: DPRICEF
Actual: DPRICE
Forecast sample: 7/
Included observation
Root Mean Squared
Mean Absolute Erro
Mean Abs. Percent
Theil Inequality Coef
Bias Proportion
Variance Proport
Covariance Propo
0
1
2
3
4
5
III IV I II III IV I II III IV
2012 2013 2014
Forecast of Variance
Since they objective is to predict returns in the long run the dynamic forecast statistics
are used to select the optimal model. Static forecasts work well more one or two step ahead
time periods. Dynamic forecasts predict into the future. Considering the fact that index
returns will be viewed as a long term investment as opposed to daily investment, the model is
chosen based on the dynamic forecasting power.
The optimal model should have the minimum RMSE and THEIL U. Lower values for
these variables indicate lower deviation from the actual values.
The chosen model is therefore the ARMA (1, 1) TGARCH (1, 1) in Mean with
threshold order of one.
Formal Statement of Optimal Model
The general statement of the model is as follows:
yt = µ + a1yt-1 + a2ut-1 + bσt-1 + ut , ut ~ N(0,σt
2
)
σt
2
= α0 + α1 u2
t-1 + βσ2
t-1 + γ u2
t-1It-1
where yt is the value of the returns to be predicted.
µ is the mean equation intercept
Ray | Shanghai Stock Exchange Returns Model
13
a1yt-1 is the previous period return and its coefficient
a2ut-1 is the previous period error term and its coefficient
bσt-1 is the previous period standard deviation and its coefficient
ut is the error term for the mean equation estimation
α0 is the volatility model intercept
α1 u2
t-1 is the previous period error term squared and its coefficient
βσ2
t-1 is the previous period variance and its coefficient
γ u2
t-1It-1 is the dummy variable and its coefficient where
It-1 = 1 if ut-1 < 0
= 0 otherwise
The TGARCH or GJR model has some pre conditions on the coefficients for the
model to be valid and stationary.
α0 ≥ 0 and α1 + γ ≥ 0
The TGARCH model measures the leverage effect which states that the volatility
increases more with a price fall or bad news than it does for a price rise or good news. For
leverage effect to exist in the model, the value of γ should be ≥ 0.
The model also has a GARCH in Mean term which indicates a volatility term in the
mean equation. This is expected of financial data since higher risk should be compensated
with higher return. The volatility term models risk and the dependent variable is the return
and the expectation is that the value of b is > 0.
Stating the model with the coefficients:
yt = -0.0931 + -0.7546yt-1 + 0.7702ut-1 + 0.0753σt-1 + ut
3
,
σt
2
= 0.0304 + 0.0441u2
t-1 + 0.9264σ2
t-1 + 0.0341u2
t-1It-1
Forecast Band: Returns ± 3.208
3
Model is executed with Bollerslev-Wooldridge robust standard errors & covariance due to non-normality of
residuals for SE adjustments
Ray | Shanghai Stock Exchange Returns Model
14
The model satisfies all the constraints. The mean equation has a negative intercept and
is negatively attached with the previous period return which indicates oscillation about
positive and negative values which is common in financial time series returns variables. It is
positively linked with the error term of the previous period which also reverses the effect of
the prediction shortfall from the previous period. Both these values are below absolute value
1 which means that these are individually stationary effects on the return. As mentioned
above, the volatility term has a positive coefficient which models the fact that high volatility
in the previous period must be compensated by high return in this period.
The volatility model has a positive intercept indicating positive values of volatility
always. It is a stationary volatility process. It is positively linked with the error and variance
terms of the previous period which is to be expected since finance data exhibits clustering.
The volatility coefficient is very high indicating momentum effect in volatility. The leverage
coefficient is positive indicating that this model captures the fact that bad news affects
volatility more than good news. Standard forecast values and errors (twice standard error) are
also indicated which is in general agreement with financial time series models. Technical
readers may browse the model specifics attached below.
Ray | Shanghai Stock Exchange Returns Model
15
Limitations
Every optimal exercise has drawbacks. Since there are constraints to be satisfied
especially model assumptions and parsimony, some criterion for good models may not be met
or may be at a sub optimal level. Listed below are some limitations faced in building the
model.
 Choosing the perfect model involves a lot of permutations and combination. To test
all of these and arrive at a metric that makes one model better than the rest is not an
easy or efficient task. Although the model chosen is optimal with respect to some
criterion, it is sub optimal with respect to others.
 The choice of the optimal model finally comes down to subjectivity and is not
completely quantitative. Although the framework is purely mathematical and
statistical, the final choice depends a lot on judgement and experience in model
picking since the number of choices is very large.
 The forecasting period is short due to the lack of availability of data. Ideally greater
forecasting window gives more room for forecasting and testing the validity of the
model in the long run. But with the number of data points being constrained, the
forecasting power of the model cannot be completely determined
 The assumption of residual autocorrelation being absent is not completely achieved
along with normality of the residuals. These are difficult criteria to be fulfilled and
very rarely are they satisfied. Although autocorrelation has been removed for ten lags,
beyond that auto correlation exists. The consequence is a loss of efficiency of the
coefficient estimates.
Ray | Shanghai Stock Exchange Returns Model
16
 Due to the adjustments to fit model assumptions on all sides, the regressed sum of
squares value may be quite low, which is a limitation of time series analysis.
Conclusion
Although there are plenty of drawbacks, the model for estimating future earnings or
index or return levels of the Shanghai Stock Exchange index is a robust one and may be used
to successfully predict a range of possible outcomes of the index in the future and thus base
business and investment decisions on these forecasts. The model is driven with business logic
and is designed quantitatively. It updates the volatility level at every time period to reflect
new changes in risk parameters and environment. This in turn provides better predictions for
the returns. The risk also directly influences the returns which follows business logic.
Therefore, this model provides a good framework to estimate the returns on investments and
ventures in the Shanghai Pilot Free Trade Zone (SPFTZ) with the margin for error also
clearly defined. Therefore overall risk exposures may also be assessed when relying on
the model predictions to take investment decisions at the firm level.
Ray | Shanghai Stock Exchange Returns Model
17
Acknowledgement
I would like to thank Dr. Wu Yuan for providing me the opportunity to work on the
data provided by him to design the time series. My thanks extends to Nanyang Technological
University for providing me with the facilities needed to complete the assignment. In
particular, the Business Library and Financial Trading Room resources including modelling
software managed by the Nanyang Business School were of utmost importance.
Ray | Shanghai Stock Exchange Returns Model
18
Bibliography
1. Brooks, Chris: Introductory Econometrics for Finance- 2nd
Edition: Cambridge
University Press 2008.
2. Kutner Michael H., Nachsheim Christopher J., Neter John, Li William: Applied
Linear Statistical Models: McGraw-Hill/Irwin 2005
3. Shanghai Stock Exchange Index level Data: NTULearn Web Portal

More Related Content

Similar to Shanghai Stock Exchange Returns Model Selection

Big Data Project - Final version
Big Data Project - Final versionBig Data Project - Final version
Big Data Project - Final versionMihir Sanghavi
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
Non-Temporal ARIMA Models in Statistical Research
Non-Temporal ARIMA Models in Statistical ResearchNon-Temporal ARIMA Models in Statistical Research
Non-Temporal ARIMA Models in Statistical ResearchMagnify Analytic Solutions
 
Predictive performance analysis using sql pattern matching
Predictive performance analysis using sql pattern matchingPredictive performance analysis using sql pattern matching
Predictive performance analysis using sql pattern matchingHoria Berca
 
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...ThinkInnovation
 
Stock Price Prediction using Machine Learning Algorithms: ARIMA, LSTM & Linea...
Stock Price Prediction using Machine Learning Algorithms: ARIMA, LSTM & Linea...Stock Price Prediction using Machine Learning Algorithms: ARIMA, LSTM & Linea...
Stock Price Prediction using Machine Learning Algorithms: ARIMA, LSTM & Linea...IRJET Journal
 
Air Passenger Prediction Using ARIMA Model
Air Passenger Prediction Using ARIMA Model Air Passenger Prediction Using ARIMA Model
Air Passenger Prediction Using ARIMA Model AkarshAvinash
 
Forecasting Methodology Used in Restructured Electricity Market: A Review
Forecasting Methodology Used in Restructured Electricity Market: A ReviewForecasting Methodology Used in Restructured Electricity Market: A Review
Forecasting Methodology Used in Restructured Electricity Market: A ReviewDr. Sudhir Kumar Srivastava
 
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...CSCJournals
 
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
IRJET-  	  Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...IRJET-  	  Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...IRJET Journal
 
Model Selection Techniques
Model Selection TechniquesModel Selection Techniques
Model Selection TechniquesSwati .
 
difference between dynamic programming and divide and conquer
difference between dynamic programming and divide and conquerdifference between dynamic programming and divide and conquer
difference between dynamic programming and divide and conquerSRISHTISRIVASTAVA212
 
Regression Analysis of NBA Points Final
Regression Analysis of NBA Points  FinalRegression Analysis of NBA Points  Final
Regression Analysis of NBA Points FinalJohn Michael Croft
 
All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...
All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...
All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...justinlent
 

Similar to Shanghai Stock Exchange Returns Model Selection (20)

Writing Sample 1
Writing Sample 1Writing Sample 1
Writing Sample 1
 
Time_Series_Assignment
Time_Series_AssignmentTime_Series_Assignment
Time_Series_Assignment
 
Big Data Project - Final version
Big Data Project - Final versionBig Data Project - Final version
Big Data Project - Final version
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
Time series Forecasting
Time series ForecastingTime series Forecasting
Time series Forecasting
 
Non-Temporal ARIMA Models in Statistical Research
Non-Temporal ARIMA Models in Statistical ResearchNon-Temporal ARIMA Models in Statistical Research
Non-Temporal ARIMA Models in Statistical Research
 
Predictive performance analysis using sql pattern matching
Predictive performance analysis using sql pattern matchingPredictive performance analysis using sql pattern matching
Predictive performance analysis using sql pattern matching
 
Time series project
Time series projectTime series project
Time series project
 
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
 
Stock Price Prediction using Machine Learning Algorithms: ARIMA, LSTM & Linea...
Stock Price Prediction using Machine Learning Algorithms: ARIMA, LSTM & Linea...Stock Price Prediction using Machine Learning Algorithms: ARIMA, LSTM & Linea...
Stock Price Prediction using Machine Learning Algorithms: ARIMA, LSTM & Linea...
 
Air Passenger Prediction Using ARIMA Model
Air Passenger Prediction Using ARIMA Model Air Passenger Prediction Using ARIMA Model
Air Passenger Prediction Using ARIMA Model
 
Forecasting Methodology Used in Restructured Electricity Market: A Review
Forecasting Methodology Used in Restructured Electricity Market: A ReviewForecasting Methodology Used in Restructured Electricity Market: A Review
Forecasting Methodology Used in Restructured Electricity Market: A Review
 
Machine learning project
Machine learning project Machine learning project
Machine learning project
 
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
 
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
IRJET-  	  Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...IRJET-  	  Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
 
Model Selection Techniques
Model Selection TechniquesModel Selection Techniques
Model Selection Techniques
 
Evolutionary Optimization Using Big Data from Engineering Simulations and Apa...
Evolutionary Optimization Using Big Data from Engineering Simulations and Apa...Evolutionary Optimization Using Big Data from Engineering Simulations and Apa...
Evolutionary Optimization Using Big Data from Engineering Simulations and Apa...
 
difference between dynamic programming and divide and conquer
difference between dynamic programming and divide and conquerdifference between dynamic programming and divide and conquer
difference between dynamic programming and divide and conquer
 
Regression Analysis of NBA Points Final
Regression Analysis of NBA Points  FinalRegression Analysis of NBA Points  Final
Regression Analysis of NBA Points Final
 
All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...
All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...
All that Glitters Is Not Gold_Comparing Backtest and Out-of-Sample Performanc...
 

Shanghai Stock Exchange Returns Model Selection

  • 1. FE8513 Financial Time Series Analysis Shanghai Stock Exchange Returns Model Master of Science (Financial Engineering) Academic Year 2014/15 Mini-Term 3 RAKTIM RAY G1401584K
  • 2. Ray | Shanghai Stock Exchange Returns Model 2 Table Of Contents Introduction...........................................................................................................3 Model Data and Elementary Analysis ..................................................................3 Mean Model Construction and Choice.................................................................4 Residual Diagnostics: Modelling Volatility .........................................................7 Forecasting and Final Model Selection ..............................................................10 Formal Statement of Optimal Model..................................................................12 Limitations ..........................................................................................................15 Conclusion ..........................................................................................................16 Acknowledgement ..............................................................................................17 Bibliography .......................................................................................................18
  • 3. Ray | Shanghai Stock Exchange Returns Model 3 Introduction Our main objective is to create a time series model to forecast the future prospective returns and the volatility of these returns. The investment opportunity is in the Shanghai Pilot Free Trade Zone (SPFTZ). Since this is a newly launched initiative for foreign companies to invest in, we use the Shanghai Stock Exchange returns as proxy for modelling the SPFTZ returns and volatility. Our objective is to provide investment returns and risk insights for SPFTZ index investing based on quantitative modelling. This report will aim to serve the Senior Management of the firm in arriving at a quantitatively driven decision of whether or not to go ahead with this new venture based on historical proxy returns. Theoretically, index values are themselves non- stationary. It is difficult to model non-stationary processes. We therefore base our model building on log returns which are stationary in nature. This removes the subjective biases of absolute value and non-stationarity and sets an objective outlook of the relative returns which is the end goal for all investments. The data tables used for general analyses and model building and referred to in the report is self-contained. An analysis support folder is also provided with all the tests and output from the software for the technical reader. Model Data and Elementary Analysis The index values of the Shanghai Stock Exchange ranging from 03/Jan/2000 to 27/Nov/2014 are used which is deemed a sufficient time period for our model building purpose. We do not account for dividends paid on the stock and assume that the price or index level have been already adjusted to account for them. Our purpose of elementary analysis is to understand the nature of data that we are dealing with and to check whether our assumptions of stationarity and normality hold. Additionally, other characteristics of the data
  • 4. Ray | Shanghai Stock Exchange Returns Model 4 are also tested here namely the heteroskedasticity and autocorrelation of error terms present in our model. Checking the distribution of price data (adjusted close price) henceforth to be known as price, we see that the data is not normally distributed from the histogram plot. Formal tests also reveal the same result. It is difficult to work with variables which are not normally distributed. But this problem disappears with large number of observations as the law of large numbers starts to prevail. A bigger problem with that of price is of non stationarity. We conduct statistical tests to verify that this is indeed the case. We cannot deal with non-stationary data when trying to model a time series relationship. Therefore the variable is transferred to the log return which makes the model parsimonious since return on investment prediction is the true focus and the price levels or the absolute value of the investments. This data is also not normally distributed like most finance data, but as mentioned before, the law of large numbers makes the OLS estimates unbiased and efficient and consistent with minimum variance. The stationarity tests using the unit root method (ADF and KPSS for the technical reader) conducted on the log return data shows that it is indeed stationary over time and does not explode. Therefore this variable may be used to construct a time series model. This also removes the possibility of spurious correlation in case we try to use this variable with some others to model for some other forecasts. Now that we have a stationary series we proceed to form linear models and test for other desirable properties exhibited by the model which makes it robust and a good prediction model. Mean Model Construction and Choice The next step in the model building process involved the determining the order of the linear model. Since this is time series estimation, we follow the Auto Regressive/Moving Average framework for the linear model or the ARMA framework. To determine the exact
  • 5. Ray | Shanghai Stock Exchange Returns Model 5 levels of lag terms to include, the correlogram was initially used to determine if there is any strong trend to determine the model. This is the Box-Jenkins approach. Technical readers may be interested in this methodology but for our purposes the description is restricted to its use which is limited at this point. As in most instances, it was difficult to pick up the correct lag terms for the AR as well as the MA part looking at the Auto Correlation (AC) and Partial Auto Correlation (PAC) terms and their decays. Additionally this is an informal method and a quantitative model demands a quantitative choice for the optimal model. Focus was therefore shifted to forming a cohort of linear models based on the ARMA structure and then choosing the optimal model based on some information criterion. Information criterion implies Akaike Information Criterion (AIC) and Schwarz Bayesian Information Criterion (SBIC). These two information criterion uses the same method to rank various models. One term is related to the residual sum of squares which may be compared with that part of the dependent variable that is still unexplained by the model. The lower this term is the better it is for our model. But being parsimonious is also another objective of the model. The information criterion uses a penalty term which adds to value of the AIC or SBIC when extra variables are added. SBIC uses a stricter penalty term. So overall, to choose the model with the lowest Information Criterion value, one has to have a trade-off between adding more variables to lower the first term and decreasing the number of variables to decrease the second term. A more technical discussion regarding degrees of freedom and residual sum of squares is beyond the scope of this paper and the technical reader may refer to Brooks1 . To sum up, models with the lowest AIC and SBIC values are to be selected as the optimal model to explain the returns of the SSE. For the sake of being parsimonious, the AR and the MA terms are restricted to 5 which imply that the model should not have more than 5 lagged values of the returns or the 1 Chris Brooks : Introductory Econometrics for Finance- 2nd Edition (2008)
  • 6. Ray | Shanghai Stock Exchange Returns Model 6 error terms. Beyond 5 it gets very difficult to interpret the model. This is a minor cost compared to the major benefit of simplicity and interpretation. What we will gain by adding the 6th terms and so on is small compared to the loss in the power interpretation and implementation of the model. Data is truncated at this point between in sample and out of sample. The most important use of a time series model is its predictive power. If predictive power is low, the model is useless. The most recent data is therefore reserved to be the test data or out of sample data. This is used to test which model has the best predictive power and therefore may be used as the optimal model. Among 3776 observations, 3210 or about 85% is kept for in sample, truncated at the date 07/09/2012, and the rest was truncated as out of sample or the holdout sample to be used for forecasting purposes. The AIC and SBIC table is provided for reference. From the table the choices are marked with circles. One model is selected based on the AIC criterion. This is the ARMA (5, 4) model. The SBIC table returns the lowest IC value for the ARMA (0, 0). This is naturally dropped as the interpretation will be that the returns in the future will be the same for every day which is equal to the intercept of the model. This will lead to poor predictive power and huge errors on the estimates of future returns. Volatility will equal the variance of the error term which is as good as not modelling at all. Therefore, this choice is dropped naturally and we proceed to choose the next least IC which is given by ARMA (1, 0) model. But the regression reveals that the coefficients are insignificant. Therefore we drop this model. We move on to choose ARMA (1, 1) as it is low on the SBIC value list and gives significant values for the intercepts and is parsimonious. We make a third choice to broaden our search for the best model. Since the next best model in terms of the AIC and the coefficients being significant is ARMA (4, 5) and according to the SBIC is ARMA (2, 3) the latter is selected since it is more parsimonious.
  • 7. Ray | Shanghai Stock Exchange Returns Model 7 To sum up, the three choices are ARMA (5, 4), ARMA (1, 1) and ARMA (2, 3). Information Criterion for ARMA models of the log returns of the Shanghai Stock Exchange index Residual Diagnostics: Modelling Volatility After formulating the mean model our focus now shifts on the residuals and their diagnostics. Our assumptions about the errors were that they are normally distributed with a constant mean and variance. In statistical parlance, this is called white noise. Now if these assumptions hold, then the model behaves according to the predication with accurate interval estimation of the parameters. But if these assumptions are not true, then we may have misleading results and our estimates may not be good. The assumption of white noise being a stationary process will then not hold true. Residuals are used as a proxy of the errors for the models to test the assumptions. ARCH effect or the presence of heteroskedasticity is tested. Our assumption for the errors was that of constant volatility. But this rarely holds true for financial data. The AIC SBIC
  • 8. Ray | Shanghai Stock Exchange Returns Model 8 variance of the error terms change over time (is heteroskedastic) and is required to be modelled along with the modelling of the mean which was done above. At this point we leave the realm of linear models and step into non-linear modelling The three chosen models were tested for the ARCH effect. At this point, it is to be noted that the purpose of the test is to detect the presence of heteroskedasticity and not to model the variance. GARCH will be used to model the variance since ARCH has a problem of determining the correct amount of lags. So long as ARCH effect exists, the test works. But when ARCH effect is not found then the process of including more lags will go on and is thus not effective for modelling purposes. The analysis was conducted starting with a lag of two for the ARCH heteroskedasticity test. All the three models show ARCH effect. Both the F test and the Chi- Squared tests were significant which shows that the volatility of the residuals is indeed not constant and needs to be modelled along with the mean. Results are documented in the analysis folder for the technical reader. With the three mean models, the process of estimating the GARCH model for the volatility was conducted. GARCH implies General Auto Regressive Conditionally Heteroskedastic which means the volatility of the log returns will be modelled with previous volatility terms and the square of the error terms as estimated from the mean equations. Therefore, even if the assumptions of constant variance of error terms do not hold for the mean equation, the variance will be constantly updated through this model. The analysis was made thorough by modelling the volatility with EGARCH, GJR or TGARCH and GARCH in Mean models as well. EGARCH accounts for leverage effects which means that volatility should rise more after huge price fall than for an equivalent price rise. TGARCH accounts for the fact that bad news affects volatility more than good news and GARCH in Mean puts the volatility term in the mean equation to account for the fact that index level is directly
  • 9. Ray | Shanghai Stock Exchange Returns Model 9 determined by the past volatility level. For a more detailed study on these models, please refer to Brooks2 . At this point the ARCH models were also tested but it is to be noted that most finance data volatility agrees with the GARCH model. Not all the models and their tests are recorded. Only the optimal volatility models for the three mean models are recorded in the analysis folder. At each stage of the GARCH/ARCH models, diagnostic checks are conducted mainly to see if the heteroskedasticity effect is significant any more or not. In almost all cases, after modelling for volatility, it was seen that the ARCH effect becomes insignificant. Additionally correlogram checks are also done to ensure that the residuals are not auto correlated. This applies to both the residuals and the squared residuals since the volatility model has a squared residual term. The ideal model should possess no heteroskedasticity, no auto correlation both for the residuals and the squared residuals. With these criteria in mind, the GRACH models were tested. The focus now is to find that volatility model which has no heteroskedasticity and the residuals are not auto correlated. Although AIC and SBIC values were looked at but priority was given to the residual criteria here. In order to maintain some level of parsimony, the GARCH levels were limited to (5, 5) or five lags. The final decisions were made with low AIC and SBIC as well as fulfilling the other essential conditions mentioned above and maintaining parsimony and selecting a simple model. Subjectivity is important in model where it should not always be the focus of mathematically optimising but also choosing a model that is easy to understand and implement. Due to the large variety of options the comparisons for AIC and SBIC have not been provided here to keep the choices straightforward. All volatility models were checked for standard coefficient violations during the selection process so as to make the volatility process stationary. 2 Chris Brooks : Introductory Econometrics for Finance- 2nd Edition (2008)
  • 10. Ray | Shanghai Stock Exchange Returns Model 10 For the ARMA (5, 4) model, it was determined that the volatility model most apt to fit was the TGARCH (1, 1) with Threshold Order=1. This conclusion was arrived at after going through the various combinations as mentioned above and conducting residual checks to see which model provides the best results. For the ARMA (1, 1) model, it was determined that the TGARCH (1, 1) in Mean model with Threshold Order = 1 produced the optimal fit. At this point it must be mentioned that since it is a market index model, the intuition is for a GARCH term to be present in the mean model as well. Although the GARCH term in the mean equation is insignificant, the model produced a much reduced auto correlation for the standardized coefficient. Therefore the above volatility model was selected for the ARMA (1, 1) model. For the ARMA (2, 3) model the volatility prediction model is selected as the TGARCH (1, 2) with threshold order=2. This model made most of the coefficients significant as well as reduced the auto correlation between the error terms to a decent level so that the assumptions behind the model may be matched with the data. A summary of the Information criteria for the models selected is given below. Information Criterion ARMA(5,4)TGARCH (1,1)(t=1) ARMA(1,1)TGARCH(1,1)- in M(t=1) ARMA(2,3) TGARCH(1,2)(t=2) AIC 3.5600 3.5633 3.5600 SBIC 3.5866 3.5784 3.5827 HQIC 3.5696 3.5687 3.5682 As can be seen from the table, the ICs are very close to one another and a choice for the optimal model would involve some other criteria than the ICs and the correlogram. Forecasting and Final Model Selection The purpose of a good model is to be able to forecast the future values with the greatest degree of accuracy. Some academics and statisticians are of the opinion that the as long as forecasts are good, the model is good. The assumptions are secondary to the real
  • 11. Ray | Shanghai Stock Exchange Returns Model 11 purpose of the model which is to be able to forecast the future value with a given level of accuracy. Therefore the criterion for the final model selection will be based on the degree of forecasting accuracy possessed by the three models. Both dynamic and static forecasts are used to determine model forecasting strength since both long run and short run predictions are of importance in time series modelling. Forecasting both ways gives information regarding whether the model is a good predictor in the short or the long run. The summary of the results for the forecasting errors are represented below with the forecasting of the chosen model being represented in the graphical form. Static Forecast Forecast Statistics ARMA(5,4)TGARCH (1,1)(t=1) ARMA(1,1)TGARCH(1,1)- in M(t=1) ARMA(2,3) TGARCH(1,2)(t=2) RMSE 1.0212 1.0231 1.0294 MAPE 107.4325 105.3958 117.7223 THEIL U 0.9469 0.9724 0.9157 Dynamic Forecast Forecast Statistics ARMA(5,4)TGARCH (1,1)(t=1) ARMA(1,1)TGARCH(1,1)- in M(t=1) ARMA(2,3) TGARCH(1,2)(t=2) RMSE 1.0238 1.0229 1.0235 MAPE 103.6614 102.6099 99.9349 THEIL U 0.9839 0.9780 0.9954 Static and Dynamic Forecast of ARMA(1,1) TGARCH(1,1)-in M(t=1)
  • 12. Ray | Shanghai Stock Exchange Returns Model 12 -4 -2 0 2 4 III IV I II III IV I II III IV 2012 2013 2014 DPRICEF ± 2 S.E. Forecast: DPRICEF Actual: DPRICE Forecast sample: 7/10/2012 11/27/2014 Included observations: 566 Root Mean Squared Error 1.022977 Mean Absolute Error 0.746364 Mean Abs. Percent Error 102.6099 Theil Inequality Coefficient 0.978045 Bias Proportion 0.000154 Variance Proportion 0.987612 Covariance Proportion 0.012234 1.4 1.6 1.8 2.0 2.2 2.4 2.6 III IV I II III IV I II III IV 2012 2013 2014 Forecast of Variance -6 -4 -2 0 2 4 6 III IV I II III IV I II III IV 2012 2013 2014 DPRICEF ± 2 S.E. Forecast: DPRICEF Actual: DPRICE Forecast sample: 7/ Included observation Root Mean Squared Mean Absolute Erro Mean Abs. Percent Theil Inequality Coef Bias Proportion Variance Proport Covariance Propo 0 1 2 3 4 5 III IV I II III IV I II III IV 2012 2013 2014 Forecast of Variance Since they objective is to predict returns in the long run the dynamic forecast statistics are used to select the optimal model. Static forecasts work well more one or two step ahead time periods. Dynamic forecasts predict into the future. Considering the fact that index returns will be viewed as a long term investment as opposed to daily investment, the model is chosen based on the dynamic forecasting power. The optimal model should have the minimum RMSE and THEIL U. Lower values for these variables indicate lower deviation from the actual values. The chosen model is therefore the ARMA (1, 1) TGARCH (1, 1) in Mean with threshold order of one. Formal Statement of Optimal Model The general statement of the model is as follows: yt = µ + a1yt-1 + a2ut-1 + bσt-1 + ut , ut ~ N(0,σt 2 ) σt 2 = α0 + α1 u2 t-1 + βσ2 t-1 + γ u2 t-1It-1 where yt is the value of the returns to be predicted. µ is the mean equation intercept
  • 13. Ray | Shanghai Stock Exchange Returns Model 13 a1yt-1 is the previous period return and its coefficient a2ut-1 is the previous period error term and its coefficient bσt-1 is the previous period standard deviation and its coefficient ut is the error term for the mean equation estimation α0 is the volatility model intercept α1 u2 t-1 is the previous period error term squared and its coefficient βσ2 t-1 is the previous period variance and its coefficient γ u2 t-1It-1 is the dummy variable and its coefficient where It-1 = 1 if ut-1 < 0 = 0 otherwise The TGARCH or GJR model has some pre conditions on the coefficients for the model to be valid and stationary. α0 ≥ 0 and α1 + γ ≥ 0 The TGARCH model measures the leverage effect which states that the volatility increases more with a price fall or bad news than it does for a price rise or good news. For leverage effect to exist in the model, the value of γ should be ≥ 0. The model also has a GARCH in Mean term which indicates a volatility term in the mean equation. This is expected of financial data since higher risk should be compensated with higher return. The volatility term models risk and the dependent variable is the return and the expectation is that the value of b is > 0. Stating the model with the coefficients: yt = -0.0931 + -0.7546yt-1 + 0.7702ut-1 + 0.0753σt-1 + ut 3 , σt 2 = 0.0304 + 0.0441u2 t-1 + 0.9264σ2 t-1 + 0.0341u2 t-1It-1 Forecast Band: Returns ± 3.208 3 Model is executed with Bollerslev-Wooldridge robust standard errors & covariance due to non-normality of residuals for SE adjustments
  • 14. Ray | Shanghai Stock Exchange Returns Model 14 The model satisfies all the constraints. The mean equation has a negative intercept and is negatively attached with the previous period return which indicates oscillation about positive and negative values which is common in financial time series returns variables. It is positively linked with the error term of the previous period which also reverses the effect of the prediction shortfall from the previous period. Both these values are below absolute value 1 which means that these are individually stationary effects on the return. As mentioned above, the volatility term has a positive coefficient which models the fact that high volatility in the previous period must be compensated by high return in this period. The volatility model has a positive intercept indicating positive values of volatility always. It is a stationary volatility process. It is positively linked with the error and variance terms of the previous period which is to be expected since finance data exhibits clustering. The volatility coefficient is very high indicating momentum effect in volatility. The leverage coefficient is positive indicating that this model captures the fact that bad news affects volatility more than good news. Standard forecast values and errors (twice standard error) are also indicated which is in general agreement with financial time series models. Technical readers may browse the model specifics attached below.
  • 15. Ray | Shanghai Stock Exchange Returns Model 15 Limitations Every optimal exercise has drawbacks. Since there are constraints to be satisfied especially model assumptions and parsimony, some criterion for good models may not be met or may be at a sub optimal level. Listed below are some limitations faced in building the model.  Choosing the perfect model involves a lot of permutations and combination. To test all of these and arrive at a metric that makes one model better than the rest is not an easy or efficient task. Although the model chosen is optimal with respect to some criterion, it is sub optimal with respect to others.  The choice of the optimal model finally comes down to subjectivity and is not completely quantitative. Although the framework is purely mathematical and statistical, the final choice depends a lot on judgement and experience in model picking since the number of choices is very large.  The forecasting period is short due to the lack of availability of data. Ideally greater forecasting window gives more room for forecasting and testing the validity of the model in the long run. But with the number of data points being constrained, the forecasting power of the model cannot be completely determined  The assumption of residual autocorrelation being absent is not completely achieved along with normality of the residuals. These are difficult criteria to be fulfilled and very rarely are they satisfied. Although autocorrelation has been removed for ten lags, beyond that auto correlation exists. The consequence is a loss of efficiency of the coefficient estimates.
  • 16. Ray | Shanghai Stock Exchange Returns Model 16  Due to the adjustments to fit model assumptions on all sides, the regressed sum of squares value may be quite low, which is a limitation of time series analysis. Conclusion Although there are plenty of drawbacks, the model for estimating future earnings or index or return levels of the Shanghai Stock Exchange index is a robust one and may be used to successfully predict a range of possible outcomes of the index in the future and thus base business and investment decisions on these forecasts. The model is driven with business logic and is designed quantitatively. It updates the volatility level at every time period to reflect new changes in risk parameters and environment. This in turn provides better predictions for the returns. The risk also directly influences the returns which follows business logic. Therefore, this model provides a good framework to estimate the returns on investments and ventures in the Shanghai Pilot Free Trade Zone (SPFTZ) with the margin for error also clearly defined. Therefore overall risk exposures may also be assessed when relying on the model predictions to take investment decisions at the firm level.
  • 17. Ray | Shanghai Stock Exchange Returns Model 17 Acknowledgement I would like to thank Dr. Wu Yuan for providing me the opportunity to work on the data provided by him to design the time series. My thanks extends to Nanyang Technological University for providing me with the facilities needed to complete the assignment. In particular, the Business Library and Financial Trading Room resources including modelling software managed by the Nanyang Business School were of utmost importance.
  • 18. Ray | Shanghai Stock Exchange Returns Model 18 Bibliography 1. Brooks, Chris: Introductory Econometrics for Finance- 2nd Edition: Cambridge University Press 2008. 2. Kutner Michael H., Nachsheim Christopher J., Neter John, Li William: Applied Linear Statistical Models: McGraw-Hill/Irwin 2005 3. Shanghai Stock Exchange Index level Data: NTULearn Web Portal