PPT_Sanjeev

Prediction of Stock Prices
Over Short Term
IE 5300 Spring 2015 Project
SANJEEV PUDASAINI

Contents
1. Introduction
• Nature of Stock Data
2. Data Acquisition and Pre-processing
• Source
• Pre-processing
3. Methodology
• Auto regressive(AR(p) model
• Moving average(MA(q) model)
• Multivariate VAR(p) model
• Multivariate VARMA(p, q) model
• Formulating the problem in Matlab© environment
• Evaluation of the model performance
4. Results and Conclusions

Introduction
• Stock prices – subject of curiosity for both
general public and expert investors
• Picture of the overall volatility of the market and
the financial condition of particular company
• Depend on classical economic theory of Supply
and Demand – No mathematical formula to
calculate!
• However, there are ways to predict it!

Introduction – Nature of Stock
data
• Stock prices are affected by Internal and External
Factors
• Internal factors – Earnings statement, market share,
etc.
• External factors- Competitor performance, public
financial indicators
• Unless significantly affected by these factors, stock
prices tend to follow their historical trend –
correlated to past data
• Correlated to other stock prices as external factors

Sample Auto-correlation plot of a stock
on a particular day & Correlation
between stocks
Hence use of traditional Multiple Linear Regression not
suitable

Data Acquisition and Pre-
processing
• Source – Kaggle Repository for the competition “Predict
Short Term Movements in Stock Prices Using News and
Sentiment Data Provided by RavenPack”
• Given percentage change in stock prices compared to the
previous day's closing price of 198 anonymous stocks
measured over 5 minutes interval for 510 days from 9
AM to 2 PM
• Also given the values of 244 anonymous features in the
same interval for 510 days
• Task is to predict the % change in prices 25 intervals
ahead i.e. at 4 PM
• Training set given for 200 days with closing data at 4 PM

processing
• Pre-processing
• Zeroing out the 9 AM value for each day
• Seeking data entry errors
• Replacing missing values with zeros
• Testing for stationarity and differentiating the time
series – VAR models are based on stationary time
series

processing cont..
Before and after first order differentiation of stock data for a single day

Methodology
• Multivariate Vector Auto regressive - VAR (p) and
Vector Autoregressive Moving Average -VARMA
(p,q) models used for forecasting
• VAR (p) and VARMA (p,q) are derived from their
Univariate counter parts
• Matlab© statistical toolboxes used for estimating
the parameters in both models

Methodology
• VAR (p) model
• Vector Autoregressive Moving Average - VARMA
(p,q) model

Methodology - Formulating the problem in
Matlab© environment
• Determine the number of time series (n) to be used:
Used empirical number of 4 highly correlated stocks
at a time in the model
• Determine the Moving Average (MA) lags and Auto
regression (AR) lags: Akaike Information Criterion
(AIC) used to find out the minimum numbers
• Hit and trial then used to arrive at optimal AR and
MA lags that produced least Mean Absolute Error
(MAE)

Evaluation of model performance
• Kaggle’s criteria: Best model – model with least
Mean Absolute Error (MAE) score. Mean Absolute
Error is calculate as:
• where 𝑦𝑖 is the true value of percentage change in stock
price and 𝑦𝑖 is the predicted value of percentage change in
stock price at the end of day (4 PM)
• Forecast done for initial 200 days only because of
availability of closing price changes at 4 PM

What does MAE (Mean Absolute
Error) signify ?
• 100 $ worth of stock yesterday jumped to 105 $
this morning at 10 AM but model predicted 98$
• Jump of 5%
• The model predicted decline of 2%
MAE = |5-(-2)|% = 7 %
• % prediction error = (98 – 105)/ 100 = 6.66 %
• Hence, your forecast will lie at around 6.66% of the
true value
• Good or bad? Depends!

Results and Conclusion
• Best results:
VARMA(10,4) model performs slightly better than
VAR(4) model with lesser
Mean Absolute Error (MAE)

Results - Distribution of MAE and sample
forecast graph for a stock

Conclusion/ Future works
• Both models could not reach the Kaggle winning
bench mark of 0.42 MAE
• It was found MAE goes on increasing at the
forecasting horizon is extended
• Forecasting at 2 PM with the data up to 1 PM as
Training set gave MAE of 0.27
• VAR and VARMA models more suitable for pretty
short term forecasting
• Use of features in future model may improve
performance.

References
[1] Fama, Eugene F. ”The behavior of stock-market prices.” Journal of business
(1965): 34-105.
[2] Lewellen, Jonathan. ”Momentum and auto correlation in stock returns.” Review
of Financial Studies 15.2 (2002): 533-564.
[3] Kullmann, L., Janos Kertsz, and K. Kaski. ”Time-dependent cross-correlations
between different stock returns: A directed network of influence.” Physical Review E 66.2
(2002): 026125.
[4] George, Box. ”Time Series Analysis: Forecasting and Control”, 3/e. Pearson
Education India, 1994.
[5] Hafez, Peter, and Ilya Gorelik. Predict Short Term Movements in Stock Prices
Using News and Sentiment Data Provided by RavenPack.The Big Data Combine
Engineered by Battlefin. Battlefin Inc., 16 Aug. 2013. Web. 1 Mar. 2015.
[6] Hyndman, Rob J., and George Athanasopoulos. ”Forecasting: principles and
practice”. OTexts, 2014.

References
[7] Ltkepohl, Helmut. ”Forecasting with VARMA models.” Handbook of
economic
forecasting 1 (2006): 287-325.
[8] Robinson, Wayne. ”Forecasting Inflation using VAR analysis.” 28th Annual
Conference of the Regional Programme for Monetary Studies Conference, Port-
ofSpain. 1996.
[9] Zivot, Eric, and Jiahui Wang. ”Vector autoregressive models for multivariate
time series.” Modeling Financial Time Series with S-PLUS? (2006): 385-429.
[10] Chatfield, Chris. Time-series forecasting. CRC Press, 116-190, 2006
[11] Ltkepohl, Helmut. Vector autoregressive models. Springer Berlin
Heidelberg, 5-15, 2011.
[12] LeSage, James P. ”Applied econometrics using MATLAB.” Manuscript, Dept.
of Economics, University of Toronto (1999).
[13] Lack, Caesar. Forecasting Swiss inflation using VAR models. No. 2006-02.
Swiss
National Bank, 2006.

PPT_Sanjeev

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Viewers also liked

Viewers also liked (16)

Similar to PPT_Sanjeev

Similar to PPT_Sanjeev (20)

PPT_Sanjeev