Kondal Kolipaka.pptx

Stock Market Prediction using
Deep Learning Models
Kondal Kolipaka
Liverpool John Moores University
Student number: 931219

Outline
• Introduction
• Problem Description
• Aim and objectives
• Literature Study
• Research Methodology
• Analysis
• Results and Discussions
• Conclusion and future work

Introduction
• Stock market prediction is the act of determine
the future value of a company stock or other
financial instrument traded on an exchange.
• Predicting the stock market performance is a very
large and profitable area of the study
• The successful prediction of a stock's future price
could yield significant profit
• BSE Sensex 7th largest stock exchange in the world
with US $ 2.8 trillion market cap and index
represents 30 largest companies listed on the
exchange

Problem Description
• Stock market is an interesting task for researchers and academicians,
it divides them into two groups
• Not possible to predict the stock market – Efficient Market
Hypothesis(EMH) Principle
• There is a scope to beat the stock market
• Deep Learning Models for prediction
• LSTM for stock market prediction
• Not many researchers have used numerical and textual analysis for
prediction
• Hybrid LSTM model

Aim and objectives of the study
• Assist the investors to make better decisions
• Find the gaps presented in the past
• Prediction model based on Stock historical data and news data
• Model identification
• Model building
• Model performance analysis

Research Questions
• Can we combine Stocks numerical analysis and textual analysis to
predict the stock market?
• What is the best machine learning model for stock market prediction?
• How to classify the business news for public sentiment analysis?
• How historical data and text data techniques help to generate better
stock market prediction?

Literature Review
• Numerical data – India and international markets
• Textual Data – News, twitter feeds, blogs
• Linear Models - AR, MA, ARIMA, ARMA
• Deep Learning Models – RNN, MLP, CNN, LSTM
• Hybrid Models - ARIMA-BPNN , ARIMA-GRU, LSTM and ensemble EMD
Limitations
• Either focused on Stock historical data or news sentiment data
• Not much research into merging numerical and text analysis data and
predicting stock market

Analysis
• Numerical Data
oBSE Sensex historical data
downloaded from Yahoo
Finance
o15 years of data (30-06-2005
to 29-06-2020)
oDaily-price for 3672 days
oVariables – Date, Open, High,
Low, Close, Adj Close and
Volume
oMain variable: Close
• Text Data
• News headlines published by
Times of India, Harvard
Dataverse
• 20 years of data (till mid of
2020)
• 3.3 million records
• Variables – publish_date,
headline_category,
headline_text
• Main variable: headline_text

Analysis
• Data cleansing and pre-processing
• Numerical data: Dropping null values, missing data, Outlier
detection, feature selection
• Text Data: Dropping null values, feature selection, data range
• Exploratory data analysis
• Numerical Data Modeling – ARIMA & LSTM
ARIMA Model Prediction LSTM Model prediction

Cont. Analysis
LSTM Hybrid Model
Add the sentiment of the texts to the
original LSTM and see if there is an
improvement in the performance
• Date
• Close
• Headline_text => Sentiment Score
Model Parameters:
• 80:20 training and validation set
• Tanh activation function
• Adam optimizer
• Batch size 16
• Epochs 100
Text Analysis
• Naïve Bayes Classifier
• SVM Classifier
• Random Forest
Classifier

Results and Discussions
ARIMA model Performance
Parameter Result
MSE 14469805.031856986
MAE 2620.2431482654974
RMSE 3803.9196931398255
MAPE 0.07676215004310963
Parameter Result
MSE 637816.3887958465
MAE 650.9328685484523
RMSE 798.6340769062177
MAPE 0.01779417716769563
Classification Model Accuracy
Naive Bayes Classifier 0.751
SVM Classifier 0.888
Random Forest Classifier 0.842
Parameter Result
MSE 243371.66329966017
MAE 317.4715822069669
RMSE 493.32713618820947
MAPE 0.009039365197613879
Model MSE MAE RMSE MAPE
ARIMA 14469805.032 2620.243 3803.919 0.0767
LSTM 637816.388 650.932 798.634 0.0177
Hybrid LSTM 243371.663 317.471 483.327 0.009
LSTM model Performance
Text Analysis
Hybrid Model
Different model performance
Around 7.6% MAPE represents the model is about 92.4% accurate
in predicting the stock price over test set.
Around 1.7% MAPE represents the model is about 98.3% accurate
in predicting the stock price over test set.
Around 0.9% MAPE represents the model is about 99.1% accurate in predicting
the stock price over test dataset which is a great improvement compared with the
individual LSTM model and ARIMA model.

Contributions and future work
• Novel approach for prediction of stock market by combining numerical and text analysis data
• With the LSTM Hybrid model where the text analysis is augmented over the prediction of the
numerical analysis by combing the sentiment with the closing price of the numerical data, resulted in
an MAPE of 0.0090 and RMSE of 493.32. This clearly shows by combining the sentiment analysis
data with the historical data we are able to get the better results than the individual LSTM model.
• Results can be improved for numerical analysis by using more sophisticated approaches like
stacked auto encoders presented by the Wei Bao (Bao, Yue and Rao, 2017) where novel deep
learning framework studies by combining wavelet transforms(WT), stacked autoencoders (SAEs)
and long-short term memory (LSTM) for forecasting the stock prices

Kondal Kolipaka.pptx

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Kondal Kolipaka.pptx

Similar to Kondal Kolipaka.pptx (20)

Recently uploaded

Recently uploaded (20)

Kondal Kolipaka.pptx