1. 1
INFORMS Philadelphia
November 2015
Bin Weng ( Email: bzw0018@auburn.edu)
Ph.D. Candidate of Industrial and System Engineering
Mohamed A. Ahmed (Email: mza0068@auburn.edu)
M.S. Candidate of Industrial and System Engineering
Fadel M. Megahed (Email: fmegahed@auburn.edu)
Assistant Professor of Industrial and System Engineering
Stock Market Prediction Using
Disparate Data Sources
2. 2Stock Market Prediction
Why?
• The stock market is one
of the most important
way for companies to
raise money.
• About 48% Americans
invested in the stock
market as 2015 (CNBC).
• The successful prediction
of a stock’s future price
could yield significant
PROFIT.
4. 4Stock Market Prediction
Ray Dalio’s $165B Bridgewater Associates will start
a new artificial-intelligence unit to use predictive
analysis for trades. (Bloomberg, 2015)
5. 5Related Works
Paper Index Selected Papers
[1]
Predicting Financial Markets: Comparing Survey,
News, Twitter and Search Engine Data
[2] A fusion model of HMM, ANN and GA for stock market forecasting
[3] Twitter mood predicts the stock market
[4] Stock Market Prediction System with Modular Neural Networks
[5]
Empirical evaluation of an automated intraday stock recommendation
system incorporating both market data and textual news
[6] A Hybrid Machine Learning System for Stock Market Forecasting
[7]
Market Index and Stock Price Direction Prediction using Machine
Learning Techniques: An empirical study on the KOSPI and HSI
[8] Stock Market Prediction Using Disparate Data Sources (Proposed)
6. 6Related Works
Paper
Data Model
Target
Type of Stock
Market
Data
Technical
Indicator
Social
Media
News
Secondary
Variable
Time
Series
Logistic
Regression
Decision
Trees
Neural
Networks
Support
Vector
Machines
IT Index
Mix of
companies
[1]
Price
Volume
[2]
Price
[3] Movement
[4]
Buy and sell
signal
[5]
Price
Volume
[6] Movement
[7] Movement
[8] Movement
7. 7Research Motivation
Which sources of data have the most correlation
with the stock market time series?
Which logical target has the best prediction
capability with regards to the stock movement?
Which technological model is best at predicting
the stock movement?
Can we construct a better model using disparate
data sources?
10. 10Data Sources
Social Media and Internet Data
• “Financial news articles play a large role in influencing the
movement of a stock as humans react to the information.” (M.
Nardo etc. 2015)
• “Data on changes in how often financially
related Wikipedia pages were viewed have contained early signs
of stock market moves.” (H. Moat etc. 2013)
• Blog communication exhibits remarkable
predictive power. (M. Choudhury etc. 2008)
11. 11Data Sources
Secondary Variables
• The data from Social Media and Internet always have high variability
(e.g. Moving Average, Momentum, Relative Strength Index).
• If the upward or downward movement in predicting variables had an
effect on the target movement?
• What range of the primary variables have predicting power over the
targets?
0
500
1000
1500
2000
2500
3000
3500
1/2/2014 2/2/2014 3/2/2014 4/2/2014 5/2/2014 6/2/2014
Google News & Blogs
12. 12Target Matrix
Target
Type
Method
1 Open (i+1) – Close (i)
2 Open (i+1) – Open (i)
3 Close (i+1) – Close (i)
4 Close (i+1) – Open (i)
5 Volume of trades moves as previous day
25. 25Target Matrix
Target
Type
Method
1 Open (i+1) – Close (i)
2 Open (i+1) – Open (i)
3 Close (i+1) – Close (i)
4 Close (i+1) – Open (i)
5 Volume of trades as previous day
30. 30Conclusion
• Disparate sources of data help predict the stock market.
• Multiple targets’ prediction results can be used in
conjunction to successfully track stock market
movements.
• Decision tree model and support vector machine model
perform the best interchangeably with different
combinations of input data.
• With all the types of input data, SVMs performed best.
31. 31Future Work
• Identifying and adding into a more inclusive form of
this model, new sources of data that have a predictive
effect on the movement of the stock market, like twitter
sentiment and market news textual analysis.
• Include linguistic modeling, clustering, and controlling
methods like fuzzy theory in obtaining the predictions
of price range.
Fuzzy Membership FunctionFuzzy System
32. 32
INFORMS Philadelphia
November 2015
Bin Weng ( Email: bzw0018@auburn.edu)
Ph.D. Candidate of Industrial and System Engineering
Mohamed A. Ahmed (Email: mza0068@auburn.edu)
M.S. Candidate of Industrial and System Engineering
Fadel M. Megahed (Email: fmegahed@auburn.edu)
Assistant Professor of Industrial and System Engineering
Stock Market Prediction Using
Disparate Data Sources
Editor's Notes
Ray Dalio (born August 8, 1949) is an American businessman and founder of the investment firm Bridgewater Associates.
journal
initial
Does the target satisfy investors who wish to know movement in different time periods? What kind of prediction is being done, short or long term?
Google News could search and explore information from historical achieves dating back over 200 years. Blog provides insights into understanding communication patterns of people. The communication dynamics yield correlations with certain external events, justifying their predictive power.
Feature selection is an indispensable process in the Machine Learning
As more and more data is collected and analyzed, decision makers at all levels welcome data visualization software that enables them to see analytical results presented visually, find relevance among the millions of variables, communicate concepts and hypotheses to others, and even predict the future.
It works by recursively removing attributes and building a model on those attributes that remain. Code available
Gap as increase
Stronger lines
This research work points to the fact that historical market data alone is not sufficient to predict the movements of stocks in the market. It validates the proposition that internet search data has predictive power, too.