This document summarizes research on using disparate data sources and fuzzy logic systems to predict stock prices. It discusses using feature selection techniques like recursive feature elimination to select relevant inputs from multiple data sources for modeling stock movement and price. Experimental results show the fuzzy logic models can accurately predict stock movement and price changes based on selected features from technical analysis, fundamental data, and other sources. The document proposes combining the movement and price prediction models to help with stock selection decisions. Future work is suggested to expand the data sources, rules, and use error-driven methods to improve the models.
1. 1
INFORMS Philadelphia
November 2015
Mohamed Abraar Ahmed (Email: mza0068@auburn.edu)
M.S. Candidate, Industrial and Systems Engineering
Stock Price Prediction Using
Disparate Data Sources in Fuzzy
Systems
2. 2Stock Market Prediction
Why?
• The stock market is one
of the most important
way for companies to
raise money
• About 48% Americans
invested in the stock
market in 2015 (CNBC)
• The successful prediction
of a stock’s future price
could yield significant
profit
5. 5Motivation and Previous Process Overview
• Which sources of data have the most correlation with the stock market
time series?
• Which logical target has the best prediction capability with regards to
the stock movement?
• Which technological model is best at predicting the stock movement?
• Can we construct a better model using disparate data sources?
10. 10Motivation
Could predict movement quite accurately, can
it be done for price?
Movement can tell buy or sell, price will tell
whether it is worth it
Will application of Fuzzy Logic to the
disparate data sources improve, maintain or
reduce accuracy compared to other
implementations?
Can the movement and price models be used
in conjunction for better decision making in
stock selection?
11. 11Membership Functions for Input and Output
Cluster Analysis
• Cluster analysis or
segmentation
analysis forms
clusters such that
data points in the
same cluster are very
similar
• K-means clustering
• Clusters were used
to form ranges of
membership
functions
• Coding: On Matlab
12. 12Rules
Made categories of levels based on input
membership functions
Got the input and output rules for each pair
based on historical real data
Checked for input-output pairs, that formed
rules, which were repeated
Picked most repeated
Coding: On VBA
Part of results for Output Low MF
13. 13Results
• Using error as
a marker of
performance,
the results are
convincing
• There are
situations
where it looks
like more rules
are required
for predicting
the market
• The system
looks to be
reacting well
even when the
stock price
range has
changed
15. 15Model Combination
The main idea behind adding Fuzzy Logic to
the chosen movement model is to predict the
close price after movement is known
If predicted close price is in the opposite
direction of the movement prediction, close
price resets to previous day price
16. 16Future Work
Gather data by other methods such as Twitter
sentiments and textual analysis of financial
reports
Scan for more rules via the input-output
pairing method
Use error in prediction in genetic algorithms
to modify rules
Ray Dalio’s $165B Bridgewater Associates will start a new artificial-intelligence unit to use predictive analysis for trades. (Bloomberg, 2015)
Talk about Secondary Variables
Feature selection is an indispensable process in the Machine Learning
As more and more data is collected and analyzed, decision makers at all levels welcome data visualization software that enables them to see analytical results presented visually, find relevance among the millions of variables, communicate concepts and hypotheses to others, and even predict the future.
It works by recursively removing attributes and building a model on those attributes that remain. Code available