SlideShare a Scribd company logo
Figure. SVM Kernel Gamma Γ vs Accuracy
Stock Market Trends Prediction after Earning Release
Chen Qian, Wenjie Zheng, Ran An
(cqian23 / zhwj814 / anran)@stanford.edu
Public companies release quarterly earnings report to present financial
status to investors. Short term stock prices can be significantly affected
by company’s quarterly performance. Our project aims to explore the
impact of financial report factors, earning per share (EPS), consensus
forecast and the earning report related articles on short term stock
market trends. We narrow down our targeted company to Silicon based
public technology companies.
Features
Relativized financial data
Data Selection
• Financial factors
Discussion
Future
Result: Stock market is known as a chaotic system and our model analysis
depicts that even building our model with empirical key features could still result
in low accuracy. However, by limiting our scope to earning release day, we are
able to build the prediction model of around 70% accuracy.
Total asset Total equity Total revenue Net income
EPS surprise Total asset Total equity Total revenue Net income
15.30% 5.52% 7.82% -3.45% 14.21%
Very negative Negative Neutral Positive Very Positive
2.564% 8.974% 62.820% 20.512% 5.128%
Normalized sentiment analysis features
Data Preprocessing
Relativisation
Calculate the growth rate of
each financial factor from last
quarter, allows us to combine
companies data together.
Sentiment Analysis
Interact with Stanford NLP
toolkit to obtain sentiment
analysis matrix from news
and articles.
Normalization
Normalized input features
simplifies model design.
● EPS vs Forecast
How much the actual EPS bits
consensus expectation is also a
key factor to be considered for
investment strategy.
● News and analytic articles
We would like to analyze if these
articles from mainstream media could
influence stock market and investor’s
trading strategies.
● Financial features from earning report
Mixed
● Sentiment features from articles
Mixed
Model Selection & Evaluation
Introduction Results
Model Equation Parameters Smoothing Tools
Locally Weighted LR Bandwidth Ƭ
Model Equation Parameters Kernel
SVM Bandwidth Γ
Figure. LWLR with financial features:
Bandwidth vs Accuracy
Figure. LWLR with NLP features:
Bandwidth vs Accuracy
● Training labels
Label 1: Next day opening price vs release date close price
Label 2: Next day closing price vs release date close price
● All error analysis are validated using K-Fold cross validation (K = 5)
Data Visualization
Models and features: SVM seems to give us a better prediction than NB and
other generalized LR algorithms. Further, it is observed that data visualization
from different companies shows that data set from a certain company is much
more distinguishable than mixing data together. The commercial correlations
between companies are far less than previous expectation and the features we
collected are far from enough to predict the stock trend.
Limitation: Due to the limited number of company choices, we have small data
size (~300 samples) which could lead to high bias and over-fitting. Stock price is
not only affected by certain financial features, consensus news, but also company
direction and future business guidance, which are difficult to be digitized.
Figure. Accuracy with different models over multiple tests
Feature selections: Write script to auto collect/process data to increase the
learning set size and use feature ranking to analysis the covariance between
features and labels and pick the most top ranked one as the model inputs.
NLP analysis: The sentiment result from Stanford NLP toolkit is too general
to apply for financial news analysis and we would spend more time to develop
a financial specific sentiment analysis algorithm.
Model Improvement: We would spend more time on SVM algorithm and
explore the most appropriate kernel to separate features in high dimension.
Classification Models
Predict by report SVM Naïve Bayesian Locally Weighted Boosting
after-market 69.80% 66% 64.15% 66%
next-day 64.20% 60.30% 68.12% 58.30%
Predict by article (norm)
after-market 57.64% 52.17% 58.96% 53.96%
next-day 54.91% 50.94% 64.38% 52.17%
1. In SVM with RBF, decision boundary to
separate samples of different labels is easier to
find with a certain company. It is hard to find the
boundary if we mix data among companies.
However, in terms of article, no matter if the
focus is on individual company or several, the
data tends to follow a similar distribution.
2. In experiments with SVM and LWLR, we
tried to adjust the bandwidth 𝛕/Γ to improve the
performance of these models.
3. 5-fold cross validation was exploited for
accuracy measurement

More Related Content

What's hot

Stock Market Price Prediction Using Technical Analysis
Stock Market Price Prediction Using Technical AnalysisStock Market Price Prediction Using Technical Analysis
Stock Market Price Prediction Using Technical Analysis
ASHEESHVERMA6
 
Managing uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingManaging uncertainty in ai performance target setting
Managing uncertainty in ai performance target setting
Noelle Ibrahim
 
ForecastIT 3. Simple Exponential Smoothing
ForecastIT 3. Simple Exponential SmoothingForecastIT 3. Simple Exponential Smoothing
ForecastIT 3. Simple Exponential Smoothing
DeepThought, Inc.
 
Stock Market Prediction
Stock Market PredictionStock Market Prediction
Stock Market Prediction
MRIDUL GUPTA
 
ForecastIT 1. Introduction to Forecasting
ForecastIT 1. Introduction to ForecastingForecastIT 1. Introduction to Forecasting
ForecastIT 1. Introduction to Forecasting
DeepThought, Inc.
 
Presentation 2
Presentation 2Presentation 2
Presentation 2
uliana8
 
Npi rpg 22
Npi rpg 22Npi rpg 22
Npi rpg 22
Simon Joiner
 

What's hot (10)

Stock Market Price Prediction Using Technical Analysis
Stock Market Price Prediction Using Technical AnalysisStock Market Price Prediction Using Technical Analysis
Stock Market Price Prediction Using Technical Analysis
 
Managing uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingManaging uncertainty in ai performance target setting
Managing uncertainty in ai performance target setting
 
Stock Market Analysis
Stock Market AnalysisStock Market Analysis
Stock Market Analysis
 
ForecastIT 3. Simple Exponential Smoothing
ForecastIT 3. Simple Exponential SmoothingForecastIT 3. Simple Exponential Smoothing
ForecastIT 3. Simple Exponential Smoothing
 
Stock Market Prediction
Stock Market PredictionStock Market Prediction
Stock Market Prediction
 
Forecasting
ForecastingForecasting
Forecasting
 
ForecastIT 1. Introduction to Forecasting
ForecastIT 1. Introduction to ForecastingForecastIT 1. Introduction to Forecasting
ForecastIT 1. Introduction to Forecasting
 
Presentation 2
Presentation 2Presentation 2
Presentation 2
 
Npi rpg 22
Npi rpg 22Npi rpg 22
Npi rpg 22
 
STOCK MARKET PREDICTION
STOCK MARKET PREDICTIONSTOCK MARKET PREDICTION
STOCK MARKET PREDICTION
 

Similar to Stock Market Trends Prediction after Earning Release.pptx

Quant Foundry Labs - Low Probability Defaults
Quant Foundry Labs - Low Probability DefaultsQuant Foundry Labs - Low Probability Defaults
Quant Foundry Labs - Low Probability Defaults
Davidkerrkelly
 
IRJET - Stock Recommendation System using Machine Learning Approache
IRJET - Stock Recommendation System using Machine Learning ApproacheIRJET - Stock Recommendation System using Machine Learning Approache
IRJET - Stock Recommendation System using Machine Learning Approache
IRJET Journal
 
Stock price prediction using stock eod of day price
Stock price prediction using stock eod of day priceStock price prediction using stock eod of day price
Stock price prediction using stock eod of day price
Ekaantik Software Solutions
 
The Analysis of Share Market using Random Forest & SVM
The Analysis of Share Market using Random Forest & SVMThe Analysis of Share Market using Random Forest & SVM
The Analysis of Share Market using Random Forest & SVM
IRJET Journal
 
ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...
ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...
ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...
IRJET Journal
 
Investment Portfolio Risk Manager using Machine Learning and Deep-Learning.
Investment Portfolio Risk Manager using Machine Learning and Deep-Learning.Investment Portfolio Risk Manager using Machine Learning and Deep-Learning.
Investment Portfolio Risk Manager using Machine Learning and Deep-Learning.
IRJET Journal
 
STOCK MARKET ANALYZING AND PREDICTION USING MACHINE LEARNING TECHNIQUES
STOCK MARKET ANALYZING AND PREDICTION USING MACHINE LEARNING TECHNIQUESSTOCK MARKET ANALYZING AND PREDICTION USING MACHINE LEARNING TECHNIQUES
STOCK MARKET ANALYZING AND PREDICTION USING MACHINE LEARNING TECHNIQUES
IRJET Journal
 
IRJET- Financial Analysis using Data Mining
IRJET- Financial Analysis using Data MiningIRJET- Financial Analysis using Data Mining
IRJET- Financial Analysis using Data Mining
IRJET Journal
 
IRJET- Slant Analysis of Customer Reviews in View of Concealed Markov Display
IRJET- Slant Analysis of Customer Reviews in View of Concealed Markov DisplayIRJET- Slant Analysis of Customer Reviews in View of Concealed Markov Display
IRJET- Slant Analysis of Customer Reviews in View of Concealed Markov Display
IRJET Journal
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
Pranov Mishra
 
What we do; predictive and prescriptive analytics
What we do; predictive and prescriptive analyticsWhat we do; predictive and prescriptive analytics
What we do; predictive and prescriptive analytics
Weibull AS
 
Heuristic Approach for Demand Forecasting under the Impact of Promotions
Heuristic Approach for Demand Forecasting under the Impact of PromotionsHeuristic Approach for Demand Forecasting under the Impact of Promotions
Heuristic Approach for Demand Forecasting under the Impact of Promotions
IRJET Journal
 
EAD Parameter : A stochastic way to model the Credit Conversion Factor
EAD Parameter : A stochastic way to model the Credit Conversion FactorEAD Parameter : A stochastic way to model the Credit Conversion Factor
EAD Parameter : A stochastic way to model the Credit Conversion Factor
Genest Benoit
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network ModelEric Esajian
 
IRJET- Stock Market Prediction using Machine Learning
IRJET- Stock Market Prediction using Machine LearningIRJET- Stock Market Prediction using Machine Learning
IRJET- Stock Market Prediction using Machine Learning
IRJET Journal
 
Demand forecasting case study
Demand forecasting case studyDemand forecasting case study
Demand forecasting case study
Rupam Devnath
 
Advanced Pricing in General Insurance
Advanced Pricing in General InsuranceAdvanced Pricing in General Insurance
Advanced Pricing in General Insurance
Syed Danish Ali
 
everis Marcus Evans FRTB Conference 23Feb17
everis Marcus Evans FRTB Conference 23Feb17everis Marcus Evans FRTB Conference 23Feb17
everis Marcus Evans FRTB Conference 23Feb17
Jonathan Philp
 

Similar to Stock Market Trends Prediction after Earning Release.pptx (20)

Quant Foundry Labs - Low Probability Defaults
Quant Foundry Labs - Low Probability DefaultsQuant Foundry Labs - Low Probability Defaults
Quant Foundry Labs - Low Probability Defaults
 
IRJET - Stock Recommendation System using Machine Learning Approache
IRJET - Stock Recommendation System using Machine Learning ApproacheIRJET - Stock Recommendation System using Machine Learning Approache
IRJET - Stock Recommendation System using Machine Learning Approache
 
Stock price prediction using stock eod of day price
Stock price prediction using stock eod of day priceStock price prediction using stock eod of day price
Stock price prediction using stock eod of day price
 
The Analysis of Share Market using Random Forest & SVM
The Analysis of Share Market using Random Forest & SVMThe Analysis of Share Market using Random Forest & SVM
The Analysis of Share Market using Random Forest & SVM
 
ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...
ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...
ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...
 
Investment Portfolio Risk Manager using Machine Learning and Deep-Learning.
Investment Portfolio Risk Manager using Machine Learning and Deep-Learning.Investment Portfolio Risk Manager using Machine Learning and Deep-Learning.
Investment Portfolio Risk Manager using Machine Learning and Deep-Learning.
 
STOCK MARKET ANALYZING AND PREDICTION USING MACHINE LEARNING TECHNIQUES
STOCK MARKET ANALYZING AND PREDICTION USING MACHINE LEARNING TECHNIQUESSTOCK MARKET ANALYZING AND PREDICTION USING MACHINE LEARNING TECHNIQUES
STOCK MARKET ANALYZING AND PREDICTION USING MACHINE LEARNING TECHNIQUES
 
IRJET- Financial Analysis using Data Mining
IRJET- Financial Analysis using Data MiningIRJET- Financial Analysis using Data Mining
IRJET- Financial Analysis using Data Mining
 
IRJET- Slant Analysis of Customer Reviews in View of Concealed Markov Display
IRJET- Slant Analysis of Customer Reviews in View of Concealed Markov DisplayIRJET- Slant Analysis of Customer Reviews in View of Concealed Markov Display
IRJET- Slant Analysis of Customer Reviews in View of Concealed Markov Display
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
 
Final Report
Final ReportFinal Report
Final Report
 
What we do; predictive and prescriptive analytics
What we do; predictive and prescriptive analyticsWhat we do; predictive and prescriptive analytics
What we do; predictive and prescriptive analytics
 
Heuristic Approach for Demand Forecasting under the Impact of Promotions
Heuristic Approach for Demand Forecasting under the Impact of PromotionsHeuristic Approach for Demand Forecasting under the Impact of Promotions
Heuristic Approach for Demand Forecasting under the Impact of Promotions
 
EAD Parameter : A stochastic way to model the Credit Conversion Factor
EAD Parameter : A stochastic way to model the Credit Conversion FactorEAD Parameter : A stochastic way to model the Credit Conversion Factor
EAD Parameter : A stochastic way to model the Credit Conversion Factor
 
CCAR - Kocis
CCAR - KocisCCAR - Kocis
CCAR - Kocis
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
 
IRJET- Stock Market Prediction using Machine Learning
IRJET- Stock Market Prediction using Machine LearningIRJET- Stock Market Prediction using Machine Learning
IRJET- Stock Market Prediction using Machine Learning
 
Demand forecasting case study
Demand forecasting case studyDemand forecasting case study
Demand forecasting case study
 
Advanced Pricing in General Insurance
Advanced Pricing in General InsuranceAdvanced Pricing in General Insurance
Advanced Pricing in General Insurance
 
everis Marcus Evans FRTB Conference 23Feb17
everis Marcus Evans FRTB Conference 23Feb17everis Marcus Evans FRTB Conference 23Feb17
everis Marcus Evans FRTB Conference 23Feb17
 

Stock Market Trends Prediction after Earning Release.pptx

  • 1. Figure. SVM Kernel Gamma Γ vs Accuracy Stock Market Trends Prediction after Earning Release Chen Qian, Wenjie Zheng, Ran An (cqian23 / zhwj814 / anran)@stanford.edu Public companies release quarterly earnings report to present financial status to investors. Short term stock prices can be significantly affected by company’s quarterly performance. Our project aims to explore the impact of financial report factors, earning per share (EPS), consensus forecast and the earning report related articles on short term stock market trends. We narrow down our targeted company to Silicon based public technology companies. Features Relativized financial data Data Selection • Financial factors Discussion Future Result: Stock market is known as a chaotic system and our model analysis depicts that even building our model with empirical key features could still result in low accuracy. However, by limiting our scope to earning release day, we are able to build the prediction model of around 70% accuracy. Total asset Total equity Total revenue Net income EPS surprise Total asset Total equity Total revenue Net income 15.30% 5.52% 7.82% -3.45% 14.21% Very negative Negative Neutral Positive Very Positive 2.564% 8.974% 62.820% 20.512% 5.128% Normalized sentiment analysis features Data Preprocessing Relativisation Calculate the growth rate of each financial factor from last quarter, allows us to combine companies data together. Sentiment Analysis Interact with Stanford NLP toolkit to obtain sentiment analysis matrix from news and articles. Normalization Normalized input features simplifies model design. ● EPS vs Forecast How much the actual EPS bits consensus expectation is also a key factor to be considered for investment strategy. ● News and analytic articles We would like to analyze if these articles from mainstream media could influence stock market and investor’s trading strategies. ● Financial features from earning report Mixed ● Sentiment features from articles Mixed Model Selection & Evaluation Introduction Results Model Equation Parameters Smoothing Tools Locally Weighted LR Bandwidth Ƭ Model Equation Parameters Kernel SVM Bandwidth Γ Figure. LWLR with financial features: Bandwidth vs Accuracy Figure. LWLR with NLP features: Bandwidth vs Accuracy ● Training labels Label 1: Next day opening price vs release date close price Label 2: Next day closing price vs release date close price ● All error analysis are validated using K-Fold cross validation (K = 5) Data Visualization Models and features: SVM seems to give us a better prediction than NB and other generalized LR algorithms. Further, it is observed that data visualization from different companies shows that data set from a certain company is much more distinguishable than mixing data together. The commercial correlations between companies are far less than previous expectation and the features we collected are far from enough to predict the stock trend. Limitation: Due to the limited number of company choices, we have small data size (~300 samples) which could lead to high bias and over-fitting. Stock price is not only affected by certain financial features, consensus news, but also company direction and future business guidance, which are difficult to be digitized. Figure. Accuracy with different models over multiple tests Feature selections: Write script to auto collect/process data to increase the learning set size and use feature ranking to analysis the covariance between features and labels and pick the most top ranked one as the model inputs. NLP analysis: The sentiment result from Stanford NLP toolkit is too general to apply for financial news analysis and we would spend more time to develop a financial specific sentiment analysis algorithm. Model Improvement: We would spend more time on SVM algorithm and explore the most appropriate kernel to separate features in high dimension. Classification Models Predict by report SVM Naïve Bayesian Locally Weighted Boosting after-market 69.80% 66% 64.15% 66% next-day 64.20% 60.30% 68.12% 58.30% Predict by article (norm) after-market 57.64% 52.17% 58.96% 53.96% next-day 54.91% 50.94% 64.38% 52.17% 1. In SVM with RBF, decision boundary to separate samples of different labels is easier to find with a certain company. It is hard to find the boundary if we mix data among companies. However, in terms of article, no matter if the focus is on individual company or several, the data tends to follow a similar distribution. 2. In experiments with SVM and LWLR, we tried to adjust the bandwidth 𝛕/Γ to improve the performance of these models. 3. 5-fold cross validation was exploited for accuracy measurement