SlideShare a Scribd company logo
Stock Prediction Using Social
Network Data
Rohit Tiwari (rtiwari2)
Chanon Hongsirikulkit (hongsir2)
Outline
- Introduction
- Data Sources
- APIs
- Filter Relevant Data
- Text Normalization
- Noise Removal
- Feature Extraction
- Topic Modeling
- Sentiment Analysis
- Tweet Features
- Prediction Model Construction
- Conclusion
- Future Works
Fake Tweet -> Stocks Plunged
Introduction
- Social Network is a communication platform contain hidden valuable knowledge
- Information on social network can reflect the real-world events
- Many researches exploit those information to enhance the application capability
- To analyze tweets contain information needs (Zhao and Mei 2013)
- Apply tweet-rate to predict box office revenues of movie (Asur and Huberman 2010)
- Our survey will focus on using social network data to predict stock market movement
- False message on Twitter “BREAKING: Two Explosions in the White House and Barack Obama is injured.” -> The Dow Jones and
S&P 500 indexes dropped by close to 1%, the equivalent of hundreds of billions of dollars changing hands.
- In August 2012, an Italian journalist set up a fake Twitter account for a member of Russia's government and tweeted that the
president of Syria had been killed, causing brief fluctuations in the oil markets.
http://www.telegraph.co.uk/finance/markets/10013768/Bogus-AP-tweet-
about-explosion-at-the-White-House-wipes-billions-off-US-markets.html
Formal Description: The Efficient Market Hypothesis (EMH)
- The EMH states that financial markets are the source of comprehensive and huge information.
It implies that market prices reflect changes in investor behavior since they take this into
account and act accordingly.
- Research asserts investor’s rational considerations are influenced by psychological biases and
emotions.
- For several decades, direct surveys have been the prominent method to estimate public mood
and investor sentiment. However, explicit expressions can be manipulated incorrectly. It cannot
take behavior based indicators into consideration.
J. Bollen and H. Mao, “Twitter Mood as a Stock Market Predictor,” Computer, vol. 44, no. 10, pp. 91-94, 2011.
General Methodology for Stock prediction
Data
Sources
Relevant
Dataset
Data
Preprocessing
-Text Filter
-Text Normalization
-Noise Removal
via APIs
Feature
Extraction
Features
Topic
Modeling
Sentiment
Analysis
Tweet
Features
Classifiers
Training
Data
Results
Correlation /
Prediction
Capability
Testing
Data Sources
- Twitter (Asur and Huberman 2010; Bollen and Mao 2011; Zhao and Mei 2013; Arias et al. 2015)
- Streaming API -> collect real-time tweets
- Search API -> search and collect historical tweets one week in past
- Yahoo Finance (Nguyen et al. 2015)
- Collect historical stock prices
- Collect posts from Yahoo Finance Board
- Sina Weibo (Liu et al. 2015)
- Microblogging service from China which is similar to Twitter
Filter Relevant Data from Corpus
- Collect data from social network contain both relevant and non-relevant data
to our specific domain
- We need to filter only relevant data
- Some approaches are used in the researches
- Filter by keywords -> exploit hashtag or cash tag in the messages
- Apply LDA to do topic modeling and then filter only related topics (Arias et al.
2015)
M. Arias, A. Arratia, and R. Xuriguera, “Forecasting with Twitter Data,” ACM
Transactions on Intelligent Systems and Technology, vol. 5, no. 1, pp. 1-24, 2015.
Text Normalization
Primary step to refine the data. It can involve tasks.
- Stop word removal
- Punctuation removal
- Lowercase conversion
- Compressing
- Transform “Haaappyyyy” to “Happy” . This is done in multiple iterations,
finally validated with the dictionary lookup at the end.
Noise Removal in tweets
- Noise data removing has standard tools to remove highly weighted and
frequent terms with IDF.
- Named entity recognition (NER) system - Initially, it was built to figure out
if tweet contains name entities related to companies(or other feature) based
on conditional random fields (CRF) model. If the Tweet doesn’t have any
named entities from keyword list for the company, it is removed.
Cluttered Information
Refined form
Feature Extraction
- Some researches use topics of the messages to be features for forecasting model
- Many approaches are proposed for topic extraction
- Extract n-gram (unigrams or bigrams)
- Latent Dirichlet Allocation (LDA)
- Joint Sentiment-Topic (JST) -> to extract both sentiment information and topics from
text data simultaneously
- Aspect-based sentiment -> to extract topics first and then calculate sentiment scores
concerning the distance between topics and emotion words / the importance of each topic
(Nguyen et al. 2015)
Topic Modelling
- To extract topics first and then calculate sentiment scores concerning the distance
between topics and emotion words / the importance of each topic (Nguyen et al.
2015)
Aspect-based sentiment algorithm
Algorithm for extracting topics
from dataset
Algorithm for extracting topics
and their sentiment values
T. H. Nguyen, K. Shirai, and J. Velcin, “Sentiment analysis on
social media for stock movement prediction,” Expert Systems with
Applications, vol. 42, no. 24, pp. 9603-9611, 2015.
Sentiment Analysis
- Some researches consider sentiment information on social network as features for their model
- There are two ways to extract sentiment score
- Using software to calculate sentiment scores
- Construct a classifier for sentiment classification
- Popular tools
- GPOMS -> categorize people’s emotions into 6 categories: calm, alert, sure, vital, kind, and happy
- OpinionFinder (OF) -> classify sentiment into positive or negative feelings
Constructing Sentiment Classifier
- Have experts to annotate sentiment data and use them as training data
- Extract features from training data -> n-gram, POS tagging
- Use classifier (SVM, Linear Regression Model) to learn from training data
- Apply the classifier to entire collection
Extracting Sentiment Features
After having classified sentiment data, we can generate sentiment features in various ways
Example of sentiment features used in some researches.
- Average daily sentiment score
- Sentiment index = Numbers of positive tweets / Total numbers of tweets
- PNRatio = Numbers of positive tweets / Numbers of negative tweets
- Sentiment polarity = (ptw - ntw) / (ptw + ntw)
- ptw : numbers of positive tweets
- ntw : numbers of negative tweets
Sentiment Features Testing
- To ensure that sentiment information reflect the real-world events and can be used for prediction
- Some approaches used in researches (Bollen and Mao 2011)
- Causality testing : to test correlation between sentiment information and stock market price (DJIA / VIX)
- Self-organizing fuzzy neural network (SOFFN) : to test prediction capability of sentiment information
J. Bollen, and H. Mao, “Twitter Mood as a Stock Market Predictor,”
Computer, vol. 44, no. 10, pp. 91-94, 2011.
Extracting Tweet Features
Some useful quantifiable information out of corpus.
- Number of followers of the company or the famous personality tweeting about the
company (typical problem of mapreduce framework)
- Tweet volume (related to a specific identity or hashtag)
- Retweet volume (related to a specific hashtag coupled with an identity)
- Tweet-rate = Numbers of tweets / Duration for generating those tweets
- Tweet length
Prediction Model Construction
1. Combine features from previous step
- Topic features
- Sentiment features
- Tweet features
- Stock historical price features (additional features)
Google Heat Map:
Gives the fair idea of any form of concentrated information by the geography. Eg, Facebook trends
Iterative Training & Validation
2. Train the classifier -> SVM, Linear Regression, Neural Networks
3. Test and evaluate the model
- Most popular method for this is windowing mechanism, where model segregates
tweets in a window (w1) spanning over days and analyses their sentiments or
features.
- Then in the subsequent window(w2) of 1-2 days, stock indices are measured.
- Then, w1 & w2 are formally analyzed together to find interesting patterns.
Correlation of sentiments & indices
This involve formally casually correlating social network sentiments and stock market
indices from Dow Jones, NASDAQ, NYSE, VIX
M. Arias, A. Arratia, and R. Xuriguera, “Forecasting with Twitter Data,” ACM
Transactions on Intelligent Systems and Technology, vol. 5, no. 1, pp. 1-24, 2015.
T. H. Nguyen, K. Shirai, and J. Velcin, “Sentiment analysis on social media for stock movement
prediction,” Expert Systems with Applications, vol. 42, no. 24, pp. 9603-9611, 2015.
Conclusion
- Information on social network reflect the real-world events
- Social network data can be used to predict stock market movement at certain
degree
- The knowledge extracted from social media can be applied to different
applications
- Individual stock price prediction
- Predicting box-office revenue of a movie
- Presidential/Senate election prediction based on campaigning data.
Future Works
- Try to work on longer duration dataset -> some current works use only 15
transaction dates
- Combining information from different data sources might improve prediction
accuracy -> we know that Twitter contain many noise data
- Come up with new features, such as the credibility of tweets. -> most of
current researches focus on topic + sentiment without concerning about
reliability of data
References
[1] M. Arias, A. Arratia, and R. Xuriguera, “Forecasting with Twitter Data,” ACM Transactions on Intelligent Systems and Technology, vol. 5, no. 1,
pp. 1-24, 2015.
[2] L. Liu, J. Wu, P. Li, and Q. Li, “A social-media-based approach to predicting stock comovement,” Expert Systems with Applications, vol. 42, no.
8, pp. 3893-3901, 2015.
[3] T. H. Nguyen, K. Shirai, and J. Velcin, “Sentiment analysis on social media for stock movement prediction,” Expert Systems with Applications,
vol. 42, no. 24, pp. 9603-9611, 2015.
[4] S. Asur, B. A. Huberman, "Predicting the Future with Social Media," 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence
(WI) and Intelligent Agent Technologies (IAT), pp. 492-499, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent
Agent Technology, 2010.
[5] Z. Zhao, Q. Mei, “Questions about questions: an empirical analysis of information needs on Twitter,” Proceedings of the 22nd international
conference on World Wide Web, May 13-17, 2013, Rio de Janeiro, Brazil
[6] J. Bollen, and H. Mao, “Twitter Mood as a Stock Market Predictor,” Computer, vol. 44, no. 10, pp. 91-94, 2011.
[7] J. Si, A. Mukherjee, B. Liu, Q. Li, H. Li, and X. Deng, “Exploiting Topic based Twitter Sentiment for Stock Prediction,” Proceedings of the 51st
Annual Meeting of the Association for Computational Linguistics, pp. 24-29, 2013.
[8] X. Zhang, H. Fuehres, and P. A. Gloor, “Predicting Stock Market Indicators Through Twitter “I hope it is not as bad as I fear”,” The 2nd
Collaborative Innovation Networks Conference - COINs2010, vol. 26, pp. 55-62, 2011.
[9] G. Ranco, D. Aleksovski, G. Caldarelli, M. Grčar, and I. Mozetič, “The Effects of Twitter Sentiment on Stock Price Returns,” Plos ONE, vol. 10,
no. 9, pp. 1-21, 2015.
[10] T. T. Vu, S. Chang, Q. T. Ha, and N. Collier, “An Experiment in Integrating Sentiment Features for Tech Stock Prediction in Twitter,”
Workshop on Information Extraction and Entity Analytics on Social Media Data, pp. 23-38, 2012.
Thank You :)

More Related Content

What's hot

Final PPT.pptx
Final PPT.pptxFinal PPT.pptx
Final PPT.pptx
samarth70133
 
Stock Market Prediction
Stock Market PredictionStock Market Prediction
Stock Market Prediction
MRIDUL GUPTA
 
PRML上巻勉強会 at 東京大学 資料 第2章2.3.3 〜 2.3.6
PRML上巻勉強会 at 東京大学 資料 第2章2.3.3 〜 2.3.6PRML上巻勉強会 at 東京大学 資料 第2章2.3.3 〜 2.3.6
PRML上巻勉強会 at 東京大学 資料 第2章2.3.3 〜 2.3.6Hiroyuki Kato
 
テンソル代数
テンソル代数テンソル代数
テンソル代数
KCS Keio Computer Society
 
差分の差分法(Difference-in-Difference)
差分の差分法(Difference-in-Difference)差分の差分法(Difference-in-Difference)
差分の差分法(Difference-in-Difference)
Jaehyun Song
 
Stock Price Prediction
Stock Price PredictionStock Price Prediction
Stock Price Prediction
Manisha Mishra
 
連続時間フラクショナル・トピックモデル(NLP2023 金融・経済ドメインのための言語処理)
連続時間フラクショナル・トピックモデル(NLP2023 金融・経済ドメインのための言語処理)連続時間フラクショナル・トピックモデル(NLP2023 金融・経済ドメインのための言語処理)
連続時間フラクショナル・トピックモデル(NLP2023 金融・経済ドメインのための言語処理)
Kei Nakagawa
 
ICML 2021 Workshop 深層学習の不確実性について
ICML 2021 Workshop 深層学習の不確実性についてICML 2021 Workshop 深層学習の不確実性について
ICML 2021 Workshop 深層学習の不確実性について
tmtm otm
 
PRML上巻勉強会 at 東京大学 資料 第1章前半
PRML上巻勉強会 at 東京大学 資料 第1章前半PRML上巻勉強会 at 東京大学 資料 第1章前半
PRML上巻勉強会 at 東京大学 資料 第1章前半Ohsawa Goodfellow
 
Stock market prediction using data mining
Stock market prediction using data miningStock market prediction using data mining
Stock market prediction using data mining
ShivakumarSoppannavar
 
データ解析6 重回帰分析
データ解析6 重回帰分析データ解析6 重回帰分析
データ解析6 重回帰分析
Hirotaka Hachiya
 
Tokyo.R 41 サポートベクターマシンで眼鏡っ娘分類システム構築
Tokyo.R 41 サポートベクターマシンで眼鏡っ娘分類システム構築Tokyo.R 41 サポートベクターマシンで眼鏡っ娘分類システム構築
Tokyo.R 41 サポートベクターマシンで眼鏡っ娘分類システム構築Tatsuya Tojima
 
STOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUE
STOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUESTOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUE
STOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUE
Richa Handa
 
[DLHacks LT] PytorchのDataLoader -torchtextのソースコードを読んでみた-
[DLHacks LT] PytorchのDataLoader -torchtextのソースコードを読んでみた-[DLHacks LT] PytorchのDataLoader -torchtextのソースコードを読んでみた-
[DLHacks LT] PytorchのDataLoader -torchtextのソースコードを読んでみた-
Deep Learning JP
 
クラシックな機械学習入門 1 導入
クラシックな機械学習入門 1 導入クラシックな機械学習入門 1 導入
クラシックな機械学習入門 1 導入
Hiroshi Nakagawa
 
Accelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUAccelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPU
Davide Nardone
 
多項式あてはめで眺めるベイズ推定 ~今日からきみもベイジアン~
多項式あてはめで眺めるベイズ推定~今日からきみもベイジアン~多項式あてはめで眺めるベイズ推定~今日からきみもベイジアン~
多項式あてはめで眺めるベイズ推定 ~今日からきみもベイジアン~
tanutarou
 
これならわかる最適化数学8章_動的計画法
これならわかる最適化数学8章_動的計画法これならわかる最適化数学8章_動的計画法
これならわかる最適化数学8章_動的計画法
kenyanonaka
 
Bayesian Structural Equations Modeling (SEM)
Bayesian Structural Equations Modeling (SEM)Bayesian Structural Equations Modeling (SEM)
Bayesian Structural Equations Modeling (SEM)
Hamy Temkit
 
Classifying and understanding financial data using graph neural network
Classifying and understanding financial data using graph neural networkClassifying and understanding financial data using graph neural network
Classifying and understanding financial data using graph neural network
Park JunPyo
 

What's hot (20)

Final PPT.pptx
Final PPT.pptxFinal PPT.pptx
Final PPT.pptx
 
Stock Market Prediction
Stock Market PredictionStock Market Prediction
Stock Market Prediction
 
PRML上巻勉強会 at 東京大学 資料 第2章2.3.3 〜 2.3.6
PRML上巻勉強会 at 東京大学 資料 第2章2.3.3 〜 2.3.6PRML上巻勉強会 at 東京大学 資料 第2章2.3.3 〜 2.3.6
PRML上巻勉強会 at 東京大学 資料 第2章2.3.3 〜 2.3.6
 
テンソル代数
テンソル代数テンソル代数
テンソル代数
 
差分の差分法(Difference-in-Difference)
差分の差分法(Difference-in-Difference)差分の差分法(Difference-in-Difference)
差分の差分法(Difference-in-Difference)
 
Stock Price Prediction
Stock Price PredictionStock Price Prediction
Stock Price Prediction
 
連続時間フラクショナル・トピックモデル(NLP2023 金融・経済ドメインのための言語処理)
連続時間フラクショナル・トピックモデル(NLP2023 金融・経済ドメインのための言語処理)連続時間フラクショナル・トピックモデル(NLP2023 金融・経済ドメインのための言語処理)
連続時間フラクショナル・トピックモデル(NLP2023 金融・経済ドメインのための言語処理)
 
ICML 2021 Workshop 深層学習の不確実性について
ICML 2021 Workshop 深層学習の不確実性についてICML 2021 Workshop 深層学習の不確実性について
ICML 2021 Workshop 深層学習の不確実性について
 
PRML上巻勉強会 at 東京大学 資料 第1章前半
PRML上巻勉強会 at 東京大学 資料 第1章前半PRML上巻勉強会 at 東京大学 資料 第1章前半
PRML上巻勉強会 at 東京大学 資料 第1章前半
 
Stock market prediction using data mining
Stock market prediction using data miningStock market prediction using data mining
Stock market prediction using data mining
 
データ解析6 重回帰分析
データ解析6 重回帰分析データ解析6 重回帰分析
データ解析6 重回帰分析
 
Tokyo.R 41 サポートベクターマシンで眼鏡っ娘分類システム構築
Tokyo.R 41 サポートベクターマシンで眼鏡っ娘分類システム構築Tokyo.R 41 サポートベクターマシンで眼鏡っ娘分類システム構築
Tokyo.R 41 サポートベクターマシンで眼鏡っ娘分類システム構築
 
STOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUE
STOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUESTOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUE
STOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUE
 
[DLHacks LT] PytorchのDataLoader -torchtextのソースコードを読んでみた-
[DLHacks LT] PytorchのDataLoader -torchtextのソースコードを読んでみた-[DLHacks LT] PytorchのDataLoader -torchtextのソースコードを読んでみた-
[DLHacks LT] PytorchのDataLoader -torchtextのソースコードを読んでみた-
 
クラシックな機械学習入門 1 導入
クラシックな機械学習入門 1 導入クラシックな機械学習入門 1 導入
クラシックな機械学習入門 1 導入
 
Accelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUAccelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPU
 
多項式あてはめで眺めるベイズ推定 ~今日からきみもベイジアン~
多項式あてはめで眺めるベイズ推定~今日からきみもベイジアン~多項式あてはめで眺めるベイズ推定~今日からきみもベイジアン~
多項式あてはめで眺めるベイズ推定 ~今日からきみもベイジアン~
 
これならわかる最適化数学8章_動的計画法
これならわかる最適化数学8章_動的計画法これならわかる最適化数学8章_動的計画法
これならわかる最適化数学8章_動的計画法
 
Bayesian Structural Equations Modeling (SEM)
Bayesian Structural Equations Modeling (SEM)Bayesian Structural Equations Modeling (SEM)
Bayesian Structural Equations Modeling (SEM)
 
Classifying and understanding financial data using graph neural network
Classifying and understanding financial data using graph neural networkClassifying and understanding financial data using graph neural network
Classifying and understanding financial data using graph neural network
 

Similar to Stock prediction using social network

Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
Sumit Raj
 
IRJET- Sentimental Analysis of Twitter for Stock Market Investment
IRJET- Sentimental Analysis of Twitter for Stock Market InvestmentIRJET- Sentimental Analysis of Twitter for Stock Market Investment
IRJET- Sentimental Analysis of Twitter for Stock Market Investment
IRJET Journal
 
Sentiment analysis using machine learning and deep Learning
Sentiment analysis using machine learning and deep LearningSentiment analysis using machine learning and deep Learning
Sentiment analysis using machine learning and deep Learning
Venkat Projects
 
IRJET- A Real-Time Twitter Sentiment Analysis and Visualization System: Twisent
IRJET- A Real-Time Twitter Sentiment Analysis and Visualization System: TwisentIRJET- A Real-Time Twitter Sentiment Analysis and Visualization System: Twisent
IRJET- A Real-Time Twitter Sentiment Analysis and Visualization System: Twisent
IRJET Journal
 
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET -  	  Twitter Sentiment Analysis using Machine LearningIRJET -  	  Twitter Sentiment Analysis using Machine Learning
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET Journal
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
Mahir Haque
 
UTILIZING TWITTER TO PERFORM AUTONOMOUS SENTIMENT ANALYSIS
UTILIZING TWITTER TO PERFORM AUTONOMOUS SENTIMENT ANALYSISUTILIZING TWITTER TO PERFORM AUTONOMOUS SENTIMENT ANALYSIS
UTILIZING TWITTER TO PERFORM AUTONOMOUS SENTIMENT ANALYSIS
IRJET Journal
 
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET Journal
 
Twitter sentimentanalysis report
Twitter sentimentanalysis reportTwitter sentimentanalysis report
Twitter sentimentanalysis report
Savio Aberneithie
 
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNINGTHE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
IRJET Journal
 
IRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
IRJET- Interpreting Public Sentiments Variation by using FB-LDA TechniqueIRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
IRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
IRJET Journal
 
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
IJET - International Journal of Engineering and Techniques
 
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
Analysis and Prediction of Sentiments for Cricket Tweets using HadoopAnalysis and Prediction of Sentiments for Cricket Tweets using Hadoop
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
IRJET Journal
 
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
csandit
 
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
cscpconf
 
Neural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment AnalysisNeural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment Analysis
Editor IJCATR
 
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET Journal
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET Journal
 
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
A Survey on Analysis of Twitter Opinion Mining using Sentiment AnalysisA Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
IRJET Journal
 
Sentiment analysis using machine learning
Sentiment analysis using machine learningSentiment analysis using machine learning
Sentiment analysis using machine learning
Venkat Projects
 

Similar to Stock prediction using social network (20)

Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
IRJET- Sentimental Analysis of Twitter for Stock Market Investment
IRJET- Sentimental Analysis of Twitter for Stock Market InvestmentIRJET- Sentimental Analysis of Twitter for Stock Market Investment
IRJET- Sentimental Analysis of Twitter for Stock Market Investment
 
Sentiment analysis using machine learning and deep Learning
Sentiment analysis using machine learning and deep LearningSentiment analysis using machine learning and deep Learning
Sentiment analysis using machine learning and deep Learning
 
IRJET- A Real-Time Twitter Sentiment Analysis and Visualization System: Twisent
IRJET- A Real-Time Twitter Sentiment Analysis and Visualization System: TwisentIRJET- A Real-Time Twitter Sentiment Analysis and Visualization System: Twisent
IRJET- A Real-Time Twitter Sentiment Analysis and Visualization System: Twisent
 
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET -  	  Twitter Sentiment Analysis using Machine LearningIRJET -  	  Twitter Sentiment Analysis using Machine Learning
IRJET - Twitter Sentiment Analysis using Machine Learning
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
UTILIZING TWITTER TO PERFORM AUTONOMOUS SENTIMENT ANALYSIS
UTILIZING TWITTER TO PERFORM AUTONOMOUS SENTIMENT ANALYSISUTILIZING TWITTER TO PERFORM AUTONOMOUS SENTIMENT ANALYSIS
UTILIZING TWITTER TO PERFORM AUTONOMOUS SENTIMENT ANALYSIS
 
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
 
Twitter sentimentanalysis report
Twitter sentimentanalysis reportTwitter sentimentanalysis report
Twitter sentimentanalysis report
 
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNINGTHE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
 
IRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
IRJET- Interpreting Public Sentiments Variation by using FB-LDA TechniqueIRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
IRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
 
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
 
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
Analysis and Prediction of Sentiments for Cricket Tweets using HadoopAnalysis and Prediction of Sentiments for Cricket Tweets using Hadoop
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
 
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
 
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
 
Neural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment AnalysisNeural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment Analysis
 
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
 
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
A Survey on Analysis of Twitter Opinion Mining using Sentiment AnalysisA Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
 
Sentiment analysis using machine learning
Sentiment analysis using machine learningSentiment analysis using machine learning
Sentiment analysis using machine learning
 

Recently uploaded

Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
Celine George
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
Vivekanand Anglo Vedic Academy
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
Celine George
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
PedroFerreira53928
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
Excellence Foundation for South Sudan
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
Fundacja Rozwoju Społeczeństwa Przedsiębiorczego
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
GeoBlogs
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 

Recently uploaded (20)

Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 

Stock prediction using social network

  • 1. Stock Prediction Using Social Network Data Rohit Tiwari (rtiwari2) Chanon Hongsirikulkit (hongsir2)
  • 2. Outline - Introduction - Data Sources - APIs - Filter Relevant Data - Text Normalization - Noise Removal - Feature Extraction - Topic Modeling - Sentiment Analysis - Tweet Features - Prediction Model Construction - Conclusion - Future Works
  • 3. Fake Tweet -> Stocks Plunged
  • 4. Introduction - Social Network is a communication platform contain hidden valuable knowledge - Information on social network can reflect the real-world events - Many researches exploit those information to enhance the application capability - To analyze tweets contain information needs (Zhao and Mei 2013) - Apply tweet-rate to predict box office revenues of movie (Asur and Huberman 2010) - Our survey will focus on using social network data to predict stock market movement - False message on Twitter “BREAKING: Two Explosions in the White House and Barack Obama is injured.” -> The Dow Jones and S&P 500 indexes dropped by close to 1%, the equivalent of hundreds of billions of dollars changing hands. - In August 2012, an Italian journalist set up a fake Twitter account for a member of Russia's government and tweeted that the president of Syria had been killed, causing brief fluctuations in the oil markets. http://www.telegraph.co.uk/finance/markets/10013768/Bogus-AP-tweet- about-explosion-at-the-White-House-wipes-billions-off-US-markets.html
  • 5. Formal Description: The Efficient Market Hypothesis (EMH) - The EMH states that financial markets are the source of comprehensive and huge information. It implies that market prices reflect changes in investor behavior since they take this into account and act accordingly. - Research asserts investor’s rational considerations are influenced by psychological biases and emotions. - For several decades, direct surveys have been the prominent method to estimate public mood and investor sentiment. However, explicit expressions can be manipulated incorrectly. It cannot take behavior based indicators into consideration. J. Bollen and H. Mao, “Twitter Mood as a Stock Market Predictor,” Computer, vol. 44, no. 10, pp. 91-94, 2011.
  • 6. General Methodology for Stock prediction Data Sources Relevant Dataset Data Preprocessing -Text Filter -Text Normalization -Noise Removal via APIs Feature Extraction Features Topic Modeling Sentiment Analysis Tweet Features Classifiers Training Data Results Correlation / Prediction Capability Testing
  • 7. Data Sources - Twitter (Asur and Huberman 2010; Bollen and Mao 2011; Zhao and Mei 2013; Arias et al. 2015) - Streaming API -> collect real-time tweets - Search API -> search and collect historical tweets one week in past - Yahoo Finance (Nguyen et al. 2015) - Collect historical stock prices - Collect posts from Yahoo Finance Board - Sina Weibo (Liu et al. 2015) - Microblogging service from China which is similar to Twitter
  • 8. Filter Relevant Data from Corpus - Collect data from social network contain both relevant and non-relevant data to our specific domain - We need to filter only relevant data - Some approaches are used in the researches - Filter by keywords -> exploit hashtag or cash tag in the messages - Apply LDA to do topic modeling and then filter only related topics (Arias et al. 2015) M. Arias, A. Arratia, and R. Xuriguera, “Forecasting with Twitter Data,” ACM Transactions on Intelligent Systems and Technology, vol. 5, no. 1, pp. 1-24, 2015.
  • 9. Text Normalization Primary step to refine the data. It can involve tasks. - Stop word removal - Punctuation removal - Lowercase conversion - Compressing - Transform “Haaappyyyy” to “Happy” . This is done in multiple iterations, finally validated with the dictionary lookup at the end.
  • 10. Noise Removal in tweets - Noise data removing has standard tools to remove highly weighted and frequent terms with IDF. - Named entity recognition (NER) system - Initially, it was built to figure out if tweet contains name entities related to companies(or other feature) based on conditional random fields (CRF) model. If the Tweet doesn’t have any named entities from keyword list for the company, it is removed.
  • 12. - Some researches use topics of the messages to be features for forecasting model - Many approaches are proposed for topic extraction - Extract n-gram (unigrams or bigrams) - Latent Dirichlet Allocation (LDA) - Joint Sentiment-Topic (JST) -> to extract both sentiment information and topics from text data simultaneously - Aspect-based sentiment -> to extract topics first and then calculate sentiment scores concerning the distance between topics and emotion words / the importance of each topic (Nguyen et al. 2015) Topic Modelling
  • 13. - To extract topics first and then calculate sentiment scores concerning the distance between topics and emotion words / the importance of each topic (Nguyen et al. 2015) Aspect-based sentiment algorithm Algorithm for extracting topics from dataset Algorithm for extracting topics and their sentiment values T. H. Nguyen, K. Shirai, and J. Velcin, “Sentiment analysis on social media for stock movement prediction,” Expert Systems with Applications, vol. 42, no. 24, pp. 9603-9611, 2015.
  • 14. Sentiment Analysis - Some researches consider sentiment information on social network as features for their model - There are two ways to extract sentiment score - Using software to calculate sentiment scores - Construct a classifier for sentiment classification - Popular tools - GPOMS -> categorize people’s emotions into 6 categories: calm, alert, sure, vital, kind, and happy - OpinionFinder (OF) -> classify sentiment into positive or negative feelings
  • 15. Constructing Sentiment Classifier - Have experts to annotate sentiment data and use them as training data - Extract features from training data -> n-gram, POS tagging - Use classifier (SVM, Linear Regression Model) to learn from training data - Apply the classifier to entire collection
  • 16. Extracting Sentiment Features After having classified sentiment data, we can generate sentiment features in various ways Example of sentiment features used in some researches. - Average daily sentiment score - Sentiment index = Numbers of positive tweets / Total numbers of tweets - PNRatio = Numbers of positive tweets / Numbers of negative tweets - Sentiment polarity = (ptw - ntw) / (ptw + ntw) - ptw : numbers of positive tweets - ntw : numbers of negative tweets
  • 17. Sentiment Features Testing - To ensure that sentiment information reflect the real-world events and can be used for prediction - Some approaches used in researches (Bollen and Mao 2011) - Causality testing : to test correlation between sentiment information and stock market price (DJIA / VIX) - Self-organizing fuzzy neural network (SOFFN) : to test prediction capability of sentiment information J. Bollen, and H. Mao, “Twitter Mood as a Stock Market Predictor,” Computer, vol. 44, no. 10, pp. 91-94, 2011.
  • 18. Extracting Tweet Features Some useful quantifiable information out of corpus. - Number of followers of the company or the famous personality tweeting about the company (typical problem of mapreduce framework) - Tweet volume (related to a specific identity or hashtag) - Retweet volume (related to a specific hashtag coupled with an identity) - Tweet-rate = Numbers of tweets / Duration for generating those tweets - Tweet length
  • 19. Prediction Model Construction 1. Combine features from previous step - Topic features - Sentiment features - Tweet features - Stock historical price features (additional features)
  • 20. Google Heat Map: Gives the fair idea of any form of concentrated information by the geography. Eg, Facebook trends
  • 21. Iterative Training & Validation 2. Train the classifier -> SVM, Linear Regression, Neural Networks 3. Test and evaluate the model - Most popular method for this is windowing mechanism, where model segregates tweets in a window (w1) spanning over days and analyses their sentiments or features. - Then in the subsequent window(w2) of 1-2 days, stock indices are measured. - Then, w1 & w2 are formally analyzed together to find interesting patterns.
  • 22. Correlation of sentiments & indices This involve formally casually correlating social network sentiments and stock market indices from Dow Jones, NASDAQ, NYSE, VIX M. Arias, A. Arratia, and R. Xuriguera, “Forecasting with Twitter Data,” ACM Transactions on Intelligent Systems and Technology, vol. 5, no. 1, pp. 1-24, 2015. T. H. Nguyen, K. Shirai, and J. Velcin, “Sentiment analysis on social media for stock movement prediction,” Expert Systems with Applications, vol. 42, no. 24, pp. 9603-9611, 2015.
  • 23. Conclusion - Information on social network reflect the real-world events - Social network data can be used to predict stock market movement at certain degree - The knowledge extracted from social media can be applied to different applications - Individual stock price prediction - Predicting box-office revenue of a movie - Presidential/Senate election prediction based on campaigning data.
  • 24. Future Works - Try to work on longer duration dataset -> some current works use only 15 transaction dates - Combining information from different data sources might improve prediction accuracy -> we know that Twitter contain many noise data - Come up with new features, such as the credibility of tweets. -> most of current researches focus on topic + sentiment without concerning about reliability of data
  • 25. References [1] M. Arias, A. Arratia, and R. Xuriguera, “Forecasting with Twitter Data,” ACM Transactions on Intelligent Systems and Technology, vol. 5, no. 1, pp. 1-24, 2015. [2] L. Liu, J. Wu, P. Li, and Q. Li, “A social-media-based approach to predicting stock comovement,” Expert Systems with Applications, vol. 42, no. 8, pp. 3893-3901, 2015. [3] T. H. Nguyen, K. Shirai, and J. Velcin, “Sentiment analysis on social media for stock movement prediction,” Expert Systems with Applications, vol. 42, no. 24, pp. 9603-9611, 2015. [4] S. Asur, B. A. Huberman, "Predicting the Future with Social Media," 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), pp. 492-499, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, 2010. [5] Z. Zhao, Q. Mei, “Questions about questions: an empirical analysis of information needs on Twitter,” Proceedings of the 22nd international conference on World Wide Web, May 13-17, 2013, Rio de Janeiro, Brazil [6] J. Bollen, and H. Mao, “Twitter Mood as a Stock Market Predictor,” Computer, vol. 44, no. 10, pp. 91-94, 2011. [7] J. Si, A. Mukherjee, B. Liu, Q. Li, H. Li, and X. Deng, “Exploiting Topic based Twitter Sentiment for Stock Prediction,” Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 24-29, 2013. [8] X. Zhang, H. Fuehres, and P. A. Gloor, “Predicting Stock Market Indicators Through Twitter “I hope it is not as bad as I fear”,” The 2nd Collaborative Innovation Networks Conference - COINs2010, vol. 26, pp. 55-62, 2011. [9] G. Ranco, D. Aleksovski, G. Caldarelli, M. Grčar, and I. Mozetič, “The Effects of Twitter Sentiment on Stock Price Returns,” Plos ONE, vol. 10, no. 9, pp. 1-21, 2015. [10] T. T. Vu, S. Chang, Q. T. Ha, and N. Collier, “An Experiment in Integrating Sentiment Features for Tech Stock Prediction in Twitter,” Workshop on Information Extraction and Entity Analytics on Social Media Data, pp. 23-38, 2012.