This document describes research on using deep learning models to predict stock market movements based on news events. It presents a method to extract event representations from news articles, generalize the events, embed the events, and feed the embedded events into deep learning models. Experimental results show that using embedded events as inputs to convolutional neural networks achieved more accurate stock market predictions than baseline methods, and modeling long, mid, and short-term event effects further improved performance. The research demonstrates that deep learning can effectively capture hidden relationships between news events and stock prices.
2. My
 research
 areas
Machine Learning
Natural Language Processing
Applications
Text synthesis
Machine translation
Information extractionMarket prediction
Sentiment analysis
Syntactic analysis
3. This
 talk
⢠Reading
 news
 from
 the
 Internet
 and
Â
predicting
 the
 stock
 market
5. Introduction
⢠Is
 it
 possible?
â Random
 walk
 theory
â Efficient
 market
 hypothesis
â Human/algorithm
 trading
⢠Examples
â Shares
 of
 Apple
 Inc.
 fell
 as
 trading
 began
 in
 New
 York
Â
on
 Tuesday
 morning,
 the
 day
 after
 former
 CEO
 Steve
Â
Jobs
 passed
 away
â Googleâs
 stock
 falls
 after
 grim
 earnings
 come
 out
 early
6. Why
 events?
⢠Previous
 work
â Bag-Ââof-Ââwords
â Named
 Entities
â Noun
 Phrases
⢠Examples
â Oracle
 Corp
 would
 sue
 Google
 Inc.,
 claiming
 Googleâs
Â
Android
 operating
 systemâŚ
â Microsoft
 agrees
 to
 buy
 Nokiaâs
 mobile
 phone
Â
business
 for
 $
 7.2
 billion.
9. Method
⢠Event
 Generalization
â First,
 we
 construct
 a
 morphological
 analysis
 tool
Â
based
 on
 the
 WordNet stemmer
 to
 extract
 lemma
Â
forms
 of
 inflected
 words
â Second,
 we
 generalize
 each
 verb
 to
 its
 class
 name
 in
Â
VerbNet
⢠For
 example
â Instant
 view:
 Private
 sector
 adds
 114,000
 jobs
 in
 July.
â (Private
 sector,
 adds,
 114,000
 jobs)
â (private
 sector,
 multiply_class,
 114,000
 job)
10. Method
⢠Model
â Input:
 events
â Output:
 two-Ââway
 movement
⢠Training:
 historical
 data
⢠Testing:
 coming
 data
11. Method
⢠Prediction
 Model
â Linear
 model
⢠Most
 previous
 work
 uses
 linear
 models
 to
 predict
 the
 stock
Â
market.
 To
 make
 direct
 comparisons,
 this
 paper
 constructs
 a
Â
linear
 prediction
 model
 by
 using
 SVM
 with
 linear
 kernel
â Nonlinear
 model
⢠Intuitively,
 the
 relationship
 between
 events
 and
 the
 stock
Â
market
 may
 be
 more
 complex
 than
 linear,
 due
 to
 hidden
 and
Â
indirect
 relationships.
 We
 exploit
 a
 deep
 neural
 network
Â
model,
 the
 hidden
 layers
 of
 which
 is
 useful
 for
 learning
 such
Â
hidden
 relationships
12. âŚ
News
 documents
Ď1
Class
 +1
The
 polarity
 of
 the
 stock
Â
price
 movement
 is
Â
positive
Class
 -Ââ1
The
 polarity
 of
 the
 stock
Â
price
 movement
 is
Â
negative
Input
Â
Layer
Output
Â
Layer
Hidden
Â
Layers
âŚ
âŚ
Ď2 Ď3 ĎM
13. Method
⢠Feature
 Representation
â Bag-Ââof-Ââwords
⢠TF*IDF
â Events
⢠O1,
 P,
 O2,
 O1
 +
 P,
 P
 +
 O2,
 O1
 +
 P
 +
 O2
⢠For
 Example
â (Microsoft,
 buy,
 Nokia's
 mobile
 phone
 business)
â (#arg1=Microsoft,
 #action=buy,
 #arg2= Nokia's
 mobile
 phone
Â
business,
 #arg1_action=Microsoft
 buy,
 #action_arg2=buy
Â
Nokia's
 mobile
 phone
 business,
 #arg1_action_arg2=
Â
Microsoft
 buy
 Nokia's
 mobile
 phone
 business)
14. Experiments
⢠Data
 Description
â We
 use
 publicly
 available
 financial
 news
 from
 Reuters
Â
and
 Bloomberg
 over
 the
 period
 from
 October
 2006
 to
Â
November
 2013.
 This
 time
 span
 witnesses
 a
 severe
Â
economic
 downturn
 in
 2007-Ââ2010,
 followed
 by
 a
Â
modest
 recovery
 in
 2011-Ââ2013.
 There
 are
 106,521
Â
documents
 in
 total
 from
 Reuters
 News
 and
 447,145
Â
from
 Bloomberg
 News.
â We
 mainly
 focus
 on
 predicting
 the
 Standard
 &Poor's
Â
500
 stocks
 (S&P
 500)
 index,
 obtaining
 indices
 and
Â
stock
 price
 from
 Yahoo
 Finance.
21. Conclusion
⢠Events
 are
 useful.
Â
â Events
 are
 more
 useful
 representations
 compared
 to
 bags-Ââof-Ââwords
 for
 the
Â
task
 of
 stock
 market
 prediction.
⢠Hidden
 relations
 useful.
â A
 deep
 neural
 network
 model
 can
 be
 more
 accurate
 on
 predicting
 the
 stock
Â
market
 compared
 to
 the
 linear
 model.
⢠Robust
 results
 obtained.
â Our
 approach
 can
 achieve
 stable
 experiment
 results
 on
 S&P
 500
 index
Â
prediction
 and
 individual
 stock
 prediction
 over
 a
 large
 amount
 of
 data
 (eight
Â
years
 of
 stock
 prices
 and
 more
 than
 550,000
 pieces
 of
 news).
⢠Quality
 of
 information
 is
 more
 important
 than
 quantity.
Â
â The
 most
 relevant
 information
 (i.e.
 news
 title
 vs news
 content,
 individual
Â
company
 news
 vs all
 news)
 is
 better
 than
 more,
 but
 less
 relevant
 information.
23. Two
 extensions
⢠Event
 sparsity
â Using
 structured
 event
 to
 predict
 stock
 market
Â
movement
 suffers
 from
 increased
 data
 sparsity
(Actor
 =
 Microsoft,
 Action
 =
 sues,
 Object
 =
 Barnes
 &
 Noble)
24. Two
 extensions
⢠Modeling
 long-Ââterm
 effect
 of
 events
â The
 effect
 becomes
 weaker
â Little
 has
 been
 done
28. O1 T1 P
R1
đ $ =
 đ(đ$
%
đ$
[$:(]
đ + đ :-
;
+ đ)
Neural Tensor Network for Event Embedding
29. Training
⢠Minimize the margin loss
⢠500 iterations
⢠Standard back-Ââpropagation
Random replace with
an object
Regulation weight,set
to 0.0001
Parameters
30. Deep
 Prediction
 Model
⢠We model long-Ââ, mid-Ââ, short-Ââterm events
â Long-Ââterm events (Last month)
â Mid-Ââterm events (Last week)
â Short-Ââterm events (Last day)
32. Deep
 Prediction
 Model
⢠Convolution and Max-Ââpooling
â Convolution layer to obtain local feature
â Max-Ââpooling to determine the global
representativefeature
33. Experiments
⢠Baselines
Input Method
Luss and dâAspremont [2012] Bag of words NN
Ding et al. [2014] (E-NN) Structured event NN
WB-NN Word embedding NN
WB-CNN Word embedding CNN
E-CNN Structured event CNN
EB-NN Event embedding NN
EB-CNN Event embedding CNN
34. Experiments
⢠Finds
â Events
 are
 better
 features
 than
 words
â Reducing
 sparsity if
 helpful
 in
 the
 task
â CNN-Ââbased
 is
 more
 powerful
35. Experiments
⢠15
 companies
 from
 S&P
 500
â Consists
 of
 High-Ââ,mid-Ââ and
 low-Ââranking
 companies
â Evaluation
 metric:
 Accuracy
 and
 MCC
36. Conclusion
⢠Event
 embeddings-Ââbased
 document
Â
representations
 are
 better
 than
 discrete
Â
events-Ââbased
 methods
⢠Deep
 CNN
 can
 help
 capture
 longer-Ââterm
Â
influence
 of
 news
 event
37. Current
⢠More technical enhancements
⢠More
 markets
â Chinaâs
 A
 market
â Chinese
 syntactic
 and
 semantic
 analysis
â Chinese
 Open
 IE