A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
1. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
A Hierarchical Model of Reviews for
Aspect-based Sentiment Analysis
Sebastian Ruder
PhD Candidate, Social Semantics Unit, Insight Centre, NUIG
Research Scientist, Aylien Ltd., Dublin
24.08.16
2. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Agenda
1 Introduction
2 A brief history of Aspect-based Sentiment Analysis
3 Task description
4 Data
5 State-of-the-art approaches and motivation
6 Deep Learning background
7 Model
8 Experiments
9 Results and takeaways
3. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Introduction
Figure: Aspect-based Sentiment Analysis (ABSA)
4. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
A Brief History of Aspect-based Sentiment Analysis
Main driver of research: shared tasks at SemEval
workshops
2014. First SemEval task on ABSA [Pontiki et al., 2014]:
English reviews for laptops and restaurants
2015. Second SemEval task [Pontiki et al., 2015]:
Extension and consolidation of previous subtasks
2016. Third SemEval task on ABSA [Pontiki et al., 2016]:
Extension to new languages and domains
5. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Task Description
Subtask 1. Sentence-level ABSA:
Slot 1. Aspect category: FOOD#QUALITY, FOOD#PRICE,
etc.
Slot 2. Opinion Target Expression: food, service, etc.
Slot 3. Sentiment Polarity: positive, negative,
neutral
Subtask 2. Text-level ABSA: FOOD#QUALITY:
positive, FOOD#PRICE: negative, etc.
Subtask 3. Out-of-domain ABSA.
6. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Task Description
Subtask 1. Sentence-level ABSA:
Slot 1. Aspect category: FOOD#QUALITY, FOOD#PRICE,
etc.
Slot 2. Opinion Target Expression: food, service, etc.
Slot 3. Sentiment Polarity: positive, negative,
neutral
Subtask 2. Text-level ABSA: FOOD#QUALITY:
positive, FOOD#PRICE: negative, etc.
Subtask 3. Out-of-domain ABSA.
7. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Data
Language Domain # of # of
Reviews Sentences
English Restaurants 440 2676
English Laptops 530 3303
Arabic Hotels 2291 6029
Chinese Phones 200 9521
Chinese Cameras 200 8040
Dutch Restaurants 400 2286
Dutch Phones 270 1697
French Restaurants 455 2429
Russian Restaurants 405 4299
Spanish Restaurants 913 2951
Turkish Restaurants 339 1248
Table: Number of reviews and sentences for every language-domain
pair in the SemEval 2016 ABSA task [Pontiki et al., 2016].
8. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
An example sentence.
1 <sentence id=”347 :0 ”>
2 <t e x t>I bought i t f o r r e a l l y cheap a l s o
and i t s AMAZING.</ t e x t>
3 <Opinions>
4 <Opinion category=”LAPTOP#PRICE”
p o l a r i t y=” p o s i t i v e ”/>
5 <Opinion category=”LAPTOP#GENERAL”
p o l a r i t y=” p o s i t i v e ”/>
6 </ Opinions>
7 </ sentence>
Figure: Example XML entry in a SemEval 2016 ABSA dataset.
9. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
State-of-the-art Approaches and Motivation
State-of-the-art approaches use a lot of additional
information, e.g. domain-specific parsers and lexicons
[Brun et al., 2014, Brun et al., 2016] as well as large
sentiment lexicons [Kumar et al., 2016]
Can we achieve performance that is on-par or better
just using the information contained in the review?
What information can we leverage?
The sentence.
The aspect.
The context of the surrounding sentences / the structure
of the review.
10. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Review structure
Elaboration
Background
that they cook
with only sim-
ple ingredients.
I am amazed
at the quality
of the food
I love this
restaurant.
Figure: RST [Mann and Thompson, 1988] structure of an example
review.
11. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
RNNs
Recurrent Neural Networks (RNNs) and LSTMs are
state-of-the-art for many text classification and sequence
tagging tasks.
Figure: An RNN takes an input xt at every time step t and produces
an output ht.
12. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Bidirectional RNNs
Bidirectional RNNs allow RNNs to ”look ahead”, work
even better in practice.
Figure: A bidirectional RNN: One RNN processes the input
left-to-right; the other one right-to-left. The output yt at every time
step t is the concatenation of the outputs of the RNNs at the
corresponding time step.
13. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
LSTM
An LSTM adds input, output, and forget gates to an
RNN, is able to model long-range dependencies essential
for capturing sentiment.
Figure: An LSTM cell.
14. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Putting things together...
Sentence. Use a sentence-level bidirectional LSTM to
capture the sentence context.
Review. Use a review-level bidirectional LSTM to capture
the review context.
Aspect. Feed the aspect representation together with the
sentence representation into the review-level LSTM.
15. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Our model
Food is great. Service is top notch.FOOD#
QUALITY
SERVICE#
GENERAL
LSTM LSTM LSTM
LSTM LSTM LSTM 0
0 LSTM LSTM LSTM
LSTM LSTM LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
OUT OUT
0
0
Output
Output
layer
Review-level
backward LSTM
Review-level
forward LSTM
Sentence-level
backward LSTM
Sentence-level
forward LSTM
Aspect/word
embeddings
Figure: The bidirectional hierarchical LSTM (H-LSTM) for ABSA.
16. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Experiments
Hyperparameter tuning on development set
Dropout of 0.5 before and after LSTM cell
Pre-trained 300-dimensional GloVe word embeddings for
English, random embeddings for other languages1
Comparison models:
Best: best model of shared task [Pontiki et al., 2016] for
each domain-language pair
IIT-TUDA: best single model of the competition
[Kumar et al., 2016]
CNN: sentence-level convolutional neural network
[Ruder et al., 2016]
LSTM: sentence-level Bi-LSTM
1
Polyglot embeddings [Al-Rfou et al., 2013] (64 dimensions) did not
improve performance.
17. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Results
Language Domain Best IIT CNN LSTM H-LSTM
English Restaurants 88.1 86.7 82.1 81.4 85.3
Spanish Restaurants 83.6 83.6 79.6 75.7 79.5
French Restaurants 78.8 72.2 73.2 69.8 73.6
Russian Restaurants 77.9 73.6 75.1 73.9 78.1
Dutch Restaurants 77.8 77.0 75.0 73.6 82.2
Turkish Restaurants 84.3 84.3 74.2 73.6 76.7
Arabic Hotels 82.7 81.7 82.7 80.5 82.8
English Laptops 82.8 82.8 78.4 76.0 80.1
Dutch Phones 83.3 82.6 83.3 81.8 81.3
Chinese Cameras 80.5 - 78.2 77.6 78.6
Chinese Phones 73.3 - 72.4 70.3 74.1
Table: Results of our system (H-LSTM) in comparison to the best
system for each pair (Best), the best single system (IIT-TUDA), a
sentence-level CNN (CNN), and our sentence-level LSTM (LSTM).
18. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Takeaways
Knowledge of surrounding sentences / review context is
helpful.
Id Sentence LSTM H-LSTM
1.1 No Comparison negative positive
1.2
It has great sushi and
positive positive
even better service.
2.1
Green Tea creme
positive positive
brulee is a must!
2.2
Don’t leave the
negative positive
restaurant without it.
Table: Example sentences where knowledge of other sentences in the
review (not necessarily neighbors) helps to disambiguate the
sentiment of the sentence in question.
19. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Takeaways
Pre-trained embeddings increase performance across all
languages significantly (more results in final version).
Gathering multilingual corpora is worth it.
H-LSTM is better than state-of-the-art particularly for
low-resource languages where reliable parsers are not
available.
Generally, too little training data to completely
compensate for lack of domain information; lack of data
does not allow using more sophisticated models, e.g.
attention.
Gap to best model in English, Spanish and French is still
large. LSTMs can also use sentiment lexicon, but best
integration is not obvious (use scalar scores,
embed/bucket scores, filter based on occurrence, etc.).
20. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Presentation is based on:
Sebastian Ruder, Parsa Ghaffari, John G. Breslin (2016). A
Hierarchical Model of Reviews for Aspect-based Sentiment
Analysis. EMNLP, Austin, Texas, US.
Credit for RNN and LSTM images: Christopher Olah.
Thank you for your attention!
21. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Bibliography I
[Al-Rfou et al., 2013] Al-Rfou, R., Perozzi, B., and Skiena, S.
(2013).
Polyglot: Distributed Word Representations for Multilingual
NLP.
Proceedings of the Seventeenth Conference on
Computational Natural Language Learning, pages 183–192.
[Brun et al., 2016] Brun, C., Perez, J., and Roux, C. (2016).
XRCE at SemEval-2016 Task 5: Feedbacked Ensemble
Modelling on Syntactico-Semantic Knowledge for Aspect
Based Sentiment Analysis.
pages 282–286.
22. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Bibliography II
[Brun et al., 2014] Brun, C., Popa, D., and Roux, C. (2014).
XRCE: Hybrid Classification for Aspect-based Sentiment
Analysis.
SemEval 2014, (SemEval):838–842.
[Kumar et al., 2016] Kumar, A., Kohail, S., Kumar, A., Ekbal,
A., and Biemann, C. (2016).
IIT-TUDA at SemEval-2016 Task 5: Beyond Sentiment
Lexicon: Combining Domain Dependency and Distributional
Semantics Features for Aspect Based Sentiment Analysis.
Proceedings of the 10th International Workshop on Semantic
Evaluation, (SemEval).
23. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Bibliography III
[Mann and Thompson, 1988] Mann, W. C. and Thompson,
S. A. (1988).
Rhetorical Structure Theory: Toward a functional theory of
text organization.
[Pontiki et al., 2016] Pontiki, M., Galanis, D., Papageorgiou,
H., Androutsopoulos, I., Manandhar, S., AL-Smadi, M.,
Al-Ayyoub, M., Zhao, Y., Qin, B., Clercq, O. D., Hoste, V.,
Apidianaki, M., Tannier, X., Loukachevitch, N., Kotelnikov,
E., Bel, N., Jim´enez-Zafra, S. M., and Eryi˘git, G. (2016).
SemEval-2016 Task 5: Aspect-Based Sentiment Analysis.
In Proceedings of the 10th International Workshop on
Semantic Evaluation, San Diego, California. Association for
Computational Linguistics.
24. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Bibliography IV
[Pontiki et al., 2015] Pontiki, M., Galanis, D., Papageorgiou,
H., Manandhar, S., and Androutsopoulos, I. (2015).
SemEval-2015 Task 12: Aspect Based Sentiment Analysis.
Proceedings of the 9th International Workshop on Semantic
Evaluation (SemEval 2015), pages 486–495.
[Pontiki et al., 2014] Pontiki, M., Galanis, D., Pavlopoulos, J.,
Papageorgiou, H., Androutsopoulos, I., and Manandhar, S.
(2014).
SemEval-2014 Task 4: Aspect Based Sentiment Analysis.
Proceedings of the 8th International Workshop on Semantic
Evaluation (SemEval 2014), pages 27–35.
25. A Hierarchical
Model of
Reviews for
ABSA
Sebastian
Ruder
Introduction
A Brief
History of
ABSA
Task
Data
SotA &
Motivation
DL
Background
Model
Experiments
Results &
Takeaways
Bibliography
Bibliography V
[Ruder et al., 2016] Ruder, S., Ghaffari, P., and Breslin, J. G.
(2016).
INSIGHT-1 at SemEval-2016 Task 5: Deep Learning for
Multilingual Aspect-based Sentiment Analysis.
Proceedings of the 10th International Workshop on Semantic
Evaluation.