Aspect-Specific Polarity-Aware Summarization of Online Reviews

Aspect-Specific Polarity-Aware
Summarization of Online Reviews
Gaoyan Ou (ougaoyan@126.com)
School of EECS, Peking University
1

Outline
 Motivation
 Related work
 The proposed APSM and ME-APSM model
 Experiments and results
 Conclusion
2

Outline
 Motivation
 Related work
 Conclusion
3

Motivation
 A large amount of reviews which contain people’s opinions
 However, there are too many reviews to read!
 Techniques to discover and summarize aspects and
sentiments from online reviews are needed
 It is still a challenging task
 useless to do analysis manually because of the huge number of
reviews
 the reviews are composed of unstructured texts
4

Aspect & Sentiment Extraction
...
Aspect 1
(room)
pos large clean safe comfortable ...
bathroom towels bed shower ...
dirty small uncomfortablenoise ...neg
Aspect 2
(meal)
breakfast fruit eggs juice ...
good fresh ...delicious wonderfulpos
cold awful terrible poor ...neg
Output 1
...
Review n
Michelle K
Busan, South
Korea
...Hilton Wangfujing made my
stay in Beijing perfect! The
location of the hotel is great. ...
The room was large, luxurious
and very comfortable...
Review 1
Input
Sentiment Classification
Review n
...
Overall sentiment:
Aspect-specific sentiment:
Review 1
room :
meal :
staff :
Output 2
Problem Setup
 Aspect and sentiment extraction
 Aspect extraction
 Aspect-specific sentiment extraction
 Sentiment Classification
 Classify the overall review as positive
or negative
 Key advantage:
 Figure out how sentiments are
expressed according to different
polarities for a particular aspect.
5

Outline
 Motivation
 Related work
 Conclusion
6

Related work
 Aspect-based sentiment analysis
 Identify aspects that have been evaluated (aspect extraction) and predict sentiment for each
extracted aspects(sentiment extraction)
 Frequency-based methods (Hu et al. 2004; Popescu et al. 2005)
 uses frequent pattern mining and a dependency parser to find frequent noun terms and opinions
cast on them.
 Limitation: Produce many non-aspects matching with the patterns
 Sequential labeling techniques (Jin et al. 2009; Jakob 2010; Choi and Cardie 2010)
 Employs POS and lexical features on labeled data sets to train a CRF or HMM model
 Limitation: Need manually labeled data for training.
 LDA-based methods (ME-LDA (Zhao et al 2010), ME-SAS (Mukherjee and Liu 2012), ASUM (Jo and Oh 2011))
 Unsupervised, can extract aspects and sentiments simultaneously.
 Limitation: cannot extract polarity-aware sentiments for each aspect.
7

Outline
 Motivation
 Related work
 Conclusion
8

The Proposed Models
 Two LDA-based aspect and sentiment models
 Aspect-specific Polarity-aware Sentiment Model (APSM)
 Improved version of APSM (ME-APSM), which uses a maximum entropy
component to better distinguish aspect word from sentiment word
 Model inference
 Integrate sentiment and aspect via asymmetric Dirichlet prior
9

APSM model

z
l w
r
θ δα
D
Nd,s
K
Sd
K K
M
Graphical representation of APSM model
10

x
Employing MaxEnt (ME-APSM model)
 z
l w
r
θα
D
Nd,s
K
Sd
K K
M
Graphical representation of ME-APSM model
11

Model inference

APSM
ME-APSM
12

Incorporating Prior Knowledge
we expect that no negative word appears in each aspect’s
positive sentiment model
positive word will be more likely to appear in each
aspect’s positive model
sentiment seeds will get
higher prior weights
words in the aspect seed list
will get higher prior weights
sentiment words should unlikely appear in aspect model
Sentiment Prior
Aspect Prior
13
Asymmetric Dirichlet prior

Outline
 Motivation
 Related work
 Conclusion
14

Qualitative Results
Aspect
APSM ME-APSM
Aspect Senti(p) Senti(n) Aspect Senti(p) Senti(n)
Staff
staff
helpful
friendly
english
desk
front
good
extremely
staff
friendly
courteous
helpful
attentive
clean
great
recommend
unhelpful
poor
bad
noise
cold
problem
overpriced
disappointed
staff
helpful
friendly
english
desk
extremely
waiter
waitress
good
great
helpful
friendly
excellent
wonderful
staff
clean
rude
unfriendly
unhelpful
noise
poor
disappointed
cheap
hard
Meal
breakfast
coffee
buffet
room
fruit
eggs
fresh
included
breakfast
friendly
fresh
variety
good
great
delicious
nice
cold
scrambled
problem
hard
bad
expensive
poor
die
breakfast
coffee
fruit
buffet
eggs
cheese
cereal
juice
good
great
fresh
hot
wonderful
excellent
nice
fantastic
cold
scrambled
awful
limited
terrible
bad
poor
disappointed
Example Aspects and Sentiments Extracted by APSM and ME-APSM
16
 Both APSM and ME-APSM can
extract coherent aspects and
aspect-specific sentiments well.
 “breakfast”, “coffee”, “buffet”,
“fruit” and “eggs” are all words
related to the aspect meal.
 In general, ME-APSM performs
better than APSM.
 APSM incorrectly identifies the
aspect word “staff” as positive
sentiment words.
 ME-APSM can discover more
specific negative sentiment
words, such as “rude” and
“unfriendly”.

Aspect-Specific Sentiment Extraction
Aspect/
Sentiment
ME-LDA APSM ME-APSM
P@5 P@10 P@20 P@5 P@10 P@20 P@5 P@10 P@20
Staff/Pos 1.00 0.90 0.65 0.80 0.70 0.70 1.00 0.80 0.80
Staff/Neg 0.40 0.60 0.35 0.80 0.50 0.35 0.80 0.40 0.30
Room/Pos 0.80 0.60 0.70 1.00 0.90 0.80 1.00 0.80 0.75
Room/Neg 0.60 0.30 0.25 0.40 0.50 0.30 0.80 0.50 0.40
Meal/Pos 0.80 0.80 0.70 0.80 0.80 0.85 1.00 0.80 0.85
Meal/Neg 0.20 0.30 0.30 0.40 0.30 0.35 0.60 0.40 0.30
Avg./Pos 0.87 0.77 0.68 0.87 0.80 0.78 1.00 0.80 0.80
Avg./Neg 0.40 0.40 0.30 0.53 0.43 0.35 0.73 0.43 0.33
Aspect-specific Sentiment Extraction Performance
17
 P@n as the metric to compare ME-LDA,
APSM and ME-APSM.
 APSM and ME-APSM give better
results than ME-LDA.
 ME-APSM further outperforms APSM,
which suggests the effectiveness of the
MaxEnt component.

Sentiment Classification
Method Hotel Data Set Product Data Set
Lexicon-based Method 62.7% 60.2%
ASUM 65.6% 64.5%
APSM 69.7% 66.5%
ME-APSM 72.9% 69.2%
APSM+ 70.3% 66.9%
ME-APSM+ 73.9% 70.1%
Supervised Classification 74.3% 70.7%
Sentiment Classification Accuracy
18
 Lexicon-based Method
 counting the positive and negative words in the review
 Supervised Classification (Denecke 2009): logistic regression
 ASUM (Jo and Oh 2011)
 APSM+: APSM with aspect and sentiment seeds
 ME-APSM+: ME-APSM with aspect and sentiment seeds
 Lexicon-based method performs worst
 can not capture the aspect information of the sentiment words.
 APSM and ME-APSM give better results than ASUM.
 separating aspects and sentiments improve sentiment classification accuracy
 ME-APSM further outperforms APSM, which suggests the effectiveness of the MaxEnt component.
 APSM+, ME-APSM+ > APSM, ME-APSM
 Incorporating sentiment and aspect prior can improves performance

Effect of aspect numbers
Sentiment Classification Accuracy with Different Aspect Numbers
19
 Sentiment classification performance increases as K increases
 This trend is more evident on the product data set.

Outline
 Motivation
 Related work
 Conclusion
20

Conclusion
 In this paper, we focus on the problem of simultaneously aspect and
sentiment extraction and sentiment classification of online reviews.
 We proposed APSM and ME-APSM to address the problem.
 Key advantage: extract aspect-specific and polarity-aware sentiment
 Incorporate sentiment and aspect prior information
 In the future, we plan to apply our models to more sentiment analysis
tasks, such as aspect-level sentiment classification
21

Reference
 Hu, M., Liu, B.: Mining and summarizing customer reviews. In: KDD. (2004) 168–177
 Jo, Y., Oh, A.H.: Aspect and sentiment unification model for online review analysis. In: WSDM. (2011) 815–824
 Popescu, A.M., Nguyen, B., Etzioni, O.: Opine: Extracting product features and opinions from reviews. In:
HLT/EMNLP. (2005)
 Zhao, W.X., Jiang, J., Yan, H., Li, X.: Jointly modeling aspects and opinions with a maxent-lda hybrid. In:
EMNLP. (2010) 56–65
 Mukherjee, A., Liu, B.: Aspect extraction through semi-supervised modeling. In:ACL (1). (2012) 339–348
 Jin, W., Ho, H.H.: A novel lexicalized hmm-based learning framework for web opinion mining. In: Proceedings
of the 26th Annual International Conference on Machine Learning. ICML ’09, New York, NY, USA, ACM (2009)
465–472
 Jakob, N., Gurevych, I.: Extracting opinion targets in a single and cross-domain setting with conditional random
fields. In: EMNLP. (2010) 1035–1045
 Choi, Y., Cardie, C.: Hierarchical sequential learning for extracting opinions and their attributes. In: ACL (Short
Papers). (2010) 269–274
 Denecke, K.: Are sentiwordnet scores suited for multi-domain sentiment classification? In: ICDIM. (2009) 33–38
22

Aspect-Specific Polarity-Aware Summarization of Online Reviews

Recommended

Recommended

More Related Content

Similar to Aspect-Specific Polarity-Aware Summarization of Online Reviews

Similar to Aspect-Specific Polarity-Aware Summarization of Online Reviews (20)

Recently uploaded

Recently uploaded (20)

Aspect-Specific Polarity-Aware Summarization of Online Reviews

Editor's Notes