The document proposes two aspect-specific polarity-aware sentiment summarization models called APSM and ME-APSM. APSM uses an LDA model to jointly model aspects and sentiments from reviews, while ME-APSM improves on APSM by incorporating a maximum entropy component to better distinguish aspect words from sentiment words. Experimental results on hotel and product reviews show that both APSM and ME-APSM outperform previous models in extracting coherent aspects and sentiments, and ME-APSM performs better than APSM. Additionally, both models achieve better sentiment classification accuracy compared to baselines.
4. Motivation
A large amount of reviews which contain people’s opinions
However, there are too many reviews to read!
Techniques to discover and summarize aspects and
sentiments from online reviews are needed
It is still a challenging task
useless to do analysis manually because of the huge number of
reviews
the reviews are composed of unstructured texts
4
5. Aspect & Sentiment Extraction
...
Aspect 1
(room)
pos large clean safe comfortable ...
bathroom towels bed shower ...
dirty small uncomfortablenoise ...neg
Aspect 2
(meal)
breakfast fruit eggs juice ...
good fresh ...delicious wonderfulpos
cold awful terrible poor ...neg
Output 1
...
Review n
Michelle K
Busan, South
Korea
...Hilton Wangfujing made my
stay in Beijing perfect! The
location of the hotel is great. ...
The room was large, luxurious
and very comfortable...
Review 1
Input
Sentiment Classification
Review n
...
Overall sentiment:
Aspect-specific sentiment:
Review 1
room :
meal :
staff :
Output 2
Problem Setup
Aspect and sentiment extraction
Aspect extraction
Aspect-specific sentiment extraction
Sentiment Classification
Classify the overall review as positive
or negative
Key advantage:
Figure out how sentiments are
expressed according to different
polarities for a particular aspect.
5
7. Related work
Aspect-based sentiment analysis
Identify aspects that have been evaluated (aspect extraction) and predict sentiment for each
extracted aspects(sentiment extraction)
Frequency-based methods (Hu et al. 2004; Popescu et al. 2005)
uses frequent pattern mining and a dependency parser to find frequent noun terms and opinions
cast on them.
Limitation: Produce many non-aspects matching with the patterns
Sequential labeling techniques (Jin et al. 2009; Jakob 2010; Choi and Cardie 2010)
Employs POS and lexical features on labeled data sets to train a CRF or HMM model
Limitation: Need manually labeled data for training.
LDA-based methods (ME-LDA (Zhao et al 2010), ME-SAS (Mukherjee and Liu 2012), ASUM (Jo and Oh 2011))
Unsupervised, can extract aspects and sentiments simultaneously.
Limitation: cannot extract polarity-aware sentiments for each aspect.
7
9. The Proposed Models
Two LDA-based aspect and sentiment models
Aspect-specific Polarity-aware Sentiment Model (APSM)
Improved version of APSM (ME-APSM), which uses a maximum entropy
component to better distinguish aspect word from sentiment word
Model inference
Integrate sentiment and aspect via asymmetric Dirichlet prior
9
13. Incorporating Prior Knowledge
we expect that no negative word appears in each aspect’s
positive sentiment model
positive word will be more likely to appear in each
aspect’s positive model
sentiment seeds will get
higher prior weights
words in the aspect seed list
will get higher prior weights
sentiment words should unlikely appear in aspect model
Sentiment Prior
Aspect Prior
13
Asymmetric Dirichlet prior
16. Qualitative Results
Aspect
APSM ME-APSM
Aspect Senti(p) Senti(n) Aspect Senti(p) Senti(n)
Staff
staff
helpful
friendly
english
desk
front
good
extremely
staff
friendly
courteous
helpful
attentive
clean
great
recommend
unhelpful
poor
bad
noise
cold
problem
overpriced
disappointed
staff
helpful
friendly
english
desk
extremely
waiter
waitress
good
great
helpful
friendly
excellent
wonderful
staff
clean
rude
unfriendly
unhelpful
noise
poor
disappointed
cheap
hard
Meal
breakfast
coffee
buffet
room
fruit
eggs
fresh
included
breakfast
friendly
fresh
variety
good
great
delicious
nice
cold
scrambled
problem
hard
bad
expensive
poor
die
breakfast
coffee
fruit
buffet
eggs
cheese
cereal
juice
good
great
fresh
hot
wonderful
excellent
nice
fantastic
cold
scrambled
awful
limited
terrible
bad
poor
disappointed
Example Aspects and Sentiments Extracted by APSM and ME-APSM
16
Both APSM and ME-APSM can
extract coherent aspects and
aspect-specific sentiments well.
“breakfast”, “coffee”, “buffet”,
“fruit” and “eggs” are all words
related to the aspect meal.
In general, ME-APSM performs
better than APSM.
APSM incorrectly identifies the
aspect word “staff” as positive
sentiment words.
ME-APSM can discover more
specific negative sentiment
words, such as “rude” and
“unfriendly”.
18. Sentiment Classification
Method Hotel Data Set Product Data Set
Lexicon-based Method 62.7% 60.2%
ASUM 65.6% 64.5%
APSM 69.7% 66.5%
ME-APSM 72.9% 69.2%
APSM+ 70.3% 66.9%
ME-APSM+ 73.9% 70.1%
Supervised Classification 74.3% 70.7%
Sentiment Classification Accuracy
18
Lexicon-based Method
counting the positive and negative words in the review
Supervised Classification (Denecke 2009): logistic regression
ASUM (Jo and Oh 2011)
APSM+: APSM with aspect and sentiment seeds
ME-APSM+: ME-APSM with aspect and sentiment seeds
Lexicon-based method performs worst
can not capture the aspect information of the sentiment words.
APSM and ME-APSM give better results than ASUM.
separating aspects and sentiments improve sentiment classification accuracy
ME-APSM further outperforms APSM, which suggests the effectiveness of the MaxEnt component.
APSM+, ME-APSM+ > APSM, ME-APSM
Incorporating sentiment and aspect prior can improves performance
19. Effect of aspect numbers
Sentiment Classification Accuracy with Different Aspect Numbers
19
Sentiment classification performance increases as K increases
This trend is more evident on the product data set.
21. Conclusion
In this paper, we focus on the problem of simultaneously aspect and
sentiment extraction and sentiment classification of online reviews.
We proposed APSM and ME-APSM to address the problem.
Key advantage: extract aspect-specific and polarity-aware sentiment
Incorporate sentiment and aspect prior information
In the future, we plan to apply our models to more sentiment analysis
tasks, such as aspect-level sentiment classification
21
22. Reference
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: KDD. (2004) 168–177
Jo, Y., Oh, A.H.: Aspect and sentiment unification model for online review analysis. In: WSDM. (2011) 815–824
Popescu, A.M., Nguyen, B., Etzioni, O.: Opine: Extracting product features and opinions from reviews. In:
HLT/EMNLP. (2005)
Zhao, W.X., Jiang, J., Yan, H., Li, X.: Jointly modeling aspects and opinions with a maxent-lda hybrid. In:
EMNLP. (2010) 56–65
Mukherjee, A., Liu, B.: Aspect extraction through semi-supervised modeling. In:ACL (1). (2012) 339–348
Jin, W., Ho, H.H.: A novel lexicalized hmm-based learning framework for web opinion mining. In: Proceedings
of the 26th Annual International Conference on Machine Learning. ICML ’09, New York, NY, USA, ACM (2009)
465–472
Jakob, N., Gurevych, I.: Extracting opinion targets in a single and cross-domain setting with conditional random
fields. In: EMNLP. (2010) 1035–1045
Choi, Y., Cardie, C.: Hierarchical sequential learning for extracting opinions and their attributes. In: ACL (Short
Papers). (2010) 269–274
Denecke, K.: Are sentiwordnet scores suited for multi-domain sentiment classification? In: ICDIM. (2009) 33–38
22
Good afternoon, everyone! I’m Gaoyan Ou from Peking University. And now I am a PhD student. My talk today is “Aspect-Specific Polarity-Aware Summarization of Online Reviews”.
Here is my outline. First I will talk about the motivation of our work. And then give a review of the related research. Next I propose two probabilistic models named APSM and ME-APSM to model the online reviews. After that, I will describe the experiments and results. Last, I’ll give the conclusion.
First, let’s look at our motivation. With the rapid development of various types of social media, there exists a large amount of reviews which contain people’s opinions. The opinions are important for both customers and manufacturers. However, the number of reviews is so huge for user to digest. We need techniques to automatically discover and summarize aspects and sentiments for us. It’s still a challenging task.
Now, I will describe the problem we want to address. Given a collection of reviews about a product. We want a structured summarization of what features (or aspects) of the product are discussed. For each aspect, we further want to know when people are satisfied or unsatisfied, what they will say. This is the aspect and sentiment extraction task. We also want to know for each review, the sentiment is positive or negative. This is the sentiment classification task. Compared to existing work, our key advantage is that we can figure out how sentiments are expressed according to different polarities for a particular aspect. This is what “aspect-specific polarity-aware” means.
Our work can be classified in the framework of aspect-based sentiment classification. Here I give a review of the related work. Generally speaking, there are three kinds of methods. The first one is frequency-based methods, which use frequent pattern mining and dependency parser to find frequent noun terms and opinions cast on them. The limitation is that the produce many non-aspects. The second kind of method is sequential labeling, they employ POS and lexical features on labeled training data to train a CRF or HMM model. However, labelled training data is hard to obtain. The third method is LDA-based models, which use topic modeling to discover latent topics as aspects and sentiments. This kind of methods are unsupervised, which means we do not need labelled training data. However, most existing LDA-based models cannot extract polarity-aware sentiments for each aspect.
Now, I will describe our models. First, I describe APSM and its improved version ME-APSM. Then we show how to inference the models. Last, I will describe how to integrate aspect and sentiment prior to APSM and ME-APSM via asymmetric Dirichlet prior.
OK, here is the generative process of APSM. First we generate K aspect models and M sentiment models for each aspect. Both aspect and sentiment model are distribution over words. Given the aspect and sentiment models, our review corpus can be generate as follows:For each review d, draw an aspect distribution θ and K sentiment distribution π. For each sentence in d, we first draw its aspect z from 𝜃𝑧. Given z and 𝜋𝑑, we then draw a sentiment l for sentence s. Each sentence has a proportion of aspect and sentiment ψ. For each word in sentence s, we first draw r from ψ, which indicates whether word is an aspect word or sentiment word. If w is an aspect word, we draw it from the z aspect model; else if w is a sentiment word, we draw it from l sentiment model of aspect z.OK, here is the generative process of APSM. First we generate K aspect models and M sentiment models for each aspect. Both aspect and sentiment model are distribution over words. Given the aspect and sentiment models, our review corpus can be generate as follows: For each review d, draw an aspect distribution θ and K sentiment distribution π. For each sentence in d, we first draw its aspect z from 𝜃_𝑧. Given z and 𝜋_𝑑, we then draw a sentiment l for sentence s. Each sentence has a proportion of aspect and sentiment ψ. For each word in sentence s, we first draw r from ψ, which indicates whether word is an aspect word or sentiment word. If w is an aspect word, we draw it from the z aspect model; else if w is a sentiment word, we draw it from l sentiment model of aspect z.
Now, I describe ME-APSM. We observe that aspect and sentiment terms play different syntactic roles in a sentence. Aspect words term to be nouns and noun phrases, while sentiment words term to be adjectives and adverbs. So we train a MaxEnt to distinguish them. The training data are obtained by exploiting a sentiment lexicon. The graphical representation of ME-APSM are shown in the right figure.
We use collapsed Gibbs Sampling to inference the model. The sampling formula of z and l is the same for APSM and ME-APSM, which are shown as follows. For APSM, the sampler for r is as follows. For ME-APSM, only the first term changes as exp(…).
We can see from Table 1 that both APSM and ME-APSM can extract coherent aspects and aspect-specific sentiments well. For example, “breakfast”, “coffee”, “buffet”, “fruit” and “eggs” are all words related to the aspect meal. They are correctly identified by APSM and ME-APSM. In general, ME-APSM performs better than APSM. For the aspect staff, both APSM and ME-APSM.discover aspect word “staff”, but APSM fails to discover more staff-related words like “waiter” and “waitress”, which are successfully captured by ME-APSM.APSM incorrectly identifies the aspect word “staff” as positive sentiment words. ME-APSM can discover more specific negative sentiment words, such as “rude” and “unfriendly”.
Table 2 gives P@nvalues for ME-LDA, APSM and ME-APSM. It can be seen that APSM and ME-APSM give better results than ME-LDA. ME-APSM further outperforms APSM, which suggests the effectiveness of the MaxEnt component.
In this section, we present the results of sentiment classification.The experimental results are shown in Table 3.The performance of unified aspect and sentiment models (ASUM, DS-LDA, APSM and ME-APSM) are better than lexicon based method on both data sets. This is because sentiment polarities are dependent on aspects. The lexicon based method can not capture the aspect information of the sentiment words.APSM and ME-APSM consistently outperform ASUM. ASUM can not separateaspects and sentiments. This suggests that separating aspects and sentiments not only improve the aspect extraction performance, but also improve sentiment classification accuracy.We indicate the models with aspect and sentiment seeds as APSM+ and ME-APSM+.Note that the supervised method needs labeled training data while ME-APSM+ only needs several aspect and sentiment seeds.
We also analyze the influence of the number of aspects K. We can see from Fig. 3 that as the number of aspectsincrease, the sentiment classification performance increases. This trend is more evident on the product data set.