Prof. Dr. Computer Science (Artificial Intelligence, Software Engineering), Co-Founder AGISI.org at Computer Science Dept., Berlin School of Economics and Law
Prof. Dr. Computer Science (Artificial Intelligence, Software Engineering), Co-Founder AGISI.org at Computer Science Dept., Berlin School of Economics and Law
Predicting Star Ratings based on Annotated Reviewss of Mobile Apps [Slides]
Predicting Star Ratings
based on Annotated Reviews
of Mobile Apps
Talk at the 6th International Workshop on Advances in Semantic Information Retrieval
ASIR 2016
Prof. Dr. Dagmar Monett, Hermann Stolte
D. Monett
Reviews and star ratings
2Gdańsk, Poland, September 11 – 14, 2016
Example of reviews and star ratings of the
Evernote App, Google Play Store (07/2016)
D. Monett
Star ratings matter
3Gdańsk, Poland, September 11 – 14, 2016
15% would consider downloading an app with a 2-star rating
50% would consider downloading an app with a 3-star rating
96% would consider downloading an app with a 4-star rating
Source: Aptentive 2015 Consumer Study
The Mobile Marketer‘s Guide to App Store Ratings & Reviews
D. Monett
Some questions…
6Gdańsk, Poland, September 11 – 14, 2016
■ Could we (a program) teach users how to rate
apps consistently with the review they are writing
for a mobile app?
■ I.e., could we (a program) suggest to users the
most adequate star rating they should give to a
product depending on the semantic orientation of
what they have already written in the review?
■ Would it mean an improvement of users'
engagement and satisfaction with the app?
D. Monett 8Gdańsk, Poland, September 11 – 14, 2016
Review rating prediction
■ Also sentiment rating prediction:
■ …a task that deals with the inference of an
author's implied numerical rating, i.e. on the
prediction of a rating score, from a given written
review
■ E.g., recommendation systems often suggest
products based on star ratings of similar
products previously rated by other users
D. Monett 10Gdańsk, Poland, September 11 – 14, 2016
Other related work
■ Analysing textual reviews and inferring sentiment
polarity –positive/negative/neutral– (Pang et al. 2002;
Liu, 2010)
■ Using not only textual semantics but also other
information, e.g., about the author and/or the
product (Tang et al., 2015; Li et al. 2011)
■ Considering phrase-level sentiment polarity (Qu et
al., 2010)
■ Considering aspect-based opinion mining (Zhang et
al., 2006; Ganu et al., 2013; Klinger & Cimiano, 2013; Sänger, 2015)
D. Monett 12Gdańsk, Poland, September 11 – 14, 2016
Our approach
■ We do not deal with aspect identification nor with
sentiment classification
■ We are assuming that these tasks are already
performed before the star ratings are predicted
■ We focus on predicting star ratings based solely
on available annotated, fine-granular opinions
■ I.e., a complement to works like (Sänger, 2015) which
extends (Klinger & Cimiano, 2013) and use a German
annotated corpus of mobile apps
D. Monett 14Gdańsk, Poland, September 11 – 14, 2016
SCARE Corpus
Mario Sänger, Ulf Leser, Steffen Kemmerer, Peter Adolphs, and Roman Klinger.
SCARE - The Sentiment Corpus of App Reviews with Fine-grained Annotations in
German. In Proceedings of the Tenth International Conference on Language
Resources and Evaluation (LREC'16), Portorož, Slovenia, May 2016. European
Language Resources Association (ELRA).
■ Fine-grained annotations for mobile application
reviews from the Google Play Store
■ 1,760 German application reviews with 2,487
aspects and 3,959 subjective phrases
■ SCARE corpus v.1.0.0 (annotations only)
■ Available at http://www.romanklinger.de/scare/
D. Monett 21Gdańsk, Poland, September 11 – 14, 2016
We “played” with
different models
D. Monett
Computational models
22Gdańsk, Poland, September 11 – 14, 2016
For example,
x0=1
x1 : no. of subjective phrases with positive polarity
x2 : no. of subjective phrases with negative polarity
x3 : no. of subjective phrases with neutral polarity
D. Monett
Experiments
24Gdańsk, Poland, September 11 – 14, 2016
(1) Assessing the importance of sentiment in the
reviews:
■ Neutral phrases (yes/no)?
■ Reviews with no sentiment (yes/no)?
(2) Using other predictors
■ Each individual experiment is run 10,000 times
■ A Monte Carlo cross-validation: 70% training
dataset and 30% testing dataset, randomly on each
iteration.
D. Monett
“Best” model, exp. (1)
26Gdańsk, Poland, September 11 – 14, 2016
■ It considers only the average value of the
polarities of a review in one feature:
■ Plus:
■ filtering both subjective phrases with neutral
polarity and reviews with no sentiment
orientation at all
■ No normalisation
D. Monett
Conclusion
29Gdańsk, Poland, September 11 – 14, 2016
■ Textually-derived rating prediction can be
performed well even when only phrase-level
sentiment polarity is available
■ Phrases with neutral sentiment could be filtered
out of the corpus
■ Computing the overall sentiment of a review using
the review rating score (Ganu et al., 2009, 2013) provides
the best star rating predictions
D. Monett
Further work
30Gdańsk, Poland, September 11 – 14, 2016
■ To consider the aspects’ relevance
■ aspect-oriented subjective phrases
■ To analyse the strengths of the opinions (Wilson et al.,
2004)
■ not only positive/negative/neutral sentiment
■ To deal with other types of models different than
linear, multivariate regression ones
D. Monett
Sources
31Gdańsk, Poland, September 11 – 14, 2016
Related work:
- See references list on our paper!
■ https://www.researchgate.net/publication/304244445_Predi
cting_Star_Ratings_based_on_Annotated_Reviews_of_Mo
bile_Apps