Unblocking The Main Thread Solving ANRs and Frozen Frames
Web Opinion Mining - Presentation
1. Web Opinion Mining
Marc-Antoine Dupré
Alexander Patronas
Erhard Dinhobl
Ksenija Ivekovic
Martin Trenkwalder
2. Roadmap
What is opinion mining and why?
Objects, model and task
Words and phrases
Sentiment classification
Feature-based opinion mining
Opinion Spam
Tools on opinion mining
3. Questions:
What do users think about a specific product?
Which of our customers are unsatisfied? Why?
Which product is more popular among users?
Answer: Web Opinion Mining
4. Web Opinion Mining
Facebook, blogs, … > opinion
Wikipedia > fact
Opinions: underlying question
“ what do people in America think about Barack Obama?”
Mostly in deep web
AI algorithm necessary
Useful: market intelligence (better ads)
5. Objects, Model
Opinion holder / object / opinion
Features of object
F = {f1, f2, f3, …} fi ϵ F
fi defined by words or phrases
W = {w1, w2, w3, …} Wi ϵ W
O is some object (event, person, product, …)
“Now the opinion holder is j and comments on a subset of features S j of F of O.
Now feature fk ϵ Sj is commented by j by a word or phrase from Wk to
determine the feature and a positive, negative or neutral opinion on fk”
6. Task
One document – one opinion from one holder
Opinion: positive, negative, neutral
3 levels:
Document - class determining
Sentence (one opinion)
sentence type (objective or subjective)
sentence class (neutral, positive, negative)
Feature – determining words and phrases
7. Words and Phrases
Words often context dependent („long“ – long loading time
– long battery runtime)
3 approaches to get wordlist:
Manual approach
Corpus-based approach
Dictionary-based approach
8. Sentiment Classification
Classify documents (e.g. reviews) based on overall
sentiments expressed by opinion holders
Positive, negative or neutral
Useful, but doesn’t find what reviewer liked or disliked!
A negative sentiment on an object doesn’t mean that opinion
holder dislikes everything about object and opposite
Need to go to sentence level and the feature level
9. Feature-based Opinion Mining
Objective: find what reviewers like and dislike
Features and components
Three tasks:
Extract object features that have been commented on in each
review
Determine whether opinions on the feature are positive,
negative or neutral
Group synonyms and produce summary
10. Different Review Formats
GREAT Camera., Jun 3, 2004
Reviewer: jprice174 from Atlanta, Ga.
I did a lot of research last year before I
bought this camera... It kinda hurt to leave behind
my beloved nikon 35mm SLR, but I was going to
Italy, and I needed something smaller, and digital.
The pictures coming out of this camera are
amazing. The 'auto' feature takes great pictures
most of the time. And with digital, you're not
wasting film if the picture doesn't come out. …
….
11. Extracting Object Features
1. Part-of-speech tagging:
Features are noun and noun phrases
2. Frequent features generation
Association mining to generate candidate features
Feature pruning
3. Infrequent feature generation
Opinion words extraction
Finding infrequent features using opinion words
12. Identifying Orientation of Opinion
Sentence
Used dominant orientation of opinion words as sentence
orientation
If positive opinion prevails, the opinion sentence is regarded as
a positive and vice versa
13. Feature-based Summary
GREAT Camera., Jun 3, 2004 Feature Based Summary:
Reviewer: jprice174 from Atlanta, Ga.
Feature1: picture
I did a lot of research last year before
Positive: 12
I bought this camera... It kinda hurt to
The pictures coming out of this camera are
leave behind my beloved nikon 35mm amazing.
SLR, but I was going to Italy, and I needed Overall this is a good camera with a really good
something smaller, and digital. picture clarity.
The pictures coming out of this …
camera are amazing. The 'auto' feature Negative: 2
takes great pictures most of the time. And The pictures come out hazy if your hands shake
with digital, you're not wasting film if the even for a moment during the entire process of
picture doesn't come out. … taking a picture.
Focusing on a display rack about 20 feet away in
a brightly lit room during day time, pictures
…. produced by this camera were blurry and in a
shade of orange.
Feature2: battery life
…
14. Opinion Spam
Reviews contain rich user opinions on products and services,
that possibly influence the purchase decisions of users
Generally three types of spam reviews:
Untruthful opinions
Reviews on brands only
Non-Reviews
15. Tools for Sentiment Analysis [1/2]
APIs
Evri – semantic search engine, very powerful API
OpenDover – Java based webservice
Blogosphere/Twittersphere
RankSpeed – search by criterias
Twittratr – simple search tool (keyword based)
TwitterSentiment – project from Stanford University,
classifiers from machine learning algorithms, transparent
16. Tools for Sentiment Analysis [2/2]
Newspaper
Newssift – sentiment search tool on newspapers (by Financial
Times)
Applications
LingPipe – Java tool
Radian6 – commercial social media monitoring application
RapidMiner – open-source machine learning and data mining
tool (Community Edition)