A holistic lexicon based approach to opinion mining

  • 761 views
Uploaded on

 

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
761
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
28
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Xiaowen Ding, Bing Liu and Philip YuPresenter: Quang NguyenDate: 2010.10.18Saltlux Vietnam Development Center
  • 2.  Featured-based Opinion Mining Tasks Task 1: Identify and extract object features F that have been commented on by an opinion holder (e.g., a reviewer). Task 2: Determine whether the opinions on the features F are positive, negative or neutral. Task 3: Group feature synonyms. • Produce a feature-based opinion summary of multiple reviews. This paper focuses on Task 2 assuming that features have been discovered 2
  • 3.  Opinion Words • Positive: beautiful, wonderful, good, amazing, • Negative: bad, poor, terrible, cost someone an arm and a leg (idiom). One effective approach is to use opinion lexicon, opinion words. • Identify all opinion words in a sentence • Aggregate these words to give the final opinion to each feature. 3
  • 4.  Dictionary-based approaches • Start from a seed opinion words • Use Wordnet’s hierarchy and synsets to acquire more opinion words Corpus-based approaches: extract opinion words from large corpora using syntactic rules and co-occurrence patterns Do not deal well with context dependent words! 4
  • 5.  Improve lexicon-based approaches using context dependent opinion words • Negative: “The bedroom is very small” • Positive: “The Nokia N3100 is so small as to be put in any pockets” Propose a function for aggregating multiple opinion words in the same sentence Consider explicit and implicit opinions 5
  • 6.  Intra-sentence conjunction rule Pseudo intra-sentence conjunction Inter-sentence conjunction rule 6
  • 7.  Opinion on both sides of “and” should be the same • E.g., “This camera takes great pictures and has a long battery life”. Not likely to say: • “This camera takes great pictures and has a short battery life.” 7
  • 8.  Sometimes, one may not use an explicit conjunction “and”. • Same opinion in same sentence, unless there is a “but”-like clause • E.g., “The camera has a long battery life, which is great” 8
  • 9.  Peopleusually express the same opinion across sentences • unless there is an indication of opinion change using words such as “but” and “however” • E.g., “The picture quality is amazing. The battery life is long” Not so natural to say: • “The picture quality is amazing. The battery life is short” 9
  • 10.  Opinion lexicon is far from sufficient. It needs special handling: • Negation/But Rule • Non-negation contains negative word, e.g., “I like this camera not just because it is beautiful” • Not contrary, but has a “but”, e.g., ““I not only like the picture quality of this camera, but also its size” • … 10
  • 11.  Implicit Feature is determined through adjectives (implicit feature indicator) • E.g., “This camera is very small” “small” is indicator for “size” • E.g., “This camera is very heavy” • “heavy” is indicator for “weight” 11
  • 12.  An object O is an entity which can be a product, person, event, organization, or topic An object O is represented with a finite set of features, F = {f1, f2, …, fn}. • Each feature fi in F can be expressed with a finite set of words or phrases Wi, which are synonyms. Model of a review: An opinion holder j comments on a subset of the features Sj F of object O. • For each feature fk Sj that j comments on, he/she  chooses a word or phrase from Wk to describe the feature, and  expresses a positive, negative or neutral opinion on fk. 12
  • 13.  Input: a pair (f, s), where f is a product feature and s is a sentence that contains f. Output: whether the opinion on f in s is pos, neg, or neut. wi: opinion word V: set of all opinion words dis(wi, f): distance between wi and f SO: semantic orientation of wi (+1, -1, 0) 13
  • 14. 14
  • 15. 15
  • 16. Precision Recall F-ScoreFBS(M. Hu and B. Liu. Mining and 0.93 0.76 0.83summarizing customerreviews. KDD’04, 2004)OPINE(A-M. Popescu and O. Etzioni.Extracting Product Features 0.86 0.89 0.87and Opinions from Reviews. EMNLP-05, 2005.)Opinion Observer 0.92 0.91 0.91(this paper) 16
  • 17.  Xiaowen Ding, Bing Liu, and Philip S. Yu, A Holistic Lexicon-Based Approach to Opinion Mining, Proceedings of the international conference on Web search and web data mining, USA, 2008 17
  • 18. 18