A holistic lexicon based approach to opinion mining


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A holistic lexicon based approach to opinion mining

  1. 1. Xiaowen Ding, Bing Liu and Philip YuPresenter: Quang NguyenDate: 2010.10.18Saltlux Vietnam Development Center
  2. 2.  Featured-based Opinion Mining Tasks Task 1: Identify and extract object features F that have been commented on by an opinion holder (e.g., a reviewer). Task 2: Determine whether the opinions on the features F are positive, negative or neutral. Task 3: Group feature synonyms. • Produce a feature-based opinion summary of multiple reviews. This paper focuses on Task 2 assuming that features have been discovered 2
  3. 3.  Opinion Words • Positive: beautiful, wonderful, good, amazing, • Negative: bad, poor, terrible, cost someone an arm and a leg (idiom). One effective approach is to use opinion lexicon, opinion words. • Identify all opinion words in a sentence • Aggregate these words to give the final opinion to each feature. 3
  4. 4.  Dictionary-based approaches • Start from a seed opinion words • Use Wordnet’s hierarchy and synsets to acquire more opinion words Corpus-based approaches: extract opinion words from large corpora using syntactic rules and co-occurrence patterns Do not deal well with context dependent words! 4
  5. 5.  Improve lexicon-based approaches using context dependent opinion words • Negative: “The bedroom is very small” • Positive: “The Nokia N3100 is so small as to be put in any pockets” Propose a function for aggregating multiple opinion words in the same sentence Consider explicit and implicit opinions 5
  6. 6.  Intra-sentence conjunction rule Pseudo intra-sentence conjunction Inter-sentence conjunction rule 6
  7. 7.  Opinion on both sides of “and” should be the same • E.g., “This camera takes great pictures and has a long battery life”. Not likely to say: • “This camera takes great pictures and has a short battery life.” 7
  8. 8.  Sometimes, one may not use an explicit conjunction “and”. • Same opinion in same sentence, unless there is a “but”-like clause • E.g., “The camera has a long battery life, which is great” 8
  9. 9.  Peopleusually express the same opinion across sentences • unless there is an indication of opinion change using words such as “but” and “however” • E.g., “The picture quality is amazing. The battery life is long” Not so natural to say: • “The picture quality is amazing. The battery life is short” 9
  10. 10.  Opinion lexicon is far from sufficient. It needs special handling: • Negation/But Rule • Non-negation contains negative word, e.g., “I like this camera not just because it is beautiful” • Not contrary, but has a “but”, e.g., ““I not only like the picture quality of this camera, but also its size” • … 10
  11. 11.  Implicit Feature is determined through adjectives (implicit feature indicator) • E.g., “This camera is very small” “small” is indicator for “size” • E.g., “This camera is very heavy” • “heavy” is indicator for “weight” 11
  12. 12.  An object O is an entity which can be a product, person, event, organization, or topic An object O is represented with a finite set of features, F = {f1, f2, …, fn}. • Each feature fi in F can be expressed with a finite set of words or phrases Wi, which are synonyms. Model of a review: An opinion holder j comments on a subset of the features Sj F of object O. • For each feature fk Sj that j comments on, he/she  chooses a word or phrase from Wk to describe the feature, and  expresses a positive, negative or neutral opinion on fk. 12
  13. 13.  Input: a pair (f, s), where f is a product feature and s is a sentence that contains f. Output: whether the opinion on f in s is pos, neg, or neut. wi: opinion word V: set of all opinion words dis(wi, f): distance between wi and f SO: semantic orientation of wi (+1, -1, 0) 13
  14. 14. 14
  15. 15. 15
  16. 16. Precision Recall F-ScoreFBS(M. Hu and B. Liu. Mining and 0.93 0.76 0.83summarizing customerreviews. KDD’04, 2004)OPINE(A-M. Popescu and O. Etzioni.Extracting Product Features 0.86 0.89 0.87and Opinions from Reviews. EMNLP-05, 2005.)Opinion Observer 0.92 0.91 0.91(this paper) 16
  17. 17.  Xiaowen Ding, Bing Liu, and Philip S. Yu, A Holistic Lexicon-Based Approach to Opinion Mining, Proceedings of the international conference on Web search and web data mining, USA, 2008 17
  18. 18. 18