Unsupervised Sentiment Analysis

4,221 views

Published on

What is sentiment analysis? How can it be used in business? What is a potential of UNSUPERVISED sentiment analysis?

Published in: Technology, Education
  • Be the first to comment

Unsupervised Sentiment Analysis

  1. 1. Taras Zagibalov T.Zagibalov@sussex.ac.uk PhD candidate at University of Sussex Brighton, UK Ford Foundation International Fellowship fellow Natural languages: Russian, English, Mandarin Programming: Java, Prolog Taras Zagibalov© 2009
  2. 2. Unsupervised Sentiment Analysis Listening to the Word of Mouth What is it? How does it work? How can it be used? Taras Zagibalov© 2009
  3. 3. Outline  What is Sentiment Analysis  Application of Sentiment Analysis  Who's in the business?  Unsolved Problems  Why unsupervised?  Is it effective? Taras Zagibalov© 2009
  4. 4. Sentiment Analysis Sentiment Analysis (or Opinion Mining) is a relatively new research area in Information Retrieval and Natural Language Processing, which is concerned not with a document's topic, but with what opinion it expresses Taras Zagibalov© 2009
  5. 5. What is Sentiment Analysis  Subjectivity Classification  Orientation Detection  Opinion Holder and Target Extraction  Feature-Based Opinion Mining Taras Zagibalov© 2009
  6. 6. What is Sentiment Analysis  Subjectivity Classification  Orientation Detection  Opinion Holder and Target Extraction  quot;Feature-Based Opinion Miningquot; A car has four wheels. vs It's a good car. Taras Zagibalov© 2009
  7. 7. What is Sentiment Analysis  Subjectivity Classification  Orientation Detection  Opinion Holder and Target Extraction  quot;Feature-Based Opinion Miningquot; It's a good car. vs It's a bad car. Taras Zagibalov© 2009
  8. 8. What is Sentiment Analysis  Subjectivity Classification  Orientation Detection  Opinion Holder and Target Extraction  quot;Feature-Based Opinion Miningquot; Ian says it's a good car. Taras Zagibalov© 2009
  9. 9. What is Sentiment Analysis  Subjectivity Classification  Orientation Detection  Opinion Holder and Target Extraction  quot;Feature-Based Opinion Miningquot; The wheels are good, but all the rest is just unusable. Taras Zagibalov© 2009
  10. 10. Application of Sentiment Analysis Where opinions can be found?  News feeds (Google, Yahoo, Reuters etc)  Blogs (LJ, Technorati etc)  Social Networks (Twitter, Facebook...)  Customer review sites (Amazon, eBay...) Taras Zagibalov© 2009
  11. 11. Application of Sentiment Analysis  Marketing Research  Product Reviews Analysis  Brand Tracking  Influence Analysis  Public Opinion Tracking  Customer correspondence analysis Taras Zagibalov© 2009
  12. 12. Application of Sentiment Analysis What questions can be answered by Sentiment analysis system?  What do customers think about our product?  Which of our customers are unsatisfied?  What features of our product are the worst?  Who and how influences our image?  What is public reaction to (some event or some person)?  and so on... Taras Zagibalov© 2009
  13. 13. Example 1 On-line (blogs, mass-media) monitoring of a product promotion campaigns 10 9 8 7 6 5 4 3 2 1 0 A B Promotional campaign A is successful as most of on-line reviews are positive. Promotional campaign B needs immediate actions as most of on-line reviews are negative. Taras Zagibalov© 2009
  14. 14. Example 2 New product release as it mirrored in customer on-line reviews 8 7 6 5 4 3 2 1 0 A B (A) Product release and add campaign is quite effective as public opinion is mostly positive. But the sentiment changes as sales grow (B), more people are unsatisfied and it needs to be analysed (probably some quality-related issues) Taras Zagibalov© 2009
  15. 15. Example 3 Influence analysis by tracking blogs 9 8 7 6 5 4 3 2 1 0 A B (A) Negative review in a newspaper does not affect a generally positive sentiment towards a product, although a positive review in a magazine (B) is quite effective. Taras Zagibalov© 2009
  16. 16. Who's in the business?  BrandWatch  Istrategy Labs  Cataphora  Scoutlabs  Lexalytics  Infonic  Attensity  Open Dover  ... Taras Zagibalov© 2009
  17. 17. What's the technology?  Machine Learning  Manually tagged training data sets  User-tagged training data sets (“thumbs up” and the “ five stars”)  Knowledge-based Approaches  Manually created word-lists  Generic word-lists (like SentiWordNet or sentiment vocabularies)  Manual Processing Taras Zagibalov© 2009
  18. 18. Unsolved Problems  Domain-dependency  Unpredictable evaluation language  Language-dependency Taras Zagibalov© 2009
  19. 19. Unsolved Problems  Domain-dependency  Unpredictable evaluation language  Language-dependency quot;The plot was unpredictablequot; vs quot;the steering was unpredictablequot; Taras Zagibalov© 2009
  20. 20. Unsolved Problems  Domain-dependency  Unpredictable evaluation language  Language-dependency “good” == “bad” in eBay “3G” (technology for mobile phones) == “good” Taras Zagibalov© 2009
  21. 21. Unsolved Problems  Domain-dependency  Unpredictable evaluation language  Language-dependency Culture-related issues (“good” <> “ 好” ) Language-related issues (SVO vs SOV) Taras Zagibalov© 2009
  22. 22. Why unsupervised?  Cross-Domain applicability  Multi-Lingual applicability  Cheap Start Taras Zagibalov© 2009
  23. 23. Why unsupervised?  Cross-Domain applicability  Multi-Lingual applicability  Cheap Start No expensive human annotation needed: all information is found in the documents which needed to be processed. All extracted information is domain- specific and free from noise produced by “generic” word lists and wordnets. Taras Zagibalov© 2009
  24. 24. Why unsupervised?  Cross-Domain applicability  Multi-Lingual applicability  Cheap Start Unsupervised systems, being data- independent, can be easily ported to almost any language. Taras Zagibalov© 2009
  25. 25. Why unsupervised?  Cross-Domain applicability  Multi-Lingual applicability  Cheap Start Once an unsupervised system is developed it can be applied to new data almost immediately saving costs of data labelling and/or rules (word-lists) writing up. Taras Zagibalov© 2009
  26. 26. Is it effective?  The unsupervised approach was tested on different language corpora (English, Simplified Chinese, Traditional Chinese, Japanese) and in many cases compared reasonably well with supervised methods.  Results were presented on some major international scientific conferences (ACL, IJCNLP, COLING, NTCIR). Taras Zagibalov© 2009
  27. 27. Is it effective? The approach can be easily combined with supervised techniques:  Unsupervised system can provide initial data for in-depth research of the data (building up word-lists and rule-sets)  Automatically extracted information can be used for training machine learning systems. Taras Zagibalov© 2009
  28. 28. Conclusion  Unsupervised Sentiment Analysis is an efficient instument of keeping track of public opinion in different domains and languages.  It can be used as an entry point to a new domain or language.  It can be combined with supervised methods to increase accuracy. Taras Zagibalov© 2009

×