Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

OEG at TASS 2017: Spanish Sentiment Analysis of tweets at document level

129 views

Published on

Participation of OEG at Task 1 of the workshop TASS - 19.09.2017

  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

OEG at TASS 2017: Spanish Sentiment Analysis of tweets at document level

  1. 1. OEG at TASS 2017: Spanish Sentiment Analysis of tweets at document level María Navas-Loro, Víctor Rodríguez-Doncel Universidad Politécnica de Madrid mnavas@fi.upm.es TASS - SEPLN, 19th September 2017
  2. 2. Outline Outline 1. Background 2. Our approach 1. Labeling Strategy 2. Linguistic Considerations 3. Means 3. Final systems 4. Conclusions
  3. 3. BACKGROUND Why are we here? 3
  4. 4. LPS BIGGER project
  5. 5. Analysis proposal Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear
  6. 6. Analysis proposal Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Purchase Funnel AwarenessEvaluation Purchase Postpurchase Review When?
  7. 7. Analysis proposal Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Meaningful Brands Marketplace Personal Wellbeing Collective Wellbeing Where?
  8. 8. Analysis proposal Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Marketing MixProduct Price Promotion Place What?
  9. 9. Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Sentiment Analysis Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Emotion? Analysis proposal
  10. 10. Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Sentiment Analysis Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Analysis proposal
  11. 11. LABELING STRATEGY Our approach 11
  12. 12. • Sentiment Analysis, usually a binary classification task  Our case is more complex (P, N, NEU and NONE). • We focused on testing different strategies: Labeling Strategy P N Neu None Max P N F(x) 1) 2) F(x) = 𝑃 − 𝑁 < 𝑡 𝑑 𝑁𝐸𝑈 0 < 𝑡 𝑛 𝑚𝑖𝑛 < 𝑃, 𝑁 < 𝑡 𝑛 𝑚𝑎𝑥 𝑁𝐸𝑈 0 ≤ 𝑃, 𝑁 ≤ 𝑡 𝑛 𝑚𝑖𝑛 𝑁𝑂𝑁𝐸 G(x) = 0 ≤ 𝑋 < 𝑡− 𝑁 𝑡− ≤ 𝑋 ≤ 𝑡+ 𝑁𝐸𝑈 𝑡+ < 𝑋 ≤ 1 𝑃 3) Emotion map 4) N, Neu, P G(x) satisfaction, trust, happiness, love dissatisfaction, fear sadness, hate P NNeu NONE  just the14% of InterTASS corpus! Sentiment? P, N, Neu 5) Two-stages: 1. Is there any sentiment? 2. If so, which one? None
  13. 13. LINGUISTIC CONSIDERATIONS Our approach 13
  14. 14. Features: • Tokens • Lemmas • Words from lexicons Linguistic Considerations Negation treatment: • Presence of NEG constituents (“no”, “nunca”…)  If present within a verbal group, polarity is inverted. • Double negation is not considered.
  15. 15. Preprocessing • Laugh patterns • URLs • Slang expressions: • Q, k, qu, ke, qe • d,tb, lol • Xq, pq, porq • Repeated letters • Suppresions of numbers • Emoticons • Stopword list • Manual • TF-IDF Linguistic Considerations
  16. 16. MEANS: Resources and Algorithms Our approach 16
  17. 17. External resources: • IXA-Pipes for NLP • WEKA for Machine Learning • Different lexicons. Means Algorithms: • MNB (Multinomial Naïve Bayes). • SMO (Sequential Minimal Optimization for SVMs). * Previous efforts: Logistic Regression, trees, WordEmbeddings…
  18. 18. OUR SYSTEMS Results 18
  19. 19. laOEG • 2nd labeling strategy. • MNB (slightly better than SMO), tokens. Our systems • Parameters: 0.3 for MNB (0.01-0.5) 0.10-0.15 • PRO: it is versatile and fast • CONS: it is the simpliest and shallowest approach, it does not even consider negation. F(x) = 𝑃 − 𝑁 < 0,10 𝑁𝐸𝑈 0 < 0,10 < 𝑃, 𝑁 < 0,15 𝑁𝐸𝑈 0 ≤ 𝑃, 𝑁 ≤ 0,10 𝑁𝑂𝑁𝐸 P N F(x) tokens
  20. 20. victor0 • Same as above (2nd labeling strategy, MNB), but: • Using lemmas. Our systems • PRO: it is versatile and detects negation. • CONS: much slower than laOEG. • Handling negation at the verbal group at different constituent levels, new features: Eg: ‘don’t like’ is considered now as a single feature. P N F(x) lemmas
  21. 21. victor2 • Same as above, but: • With stopwords. • Using a special dataset to better distinguish NEU and NONE. Our systems P N F(x) lemmas Hinojosa et al. 2016 victor3b • victor2 combined with IBM Watson NLU module when good confidence (0,75). victor2 W NLU > 0,75 < 0,75
  22. 22. Our systems laOEG victor0 victor2 W NLU victor3b Hinojosa et al. 2016
  23. 23. Results InterTASS System M-P M-R M-F1 Acc victor2 0,400 0,389 0,395 0,451 victor0 0,388 0,378 0,383 0,433 laOEG 0,383 0,370 0,377 0,505 Max. result 0,497 0,490 0,493 0,607 Min. result 0,291 0,322 0,306 0,479 General 1k System M-P M-R M-F1 Acc victor3b 0,402 0,337 0,367 0,486 victor2 0,361 0,370 0,366 0,412 laOEG 0,348 0,345 0,346 0,448 Max. result 0,559 0,595 0,577 0,645 Min. result 0,302 0,348 0,324 0,434 General TASS System M-P M-R M-F1 Acc victor2 0,395 0,384 0,389 0,496 laOEG 0,350 0,342 0,346 0,407 Max. result 0,559 0,595 0,577 0,645 Min. result 0,302 0,348 0,324 0,434
  24. 24. CONCLUSIONS Contributions and future work 24
  25. 25. Conclusions and future lines Conclusions on this first participation: • There is room for improvement, but… • … we had stable results and versatile systems. • … out-of-the-box external software (Watson) worked any better. Future lines: • Fine-grained emotions is not the right way. • More resources. • Use of concepts instead of simple words.
  26. 26. Bibliography IXA pipes: Rodrigo Agerri, Josu Bermudez and German Rigau (2014): "IXA pipeline: Efficient and Ready to Use Multilingual NLP tools", in: Proceedings of the 9th Language Resources and Evaluation Conference (LREC2014), 26-31 May, 2014, Reykjavik, Iceland. WEKA: Eibe Frank, Mark A. Hall, and Ian H. Witten (2016). The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann, Fourth Edition, 2016. (Hinojosa et at. 2016) dataset: Hinojosa, J. A., N. Martínez-García, C. Villalba-García, U. Fernández-Folgueiras, A. Sánchez-Carmona, M. A. Pozo, and P. Montoro. 2016. Affective norms of 875 spanish words for five discrete emotional categories and two emotional dimensions. Behavior research methods, 48(1):272-284. IBM Watson NLU: https://www.ibm.com/watson/developercloud/doc/natural-language- understanding/
  27. 27. OEG at TASS 2017: Spanish Sentiment Analysis of tweets at document level María Navas-Loro, Víctor Rodríguez-Doncel Universidad Politécnica de Madrid mnavas@fi.upm.es TASS - SEPLN, 19th September 2017

×