Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Spanish Corpus for Sentiment Analysis towards Brands

87 views

Published on

Presentation of the SAB corpus at SPECOM (Hertfordshire, 14.09.2017)

  • Be the first to comment

  • Be the first to like this

Spanish Corpus for Sentiment Analysis towards Brands

  1. 1. Spanish Corpus for Sentiment Analysis towards Brands María Navas-Loro, Víctor Rodríguez-Doncel, Idafen Santana-Perez, Alberto Sánchez Technical University of Madrid mnavas@fi.upm.es SPECOM, 14th September 2017
  2. 2. INTRODUCTION Spanish Sentiment Analysis 2
  3. 3. Introduction Sentiment Analysis on Social Networks allows companies to know better client opinions.
  4. 4. Main problem Sentiment Analysis, specially in Spanish, just focuses on polarity... … also due to Twitter policies, there is not much annotated training available…
  5. 5. Main problem And even though we can find corpora in several fields, such as the medical or the touristic, or more general opinions... … there is nothing for opinion towards brands in Spanish!
  6. 6. ANALYSIS PROPOSAL Designed schema 6
  7. 7. Who we are
  8. 8. Analysis proposal Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear
  9. 9. Analysis proposal Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Purchase Funnel AwarenessEvaluation Purchase Postpurchase Review When?
  10. 10. Analysis proposal Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Meaningful Brands Marketplace Personal Wellbeing Collective Wellbeing Where?
  11. 11. Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Sentiment Analysis Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Emotion? Analysis proposal
  12. 12. Analysis proposal Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear What? Marketing MixProduct Price Promotion Place
  13. 13. Analysis proposal Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Purchase Funnel AwarenessEvaluation Purchase Postpurchase Review Marketing MixProduct Price Promotion Place Sentiment Analysis Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear Meaningful Brands Marketplace Personal Wellbeing Collective Wellbeing When? Where? Emotion? What?
  14. 14. LPS BIGGER project
  15. 15. BUILDING PROCESS Spanish Corpus for Sentiment Analysis towards Brands 15
  16. 16. Corpus building 1. Selection of the brands 2. Acquisition of tweets 3. Sifting 4. Tagging 5. Transformation Sector Brand BEVERAGES Cruzcampo, Heineken, Estrella Galicia, Mahou AUTOMOTIVE Citroën, Fiat, Hyundai, Kia, Peugeot, Toyota BANKING Bankia, Bankinter, BBVA, Sabadell, ING, La Caixa/Caixabank, Santander FOOD Auchan, Bimbo, Hacendado, Milka, Pascual, Puleva RETAIL Alcampo, Carrefour, Decathlon, Ikea, Leroy Merlin, Mediamarkt, Mercadona TELECOM Amena, Lowi, Movistar, Orange, Vodafone, Yoigo SPORTS Adidas, Nike, Reebok
  17. 17. Corpus building 2. Acquisition of tweets 1. Selection of the brands 3. Sifting 4. Tagging 5. Transformation • Collected from Twitter between the 1st and the 7th February 2017. • Using keywords just related to the name of the brands. • Using just a filter, for Spanish tweets. • Avoiding retweets. • At the end of this step, there remainded more tan 23,000 tweets.
  18. 18. Corpus building 3. Sifting 1. Selection of the brands 4. Tagging 5. Transformation • Manual and automatic screening: • Repeated tweets. • The same tweet changing the URL. • No real Brand (polysemous). • Other languages… • We obtain the final 4548 tweets. 2. Acquisition of tweets
  19. 19. Corpus building 4. Tagging 1. Selection of the brands 5. Transformation • Three taggers for BEVERAGES, one for the whole corpus. • Tags are per post, being possible have different emotions. • Specific criteria was given to the taggers (available with the corpus), along with information on interagreement and Kappas. Example of criteria: HAPPINESS is only given to products already acquired, not to future purchases. If a desired product is not found, SATISFACTION and SADNESS are tagged. 2. Acquisition of tweets 3. Sifting
  20. 20. Corpus building 5. Transformation 1. Selection of the brands • The corpus is represented as linked data, reusing ontologies and also creating new classes: • Marl and Onyx. • SIOC. • GoodRelations. • It is also linked to external databases such as: • Thomson Reuters’ PermID. • DBpedia. 2. Acquisition of tweets 3. Sifting 4. Tagging
  21. 21. SAB CORPUS Spanish Corpus for Sentiment Analysis towards Brands 21
  22. 22. Final corpus statistics Sector HATE SADNESS FEAR DISSATISF. NC2 SATISF. TRUST HAPPINESS LOVE WITH EMOTION FOOD 5 4 0 28 181 153 149 50 44 367 AUTOMOTIVE 0 1 5 11 508 33 17 6 5 551 BANKING 32 6 92 146 561 8 3 0 0 712 BEVERAGES 15 8 5 131 253 302 225 51 53 686 SPORTS 16 17 2 87 430 123 78 32 74 653 RETAIL 27 10 13 100 1057 119 118 31 28 1328 TELECOM 32 2 0 73 152 21 15 8 3 249 TOTAL 127 48 117 576 3142 759 605 178 207 4546
  23. 23. SAB corpus connections SAB corpus connects to external resources and uses a proper vocabulary along with reusing several ontologies. Post Meaningful Brands Marketing Mix Sentiment Analysis Purchase Funnel Marketplace Personal Wellbeing Collective Wellbeing AwarenessEvaluation Purchase Postpurchase Review Product Price Promotion Place Hate / Love Satisfaction / Dissatisfaction Hapiness / Sadness Trust / Fear
  24. 24. Example of a post lps:826812979421257730 a sioc:Post ; sioc:id "826812979421257730" ; sioc:content "Ya me quede sin credito?? Hace 3 dias tengo credito nomas... Movistar y la concha de tu hermana"@es ; marl:describesObject lps:Movistar ; lps:isInPurchaseFunnel lps:postPurchase; lps:hasMarketingMix lps:price; lps:hasMeaningfulBrand lps:marketplace; onyx:hasEmotion lps:hate, lps:dissatisfaccion ; marl:hasPolarity marl:negative ; marl:forDomain "TELCO" . lps:hate a onyx:Emotion ; rdfs:label "odio"@es, "hate"@en . lps:dissatisfaction a onyx:Emotion ; rdfs:label "insatisfaccion"@es, "dissatisfaction"@en .
  25. 25. Example of information for a brand and a company lps:Movistar a gr:Brand ; rdfs:seeAlso <http://dbpedia.org/resource/Movistar> ; rdfs:label "Movistar" . lps:1-5000062703 a gr:Business ; rdfs:label "Telefonica de Espana, S.A.U."; rdfs:seeAlso <https://opencorporates.com/companies/es/82018474> ; owl:sameAs permid:1-5000062703 .
  26. 26. CONCLUSIONS Contributions and future work 26
  27. 27. Contributions and future lines Some contributions of the SAB corpus: • It covers a gap in the Spanish Sentiment Analysis. • It offers a representation for Sentiment Analysis towards Brands independent of the language. • It offers Linked Data information that: • Prevent corpus to be outdated (changes in names, for instance). • Offer data related to brands beyond the text (CEOs…) Future lines: • Full annotation of all the aspects. • More links. • More tweets. • Semantic annotation of emotional keywords.
  28. 28. Bibliography Breslin, J.G., Decker, S., et al.: Sioc: an approach to connect web- based communities. International Journal of Web Based Communities 2(2), 133-142 (2006) Sanchez Rada, J.F., Torres, M., et al.: A linked data approach to sentiment and emotion analysis of twitter in the financial domain. In: 2nd International Workshop on Finance and Economics on the Semantic Web (2014) Hepp, M.: Goodrelations: An ontology for describing products and services offers on the web. In: International Conference on Knowledge Engineering and Knowledge Management. pp. 329- 346. Springer (2008) Thomson Reuters’ PermID: https://permid.org/ Dbpedia: http://dbpedia.org/
  29. 29. Link to the corpus http://sabcorpus.linkeddata.es/ Thank you for your attention
  30. 30. Spanish Corpus for Sentiment Analysis towards Brands María Navas-Loro, Víctor Rodríguez-Doncel, Idafen Santana-Perez, Alberto Sánchez Technical University of Madrid mnavas@fi.upm.es SPECOM, 14th September 2017

×