Analyzing customer sentiments in microblogs


Published on

Published in: Technology, Business
1 Comment
  • Good introduction for LDA approach
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Analyzing customer sentiments in microblogs

  1. 1. Analyzingcustomersentiments in microblogs:A topic-model-basedapproachforTwitterdatasets<br />Amercian Conference on Information Systems, Detroit, 2011<br />Stefan Sommer, Andreas Schieber<br />Twitterbird:<br />
  2. 2. Imagineyourare a productmanagerof Sony TVs:Whatistheconversationabout?<br />Buy a 3D TV<br />08/06/2011<br />Sommer, Schieber / Analyzingcustomersentiments in microblogs<br />2<br />Sentiments, opinions<br />Sony TV:<br />Communication plattformsofthe Web 2.0<br />
  3. 3. The communication plattform Twitter:What am I doing?<br />Microblog - Ordinaryblogwithsocialnetworkingfeatures (followingotherusers)<br />Eachentryis limited to 140 characters<br />Mark wordswithhashtags (#), Adressuserswith@, link informationwithshort URLs<br />Twitteristhemostpopularmicrobloggingservices:1.8 mill. users in Germany (Pettey & Stevens, 2009)<br />190 mill. usersworldwide valuablesourceforcompanies (Pak & Paroubekl, 2010)<br />Free accesstothedata via Twitter API<br />08/06/2011<br />Sommer, Schieber / Analyzingcustomersentiments in microblogs<br />3<br />
  4. 4. Sentiment analysisisthegoal:Why do weneedtopicmodels?<br />Analytics<br />Sentimentanalysis<br />Hugeammountofdata, only a smallsetofthedatais relevant.<br />Knowledge Discovery<br />in Databases<br />Twitterdata<br />Data source<br />08/06/2011<br />Sommer, Schieber / Analyzingcustomersentiments in microblogs<br />4<br />
  5. 5. Research methodology. <br />Research goals<br />Identification of microblog entries containing opinions in a specific context.<br />How can we automatically identify the topics of the entries?<br />Research approach<br />Design-science-based approach (Hevner et al., 2004)<br />Topic detection with generative topic models (Blei& Lafferty, 2009)<br />Twitter as a textual data source<br />08/06/2011<br />Sommer, Schieber / Analyzingcustomersentiments in microblogs<br />5<br />
  6. 6. Topic Modelling.<br />„Probabilistic models for uncovering the underlying semantic structure of a document collection“ (Blei& Lafferty, 2009)<br />Exploratory approach:Latent Dirichlet Allocation (LDA)<br />Lack of knowledge about underlying correlations between topics<br />LDA is allowing the documents to have a mixture of topics<br />LDA is used for exploring and representing topics in microblog entries<br />08/06/2011<br />Sommer, Schieber / Analyzingcustomersentiments in microblogs<br />6<br />
  7. 7. Knowledge-Discovery-in-Databasesfor Twitter data sets. <br />Topic modelling<br />Topic identification by implementing LDA<br />Lexicalization and co-occurrences<br />Transformation<br />Preprocessing<br />Removal of unwanted characters<br />Sentiment analysis<br />based on specifictopicclusters<br />Selection by keywords<br />Twitter search by using Twitter API<br />Twitterdata<br />08/06/2011<br />Sommer, Schieber / Analyzingcustomersentiments in microblogs<br />7<br />
  8. 8. Data set 1: November 2010, 1.500tweets.Data set 2: January 2011, 1.200 tweets.<br />Target twitter data <br />without search terms<br />Search items<br />3D<br />Preprocessed<br />targettwitterdata<br />Sony 3D<br />Sony 3D KDL<br />08/06/2011<br />Sommer, Schieber / Analyzingcustomersentiments in microblogs<br />8<br />
  9. 9. Results of the first period.November 2010.<br />Identification of 10 topic clusters<br />Most frequent topic “X7”<br />Top 8 words which are representing topic “X7”<br />Sample Twitter documents corresponding to topic “X7”<br />Reduction of raw Twitter data to tweets containing words of a specific context<br />Sentiment analysis of these tweets<br />08/06/2011<br />Sommer, Schieber / Analyzingcustomersentiments in microblogs<br />9<br />
  10. 10. Results of the second period.January 2011.<br />Identification of 10 topic clusters<br />Most frequent topic “X9”<br />Top 5 words which are representing topic “X9”<br />Tweets correspond to one topic or at least to one dominant topic <br />We separated tweets with specific topics by using generative topic models<br />We obtained topics referencing current events in our short-time datasets<br />The algorithm automatically detected several topics (manual setting of topic ammount)<br />The topic distribution becomes more specific by specifying the search string<br />08/06/2011<br />Sommer, Schieber / Analyzingcustomersentiments in microblogs<br />10<br />
  11. 11. Summary and Q&A.<br />Topic modellingworks in thefieldoftwitterdataanalysis.<br />Wecandistinguishbetween relevant and non-relevant tweetsforspecificquestions(bythecompanies).<br />A firststep in ordertoknowwhattheconversationofcustomersisreallyabout.<br />Goal: Identification of microblog entries containing sentiments in a specific context.<br />Limited datasets, noevaluationmetricsandtheresearchisatthebeginning.<br />Developing a crawlerforlong-term analysis (large scaleevaluation).<br />Evaluation ofothertopicmodelapproaches (correlated, dynamic).<br />Literaturereviewanddiscussion on non topicmodelapproaches.<br />08/06/2011<br />Sommer, Schieber / Analyzingcustomersentiments in microblogs<br />11<br />
  12. 12. Contactdetails.<br />Stefan SommerT-Systems Multimedia Solutions GmbH<br />E-Mail: s.sommer@t-systems.comPhone: +49 170 2236 469Twitter: @somerus<br />Andreas SchieberDresden University of Technology<br />E-Mail: andreas.schieber@tu-dresden.deTelefon: +49 351 463 32735Twitter: @reakahont<br />Slides:<br />08/06/2011<br />Sommer, Schieber / Analyzingcustomersentiments in microblogs<br />12<br />
  13. 13. References.<br />Barnes, S.J. and Böhringer, M. (2009) 'Continuance Usage Intention in Microblogging Services: The Case of Twitter', Proceedings of the 17th European Conference on Information Systems, 1-13.<br />Bermingham, A. and Smeaton, A. (2010) 'Classifying Sentiment in Microblogs - Is Brevity an Advantage?', Proceedings of the 19th ACM international conference on Information and knowledge management, 1833-1836.<br />Blei, D. and Lafferty, J. (2009) Topic Models, [Online], Available: [30 Nov 2010].<br />Blei, D., Ng, A. and Jordan, M. (2003) 'Latent Dirichlet Allocation', Journal of Machine Learning Research, pp. 933-1022.<br />Böhringer, M. andGluchowski, P. (2009) 'Microblogging', Informatik-Spektrum, pp. 505-510.<br />Fayyad, U. (1996) Advances in Knowledge Discovery and Data Mining, Menlo Park: AAAI Press.<br />Hevner, A., March, S., Park, J. and Ram, S. (2004) 'Design Science in Information Systems', MIS Quarterly, 28, pp. 75-105.<br />Liu, B. (2007) Web Data Mining, Berlin: Springer.<br />O'Connor, B., Balasubramanyan, R., Routledge, B. and Smith, N. (2010) 'From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series', Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, 122-129.<br />Oulasvirta, A., Lehtonen, E., Kurvinen, E. and Raento, M. (2010) 'Making the ordinary visible in microblogs', Personal and ubiquitous computing, Vol. 14 (3), pp. 237-249.<br />Pak, A. and Paroubek, P. (2010) 'Twitter as a Corpus for Sentiment Analysis and Opinion Mining', Proceedings of the International Conference on Language Resources and Evaluation, 1320-1326.<br />Pettey, C. and Stevens, H. (2009) Gartner's Hype Cycle Special Report for 2009, [Online], Available: [7 Dec 2010].<br />Ramage, D., Dumais, S. and Liebling, D. (2010) 'Characterizing Microblogs with Topic Models', Fourth International AAAI Conference on Weblogs and Social Media.<br />Richter, A., Koch, M. and Krisch, J. (2007) 'Social Commerce - Eine Analyse des Wandels im E-Commerce', Bericht 2007/03, Fakultät Informaitk, Universität der Bundeswehr München.<br />Stephen, A.T. and Toubia, O. (2010) 'Deriving Value from Social Commerce Networks', Journal of Marketing Research, Nr. 2 Vol. 67, pp. 215-228.<br />Tumasjan, A., Sprenger, T., Sandner, P. and Welpe, I. (2010) 'Predicting Elections with Twitter - What 140 Characters Reveal about Political Sentiment', Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, 178-185.<br /> (2011) 'Your world, more connected', [Online], Available: [2 Aug 2011]<br />08/06/2011<br />Sommer, Schieber / Analyzingcustomersentiments in microblogs<br />13<br />