Topic and Opinion Classification based Information Credibility Analysis on Twitter

5,931 views

Published on

My presentation about information credibility analysis at SMC 2013.

Published in: Data & Analytics
  • Be the first to comment

Topic and Opinion Classification based Information Credibility Analysis on Twitter

  1. 1. Topic and Opinion Classification based Information Credibility Analysis on Twitter Yukino Ikegami Kenta Kawai Yoshimi Namihira Setsuo Tsuruta At SMC 2013 2013/10/16 1
  2. 2. Background and Motivation • False rumors often confuse people • Confirming reliability of rumors often requires a domain knowledge about the problem Automatically Information credibility analysis 2013/10/16 2
  3. 3. Related Work (1) Using Web-page-dependent features • [Wassmer et al., 2005] – Use credentials of the site, advertisements and Web design • [Castillo et al. 2011] – Twitter-dependent features • E.g. number of followers – Twitter-independent features • E.g. number of !/? 2013/10/16 3
  4. 4. Related Work (2) Using textual features • Rumor information cloud system [Miyabe et al. 2011] – Confirm a rumor whether is truth or not by alerting information about a rumor – Find correcting information by SVM applying word n- grams model – The word n-gram model consists of words in front and back of the word “デマ” (“dema” is the abbreviation of “demagogic” in Japanese-English). • Dematter [Toriumi et al. 2012] – Assesse credibility by the percentage of alerting tweets about a rumor – Detect alerting tweets by keyword matching 2013/10/16 4
  5. 5. Topic and Opinion Classification based Information Credibility Analysis 2013/10/16 5 Twitter Tweet crawler Topic & opinion classifier Tweet opinion DB Tweet credibility calculator
  6. 6. Topic classification • Classify tweet by topic model – Topic model: Latent Dirichlet Allocation (LDA) with Gibbs sampling [Griffiths, 2002] – Feature: content words (i.e.) noun, verb, adjective, adverb 2013/10/16 6 Topic1 Topic2 Topic3 Vegetable Measure Radioactive material Eat Amount of radiation In prefecture No problem Result Governor Leaf of tea Pool Fukushima
  7. 7. Opinion Classification • Classify whether a tweet is positive opinion or negative one by a dictionary • Takamura’s semantic orientation dictionary [Takamura et al. 2006] – Contains word-positivity [-1, 1] pairs 2013/10/16 7
  8. 8. Information Credibility Assessment • Majority decision 2013/10/16 8 All tweets Positive Negative Tweets about the interest topic
  9. 9. Evaluation • Dataset: 2960 tweets – Confirmed whether it is true or not by human • Criteria: Weighted kappa 2013/10/16 9 – Weight w is designed as follows: judging certainly false-information as certainly true or vice versa are critical error
  10. 10. Result Fully Random method (All tweets) Our method (All tweets) Our method (Only Topic & Opinion correct) 0.003 0.604 0.616 2013/10/16 10 TABLE 1: Kappa of each conditions • Landis’s kappa guideline: κ > 0.61 is substantial • Our method has the substantial effectiveness for assessing tweet credibility
  11. 11. Conclusion • Topic and opinion classification based information analysis on Twitter – Topic model and sentiment analysis based majority decision • Evaluation shows it has substantial effect 2013/10/16 11
  12. 12. Future works • Weighting tweets by author’s expertise – people often determine whether information is trustworthy or not by author’s expertise • Applying online topic model – New topic and usage of existing words are created one after another • Excluding neutral tweets – No-sentiment tweets are useless on our method 2013/10/16 12
  13. 13. References • [Wassmer et al. 2015] M. Wassmer and C. Eastman, “Automatic evaluation of credibility on the Web,” ASIS&T 2005, 42(1), 2005. • [Castillo et al. 2011] Castillo, C., Mendoza, M., and Poblete, B. “Information credibility on twitter,” WWW 2011, pp. 675- 684, 2011. • [Miyabe et al. 2005] M. Miyabe, A. Umejima, A. Nadamoto and E. Aramaki, “Proposal of Rumor Information Cloud based on Rumor-Correction Information” (In Japanese), RRDS4-019, 2011. • [Toriumi et al. 2006] F. Toriumi, K. Shinoda, G. Kaneyama, “Accuracy Evaluation of Dema- gogue Detection System using Social Media” (In Japanese), IPSJ Digital Practice, 3.3, pp. 201-208, 2012. 2013/10/16 13

×