Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Leveraging Joint Interactions for Credibility Analysis in News Communities

342 views

Published on

Leveraging Joint Interactions for Credibility Analysis in News Communities,
Subhabrata Mukherjee and Gerhard Weikum,
Max Planck Institute for Informatics,
CIKM 2015

Published in: Data & Analytics
  • Be the first to comment

Leveraging Joint Interactions for Credibility Analysis in News Communities

  1. 1. Leveraging Joint Interactions for Credibility Analysis in News Communities Subhabrata Mukherjee and Gerhard Weikum Max Planck Institute for Informatics CIKM 2015
  2. 2. Motivation ➢ Media plays a crucial role in public dissemination of information ➢ However, people believe there is substantial media bias in news in view of inter-dependencies and cross-ownerships of media companies and other industries (like energy) ➢ 4 out of 5 Americans among younger generations do not trust major news networks [Gallup poll, 2013] ➢ This work: Credibility Analysis of News Communities
  3. 3. News Community ➢ A news community is a news aggregator site (e.g., reddit.com, digg.com, newstrust.net) where: ➢ Users can give explicit feedback (e.g., rate, review, share) on the quality of news ➢ Interact (e.g., comment, vote) with each other ➢ However, this adds user subjectivity as users incorporate their own bias and perspectives in the framework ➢ Controversial topics create polarization among users which influence their evaluation
  4. 4. Contributions ● A model to capture joint interaction between language, topics, users and sources leading to better prediction than the ones in isolation ● User expertise, source trustworthiness, language objectivity, topical perspective and article credibility mutually reinforce each other ● A supervised Conditional Random Field model that can capture these interactions, and handle real-valued ratings
  5. 5. s1 s1 d1 r11 r12 u1 u2 d2 r22 u2 y1 y2 C1 C2 Source Article Review User Example FACTORS
  6. 6. s1 s1 d1 r11 r12 u1 u2 d2 r22 u2 y1 y2 C1 C2 Source Article Review User Alternet.org (progressive/liberal) Why do conservaties hate your children? Ratings Discussions (liberal vs.conservative) Example Topic: Climate FACTORSInstantiation
  7. 7. s1 s1 d1 r11 r12 u1 u2 d2 r22 u2 y1 y2 C1 C2 Viewpoint, Expertise Why do conservaties hate your children? Ratings Discussions (liberal vs.conservative) Example Topic: Climate Source Article Review User FACTORSFEATURES
  8. 8. s1 s1 d1 r11 r12 u1 u2 d2 r22 u2 y1 y2 C1 C2 Viewpoint, Expertise Emotionality, Discourse Ratings Discussions (liberal vs.conservative) Example Topic: Climate Source Article Review User FACTORSFEATURES
  9. 9. s1 s1 d1 r11 r12 u1 u2 d2 r22 u2 y1 y2 C1 C2 Viewpoint, Expertise Emotionality, Discourse Discussions (liberal vs.conservative) Example Ratings Topic Source Article Review User FACTORSFEATURES
  10. 10. s1 s1 d1 r11 r12 u1 u2 d2 r22 u2 y1 y2 C1 C2 Viewpoint, Expertise Emotionality, Discourse Bias, Viewpoint, Expertise Example Topic Ratings Source Article Review User FACTORSFEATURES
  11. 11. s1 s1 d1 r11 r12 u1 u2 d2 r22 u2 y1 y2 C1 C2 Source Article Review User Article Credibility Rating? Trustworthiness Objectivity Credibility Expertise Task FACTORSATTRIBUTES
  12. 12. Credibility Analysis ➢ Given a set of news sources generating news articles, and users reviewing them on different qualitative aspects with mutual interactions: ➢ Jointly rank the sources, articles, and users based on their trustworthiness, credibility,and expertise
  13. 13. Credibility of Statements in Health Communities [S. Mukherjee et al.: KDD‘14]
  14. 14. s1 s1 d1 r11 r12 u1 u2 d2 r22 u2 y1 y2 C1 C2 Source Article Review User Objectivity Language Features Assertives, Factives, Hedges, Implicatives, Report, Discourse, Subjectivity etc. 1. M. Recasens, C. Danescu-Niculescu-Mizil, and D. Jurafsky. Linguistic models for analyzing and detecting biased language. In ACL, 2013. 2. S. Mukherjee, G. Weikum, and C. Danescu-Niculescu-Mizil. People on drugs: Credibility of user statements in health communities. KDD, 2014.
  15. 15. ➢ Only 33% of the articles have explicit tags ➢ Use Latent Dirichlet Allocation to learn the latent topic distribution in the corpus of news articles Topic Features
  16. 16. Source Features
  17. 17. Category Elements Engagement answers, ratings (given / received), comments etc. Agreement Inter-user agreement Topics perspective and expertise Interactions user-user, user-item, user-source User Features
  18. 18. s1 s1 d1 r11 r12 u1 u2 d2 r22 u2 y1 y2 C1 C2 Source Article Review User Article Credibility Rating? Source Models Article Language Model, Topic Model Review Language Model, Topic Model User Models How to aggregate? Given a factor, with its features, use Support Vector Regression to learn a model that will predict its rating for an article.
  19. 19. Probability Mass Function for discrete labels: Probability Density Function for continuous ratings: Conditional Random Field
  20. 20. Clique potential Energy Function User Potential Source Potential Language Potential Topic Potential Clique: source, article, <users>, <reviews>
  21. 21. error of predictor SVR partitions the user space user expertise
  22. 22. source trustworthiness Energy Function language objectivity topical perspective
  23. 23. Σ needs to be positive definite for inverse to exist → {α, β, γ} > 0 Makes sense: predictor reliability should be positive The joint p.d.f is a multivariate gaussian distribution
  24. 24. Maximize log-likelihood with respect to log λk instead of λk Prediction is the expected value of the function given by the mean of the Multivariate Gaussian distribution: Constrained optimization problem. Gradient ascent cannot be directly used.
  25. 25. Experiments: NewsTrust Data available at: http://www.mpi-inf.mpg.de/impact/credibilityanalysis/
  26. 26. Predicting User Ratings Users, Articles, Ratings +Time +Review Text +Review Text and Interactions 1. Y. Koren. Factorization meets the neighborhood: A multifaceted collaborative filtering model. KDD, 2008. 2. J. McAuley and J. Leskovec. Hidden factors and hidden topics: Understanding rating dimensions with review text. RecSys, 2013. 3. J. J. McAuley and J. Leskovec. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. In WWW, 2013.
  27. 27. Predicting Article Credibility Ratings
  28. 28. Predicting Article Credibility Ratings
  29. 29. Predicting Article Credibility Ratings
  30. 30. Predicting Article Credibility Ratings
  31. 31. Ranking Trustworthy News Sources Ranking Expert Users:
  32. 32. Sample Output: Most and Least Trust Sources on Sample Topics
  33. 33. Conclusions ➢ Joint interaction between language, topics, users and sources lead to better prediction in multiple tasks ➢ User expertise, source trustworthiness, language objectivity, topical perspective and article credibility mutually reinforce each other
  34. 34. Ongoing Work ➢ Analyze temporal evolution of these factors ➢ Communities are inherently dynamic in nature ➢ Source trustworthiness, and user expertise change with time ➢ To this end we propose an Experience-aware Item Recommendation for Evolving Review Communities, ICDM 2015.

×