Learning To Recognize Reliable Users And Content In Social Media With Coupled Mutual Reinforcement

1,065 views

Published on

Learning to Recognize Reliable Users and Content in Social Media with Coupled Mutual Reinforcement, Mohammad Ali Abbasi,
Arizona State University
http://dmml.asu.edu

Published in: Education, Technology
1 Comment
0 Likes
Statistics
Notes
  • For Business Analytics tools Online Training register at http://www.todaycourses.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total views
1,065
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
15
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide
  • An answer is likely to be of high quality if the content is responsive and well-formed, the question has high quality, and the answerer is of high answer-reputation. At the same time, a user will have high answer-reputation if she posts high- quality answers, and high question-reputation if she tends to post high-quality questions. Finally, a question is likely to be of high quality if it is well stated, is posted by a user with high question reputation, and attracts high-quality answers.
  • Circular definition from user to contentIn previous work, question and answer quality were defined in terms of content, form, and style, as manually labeled by paid editors [2]. In contrast, our definitions focus on question effectiveness, and the answer accuracy { both quantities that can be measured automatically and do not necessarily require human judgments.
  • Proportional User question-reputation and user answers-reputationQuestions QualityAnswers QualityY q (~a) denotes the quality of answera’s question
  • 3000 factoid questions as the initial set of queries and select 1250 factoid questions that has at least one similar question in Yahoo! Answers archive
  • and reputation as extra features for learning the ranking function
  • Learning To Recognize Reliable Users And Content In Social Media With Coupled Mutual Reinforcement

    1. 1. DATA MINING AND MACHINE LEARNING IN A NUTSHELLLEARNING TO RECOGNIZE RELIABLE USERS AND CONTENT IN SOCIAL MEDIA WITH COUPLED MUTUAL REINFORCEMENT Mohammad-Ali Abbasi http://www.public.asu.edu/~mabbasi2/ SCHOOL OF COMPUTING, INFORMATICS, AND DECISION SYSTEMS ENGINEERING ARIZONA STATE UNIVERSITY Arizona State University http://dmml.asu.edu/ to Recognize Reliable Users and Content in Social Media with Learning Data Mining and Machine Learning Lab Data Mining and Machine Learning- in a nutshell 1 Coupled Mutual Reinforcement
    2. 2. About the paper • Learning to Recognize Reliable Users and Content in Social Media with Coupled Mutual Reinforcement – Jiang Bian, Georgia Institute of Technology – Yandong Liu, Emory University – Ding Zhou, Facebook Inc. – Eugene Agichtein, Emory University – Hongyuan Zha, Georgia Institute of Technology • WWW 2009, April 20–24, 2009, Madrid, Spain. Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2 2
    3. 3. Community Question Answering (CQA) • Is a popular forum for users to pose questions for the other users to answer • User can ask natural language question • Is comparable with regular web search Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 3 3
    4. 4. Sample: Yahoo! Answers • Introduction Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 4 4
    5. 5. What is the problem? • retrieve answers from a social media archive with a large amount information – the quality, accuracy, and comprehensiveness of the submitted questions and answers varies widely – A large fraction of the content is not useful for answering queries – Current approaches require large amounts of manually labeled data Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 5 5
    6. 6. CQA environment • Users • Question • Answers Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 6 6
    7. 7. The goal • Identify – High quality Answers – High quality Questions – High reputation Users • Simultaneously • With the minimum manual labeling Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 7 7
    8. 8. The contribution of this paper • developing a semi-supervised coupled mutual reinforcement framework for simultaneously calculating content quality and user reputation, that requires relatively few labeled examples to initialize the training process • more effective for finding high-quality answers, questions, and users. • improves the accuracy of search over CQA archives Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 8 8
    9. 9. Current approaches • Relies on the users reputation, • OR- Require large amount of supervision, • OR- focus on the network properties of the CQA • without considering the actual content of the information exchanged Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 9 9
    10. 10. How to rank? • Current approaches: – Content Quality OR – User reputation • This paper: – Content Quality AND – User reputation Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1010
    11. 11. Definitions • Question Quality – A questions effectiveness at attracting high quality answers • Answer Quality – the responsiveness, accuracy, and comprehensiveness of the answer to a question. • Question Reputation – indicating the expected quality of the questions posted by a user • Answer Reputation – the expected quality of the answers posted by a user. Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1111
    12. 12. Model the problem • Solution Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1212
    13. 13. Mutual reinforcement Principle • Solution Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1313
    14. 14. Feature Space: X(Q), X(A), X(U) • Solution Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1414
    15. 15. Learning quality and reputation(Coupled Mutual Reinforcement) • P(x): probability of being “good” • Model of P(x) • B is Coefficient of the linear model and can be found by maximizing: Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1515
    16. 16. Non independent equations • Conditional log-likelihood • Objective function Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1616
    17. 17. CQA-MR Algorithm • Solution Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1717
    18. 18. Experimental Setup- Data Collection • From Yahoo! Answers with their API • Use TREC QA benchmark Archive to crawl QA archives (http://trec.nist.gov/data.html) • Get all available answers for each question – 107293 users – 27354 questions – 224617 answers Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1818
    19. 19. Evaluation Metrics • Mean Reciprocal Rank(MRR) – the reciprocal of the rank at which the first relevant answer was returned, or 0 if none of the top N results contained a relevant answer • Precision at K – for a given query, P(K) reports the fraction of answers ranked in the top K results that are labeled as relevant • Mean Average of Precision(MAP) – the mean of the precision at K values calculated after each relevant answer was retrieved Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1919
    20. 20. User reputation methods • Baseline – users are ranked by “indegree" (number of answers posted) • HITS – Users are ranked based on their authority scores • CQA-Supervised – classify users into those with "high" and "low” reputation, and trained over the features • CQA-MR – predict user reputation based on mutual- reinforcement algorithm Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2020
    21. 21. CQA Retrieval methods • Baseline – score computed as the difference of up votes and down votes • Gbrank – did not include answer and question quality and user reputation • GBrank-HITS: – optimized GBrank by adding user reputation calculated by HITS algorithm • GBrank-Supervised – supervised learning and optimize GBrank by adding obtained quality Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2121
    22. 22. Precision at K for the top contributors • Experiments Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2222
    23. 23. Precision at K • Experiments Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2323
    24. 24. Accuracy • Experiments Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2424
    25. 25. Training Labels • Experiments Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2525
    26. 26. Training Labels • Experiments Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2626
    27. 27. Mohammad-Ali Abbasi (Ali), Ali, is a Ph.D student at Data Mining and Machine Learning Lab, Arizona State University. His research interests include Data Mining, Machine Learning, Social Computing, and Social Media Behavior Analysis. http://www.public.asu.edu/~mabbasi2/ Arizona State University Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media withData Mining and Machine Learning Lab Coupled Mutual Reinforcement 27

    ×