Data mining paper presentation

1,293 views
1,161 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,293
On SlideShare
0
From Embeds
0
Number of Embeds
83
Actions
Shares
0
Downloads
20
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Data mining paper presentation

  1. 1. From Bias to Opinion: a Transfer-Learning Approach to Real-Time Sentiment Analysis Pedro , Adriano, Wagner, Virgilio Universidade Federal de Minas Gerais, Brazil 4/10/2012 Presentation for Comp722 Data Mining, Kaiwen Qi5/22/2012 1
  2. 2. Outline Background and Paper Purpose Quantify Bias Exploiting Bias for Sentiment Analysis Conclusions5/22/2012 2
  3. 3. Social Media and Opinionated Data From Pedro’s PPT5/22/2012 3
  4. 4. Background: Sentimental Analysis Goal Determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document.5/22/2012 4
  5. 5. Sentimental Analysis Example:  http://www.tweetsentiments.com/analyze?utf8=%E2%9C%93&q=Lady+Gaga&topic=false&commit=Analyze+Tweets5/22/2012 5
  6. 6. Sentimental Analysis Another name: Opinion Mining5/22/2012 6
  7. 7. Use sentimental analysis for: Help companies keep on top of issues and respond to trends impacting on business. Gather new customer insights from unstructured-content (gathered from social networks). Determine the degree to which a sentiment is positive, negative or neutral for the entire content or a segment of the content. Identify those voices and publications influencing customers and competitors. Adjust and optimize communication strategies. Use it to direct strategic decisions such as modifying marketing messages, customer service or product development. Receive early warnings of market developments. Manage and preserve brand equations and reputations. Monitor public opinion Summarize the aggregated sentiment of online society http://passionjunkie.hubpages.com/hub/Sentimental-Analysis-Business-Insights-that-Help-you- Grow5/22/2012 7
  8. 8. Real-Time v.s Traditional Sentiment Analysis Traditional:  Uses static and well-controlled scenarios that target analysis of reviews of products and services  Pre-defined Lists of positive and negative words Real-Time:  Lack of labeled textual data  Dynamicity of discussion : dynamic/concept drift/non- stationary distribution5/22/2012 8
  9. 9.  Dynamic Discussion and lack of labeled data From Pedro’s ppt5/22/2012 9
  10. 10. Task What is the time-invariant pattern that does not require significant labeling efforts and supports real-time sentiment analysis?5/22/2012 10
  11. 11. Proposal5/22/2012 11
  12. 12. Social Media Endorsements as Evidence of User Bias Endorsements : Interactions among users in which one user implicitly agrees with another.5/22/2012 12
  13. 13.  Bias and opinions From Pefro’s PPT5/22/2012 13
  14. 14. Proposal intension How can the sociological definition of bias be implemented into a social media platform by only considering social interactions among users? How can bias information be converted into information on the sentiment that is associated with the generated content?5/22/2012 14
  15. 15. Modeling User Bias Prediction Determine the most similar users based on individual endorsements5/22/2012 15
  16. 16. Measuring bias We label users whose bias is clearly identifiable as representative of a particular side in a discussion From Pefro’s PPT5/22/2012 16
  17. 17. Modeling User Bias Prediction Activity similarity  The similarity considering the users that both pair of users retweeted Passive similarity  The similarity considering the users that retweeted both pair of users5/22/2012 17
  18. 18. The Opinion Agreement Graph G=(V,E)  Vertices : User  Edge: global judgment of the connected users 185/22/2012
  19. 19. The Opinion Agreement GraphFrom Pefro’s PPT5/22/2012 19
  20. 20. Explanation5/22/2012 20
  21. 21. Measure Bias Attractors: sever as reliable sources of bias knowledge The bias of each node is its proximity from attractors that represent that side to all users in U Random walk: to measure proximity among nodes5/22/2012 21
  22. 22. Bias measurement5/22/2012 22
  23. 23. Case Study Brazilian 2010 Presidential Elections Brazilian 2010 Soccer League5/22/2012 23
  24. 24. Bias in Elections Discussions5/22/2012 24
  25. 25. Bias in Elections Discussions5/22/2012 25
  26. 26. Bias in Soccer Discussions5/22/2012 26
  27. 27. Bias in Soccer Discussions5/22/2012 27
  28. 28. Bias is a consistent patternFrom Pedro’s PPT 5/22/201 28 2
  29. 29. Consistent Bias5/22/2012 29
  30. 30. Background: Transfer LearningUsing learned knowledge from one context tobenefit further learning tasks in other contexts Benefit from knowledge Obtained from similar Tasks or domains From Liyuan Dai’s paper5/22/2012 30
  31. 31. Transfer LearningExample:5/22/2012 31
  32. 32. Transferring bias from user to contentFrom Pedro’s PPT5/22/2012 32
  33. 33. Relationship between terms and users biasFrom Pefro’s PPT5/22/2012 33
  34. 34. Relationship between terms and users biasFrom Pedro’s PPT5/22/2012 34
  35. 35. Relationship between terms and users biasFrom Pedro’s PPT 35 5/22/2012
  36. 36. Message Polarity Determination The term of highest polarity in each tweet: polarity = argmax(p҄(polarity = x|t))5/22/2012 36
  37. 37. Evaluating the Knowledge Transfer Process  F1 accuracy v.s number of user with bias  When the bias of 15% of users commenting on politics is known, F1=85%5/22/2012 37
  38. 38. Evaluating the Knowledge Transfer Process F1 v.s number of users with bias When the bias of 15% of users commenting on politics is known, F1=90%5/22/2012 38
  39. 39. Comparison to SVM SVM F1 decreases due to the textual feature distribution Bias-based is better, not using labeled textual data Maintain a stable F1, as it incrementally incorporate bias information on new terms by propagating user bias.5/22/2012 39
  40. 40. Comparison to SVM SVM F1 decreases Bias-based = SVM, but not require labeled textual data5/22/2012 40
  41. 41. Analyzing a Soccer Math in Real Time Live eventFrom Pedro’s PPT5/22/2012 41
  42. 42. conclusions Real-time sentiment analysis based on the consistency of the user bias Known bias  Propagate through endorsements  propagate user bias to terms associated with user content  combine term bias to computer the overall contentpolarity 5/22/2012 42
  43. 43. Thanks & Question?5/22/2012 43
  44. 44. Extra Slides http://www.cs.cornell.edu/people/pabo/movie-review-data/5/22/2012 44

×