Eddi: Topic Browsing of Twitter Streams
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Eddi: Topic Browsing of Twitter Streams

on

  • 2,253 views

presented at UIST 2010.

presented at UIST 2010.

Statistics

Views

Total Views
2,253
Views on SlideShare
2,253
Embed Views
0

Actions

Likes
0
Downloads
13
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Eddi: Topic Browsing of Twitter Streams Presentation Transcript

  • 1. eddiInteractive Topic-Based Browsing of Social Status StreamsMichael BernsteinMIT CSAILBongwon Suh, Lichan Hong, Sanjay Kairam, Ed H. ChiPARC AUGMENTED SOCIAL COGNITIONJilin ChenUNIVERSITY OF MINNESOTA MIT HUMAN-COMPUTER INTERACTION
  • 2. shoppinglibrary sciencegooglepakistangrammarwritingfacebook
  • 3. User Goal: Topic Exploration on trending topics in the feed or topics of interest
  • 4. Topic Detection is DifficultExisting algorithms expect reasonably long documentsWikipedia articles: average 400 wordsTweets: average 15 words msbernst macbook died, but the Genius guys gave me a new one!Existing algorithm might find: Existing algorithm might miss: macbook apple died customer support guys
  • 5. eddiinteractive topic browserfor twitter feedsTweeTopicrealtime topic detection Tweet Web Search Noun Phrases Topic Keywordsalgorithm for tweets
  • 6. TweeTopicfrom msbernst Awesome article on some SIGGRAPHtweet user interface work: http://bit.ly/30MJyto animation charactertopics 3d computer graphics user interface
  • 7. Information Retrieval TechniquesAssume decent length to text –  Repetition as a measure of importance: e.g., Term Frequency – Inverse Document Frequency (TF-IDF) –  Co-occurrence matrices: e.g., Latent Dirichlet Allocation (LDA) [Blei et al., Ramage et al.]But with 140 characters, it is difficult todistinguish signal from noise,topic from commentary. katrina_ Ron Rivest cracks me up. It keeps me awake when algorithm design brings the lulz.
  • 8. Information Retrieval TechniquesAssume decent length to text –  Repetition as a measure of importance: e.g., Term Frequency – Inverse Document Frequency (TF-IDF) –  Co-occurrence matrices: e.g., Latent Dirichlet Allocation (LDA) [Blei et al., Ramage et al.]But with 140 characters, it is difficult todistinguish signal from noise,topic from commentary. katrina_ me up. It me when brings the .
  • 9. Information Retrieval Techniques katrina_ me up. It me when brings the .
  • 10. TweeTopic: IntuitionTweets look like search queries,and search results can be mined for topics.
  • 11. TweeTopic: IntuitionTweets look like search queries,and search results can be mined for topics.Tweet msbernst Noun Phrases Awesome article on some Tweet SIGGRAPH user interface Noun Phrases article SIGGRAPH user interface work work: http://bit.ly/30MJy SearchWeb Search Topic KeywordsSIGGRAPH 2004 Trip Report Number TermThis year’s themes at SIGGRAPH … good navigation interface … of Pages Web Searchwww.stoneschool.com/Work/Siggraph/2004/index.htmlWIMP (computing) – Wikipedia 9 Topic Keywords SIGGRAPHPossibility ... (like the noun GUI, for graphical user interface) ... 7 user interfaceen.wikipedia.org/wiki/WIMP_(computing) 6 animationSIGGRAPH: Specialty 3D ApplicationsStandalone programs give alternatives to the toolset of a 3D ... 6 computer graphicsmaxon.digitalmedianet.com/articles/viewarticle.jsp?id=55098
  • 12. 1 Noun phrase detection Noun Phrases Web Search Topic Keywords msbernst Awesome article on some SIGGRAPH user interface work: http://bit.ly/30MJy
  • 13. 1 Noun phrase detection Noun Phrases Web Search Topic Keywords msbernst Awesome article on some SIGGRAPH user interface work: http://bit.ly/30MJy
  • 14. 1 Noun phrase detection Noun Phrases Web Search Topic Keywords msbernst Awesome article on some SIGGRAPH user interface work: http://bit.ly/ 30MJy
  • 15. 2 Query a search engine Noun Phrases Web Search Topic Keywords article SIGGRAPH user interface work Search
  • 16. 2 Query a search engine SIGGRAPH 2004 Trip Report Noun Phrases Web Search Topic Keywords <ht This year’s themes at SIGGRAPH … Automatic Distinctive Icons for Desktop Interfaces … such that they actually do provide a good navigation interface … www.stoneschool.com/Work/Siggraph/2004/index.html WIMP (computing) – Wikipedia Another possibility is to have the P in WIMP stand for Program, allowing it to be used as a noun (like the noun GUI, for graphical user interface) rather ... en.wikipedia.org/wiki/WIMP_(computing) SIGGRAPH: Specialty 3D Applications Aug 4, 2006 ... SIGGRAPH: Specialty 3D Applications Standalone programs give alternatives to the toolset of a 3D animation application By Frank Moldstad ... maxon.digitalmedianet.com/articles/viewarticle.jsp?id=55098 Graphical specification of flexible user interface displays Graphical specification of flexible user interface displays. Full text, Pdf (983 KB). Source, Symposium on User Interface Software and Technology archive ... portal.acm.org/citation.cfm?id=73673 UIST 2010 UIST (ACM Symposium on User Interface Software and Technology) is the premier forum for innovations in the software and technology of human-computer … www.acm.org/uist/
  • 17. 3 Mine topics from results SIGGRAPH 2004 Trip Report Noun Phrases Web Search Topic Keywords This year’s themes at SIGGRAPH … Automatic Distinctive Icons for Desktop Interfaces … such that they actually do provide a good navigation interface … www.stoneschool.com/Work/Siggraph/2004/index.html TF-IDF on a web corpus: sketch skin model character paper shader Gollum collada cards real-time animation cloth map subsurface texture scattering SIGGRAPH Balrog fluids special session
  • 18. 3 Mine topics from results Number of Term Noun Phrases Web Search Topic Keywords Pages (max. 10) 9 SIGGRAPH Keep terms in 7 user interface at least 50% 6 animation of search results 6 computer graphics 5 3d 5 character 4 WIMP Use less common terms 4 interaction as suggestions 3 pop-up menus 3 mice 3 subsurface scattering 2 human computer interface
  • 19. Apple W00t! Snow Leopard gave me 10 gigs back! RT @username: gmail is down, but the imap connection on my iphone still works (fingers crossed!) My iPhone 3GS cracked-on-a-rock, @username’s swam in a toilet, both repaired/replaced in 20 min @ Boylston Apple Store. Total cost: $0.Obama I think the most striking thing about Obama’s speech + GOP response for casual listeners would be how much agreement there was. Watching Obama attempt to #reversethecursehealthcare RT @username: The fastest way to prove you are an idiot is to call the President a liar on live TVResearch @username Congratulations on the CSCW best paper nomination! Stanford scientists turn liposuction leftovers into embryonic-like stem cells: http://bit.ly/3GHsw9 CORRECTION: the deadline for submissions to the Graduate Student Consortium for TEI ’09 is October 2 http://bit.ly/15D8Mv
  • 20. Related WorkDesignTopic browsing interfaces[Käki et al., CHI 2005] [Kammerer et al., CHI 2009] [Leskovec et al., KDD 2009]
  • 21. Related WorkAlgorithmsNoun phrases as key conceptsin short segments of text[Bendersky and Croft, SIGIR 2008]Search engine calloutsto find query similarity[Sahami and Heilman, WWW 2006]LDA on Twitter[Ramage et al., ICWSM 2010]
  • 22. EvaluationHow does TweeTopic compare Tweet Noun Phrasesto other topic detection Web Search Topic Keywordsalgorithms?How does Eddi compareto a typical chronologicalTwitter interface?
  • 23. TweeTopic EvaluationComparison topic detection algorithms •  Random Unigram msbernst Awesome article on some SIGGRAPH user interface work: http://bit.ly/30MJy
  • 24. TweeTopic EvaluationComparison topic detection algorithms •  Random Unigram •  Inverse Document Frequency (IDF) msbernst Awesome article on some SIGGRAPH user interface work: http://bit.ly/30MJy
  • 25. TweeTopic EvaluationComparison topic detection algorithms •  Random Unigram •  Inverse Document Frequency (IDF) •  Latent Dirichlet Allocation (LDA) msbernst Awesome article msbernst Awesome article onmsbernst Awesome article some SIGGRAPH onmsbernst Awesome article some SIGGRAPH oninterfaceSIGGRAPH some SIGGRAPH useroninterface work: user some work: graphics user interface work: http://bit.ly/30MJy user interface work: http://bit.ly/30MJy http://bit.ly/30MJy http://bit.ly/30MJy
  • 26. TweeTopic Evaluation100 random tweets from Twitter’s streamThree human coders rated the top fiverecommendations from each algorithm (Fleiss’s κ=.70) video games Yup, Medal of Honor will have a demo medal of honor http://bit.ly/bx6PSG reviews honorLogistic regression analysis for binary outcomes
  • 27. Results: TweeTopic Doubles Baseline TweeTopic(No Noun Detection) Topic Labeling Accuracy TweeTopic IDF Unigram (baseline) LDA 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Odds Ratio (baseline = 1 at Random Unigram)
  • 28. LDA vs. TweeTopic I’m off to take a nap now. See y’all in a few hours!LDA TweeTopic bed naptime half power nap hour sleep sleep take a nap
  • 29. Eddi EvaluationRecruited active Twitter users,preferring those who followedmore than 100 peopleGave users 3 minutes to browse 24 hoursof their feed using Eddi or a chronologicalinterface, over 6 total trials
  • 30. Results: More Efficient and EnjoyableLikert Response (Agreement)1 4 9Is Quick to Scan Eddi “Eddi helps me find things that Chrono. I’m interested in, faster.”Is EnjoyableEddi “I get bored faster with the traditionalChronological feed. There’s way more stuff that I’m not interested in.”I’m Confident I Saw Everything Eddi “[The chronological feed] is less Chrono. enjoyable but more comprehensive.”
  • 31. Results: Twice As EffectiveTrack tweets remaining onscreen for > 2 secondsGet relevance judgments from users:“I’m glad that I saw this tweet in my feed.”Users consume a purer feed:
  • 32. Discussion and Future WorkEddi is most useful for overwhelming feeds @msbernst follows 1000 people @msbernst follows 100 people @msbernst follows 10 peopleUse case: filter accounts with selective interests “Show me @GuyKawasaki when he tweets about social computing; ignore the rest.”
  • 33. eddiInteractive Topic-Based Browsing of Social Status StreamsExplore an overwhelming feedby topics of interestUncover the central topic of a tweet,given very little text
  • 34. TweeTopic EvaluationTweeTopic Variants •  Transformed vs. Raw: Do we massage the tweet to look like a query? •  Iterated vs. None: Do we keep removing words if the search engine fails?
  • 35. 4 Iterate to remove words if needed article SIGGRAPH user interface work
  • 36. Results: Noun Phrase Analysis Unnecessary TweeTopic(No Noun Detection) Topic Labeling Accuracy TweeTopic IDF Unigram (baseline) LDA 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Odds Ratio (baseline = 1 at Random Unigram)
  • 37. Related WorkTwitter and DesignCommon uses of Twitter:information sharing, opinions, status[Naaman et al., CSCW 2009] 50%% of all tweets 40% 30% 20% 10% 0% Information Opinions Random Personal Sharing Thoughts Status
  • 38. ed chi l