Your SlideShare is downloading. ×
0
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Recommending #-Tags in Twitter
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Recommending #-Tags in Twitter

1,282

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,282
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Recommending #-Tags in Twitter<br />Eva Zangerle, Wolfgang Gasslerand Günther Specht<br />1<br />
  • 2. Outline<br />Motivation<br />Approach<br />Ranking Methods<br />Evaluation<br />Future Directions<br />2<br />
  • 3. Hashtags<br />Tags forTweets<br />(Manual) Categorizationofconversations<br />Follow streamsofconversation<br />3<br />
  • 4. Motivation<br />Only 20% oftweetscontainhashtags<br />Hashtagscanbechosenfreely<br />#umap2011? #umap11? #umap?<br />Synonymoushashtags<br />Heterogeneity<br />Search capability limited<br />4<br />
  • 5. Are hashtagrecommendationspossible?<br />Motivation<br />5<br />
  • 6. Goals<br />Recommendationofsuitablehashtagsduringentering a tweet<br />Encourageuseofhashtags<br />Improvesearchcapabilities<br />Bettercategorization<br />Fight heterogeneity<br />Avoiduseofsynonymoushashtags<br />6<br />
  • 7. Approach<br />First Attempt<br />Crawl setoftweetscontaininghashtags<br />Analysis ofdataset<br />Can itbedonebased on content?<br />Compareenteredtweettoexistingtweets<br />7<br />
  • 8. Content-based Approach<br />8<br />
  • 9. Crawled Dataset<br />CrawledJuly 2010 – February 2011<br />16,034,195 messages in total<br />3,209,281 messagescontaininghashtags (20%) -&gt; usedasdatasetforevaluation<br /><ul><li>Top five contained in 8% of all messages containing hashtags (#jobs, #nowplaying, #zodiacfacts, #news and #fb)</li></ul>9<br />
  • 10. Hashtags per Tweet<br />10<br />
  • 11. Hashtags per Tweet<br />RT @Bhupeshtweet: #Quad #loop-http://bit.ly/ciHX2U #retweet#India #Jobs #World #news #canada #ad #win #USA #tdf #oea #hacking #icantstop #sdcc #game<br />11<br />
  • 12. Longtail Distribution<br />12<br />
  • 13. Ranking Methods<br />Input: Set ofCandiateHashtags (from 500 similartweets)<br />Output: RankedCandidate List -&gt; top k shown<br />Similarity Rank<br />Usesimilaritymeasureoftweetsforranking (tf/idfcosinesimilarity)<br />The higherthesimilarityofthetweets, thehighertherankingofthecorrespondinghashtags<br />Overall Popularity Rank<br />Most popularhashtagsoverwholedataset<br />The morepopular, thehighertherankingwithinthecandidatehashtags<br />13<br />
  • 14. Ranking Methods<br />Input: Set ofCandiateHashtags (from 500 similartweets)<br />Output: RankedCandidate List -&gt; top k shown<br />RecommendationPopularityRank<br /><ul><li>Count numberofoccurrencesforeachhashtagswithincandidatelist
  • 15. The moresimilartweetsfeaturethehashtag, thehigherthe rank ofthehashtag</li></ul>14<br />
  • 16. Evaluation<br />15<br />
  • 17. Evaluation<br />Dataset<br />3,209,281 messages<br />5,097,545 hashtags<br />510,170 distincthashtags<br />Testrun<br />10,000 randomlychosentweets (max. 5 hashtags)<br />Retweetsexcluded<br />30,000 testruns (3 rankingmethods)<br />16<br />
  • 18. Evaluation - Precision<br />17<br />Avg. Numberofhashtags per message in dataset<br />
  • 19. Evaluation - Recall<br />18<br />Top-3 recommendationsenough? <br />
  • 20. Whatweshowed…<br />Motivation forrecommendationofhashtags<br />Content-basedrecommendations<br />Simple, straight-forward approach<br />40% Recall@3 <br />… so itcanbedone!<br />19<br />
  • 21. { &quot;coordinates&quot;: null, &quot;favorited&quot;: false, &quot;created_at&quot;: &quot;Thu Jul 15 23:26:04 +0000 2010&quot;, &quot;truncated&quot;: false, &quot;text&quot;: <br />&quot;RT @ApeyBaby44: Labels r runbylawyer &amp; accountants. http://tl.gd/2hkmas&quot;, &quot;contributors&quot;: null, &quot;id&quot;: 18639444000, &quot;geo&quot;: null, &quot;in_reply_to_user_id&quot;: null, &quot;place&quot;: null, &quot;in_reply_to_screen_name&quot;: null, &quot;user&quot;: { &quot;name&quot;: &quot;DIGGZ&quot;, &quot;profile_sidebar_border_color&quot;: &quot;F2E195&quot;, &quot;profile_background_tile&quot;: true, &quot;profile_sidebar_fill_color&quot;: &quot;FFF7CC&quot;, &quot;created_at&quot;: &quot;Fri Apr 03 03:16:01 +0000 2009&quot;, &quot;profile_image_url&quot;: &quot;http://a3.twimg.com/profile_images/1079346239/untitled_normal.JPG&quot;, &quot;location&quot;: &quot;ATL, NC, VA, NY all day!&quot;, &quot;profile_link_color&quot;: &quot;FF0000&quot;, &quot;follow_request_sent&quot;: null, &quot;url&quot;: &quot;http://thisisseriousbiz.com&quot;, &quot;favourites_count&quot;: 42, &quot;contributors_enabled&quot;: false, &quot;utc_offset&quot;: -18000, &quot;id&quot;: 28489988, &quot;profile_use_background_image&quot;: true, &quot;profile_text_color&quot;: &quot;0C3E53&quot;, &quot;protected&quot;: false, &quot;followers_count&quot;: 588, &quot;lang&quot;: &quot;en&quot;, &quot;notifications&quot;: null, &quot;time_zone&quot;: &quot;Quito&quot;, &quot;verified&quot;: false, &quot;profile_background_color&quot;: &quot;BADFCD&quot;, &quot;geo_enabled&quot;: true, &quot;description&quot;: &quot;Half ofProductionduoSeriousBIZ circa 2008rn#teamSERIOUSBIZrn#teamblackberry PIN 315442C9rn#teamfollowback&quot;, &quot;friends_count&quot;: 477, &quot;statuses_count&quot;: 6269, &quot;profile_background_image_url&quot;: &quot;http://a1.twimg.com/profile_background_images/118926256/_MG_43571.JPG&quot;, &quot;following&quot;: null, &quot;screen_name&quot;: &quot;DIGGZSeriousBIZ&quot; }, &quot;source&quot;: &quot;&lt;a href=&quot;http://www.ubertwitter.com/bb/download.php&quot; rel=&quot;nofollow&quot;&gt;u00dcberTwitter&lt;/a&gt;&quot;, &quot;in_reply_to_status_id&quot;: null }<br />We‘vebarelyscratchedthesurface…<br />Exploitedonlysmallfractionofavailableinformation<br />90% aremetadata<br />20<br />
  • 22. ThankYou!<br />21<br />#ideas?<br />
  • 23. Future Challenges<br />User Interface<br />Social Graph<br />Realtime Recommendations<br />Synonymous Tags?<br />Ranking<br />Real User Tests<br />22<br />

×