Your SlideShare is downloading. ×
0
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Leveraging microblogs for resource ranking
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Leveraging microblogs for resource ranking

1,909

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,909
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Leveraging Microblogs for Resource RankingTomáš Majer, Marián Šimkotomasmajer@gmail.com, simko@fiit.stuba.sk23.01.2012, SOFSEM 2012Institute of Informatics and Software EngineeringFaculty of Informatics and Information TechnologiesSlovak University of Technology in Bratislava
  • 2. Microblog• a brief form of a blog with limited size of a post, typically 140 characters of length• sharing experiences, opinions, comments, links• Twitter, Identi.ca, Jaiku, (Google+, Facebook)• Twitter: ▫ over 300 millions of users (100 mil. at the time of writing paper) ▫ over 300 millions of tweets/day (100 mil. --||--)
  • 3. Microblog – why important?• „Read-Write Web“ vision• user-generated data ▫ unbiased ▫ not moderated• huge amount of data – text, links• valuable source of information ▫ data mining
  • 4. Microblog – why important?• „Read-Write Web“ vision• user-generated data ▫ unbiased ▫ not moderated 22 % of posts• huge amount of data – text, links• valuable source of information ▫ data mining
  • 5. State-of-the-Art• topic identification/keyword extraction (Ramage et al., 2010)• opinion mining (Pendey, Iyer, 2009)• twitter search, tweets ranking (Teevan et al., 2011)• user ranking (Gayo-Avello, 2010)• user modeling (Abel et al., 2011)•…• challenge: resource ranking
  • 6. Twitter Graph T1 R1 U1 T2 R2 T3 U2 R3 T4 T5 R4 U3 T6 R5 T7 Users Tweets Resources
  • 7. Twitter Graph T1 R1 posts U1 T2 contains R2 T3 U2 R3 T4 T5 R4 U3 re-posts follows T6 R5 T7 Users Tweets Resources
  • 8. Resource Ranking: Overview• TweetRank ▫ ranking of a resource based on Twitter graph analysis• Computation 1. UserRank 2. TweetRelevance 3. TweetRank
  • 9. Resource Ranking: Overview• TweetRank ▫ ranking of a resource based on Twitter graph analysis• Computation T1 R1 1. UserRank U1 T2 R2 T3 2. TweetRelevance U2 R3 T4 T5 R4 3. TweetRank U3 T6 R5 T7
  • 10. UserRank 1+γ (u)UserRank ( f ) UserRank (u) = ∑ f ∈ followers(u) ∣ followers( f )∣
  • 11. UserRank ∣ followers(u)∣ γ(u)= ∣tweets(u)∣ 1+γ (u)UserRank ( f ) UserRank (u) = ∑ f ∈ followers(u) ∣ followers( f )∣
  • 12. TweetRelevance, TweetRank UserRank ( Author (t)) TweetRelevance (t) = ∣tweets( Author (t ))∣
  • 13. TweetRelevance, TweetRank UserRank ( Author (t)) TweetRelevance (t) = = TR(t) ∣tweets( Author (t ))∣ TweetRank (r) = ∑ t ∈tweets(r ) ( TR(t)+ ∑ rt ∈retweets(t) TR(t)TR(rt) )
  • 14. Evaluation1. TweetRank ranking vs. explicit user ranking (YouTube )2. Search results ranking study (Search)• Data: ▫ 1,997,466 tweets from 367,824 users  85 % in English ▫ 1,468,365,182 connections between 40,103,281 users ▫ 1,150,168 unique web links  3 % of them: YouTube
  • 15. Computing TweetRank• TweetRank computed for each resource ▫ power-law distribution # of resources TweetRank intervals
  • 16. Experiment YouTube• YouTube – explicit user rating (Y1Rank) ▫ positive/negative vote, normalized• TweetRank vs. Y1Rank TweetRank, Y1Rank ▫ correlation coefficient r = 0.02 YouTube videos
  • 17. Experiment YouTube 2• YouTube – application-collected user rating (Y2Rank) ▫ „how do you like the video“? ▫ 5 degree scale (1-best, 5-worst), 70 participants• TweetRank vs. Y2Rank ▫ Kendall rank correlation coefficient τ= 0.125• Relative video ranking
  • 18. Experiment YouTube 2 ▫ 5-tuplets
  • 19. Experiment YouTube 2 ▫ pairs
  • 20. Experiment Search• 20,000 resources• indexing: SOLR (Apache Lucene)• searching: resource ranking extended with TweetRank• search results manual comparison ▫ 20 randomly selected queries (e.g.: „apple“) ▫ analyzed top-k results
  • 21. Experiment Search• findings: ▫ in general, „newer“ resources rank better ▫ ranking does not reflect chronological ordering of resources• suitable for sorting search results within a predefined time window
  • 22. Conclusions• microblog ▫ perspective source of data, information, knowledge• we proposed novel method for resource ranking leveraging microblog network• an important additional knowledge from the crowd ▫ a form of indirect explicit user rating• great potential for search improvement ▫ reflects temporal characteristics (not linearly) ▫ sorting results within a predefined time window

×