Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

What is Twitter, a Social Network or a News Media?

176,182 views

Published on

The 19th international conference on World wide web, Raleigh, North Carolina, USA

Published in: Technology

What is Twitter, a Social Network or a News Media?

  1. TWITTE What is Twitter, a Social Network or a News Media? Haewoon Kwak Changhyun Lee Hosung Park Sue Moon Department of Computer Science, KAIST, Korea 19th International World Wide Web Conference (WWW2010)
  2. TWITTE Twitter, a microblog service 2
  3. TWITTE Twitter, a microblog service write a short message 2
  4. TWITTE Twitter, a microblog service read neighbors’ tweets 3
  5. TWITTE In most OSN “We are friends.” 4
  6. TWITTE In Twitter “I follow you.” 5
  7. TWITTE on Twitter “Unlike most social networks, following on Twitter is not mutual.  Someone who thinks you're interesting can follow you, and you don't have to approve, or follow back." 6 http://help.twitter.com/entries/14019-what-is-following
  8. TWITTE Following = subscribing tweets recent tweets of followings 7
  9. 8
  10. TWITTE http://blog.marsdencartoons.com/2009/06/18/cartoon-iranian-election-demonstrations-and-twitter/marsden-iran-twitter72/ 9
  11. PROBLEM STATEMEN The goal of this work We analyze how directed relations of following set Twitter apart from existing OSNs. Then, we see if Twitter has any characteristics of news media. 10
  12. TWITTE me⋅di⋅a [mee-dee-uh] 1.a pl. of medium 2.the means of communication, as radio and television, newspapers, and magazines, that reach or influence people widely 11 http://dictionary.reference.com/
  13. PROBLEM STATEMEN The goal of this work We analyze how directed relations of following set Twitter apart from existing OSNs. Then, we see if Twitter has any characteristics of news media. 12
  14. OUTLIN Summary of our findings 1. Following is mostly not reciprocated (not so “social”) 2. Users talk about timely topics 3. A few users reach large audience directly 4. Most users can reach large audience by WOM* quickly *WOM: word-of-mouth 13
  15. TWITTE Data collection (09/6/1~9/24) • 41.7M user profiles (near-complete at that time) • 1.47B following relations *publicly available • 4262 trending topics • 106M tweets mentioning trending topics ‣ Spam tweets removed by CleanTweets 14 *http://an.kaist.ac.kr/traces/WWW2010.html
  16. TWITTE How we crawled • Twitter’s well-defined 3rd party API • With 20+ ‘whitelisted’ IPs ‣ Send 20,000 requests per IP / hour 15
  17. TWITTE Recent studies • Ranking methodologies [WSDM’10] • Predicting movie profits [HYPERTEXT’10] • Recommending users [CHI’10 microblogging] • Detecting real time events [WWW’10] • The ‘entire’ Twittersphere unexplored 16
  18. TRANSITIO Part I. 1. Following is mostly not reciprocated (not so “social”) 2. Users talk about timely topics 3. A few users reach large audience directly 4. Most users can reach large audience by WOM* quickly 17
  19. 2. ACTIVE SUBSCRIPTIO Why do people follow others? • Reflection of offline social relationships otherwise, • Subscription to others’ messages 18
  20. 2. ACTIVE SUBSCRIPTIO Sociologists’ answer • “Reciprocal interactions pervade every relation of primitive life and in all social systems” 19
  21. 2. ACTIVE SUBSCRIPTIO Is following reciprocal? • Only 22.1% of user pairs follow each other • Much lower than ‣ 68% on Flickr ‣ 84% on Yahoo! 360 ‣ 77% on Cyworld guestbook messages 20
  22. 2. ACTIVE SUBSCRIPTIO Low reciprocity of following • Following is not similarly used as friend in OSNs ‣ Not reflection of offline social relationships • Active subscription of tweets! 21
  23. TRANSITIO Part II. 1. Following is mostly not reciprocated (not so “social”) 2. Users talk about timely topics 3. A few users reach large audience directly 4. Most users can reach large audience by WOM* quickly 22
  24. 1. TIMELINESS TOPIC Dynamically changing trends 23
  25. 1. TIMELINESS TOPIC 5.3 User Participation in Trending Topics User participation pattern can How many topics does a user participate on average? Out of 41 million Twitter users, a large number of users (8, 262, 545) partici- be a signature of a topic pated in trending topics and about 15% of those users participated in more than 10 topics during four months. (a) Topic ’apple’ (b) Topic ’#iranelection’ Figure 11: Cumulative numbers of tweets and users over time 24
  26. 1. TIMELINESS TOPIC Majority of topics are headline topics ranked by the pro- e about offline news, and ‘remembering 9’) and, we cting frequent words from 31.5% 54.3% “ephemeral” “headline news” rending Topics te on average? Out of 41 sers (8, 262, 545) partici- f those users participated (a) Exogenous subcritical (b) Exogenous critical s. (topic ‘#backintheday’) (topic ‘beyonce’) 7.3% 6.9% “persistent news” (c) Endogenous subcritical (d) Endogenous critical (topic ‘lynn harris’) (topic ‘#redsox’) Topic ’#iranelection’ eets and users over time Figure 13: The examples of classified popularity patterns 25
  27. TRANSITIO Part III. 1. Following is mostly not reciprocated (not so “social”) 2. Users talk about timely topics 3. A few users reach large audience directly 4. Most users can reach large audience by WOM* quickly 26
  28. and compared against each other. Before we delve into the eccen- 3. A FEW HUB tricities and peculiarities of Twitter, we run a batch of well-known analysis and present the summary. How many followers a user has? 3.1 Basic Analysis Figure 1: Number of followings and followers 27
  29. ≤ 3, attest to the existence of a relatively 3. A FEW HUB nodes with a very large number of links. CCDF so have distinguishing properties, such as mic threshold, ultra-small worldness, and random errors [11, 12, 13, 14]. The degree • Complementary Cumulative Density Function en plotted as a complementary cumulative • CCDF(x=k) = P (k )dk ∼ k ∼ on (CCDF), ℘(k) ≡ k P(x)dx R∞ −α wer-law distribution shows up as a straight plot, the exponent of a power-law distri- entative characteristic, distinguishing one 28
  30. and compared against each other. Before we delve into the eccen- 3. A FEW HUB tricities and peculiarities of Twitter, we run a batch of well-known analysis and present the summary. 3.1 Reading the graph Basic Analysis Figure 1: Number of followings and followers 29
  31. and compared against each other. Before we delve into the eccen- 3. A FEW HUB tricities and peculiarities of Twitter, we run a batch of well-known analysis and present the summary. 3.1 Plenty of super-hubs Basic Analysis Figure 1: Number of followings and followers 30
  32. 3. A FEW HUB More super-hubs than projected by power-law • Where do they get all the followers? Possibly from... ‣ Search by ‘name’ ‣ Recommendation by Twitter • They reach millions in one hop 31
  33. ings of the top 40 users is 114, three orders of magnitude smaller 3. A FEW HUB than the number of followers). We revisit the issue of reciprocity in Are those who have many Section 3.3. 3.2 followers active? Followers vs. Tweets Figure 2: The number of followers and that of tweets per user 32
  34. 3. A FEW HUB How we plotted 33
  35. 3. A FEW HUB How we plotted × Med.= 8 Avg. =9 34
  36. ings of the top 40 users is 114, three orders of magnitude smaller 3. A FEW HUB than the number of followers). We revisit the issue of reciprocity in Section 3.3. More followers, more tweets 3.2 Followers vs. Tweets Figure 2: The number of followers and that of tweets per user 35
  37. ings of the top 40 users is 114, three orders of magnitude smaller 3. A FEW HUB than the number of followers). We revisit the issue of reciprocity in Section 3.3. Many followers without activity 3.2 Followers vs. Tweets Figure 2: The number of followers and that of tweets per user 36
  38. 3. A FEW HUB Twitter user rankings by Followers, PageRank and RT 37
  39. 3. A FEW HUB Twitter user rankings by Followers, PageRank and RT 38
  40. independent news media based on online distribution. Ranking by 3. A FEW HUB the retweets shows the rise of alternative media in Twitter. Great discrepancy Rankingsrankings 4.3 Comparison among among 39
  41. TRANSITIO Part IV. 1. Following is mostly not reciprocated (not so “social”) 2. Users talk about timely topics 3. A few users reach large audience directly 4. Most users can reach large audience by WOM* quickly *WOM: word-of-mouth 40
  42. 4. WORD-OF-MOUT Which is more efficient for WOM? 41
  43. 4. WORD-OF-MOUT In Twitter Information Following 42
  44. 4. WORD-OF-MOUT rather a source of information than a social networking site. Fur- ther validation is out of the scope of this paper and we leave it for future work. Average path length: 4.1 3.4 Degree of Separation Figure 4: Degree of separation 43
  45. 4. WORD-OF-MOUT Retweet (RT) • Relay tweets from a following to followers 44
  46. 4. WORD-OF-MOUT Retweet (RT) • Relay tweets from a following to followers Last day of WWW’10 node 0 45
  47. 4. WORD-OF-MOUT Retweet (RT) • Relay tweets from a following to followers Last day of WWW’10 Last day of WWW’10 Last day of WWW’10 Last day of WWW’10 46
  48. 4. WORD-OF-MOUT Retweet (RT) • Relay tweets from a following to followers RT @node0 Last day of WWW’10 node 4 47
  49. 4. WORD-OF-MOUT Retweet (RT) • Relay tweets from a following to followers RT @node0 Last day of WWW’10 RT @node0 Last day of WWW’10 RT @node0 Last day of WWW’10 48
  50. 4. WORD-OF-MOUT Retweet (RT) • Relay tweets from a following to followers retweeter r w writer 49
  51. 4. WORD-OF-MOUT Retweet (RT) • Not only hop neighbors hop neighbors r w 50
  52. 4. WORD-OF-MOUT Retweet (RT) • More goes further 2 hop neighbors r w 51
  53. 4. WORD-OF-MOUT We construct RT tree • A tree with writer and retweeter(s) r w 52
  54. 4. WORD-OF-MOUT Height of RT trees W W W r r r r r r r 53
  55. 4. WORD-OF-MOUT Empirical RT trees 54
  56. 4. WORD-OF-MOUT 96% of RT trees = Height Figure 15: Retweet trees of ‘air france flight’ tweets Figure 16: Height and participating users in retweet trees 55
  57. 4. WORD-OF-MOUT yond adjacent neighbors. We dig into the retweet trees constructed per trending topic and examine key factors that impact the eventual Boosting audience by RT spread of information. 6.1 Audience Size of Retweet Figure 14: Average and median numbers of additional recipi- ents of the tweet via retweeting 56
  58. 4. WORD-OF-MOUT Additional readers 2 additional readers by retweeter 3 followers r w 57
  59. 4. WORD-OF-MOUT yond adjacent neighbors. We dig into the retweet trees constructed A retweet brings a few per trending topic and examine key factors that impact the eventual spread of information. 6.1 hundred additional readers Audience Size of Retweet Figure 14: Average and median numbers of additional recipi- ents of the tweet via retweeting 58
  60. retweets appear and how long they last. Figure 17 plots the tim lag from a tweet to its retweet. Half of retweeting occurs within a 4. WORD-OF-MOUT hour, and 75% under a day. However about 10% of retweets tak Time lag between hops in RT tree place a month later, Figure 17: Time lag between 59 retweet and the original tweet a
  61. retweets appear and how long they last. Figure 17 plots the tim lag from a tweet to its retweet. Half of retweeting occurs within a 4. WORD-OF-MOUT Fast relaying tweets by RT: hour, and 75% under a day. However about 10% of retweets tak place a month later, 35% of RT < 10 min. Figure 17: Time lag between 60 retweet and the original tweet a
  62. retweets appear and how long they last. Figure 17 plots the tim lag from a tweet to its retweet. Half of retweeting occurs within a 4. WORD-OF-MOUT Fast relaying tweets by RT: hour, and 75% under a day. However about 10% of retweets tak place a month later, 55% of RT < 1hr. Figure 17: Time lag between 61 retweet and the original tweet a
  63. SUMMAR Summary 1. We study the entire Twittersphere 2. Low reciprocity distinguishes Twitter from OSNs 3. Twitter has characteristics of news media: ‣ Tweets mentioning timely topics ‣ Plenty of hubs reaching a large public directly ‣ Fast and wide spread of word-of-mouth 62
  64. SUMMAR Resources • http://an.kaist.ac.kr/traces/WWW2010.html 63
  65. Supplementary info. 64
  66. SUPPLEMENTARY INF This can be interpreted as a large following in another continent. Homophily in terms of followers We conclude that Twitter users who have reciprocal relations of (only in reciprocal network) fewer than 2, 000 are likely to be geographically close. Figure 6: The average number65 followers of r-friends per user of
  67. SUPPLEMENTARY INF This can be interpreted as a large following in another continent. Assortative mixing We conclude that Twitter users who have reciprocal relations of fewer than 2, 000 are likely to be geographically close. Figure 6: The average number66 followers of r-friends per user of
  68. SUPPLEMENTARY INF Homophily in terms of location (only in reciprocal network) Figure 5: The average time differences between a user and r- 67
  69. SUPPLEMENTARY INF Favoritism in RTs? • A few informative users? ? ? ? 68
  70. SUPPLEMENTARY INF Disparity in weighted network s than evenly among one’s followers. How even is the information diffu- have sion in retweet? To answer this question we investigate disparity [2] The in retweet trees. This For each user i we define |rij | as the number of retweets from d how user j. The Y (k, i) is defined as follows: k 2 |rij | Y (k, i) = k (3) height j=1 l=1 |ril | r how soon Y (k) represents Y (k, i) averaged over all nodes that have k out- time going (incoming) edges. Here an edge represents a retweet. When E. Almaas, B. Kovacs, T. Vicsek, Z. N. Oltvai, and A.-L. Barabasi. hin an retweeting occurs evenly among followers, Nature,kY (k) ∼ 1. Feb 2003. 69 then 427:839–843, If
  71. SUPPLEMENTARY INF ures 19(a) and 19(b) shows a linear correlation up to 1, 000 follow- ers. The linear correlation to k represents favoritism in retweets: people only retweets from a small number of people and only a Favoritism in RTs subset of a user’s followers actually retweet. Chun et al. also re- port that favoritism exists in conversation from guestbook logs of Cyworld, the biggest social networks in Korea [5]. tweet (a) kout Y(kout ) ∼ kout (b) kin Y(kin ) ∼ kin etweet on the Figure 19: Disparity in retweet trees and the than a hat the 7. RELATED WORK h more Online social networks and social media 70 away.
  72. responsive and basically occur back to back up to 5 hops away. SUPPLEMENTARY INF Cha et al. reports that favorite photos diffuse in the order of days Fast WOM by retweet in Flickr [4]. The strength of Twitter as a medium for information diffusion stands out by the speed of retweets. Figure 18: Elapsed time of retweet from (n − 1) hop to n hop 71

×