What is Twitter, a Social Network or a News Media?

  • 148,823 views
Uploaded on

The 19th international conference on World wide web, Raleigh, North Carolina, USA

The 19th international conference on World wide web, Raleigh, North Carolina, USA

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
148,823
On Slideshare
0
From Embeds
0
Number of Embeds
113

Actions

Shares
Downloads
3,230
Comments
27
Likes
248

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • I’m Haewoon Kwak. a Ph. D. student from KAIST, Korea
    In this talk I raise a question “what is ~” and answer it with empirical data.
    This work is joint with Changhyun Lee, Hosung Park, and Sue Moon

  • I hope, all you have heard about Twitter, a microblog service.
    (click) People write a short message called as tweets,
    via Twitter web page, iphone app, or even SMS.
    So, people can write a tweet anywhere, anytime.
  • I hope, all you have heard about Twitter, a microblog service.
    (click) People write a short message called as tweets,
    via Twitter web page, iphone app, or even SMS.
    So, people can write a tweet anywhere, anytime.
  • and in Twitter people easily read tweets written by neighbors.
    Twitter delivers neighbors’ tweets almost in real time.
    So, people easily subscribe neighbors tweets without visiting each neighbor’s page.
    Then, in Twitter, who are neighbors?
  • In most online social networks such as Facebook, neighbors are called as friends.
    friends relations are established by both side. request and accept.

  • By contrast, in Twitter, relation can be established by one-side.
    This relation is called as following.
    Literally, the term, follow, has a direction; one follow the other.


  • So, following on Twitter is not mutual.
    If you wanna follow others, you can freely follow them without their agreement.
    This is major difference between Twitter and common online SNS.
  • As I mentioned earlier, Twitter deliver all tweets of those who I’m following almost in real time.
    So, in some sense, the act of following in Twitter, subscribing tweets, is similar with subscribing blogs by RSS.
    Instead of visiting each blog, we can read new articles of blogs by RSS reader.
    Similarly, in Twitter, we easily read new tweets of followings by Twitter’s assist.

  • There are few events to show the possibility of Twitter as news media.
    One is in last year.
    This figure and tweet firstly report airplane crash on the Hudson river
    much earlier than mass media.
    The tweet was written on the ferry by a person going to pick up the people.
  • The other is, also in the last year, related to iran election.
    People report and disseminate news widely and quickly, and a mass protest movement arose in Twitter. At that time, undoubtedly Twitter is one of the most influential news media.
    Then, what is Twitter today? A power of Twitter as news media has been still there? so, if something happens, Can Twitter play a role of news media again?
  • We find answers from structural properties of Twitter network and user behavior observed from empirical data. First, we analyze how directed relations of following leads the difference between Twitter and OSN.
    Then, we see if Twitter has any characteristics of news media.
    (:D) By the way, what is news media? we need clear definition.
  • media is a means of communications that reach people widely. such as radio and newspaper.
    we show this definition can hold in Twitter.
  • Remember our goal,
    we achieve this goal step by step.
  • We divide the talk into 4 parts like this.
    In the first part, we report that following is mostly not reciprocated.
    We explain how this sets Twitter apart from other OSNs.
    Next, to show the characteristics of media, we present three observations.
    We show the timeliness of tweets. This is one of important characteristics to become news media. and we demonstrate the high reachability of users.
  • For this study we collect a large scale of Twitter data during 4 month last year.
    41.7 million user profiles such as name, location, and short bio. ... near-complete
    1.47 billion following/follower relations. We make following relation data publicly available in our web site. We note that the published dataset contains only numeric ID, not some textual information such as name for privacy.
    and... we collect 4 thousand trending topics. trending topics are popular words or hashtags~ similar with tags. trending topics are regularly announced by Twitter. we collect trending topics every 10 minutes.
    and finally we collect 106 million tweets mentioning trending topics.

  • All our data is collected by using great API of Twitter.
    With more than 20 whitelisted IPs, we can send 20 thousand request per IP every hour.
  • There are many recent studies about interesting observations in Twitter,
    but they do not explore the entire twittersphere.
    Thus, our large-scale data is one of strong points of our work.
  • Now we move on the first part.
    In this part we show how different Twitter and OSN are.
    Particularly, Twitter’s following relation is differently from friends relation in other OSNs.

  • Think the motivation of following. Why do people follow others?
    There are two plausible explanations.
    One is related to social reason. As similar with OSN, people may follow other because they are friends in offline. The other is related to functional reason. To read interesting tweets easily, people may follow others as information source. Both explanations seem to be reasonable equally.
    Then, how can we know which is better explanation?

  • Sociologists answered our question a few decades ago. They observe social interactions are reciprocal. reciprocal interaction pervades every relations in all social systems.
    So, whether reciprocal interactions are observed in following relations is a good indicator what following relation is. a reflection of social ties, or a subscription of tweets.
  • Only 22% user pairs follow each other, follow and follow back.
    This is much lower than the value observed in other online social networks.
    68% on Flickr, 84% on Yahoo! 360, and 77% on Cyworld. the biggest OSN in Korea.
    This low reciprocity is constantly observed in all range of number of followings a user has.

  • so, it’s hard to say that following is reflection of social relations, because low volume of reciprocal interactions are observed.
    Thus, the motivation of following is mostly not for social reason but for its function, the subscription of tweets.
    Following in Twitter is not similarly used as friends in Facebook, and
    This is major difference between existing OSNs and Twitter.

  • Then, what are tweets about? what tweets do users subscribe?
    In this part, we show that Twitter users talk about timely topics well.
    This means that tweets deliver hot topics to users as news media do.
  • This figure shows dynamics of trending topics in Twitter.
    Green area below the red dot line stands for the proportion of newly emerging trending topics.
    More than half of trending topics newly appears.
    Also we find that some trending topics appear again and again.
    This memory effect is not observed Google popular search keywords, so it might come from discussion between users.
    Dynamically changing trending topics show that users talk about timely topics well.


  • Seeing the growth of the number of tweets and unique users mentioning trending topics over time,
    we find various patterns of growing numbers.
    for topic apple, constantly new users write a tweet.
    but, for topic iranelection, after a point, without influx of writers, tweets continuously grow.
    These various patterns of growing number of users and tweets can be a signature of a topic.


  • For the first step of analyzing the participation pattern, we classify the trending topics into 4 classes by a method borrowed from the work of Crane and Sornett. Crane and Sornett built a framework to classify YouTube videos by popularity curve, and their framework works well in Twitter, too. We show that majority of trending topics are headline news. and 30% of trending topics catch a limited subset of users’ attention and eventually dying out, we called them ephemeral. 7.3% are of more lasting nature such as professional sports teams and brands.
  • So far we show that Twitter users actively subscribe others tweets,
    and those tweets deliver hot topics to users as news media do.
    Now we move on the reachability. The power of mass media comes from high reachability.
    Many people read newspaper, watch TV, and listen the radio.
    Then, how about the reachability of Twitter user?
    To answer the question this part first shows a few users who directly reach large audience.


  • The number of followers is a measure of the reachability, because followers read a user’s tweets.
    This figure shows a distribution of following and followers.
    The y-axis is represented as CCDF,
  • CCDF stands for the complementary cumulative density function.
    The value of CCDF at x = k is calculated as the integral of function from k to infinity.
  • So, reading this graph, those who have more than 10 thousand followers are about 0.01%.
  • For most real networks, it is well known that the distribution of the number of neighbors follow a power-law. The power-law distribution is represented as a straight line in log-log space like this figure. But, we clearly observe a bump to show a plenty of super hubs. who have about a few hundred thousand followers.

  • These super hubs might result in global exposure.
    Twitter supports to search users by real name, and provide a recommendation list of popular users.
    so we easily find celebrities and follow them.
    These super hubs reach a large public directly.
  • But, all they are not active.
    The figure shows a correlation between the number of followers and the number of written tweets.
    The way to plot this graph is like this.
  • consider three users who have 1 followers and write 3, 9, and 12 tweets.
  • then we compute the median and average of three users who have the same number of followers.
  • Up to a point we have positive correlation between the number of followers and the number of tweets. It means, more followers, more tweets in this range.
  • But we find some users who have many followers without activity.
    These users’ authorities are built in offline and holds in Twitter,
    because they do not write any tweet or lower than 10 tweet.
  • We compare top 10 users who have many followers and who frequently write valuable tweets.
    the right column, labeled as RT, stands for retweet.
    Here we simply consider RT as follower’s marking a valuable tweet.

  • Once we color celebrities as green, and news as red,
    we clearly observe difference between the rank by followers and retweets.
    It means that many followers do not guarantee the high quality of tweets.
  • Up to top 2,000 users, there are great discrepancy among rankings.
  • In previous part, we show that there are a few users who reach large public directly.
    Part IV, the last part of today’s talk, shows that most users reach large public by word-of-mouth quickly, although they do not reach large audience directly.
    In other words, although users do not have many followers, they reach large public by word-of-mouth quickly.
  • First, we talk about the structural properties of a network that affects the efficiency for WOM.
    Consider these two networks. each circle is an information source, and information propagates following the edge. Then, which network is more efficient for propagating information by WOM?
    All you reach the same, correct answer. The below network.
    Intuitively we know the average path length is an important factor for the efficiency of a network for WOM. Shorter average path length is more effective for WOM.
  • Then how can we compute the average path length in Twitter?
    Information disseminate from a user to followers in Twitter.
    but reverse is not always true. only 22% is true.
    So, the average path length might be longer.
  • But, the result is surprising.
    We observe the average path length is 4.1 and effective diameter is 4.8.
    Additionally, for 97% of node pairs, the path length is 6 or shorter.
    This is much shorter than a few previous work report 6 degree of separation.
    This shows that Twitter can be efficient medium for word-of-mouth.


  • Then, how information is propagated in Twitter by word-of-mouth?
    We introduce the means to relay information called as retweet.
    Consider a network consists of 7 nodes and all edges are reciprocal.
  • One user, node0, writes a tweet “last day of www 10”
  • This tweet is read by 3 followers of her.
  • Then, one of followers relay a tweet.
    She write an original tweet with additional character “RT” stands for retweet.
    and a pointer to original writer. In this case, at node 0 is a pointer to original writer.


  • This retweeting tweets deliver two followers of her.
  • We label those who write an original tweet as writer,
    and those who retweet an original tweet as retweeter.
  • without retweet, an original tweet only reach 1-hop neighbors.
  • But with retweet, an original tweet more goes further.
    In this example, some of 2-hop neighbors.
  • We construct a retweet tree that consists of writer and retweeters.
    Here we construct a retweet tree of one writer and one retweeter.

  • We can define the height of RT trees.
    The height of RT tree is defined as the longest path between the writer and the retweeter.
  • These are shapes of empirical retweet trees about iphone.
    We see many star-shaped trees.
  • Most of retweeted tree is height 1. and there are a few longer trees.
    Then, how retweet affects the number of readers empirically?
  • This figure shows the boosted audience by retweet.
    the x axis is the number of followers of an original tweet’s writer,
    and the y axis is the number of additional readers.
    Back to the toy network,
  • The additional readers are those who read an original tweet by retweet.
    The number of followers of a writer is 3, and the additional readers are two.
  • return the graph.
    Interestingly, up to a point, the average number of additional readers by a retweet are a few hundred no matter how many followers a user has.
    Tweet is likely to reach a certain number of audience, once the user’s tweet starts spreading via retweets

  • Furthermore, this retweet process quickly occurs.
    This figure shows a cumulative distribution of time lag between retweets.
  • 35% of retweets occurs within 10 minutes.
  • a half of retweets occur within an hour.
    So... In conclusion of this part,
    Retweet provides most users with a means to propage information widely and quickly.
  • Thorough this work we collect and publish a huge social graph of twitter.
    Then, we show a low reciprocity, it can be enough reason to interpret following as subscription to tweets rather than representation of offline social relationships.
    Furthermore, Twitter has characteristics of news media such as tweets mentioning timely topics, plenty of hubs who reach a large audience directly, and fast and wide coverage of word-of-mouth.









Transcript

  • 1. TWITTE What is Twitter, a Social Network or a News Media? Haewoon Kwak Changhyun Lee Hosung Park Sue Moon Department of Computer Science, KAIST, Korea 19th International World Wide Web Conference (WWW2010)
  • 2. TWITTE Twitter, a microblog service 2
  • 3. TWITTE Twitter, a microblog service write a short message 2
  • 4. TWITTE Twitter, a microblog service read neighbors’ tweets 3
  • 5. TWITTE In most OSN “We are friends.” 4
  • 6. TWITTE In Twitter “I follow you.” 5
  • 7. TWITTE on Twitter “Unlike most social networks, following on Twitter is not mutual.  Someone who thinks you're interesting can follow you, and you don't have to approve, or follow back." 6 http://help.twitter.com/entries/14019-what-is-following
  • 8. TWITTE Following = subscribing tweets recent tweets of followings 7
  • 9. 8
  • 10. TWITTE http://blog.marsdencartoons.com/2009/06/18/cartoon-iranian-election-demonstrations-and-twitter/marsden-iran-twitter72/ 9
  • 11. PROBLEM STATEMEN The goal of this work We analyze how directed relations of following set Twitter apart from existing OSNs. Then, we see if Twitter has any characteristics of news media. 10
  • 12. TWITTE me⋅di⋅a [mee-dee-uh] 1.a pl. of medium 2.the means of communication, as radio and television, newspapers, and magazines, that reach or influence people widely 11 http://dictionary.reference.com/
  • 13. PROBLEM STATEMEN The goal of this work We analyze how directed relations of following set Twitter apart from existing OSNs. Then, we see if Twitter has any characteristics of news media. 12
  • 14. OUTLIN Summary of our findings 1. Following is mostly not reciprocated (not so “social”) 2. Users talk about timely topics 3. A few users reach large audience directly 4. Most users can reach large audience by WOM* quickly *WOM: word-of-mouth 13
  • 15. TWITTE Data collection (09/6/1~9/24) • 41.7M user profiles (near-complete at that time) • 1.47B following relations *publicly available • 4262 trending topics • 106M tweets mentioning trending topics ‣ Spam tweets removed by CleanTweets 14 *http://an.kaist.ac.kr/traces/WWW2010.html
  • 16. TWITTE How we crawled • Twitter’s well-defined 3rd party API • With 20+ ‘whitelisted’ IPs ‣ Send 20,000 requests per IP / hour 15
  • 17. TWITTE Recent studies • Ranking methodologies [WSDM’10] • Predicting movie profits [HYPERTEXT’10] • Recommending users [CHI’10 microblogging] • Detecting real time events [WWW’10] • The ‘entire’ Twittersphere unexplored 16
  • 18. TRANSITIO Part I. 1. Following is mostly not reciprocated (not so “social”) 2. Users talk about timely topics 3. A few users reach large audience directly 4. Most users can reach large audience by WOM* quickly 17
  • 19. 2. ACTIVE SUBSCRIPTIO Why do people follow others? • Reflection of offline social relationships otherwise, • Subscription to others’ messages 18
  • 20. 2. ACTIVE SUBSCRIPTIO Sociologists’ answer • “Reciprocal interactions pervade every relation of primitive life and in all social systems” 19
  • 21. 2. ACTIVE SUBSCRIPTIO Is following reciprocal? • Only 22.1% of user pairs follow each other • Much lower than ‣ 68% on Flickr ‣ 84% on Yahoo! 360 ‣ 77% on Cyworld guestbook messages 20
  • 22. 2. ACTIVE SUBSCRIPTIO Low reciprocity of following • Following is not similarly used as friend in OSNs ‣ Not reflection of offline social relationships • Active subscription of tweets! 21
  • 23. TRANSITIO Part II. 1. Following is mostly not reciprocated (not so “social”) 2. Users talk about timely topics 3. A few users reach large audience directly 4. Most users can reach large audience by WOM* quickly 22
  • 24. 1. TIMELINESS TOPIC Dynamically changing trends 23
  • 25. 1. TIMELINESS TOPIC 5.3 User Participation in Trending Topics User participation pattern can How many topics does a user participate on average? Out of 41 million Twitter users, a large number of users (8, 262, 545) partici- be a signature of a topic pated in trending topics and about 15% of those users participated in more than 10 topics during four months. (a) Topic ’apple’ (b) Topic ’#iranelection’ Figure 11: Cumulative numbers of tweets and users over time 24
  • 26. 1. TIMELINESS TOPIC Majority of topics are headline topics ranked by the pro- e about offline news, and ‘remembering 9’) and, we cting frequent words from 31.5% 54.3% “ephemeral” “headline news” rending Topics te on average? Out of 41 sers (8, 262, 545) partici- f those users participated (a) Exogenous subcritical (b) Exogenous critical s. (topic ‘#backintheday’) (topic ‘beyonce’) 7.3% 6.9% “persistent news” (c) Endogenous subcritical (d) Endogenous critical (topic ‘lynn harris’) (topic ‘#redsox’) Topic ’#iranelection’ eets and users over time Figure 13: The examples of classified popularity patterns 25
  • 27. TRANSITIO Part III. 1. Following is mostly not reciprocated (not so “social”) 2. Users talk about timely topics 3. A few users reach large audience directly 4. Most users can reach large audience by WOM* quickly 26
  • 28. and compared against each other. Before we delve into the eccen- 3. A FEW HUB tricities and peculiarities of Twitter, we run a batch of well-known analysis and present the summary. How many followers a user has? 3.1 Basic Analysis Figure 1: Number of followings and followers 27
  • 29. ≤ 3, attest to the existence of a relatively 3. A FEW HUB nodes with a very large number of links. CCDF so have distinguishing properties, such as mic threshold, ultra-small worldness, and random errors [11, 12, 13, 14]. The degree • Complementary Cumulative Density Function en plotted as a complementary cumulative • CCDF(x=k) = P (k )dk ∼ k ∼ on (CCDF), ℘(k) ≡ k P(x)dx R∞ −α wer-law distribution shows up as a straight plot, the exponent of a power-law distri- entative characteristic, distinguishing one 28
  • 30. and compared against each other. Before we delve into the eccen- 3. A FEW HUB tricities and peculiarities of Twitter, we run a batch of well-known analysis and present the summary. 3.1 Reading the graph Basic Analysis Figure 1: Number of followings and followers 29
  • 31. and compared against each other. Before we delve into the eccen- 3. A FEW HUB tricities and peculiarities of Twitter, we run a batch of well-known analysis and present the summary. 3.1 Plenty of super-hubs Basic Analysis Figure 1: Number of followings and followers 30
  • 32. 3. A FEW HUB More super-hubs than projected by power-law • Where do they get all the followers? Possibly from... ‣ Search by ‘name’ ‣ Recommendation by Twitter • They reach millions in one hop 31
  • 33. ings of the top 40 users is 114, three orders of magnitude smaller 3. A FEW HUB than the number of followers). We revisit the issue of reciprocity in Are those who have many Section 3.3. 3.2 followers active? Followers vs. Tweets Figure 2: The number of followers and that of tweets per user 32
  • 34. 3. A FEW HUB How we plotted 33
  • 35. 3. A FEW HUB How we plotted × Med.= 8 Avg. =9 34
  • 36. ings of the top 40 users is 114, three orders of magnitude smaller 3. A FEW HUB than the number of followers). We revisit the issue of reciprocity in Section 3.3. More followers, more tweets 3.2 Followers vs. Tweets Figure 2: The number of followers and that of tweets per user 35
  • 37. ings of the top 40 users is 114, three orders of magnitude smaller 3. A FEW HUB than the number of followers). We revisit the issue of reciprocity in Section 3.3. Many followers without activity 3.2 Followers vs. Tweets Figure 2: The number of followers and that of tweets per user 36
  • 38. 3. A FEW HUB Twitter user rankings by Followers, PageRank and RT 37
  • 39. 3. A FEW HUB Twitter user rankings by Followers, PageRank and RT 38
  • 40. independent news media based on online distribution. Ranking by 3. A FEW HUB the retweets shows the rise of alternative media in Twitter. Great discrepancy Rankingsrankings 4.3 Comparison among among 39
  • 41. TRANSITIO Part IV. 1. Following is mostly not reciprocated (not so “social”) 2. Users talk about timely topics 3. A few users reach large audience directly 4. Most users can reach large audience by WOM* quickly *WOM: word-of-mouth 40
  • 42. 4. WORD-OF-MOUT Which is more efficient for WOM? 41
  • 43. 4. WORD-OF-MOUT In Twitter Information Following 42
  • 44. 4. WORD-OF-MOUT rather a source of information than a social networking site. Fur- ther validation is out of the scope of this paper and we leave it for future work. Average path length: 4.1 3.4 Degree of Separation Figure 4: Degree of separation 43
  • 45. 4. WORD-OF-MOUT Retweet (RT) • Relay tweets from a following to followers 44
  • 46. 4. WORD-OF-MOUT Retweet (RT) • Relay tweets from a following to followers Last day of WWW’10 node 0 45
  • 47. 4. WORD-OF-MOUT Retweet (RT) • Relay tweets from a following to followers Last day of WWW’10 Last day of WWW’10 Last day of WWW’10 Last day of WWW’10 46
  • 48. 4. WORD-OF-MOUT Retweet (RT) • Relay tweets from a following to followers RT @node0 Last day of WWW’10 node 4 47
  • 49. 4. WORD-OF-MOUT Retweet (RT) • Relay tweets from a following to followers RT @node0 Last day of WWW’10 RT @node0 Last day of WWW’10 RT @node0 Last day of WWW’10 48
  • 50. 4. WORD-OF-MOUT Retweet (RT) • Relay tweets from a following to followers retweeter r w writer 49
  • 51. 4. WORD-OF-MOUT Retweet (RT) • Not only hop neighbors hop neighbors r w 50
  • 52. 4. WORD-OF-MOUT Retweet (RT) • More goes further 2 hop neighbors r w 51
  • 53. 4. WORD-OF-MOUT We construct RT tree • A tree with writer and retweeter(s) r w 52
  • 54. 4. WORD-OF-MOUT Height of RT trees W W W r r r r r r r 53
  • 55. 4. WORD-OF-MOUT Empirical RT trees 54
  • 56. 4. WORD-OF-MOUT 96% of RT trees = Height Figure 15: Retweet trees of ‘air france flight’ tweets Figure 16: Height and participating users in retweet trees 55
  • 57. 4. WORD-OF-MOUT yond adjacent neighbors. We dig into the retweet trees constructed per trending topic and examine key factors that impact the eventual Boosting audience by RT spread of information. 6.1 Audience Size of Retweet Figure 14: Average and median numbers of additional recipi- ents of the tweet via retweeting 56
  • 58. 4. WORD-OF-MOUT Additional readers 2 additional readers by retweeter 3 followers r w 57
  • 59. 4. WORD-OF-MOUT yond adjacent neighbors. We dig into the retweet trees constructed A retweet brings a few per trending topic and examine key factors that impact the eventual spread of information. 6.1 hundred additional readers Audience Size of Retweet Figure 14: Average and median numbers of additional recipi- ents of the tweet via retweeting 58
  • 60. retweets appear and how long they last. Figure 17 plots the tim lag from a tweet to its retweet. Half of retweeting occurs within a 4. WORD-OF-MOUT hour, and 75% under a day. However about 10% of retweets tak Time lag between hops in RT tree place a month later, Figure 17: Time lag between 59 retweet and the original tweet a
  • 61. retweets appear and how long they last. Figure 17 plots the tim lag from a tweet to its retweet. Half of retweeting occurs within a 4. WORD-OF-MOUT Fast relaying tweets by RT: hour, and 75% under a day. However about 10% of retweets tak place a month later, 35% of RT < 10 min. Figure 17: Time lag between 60 retweet and the original tweet a
  • 62. retweets appear and how long they last. Figure 17 plots the tim lag from a tweet to its retweet. Half of retweeting occurs within a 4. WORD-OF-MOUT Fast relaying tweets by RT: hour, and 75% under a day. However about 10% of retweets tak place a month later, 55% of RT < 1hr. Figure 17: Time lag between 61 retweet and the original tweet a
  • 63. SUMMAR Summary 1. We study the entire Twittersphere 2. Low reciprocity distinguishes Twitter from OSNs 3. Twitter has characteristics of news media: ‣ Tweets mentioning timely topics ‣ Plenty of hubs reaching a large public directly ‣ Fast and wide spread of word-of-mouth 62
  • 64. SUMMAR Resources • http://an.kaist.ac.kr/traces/WWW2010.html 63
  • 65. Supplementary info. 64
  • 66. SUPPLEMENTARY INF This can be interpreted as a large following in another continent. Homophily in terms of followers We conclude that Twitter users who have reciprocal relations of (only in reciprocal network) fewer than 2, 000 are likely to be geographically close. Figure 6: The average number65 followers of r-friends per user of
  • 67. SUPPLEMENTARY INF This can be interpreted as a large following in another continent. Assortative mixing We conclude that Twitter users who have reciprocal relations of fewer than 2, 000 are likely to be geographically close. Figure 6: The average number66 followers of r-friends per user of
  • 68. SUPPLEMENTARY INF Homophily in terms of location (only in reciprocal network) Figure 5: The average time differences between a user and r- 67
  • 69. SUPPLEMENTARY INF Favoritism in RTs? • A few informative users? ? ? ? 68
  • 70. SUPPLEMENTARY INF Disparity in weighted network s than evenly among one’s followers. How even is the information diffu- have sion in retweet? To answer this question we investigate disparity [2] The in retweet trees. This For each user i we define |rij | as the number of retweets from d how user j. The Y (k, i) is defined as follows: k 2 |rij | Y (k, i) = k (3) height j=1 l=1 |ril | r how soon Y (k) represents Y (k, i) averaged over all nodes that have k out- time going (incoming) edges. Here an edge represents a retweet. When E. Almaas, B. Kovacs, T. Vicsek, Z. N. Oltvai, and A.-L. Barabasi. hin an retweeting occurs evenly among followers, Nature,kY (k) ∼ 1. Feb 2003. 69 then 427:839–843, If
  • 71. SUPPLEMENTARY INF ures 19(a) and 19(b) shows a linear correlation up to 1, 000 follow- ers. The linear correlation to k represents favoritism in retweets: people only retweets from a small number of people and only a Favoritism in RTs subset of a user’s followers actually retweet. Chun et al. also re- port that favoritism exists in conversation from guestbook logs of Cyworld, the biggest social networks in Korea [5]. tweet (a) kout Y(kout ) ∼ kout (b) kin Y(kin ) ∼ kin etweet on the Figure 19: Disparity in retweet trees and the than a hat the 7. RELATED WORK h more Online social networks and social media 70 away.
  • 72. responsive and basically occur back to back up to 5 hops away. SUPPLEMENTARY INF Cha et al. reports that favorite photos diffuse in the order of days Fast WOM by retweet in Flickr [4]. The strength of Twitter as a medium for information diffusion stands out by the speed of retweets. Figure 18: Elapsed time of retweet from (n − 1) hop to n hop 71