This is a research study that examines how News Articles propagate on Twitter. It reports on a comparison of 12 major news media agencies using Network, and temporal analysis of a dataset collected from Twitter.
1. Sharing News Articles Using 140 Characters:
A Propagation Analysis on Twitter
Dr. Sudha Ram
McClelland Professor of MIS and Computer Science
Eller College of Management
University of Arizona
Tucson, AZ 86721
Email: ram@eller.arizona.edu
April 2, 2013
INSITE: Center for Business Intelligence and Analytics
www.insiteua.org
This is joint work with Devi Bhattacharya
Friday, April 05, 2013
2. Motivation
Twitter – Important hub of Social Media
140 million members1
340 million tweets/day1
Evolution of Twitter as a serious
newswire - Credible news appearing on
it before anywhere else on the web2
Iran election results controversy
Egyptian revolution
1 - as of March, 2012 via http://blog.twitter.com/2012/03/twitter-turns-six.html
2 - http://blog.twitter.com/2007/09/twitter-for-news.html, http://blog.twitter.com/2008/07/twitter-as-news-wire.html,
http://www.sysomos.com/insidetwitter/engagement/
Friday, April 05, 2013
3. Research Objective
“Understanding and comparing the influence of news
agency brands in a micro-blogging environment using
network analysis”
Measuring the news article cascades caused by Twitter
users’ participation
Investigating Twitter news agency Followers’ involvement
in the propagation process
Theory development for model
Identifying and producing
of online news article
new network measures
propagation
4/5/2013 4:34 AM
4. Our Initial Questions
How do different news sources compare in
volume and extent of spread of news articles?
How do the articles from different news sources
spread and survive over time and what is their
lifespan?
How do rates of spread compare across different
news sources?
Friday, April 05, 2013
5. Data Collection
Dataset of tweets (21st Nov till 13th Dec,
2011)
Tweet containing valid URL of articles from
selected news sources.
Excludes tweets containing a URL that refers to
the news source homepage
URL Minimum reach 5 or more (tweets,re-
tweeted/posted)
6 Million tweets
Via Twitter Streaming API and Phirehose
application
Friday, April 05, 2013
6. Twitter Propagation Network
A weighted user-user network : G = (V, E, W)
s
v1 v2
where,
V : the set of nodes, representing the users on Twitter who
tweet/retweet about a news media article
E : the set of edges, signifying two users who are linked via
tweet-retweet/reply relationship
W : the normalized edge weight representing the strength
of the tweet-retweet/reply relationship. (Strength s = Number
of times v2 retweets/replies v1)
This network is named as Twitter Activity Network (TAN)
Captures the user participation in news propagation for the
entire time period
4/5/2013 4:34 AM
7. Example of Twitter Activity Network
NYTimes TAN for
1 week activity
#Nodes : 58,945
#Edges : 29,676
4/5/2013 4:34 AM EGO
8. Example of TAN: WaPo
Washington Post
Non-media seeding node
cascade streams
Disconnected network section – single user nodes
capturing tweets with no response
Friday, April 05, 2013
9. QUESTION 1
How do different news sources
compare in volume and extent of
spread of news articles?
Nodes volume and edge/node ratio
comparison using TAN
Ego network node analysis
Friday, April 05, 2013
10. Diffusion Network Statistics
News Source Media Seeding Nodes Edges Diameter Edge/Node
Nodes Ratio
BBC bbcworld, 3801581 154276 341 0.406
Higher number of Lowest
bbcbreaking, Bloomberg, NPR, Washington
long cascade diameter High
NYTimes Vs. bbcnews Post, Forbes, and Wired : High edge/node
streams
Reuters reuters Sparse #
61440 27035 non-media nodes 1
Washington Post
retweets/replies –
% of 8 0.44
ratio
Guardian guardiannews 21454 9250creating cascades
3 0.4312
High concentration of
NYTimes as information
Acting nytimes 1556503
individual tweets63436 143 0.4084
NPR boosters for news
Financial each
nprnews 40454 16459 4 0.407
other
agencies
Washington Post washingtonpost 982474 40791 network's three major seedingLow
BBC 152 0.4153
edge/node
nodes – bbcworld, bbcbreaking and
FT ft 10242 2925 4 long
Fewer bbcnews
0.286 ratios
Information spread
Forbes concentrated at levels 88630
forbes 28718 9
cascade 0.324
Bloomberg closerbloombergnews 26487
to the seeding 8719 streams.
104 0.33
Arstechnica node
arstechnica 13313 5016 3 0.377
Mashable mashable 1971092 78620 8 0.398
Wired wired 43744 14786 9 0.338
Superscript values indicate rank for that column
Friday, April 05, 2013
11. Example of Twitter Ego Network
NYTimes Ego
Network for 1
week activity
#Nodes : 22,176
#Edges : 23,081
Ego Network ⊆
TAN
4/5/2013 4:34 AM
TAN
12. Analyzing News Cascades – Diffusion Depth
Day-wise Comparison of Average Diffusion Depth
8 BBC News Guardian – M
NYTimes –
7 – 5.43 4.00 2.14
Tu
Diffusion Depth
6 Washington
FT – 1.71
Post – 4.00 W
5 Arstechnica
– 2.29 Th
4
3 F
2 Sa
1 Su
0
Avg
Lowest Average Highest Average
Weekly Diffusion Weekly Diffusion
Depth Depth
4/5/2013 4:34 AM
13. QUESTION 2
How do the articles from different
news sources spread and survive over
time and what is their lifespan?
Analyzing Lifespan: Time difference
between the last and first tweet posted
containing the URL to an article
Examining article survival distributions
Friday, April 05, 2013
14. Life Span of Articles on Twitter
Shortest Lifespan
Longest Lifespan
NYTimes
Bloomberg
Wired
Forbes Mashable BBC
• Y-Axis: Count of articles surviving
• X-Axis: Time Progression from first tweet submission time (In Hours)
Friday, April 05, 2013
15. QUESTION 3
How do rates of spread compare
across different news sources analyzed
in comparison with tweet posting
activity?
Concept of URL half-life
Cumulative and non-cumulative trends
Friday, April 05, 2013
16. Cumulative Number of Tweets Posted
News Sources Exceeding Forbes
BBC
Half-Life - Fair
News Sources Exceeding Half-Life -
Mashable
Most Popular
Washington Wired
NYTimes Reuters Post
• Y-Axis: Cumulative number of tweets posted for a given article
• X-Axis: Time Progression from first tweet submission time (In Hours)
• Tweets Frequency restricted between 0-400 and Time Progression to 72 hours
Friday, April 05, 2013
17. Summary of Findings
• BBC
– Maximum reach in terms of affected users, retweet levels and distribution of Twitter
users in tweet-retweet ladders.
– Best article survival chances, with 0.1% of articles surviving for three days or more
– Median lifespan of article is 55 hours from first submission.
– Majorly supported by 2 other BBC Twitter accounts – bbcbreaking and bbcworld.
• Reuters
– Average article lifespan and rate of spread
– Best edge/node ratio - indicating highest percentage of user interactions via retweets
and replies
• NYTimes and Mashable
– Similar tweets volume and rate of spread
– Mashable - high percentage of level 1 tweets and NYTimes - tweet-retweet cascades
resulting in a diameter of 14 levels.
• Guardian
– Most of the tweets in its ego network are concentrated in the first level
– Second highest edge/node ratio.
– Highest median number of tweets posted in the first hour
– Third highest rate of spread
Friday, April 05, 2013
18. Applying the network measures for news
agency comparison
Considering the collective influence of the network measures
on news agency article propagation
Diffusion depth and
Diffusion magnitude
appear to be
inversely related
Follower engagement
seems directly
News Agencies with
Involvementproportional to
of followers
high tweeting activity
is more important magnitude
Diffusion than
the number diffusion
have high of followers
depth
4/5/2013 4:34 AM
19. Implications
Exhaustive analysis of news media agency related
tweets
Leads for developing diffusion strategy to maximize
audience reach
Magnitude Vs. Depth
Identification of useful metrics to evaluate
propagation
Diffusion Follower Participation, Lifespan and
Diameter User Volume related metrics
Edge/Node Node composition at Cumulative and
Ratio, Reach different levels of ego non-cumulative rate
network of spread
Friday, April 05, 2013
20. NPR Video
Click here to view the NPR news item on
our work:
http://www.insiteua.org/news/Twitter_Be
comes_News_Media_Tool.asp
4/5/2013 20
21. Publication Reference
For more information please see our
research paper:
Devipsita Bhattacharya and Sudha Ram, “Sharing
News Article Using 140 Characters: A Diffusion
Analysis on Twitter”, IEEE 2012 International
Conference on Advances in Social Network Analysis
and Mining (ASONAM 2012), pp, 966-971.
http://www.computer.org/csdl/proceeding
s/asonam/2012/4799/00/4799a966-
abs.html
4/5/2013 21
22. Harnessing the Power of Social Media Using
INSITE
Facebook ID: BusinessIntelligenceAndAnalyticsCenter Twitter ID: insite_ua
4/5/2013 4:34 AM
23. Questions
INSITE: Center for Business Intelligence and Analytics
URL: www.insiteua.org
Contact Information: ram@eller.arizona.edu
4/5/2013 23