Paper by Axel Bruns, Patrik Wikström, Peta Mitchell, Brenda Moon, Felix Münch, Lucia Falzon, and Lucy Resnyansky presented at the ACSPRI 2016 conference, Sydney, 19-22 July 2016/
Information Contagion through Social Media: Towards a Realistic Model of the Australian Twittersphere
1. UNCLASSIFIED – Approved For Public release
Information Contagion through Social Media: Towards
a Realistic Model of the Australian Twittersphere
Work in progress
Axel Bruns, Patrik Wikström, Peta Mitchell,
Brenda Moon, and Felix Münch
Digital Media Research Centre
Queensland University of Technology
Lucia Falzon and Lucy Resnyansky
NSID DSTG
5th Biennial ACSPRI Social Science Methodology Conference
July 19-22, 2016, The University of Sydney
2. Introduction
AREA: the development of contagion simulation approaches
AIM: to simulate the effects of a range of possible communication strategies on a network
structure that accurately replicates the real-world Twitter follower network in Australia
FOCUS: crisis communication during two qualitatively different events:
– the Brisbane flood: impacted on a large geographical area and on a large number of
people, either as an actual or a potential threat;
– The Sydney siege: located at a single point, and directly impacted only on a small
number of people, but was the focus of attention for many who were located at a
significant distance from the actual event location
OUTCOMES:
– new methodological impulses for the modelling of realistic information contagion
processes in social media
– directly actionable insights into the specific processes of information contagion in crisis
contexts within the Australian Twittersphere.
3. Fabrega, J., & Paredes, P. (2013). Social Contagion and Cascade Behaviors
on Twitter. Information, 4(2), 171-181. [Twitter]
Hodas, N. O., & Lerman, K. (2014). The simple rules of social contagion.
Scientific reports, 4. [Twitter]
Kramer, A. D., Guillory, J. E., & Hancock, J. T. (2014). Experimental evidence
of massive-scale emotional contagion through social networks. PNAS
111(24): 8788–90. [Facebook]
Shuai, X., Ding, Y., Busemeyer, J., Chen, S., Sun, Y., & Tang, J. (2012).
Modeling indirect influence on twitter. International Journal on Semantic
Web and Information Systems (IJSWIS), 8(4), 20-36. [Twitter]
Weng, L., Menczer, F., & Ahn, Y.-Y. (2013). Virality Prediction and Community
Structure in Social Networks. Sci. Rep., 3. doi: 10.1038/srep0252 [Twitter]
Social Contagion in Social Media: Literature
4. Important Concepts and Definitions for This Project
1. Contagion as a property of a spreading object:
the inherent virality
2. Contagion as a process:
simple vs. complex contagion
3. Contagion as a structural property:
virality of a cascade
6. The Australian Twittersphere
Twitter in Australia:
– Strong take-up since 2009
– Centred around 25-55 age range, urban, educated, affluent users (but gradually broadening)
– Significant role in crisis communication, political communication, audience engagement, …
Mapping the Twittersphere:
– Long-term project to identify all Australian Twitter accounts
– First iteration: snowball crawl of follower/followee networks
• Starting with key hashtag populations (#auspol, #spill, …)
• Map of ~1m accounts in early 2012
– Second iteration: full crawl of global Twitter ID numberspace through to Sep. 2013
(~870m accounts)
• Filtering by description, location, timezone fields
• Focus on identifiably Australian cities, states, timezones and other markers
• 2.8 million Australian accounts identified (by Sep. 2013)
• Retrieval of their follower/followee lists
• Best guess of account location based on timezone, location and description settings
7. Education
Agriculture
Literature
Adelaide / SA
Food
Wine
Beer
Parenting
Mums PR
Netizens
Marketing
Investing
Real Estate
Home Business
Sole Traders
Self-Help
HR / Support
Followback
Urban Media
Utilities
Advertising
Business
Fashion
Beauty
Arts
Cinema
Journalists
Politics
Hard RightLeftists
News
CyclingTalkback
Music
TV
V8s
UFC
NRL
AFL
Football
Horse Racing
Cricket
NRU
Celebrities
Hillsong
Perth
Pop
Media
Teen Idols
Cody Simpson
The Australian Twittersphere
2.8m known Australian accounts
Network of follower connections
Filtered for degree ≥1000
140k nodes (~5%), 22.8m edges
Labels assigned through qualitative evaluation
8. TrISMA: Tracking Australian Twitter
ARC LIEF project:
– Tracking Infrastructure for Social Media Analysis
– Multi-university project led by QUT to develop comprehensive
infrastructure for large-scale social media data analytics
– Twitter: continuous capture of tweets by all 2.8m identified Australian
accounts
– 1b+ tweets captured to date, 1m+ new tweets/day
– Data storage via Google BigQuery, analysis via Tableau and Gephi
9. Modelling Experiments
Two issues to be addressed:
1. The impact that accounts with certain characteristics during
the early phases of the crisis communication process have
on the overall dissemination of emergency messages.
2. The impact that using Twitter-specific communicative
features – e.g. a topical hashtag – have on the dissemination
of emergency messages.
10. Research Assumptions & Questions
ASSUMPTION (1): Contagion behaviour on Twitter is affected by a range of
factors that are specific to individual users and local subsets of the wider
network. Modelling of contagion processes based on realistic data can
identify the impact of these factors.
QUESTION (1): What is the impact of factors such as the following on
contagion processes: volume of incoming content feeds to an individual
account; repeated exposure to contagious content; network position of
immediately preceding vector of contagion; affinity of message content to
recipient's key interests; ...?
11. Research Assumptions & Questions
ASSUMPTION (2): People are more likely to share information with those who
are similar to them (Romero et al. 2010).
QUESTION (2): Does this assumption still hold in times of crisis? In the case of
the Australian Twittersphere, does the community structure (clusters of users
based on similarity of interests, tastes, demographics, etc.) change to reflect
the fact that in crises the basis of online community structuring differs from
normal life situations?
12. Research Assumptions & Questions
ASSUMPTION (3): Rumour diffusion on social media can be modelled in the
same way as the spread of rumours according to a traditional model based on
studies of rumours in a mass media society.
The traditional rumour model (Andrews et al. 2016; Freberg 2012):
• an official source determines the certainty and veracity of a piece of information
• information is a rumour only until it is either confirmed or denied by the official
source
• rumours only spread until certainty is established.
QUESTION (3): How significant is the impact of official tweets on rumour
spreading? What implications does this have for the communication
strategies of official stakeholders?
13. Factors to Consider to In Understanding Retweeting Patterns
The relevance of the information source to the crisis situation has a bigger impact
on retweetability than the large number of subscribers or followers
The waiting time of a retweet in the context of extreme events: hashtags are
strongly correlated with shorter waiting times whereas a large number of followers
is associated with longer waiting times; messages broadcast to a large number of
individuals may make re-tweeting redundant. (Spiro et al. 2012);
Case study: the Sky News Australia Lakemba Raids rumour vs. the AFPMedia tweet
denying the rumour (Arif et al. 2016)
- due to a lack of serial transmission the Sky News tweet had a very low impact
on the rumour’s overall propagation. Even though the AFP had much fewer
followers their message could spread to a larger network
- a large number of followers may be counter-productive for rumour spreading
because people might assume that others have already seen the message.
14. Possible Modelling Parameters
Snowball sampling from seeds within each cluster, with the number of seeds in
proportion to the size of cluster.
Modelling at the network cluster level instead of the current node level modeling.
Currently model using network wide metrics for generation, we want to expand
this to allow for different coefficients for different clusters in the network.
Incorporate different behaviour for different types of nodes, for example
individuals, media outlets, bots of different types, verified users.
Introduce additional metrics such as tweeting rate (number of tweets sent during
period) and proportion of tweets, retweets, @mentions and hashtag use for each
node
Extract these new metrics and the topic preferences from the selected nodes in
the Australian Twittersphere.
Model effect of geographic location in addition to network location.
15. Key Measures in Modelling
Extent of serial transmission (time-ordered reach)
Overall propagation (proportion of nodes a message reaches, this measure
is related to information exposure the total number of people who have
been exposed to messages related to a particular piece of information)
Retweet waiting time – the time between the first posting of a message
and its re-transmission
16. Simulation model
An agent-based model has been developed in order to
simulate information contagion in a social media network.
The agents in the network are “accounts” that are follow
other accounts and create “posts” that diffuse through the
network based on certain rules.
There are two main processes that are modelled; network
generation and information diffusion (outlined below).
The model is very much a work in progress at this stage and
will be developed as the project continues.
17. Account Characteristics
0
20
40
60
80
100
Topic A
Topic B
Topic CTopic D
Topic E
Account 1
Account 2
The accounts are modelled as having a specific interest profile.
The profile consist of nt topics t, and each account’s interest
it in a specific topic varies between 0 and 100 (see illustration).
Account are more likely to create and pass on posts that fit
their interest profile.
Accounts are also likely to follow other accounts with a similar
interest profile (homophily).
Example:
18. Brief Overview of the Network Generation Process in
the Simulation Model
The model can either load a real-world network structure or generate an
artificial network that replicates characteristics of the real-world social
network.
When an artificial network is generated it is controlled by the following
weighted parameters:
– The extent a new account is likely to connect with accounts…
• …beyond the friends of its current friends. (Introversion)
• …with a similar profile (Homophily)
• …that already have many followers (Popularity preference)
• …that are active communicators (Communicator preference)
– The likelihood that an existing account starts following another account, based
on the parameters listed above.
19. Brief Overview of the Network Generation Process in
the Simulation Model
Posting
– The probability that an account sends a post during a timestep is p (exp distr).
– When an account creates a post, the post is assigned to one of the topics in
the account’s interest profile.
Observing
– An account receives posts from the accounts that it follows.
– An account keeps the x most recently received posts in its newsfeed.
Re-posting
– The likelihood that an account re-posts a post in their newsfeed is p, which is a
function of how well the post fits with the account’s interest profile.
– A post is only re-posted once.
21. Example of a Basic Simulation Experiment
Output data is recorded to allow for
post-simulation analysis.
This is an example of a basic simulation
experiment that explores the
relationships between the size of the
creator’s 1-step and 2-step
neighbourhoods and the number of
accounts that (a) see the post and (b)
re-tweet the post.
As expected, the experiment shows that
there is a stronger correlation between
the size of the 2-step neighbourhood on
the diffusion of the posts.
22. References
Andrews, C., Fichet, E., Ding, Y. Spiro, E.S., Starbird, K., 2016, ‘Keeping up with the Tweet-dashians: The
impact of ‘official accounts on online rumoring’, CSCW’16, February 27 – March 2, 2016, San Francisco, CA,
USA, ACM, pp. 452-465.
Bruns, A., & Burgess, J. (2015). Twitter Hashtags from Ad Hoc to Calculated Publics. In N. Rambukkana
(Ed.), Hashtag Publics: The Power and Politics of Discursive Networks (pp. 13–28). New York: Peter Lang.
Bruns, A., Burgess, J., & Highfield, T. (2014). A “Big Data” Approach to Mapping the Australian
Twittersphere. In P. L. Arthur & K. Bode (Eds.), Advancing Digital Humanities: Research, Methods, Theories
(pp. 113–129). Houndmills: Palgrave Macmillan.
Bruns, A., J. Burgess, J. Banks, D. Tjondronegoro, A. Dreiling, J. Hartley, T. Leaver, A. Aly, T. Highfield, R.
Wilken, E. Rennie, D. Lusher, M. Allen, D. Marshall, K. Demetrious, & T. Sadkowsky. (2015). TrISMA:
Tracking Infrastructure for Social Media Analysis. http://www.trisma.org/.
Romero, D., Meeder, B., Kleinberg, J. (2011) Differences in the Mechanics of Information Diffusion Across
Topics: Idioms, Political Hashtags, and Complex Contagion on Twitter, Proceedings of WWW’11 – the 20th
international conference on World Wide Web, pp695-704 Accessed 31/3/13 at
http://www.cs.cornell.edu/home/kleinber/www11-hashtags.pdf
Spiro, E.S., DuBois, C.L., Butts, C.T., ‘Waiting for a Retwwet: Modeling Waiting Times in Information
Propagagtion’, Workshop on Social Network and Social Media Analysis: Methods, Models and Applications.
Neural Information Processing Systems (NIPS), January 2012, Lake Tahoe, NV. Accessed 14/3/15 at
http://snap.stanford.edu/social2012/papers/spiro-dubois-butts.pdf
Editor's Notes
Image credit: Arisu San. Untitled image of Tomás Saraceno’s Galaxies Forming Along Filaments, Like Droplets Along the Strands of a Spider's Web, Venice Biennale, 2009. https://www.flickr.com/photos/sairenso/4002039333. Creative Commons Licence (CC BY-NC-SA 2.0).
Inherited from mathematical chemistry: Wiener Index (Wiener 1947)