Who’s in the Gang? Revealing Coordinating Communities in Social Media

OFFICIALASONAM 7-10 Dec 2020
Derek Weber1,2 & Frank Neumann1
Contact: derek.weber@adelaide.edu.au
1 School of Computer Science, University of Adelaide, Australia.
2 Defence Science and Technology Group, Department of Defence, Australia.
WHO’S IN THE GANG?
REVEALING COORDINATING COMMUNITIES IN SOCIALMEDIA

OFFICIALASONAM 7-10 Dec 2020 2
https://twitter.com/conspirator0/status/1328479128908132358 17 Nov 2020

Context
• Social media for political communication
• Targeted marketing → (Political) Spam & recruitment
• Anonymity → Trolls
• Automation → Bots, social bots & political bots
3
Targeted marketing + Anonymity + Automation = Interference

Coordination
4
Malone & Crowston (1994)
T1 T2
T1
T2
R1
T1
T2
Starbird et al. (2019)
1 https://www.nbcnews.com/politics/national-security/russian-trolls-who-interfered-2016-u-s-election-also-made-n1013811
2 Bianoconi (2015) DOI: 10.1209/0295-5075/111/56001
3 https://commons.wikimedia.org/wiki/File:Social_media_icon.png
3
CultivatedOrchestrated
1
Emergent
2

Information Campaigns & Coordination Strategies
5
Intent Strategy Planning Execution
Post Repost
time
time
Hostile
Friendly
Good Post Junk Post
time
Channel: e.g., #OurPartyRocks
time
t0 t1
t1 t2
t2 t3
Pollution
Woolley (2016)
Fisher (2018)
Nasim et al. (2018)
Boost
Cao et al. (2015)
Vo et al. (2017)
Graham et al. (2020)
Bully
Hine et al. (2017)
Kumar et al. (2018)

The Challenge
• Discovery
• RQ1 How can highly coordinating communities (HCCs) be found?
• Validation
• RQ2 How do the discovered communities differ?
• RQ3 How consistent is the HCC messaging?
• RQ4 Are the HCCs internally or externally focused?
6
To identify groups of accounts whose behaviour,
though typical in nature, is anomalous in degree.

Approach
7
(1) (2) (3) (4) (5)
@ #
,@,t1
,#,t2
,#,t3
,@,t4
,#,t5
Common
behaviours
Timed
Interactions
Evidence of
coordination
Highly
Coordinating
Communities
Latent
Coordination
Network
https://github.com/weberdc/find_hccs

Temporal Aspect
Time →
γ
8
(1) (2) (3) (4) (5)
@ #
,@,t1
,#,t2
,#,t3
,@,t4
,#,t5
γ γ

Extract HCCs
9
Focal Structures Analysis1 – Variant (FSA_V)
1 Şen et al., 2016

Extract HCCs
10
Focal Structures Analysis1 – Variant (FSA_V)
1 Şen et al., 2016

Evaluation
• Window size, γ = {15, 60, 360, 1440} minutes
• Community extraction:
• FSA_V, θ = 0.3
• K Nearest Neighour (kNN), k = ln(|V|) (cf. Cao et al., 2015)
• Threshold
• Coordination strategy
• Boost (co-retweet)
• Pollute (co-hashtag)
• Bully (co-mention)
11

Data
DS1 – Australian regional election, 2018
• Including ground truth (GT, cf. Keller et al., 2017)
DS2 – Twitter’s election integrity dataset1
• Internet Research Agency, 2016 tweets
12
1 https://about.twitter.com/en us/values/elections-integrity.html
Tweets
(T)
Retweets (RT) Accts
(A)
Days T / A /
Day
RT / A /
Day
DS1 115.9k 64.2k 54.5% 20.6k 18 0.31 0.17
- GT 4.2k 2.5k 59.7% 134 18 1.74 1.04
DS2 1.57m 729.9k 56.6% 1.4k 365 3.12 1.45
Ethics
University of Adelaide
HREC H-2018-045https://github.com/weberdc/find_hccs

Finding HCCs
• Coordination Strategies
• HCCs found in all
• Many components (HCCs), incl. a very large one
• kNN – single HCC with internal structure
13
DS1 DS2
FSA_V kNN Threshold FSA_V kNN Threshold
GT
Networks: Gephi https://gephi.org

Membership
• Window size, γ
• Jaccard
• Yellow = Identical
• Blue = Disjoint
14
GT
DS1
DS2

Hashtags
15
Retweeting the same tweet
Retweeting the same account
GT DS1
DS2

Consistency
Hypothesis
• Dissemination groups should have
highly similar content
• i.e., Int. similarity ≥ Ext. similarity
Approach
• For each group:
• For each member:
• Combine member tweets into a corpus
• Compare 5-char n-grams of corpus
against all other accounts
• Plot similiarities as a matrix
cf. a heatmap
16
GT DS1
DS2 RANDOM

Connectivity & Focus
17
(log) (log)
Retweets Mentions

DNC
RNC
1 https://botometer.osome.iu.edu/
2 Rizoiu et al.(2018) 2
2
Botometer1
Recent Work:
2020 US Conventions

19
Diagrams: visone https://visone.info
Co-hashtag bigraphs
DNC RNC

Future
• Community extraction
• Stitching (cf. FSA)
• Conductance cut clustering
• Combinations
• Statistics
• Parameter choices
• Anomaly validation
• Temporal analysis
• Evolution of HCCs
• Edge semantics
20
OT1
10s
10s
A B C
w1 w2
OT1 OT2
10s 10s
A B C
w1 w2
ts(w2) >> ts(w1 )

Conclusion
• Targeted Marketing + Anonymity + Automation = Interference
• Focus: detecting coordinating groups
• Discovery:
• RQ1: Pipeline model + community extraction (FSA_V)
• Validation
• RQ2: Distinct content patterns
• RQ3: Internal consistency
• RQ4: External focus
• Recent & future work
• More detail: arXiv:2010.08180
• Code: https://github.com/weberdc/find_hccs
21

Literature
Campaign detection
• Content (Lee et al., 2013)
• URL sharing (Cao et al., 2015)
• Temporal signatures (Hine et al., 2017)
• Cross-platform linking (Starbird & Wilson, 2020)
Social bots
• Agenda-oriented automated accounts pretending to be human (Ferrara et al., 2016)
• Hard to identify (Cresci et al., 2017; Nasim et al., 2018; Grimme et al., 2018)
Coordination as “orchestrated activities”
• Focus on detecting strategies (Fisher, 2018; Grimme et al., 2018; Starbird et al., 2019;
Weber, 2019)
• Co-retweet (Weber, 2019; Graham et al., 2020)
• Co-hashtag (Woolley, 2016; Fisher, 2018)
• Co-URL (Cao et al., 2015; Giglietto et al., 2020)
23

• Brooking, E. T., and Singer, P. W. (2016). War Goes Viral: How social media is being weaponized across the world. The Atlantic. Retrieved
from https://www.theatlantic.com/magazine/archive/2016/11/war-goes-viral/501125/
• Cao, C., Caverlee, J., Lee, K., Ge, H. and Chung, J. 2015. Organic or Organized?: Exploring URL Sharing Behavior. CIKM’15, 513-522.
• Cresci, S., Pietro, R. D., Petrocchi, M., Spognardi, A. and Tesconi, M. 2017. The Paradigm-Shift of Social Spambots. WWW’17 (Companion
Volume), 963-972.
• Ferrara, E., Varol, O., Davis, C., Menczer, F. and Flammini, A. 2016. The rise of social bots. Communications of the ACM. 59(7) (Jun. 2016),
96–104. DOI:10.1145/2818717.
• Fisher, A. 2018. Netwar in Cyberia: Decoding the Media Mujahidin. USC Centre on Public Diplomacy, Figueroa Press.
• Giglietto, F., Righetti, N., Rossi, L. and Marino, G. 2020. Coordinated Link Sharing Behavior as a Signal to Surface Sources of Problematic
Information on Facebook. SMSociety, 85-91.
• Grimme, C., Assenmacher, D. and Adam, L. 2018. Changing Perspectives: Is It Sufficient to Detect Social Bots? HCI (13) 2018, 445–461.
• Graham, T., Bruns, A., Zhu, G., and Campbell, R. 2020. Like a virus: The coordinated spread of coronavirus disinformation. Centre for
Responsible Technology, The Australia Institute.
• Hine, G. E., Onaolapo, J., Cristofaro, E. D., Kourtellis, N., Leontiadis, I., Samaras, R., Stringhini, G. and Blackburn, J. 2017. Kek, Cucks, and
God Emperor Trump: A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effects on the Web. ICWSM’17, 92–101.
• Keller, F.B., Schoch, D., Stier, S. and Yang, J.H. 2017. How to Manipulate Social Media: Analyzing Political Astroturfing Using Ground Truth
Data from South Korea. ICWSM’17, 564–567
• Kumar, S., Hamilton, W.L., Leskovec, J. and Jurafsky, D. 2018. Community Interaction and Conflict on the Web. Proceedings of the 2018
World Wide Web Conference, WWW’18, 933–943 .
24
References (1)

References (2)
• Lee, K., Caverlee, J., Cheng, Z. and Sui, D. Z. 2013. Campaign extraction from social media. ACM Transactions on Intelligent Systems and
Technology. 5(1), 9:1–9:28. DOI:10.1145/2542182.2542191.
• Lim, K. H., Jayasekara, S., Karunasekera, S., Harwood, A., Falzon, L., Dunn, J. and Burgess, G. 2019. RAPID: Real-time Analytics Platform
for Interactive Data Mining. KCML/PKDD (3) 2018. 649–653.
• Nasim, M., Nguyen, A., Lothian, N., Cope, R. and Mitchell, L. 2018. Real-time Detection of Content Polluters in Partially Observable Twitter
Networks. WWW’18 (Companion Volume), 1331-1339.
• Pacheco, D., Hui, P.-M., Torres-Lugo, C., Truong, B. T., Flammini, A. and Menczer, F. 2020-01-16. Uncovering Coordinated Networks on
Social Media. ICWSM’21, to appear.
• Rizoiu, M.-A., Graham, T., Zhang, R., Zhang, Y., Ackland, R. and Xie, L. 2018. #DebateNight: The Role and Influence of Socialbots on
Twitter During the 1st 2016 U.S. Presidential Debate. ICWSM’18, 300–309.
• Saulwick, A., & Trentelman, K. (2014). Towards a formal semantics of social influence. Knowledge-Based Systems, 71, 52–60.
DOI:10.1016/j.knosys.2014.06.022
• Şen, F., Wigand, R., Agarwal, N., Tokdemir, S., and Kasprzyk, R. 2016. Focal structures analysis: identifying influential sets of individuals in
a social network. Social Network Analysis and Mining, 6(1). DOI:10.1007/s13278-016-0319-z
• Starbird, K. and Wilson, T. 2020. Cross-Platform Disinformation Campaigns: Lessons Learned and Next Steps. Harvard Kennedy School
Misinformation Review. (Jan. 2020). DOI:10.37016/mr-2020-002.
• Starbird, K., Arif, A. and Wilson, T. 2019. Disinformation as Collaborative Work:: Surfacing the Participatory Nature of Strategic Information
Operations . Proc. ACM on Human-Computer Interaction. 3 (CSCW), 127:1–127:26. DOI:10.1145/3359229.
• Vo, N., Lee, K., Cao, C., Tran, T. and Choi, H. 2017. Revealing and detecting malicious retweeter groups. ASONAM’17, 363-368.
• Weber, D. 2019. On Coordinated Online Behaviour. Poster presented at ASNAC’19, Adelaide, Australia.
25

Who’s in the Gang? Revealing Coordinating Communities in Social Media

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Who’s in the Gang? Revealing Coordinating Communities in Social Media

Similar to Who’s in the Gang? Revealing Coordinating Communities in Social Media (20)

Recently uploaded

Recently uploaded (20)

Who’s in the Gang? Revealing Coordinating Communities in Social Media

Editor's Notes