IC2S2, Amsterdam, July 2019
Felix Victor Münch, Ben Thies, Cornelius Puschmann, Axel Bruns
We release the code and present results of a successful experiment with an adapted version of a network sampling method, the so-called rank degree method, which we modified to create a sample of the most influential accounts in the German-speaking Twitter follow network.
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Mining Influencers in the German Twittersphere – Mapping a Language-Based Follow Network
1. Mining Influencers in the
German Twittersphere
Mapping a Language-Based Follow Network
Felix Victor Münch1
, Ben Thies1
, Cornelius Puschmann1
, Axel Bruns2
1
Leibniz Institute for Media Research | Hans-Bredow-Institut (HBI), Germany
2
Digital Media Research Centre (DMRC), Queensland University of Technology (QUT), Australia
IC2
S2
, July 2019, Amsterdam
2. Problem
‘I can’t get no population …’
● follow/friend networks of
Online Social Networks (OSNs)
arguably main predictor for
content exposure (despite
sponsored content and
algorithmic sorting of
timelines)
● APIs too restrictive for
collection of comprehensive
follow networks
3. Opportunities
Possible Research Questions
● “Are there filter bubbles/echo
chambers?”
● “Are there social and/or topical
communities/issue publics?”
● “Who can spread which content
most efficiently?”
● “Can we predict the spread of
(fake) news?”
● “Who let the bots out? And
where?”
5. The Australian Twittersphere
Bruns, A., Moon, B., Münch, F. V., & Sadkowsky, T. (2017). The Australian Twittersphere in 2016:
mapping the follower/followee network. Social Media + Society, 3(4).
https://doi.org/10.1177/2056305117748162
Münch, F. V. (2019). Measuring the Networked Public – Exploring Network Science Methods
for Large Scale Online Media Studies. Queensland University of Technology.
https://doi.org/10.5204/thesis.eprints.125543
7. Our adaptation of the ‘rank degree’ method
Based on: Salamanos, N., Voudigari, E., & Yannakoudakis, E. J. (2017). Deterministic graph exploration for efficient graph sampling. Social
Network Analysis and Mining, 7(1), 24. https://doi.org/10.1007/s13278-017-0441-6
Bottom: Original graph without walked edges. Starting nodes (seeds) are drawn randomly (1) and
walker move to their friend with the highest in-degree (2-6). Walked edges get removed/‘burned’.
Top: Current sample at each step. Walked (and symmetric) edges are added to sample.
1 2 3 4 5 6
8. Our adaptation of the ‘rank degree’ method
Main adaptations (amongst others) mostly due to API restrictions:
● Undirected → Directed
● Fewer walkers (200)
● Walkers do not collapse when ending up on the same node
● Last 5000 friends only
● ‘Degree’ stays constant and is equal to follower count reported by Twitter API
● Only collect edges to accounts with German as their interface language (not possible
anymore due to API changes ¯_(ツ)_/¯ … we work on a solution.)
12. Coverage
Distribution of public accounts with > 1 friend in the test sample over the percentage of their friends that
can be found in the influencer sample (left, filtered for in-degree >= 1, leaving 199,180 accounts) / baseline
sample (right, same size, randomly drawn from German accounts in global dataset)
13. Coverage
Ranked distribution of the percentage of ‘friends’ of accounts in a random sample
(n=1000, filtered for having at least 2 ‘friends’) found in our influencer sample (excl. nodes
with in-degree 0) and a random sample of the same size (181k accounts)
14. Reach
Ranked distribution of the percentage of accounts reached in a random sample (n=1000,
filtered for having at least 2 ‘friends’) by accounts in our influencer sample (excl. nodes
with in-degree 0) and by accounts in a random sample of the same size (181k accounts)
16. Community detection with
infomap
3-core of our sample network;
coloured by communities detected
with the infomap community
detection algorithm;
node size represents Page Rank
18. Tagged Community graph Community graph of communities
in the 3-core of our sample with
over 300 accounts, at least 80 active
accounts during the examined time
frame, and edges with a weight of at
least 150; edge width represents
weight; edge direction follows
clockwise curvature; edges coloured
by source node; node size represents
the number of accounts in each
community
19. Outlook
● Adaptation to API changes
● Bootstrapping the seed pool
● Other language-based spheres
● Topical mining
● Other community detection methods
● Bot detection
22. Interesting 🤔
Münch, F. V. (2019).
Measuring the Networked Public – Exploring Network
Science Methods for Large Scale Online Media Studies.
Queensland University of Technology.
https://doi.org/10.5204/thesis.eprints.125543
23. Coverage
Count, mean, standard deviation, minimum, quartiles, and maximum of the number of
friends and the percentages of friends in the influencer and baseline sample for public
accounts in the test sample with at least 2 friends.
n = 597 number of friends
percent of friends in
influencer sample
percent of friends in
baseline sample
mean 57 40 0.54
std 160 30 2.7
min 2 0 0
25% 7 11 0
50% 18 40 0
75% 42 65 0
max 1988 100 50
24. Twitter is not representative for general population – but for itself
https://www.pewinternet.org/2019/04/24/sizing-up-twitter-users/
25. Dominance of an active and influential elite
https://www.pewinternet.org/2019/04/24/sizing-up-twitter-users/