An approach to use social media data for the characterization of a company’s
follower base and the identification of influential players within the network
1. by Moritz Osterberger
An approach to use social media data
for the characterization of a company’s
follower base and the identification of
influential players within the network
Instagram network analytics
3. Which data?
Instagram data about followers of the online
home & living shopping club Westwing and
those followers’ followings:
• General Information: Username, full name,
bio, user id, usernames of followings,
number of followers, -followings & -posts
• Information about a users’ post:
Engagement (number of comments & likes)
and upload frequency (posts per day)
31.000 users
4. How collected?
Four different scrapers written in Python language with the
libraries Selenium, Requests & Beautiful Soup:
1. Usernames of Westwings‘ followers
2. Usernames of those followers‘ followings
3. General user information
4. Information about posts
Total (successful) runtime: 138,5 hours
5. Data exploration
Part 1: Characteristics of followers based on their user
page information
Part 2: Creation of a graph with different communities
based on the users’ followings
6. Where are the followers from?
• Information gained from
users’ bio
• Top cities:
City Frequency Population
Munich 8,26% 1,46m
Hamburg 8,08% 1,81m
Berlin 7,57% 3,57m
Cologne 3,44% 1,07m
Vienna 3,37% 1,87m
7. How is the gender distributed?
Information gained from
user’s full name by
identifying the first name and
then categorizing it to a
gender
Female 94,59%
Male
5,41
%
Word cloud of first names in user profiles
8. What are the followers’ interests?
• Information gained from
users’ bio
• Top words:
Word Frequency
travel 5,37%
mother 3,50%
design 3,08%
fashion 3,07%
food 2,96%
interior 2,95%
9. What are Graphs?
• The fundament & the mathematical
representation of every network
• A graph G is a pair of sets of all nodes V and all
edges E: G = (V, E).
• Directed or undirected
• Centrality: Measurement of a node‘s importance
• Indegree centrality captures number of
direct incoming relationships of an actor
• Communities: Subsets of nodes that are „near“
to each other in a specific network
Edge
Node A
Node B
Graph
10. Big Players of Instagram
Influencer & Models
Celebrities
Big Brands
Home and Living
Health, Beauty & small shops
How was the graph created?
• Top 1.000 followings of
Westwing’s followers were
selected by their indegree
centrality
• Followers that don’t follow
those were excluded
• Six communities found
11. What are the communities?
Optimal target group
➔further analysis
12. Analysis on Home & Living
• Follower within this community
fit best to the company
• To find the influential
accounts, analysis just on the
followers’ followings within
network
• Manual categorization into 5
groups
13. Who are the optimal influencers?
• Optimality is defined as the
combination of Centrality &
Engagement
• By taking the product of both
the most fitting influencers
could be identified
• Ideal choice for influencer
marketing
➔„karlas_view“, „solebich“ &
„svenja_traumzuhause“
300.000
100.000
50.000
14. Who are fitting brands?
• Optimality is defined as the
combination of Centrality &
Engagement
• By taking the product of both the
most fitting influencers could be
identified
• Addition to the product palette
and/or extra promotion
➔„sostrenegrene“, „thejungalow“
& „eulenschnitt“
1.000.000
500.000
250.000