In this work, we present the Klout Score, an influence scoring system that assigns scores to 750 million users across 9 different social networks on a daily basis. We propose a hierarchical framework for generating an influence score for each user, by incorporating information for the user from multiple networks and communities. Over 3600 features that capture signals of influential interactions are aggregated across multiple dimensions for each user. The features are scalably generated by processing over 45 billion interactions from social networks every day, as well as by incorporating factors that indicate real world influence. Supervised models trained from labeled data determine the weights for features, and the final Klout Score is obtained by hierarchically combining communities and networks. We validate the correctness of the score by showing that users with higher scores are able to spread information more effectively in a network. Finally, we use several comparisons to other ranking systems to show that highly influential and recognizable users across different domains have high Klout scores.
http://arxiv.org/abs/1510.08487
Klout Score: Measuring Influence Across Multiple Social Networks
1. Klout Score:
Measuring Influence Across
Multiple Social Networks
October 29, 2015
Mining Big Data in Social Networks Workshop
IEEE International Conference on Big Data, Santa Clara
*Adithya Rao, Nemanja Spasojevic, Zhisheng Li, Trevor DSouza
Link to paper: pdf arxiv
2. ● Klout is a social influence measurement tool.
● Users register on Klout.com and connect their
social network accounts.
● Klout collects authorized/public information
from connected networks.
● Klout derives influence scores and topics for
users from collected data.
● Klout recommends:
○ content to post
○ times when to post.
● Klout Website (klout.com)
What is Klout ?
4. Paper Contributions
● Scalable Production System:
○ Full production system
○ 750 million public and registered user profiles
○ 45 billion interactions from 9 different networks
● Feature Generation:
○ How to generate features that signify influence?
○ Over 3600 features generated.
● Hierarchical Scoring:
○ How to combine networks into a single score?
● Validation:
○ Experiments and comparisons that validate effectiveness of the Klout score
6. Problem Statement
Formal Definition:
For each user u in a network G, let G_u be the subset of the network containing the users who may
directly or indirectly interact with u, via a set of reactions R ⊆ A. Then an influence score I(u,T) is a
measure of the degree and quantity of reactions that u can induce in G_u over a specified time period
T.
In simpler words, an influence score may be defined as the ability of a user to drive
actions among other users.
8. Networks and Sources
Mentions
Likes
Comments
Subscribers
Wall Posts
Friends
Retweets
Mentions
List Memberships
Followers
Replies
Facebook Twitter
Title
Education
Connections
Recommenders
Comments
LinkedIn
Check-in’s and Tips
Friends and Mayorship
+K received
Klout
Foursquare
Inlinks
Inlinks to Outlinks
Page Importance
Category counts
Comments
+1’s
Reshares
Google+
Wikipedia Youtube
Instagram
Posts
Followers
Likes and Comments
Subscribers
Views
Likes
9. Scoring
Step 1: Acquire Labeled Ground truth data
● 100k labels from human evaluators
● Each network has its own labels
Step 2: Derive Features from
interaction graph
● Long Lasting
● Dynamic
Step 3: Generate a score per network /
community
● Fit a model for the labels with features
using Supervised Learning models
● Non negative least squares
Step 4: Hierarchically combine scores
● Use heuristics such as graph size to
determine weights
11. Dynamic Features
● Who:
○ The characteristics of the audience who
reacted to the original post from the user.
● When:
○ The difference between the current time
and the time at which the reaction
occurred.
● Where:
○ The social network on which the reaction
was performed.
● What:
○ The unit of original content or action on
which the reaction was performed.
● How:
○ The type of reaction.
HIGHER_(SCORED)-D3-FACEBOOK-POST-COMMENT
All-D3-FACEBOOK-POST-COMMENT
14. Hierarchical Combining - Cont.
● Treat networks as orthogonal vectors
since networks are mostly independent.
● Use heuristics such as network size to
determine weights.
● Final Klout score is the Euclidean norm
of the combined vector.
15. Key Insights
● Features are log-normalized => Klout scores are on a log scale
○ eg. Order of magnitude difference between users scored 50 and 60
● Network models achieve 70-75% F1 scores.
○ Human evaluators do not always agree on influence ordering
● Wikipedia and LinkedIn are important sources for less active, high influence users
○ eg. Warren Buffett => low social network activity, high score
● Twitter and Facebook are important sources for long tail users:
○ eg. Low scored users with less influential interactions
● Temporal Dependence:
○ Combining long lasting and dynamic features allows influence measurement on
different time scales
17. Spreading information
● 87k Users targeted with perks, encouraged to post messages
● 18k posts created, 394k reactions received
● Order of magnitude difference for users with Klout Score 60 vs 30
21. Conclusion
● A hierarchical scoring system called the Klout Score and a feature generation
framework to capture different dimensions of influential interactions.
● Framework scales to hundreds of millions of users and billions of interactions across 9
social networks.
● Sources like Wikipedia and LinkedIn provide partial signals for real world influence.
Temporal dependence is also considered.
● The Klout Score is only a partial representation of the influence of a user.
● However, an extensible system that is able to easily incorporate new sources of
information can grow more accurate over time.
22. Paper Reference
● Klout Score: Measuring Influence Across Multiple Social Networks
Adithya Rao, Nemanja Spasojevic, Zhisheng Li, Trevor DSouza
Arxiv link to paper