All Friends are Not Equal:
Using Weights in Social Graphs to
Sudheendra Hangal, Diana MacLean, Monica S. Lam, Jeffrey Heer
Computer Science Department, Stanford University
Problem: social search in a global network
Most contemporary approaches optimize for path length
Tie strength not considered
Success can be highly path dependent
Question: Could a longer path be better?
More likely to get an introduction through people who like me.
If not, is there a best shortest path?
Influence as a model of tie strength in directed & undirected networks
Best path = most influential path
Study of influence and optimal paths in 2 networks (Twitter RT & DBLP)
The shortest path is not always the best path!
Social Search Scenario
John would like to apply for a job at Google. What is the best path to the HR
James thinks Mary is cute. Who is the best person to ask for an introduction?
<Graphic of LinkedIn showing hundreds of paths>
Assigning Tie Strengths
A social tie may be both weighted and asymmetric
Most users would not input in any case
Based on interaction frequency
Latently captured in many social networks (emails, co-authorships…)
Involves cost investment from user, so good proxy for tie strength
We assume a global view of the data
X’s influence on Y is proportional to Y’s investment in X
Assume each node has equal, fixed resources to invest
Influence of an edge:
Inf luence(A, B) =
X Invests(B, X)
Influence of a person:
Inf luence(A) = Inf luence(A, X)
Influence is both asymmetric and weighted
Influence of a Path
Influence of a path:
S(P ) = D |P |
Inf luence(ei ), ei ∈ P
Decay factor d damps influence as path length increases
Many other models
This one is simple
Strongest path = most influential
Computing the Strongest Path
Adaptation of Djikstra’s shortest path algorithm.
In order to maximize S(P):
S(P ) = (D × Inf luence(ei )), ei ∈ P
= (log(D) + log(Inf luence(ei )), ei ∈ P
− (log(D) + log(Inf luence(ei ))), ei ∈ P
= (log( ) + log( )), ei ∈ P
D Inf luence(ei )
We provide log(1/D) + log(1/Inf luence(ei )) as the starting weights to the
shortest path algorithm.
~600K nodes, ~4M edges (giant component only)
Example of influence relationship: earlier slide
Investment: re-tweeting someone’s tweet
1 month’s worth of tweets
~2.4M nodes, ~8.85M edges (giant component only)
Example of influence relationship: Obama Joe the Plumber
Pick 500 random node pairs
Do stronger paths tend to be longer? Equivalent to shortest path?
What proportion of stronger paths are longer?
How is influence distributed across nodes?
Captures asymmetry at the node level – most have influence 1
Differences between Twitter and DBLP datasets
Twitter outliers DBLP outliers
Twitter more sparse than DBLP
Twitter, driven by popularity hype, lends itself to influence?
Stronger path longer than shortest path
43% in DBLP (~1 extra hop compared with shortest path)
68% in Twitter (~2 extra hops compared with shortest path)
More worthwhile to pick the stronger path
Strongest path length equal to shortest path length
Still better to pick the strongest, shortest path
Explore alternate models of influence
Consider paths between n-degree connected pairs.
Global social search
Aardvark [Horwitz Kamvar, WWW ’09]
Facebook (and other OSN companies)
Local social search
[Dodds et al., Science, August ’03]
[Adamic Adar, Social Networks, July ’05]
[Watts et al., Science, May ’02]
Inferring tie strengths from social graphs
[Gilbert Karahalois, CHI ’09], [Xiang et al., WWW ’09],
[Leskovec et al., CHI ’10], [Onnela et al., NJP, June ‘07]
Longer paths are often better than shortest paths
Cost of 1-2 extra “hops” seems small for tasks that are highly path dependent
Even when the better path is not longer
it is still better that picking randomly from the set of shortest paths
In general, we need to develop more graph analysis methods for
Binary ties are often arbitrary
Weights can be easily inferred
Weights encode a wealth of social information
Applicable to any graph encoding social interactions