1. All Friends are Not Equal:
Using Weights in Social Graphs to
Improve Search
Sudheendra Hangal, Diana MacLean, Monica S. Lam, Jeffrey Heer
Computer Science Department, Stanford University
2. Outline
Problem: social search in a global network
Most contemporary approaches optimize for path length
Tie strength not considered
Success can be highly path dependent
Question: Could a longer path be better?
More likely to get an introduction through people who like me.
If not, is there a best shortest path?
Contributions:
Influence as a model of tie strength in directed & undirected networks
Best path = most influential path
Study of influence and optimal paths in 2 networks (Twitter RT & DBLP)
Results
The shortest path is not always the best path!
3. Social Search Scenario
John’s
HR
College
Manager’s
Roommate
Brother
Google
John
HR
Manager
Google
Recruiter
John would like to apply for a job at Google. What is the best path to the HR
manager?
James thinks Mary is cute. Who is the best person to ask for an introduction?
…
5. Assigning Tie Strengths
A social tie may be both weighted and asymmetric
Infer automatically
Most users would not input in any case
Based on interaction frequency
Latently captured in many social networks (emails, co-authorships…)
Involves cost investment from user, so good proxy for tie strength
We assume a global view of the data
6. Influence
X’s influence on Y is proportional to Y’s investment in X
Assume each node has equal, fixed resources to invest
Influence of an edge:
Invests(B, A)
Inf luence(A, B) =
X Invests(B, X)
Influence of a person:
Inf luence(A) = Inf luence(A, X)
X
Influence is both asymmetric and weighted
8. Influence of a Path
Influence of a path:
S(P ) = D |P |
Inf luence(ei ), ei ∈ P
Decay factor d damps influence as path length increases
Many other models
This one is simple
Strongest path = most influential
9. Computing the Strongest Path
Adaptation of Djikstra’s shortest path algorithm.
In order to maximize S(P):
S(P ) = (D × Inf luence(ei )), ei ∈ P
= (log(D) + log(Inf luence(ei )), ei ∈ P
Thus minimizing:
− (log(D) + log(Inf luence(ei ))), ei ∈ P
1 1
= (log( ) + log( )), ei ∈ P
D Inf luence(ei )
We provide log(1/D) + log(1/Inf luence(ei )) as the starting weights to the
shortest path algorithm.
10. Networks Studied
DBLP
Investment: co-authorship
~600K nodes, ~4M edges (giant component only)
Example of influence relationship: earlier slide
Twitter RT
Investment: re-tweeting someone’s tweet
1 month’s worth of tweets
~2.4M nodes, ~8.85M edges (giant component only)
Example of influence relationship: Obama Joe the Plumber
12. Experiment
Pick 500 random node pairs
Compute:
Strongest path
Shortest path
Questions
Do stronger paths tend to be longer? Equivalent to shortest path?
What proportion of stronger paths are longer?
How is influence distributed across nodes?
15. Discussion (1)
Influence metric
Captures asymmetry at the node level – most have influence 1
Differences between Twitter and DBLP datasets
Twitter outliers DBLP outliers
Twitter more sparse than DBLP
Twitter, driven by popularity hype, lends itself to influence?
16. Discussion (2)
Stronger path longer than shortest path
43% in DBLP (~1 extra hop compared with shortest path)
68% in Twitter (~2 extra hops compared with shortest path)
More worthwhile to pick the stronger path
Strongest path length equal to shortest path length
Still better to pick the strongest, shortest path
Future work:
Explore alternate models of influence
Consider paths between n-degree connected pairs.
17. Related Work
Global social search
Aardvark [Horwitz Kamvar, WWW ’09]
Facebook (and other OSN companies)
Local social search
[Dodds et al., Science, August ’03]
[Adamic Adar, Social Networks, July ’05]
[Watts et al., Science, May ’02]
Inferring tie strengths from social graphs
[Gilbert Karahalois, CHI ’09], [Xiang et al., WWW ’09],
[Leskovec et al., CHI ’10], [Onnela et al., NJP, June ‘07]
18. Conclusions
Longer paths are often better than shortest paths
Cost of 1-2 extra “hops” seems small for tasks that are highly path dependent
Even when the better path is not longer
it is still better that picking randomly from the set of shortest paths
In general, we need to develop more graph analysis methods for
weighted graphs
Binary ties are often arbitrary
Weights can be easily inferred
Weights encode a wealth of social information
Influence metric
Simple
Applicable to any graph encoding social interactions