Maclean.pptx

All Friends are Not Equal:
Using Weights in Social Graphs to
Improve Search

Sudheendra Hangal, Diana MacLean, Monica S. Lam, Jeffrey Heer

Computer Science Department, Stanford University

Outline

  Problem: social search in a global network
  Most contemporary approaches optimize for path length
  Tie strength not considered
  Success can be highly path dependent

  Question: Could a longer path be better?
  More likely to get an introduction through people who like me.
  If not, is there a best shortest path?

  Contributions:
  Influence as a model of tie strength in directed & undirected networks
  Best path = most influential path
  Study of influence and optimal paths in 2 networks (Twitter RT & DBLP)

  Results
  The shortest path is not always the best path!

Social Search Scenario
John’s

HR

College

Manager’s

Roommate
Brother

Google

John
HR

Manager

Google

Recruiter

  John would like to apply for a job at Google. What is the best path to the HR
manager?
  James thinks Mary is cute. Who is the best person to ask for an introduction?
  …

  <Graphic of LinkedIn showing hundreds of paths>

Assigning Tie Strengths

  A social tie may be both weighted and asymmetric

  Infer automatically
  Most users would not input in any case

  Based on interaction frequency
  Latently captured in many social networks (emails, co-authorships…)
  Involves cost investment from user, so good proxy for tie strength

  We assume a global view of the data

Influence

  X’s influence on Y is proportional to Y’s investment in X

  Assume each node has equal, fixed resources to invest

  Influence of an edge:

Invests(B, A)
Inf luence(A, B) =
X Invests(B, X)

  Influence of a person:

Inf luence(A) = Inf luence(A, X)
X

  Influence is both asymmetric and weighted

Influence
Co-Authorship Influence

7

0.75
6
0.35

2 4

Influence of a Path

  Influence of a path:

S(P ) = D |P |
Inf luence(ei ), ei ∈ P

  Decay factor d damps influence as path length increases

  Many other models
  This one is simple

  Strongest path = most influential

Computing the Strongest Path

  Adaptation of Djikstra’s shortest path algorithm.

  In order to maximize S(P):

S(P ) = (D × Inf luence(ei )), ei ∈ P

= (log(D) + log(Inf luence(ei )), ei ∈ P
  Thus minimizing:

− (log(D) + log(Inf luence(ei ))), ei ∈ P
1 1
= (log( ) + log( )), ei ∈ P
D Inf luence(ei )

  We provide log(1/D) + log(1/Inf luence(ei )) as the starting weights to the
shortest path algorithm.

Networks Studied

  DBLP
  Investment: co-authorship
  ~600K nodes, ~4M edges (giant component only)
  Example of influence relationship: earlier slide

  Twitter RT
  Investment: re-tweeting someone’s tweet
  1 month’s worth of tweets
  ~2.4M nodes, ~8.85M edges (giant component only)
  Example of influence relationship: Obama Joe the Plumber

Obama Joe

Joe the Plumber

Obama

Experiment

  Pick 500 random node pairs

  Compute:
  Strongest path
  Shortest path

  Questions
  Do stronger paths tend to be longer? Equivalent to shortest path?
  What proportion of stronger paths are longer?
  How is influence distributed across nodes?

Results

  Node influence distributions

Discussion (1)

  Influence metric
  Captures asymmetry at the node level – most have influence 1

  Differences between Twitter and DBLP datasets
  Twitter outliers DBLP outliers
  Twitter more sparse than DBLP
  Twitter, driven by popularity hype, lends itself to influence?

Discussion (2)

  Stronger path longer than shortest path
  43% in DBLP (~1 extra hop compared with shortest path)
  68% in Twitter (~2 extra hops compared with shortest path)
  More worthwhile to pick the stronger path

  Strongest path length equal to shortest path length
  Still better to pick the strongest, shortest path

  Future work:
  Explore alternate models of influence
  Consider paths between n-degree connected pairs.

Related Work
  Global social search
  Aardvark [Horwitz Kamvar, WWW ’09]
  Facebook (and other OSN companies)

  Local social search
[Dodds et al., Science, August ’03]
[Adamic Adar, Social Networks, July ’05]
[Watts et al., Science, May ’02]

  Inferring tie strengths from social graphs
[Gilbert Karahalois, CHI ’09], [Xiang et al., WWW ’09],
[Leskovec et al., CHI ’10], [Onnela et al., NJP, June ‘07]

Conclusions

  Longer paths are often better than shortest paths
  Cost of 1-2 extra “hops” seems small for tasks that are highly path dependent

  Even when the better path is not longer
  it is still better that picking randomly from the set of shortest paths

  In general, we need to develop more graph analysis methods for
weighted graphs
  Binary ties are often arbitrary
  Weights can be easily inferred
  Weights encode a wealth of social information

  Influence metric
  Simple
  Applicable to any graph encoding social interactions

Thank you!

  Questions?

  http://prpl. stanford.edu/influence

Maclean.pptx

Recommended

Recommended

More Related Content

Similar to Maclean.pptx

Similar to Maclean.pptx (20)

More from hangal

More from hangal (7)

Maclean.pptx