All Friends are Not Equal:
      Using Weights in Social Graphs to
      Improve Search

Sudheendra Hangal, Diana MacLean,...
Outline

  Problem: social search in a global network
     Most contemporary approaches optimize for path length
     T...
Social Search Scenario
                             John’s	
  	
                          HR	
  
                         ...
  <Graphic of LinkedIn showing hundreds of paths>
Assigning Tie Strengths

  A social tie may be both weighted and asymmetric

  Infer automatically
     Most users woul...
Influence

  X’s influence on Y is proportional to Y’s investment in X

  Assume each node has equal, fixed resources to...
Influence
 Co-Authorship       Influence




                 7

                       0.75
       6
                    ...
Influence of a Path

  Influence of a path:
                             
     S(P ) = D        |P |
                    ...
Computing the Strongest Path

  Adaptation of Djikstra’s shortest path algorithm.

  In order to maximize S(P):
        ...
Networks Studied

  DBLP
    Investment: co-authorship
    ~600K nodes, ~4M edges (giant component only)
    Example o...
Obama  Joe


                  Joe the Plumber




          Obama
Experiment

  Pick 500 random node pairs

  Compute:
    Strongest path
    Shortest path

  Questions
    Do strong...
Results

  Node influence distributions
Results

  Short vs. Strong paths

  DBLP              All     |Pstrong|  |Pshort|   |Pshort| = |Pstrong|
  Node Pairs   ...
Discussion (1)

  Influence metric
     Captures asymmetry at the node level – most have influence  1

  Differences be...
Discussion (2)

  Stronger path longer than shortest path
     43% in DBLP (~1 extra hop compared with shortest path)
  ...
Related Work
  Global social search
     Aardvark [Horwitz  Kamvar, WWW ’09]
     Facebook (and other OSN companies)

...
Conclusions

  Longer paths are often better than shortest paths
     Cost of 1-2 extra “hops” seems small for tasks tha...
Thank you!

  Questions?

  http://prpl. stanford.edu/influence
Upcoming SlideShare
Loading in …5
×

Maclean.pptx

905 views

Published on

SNAKDD '10 presentation on social search

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
905
On SlideShare
0
From Embeds
0
Number of Embeds
339
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Maclean.pptx

  1. 1. All Friends are Not Equal: Using Weights in Social Graphs to Improve Search Sudheendra Hangal, Diana MacLean, Monica S. Lam, Jeffrey Heer Computer Science Department, Stanford University
  2. 2. Outline   Problem: social search in a global network   Most contemporary approaches optimize for path length   Tie strength not considered   Success can be highly path dependent   Question: Could a longer path be better?   More likely to get an introduction through people who like me.   If not, is there a best shortest path?   Contributions:   Influence as a model of tie strength in directed & undirected networks   Best path = most influential path   Study of influence and optimal paths in 2 networks (Twitter RT & DBLP)   Results   The shortest path is not always the best path!
  3. 3. Social Search Scenario John’s     HR   College     Manager’s   Roommate   Brother   Google     John   HR     Manager   Google     Recruiter     John would like to apply for a job at Google. What is the best path to the HR manager?   James thinks Mary is cute. Who is the best person to ask for an introduction?   …
  4. 4.   <Graphic of LinkedIn showing hundreds of paths>
  5. 5. Assigning Tie Strengths   A social tie may be both weighted and asymmetric   Infer automatically   Most users would not input in any case   Based on interaction frequency   Latently captured in many social networks (emails, co-authorships…)   Involves cost investment from user, so good proxy for tie strength   We assume a global view of the data
  6. 6. Influence   X’s influence on Y is proportional to Y’s investment in X   Assume each node has equal, fixed resources to invest   Influence of an edge: Invests(B, A) Inf luence(A, B) = X Invests(B, X)   Influence of a person: Inf luence(A) = Inf luence(A, X) X   Influence is both asymmetric and weighted
  7. 7. Influence Co-Authorship Influence 7 0.75 6 0.35 2 4
  8. 8. Influence of a Path   Influence of a path: S(P ) = D |P | Inf luence(ei ), ei ∈ P   Decay factor d damps influence as path length increases   Many other models   This one is simple   Strongest path = most influential
  9. 9. Computing the Strongest Path   Adaptation of Djikstra’s shortest path algorithm.   In order to maximize S(P): S(P ) = (D × Inf luence(ei )), ei ∈ P = (log(D) + log(Inf luence(ei )), ei ∈ P   Thus minimizing: − (log(D) + log(Inf luence(ei ))), ei ∈ P 1 1 = (log( ) + log( )), ei ∈ P D Inf luence(ei )   We provide log(1/D) + log(1/Inf luence(ei )) as the starting weights to the shortest path algorithm.
  10. 10. Networks Studied   DBLP   Investment: co-authorship   ~600K nodes, ~4M edges (giant component only)   Example of influence relationship: earlier slide   Twitter RT   Investment: re-tweeting someone’s tweet   1 month’s worth of tweets   ~2.4M nodes, ~8.85M edges (giant component only)   Example of influence relationship: Obama Joe the Plumber
  11. 11. Obama Joe Joe the Plumber Obama
  12. 12. Experiment   Pick 500 random node pairs   Compute:   Strongest path   Shortest path   Questions   Do stronger paths tend to be longer? Equivalent to shortest path?   What proportion of stronger paths are longer?   How is influence distributed across nodes?
  13. 13. Results   Node influence distributions
  14. 14. Results   Short vs. Strong paths DBLP All |Pstrong| |Pshort| |Pshort| = |Pstrong| Node Pairs 500 215 (43.0%) 285 (57.0%) Avg. |Pshort| 6.5 6.6 6.5 Avg. |Pstrong| 7.0 7.8 6.5 TWITTER All |Pstrong| |Pshort| |Pshort| = |Pstrong| Node Pairs 500 339(67.8%) 161 (32.2%) Avg. |Pshort| 7.7 7.9 7.3 Avg. |Pstrong| 9.2 10.1 7.3
  15. 15. Discussion (1)   Influence metric   Captures asymmetry at the node level – most have influence 1   Differences between Twitter and DBLP datasets   Twitter outliers DBLP outliers   Twitter more sparse than DBLP   Twitter, driven by popularity hype, lends itself to influence?
  16. 16. Discussion (2)   Stronger path longer than shortest path   43% in DBLP (~1 extra hop compared with shortest path)   68% in Twitter (~2 extra hops compared with shortest path)   More worthwhile to pick the stronger path   Strongest path length equal to shortest path length   Still better to pick the strongest, shortest path   Future work:   Explore alternate models of influence   Consider paths between n-degree connected pairs.
  17. 17. Related Work   Global social search   Aardvark [Horwitz Kamvar, WWW ’09]   Facebook (and other OSN companies)   Local social search [Dodds et al., Science, August ’03] [Adamic Adar, Social Networks, July ’05] [Watts et al., Science, May ’02]   Inferring tie strengths from social graphs [Gilbert Karahalois, CHI ’09], [Xiang et al., WWW ’09], [Leskovec et al., CHI ’10], [Onnela et al., NJP, June ‘07]
  18. 18. Conclusions   Longer paths are often better than shortest paths   Cost of 1-2 extra “hops” seems small for tasks that are highly path dependent   Even when the better path is not longer   it is still better that picking randomly from the set of shortest paths   In general, we need to develop more graph analysis methods for weighted graphs   Binary ties are often arbitrary   Weights can be easily inferred   Weights encode a wealth of social information   Influence metric   Simple   Applicable to any graph encoding social interactions
  19. 19. Thank you!   Questions?   http://prpl. stanford.edu/influence

×