Fast shortest path distance estimation in large networks

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Notes on slide 1

    Title intro

    motivation1

    Motivation2

    500 random point-to-point queries error: | d’ – d| / d

    500 random point-to-point queries error: | d’ – d| / d

    p@k: size of the intersection between the estimated top-k elements and the actual top-k, normalized by k 200 queries of the form (source, tag)

    Favorites, Groups & Events

    Fast shortest path distance estimation in large networks - Presentation Transcript

    1. Fast Shortest Path Distance Estimation in Large Networks Michalis Potamias Francesco Bonchi Carlos Castillo Aristides Gionis
    2. Context-aware Search …use shortest-path distance in wikipedia links-graph! Shortest Paths in Large Networks @ CIKM 2009 2
    3. Social Search John searches Mary Ellie Mary B Ranking: Jack 1. Mary A Jim 2. Mary B Ron 3. Mary C John Mary C Joe Mary A Frodo …use shortest-path distance in friendship graph! Shortest Paths in Large Networks @ CIKM 2009 3
    4. Problem and Solutions • DB: Graph G = (V,E) • Query: Nodes s and t in V • Goal: Compute fast shortest path d(s,t) • Exact Solution – BFS - Dijkstra – Bidirectional - Dijkstra with A* (aka ALT methods) • [Ikeda, 1994] [Pohl, 1971] [Goldberg and Harrelson, SODA 2005] • Heuristic Solution – Avoid traversals – Use Random Landmarks • [Kleinberg et al, FOCS 2004] [Vieira et al, CIKM 2007] s – Can we choose Better Landmarks ?!? t u Shortest Paths in Large Networks @ CIKM 2009 4
    5. The Landmarks’ Method • Offline – Precompute distance of all nodes to a small set of nodes (landmarks) – Each node is associated with a vector with its SP-distance from each landmark (embedding) • Query-time – d(s,t) = ? – Combine the embeddings of s and t to get an estimate of the query Shortest Paths in Large Networks @ CIKM 2009 5
    6. Contribution 1. Proved that covering the network with landmarks is NP-hard. 2. Devised heuristics for good landmarks. 3. Experiments with 5 large real-world networks and more than 30 heuristics. Comparison with state of the art. 4. Application to Social Search. Shortest Paths in Large Networks @ CIKM 2009 6
    7. Algorithmic Framework • Triangle Inequality s t u • Observation: the case of equality Shortest Paths in Large Networks @ CIKM 2009 7
    8. The Landmarks’ Method 1. Selection: Select k landmarks 2. Offline: Run k BFS/Dijkstra and store the embeddings of each node: Φ(s) = <d(s, u1), d(s , u2), … , d(s, uk)> = <s1, s2, …, sk> 1. Query-time: d(s,t) = ? – Fetch Φ(s) and Φ(t) – Compute mini{si + ti} (i.e. inf of UB) ... in time O(k) Shortest Paths in Large Networks @ CIKM 2009 8
    9. Example query: d(s,t) d(_,u1) d(_,u2) d(_,u3) d(_,u4) Φ(s) 2 4 5 2 Φ(t) 3 5 1 4 UB 5 9 6 6 LB 1 1 4 2 Shortest Paths in Large Networks @ CIKM 2009 9
    10. Coverage Using Upper Bounds • A landmark u covers a pair (s, t), if u lies on a shortest path from s to t • Problem Definition: find a set of k landmarks that cover as many pairs (s,t) in V x V as possible – NP-hard – k = 1 : node with the highest betweenness centrality – k > 1 : greedy set-cover (approximation - too expensive) …central nodes are a good start for devising heuristics! Shortest Paths in Large Networks @ CIKM 2009 10
    11. Landmarks Selection: Basic Heuristics • Random (baseline) • Choose central nodes! – Degree – Closeness centrality • Closeness of u is the average distance of u to any vertex in G • Caveat: many central nodes may cover the same pairs: newly added landmarks should cover different pairs …spread the landmarks in the graph! Shortest Paths in Large Networks @ CIKM 2009 11
    12. Constrained Heuristics • Remove immediate neighborhood 1. Rank all nodes according to Degree or Centrality 2. Iteratively choose the highest ranking nodes. Remove h-neighbors of each selected node from candidate set • Denote as – Degree/h – Closeness/h – Best results for h = 1 Shortest Paths in Large Networks @ CIKM 2009 12
    13. Partitioning-based Heuristics • Use graph-partitioning to spread nodes. • Utilize any partitioning scheme and – Degree/P • Pick the node with the highest degree in each partition – Closeness/P • Pick the node with the highest closeness in each partition – Border/P • Pick the node closer to the border in each partition. Maximize the border-value that is given from the following formula: Shortest Paths in Large Networks @ CIKM 2009 13
    14. Border/P u d1(u) = 3 1 2 d2(u) = 3 d3(u) = 2 b(u) = 3 d1(u)*(d2(u) + d3(u))= 3(3 + 2) Shortest Paths in Large Networks @ CIKM 2009 14
    15. Versus Random - error Shortest Paths in Large Networks @ CIKM 2009 15
    16. Versus Random - triangulation random landmarks have theoretical guarantees [FOCS04] Shortest Paths in Large Networks @ CIKM 2009 16
    17. Versus ALT - efficiency Ours (10%) 20 100 500 50 50 Operations ALT LB 60K 40K 80K 20K 2K Operations >300x >400x >160x >400x >40x ALT 7K 10K 20K 2K 2K Visited Nodes state of the art exact ALT methods [SODA05] Shortest Paths in Large Networks @ CIKM 2009 17
    18. Social Search Task random landmarks have been used [CIKM07] Shortest Paths in Large Networks @ CIKM 2009 18
    19. Conclusion • Novel search paradigms need distance as primitive – Approximations should be computed in milliseconds • Heuristic landmarks yield remarkable tradeoffs for SP- distance estimation in huge graphs – Hard to find the optimal landmarks – Border and Centrality heuristics: • outperform Random even by a factor of 250. • are, for a 10% error, many orders of magnitude faster than state of the art exact algorithms (ALT) • Future Work – Provide fast estimation for more graph primitives! Shortest Paths in Large Networks @ CIKM 2009 19
    20. Thank you! ? Shortest Paths in Large Networks @ CIKM 2009 20

    + Carlos CastilloCarlos Castillo, 2 weeks ago

    custom

    115 views, 0 favs, 0 embeds more stats

    Michalis Potamias, Francesco Bonchi, Carlos Castill more

    More info about this document

    CC Attribution License

    Go to text version

    • Total Views 115
      • 115 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 1
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories

    Tags