Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Copyright© 2016 NTT Corp. All Rights Reserved.
Rabbit Order:
Just-in-time Parallel Reordering
for Fast Graph Analysis
Juny...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights ...
Rabbit Order: Just-in-time Reordering for Fast Graph Analysis
Upcoming SlideShare
Loading in …5
×

Rabbit Order: Just-in-time Reordering for Fast Graph Analysis

537 views

Published on

Presentation slides used in IPDPS'16.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Rabbit Order: Just-in-time Reordering for Fast Graph Analysis

  1. 1. Copyright© 2016 NTT Corp. All Rights Reserved. Rabbit Order: Just-in-time Parallel Reordering for Fast Graph Analysis Junya Arai Nippon Telegraph and Telephone Corp. (NTT) Hiroaki Shiokawa Univ. of Tsukuba Takeshi Yamamuro NTT Makoto Onizuka Osaka Univ. Sotetsu Iwamura NTT
  2. 2. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 2 Summary • Vertex reordering has been used to improve locality of graph processing • However, overheads of reordering tend to increase end-to-end runtime (= reordering + analysis) • Thus, we propose a fast reordering algorithm, Rabbit Order • Exploit community structures in real-world graphs • Up to 3.5x speedup for PageRank • Including reordering overheads! • Also effective for various graph analysis
  3. 3. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 3 Graph analysis To deal with large-scale graphs, performance of various analysis algorithms need to be improved Real-world graphs • Web graphs Over 50B pages*1 • Social graphs 1B users 200B friendships*1 Analysis algorithms • Community detection • Ranking (e.g., PageRank) • Shortest Path • Diameter • Connected components • k-core decomposition • ...... Large-scale Various *1: Andrew+, “Parallel Graph Analytics,” CACM, 59(5), ‘16
  4. 4. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 4 Poor locality • Poor locality in memory accesses is a problem common to various analysis algorithms Poor locality causes ... • Frequent cache misses • Frequent inter-core communications • Memory bandwidth saturation • Simultaneous memory access from cores Poor scalability
  5. 5. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 5 Memory access example • PageRank until convergence do for each vertex 𝑣 do 𝒔 𝑣 = σ 𝑢∈𝑁𝑒𝑖𝑔ℎ𝑏𝑜𝑟(𝑣) Τ𝒔[𝑢] degree(𝑢) Accessed elements in array 𝒔 0 1 2 3 4 5 6 7 𝑣 = 0 Access PageRank score 𝒔 of each neighbor 𝒔 0 , 𝒔 2 , 𝒔 4 and 𝒔[7] are accessed when 𝑣 = 0 5 2 0 7 4 1 3 6
  6. 6. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 6 Memory access example • PageRank until convergence do for each vertex 𝑣 do 𝒔 𝑣 = σ 𝑢∈𝑁𝑒𝑖𝑔ℎ𝑏𝑜𝑟(𝑣) Τ𝒔[𝑢] degree(𝑢) Access PageRank score 𝒔 of each neighbor Accessed elements in array 𝒔 0 1 2 3 4 5 6 7 𝑣 = 0 𝑣 = 1 5 2 0 7 4 1 3 6
  7. 7. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 7 Memory access example • PageRank until convergence do for each vertex 𝑣 do 𝒔 𝑣 = σ 𝑢∈𝑁𝑒𝑖𝑔ℎ𝑏𝑜𝑟(𝑣) Τ𝒔[𝑢] degree(𝑢) Access PageRank score 𝒔 of each neighbor Accessed elements in array 𝒔 0 1 2 3 4 5 6 7 𝑣 = 0 𝑣 = 1 𝑣 = 2 5 2 0 7 4 1 3 6
  8. 8. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 8 Memory access example • PageRank until convergence do for each vertex 𝑣 do 𝒔 𝑣 = σ 𝑢∈𝑁𝑒𝑖𝑔ℎ𝑏𝑜𝑟(𝑣) Τ𝒔[𝑢] degree(𝑢) Access PageRank score 𝒔 of each neighbor Accessed elements in array 𝒔 0 1 2 3 4 5 6 7 𝑣 = 0 𝑣 = 1 𝑣 = 2 𝑣 = 3 𝑣 = 4 𝑣 = 5 𝑣 = 6 𝑣 = 7 5 2 0 7 4 1 3 6
  9. 9. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 9 Memory access example • PageRank until convergence do for each vertex 𝑣 do 𝒔 𝑣 = σ 𝑢∈𝑁𝑒𝑖𝑔ℎ𝑏𝑜𝑟(𝑣) Τ𝒔[𝑢] degree(𝑢) Access PageRank score 𝒔 of each neighbor Accessed elements in array 𝒔 0 1 2 3 4 5 6 7 𝑣 = 0 𝑣 = 1 𝑣 = 2 𝑣 = 3 𝑣 = 4 𝑣 = 5 𝑣 = 6 𝑣 = 7 Poor spatial locality Poor temporal locality 5 2 0 7 4 1 3 6
  10. 10. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 10 Reordering • Preprocess for optimizing vertex ordering (ID numbering) • No change of analysis algorithms and implementations is required • Improve locality by co-locating neighboring vertices in memory • Existing algorithms: RCM, LLP, Nested Dissection, ... Random ordering High-locality ordering 5 2 0 7 4 1 3 6 0 2 3 1 4 7 6 5
  11. 11. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 11 On a reordered graph • PageRank until convergence do for each vertex 𝑣 do 𝒔 𝑣 = σ 𝑢∈𝑁𝑒𝑖𝑔ℎ𝑏𝑜𝑟(𝑣) Τ𝒔[𝑢] degree(𝑢) Access PageRank score 𝒔 of each neighbor Accessed elements in array 𝒔 0 1 2 3 4 5 6 7 𝑣 = 0 𝑣 = 1 𝑣 = 2 𝑣 = 3 𝑣 = 4 𝑣 = 5 𝑣 = 6 𝑣 = 7 0 2 3 1 4 7 6 5
  12. 12. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 12 On a reordered graph • PageRank until convergence do for each vertex 𝑣 do 𝒔 𝑣 = σ 𝑢∈𝑁𝑒𝑖𝑔ℎ𝑏𝑜𝑟(𝑣) Τ𝒔[𝑢] degree(𝑢) Access PageRank score 𝒔 of each neighbor Accessed elements in array 𝒔 0 1 2 3 4 5 6 7 𝑣 = 0 𝑣 = 1 𝑣 = 2 𝑣 = 3 𝑣 = 4 𝑣 = 5 𝑣 = 6 𝑣 = 7 0 2 3 1 4 7 6 5 High spatial locality High temporal locality
  13. 13. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 13 Problem in reordering • Reordering tends to increase end-to-end runtime • end-to-end = reordering + analysis (e.g., PageRank) • ‘Speedup’ by ahead-of-time reordering Reordering: Slow Analysis: Fast Reorder again when the graph is modified Result 0 2 3 1 4 7 6 5 5 2 0 7 4 1 3 6
  14. 14. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 14 Problem in reordering • Reordering tends to increase end-to-end runtime • end-to-end = reordering + analysis (e.g., PageRank) • ‘Speedup’ by ahead-of-time reordering w/o reordering Analysis w/ reordering Reordering Analysis Time Slowdown!!!
  15. 15. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 15 Our contribution: Rabbit Order • Reordering algorithm to reduce end-to-end runtime • Speedup by just-in-time reordering w/o reordering Analysis w/ reordering Reordering Analysis Time ReorderingAnalysis Fast reordering High locality& Rabbit Order
  16. 16. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 16 Rabbit Order
  17. 17. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 17 Two main techniques 1. Hierarchical community-based ordering • For high locality 2. Parallel incremental aggregation • For fast reordering
  18. 18. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 18 Two main techniques 1. Hierarchical community-based ordering • For high locality 2. Parallel incremental aggregation • For fast reordering
  19. 19. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 19 Community-based ordering • Community: a group of densely connected vertices • Common in real-world graphs (e.g., web, social, ...) • Co-locate vertices within each community in memory (cf. [Prat-Perez ‘11][Boldi+ ‘11]) Accessed elements in array 𝒔 0 1 2 3 4 5 6 7 𝑣 = 0 𝑣 = 1 𝑣 = 2 𝑣 = 3 𝑣 = 4 𝑣 = 5 𝑣 = 6 𝑣 = 7 0 2 3 1 4 7 6 5 Community 1 Vertex 0~4 Community 2 Vertex 5~7 Community 1 Community 2
  20. 20. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 20 Community-based ordering • Community: a group of densely connected vertices • Common in real-world graphs (e.g., web, social, ...) • Co-locate vertices within each community in memory (cf. [Prat-Perez ‘11][Boldi+ ‘11]) Real social network http://snap.stanford.edu/data/egonets-Facebook.html Accessed elements in array 𝒔 0 1 2 3 …… 𝑣 = 0 𝑣 = 1 𝑣 = 2 ……
  21. 21. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 21 Hierarchy of communities • A community contains inner nested communities • e.g., social network of students • Hierarchy of schools, grades, and classes Real social network http://snap.stanford.edu/data/egonets-Facebook.html
  22. 22. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 22 Hierarchical community-based ordering • Hierarchical community co-location for further locality • Recursively co-locate vertices within each inner-community • Inner communities produce denser blocks, higher locality Accessed elements in array 𝒔 0 1 2 3 …… 𝑣 = 0 𝑣 = 1 𝑣 = 2 …… Denser block Real social network http://snap.stanford.edu/data/egonets-Facebook.html
  23. 23. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 23 Hierarchical community-based ordering • Hierarchical community co-location for further locality • Recursively co-locate vertices within each inner-community • Inner communities produce denser blocks, higher locality Accessed elements in array 𝒔 0 1 2 3 …… 𝑣 = 0 𝑣 = 1 𝑣 = 2 …… Denser block Real social network http://snap.stanford.edu/data/egonets-Facebook.html How can we obtain hierarchical communities? Reordering time must be short!
  24. 24. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 24 Two main techniques 1. Hierarchical community-based ordering • For high locality 2. Parallel incremental aggregation • For fast reordering
  25. 25. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 25 Incremental aggregation [Shiokawa+ '13] • Extract hierarchical communities by merging vertex pairs • Fast since it rapidly coarsens the graph, but sequential Merged to a neighbor that most improves modularity 𝑸 2 0 7 4 𝜟𝑸(𝒗 𝟎, 𝒗 𝟐) = 𝟎. 𝟎𝟓𝟐 𝛥𝑄(𝑣0, 𝑣4) = 0.031 𝛥𝑄(𝑣0, 𝑣7) = 0.042 Gain of modularity for merging vertex 𝒖 and 𝒗: 𝛥𝑄 𝑢, 𝑣 = 2 𝑤 𝑢𝑣 2𝑚 − 𝑑𝑒𝑔 𝑢 𝑑𝑒𝑔(𝑣) 2𝑚 2 𝒘 𝒖𝒗 Edge weight between vertex 𝑢 and 𝑣 𝒎 Total number of edges in the graph Community 20 75 4 Community 13 6 5 2 0 7 4 6 3 1 2 7 4 3 1 8 6 2 4 3 [Newman+ ‘04]
  26. 26. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 26 Parallelization issues • Naive per-vertex parallelization causes conflicts • Mutex: large overheads • Fine-grained locking (per vertex) is required • Atomic operation: too small operands (16 bytes on x86-64) • Cannot atomically merge vertices • by reattaching edges and removing a one of the vertices 1 2 4 5 6 3 0 Thread 1 Thread 2
  27. 27. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 27 Solution: lazy aggregation (1/2) • Lightweight concurrency control by atomic operations • Delay merges until the merged vertex is required • to reduce data size to be atomically modified 1 2 4 5 6 3 0 1 2 4 5 6 3 0 Just register vertices as a community member • This can be performed using compare-and-swap by storing the members in a singly-linked list • All the members are virtually treated as vertex 1 Community Thread 1 Thread 2 1 1
  28. 28. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 28 Solution: lazy aggregation (2/2) • Lightweight concurrency control by atomic operations • Delay merges until the merged vertex is required • to reduce data size to be atomically modified 1 2 4 5 6 3 0 Which vertex should vertex 1 be merged to? 1 5 6 3 01 1 Actually merge the members • Only one thread is assigned to each vertex, and so it can merge the members without conflicts Compute 𝜟𝐐 6 2 Thread Thread
  29. 29. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 29 Communities to ordering in a hierarchical community-based manner • Construct a dendrogram while extracting communities 1 3 65 7 0 2 4 Community 2 Community 1 Inner community 5 2 0 7 4 1 3 6
  30. 30. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 30 Communities to ordering in a hierarchical community-based manner • Construct a dendrogram while extracting communities • Reorder vertices to DFS visit order on it • Vertices in each inner-community are recursively co-located 1 3 65 7 0 2 4 Community 2 Community 1 DFS DFS New ordering 5 6 70 1 2 3 4 Inner community 5 2 0 7 4 1 3 6 0 3 2 1 4 5 6 7
  31. 31. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 31 Evaluation
  32. 32. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 32 Setup • Xeon E5-2697v2 12 cores x 2 socket / RAM 256GB • Reordering methods for comparison • Graphs Slash SlashBurn [Lim+ TKDE’14] Sequential BFS Unordered parallel BFS [Karantasis+ SC’14] Parallel RCM Unordered parallel RCM [Karantasis+ SC’14] Parallel ND Multithreaded Nested Dissection [LaSalle+ IPDPS’13] Parallel LLP Layered Label Propagation [Boldi+ WWW’11] Parallel Shingle The shingle ordering [Chierichetti+ KDD’09] Parallel Degree Ascending order of degree Parallel Random Random ordering (baseline) - berkstan enwiki ljournal uk-2002 road-usa uk-2005 it-2004 twitter sk-2005 webbase V 0.7M 4.2M 4.8M 18.5M 23.9M 39.5M 41.3M 41.7M 50.6M 118.1M E 7.6M 101.4M 69.0M 298.1M 57.7M 936.4M 1.2B 1.5B 1.9B 1.0B
  33. 33. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 33 End-to-end PageRank speedup Rabbit Order yields up to 3.5x (avg. 2.2x) speedup The other methods degrade performance in most cases • End−to−end speedup = PageRank runtime with random ordering Reordering runtime + PageRank runtime • Reordering methods and PageRank are run with 48 threads using HyperThreading 0 0.5 1 1.5 2 2.5 3 3.5 berkstan enwiki ljournal uk-2002 road-usa uk-2005 it-2004 twitter sk-2005 webbase Speedup Rabbit Slash BFS RCM ND LLP Shingle Degree SpeedupSlowdown Best speedup 3.5x
  34. 34. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 34 Breakdown of PageRank runtime • Rabbit Order achieves fast reordering and high locality at the same time • Reorder a 1.2B-edge graph in about 12 sec. 0 500 1000 1500 2000 2500 Random Degree Shingle LLP ND RCM BFS Slash Rabbit Runtime [sec] Reordering PageRank Fast Graph: it-2004
  35. 35. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 35 Cache misses during PageRank • Competitive with the best state-of-the-art algorithms 0 2E+10 4E+10 6E+10 8E+10 1E+11 1.2E+11 1.4E+11 1.6E+11 1.8E+11 Rabbit Slash BFS RCM ND LLP Shingle Degree Rand #ofcachemisses Graph: it-2004L1 L2 L3
  36. 36. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 36 Effectiveness for other analyses • Rabbit Order is effective for various analysis algorithms • Efficiency is affected by computational cost of analyses • It is difficult to amortize the reordering time by short analysis time (e.g., that of DFS and BFS) 0 0.5 1 1.5 2 2.5 3 3.5 DFS BFS Connected components Graph diameter k-core decomposition Speedup Average end-to-end speedup for the 10 graphs Rabbit Slash BFS RCM ND LLP Shingle Degree SlowdownSpeedup Analysis 1-10 sec Analysis 10-100 sec
  37. 37. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 37 Scalability of reordering • Highest scalability against the number of threads • Plenty of parallelism in incremental aggregation • Lightweight concurrency control (lazy aggregation) 0 2 4 6 8 10 12 14 16 18 20 Rabbit BFS RCM ND LLP Shingle Degree Avg.speedupvs.1thread Reordering time 12 threads 24 threads 48 threads (HT) Scalable
  38. 38. J. Arai+, "Rabbit Order: Just-in-time Reordering for Fast Graph Analysis," IPDPS'16. Copyright© 2016 NTT Corp. All Rights Reserved. 38 Conclusion • Reordering improves locality of graph analysis • But existing algorithms tend to increase end-to-end runtime • Rabbit Order reduces the end-to-end runtime by two main techniques: 1. Hierarchical community-based ordering for high locality 2. Parallel incremental aggregation for fast reordering • Up to 3.5x speedup for PageRank • Also effective for various analysis algorithms Implementation available https://git.io/rabbit (for evaluation purposes only)

×