Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
A Stable Broadcast Algorithm  Kei Takahashi  Hideo Saito Takeshi Shibata  Kenjiro Taura  (The University of Tokyo, Japan) ...
<ul><li>To distribute the same, but large data  to many nodes </li></ul><ul><ul><li>Ex: content delivery </li></ul></ul><u...
<ul><li>Usually, in a broadcast transfer, the source can deliver  much less  data than a single transfer from the source <...
<ul><li>Pipeline-manner transfers improve the performance </li></ul><ul><li>Even in a pipeline transfer, nodes with small ...
<ul><li>Propose an idea of  Stable Broadcast </li></ul><ul><li>In a stable broadcast: </li></ul><ul><ul><li>Slow nodes  ne...
<ul><li>Propose a  stable   broadcast algorithm for tree topologies </li></ul><ul><ul><li>Proved  to be stable in a theore...
<ul><li>Introduction </li></ul><ul><li>Problem Settings </li></ul><ul><li>Related Work </li></ul><ul><li>Proposed Algorith...
<ul><li>Target: large message broadcast </li></ul><ul><li>Only computational nodes handle messages </li></ul>Problem Setti...
<ul><li>Only bandwidth matters for large messages  </li></ul><ul><ul><li>(Transfer time) = (Latency) + </li></ul></ul><ul>...
<ul><li>Bandwidth-annotated topologies  are given in advance </li></ul><ul><ul><li>Bandwidth and topologies can be rapidly...
<ul><li>Previous algorithms evaluated broadcast  by completion time  </li></ul><ul><li>However, it cannot evaluate the eff...
<ul><li>All nodes  receive  maximum  possible bandwidth </li></ul><ul><ul><li>Receiving bandwidth for each node does not l...
<ul><li>Maximize  aggregate bandwidth </li></ul><ul><li>Minimize  completion time </li></ul>Properties of  Stable broadcast
<ul><li>Introduction </li></ul><ul><li>Problem Settings </li></ul><ul><li>Related Work </li></ul><ul><li>Proposed Algorith...
<ul><li>Flat tree:  </li></ul><ul><ul><li>The outgoing link from the source becomes a bottleneck </li></ul></ul><ul><li>Ra...
<ul><li>FPFR  ( Fast Parallel File Replication ) has improved the aggregate bandwidth  from algorithms that use only one t...
<ul><li>Iteratively construct spanning trees </li></ul><ul><ul><li>Create a spanning tree ( Tn ) by tracing every destinat...
<ul><li>Each tree sends different fractions of data  in parallel </li></ul><ul><ul><li>The proportion of data sent through...
<ul><li>In FPFR, slow nodes degrade receiving bandwidth to other nodes </li></ul><ul><li>For tree topologies, FPFR only ou...
<ul><li>Introduction </li></ul><ul><li>Problem Settings </li></ul><ul><li>Related Work </li></ul><ul><li>Our Algorithm </l...
<ul><li>Modify FPFR algorithm </li></ul><ul><ul><li>Create both  spanning  trees and  partial trees </li></ul></ul><ul><li...
<ul><li>Iteratively construct trees </li></ul><ul><ul><li>Create a tree  Tn  by tracing every destination </li></ul></ul><...
<ul><li>Send data proportional to the tree throughput  Vn </li></ul><ul><li>Example: </li></ul><ul><ul><li>Stage1: use T1,...
<ul><li>Our algorithm is  Stable  for tree topologies (whose links have the same capacities in both directions) </li></ul>...
<ul><li>Introduction </li></ul><ul><li>Problem Settings </li></ul><ul><li>Related Work </li></ul><ul><li>Proposed Algorith...
<ul><li>Simulated 5 broadcast algorithms  using a real topology </li></ul><ul><li>Compared the aggregate bandwidth  of eac...
Compared Algorithms Flat Tree Random Dijkstra Depth-First (FPFR) …  and OURS
<ul><li>Mixed two kinds of Links (100 and 1000) </li></ul><ul><ul><li>Vertical axis: speedup from  FlatTree </li></ul></ul...
<ul><li>Tested 8 bandwidth distributions </li></ul><ul><ul><li>Uniform distribution (500-1000) </li></ul></ul><ul><ul><li>...
<ul><li>Performed broadcasts in 4 clusters </li></ul><ul><ul><li>Number of destinations:10, 47 and 105 nodes </li></ul></u...
Theoretical Maximum Aggregate Bandwidth <ul><li>Also, we calculated the theoretical maximum aggregate bandwidth </li></ul>...
<ul><li>For 105 nodes broadcast,  2.5 times  more bandwidth than the baseline algorithm  DepthFirst (FPFR) </li></ul><ul><...
<ul><li>Compared aggregate bandwidth of 9 nodes before/after adding one slow node </li></ul><ul><ul><li>Unlike  DepthFirst...
<ul><li>Introduction </li></ul><ul><li>Problem Settings </li></ul><ul><li>Related Work </li></ul><ul><li>Our Algorithm </l...
<ul><li>Introduced the notion of  Stable Broadcast </li></ul><ul><ul><li>Slow nodes  never  degrade receiving bandwidth of...
<ul><li>Algorithm that maximizes aggregate bandwidth in general graph topologies </li></ul><ul><li>Algorithm that changes ...
<ul><li>Algorithm that maximizes aggregate bandwidth in general graph topologies </li></ul><ul><li>Algorithm that changes ...
All the graphs 0 10 20 30 40 50 60 70 80 10 50 100 R e l a t i v e P e r f o r m a n c e Number of Destinations (a) [Low B...
<ul><li>BitTorrent gradually improves the transfer schedule by adaptively choosing the parent node </li></ul><ul><li>Since...
<ul><li>Uniform distribution (100-1000) between switches </li></ul><ul><ul><li>Vertical axis: speedup from  FlatTree </li>...
<ul><li>Trace all the destinations from the source </li></ul><ul><ul><li>Some links used by many transfers  become bottlen...
<ul><li>Construct a depth-first pipeline by using topology information </li></ul><ul><ul><li>Avoid link sharing by using e...
<ul><li>Construct a relaying structure in a greedy manner </li></ul><ul><ul><li>Add a node reachable in the maximum bandwi...
Upcoming SlideShare
Loading in …5
×

http://www.logos.ic.i.u-tokyo.ac.jp/~kay/papers/ccgrid2008_stable_broadcast.pdf

595 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

http://www.logos.ic.i.u-tokyo.ac.jp/~kay/papers/ccgrid2008_stable_broadcast.pdf

  1. 1. A Stable Broadcast Algorithm Kei Takahashi Hideo Saito Takeshi Shibata Kenjiro Taura (The University of Tokyo, Japan) CCGrid 2008 - Lyon, France
  2. 2. <ul><li>To distribute the same, but large data to many nodes </li></ul><ul><ul><li>Ex: content delivery </li></ul></ul><ul><li>Widely used in parallel processing </li></ul>Broadcasting Large Messages Data Data Data Data Data < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura
  3. 3. <ul><li>Usually, in a broadcast transfer, the source can deliver much less data than a single transfer from the source </li></ul>Problem of Broadcast S D S D 100 25 25 25 25 < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura
  4. 4. <ul><li>Pipeline-manner transfers improve the performance </li></ul><ul><li>Even in a pipeline transfer, nodes with small bandwidth ( slow nodes ) may degrade receiving bandwidth of all other nodes </li></ul>Problem of Slow Nodes    10 10 100  100 100 10 10
  5. 5. <ul><li>Propose an idea of Stable Broadcast </li></ul><ul><li>In a stable broadcast: </li></ul><ul><ul><li>Slow nodes never degrade receiving bandwidth to other nodes </li></ul></ul><ul><ul><li>All nodes receive the maximum possible amount of data </li></ul></ul>Contributions
  6. 6. <ul><li>Propose a stable broadcast algorithm for tree topologies </li></ul><ul><ul><li>Proved to be stable in a theoretical model </li></ul></ul><ul><ul><li>Improve performances in general graph networks </li></ul></ul><ul><li>In a real-machine experiment, our algorithm achieved 2.5 times the aggregate bandwidth than the previous algorithm (FPFR) </li></ul>Contributions (cont.)
  7. 7. <ul><li>Introduction </li></ul><ul><li>Problem Settings </li></ul><ul><li>Related Work </li></ul><ul><li>Proposed Algorithm </li></ul><ul><li>Evaluation </li></ul><ul><li>Conclusion </li></ul>Agenda < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura
  8. 8. <ul><li>Target: large message broadcast </li></ul><ul><li>Only computational nodes handle messages </li></ul>Problem Settings < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura
  9. 9. <ul><li>Only bandwidth matters for large messages </li></ul><ul><ul><li>(Transfer time) = (Latency) + </li></ul></ul><ul><li>Bandwidth is only limited by link capacities </li></ul><ul><ul><li>Assume that nodes and switches have enough processing throughput </li></ul></ul>Problem Settings (cont.) (Message Size) (Bandwidth) 50msec 1Gbps 1GB 99%
  10. 10. <ul><li>Bandwidth-annotated topologies are given in advance </li></ul><ul><ul><li>Bandwidth and topologies can be rapidly inferred </li></ul></ul><ul><ul><ul><li>- Shirai et al. A Fast Topology Inference - A building block for network-aware parallel computing. (HPDC 2007) </li></ul></ul></ul><ul><ul><ul><li>- Naganuma et al. Improving Efficiency of Network Bandwidth Estimation Using Topology Information (SACSIS 2008, Tsukuba, Japan) </li></ul></ul></ul>Problem Settings (cont.) 100 10 80 40 30
  11. 11. <ul><li>Previous algorithms evaluated broadcast by completion time </li></ul><ul><li>However, it cannot evaluate the effect of slowly receiving nodes </li></ul><ul><ul><li>It is desirable that each node receives as much data as possible </li></ul></ul><ul><li>Aggregate Bandwidth is a more reasonable evaluation criterion in many cases </li></ul>Evaluation of Broadcast
  12. 12. <ul><li>All nodes receive maximum possible bandwidth </li></ul><ul><ul><li>Receiving bandwidth for each node does not lessen by adding other nodes to the broadcast </li></ul></ul>Definition of Stable Broadcast D0 D1 D2 D3 100 10 120 100 D2 120 Single Transfer Broadcast
  13. 13. <ul><li>Maximize aggregate bandwidth </li></ul><ul><li>Minimize completion time </li></ul>Properties of Stable broadcast
  14. 14. <ul><li>Introduction </li></ul><ul><li>Problem Settings </li></ul><ul><li>Related Work </li></ul><ul><li>Proposed Algorithm </li></ul><ul><li>Evaluation </li></ul><ul><li>Conclusion </li></ul>Agenda < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura
  15. 15. <ul><li>Flat tree: </li></ul><ul><ul><li>The outgoing link from the source becomes a bottleneck </li></ul></ul><ul><li>Random Pipeline: </li></ul><ul><ul><li>Some links used many times become bottlenecks </li></ul></ul><ul><li>Depth-first Pipeline: </li></ul><ul><ul><li>Each link is only used once, but fast nodes suffer from slow nodes </li></ul></ul><ul><li>Dijkstra: </li></ul><ul><ul><li>Fast nodes do not suffer from slow nodes, but some link are used many times </li></ul></ul>Single-Tree Algorithms Flat Tree Random Pipeline Dijkstra Depth-First (FPFR) < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura
  16. 16. <ul><li>FPFR ( Fast Parallel File Replication ) has improved the aggregate bandwidth from algorithms that use only one tree </li></ul><ul><li>Idea: </li></ul><ul><ul><li>(1) Construct multiple spanning trees </li></ul></ul><ul><ul><li>(2) Use these trees in parallel </li></ul></ul>FPFR Algorithm [†] [†] Izmailov et al. Fast Parallel File Replication in Data Grid. (GGF-10, March 2004.)
  17. 17. <ul><li>Iteratively construct spanning trees </li></ul><ul><ul><li>Create a spanning tree ( Tn ) by tracing every destination </li></ul></ul><ul><ul><li>Set the throughput ( Vn ) to the bottleneck bandwidth in Tn </li></ul></ul><ul><ul><li>Subtract Vn from the remaining bandwidth of each link </li></ul></ul>Tree constructions in FPFR Bottleneck First Tree (T1) V1 V2 Second Tree (T2)
  18. 18. <ul><li>Each tree sends different fractions of data in parallel </li></ul><ul><ul><li>The proportion of data sent through each tree may be optimized by linear programming ( Balanced Multicasting [†] ) </li></ul></ul>Data transfer with FPFR T1: Sends the former part T2: sends the latter part [†] den Burger et al. Balanced Multicasting: High-throughput Communication for Grid Applications (SC ‘2005) V1 V2
  19. 19. <ul><li>In FPFR, slow nodes degrade receiving bandwidth to other nodes </li></ul><ul><li>For tree topologies, FPFR only outputs one depth-first pipeline, which cannot utilize the potential network performance </li></ul>Problems of FPFR Bottleneck    
  20. 20. <ul><li>Introduction </li></ul><ul><li>Problem Settings </li></ul><ul><li>Related Work </li></ul><ul><li>Our Algorithm </li></ul><ul><li>Evaluation </li></ul><ul><li>Conclusion </li></ul>Agenda < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura
  21. 21. <ul><li>Modify FPFR algorithm </li></ul><ul><ul><li>Create both spanning trees and partial trees </li></ul></ul><ul><li>Stable for tree topologies whose links have the same bandwidth in both directions </li></ul>Our Algorithm V V < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura
  22. 22. <ul><li>Iteratively construct trees </li></ul><ul><ul><li>Create a tree Tn by tracing every destination </li></ul></ul><ul><ul><li>Set the throughput Vn to the bottleneck in Tn </li></ul></ul><ul><ul><li>Subtract Vn from the remaining capacities </li></ul></ul>Tree Constructions T3: Third Tree (Partial Tree) S A B C T1: First Tree (Spanning) S A B C T2: Second Tree (Partial Tree) S A B C V1 V2 V3 Throughput of T1
  23. 23. <ul><li>Send data proportional to the tree throughput Vn </li></ul><ul><li>Example: </li></ul><ul><ul><li>Stage1: use T1, T2 and T3 </li></ul></ul><ul><ul><li>Stage2: use T1 and T2 to send data previously sent by T3 </li></ul></ul><ul><ul><li>Stage3: use T1 to send data previously sent by T2 </li></ul></ul>Data Transfer A B S C T1 T2 T3 (V1) (V2) (V3) A B S C T1 T2 A B S C T1
  24. 24. <ul><li>Our algorithm is Stable for tree topologies (whose links have the same capacities in both directions) </li></ul><ul><ul><li>Every node receives maximum bandwidth </li></ul></ul><ul><li>For any topology, it achieves greater aggregate bandwidth than the baseline algorithm (FPFR) </li></ul><ul><ul><li>Fully utilize link capacity by using partial trees </li></ul></ul><ul><li>It has small calculation cost to create a broadcast plan </li></ul>Properties of Our Algorithm
  25. 25. <ul><li>Introduction </li></ul><ul><li>Problem Settings </li></ul><ul><li>Related Work </li></ul><ul><li>Proposed Algorithm </li></ul><ul><li>Evaluation </li></ul><ul><li>Conclusion </li></ul>Agenda < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura
  26. 26. <ul><li>Simulated 5 broadcast algorithms using a real topology </li></ul><ul><li>Compared the aggregate bandwidth of each method </li></ul><ul><ul><li>Many bandwidth distributions </li></ul></ul><ul><ul><li>Broadcast to 10, 50, and 100 nodes </li></ul></ul><ul><ul><li>10 kinds of conditions (src, dest) </li></ul></ul>(1) Simulations < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura … … … … 110 nodes 81 nodes 36 nodes 4 nodes
  27. 27. Compared Algorithms Flat Tree Random Dijkstra Depth-First (FPFR) … and OURS
  28. 28. <ul><li>Mixed two kinds of Links (100 and 1000) </li></ul><ul><ul><li>Vertical axis: speedup from FlatTree </li></ul></ul><ul><ul><li>40 times more than random , 3 times more than depth-first (FPFR) with 100 nodes </li></ul></ul>Result of Simulations 100 100 100 1000 1000 1000
  29. 29. <ul><li>Tested 8 bandwidth distributions </li></ul><ul><ul><li>Uniform distribution (500-1000) </li></ul></ul><ul><ul><li>Uniform distribution (100-1000) </li></ul></ul><ul><ul><li>Mixed 100 and 1000 links </li></ul></ul><ul><ul><li>Uniform distribution (500-100) between switches </li></ul></ul><ul><li>(for each distribution, tested two conditions that bandwidth of both directions are the same and different ) </li></ul><ul><li>Our method achieved the largest bandwidth in 7/8 cases </li></ul><ul><ul><li>Large improvement especially in large bandwidth variance </li></ul></ul><ul><ul><li>In a uniform distribution (100-1000) and link bandwidth in two directions are different, Dijkstra achieved 2% more aggregate bandwidth </li></ul></ul>Result of Simulations (cont.)
  30. 30. <ul><li>Performed broadcasts in 4 clusters </li></ul><ul><ul><li>Number of destinations:10, 47 and 105 nodes </li></ul></ul><ul><ul><li>Bandwidths of each link: (10M - 1Gbps) </li></ul></ul><ul><li>Compared the aggregate bandwidth in 4 algorithms </li></ul><ul><ul><li>Our algorithm </li></ul></ul><ul><ul><li>Depth-first (FPFR) </li></ul></ul><ul><ul><li>Dijkstra </li></ul></ul><ul><ul><li>Random (Best among 100 trials) </li></ul></ul>(2) Real Machine Experiment
  31. 31. Theoretical Maximum Aggregate Bandwidth <ul><li>Also, we calculated the theoretical maximum aggregate bandwidth </li></ul><ul><ul><li>The total of the receiving bandwidth in a case of separate direct transfer from the source to each destination </li></ul></ul>D0 D1 D2 D3 100 10 120 100
  32. 32. <ul><li>For 105 nodes broadcast, 2.5 times more bandwidth than the baseline algorithm DepthFirst (FPFR) </li></ul><ul><li>However, our algorithm stayed 50-70% the aggregate bandwidth compared to the theoretical maximum </li></ul><ul><ul><li>Computational nodes cannot fully utilize up/down network </li></ul></ul>Evaluation of Aggregate Bandwidth 700 700 900
  33. 33. <ul><li>Compared aggregate bandwidth of 9 nodes before/after adding one slow node </li></ul><ul><ul><li>Unlike DepthFirst ( FPFR ), existing nodes do not suffer from adding a slow node in our algorithm </li></ul></ul><ul><ul><li>Achieved 1.6 times bandwidth than Dijkstra </li></ul></ul>Evaluation of Stability Slow
  34. 34. <ul><li>Introduction </li></ul><ul><li>Problem Settings </li></ul><ul><li>Related Work </li></ul><ul><li>Our Algorithm </li></ul><ul><li>Evaluation </li></ul><ul><li>Conclusion </li></ul>Agenda < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura
  35. 35. <ul><li>Introduced the notion of Stable Broadcast </li></ul><ul><ul><li>Slow nodes never degrade receiving bandwidth of fast nodes </li></ul></ul><ul><li>Proposed a stable broadcast algorithm for tree topologies </li></ul><ul><ul><li>Theoretically proved </li></ul></ul><ul><ul><li>2.5 times the aggregate bandwidth in real machine experiments </li></ul></ul><ul><ul><li>Confirmed speedup in simulations with many different conditions </li></ul></ul>Conclusion < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura
  36. 36. <ul><li>Algorithm that maximizes aggregate bandwidth in general graph topologies </li></ul><ul><li>Algorithm that changes relay schedule by detecting bandwidth fluctuations </li></ul>Future Work < A Stable Broadcast Algorithm > Kei Takahashi, Hideo Saito, Takeshi Shibata and Kenjiro Taura
  37. 37. <ul><li>Algorithm that maximizes aggregate bandwidth in general graph topologies </li></ul><ul><li>Algorithm that changes relay schedule by detecting bandwidth fluctuations </li></ul>Future work
  38. 38. All the graphs 0 10 20 30 40 50 60 70 80 10 50 100 R e l a t i v e P e r f o r m a n c e Number of Destinations (a) [Low Bandwidth Variance] (Symmetric) Ours Depthfirst Dijkstra Random (best) Random (avg) 0 5 10 15 20 25 30 35 40 10 50 100 R e l a t i v e P e r f o r m a n c e Number of Destinations (g) [Random Bandwidth among Clusters ] (Symmetric) Ours Depthfirst Dijkstra Random (best) Random (avg) 0 5 10 15 20 25 30 35 10 50 100 R e l a t i v e P e r f o r a m a n c e Number of Destinations Ours Depthfirst Dijkstra Random (best) Random (avg) (h) [Random Bandwidth among Clusters ] (Asymmetric) 0 10 20 30 40 50 60 70 10 50 100 R e l a t i v e P e r f o r m a n c e Number of Destinations (b) [Low Bandwidth Variance] (Asymmetric) Ours Depthfirst Dijkstra Random (best) Random (avg) 0 5 10 15 20 25 30 35 40 45 10 50 100 R e l a t i v e P e r f o r m a n c e Number of Destinations (c) [High Bandwidth Variance] (Symmetric) Ours Depthfirst Dijkstra Random (best) Random (avg) 0 5 10 15 20 25 10 50 100 R e l a t i v e P e r f o r m a n c e Number of Destinations (d) [High Bandwidth Variance] (Asymmetric) Ours Depthfirst Dijkstra Random (best) Random (avg) 0 5 10 15 20 25 30 35 40 45 10 50 100 R e l a t i v e P e r f o r m a n c e Number of Destinations (e) [Mixed Fast and Slow links ] (Symmetric) Ours Depthfirst Dijkstra Random (best) Random (avg) 0 2 4 6 8 10 12 14 16 18 20 10 50 100 R e l a t i v e P e r f o r m a n c e Number of Destinations (f) [Mixed Fast and Slow links ] (Asymmetric) Ours Depthfirst Dijkstra Random (best) Random (avg) 0 2 4 6 8 10 12 (1,10) (2,20) (3,30) (4,40) (5,50) R e l a t i v e P e r f o r m a m Ours Depthfirst Dijkstra (# of srcs, # of dests) (i) [Mulrtiple Souces] (Low Bandwidth Variance, Symmetric) Random (best) Random (avg)
  39. 39. <ul><li>BitTorrent gradually improves the transfer schedule by adaptively choosing the parent node </li></ul><ul><li>Since relaying structure created by BitTorrent has many branches, these links may become bottlenecks </li></ul>Broadcast with BitTorrent [†] [†] Wei et al. Scheduling Independent Tasks Sharing Large Data Distributed with BitTorrent. (In GRID ’05) Transfer tree snapshot Bottleneck Link
  40. 40. <ul><li>Uniform distribution (100-1000) between switches </li></ul><ul><ul><li>Vertical axis: speedup from FlatTree </li></ul></ul><ul><ul><li>36 times more than FlatTree , 1.2 times more than DepthFirst (FPFR) for 100-nodes broadcast </li></ul></ul>Simulation 1 1000 1000 100~1000 100~1000
  41. 41. <ul><li>Trace all the destinations from the source </li></ul><ul><ul><li>Some links used by many transfers become bottlenecks </li></ul></ul>Topology-unaware pipeline Bottleneck
  42. 42. <ul><li>Construct a depth-first pipeline by using topology information </li></ul><ul><ul><li>Avoid link sharing by using each link only once </li></ul></ul><ul><ul><li>Minimize the completion time in a tree topology </li></ul></ul><ul><li>Slow nodes degrade the performance of other nodes </li></ul>Depth-first Pipeline Slow Node [†] Shirai et al. A Fast Topology Inference - A building block for network-aware parallel computing. (HPDC 2007)
  43. 43. <ul><li>Construct a relaying structure in a greedy manner </li></ul><ul><ul><li>Add a node reachable in the maximum bandwidth one by one </li></ul></ul><ul><ul><li>Effects of slow nodes are small </li></ul></ul><ul><li>Some links may be used by many transfers, may become bottlenecks </li></ul>Dijkstra Algorithm [†] [†] Wang et al. A novel data grid coherence protocol using pipeline-based aggressive copy method. (GPC, pages 484–495, 2007) Bottleneck Link

×