Deadlock Preventive Adaptive Wormhole Routing

   on k-ary n-cube Interconnection Networks

                  Franck Binar...
Abstract

   A primary concern in all adaptive networks is the cost of deadlock prevention. While
wormhole routing can be ...
Contents

1 Introduction                                                                               6

2 k-ary n-cube I...
9.2.3   Deadlock Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . .      22
        9.2.4   Performance Compa...
List of Figures

  1    Examples of k-ary n-cube interconnection network topologies . . . . . . . .           7
  2    Ada...
1 INTRODUCTION                                                                             6


1    Introduction

Routing ...
2 K-ARY N-CUBE INTERCONNECTION NETWORKS                                                            7




               3-...
3 WORMHOLE ROUTING                                                                          8


ith dimension, (a0 , . . ....
5 LIVELOCK PREVENTION                                                                     9




Deterministic routing algo...
6 DEADLOCK PREVENTION                                                                   10




                  Figure 2:...
6 DEADLOCK PREVENTION                                                                     11


  1. In wormhole routing, o...
7 MINIMAL ROUTING                                                                        12




Figure 3: A deadlock situa...
8 VIRTUAL CHANNELS                                                                       13




                 Figure 4:...
8 VIRTUAL CHANNELS                                                                            14


8.1     Virtual Channel...
8 VIRTUAL CHANNELS                                                                         15


      to increasing the co...
9 PLANAR ADAPTIVE WORMHOLE ROUTING                                                        16




                         ...
9 PLANAR ADAPTIVE WORMHOLE ROUTING                                                             17


9.1     The Turn Model...
9 PLANAR ADAPTIVE WORMHOLE ROUTING                                                          18




                   Figu...
9 PLANAR ADAPTIVE WORMHOLE ROUTING                                                            19




                 Figu...
9 PLANAR ADAPTIVE WORMHOLE ROUTING                                                             20




           Figure 9:...
9 PLANAR ADAPTIVE WORMHOLE ROUTING                                                       21



                           ...
9 PLANAR ADAPTIVE WORMHOLE ROUTING                                                              22




        Figure 11: ...
9 PLANAR ADAPTIVE WORMHOLE ROUTING                                                          23


9.2.4     Performance Com...
9 PLANAR ADAPTIVE WORMHOLE ROUTING                                                         24


of the one presented in [5...
10 CONCLUSION                                                                            25




                          ...
10 CONCLUSION                                                                          26




                            ...
REFERENCES                                                                                     27


switching hardware [15...
REFERENCES                                                                                  28


   [4] William J. Dally, ...
Upcoming SlideShare
Loading in …5
×

Deadlock Preventive Adaptive Wormhole Routing on k-ary n-cube Interconnection Networks

1,084 views
1,028 views

Published on

Franck Binard, 2003

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,084
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
24
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Deadlock Preventive Adaptive Wormhole Routing on k-ary n-cube Interconnection Networks

  1. 1. Deadlock Preventive Adaptive Wormhole Routing on k-ary n-cube Interconnection Networks Franck Binard December 7, 2003 1
  2. 2. Abstract A primary concern in all adaptive networks is the cost of deadlock prevention. While wormhole routing can be considered superior to other routing schemes when looked at in terms of low latency combined with low buffer requirements, it makes the deadlock issue more complex to resolve, as packets of a single message can block several links at the same time. Because planar-adaptive routing limits routing freedom to two dimensions at a time, it makes it possible to prevent deadlock with only a fixed number of virtual channels, independent of the number of network dimensions. In this essay, i will study two planar-adaptive schemes (the Chien and Kim’s algorithm and the turn model) in the context of cost effectiveness and deadlock prevention.
  3. 3. Contents 1 Introduction 6 2 k-ary n-cube Interconnection Networks 7 3 Wormhole routing 8 4 Adaptive Routing 8 5 Livelock Prevention 9 6 Deadlock Prevention 10 6.1 The Cost of Deadlock Prevention . . . . . . . . . . . . . . . . . . . . . . . . 11 7 Minimal Routing 12 8 Virtual Channels 13 8.1 Virtual Channel Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 8.1.1 Advantages of Using Virtual Channel . . . . . . . . . . . . . . . . . 14 8.1.2 Disadvantages of Virtual Channel Usage . . . . . . . . . . . . . . . . 14 8.2 Using Virtual Channels for Deadlock Avoidance . . . . . . . . . . . . . . . . 15 9 Planar Adaptive Wormhole Routing 16 9.1 The Turn Model for Adaptive Routing . . . . . . . . . . . . . . . . . . . . . 17 9.1.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 9.1.2 The West-First Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 18 9.1.3 The North-Last Algorithm ....................... 18 9.1.4 The Negative-First Algorithm . . . . . . . . . . . . . . . . . . . . . . 19 9.1.5 Deadlock Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 9.1.6 Performance ............................... 19 9.2 Chien and Kim’s Partially Adaptive Routing Algorithm . . . . . . . . . . . 19 9.2.1 Notation and Terminology . . . . . . . . . . . . . . . . . . . . . . . . 20 9.2.2 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
  4. 4. 9.2.3 Deadlock Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 9.2.4 Performance Comparisons . . . . . . . . . . . . . . . . . . . . . . . . 23 9.2.5 Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 9.3 The Modified Chien and Kim’s Partially Adaptive Routing Algorithm . . . 25 10 Conclusion 25
  5. 5. List of Figures 1 Examples of k-ary n-cube interconnection network topologies . . . . . . . . 7 2 Adaptive vs Deterministic Routing Algorithms . . . . . . . . . . . . . . . . 10 3 A deadlock situation: four messages have entered the network through dif- ferent switches, and are blocked by each other in a cycle after having each acquired the first-hop link . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4 Virtual Channel Router Architecture Diagram ................ 13 5 Virtual Channel Deadlock Control . . . . . . . . . . . . . . . . . . . . . . . 16 6 Turns prohibited in the west-first algorithm . . . . . . . . . . . . . . . . . . 18 7 Turns prohibited in the north-last algorithm . . . . . . . . . . . . . . . . . . 18 8 Turns prohibited in the negative-last algorithm . . . . . . . . . . . . . . . . 19 9 Chien and Kim’s planar-adaptive routing in a 2-ary 3-cube . . . . . . . . . 20 10 Numbering of virtual channels with respect to node A in dimension i, i + 1 21 11 Increasing and Decreasing Virtual Networks of an Adaptive Plane . . . . . 22 12 Deactivation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 13 Comparison of Chien and Kim’s planar adaptive scheme and of the modifi- cation algorithm terms of buffer requirements and channel utilization ... 26
  6. 6. 1 INTRODUCTION 6 1 Introduction Routing algorithms are crucial to the efficient operation of interconnection networks as they specify the paths packets will take when messages are being sent among the processors of the network. A good routing algorithm will reduce the latency of the network by mini- mizing the number of hops that are required for packets to reach their destination, ideally forcing packets to advance closer to their destination with every hop. It should also be able to handle deadlock and livelock situations. Ideally, it would have features that would allow it to route around network faults. Finally, it should be able to balance the load of the routing traffic on the interconnection network’s routing resources. There are two categories of routing approaches in the context of interconnection networks: deterministic and adaptive[13]. While with deterministic routing, a packet follows a path that is determined exclusively by its source and its destination, adaptive routers choose the path that the packet is to be routed through based on the current dynamic conditions of the network. Adaptive routing schemes increase network performance, however, they also increase the cost and complexity of the network. Adaptive routing schemes will also aug- ment the difficulty of providing deadlock/livelock prevention and correction to the network. Restricting adaptivity is one way to reduce some of the problems associated with adaptive routing. Planar-adaptive routing is a limited adaptive scheme. By restricting the set of possible paths that a message can take, planar-adaptive routing resolves some of the difficulties as- sociated with full adaptivity at a reasonable cost. Because planar-adaptive routing also has most of the desirable routing properties that are found in fully-adaptive routing (such as load balancing properties), it constitutes a compromise between deterministic routing and fully-adaptive routing. In this essay, I provide the reader with a survey of planar-adaptive wormhole routing schemes and a study of deadlock-free planar-adaptive wormhole routed networks in the
  7. 7. 2 K-ARY N-CUBE INTERCONNECTION NETWORKS 7 3-ary 2-cube (torus) 2-ary 3-cube 4-ary 3-cube Figure 1: Examples of k-ary n-cube interconnection network topologies context of deadlock-prevention, performance benefits, cost effectiveness and fault tolerance. In terms of topology, the domain of discourse will remain fixed on a constant k-ary n-cube topology. 2 k-ary n-cube Interconnection Networks Definition 2.1 (k-ary n-cube Interconnection Network) A k-ary n-cube interconnec- tion network is an n-dimensional network in which each dimension has k nodes. k is also referred to as the radix. The relation: N = k n , (n = logk N ) holds where N is the total number of nodes in the network. The network is composed of k n nodes connected by 2 ∗ n physical channels. Figure 1 presents possible architectures for 3 different k-ary n-cube topologies. Nodes in k-ary n-cubes are identified by a an n-digit radix k address, a0 , . . . , an−1 , where the ai address represents the node’s position in the ith dimension. There is always a physical channel between the node with address (a0 , . . . , ai , . . . , an−1 ) and its upper neighbor in the
  8. 8. 3 WORMHOLE ROUTING 8 ith dimension, (a0 , . . . , (ai+1 )mod(k), . . . , an−1 ). 3 Wormhole routing Wormhole routing is a routing technique by which a switch immediately forwards an in- coming message to the desired output link when that link is available [11]. In wormhole routing, each packet is divided into a sequence of small units (typically 8-32 bits[16]) of data called flits. Once a communication channel has started to send the first flits of a packet, it must transmit all the flits of that packet before it can be used for any other messages. The header flit contains routing information and governs the route. The remaining data flits follow in a pipelined fashion. If the header is blocked, the data flits are also blocked. Because wormhole routing does not make it necessary for a node to allocate an entire packet buffer before accepting each packet, the buffers do not need to be as voluminous as they are with other routing schemes such as virtual cut-through[7]. Thus, wormhole routing allows the construction of fast and inexpensive interconnection networks. When compared with cut-through and store-and-forward techniques, wormhole routing re- duces the buffer requirements of the network. However, it may cause lower network through- put because when a packet is blocked, it remains in the network as opposed to being buffered and disposed of. When a message’s output link is used by another message a collision takes place, and the incoming message is blocked, which augments the probability of deadlocks. 4 Adaptive Routing An adaptive routing scheme allows routers to choose from several possible paths based on channel loading, network faults and other dynamic information. A fully adaptive minimal routing algorithm will permit any possible minimal path between a source and destination to be used when messages are routed through the network.
  9. 9. 5 LIVELOCK PREVENTION 9 Deterministic routing algorithms avoid deadlocks by defining a single possible path be- tween source and destination. However, this means that the interconnection network can not make effective use of its routing resources[15]. The result being that deterministic routing will often result in non-maximal performance under some specific traffic patterns. Adaptive routing on the other hand allows the effective use of the interconnection network’s resources, but presents a new set of challenge in terms of deadlock and livelock prevention. There is another advantage to using adaptively routed systems: since path choices can be made on the basis of any local information, adaptive routers can easily be made fault tol- erant. Full adaptivity is expensive as it requires a large number of additional hardware resources. The Linder-Harden algorithm, a fully-adaptive routing algorithm for k-ary n-cubes, requires a number of virtual channels exponential to n ((n + 1)2n−1 virtual channels per physical channels[12]). The Berman et al., another fully-adaptive algorithm for k-ary n-cubes uses as many as 10(n − 1) + 6 virtual channels per physical channels. Because some of the most expensive part of an interconnection network are the buffers and switching hardware, limiting the required number of virtual channel is an important factor in the choice of the routing algorithm. 5 Livelock Prevention Livelock represent a state in which one or more messages could be forever denied of the resources they require to progress towards their destinations. Unlike deadlock or indefinite postponement (a packet that is waiting forever to acquire a network resource for which other packets are always competing successfully), livelock does not stop a packet’s movement, but rather its progress towards its final destination.
  10. 10. 6 DEADLOCK PREVENTION 10 Figure 2: Adaptive vs Deterministic Routing Algorithms Livelock-freedom can, in general, be ensured by assigning resources (channels or buffers) to waiting messages in a FIFO manner[12]. Both Chien and Kim’s algorithm and the turn algorithm are livelock-free. The proofs are given in [15] and in [8] respectively. 6 Deadlock Prevention When a message’s output link is used by another message, a collision takes place, and the incoming message is blocked. Deadlock occurs when messages are blocked by messages that are themselves blocked, and the sequence of blocking forms a cycle[15]. Figure 3 il- lustrates such a cycle. In Figure 3, four color labelled messages have entered the network through different switches. Each message has acquired the first-hop link, but is blocked on the second-hop link. A deadlock situation is created. Ensuring deadlock-freedom is more difficult than ensuring livelock-freedom, and depends heavily on the design of the routing algorithm[12]. Wormhole routing makes deadlock avoidance a more difficult problem to solve because:
  11. 11. 6 DEADLOCK PREVENTION 11 1. In wormhole routing, once a communication channel has started to send the first flits of a packet, it must transmit all the flits of that packet before it can be used for any other messages[11]. This means that a packet could be blocking several network links at the same time. 2. When deadlock situations occur, it is generally hard to assemble the incoming message in a buffer for later transmission. This is because in wormhole routing the message lengths are not limited, and the buffers are small[17]. The deterministic version of the wormhole algorithm has only two ways of handling deadlock situations: 1. By using a technique called backpressure flow [17], where a control signal is sent up- stream to the reverse link to stop or resume transmission. This however doesn’t pre- vent message blocking, which increases the message transfer time. This also requires efficient deadlock detection and resolution mechanisms, which might be expensive in terms of added hardware and complexity. For example, deadlocks can be detected using timers. The timer is started when the message arrives at the switch. The timer eventually expires, allowing the detection of the messages that are not progressing. 2. By using an acyclic routing scheme where there are no cycles in the sequence in links that is used[3]. This however leads to non-minimal paths and to the concentration of traffic in some nodes of the interconnection network. It might also be difficult to design. Planar-adaptive wormhole routing routing approaches deadlock prevention by restricting the available paths that the message can take in a way that ensures that no cycles of blocked packets can be formed. We will see how this is done when we look at planar-adaptive schemes. 6.1 The Cost of Deadlock Prevention In general, supporting both adaptivity and deadlock prevention is expensive because it requires additional virtual channels and larger crossbar switches[13]. Increasing routing
  12. 12. 7 MINIMAL ROUTING 12 Figure 3: A deadlock situation: four messages have entered the network through different switches, and are blocked by each other in a cycle after having each acquired the first-hop link flexibility multiplies the possibilities for deadlock situations which in turns increases the cost of deadlock prevention[6]. Constraining the routing freedom to a few dimensions at a time greatly reduces the hardware requirements for deadlock avoidance. While partially planar approaches sacrifice some routing freedom, they also drastically reduces the possibilities of deadlock[6] at a much lower hardware cost than the cost of deadlock-freedom mechanisms used with full-adaptive schemes. 7 Minimal Routing We distinguish between adaptive routers which route messages using only minimal paths (wasting no work) and those that will consider nonminimal paths, potentially wasting rout- ing work in exchange for increased routing freedom. In minimal routing algorithms, messages will get closer to their destination with each hop taken. Since message latencies increase with the number of hops[6], minimal routing makes it possible to utilize the full wire capacity of the network productively.
  13. 13. 8 VIRTUAL CHANNELS 13 Figure 4: Virtual Channel Router Architecture Diagram The Chien and Kim planar-adaptive scheme is a minimal, adaptive routing algorithm[15]. It will however allow misrouting in its fault tolerant version. The turn model can be either minimal or not[8]. 8 Virtual Channels Virtual channels are used in wormhole routed interconnection networks to avoid deadlocks and to improve link utilization and network throughput ([7],[3],[16]). Deadlock-free planar- adaptive adaptive routing relies on the use of virtual channels. Definition 8.1 (Virtual Channels [7]) A virtual channel is a pair of flit buffers (each is stored in two connected nodes) connected by a shared physical channel. The physical channel is timeshared by the virtual channels. Figure 4 depicts four virtual channels sharing a single physical channel. Virtual channels have their own flit queue, but share the bandwidth of associated physical channel with other virtual channels in a time-multiplexed fashion.
  14. 14. 8 VIRTUAL CHANNELS 14 8.1 Virtual Channel Utilization A network that uses virtual channels for flow control purposes organizes the flit buffers associated with each channel into lanes. The buffers in each lanes can then be allocated independently of the buffers in the other lanes. This increases channel utilization and by extension throughput[7]. As virtual channels are an integral part of deadlock free partial adaptive routing, it is worthwhile to consider the advantages and inconveniences of using virtual channelswhen looking at the use of planar-adaptive. 8.1.1 Advantages of Using Virtual Channel Below, I outline some of the major advantages of using virtual channels. 1. Adding lanes to the network allows blocked packets to be passed. This in turns increases network throughput[7] and facilitates deadlock prevention. 2. Virtual channels provide an additional degree of freedom in the allocation of net- work resources for the routing of packets in the network. This facilitates the use of scheduling strategies, reducing the variance of the network latency[7]. 3. Because buffer memory tends to be cheaper than physical channel bandwidth, adding virtual channels to a network provides a cost effective way to increase bandwidth as it permits the decoupling of wire resources. 4. Physical idle channel time is reduced because when using virtual channels, a physical channel is idle only when all of its virtual channels are idle or blocked. [7] shows that the probability of this happening is small. 5. Virtual channels make it easier to build virtual topologies. This results in easier network separation. 8.1.2 Disadvantages of Virtual Channel Usage Below, I outline some of the major inconveniences of using virtual channels. 1. Adding virtual channels to a physical channel is less expensive than adding new phys- ical channels, but it is not free. Adding buffer space and control logic will contribute
  15. 15. 8 VIRTUAL CHANNELS 15 to increasing the cost of the underlying hardware of the interconnection network, as well as its complexity. 2. Each extra virtual channel will reduce the bandwidth of the other virtual channels that are already sharing the physical channel. 3. Virtual channels increase the signaling overhead of the interconnection network. 4. Virtual channels increase the cycle time and scheduling overhead of the interconnec- tion network. 5. Virtual channels increase the scheduling complexity of the system (packet stretching problem). 6. Preserving packet transmission order is difficult in any interconnection network that uses multipath routing[15]. Using virtual channels makes it harder, as some packets might be buffered while others might be progressing. It is difficult and expensive to use packet sequence numbering schemes in large networks, and reassembly schemes are expensive. It is however possible to modify the planar-adaptive routing presented in [15] to make it order-preserving. the modify version works by restricting the routing paths even further, essentially reducing planar-adaptive routing to dimension ordering routing. 8.2 Using Virtual Channels for Deadlock Avoidance Planar-adaptive routing uses virtual channels primarily for deadlock avoidance. Any cyclic network can be made deadlock-free by restricting routing in such a way that there are no cycles in the channel dependency graph. Virtual channel are then added to reconnect the network[7]. Figure 5(a) shows how a packet A blocked between routers 3 and 4 also blocks the packet B when the network is not equipped with virtual channels. In figure 5(b), the network is equipped with virtual channels, allowing dual utilization of the physical channel between
  16. 16. 9 PLANAR ADAPTIVE WORMHOLE ROUTING 16 (a) (b) Figure 5: Virtual Channel Deadlock Control node 3 and node 4. Packet B can now pass A. While virtual channels are expensive, the good news is that planar-adaptive routing requires only a constant number of virtual channels to be provably deadlock-free, independently of network size and dimension[15]. In contrast, the virtual channel requirement of deadlock- free fully-adaptive routing schemes is much higher. 9 Planar Adaptive Wormhole Routing A primary concern in all adaptive networks is the cost of deadlock prevention. Because planar-adaptive routing limits routing freedom, it makes it possible to prevent deadlock with only a fixed number of virtual channels, independent of the number of network dimensions. Planar-adaptive routing supports full adaptivity, but only at the 2-dimensional plane level. The routing dimensions change as the packet progresses towards its destination[15]. Though there is less routing freedom than with fully adaptive routing, planar-adaptive routing still allows choice from a large number of paths from source to destination[6].
  17. 17. 9 PLANAR ADAPTIVE WORMHOLE ROUTING 17 9.1 The Turn Model for Adaptive Routing Proposed in [8], the turn model is deadlock-free and livelock-free. While the algorithm can be applied to networks with extra channels, unlike the Chien and Kim’s model, presented in section 9.2 of this essay, it is not based on the addition of virtual channels, but rather on the analysis of the directions in which the messages’ packets can turn in the network and the cycles that the turns can form. The algorithm works by prohibiting only those turns (change in dimension) in the network that could cause deadlock. 9.1.1 The Algorithm The term channel is used to designate both physical channels and virtual channels. The steps of the algorithm for a 2-dimensional mesh are as follows: 1. The first step partitions the channels in the network into sets according to the direc- tions in which they route packets. Nodes with v channels in a physical direction are treated as being in v distinct virtual directions and are divided into v distinct sets. Wraparound channels are in a separate set and are used in step 5 of the algorithm. 2. The possible turns from one virtual direction to another are identified. 180-degree and 0-degree turns are ignored. A 0-degree turn represents a transition from one set of channels to another.A 0-degree is only possible when there are multiple channels in one direction. [8] indicates that in general, identifying the simplest cycles in each plane of the topology is enough. 3. The cycles that these abstract turns can form are then identified. 4. One turn in each abstract cycle is prohibited so as to prevent deadlock. The prohibited turns are chosen carefully so as to break every possible cycle, including complex cycles not identified in step 3. 5. Turns originating from the wraparound channels are then added back, but only after checking that they do not reintroduce cycles. [8] indicates that at least one turn for each wraparound channel can always be incorporated.
  18. 18. 9 PLANAR ADAPTIVE WORMHOLE ROUTING 18 Figure 6: Turns prohibited in the west-first algorithm Figure 7: Turns prohibited in the north-last algorithm 6. All the 180-degree and 0-degree turns that do not reintroduce cycles are added back. 9.1.2 The West-First Algorithm One possibility to avoid the possibility of cycle creation is to prohibit all turns to the west (picture 6). This ensures that a packet that needs to go west does so at the beginning of its path. 9.1.3 The North-Last Algorithm Another possibility is to prohibit all turns to the north (picture 7). Doing this forces a packet that needs to go north to do so at the end of its path.
  19. 19. 9 PLANAR ADAPTIVE WORMHOLE ROUTING 19 Figure 8: Turns prohibited in the negative-last algorithm 9.1.4 The Negative-First Algorithm The last variant of the turn model is the negative-first algorithm. Here, the prohibited turns will be the two from a positive to a negative direction, forcing the packet routing to proceed west and south first, and then east and north. 9.1.5 Deadlock Freedom All the turn prohibition based algorithms presented above are deadlock-free. This comes from the existence of a channel numbering system for each algorithm in which packets can be shown to always be routed along channels with strictly decreasing (or increasing) numbers. 9.1.6 Performance In [8] it is proven that the turn model will require the prohibition of at least a quarter of all possible turns in order to prevent deadlock. Turn prohibiting has an impact on performance, as it reduces adaptivity. 9.2 Chien and Kim’s Partially Adaptive Routing Algorithm This scheme is presented in [15]. In this version of planar-adaptive wormhole routing, three bidirectional virtual channels must be provided for each physical channels. It is fault tolerant and deadlock-free, however, two faults may prevent many packets from being routed by using their method[1]. The algorithm works by dividing a k-ary n-cube topology into
  20. 20. 9 PLANAR ADAPTIVE WORMHOLE ROUTING 20 Figure 9: Chien and Kim’s planar-adaptive routing in a 2-ary 3-cube n − 1 virtual planes and routing adaptively in each plane, and deterministically from one plane to the next. This is repeated until the header packet has reached its final destination. Figure 9 gives a high level idea of the way the algorithm works for 2-ary 3-cube. In figure 9, a packet that needs to be routed through each of the network’s three dimensions is first routed in the X-Z dimension, then in the Y-Z dimension. 9.2.1 Notation and Terminology The virtual channels for each nodes are labelled from 0 to 2. Each plane i can now be defined as the union of the three sets of virtual channels : {di,0 + di,1 + di,2 } (see Figure 10). Definition 9.1 (Adaptive Plane) An adaptive plane Ai is defined formally as a set of virtual channels: Ai = di,2 + di+1,0 + di+1,1 over two dimensions i and i + 1. Within the plane, messages are routed adaptively with respect to these two dimension. Given a k-ary n-cube with n dimensions, the algorithm starts by creating n − 1 such adap- tive virtual planes. Given a bidirectional virtual channel di,jo in the dimension i of the network, we differentiate between the two directions of the data flow passing through the virtual channel by writing
  21. 21. 9 PLANAR ADAPTIVE WORMHOLE ROUTING 21 i+1 i d i+1, 0 d i+1, 1 A d i, 2 Figure 10: Numbering of virtual channels with respect to node A in dimension i, i + 1 di,jo + (for the increasing traffic), and di,jo − (for the decreasing traffic). We can now separate each adaptive plane’s virtual channels into two separate virtual net- works: 1. The increasing network, which routes increasing traffic and is defined as the union of the two sets: di,2 +, di+1,0 2. The decreasing network, which routes decreasing traffic and is defined as the union of the two sets: di,2 −, di+1,1 Figure 11 shows a plane logically decoupled into disjoint sets of virtual channels(the in- creasing and the decreasing network). 9.2.2 The Algorithm 1. At the adaptive plane level, messages are routed adaptively by looping through the adaptive planes, starting with A0 , all the way to An−2 . Within each adaptive plane
  22. 22. 9 PLANAR ADAPTIVE WORMHOLE ROUTING 22 Figure 11: Increasing and Decreasing Virtual Networks of an Adaptive Plane Ai , packets may use any channels leading toward their destination until the di address is correct. Of course, within each adaptive plane Ai , if the di address of the destination is lower than the di current address of the packet, the packet is routed through the decreasing network only, and vice versa, the di address of the destination being higher than the current di address of the packet forces it to route through the increasing network. 2. At the loop’s exit, only the dn−1 address of the packet might be different from its final address. If that is the case, the packet is routed to its final destination using the dn−1,2 channel. 9.2.3 Deadlock Freedom It is easy to see how the adaptive routing that is done at the plane level is deadlock-free. As each adaptive plane is divided into two completely separate networks, input routing in di + can only depend on output routing from di + and di+1,0 channels. Similarly, input routing in di − can only depend on output routing from di − and di+1,1 channels. This pre- vents any cycles from forming, as the routing flow is always unidirectional in the i dimension. Across the planes, deadlock freedom is also trivial as the loop always routes from the lower to the higher dimension, again making cycles impossible.
  23. 23. 9 PLANAR ADAPTIVE WORMHOLE ROUTING 23 9.2.4 Performance Comparisons [6], provides an evaluation of the performance of planar-adaptive routing in comparison to deterministic and fully adaptive routing with similar resources. The simulation stud- ies presented show that planar-adaptive routers can increase the robustness of network throughput for nonuniform communication patterns. Chien and Kim’s Planar-Adaptive Routing vs Deterministic Routing The performance of planar-adaptive routing is first evaluated against the performance of deterministic routing under three different traffic patterns: 1. Random (uniform) Each node sends with equal probability to all other nodes in the system. 2. Dimension-reversal Nodes send messages to nodes with address of reversed dimen- sion index. (x, y) sends to (y, x), (x, y, w, z) sends to (y, x, z, w) and so on. 3. Bit-reversal A node with address abcd2 sends messages to a nod with address dcba2 The comparisons show that while the performance of planar-adaptive routing is similar or slightly worst than the performance of deterministic routing in terms of latency when traffic loads are uniform, non-uniform traffic conditions give a clear advantage to planar- adaptive routing, as the planar-adaptive routed networks get saturated much later than the deterministically routed networks. In general, Planar-adaptive routed networks tend to be more consistent in terms of performance when confronted with different traffic patterns. Chien and Kim’s Planar-Adaptive Routing vs Fully Adaptive Routing Comparisons with fully adaptive routers show that planar-adaptive routers, can give supe- rior performance[15] with fewer resources. 9.2.5 Fault Tolerance The scheme can be augmented with misrouting to support fault tolerance. The resulting networks tolerate faults by routing around. The approach taken in [6] is a complement
  24. 24. 9 PLANAR ADAPTIVE WORMHOLE ROUTING 24 of the one presented in [5], where all faulty regions are augmented until they are convex (deactivation algorithm given in [6]). Requiring the faulty regions to be convex allows a larger fraction of the nodes to remain in service for a given pattern of faults[6]. Augmen- tation ensures that if the faulty regions are not naturally convex, good nodes and channels are marked as faulty until the regions become convex. Planar-adaptive routing will then route packets to the parts of the machine which remain connected. The flexibility of the adaptive routing algorithm is used to circumvent faulty channels.The algorithm is modified to support fault-tolerance as follows: 1. Where the algorithm previously didn’t make use of the dn−1,2 , d0,0 and d0,1 channels, we now define a new adaptive plane using these channels (An−1 ). 2. Where we routed using n−1 adaptive planes, we now route using n planes (A0 . . . An−1 ) , using the same high level algorithm as before in between the planes, but using the following algorithm in the adaptive planes: (a) If there are no faults, route as in the first algorithm. (b) If the packet is blocked by a fault in di , route in di+1 . If blocked by a fault in di+1 , route in di (c) If the packet is blocked by a fault in di , route and the di+1 has already been reduced to 0, then misrouting occurs. If we were routing in the di+1 direction, we continue in the same direction. If we were routing in the di direction, we pick any di+1 channel and we start in that direction. A soon as possible, we go back to the di direction, going back to the first step of the algorithm. Circumventing faulty regions with only local information requires packet misrouting. While allowing misrouting also introduces the possibility of livelock, it is shown in [6] that planar-adaptive routing forces all packets to always make progress toward their destina- tions, ensuring that the resulting networks are livelockfree.
  25. 25. 10 CONCLUSION 25 Figure 12: Deactivation Algorithm 9.3 The Modified Chien and Kim’s Partially Adaptive Routing Algo- rithm [2] proposes a modified version of Chien and Kim’s algorithm, intended to correct the low channel utilization of Chien and Kim’s algorithm. The modification determines wether or not the packet will need to go through the wrap-around links or not. When a packet will not need to use the wrap-around links, it is marked as a free packet. A free packet that cannot find a virtual channel by the previous assignment is allowed to use the higher level of virtual channels. The effect is to improve the channel utilization, while still increasing the virtual channel assignment. [2] compares the two schemes in terms of performance, and shows that while the modified scheme will improve the performance of the original one, its buffer requirements are larger. The throughput vs buffer utilization of the two algorithms is given in figure 13. 10 Conclusion All adaptive routing algorithms increase the hardware complexity of the network in or- der to support the additional routing flexibility ([12],[13]). Because the most expensive part of routing networks (after the wires for the physical channels) are the buffers and the
  26. 26. 10 CONCLUSION 26 unmodified Chien and Kim algorithm: Wormadp modified Chien and Kim algorithm: Wormadpmod Throughput (packets/cycle) 4 Numbers of buffers occupied (x10 ) Figure 13: Comparison of Chien and Kim’s planar adaptive scheme and of the modification algorithm terms of buffer requirements and channel utilization
  27. 27. REFERENCES 27 switching hardware [15], increasing the underlying hardware complexity of the network re- sults in a substantial increase in the cost of the network. In addition, increased hardware complexity can significantly reduce router speed, decreasing total network performance [15]. [15] show that planar-adaptive routers outperform deterministic routers with equal hardware resources. Further, adding virtual lanes to planar-adaptive routers increases this advantage. In this essay, we have seen that not only the structure of planar-adaptive routers is easy to implement efficiently [6], but that planar-adaptive routing provides a simple type of support for deadlockfree adaptive routing in k-ary n-cubes of more than two dimensions[6]. In addition, planar-adaptive routing has some nice fault tolerant features [9] as it allows messages to be routed around failed channel and nodes. In terms of load balancing, by allowing more freedom in the paths that are taken by mes- sages, planar-adaptive routing spreads network load over physical channels more evenly, thus improving the performance of the interconnection network ([15],[9]). Simulations show a clear advantage for planar-adaptive routing under uneven network load conditions. This is true of all adaptive routing algorithms, however, by restricting adaptivity, planar-adaptive schemes also reduce the hardware complexity of the interconnection network[6], which will have a positive impact on its performance.Planar-adaptive routing allows routing flexibility at a lower hardware cost than full adaptivity. References [1] Jau-Der Shih, ”Adaptive Fault Tolerant Wormhole Routing Algorithms for Hypercube and Mesh ” [2] Yen-Wen Lu, Kallol Bagchi, James B. Burr, Allen M. Peterson, A Comparison of Different Wormhole Routing Schemes. [3] W.J. Dally and C. L. Seitz, ”Deadlock-free message routing in multiprocessor interconnec- tion networks,” IEEE Trans. Compul., vol. C-36, no. 5, pp. 547 553, May 1987.
  28. 28. REFERENCES 28 [4] William J. Dally, Performance Analysis of k-ary n-cube Interconnection Networks, 1988, IEEE Transactions on Computers. [5] J. Y. Ngai, and C. L. Seitz, ”A Framework for Adaptive Routing in Multicomputer Net- works,” Proc. Symp. on Parallel Algorithms and Architectures, (1989), pp. 1–9. [6] J. H. Kim and A. A. Chien, ”An evaluation of planar-adaptive routing (PAR),” in Proc. Fourth IEEE Symp. on Par. and Distr. Processing, 1992. [7] W. J. Dally, Virtual channel Flow Control. IEEE Trans. Parallel and Distributed Systems, 1992 [8] Christopher J. Glass, Lionel M. Ni, The Turn Model for Adaptive Routing, 1992, 25 Years ISCA: Retrospectives and Reprints [9] W. J. Dally and H. Aoki, Deadlock-free adaptive routing in multicomputer networks using virtual channels. IEEE Trans. Parallel and Distributed Systems, 4:466–475, 1993 [10] R. V. Boppana and S. Chalasani., New Wormhole Routing Algorithms for Multicomputers, In International Parallel Processing Symposium, pages 419–423, 1993. [11] L. M. Ni and P. K. McKinley, A survey of wormhole routing techniques in direct networks, IEEE Computer Magazine 26 (1993), no. 2, 62-76. [12] Rajendra V. Boppana and Suresh Chalasani, A Comparison of Adaptive Wormhole Routing Algorithms, ISCA,351-360,1993 [13] Kazuhiro Aoyama and Andrew A. Chien, ”The Cost of Adaptivity and Virtual Lanes in a Wormhole Router”, Journal of VLSI Design, 1993 [14] W. Dally and L. Dennison and D. Harris and K. Kan and T. Xanthopoulos, ”Architecture and Implementation of the Reliable Router”, In Hot Interconnects II: 1994 [15] A. A. Chien and J. H. Kim, ”Planar-adaptive routing: Low-cost adaptive networks for multiprocessors”, Journal of the ACM, vol. 42, pp. 91–123, January 1995 [16] Akhilesh Kumar, Laxmi N. Bhuyan, Effect of Virtual Channels and Memory Organization on Cache-Coherent Shared-Memory Multiprocessor, 1996 [17] Arne Folkestad and Christian Roche, ”Deadlock Probability in Unrestricted Wormhole Routing Networks”, ICC(3)”, 1401-1405,1997

×