03 Router (curtin - computer systems and networking)

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Group

    03 Router (curtin - computer systems and networking) - Presentation Transcript

    1. Routers and Switches: Design and Operation High Speed Router Design
    2. High-Speed Router Design
      • Outline:
      • Introduction
      • Router Generations
      • Table Lookup
      • Switch Fabric Design
      • Buffer Placement
    3. Where IP routers sit in the network The Internet Core Core router Edge Router
    4. Basic Architectural Components
      • Two key router functions:
      • run routing algorithms/protocol (RIP, OSPF, BGP)
      • switching datagrams from incoming to outgoing link
      Control Plane Datapath - per-packet processing Switching Forwarding Table Routing Table Routing Protocols
    5. Per-packet processing in an IP Router
      • 1. Accept packet arriving on an incoming link.
      • 2. Lookup packet destination address in the forwarding table => to identify outgoing port(s).
      • 3. Manipulate packet header: e.g., decrement TTL, update header checksum.
      • 4. Send packet to the outgoing port(s).
      • 5. Buffer packet in the queue.
      • 6. Transmit packet onto outgoing link.
    6. Input Port Functions
      • Decentralized switching :
      • given datagram dest., lookup output port using routing table in input port memory
      • goal: complete input port processing at ‘line speed’
      • queueing: if datagrams arrive faster than forwarding rate into switch fabric
      Physical layer: bit-level reception Data link layer: e.g., Ethernet
    7. Output Ports
      • Buffering required when datagrams arrive from fabric faster than the transmission rate
      • Scheduling discipline chooses among queued datagrams for transmission ( to be discussed later )
    8. High-Speed Router Design
      • Outline:
      • Introduction
      • Router Generations
      • Table Lookup
      • Switch Fabric Design
      • Buffer Placement
    9. First Generation Routers Shared Backplane Line Interface Typically <0.5Gb/s aggregate capacity CPU Memory Route Table CPU Buffer Memory Line Interface MAC Line Interface MAC Line Interface MAC
    10. First Generation Routers Switching Via Memory
      • Follow conventional computer architecture
      • A shared central bus, with a central CPU, memory and peripheral Line Cards.
      • Each Line Card connecting the system to each of the external links.
      • Packets arriving from a link are transferred across the shared bus to the CPU, where a forwarding decision is made.
      • The packet is then transferred across the bus again to its outgoing Line Card, and onto the external link.
      • speed limited by memory bandwidth (2 bus crossings per packet)
      Input Port Output Port Memory/CPU System Bus
    11. First Generation Routers Queueing Structure: Shared Memory Output 2 Output N Large, single dynamically allocated memory buffer: N writes per pkt time N reads per pkt time. Limited by memory bandwidth. Input 1 Output 1 Input N Input 2
      • Numerous work has proven and made possible:
        • Fairness
        • Delay Guarantees
        • Delay Variation Control
        • Loss Guarantees
        • Statistical Guarantees
    12. First Generation Routers How fast can we make centralized shared memory? Shared Memory 200 byte bus 5ns SRAM 1 2 N
      • 5ns per memory operation
      • Two memory operations per packet
      • Therefore, up to 160Gb/s
      • In practice, closer to 80Gb/s
    13. Second Generation Routers Route Table CPU Line Card Buffer Memory Line Card MAC Buffer Memory Line Card MAC Buffer Memory Fwding Cache Fwding Cache MAC Drop Policy Drop Policy Or Backpressure Output Link Scheduling Buffer Memory Typically <5Gb/s aggregate capacity Fwding Cache Slow Path
    14. Second Generation Routers Operations
      • Placing a separate CPU at each interface.
      • A local forwarding decision is made in a dedicated CPU, based on its local forwarding table cache.
      • The packet is immediately forwarded to its outgoing interface. (i.e. port mapping intelligence in Line Cards)
      • This has the additional benefit that each packet need only traverse the bus once , thus increasing the system throughput.
      • The central CPU is needed to maintain the forwarding tables in each of the other CPUs, and for centralized system management functions.
    15. Second Generation Routers Queueing Structure: Combined Input and Output Queueing Bus 1 write per “packet” time 1 read per “packet” time Rate of write/read determined by bus speed
    16. E.g. Cisco 7507 router Front view Rear view Backplane
    17. Third Generation Routers Line Card MAC Local Buffer Memory CPU Card Line Card MAC Local Buffer Memory Switched Backplane Line Interface CPU Memory Routing Table Fwding Table Typically <50Gb/s aggregate capacity Fwding Table
    18. Third Generation Routers
      • BUT forwarding decisions are made in software, and so are limited by the speed of a general purpose CPU.
      • Carefully designed, special purpose ASICs can readily outperform a CPU when making forwarding decisions, managing queues, and arbitrating access to the bus.
      • Shared bus allows only one packet traverses at a time between two Line Cards.
      • Replacing the shared bus by a crossbar switch
      • => multiple Line Cards can communicate with each other simultaneously greatly increasing the system throughput.
    19. Third Generation Routers Queueing Structure Switch 1 write per pkt time 1 read per pkt time Rate of write/read determined by switch fabric speedup
    20. E.g. Cisco 12000 series routers
      • http://www.cisco.com/warp/public/cc/pd/rt/12000/12416/
      • “ The Cisco 12000 series offers industry leading scalability, high performance, and guaranteed priority packet delivery through an innovative distributed architecture design that enables service providers to accelerate the evolution of the Internet through delivery of profitable, next generation services.”
      • “ The Cisco 12416 Internet router is a 10 Gigabit, 16-slot chassis member of the Cisco 12000 series that provides a total switching capacity of 320 Gigabits per second (Gbps), with 20 Gbps (10 Gbps full duplex) capacity per slot.”
    21. Routers vs. Gateways
      • Researchers who invented TCP/IP defined the term “ IP Gateway ” to refer to the systems that interconnected networks and forwarded IP datagrams among them.
      • By the early 1990s, vendors had hired marketing people to help them sell products. One vendor thought “ IP router ” sounded better than “IP gateway”, and others quickly followed the lead.
      • When Microsoft incorporated TCP/IP software into their Windows system, they chose to make the configuration screen ask the user to enter a “Gateway” address.
      • So, the terms “IP Gateway” and “IP router” are synonymous.
    22. High-Speed Router Design
      • Outline:
      • Introduction
      • Router Generations
      • Table Lookup
      • Switch Fabric Design
      • Buffer Placement
      Switching Forwarding Table Routing Table Routing Protocols
    23. High-Speed Router Design
      • Outline:
      • Introduction
      • Router Generations
      • Table Lookup
      • Switch Fabric Design
      • Buffer Placement
      Switching Forwarding Table Routing Table Routing Protocols
    24. Basic Architectural Components Datapath: per-packet processing
    25. Input Queueing vs Output Queueing Input Queueing Output Queueing Usually a non-blocking switch fabric (e.g. crossbar) (or, single path switch) Usually a fast bus (or, multiple path switch)
    26. Switch fabrics
      • Transfer data from input to output, ignoring scheduling and buffering
      • There are many types of fabric architectures
      • Choosing one usually depends on where the switch will exist in the network and the amount of traffic it will have to carry
      Switch Fabrics Time domain Space domain Shared media Shared memory Single path Multiple path Crossbar Broadcast Banyan Batcher-banyan Replicated Banyan Dilated Banyan Tandem Banyan
    27. Three types of switching fabrics Interconnection Networks, e.g. crossbar
    28. Switching Via Memory
      • packet copied by system’s (single) CPU
      • speed limited by memory bandwidth
    29. Switching Via Bus
      • datagram from input port memory
      • to output port memory via a shared bus
      • bus contention: switching speed limited by bus bandwidth
      • 1 Gbps bus, Cisco 1900: sufficient speed for access and enterprise routers (not regional or backbone)
    30. Switching Via Interconnection Networks
      • An interconnection network is usually constructed using 2 x 2 switching elements, e.g. crossbar switch
      • The way those switching elements interconnected determines the resulting switch architecture.
      • Overcome bus bandwidth limitations
      • Major types of Interconnection Networks:
        • Crossbar
        • Banyan based
      • Note: For an interconnection network, the number of 2x2 switching elements required is considered as a good measure for complexity
      Bar state Cross state
    31. Fundamental Properties of Interconnection Networks
      • An interconnection network is usually constructed using 2 x 2 switching elements. The way those switching elements interconnected determines the resulting switch architecture.
      • We can find out that there should be N N input-output mappings for a nonblocking packet switch, since there can be up to N packets destine for a same output port simultaneously
      1 2 N Inputs Outputs 2x2 Output blocking/contention occurs at an output port if it cannot accept the amount of packets destined to it in a packet transmission time, or a time slot . 1 2 N 1 2 N 1 2 N
    32. How to deal with output blocking?
      • Speed-up
        • mainly adopted by time-domain switch fabrics
        • If in a packet transmission time, m packets can be forwarded to a same output port, the switch fabric is said to have a speed-up factor of m.
      • Multiple paths
        • mainly adopted by space domain switch fabrics
        • Different packets can follow different paths to arrive at a same output port simultaneously.
        • => no speed-up is required but the switch fabric is more complicated in order to provide multiple paths
      1 2 N 1 2 N Group size , the maximum # of packets can be received by an output per time slot , increases in both cases
    33. How to deal with excess packets?
      • The switch can either drop excess packets that cannot be switched ( loss systems ), or it can buffer them for output access in the next time slot ( waiting systems ).
      • In a loss system , one can increase the group size to reduce the packet loss probability .
        • But the switch complexity increases.
        • Besides, packets can be dropped as a result of internal blocking in the switch, as that for banyan networks
      • In a waiting system , excess packets can be buffered at inputs, internally in the switch, or at outputs. A large group size => higher throughput .
      • Note: if group size > 1, output buffer is always required.
    34. Crossbar Switch 2N buses in parallel Inputs Outputs 1 2 3 4 1 2 3 4 Complexity = N 2 configuration Data In Data Out Bar state Cross state
    35. Switching Via Interconnection Networks
      • An interconnection network is usually constructed using 2 x 2 switching elements, e.g. crossbar switch
      • The way those switching elements interconnected determines the resulting switch architecture.
      • Overcome bus bandwidth limitations
      • Major types of Interconnection Networks:
        • Crossbar
        • Banyan based
      • Note: For an interconnection network, the number of 2x2 switching elements required is considered as a good measure for complexity
      Bar state Cross state
    36. Banyan Network
      • An N x N banyan network has two properties (where N = 2 n ):
        • There is a unique path from any input to any output.
        • There are logN columns/stages, each with N/2 switching elements.
      • Banyan is self-routed
        • => distributed control
      8 x 8 banyan 16 x 16 banyan 4 x 4 banyan
    37. Routing in banyan network
      • Each switching element in the i -th stage examines the i -th bit of the destination address ( most-significant-bit first ) to make the decision
        • if the bit = 1, route to the lower output
        • if the bit = 0, route to the upper output
      Banyan Network 000 001 010 011 100 110 111 101 011 011 101 011 011 101 101 101
    38. Blocking in banyan network
      • Internal blocking: if two packets, with different outputs, contend for the same outgoing link at an intermediate switching element
      Banyan Network 000 001 010 011 100 110 111 101 Output blocking! Internal blocking! 011 010 101 101 011 010
    39. Other “banyan-typed” networks Shuffle-exchange (Omega) Network Reverse Shuffle-exchange network Banyan Network Isomorphic
    40. Simple switches based on banyan network
      • Consider a banyan network when it is operated as a loss system. Assume a uniform traffic situation and let
        • P m = Pr [there is a packet at an input link at stage m +1].
        • We can express
        • Therefore, P loss = ( P 0 - P n ) / P 0 .
        • Using Taylor series expansion , we can have
        • P loss = n P 0 / ( n P 0 + 4).
      • When P 0 = 1, P loss = 0.25, 0.39, 0.48 , 0.5, 0.56, 0.6 for switches with size 2, 4, 8, 16, 32, 64
        • => blocking increases rapidly with switch size n.
      P m+ 1 P m 2 x 2 P m P m+ 1
    41. A simple internally nonblocking switch
      • Performance of an internally nonblocking switch functioning as a loss system
        • Consider a particular output i . In any time slot the prob. that none of the N inputs has a packet destined for it is (1- P 0 /N ) N  e - P 0 as N increases
        • Thus P n = Pr [there is a packet at output i ] = 1 - e - P 0 for large N .
        • Since P n increases with P 0 , the maximum throughput is obtained when P 0 =1, i.e.  * = 1 - e -1 = 0.632; and the corresponding packet loss prob. P loss = ( P 0 - P n ) / P 0 = (1-0.632)/1 = 0.368.
        • Similarly, when P 0 = 1, P loss = 0.25, 0.32, 0.34, 0.356, 0.362, 0.365 for switches with size 2, 4, 8, 16, 32, 64
      Loss system P 0 P n P loss = 0.25, 0.39, 0.48 , 0.5, 0.56, 0.6 1 2 N 1 2 N
    42. Combinatoric properties of banyan networks
      • For a nonblocking (both internally and at output) N x N packet switch, there are N N possible input-output mappings.
      • The total number of possible states (input-output mapping) can be realized by a banyan network is N N /2 (=2 N/2logN ).
      • The fraction of realizable input-output mapping is 1/ N N /2 .
        • => appraches 0 as N increases
      1-1 2-2 3-3 4-4
    43. Non-blocking Conditions for Banyan Networks
      • Theorem . The banyan network is nonblocking if the active inputs x 1 , ...., x m ( x j > x i if j > i ) and their corresponding output destinations y 1 , ...., y m satisfy the following:
        • 1. ( Distinct & monotonic outputs ): or
        • 2. ( Concentrated inputs ): Any input between two active inputs is also active. That is, implies input w is active.
      x 1 = 0000 y 1 = 0010 x 2 = 0001 x 3 = 0010 y 2 = 1011 y 3 = 1100
      • Labeling switching elements in a banyan
        • Each node in stage k can be uniquely represented by two binary numbers ( a n-k … a 1 ,b 1 … b k- 1 ) .
      00,0 01,0 10,0 11,0 00,1 01,1 10,1 11,1 0,00 1,00 0,01 1,01 0,10 1,10 0,11 1,11  ,000  ,001  ,010  ,011  ,100  ,101  ,110  ,111 000,  010,  011,  100,  101,  110,  111,  001,  0001 1001 1 0 0 1
      • Proof:
      • Suppose two packets, one from x = a n … a 1 to output y = b 1 … b n, the other from x’ = a’ n … a’ 1 to output y’ = b’ 1 … b’ n , collide in stage k
      • That is, two paths
      • merge at the same node and share the same outgoing link
      • Thus, we have
      (A) ( a n-k … a 1 ,b 1 … b k- 1 ) = ( a’ n-k … a’ 1 ,b’ 1 … b’ k- 1 ) Stage k b k = b’ k
      • Since input packets are concentrated , the total # of packets between x and x’ , inclusively, is | x’- x | + 1
      • Since all packets are destined for different outputs , thus there must be | x ’- x |+1 distinct output addresses
      • Since outputs are monotonic , the largest and the smallest output addresses must be y and y’, or y’ and y . Hence we must have
      • From (A), we have
      • This contradicts with (B). Thus the theorem is proved.
      • Proof End
      (B)
    44. E.g.
      • Violating the concentrated input condition but still non-blocking
      Banyan Network 000 001 010 011 100 110 111 101 011 101
      • Theorem 2 . Let the input-output pair of packet i be denoted by ( x i , y i ) . If the packets can be routed through the banyan network without conflicts, so can the set of packets (( x i + z ) mod N, y i ) .
      • Proof: (try it yourself)
      • e.g. z = 5
      x 1 = 0000 y 1 = 0010 x 2 = 0001 x 3 = 0010 y 2 = 1011 y 3 = 1100 x’ 1 = 0101 x’ 2 = 0110 x’ 3 = 0111
    45. Exercise: Consider an 8 x 8 banyan network. Suppose with probability 0.75 a packet is destined for outputs 000, 001, 010, or 011, and with probability 0.25 it is destined for the other four outputs. Within each group of outputs, the packet is equally likely to be destined for any of the four outputs. Is the loss probability higher in this case than when a packet is equally likely to be destined for any of the eight outputs? 000 001 010 011 100 110 111 101
    46. Solution:
      • Let P iL and P iU be defined as shown in the figure. P 0 is the input load and let P 0 = 1.
      • At stage 1,
      • At stage 2,
      • At stage 3,
      000 001 010 011 100 110 111 101 P 1 U P 1 L
      • The overall loss prob. is
      • When a packet is equally likely to any of the 8 outputs,
      • Therefore, the loss probability in this case is higher than the case that a packet is equally likely to be destined for any outputs.
      P 0 P 2 U P 3 U P 2 L P 3 L
    47. Sorting Networks a1 a2 a3 a4 b1 b2 b3 b4 1 2 3 Stage: 01 11 10 00 00 01 10 11 01 11 00 00 01 11 max{a1,a2} min{a1,a2} max{a3,a4} min{a3,a4} min{a1,a2,a3,a4} max{min{a1,a2},min{a3,a4}} max{a1,a2,a3,a4} min{max{a1,a2},max{a3,a4}} Comparator : it takes two input numbers and places the larger number on the output pointed by the arrow and the smaller number on the other output
      • Order-preserving Property : Suppose a sorting network sorts the input sequence a =  a 1 ,a 2 ,...,a N  into the output sequence b=  b 1 ,b 2 ,...,b N  , then for any monotonically increasing function f, the network sorts the input sequence f(a) =  f(a 1 ),f(a 2 ),...,f(a N )  into the output sequence b =  f(b 1 ),f(b 2 ),...,f(b N )  .
      7= 6= 9= 3= =3 =6 =7 =9 2= 1= 4= 3= =3 =1 =2 =4 X  f(5) f(4) f(7) f(1)  Sorting Network  f(1) f(4) f(5) f(7)   5 4 7 1  Sorting Network  1 4 5 7  What if f(x) = x+ 2, if x < 2; f(x)= x - 3, if x  2 2 2 E.g. f(x) = x+ 2 2  f(5) f(4) f(7) f(1)  Sorting Network  f(1) f(4) f(5) f(7) 
      • Theorem 3 ( Zero-One Principle ) If a sorting network with N inputs sorts all the 2 N possible sequences of 0’s and 1’s correctly, then it sorts all sequences of arbitrary input numbers correctly.
      • Proof:
      • Consider a sorting network can sort all sequences of 0’s and 1’s correctly. By contradiction, suppose it does not sort input sequences of arbitrary numbers correctly. That is, there is an input sequence  a 1 ,a 2 ,...,a N  containing two elements a i and a j such that a i < a j , but the network places a j before/above a i .
      • Define a monotonically increasing function f(x) such that f(x) = 0, if x  a i ; f(x) = 1, if x > a i .
      • According to the order-preserving property, since the network places a j before/above a i when the input sequence is  a 1 ,a 2 ,...,a N  , it places f(a j ) =1 before/above f(a i ) =0 when the input sequence is  f(a 1 ),f(a 2 ),...,f(a N )  . But this input sequence consists of only 0’s and 1’s, and yet the network does not sort it correctly, leading to a contradiction.
       a 1 a 2  a N  0= 1= : 0= a j a i Sorting Network ...  f(a 1 ) f(a 2 ) : f(a N )  Sorting Network f(a j ) ... f(a i ) =1 =0
      • Merging is a divide-and-conquer technique for sorting.
        • A k -merger takes two sorted input sequences and merge them into one sorted sequence of k elements.
        • Intuitively merging is simpler than sorting in general.
        • Suppose we have mergers of different sizes, they can be interconnected (as shown below) to sort an arbitrary input sequence.
        • One way to construct the mergers is to use bitonic sorting algorithm invented by Batcher .
      Sorting networks based on bitonic sort 4-merger 4-merger N/2-merger N-merger N/2-merger ... ... ... ... ... ... 2-merger 2-merger 2-merger 2-merger
    48. Some properties of bitonic sequence
      • A bitonic sequence is a sequence that either increases monotonically and then decreases monotonically, or decreases monotonically and then increases monotonically.
        • E.g.  1,3,5,7,6,4,2,0  ,  7,5,3,1,0,2,4,6  ,  1,2,3,3,2,1 
      • A bitonic sorter is a merger that takes a bitonic sequence and sort it into a monotonic sequence.
      k -bitonic sorter ... Ascending sequence ... Descending sequence Ascending sequence k -bitonic sorter ... Descending sequence ... Ascending sequence Ascending sequence
    49. Some properties of bitonic sequence
      • We focus on bitonic sequences with only 0’s and 1’s (Why?)
      • Two general forms:  1 i 0 j 1 k  or  0 i 1 j 0 k 
      • A bitonic sequence a is said to be no less than another bitonic sequence if none of the element in a is less than any of the element in b .
        • e.g.  00000    01110  ,  11111    01110  ,
      • Two sequences do not necessarily have an ordering relationship.
        • e.g.  00010  and  01110 
      • Using the following theorem, a bitonic sequence can be decomposed into two bitonic subsequences a’ and a” using only one stage of comparators.
      • Theorem 4 . If a zero-one sequence of 2n elements a =  a 1 ,a 2 ,...,a 2n  is bitonic then the two n -element sequences
      • a’ =  min(a 1 ,a n+1 ),min(a 2 ,a n+2 ), ..., min(a n ,a 2n )  and
      • a”=  max(a 1 ,a n+1 ),max(a 2 ,a n+2 ) ,..., max(a n ,a 2n ) 
      • have two properties:
      • 1. They are both bitonic.
      • 2. a’ < a”.
      • Proof:
      Some properties of bitonic sequence a 1 a 2 a n a n +1 a n +2 a 2 n min( a 1 , a n+ 1 ) min( a 2 , a n +2 ) min( a n , a 2 n ) max( a 1 , a n+ 1 ) max( a 2 , a n +2 ) max( a n , a 2 n ) ... ... ... ... a’ a” a
    50. Recursive construction of a k -bitonic sorter k -bitonic sorter ... ... ... ... k/ 2 -half cleaner 2 -half cleaner ... ... ... Ascending Sequence a 1 a 2 a k/ 2 a k/ 2+1 a k min( a 1 , a k/ 2+1 ) min( a 2 , a k/ 2+2 ) max( a 1 , a k/ 2+1 ) max( a 2 , a k/ 2+2 ) max( a k/ 2 , a k ) ... ... ... ... k-half cleaner min( a k/ 2 , a k ) Bitonic Sequence k -half Cleaner k /2-bitonic sorter k /2-bitonic sorter
    51. An 8 x 8 Sorting Network using bitonic sorters 4-merger 4-merger 8-merger Note: a merger is implemented using a bitonic sorter 3 2 8 7 1 6 5 4 2 3 8 7 1 6 5 4 2 8 3 7 1 5 6 4 2 8 3 7 5 1 6 4 2 3 8 7 5 6 1 4 2 3 7 8 6 5 4 1 2 6 3 5 7 4 8 1 2 6 3 5 4 7 1 8 2 4 3 1 6 7 5 8 2 4 1 3 6 7 5 8 2 1 4 3 6 5 7 8 1 2 3 4 5 6 7 8 2-merger 2-merger 2-merger 2-merger The total # of comparators in an N x N Batcher Network is: Total # of stages = 1 + 2 + 3 + … + log N = (1 + log N) log N / 2
    52. Batcher-Banyan network
      • Q1: how packets are switched in Batcher network?
      • Q2: how contentions are resolved such that banyan is nonblocking?
      Batcher Banyan Inputs Outputs
    53. Switching in Batcher-banyan network
        • Only headers are compared
        • If both header bits of the two packets are 0’s or 1’s, the comparator remains in its original state (in this case, bar state) and the bits are forwarded to the outputs.
        • For the first pair of bits that differ, set the comparator state accordingly and remains unchanged for the rest of the packet
        • If the two packets have the same output address, the comparator remains in its original state for the whole packet duration.
      ... 0100 ... 1000 ... Assume comparator is in bar state Order of arrival from right to left Payload Header 010 100 ... 0 0 Remains in bar after 1st bit 01 ... 10 ... 00 00 Remains in bar after 2nd bit 0 ... 1 ... 000 100 Set to cross after 3rd bit because upper input is larger; remains in cross for the whole pkt duration
    54. Contention resolution in Batcher-banyan network
      • What if some inputs are idle?
        • Adding an extra bit in front of of the MSB (most significant bit) of the output port address of each packet; called activity bit .
        • If the input is active, set this bit to 0
        • Otherwise, construct a dummy packet and set this bit to 1
        • All dummy packets will be “pushed” to the lower end of the outputs
          • => all active packets are concentrated at the inputs to the banyan switch
      Batcher Banyan 001 100 100 000 101 i i i 0 001 0 100 0 100 0 000 0 101 1 xxx 1 xxx 1 xxx
    55. Three-phase algorithm
      • How to solve the output contention problem?
        • Three phase algorithm for resolving output contention:
          • Probe phase : only the header of packets enter the sorting network. Packets with the same output address will be adjacent to each other at the outputs. Output j+1 checks with output j to see if their addresses are the same. If yes, let the packet at output j+1 be the loser and the packet at output j be the winner.
          • Acknowledgment phase : acknowledgements are back-propagated along the same path as the forward path in the probe phase.
          • Send phase : send the winning packets; inputs that have lost contention can buffer their packets for later attempt (=> waiting system approach)
        • The first two phases are overheads.
    56. An example 001 010 011 100 101 110 111 000 X 001 100 100 000 101 i i i 001 010 011 100 101 110 111 000 0 001 0 100 0 100 0 000 0 101 1 xxx 1 xxx 1 xxx 0 000 0 001 0 100 0 101 1 xxx 0 100 1 xxx 1 xxx 001 010 011 100 101 110 111 000 001 010 011 100 101 110 111 000 0 000 0 001 0 101 1 xxx 1 xxx 0 100 1 xxx 1 xxx 000 001 101 xxx xxx 100 xxx xxx 000 001 101 100 001 100 100 000 101 i i i 0 001 1 xxx 0 100 0 000 0 101 1 xxx 1 xxx 1 xxx
    57. Multiple-Path Banyan Switch Designs
      • Complexity of Batcher-banyan
      • Can get rid of the complexity of Batcher network while keeping the performance of Batcher-banyan and the advantages of banyan?
      • Yes, using multiple-path banyan networks:
        • Dilated banyan
        • Replicated banyan
        • Tandem banyan
    58. Multiple-Path Banyan Switch Designs
      • Dilated Banyan
        • The internal link bandwidth is expanded to reduce the likelihood of a packet being dropped
        • For a banyan with dilation degree of d, the switch elements are of size 2d x 2d. Each outgoing address has d associated outgoing links
      • Replicated Banyan
        • Suppose each packet is randomly routed to one of the banyan for switching. The load to each banyan is reduced by a factor of K, thus
        • P loss = n P 0 / ( n P 0 + 4K).
        • Instead of random routing, we can broadcast a packet to all K banyan planes.
        • Since a packet is lost only if all copies are lost, so
        • P loss = ( n P 0 / ( n P 0 + 4)) K .
      d Random router or broadcaster 1st banyan K-th banyan 1 2 N 1 2 N
      • Tandem Banyan
        • Deflection routing:
          • Whenever there is a conflict at a 2x2 switch element, one packet would be routed correctly while the other would be marked and routed in the wrong direction.
          • To optimize the number of correctly routed packets , the marked packet would have a lower priority than an unmarked one for the rest of the journey within the banyan network (why?)
          • If a packet remains unmarked when it reaches the output of the banyan, it has reached the correct destination. It is removed and forwarded to the concentrator associated with the output destination
          • On the other hand, a marked packet will be unmarked and forwarded to the next banyan (by packet filter), and a new attempt to route the packet to its desired output is initiated
          • A packet is considered lost if it still fails to reach the desired output after passing through all the K banyan networks.
      Packet filter for marked packets Concentrator 1st banyan 2nd banyan K-th banyan Output 1 Output N 1 N Delay elements
        • Let D be the delay suffered by a packet as it travels through a single banyan network.
        • A packet that reaches its correct destination at a later banyan experiences a larger delay
        • To compensate for the delay differences (why?), one can insert delay elements with varying delays at different places:
          • for the links that connect the N outputs of banyan i to the N concentrators, one can introduce a delay of (K-i)D.
      Packet filter for marked packets Concentrator 1st banyan 2nd banyan K-th banyan Output 1 Output N 1 N Delay elements
    59. An example of tandem Banyan routing 011 010 110 001 ( 010 ) 101 ( 011 ) 111 ( 110 ) 100
    60. A practical issue
      • Today, it is the number of chip I/Os, not the number of crosspoints, that limits the size of a switch fabric.
      • The problem is how to interconnect chips with limited I/Os to form a larger switch?
    61. Three-stage switching network Clos Network
      • Switch modules are arranged in three stages and any module in the first ( second ) stage is interconnected with any module in the second ( third ) stage via a unique link
      • Each switch module is a nonblocking switch
      • For an N x N 3-stage switch:
        • n1 x r1 = n3 x r3 = N
      Switch module n1 x r2 r1 x r3 r2 x n3 1 r1 1 1 r3 r2 ... ... ...
    62. An example
      • n1 = r2 = r1 = r3 = n3 = 3
      • 3-stage switching network is not necessarily nonblocking.
      1 2 3 1 2 3 1 2 3
      • Theorem A three-stage switching network is strictly nonblocking if and only if
        • r2  n1 + n3 -1
        • Proof:
      n1 x r2 r1 x r3 r2 x n3 1 r1 1 1 r3 r2 ... ... ... n3-1 n1 -1 n1-1 inputs are busy i.e. only one is idle n3-1 outputs are busy i.e. only one is idle
    63. Summary: Switch Fabric Design using Interconnection networks
      • Banyan networks
        • nice but with serious internal blocking problem
      • Non-blocking conditions for banyan networks
        • how to implement?
      • Batcher-banyan networks
        • Design of a sorting network (Batcher) to satisfy the non-blocking conditions
      • Multipath banyan networks
        • Keeping the performance of batcher-banyan and advantages of banyan
      • Tackling the issue on interconnecting chips with limited I/Os to form a larger switch fabric
    64. High-Speed Router Design
      • Outline:
      • Introduction
      • Router Generations
      • Table Lookup
      • Switch Fabric Design
      • Buffer Placement
      Input Port Queueing Output Port Queueing Combined Input Output Queueing
    65. Input Port Queuing
      • Note: When we call a switch is input-queued , we imply that in each time slot, the switch fabric can allow at most one packet to be sent by an input port, and at most one packet can be received by an output port.
      • Fabric slower than input ports -> queueing may occur at input queues
      • Head-of-the-Line (HOL) blocking: queued datagram at front of queue prevents others in queue from moving forward
    66. An Analogy
    67. Input Port Queuing Performance
      • Throughput of an input-queueing switch
        • When N is large, the maximum throughput  * = 0.586 can be found under uniformly distributed traffic condition
          • all inputs have the same loading
          • a packet destined for any output port with the same probability
          • in case of contention, the winner is chosen randomly
      Delay Load 100% 58.6%
    68.  
    69. Input Queueing -- Virtual output queues
    70. Input Queues Virtual Output Queues Delay Load 100% 58.6% Scheduler Memory b/w = 2R
    71. Input Queueing Scheduling Bipartite Matching (Weight = 18) Question: Maximum weight or maximum size? ? 1 2 3 4 1 2 3 4 2 5 2 4 2 7 1 2 3 4 1 2 3 4
    72. Input Queueing Scheduling
      • Maximum Size
        • Maximizes instantaneous throughput
        • Does it maximize long-term throughput? Not necessarily
      • Maximum Weight
        • Can clear most backlogged queues
        • But does it sacrifice long-term throughput? No.
      Maximum Weight is better!
    73. Input Queueing Why is serving long/old queues better than serving maximum number of queues?
      • When traffic is uniformly distributed, servicing the maximum number of queues leads to 100% throughput.
      • When traffic is non-uniform, some queues become longer than others.
      • A good algorithm keeps the queue lengths matched , and services a large number of queues.
      VOQ # Avg Occupancy Uniform traffic VOQ # Avg Occupancy Non-uniform traffic
    74. Points to Ponder:
      • Can we design an input queued switch with k queues per input port, where 1 < k < N , such that it is scalable and has the high throughput of a VOQ switch?
      ? k queues N outputs
    75. Odd-even switch
      • In each time slot, scheduling is divided into 2 contention rounds, one for each output address group.
      • The 1-st contention round considers the HOL packets at all even queue. The 2-nd round considers the HOL packets at all odd queues.
      • After the 2 contention rounds, the winning packets in both rounds are then switched together.
      • Note: Each input/output can only send/receive at most one packet each time slot.
    76. Performance Results
        • For details, please refer to the following paper:
        • Kwan L. Yeung and H. Shi, “Throughput Analysis for Input-buffered ATM Switches with Multiple FIFO Queues per Input Port,” Electronics Letters , Vol. 33, No. 19, pp. 1604-1606, Sep. 1997.
    77. High-Speed Router Design
      • Outline:
      • Introduction
      • Router Generations
      • Table Lookup
      • Switch Fabric Design
      • Buffer Placement
      Input Port Queueing Output Port Queueing Combined Input Output Queueing
    78. Output port queueing
      • Note: When we call a switch is output-queued , we imply that in each time slot, the switch fabric can allow up to N packets to be arrived at an output port.
      • Output buffering when arrival rate via switch exceeds output line speed
      • queueing (delay) and loss due to output port buffer overflow
    79. Output Queueing Individual Output Queues Centralized Shared Memory 1 2 N 1 2 N Memory b/w = (N+1).R Memory b/w = 2N.R
    80. Comments:
      • Both an output-buffered switch and an input-buffered VOQ switch do not suffer from HOL blocking.
      • Does it mean they should have the same performance?
      • No.
      • Which one will be better?
      • Output-buffered switch.
      • Why?
      • Intuitively let us consider the case that an output line is idle. For an output-buffered switch, a packet (if any) in the output buffer can be sent immediately; for an input-buffered VOQ switch, even there are packets waiting at the input buffers for this output port, they can not be immediately sent to the output port/line because of the constraints imposed by the VOQ scheduling algorithms.
        • e.g. using longest queue first , the queue destines for this output port may not be the longest
    81. Combined Input-output Queueing Can we design an input-queued switch that functions exactly as an output-queued switch?
      • Yes. It has been proved that a combined input-output Queueing (CIOQ) switch with (at least) 2 times switch fabric speedup, together with a suitable scheduling algorithm (e.g. stable marriage matching algorithm), can function exactly like an output-buffered switch.
      • Why we bother to do so?
      • To get the performance of an output-queued switch, while at the cost of an input-queued switch.
      • Note: To provide Quality of Service (QoS) guarantee on, e.g. packet delay, scheduling algorithms can be easily designed and efficiently implemented for output-buffered switches (more to be covered later on in Scheduling ).
    82. Using Speedup 1 1 1 2 2
    83. The Ideal Solution N N Output Queued Switch 1 N = ? Combined Input-Output Queued Switch 1 N
    84. The findings For a switch with combined input and output queueing to exactly mimic an output queued switch, for all types of traffic, a speedup of 2-1/N is necessary and sufficient. But How to make such an algorithm fast and efficient for real implementation?
    85. High-Speed Router Design
      • Summary:
      • Introduction
      • Router Generations
      • Table Lookup
      • Switch Fabric Design
      • Buffer Placement
    SlideShare Zeitgeist 2009

    + Raj NOXRaj NOX Nominate

    custom

    259 views, 0 favs, 0 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 259
      • 259 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 35
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories

    Groups / Events