Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data center network architectures v1.3


Published on

Data Center Network Architecture: Towards a Cloud Data Center

Published in: Technology

Data center network architectures v1.3

  1. 1. Jeong Wook-jae Data Center Network Architecture: Towards a Cloud Data Center
  2. 2. 1/44 Contents  The Conventional Architecture & Problem  The New Architecture  The Monsoon Architecture  The VL2 Architecture  The SEATTLE Architecture  The PortLand Architecture  The TRILL  Related Works  Summary  The CDCN(Cloud Data Center Network) Architecture Proposal  Trend
  3. 3. 2/44 Confidential The Conventional Architecture The conventional architecture for data centers (adapted from figure by Cisco_2004)
  4. 4. 3/44 Confidential The Problems of a Conventional DC Ethernet is hard to scale out - STP - Broadcast (ARP, RARP, DHCP…) - Packet Floods in Switch (for Mac Learning) Fragmentation of resources No Performance Isolation Poor server to server connectivity Need very high reliability near top of the tree (Single Point of Failure)
  5. 5. 4/44 Confidential The Problems of a Conventional DC Fragmentation of Resources - VLANs used to isolate properties from each other - IP addresses topologically determined by ARs - Reconfiguration of IPs and VLAN trunks • painful, error-prone, slow, often manual
  6. 6. 5/44 Confidential The Problems of a Conventional DC No Performance Isolation - VLANs typically provide only reachability isolation - One service sending/receiving too much traffic hurts all services sharing its subtree
  7. 7. 6/44 Confidential The Problems of a Conventional DC Poor server to server connectivity - Data centers run two kinds of applications: • Outward facing (serving web pages to users) • Internal computation - 70~80% of the packets stay inside the data center
  8. 8. 7/44 Confidential The Problems of a Conventional DC
  9. 9. 8/44 Confidential Monsoon Albert Greenberg and 4 other persons (Microsoft Research)
  10. 10. 9/44 Confidential The Monsoon Architecture Monsoon - A new network architecture, which scales and commoditizes data center networking. Abstract - Scale-out instead of Scale-up - A single large Layer 2 domain - Using programmable commodity layer 2 switches and servers. - Hierarchy has 2: • TOR(Top-Of-Rack) Switch => Access Switch • LB(Load Balancing) Switch => Core Switch - Scale to 100,000 servers or more.
  11. 11. 10/44 Confidential The Monsoon Architecture Objectives - Low-Cost & Scale-out - Uniform high capacity • Capacity between two servers limited only by their NICs • No need to consider topology when adding servers - Performance isolation • Traffic of one service should be unaffected by others - Layer-2 semantics • Flat addressing, so any server can have any IP address • Server configuration is the same as in a LAN • Legacy applications depending on broadcast must work
  12. 12. 11/44 Confidential The Monsoon Architecture Server-to-Server Forwarding - An Example Monsoon Topology (Clos Network) • A scale-out design with broad layers - Same bisection BW at each layer -> no oversubscription - Extensive path diversity -> Graceful degradation under failure SWITCH Up-link Port Down-link Port # Inter. SW N/A 10Gbps X 144 72 Aggr. SW 10Gbps X 72 10Gbps X 72 144 TOR SW 10Gbps X 2 1Gbps X 20 5,184
  13. 13. 12/44 Confidential The Monsoon Architecture Clos Network Topology - A Multistage(ex. 3-stage) switching network. - The advantage • The connection between a large number of input and output ports can be made by using only small-sized switches. • It can be shown that with k ≥ n, the clos network can be non-blocking like a crossbar switch. - Clos Theorem: If K >= 2n-1, then a new connection can always be added without rearrangement
  14. 14. 13/44 Confidential The Monsoon Architecture Server-to-Server Forwarding Valiant Load Balancing • Every flow “bounced” off a random intermediate switch • Probably hotspot free for any admissible traffic matrix • Servers could randomize flow-lets if needed
  15. 15. 14/44 Confidential The Monsoon Architecture Valiant Load Balancing
  16. 16. 15/44 Confidential The Monsoon Architecture Server-to-Server Forwarding - Encapsulation used to transfer complexity to servers • Commodity switches have simple forwarding primitives • Complexity moved to computing the headers - Encapsulation available • IEEE 802.1ah defines MAC-in-MAC encapsulation Frame processing when packets go from one server to another in the same data center.
  17. 17. 16/44 Confidential The Monsoon Architecture Server-to-Server Forwarding - Data center OSes already heavily modified for VMs, storage, etc. • A thin shim for network support is no big deal - Applications work with Application Addresses • AA’s are flat names; infrastructure addresses invisible to apps - No change to applications or clients outside DC The networking stack of a host. The Monsoon Agent looks up remote IPs in the central directory. Monsoon Agent
  18. 18. 17/44 Confidential The Monsoon Architecture External Connection & Full Topology(Example) - Routers do not support the Monsoon functions - Ingress Server with each Access Router • Implements the Monsoon functionality and acts as a GW to the DC. • Two Interface : AR & TOR switch • Default GW ARAR AR AR ··· Ingress Server ···Ingress Server Ingress Server Ingress Server
  19. 19. 18/44 Confidential The Monsoon Architecture Directory System Performance - Key issues: • Lookup latency • How many servers needed to handle a DC’s lookup traffic? • Update latency • Convergence latency
  20. 20. 19/44 Confidential VL2 Albert Greenberg, Changhoon Kim and 7 other persons (Microsoft Research)
  21. 21. 20/44 Confidential The VL2 Architecture VL2 uses - flat addressing to allow service instances to be placed anywhere in the network - Valiant Load Balancing to spread traffic uniformly across network paths - end system-based address resolution to scale to large server pools without introducing complexity to the network control plane. Objectives - Uniform high capacity - Performance isolation - Layer-2 semantics Topology - Low-cost switch into a Clos topology. • Traffic Engineering - Valiant Load Balancing
  22. 22. 21/44 Confidential The VL2 Architecture Building on proven networking technology - Link-state routing • To maintain the Switch-level topology • Not end hosts’ information - ECMP to enable VLB Separating names from locators - Hosting any service on any server. - Addressing scheme • AAs(Application-specific Addresses) & LAs(Location-specific Addresses) • Directory system: mapping between names and locators. • VL2 agent (in Host) : 2.5Layer, invokes the directory system’s resolution service. Embracing end-system - VL2 agent in host
  23. 23. 22/44 Confidential The VL2 Architecture Addressing
  24. 24. 23/44 Confidential The VL2 Architecture Routing
  25. 25. 24/44 Confidential The VL2 Architecture Potential issue for both ECMP and VLB - transient congestion on some links. - it can change the hash used to create the source address periodically or whenever TCP detects a severe congestion event (e.g., a full window loss) or an Explicit Congestion Notification. - Switches today only support up to 16-way ECMP, with 256-way ECMP being released by some vendors this year. - Some inexpensive switches cannot correctly retrieve the five-tuple values when a packet is encapsulated with multiple IP headers. Thus, the agent at the source computes a hash of the five-tuple values and writes that value into the source IP address field, which all switches do use in making ECMP forwarding decisions.
  26. 26. 25/44 Confidential The VL2 Architecture Discussion - Cost & Scale • the VL2 topology can scale to create networks with no oversubscription. • switches with 144 ports (D = 144) are available today for $150K. • switches with 24 ports (D = 24) are available today for $8K. • Building a conventional network with no oversubscription would cost roughly 14× the cost of a equivalent VL2 network with no oversubscription.
  27. 27. 26/44 Confidential SEATTLE Changhoon Kim and 2 other persons (Univ. of Princeton)
  28. 28. 27/44 Confidential The SEATTLE Architecture Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises. - In SIGCOMM, 2008. Flat addressing of end-hosts - Switches use hosts’ MAC addresses for routing - Ensures zero-configuration and backwards-compatibility Automated host discovery at the edge - Switches detect the arrival/departure of hosts - Obviates flooding and ensures scalability Hash-based on-demand resolution - Hash deterministically maps a host to a switch - Switches resolve end-hosts’ location and address via hashing - Ensures scalability Shortest-path forwarding between switches - Switches run link-state routing to maintain only switch-level topology (i.e., do not disseminate end-host information) - Ensures data-plane efficiency
  29. 29. 28/44 Confidential The SEATTLE Architecture Packet forwarding & Lookup
  30. 30. 29/44 Confidential The SEATTLE Architecture Packet forwarding & Lookup
  31. 31. 30/44 Confidential PortLand R.N. Mysore and 7 other persons (Univ. of California San Diego)
  32. 32. 31/44 Confidential The PortLand Architecture Add a new host Transfer a packet Key features - Layer 2 protocol based on tree topology - PMAC encode the position information - Data forwarding proceeds based on PMAC - Edge switch’s responsible for mapping between PMAC and AMAC (Rewriting) - Fabric manger’s responsible for address resolution - Edge switch makes PMAC invisible to end host - Each switch node can identify its position by itself - Fabric manager keep information of overall topology. Corresponding to the fault, it notifies affected nodes. - PMAC(48bits): pod(16).position(8).port(8).vmid(16)
  33. 33. 32/44 Confidential TRILL (RFC 5556) Radia Perlman (Univ. of California San Diego)
  34. 34. 33/44 Confidential The TRILL TRILL: Transparent Interconnection of Lots of Links - TRILL is a new standard protocol to perform Layer 2 bridging with IS-IS link state routing technology. A simple idea - Encapsulate native frames in a transport header providing a hop count. - Route the encapsulated frames using IS-IS. - Decapsulate the native frame before delivery. Definitions - RBridge - Routing Bridge • A device which implements TRILL - RBridge Campus • A network of RBridges, links, and any intervening bridges, bounded by end stations/layer 3 router.
  35. 35. 34/44 Confidential The TRILL Encapsulation & Header TRILL Header – 64 bits Nicknames - auto-configured 16-bit campus local names for RBridges V = Version (2 bits) R = Reserved (2 bits) M = Multi-Destination (1 bit) OpLng = Length of TRILL Options Hop = Hop Limit (6 bits)
  36. 36. 35/44 Confidential The TRILL Packet Routing - ESADI (End Station Address Distribution Information protocol)
  37. 37. 36/44 Confidential Related Works & Summary
  38. 38. 37/44 Confidential Related Works OpenFlow - Shares idea of simple switches controlled by external SW - Monsoon & VL2 is a philosophy for how to use the switches Brocade: Brocade One (TRILL, Clos Net, DCB) Cisco: FabricPath (TRILL) Juniper: Qfabric (HW & FC)
  39. 39. 38/44 Confidential Summary Comparison of the Data Center Network Architecture Monsoon VL2 SEATTLE FAT-TREE PortLand SPAIN MOOS E TRILL Dcell Bcube MDCube Org. MS Research Univ. of Princeton Univ. of California San Diego HP Univ. of Cambrid ge MS Research Asia Publishing SIGCOMM 2008 SIGCOMM 2009 SIGCOMM 2008 SIGCOMM 2008 SIGCOMM 2009 NSDI 2010 DC CAVE S Works hop 2009 RFC 5556 2009 SIGCOMM 2008 SIGCOMM 2009 CoNEXT 2009 Authors Albert Greenberg… Albert Greenberg, Changhoon Kim… Changhoon Kim… M. Al-Fares… R.N. Mysore… J. Mudigon da, M. Al-Fare s… M. Scott … Radia Perlman C. GUO… C. GUO… H. Wu, C. GUO… Topology Clos Network Clos Network N/A Fat-Tree Fat-Tree N/A N/A N/A Bcube Topo logy Packetizing MAC-in-MAC (802.1ah PBB) IP-in-IP IP-in-IP(?) IP rewriting MAC rewriting (PMAC) MAC rewriting TRILL Hdr Load Spreading MAC-Rotation ECMP ECMP ECMP ECMP Multi-path O O X O O O X O Mod. of End-Host? O O X X X O X X O Mod. of switches? O X O O (Special HW) O (Special HW) X O (Rbridge) △ ARP Directory Server Directory Server DHT on the switches Fabric Manager ESADI
  40. 40. 39/44 Confidential Traffic Engineering is …
  41. 41. Thank you.