Valiant Load Balancing and Traffic Oblivious Routing
Theoretical Foundation for Valiant Load Balancing and Traffic Oblivious Routing 侯宗成, Oct. 13th, 2011• A. Greenberg et al., “VL2: A Scalable and Flexible Data Center Network”, ACM SIGCOMM 2009.• M. Kodialam, T. V. Kakshman, S. Sengupta, “Efficient and Robust Routing of Highly Variable Traffic”, HotHets, 2004.• James Roberts, “Public Reviews of Papers Appearing at HotNets-III”, ”, HotHets, 2004.
Outline• Valiant Load Balancing in VL2• Background• Proposed Routing Scheme• Further Ideas
Outline• Valiant Load Balancing in VL2 – Goals and Building Blocks of VL2 – Spreading for Uniform High Capacity – Randomization for Volatility – References of VLB• Background• Proposed Routing Scheme• Further Ideas
Goals and Building Blocks of VL2• Current designs prevent agility – Poor server-server capacity: Oversubscription – Poor utilization: Fragmentation of resources – Poor reliability: Routing & computing deadlocks• Goals: Scalable, flexible, and agile DC – Uniform High Capacity – Performance Isolation – Layer-2 Semantics
Goals and Building Blocks of VL2• Supporting Infrastructure – Directory System / Address Mapping• Key Innovation – Application and Location Addresses• Major Application of an Innovation – VLB and ECMP• Infrastructure – Clos Topology
Goals and Building Blocks of VL2Supporting InnovationApplicationInfrastructure Building Blocks Goals
Spreading for Uniform High Capacity• ECMP: among equal paths for a node• VLB: among nodes for entire network• Implement VLB by spreading traffic to bounce off several core switches• Hot-spot free: encapsulation and anycast address of core switches• No centralized engineering – Seemingly contradictory to OpenFlow – Discuss in further ideas
Randomization for Volatility• Destination-independent traffic spreading• Randomly-chosen intermediate switches• Traffic spreading ratios are uniform• Edge constraints hold – theoretical model provide later• Shim layer agent: enables path control by adjusting randomization• Claims no problem when elephant flows occur: where OpenFlow can work on
References of VLB• Specific example, VLB: – R. Zhang-Shen and N. McKeown “Designing a Predictable Internet Backbone Network”, HotNets-III, November 2004.• General Case, Traffic Oblivious Routing: – M. Kodialam, T.V. Lakshman, S Sengupta, “Efficient and Robust Routing of Highly Variable Traffic," HotNets, 2004.• Both met at Stanford Workshop on Load-Balancing, May 2004.• R. Zhang-Shen: student of McKeown(Ph.D.) and Roxford (post-doc), now at Google.• Sengupta: one of the authors of VL2, now at Microsoft Research• Early works by: Valiant, for processor interconnection networks, 1981.
Outline• Valiant Load Balancing in VL2• Background – Original Motivation in 2004 – Traditional Approach – Multi-Commodity Flow Problem – Preferred Routing Characteristics – Similarities with Data Center Network• Proposed Routing Scheme• Further Ideas
Original Motivation in 2004• For Internet Backbone, ISP, VPN services, and Autonomous Systems.• Also applicable to any scenarios: – Extreme traffic variations – Traffic matrix unknown and no pattern• Didn’t think of applying to DCN.• Found to be so ideal for DCN in VL2.
Traditional approach• Assume we know matrix of demands of pairs of ingress/egress routers• Network design can be formulated as a multi-commodity flow problem• Routing and capacity be selected to: – optimize objective functions – while satisfying constraints.• For example: IP shortest path routing – implies that demands are over a single path satisfying least hops or delay.
Preferred Routing Characteristics• Can handle unpredictable traffic and maintains good service• Minimize overprovisioning• Mostly static routing, without dynamic adjustments and complex mechanisms
Similarities with Data Center Network• Traffic unpredictable and variant• Mostly static routing can release workload• Bandwidth on links are critical resources• DCN core works similarly as backbone network
Briefing• View Internet backbone as fully meshed• N nodes with inter-node links by tunneling• Traffic Ti-j is routed through an intermediate k: tunnel i→k→j• Traffic split over all possible two-hop routes• Including i→i→j and i→j→j
Briefing• Can be performed at flow level by a hash function or by resequencing packets• Tunnels need to be sized to accommodate all possible traffic matrices• The only constraint: an upper bound on the total amount of incoming and outgoing capacity at each node.
Modeling Traffic Variability• A very tough condition, all nodes are at Ri Ci full capacity.• It we can route any matrix in T(R,C), we can route any other matrices with smaller column and row sums.• Can route any demands with nodes less than full capacity.
Traffic Oblivious Routing• Implementing this scheme by: – Forming fixed bandwidth tunnels between nodes. – Refer as Phase 1 and Phase 2 tunnels.• Bandwidth required for tunnels only depends on R and C values.• Not on the unknown individual entries in the varying traffic matrix.• Modeling tunnel demand next slide.
Traffic Oblivious Routing• Property 1: Routing oblivious to traffic variations.• Property 2: Provisioned capacity is traffic matric independent.• Property 3: Complete utilization of provisioned capacity.
Traffic Oblivious Routing• Does not make any assumptions about T, apart from row and column sum bounds.• Does not require the network to detect changes in traffic.• Handles variability in the traffic matrix set by effectively routing a transformed matrix.• Depends only on row/column sum bounds and traffic distribution ratios.• Not on a specific matrix.
Traffic Oblivious Routing Minimize link capacities Flow conservation Demand satisfaction Within hardware capacity Distribution ratios
Capacity Effectiveness• Results with no details in the paper.• Consider a 20-node and 33 bidirectional links network. (represent US backbone)• Ri’s and Ci’s are equal and normalized to 1.• Node capacities are identical, equals uR.• Below uR, routing infeasible.• Lowest uR =2.595• uR =2.8, bandwidth efficiency 94%.
Key Knowledge Gained• Violating edge constraints: roots of all network deadlocks in DCN.• Edge and network problems can be separated.• Edge: how to ensure capacity constraints are not violated?• Network: how to balance loads and separate services?
Further Ideas• Traffic-Oblivious Routing – Localized routing to switches• centralized / distributed split ratios computation – Need further research• OpenFlow Controllers and Switches – Good for planning elephant flows – Should be combined with traffic oblivious and randomized distributed routing – Randomization vs Dictation: Seemingly Contradictory
How to adopt both concepts and implement into one scheme?• Depending on flow types and scenarios. When switches are able to do the routines, only leave important and critical tasks to controllers,• Prevent edges from being overflowed. – Design and placement of tenants and hosts. – Policies of edge switches, soft or hard.
When do systems initiate dictation / randomization?• For major controllers – When critical tasks or situations occur. – What are critical tasks?• For switches / secondary controllers – Reconfigure distribution ratios when environment changes. – How to reconfigure?• Logical topology / link capacities changed – Then switches start to reconfigure. – Define logical change?
What are relations between controllers and switches?• Controllers plan resources allocation and routing when elephant flows or critical situations occur.• Switches utilize resources left by controllers and perform optimization for distribution remaining traffic.• Balance load between controllers and switches.