0
DevoFlow: Scaling FlowManagement for High-Performance          Networks   Andrew R. Curtis (University of Waterloo); Jeffr...
Motivation• SDN / OpenFlow can enable per-flow  management… However…• What are the costs and limitations?• Network-wide lo...
DevoFlow Contributions• Characterize overheads of implementing  OpenFlow on switches• Evaluate flow mgmt capability within...
Agenda•   OF Benefits, Bottlenecks, and Dilemmas•   Evaluation of Overheads•   DevoFlow•   Simulation Results             ...
Benefits•   Flexible policies w/o switch-by-sw config.•   Network graph and visibility, stat.s collection•   Enable traffi...
Bottlenecks• Root: Excessively couples..  – central control and complete visibility• Controller bottleneck: scale by dist....
Dilemma• Control dilemma:  – Role of controller: visibility and mgmt capability    however, per-flow setup too costly  – F...
Main Concept of DevoFlow•   Devolving most flow controls to switches•   Maintain partial visibility•   Keep trace of signi...
Design Principles of DevoFlow• Try to stay in data-plane, by default• Provide enough visibility:  – Esp. for significant f...
Agenda•   OF Benefits, Bottlenecks, and Dilemmas•   Evaluation of Overheads•   DevoFlow•   Simulation Results             ...
Overheads: Control PKTs                               A N-switch pathFor a path with N switches: N+1 control pkts• First f...
Overheads: Flow Setup• Switch w/ finite BW between data / control  plane, i.e. overheads between ASIC and CPU• Setup capab...
Overheads: Flow SetupExperiment: a single switch                              13
Overheads: Flow SetupASIC switching rateLatency: 5 s                                   14
Overheads: Flow SetupASIC CPULatency: 0.5 ms                                  15
Overheads: Flow SetupCPU ControllerLatency: 2 msA huge wasteof resources!                                16
Overheads: Gathering Stat.s•   [30] most longest-lived flows: only a few sec•   Counters: (pkts, bytes, duration)•   Push-...
Overheads: Gathering Stat.s 2.5 sec to pull 13K entries 1 sec to pull 5,600 entries 0.5 sec to pull 3,200 entries         ...
Overheads: Gathering Stat.s• Per-flow setup generates too many entries• More the controller fetch    longer• Longer to fet...
Overheads: Competition• Flow setups and stat-pulling compete for BW• Must need timely stat.s for scheduling• Switch flow e...
Overheads: Competition                         21
Agenda•   OF Benefits, Bottlenecks, and Dilemmas•   Evaluation of Overheads•   DevoFlow•   Simulation Results             ...
Mechanisms• Control  – Rule cloning  – Local actions• Statistics-gathering  – Sampling  – Triggers and reports  – Approxim...
Rule Cloning• ASIC clones a wildcard rule as an exact  match rule for new microflows• Timeout or output port by probabilit...
Rule Cloning• ASIC clones a wildcard rule as an exact  match rule for new microflows• Timeout or output port by probabilit...
Rule Cloning• ASIC clones a wildcard rule as an exact  match rule for new microflows• Timeout or output port by probabilit...
Local Actions• Rapid re-routing: fallback paths predefined    Recover almost immediately• Multipath support: based on prob...
Statistics-Gathering• Sampling  – Pkts headers send to controller with1/1000 prob.• Triggers and reports  – Set a threshol...
Implementation• Not yet on hardware• Engineers support this by using existing  functional blocks for most mechanisms• Prov...
Simulation• How much flow scheduling overheads can be  reduced, while achieving high performance?• Custom built flow-level...
Simulation             31
Agenda•   OF Benefits, Bottlenecks, and Dilemmas•   Evaluation of Overheads•   DevoFlow•   Simulation Results             ...
Simulation ResultsClos Topology                                33
Simulation ResultsClos Topology                                34
Simulation ResultsHyperX Topology                                  35
Simulation ResultsHyperX Topology                                  36
Simulation Results                     37
Simulation Results                     38
Simulation Results                     39
Simulation Results                     40
Simulation Results                     41
Conclusion• Per-flow control imposes too many overheads• Balance between  – Overheads and network visibility  – Effective ...
Upcoming SlideShare
Loading in...5
×

DevoFlow - Scaling Flow Management for High-Performance Networks

698

Published on

Internet Research Lab at NTU, Taiwan.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
698
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
53
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "DevoFlow - Scaling Flow Management for High-Performance Networks"

  1. 1. DevoFlow: Scaling FlowManagement for High-Performance Networks Andrew R. Curtis (University of Waterloo); Jeffrey C. Mogul, Jean Tourrilhes, Praveen Yalagandula, Puneet Sharma, Sujata Banerjee (HP Labs), SIGCOMM 2011 Presenter: Jason, Tsung-Cheng, HOU Advisor: Wanjiun Liao Mar. 22nd, 2012 1
  2. 2. Motivation• SDN / OpenFlow can enable per-flow management… However…• What are the costs and limitations?• Network-wide logical graph = always collecting all flows’ stat.s?• Any more problems beyond controller’s scalability?• Enhancing performance / scalability of controllers solves all problems? 2
  3. 3. DevoFlow Contributions• Characterize overheads of implementing OpenFlow on switches• Evaluate flow mgmt capability within data center network environment• Propose DevoFlow to enable scalable flow mgmt by balancing – Network control – Statistics collection – Overheads – Switch functions and controller loads 3
  4. 4. Agenda• OF Benefits, Bottlenecks, and Dilemmas• Evaluation of Overheads• DevoFlow• Simulation Results 4
  5. 5. Benefits• Flexible policies w/o switch-by-sw config.• Network graph and visibility, stat.s collection• Enable traffic engineering and network mgmt• OpenFlow switches are relatively simple• Accelerate innovation: – VL2, PortLand: new architecture, virtualized addr – Hedera: flow scheduling – ElsticTree: energy-proportional networking• However, no further est. of overheads 5
  6. 6. Bottlenecks• Root: Excessively couples.. – central control and complete visibility• Controller bottleneck: scale by dist. sys.• Switch bottleneck: – Data- to control-plane: limited BW – Enormous flow tables, too many entries – Control and stat.s pkts compete for BW – Introduce extra delays and latencies• Switch bottleneck was not well studied 6
  7. 7. Dilemma• Control dilemma: – Role of controller: visibility and mgmt capability however, per-flow setup too costly – Flow-match wildcard, hash-based: much less load, but no effective control• Statistics-gathering dilemma: – Pull-based mechanism: counters of all flows full visibility but demand high BW – Wildcard counter aggregation: much less entries but lose trace of elephant flows• Aim to strike in between 7
  8. 8. Main Concept of DevoFlow• Devolving most flow controls to switches• Maintain partial visibility• Keep trace of significant flows• Default v.s. special actions: – Security-sensitive flows: categorically inspect – Normal flows: may evolve or cover other flows become security-sensitive or significant – Significant flows: special attention• Collect stat.s by sampling, triggering, and approximating 8
  9. 9. Design Principles of DevoFlow• Try to stay in data-plane, by default• Provide enough visibility: – Esp. for significant flows & sec-sensitive flows – Otherwise, aggregate or approximate stat.s• Maintain simplicity of switches 9
  10. 10. Agenda• OF Benefits, Bottlenecks, and Dilemmas• Evaluation of Overheads• DevoFlow• Simulation Results 10
  11. 11. Overheads: Control PKTs A N-switch pathFor a path with N switches: N+1 control pkts• First flow pkt to controller• N control messages to N switchesAverage length of a flow in 1997: 20 pktsIn clos / fat-tree DCN topo: 5 switches 6 control pkts per flow The smaller the flow, the higher cost of BW 11
  12. 12. Overheads: Flow Setup• Switch w/ finite BW between data / control plane, i.e. overheads between ASIC and CPU• Setup capability: 275~300 flows/sec• Similar with [30]• In data center: mean interarrival 30 ms• Rack w/ 40 servers 1300 flows/sec• In whole data center [43] R. Sherwood, G. Gibb, K.-K. Yap, G. Appenzeller, M. Casado, N. McKeown, and G. Parulkar. Can the production network be the testbed? In OSDI , 2010. 12
  13. 13. Overheads: Flow SetupExperiment: a single switch 13
  14. 14. Overheads: Flow SetupASIC switching rateLatency: 5 s 14
  15. 15. Overheads: Flow SetupASIC CPULatency: 0.5 ms 15
  16. 16. Overheads: Flow SetupCPU ControllerLatency: 2 msA huge wasteof resources! 16
  17. 17. Overheads: Gathering Stat.s• [30] most longest-lived flows: only a few sec• Counters: (pkts, bytes, duration)• Push-based: to controller when flow ends• Pull-based: fetch actively by controller• 88F bytes for F flows• In 5406zl switch: Entries:1.5K wildcard match/13K exact match total 1.3 MB, 2 fetches/sec, 17 Mbps Not fast enough! Consumes a lot of BW! [30] S. Kandula, S. Sengupta, A. Greenberg, and P. Patel. The Nature of Datacenter Trac: Measurements & Analysis. In Proc. IMC , 2009. 17
  18. 18. Overheads: Gathering Stat.s 2.5 sec to pull 13K entries 1 sec to pull 5,600 entries 0.5 sec to pull 3,200 entries 18
  19. 19. Overheads: Gathering Stat.s• Per-flow setup generates too many entries• More the controller fetch longer• Longer to fetch longer the control loop• In Hedera: control loop 5 secs BUT workload too ideal, Pareto distribution• Workload in VL2, 5 sec only improves 1~5% over ECMP• [41], must be less than 0.5 sec to be better [41] C. Raiciu, C. Pluntke, S. Barre, A. Greenhalgh, D. Wischik, and M. Handley. Data center networking with multipath TCP. In HotNets , 2010. 19
  20. 20. Overheads: Competition• Flow setups and stat-pulling compete for BW• Must need timely stat.s for scheduling• Switch flow entries – OpenFlow: TCAMs, wildcard, consumes lots of power & space – Rules: 10 header fields, 288 bits each – Only 60 bits for trad. Ethernet• Per-flow entry v.s. per-host entry 20
  21. 21. Overheads: Competition 21
  22. 22. Agenda• OF Benefits, Bottlenecks, and Dilemmas• Evaluation of Overheads• DevoFlow• Simulation Results 22
  23. 23. Mechanisms• Control – Rule cloning – Local actions• Statistics-gathering – Sampling – Triggers and reports – Approximate counters• Flow scheduler: like Hedera• Multipath routing: based on probability dist. enable oblivious routing 23
  24. 24. Rule Cloning• ASIC clones a wildcard rule as an exact match rule for new microflows• Timeout or output port by probability 24
  25. 25. Rule Cloning• ASIC clones a wildcard rule as an exact match rule for new microflows• Timeout or output port by probability 25
  26. 26. Rule Cloning• ASIC clones a wildcard rule as an exact match rule for new microflows• Timeout or output port by probability 26
  27. 27. Local Actions• Rapid re-routing: fallback paths predefined Recover almost immediately• Multipath support: based on probability dist. Adjusted by link capacity or loads 27
  28. 28. Statistics-Gathering• Sampling – Pkts headers send to controller with1/1000 prob.• Triggers and reports – Set a threshold per rule – When exceeds, enable flow setup at controller• Approximate counters – Maintain list of top-k largest flows 28
  29. 29. Implementation• Not yet on hardware• Engineers support this by using existing functional blocks for most mechanisms• Provide some basic tools for SDN• However, scaling remains a problem What threshold? How to sample? Rate?• Default multipath on switches• Controller samples or sets triggers to detect elephants, schedules by bin-packing algo. 29
  30. 30. Simulation• How much flow scheduling overheads can be reduced, while achieving high performance?• Custom built flow-level simulator, based on 5406zl experiments• Workloads generated: – Reverse-engineered [30], by MSR, 1500-server – MapReduce shuffle stage, 128MB to each other – Combine these two [30] S. Kandula, S. Sengupta, A. Greenberg, and P. Patel. The Nature of Datacenter Trac: Measurements & Analysis. In Proc. IMC , 2009. 30
  31. 31. Simulation 31
  32. 32. Agenda• OF Benefits, Bottlenecks, and Dilemmas• Evaluation of Overheads• DevoFlow• Simulation Results 32
  33. 33. Simulation ResultsClos Topology 33
  34. 34. Simulation ResultsClos Topology 34
  35. 35. Simulation ResultsHyperX Topology 35
  36. 36. Simulation ResultsHyperX Topology 36
  37. 37. Simulation Results 37
  38. 38. Simulation Results 38
  39. 39. Simulation Results 39
  40. 40. Simulation Results 40
  41. 41. Simulation Results 41
  42. 42. Conclusion• Per-flow control imposes too many overheads• Balance between – Overheads and network visibility – Effective traffic engineering / network mgmt Could lead to various researches• Switches w/ limited resources – Flow entries / control-plane BW – Hardware capability / power consumption 42
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×