DevoFlow - Scaling Flow Management for High-Performance Networks

DevoFlow: Scaling Flow
Management for High-Performance
Networks
Andrew R. Curtis (University of Waterloo); Jeffrey C.
Mogul, Jean Tourrilhes, Praveen Yalagandula, Puneet
Sharma, Sujata Banerjee (HP Labs), SIGCOMM 2011

Presenter: Jason, Tsung-Cheng, HOU
Advisor: Wanjiun Liao
Mar. 22nd, 2012 1

Motivation
• SDN / OpenFlow can enable per-flow
management… However…
• What are the costs and limitations?
• Network-wide logical graph
= always collecting all flows’ stat.s?
• Any more problems beyond controller’s
scalability?
• Enhancing performance / scalability of
controllers solves all problems?
2

DevoFlow Contributions
• Characterize overheads of implementing
OpenFlow on switches
• Evaluate flow mgmt capability within data
center network environment
• Propose DevoFlow to enable scalable flow
mgmt by balancing
– Network control
– Statistics collection
– Overheads
– Switch functions and controller loads
3

Agenda
• OF Benefits, Bottlenecks, and Dilemmas
• Evaluation of Overheads
• DevoFlow
• Simulation Results

4

Benefits
• Flexible policies w/o switch-by-sw config.
• Network graph and visibility, stat.s collection
• Enable traffic engineering and network mgmt
• OpenFlow switches are relatively simple
• Accelerate innovation:
– VL2, PortLand: new architecture, virtualized addr
– Hedera: flow scheduling
– ElsticTree: energy-proportional networking
• However, no further est. of overheads
5

Bottlenecks
• Root: Excessively couples..
– central control and complete visibility
• Controller bottleneck: scale by dist. sys.
• Switch bottleneck:
– Data- to control-plane: limited BW
– Enormous flow tables, too many entries
– Control and stat.s pkts compete for BW
– Introduce extra delays and latencies
• Switch bottleneck was not well studied
6

Dilemma
• Control dilemma:
– Role of controller: visibility and mgmt capability
however, per-flow setup too costly
– Flow-match wildcard, hash-based:
much less load, but no effective control
• Statistics-gathering dilemma:
– Pull-based mechanism: counters of all flows
full visibility but demand high BW
– Wildcard counter aggregation: much less entries
but lose trace of elephant flows
• Aim to strike in between 7

Main Concept of DevoFlow
• Devolving most flow controls to switches
• Maintain partial visibility
• Keep trace of significant flows
• Default v.s. special actions:
– Security-sensitive flows: categorically inspect
– Normal flows: may evolve or cover other flows
become security-sensitive or significant
– Significant flows: special attention
• Collect stat.s by sampling, triggering, and
approximating
8

Design Principles of DevoFlow
• Try to stay in data-plane, by default
• Provide enough visibility:
– Esp. for significant flows & sec-sensitive flows
– Otherwise, aggregate or approximate stat.s
• Maintain simplicity of switches

9

Agenda
• DevoFlow

10

Overheads: Control PKTs
A N-switch path

For a path with N switches: N+1 control pkts
• First flow pkt to controller
• N control messages to N switches
Average length of a flow in 1997: 20 pkts
In clos / fat-tree DCN topo: 5 switches
6 control pkts per flow
The smaller the flow, the higher cost of BW
11

Overheads: Flow Setup
• Switch w/ finite BW between data / control
plane, i.e. overheads between ASIC and CPU
• Setup capability: 275~300 flows/sec
• Similar with [30]
• In data center: mean interarrival 30 ms
• Rack w/ 40 servers 1300 flows/sec
• In whole data center

[43] R. Sherwood, G. Gibb, K.-K. Yap, G. Appenzeller, M. Casado,
N. McKeown, and G. Parulkar. Can the production network be
the testbed? In OSDI , 2010. 12

Experiment: a single switch

13


ASIC switching rate
Latency: 5 s

14


ASIC CPU
Latency: 0.5 ms

15

CPU Controller
Latency: 2 ms
A huge waste
of resources!

16

Overheads: Gathering Stat.s
• [30] most longest-lived flows: only a few sec
• Counters: (pkts, bytes, duration)
• Push-based: to controller when flow ends
• Pull-based: fetch actively by controller
• 88F bytes for F flows
• In 5406zl switch:
Entries:1.5K wildcard match/13K exact match
total 1.3 MB, 2 fetches/sec, 17 Mbps
Not fast enough! Consumes a lot of BW!
[30] S. Kandula, S. Sengupta, A. Greenberg, and P. Patel. The
Nature of Datacenter Trac: Measurements & Analysis. In
Proc. IMC , 2009. 17


2.5 sec to pull 13K entries
1 sec to pull 5,600 entries
0.5 sec to pull 3,200 entries

18

• Per-flow setup generates too many entries
• More the controller fetch longer
• Longer to fetch longer the control loop
• In Hedera: control loop 5 secs
BUT workload too ideal, Pareto distribution
• Workload in VL2, 5 sec only improves 1~5%
over ECMP
• [41], must be less than 0.5 sec to be better
[41] C. Raiciu, C. Pluntke, S. Barre, A. Greenhalgh, D. Wischik,
and M. Handley. Data center networking with multipath TCP.
In HotNets , 2010. 19

Overheads: Competition
• Flow setups and stat-pulling compete for BW
• Must need timely stat.s for scheduling
• Switch flow entries
– OpenFlow: TCAMs, wildcard, consumes lots of
power & space
– Rules: 10 header fields, 288 bits each
– Only 60 bits for trad. Ethernet
• Per-flow entry v.s. per-host entry

20

Overheads: Competition

21

Agenda
• DevoFlow

22

Mechanisms
• Control
– Rule cloning
– Local actions
• Statistics-gathering
– Sampling
– Triggers and reports
– Approximate counters
• Flow scheduler: like Hedera
• Multipath routing: based on probability dist.
enable oblivious routing
23

Rule Cloning
• ASIC clones a wildcard rule as an exact
match rule for new microflows
• Timeout or output port by probability

24

Rule Cloning

25

Rule Cloning

26

Local Actions
• Rapid re-routing: fallback paths predefined
Recover almost immediately
• Multipath support: based on probability dist.
Adjusted by link capacity or loads

27

Statistics-Gathering
• Sampling
– Pkts headers send to controller with1/1000 prob.
• Triggers and reports
– Set a threshold per rule
– When exceeds, enable flow setup at controller
• Approximate counters
– Maintain list of top-k largest flows

28

Implementation
• Not yet on hardware
• Engineers support this by using existing
functional blocks for most mechanisms
• Provide some basic tools for SDN
• However, scaling remains a problem
What threshold? How to sample? Rate?
• Default multipath on switches
• Controller samples or sets triggers to detect
elephants, schedules by bin-packing algo.
29

Simulation
• How much flow scheduling overheads can be
reduced, while achieving high performance?
• Custom built flow-level simulator, based on
5406zl experiments
• Workloads generated:
– Reverse-engineered [30], by MSR, 1500-server
– MapReduce shuffle stage, 128MB to each other
– Combine these two

[30] S. Kandula, S. Sengupta, A. Greenberg, and P. Patel. The
Nature of Datacenter Trac: Measurements & Analysis. In
Proc. IMC , 2009. 30

Agenda
• DevoFlow

32

Simulation Results
Clos Topology

33

Simulation Results
Clos Topology

34

Simulation Results
HyperX Topology

35

Simulation Results
HyperX Topology

36

Simulation Results

37

Simulation Results

38

Simulation Results

39

Simulation Results

40

Simulation Results

41

Conclusion
• Per-flow control imposes too many overheads
• Balance between
– Overheads and network visibility
– Effective traffic engineering / network mgmt
Could lead to various researches
• Switches w/ limited resources
– Flow entries / control-plane BW
– Hardware capability / power consumption

42

DevoFlow - Scaling Flow Management for High-Performance Networks

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to DevoFlow - Scaling Flow Management for High-Performance Networks

Similar to DevoFlow - Scaling Flow Management for High-Performance Networks (20)

More from Jason TC HOU (侯宗成)

More from Jason TC HOU (侯宗成) (9)

DevoFlow - Scaling Flow Management for High-Performance Networks