ONOS Open Network Opera.ng System Experimental Open-‐Source Distributed SDN OS
So3ware Deﬁned Networking TE Network OS Mobility Network Virtualiza6on Sco9 Shenker, ONS ‘11 Global Network View Packet Forwarding Packet Forwarding Packet Forwarding Packet Forwarding Packet Forwarding Openﬂow Rou6ng Abstract Network Model
Logically Centralized NOS – Key Ques.ons TE Network OS Mobility Network Virtualiza6on Global Network View Packet Forwarding Packet Forwarding Packet Forwarding Packet Forwarding Packet Forwarding Openﬂow Rou6ng Abstract Network Model Fault Tolerance? Scale-‐out? How to realize a Global Network View
Related Work Distributed control plaQorm for large-‐scale networks Focus on reliability, scalability, and generality State distribu.on primi.ves, global network view, ONIX API ONIX Other Work Helios, Hyperﬂow, Maestro, Kandoo distributed control planes NOX, POX, Beacon, Floodlight, Trema controllers
Mo.va.on Ø Build an open source distributed NOS Ø Learn and share with community Ø Target WAN use cases
Ø Demo Key Func6onality Ø Fault-‐Tolerance: Highly Available Control plane Ø Scale-‐out: Using distributed Architecture Ø Global Network View: Network Graph abstrac6on Ø Non Goals Ø Performance op6miza6on Ø Support for reac6ve ﬂows Ø Stress tes6ng Phase 1: Goals December 2012 – April 2013
ONOS: Scale-‐out using control isola.on Distributed Network OS Instance 2 Instance 3 Instance 1 Network Graph Simple Scale-‐out Design Ø An instance is responsible for building & maintaining a part of network graph Ø Control capacity can grow with network size
Cassandra In-‐memory DHT Id: 1 A Id: 101, Label Id: 103, Label Id: 2 C Id: 3 B Id: 102, Label Id: 104, Label Id: 106, Label Id: 105, Label Network Graph Titan Graph DB ONOS Network Graph Abstrac.on
Network Graph port switch port device port on port port port link switch on device host host Ø Network state is naturally represented as a graph Ø Graph has basic network objects like switch, port, device and links Ø Applica.on writes to this graph & programs the data plane
Example: Path Computa.on App on Network Graph port switch port device Flow path Flow entry port on port port port link switch inport on Flow entry device outport switch switch host host ﬂow ﬂow • Applica.on computes path by traversing the links from source to des.na.on • Applica.on writes each ﬂow entry for the path Thus path computa.on app does not need to worry about topology maintenance
Example: A simpler abstrac.on on network graph? Logical Crossbar port switch port device Edge Port port on port port port link switch physical on Edge Port device physical host host • App or service on top of ONOS • Maintains mapping from simpler to complex Thus makes applica.ons even simpler and enables new abstrac.ons Virtual network objects Real network objects
Ø Demo Key Func6onality ü Fault-‐Tolerance: Highly Available Control plane ü Scale-‐out: Using distributed Architecture ü Global Network View: Network Graph abstrac6on Phase 1: Goals December 2012 – April 2013
How is Network Graph built and maintained by ONOS?
Switch Manager Switch Manager Switch Manager Network Graph: Switches OF OF OF OF OF OF Network Graph and Switches
SM Network Graph Switch Manager SM Switch Manager SM Switch Manager Link Discovery Link Discovery Link Discovery Network Graph and Link Discovery
SM Network Graph: Links SM SM Link Discovery Link Discovery Link Discovery LLDP LLDP LLDP Network Graph and Link Discovery
Network Graph SM SM SM Link Discovery Link Discovery Link Discovery LD LD LD Devices and Network Graph Device Manager Device Manager Device Manager
Network Graph: Devices SM SM SM LD LD LD Device Manager Device Manager Device Manager PKTIN PKTIN PKTIN Host Host Host Devices and Network Graph
Consistency Deﬁni.on Ø Strong Consistency: Upon an update to the network state by an instance, all subsequent reads by any instance returns the last updated value. Ø Strong consistency adds complexity and latency to distributed data management. Ø Eventual consistency is slight relaxa.on – allowing readers to be behind for a short period of .me.
Strong Consistency using Registry Distributed Network OS Instance 2 Instance 3 Network Graph Instance 1 A = Switch A Master = NONE A = ONOS 1 Timeline All instances Switch A Master = NONE Instance 1 Switch A Master = ONOS 1 Instance 2 Switch A Master = ONOS 1 Instance 3 Switch A Master = ONOS 1 Master elected for switch A Registry Switch A Master = NONE Switch A Master = ONOS 1 Switch A Master = ONOS 1 Switch A Master = NONE Switch A Master = ONOS 1 Cost of Locking All instances Switch A Master = NONE
Why Strong Consistency is needed for Master Elec.on Ø Weaker consistency might mean Master elec.on on instance 1 will not be available on other instances. Ø That can lead to having mul.ple masters for a switch. Ø Mul.ple Masters will break our seman.c of control isola.on. Ø Strong locking seman.c is needed for Master Elec.on
Eventual Consistency in Network Graph Distributed Network OS Instance 2 Instance 3 Network Graph Instance 1 SWITCH A STATE= INACTIVE Switch A State = INACTIVE Switch A STATE = INACTIVE All instances Switch A STATE = ACTIVE Instance 1 Switch A = ACTIVE Instance 2 Switch A = INACTIVE Instance 3 Switch A = INACTIVE DHT Switch Connected to ONOS Switch A State = ACTIVE Switch A State = ACTIVE Switch A STATE = ACTIVE Timeline All instances Switch A STATE = INACTIVE Consistency Cost
Cost of Eventual Consistency Ø Short delay will mean the switch A state is not ACTIVE on some ONOS instances in previous example. Ø Applica.ons on one instance will compute ﬂow through the switch A while other instances will not use the switch A for path computa.on. Ø Eventual consistency becomes more visible during control plane network conges.on.
Why is Eventual Consistency good enough for Network State? Ø Physical network state changes asynchronously Ø Strong consistency across data and control plane is too hard Ø Control apps know how to deal with eventual consistency Ø In the current distributed control plane, each router makes its own decision based on old info from other parts of the network and it works ﬁne Ø Strong Consistency is more likely to lead to inaccuracy of network state as network conges6ons are real.
Ø Is graph the right abstrac.on? Ø Can it scale for reac.ve ﬂows? Ø What is the Concurrency requirement on graph? Ø Is it large enough to use a NoSQL backend? Ø What about No.ﬁca.ons or publish/subscribe on Graph? Ø Are we using the right technologies for Graph? Ø Is DHT a right choice? Ø Cassandra has latencies in order of millisecond – is it ok? Ø What throughput do we need? Ø Titan is good for rapid prototype – is it good enough for produc.on? Ø Have we got our Consistency and Par..on Tolerance right? Ø What is the latency impact? Ø Should we pick Availability over Consistency? ONOS -‐ Ques.ons
What is Next for ONOS ONOS Core ONOS Apps Performance benchmarks and improvements Reac.ve ﬂows and low-‐latency forwarding Events, callbacks and publish/subscribe API Expand graph abstrac.on for more types of network state ONOS Northbound API Service chaining Network monitoring, analy.cs and debugging framework Community Release as open source and/or contribute to Open DayLight. Build and assist developer community outside ON.LAB Support deployments in R&E networks