Software Defined Data CentersBrent Salisbury Network Architect University of Kentucky firstname.lastname@example.org
My Obligatory Rationalizing Change is Bad • We are operaAng far to close to the hardware. o Do systems administrators conﬁgure their services in x86 Bios? Guess what? We do. • Generic components decomposed into resources to consume anywhere, anyAme. • AbstracAon of Forwarding, State and Management. o Forwarding: Networking gear with ﬂow tables and ﬁrmware. o State: Bag of protocols destrucAon. o Management: OrchestraAon, CMDB etc. Join the rest of the data center (and world)
A Quick Recap Doh!" > Jumbled Protocol Picture source: Nick McKeown -‐Stanford
The Problem Has Always Been the Edge • Security Policy at the Edge. • MulA-‐Tenancy at the Edge. • QOS Policy at the Edge. • Management at the Edge. • Cost at the Edge. • Complexity at the Edge.
CommodiAzaAon: A Collage of DisrupAon Google’s Pluto
What Changed? Why Now? #1 HW Commoditization 1. Commodity Hardware. Off the shelf “Merchant Silicon”. – If all vendors are using the same pieces and parts where is the value. • “We want to create a dynamic where we have a very good base set of vendor- agnostic instructions. On the other hand, we need to give room for switch/chip vendors to differentiate.” -Nick McKeown • “You don’t have to have an MBA to realize there is a problem. We are still ok but not for very long.” -Stuart Selby,Verizon • When you run a large data center it is cheaper per unit to run a large thing rather than a small thing, unfortunately in networking that’s not really true. -Urs Hoezle, Google • “Work with existing silicon today; tomorrow may bring dedicated OpenFlow silicon.” -David Erickson • “The path to OpenFlow is not a four lane highway of joy and freedom with a six pack and a girl in the seat next to you, it’s a bit more complex and a little hard to say how it will work out, but I’d be backing OpenFlow in my view” – Greg Ferro Etherrealmind.com
Not New Ideas VM Farms Today SDN Network Physical Server Infrastructure Physical Network Infrastructure Servers, CPU, Memory, Disk, Physical HW Router, Switches, RIB, LIB, NIC, Bus. TCAM, Memory, CPU, ASIC. HyperVisors, Vmware, Multi-‐‑Tenancy FlowVisor Hyper-‐‑V, KVM, Xen, X86 Virtualization Openﬂow Controller Instruction Set Windows General Secure Windows Windows Research WindowS Slices WindowS WindowS WindowS Purpose WindowS Network WindowS Slices Slices Slices lice lice lice lice lice lice Slice Slice Slices
Conceptually Simple, Yet Powerful -‐Flow Switching Switch MAC MAC Eth VLAN IP IP IP TCP TCP AcAon Port src dst type ID Src Dst Prot sport dport port3 00:20.. 00:1f.. 0800 vlan1 184.108.40.206 220.127.116.11 4 17264 80 port6 VLAN Switching Switch MAC MAC Eth VLAN IP IP IP TCP TCP AcAon Port src dst type ID Src Dst Prot sport dport port6, 00:1f.. * vlan1 * port7, RouAng * * * * * * port9 Switch MAC MAC Eth VLAN IP IP IP TCP TCP AcAon Port src dst type ID Src Dst Prot sport dport * * * * * * 18.104.22.168 * * * port6 -‐Firewall Switch MAC MAC Eth VLAN IP IP IP TCP TCP AcAon Port src dst type ID Src Dst Prot sport dport * * * * * * * * * 22 drop
AbstractionWin32 Abacus x86 Main SDN Stack OperaAng System frame AbstracAon Layers ApplicaAons ApplicaAons/ Northbound API Policy (POSIX, REST, JSON) Kernel/OS/ Controllers/ Hypervisor Slicing Southbound API Hardware/Firmware/ (x86 ‘like’ or a HAL) vSwitch Firmware CPU Device Memory
Enterprise Wireless at Larger Scale Today Campus Distributed Core Controllers In the same AdministraAve Domain DistribuAon DistribuAon DistribuAon
Decoupled Control Plane (NOS) SDN/OF x86 Campus Controllers Core Apply Policy Centrally DistribuAon DistribuAon DistribuAon Edge Switches
Policy Application in Wired Networks• Decoupling the Control Plane ! = mean distributed systems in networks go away. • The problem is a distributed systems theory problem managed in sonware independent of hardware. • We centralize the control plane in tradiAonal hierarchical campus architectures today in big expensive chassis. Distributed SDN/OF Controllers
The Alternative is More of the Same• The AlternaAve to apply policy is Business as usual. Un-‐ scalable and cost prohibiAve bumps in the wire Campus Core • NAC at scale is even more mythical than BYOD and SDN. DistribuAon DistribuAon DistribuAon Edge Switches
Open vSwitch – Scale HW vs. SW• VM rack density East-‐West traﬃc could be problemaAc for general purpose top of rack. • 100K+ entries in a rack is unrealisAc in HW today. AcAon Bucket Packet-‐in with match in TCAM – AcAon is forward to port 0/2 Port 0/3 * * * * 192.168.1.1/32 * * * * Send Packet to Port 0/2 In (n)RAM TCAM Lookup Ingress Port Ethec Src Ether Dst Ether Type Vlan ID IP Dst IP Src TCP Dst TCP Src IP Proto Port 0/1 * * * * * * 80 * * * * * * * 192.168.1.0/20 * * * * AcAon Bucket * * * * * 192.168.1.0/24 * 25 * * Port 0/3 * * * * 192.168.1.1/32 * * * * Send Packet TCAM Lookup to Controller 0/5 * * * * 172.24.16.5/32 * 80 * * Packet-‐in with NO match in TCAM – AcAon is Punt to Controller
Where Do You Start? (Physical Devices) Ships in the night.. • Whatever your method of virtualizaAon or an overlay network on hardware lifecycle. The idea is to create test bed and migrate as warranted. • VirtualizaAon today is done in segments or layers rather than a complete network soluAon. • Most vendor early OpenFlow enabled code supports hybrid adopAons. Lambdas Vlans or VRFs Physical Overlay
Ez Deployment Scenario New Flow Processing -‐-‐ struct ofp_packet_in (POX L2 LearningAlgorthym) 1. Update Source Address in (T)CAM or SW tables. POX, FloodLight, TradiAonal 2. Is desAnaAon address a Ethertpe LLDP or Etc OF Controller Bridge Filtered MAC, or is? Network Drop or FWD to Controller or even hand oﬀ to STP. LLDP may be used to build a topology (important for future). 3. Is MulAcast? Yes Flood. Access Port TradiAonal to Controllers 24 1 Layer 3 Gateway 4. Is the desAnaAon address in port mac address Physical Switch table. If no Flood. Running OF Agent 802.1q Trunk or (M)LAG group 5. Is output port the same as input port? Drop 10 11 to prevent loops. 6. Install ﬂow and forward buﬀered and subsequent packets. Host A Host B Vlan-‐10 Vlan-‐20 10.100.1.10/24 10.200.1.10/24
What Changed? Why Now? #2 The Data Center • “The network is in my way” -‐James Hamilton, Amazon • Networking is complex because the appropriate abstracAons have not yet been deﬁned.” –A Case for Expanding OpenFlow/SDN Deployments On University Campuses • “If you look at the way things are done today, it makes it impossible to build an eﬃcient cloud. If you think about the physical network because of things like VLAN placements, you are limited on where you can place workloads. So even without thinking about the applicaAon at all, there are limits on where you can place a VM because of capacity issues or because of VLAN placement issues.” – MarAn Casado • The tools we have today for automaAon: snmp, netconf, subprocess.Popen(Python), Net::Telnet(Perl),#!/bin/bash, autoexpect, etc.
What Changed? #2 The Data Center • Public Cloud Scale • VID LimitaAons -‐ ~4094 Tags The Edge Needs to Be Smarter but also manageable: Below is neither • ¼ of Servers are Virtualized • Customers want ﬂat networks Physical Policy but they do not scale. Network • Complexity in the network substrate to support bad Physical x86 Hardware applicaAon design. • Required-‐ Flexible & Open APIs Open vSwitch & Hypervisor to consume Network Resources. • East-‐West policy applicaAon. • East-‐West BW ConsumpAon. VM1 VM2 VM3 VM4 • L2 MulA-‐Tenancy. Port1 Port2 Port3 Port4 • Hypervisor AgnosAc. • VM port characterisAc mobility. • Traﬃc Trombone for Policy.
Open vSwitch Forwarding Physical Hardware/Hypervisor Open vSwitch Controller or Controller (x) First Packet in a Flow Subsequent Packets VM 1 VM 2 Open vSwitch Data Path • First Packet in the ﬂow goes to the OVS controller (slowpath) and subsequent are forwarded by the OVS data path (fastpath) • Underlying Open vSwitch is a ﬂow-‐table forwarding model similar to that used by OpenFlow. • When a}ached to a controller datapaths are determined by the OpenFlow Controller. • MulAple tenants can share the same tunnel. • AcAons: Forward, Drop, encapsulate and send to controller.
New Encapsulation to Traverse Physical Net Encapsulated In New Headers the Original Packet + Headers is now Payload. Insert Keys/VID etc here
Tunneling/Overlays Physical Host A Physical Host A VM-‐A w/Virtual Switch w/Virtual Switch Taps into a virtual bridge & Hyper Visor & Hyper Visor Tap0 Gre/VXLan/etc Tap0 VM-‐A Br0/Eth0 Eth0/Br0 VM-‐B The Network Substrate has no VM hosts in Tunnels Traversing the Network Flat Layer 2 Network Controller Establishing 10.100.0.0/16 Tunnels in some fashion x86 Box OrchestraAng Tunnels Legacy Network The Tunnel bridges the two VMs together on the same Network VM-‐A VM-‐B 192.168.1.10/24 192.168.1.10/24 Br2 interface on both hosts cannot reach either side unAl the GRE tunnel is brought up Br2 = island needing connecAvity. Br1 with a GRE will tunnel the two together.
Data Center Overlays• Early SDN adopAons are happening today in Data Centers. decouple the virtual from the physical discreetly. Physical L3 Network Where do we terminate Tenancy X Tunnel endpoints? Tenancy Y Tenancy Z HW or SW for De-‐Encap? SDN Overlays (GRE, STT, VXLan) TradiAonal and SDN Network Substrates CreaAng Dynamic Network Resource Pools Resources ConsumpAon (Storage, Network, Compute) Either Local or Hybrid Private/ Public Cloud. Visibility, OAM, Dynamic Provisioning, Brokerage and AnalyAcs.
Does This Make Sense? Tenancy X Tenancy Y Tenancy Z Cloud Provided ElasAc Compute Disaster Recovery Warm/Hot Site Layer 3 Network e.g. Carrier MPLS/VPN, Internet, L3 Segmented Data Center Data Center Data Center West Segment East Segment Leveraging Overlays With VXLAN/(NV)GRE/CAPWAP Create one Flat Network
Hybrid Cloud Look How Public Cloud Feels How it Really is: Internets Public Cloud Spoke Controller Dnsmasquerading & Iptables Internets Aka, Router, Switch and Firewall VM Instance VM Instance VM Networks Public and Private IP addr on one NIC
Hybrid Cloud - IMO Not as Bad as It Looks, this exists today in most DCs Internet Spoke Spoke Public and Private On one Nic Hub Gateway Spoke Spoke Private Cloud On Your Network
Tunneling & Hybrid Cloud Creates One Network and Hybrid Cloud Public Cloud Spoke An x86 Node Can Aggregate the Tunnel Endpoints. Hub and Spoke. The AlternaAve Internets would be a Full Mesh. Policy could centrally be applied there. Encapsulated Tunnels Hub Gateway Network is Unaware of Spoke Spoke Private Cloud On Your Network
De-Dup Policy is the best Reason for Tunnels • Leverage exisAng centralized policy ApplicaAon and Public Cloud Spoke OrchestraAon. • That all said sending the client directly to a cloud provider Crypto, IDS/IPS, outside of a tunnel via the Firewall etc. Internets Internet is by far the easiest and most scalable soluAon. Hub Gateway Spoke Spoke Private Cloud On Your Network
Public Cloud: The Internet will be the new LAN Op#on 1:General Internet1 best eﬀort Op#on 2: Dedicated peerings to Op#on 3: Internet2, connecAvity through commodity I1 any node from tenant to colo into Ideally begin leveraging drains like a Cogent for example at ~ the super-‐regional anyone selling their peering and Colos ¢25-‐¢50 cents per/mb. Capture that as resource pools with open APIs. globally for a broader service level as a lower Aer SLA but Rackspace, HP, Dell, Piston Cloud. net to capture priced signiﬁcantly lower. Companies whose end game is not compeAAvely priced Primary opAon. 100% cloud. resources Leverage Regional & Super-‐Regional Statewide Networks and Open Peerings to Cloud Providers. xAAS driven as a commodiAes market through Emerging Open API Standards. Programmability Should Enable Eﬃciency in Usage and Allow for Time Sharing via OrchestraAon. OpenStack Resources Either Local w/the ability to leverage Hybrid Private/Public Cloud oﬀerings based on the best market price that year, month maybe even day depending on the elasAcity and ﬂexibility to move workloads. Also balancing workloads amongst each other through scheduling and predicAve analysis and magic. Tenants would be any community anchor, state, city, educaAon non-‐proﬁt etc.
Challenges• The idea that all networks should be built like the Internet is ge•ng in the way. I have to build Campus and Healthcare networks in the same manner as Tier1 Service Providers build their networks and we build our Regional SP network in the state to ﬁnd reliability and scale. Time and Space should be taken into account. • The Internet will be the LAN in the next decade. The Carriers, LECs and Cable companies are not invesAng the capital necessary to scale. • DistribuAon of state and mapping of elements in a decoupled Control Plane. This has been solved in systems today eg. Hadoop / HGFS. Tracking millions of elements over some distributed controllers and maintaining state is well within the realm of reality. • “Big old tech companies are onen incapable of invesAng in new ideas because theyre addicted to the revenue streams from their current businesses and dont want to disrupt those businesses.” -‐ Marc Andreessen Silicon Valley VC Firm • If providing IaaS understand Self-‐Provisioning and Customer Experience.
How To Get Involved • Get Involved with test beds with the community and vendors. • Thought leadership and knowledge transfer. • Dip your toes in the public and private Cloud. • Installers for local OpenStack instances. @ h}p://www.rackspace.com/ & h}ps://airframe.pistoncloud.com/ • ParAcipate in the Internet2 working group. Co-‐Chairs Dan Schmeidt WILLYS@clemson.edu & Deniz Gurkan dgurkan@Central.UH.EDU • h}p://incntre.iu.edu/openﬂow IU OpenFlow in a Day Class. • Networking vendors need to realize one of the role reversal in Networking that has occurred (thanks in large part to R&E and DIY’s). Just as in the x86 market it is consumer driven. The legacy echo chambers product managers live in are no longer acceptable. • Mimic what the sonware industry perfected. Open and communiAes of pracAce.
Brent’s Bookmarks – Comments, Questions, Nerd Rage?• h}p://ioshints.info • h}p://etherealmind.com • h}p://nerdtwilight.wordpress.com/ • h}p://networkheresy.com/ • h}p://www.openﬂowhub.org/ (Floodlight) • h}p://www.noxrepo.org/ (POX) • First 10 minutes of McKeown’s presentaAon for anyone with manager in Atle not to menAon brings tears to my eyes. • h}p://www.youtube.com/watch?v=W734gLC9-‐dw (McKeown) • h}p://packetpushers.net • h}p://www.codybum.com • h}p://www.rackspace.com (Rackspace OpenStack Private Cloud build) • h}p://www.networkworld.com/community/fewell • h}p://networkstaAc.net/ My Ramblings • irc.freenode.net #openﬂow #openvswitch #openstack