Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

OpenStack Scale-out Networking Architecture


Published on

A detailed description of how Cloudscaling's Open Cloud System (OCS) has solved the network scalability problems in OpenStack. We'll cover how and why we designed a Layer-3 (L3) scale-out network, how we plugin and extend OpenStack, and talk about why we did it this way.

Published in: Technology

OpenStack Scale-out Networking Architecture

  1. 1. CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution*! * All unlicensed or borrowed works retain their original licenses OpenStack Scale-out 
 Networking Architecture Abhishek Chanda, Software Engineer! OpenStack Juno Design Summit! May 13th, 2014 Scaling to 1,000+ servers without Neutron! @rony358
  2. 2. Abhishek Chanda •Engineer at Cloudscaling! •Started with a focus on software defined networking and distributed systems! • Ended up as jack of all trades!! • Still bad at configuring network devices! 2
  3. 3. Today’s Goals 3 • Our Layer-3 (L3) network architecture! • Motivation! • Design goals! • Network topology! • Software components layout! • Under the hood! • Challenges Faced! • Future Directions (SDN, Neutron etc.)
  4. 4. Networking Modes 
 in Vanilla OpenStack using nova-network
  5. 5. OpenStack Networking Options 5 Single-Host" OpenStack Networking
 Multi-Host" OpenStack Networking
 OpenStack Networking OCS " Classic
 Networking OCS " VPC 
 Networking Scalability" & Control 0 1 2 4 4 Reference Network " Architecture Layer-2 VLANs w/ STP Layer-2 VLANs w/ STP Layer-2 VLANs w/ STP L3 Spine & Leaf + scale-out NAT L3 Spine & Leaf Underlay + network virtualization Design Pattern Centralized network stack Decentralized network stack Centralized network stack Fully Distributed Fully Distributed Core 
 Flaw All traffic through a single x86 server NAT at Hypervisor; 802.1q tagging All traffic through a single x86 server Requires deeper networking savvy More bleeding edge (scale not proven) Issues SPOF; Performance bottleneck; 
 No SDN Security & 
 QoS issues; 
 No SDN No control plane scalability; No SDN No virtual networks No VLAN support
  6. 6. Neutron Router nova-network Option 1 - OpenStack Single-Host 6 VLAN VLAN VLAN VLAN VLAN VLAN All traffic through centralized bottleneck and 
 single point of failure Performance bottleneck; non-standard networking arch.
  7. 7. Option 2 - OpenStack Multi-Host 7 Core Network Hypervisor nova-network Public
 IPs Private
 IPsNAT Separate NICs cut available bandwidth by 50% Direct database access increases risk NAT at hypervisor means routing protocols to hypervisor or VLANs + STP across racks for gratuitous ARP Must route Public IPs across core Security problem; non-standard networking arch.
  8. 8. Option 3 - OpenStack Flat 8 Networking scalability, but no control plane scalability Uses ethernet adapters configured as bridges to allow network traffic between nodes! ! Commonly used in POC and development environments nova-network nova-scheduler nova-api nova-compute Controller Compute your local network IP address space (eth1) (eth1) br100
 (eth0) br100
  9. 9. What Path Have Others Chosen? •HP, CERN, ANL and others! •CERN uses a custom driver which talks to a DB that maps IP to MAC addresses (amongst other attributes)! • Essentially flat with manually created VLANs assigned to specific compute nodes! •ANL added InfiniBand and VxLAN support to nova-network 9
  10. 10. The L3 Network Architecture in OCS
  11. 11. Design Goals • Mimic Amazon EC2 “classic” networking! • Pure L3 is blazing fast and well understood! • Network/systems folks can troubleshoot easily! • No VLANs, no Spanning Tree Protocol (STP), no L2 mess! • No SPOFs, smaller failure domains! • E.g. single_host & flat mode! • Distributed “scale-out” DHCP! • No virtual routers!! • Path is vm->host->ToR->core 11 This enables a horizontally scalable stateless 
 NAT layer that provides floating IPs
  12. 12. Layer 3 (L3) Networks Scale 12 Largest cloud operators are L3 with NO VLANs Cloud-ready apps don’t need or want VLANs (L2) An L3 networking model is ideal underlay for SDN overlay The Internet Scales!
  13. 13. Why NOT L2? 13 L3 L2 Hierarchical topology Flat topology Route aggregation No route aggregation / everything everywhere Fast convergence times Fast convergence times only when tuned properly Locality matters Locality disappears Use all available bandwidth 
 (via ECMP) using multiple paths Uses half of available bandwidth 
 and most traffic takes a long route Proven scale STP/VLANs work at small to medium scale only The Internet (& ISPs) are L3 oriented Typical enterprise datacenter is L2 oriented Best practice for SDN “underlay” SDN “overlay” designed to provide L2 virt. nets
  14. 14. Smaller Failure Domains 14 Would you rather have the whole cloud down ! or just a small bit of it for a short time? vs
  15. 15. Spine & Leaf 15
  16. 16. Simple Network View 16
  17. 17. Software Components Schematic Layout 17 Software! Components Hardware! Components • Nova Network is distributed & synchronized! • Means we can have 
 many running at once) ! • This drives horizontal 
 network scalability 
 by adding more 
 network managers Zone Node nova-
 network Compute Node ToR Core Network nova-! db ToR
  18. 18. Software Components Schematic Layout 18 Software! Components Hardware! Components • VIF driver on each compute node! • Bridge creation on each vm (/30)! • Enhanced iptables rules! • Per vm udhcpd process! • Configures routing Zone Node nova-! network Compute Node ToR Core Network nova-! db VM DHCP 
 Server VIF! Driver ToR
  19. 19. Software Components Schematic Layout 19 Software! Components Hardware! Components • NAT service on the edge! • Provides on demand 
 elastic IP service Zone Node nova-! network Compute Node ToR Natter Core Network Edge Network nova-! db VM DHCP 
 Server VIF! Driver ToR Provides utilities to create a number of /30 networks per host and pin them to host (l3addnet)
  20. 20. Under the Hood: Natter 20 Zone 
 Node nova-db Natter Core Network Edge Network Connection Control Plane Data 1 Polls nova-db for new
 <floating_ip, private_ip> tuples 2 Use tc to install 1:1 NAT! rules in eth0 (eth1) (eth0)
  21. 21. Under the Hood: Natter 21
  22. 22. 22 Under the Hood: VIF Driver and Nova-Network Zone! Node ToR nova- network Compute 
 Node ToR VIF! Driver Core Network Natter VM Provisioning" 1) VIF: Build linux bridge! 2) NM: Get host! 3) NM: Get all available networks for host! 4) NM: Find first unused network for the host! 5) VIF: Create a VIF! 6) VIF: Sets up and starts udhcpd on host per VM
 MAC is calculated based on the IP! 7) VIF: Creates a /30 network for the VM, assigns
 one address to the bridge, one to the VM! 8) VIF: Adds routes to enable VM to gateway traffic! 9) VIF: Adds iptables rules to enable blocked ! networks and whitelisted hosts! ! VM Decommissioning" 1) VIF: Stop udhcpd for the bridge the VM is 
 attached to and remove config! 2) NM: Delete all IPs in all VIFs! 3) NM: Cleanup linux bridge! 4) NM: Cleanup all /30 networks!
  23. 23. Under the Hood: l3addnet 23 Used by cloud admins to pin networks to hosts! Wrapper around nova-manage network create ! • Breaks down the input CIDR into /30 blocks! • Loops through each block and calls the nova- manage API to create a network on that compute host root@z2.b0.z1:~# l3addnet
 Usage: l3addnet cidr host01 dns1 dns2 root@z2.b0.z1:~# l3addnet
  24. 24. Network Driver Challenges • OpenStack releases are moving targets! • Plugin interfaces change! • Library dependencies change! • Database API not rich/efficient enough! • Straight to SQL to get what we needed! • nova-network supposed to be deprecated?! • First in Folsom or Grizzly? Then Havana??! • Have to figure out our Neutron strategy 24
  25. 25. Why Not Neutron Now? •Created in Diablo timeframe ! •Neutron still not stable! • API changes and interfaces are actively hostile! • No multi-host support! • Complicated, non-intuitive maintenance procedures! • Not all plugins are equally stable! • many are outright buggy 25
  26. 26. SDN in OCS • OpenContrail only one to meet our rqmts! • OCS ref network arch ideal “underlay”! • SDN underlays usually are spine and leaf! • L3 routing does not interfere with supporting encapsulation or tunneling protocols! • Customers can choose network model! • VPC or L3 26 A large customer who wants to seamlessly support autoscaling for its tenants is a perfect use-case for VPC
  27. 27. Example Packet Path - L3* 27 Edge Routers! (“egress”) Core Routers! (“spine”) ToR Switches! (“leaf”) VMs Router (L3) Switch (L2) Ethernet 
 Domain ! IP Traffic ! Encapsulated
 Traffic (L3oL3) Internet Linux Bridge
 on compute node * natters not shown for simplicity purposes!
  28. 28. Example Packet Path 28 Edge Routers! (“egress”) Core Routers! (“spine”) ToR Switches! (“leaf”) VMs Router (L3) Switch (L2) Ethernet 
 Domain ! IP Traffic ! Encapsulated
 Traffic (L3oL3) Internet Virtual Network Virtual Network Layer 3 
 Physical Network Network 
 Virtualization Layer 3 
 Virtual Networks
 for Customers 
 & Apps Abstraction vRouter/vSW
  29. 29. Future Directions • OCS L3 networking migrates to Neutron! • As networking plugin (beyond nova-network replacement)! • OCS VPC w/ more advanced SDN capabilities! • NFV combined with Service Chaining for Carriers! • Support existing physical network assets with Service Chaining 29
  30. 30. Questions? Abhishek Chanda @rony358