Open Cloud System Networking Vision

2,374 views

Published on

Presentation to Silicon Valley Virtual Networking Meetup group in December 2012 by Dan Sneddon of Cloudscaling.

Open Cloud System Networking Vision

  1. 1. Open Cloud Networking Vision The state of OpenStack networking and a vision of things to come... Dan Sneddon Member Technical Staff Twitter: @dxs OCS 2.0 Public Cloud Benefits | Private Cloud Control | Open Cloud Economics Bay Area Network Virtualization Meetup – Dec 2012 CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution 1Thursday, December 13, 12
  2. 2. Our Journey Today 1. Cloudscaling Introduction 2. Elastic Clouds and Private Hybrid Clouds 3. Advanced Networking 4. Future Vision 5. Network-Based Service Resiliency 2Thursday, December 13, 12
  3. 3. Elastic Clouds Bay Area Network Virtualization Meetup – Dec 2012 CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution 3Thursday, December 13, 12
  4. 4. Two Cloud Infrastructure Models 4Thursday, December 13, 12
  5. 5. Two Cloud Infrastructure Models 1 Enterprise Virtualization Legacy Apps 4Thursday, December 13, 12
  6. 6. Two Cloud Infrastructure Models 1 2 Enterprise Elastic Virtualization Infrastructure New Legacy Apps Dynamic Apps 4Thursday, December 13, 12
  7. 7. Elastic Cloud vs Enterprise Virtualization Enterprise Virtualization Elastic Cloud Applications Traditional & Legacy Dynamic Scaling Architecture Managed Silos Horizontal Technology Stack Heavy & Proprietary Distributed & Open Price/Performance Low High (4-7x better) Failure Domains Large Small Provisioning Slower & Manual Faster & 100% API Server consolidation and lower On-demand, scale-out Best For: datacenter mgmt costs infrastructure for new apps 5Thursday, December 13, 12
  8. 8. Public Elastic Clouds Are Not Enough Why? • Loss of control/governance • Security concerns • Integration w/ existing systems • Data privacy/patriation issues • Expensive at scale • Limited performance options • Geographic reachThursday, December 13, 12
  9. 9. Wanted: Private Elastic Clouds Dynamic Applications Public Benefits Private Control 7Thursday, December 13, 12
  10. 10. Wanted: Private/Public Federation Private Public common architecture & behavior 8Thursday, December 13, 12
  11. 11. Wanted: Open Economics/Choice • No vendor lock-in • Open source software • Standard interfaces • Commodity hardware • Community/rapid innovation 9Thursday, December 13, 12
  12. 12. Can I Build A Private Elastic Cloud Myself? Yes, but painfully • Highly complex • Costly to build and maintain • Not where the value is • No innovation path • Not production-grade • Requires special skills 10Thursday, December 13, 12
  13. 13. Our Solution: Open Cloud System Bay Area Network Virtualization Meetup – Dec 2012 CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution 11Thursday, December 13, 12
  14. 14. Our Product Open Cloud System 2.0 The most reliable, scalable and production-grade Powered by solution for private elastic clouds powered by OpenStack technology. 12Thursday, December 13, 12
  15. 15. OpenStack Elastic Cloud Block Object Advanced Compute Storage Storage Networking Nova Cinder Swift / Hadoop Quantum 13Thursday, December 13, 12
  16. 16. Production-Grade Focus Performance Can the system guarantee quality of service, responsiveness, scalability, and economic performance? Availability Does the system offer redundancy, resiliency, fault isolation, graceful degradation, and scale-out engineering? Security Are best practices of default deny, least privilege, a minimal attack surface, and encryption/data privacy followed? Maintainability Does the system provide a reliable upgrade path for new enhancements & system updates, transparency & measurability of performance, comprehensive lifecycle management, and predictable behavior? 14Thursday, December 13, 12
  17. 17. OCS is Production OpenStack OCS is more than just software ... it’s OpenStack release synchronicity, Cloudscaling innovations, a compelling roadmap, community involvement & production support from a team with deep operational experience. OCS is open core software that tracks the OpenStack release Virtual Machines (Nova) cycle closely. Object Storage (Swift) VM Image Management (Glance) Identity Service (Keystone) Future Hardware Lifecycle Mgmt Topology & Data Model Block-based Architecture Folsom Security Hardening Essex Scale-Out Networking Service Redundancy OpenStack ecosystem contributions aws-compat, zeromq-rpc-driver, nova-gerrit-monitor, Public Cloud Compatibility tarkin, cs-nova-simplescheduler, sheep 15Thursday, December 13, 12
  18. 18. Advanced Networking Bay Area Network Virtualization Meetup – Dec 2012 CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution 16Thursday, December 13, 12
  19. 19. OCS Advanced Networking Layer 3 Networking Plugin Distributed NAT Service Network Definition API Tenant Isolation Distributed Load Balancing 17Thursday, December 13, 12
  20. 20. Distributed NAT Service 18Thursday, December 13, 12
  21. 21. Brokerless Messaging With ZeroMQ Avoiding RabbitMQ’s Single Point Of Failure Nova-Compute Single Point Of Failure RabbitMQ Broker Nova-Scheduler Nova-API RabbitMQ (Brokered) 19Thursday, December 13, 12
  22. 22. Brokerless Messaging With ZeroMQ Avoiding RabbitMQ’s Single Point Of Failure Nova-Compute Nova-Compute Single Point Of Failure RabbitMQ Broker Nova-Scheduler Nova-API Nova-Scheduler Nova-API RabbitMQ vs. ZeroMQ (Brokered) (Peer To Peer) 20Thursday, December 13, 12
  23. 23. Quantum Development Bay Area Network Virtualization Meetup – Dec 2012 CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution 21Thursday, December 13, 12
  24. 24. OpenStack Networking Nova Manages VMs, Quantum Manages Network 22Thursday, December 13, 12
  25. 25. Quantum Architecture Unified Open APIs For Advanced Networking * Image From OpenStack Documentation 23Thursday, December 13, 12
  26. 26. Quantum in Folsom (2012) • Layer 3 plugin to Nova Networking • DHCP service plugin • Network Address Translation (NAT) • No overlapping IP assignments • Very limited multi-tenant security 24Thursday, December 13, 12
  27. 27. Quantum in Grizzly (2013) • Complete Rewrite of Quantum API (v3) • Security Group Enhancements • Distributed Firewall Configuration API • Distributed Load Balancer Services API • VPN Services API • Better integration with OpenFlow Controllers 25Thursday, December 13, 12
  28. 28. Quantum Compatibility Lots Of Choices For Virtual Network/SDN Providers •Open vSwitch. http://www.openvswitch.org/openstack/documentation •Nicira NVP. quantum/plugins/nicira/nicira_nvp_plugin/README and http:// www.nicira.com/support. •Midokura. http://www.midokura.com/midonet/openstack/ •BigSwitch. http://www.bigswitch.com/sites/default/files/sdn_resources/ openstack_aag.pdf •Cisco. quantum/plugins/cisco/README and http://wiki.openstack.org/cisco- quantum •Linux Bridge. quantum/plugins/linuxbridge/README and http:// wiki.openstack.org/Quantum-Linux-Bridge-Plugin   •Ryu. quantum/plugins/ryu/README and http://www.osrg.net/ryu/ using_with_openstack.html •NEC OpenFlow. http://wiki.openstack.org/Quantum-NEC-OpenFlow-Plugin 26Thursday, December 13, 12
  29. 29. Network Virtualization Bay Area Network Virtualization Meetup – Dec 2012 CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution 27Thursday, December 13, 12
  30. 30. Network Virtualization Use Case Isolation and Network Tunnels in Multi-Tenant VM Host 28Thursday, December 13, 12
  31. 31. Origin Of Virtual Networking It makes more sense if you know where it came from • Despite the appearance of new technologies, virtual networking and SDN have been around a long time • Prior to the latest protocols, virtual networking was achieved through creative use of VLANs, proxies, dynamic routing, GRE tunnels, and expect scripts. • MPLS was an attempt by Cisco and the telcom providers to do managed virtual networking • SDN isn’t new either, e.g. Ciscoworks, NetConf, Opnet 29Thursday, December 13, 12
  32. 32. Future Of Virtual Networking A few bold predictions • Increased adoption of OpenFlow is an obvious inevitability • The biggest change on the horizon: virtualized hardware • OSS soon to be a viable option for elastic cloud networking • Networking will be about programming, not configuring • It is already easier to do virtual networking at scale, but soon it will also be cheaper, leading to massive disruption 30Thursday, December 13, 12
  33. 33. The Path To The Future • Virtualizing network hardware: VRFs, VM support on routers, VXLANs, virtual instances • SDN needs to be a software and hardware approach, device configuration should be done through common APIs • A dramatic shift away from HA and failover, and toward distributed services with smaller failure domains and expectation of failure 31Thursday, December 13, 12
  34. 34. Service Resiliency Bay Area Network Virtualization Meetup – Dec 2012 CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution 32Thursday, December 13, 12
  35. 35. What Do We Mean By “HA”? We mean what most people mean ... Two servers or network devices that look like one 33Thursday, December 13, 12
  36. 36. “HA HA”? HA pairs come in a couple flavors Active / Passive 34Thursday, December 13, 12
  37. 37. “HA HA”? People like this flavor best, but it’s not always possible... Active / Active 35Thursday, December 13, 12
  38. 38. “HA HA HA HA HA”?? Many people wish they could get it more like this ... HA cluster aka ‘massive operational nightmare’ 36Thursday, December 13, 12
  39. 39. What is Scale-out? A B A B C D N A B Scale-up - Make boxes Scale-out - Make moar bigger (usually an HA pair) boxes 37Thursday, December 13, 12
  40. 40. Scaling out is a mindset Scaling up is like treating your servers as pets bowzer.company.com web001.company.com Servers *are* cattle 38Thursday, December 13, 12
  41. 41. “HA” Pairs Are an All-in Move They better not fail ... 39Thursday, December 13, 12
  42. 42. Risk Reduction Many small failure domains is usually better 40Thursday, December 13, 12
  43. 43. Big failure domains vs. small Would you rather have the whole cloud down or just a small bit for a short period of time? Still a scale-up pattern ... wouldn’t you rather scale-out? 41Thursday, December 13, 12
  44. 44. What’s Usually an “HA” Pair in OpenStack? Everything ... Service Endpoints Messaging System (APIs) (RPC) Worker Threads Database (e.g. Scheduler, (MySQL) Networking) 42Thursday, December 13, 12
  45. 45. What needs to be an HA pair? Not much needs state synchronization Service Endpoints Messaging System (APIs) (RPC) Worker Threads Database (e.g. Scheduler, Networking) (MySQL) 43Thursday, December 13, 12
  46. 46. Traditional HA Pair Failover 44Thursday, December 13, 12
  47. 47. Traditional HA Pair Failover 45Thursday, December 13, 12
  48. 48. Fault Tolerance Methodologies 46Thursday, December 13, 12
  49. 49. Fault Tolerance in OCS 47Thursday, December 13, 12
  50. 50. Service Distribution High Availability Without Compromise Resilient Stateless Scale-out 48Thursday, December 13, 12
  51. 51. Service Distribution Combines Standard Networking Technologies router ospf OSPF /etc/quagga/ospfd.conf ospf router-id 10.1.1.1 network 10.1.255.1 area 0.0.0.0 interface lo:2 Anycast /etc/quagga/zebra.conf description Pound listening address ip address 10.1.255.1/32 ListenHTTP Address 10.1.255.1 Port 8774 Load- xHTTP Service BackEnd 1 Balancing /etc/pound/pound.conf End Address 10.1.1.1 Port 8774 Proxy BackEnd Address 10.1.1.2 Port 8774 End End End 49Thursday, December 13, 12
  52. 52. Resilient OpenStack Horizontally Scalable, No Single Point Of Failure Service Distribution ZeroMQ Service Endpoints Messaging System (APIs) (RPC) Service Distribution MMR + HA Worker Threads Database (e.g. Scheduler, Networking) (MySQL)Thursday, December 13, 12
  53. 53. Service Distribution Advantages What Makes This a Superior Solution? • True horizontal scalability with no centralized controller • Services are always running, failover is nearly instant • Reduced complexity, fewer idle resources • No need for separate load balancers Server Server Server Server Server Server Server ... Failover vs. Distributed Services 51Thursday, December 13, 12
  54. 54. Site Failover and Global LB Service Distribution Works With Multiple Sites • Traditional HA pairs do not support cross-site resiliency • Service Distribution fail across sites without DNS redirections 52Thursday, December 13, 12
  55. 55. Service Distribution in Action Example: Distributed Load Balancing 1) OSPF OSPF Router(s) OSPF OSPF advertisement advertisement V Quagga Quagga HTTP Proxy HTTP Proxy 53Thursday, December 13, 12
  56. 56. Service Distribution in Action Example: Distributed Load Balancing 1) OSPF OSPF Router(s) 2) ECMP Per-flow Load Balancing OSPF OSPF advertisement advertisement Per-Flow Load 3) Load-balancing V Balancing Quagga Quagga HTTP Proxy HTTP Proxy HTTP Proxy 54Thursday, December 13, 12
  57. 57. Service Distribution in Action Example: Distributed Load Balancing 1) OSPF OSPF Router(s) 2) ECMP Per-flow Load Balancing OSPF OSPF advertisement advertisement Per-Flow Load 3) Load-balancing V Balancing Quagga Quagga HTTP Proxy HTTP Proxy HTTP Proxy 4) Unlimited # of Back-End Servers Server Server Server Server 55Thursday, December 13, 12
  58. 58. Failure Resiliency Client Client Client Client 1 2 3 4 1 2 3 4 Load Balancer/ Load Balancer/ Load Balancer/ Load Balancer/ Load Balancer/ Proxy Proxy Proxy Proxy Proxy 10% Server Server Server Server Server Load Each Server Server Server Server Server Server 56Thursday, December 13, 12
  59. 59. Failure Resiliency Client Client Client Client 1 2 3 4 1 12 3 4 X Load Balancer/ Load Balancer/ Load Balancer/ Load Balancer/ Load Balancer/ Proxy Proxy Proxy Proxy Proxy 10% Server Server Server Server Server Load Each Server Server Server Server Server Server 57Thursday, December 13, 12
  60. 60. Failure Resiliency Client Client Client Client 1 2 3 4 1 2 3 4 Load Balancer/ Load Balancer/ Load Balancer/ Load Balancer/ Load Balancer/ Proxy Proxy Proxy Proxy Proxy 10% X Server Server Server Server Server Server Server Server Server Server Increased Server Load 58Thursday, December 13, 12
  61. 61. CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution 59Thursday, December 13, 12

×