Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep dive into highly available open stack architecture openstack summit vancouver 2015

12,285 views

Published on

OpenStack HA

Published in: Engineering

Deep dive into highly available open stack architecture openstack summit vancouver 2015

  1. 1. Arthur Berezin, Sr. Technical Product Manager, Red Hat Deep Dive into Highly Available OpenStack Architecture OpenStack Summit Vancouver May 2015
  2. 2. Agenda ★ HA Enabling Services Pacemaker and HAProxy ★ Shared Services MariaDB w/Galera, RabbitMQ w/Mirrored Queues ★ OpenStack Services Keystone, Nova, Neutron, Glance, Cinder, Horizon ★ Topologies Controller, Compute, Network, Storage
  3. 3. cc: Morio2015 Source: https://www.wikiwand.com/en/Scuderia_Ferrari
  4. 4. Losing Your Controller https://www.youtube.com/watch?v=Kb43Nxuwc4I
  5. 5. High Availability ● Minimize downtime by avoiding SPOF ● Create service redundancy ○ Active-Active When possible ■ Stateless services or HA internal support ■ Active-Passive if nothing else is applicable ● Scale out Architecture
  6. 6. HA Enabling Technologies Pacemaker, HAProxy
  7. 7. ● Cluster Resource Manager ● Uses Corosync for cluster communication ● Monitor and Control Resources: ○ Floating Virtual IP Address (VIP) ○ SystemD/LSB/OCF Services ○ Cloned Services(Active/Active) ● STONITH - Fencing with Power Management ○ Important for ensuring data consistency Pacemaker
  8. 8. ● Virtual IP(VIP) ● SystemD Cloned Resource ● STONITH Fencing Pacemaker OpenStack Service Node 2 - 192.168.1.2Node 1 - 192.168.1.1 pcsd pcsd Cloned STONITH STONITH Service Service Service Virtual IP 10.0.0.1
  9. 9. HAProxy Load Balancer Load Balancing and Proxy for HTTP/TCP ● Mature and popular with web applications ● Health Checking ● Load Distribution
  10. 10. ● Load Distribution ○ Round Robin, ○ Stick-Table ● API Isolation ● Failure Detection Node 1 Node 2 Node 3 HAProxy Load Balancer Service Service HAProxy
  11. 11. Avoiding SPOFs A day in a Highly Available Service Life
  12. 12. Horizon Controller Give Me Horizon Web UI NOW!
  13. 13. Horizon Controller Give Me Horizon Web UI NOW! Single Point Of Failure
  14. 14. Horizon Controller 1 Horizon Controller 2 Horizon Controller 3 Give Me Horizon Web UI NOW! HAProxy Controller 1
  15. 15. Horizon Controller 1 Horizon Controller 2 Horizon Controller 3 Give Me Horizon Web UI NOW! HAProxy Controller 1 Single Point Of Failure Each Could Fail
  16. 16. Horizon Controller 1 Horizon Controller 2 Horizon Controller 3 Give Me Horizon Web UI NOW! HAProxy Controller 1 Single Point Of Failure Pacemaker Cloned Horizon Service
  17. 17. Horizon Controller 1 Horizon Controller 2 Horizon Controller 3 Give Me Horizon Web UI NOW! HAProxy Controller 1 HAProxy Controller 3 HAProxy Controller 2 Pacemaker Cloned Horizon Service Pacemaker Cloned HAProxy Service
  18. 18. Pacemaker Cloned HAProxy Service Horizon Controller 1 Horizon Controller 2 Horizon Controller 3 HAProxy Controller 1 HAProxy Controller 3 HAProxy Controller 2 Give Me Horizon Web UI NOW! Horizon VIP Pacemaker Cloned Horizon Service
  19. 19. Shared Components Database, Messaging
  20. 20. Galera with MariaDB ● Active-Active MultiMaster Synchronous Replication ● Auto Node Joining ● Row level parallel replication ● Native with MariaDB DB Node 3DB Node 2DB Node 1 GALERA REPLICATION wsrep MariaDB wsrep wsrep MariaDBMariaDB
  21. 21. RabbitMQ Clustering with Mirrored Queues RabbitMQ Node1RabbitMQ Node1RabbitMQ Node1 RabbitMQ RabbitMQ Clustering RabbitMQ RabbitMQ Mirrored Queue
  22. 22. OpenStack Services Keystone, Glance, Cinder, Nova, Neutron, Horizon
  23. 23. Keystone
  24. 24. Keystone HTTPD SQL: Assignments SQL: Identities LDAP: Identities API Call Keystone Service: ★ httpd/Keystone ○ API ○ Assignments ○ Identities ■ LDAP ■ SQL
  25. 25. ● Cloned Stateless HTTPD Service ● Same SSL Certs on all nodes ● Cache is local on each host Node 2Node 1 Cloned Keystone SQL: Assignments SQL: Identities LDAP: Identities API Call Cloned HTTPd/ Keystone HTTPd/ Keystone HAProxy HAProxy pcsd pcsd Keystone VIP STONITH STONITH
  26. 26. ● Cloned Stateless HTTPD Service ● Same SSL Certs on all nodes ● Cache is local on each host Node 2Node 1 Cloned Keystone SQL: Assignments SQL: Identities LDAP: Identities API Call Cloned HTTPd/ Keystone HTTPd/ Keystone HAProxy HAProxy pcsd pcsd Keystone VIP STONITH STONITH
  27. 27. Glance
  28. 28. Glance SQLStorage Glance-API Glance Registry Service: ★ Glance-API ○ API ○ Storage Calls ★ Glance-Registry ○ Keeps images registry at the Database Ceilometer Notifications HTTP RabbitMQ
  29. 29. ● Both services are Cloned Active/Active ● Both services are LB and VIP Node 2Node 1 Cloned Glance SQLImages Store HAProxy HAProxy pcsd pcsd Glance Registry VIP ClonedGlance-API Glance-API ClonedGlance Registry Glance API VIP Glance Registry STONITH STONITH
  30. 30. ● Both services are Cloned Active/Active ● Both services are LB and VIP Node 2Node 1 Cloned Glance SQLImages Store HAProxy HAProxy pcsd pcsd Glance Registry VIP ClonedGlance-API Glance-API ClonedGlance Registry Glance API VIP Glance Registry STONITH STONITH
  31. 31. ● Both services are Cloned Active/Active ● Both services are LB and VIP Node 2Node 1 Cloned Glance SQLImages Store HAProxy HAProxy pcsd pcsd Glance Registry VIP ClonedGlance-API Glance-API ClonedGlance Registry Glance API VIP Glance Registry STONITH STONITH
  32. 32. Cinder
  33. 33. Cinder Volume ★ Cinder-API ○ API ★ Cinder-Scheduler ○ Volumes placement ★ Cinder-Volume ○ Manages Storage ★ Cinder-Backup SQL Storage Cinder-API Cinder Scheduler RabbitMQ Cinder Backup Storage Cinder Driver VM Data Path
  34. 34. ● Cinder-API is Stateless Cloned ● LB and VIP ● Cinder-Volume is A/P due it potential races ● Cinder-Backup is A/P Node 2Node 1 A/PVolume Volume Cloned Cinder Storage HAProxy HAProxy pcsd pcsd ClonedCinder-API Cinder-API ClonedScheduler Scheduler Cinder API VIP DriverDriver STONITH STONITH
  35. 35. Nova
  36. 36. Nova Nova Compute ★ Nova-API ○ API ★ Nova-Scheduler ○ VM placement ★ Nova-Conductor ○ Updates DB on Compute’s behalf ★ Nova-Compute ○ Runs VM Instances SQL Nova-API Nova Scheduler RabbitMQ Nova Conductor libvirt/KVM VMVM
  37. 37. Compute Controller Services Nova Nova Compute ★ Nova-API ○ API ★ Nova-Scheduler ○ VM placement ★ Nova-Conductor ○ Updates DB on Compute’s behalf ★ Nova-Compute ○ Runs VM Instances SQL Nova-API Nova Scheduler RabbitMQ Nova Conductor libvirt/KVM VMVM
  38. 38. Controller Services ● Nova-API configured with LB and VIP ● Nova-API, Nova-Scheduler and Nova-Conductor are Stateless A/A Cloned services Node 2Node 1 ClonedConductor Conductor ClonedHAProxy HAProxy pcsd pcsd ClonedNova-API Nova-API ClonedScheduler Scheduler Nova-API VIP Nova SQL RabbitMQ STONITH STONITH
  39. 39. Compute Service ● Each host is independent ● Nova-compute watched locally by SystemD ● VM HA not supported(yet), Probably Liberty Nova Compute2 Nova Compute libvirt/KVM VM Compute1 VMVM Nova Compute libvirt/KVM
  40. 40. Compute Service ● Probably supported in Liberty ● Each host is independent ● Nova-compute watched locally by SystemD ● Liberty Blueprint: Mark Host Down Nova VM HA Compute1 VMVM Nova Compute libvirt/KVM STONITH pacemaker_remote Compute1 VMVM Nova Compute libvirt/KVM STONITH pacemaker_remote
  41. 41. Neutron
  42. 42. ★ Neutron Server ○ API and Management ★ Neutron L2 Agent ○ L2 Traffic on compute ★ Neutron L3 Agent ○ Network Routing ★ DHCP Agent ★ LBaaS Agent Neutron SQL Neutron Server L2 Agent(s) Open vSwitch RabbitMQ L3 Agent DHCP Agent LBaaS Agent
  43. 43. Controller Neutron Network Node Compute2 SQL Internet Compute1 L2 Agent(s) Open vSwitch L2 Agent(s) Open vSwitch VMVMVMVMVM VM L3 Agent DHCP Agent L2 Agent(s) Open vSwitch RabbitMQ LBaaS Agent Neutron Server R1
  44. 44. Network Node2 Network Node1 A/P Cloned Cloned + VRRP L2 Agent L3 Agent LBaaS Agent RPC DHCP Agent Compute1 L2 Agent L3 Agent Controller1 LBaaS Agent DHCP Agent R1pcsd Neutron API VIP Controller2 pcsd Cloned Cloned HAProxy Neutron-API Neutron-API HAProxy R1
  45. 45. Neutron ● Kilo ○ L3 Agent HA with VRRP ○ DHCP Agent HA ● Liberty ■ L3 Agent - DVR ■ DVR + VRRP Longer Term ■ Distributed DHCP on compute nodes
  46. 46. Horizon
  47. 47. Service: ★ httpd/OpenStack- Dashboard ○ Django web app ○ Uses services APIs Horizon Browser Horizon Cinder API Neutron API Glance API Keystone API Nova API
  48. 48. Horizon ● Cloned Stateless HTTPd Service ● Same SSL Certs on all nodes ● Cache is local on each host Node 2Node 1 Cloned Cloned HTTPd/ Horizon HTTPd/ Horzon HAProxy HAProxy pcsd pcsd Horizon VIP STONITH STONITH
  49. 49. Topologies Controller, Compute,Network, Storage
  50. 50. Active - Active Controller Cluster Controller 1 Controller 2 Controller 3 HAProxy Packemaker Keystone Neutron Cinder ... HAProxy Packemaker Keystone Neutron Cinder ... HAProxy Packemaker Keystone Neutron Cinder .... Galera Multi-master replication MariaDB MariaDB MariaDB RabbitMQ Mirrored Queues
  51. 51. Controller Cluster Compute2Compute1 Nova Compute Nova Compute Controller 1 Controller 2 Controller 3 Controller Services Public Tenant Management Controller Services Controller Services Controller Services Controller Services Controller Services
  52. 52. Controller Cluster Compute2Compute1 Nova Compute Nova Compute Network Cluster Public Tenant Management Neutron Network Node1 Neutron Network Node2 Neutron Network Node3 Controller 1 Controller 2 Controller 3
  53. 53. Controller Cluster Compute2Compute1 Nova Compute Nova Compute Storage Cluster Storage Management Controller 1 Controller 2 Controller 3 Cinder Glance Node1 Cinder Glance Node2 Cinder Glance Node3 Volume Storage Image Store
  54. 54. Resources
  55. 55. Resources RDO HA Ref Arch https://github.com/beekhof/osp-ha-deploy Layer 3 High Availability - VRRP DVR DHCP http://assafmuller.com/2014/08/16/layer-3-high-availability/ DVR http://assafmuller.com/2015/04/15/distributed-virtual-routing- overview-and-eastwest-routing/ Creating a Highly Available Red Hat OpenStack Platform Configuration (OSP5 and RHEL 7) https://access.redhat.com/articles/1150463 About High Availability with OpenStack Platform https://access.redhat.com/articles/1274203 New nova API call to mark nova-compute down https://review.openstack.org/#/c/169836/ The Different Facets of OpenStack HA http://blog.russellbryant.net/2015/03/10/the-different-facets-of- openstack-ha/ Implementation of Pacemaker Managed OpenStack VM Recovery http://blog.russellbryant.net/2015/04/08/implementation-of- pacemaker-managed-openstack-vm-recovery/
  56. 56. HA Talks during Summit HA Infrastructure Talks Pacemaker: OpenStack’s PID 1 MariaDB Galera cluster : Best practices High Availability Architecture Deep Dive Into a Highly Available OpenStack Architecture Real World Practices Highly Available OpenStack: From Theory to Reality Lessons learned on upgrades: the importance of HA and automation Providing OpenStack Service High-Availability Through Anycast Routing HA Storage Talks Keeping OpenStack storage trendy with Ceph and containers DRBD9 for OpenStack The Road to Enterprise-Ready OpenStack Storage as Service Dude, where is my volume HA Networking Talks Highly Available, Performant, VXLAN Service Node IPv6 impact on Neutron L3 High Availability High Availability and Resiliency Testing Strategies for OpenStack Clouds
  57. 57. Thank You

×