Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

OpenStack High Availability


Published on

Presentation about High Availability in OpenStack, which covers release IceHouse and divides basic domains. September 2014

Published in: Technology
  • Be the first to comment

OpenStack High Availability

  1. 1. OpenStack High Availability Jakub Pavlik
  2. 2. About me Jakub Pavlík • Cloud Platform Engineer • 3 years in Cloud • 2 years in OpenStack
  3. 3. High Availability vs. Disaster Recovery High Availability = fault detection & correction procedures to maximize availability of critical services and applications, often in an automated fashion. Disaster Recovery = process of preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster. High Availability ≠ Disaster Recovery!
  4. 4. Four types of HA in an OpenStack Cloud Physical infrastructure OpenStack Control services VMs OpenStack Compute Applications Compute Controller Network Controller Database Message Queue Storage .... Physical nodes Physical network Physical storage Hypervisor Host OS …. Service Resiliency QoS Cost Transparency Data Integrity ….. Virtual Machine Virtual Network Virtual Storage VM Mobility …
  5. 5. Physical Infrastructure
  6. 6. Controller 1 Controller 2 SAN 1 SAN 2 Passthru 2Passthru 1 Controller 1 Controller 2 SAN 1 SAN 2 Passthru 2Passthru 1 Switch 1 Switch 2 168 cores 3,46GHz ,336 threads agregation ¼ : 1344 vCPU 2688 GB RAM 28 x 10GE ports 168 cores 2,67GHz ,336 threads agregation ¼ : 1344 vCPU 1792 GB RAM 28 x 10GE ports tcp cloud VPC Hardware
  7. 7. OpenStack Control services
  8. 8. OpenStack modules – TCP VPC
  9. 9. Stateless services • There is no dependency between requests • For example APIs: Nova, Keystone, Glance, Cinder, etc. Stateful services • An action typically compromises multiple requests • For example: MySQL, RabbitMQ, etc. OpenStack High Availability Concepts Active/Passive • Redundant instances of stateless services are load balanced • For Stateful services a replacement resource can be brought online Active/Active • Redundant instances of stateless services are load balanced • Stateful services are managed in such a way that services are redundant, and that all instances have and identical state.
  10. 10. Corosync • Totem single-ring ordering and membership protocol • UDP and InfiniBand based messaging, quorum, and cluster membership to Pacemaker Pacemaker • High availability and load balancing stack for the Linux platform. • Interacts with applications through Resource Agents (RA) HAProxy • Load Balancing and Proxying for HTTP and TCP Applications • Works over multiple connections • Used to load balance API services Corosync, Pacemaker and HAProxy
  11. 11. • MySQL patched for wsrep (Write Set REPlication) • Active/active multi-master topology • Read and write to any cluster node • True parallel replication, in row level • No slave lag or integrity issues MySQL Galera Synchronous multi-master cluster technology for MySQL/InnoDB
  12. 12. Sample OpenStack HA architecture Stateful • Cinder Volume • Neutron L3, DHCP agents • Ceilometer central agent • RabbitMQ Stateless • Neutron Server • OpenStack APIs • Apache web server • Nova Scheduler • Cinder Scheduler Neutron agents (Active) Neutron agents (Hot Standby)
  13. 13. VMs – Compute nodes
  14. 14. Storage • Shared storage filesystem – file disks (qcow2, vmdk, vhv) • Block storage Network • Vanilla Neutron L3 agent (OpenVSwitch, Linux Bridge) • Vendor plugins - SDN controller VMs HA – two layers
  15. 15. No vSphere Style HA with KVM
  16. 16. Shared Storage • Live migration – just RAM memory • Hypervisor Evacuation – The instance will be booted from same disk and data will be preserved • CEPH, Gluster, NFS, Samba, GFS Non-Shared Storage • Block Live Migration – disk and RAM • Hypervisor Evacuation – the instance will be booted from a new disk, but will preserve the configuration, e.g. id, name, uuid • Standard filesystem EXT4, etc. Non-Shared/Shared Storage filesystem
  17. 17. • Instance boots from volume • iSCSI/FC direct mapping to instance • Enable Live Migration • Cinder Backends • LVM Driver • Default linux iSCSI server • Vendor software plugins • Gluster, CEPH, VMware VMDK driver • Vendor storage plugins • EMC VNX, IBM Storwize, Solid Fire, etc. Block Storage - Cinder
  18. 18. Problems • Routing on Linux server (max. bandwith approximately 3-4 Gbits) • Limited distribution between more network nodes • East-West and North-South communication through network node High Availability • Pacemaker&Corosync • Keepalived VRRP • DVR + VRRP – should be in Juno release Networking - Vanilla Neutron L3 agent
  19. 19. Examples • Juniper OpenContrail, VMware NSX, SDN PLUMgrid Advantages against Neutron L3 agent • North-South communication on network devices (iBGP, MLPSoverGRE) • East-West communication directly between compute nodes • Higher bandwidth (9.7 Gbits per 10Gbits port) High Availability • iBGP peering into two routers • Native HA implemented inside of network devices Networking – Vendor SDN Controller plugins
  20. 20. OpenStack HA TCP VPC MySQL RabbitMQ Openstack Controller GALERA Zookee per Cassandra Contrail Database Contrail Config with Analytics & WebUI Contrail Control Zookee per Cassandra Contrail Database MySQL RabbitMQ Openstack Controller MySQL RabbitMQ Openstack Controller Zookee per Cassandra Contrail Database Contrail Control Contrail Config with Analytics & WebUI HAProxy HAProxy HAProxy VIP Bond Interface Pacemaker Corosync Contrail Config with Analytics & WebUI Pacemaker Corosync
  21. 21. TCP Virtual Private Cloud
  22. 22. HA methods - vendors Vendor Cluster/Replication Technique Characteristics RackSpace Keepalived, HAProxy, VRRP, DRBD Automatic - Chef Red Hat Pacemaker, Corosync, Galera Manual installation/Foreman Cisco Keepalived, HAProxy, Galera Manual installation, at least 3 controller tcp cloud Pacemaker, Corosync, HAProxy, Galera, Contrail Automatic Salt-Stack deployment Mirantis Pacemaker, Corosync, HAProxy Galera Automatic - Puppet
  23. 23. Thank you for your attention!