SlideShare a Scribd company logo
1 of 23
OpenStack High Availability
Jakub Pavlik
About me
Jakub Pavlík
• Cloud Platform Engineer
• 3 years in Cloud
• 2 years in OpenStack
High Availability vs. Disaster Recovery
High Availability = fault detection & correction procedures to maximize
availability of critical services and applications, often in an automated
fashion.
Disaster Recovery = process of preparing for recovery or continuation of
technology infrastructure critical to an organization after a natural or
human-induced disaster.
High Availability ≠ Disaster Recovery!
Four types of HA in an OpenStack Cloud
Physical infrastructure
OpenStack
Control services
VMs
OpenStack Compute
Applications
Compute Controller
Network Controller
Database
Message Queue
Storage
....
Physical nodes
Physical network
Physical storage
Hypervisor
Host OS
….
Service Resiliency
QoS Cost
Transparency
Data Integrity
…..
Virtual Machine
Virtual Network
Virtual Storage
VM Mobility
…
Physical Infrastructure
Controller 1 Controller 2
SAN 1 SAN 2
Passthru 2Passthru 1
Controller 1 Controller 2
SAN 1 SAN 2
Passthru 2Passthru 1
Switch 1 Switch 2
168 cores 3,46GHz ,336 threads
agregation ¼ : 1344 vCPU
2688 GB RAM
28 x 10GE ports
168 cores 2,67GHz ,336 threads
agregation ¼ : 1344 vCPU
1792 GB RAM
28 x 10GE ports
tcp cloud
VPC
Hardware
OpenStack Control services
OpenStack modules – TCP VPC
Stateless services
• There is no dependency between requests
• For example APIs: Nova, Keystone, Glance, Cinder, etc.
Stateful services
• An action typically compromises multiple requests
• For example: MySQL, RabbitMQ, etc.
OpenStack High Availability Concepts
Active/Passive
• Redundant instances of stateless services are load balanced
• For Stateful services a replacement resource can be brought
online
Active/Active
• Redundant instances of stateless services are load balanced
• Stateful services are managed in such a way that services are
redundant, and that all instances have and identical state.
Corosync
• Totem single-ring ordering and membership
protocol
• UDP and InfiniBand based messaging, quorum,
and cluster membership to Pacemaker
Pacemaker
• High availability and load balancing stack for the
Linux platform.
• Interacts with applications through Resource
Agents (RA)
HAProxy
• Load Balancing and Proxying for HTTP and TCP
Applications
• Works over multiple connections
• Used to load balance API services
Corosync, Pacemaker and HAProxy
• MySQL patched for wsrep
(Write Set REPlication)
• Active/active multi-master
topology
• Read and write to any cluster
node
• True parallel replication, in row
level
• No slave lag or integrity issues
MySQL Galera
Synchronous multi-master cluster technology for MySQL/InnoDB
Sample OpenStack HA architecture
Stateful
• Cinder Volume
• Neutron L3, DHCP agents
• Ceilometer central agent
• RabbitMQ
Stateless
• Neutron Server
• OpenStack APIs
• Apache web server
• Nova Scheduler
• Cinder Scheduler
Neutron agents
(Active)
Neutron agents
(Hot Standby)
VMs – Compute nodes
Storage
• Shared storage filesystem – file disks (qcow2, vmdk, vhv)
• Block storage
Network
• Vanilla Neutron L3 agent (OpenVSwitch, Linux Bridge)
• Vendor plugins - SDN controller
VMs HA – two layers
No vSphere Style HA with KVM
Shared Storage
• Live migration – just RAM memory
• Hypervisor Evacuation – The instance will be booted from
same disk and data will be preserved
• CEPH, Gluster, NFS, Samba, GFS
Non-Shared Storage
• Block Live Migration – disk and RAM
• Hypervisor Evacuation – the instance will be booted from a
new disk, but will preserve the configuration, e.g. id, name,
uuid
• Standard filesystem EXT4, etc.
Non-Shared/Shared Storage filesystem
• Instance boots from volume
• iSCSI/FC direct mapping to instance
• Enable Live Migration
• Cinder Backends
• LVM Driver
• Default linux iSCSI server
• Vendor software plugins
• Gluster, CEPH, VMware VMDK driver
• Vendor storage plugins
• EMC VNX, IBM Storwize, Solid Fire, etc.
Block Storage - Cinder
Problems
• Routing on Linux server (max. bandwith approximately 3-4
Gbits)
• Limited distribution between more network nodes
• East-West and North-South communication through network
node
High Availability
• Pacemaker&Corosync
• Keepalived VRRP
• DVR + VRRP – should be in Juno release
Networking - Vanilla Neutron L3 agent
Examples
• Juniper OpenContrail, VMware NSX, SDN PLUMgrid
Advantages against Neutron L3 agent
• North-South communication on network devices (iBGP,
MLPSoverGRE)
• East-West communication directly between compute nodes
• Higher bandwidth (9.7 Gbits per 10Gbits port)
High Availability
• iBGP peering into two routers
• Native HA implemented inside of network devices
Networking – Vendor SDN Controller plugins
OpenStack HA
TCP VPC
MySQL RabbitMQ
Openstack
Controller
GALERA
Zookee
per
Cassandra
Contrail
Database
Contrail Config
with Analytics &
WebUI
Contrail
Control
Zookee
per
Cassandra
Contrail
Database
MySQL RabbitMQ
Openstack
Controller
MySQL RabbitMQ
Openstack
Controller
Zookee
per
Cassandra
Contrail
Database
Contrail
Control
Contrail Config
with Analytics &
WebUI
HAProxy HAProxy HAProxy
VIP
Bond Interface
Pacemaker
Corosync
Contrail Config
with Analytics &
WebUI
Pacemaker
Corosync
TCP Virtual Private Cloud
HA methods - vendors
Vendor Cluster/Replication Technique Characteristics
RackSpace Keepalived, HAProxy, VRRP,
DRBD
Automatic - Chef
Red Hat Pacemaker, Corosync, Galera Manual
installation/Foreman
Cisco Keepalived, HAProxy, Galera Manual installation,
at least 3 controller
tcp cloud Pacemaker, Corosync, HAProxy,
Galera, Contrail
Automatic Salt-Stack
deployment
Mirantis Pacemaker, Corosync, HAProxy
Galera
Automatic - Puppet
Thank you for your attention!

More Related Content

What's hot

MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera Cluster
Abdul Manaf
 
[OpenStack Days Korea 2016] Track3 - 오픈스택 환경에서 공유 파일 시스템 구현하기: 마닐라(Manila) 프로젝트
[OpenStack Days Korea 2016] Track3 - 오픈스택 환경에서 공유 파일 시스템 구현하기: 마닐라(Manila) 프로젝트[OpenStack Days Korea 2016] Track3 - 오픈스택 환경에서 공유 파일 시스템 구현하기: 마닐라(Manila) 프로젝트
[OpenStack Days Korea 2016] Track3 - 오픈스택 환경에서 공유 파일 시스템 구현하기: 마닐라(Manila) 프로젝트
OpenStack Korea Community
 
Inside neutron 2
Inside neutron 2Inside neutron 2
Inside neutron 2
Robin Gong
 

What's hot (20)

[2018] 오픈스택 5년 운영의 경험
[2018] 오픈스택 5년 운영의 경험[2018] 오픈스택 5년 운영의 경험
[2018] 오픈스택 5년 운영의 경험
 
Ceph issue 해결 사례
Ceph issue 해결 사례Ceph issue 해결 사례
Ceph issue 해결 사례
 
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
 
ansible why ?
ansible why ?ansible why ?
ansible why ?
 
Kubernetes 101 - an Introduction to Containers, Kubernetes, and OpenShift
Kubernetes 101 - an Introduction to Containers, Kubernetes, and OpenShiftKubernetes 101 - an Introduction to Containers, Kubernetes, and OpenShift
Kubernetes 101 - an Introduction to Containers, Kubernetes, and OpenShift
 
MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera Cluster
 
Deep Dive into Kubernetes - Part 1
Deep Dive into Kubernetes - Part 1Deep Dive into Kubernetes - Part 1
Deep Dive into Kubernetes - Part 1
 
Open shift 4 infra deep dive
Open shift 4    infra deep diveOpen shift 4    infra deep dive
Open shift 4 infra deep dive
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Ansible Automation Platform.pdf
Ansible Automation Platform.pdfAnsible Automation Platform.pdf
Ansible Automation Platform.pdf
 
Meetup 23 - 02 - OVN - The future of networking in OpenStack
Meetup 23 - 02 - OVN - The future of networking in OpenStackMeetup 23 - 02 - OVN - The future of networking in OpenStack
Meetup 23 - 02 - OVN - The future of networking in OpenStack
 
[OpenStack Days Korea 2016] Track3 - 오픈스택 환경에서 공유 파일 시스템 구현하기: 마닐라(Manila) 프로젝트
[OpenStack Days Korea 2016] Track3 - 오픈스택 환경에서 공유 파일 시스템 구현하기: 마닐라(Manila) 프로젝트[OpenStack Days Korea 2016] Track3 - 오픈스택 환경에서 공유 파일 시스템 구현하기: 마닐라(Manila) 프로젝트
[OpenStack Days Korea 2016] Track3 - 오픈스택 환경에서 공유 파일 시스템 구현하기: 마닐라(Manila) 프로젝트
 
Issues of OpenStack multi-region mode
Issues of OpenStack multi-region modeIssues of OpenStack multi-region mode
Issues of OpenStack multi-region mode
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Inside neutron 2
Inside neutron 2Inside neutron 2
Inside neutron 2
 
Ansible
AnsibleAnsible
Ansible
 
Virtualization Architecture & KVM
Virtualization Architecture & KVMVirtualization Architecture & KVM
Virtualization Architecture & KVM
 
Introduction of OpenStack cascading solution
Introduction of OpenStack cascading solutionIntroduction of OpenStack cascading solution
Introduction of OpenStack cascading solution
 
Ansible, best practices
Ansible, best practicesAnsible, best practices
Ansible, best practices
 
[오픈소스컨설팅]RHEL7/CentOS7 Pacemaker기반-HA시스템구성-v1.0
[오픈소스컨설팅]RHEL7/CentOS7 Pacemaker기반-HA시스템구성-v1.0[오픈소스컨설팅]RHEL7/CentOS7 Pacemaker기반-HA시스템구성-v1.0
[오픈소스컨설팅]RHEL7/CentOS7 Pacemaker기반-HA시스템구성-v1.0
 

Similar to OpenStack High Availability

Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stack
Nitin Mehta
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overview
howie YU
 

Similar to OpenStack High Availability (20)

Open stack ha design & deployment kilo
Open stack ha design & deployment   kiloOpen stack ha design & deployment   kilo
Open stack ha design & deployment kilo
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
 
Sharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual MachinesSharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual Machines
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
 
Txlf2012
Txlf2012Txlf2012
Txlf2012
 
Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stack
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community) [발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
 
Openstack HA
Openstack HAOpenstack HA
Openstack HA
 
Next Generation Security Solution
Next Generation Security SolutionNext Generation Security Solution
Next Generation Security Solution
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overview
 
Climb Technical Overview
Climb Technical OverviewClimb Technical Overview
Climb Technical Overview
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
pps Matters
pps Matterspps Matters
pps Matters
 
QoS, QoS Baby
QoS, QoS BabyQoS, QoS Baby
QoS, QoS Baby
 
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
 
Cloud orchestration major tools comparision
Cloud orchestration major tools comparisionCloud orchestration major tools comparision
Cloud orchestration major tools comparision
 

More from Jakub Pavlik

More from Jakub Pavlik (10)

Mirantis - Continuous Deployment of Infrastructure, Platform, and Application...
Mirantis - Continuous Deployment of Infrastructure, Platform, and Application...Mirantis - Continuous Deployment of Infrastructure, Platform, and Application...
Mirantis - Continuous Deployment of Infrastructure, Platform, and Application...
 
OpenStack Journey in Tieto Elastic Cloud
OpenStack Journey in Tieto Elastic CloudOpenStack Journey in Tieto Elastic Cloud
OpenStack Journey in Tieto Elastic Cloud
 
Evolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
Evolve or Die: Enterprise Ready OpenStack upgrades with KubernetesEvolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
Evolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
 
Kubernetes SDN performance and architecture
Kubernetes SDN performance and architectureKubernetes SDN performance and architecture
Kubernetes SDN performance and architecture
 
Operators experience and perspective on SDN with VLANs and L3 Networks
Operators experience and perspective on SDN with VLANs and L3 NetworksOperators experience and perspective on SDN with VLANs and L3 Networks
Operators experience and perspective on SDN with VLANs and L3 Networks
 
SmartCity IoT on Kubernetes and OpenStack
SmartCity IoT on Kubernetes and OpenStackSmartCity IoT on Kubernetes and OpenStack
SmartCity IoT on Kubernetes and OpenStack
 
OpenContrail Experience tcp cloud OpenStack Summit Tokyo
OpenContrail Experience tcp cloud OpenStack Summit TokyoOpenContrail Experience tcp cloud OpenStack Summit Tokyo
OpenContrail Experience tcp cloud OpenStack Summit Tokyo
 
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
 
OpenContrail Implementations
OpenContrail ImplementationsOpenContrail Implementations
OpenContrail Implementations
 
OpenContrail deployment experience
OpenContrail deployment experienceOpenContrail deployment experience
OpenContrail deployment experience
 

Recently uploaded

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 

Recently uploaded (20)

Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 

OpenStack High Availability

  • 2. About me Jakub Pavlík • Cloud Platform Engineer • 3 years in Cloud • 2 years in OpenStack
  • 3. High Availability vs. Disaster Recovery High Availability = fault detection & correction procedures to maximize availability of critical services and applications, often in an automated fashion. Disaster Recovery = process of preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster. High Availability ≠ Disaster Recovery!
  • 4. Four types of HA in an OpenStack Cloud Physical infrastructure OpenStack Control services VMs OpenStack Compute Applications Compute Controller Network Controller Database Message Queue Storage .... Physical nodes Physical network Physical storage Hypervisor Host OS …. Service Resiliency QoS Cost Transparency Data Integrity ….. Virtual Machine Virtual Network Virtual Storage VM Mobility …
  • 6. Controller 1 Controller 2 SAN 1 SAN 2 Passthru 2Passthru 1 Controller 1 Controller 2 SAN 1 SAN 2 Passthru 2Passthru 1 Switch 1 Switch 2 168 cores 3,46GHz ,336 threads agregation ¼ : 1344 vCPU 2688 GB RAM 28 x 10GE ports 168 cores 2,67GHz ,336 threads agregation ¼ : 1344 vCPU 1792 GB RAM 28 x 10GE ports tcp cloud VPC Hardware
  • 9. Stateless services • There is no dependency between requests • For example APIs: Nova, Keystone, Glance, Cinder, etc. Stateful services • An action typically compromises multiple requests • For example: MySQL, RabbitMQ, etc. OpenStack High Availability Concepts Active/Passive • Redundant instances of stateless services are load balanced • For Stateful services a replacement resource can be brought online Active/Active • Redundant instances of stateless services are load balanced • Stateful services are managed in such a way that services are redundant, and that all instances have and identical state.
  • 10. Corosync • Totem single-ring ordering and membership protocol • UDP and InfiniBand based messaging, quorum, and cluster membership to Pacemaker Pacemaker • High availability and load balancing stack for the Linux platform. • Interacts with applications through Resource Agents (RA) HAProxy • Load Balancing and Proxying for HTTP and TCP Applications • Works over multiple connections • Used to load balance API services Corosync, Pacemaker and HAProxy
  • 11. • MySQL patched for wsrep (Write Set REPlication) • Active/active multi-master topology • Read and write to any cluster node • True parallel replication, in row level • No slave lag or integrity issues MySQL Galera Synchronous multi-master cluster technology for MySQL/InnoDB
  • 12. Sample OpenStack HA architecture Stateful • Cinder Volume • Neutron L3, DHCP agents • Ceilometer central agent • RabbitMQ Stateless • Neutron Server • OpenStack APIs • Apache web server • Nova Scheduler • Cinder Scheduler Neutron agents (Active) Neutron agents (Hot Standby)
  • 14. Storage • Shared storage filesystem – file disks (qcow2, vmdk, vhv) • Block storage Network • Vanilla Neutron L3 agent (OpenVSwitch, Linux Bridge) • Vendor plugins - SDN controller VMs HA – two layers
  • 15. No vSphere Style HA with KVM
  • 16. Shared Storage • Live migration – just RAM memory • Hypervisor Evacuation – The instance will be booted from same disk and data will be preserved • CEPH, Gluster, NFS, Samba, GFS Non-Shared Storage • Block Live Migration – disk and RAM • Hypervisor Evacuation – the instance will be booted from a new disk, but will preserve the configuration, e.g. id, name, uuid • Standard filesystem EXT4, etc. Non-Shared/Shared Storage filesystem
  • 17. • Instance boots from volume • iSCSI/FC direct mapping to instance • Enable Live Migration • Cinder Backends • LVM Driver • Default linux iSCSI server • Vendor software plugins • Gluster, CEPH, VMware VMDK driver • Vendor storage plugins • EMC VNX, IBM Storwize, Solid Fire, etc. Block Storage - Cinder
  • 18. Problems • Routing on Linux server (max. bandwith approximately 3-4 Gbits) • Limited distribution between more network nodes • East-West and North-South communication through network node High Availability • Pacemaker&Corosync • Keepalived VRRP • DVR + VRRP – should be in Juno release Networking - Vanilla Neutron L3 agent
  • 19. Examples • Juniper OpenContrail, VMware NSX, SDN PLUMgrid Advantages against Neutron L3 agent • North-South communication on network devices (iBGP, MLPSoverGRE) • East-West communication directly between compute nodes • Higher bandwidth (9.7 Gbits per 10Gbits port) High Availability • iBGP peering into two routers • Native HA implemented inside of network devices Networking – Vendor SDN Controller plugins
  • 20. OpenStack HA TCP VPC MySQL RabbitMQ Openstack Controller GALERA Zookee per Cassandra Contrail Database Contrail Config with Analytics & WebUI Contrail Control Zookee per Cassandra Contrail Database MySQL RabbitMQ Openstack Controller MySQL RabbitMQ Openstack Controller Zookee per Cassandra Contrail Database Contrail Control Contrail Config with Analytics & WebUI HAProxy HAProxy HAProxy VIP Bond Interface Pacemaker Corosync Contrail Config with Analytics & WebUI Pacemaker Corosync
  • 22. HA methods - vendors Vendor Cluster/Replication Technique Characteristics RackSpace Keepalived, HAProxy, VRRP, DRBD Automatic - Chef Red Hat Pacemaker, Corosync, Galera Manual installation/Foreman Cisco Keepalived, HAProxy, Galera Manual installation, at least 3 controller tcp cloud Pacemaker, Corosync, HAProxy, Galera, Contrail Automatic Salt-Stack deployment Mirantis Pacemaker, Corosync, HAProxy Galera Automatic - Puppet
  • 23. Thank you for your attention!