High Availability for OpenStack

HA for OpenStack:
Connecting the dots
Raghavan “Rags” Srinivas
Rackspace

OpenStack Meetup,
Boston on Feb. 19th 2014

Rags
• 
• 
• 

Solutions Architect at Rackspace for OpenStack-based Rackspace Private Cloud
Speaker at JavaOne, RSA conferences, Sun Tech Days, JUGs and other
developer conferences
Trying to help make OpenStack more App Developer friendly

Agenda

What is HA?
HA of OpenStack APIs
HA of RabbitMQ
MySQL HA
A Peek into HA Methods
Resources and Summary

OpenStack Design Tenets
•  Scalability and elasticity are our main goals
•  Any feature that limits our main goals must be optional
•  Everything should be asynchronous
–  a) If you can't do something asynchronously, see #2

•  All required components must be horizontally scalable
•  Always use shared nothing architecture (SN) or sharding
–  a) If you can't Share nothing/shard, see #2

•  Distribute everything
–  a) Especially logic. Move logic to where state naturally exists.

•  Accept eventual consistency and use it where it is appropriate.
•  Test everything
RACKSPACE® HOSTING

|

WWW.RACKSPACE.COM

4

What is HA?
• 
• 
• 

Minimization of system downtime
Minimization of data/transaction loss
In case of multiple (or interrelated)
failures, minimization of data loss is
preferred over minimization of system
downtime

HA as Nines

Downtime/Year

99% (two nines)

3.65 days

99.9%

8.76 hours

99.99%

52.56 minutes

99.999%

5.26 minutes

99.9999% (six nines) 31.5 seconds

Implementing HA
•  Elimination of Single Point of Failure (SPOFs)
•  Redundancy of network components such as switchers and routers
•  Redundancy of applications and automatic service migrations
•  Redundancy of storage components
•  Redundancy of facilities services such as power, AC, etc.

Components (High Level)
Client
VIP
NODE 1

NODE 2

Replication Services

Replication Services

Health Check

Health Check

Cluster Communication

Cluster Communication

Concepts
State

Description

• There is no dependency between requests
Stateless

• No need for data replication/synchronization.
Failed request may need to be restarted on a
different node.

Example

Apache web server,
Nova API, Nova
Scheduler, etc.

• An action typically comprises multiple requests
Stateful

• Data needs to be replicated and synchronized
between redundant services (to preserve state
and consistency)

MySQL, RabbitMQ,
etc.

More Concepts
Terminology

Description

Failover

Migration of a service from the “primary” to the
“secondary”

Failback

Migration of service back to the “primary”

Switchover

Migration is initiated manually

Much more concepts
Active/Passive

Active/Active

o  There is a single master

o  Multiple masters

o  Load balance stateless services using a VIP and a
load balancer such as HAProxy

o  Load balance stateless services using a VIP and a
load balancer such as HAProxy

o  For Stateful services a replacement resource can be
brought online. A separate application monitors these
services, bringing the backup online as necessary

o  Stateful Services are managed in such a way that
services are redundant, and that all instances have
an identical state

o  After a failover the system will encounter a speed
bump since the passive node has to notice the fault
in the active node and become active

o  Updates to one instance of database would
propagate to all other instances
o  After a failover the system will function in a
degraded state

HA for OpenStack
•  OpenStack APIs (nova, cinder, etc.)
•  RabbitMQ
•  MySQL
•  Cinder, Swift, and so on
•  Heat (still Work in Progress)
•  Application running on OpenStack (Application
dependent)

HA on OpenStack
• 

Overall Philosophy (Don’t reinvent the wheel)

• 
• 
• 

Leverage time-tested Linux utilities such as Keepalived, HAProxy and Virtual IP
(using VRRP)
Leverage Hardware Load Balancers
Leverage replication services for RabbitMQ/MySQL such as RabbitMQ
Clustering, MySQL master-master replication, Corosync, Pacemaker, DRBD,
Galera and so on

Keepalived
• 
• 
• 

Based on Linux Virtual Server (IPVS) kernel module providing layer 4 Load
Balancing
Implements a set of checkers to maintain health and Load Balancing
HA is implemented using VRRP Protocol

1 vrrp_script rabbitmq {!
script “usr/sbin/service
2
interval 5
3
weight -2
4
rise 2
5
fall -2
6
}!
7

rabbitmq-server status" # Check the service status!
# check every 5 seconds!
# adjust priority by -2 if OK!
# required number of failures for KO switch!
# required number of successes for OK switch!

HAProxy
• Load Balancing and Proxying for HTTP and TCP
Applications

• Works over multiple connections

HA with Keepalived, VRRP &
HAProxy
Application
VRRP

Network Layer

Host1
HAProxy

Application Layer

Realserver1

Host2
Keepalived

Backup

Realserver2

HA on Rackspace Private
Cloud
INTERNET

Controller 1

VIP(Keepalived, VRRP)
HAProxy

Active-Passive Infrastructure services
(MySQL, Rabbit)
Active-Active Infrastructure services
(API services)

Heartbeat

Compute Node 1 Compute Node 2

VMs Instantiated

Controller 2

Redundant Active-Passive
Infrastructure services
Redundant Active-Active
Compute Node N

HA on Rackspace Private
Cloud (switchover)
INTERNET
VIP(HAProxy)
Controller 2

Controller 1
Active-Passive Infrastructure services
(MySQL, Rabbit)

Heartbeat

Compute Node 1 Compute Node 2

VMs Instantiated


Compute Node N

RabbitMQ HA Options
•  Health Check without Clustering
•  Clustering without Health Check
•  Health Check and Clustering

RabbitMQ HA

Ethernet
VRID 13
192.168.236.199
Master (Active)
Controller 1
VRID 13
IP address:
192.168.236.11

Backup (Passive)

RabbitMQ

RabbitMQ
RabbitMQ Clustering

Controller 2
VRID 13
IP address:
192.168.236.12

MYSQL HA: MASTER/MASTER REPLICATION

MySQL – Master/Master
Replication

Ethernet
VRID 12
192.168.236.198
Master (Active)

Backup (Passive)

MySQL
Controller 1
VRID 12
IP address: 192.168.236.11

MySQL
Master/Master

Controller 2
VRID 12
IP address:
192.168.236.12

MySQL – Master/Master
Replication simplified

MYSQL HA: COROSYNC, PACEMAKER AND DRBD

Pacemaker, Corosync and DRBD

Image from: http://dev.mysql.com/doc/refman/5.0/en/ha-drbd.html"

RACKSPACE® HOSTING

|

WWW.RACKSPACE.COM

27

Pacemaker, Corosync,
DRBD
Pacemaker

Corosync

DRBD

High availability and load
balancing stack for the Linux
platform

Totem single-ring ordering and
membership protocol

Synchronizes data at the
block device

Interacts with applications
through Resource Agents (RA)

UDP and InfiniBand based
messaging, quorum, and
cluster membership to
Pacemaker

Uses a journaling system
(such as ext3 or ext4)

DRBD
Service

Service

FILE SYSTEM

FILE SYSTEM

BUFFER CACHE

BUFFER CACHE

DRBD

RAW DEVICE

RAW DEVICE

TCP/IP

TCP/IP

DISK SCHED

DRBD

DISK SCHED

DISK DRIVER

NIC DRIVER

NIC DRIVER

DISK DRIVER

DISK

NIC

NIC

DISK

Galera
CLIENTS

•  Synchronous multi-master cluster
technology for MySQL/InnoDB
•  MySQL patched for wsrep (Write Set
REPlication)

Transparent

Connections

•  Active/active multi-master topology
•  Read and write to any cluster node

DBMS

DBMS

DBMS

•  True parallel replication, in row level

wsrep API

wsrep API

wsrep API

•  No slave lag or integrity issues

Galera Replication

Multi-master replication
•  Based on Optimistic Concurrency Control
•  In case of two transactions modifying the same row on different nodes, one of
the transactions will abort
•  Victim transaction will get Deadlock Error
•  Application needs to handle this error

Multi-master Replication

read & write

read & write

MySQL

read & write

Multi-master cluster looks
like one big database with
multiple entry points

Multi-master conflicts

write

write

MySQL

MySQL

GALERA REPLICATION

MySQL


write

write

MySQL

MySQL

GALERA REPLICATION

MySQL

Conflict detected


write

OK

MySQL

MySQL

GALERA REPLICATION

MySQL

Deadlock
error

OpenStack and Galera

Image from http://www.severalnines.com/blog/clustering-mysql-backendopenstack"

Galera on Rackspace Private
Cloud/OpenStack
A How To: OFFICIALLY UNSUPPORTED
1.  Install Rackspace Private Cloud on 2 controllers with HA mode (Haproxy, Keepalived
and VRRP is already installed)
2.  Install Galera (with ws-rep) on 3 separate nodes
3.  Mysqldump from controller nodes to Galera node
4.  Grant privileges to OpenStack (nova, glance, etc.) and haproxy users
5.  Update keepalived and haproxy and OpenStack configuration files on controller/compute
6.  Stop/Uninstall MySQL services on controller nodes and restart controller nodes

HA methods
Vendor

Clustering/Replication
Technique

Rackspace

Keepalived, HAProxy,
VRRP, native clustering

Red Hat

Pacemaker, Corosync,
DRBD

Cisco

Keepalived, HAProxy,
Galera for MySQL

HP

Microsoft Windows based
installation with Hyper-V

Characteristics

•  Automatic install on 2 controller nodes
via Chef recipes

•  Manual installation. Fewer components
to install

•  Manual install, at least 3 controller
nodes

•  MS SQL server and other Windowsbased methods

HA methods
Infrastructure

Clustering/Replication
Technique

Characteristics

None required (Stateless)

•  HA also serves as scale out using

RabbitMQ Clustering

•  RabbitMQ Clustering is setup for single/

Heat

TBD

•  Application Dependent (No standard

MySQL

Many

•  Discussed later slide

OpenStack APIs

RabbitMQ

HAProxy

multiple nodes

methods yet).

HA methods for MySQL
Clustering Method

Replication Technique

Pacemaker/Corosync/DRBD Mirroring on Block Devices

Keepalived/HAProxy/VRRP

Works on MySQL master-master
replication

Characteristics

•  Well tested, more complex to setup.
•  Split brain possibility
•  Simple to implement and understand.
•  Works for any storage system.
•  Master-master replication does not work
beyond 2 nodes.

Galera

Based on write-set Replication
(wsrep)

Others

MySQL Cluster, RHCS with DAS/
SAN Storage

•  No Slave lag
•  Needs at least 3 nodes
•  Deadlock erros on hotspot rows.
•  Relatively new
•  Some relatively new (GTID)
•  Some well test
•  More complex setup

Resources
• 

OpenStack HA guide

• 
• 
• 

http://docs.openstack.org/high-availability-guide/content/ch-intro.html
https://wiki.ubuntu.com/ServerTeam/OpenStackHA

Other Resources

• 

http://www.rackspace.com/blog/implementing-high-availability-ha-for-rackspace-private-cloud/

• 

http://www.rackspace.com/blog/high-availability-ha-with-galera-for-rackspace-private-cloud/https://www.hastexo.com/

• 

http://www.mysql.com/why-mysql/white-papers/mysql-high-availability-drbd-configuration-deployment-guide/

• 

http://docwiki.cisco.com/wiki/OpenStack_Havana_Release:_High-Availability_Manual_Deployment_Guide

• 

http://www.drbd.org/

• 

http://www.codership.com/

• 

http://www.severalnines.com/blog/clustering-mysql-backend-openstack

• 

https://wiki.openstack.org/wiki/BasicDesignTenets

• 

http://db.cs.berkeley.edu/papers/hpts85-nothing.pdf

Summary
• 

In general leverage existing methods of HA

• 

There are several time-tested and more recent methods for implementing MySQL HA.

• 

Rackspace Private Cloud provides Chef cookbooks and recipes for implementing HA via Keepalived,
HAProxy and VRRP.

• 

Galera is gaining more popularity. Since it’s Active/Active it does scale out and is HA.

• 

Few steps to get from Rackspace Private Cloud to MySQL with Galera (officially unsupported).

• 

Corosync/Pacemaker/DRBD is recommended by Oracle/MySQL.

• 

OpenStack HA guide goes through all these options in more detail.

Thank you!
Raghavan “Rags” Srinivas
Solutions Architect
Rackspace

High Availability for OpenStack

More Related Content

What's hot

Similar to High Availability for OpenStack

More from Kamesh Pemmaraju

Recently uploaded

High Availability for OpenStack