High availability and fault tolerance of openstack
High Availability and Fault Tolerance (OpenStack) Deepak Mane Cloud Architect
Objective & Motivation• To Build a a Fault Tolerance and High Availability Architecture (OpenStack)• Motivation – To build a fault tolerance architecture for OpenStack – Build a Cluster Architecture for MySQL RabbitMQ components – To build high availability architecture for network – To build a predictive and reactive model for detecting failures of Nova , Swift and Compute. –
Use cases• Master-Master Cluster architecture for MySQL• Disk Level replication for mySQL using DBRD for Glance , Swift and Cinder• Session level replication for RabbitMQ• High availability for networking• High availability for Horizon (Openstack dashboard)• Predictive model for detecting failure for all components• Reactive model for recovery for all components.
Non Use Cases• Scenarios not suitable for cloud – Redundancy of network components, such as switches and routers, – Redundancy of applications and automatic service migration, – Redundancy of storage components, – Redundancy of facility services such as power, air conditioning, fire protection, and others
Pacemaker – High availability for OpenStack• Cluster stack, the state of- the-art high availability and load balancing stack for the Linux platform• Storage- and application-agnostic, and is in no way specific to OpenStack• Pacemaker relies on the Corosync messaging layer for reliable cluster communications.• Corosync implements the Totem single-ring ordering and membership protocol and provides UDP and InfiniBand based messaging, quorum, and cluster membership to Pacemaker.