High availability is a very important and frequently discussed topic for clouds at the infrastructure level. There are several concepts to provide a HA-ready OpenStack. And also software defined storage like Ceph is highly available with no single point of failure.
But what about HA if you bring OpenStack and Ceph together? How do they work together and what are the impacts on the availability of your OpenStack cloud infrastructure from the tenant or application point of view?
How does the design of your classic high-available data center, e.g. with two fire compartments, power backup, and redundant power and network lines impact your cluster setup? There are many different scenarios of potential failures. What does this mean regarding building and managing failure zones, especially in case of technologies like Ceph which need to be able to build a quorum to keep up running.
This talk will cover:
- Failure scenarios and their impact on OpenStack and Ceph availability
- Which components of the cloud need a quorum
- How to setup the infrastructure to ensure a quorum
- How the different quorum devices work together and if they guarantee the HA of your cloud
- Pitfalls and solutions