SlideShare a Scribd company logo
OpenStack High Availability
Jakub Pavlik
About me
Jakub Pavlík
• Cloud Platform Engineer
• 3 years in Cloud
• 2 years in OpenStack
High Availability vs. Disaster Recovery
High Availability = fault detection & correction procedures to maximize
availability of critical services and applications, often in an automated
fashion.
Disaster Recovery = process of preparing for recovery or continuation of
technology infrastructure critical to an organization after a natural or
human-induced disaster.
High Availability ≠ Disaster Recovery!
Four types of HA in an OpenStack Cloud
Physical infrastructure
OpenStack
Control services
VMs
OpenStack Compute
Applications
Compute Controller
Network Controller
Database
Message Queue
Storage
....
Physical nodes
Physical network
Physical storage
Hypervisor
Host OS
….
Service Resiliency
QoS Cost
Transparency
Data Integrity
…..
Virtual Machine
Virtual Network
Virtual Storage
VM Mobility
…
Physical Infrastructure
Controller 1 Controller 2
SAN 1 SAN 2
Passthru 2Passthru 1
Controller 1 Controller 2
SAN 1 SAN 2
Passthru 2Passthru 1
Switch 1 Switch 2
168 cores 3,46GHz ,336 threads
agregation ¼ : 1344 vCPU
2688 GB RAM
28 x 10GE ports
168 cores 2,67GHz ,336 threads
agregation ¼ : 1344 vCPU
1792 GB RAM
28 x 10GE ports
tcp cloud
VPC
Hardware
OpenStack Control services
OpenStack modules – TCP VPC
Stateless services
• There is no dependency between requests
• For example APIs: Nova, Keystone, Glance, Cinder, etc.
Stateful services
• An action typically compromises multiple requests
• For example: MySQL, RabbitMQ, etc.
OpenStack High Availability Concepts
Active/Passive
• Redundant instances of stateless services are load balanced
• For Stateful services a replacement resource can be brought
online
Active/Active
• Redundant instances of stateless services are load balanced
• Stateful services are managed in such a way that services are
redundant, and that all instances have and identical state.
Corosync
• Totem single-ring ordering and membership
protocol
• UDP and InfiniBand based messaging, quorum,
and cluster membership to Pacemaker
Pacemaker
• High availability and load balancing stack for the
Linux platform.
• Interacts with applications through Resource
Agents (RA)
HAProxy
• Load Balancing and Proxying for HTTP and TCP
Applications
• Works over multiple connections
• Used to load balance API services
Corosync, Pacemaker and HAProxy
• MySQL patched for wsrep
(Write Set REPlication)
• Active/active multi-master
topology
• Read and write to any cluster
node
• True parallel replication, in row
level
• No slave lag or integrity issues
MySQL Galera
Synchronous multi-master cluster technology for MySQL/InnoDB
Sample OpenStack HA architecture
Stateful
• Cinder Volume
• Neutron L3, DHCP agents
• Ceilometer central agent
• RabbitMQ
Stateless
• Neutron Server
• OpenStack APIs
• Apache web server
• Nova Scheduler
• Cinder Scheduler
Neutron agents
(Active)
Neutron agents
(Hot Standby)
VMs – Compute nodes
Storage
• Shared storage filesystem – file disks (qcow2, vmdk, vhv)
• Block storage
Network
• Vanilla Neutron L3 agent (OpenVSwitch, Linux Bridge)
• Vendor plugins - SDN controller
VMs HA – two layers
No vSphere Style HA with KVM
Shared Storage
• Live migration – just RAM memory
• Hypervisor Evacuation – The instance will be booted from
same disk and data will be preserved
• CEPH, Gluster, NFS, Samba, GFS
Non-Shared Storage
• Block Live Migration – disk and RAM
• Hypervisor Evacuation – the instance will be booted from a
new disk, but will preserve the configuration, e.g. id, name,
uuid
• Standard filesystem EXT4, etc.
Non-Shared/Shared Storage filesystem
• Instance boots from volume
• iSCSI/FC direct mapping to instance
• Enable Live Migration
• Cinder Backends
• LVM Driver
• Default linux iSCSI server
• Vendor software plugins
• Gluster, CEPH, VMware VMDK driver
• Vendor storage plugins
• EMC VNX, IBM Storwize, Solid Fire, etc.
Block Storage - Cinder
Problems
• Routing on Linux server (max. bandwith approximately 3-4
Gbits)
• Limited distribution between more network nodes
• East-West and North-South communication through network
node
High Availability
• Pacemaker&Corosync
• Keepalived VRRP
• DVR + VRRP – should be in Juno release
Networking - Vanilla Neutron L3 agent
Examples
• Juniper OpenContrail, VMware NSX, SDN PLUMgrid
Advantages against Neutron L3 agent
• North-South communication on network devices (iBGP,
MLPSoverGRE)
• East-West communication directly between compute nodes
• Higher bandwidth (9.7 Gbits per 10Gbits port)
High Availability
• iBGP peering into two routers
• Native HA implemented inside of network devices
Networking – Vendor SDN Controller plugins
OpenStack HA
TCP VPC
MySQL RabbitMQ
Openstack
Controller
GALERA
Zookee
per
Cassandra
Contrail
Database
Contrail Config
with Analytics &
WebUI
Contrail
Control
Zookee
per
Cassandra
Contrail
Database
MySQL RabbitMQ
Openstack
Controller
MySQL RabbitMQ
Openstack
Controller
Zookee
per
Cassandra
Contrail
Database
Contrail
Control
Contrail Config
with Analytics &
WebUI
HAProxy HAProxy HAProxy
VIP
Bond Interface
Pacemaker
Corosync
Contrail Config
with Analytics &
WebUI
Pacemaker
Corosync
TCP Virtual Private Cloud
HA methods - vendors
Vendor Cluster/Replication Technique Characteristics
RackSpace Keepalived, HAProxy, VRRP,
DRBD
Automatic - Chef
Red Hat Pacemaker, Corosync, Galera Manual
installation/Foreman
Cisco Keepalived, HAProxy, Galera Manual installation,
at least 3 controller
tcp cloud Pacemaker, Corosync, HAProxy,
Galera, Contrail
Automatic Salt-Stack
deployment
Mirantis Pacemaker, Corosync, HAProxy
Galera
Automatic - Puppet
Thank you for your attention!

More Related Content

What's hot

Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
CloudStack Networking
CloudStack NetworkingCloudStack Networking
Prometheus monitoring
Prometheus monitoringPrometheus monitoring
Prometheus monitoring
Hien Nguyen Van
 
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Brian Brazil
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
Adam Kotwasinski
 
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Vietnam Open Infrastructure User Group
 
Velero search & practice 20210609
Velero search & practice 20210609Velero search & practice 20210609
Velero search & practice 20210609
KAI CHU CHUNG
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
confluent
 
Room 3 - 1 - Nguyễn Xuân Trường Lâm - Zero touch on-premise storage infrastru...
Room 3 - 1 - Nguyễn Xuân Trường Lâm - Zero touch on-premise storage infrastru...Room 3 - 1 - Nguyễn Xuân Trường Lâm - Zero touch on-premise storage infrastru...
Room 3 - 1 - Nguyễn Xuân Trường Lâm - Zero touch on-premise storage infrastru...
Vietnam Open Infrastructure User Group
 
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera ) Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Mydbops
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Henning Jacobs
 
Redis cluster
Redis clusterRedis cluster
Redis cluster
iammutex
 
Prometheus - basics
Prometheus - basicsPrometheus - basics
Prometheus - basics
Juraj Hantak
 
Kubernetes and Prometheus
Kubernetes and PrometheusKubernetes and Prometheus
Kubernetes and Prometheus
Weaveworks
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
Jiangjie Qin
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
StreamNative
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Vietnam Open Infrastructure User Group
 
Kubernetes Observability with Prometheus by Example
Kubernetes Observability with Prometheus by ExampleKubernetes Observability with Prometheus by Example
Kubernetes Observability with Prometheus by Example
Thomas Riley
 
Docker and Go: why did we decide to write Docker in Go?
Docker and Go: why did we decide to write Docker in Go?Docker and Go: why did we decide to write Docker in Go?
Docker and Go: why did we decide to write Docker in Go?
Jérôme Petazzoni
 

What's hot (20)

Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
CloudStack Networking
CloudStack NetworkingCloudStack Networking
CloudStack Networking
 
Prometheus monitoring
Prometheus monitoringPrometheus monitoring
Prometheus monitoring
 
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
 
Velero search & practice 20210609
Velero search & practice 20210609Velero search & practice 20210609
Velero search & practice 20210609
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Room 3 - 1 - Nguyễn Xuân Trường Lâm - Zero touch on-premise storage infrastru...
Room 3 - 1 - Nguyễn Xuân Trường Lâm - Zero touch on-premise storage infrastru...Room 3 - 1 - Nguyễn Xuân Trường Lâm - Zero touch on-premise storage infrastru...
Room 3 - 1 - Nguyễn Xuân Trường Lâm - Zero touch on-premise storage infrastru...
 
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera ) Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
 
Redis cluster
Redis clusterRedis cluster
Redis cluster
 
Prometheus - basics
Prometheus - basicsPrometheus - basics
Prometheus - basics
 
Kubernetes and Prometheus
Kubernetes and PrometheusKubernetes and Prometheus
Kubernetes and Prometheus
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
 
Kubernetes Observability with Prometheus by Example
Kubernetes Observability with Prometheus by ExampleKubernetes Observability with Prometheus by Example
Kubernetes Observability with Prometheus by Example
 
Docker and Go: why did we decide to write Docker in Go?
Docker and Go: why did we decide to write Docker in Go?Docker and Go: why did we decide to write Docker in Go?
Docker and Go: why did we decide to write Docker in Go?
 

Similar to OpenStack High Availability

Open stack ha design & deployment kilo
Open stack ha design & deployment   kiloOpen stack ha design & deployment   kilo
Open stack ha design & deployment kilo
Steven Li
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
HostedbyConfluent
 
Sharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual MachinesSharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual Machines
inside-BigData.com
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
Peter Clapham
 
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebula Project
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Belmiro Moreira
 
Txlf2012
Txlf2012Txlf2012
Txlf2012
Joe Brockmeier
 
Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stack
Nitin Mehta
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
Peter Clapham
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
Peter Clapham
 
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community) [발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
동현 김
 
Openstack HA
Openstack HAOpenstack HA
Openstack HA
Yong Luo
 
Next Generation Security Solution
Next Generation Security SolutionNext Generation Security Solution
Next Generation Security Solution
MarketingArrowECS_CZ
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overview
howie YU
 
Climb Technical Overview
Climb Technical OverviewClimb Technical Overview
Climb Technical Overview
Arif Ali
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
emreakis
 
pps Matters
pps Matterspps Matters
QoS, QoS Baby
QoS, QoS BabyQoS, QoS Baby
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
Cynthia Thomas
 
Cloud orchestration major tools comparision
Cloud orchestration major tools comparisionCloud orchestration major tools comparision
Cloud orchestration major tools comparision
Ravi Kiran
 

Similar to OpenStack High Availability (20)

Open stack ha design & deployment kilo
Open stack ha design & deployment   kiloOpen stack ha design & deployment   kilo
Open stack ha design & deployment kilo
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
 
Sharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual MachinesSharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual Machines
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
 
Txlf2012
Txlf2012Txlf2012
Txlf2012
 
Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stack
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community) [발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
 
Openstack HA
Openstack HAOpenstack HA
Openstack HA
 
Next Generation Security Solution
Next Generation Security SolutionNext Generation Security Solution
Next Generation Security Solution
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overview
 
Climb Technical Overview
Climb Technical OverviewClimb Technical Overview
Climb Technical Overview
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
pps Matters
pps Matterspps Matters
pps Matters
 
QoS, QoS Baby
QoS, QoS BabyQoS, QoS Baby
QoS, QoS Baby
 
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
 
Cloud orchestration major tools comparision
Cloud orchestration major tools comparisionCloud orchestration major tools comparision
Cloud orchestration major tools comparision
 

More from Jakub Pavlik

Mirantis - Continuous Deployment of Infrastructure, Platform, and Application...
Mirantis - Continuous Deployment of Infrastructure, Platform, and Application...Mirantis - Continuous Deployment of Infrastructure, Platform, and Application...
Mirantis - Continuous Deployment of Infrastructure, Platform, and Application...
Jakub Pavlik
 
OpenStack Journey in Tieto Elastic Cloud
OpenStack Journey in Tieto Elastic CloudOpenStack Journey in Tieto Elastic Cloud
OpenStack Journey in Tieto Elastic Cloud
Jakub Pavlik
 
Evolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
Evolve or Die: Enterprise Ready OpenStack upgrades with KubernetesEvolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
Evolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
Jakub Pavlik
 
Kubernetes SDN performance and architecture
Kubernetes SDN performance and architectureKubernetes SDN performance and architecture
Kubernetes SDN performance and architecture
Jakub Pavlik
 
Operators experience and perspective on SDN with VLANs and L3 Networks
Operators experience and perspective on SDN with VLANs and L3 NetworksOperators experience and perspective on SDN with VLANs and L3 Networks
Operators experience and perspective on SDN with VLANs and L3 Networks
Jakub Pavlik
 
SmartCity IoT on Kubernetes and OpenStack
SmartCity IoT on Kubernetes and OpenStackSmartCity IoT on Kubernetes and OpenStack
SmartCity IoT on Kubernetes and OpenStack
Jakub Pavlik
 
OpenContrail Experience tcp cloud OpenStack Summit Tokyo
OpenContrail Experience tcp cloud OpenStack Summit TokyoOpenContrail Experience tcp cloud OpenStack Summit Tokyo
OpenContrail Experience tcp cloud OpenStack Summit Tokyo
Jakub Pavlik
 
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
Jakub Pavlik
 
OpenContrail Implementations
OpenContrail ImplementationsOpenContrail Implementations
OpenContrail Implementations
Jakub Pavlik
 
OpenContrail deployment experience
OpenContrail deployment experienceOpenContrail deployment experience
OpenContrail deployment experience
Jakub Pavlik
 

More from Jakub Pavlik (10)

Mirantis - Continuous Deployment of Infrastructure, Platform, and Application...
Mirantis - Continuous Deployment of Infrastructure, Platform, and Application...Mirantis - Continuous Deployment of Infrastructure, Platform, and Application...
Mirantis - Continuous Deployment of Infrastructure, Platform, and Application...
 
OpenStack Journey in Tieto Elastic Cloud
OpenStack Journey in Tieto Elastic CloudOpenStack Journey in Tieto Elastic Cloud
OpenStack Journey in Tieto Elastic Cloud
 
Evolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
Evolve or Die: Enterprise Ready OpenStack upgrades with KubernetesEvolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
Evolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
 
Kubernetes SDN performance and architecture
Kubernetes SDN performance and architectureKubernetes SDN performance and architecture
Kubernetes SDN performance and architecture
 
Operators experience and perspective on SDN with VLANs and L3 Networks
Operators experience and perspective on SDN with VLANs and L3 NetworksOperators experience and perspective on SDN with VLANs and L3 Networks
Operators experience and perspective on SDN with VLANs and L3 Networks
 
SmartCity IoT on Kubernetes and OpenStack
SmartCity IoT on Kubernetes and OpenStackSmartCity IoT on Kubernetes and OpenStack
SmartCity IoT on Kubernetes and OpenStack
 
OpenContrail Experience tcp cloud OpenStack Summit Tokyo
OpenContrail Experience tcp cloud OpenStack Summit TokyoOpenContrail Experience tcp cloud OpenStack Summit Tokyo
OpenContrail Experience tcp cloud OpenStack Summit Tokyo
 
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
 
OpenContrail Implementations
OpenContrail ImplementationsOpenContrail Implementations
OpenContrail Implementations
 
OpenContrail deployment experience
OpenContrail deployment experienceOpenContrail deployment experience
OpenContrail deployment experience
 

Recently uploaded

Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
FODUU
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
Techgropse Pvt.Ltd.
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
Claudio Di Ciccio
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 

Recently uploaded (20)

Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 

OpenStack High Availability

  • 2. About me Jakub Pavlík • Cloud Platform Engineer • 3 years in Cloud • 2 years in OpenStack
  • 3. High Availability vs. Disaster Recovery High Availability = fault detection & correction procedures to maximize availability of critical services and applications, often in an automated fashion. Disaster Recovery = process of preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster. High Availability ≠ Disaster Recovery!
  • 4. Four types of HA in an OpenStack Cloud Physical infrastructure OpenStack Control services VMs OpenStack Compute Applications Compute Controller Network Controller Database Message Queue Storage .... Physical nodes Physical network Physical storage Hypervisor Host OS …. Service Resiliency QoS Cost Transparency Data Integrity ….. Virtual Machine Virtual Network Virtual Storage VM Mobility …
  • 6. Controller 1 Controller 2 SAN 1 SAN 2 Passthru 2Passthru 1 Controller 1 Controller 2 SAN 1 SAN 2 Passthru 2Passthru 1 Switch 1 Switch 2 168 cores 3,46GHz ,336 threads agregation ¼ : 1344 vCPU 2688 GB RAM 28 x 10GE ports 168 cores 2,67GHz ,336 threads agregation ¼ : 1344 vCPU 1792 GB RAM 28 x 10GE ports tcp cloud VPC Hardware
  • 9. Stateless services • There is no dependency between requests • For example APIs: Nova, Keystone, Glance, Cinder, etc. Stateful services • An action typically compromises multiple requests • For example: MySQL, RabbitMQ, etc. OpenStack High Availability Concepts Active/Passive • Redundant instances of stateless services are load balanced • For Stateful services a replacement resource can be brought online Active/Active • Redundant instances of stateless services are load balanced • Stateful services are managed in such a way that services are redundant, and that all instances have and identical state.
  • 10. Corosync • Totem single-ring ordering and membership protocol • UDP and InfiniBand based messaging, quorum, and cluster membership to Pacemaker Pacemaker • High availability and load balancing stack for the Linux platform. • Interacts with applications through Resource Agents (RA) HAProxy • Load Balancing and Proxying for HTTP and TCP Applications • Works over multiple connections • Used to load balance API services Corosync, Pacemaker and HAProxy
  • 11. • MySQL patched for wsrep (Write Set REPlication) • Active/active multi-master topology • Read and write to any cluster node • True parallel replication, in row level • No slave lag or integrity issues MySQL Galera Synchronous multi-master cluster technology for MySQL/InnoDB
  • 12. Sample OpenStack HA architecture Stateful • Cinder Volume • Neutron L3, DHCP agents • Ceilometer central agent • RabbitMQ Stateless • Neutron Server • OpenStack APIs • Apache web server • Nova Scheduler • Cinder Scheduler Neutron agents (Active) Neutron agents (Hot Standby)
  • 14. Storage • Shared storage filesystem – file disks (qcow2, vmdk, vhv) • Block storage Network • Vanilla Neutron L3 agent (OpenVSwitch, Linux Bridge) • Vendor plugins - SDN controller VMs HA – two layers
  • 15. No vSphere Style HA with KVM
  • 16. Shared Storage • Live migration – just RAM memory • Hypervisor Evacuation – The instance will be booted from same disk and data will be preserved • CEPH, Gluster, NFS, Samba, GFS Non-Shared Storage • Block Live Migration – disk and RAM • Hypervisor Evacuation – the instance will be booted from a new disk, but will preserve the configuration, e.g. id, name, uuid • Standard filesystem EXT4, etc. Non-Shared/Shared Storage filesystem
  • 17. • Instance boots from volume • iSCSI/FC direct mapping to instance • Enable Live Migration • Cinder Backends • LVM Driver • Default linux iSCSI server • Vendor software plugins • Gluster, CEPH, VMware VMDK driver • Vendor storage plugins • EMC VNX, IBM Storwize, Solid Fire, etc. Block Storage - Cinder
  • 18. Problems • Routing on Linux server (max. bandwith approximately 3-4 Gbits) • Limited distribution between more network nodes • East-West and North-South communication through network node High Availability • Pacemaker&Corosync • Keepalived VRRP • DVR + VRRP – should be in Juno release Networking - Vanilla Neutron L3 agent
  • 19. Examples • Juniper OpenContrail, VMware NSX, SDN PLUMgrid Advantages against Neutron L3 agent • North-South communication on network devices (iBGP, MLPSoverGRE) • East-West communication directly between compute nodes • Higher bandwidth (9.7 Gbits per 10Gbits port) High Availability • iBGP peering into two routers • Native HA implemented inside of network devices Networking – Vendor SDN Controller plugins
  • 20. OpenStack HA TCP VPC MySQL RabbitMQ Openstack Controller GALERA Zookee per Cassandra Contrail Database Contrail Config with Analytics & WebUI Contrail Control Zookee per Cassandra Contrail Database MySQL RabbitMQ Openstack Controller MySQL RabbitMQ Openstack Controller Zookee per Cassandra Contrail Database Contrail Control Contrail Config with Analytics & WebUI HAProxy HAProxy HAProxy VIP Bond Interface Pacemaker Corosync Contrail Config with Analytics & WebUI Pacemaker Corosync
  • 22. HA methods - vendors Vendor Cluster/Replication Technique Characteristics RackSpace Keepalived, HAProxy, VRRP, DRBD Automatic - Chef Red Hat Pacemaker, Corosync, Galera Manual installation/Foreman Cisco Keepalived, HAProxy, Galera Manual installation, at least 3 controller tcp cloud Pacemaker, Corosync, HAProxy, Galera, Contrail Automatic Salt-Stack deployment Mirantis Pacemaker, Corosync, HAProxy Galera Automatic - Puppet
  • 23. Thank you for your attention!