SlideShare a Scribd company logo
1
High Availability
on Linux
Roger Zhou
zzhou@suse.com
openSUSE.Asia
Summit 2015
SUSE
way
© SUSE, All rights reserved.2
CURIOSITY in the land
of Linux High Availability
3
Agenda
• HA architectural components
• Use case examples
• Future outlook & Demo
4
What is Cluster?
• HPC (super computing)
• Load Balancer (Very high capacity)
• High Availability
‒ 99.999% = 5 m/year MTTR
‒ SPOF(single point of failure)
‒ Murphy's Law
"Everything that can go wrong will
go wrong"
5
"HA", widely used, often confusing
• VMWare vSphere HA
‒ hypervisor and hardware level. Close-source.
‒ Agnostic on Guest OS inside the VM.
• SUSE HA
‒ Inside Linux OS.
‒ That said, Windows need Windows HA solution.
• Different Industries
‒ We are Enterprise.
‒ HADOOP HA (dot com)
‒ OpenStack HA ( paas )
‒ OpenSAF (telecom)
6
History of HA in Linux OS
• 1990s, Heartbeat project. Simply two nodes.
• Early 2000s, Heartbeat 2.0 too complex.
‒ Industry demands to split.
1) one for cluster membership
2) one for resource management
• Today, ClusterLabs.org
‒ A completely different solution in early days,
pacemaker + corosnc
‒ While merged Heartbeat project.
‒ 2015 HA Summit
7
Typical HA Problem - Split Brain
• Clustering
‒ multiple nodes share the same resources.
• Split partitions run the same service
‒ It just breaks data integrity !!!
• Two key concepts as the solution:
‒ Fencing
Cluster doesn't accept any confusing state.
STONITH - "shoot the other node in the head".
‒ Quorum
It stands for "majority". No quorum, then no actions,
no resource management, no fencing.
8
HA Hardware Components
• Multiple networks
‒ A user network for end user access.
‒ A dedicated network for cluster communication/heartbeat.
‒ A dedicated storage network infrastructure.
• Network Bonding
‒ aka. Link Aggregation
• Fencing/STONITH devices
‒ remote “powerswitch”
• Shared storage
‒ NAS(nfs/cifs), SAN(fc/iscsi)
9
Architectural Software Components
"clusterlabs.org"
‒ Corosync
‒ Pacemaker
‒ Resource Agents
‒ Fencing/STONITH Devices
‒ UI(crmsh and Hawk2)
‒ Booth for GEO Cluster
Outside of "clusterlabs.org"
‒ LVS: Layer 4, ip+port, kernel space.
‒ HAproxy: Layer 7/ HTTP, user space.
‒ Shared filesystem: OCFS2 / GFS2
‒ Block device replication:
DRBD, cLVM mirroring, cluster-md
‒ Shared storage:
SAN (FC / FCoE / iSCSI)
‒ Multipathing
Software Components in details
11
Corosync: messaging and membership
• Consensus algorithm
‒ "Totem Single Ring Ordering and Membership protocol"
• Closed Process Group
‒ Analogue “TCP/IP 3-way hand shaking”
‒ Membership handling.
‒ Message ordering.
• A quorum system
‒ notifies apps when quorum is achieved or lost.
• In-memory object database
‒ for Configuration engines and Service engines.
‒ Shared-nothing cluster.
12
Pacemaker: the resources manager
• The brain of the cluster.
• Policy engine for decision making.
‒ To start/stop resources on a node according to the score.
‒ To monitor resources according to interval.
‒ To restart resources if monitor fails.
‒ To fence/STONITH a node if stop operation fails.
13
Shoot The Other Node In The Head
• Data integrity does not
tolerate any confusing state.
Before migrating resources
to another node in the
cluster, the cluster must
confirm the suspicious node
really is down.
• STONITH is mandatory for
*enterprise* Linux HA
clusters.
14
Popular STONITH devices
• APC PDU
‒ network based powerswitch
• Standard Protocols Integrated with Servers
‒ Intel AMT, HP iLO, Dell DRAC, IBM IMM, IPMI Alliance
• Software libraries
‒ to deal with KVM, Xen and VMware Vms.
• Software based
‒ SBD (STONITH Block Device) to do self termination.
The last implicit option in the fencing topology.
• NOTE: Fencing devices can be chained.
15
Resources Agents (RAs)
• Write RA for your applications
• LSB shell scripts:
start / stop / monitor
• More than hundred contributors in upstream github.
16
Cluster Filesystem
• OCFS2 / GFS2
‒ On the shared storage.
‒ Multiple nodes concurrently access the same filesystem.
http://clusterlabs.org/doc/fr/Pacemaker/1.1/html/Clusters_from_Scratch/_pacemaker_architecture.html
17
Cluster Block Device
• DRBD
‒ network based raid1.
‒ high performance data replication over network.
• cLVM2 + cmirrord
‒ Clustered lvm2 mirroring.
‒ Multiple nodes can manipulate volumes on the shared disk.
‒ clvmd distributes LVM metadata updates in the cluster.
‒ Data replication speed is way too slow.
• Cluster md raid1
‒ multiple nodes use the shared disks as md-raid1.
‒ High performance raid1 solution in cluster.
Cluster Examples
19
NFS Server ( High Available NAS )
Cluster Example in Diagram
VM1
DRBD MasterDRBD Master
LVMLVM
Filesystem(ext4)Filesystem(ext4)
NFS exportsNFS exports
Virtual IPVirtual IP
VM2
DRBD SlaveDRBD Slave
LVMLVM
Filesystem(ext4)Filesystem(ext4)
NFS exportsNFS exports
Virtual IPVirtual IP
Failover
Pacemaker + CorosyncPacemaker + Corosync
KernelKernel KernelKernel
network raid1
20
HA iSCSI Server ( Active/Passive )
Cluster Example in Diagram
VM1
DRBD MasterDRBD Master
LVMLVM
iSCSITargetiSCSITarget
iSCSILogicalUnitiSCSILogicalUnit
Virtual IPVirtual IP
VM2
DRBD SlaveDRBD Slave
LVMLVM
iSCSITargetiSCSITarget
iSCSILogicalUnitiSCSILogicalUnit
Virtual IPVirtual IP
Failover
Pacemaker + CorosyncPacemaker + Corosync
network raid1
KernelKernel KernelKernel
21
Cluster FS - OCFS2 on shared disk
Cluster Example in Diagram
Host3Host1
OCFS2OCFS2
Host2
OCFS2OCFS2
KernelKernel KernelKernel
Virtual Machine Images and Configuration ( /mnt/images/ )Virtual Machine Images and Configuration ( /mnt/images/ )
Cluster MD-RAID1Cluster MD-RAID1 Cluster MD-RAID1Cluster MD-RAID1
Replication / Backup
/dev/md127 /dev/md127
VMVM VMmigration
Pacemaker + CorosyncPacemaker + Corosync
22
Future outlook
• Upstream activities
‒ OpenStack: from the control plane into the compute domain.
‒ Scalability of corosync/pacemaker
‒ Docker adoption
‒ “Zero” Downtime HA VM
‒ ...
23
Join us ( all *open-source* )
• Play with Leap 42.1: http://www.opensuse.org
Doc: https://www.suse.com/documentation/sle-ha-12/
• Report and Fix Bugs: http://bugzilla.opensuse.org
• Discussion: opensuse-ha@opensuse.org
• HA ClusterLabs: http://clusterlabs.org/
• General HA Users: users@clusterlabs.org
Demo + Q&A + Have fun
25

More Related Content

What's hot

OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBITOpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
OpenNebula Project
 
IITCC15: Xen Project 4.6 Update
IITCC15: Xen Project 4.6 UpdateIITCC15: Xen Project 4.6 Update
IITCC15: Xen Project 4.6 Update
The Linux Foundation
 
Rhel cluster basics 2
Rhel cluster basics   2Rhel cluster basics   2
Rhel cluster basics 2
Manoj Singh
 
Cgroup resource mgmt_v1
Cgroup resource mgmt_v1Cgroup resource mgmt_v1
Cgroup resource mgmt_v1sprdd
 
GlusterFS Update and OpenStack Integration
GlusterFS Update and OpenStack IntegrationGlusterFS Update and OpenStack Integration
GlusterFS Update and OpenStack Integration
Etsuji Nakai
 
Rh436 pdf
Rh436 pdfRh436 pdf
Gluster fs tutorial part 2 gluster and big data- gluster for devs and sys ...
Gluster fs tutorial   part 2  gluster and big data- gluster for devs and sys ...Gluster fs tutorial   part 2  gluster and big data- gluster for devs and sys ...
Gluster fs tutorial part 2 gluster and big data- gluster for devs and sys ...Tommy Lee
 
Kubecon shanghai rook deployed nfs clusters over ceph-fs (translator copy)
Kubecon shanghai  rook deployed nfs clusters over ceph-fs (translator copy)Kubecon shanghai  rook deployed nfs clusters over ceph-fs (translator copy)
Kubecon shanghai rook deployed nfs clusters over ceph-fs (translator copy)
Hien Nguyen Van
 
Software defined storage
Software defined storageSoftware defined storage
Software defined storage
Gluster.org
 
Gluster Storage
Gluster StorageGluster Storage
Gluster Storage
Raz Tamir
 
Integrating gluster fs,_qemu_and_ovirt-vijay_bellur-linuxcon_eu_2013
Integrating gluster fs,_qemu_and_ovirt-vijay_bellur-linuxcon_eu_2013Integrating gluster fs,_qemu_and_ovirt-vijay_bellur-linuxcon_eu_2013
Integrating gluster fs,_qemu_and_ovirt-vijay_bellur-linuxcon_eu_2013
Gluster.org
 
An Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersAn Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux Containers
Kento Aoyama
 
Ceph and Mirantis OpenStack
Ceph and Mirantis OpenStackCeph and Mirantis OpenStack
Ceph and Mirantis OpenStack
Mirantis
 
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaTechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
OpenNebula Project
 
RHCE Training
RHCE TrainingRHCE Training
RHCE Training
ajeet yadav
 
Stig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputerStig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputer
Danny Abukalam
 
Live migrating a container: pros, cons and gotchas
Live migrating a container: pros, cons and gotchasLive migrating a container: pros, cons and gotchas
Live migrating a container: pros, cons and gotchas
Docker, Inc.
 
64-bit ARM Unikernels on uKVM
64-bit ARM Unikernels on uKVM64-bit ARM Unikernels on uKVM
64-bit ARM Unikernels on uKVM
LinuxCon ContainerCon CloudOpen China
 
TechDay - Cambridge 2016 - OpenNebula Corona
TechDay - Cambridge 2016 - OpenNebula CoronaTechDay - Cambridge 2016 - OpenNebula Corona
TechDay - Cambridge 2016 - OpenNebula Corona
OpenNebula Project
 
Smb gluster devmar2013
Smb gluster devmar2013Smb gluster devmar2013
Smb gluster devmar2013
Gluster.org
 

What's hot (20)

OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBITOpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
 
IITCC15: Xen Project 4.6 Update
IITCC15: Xen Project 4.6 UpdateIITCC15: Xen Project 4.6 Update
IITCC15: Xen Project 4.6 Update
 
Rhel cluster basics 2
Rhel cluster basics   2Rhel cluster basics   2
Rhel cluster basics 2
 
Cgroup resource mgmt_v1
Cgroup resource mgmt_v1Cgroup resource mgmt_v1
Cgroup resource mgmt_v1
 
GlusterFS Update and OpenStack Integration
GlusterFS Update and OpenStack IntegrationGlusterFS Update and OpenStack Integration
GlusterFS Update and OpenStack Integration
 
Rh436 pdf
Rh436 pdfRh436 pdf
Rh436 pdf
 
Gluster fs tutorial part 2 gluster and big data- gluster for devs and sys ...
Gluster fs tutorial   part 2  gluster and big data- gluster for devs and sys ...Gluster fs tutorial   part 2  gluster and big data- gluster for devs and sys ...
Gluster fs tutorial part 2 gluster and big data- gluster for devs and sys ...
 
Kubecon shanghai rook deployed nfs clusters over ceph-fs (translator copy)
Kubecon shanghai  rook deployed nfs clusters over ceph-fs (translator copy)Kubecon shanghai  rook deployed nfs clusters over ceph-fs (translator copy)
Kubecon shanghai rook deployed nfs clusters over ceph-fs (translator copy)
 
Software defined storage
Software defined storageSoftware defined storage
Software defined storage
 
Gluster Storage
Gluster StorageGluster Storage
Gluster Storage
 
Integrating gluster fs,_qemu_and_ovirt-vijay_bellur-linuxcon_eu_2013
Integrating gluster fs,_qemu_and_ovirt-vijay_bellur-linuxcon_eu_2013Integrating gluster fs,_qemu_and_ovirt-vijay_bellur-linuxcon_eu_2013
Integrating gluster fs,_qemu_and_ovirt-vijay_bellur-linuxcon_eu_2013
 
An Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersAn Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux Containers
 
Ceph and Mirantis OpenStack
Ceph and Mirantis OpenStackCeph and Mirantis OpenStack
Ceph and Mirantis OpenStack
 
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaTechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
 
RHCE Training
RHCE TrainingRHCE Training
RHCE Training
 
Stig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputerStig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputer
 
Live migrating a container: pros, cons and gotchas
Live migrating a container: pros, cons and gotchasLive migrating a container: pros, cons and gotchas
Live migrating a container: pros, cons and gotchas
 
64-bit ARM Unikernels on uKVM
64-bit ARM Unikernels on uKVM64-bit ARM Unikernels on uKVM
64-bit ARM Unikernels on uKVM
 
TechDay - Cambridge 2016 - OpenNebula Corona
TechDay - Cambridge 2016 - OpenNebula CoronaTechDay - Cambridge 2016 - OpenNebula Corona
TechDay - Cambridge 2016 - OpenNebula Corona
 
Smb gluster devmar2013
Smb gluster devmar2013Smb gluster devmar2013
Smb gluster devmar2013
 

Similar to Linux High Availability Overview - openSUSE.Asia Summit 2015

High Availability Storage (susecon2016)
High Availability Storage (susecon2016)High Availability Storage (susecon2016)
High Availability Storage (susecon2016)
Roger Zhou 周志强
 
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community) [발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
동현 김
 
Realizing Linux Containers (LXC)
Realizing Linux Containers (LXC)Realizing Linux Containers (LXC)
Realizing Linux Containers (LXC)
Boden Russell
 
Kfs presentation
Kfs presentationKfs presentation
Kfs presentation
Petrovici Florin
 
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copyLinux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Boden Russell
 
Linux Container Brief for IEEE WG P2302
Linux Container Brief for IEEE WG P2302Linux Container Brief for IEEE WG P2302
Linux Container Brief for IEEE WG P2302
Boden Russell
 
Pacemaker+DRBD
Pacemaker+DRBDPacemaker+DRBD
Pacemaker+DRBDDan Frincu
 
Performant and Resilient Storage: The Open Source & Linux Way
Performant and Resilient Storage: The Open Source & Linux WayPerformant and Resilient Storage: The Open Source & Linux Way
Performant and Resilient Storage: The Open Source & Linux Way
OpenNebula Project
 
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
OpenNebula Project
 
Linux Memory Analysis with Volatility
Linux Memory Analysis with VolatilityLinux Memory Analysis with Volatility
Linux Memory Analysis with Volatility
Andrew Case
 
Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)
Hajime Tazaki
 
Considerations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfConsiderations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmf
hik_lhz
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
Patrick Quairoli
 
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebula Project
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
kanedafromparis
 
Gpfs introandsetup
Gpfs introandsetupGpfs introandsetup
Gpfs introandsetup
asihan
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Databricks
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Databricks
 
Build a High Available NFS Cluster Based on CephFS - Shangzhong Zhu
Build a High Available NFS Cluster Based on CephFS - Shangzhong ZhuBuild a High Available NFS Cluster Based on CephFS - Shangzhong Zhu
Build a High Available NFS Cluster Based on CephFS - Shangzhong Zhu
Ceph Community
 
What CloudStackers Need To Know About LINSTOR/DRBD
What CloudStackers Need To Know About LINSTOR/DRBDWhat CloudStackers Need To Know About LINSTOR/DRBD
What CloudStackers Need To Know About LINSTOR/DRBD
ShapeBlue
 

Similar to Linux High Availability Overview - openSUSE.Asia Summit 2015 (20)

High Availability Storage (susecon2016)
High Availability Storage (susecon2016)High Availability Storage (susecon2016)
High Availability Storage (susecon2016)
 
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community) [발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
 
Realizing Linux Containers (LXC)
Realizing Linux Containers (LXC)Realizing Linux Containers (LXC)
Realizing Linux Containers (LXC)
 
Kfs presentation
Kfs presentationKfs presentation
Kfs presentation
 
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copyLinux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
 
Linux Container Brief for IEEE WG P2302
Linux Container Brief for IEEE WG P2302Linux Container Brief for IEEE WG P2302
Linux Container Brief for IEEE WG P2302
 
Pacemaker+DRBD
Pacemaker+DRBDPacemaker+DRBD
Pacemaker+DRBD
 
Performant and Resilient Storage: The Open Source & Linux Way
Performant and Resilient Storage: The Open Source & Linux WayPerformant and Resilient Storage: The Open Source & Linux Way
Performant and Resilient Storage: The Open Source & Linux Way
 
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
 
Linux Memory Analysis with Volatility
Linux Memory Analysis with VolatilityLinux Memory Analysis with Volatility
Linux Memory Analysis with Volatility
 
Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)
 
Considerations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfConsiderations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmf
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 
Gpfs introandsetup
Gpfs introandsetupGpfs introandsetup
Gpfs introandsetup
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
 
Build a High Available NFS Cluster Based on CephFS - Shangzhong Zhu
Build a High Available NFS Cluster Based on CephFS - Shangzhong ZhuBuild a High Available NFS Cluster Based on CephFS - Shangzhong Zhu
Build a High Available NFS Cluster Based on CephFS - Shangzhong Zhu
 
What CloudStackers Need To Know About LINSTOR/DRBD
What CloudStackers Need To Know About LINSTOR/DRBDWhat CloudStackers Need To Know About LINSTOR/DRBD
What CloudStackers Need To Know About LINSTOR/DRBD
 

Recently uploaded

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 

Recently uploaded (20)

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 

Linux High Availability Overview - openSUSE.Asia Summit 2015

  • 1. 1 High Availability on Linux Roger Zhou zzhou@suse.com openSUSE.Asia Summit 2015 SUSE way
  • 2. © SUSE, All rights reserved.2 CURIOSITY in the land of Linux High Availability
  • 3. 3 Agenda • HA architectural components • Use case examples • Future outlook & Demo
  • 4. 4 What is Cluster? • HPC (super computing) • Load Balancer (Very high capacity) • High Availability ‒ 99.999% = 5 m/year MTTR ‒ SPOF(single point of failure) ‒ Murphy's Law "Everything that can go wrong will go wrong"
  • 5. 5 "HA", widely used, often confusing • VMWare vSphere HA ‒ hypervisor and hardware level. Close-source. ‒ Agnostic on Guest OS inside the VM. • SUSE HA ‒ Inside Linux OS. ‒ That said, Windows need Windows HA solution. • Different Industries ‒ We are Enterprise. ‒ HADOOP HA (dot com) ‒ OpenStack HA ( paas ) ‒ OpenSAF (telecom)
  • 6. 6 History of HA in Linux OS • 1990s, Heartbeat project. Simply two nodes. • Early 2000s, Heartbeat 2.0 too complex. ‒ Industry demands to split. 1) one for cluster membership 2) one for resource management • Today, ClusterLabs.org ‒ A completely different solution in early days, pacemaker + corosnc ‒ While merged Heartbeat project. ‒ 2015 HA Summit
  • 7. 7 Typical HA Problem - Split Brain • Clustering ‒ multiple nodes share the same resources. • Split partitions run the same service ‒ It just breaks data integrity !!! • Two key concepts as the solution: ‒ Fencing Cluster doesn't accept any confusing state. STONITH - "shoot the other node in the head". ‒ Quorum It stands for "majority". No quorum, then no actions, no resource management, no fencing.
  • 8. 8 HA Hardware Components • Multiple networks ‒ A user network for end user access. ‒ A dedicated network for cluster communication/heartbeat. ‒ A dedicated storage network infrastructure. • Network Bonding ‒ aka. Link Aggregation • Fencing/STONITH devices ‒ remote “powerswitch” • Shared storage ‒ NAS(nfs/cifs), SAN(fc/iscsi)
  • 9. 9 Architectural Software Components "clusterlabs.org" ‒ Corosync ‒ Pacemaker ‒ Resource Agents ‒ Fencing/STONITH Devices ‒ UI(crmsh and Hawk2) ‒ Booth for GEO Cluster Outside of "clusterlabs.org" ‒ LVS: Layer 4, ip+port, kernel space. ‒ HAproxy: Layer 7/ HTTP, user space. ‒ Shared filesystem: OCFS2 / GFS2 ‒ Block device replication: DRBD, cLVM mirroring, cluster-md ‒ Shared storage: SAN (FC / FCoE / iSCSI) ‒ Multipathing
  • 11. 11 Corosync: messaging and membership • Consensus algorithm ‒ "Totem Single Ring Ordering and Membership protocol" • Closed Process Group ‒ Analogue “TCP/IP 3-way hand shaking” ‒ Membership handling. ‒ Message ordering. • A quorum system ‒ notifies apps when quorum is achieved or lost. • In-memory object database ‒ for Configuration engines and Service engines. ‒ Shared-nothing cluster.
  • 12. 12 Pacemaker: the resources manager • The brain of the cluster. • Policy engine for decision making. ‒ To start/stop resources on a node according to the score. ‒ To monitor resources according to interval. ‒ To restart resources if monitor fails. ‒ To fence/STONITH a node if stop operation fails.
  • 13. 13 Shoot The Other Node In The Head • Data integrity does not tolerate any confusing state. Before migrating resources to another node in the cluster, the cluster must confirm the suspicious node really is down. • STONITH is mandatory for *enterprise* Linux HA clusters.
  • 14. 14 Popular STONITH devices • APC PDU ‒ network based powerswitch • Standard Protocols Integrated with Servers ‒ Intel AMT, HP iLO, Dell DRAC, IBM IMM, IPMI Alliance • Software libraries ‒ to deal with KVM, Xen and VMware Vms. • Software based ‒ SBD (STONITH Block Device) to do self termination. The last implicit option in the fencing topology. • NOTE: Fencing devices can be chained.
  • 15. 15 Resources Agents (RAs) • Write RA for your applications • LSB shell scripts: start / stop / monitor • More than hundred contributors in upstream github.
  • 16. 16 Cluster Filesystem • OCFS2 / GFS2 ‒ On the shared storage. ‒ Multiple nodes concurrently access the same filesystem. http://clusterlabs.org/doc/fr/Pacemaker/1.1/html/Clusters_from_Scratch/_pacemaker_architecture.html
  • 17. 17 Cluster Block Device • DRBD ‒ network based raid1. ‒ high performance data replication over network. • cLVM2 + cmirrord ‒ Clustered lvm2 mirroring. ‒ Multiple nodes can manipulate volumes on the shared disk. ‒ clvmd distributes LVM metadata updates in the cluster. ‒ Data replication speed is way too slow. • Cluster md raid1 ‒ multiple nodes use the shared disks as md-raid1. ‒ High performance raid1 solution in cluster.
  • 19. 19 NFS Server ( High Available NAS ) Cluster Example in Diagram VM1 DRBD MasterDRBD Master LVMLVM Filesystem(ext4)Filesystem(ext4) NFS exportsNFS exports Virtual IPVirtual IP VM2 DRBD SlaveDRBD Slave LVMLVM Filesystem(ext4)Filesystem(ext4) NFS exportsNFS exports Virtual IPVirtual IP Failover Pacemaker + CorosyncPacemaker + Corosync KernelKernel KernelKernel network raid1
  • 20. 20 HA iSCSI Server ( Active/Passive ) Cluster Example in Diagram VM1 DRBD MasterDRBD Master LVMLVM iSCSITargetiSCSITarget iSCSILogicalUnitiSCSILogicalUnit Virtual IPVirtual IP VM2 DRBD SlaveDRBD Slave LVMLVM iSCSITargetiSCSITarget iSCSILogicalUnitiSCSILogicalUnit Virtual IPVirtual IP Failover Pacemaker + CorosyncPacemaker + Corosync network raid1 KernelKernel KernelKernel
  • 21. 21 Cluster FS - OCFS2 on shared disk Cluster Example in Diagram Host3Host1 OCFS2OCFS2 Host2 OCFS2OCFS2 KernelKernel KernelKernel Virtual Machine Images and Configuration ( /mnt/images/ )Virtual Machine Images and Configuration ( /mnt/images/ ) Cluster MD-RAID1Cluster MD-RAID1 Cluster MD-RAID1Cluster MD-RAID1 Replication / Backup /dev/md127 /dev/md127 VMVM VMmigration Pacemaker + CorosyncPacemaker + Corosync
  • 22. 22 Future outlook • Upstream activities ‒ OpenStack: from the control plane into the compute domain. ‒ Scalability of corosync/pacemaker ‒ Docker adoption ‒ “Zero” Downtime HA VM ‒ ...
  • 23. 23 Join us ( all *open-source* ) • Play with Leap 42.1: http://www.opensuse.org Doc: https://www.suse.com/documentation/sle-ha-12/ • Report and Fix Bugs: http://bugzilla.opensuse.org • Discussion: opensuse-ha@opensuse.org • HA ClusterLabs: http://clusterlabs.org/ • General HA Users: users@clusterlabs.org
  • 24. Demo + Q&A + Have fun
  • 25. 25