SlideShare a Scribd company logo
1 of 60
Download to read offline
Linux High Availability

      Kris Buytaert
Kris Buytaert
@krisbuytaert
●   I used to be a Dev, Then Became an
    Op
●   Senior Linux and Open Source
    Consultant @inuits.be
●   „Infrastructure Architect“
●   Building Clouds since before the
    Cloud
●   Surviving the 10th floor test
●   Co-Author of some books
●   Guest Editor at some sites
What is HA Clustering ?


●   One service goes down
      => others take over its work
●   IP address takeover, service takeover,
●   Not designed for high-performance
●   Not designed for high troughput (load balancing)
Does it Matter ?


●   Downtime is expensive
●   You mis out on $$$
●   Your boss complains
●   New users don't return
Lies, Damn Lies, and
Statistics Counting nines
              (slide by Alan R)




 99.9999%                          30 sec
 99.999%                            5 min
 99.99%                            52 min
 99.9%                              9  hr  
 99%                              3.5 day
The Rules of HA

●   Keep it Simple
●   Keep it Simple
●   Prepare for Failure
●   Complexity is the enemy of
    reliability
●   Test your HA setup
Myths
●   Virtualization will solve your HA Needs
●   Live migration is the solution to all your problems
●   HA will make your platform more stable
You care about ?
●   Your data ?
    •   Consistent
    •   Realitime
    • Eventual Consistent
●   Your Connection
    •   Always
    •   Most of the time
Eliminating the SPOF
●   Find out what Will Fail
    •   Disks
    •   Fans
    • Power (Supplies)
●   Find out what Can Fail
    •   Network
    •   Going Out Of Memory
Split Brain
●   Communications failures can lead to separated partitions of
    the cluster
●   If those partitions each try and take control of the cluster,
    then it's called a split-brain condition
●   If this happens, then bad things will happen
    •   http://linux-ha.org/BadThingsWillHappen
Shared Storage
●   Shared Storage
●   Filesystem
    •   e.g GFS, GpFS
●   Replicated ?
●   Exported
    Filesystem ?
●   $$$
    1+1 <> 2
●   Storage = SPOF
●   Split Brain :(
●   Stonith
(Shared) Data
●   Issues :
    •   Who Writes ?
    •   Who Reads ?
    •   What if 2 Active application want to write ?
    •   What if an active server crashes during writing ?
    •   Can we accept delays ?
    •   Can we accept readonly data ?
●   Hardware Requirements
●   Filesystem Requirements (GFS, GpFS, ...)
DRBD
●   Distributed Replicated Block Device
●   In the Linux Kernel (as of very recent)
●   Usually only 1 mount
    •   Multi mount as of 8.X
        •   Requires GFS / OCFS2
●   Regular FS ext3 ...
●   Only 1 application instance Active accessing data
●   Upon Failover application needs to be started on other node
DRBD(2)
●   What happens when you pull the plug of a Physical
    machine ?
    •   Minimal Timeout
    •   Why did the crash happen ?
    •   Is my data still correct ?
Alternatives to DRBD
●   GlusterFS looked promising
    •   “Friends don't let Friends use Gluster”
    •   Consistency problems
    •   Stability Problems
    •   Maybe later
●   MogileFS
    •   Not posix
    •   App needs to implement the API
●   Ceph
    •   ?
HA Projects
●   Linux HA Project
●   Red Hat Cluster Suite
●   LVS/Keepalived


●   Application Specific Clustering Software
       •   e.g Terracotta, MySQL NDBD
HeartBeat
            ●   No shared storage
            ●   Serial Connections to UPS
                to STONITH
            ●   (periodical/realtime)
                Replication or no shared
                data.
            ●   e.g Static Website,
                FileServer
Heartbeat
●       Heartbeat v1
    •     Max 2 nodes
    •     No finegrained resources
    •     Monitoring using “mon”
●   Heartbeat v2
    •     XML usage was a consulting opportunity
    •     Stability issues
    •     Forking ?
Heartbeat v1
/etc/ha.d/ha.cf
/etc/ha.d/haresources
mdb-a.menos.asbucenter.dz ntc-restart-mysql mon IPaddr2::10.8.0.13/16/bond0 

    IPaddr2::10.16.0.13/16/bond0.16 mon




/etc/ha.d/authkeys
Heartbeat v2



“A consulting Opportunity”
                             LMB
Clone Resource
Clones in v2 were buggy
Resources were started on 2 nodes
Stopped again on “1”
Heartbeat v3

•   No more /etc/ha.d/haresources
•   No more xml
•   Better integrated monitoring
•   /etc/ha.d/ha.cf has
•   crm=yes
Pacemaker ?
●   Not a fork
●   Only CRM Code taken out of Heartbeat
●   As of Heartbeat 2.1.3
    •   Support for both OpenAIS / HeartBeat
    •   Different Release Cycles as Heartbeat
Heartbeat, OpenAis,
Corosync ?
●   All Messaging Layers
●   Initially only Heartbeat
●   OpenAIS
●   Heartbeat got unmaintained
●   OpenAIS had heisenbugs :(
●   Corosync
●   Heartbeat maintenance taken over by LinBit
●   CRM Detects which layer
Configuring Heartbeat 3
●   /etc/ha.d/ha.cf
      Use crm = yes


●   /etc/ha.d/authkeys
Configuring Heartbeat with puppet

heartbeat::hacf {"clustername":
         hosts => ["host-a","host-b"],
         hb_nic => ["bond0"],
         hostip1 => ["10.0.128.11"],
         hostip2 => ["10.0.128.12"],
         ping => ["10.0.128.4"],
    }
heartbeat::authkeys {"ClusterName":
         password => “ClusterName ",
    }
http://github.com/jtimberman/puppet/tree/master/heartbeat/
Pacemaker




Heartbeat       or         OpenAIS




            Cluster Glue
●   Stonithd : The Heartbeat fencing


Pacemaker Architecture
                 subsystem.
             ●   Lrmd : Local Resource Management
                 Daemon. Interacts directly with resource
                 agents (scripts).
             ●   pengine Policy Engine. Computes the next
                 state of the cluster based on the current
                 state and the configuration.
             ●   cib Cluster Information Base. Contains
                 definitions of all cluster options, nodes,
                 resources, their relationships to one
                 another and current status. Synchronizes
                 updates to all cluster nodes.
             ●   crmd Cluster Resource Management
                 Daemon. Largely a message broker for the
                 PEngine and LRM, it also elects a leader to
                 co-ordinate the activities of the cluster.
             ●   openais messaging and membership
                 layer.
             ●   heartbeat messaging layer, an alternative
                 to OpenAIS.
             ●   ccm Short for Consensus Cluster
                 Membership. The Heartbeat membership
                 layer.
CRM
                               configure
●   Cluster Resource Manager   property $id="cib­bootstrap­options" 
                                       stonith­enabled="FALSE" 
●   Keeps Nodes in Sync                no­quorum­policy=ignore 
                                       start­failure­is­fatal="FALSE" 
●   XML Based                  rsc_defaults $id="rsc_defaults­options" 
                                       migration­threshold="1" 
                                       failure­timeout="1"
●   cibadm                     primitive d_mysql ocf:local:mysql 
                                       op monitor interval="30s" 
●   Cli manageable                     params test_user="sure" 
                               test_passwd="illtell" test_table="test.table"
●   Crm                        primitive ip_db ocf:heartbeat:IPaddr2 
                                       params ip="172.17.4.202" nic="bond0" 
                                       op monitor interval="10s"
                               group svc_db d_mysql ip_db
                               commit
Heartbeat Resources
●   LSB
●   Heartbeat resource (+status)
●   OCF (Open Cluster FrameWork) (+monitor)
●   Clones (don't use in HAv2)
●   Multi State Resources
LSB Resource Agents
●   LSB == Linux Standards Base
●   LSB resource agents are standard System V-style init
    scripts commonly used on Linux and other UNIX-like OSes
●   LSB init scripts are stored under /etc/init.d/
●   This enables Linux-HA to immediately support nearly every
    service that comes with your system, and most packages
    which come with their own init script
●   It's straightforward to change an LSB script to an OCF
    script
OCF
●   OCF == Open Cluster Framework
●   OCF Resource agents are the most powerful
    type of resource agent we support
●   OCF RAs are extended init scripts
    • They have additional actions:
      • monitor – for monitoring resource health
      • meta-data – for providing information
        about the RA
●   OCF RAs are located in
    /usr/lib/ocf/resource.d/provider-name/
Monitoring
●   Defined in the OCF Resource script
●   Configured in the parameters
●   Tomcat :
    •   Checks a configurable health page
●   MySQL :
    •   Checks query from a configurable table
●   Others :
    •   Basic proces state
Anatomy of a Cluster
config

•   Cluster properties
•   Resource Defaults
•   Primitive Definitions
•   Resource Groups and Constraints
Cluster Properties

property $id="cib-bootstrap-options" 
     stonith-enabled="FALSE" 
     no-quorum-policy="ignore" 
     start-failure-is-fatal="FALSE" 
     pe-error-series-max="9" 
     pe-warn-series-max="9" 
     pe-input-series-max="9"

No-quorum-policy = We'll ignore the loss of quorum as this is a 2 node cluster
pe-* = restricg logging

Start-failure : When set to FALSE, the cluster will instead use the resource's failcount and
value for resource-failure-stickiness
Resource Defaults

rsc_defaults $id="rsc_defaults-options" 
     migration-threshold="1" 
     failure-timeout="1" 
     resource-stickiness="INFINITY"


failure-timeout means that after a failure there will be a 60 second timeout before the
resource can come back to the node on which it failed.

Migration-treshold=1 means that after 1 failure the resource will try to start on the other node

Resource-stickiness=INFINITY means that the resource really wants to stay where it is now.
Primitive Definitions
primitive d_mine ocf:custom:tomcat 
    params instance_name="mine" 
     monitor_urls="health.html" 
     monitor_use_ssl="no" 
     op monitor interval="15s" 
     on-fail="restart" 
     timeout="30s"


primitive ip_mine_svc ocf:heartbeat:IPaddr2 
     params ip="10.8.4.131" cidr_netmask="16" nic="bond0" 
     op monitor interval="10s"
Parsing a config
●   Isn't always done correctly
●   Even a verify won't find all issues
●   Unexpected behaviour might occur
Where a resource runs
•   multi state resources
    •  Master – Slave ,
       •   e.g mysql master-slave, drbd
•   Clones
    •  Resources that can run on multiple nodes
           e.g
       •   Multimaster mysql servers
       •   Mysql slaves
       •   Stateless applications
•   location
    •  Preferred location to run resource, eg. Based on hostname
•   colocation
    •  Resources that have to live together
       •   e.g ip address + service
•   order
       Define what resource has to start first, or wait for another
       resource
•   groups
    •  Colocation + order
A Tomcat app on DRBD
●   DRBD can only be active on 1 node
●   The filesystem needs to be mounted on that active DRBD
    node
●
Resource Groups and
Constraints
group svc_mine d_mine ip_mine

ms ms_drbd_storage drbd_storage 
    meta master_max="1" master_node_max="1" clone_max="2"
clone_node_max="1" notify="true"

location cli-prefer-svc_db svc_db 
    rule $id="cli-prefer-rule-svc_db" inf: #uname eq db-a

colocation fs_on_drbd inf: svc_mine ms_drbd_storage:Master
order fs_after_drbd inf: ms_drbd_storage:promote svc_mine:start
Using crm
●   Crm configure
●   Edit primitive
●   Verify
●   Commit
Crm commands
Crm
   Start the cluster resource manager
Crm resource
   Change in to resource mode
Crm configure
   Change into configure mode
Crm configure show
   Show the current resource config
Crm resource show
   Show the current resource state
Cibadm -Q
   Dump the full Cluster Information Base in XML
But We love XML
●   Cibadm -Q
Checking the Cluster
 State
crm_mon -1


============
Last updated: Wed Nov 4 16:44:26 2009
Stack: Heartbeat
Current DC: xms-1 (c2c581f8-4edc-1de0-a959-91d246ac80f5) - partition with quorum
Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7
2 Nodes configured, unknown expected votes
2 Resources configured.
============

Online: [ xms-1 xms-2 ]

Resource Group: svc_mysql
  d_mysql       (ocf::ntc:mysql):     Started xms-1
  ip_mysql      (ocf::heartbeat:IPaddr2):       Started xms-1
Resource Group: svc_XMS
  d_XMS         (ocf::ntc:XMS):       Started xms-2
  ip_XMS        (ocf::heartbeat:IPaddr2):       Started xms-2
  ip_XMS_public (ocf::heartbeat:IPaddr2):       Started xms-2
Stopping a resource
crm resource stop svc_XMS


crm_mon -1


============
Last updated: Wed Nov 4 16:56:05 2009
Stack: Heartbeat
Current DC: xms-1 (c2c581f8-4edc-1de0-a959-91d246ac80f5) - partition with quorum
Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7
2 Nodes configured, unknown expected votes
2 Resources configured.
============

Online: [ xms-1 xms-2 ]

Resource Group: svc_mysql
  d_mysql    (ocf::ntc:mysql): Started xms-1
  ip_mysql (ocf::heartbeat:IPaddr2): Started xms-1
Starting a resource
crm resource start svc_XMS
 crm_mon -1


============
Last updated: Wed Nov 4 17:04:56 2009
Stack: Heartbeat
Current DC: xms-1 (c2c581f8-4edc-1de0-a959-91d246ac80f5) - partition with quorum
Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7
2 Nodes configured, unknown expected votes
2 Resources configured.
============

Online: [ xms-1 xms-2 ]

Resource Group: svc_mysql
  d_mysql    (ocf::ntc:mysql): Started xms-1
  ip_mysql (ocf::heartbeat:IPaddr2): Started xms-1
Resource Group: svc_XMS
Moving a resource
●   Resource migrate
●   Is permanent , even upon failure
●   Usefull in upgrade scenarios
●   Use resource unmigrate to restore
Moving a resource
[xpoll-root@XMS-1 ~]# crm resource migrate svc_XMS xms-1
[xpoll-root@XMS-1 ~]# crm_mon -1
Last updated: Wed Nov 4 17:32:50 2009
Stack: Heartbeat
Current DC: xms-1 (c2c581f8-4edc-1de0-a959-91d246ac80f5) - partition with quorum
Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7
2 Nodes configured, unknown expected votes
2 Resources configured.
Online: [ xms-1 xms-2 ]
Resource Group: svc_mysql
  d_mysql     (ocf::ntc:mysql): Started xms-1
  ip_mysql (ocf::heartbeat:IPaddr2): Started xms-1
Resource Group: svc_XMS
  d_XMS       (ocf::ntc:XMS):    Started xms-1
  ip_XMS      (ocf::heartbeat:IPaddr2): Started xms-1
  ip_XMS_public         (ocf::heartbeat:IPaddr2): Started xms-1
Putting a node in Standby
[menos-val3-root@mss-1031a ~]# crm node standby
[menos-val3-root@mss-1031a ~]# crm_mon -1
============
Last updated: Wed Dec 22 14:33:45 2010
Stack: Heartbeat
Current DC: mss-1031a (45674b38-5aad-4a7c-bbf1-562b2f244763) - partition with quorum
Version: 1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782
2 Nodes configured, unknown expected votes
1 Resources configured.
============
Node mss-1031b (110dc817-e2ea-4290-b275-4e6d8ca7b031): OFFLINE (standby)
Node mss-1031a (45674b38-5aad-4a7c-bbf1-562b2f244763): standby
Restoring a node from
 standby
[menos-val3-root@mss-1031b ~]# crm node online
[menos-val3-root@mss-1031b ~]# crm_mon -1
============
Last updated: Thu Dec 23 08:36:21 2010
Stack: Heartbeat
Current DC: mss-1031b (110dc817-e2ea-4290-b275-4e6d8ca7b031) - partition with quorum
Version: 1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782
2 Nodes configured, unknown expected votes
1 Resources configured.
============


Online: [ mss-1031b mss-1031a ]
Migrate vs Standby
●   Think nrofnodes > 2 clusters
●   Migrate : send resource to node X
    •   Only use that available one
●   Standby : do not send resources to node X
    •   But use the other available ones
Debugging
●   Check crm_mon -f
●   Failcounts ?
●   Did the application launch correctly ?
●   /var/log/messages/
    •   Warning: very verbose
●   Tomcat logs
Resource not running
[menos-val3-root@mrs-a ~]# crm
crm(live)# resource
crm(live)resource# show
Resource Group: svc-MRS
   d_MRS      (ocf::ntc:tomcat) Stopped
   ip_MRS_svc         (ocf::heartbeat:IPaddr2) Stopped
   ip_MRS_usr         (ocf::heartbeat:IPaddr2) Stopped
Resource Failcount
[menos-val3-root@mrs-a ~]# crm
crm(live)# resource
crm(live)resource# failcount d_MRS show mrs-a
scope=status name=fail-count-d_MRS value=1
crm(live)resource# failcount d_MRS delete mrs-a
crm(live)resource# failcount d_MRS show mrs-a
scope=status name=fail-count-d_MRS value=0
Resource Failcount
[menos-val3-root@mrs-a ~]# crm
crm(live)# resource
crm(live)resource# failcount d_MRS show mrs-a
scope=status name=fail-count-d_MRS value=1
crm(live)resource# failcount d_MRS delete mrs-a
crm(live)resource# failcount d_MRS show mrs-a
scope=status name=fail-count-d_MRS value=0
Resource Failcount
[menos-val3-root@mrs-a ~]# crm
crm(live)# resource
crm(live)resource# failcount d_MRS show mrs-a
scope=status name=fail-count-d_MRS value=1
crm(live)resource# failcount d_MRS delete mrs-a
crm(live)resource# failcount d_MRS show mrs-a
scope=status name=fail-count-d_MRS value=0
Pacemaker and Puppet
●   Plenty of non usable modules around
    •   Hav1
●   https://github.com/rodjek/puppet-pacemaker.git
    •   Strict set of ops / parameters
●

●   Make sure your modules don't enable resources
●   I've been using templates till to populate
●   Cibadm to configure
●   Crm is complex , even crm doesn't parse correctly yet
●

●   Plenty of work ahead !
Getting Help
●   http://clusterlabs.org
●   #linux-ha on irc.freenode.org
●   http://www.drbd.org/users-guide/
Contact :
Kris Buytaert
Kris.Buytaert@inuits.be

Further Reading
@krisbuytaert
http://www.krisbuytaert.be/blog/
http://www.inuits.be/
http://www.virtualizati
on.com/
http://www.oreillygmt.com/
                       Inuits          Esquimaux
                       't Hemeltje     Kheops Business
                       Gemeentepark 2  Center
                       2930 Brasschaat Avenque Georges
                       891.514.231     Lemaître 54
                                       6041 Gosselies
                       +32 473 441 636 889.780.406

More Related Content

What's hot

Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems confluent
 
MariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseMariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseSeveralnines
 
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...DataStax
 
Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources confluent
 
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.NAVER D2
 
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin	Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin Vietnam Open Infrastructure User Group
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in ImpalaCloudera, Inc.
 
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsTuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsDataWorks Summit
 
OpenvSwitch Deep Dive
OpenvSwitch Deep DiveOpenvSwitch Deep Dive
OpenvSwitch Deep Diverajdeep
 
Flink on Kubernetes operator
Flink on Kubernetes operatorFlink on Kubernetes operator
Flink on Kubernetes operatorEui Heo
 
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...HostedbyConfluent
 
MySQL InnoDB Cluster / ReplicaSet - Tutorial
MySQL InnoDB Cluster / ReplicaSet - TutorialMySQL InnoDB Cluster / ReplicaSet - Tutorial
MySQL InnoDB Cluster / ReplicaSet - TutorialKenny Gryp
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
 
PostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFSPostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFSTomas Vondra
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machineAlexei Starovoitov
 
IBM Spectrum Scale Authentication for File Access - Deep Dive
IBM Spectrum Scale Authentication for File Access - Deep DiveIBM Spectrum Scale Authentication for File Access - Deep Dive
IBM Spectrum Scale Authentication for File Access - Deep DiveShradha Nayak Thakare
 
MySQL GTID 시작하기
MySQL GTID 시작하기MySQL GTID 시작하기
MySQL GTID 시작하기I Goo Lee
 
Object Storage in a Cloud-Native Container Envirnoment
Object Storage in a Cloud-Native Container EnvirnomentObject Storage in a Cloud-Native Container Envirnoment
Object Storage in a Cloud-Native Container EnvirnomentMinio
 
Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기NeoClova
 

What's hot (20)

Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems
 
MariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseMariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash Course
 
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
 
Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources
 
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
 
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin	Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
 
eBPF/XDP
eBPF/XDP eBPF/XDP
eBPF/XDP
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in Impala
 
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsTuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
 
OpenvSwitch Deep Dive
OpenvSwitch Deep DiveOpenvSwitch Deep Dive
OpenvSwitch Deep Dive
 
Flink on Kubernetes operator
Flink on Kubernetes operatorFlink on Kubernetes operator
Flink on Kubernetes operator
 
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
 
MySQL InnoDB Cluster / ReplicaSet - Tutorial
MySQL InnoDB Cluster / ReplicaSet - TutorialMySQL InnoDB Cluster / ReplicaSet - Tutorial
MySQL InnoDB Cluster / ReplicaSet - Tutorial
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
 
PostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFSPostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFS
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
 
IBM Spectrum Scale Authentication for File Access - Deep Dive
IBM Spectrum Scale Authentication for File Access - Deep DiveIBM Spectrum Scale Authentication for File Access - Deep Dive
IBM Spectrum Scale Authentication for File Access - Deep Dive
 
MySQL GTID 시작하기
MySQL GTID 시작하기MySQL GTID 시작하기
MySQL GTID 시작하기
 
Object Storage in a Cloud-Native Container Envirnoment
Object Storage in a Cloud-Native Container EnvirnomentObject Storage in a Cloud-Native Container Envirnoment
Object Storage in a Cloud-Native Container Envirnoment
 
Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기
 

Viewers also liked

Rhel cluster basics 1
Rhel cluster basics   1Rhel cluster basics   1
Rhel cluster basics 1Manoj Singh
 
Linux Cluster Concepts
Linux Cluster ConceptsLinux Cluster Concepts
Linux Cluster Conceptsnixsavy
 
7 tools for your devops stack
7 tools for your devops stack7 tools for your devops stack
7 tools for your devops stackKris Buytaert
 
Red Hat Enterprise Linux and NFS by syedmshaaf
Red Hat Enterprise Linux and NFS by syedmshaafRed Hat Enterprise Linux and NFS by syedmshaaf
Red Hat Enterprise Linux and NFS by syedmshaafSyed Shaaf
 
Red Hat TUG Utrecht - Storage Update june 2015
Red Hat TUG Utrecht - Storage Update june 2015Red Hat TUG Utrecht - Storage Update june 2015
Red Hat TUG Utrecht - Storage Update june 2015Marcel Hergaarden
 
Product backlog refinement
Product backlog refinementProduct backlog refinement
Product backlog refinementespeo
 
Linux-HA with Pacemaker
Linux-HA with PacemakerLinux-HA with Pacemaker
Linux-HA with PacemakerKris Buytaert
 
MySQL HA with PaceMaker
MySQL HA with  PaceMakerMySQL HA with  PaceMaker
MySQL HA with PaceMakerKris Buytaert
 
High Availability Options for Modern Oracle Infrastructures
High Availability Options for Modern Oracle InfrastructuresHigh Availability Options for Modern Oracle Infrastructures
High Availability Options for Modern Oracle InfrastructuresSimon Haslam
 
Pacemaker+DRBD
Pacemaker+DRBDPacemaker+DRBD
Pacemaker+DRBDDan Frincu
 
High Availability for OpenStack
High Availability for OpenStackHigh Availability for OpenStack
High Availability for OpenStackKamesh Pemmaraju
 
Clustering
ClusteringClustering
ClusteringMeme Hei
 
Icinga Camp Berlin 2017 - Train IT Platform Monitoring
Icinga Camp Berlin 2017 - Train IT Platform MonitoringIcinga Camp Berlin 2017 - Train IT Platform Monitoring
Icinga Camp Berlin 2017 - Train IT Platform MonitoringIcinga
 
Всем плевать на ваш дизайн
Всем плевать на ваш дизайнВсем плевать на ваш дизайн
Всем плевать на ваш дизайнAlexander Kirov
 
Metrics that matter: Making the business case that documentation has value
Metrics that matter: Making the business case that documentation has valueMetrics that matter: Making the business case that documentation has value
Metrics that matter: Making the business case that documentation has valuePublishing Smarter
 

Viewers also liked (20)

Rhel cluster basics 1
Rhel cluster basics   1Rhel cluster basics   1
Rhel cluster basics 1
 
Linux Cluster Concepts
Linux Cluster ConceptsLinux Cluster Concepts
Linux Cluster Concepts
 
7 tools for your devops stack
7 tools for your devops stack7 tools for your devops stack
7 tools for your devops stack
 
Exadata Cloud Service Overview(v2)
Exadata Cloud Service Overview(v2) Exadata Cloud Service Overview(v2)
Exadata Cloud Service Overview(v2)
 
Red Hat Enterprise Linux and NFS by syedmshaaf
Red Hat Enterprise Linux and NFS by syedmshaafRed Hat Enterprise Linux and NFS by syedmshaaf
Red Hat Enterprise Linux and NFS by syedmshaaf
 
Red Hat TUG Utrecht - Storage Update june 2015
Red Hat TUG Utrecht - Storage Update june 2015Red Hat TUG Utrecht - Storage Update june 2015
Red Hat TUG Utrecht - Storage Update june 2015
 
Product backlog refinement
Product backlog refinementProduct backlog refinement
Product backlog refinement
 
Gluster 3.3 deep dive
Gluster 3.3 deep diveGluster 3.3 deep dive
Gluster 3.3 deep dive
 
Linux-HA with Pacemaker
Linux-HA with PacemakerLinux-HA with Pacemaker
Linux-HA with Pacemaker
 
RedHat Cluster!
RedHat Cluster!RedHat Cluster!
RedHat Cluster!
 
MySQL HA with PaceMaker
MySQL HA with  PaceMakerMySQL HA with  PaceMaker
MySQL HA with PaceMaker
 
High Availability Options for Modern Oracle Infrastructures
High Availability Options for Modern Oracle InfrastructuresHigh Availability Options for Modern Oracle Infrastructures
High Availability Options for Modern Oracle Infrastructures
 
Pacemaker+DRBD
Pacemaker+DRBDPacemaker+DRBD
Pacemaker+DRBD
 
OpenStack HA
OpenStack HAOpenStack HA
OpenStack HA
 
High Availability for OpenStack
High Availability for OpenStackHigh Availability for OpenStack
High Availability for OpenStack
 
Clustering
ClusteringClustering
Clustering
 
Icinga Camp Berlin 2017 - Train IT Platform Monitoring
Icinga Camp Berlin 2017 - Train IT Platform MonitoringIcinga Camp Berlin 2017 - Train IT Platform Monitoring
Icinga Camp Berlin 2017 - Train IT Platform Monitoring
 
Pacemaker
PacemakerPacemaker
Pacemaker
 
Всем плевать на ваш дизайн
Всем плевать на ваш дизайнВсем плевать на ваш дизайн
Всем плевать на ваш дизайн
 
Metrics that matter: Making the business case that documentation has value
Metrics that matter: Making the business case that documentation has valueMetrics that matter: Making the business case that documentation has value
Metrics that matter: Making the business case that documentation has value
 

Similar to Linux-HA with Pacemaker

MySQL HA with Pacemaker
MySQL HA with  PacemakerMySQL HA with  Pacemaker
MySQL HA with PacemakerKris Buytaert
 
Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerkuchinskaya
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia DatabasesJaime Crespo
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningFromDual GmbH
 
Open stack HA - Theory to Reality
Open stack HA -  Theory to RealityOpen stack HA -  Theory to Reality
Open stack HA - Theory to RealitySriram Subramanian
 
Scalable Architecture 101
Scalable Architecture 101Scalable Architecture 101
Scalable Architecture 101ConFoo
 
Evergreen Sysadmin Survival Skills
Evergreen Sysadmin Survival SkillsEvergreen Sysadmin Survival Skills
Evergreen Sysadmin Survival SkillsEvergreen ILS
 
Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) PostgreSQL Experts, Inc.
 
Practice and challenges from building IaaS
Practice and challenges from building IaaSPractice and challenges from building IaaS
Practice and challenges from building IaaSShawn Zhu
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013dotCloud
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Docker, Inc.
 
10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in productionParis Data Engineers !
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios
 
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...PavelKonotopov
 
Malware analysis
Malware analysisMalware analysis
Malware analysisxabean
 

Similar to Linux-HA with Pacemaker (20)

MySQL HA with Pacemaker
MySQL HA with  PacemakerMySQL HA with  Pacemaker
MySQL HA with Pacemaker
 
Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemaker
 
Strata - 03/31/2012
Strata - 03/31/2012Strata - 03/31/2012
Strata - 03/31/2012
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL Tuning
 
Performance Whackamole (short version)
Performance Whackamole (short version)Performance Whackamole (short version)
Performance Whackamole (short version)
 
Open stack HA - Theory to Reality
Open stack HA -  Theory to RealityOpen stack HA -  Theory to Reality
Open stack HA - Theory to Reality
 
Scale 10x 01:22:12
Scale 10x 01:22:12Scale 10x 01:22:12
Scale 10x 01:22:12
 
Scalable Architecture 101
Scalable Architecture 101Scalable Architecture 101
Scalable Architecture 101
 
Evergreen Sysadmin Survival Skills
Evergreen Sysadmin Survival SkillsEvergreen Sysadmin Survival Skills
Evergreen Sysadmin Survival Skills
 
Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009)
 
Practice and challenges from building IaaS
Practice and challenges from building IaaSPractice and challenges from building IaaS
Practice and challenges from building IaaS
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
 
10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
 
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
 
Malware analysis
Malware analysisMalware analysis
Malware analysis
 
DEVIEW 2013
DEVIEW 2013DEVIEW 2013
DEVIEW 2013
 

More from Kris Buytaert

Years of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoopsYears of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoopsKris Buytaert
 
Observability will not fix your Broken Monitoring ,Ignite
Observability will not fix your Broken Monitoring ,IgniteObservability will not fix your Broken Monitoring ,Ignite
Observability will not fix your Broken Monitoring ,IgniteKris Buytaert
 
Infrastructure as Code Patterns
Infrastructure as Code PatternsInfrastructure as Code Patterns
Infrastructure as Code PatternsKris Buytaert
 
From devoops to devops 13 years of (not) learning
From devoops to devops 13 years of (not) learningFrom devoops to devops 13 years of (not) learning
From devoops to devops 13 years of (not) learningKris Buytaert
 
Pipeline all the Dashboards as Code
Pipeline all the Dashboards as CodePipeline all the Dashboards as Code
Pipeline all the Dashboards as CodeKris Buytaert
 
Help , My Datacenter is on fire
Help , My Datacenter is on fireHelp , My Datacenter is on fire
Help , My Datacenter is on fireKris Buytaert
 
Devops is Dead, Long live Devops
Devops is Dead, Long live DevopsDevops is Dead, Long live Devops
Devops is Dead, Long live DevopsKris Buytaert
 
10 years of #devopsdays, but what have we really learned ?
10 years of #devopsdays, but what have we really learned ? 10 years of #devopsdays, but what have we really learned ?
10 years of #devopsdays, but what have we really learned ? Kris Buytaert
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure FirstKris Buytaert
 
Is there a Future for devops ?
Is there a Future for devops   ? Is there a Future for devops   ?
Is there a Future for devops ? Kris Buytaert
 
10 Years of #devopsdays weirdness
10 Years of #devopsdays weirdness10 Years of #devopsdays weirdness
10 Years of #devopsdays weirdnessKris Buytaert
 
ADDO 2019: Looking back at over 10 years of Devops
ADDO 2019:    Looking back at over 10 years of DevopsADDO 2019:    Looking back at over 10 years of Devops
ADDO 2019: Looking back at over 10 years of DevopsKris Buytaert
 
Can we fix dev-oops ?
Can we fix dev-oops ?Can we fix dev-oops ?
Can we fix dev-oops ?Kris Buytaert
 
Continuous Infrastructure First Ignite Edition
Continuous Infrastructure First  Ignite EditionContinuous Infrastructure First  Ignite Edition
Continuous Infrastructure First Ignite EditionKris Buytaert
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure FirstKris Buytaert
 
Open Source Monitoring in 2019
Open Source Monitoring in 2019 Open Source Monitoring in 2019
Open Source Monitoring in 2019 Kris Buytaert
 
Migrating to Puppet 5
Migrating to Puppet 5Migrating to Puppet 5
Migrating to Puppet 5Kris Buytaert
 
Repositories as Code
Repositories as CodeRepositories as Code
Repositories as CodeKris Buytaert
 
Devops is a Security Requirement
Devops is a Security RequirementDevops is a Security Requirement
Devops is a Security RequirementKris Buytaert
 

More from Kris Buytaert (20)

Years of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoopsYears of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoops
 
Observability will not fix your Broken Monitoring ,Ignite
Observability will not fix your Broken Monitoring ,IgniteObservability will not fix your Broken Monitoring ,Ignite
Observability will not fix your Broken Monitoring ,Ignite
 
Infrastructure as Code Patterns
Infrastructure as Code PatternsInfrastructure as Code Patterns
Infrastructure as Code Patterns
 
From devoops to devops 13 years of (not) learning
From devoops to devops 13 years of (not) learningFrom devoops to devops 13 years of (not) learning
From devoops to devops 13 years of (not) learning
 
Pipeline all the Dashboards as Code
Pipeline all the Dashboards as CodePipeline all the Dashboards as Code
Pipeline all the Dashboards as Code
 
Help , My Datacenter is on fire
Help , My Datacenter is on fireHelp , My Datacenter is on fire
Help , My Datacenter is on fire
 
GitOps , done Right
GitOps , done RightGitOps , done Right
GitOps , done Right
 
Devops is Dead, Long live Devops
Devops is Dead, Long live DevopsDevops is Dead, Long live Devops
Devops is Dead, Long live Devops
 
10 years of #devopsdays, but what have we really learned ?
10 years of #devopsdays, but what have we really learned ? 10 years of #devopsdays, but what have we really learned ?
10 years of #devopsdays, but what have we really learned ?
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure First
 
Is there a Future for devops ?
Is there a Future for devops   ? Is there a Future for devops   ?
Is there a Future for devops ?
 
10 Years of #devopsdays weirdness
10 Years of #devopsdays weirdness10 Years of #devopsdays weirdness
10 Years of #devopsdays weirdness
 
ADDO 2019: Looking back at over 10 years of Devops
ADDO 2019:    Looking back at over 10 years of DevopsADDO 2019:    Looking back at over 10 years of Devops
ADDO 2019: Looking back at over 10 years of Devops
 
Can we fix dev-oops ?
Can we fix dev-oops ?Can we fix dev-oops ?
Can we fix dev-oops ?
 
Continuous Infrastructure First Ignite Edition
Continuous Infrastructure First  Ignite EditionContinuous Infrastructure First  Ignite Edition
Continuous Infrastructure First Ignite Edition
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure First
 
Open Source Monitoring in 2019
Open Source Monitoring in 2019 Open Source Monitoring in 2019
Open Source Monitoring in 2019
 
Migrating to Puppet 5
Migrating to Puppet 5Migrating to Puppet 5
Migrating to Puppet 5
 
Repositories as Code
Repositories as CodeRepositories as Code
Repositories as Code
 
Devops is a Security Requirement
Devops is a Security RequirementDevops is a Security Requirement
Devops is a Security Requirement
 

Linux-HA with Pacemaker

  • 1. Linux High Availability Kris Buytaert
  • 2. Kris Buytaert @krisbuytaert ● I used to be a Dev, Then Became an Op ● Senior Linux and Open Source Consultant @inuits.be ● „Infrastructure Architect“ ● Building Clouds since before the Cloud ● Surviving the 10th floor test ● Co-Author of some books ● Guest Editor at some sites
  • 3. What is HA Clustering ? ● One service goes down => others take over its work ● IP address takeover, service takeover, ● Not designed for high-performance ● Not designed for high troughput (load balancing)
  • 4. Does it Matter ? ● Downtime is expensive ● You mis out on $$$ ● Your boss complains ● New users don't return
  • 5. Lies, Damn Lies, and Statistics Counting nines (slide by Alan R) 99.9999% 30 sec 99.999% 5 min 99.99% 52 min 99.9% 9  hr   99% 3.5 day
  • 6. The Rules of HA ● Keep it Simple ● Keep it Simple ● Prepare for Failure ● Complexity is the enemy of reliability ● Test your HA setup
  • 7. Myths ● Virtualization will solve your HA Needs ● Live migration is the solution to all your problems ● HA will make your platform more stable
  • 8. You care about ? ● Your data ? • Consistent • Realitime • Eventual Consistent ● Your Connection • Always • Most of the time
  • 9. Eliminating the SPOF ● Find out what Will Fail • Disks • Fans • Power (Supplies) ● Find out what Can Fail • Network • Going Out Of Memory
  • 10. Split Brain ● Communications failures can lead to separated partitions of the cluster ● If those partitions each try and take control of the cluster, then it's called a split-brain condition ● If this happens, then bad things will happen • http://linux-ha.org/BadThingsWillHappen
  • 11. Shared Storage ● Shared Storage ● Filesystem • e.g GFS, GpFS ● Replicated ? ● Exported Filesystem ? ● $$$ 1+1 <> 2 ● Storage = SPOF ● Split Brain :( ● Stonith
  • 12. (Shared) Data ● Issues : • Who Writes ? • Who Reads ? • What if 2 Active application want to write ? • What if an active server crashes during writing ? • Can we accept delays ? • Can we accept readonly data ? ● Hardware Requirements ● Filesystem Requirements (GFS, GpFS, ...)
  • 13. DRBD ● Distributed Replicated Block Device ● In the Linux Kernel (as of very recent) ● Usually only 1 mount • Multi mount as of 8.X • Requires GFS / OCFS2 ● Regular FS ext3 ... ● Only 1 application instance Active accessing data ● Upon Failover application needs to be started on other node
  • 14. DRBD(2) ● What happens when you pull the plug of a Physical machine ? • Minimal Timeout • Why did the crash happen ? • Is my data still correct ?
  • 15. Alternatives to DRBD ● GlusterFS looked promising • “Friends don't let Friends use Gluster” • Consistency problems • Stability Problems • Maybe later ● MogileFS • Not posix • App needs to implement the API ● Ceph • ?
  • 16. HA Projects ● Linux HA Project ● Red Hat Cluster Suite ● LVS/Keepalived ● Application Specific Clustering Software • e.g Terracotta, MySQL NDBD
  • 17. HeartBeat ● No shared storage ● Serial Connections to UPS to STONITH ● (periodical/realtime) Replication or no shared data. ● e.g Static Website, FileServer
  • 18. Heartbeat ● Heartbeat v1 • Max 2 nodes • No finegrained resources • Monitoring using “mon” ● Heartbeat v2 • XML usage was a consulting opportunity • Stability issues • Forking ?
  • 19. Heartbeat v1 /etc/ha.d/ha.cf /etc/ha.d/haresources mdb-a.menos.asbucenter.dz ntc-restart-mysql mon IPaddr2::10.8.0.13/16/bond0 IPaddr2::10.16.0.13/16/bond0.16 mon /etc/ha.d/authkeys
  • 20. Heartbeat v2 “A consulting Opportunity” LMB
  • 21. Clone Resource Clones in v2 were buggy Resources were started on 2 nodes Stopped again on “1”
  • 22. Heartbeat v3 • No more /etc/ha.d/haresources • No more xml • Better integrated monitoring • /etc/ha.d/ha.cf has • crm=yes
  • 23. Pacemaker ? ● Not a fork ● Only CRM Code taken out of Heartbeat ● As of Heartbeat 2.1.3 • Support for both OpenAIS / HeartBeat • Different Release Cycles as Heartbeat
  • 24. Heartbeat, OpenAis, Corosync ? ● All Messaging Layers ● Initially only Heartbeat ● OpenAIS ● Heartbeat got unmaintained ● OpenAIS had heisenbugs :( ● Corosync ● Heartbeat maintenance taken over by LinBit ● CRM Detects which layer
  • 25. Configuring Heartbeat 3 ● /etc/ha.d/ha.cf Use crm = yes ● /etc/ha.d/authkeys
  • 26. Configuring Heartbeat with puppet heartbeat::hacf {"clustername": hosts => ["host-a","host-b"], hb_nic => ["bond0"], hostip1 => ["10.0.128.11"], hostip2 => ["10.0.128.12"], ping => ["10.0.128.4"], } heartbeat::authkeys {"ClusterName": password => “ClusterName ", } http://github.com/jtimberman/puppet/tree/master/heartbeat/
  • 27. Pacemaker Heartbeat or OpenAIS Cluster Glue
  • 28. Stonithd : The Heartbeat fencing Pacemaker Architecture subsystem. ● Lrmd : Local Resource Management Daemon. Interacts directly with resource agents (scripts). ● pengine Policy Engine. Computes the next state of the cluster based on the current state and the configuration. ● cib Cluster Information Base. Contains definitions of all cluster options, nodes, resources, their relationships to one another and current status. Synchronizes updates to all cluster nodes. ● crmd Cluster Resource Management Daemon. Largely a message broker for the PEngine and LRM, it also elects a leader to co-ordinate the activities of the cluster. ● openais messaging and membership layer. ● heartbeat messaging layer, an alternative to OpenAIS. ● ccm Short for Consensus Cluster Membership. The Heartbeat membership layer.
  • 29. CRM configure ● Cluster Resource Manager property $id="cib­bootstrap­options"          stonith­enabled="FALSE"  ● Keeps Nodes in Sync         no­quorum­policy=ignore          start­failure­is­fatal="FALSE"  ● XML Based rsc_defaults $id="rsc_defaults­options"          migration­threshold="1"          failure­timeout="1" ● cibadm primitive d_mysql ocf:local:mysql          op monitor interval="30s"  ● Cli manageable         params test_user="sure"  test_passwd="illtell" test_table="test.table" ● Crm primitive ip_db ocf:heartbeat:IPaddr2          params ip="172.17.4.202" nic="bond0"          op monitor interval="10s" group svc_db d_mysql ip_db commit
  • 30. Heartbeat Resources ● LSB ● Heartbeat resource (+status) ● OCF (Open Cluster FrameWork) (+monitor) ● Clones (don't use in HAv2) ● Multi State Resources
  • 31. LSB Resource Agents ● LSB == Linux Standards Base ● LSB resource agents are standard System V-style init scripts commonly used on Linux and other UNIX-like OSes ● LSB init scripts are stored under /etc/init.d/ ● This enables Linux-HA to immediately support nearly every service that comes with your system, and most packages which come with their own init script ● It's straightforward to change an LSB script to an OCF script
  • 32. OCF ● OCF == Open Cluster Framework ● OCF Resource agents are the most powerful type of resource agent we support ● OCF RAs are extended init scripts • They have additional actions: • monitor – for monitoring resource health • meta-data – for providing information about the RA ● OCF RAs are located in /usr/lib/ocf/resource.d/provider-name/
  • 33. Monitoring ● Defined in the OCF Resource script ● Configured in the parameters ● Tomcat : • Checks a configurable health page ● MySQL : • Checks query from a configurable table ● Others : • Basic proces state
  • 34. Anatomy of a Cluster config • Cluster properties • Resource Defaults • Primitive Definitions • Resource Groups and Constraints
  • 35. Cluster Properties property $id="cib-bootstrap-options" stonith-enabled="FALSE" no-quorum-policy="ignore" start-failure-is-fatal="FALSE" pe-error-series-max="9" pe-warn-series-max="9" pe-input-series-max="9" No-quorum-policy = We'll ignore the loss of quorum as this is a 2 node cluster pe-* = restricg logging Start-failure : When set to FALSE, the cluster will instead use the resource's failcount and value for resource-failure-stickiness
  • 36. Resource Defaults rsc_defaults $id="rsc_defaults-options" migration-threshold="1" failure-timeout="1" resource-stickiness="INFINITY" failure-timeout means that after a failure there will be a 60 second timeout before the resource can come back to the node on which it failed. Migration-treshold=1 means that after 1 failure the resource will try to start on the other node Resource-stickiness=INFINITY means that the resource really wants to stay where it is now.
  • 37. Primitive Definitions primitive d_mine ocf:custom:tomcat params instance_name="mine" monitor_urls="health.html" monitor_use_ssl="no" op monitor interval="15s" on-fail="restart" timeout="30s" primitive ip_mine_svc ocf:heartbeat:IPaddr2 params ip="10.8.4.131" cidr_netmask="16" nic="bond0" op monitor interval="10s"
  • 38. Parsing a config ● Isn't always done correctly ● Even a verify won't find all issues ● Unexpected behaviour might occur
  • 39. Where a resource runs • multi state resources • Master – Slave , • e.g mysql master-slave, drbd • Clones • Resources that can run on multiple nodes e.g • Multimaster mysql servers • Mysql slaves • Stateless applications • location • Preferred location to run resource, eg. Based on hostname • colocation • Resources that have to live together • e.g ip address + service • order Define what resource has to start first, or wait for another resource • groups • Colocation + order
  • 40. A Tomcat app on DRBD ● DRBD can only be active on 1 node ● The filesystem needs to be mounted on that active DRBD node ●
  • 41. Resource Groups and Constraints group svc_mine d_mine ip_mine ms ms_drbd_storage drbd_storage meta master_max="1" master_node_max="1" clone_max="2" clone_node_max="1" notify="true" location cli-prefer-svc_db svc_db rule $id="cli-prefer-rule-svc_db" inf: #uname eq db-a colocation fs_on_drbd inf: svc_mine ms_drbd_storage:Master order fs_after_drbd inf: ms_drbd_storage:promote svc_mine:start
  • 42. Using crm ● Crm configure ● Edit primitive ● Verify ● Commit
  • 43. Crm commands Crm Start the cluster resource manager Crm resource Change in to resource mode Crm configure Change into configure mode Crm configure show Show the current resource config Crm resource show Show the current resource state Cibadm -Q Dump the full Cluster Information Base in XML
  • 44. But We love XML ● Cibadm -Q
  • 45. Checking the Cluster State crm_mon -1 ============ Last updated: Wed Nov 4 16:44:26 2009 Stack: Heartbeat Current DC: xms-1 (c2c581f8-4edc-1de0-a959-91d246ac80f5) - partition with quorum Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7 2 Nodes configured, unknown expected votes 2 Resources configured. ============ Online: [ xms-1 xms-2 ] Resource Group: svc_mysql d_mysql (ocf::ntc:mysql): Started xms-1 ip_mysql (ocf::heartbeat:IPaddr2): Started xms-1 Resource Group: svc_XMS d_XMS (ocf::ntc:XMS): Started xms-2 ip_XMS (ocf::heartbeat:IPaddr2): Started xms-2 ip_XMS_public (ocf::heartbeat:IPaddr2): Started xms-2
  • 46. Stopping a resource crm resource stop svc_XMS crm_mon -1 ============ Last updated: Wed Nov 4 16:56:05 2009 Stack: Heartbeat Current DC: xms-1 (c2c581f8-4edc-1de0-a959-91d246ac80f5) - partition with quorum Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7 2 Nodes configured, unknown expected votes 2 Resources configured. ============ Online: [ xms-1 xms-2 ] Resource Group: svc_mysql d_mysql (ocf::ntc:mysql): Started xms-1 ip_mysql (ocf::heartbeat:IPaddr2): Started xms-1
  • 47. Starting a resource crm resource start svc_XMS crm_mon -1 ============ Last updated: Wed Nov 4 17:04:56 2009 Stack: Heartbeat Current DC: xms-1 (c2c581f8-4edc-1de0-a959-91d246ac80f5) - partition with quorum Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7 2 Nodes configured, unknown expected votes 2 Resources configured. ============ Online: [ xms-1 xms-2 ] Resource Group: svc_mysql d_mysql (ocf::ntc:mysql): Started xms-1 ip_mysql (ocf::heartbeat:IPaddr2): Started xms-1 Resource Group: svc_XMS
  • 48. Moving a resource ● Resource migrate ● Is permanent , even upon failure ● Usefull in upgrade scenarios ● Use resource unmigrate to restore
  • 49. Moving a resource [xpoll-root@XMS-1 ~]# crm resource migrate svc_XMS xms-1 [xpoll-root@XMS-1 ~]# crm_mon -1 Last updated: Wed Nov 4 17:32:50 2009 Stack: Heartbeat Current DC: xms-1 (c2c581f8-4edc-1de0-a959-91d246ac80f5) - partition with quorum Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7 2 Nodes configured, unknown expected votes 2 Resources configured. Online: [ xms-1 xms-2 ] Resource Group: svc_mysql d_mysql (ocf::ntc:mysql): Started xms-1 ip_mysql (ocf::heartbeat:IPaddr2): Started xms-1 Resource Group: svc_XMS d_XMS (ocf::ntc:XMS): Started xms-1 ip_XMS (ocf::heartbeat:IPaddr2): Started xms-1 ip_XMS_public (ocf::heartbeat:IPaddr2): Started xms-1
  • 50. Putting a node in Standby [menos-val3-root@mss-1031a ~]# crm node standby [menos-val3-root@mss-1031a ~]# crm_mon -1 ============ Last updated: Wed Dec 22 14:33:45 2010 Stack: Heartbeat Current DC: mss-1031a (45674b38-5aad-4a7c-bbf1-562b2f244763) - partition with quorum Version: 1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782 2 Nodes configured, unknown expected votes 1 Resources configured. ============ Node mss-1031b (110dc817-e2ea-4290-b275-4e6d8ca7b031): OFFLINE (standby) Node mss-1031a (45674b38-5aad-4a7c-bbf1-562b2f244763): standby
  • 51. Restoring a node from standby [menos-val3-root@mss-1031b ~]# crm node online [menos-val3-root@mss-1031b ~]# crm_mon -1 ============ Last updated: Thu Dec 23 08:36:21 2010 Stack: Heartbeat Current DC: mss-1031b (110dc817-e2ea-4290-b275-4e6d8ca7b031) - partition with quorum Version: 1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782 2 Nodes configured, unknown expected votes 1 Resources configured. ============ Online: [ mss-1031b mss-1031a ]
  • 52. Migrate vs Standby ● Think nrofnodes > 2 clusters ● Migrate : send resource to node X • Only use that available one ● Standby : do not send resources to node X • But use the other available ones
  • 53. Debugging ● Check crm_mon -f ● Failcounts ? ● Did the application launch correctly ? ● /var/log/messages/ • Warning: very verbose ● Tomcat logs
  • 54. Resource not running [menos-val3-root@mrs-a ~]# crm crm(live)# resource crm(live)resource# show Resource Group: svc-MRS d_MRS (ocf::ntc:tomcat) Stopped ip_MRS_svc (ocf::heartbeat:IPaddr2) Stopped ip_MRS_usr (ocf::heartbeat:IPaddr2) Stopped
  • 55. Resource Failcount [menos-val3-root@mrs-a ~]# crm crm(live)# resource crm(live)resource# failcount d_MRS show mrs-a scope=status name=fail-count-d_MRS value=1 crm(live)resource# failcount d_MRS delete mrs-a crm(live)resource# failcount d_MRS show mrs-a scope=status name=fail-count-d_MRS value=0
  • 56. Resource Failcount [menos-val3-root@mrs-a ~]# crm crm(live)# resource crm(live)resource# failcount d_MRS show mrs-a scope=status name=fail-count-d_MRS value=1 crm(live)resource# failcount d_MRS delete mrs-a crm(live)resource# failcount d_MRS show mrs-a scope=status name=fail-count-d_MRS value=0
  • 57. Resource Failcount [menos-val3-root@mrs-a ~]# crm crm(live)# resource crm(live)resource# failcount d_MRS show mrs-a scope=status name=fail-count-d_MRS value=1 crm(live)resource# failcount d_MRS delete mrs-a crm(live)resource# failcount d_MRS show mrs-a scope=status name=fail-count-d_MRS value=0
  • 58. Pacemaker and Puppet ● Plenty of non usable modules around • Hav1 ● https://github.com/rodjek/puppet-pacemaker.git • Strict set of ops / parameters ● ● Make sure your modules don't enable resources ● I've been using templates till to populate ● Cibadm to configure ● Crm is complex , even crm doesn't parse correctly yet ● ● Plenty of work ahead !
  • 59. Getting Help ● http://clusterlabs.org ● #linux-ha on irc.freenode.org ● http://www.drbd.org/users-guide/
  • 60. Contact : Kris Buytaert Kris.Buytaert@inuits.be Further Reading @krisbuytaert http://www.krisbuytaert.be/blog/ http://www.inuits.be/ http://www.virtualizati on.com/ http://www.oreillygmt.com/ Inuits Esquimaux 't Hemeltje Kheops Business Gemeentepark 2 Center 2930 Brasschaat Avenque Georges 891.514.231 Lemaître 54 6041 Gosselies +32 473 441 636 889.780.406