SlideShare a Scribd company logo
1 of 43
Download to read offline
MySQL HA
with PaceMaker
    Kris Buytaert
   #opendbcamp
Kris Buytaert
●   I used to be a Dev, Then Became an Op,
●   Today I feel like a dev again
●   Senior Linux and Open Source Consultant @inuits.be
●   „Infrastructure Architect“
●   Building Clouds since before the Cloud
●   Surviving the 10th floor test
●   Co-Author of some books
●   Guest Editor at some sites
In this presentation
●   High Availability ?
●   MySQL HA Solutions
●   Linux HA / Pacemaker
What is HA Clustering ?

●   One service goes down
     => others take over its work
●   IP address takeover, service takeover,
●   Not designed for high-performance
●   Not designed for high troughput (load
    balancing)
Lies, Damn Lies, and
Statistics
         Counting nines
            (slide by Alan R)




 99.9999%                        30 sec
 99.999%                          5 min
 99.99%                          52 min
 99.9%                            9  hr  
 99%                            3.5 day
The Rules of HA

●   Keep it Simple
●   Keep it Simple
●   Prepare for Failure
●   Complexity is the enemy of reliability
●   Test your HA setup
Eliminating the SPOF
●   Find out what Will Fail
    •   Disks
    •   Fans
    •   Power (Supplies)
●   Find out what Can Fail
    •   Network
    •   Going Out Of Memory
Data vs Connection
●   DATA :
    •   Replication
    •   Shared storage
    •   DRBD
●   Connection
    •   LVS
    •   Proxy
    •   Heartbeat / Pacemaker
Shared Storage
●   1 MySQL instance
●   Monitor MySQL node
●   Stonith
●   $$$              1+1 <> 2
●   Storage = SPOF
●   Split Brain :(
DRBD
●   Distributed Replicated Block Device
●   In the Linux Kernel
●   Usually only 1 mount
    •   Multi mount as of 8.X
        •   Requires GFS / OCFS2
●   Regular FS ext3 ...
●   Only 1 MySQL instance Active accessing data
●   Upon Failover MySQL needs to be started on
    other node
DRBD(2)
●   What happens when you pull the plug of a
    Physical machine ?
    •   Minimal Timeout
    •   Why did the crash happen ?
    •   Is my data still correct ?
    •   Innodb Consistency Checks ?
        •   Lengthy ?
        •   Check your BinLog size
Other Solutions Today

●   MySQL Cluster NDBD
●   Multi Master Replication
●   MySQL Proxy
●   MMM
●   Flipper
●   BYO
●   ....
Pulling Traffic
●   Eg. for Cluster, MultiMaster setups
    •   DNS
    •   Advanced Routing
    •   LVS


    •   Or the upcoming slides
Linux-HA PaceMaker
●   Plays well with others
●   Manages more than MySQL
●

●   ...v3 .. don't even think about the rest anymore
●

●   http://clusterlabs.org/
Heartbeat v1
•   Max 2 nodes
•   No finegrained resources
•   Monitoring using “mon”

/etc/ha.d/ha.cf
/etc/ha.d/haresources
mdb-a.menos.asbucenter.dz ntc-restart-mysql mon IPaddr2::10.8.0.13/16/bond0 
    IPaddr2::10.16.0.13/16/bond0.16 mon



/etc/ha.d/authkeys
Heartbeat v2
 •   Stability issues
 •   Forking ?




“A consulting Opportunity”
                             LMB
Clone Resource
Clones in v2 were buggy
Resources were started on 2 nodes
Stopped again on “1”
Heartbeat v3

•   No more /etc/ha.d/haresources
•   No more xml
•   Better integrated monitoring
•   /etc/ha.d/ha.cf has
•   crm=yes
Pacemaker ?
●   Not a fork
●   Only CRM Code taken out of Heartbeat
●   As of Heartbeat 2.1.3
    •   Support for both OpenAIS / HeartBeat
    •   Different Release Cycles as Heartbeat
Heartbeat, OpenAis,
Corosync ?
●   All Messaging Layers
●   Initially only Heartbeat
●   OpenAIS
●   Heartbeat got unmaintained
●   OpenAIS had heisenbugs :(
●   Corosync
●   Heartbeat maintenance taken over by LinBit
●   CRM Detects which layer
Pacemaker




Heartbeat       or         OpenAIS




            Cluster Glue
●   Stonithd : The Heartbeat fencing subsystem.



Pacemaker Architecture
            ●   Lrmd : Local Resource Management Daemon. Interacts
                directly with resource agents (scripts).
            ●   pengine Policy Engine. Computes the next state of the
                cluster based on the current state and the configuration.
            ●   cib Cluster Information Base. Contains definitions of all
                cluster options, nodes, resources, their relationships to
                one another and current status. Synchronizes updates to
                all cluster nodes.
            ●   crmd Cluster Resource Management Daemon. Largely
                a message broker for the PEngine and LRM, it also elects
                a leader to co-ordinate the activities of the cluster.
            ●   openais messaging and membership layer.
            ●   heartbeat messaging layer, an alternative to OpenAIS.
            ●   ccm Short for Consensus Cluster Membership. The
                Heartbeat membership layer.
Configuring Heartbeat Correctly

heartbeat::hacf {"clustername":

         hosts => ["host-a","host-b"],

         hb_nic => ["bond0"],

         hostip1 => ["10.0.128.11"],

         hostip2 => ["10.0.128.12"],

         ping => ["10.0.128.4"],

    }

heartbeat::authkeys {"ClusterName":

         password => “ClusterName ",

    }

http://github.com/jtimberman/puppet/tree/master/heartbeat/
CRM
                          configure
                          property $id="cib­bootstrap­options" 
●   Cluster Resource              stonith­enabled="FALSE" 
                                  no­quorum­policy=ignore 
    Manager                       start­failure­is­fatal="FALSE" 
                          rsc_defaults $id="rsc_defaults­options" 
                                  migration­threshold="1" 
●   Keeps Nodes in Sync           failure­timeout="1"
                          primitive d_mysql ocf:local:mysql 
                                  op monitor interval="30s" 
                                  params test_user="sure" test_passwd="illtell" 
●   XML Based             test_table="test.table"
                          primitive ip_db ocf:heartbeat:IPaddr2 
                                  params ip="172.17.4.202" nic="bond0" 
●   cibadm                        op monitor interval="10s"
                          group svc_db d_mysql ip_db
                          commit

●   Cli manageable
●   Crm
Heartbeat Resources
●   LSB
●   Heartbeat resource (+status)
●   OCF (Open Cluster FrameWork) (+monitor)
●   Clones (don't use in HAv2)
●   Multi State Resources
LSB Resource Agents
●   LSB == Linux Standards Base
●   LSB resource agents are standard System V-
    style init scripts commonly used on Linux and
    other UNIX-like OSes
●   LSB init scripts are stored under /etc/init.d/
●   This enables Linux-HA to immediately support
    nearly every service that comes with your
    system, and most packages which come with
    their own init script
●   It's straightforward to change an LSB script to
    an OCF script
OCF
●   OCF == Open Cluster Framework
●   OCF Resource agents are the most powerful type of
    resource agent we support
●   OCF RAs are extended init scripts
    • They have additional actions:
      • monitor – for monitoring resource health
      • meta-data – for providing information about the RA

●   OCF RAs are located in
    /usr/lib/ocf/resource.d/provider-name/
Monitoring
●   Defined in the OCF Resource script
●   Configured in the parameters
●   You have to support multiple states
    •   Not running
    •   Running
    •   Failed
Anatomy of a Cluster
config

•   Cluster properties
•   Resource Defaults
•   Primitive Definitions
•   Resource Groups and Constraints
Cluster Properties

property $id="cib-bootstrap-options" 
     stonith-enabled="FALSE" 
     no-quorum-policy="ignore" 
     start-failure-is-fatal="FALSE" 



No-quorum-policy = We'll ignore the loss of quorum on a 2 node cluster

Start-failure : When set to FALSE, the cluster will instead use the resource's failcount and value for resource-failure-
stickiness
Resource Defaults

rsc_defaults $id="rsc_defaults-options" 
     migration-threshold="1" 
     failure-timeout="1" 
     resource-stickiness="INFINITY"


failure-timeout means that after a failure there will be a 60 second timeout before the resource can come back to the
node on which it failed.

Migration-treshold=1 means that after 1 failure the resource will try to start on the other node

Resource-stickiness=INFINITY means that the resource really wants to stay where it is now.
Primitive Definitions

primitive d_mine ocf:custom:tomcat 
     params instance_name="mine" 
     monitor_urls="health.html" 
     monitor_use_ssl="no" 
     op monitor interval="15s" 
     on-fail="restart" 



primitive ip_mine_svc ocf:heartbeat:IPaddr2 
     params ip="10.8.4.131" cidr_netmask="16" nic="bond0" 
     op monitor interval="10s"
Parsing a config
●   Isn't always done correctly
●   Even a verify won't find all issues
●   Unexpected behaviour might occur
Where a resource runs
•   multi state resources
    •  Master – Slave ,
       •   e.g mysql master-slave, drbd
•   Clones
    •  Resources that can run on multiple nodes
           e.g
       •   Multimaster mysql servers
       •   Mysql slaves
       •   Stateless applications
•   location
    •  Preferred location to run resource, eg. Based on hostname
•   colocation
    •  Resources that have to live together
       •   e.g ip address + service
•   order
       Define what resource has to start first, or wait for another resource
•   groups
    •  Colocation + order
eg. A Service on DRBD
●   DRBD can only be active on 1 node
●   The filesystem needs to be mounted on that
    active DRBD node

group svc_mine d_mine ip_mine

ms ms_drbd_storage drbd_storage 

meta master_max="1" master_node_max="1" clone_max="2" clone_node_max="1"
notify="true"

colocation fs_on_drbd inf: svc_mine ms_drbd_storage:Master

order fs_after_drbd inf: ms_drbd_storage:promote svc_mine:start



location cli-prefer-svc_db svc_db 

rule $id="cli-prefer-rule-svc_db" inf: #uname eq db-a
A MySQL Resource
●   OCF
    •   Clone
        •   Where do you hook up the IP ?
    •   Multi State
        •   But we have Master Master replication
    •   Meta Resource
        •   Dummy resource that can monitor
            •   Connection
            •   Replication state
Simple 2 node example
primitive d_mysql ocf:ntc:mysql 
     op monitor interval="30s" 
     params test_user="just" test_passwd="kidding" test_table="really"

primitive ip_mysql_svc ocf:heartbeat:IPaddr2 
     params ip="10.8.0.30" cidr_netmask="255.255.255.0"
nic="bond0" 
     op monitor interval="10s"



group svc_mysql d_mysql ip_mysql_svc
Monitor your Setup
●   Not just connectivity
●   Also functional
    •   Query data
    •   Check resultset is correct
●   Check replication
    •   MaatKit
    •   OpenARK
How to deal with replication state ?
●   Multiple slaves

    •   Use Drbd ocf resource
●   2 masters only use own script

        •   Replication is slow on the active node

            •   Shouldn't happen talk to HR / cfgmt people

        •   Replication is slow on the passive node

            •   Weight--

        •   Replication breaks on the active node

                send out warning, don't modify weights and check other node

        •   Replication breaks on the passive node

            •   Fence of the passive node
Adding MySQL to the
stack

                     Replication
  Service IP MySQL

  “MySQLd”                          “MySQLd”   Resource MySQL

                                                Cluster Stack
                      Pacemaker

                      HeartBeat
         Node A                    Node B      Hardware
Pitfalls & Solutions
●   Monitor,
    •   Replication state
    •   Replication Lag


●   MaatKit
●   OpenARK
Conclusion
●   Plenty of Alternatives
●   Think about your Data
●   Think about getting Queries to that Data
●   Complexity is the enemy of reliability
●   Keep it Simple
●   Monitor inside the DB
Contact
Kris Buytaert Kris.Buytaert@inuits.be

Further Reading
@KrisBuytaert
http://www.krisbuytaert.be/blog/
http://www.inuits.be/
http://www.virtualization.com/
http://www.oreillygmt.com/




                              Inuits            Esquimaux
                              't Hemeltje       Kheops Business
                              Gemeentepark 2    Center
                              2930 Brasschaat   Avenque Georges
                              891.514.231       Lemaître 54
                                                6041 Gosselies
                              +32 473 441 636   889.780.406
                                                +32 495 698 668

More Related Content

What's hot

brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2Nick Wang
 
Corosync and Pacemaker
Corosync and PacemakerCorosync and Pacemaker
Corosync and PacemakerMarian Marinov
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsJulien Anguenot
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsJulien Anguenot
 
Introduction to XtraDB Cluster
Introduction to XtraDB ClusterIntroduction to XtraDB Cluster
Introduction to XtraDB Clusteryoku0825
 
Cassandra and Solid State Drives
Cassandra and Solid State DrivesCassandra and Solid State Drives
Cassandra and Solid State DrivesRick Branson
 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixAcunu
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSDataStax Academy
 
The Google Chubby lock service for loosely-coupled distributed systems
The Google Chubby lock service for loosely-coupled distributed systemsThe Google Chubby lock service for loosely-coupled distributed systems
The Google Chubby lock service for loosely-coupled distributed systemsRomain Jacotin
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudPatrick McGarry
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into CassandraBrian Hess
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2Giuseppe Paterno'
 
Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Rick Branson
 

What's hot (20)

brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2
 
HIGH AVAILABLE CLUSTER IN WEB SERVER WITH HEARTBEAT + DRBD + OCFS2
HIGH AVAILABLE CLUSTER IN WEB SERVER WITH  HEARTBEAT + DRBD + OCFS2HIGH AVAILABLE CLUSTER IN WEB SERVER WITH  HEARTBEAT + DRBD + OCFS2
HIGH AVAILABLE CLUSTER IN WEB SERVER WITH HEARTBEAT + DRBD + OCFS2
 
Corosync and Pacemaker
Corosync and PacemakerCorosync and Pacemaker
Corosync and Pacemaker
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentials
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Shootout at the PAAS Corral
Shootout at the PAAS CorralShootout at the PAAS Corral
Shootout at the PAAS Corral
 
Introduction to XtraDB Cluster
Introduction to XtraDB ClusterIntroduction to XtraDB Cluster
Introduction to XtraDB Cluster
 
Cassandra and Solid State Drives
Cassandra and Solid State DrivesCassandra and Solid State Drives
Cassandra and Solid State Drives
 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
 
Strata - 03/31/2012
Strata - 03/31/2012Strata - 03/31/2012
Strata - 03/31/2012
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
 
The Google Chubby lock service for loosely-coupled distributed systems
The Google Chubby lock service for loosely-coupled distributed systemsThe Google Chubby lock service for loosely-coupled distributed systems
The Google Chubby lock service for loosely-coupled distributed systems
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into Cassandra
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2
 
Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)
 
Containers > VMs
Containers > VMsContainers > VMs
Containers > VMs
 

Viewers also liked

MySQL HA with PaceMaker
MySQL HA with  PaceMakerMySQL HA with  PaceMaker
MySQL HA with PaceMakerKris Buytaert
 
Best practices for MySQL High Availability
Best practices for MySQL High AvailabilityBest practices for MySQL High Availability
Best practices for MySQL High AvailabilityColin Charles
 
Highly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndHighly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndJervin Real
 
Code Quality - Security
Code Quality - SecurityCode Quality - Security
Code Quality - Securitysedukull
 
Barbican 1.0 - Open Source Key Management for OpenStack
Barbican 1.0 - Open Source Key Management for OpenStackBarbican 1.0 - Open Source Key Management for OpenStack
Barbican 1.0 - Open Source Key Management for OpenStackjarito030506
 
Open Source KMIP Implementation
Open Source KMIP ImplementationOpen Source KMIP Implementation
Open Source KMIP Implementationsedukull
 
Supriya Shailaja Latest Gallery
 Supriya Shailaja Latest Gallery Supriya Shailaja Latest Gallery
Supriya Shailaja Latest Gallerytelugustop.com
 
High availability and fault tolerance of openstack
High availability and fault tolerance of openstackHigh availability and fault tolerance of openstack
High availability and fault tolerance of openstackDeepak Mane
 
Open stack HA - Theory to Reality
Open stack HA -  Theory to RealityOpen stack HA -  Theory to Reality
Open stack HA - Theory to RealitySriram Subramanian
 
Devops is not about Tooling
Devops is not about ToolingDevops is not about Tooling
Devops is not about ToolingKris Buytaert
 
Deep dive into highly available open stack architecture openstack summit va...
Deep dive into highly available open stack architecture   openstack summit va...Deep dive into highly available open stack architecture   openstack summit va...
Deep dive into highly available open stack architecture openstack summit va...Arthur Berezin
 
Chef cookbooks for OpenStack HA
Chef cookbooks for OpenStack HAChef cookbooks for OpenStack HA
Chef cookbooks for OpenStack HAAdam Spiers
 
Pacemaker Overview
Pacemaker OverviewPacemaker Overview
Pacemaker Overviewstooty s
 

Viewers also liked (14)

MySQL HA with PaceMaker
MySQL HA with  PaceMakerMySQL HA with  PaceMaker
MySQL HA with PaceMaker
 
Best practices for MySQL High Availability
Best practices for MySQL High AvailabilityBest practices for MySQL High Availability
Best practices for MySQL High Availability
 
Pacemaker basics
Pacemaker basicsPacemaker basics
Pacemaker basics
 
Highly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndHighly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlnd
 
Code Quality - Security
Code Quality - SecurityCode Quality - Security
Code Quality - Security
 
Barbican 1.0 - Open Source Key Management for OpenStack
Barbican 1.0 - Open Source Key Management for OpenStackBarbican 1.0 - Open Source Key Management for OpenStack
Barbican 1.0 - Open Source Key Management for OpenStack
 
Open Source KMIP Implementation
Open Source KMIP ImplementationOpen Source KMIP Implementation
Open Source KMIP Implementation
 
Supriya Shailaja Latest Gallery
 Supriya Shailaja Latest Gallery Supriya Shailaja Latest Gallery
Supriya Shailaja Latest Gallery
 
High availability and fault tolerance of openstack
High availability and fault tolerance of openstackHigh availability and fault tolerance of openstack
High availability and fault tolerance of openstack
 
Open stack HA - Theory to Reality
Open stack HA -  Theory to RealityOpen stack HA -  Theory to Reality
Open stack HA - Theory to Reality
 
Devops is not about Tooling
Devops is not about ToolingDevops is not about Tooling
Devops is not about Tooling
 
Deep dive into highly available open stack architecture openstack summit va...
Deep dive into highly available open stack architecture   openstack summit va...Deep dive into highly available open stack architecture   openstack summit va...
Deep dive into highly available open stack architecture openstack summit va...
 
Chef cookbooks for OpenStack HA
Chef cookbooks for OpenStack HAChef cookbooks for OpenStack HA
Chef cookbooks for OpenStack HA
 
Pacemaker Overview
Pacemaker OverviewPacemaker Overview
Pacemaker Overview
 

Similar to MySQL HA with Pacemaker

Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerkuchinskaya
 
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...PavelKonotopov
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101MongoDB
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningFromDual GmbH
 
Replication using PostgreSQL Replicator
Replication using PostgreSQL ReplicatorReplication using PostgreSQL Replicator
Replication using PostgreSQL ReplicatorCommand Prompt., Inc
 
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-DeviceSUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-DeviceSUSE
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013dotCloud
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Docker, Inc.
 
Scalable Architecture 101
Scalable Architecture 101Scalable Architecture 101
Scalable Architecture 101ConFoo
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia DatabasesJaime Crespo
 
Sdc challenges-2012
Sdc challenges-2012Sdc challenges-2012
Sdc challenges-2012Gluster.org
 
Sdc 2012-challenges
Sdc 2012-challengesSdc 2012-challenges
Sdc 2012-challengesGluster.org
 
Migrating to XtraDB Cluster
Migrating to XtraDB ClusterMigrating to XtraDB Cluster
Migrating to XtraDB Clusterpercona2013
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around HadoopDataWorks Summit
 
2013 london advanced-replication
2013 london advanced-replication2013 london advanced-replication
2013 london advanced-replicationMarc Schwering
 
Preventing and Resolving MySQL Downtime
Preventing and Resolving MySQL DowntimePreventing and Resolving MySQL Downtime
Preventing and Resolving MySQL DowntimeJervin Real
 

Similar to MySQL HA with Pacemaker (20)

Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemaker
 
Scale 10x 01:22:12
Scale 10x 01:22:12Scale 10x 01:22:12
Scale 10x 01:22:12
 
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL Tuning
 
MySQL HA
MySQL HAMySQL HA
MySQL HA
 
Replication using PostgreSQL Replicator
Replication using PostgreSQL ReplicatorReplication using PostgreSQL Replicator
Replication using PostgreSQL Replicator
 
Go replicator
Go replicatorGo replicator
Go replicator
 
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-DeviceSUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
 
Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
 
Scalable Architecture 101
Scalable Architecture 101Scalable Architecture 101
Scalable Architecture 101
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
Sdc challenges-2012
Sdc challenges-2012Sdc challenges-2012
Sdc challenges-2012
 
Sdc 2012-challenges
Sdc 2012-challengesSdc 2012-challenges
Sdc 2012-challenges
 
Migrating to XtraDB Cluster
Migrating to XtraDB ClusterMigrating to XtraDB Cluster
Migrating to XtraDB Cluster
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around Hadoop
 
2013 london advanced-replication
2013 london advanced-replication2013 london advanced-replication
2013 london advanced-replication
 
Preventing and Resolving MySQL Downtime
Preventing and Resolving MySQL DowntimePreventing and Resolving MySQL Downtime
Preventing and Resolving MySQL Downtime
 

More from Kris Buytaert

Years of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoopsYears of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoopsKris Buytaert
 
Observability will not fix your Broken Monitoring ,Ignite
Observability will not fix your Broken Monitoring ,IgniteObservability will not fix your Broken Monitoring ,Ignite
Observability will not fix your Broken Monitoring ,IgniteKris Buytaert
 
Infrastructure as Code Patterns
Infrastructure as Code PatternsInfrastructure as Code Patterns
Infrastructure as Code PatternsKris Buytaert
 
From devoops to devops 13 years of (not) learning
From devoops to devops 13 years of (not) learningFrom devoops to devops 13 years of (not) learning
From devoops to devops 13 years of (not) learningKris Buytaert
 
Pipeline all the Dashboards as Code
Pipeline all the Dashboards as CodePipeline all the Dashboards as Code
Pipeline all the Dashboards as CodeKris Buytaert
 
Help , My Datacenter is on fire
Help , My Datacenter is on fireHelp , My Datacenter is on fire
Help , My Datacenter is on fireKris Buytaert
 
Devops is Dead, Long live Devops
Devops is Dead, Long live DevopsDevops is Dead, Long live Devops
Devops is Dead, Long live DevopsKris Buytaert
 
10 years of #devopsdays, but what have we really learned ?
10 years of #devopsdays, but what have we really learned ? 10 years of #devopsdays, but what have we really learned ?
10 years of #devopsdays, but what have we really learned ? Kris Buytaert
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure FirstKris Buytaert
 
Is there a Future for devops ?
Is there a Future for devops   ? Is there a Future for devops   ?
Is there a Future for devops ? Kris Buytaert
 
10 Years of #devopsdays weirdness
10 Years of #devopsdays weirdness10 Years of #devopsdays weirdness
10 Years of #devopsdays weirdnessKris Buytaert
 
ADDO 2019: Looking back at over 10 years of Devops
ADDO 2019:    Looking back at over 10 years of DevopsADDO 2019:    Looking back at over 10 years of Devops
ADDO 2019: Looking back at over 10 years of DevopsKris Buytaert
 
Can we fix dev-oops ?
Can we fix dev-oops ?Can we fix dev-oops ?
Can we fix dev-oops ?Kris Buytaert
 
Continuous Infrastructure First Ignite Edition
Continuous Infrastructure First  Ignite EditionContinuous Infrastructure First  Ignite Edition
Continuous Infrastructure First Ignite EditionKris Buytaert
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure FirstKris Buytaert
 
Open Source Monitoring in 2019
Open Source Monitoring in 2019 Open Source Monitoring in 2019
Open Source Monitoring in 2019 Kris Buytaert
 
Migrating to Puppet 5
Migrating to Puppet 5Migrating to Puppet 5
Migrating to Puppet 5Kris Buytaert
 
Repositories as Code
Repositories as CodeRepositories as Code
Repositories as CodeKris Buytaert
 
Devops is a Security Requirement
Devops is a Security RequirementDevops is a Security Requirement
Devops is a Security RequirementKris Buytaert
 

More from Kris Buytaert (20)

Years of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoopsYears of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoops
 
Observability will not fix your Broken Monitoring ,Ignite
Observability will not fix your Broken Monitoring ,IgniteObservability will not fix your Broken Monitoring ,Ignite
Observability will not fix your Broken Monitoring ,Ignite
 
Infrastructure as Code Patterns
Infrastructure as Code PatternsInfrastructure as Code Patterns
Infrastructure as Code Patterns
 
From devoops to devops 13 years of (not) learning
From devoops to devops 13 years of (not) learningFrom devoops to devops 13 years of (not) learning
From devoops to devops 13 years of (not) learning
 
Pipeline all the Dashboards as Code
Pipeline all the Dashboards as CodePipeline all the Dashboards as Code
Pipeline all the Dashboards as Code
 
Help , My Datacenter is on fire
Help , My Datacenter is on fireHelp , My Datacenter is on fire
Help , My Datacenter is on fire
 
GitOps , done Right
GitOps , done RightGitOps , done Right
GitOps , done Right
 
Devops is Dead, Long live Devops
Devops is Dead, Long live DevopsDevops is Dead, Long live Devops
Devops is Dead, Long live Devops
 
10 years of #devopsdays, but what have we really learned ?
10 years of #devopsdays, but what have we really learned ? 10 years of #devopsdays, but what have we really learned ?
10 years of #devopsdays, but what have we really learned ?
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure First
 
Is there a Future for devops ?
Is there a Future for devops   ? Is there a Future for devops   ?
Is there a Future for devops ?
 
10 Years of #devopsdays weirdness
10 Years of #devopsdays weirdness10 Years of #devopsdays weirdness
10 Years of #devopsdays weirdness
 
ADDO 2019: Looking back at over 10 years of Devops
ADDO 2019:    Looking back at over 10 years of DevopsADDO 2019:    Looking back at over 10 years of Devops
ADDO 2019: Looking back at over 10 years of Devops
 
Can we fix dev-oops ?
Can we fix dev-oops ?Can we fix dev-oops ?
Can we fix dev-oops ?
 
Continuous Infrastructure First Ignite Edition
Continuous Infrastructure First  Ignite EditionContinuous Infrastructure First  Ignite Edition
Continuous Infrastructure First Ignite Edition
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure First
 
Open Source Monitoring in 2019
Open Source Monitoring in 2019 Open Source Monitoring in 2019
Open Source Monitoring in 2019
 
Migrating to Puppet 5
Migrating to Puppet 5Migrating to Puppet 5
Migrating to Puppet 5
 
Repositories as Code
Repositories as CodeRepositories as Code
Repositories as Code
 
Devops is a Security Requirement
Devops is a Security RequirementDevops is a Security Requirement
Devops is a Security Requirement
 

Recently uploaded

ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...FIDO Alliance
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Hiroshi SHIBATA
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxFIDO Alliance
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideStefan Dietze
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxjbellis
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewDianaGray10
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?Mark Billinghurst
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...ScyllaDB
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...panagenda
 

Recently uploaded (20)

ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 

MySQL HA with Pacemaker

  • 1. MySQL HA with PaceMaker Kris Buytaert #opendbcamp
  • 2. Kris Buytaert ● I used to be a Dev, Then Became an Op, ● Today I feel like a dev again ● Senior Linux and Open Source Consultant @inuits.be ● „Infrastructure Architect“ ● Building Clouds since before the Cloud ● Surviving the 10th floor test ● Co-Author of some books ● Guest Editor at some sites
  • 3. In this presentation ● High Availability ? ● MySQL HA Solutions ● Linux HA / Pacemaker
  • 4. What is HA Clustering ? ● One service goes down => others take over its work ● IP address takeover, service takeover, ● Not designed for high-performance ● Not designed for high troughput (load balancing)
  • 5. Lies, Damn Lies, and Statistics Counting nines (slide by Alan R) 99.9999% 30 sec 99.999% 5 min 99.99% 52 min 99.9% 9  hr   99% 3.5 day
  • 6. The Rules of HA ● Keep it Simple ● Keep it Simple ● Prepare for Failure ● Complexity is the enemy of reliability ● Test your HA setup
  • 7. Eliminating the SPOF ● Find out what Will Fail • Disks • Fans • Power (Supplies) ● Find out what Can Fail • Network • Going Out Of Memory
  • 8. Data vs Connection ● DATA : • Replication • Shared storage • DRBD ● Connection • LVS • Proxy • Heartbeat / Pacemaker
  • 9. Shared Storage ● 1 MySQL instance ● Monitor MySQL node ● Stonith ● $$$ 1+1 <> 2 ● Storage = SPOF ● Split Brain :(
  • 10. DRBD ● Distributed Replicated Block Device ● In the Linux Kernel ● Usually only 1 mount • Multi mount as of 8.X • Requires GFS / OCFS2 ● Regular FS ext3 ... ● Only 1 MySQL instance Active accessing data ● Upon Failover MySQL needs to be started on other node
  • 11. DRBD(2) ● What happens when you pull the plug of a Physical machine ? • Minimal Timeout • Why did the crash happen ? • Is my data still correct ? • Innodb Consistency Checks ? • Lengthy ? • Check your BinLog size
  • 12. Other Solutions Today ● MySQL Cluster NDBD ● Multi Master Replication ● MySQL Proxy ● MMM ● Flipper ● BYO ● ....
  • 13. Pulling Traffic ● Eg. for Cluster, MultiMaster setups • DNS • Advanced Routing • LVS • Or the upcoming slides
  • 14. Linux-HA PaceMaker ● Plays well with others ● Manages more than MySQL ● ● ...v3 .. don't even think about the rest anymore ● ● http://clusterlabs.org/
  • 15. Heartbeat v1 • Max 2 nodes • No finegrained resources • Monitoring using “mon” /etc/ha.d/ha.cf /etc/ha.d/haresources mdb-a.menos.asbucenter.dz ntc-restart-mysql mon IPaddr2::10.8.0.13/16/bond0 IPaddr2::10.16.0.13/16/bond0.16 mon /etc/ha.d/authkeys
  • 16. Heartbeat v2 • Stability issues • Forking ? “A consulting Opportunity” LMB
  • 17. Clone Resource Clones in v2 were buggy Resources were started on 2 nodes Stopped again on “1”
  • 18. Heartbeat v3 • No more /etc/ha.d/haresources • No more xml • Better integrated monitoring • /etc/ha.d/ha.cf has • crm=yes
  • 19. Pacemaker ? ● Not a fork ● Only CRM Code taken out of Heartbeat ● As of Heartbeat 2.1.3 • Support for both OpenAIS / HeartBeat • Different Release Cycles as Heartbeat
  • 20. Heartbeat, OpenAis, Corosync ? ● All Messaging Layers ● Initially only Heartbeat ● OpenAIS ● Heartbeat got unmaintained ● OpenAIS had heisenbugs :( ● Corosync ● Heartbeat maintenance taken over by LinBit ● CRM Detects which layer
  • 21. Pacemaker Heartbeat or OpenAIS Cluster Glue
  • 22. Stonithd : The Heartbeat fencing subsystem. Pacemaker Architecture ● Lrmd : Local Resource Management Daemon. Interacts directly with resource agents (scripts). ● pengine Policy Engine. Computes the next state of the cluster based on the current state and the configuration. ● cib Cluster Information Base. Contains definitions of all cluster options, nodes, resources, their relationships to one another and current status. Synchronizes updates to all cluster nodes. ● crmd Cluster Resource Management Daemon. Largely a message broker for the PEngine and LRM, it also elects a leader to co-ordinate the activities of the cluster. ● openais messaging and membership layer. ● heartbeat messaging layer, an alternative to OpenAIS. ● ccm Short for Consensus Cluster Membership. The Heartbeat membership layer.
  • 23. Configuring Heartbeat Correctly heartbeat::hacf {"clustername": hosts => ["host-a","host-b"], hb_nic => ["bond0"], hostip1 => ["10.0.128.11"], hostip2 => ["10.0.128.12"], ping => ["10.0.128.4"], } heartbeat::authkeys {"ClusterName": password => “ClusterName ", } http://github.com/jtimberman/puppet/tree/master/heartbeat/
  • 24. CRM configure property $id="cib­bootstrap­options"  ● Cluster Resource         stonith­enabled="FALSE"          no­quorum­policy=ignore  Manager         start­failure­is­fatal="FALSE"  rsc_defaults $id="rsc_defaults­options"          migration­threshold="1"  ● Keeps Nodes in Sync         failure­timeout="1" primitive d_mysql ocf:local:mysql          op monitor interval="30s"          params test_user="sure" test_passwd="illtell"  ● XML Based test_table="test.table" primitive ip_db ocf:heartbeat:IPaddr2          params ip="172.17.4.202" nic="bond0"  ● cibadm         op monitor interval="10s" group svc_db d_mysql ip_db commit ● Cli manageable ● Crm
  • 25. Heartbeat Resources ● LSB ● Heartbeat resource (+status) ● OCF (Open Cluster FrameWork) (+monitor) ● Clones (don't use in HAv2) ● Multi State Resources
  • 26. LSB Resource Agents ● LSB == Linux Standards Base ● LSB resource agents are standard System V- style init scripts commonly used on Linux and other UNIX-like OSes ● LSB init scripts are stored under /etc/init.d/ ● This enables Linux-HA to immediately support nearly every service that comes with your system, and most packages which come with their own init script ● It's straightforward to change an LSB script to an OCF script
  • 27. OCF ● OCF == Open Cluster Framework ● OCF Resource agents are the most powerful type of resource agent we support ● OCF RAs are extended init scripts • They have additional actions: • monitor – for monitoring resource health • meta-data – for providing information about the RA ● OCF RAs are located in /usr/lib/ocf/resource.d/provider-name/
  • 28. Monitoring ● Defined in the OCF Resource script ● Configured in the parameters ● You have to support multiple states • Not running • Running • Failed
  • 29. Anatomy of a Cluster config • Cluster properties • Resource Defaults • Primitive Definitions • Resource Groups and Constraints
  • 30. Cluster Properties property $id="cib-bootstrap-options" stonith-enabled="FALSE" no-quorum-policy="ignore" start-failure-is-fatal="FALSE" No-quorum-policy = We'll ignore the loss of quorum on a 2 node cluster Start-failure : When set to FALSE, the cluster will instead use the resource's failcount and value for resource-failure- stickiness
  • 31. Resource Defaults rsc_defaults $id="rsc_defaults-options" migration-threshold="1" failure-timeout="1" resource-stickiness="INFINITY" failure-timeout means that after a failure there will be a 60 second timeout before the resource can come back to the node on which it failed. Migration-treshold=1 means that after 1 failure the resource will try to start on the other node Resource-stickiness=INFINITY means that the resource really wants to stay where it is now.
  • 32. Primitive Definitions primitive d_mine ocf:custom:tomcat params instance_name="mine" monitor_urls="health.html" monitor_use_ssl="no" op monitor interval="15s" on-fail="restart" primitive ip_mine_svc ocf:heartbeat:IPaddr2 params ip="10.8.4.131" cidr_netmask="16" nic="bond0" op monitor interval="10s"
  • 33. Parsing a config ● Isn't always done correctly ● Even a verify won't find all issues ● Unexpected behaviour might occur
  • 34. Where a resource runs • multi state resources • Master – Slave , • e.g mysql master-slave, drbd • Clones • Resources that can run on multiple nodes e.g • Multimaster mysql servers • Mysql slaves • Stateless applications • location • Preferred location to run resource, eg. Based on hostname • colocation • Resources that have to live together • e.g ip address + service • order Define what resource has to start first, or wait for another resource • groups • Colocation + order
  • 35. eg. A Service on DRBD ● DRBD can only be active on 1 node ● The filesystem needs to be mounted on that active DRBD node group svc_mine d_mine ip_mine ms ms_drbd_storage drbd_storage meta master_max="1" master_node_max="1" clone_max="2" clone_node_max="1" notify="true" colocation fs_on_drbd inf: svc_mine ms_drbd_storage:Master order fs_after_drbd inf: ms_drbd_storage:promote svc_mine:start location cli-prefer-svc_db svc_db rule $id="cli-prefer-rule-svc_db" inf: #uname eq db-a
  • 36. A MySQL Resource ● OCF • Clone • Where do you hook up the IP ? • Multi State • But we have Master Master replication • Meta Resource • Dummy resource that can monitor • Connection • Replication state
  • 37. Simple 2 node example primitive d_mysql ocf:ntc:mysql op monitor interval="30s" params test_user="just" test_passwd="kidding" test_table="really" primitive ip_mysql_svc ocf:heartbeat:IPaddr2 params ip="10.8.0.30" cidr_netmask="255.255.255.0" nic="bond0" op monitor interval="10s" group svc_mysql d_mysql ip_mysql_svc
  • 38. Monitor your Setup ● Not just connectivity ● Also functional • Query data • Check resultset is correct ● Check replication • MaatKit • OpenARK
  • 39. How to deal with replication state ? ● Multiple slaves • Use Drbd ocf resource ● 2 masters only use own script • Replication is slow on the active node • Shouldn't happen talk to HR / cfgmt people • Replication is slow on the passive node • Weight-- • Replication breaks on the active node send out warning, don't modify weights and check other node • Replication breaks on the passive node • Fence of the passive node
  • 40. Adding MySQL to the stack Replication Service IP MySQL “MySQLd” “MySQLd” Resource MySQL Cluster Stack Pacemaker HeartBeat Node A Node B Hardware
  • 41. Pitfalls & Solutions ● Monitor, • Replication state • Replication Lag ● MaatKit ● OpenARK
  • 42. Conclusion ● Plenty of Alternatives ● Think about your Data ● Think about getting Queries to that Data ● Complexity is the enemy of reliability ● Keep it Simple ● Monitor inside the DB
  • 43. Contact Kris Buytaert Kris.Buytaert@inuits.be Further Reading @KrisBuytaert http://www.krisbuytaert.be/blog/ http://www.inuits.be/ http://www.virtualization.com/ http://www.oreillygmt.com/ Inuits Esquimaux 't Hemeltje Kheops Business Gemeentepark 2 Center 2930 Brasschaat Avenque Georges 891.514.231 Lemaître 54 6041 Gosselies +32 473 441 636 889.780.406 +32 495 698 668