MySQL HA with Pacemaker

MySQL HA
with PaceMaker
Kris Buytaert
#opendbcamp

Kris Buytaert
● I used to be a Dev, Then Became an Op,
● Today I feel like a dev again
● Senior Linux and Open Source Consultant @inuits.be
● „Infrastructure Architect“
● Building Clouds since before the Cloud
● Surviving the 10th floor test
● Co-Author of some books
● Guest Editor at some sites

In this presentation
● High Availability ?
● MySQL HA Solutions
● Linux HA / Pacemaker

What is HA Clustering ?

● One service goes down
=> others take over its work
● IP address takeover, service takeover,
● Not designed for high-performance
● Not designed for high troughput (load
balancing)

Lies, Damn Lies, and
Statistics
Counting nines
(slide by Alan R)

99.9999% 30 sec
99.999% 5 min
99.99% 52 min
99.9% 9 hr
99% 3.5 day

The Rules of HA

● Keep it Simple
● Keep it Simple
● Prepare for Failure
● Complexity is the enemy of reliability
● Test your HA setup

Eliminating the SPOF
● Find out what Will Fail
• Disks
• Fans
• Power (Supplies)
● Find out what Can Fail
• Network
• Going Out Of Memory

Data vs Connection
● DATA :
• Replication
• Shared storage
• DRBD
● Connection
• LVS
• Proxy
• Heartbeat / Pacemaker

Shared Storage
● 1 MySQL instance
● Monitor MySQL node
● Stonith
● $$$ 1+1 <> 2
● Storage = SPOF
● Split Brain :(

DRBD
● Distributed Replicated Block Device
● In the Linux Kernel
● Usually only 1 mount
• Multi mount as of 8.X
• Requires GFS / OCFS2
● Regular FS ext3 ...
● Only 1 MySQL instance Active accessing data
● Upon Failover MySQL needs to be started on
other node

DRBD(2)
● What happens when you pull the plug of a
Physical machine ?
• Minimal Timeout
• Why did the crash happen ?
• Is my data still correct ?
• Innodb Consistency Checks ?
• Lengthy ?
• Check your BinLog size

Other Solutions Today

● MySQL Cluster NDBD
● Multi Master Replication
● MySQL Proxy
● MMM
● Flipper
● BYO
● ....

Pulling Traffic
● Eg. for Cluster, MultiMaster setups
• DNS
• Advanced Routing
• LVS

• Or the upcoming slides

Linux-HA PaceMaker
● Plays well with others
● Manages more than MySQL
●

● ...v3 .. don't even think about the rest anymore
●

● http://clusterlabs.org/

Heartbeat v1
• Max 2 nodes
• No finegrained resources
• Monitoring using “mon”

/etc/ha.d/ha.cf
/etc/ha.d/haresources
mdb-a.menos.asbucenter.dz ntc-restart-mysql mon IPaddr2::10.8.0.13/16/bond0
IPaddr2::10.16.0.13/16/bond0.16 mon

/etc/ha.d/authkeys

Heartbeat v2
• Stability issues
• Forking ?

“A consulting Opportunity”
LMB

Clone Resource
Clones in v2 were buggy
Resources were started on 2 nodes
Stopped again on “1”

Heartbeat v3

• No more /etc/ha.d/haresources
• No more xml
• Better integrated monitoring
• /etc/ha.d/ha.cf has
• crm=yes

Pacemaker ?
● Not a fork
● Only CRM Code taken out of Heartbeat
● As of Heartbeat 2.1.3
• Support for both OpenAIS / HeartBeat
• Different Release Cycles as Heartbeat

Heartbeat, OpenAis,
Corosync ?
● All Messaging Layers
● Initially only Heartbeat
● OpenAIS
● Heartbeat got unmaintained
● OpenAIS had heisenbugs :(
● Corosync
● Heartbeat maintenance taken over by LinBit
● CRM Detects which layer

Pacemaker

Heartbeat or OpenAIS

Cluster Glue

● Stonithd : The Heartbeat fencing subsystem.

Pacemaker Architecture
● Lrmd : Local Resource Management Daemon. Interacts
directly with resource agents (scripts).
● pengine Policy Engine. Computes the next state of the
cluster based on the current state and the configuration.
● cib Cluster Information Base. Contains definitions of all
cluster options, nodes, resources, their relationships to
one another and current status. Synchronizes updates to
all cluster nodes.
● crmd Cluster Resource Management Daemon. Largely
a message broker for the PEngine and LRM, it also elects
a leader to co-ordinate the activities of the cluster.
● openais messaging and membership layer.
● heartbeat messaging layer, an alternative to OpenAIS.
● ccm Short for Consensus Cluster Membership. The
Heartbeat membership layer.

Configuring Heartbeat Correctly

heartbeat::hacf {"clustername":

hosts => ["host-a","host-b"],

hb_nic => ["bond0"],

hostip1 => ["10.0.128.11"],

hostip2 => ["10.0.128.12"],

ping => ["10.0.128.4"],

}

heartbeat::authkeys {"ClusterName":

password => “ClusterName ",

}

http://github.com/jtimberman/puppet/tree/master/heartbeat/

CRM
configure
property $id="cibbootstrapoptions"
● Cluster Resource         stonithenabled="FALSE"
        noquorumpolicy=ignore
Manager         startfailureisfatal="FALSE"
rsc_defaults $id="rsc_defaultsoptions"
        migrationthreshold="1"
● Keeps Nodes in Sync         failuretimeout="1"
primitive d_mysql ocf:local:mysql
        op monitor interval="30s"
        params test_user="sure" test_passwd="illtell"
● XML Based test_table="test.table"
primitive ip_db ocf:heartbeat:IPaddr2
        params ip="172.17.4.202" nic="bond0"
● cibadm         op monitor interval="10s"
group svc_db d_mysql ip_db
commit

● Cli manageable
● Crm

Heartbeat Resources
● LSB
● Heartbeat resource (+status)
● OCF (Open Cluster FrameWork) (+monitor)
● Clones (don't use in HAv2)
● Multi State Resources

LSB Resource Agents
● LSB == Linux Standards Base
● LSB resource agents are standard System V-
style init scripts commonly used on Linux and
other UNIX-like OSes
● LSB init scripts are stored under /etc/init.d/
● This enables Linux-HA to immediately support
nearly every service that comes with your
system, and most packages which come with
their own init script
● It's straightforward to change an LSB script to
an OCF script

OCF
● OCF == Open Cluster Framework
● OCF Resource agents are the most powerful type of
resource agent we support
● OCF RAs are extended init scripts
• They have additional actions:
• monitor – for monitoring resource health
• meta-data – for providing information about the RA

● OCF RAs are located in
/usr/lib/ocf/resource.d/provider-name/

Monitoring
● Defined in the OCF Resource script
● Configured in the parameters
● You have to support multiple states
• Not running
• Running
• Failed

Anatomy of a Cluster
config

• Cluster properties
• Resource Defaults
• Primitive Definitions
• Resource Groups and Constraints

Cluster Properties

property $id="cib-bootstrap-options"
stonith-enabled="FALSE"
no-quorum-policy="ignore"
start-failure-is-fatal="FALSE"

No-quorum-policy = We'll ignore the loss of quorum on a 2 node cluster

Start-failure : When set to FALSE, the cluster will instead use the resource's failcount and value for resource-failure-
stickiness

Resource Defaults

rsc_defaults $id="rsc_defaults-options"
migration-threshold="1"
failure-timeout="1"
resource-stickiness="INFINITY"

failure-timeout means that after a failure there will be a 60 second timeout before the resource can come back to the
node on which it failed.

Migration-treshold=1 means that after 1 failure the resource will try to start on the other node

Resource-stickiness=INFINITY means that the resource really wants to stay where it is now.

Primitive Definitions

primitive d_mine ocf:custom:tomcat
params instance_name="mine"
monitor_urls="health.html"
monitor_use_ssl="no"
on-fail="restart"

primitive ip_mine_svc ocf:heartbeat:IPaddr2
params ip="10.8.4.131" cidr_netmask="16" nic="bond0"

Parsing a config
● Isn't always done correctly
● Even a verify won't find all issues
● Unexpected behaviour might occur

Where a resource runs
• multi state resources
• Master – Slave ,
• e.g mysql master-slave, drbd
• Clones
• Resources that can run on multiple nodes
e.g
• Multimaster mysql servers
• Mysql slaves
• Stateless applications
• location
• Preferred location to run resource, eg. Based on hostname
• colocation
• Resources that have to live together
• e.g ip address + service
• order
Define what resource has to start first, or wait for another resource
• groups
• Colocation + order

eg. A Service on DRBD
● DRBD can only be active on 1 node
● The filesystem needs to be mounted on that
active DRBD node

group svc_mine d_mine ip_mine

ms ms_drbd_storage drbd_storage

meta master_max="1" master_node_max="1" clone_max="2" clone_node_max="1"
notify="true"

colocation fs_on_drbd inf: svc_mine ms_drbd_storage:Master

order fs_after_drbd inf: ms_drbd_storage:promote svc_mine:start

location cli-prefer-svc_db svc_db

rule $id="cli-prefer-rule-svc_db" inf: #uname eq db-a

A MySQL Resource
● OCF
• Clone
• Where do you hook up the IP ?
• Multi State
• But we have Master Master replication
• Meta Resource
• Dummy resource that can monitor
• Connection
• Replication state

Simple 2 node example
primitive d_mysql ocf:ntc:mysql
params test_user="just" test_passwd="kidding" test_table="really"

primitive ip_mysql_svc ocf:heartbeat:IPaddr2
params ip="10.8.0.30" cidr_netmask="255.255.255.0"
nic="bond0"

group svc_mysql d_mysql ip_mysql_svc

Monitor your Setup
● Not just connectivity
● Also functional
• Query data
• Check resultset is correct
● Check replication
• MaatKit
• OpenARK

How to deal with replication state ?
● Multiple slaves

• Use Drbd ocf resource
● 2 masters only use own script

• Replication is slow on the active node

• Shouldn't happen talk to HR / cfgmt people

• Replication is slow on the passive node

• Weight--

• Replication breaks on the active node

send out warning, don't modify weights and check other node

• Replication breaks on the passive node

• Fence of the passive node

Adding MySQL to the
stack

Replication
Service IP MySQL

“MySQLd” “MySQLd” Resource MySQL

Cluster Stack
Pacemaker

HeartBeat
Node A Node B Hardware

Pitfalls & Solutions
● Monitor,
• Replication state
• Replication Lag

● MaatKit
● OpenARK

Conclusion
● Plenty of Alternatives
● Think about your Data
● Think about getting Queries to that Data
● Complexity is the enemy of reliability
● Keep it Simple
● Monitor inside the DB

Contact
Kris Buytaert Kris.Buytaert@inuits.be

Further Reading
@KrisBuytaert
http://www.krisbuytaert.be/blog/
http://www.inuits.be/
http://www.virtualization.com/
http://www.oreillygmt.com/

Inuits Esquimaux
't Hemeltje Kheops Business
Gemeentepark 2 Center
2930 Brasschaat Avenque Georges
891.514.231 Lemaître 54
6041 Gosselies
+32 473 441 636 889.780.406
+32 495 698 668

MySQL HA with Pacemaker

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (14)

Similar to MySQL HA with Pacemaker

Similar to MySQL HA with Pacemaker (20)

More from Kris Buytaert

More from Kris Buytaert (20)

Recently uploaded

Recently uploaded (20)

MySQL HA with Pacemaker