Your SlideShare is downloading. ×
  • Like
MySQL HA with  Pacemaker
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

MySQL HA with Pacemaker


My opendbcamp 2011 presentation on Pacemaker and MySQL opportunities

My opendbcamp 2011 presentation on Pacemaker and MySQL opportunities

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. MySQL HAwith PaceMaker Kris Buytaert #opendbcamp
  • 2. Kris Buytaert● I used to be a Dev, Then Became an Op,● Today I feel like a dev again● Senior Linux and Open Source Consultant● „Infrastructure Architect“● Building Clouds since before the Cloud● Surviving the 10th floor test● Co-Author of some books● Guest Editor at some sites
  • 3. In this presentation● High Availability ?● MySQL HA Solutions● Linux HA / Pacemaker
  • 4. What is HA Clustering ?● One service goes down => others take over its work● IP address takeover, service takeover,● Not designed for high-performance● Not designed for high troughput (load balancing)
  • 5. Lies, Damn Lies, andStatistics Counting nines (slide by Alan R) 99.9999% 30 sec 99.999% 5 min 99.99% 52 min 99.9% 9  hr   99% 3.5 day
  • 6. The Rules of HA● Keep it Simple● Keep it Simple● Prepare for Failure● Complexity is the enemy of reliability● Test your HA setup
  • 7. Eliminating the SPOF● Find out what Will Fail • Disks • Fans • Power (Supplies)● Find out what Can Fail • Network • Going Out Of Memory
  • 8. Data vs Connection● DATA : • Replication • Shared storage • DRBD● Connection • LVS • Proxy • Heartbeat / Pacemaker
  • 9. Shared Storage● 1 MySQL instance● Monitor MySQL node● Stonith● $$$ 1+1 <> 2● Storage = SPOF● Split Brain :(
  • 10. DRBD● Distributed Replicated Block Device● In the Linux Kernel● Usually only 1 mount • Multi mount as of 8.X • Requires GFS / OCFS2● Regular FS ext3 ...● Only 1 MySQL instance Active accessing data● Upon Failover MySQL needs to be started on other node
  • 11. DRBD(2)● What happens when you pull the plug of a Physical machine ? • Minimal Timeout • Why did the crash happen ? • Is my data still correct ? • Innodb Consistency Checks ? • Lengthy ? • Check your BinLog size
  • 12. Other Solutions Today● MySQL Cluster NDBD● Multi Master Replication● MySQL Proxy● MMM● Flipper● BYO● ....
  • 13. Pulling Traffic● Eg. for Cluster, MultiMaster setups • DNS • Advanced Routing • LVS • Or the upcoming slides
  • 14. Linux-HA PaceMaker● Plays well with others● Manages more than MySQL●● ...v3 .. dont even think about the rest anymore●●
  • 15. Heartbeat v1• Max 2 nodes• No finegrained resources• Monitoring using “mon”/etc/ha.d/ ntc-restart-mysql mon IPaddr2:: IPaddr2:: mon/etc/ha.d/authkeys
  • 16. Heartbeat v2 • Stability issues • Forking ?“A consulting Opportunity” LMB
  • 17. Clone ResourceClones in v2 were buggyResources were started on 2 nodesStopped again on “1”
  • 18. Heartbeat v3• No more /etc/ha.d/haresources• No more xml• Better integrated monitoring• /etc/ha.d/ has• crm=yes
  • 19. Pacemaker ?● Not a fork● Only CRM Code taken out of Heartbeat● As of Heartbeat 2.1.3 • Support for both OpenAIS / HeartBeat • Different Release Cycles as Heartbeat
  • 20. Heartbeat, OpenAis,Corosync ?● All Messaging Layers● Initially only Heartbeat● OpenAIS● Heartbeat got unmaintained● OpenAIS had heisenbugs :(● Corosync● Heartbeat maintenance taken over by LinBit● CRM Detects which layer
  • 21. PacemakerHeartbeat or OpenAIS Cluster Glue
  • 22. ● Stonithd : The Heartbeat fencing subsystem.Pacemaker Architecture ● Lrmd : Local Resource Management Daemon. Interacts directly with resource agents (scripts). ● pengine Policy Engine. Computes the next state of the cluster based on the current state and the configuration. ● cib Cluster Information Base. Contains definitions of all cluster options, nodes, resources, their relationships to one another and current status. Synchronizes updates to all cluster nodes. ● crmd Cluster Resource Management Daemon. Largely a message broker for the PEngine and LRM, it also elects a leader to co-ordinate the activities of the cluster. ● openais messaging and membership layer. ● heartbeat messaging layer, an alternative to OpenAIS. ● ccm Short for Consensus Cluster Membership. The Heartbeat membership layer.
  • 23. Configuring Heartbeat Correctlyheartbeat::hacf {"clustername": hosts => ["host-a","host-b"], hb_nic => ["bond0"], hostip1 => [""], hostip2 => [""], ping => [""], }heartbeat::authkeys {"ClusterName": password => “ClusterName ", }
  • 24. CRM configure property $id="cib­bootstrap­options" ● Cluster Resource         stonith­enabled="FALSE"          no­quorum­policy=ignore  Manager         start­failure­is­fatal="FALSE"  rsc_defaults $id="rsc_defaults­options"          migration­threshold="1" ● Keeps Nodes in Sync         failure­timeout="1" primitive d_mysql ocf:local:mysql          op monitor interval="30s"          params test_user="sure" test_passwd="illtell" ● XML Based test_table="test.table" primitive ip_db ocf:heartbeat:IPaddr2          params ip="" nic="bond0" ● cibadm         op monitor interval="10s" group svc_db d_mysql ip_db commit● Cli manageable● Crm
  • 25. Heartbeat Resources● LSB● Heartbeat resource (+status)● OCF (Open Cluster FrameWork) (+monitor)● Clones (dont use in HAv2)● Multi State Resources
  • 26. LSB Resource Agents● LSB == Linux Standards Base● LSB resource agents are standard System V- style init scripts commonly used on Linux and other UNIX-like OSes● LSB init scripts are stored under /etc/init.d/● This enables Linux-HA to immediately support nearly every service that comes with your system, and most packages which come with their own init script● Its straightforward to change an LSB script to an OCF script
  • 27. OCF● OCF == Open Cluster Framework● OCF Resource agents are the most powerful type of resource agent we support● OCF RAs are extended init scripts • They have additional actions: • monitor – for monitoring resource health • meta-data – for providing information about the RA● OCF RAs are located in /usr/lib/ocf/resource.d/provider-name/
  • 28. Monitoring● Defined in the OCF Resource script● Configured in the parameters● You have to support multiple states • Not running • Running • Failed
  • 29. Anatomy of a Clusterconfig• Cluster properties• Resource Defaults• Primitive Definitions• Resource Groups and Constraints
  • 30. Cluster Propertiesproperty $id="cib-bootstrap-options" stonith-enabled="FALSE" no-quorum-policy="ignore" start-failure-is-fatal="FALSE" No-quorum-policy = Well ignore the loss of quorum on a 2 node clusterStart-failure : When set to FALSE, the cluster will instead use the resources failcount and value for resource-failure-stickiness
  • 31. Resource Defaultsrsc_defaults $id="rsc_defaults-options" migration-threshold="1" failure-timeout="1" resource-stickiness="INFINITY"failure-timeout means that after a failure there will be a 60 second timeout before the resource can come back to thenode on which it failed.Migration-treshold=1 means that after 1 failure the resource will try to start on the other nodeResource-stickiness=INFINITY means that the resource really wants to stay where it is now.
  • 32. Primitive Definitionsprimitive d_mine ocf:custom:tomcat params instance_name="mine" monitor_urls="health.html" monitor_use_ssl="no" op monitor interval="15s" on-fail="restart" primitive ip_mine_svc ocf:heartbeat:IPaddr2 params ip="" cidr_netmask="16" nic="bond0" op monitor interval="10s"
  • 33. Parsing a config● Isnt always done correctly● Even a verify wont find all issues● Unexpected behaviour might occur
  • 34. Where a resource runs• multi state resources • Master – Slave , • e.g mysql master-slave, drbd• Clones • Resources that can run on multiple nodes e.g • Multimaster mysql servers • Mysql slaves • Stateless applications• location • Preferred location to run resource, eg. Based on hostname• colocation • Resources that have to live together • e.g ip address + service• order Define what resource has to start first, or wait for another resource• groups • Colocation + order
  • 35. eg. A Service on DRBD● DRBD can only be active on 1 node● The filesystem needs to be mounted on that active DRBD nodegroup svc_mine d_mine ip_minems ms_drbd_storage drbd_storage meta master_max="1" master_node_max="1" clone_max="2" clone_node_max="1"notify="true"colocation fs_on_drbd inf: svc_mine ms_drbd_storage:Masterorder fs_after_drbd inf: ms_drbd_storage:promote svc_mine:startlocation cli-prefer-svc_db svc_db rule $id="cli-prefer-rule-svc_db" inf: #uname eq db-a
  • 36. A MySQL Resource● OCF • Clone • Where do you hook up the IP ? • Multi State • But we have Master Master replication • Meta Resource • Dummy resource that can monitor • Connection • Replication state
  • 37. Simple 2 node exampleprimitive d_mysql ocf:ntc:mysql op monitor interval="30s" params test_user="just" test_passwd="kidding" test_table="really"primitive ip_mysql_svc ocf:heartbeat:IPaddr2 params ip="" cidr_netmask=""nic="bond0" op monitor interval="10s"group svc_mysql d_mysql ip_mysql_svc
  • 38. Monitor your Setup● Not just connectivity● Also functional • Query data • Check resultset is correct● Check replication • MaatKit • OpenARK
  • 39. How to deal with replication state ?● Multiple slaves • Use Drbd ocf resource● 2 masters only use own script • Replication is slow on the active node • Shouldnt happen talk to HR / cfgmt people • Replication is slow on the passive node • Weight-- • Replication breaks on the active node send out warning, dont modify weights and check other node • Replication breaks on the passive node • Fence of the passive node
  • 40. Adding MySQL to thestack Replication Service IP MySQL “MySQLd” “MySQLd” Resource MySQL Cluster Stack Pacemaker HeartBeat Node A Node B Hardware
  • 41. Pitfalls & Solutions● Monitor, • Replication state • Replication Lag● MaatKit● OpenARK
  • 42. Conclusion● Plenty of Alternatives● Think about your Data● Think about getting Queries to that Data● Complexity is the enemy of reliability● Keep it Simple● Monitor inside the DB
  • 43. ContactKris Buytaert Kris.Buytaert@inuits.beFurther Reading@KrisBuytaert Inuits Esquimaux t Hemeltje Kheops Business Gemeentepark 2 Center 2930 Brasschaat Avenque Georges 891.514.231 Lemaître 54 6041 Gosselies +32 473 441 636 889.780.406 +32 495 698 668