MySQL HA with PaceMaker Kris Buytaert
Kris Buytaert <ul><li>Senior Linux and Open Source Consultant @inuits.be
„ Infrastructure Architect“
I don't remember when I started using MySQL :)
Specializing in Automated , Large Scale Deployments , Highly Available infrastructures, since 2008 also known as “the Cloud”
Surviving the 10 th  floor test
DevOp </li></ul>
In this presentation <ul><li>High Availability ?
MySQL HA Solutions
MySQL Replication
Linux HA / Pacemaker </li></ul>
What is HA Clustering ? <ul><li>One service goes down  </li><ul><li>=> others take over its work </li></ul><li>IP address ...
Not  designed for high-performance
Not designed for high troughput (load balancing) </li></ul>
Does it Matter ? <ul><li>Downtime is expensive
You mis out on $$$
Your boss complains
New users don't return </li></ul>
Lies, Damn Lies, and Statistics Counting nines (slide by Alan R)
The Rules of HA <ul><li>Keep it Simple
Keep it Simple
Prepare for Failure
Complexity is the enemy of reliability
Test your HA setup  </li></ul>
You care about ? <ul><li>Your data ? </li><ul><li>Consistent
Realitime
Eventual Consistent  </li></ul><li>Your Connection </li><ul><li>Always
Most of the time </li></ul></ul>
Eliminating the SPOF <ul><li>Find out what Will Fail  </li><ul><li>Disks
Fans
Power (Supplies) </li></ul><li>Find out what Can Fail </li><ul><li>Network
Going Out Of Memory  </li></ul></ul>
Split Brain <ul><li>Communications failures can lead to separated partitions of the cluster
If those partitions each try and take control of the cluster, then it's called a split-brain condition
If this happens, then bad things will happen </li><ul><li>http://linux-ha.org/BadThingsWillHappen </li></ul></ul>
Historical MySQL HA <ul><li>Replication  </li><ul><li>1 read write node
Multiple read only nodes
Application needed to be modified </li></ul></ul>
Solutions Today <ul><li>BYO
DRBD
MySQL Cluster NDBD
Multi Master Replication
MySQL Proxy
MMM
Flipper </li></ul>
Data vs Connection <ul><li>DATA :  </li><ul><li>Replication
DRBD </li></ul><li>Connection </li><ul><li>LVS
Proxy
Heartbeat / Pacemaker </li></ul></ul>
Upcoming SlideShare
Loading in...5
×

MySQL HA with PaceMaker

24,799

Published on

An overview of MySQL HA solutions,
And a further in depth example with Pacemaker/ Linux-HA

Published in: Technology

MySQL HA with PaceMaker

  1. 1. MySQL HA with PaceMaker Kris Buytaert
  2. 2. Kris Buytaert <ul><li>Senior Linux and Open Source Consultant @inuits.be
  3. 3. „ Infrastructure Architect“
  4. 4. I don't remember when I started using MySQL :)
  5. 5. Specializing in Automated , Large Scale Deployments , Highly Available infrastructures, since 2008 also known as “the Cloud”
  6. 6. Surviving the 10 th floor test
  7. 7. DevOp </li></ul>
  8. 8. In this presentation <ul><li>High Availability ?
  9. 9. MySQL HA Solutions
  10. 10. MySQL Replication
  11. 11. Linux HA / Pacemaker </li></ul>
  12. 12. What is HA Clustering ? <ul><li>One service goes down </li><ul><li>=> others take over its work </li></ul><li>IP address takeover, service takeover,
  13. 13. Not designed for high-performance
  14. 14. Not designed for high troughput (load balancing) </li></ul>
  15. 15. Does it Matter ? <ul><li>Downtime is expensive
  16. 16. You mis out on $$$
  17. 17. Your boss complains
  18. 18. New users don't return </li></ul>
  19. 19. Lies, Damn Lies, and Statistics Counting nines (slide by Alan R)
  20. 20. The Rules of HA <ul><li>Keep it Simple
  21. 21. Keep it Simple
  22. 22. Prepare for Failure
  23. 23. Complexity is the enemy of reliability
  24. 24. Test your HA setup </li></ul>
  25. 25. You care about ? <ul><li>Your data ? </li><ul><li>Consistent
  26. 26. Realitime
  27. 27. Eventual Consistent </li></ul><li>Your Connection </li><ul><li>Always
  28. 28. Most of the time </li></ul></ul>
  29. 29. Eliminating the SPOF <ul><li>Find out what Will Fail </li><ul><li>Disks
  30. 30. Fans
  31. 31. Power (Supplies) </li></ul><li>Find out what Can Fail </li><ul><li>Network
  32. 32. Going Out Of Memory </li></ul></ul>
  33. 33. Split Brain <ul><li>Communications failures can lead to separated partitions of the cluster
  34. 34. If those partitions each try and take control of the cluster, then it's called a split-brain condition
  35. 35. If this happens, then bad things will happen </li><ul><li>http://linux-ha.org/BadThingsWillHappen </li></ul></ul>
  36. 36. Historical MySQL HA <ul><li>Replication </li><ul><li>1 read write node
  37. 37. Multiple read only nodes
  38. 38. Application needed to be modified </li></ul></ul>
  39. 39. Solutions Today <ul><li>BYO
  40. 40. DRBD
  41. 41. MySQL Cluster NDBD
  42. 42. Multi Master Replication
  43. 43. MySQL Proxy
  44. 44. MMM
  45. 45. Flipper </li></ul>
  46. 46. Data vs Connection <ul><li>DATA : </li><ul><li>Replication
  47. 47. DRBD </li></ul><li>Connection </li><ul><li>LVS
  48. 48. Proxy
  49. 49. Heartbeat / Pacemaker </li></ul></ul>
  50. 50. Shared Storage <ul><li>1 MySQL instance
  51. 51. Monitor MySQL node
  52. 52. Stonith
  53. 53. $$$ 1+1 <> 2
  54. 54. Storage = SPOF
  55. 55. Split Brain :( </li></ul>
  56. 56. DRBD <ul><li>Distributed Replicated Block Device
  57. 57. In the Linux Kernel (as of very recent)
  58. 58. Usually only 1 mount </li><ul><li>Multi mount as of 8.X </li><ul><li>Requires GFS / OCFS2 </li></ul></ul><li>Regular FS ext3 ...
  59. 59. Only 1 MySQL instance Active accessing data
  60. 60. Upon Failover MySQL needs to be started on other node </li></ul>
  61. 61. DRBD(2) <ul><li>What happens when you pull the plug of a Physical machine ? </li><ul><li>Minimal Timeout
  62. 62. Why did the crash happen ?
  63. 63. Is my data still correct ?
  64. 64. Innodb Consistency Checks ? </li><ul><li>Lengthy ?
  65. 65. Check your BinLog size </li></ul></ul></ul>
  66. 66. MySQL Cluster NDBD <ul><li>Shared-nothing architecture
  67. 67. Automatic partitioning
  68. 68. Synchronous replication
  69. 69. Fast automatic fail-over of data nodes
  70. 70. In-memory indexes
  71. 71. Not suitable for all query patterns (multi-table JOINs, range scans) </li></ul>
  72. 72. Title <ul><ul><li>Data </li></ul></ul>
  73. 73. MySQL Cluster NDBD <ul><li>All indexed data needs to be in memory
  74. 74. Good and bad experiences </li><ul><li>Better experiences when using the API
  75. 75. Bad when using the MySQL Server </li></ul><li>Test before you deploy
  76. 76. Does not fit for all apps </li></ul>
  77. 77. How replication works <ul><li>Master server keeps track of all updates in the Binary Log </li><ul><li>Slave requests to read the binary update log
  78. 78. Master acts in a passive role, not keeping track of what slave has read what data </li></ul></ul><ul><li>Upon connecting the slaves do the following: </li><ul><li>The slave informs the master of where it left off
  79. 79. It catches up on the updates
  80. 80. It waits for the master to notify it of new update s </li></ul></ul>
  81. 82. Two Slave Threads <ul><li>How does it work? </li><ul><li>The I/O thread connects to the master and asks for the updates in the master’s binary log
  82. 83. The I/O thread copies the statements to the relay log
  83. 84. The SQL thread implements the statements in the relay log </li></ul></ul>Advantages <ul><ul><li>Long running SQL statements don’t block log downloading
  84. 85. Allows the slave to keep up with the master better
  85. 86. In case of master crash the slave is more likely to have all statements </li></ul></ul>
  86. 87. Replication commands Slave commands <ul><li>START|STOP SLAVE
  87. 88. RESET SLAVE
  88. 89. SHOW SLAVE STATUS
  89. 90. CHANGE MASTER TO…
  90. 91. LOAD DATA FROM MASTER
  91. 92. LOAD TABLE tblname FROM MASTER </li></ul>Master commands <ul><li>SHOW MASTER STATUS
  92. 93. PURGE MASTER LOGS… </li></ul>
  93. 94. Show slave statusG Slave_IO_State: Waiting for master to send event Master_Host: 172.16.0.1 Master_User: repli Master_Port: 3306 Connect_Retry: 60 Master_Log_File: XMS-1-bin.000014 Read_Master_Log_Pos: 106 Relay_Log_File: XMS-2-relay.000033 Relay_Log_Pos: 251 Relay_Master_Log_File: XMS-1-bin.000014 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: xpol Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 106 Relay_Log_Space: 547 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: 1 row in set (0.00 sec)
  94. 95. Row vs Statement <ul><li>Pro </li><ul><li>Proven (around since MySQL 3.23)
  95. 96. Smaller log files
  96. 97. Auditing of actual SQL statements
  97. 98. No primary key requirement for replicated tables </li></ul><li>Con </li><ul><li>Non-deterministic functions and UDFs </li></ul></ul><ul><li>Pro </li><ul><li>All changes can be replicated
  98. 99. Similar technology used by other RDBMSes
  99. 100. Fewer locks required for some INSERT, UPDATE or DELETE statements </li></ul><li>Con </li><ul><li>More data to be logged
  100. 101. Log file size increases (backup/restore implications)
  101. 102. Replicated tables require explicit primary keys
  102. 103. Possible different result sets on bulk INSERTs </li></ul></ul>
  103. 104. Multi Master Replication <ul><li>Replicating the same table data both ways can lead to race conditions </li><ul><li>Auto_increment, unique keys, etc.. could cause problems If you write them 2x </li></ul><li>Both nodes are master
  104. 105. Both nodes are slave
  105. 106. Write in 1 get updates on the other </li></ul>M|S M|S
  106. 107. MySQL Proxy <ul><li>Man in the middle
  107. 108. Decides where to connect to </li><ul><li>LUA </li></ul><li>Write rules to </li><ul><li>Redirect traffic </li></ul></ul>
  108. 109. Master Slave & Proxy <ul><li>Split Read and Write Actions
  109. 110. No Application change required
  110. 111. Sends specific queries to a specific node
  111. 112. Based on </li><ul><li>Customer
  112. 113. User
  113. 114. Table
  114. 115. Availability </li></ul></ul>
  115. 116. MySQL Proxy <ul><li>Your new SPOF
  116. 117. Make your Proxy HA too ! </li><ul><li>Heartbeat OCF Resource </li></ul></ul>
  117. 118. Breaking Replication <ul><li>If the master and slave gets out of sync
  118. 119. Updates on slave with identical index id </li><ul><li>Check error log for disconnections and issues with replication </li></ul></ul>
  119. 120. Monitor your Setup <ul><li>Not just connectivity
  120. 121. Also functional </li><ul><li>Query data
  121. 122. Check resultset is correct </li></ul><li>Check replication </li><ul><li>MaatKit
  122. 123. OpenARK </li></ul></ul>
  123. 124. Pulling Traffic <ul><li>Eg. for Cluster, MultiMaster setups </li><ul><li>DNS
  124. 125. Advanced Routing
  125. 126. LVS
  126. 127. Or the upcoming slides </li></ul></ul>
  127. 128. MMM <ul><li>Multi-Master Replication Manager for MySQL </li><ul><li>Perl scripts to perform monitoring/failover and management of MySQL master-master replication configurations </li></ul><li>Balance master / slave configs based on replication state </li><ul><li>Map Virtual IP to the Best Node </li></ul><li>http://mysql-mmm.org/ </li></ul>
  128. 129. Flipper <ul><li>Flipper is a Perl tool for managing read and write access pairs of MySQL servers
  129. 130. master-master MySQL Servers
  130. 131. Clients machines do not connect &quot;directly&quot; to either node instead,
  131. 132. One IP for read,
  132. 133. One IP for write.
  133. 134. Flipper allows you to move these IP addresses between the nodes in a safe and controlled manner.
  134. 135. http://provenscaling.com/software/flipper/ </li></ul>
  135. 136. Linux-HA PaceMaker <ul><li>Plays well with others
  136. 137. Manages more than MySQL
  137. 138. ...v3 .. don't even think about the rest anymore
  138. 139. http://clusterlabs.org/ </li></ul>
  139. 140. Heartbeat <ul><li>Heartbeat v1 </li><ul><li>Max 2 nodes
  140. 141. No finegrained resources
  141. 142. Monitoring using “mon” </li></ul><li>Heartbeat v2 </li><ul><li>XML usage was a consulting opportunity
  142. 143. Stability issues
  143. 144. Forking ? </li></ul></ul>
  144. 145. Pacemaker Architecture <ul><li>Stonithd : The Heartbeat fencing subsystem.
  145. 146. Lrmd : Local Resource Management Daemon. Interacts directly with resource agents (scripts).
  146. 147. pengine Policy Engine. Computes the next state of the cluster based on the current state and the configuration.
  147. 148. cib Cluster Information Base. Contains definitions of all cluster options, nodes, resources, their relationships to one another and current status. Synchronizes updates to all cluster nodes.
  148. 149. crmd Cluster Resource Management Daemon. Largely a message broker for the PEngine and LRM, it also elects a leader to co-ordinate the activities of the cluster.
  149. 150. openais messaging and membership layer.
  150. 151. heartbeat messaging layer, an alternative to OpenAIS.
  151. 152. ccm Short for Consensus Cluster Membership. The Heartbeat membership layer. </li></ul>
  152. 153. Pacemaker ? <ul><li>Not a fork
  153. 154. Only CRM Code taken out of Heartbeat
  154. 155. As of Heartbeat 2.1.3 </li><ul><li>Support for both OpenAIS / HeartBeat
  155. 156. Different Release Cycles as Heartbeat </li></ul></ul>
  156. 157. Heartbeat, OpenAis ? <ul><li>Both Messaging Layers
  157. 158. Initially only Heartbeat
  158. 159. OpenAIS
  159. 160. Heartbeat got unmaintained
  160. 161. OpenAIS has heisenbugs :(
  161. 162. Heartbeat maintenance taken over by LinBit
  162. 163. CRM Detects which layer </li></ul>
  163. 164. or OpenAIS Heartbeat Pacemaker Cluster Glue
  164. 165. Configuring Heartbeat <ul><li>/etc/ha.d/ha.cf </li><ul><li>Use crm = yes </li></ul><li>/etc/ha.d/authkeys </li></ul>
  165. 166. Configuring Heartbeat heartbeat::hacf {&quot;clustername&quot;: hosts => [&quot;host-a&quot;,&quot;host-b&quot;], hb_nic => [&quot;bond0&quot;], hostip1 => [&quot;10.0.128.11&quot;], hostip2 => [&quot;10.0.128.12&quot;], ping => [&quot;10.0.128.4&quot;], } heartbeat::authkeys {&quot;ClusterName&quot;: password => “ClusterName &quot;, } http://github.com/jtimberman/puppet/tree/master/heartbeat/
  166. 167. Heartbeat Resources <ul><li>LSB
  167. 168. Heartbeat resource (+status)
  168. 169. OCF (Open Cluster FrameWork) (+monitor)
  169. 170. Clones (don't use in HAv2)
  170. 171. Multi State Resources </li></ul>
  171. 172. The MySQL Resource <ul><li>OCF </li><ul><li>Clone </li><ul><li>Where do you hook up the IP ? </li></ul><li>Multi State </li><ul><li>But we have Master Master replication </li></ul><li>Meta Resource </li><ul><li>Dummy resource that can monitor </li><ul><li>Connection
  172. 173. Replication state
  173. 174. .... </li></ul></ul></ul></ul>
  174. 175. CRM <ul><li>Cluster Resource Manager
  175. 176. Keeps Nodes in Sync
  176. 177. XML Based
  177. 178. cibadm
  178. 179. Cli manageable
  179. 180. Crm </li></ul>configure property $id=&quot;cib-bootstrap-options&quot; stonith-enabled=&quot;FALSE&quot; no-quorum-policy=ignore start-failure-is-fatal=&quot;FALSE&quot; rsc_defaults $id=&quot;rsc_defaults-options&quot; migration-threshold=&quot;1&quot; failure-timeout=&quot;1&quot; primitive d_mysql ocf:local:mysql op monitor interval=&quot;30s&quot; params test_user=&quot;sure&quot; test_passwd=&quot;illtell&quot; test_table=&quot;test.table&quot; primitive ip_db ocf:heartbeat:IPaddr2 params ip=&quot;172.17.4.202&quot; nic=&quot;bond0&quot; op monitor interval=&quot;10s&quot; group svc_db d_mysql ip_db commit
  180. 181. Hardware Cluster Stack Resource MySQL Replication Adding MySQL to the stack Node A Node B HeartBeat Pacemaker “ MySQLd” “ MySQLd” Service IP MySQL
  181. 182. Pitfalls & Solutions <ul><li>Monitor, </li><ul><li>Replication state
  182. 183. Replication Lag </li></ul><li>MaatKit
  183. 184. OpenARK </li></ul>
  184. 185. Conclusion <ul><li>Plenty of Alternatives
  185. 186. Think about your Data
  186. 187. Think about getting Queries to that Data
  187. 188. Complexity is the enemy of reliability
  188. 189. Keep it Simple
  189. 190. Monitor inside the DB </li></ul>
  190. 191. ` Kris Buytaert < [email_address] > Further Reading http://www.krisbuytaert.be/blog/ http://www.inuits.be/ http://www.virtualization.com/ http://www.oreillygmt.com/ ? !
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×