Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

High Availability in 37 Easy Steps


Published on

High Availability can be a curiously nebulous term, and most people probably don't care about it until they can't access their online banking service, or their plane crashes.

This presentation examines some of the considerations necessary when building highly available computer systems, then focuses on the HA infrastructure software currently available from the Corosync/OpenAIS, Linux-HA and Pacemaker projects.

Originally presented at Linux Users Victoria in April 2010 (

Published in: Technology
  • Be the first to comment

High Availability in 37 Easy Steps

  1. 1. High Availability in 37 Easy Steps Tim Serong Senior Clustering Engineer [email_address]
  2. 2. Agenda <ul><li>What is High Availability?
  3. 3. System Design Considerations
  4. 4. HA Clustering Software </li></ul>
  5. 5. What is High Availability?
  6. 6. What is High Availability? “ High availability is a system design protocol and associated implementation that ensures a certain degree of operational continuity during a given measurement period.”
  7. 7. <ul><li>So: </li><ul><li>Increase MTTF (better hardware)
  8. 8. Decrease MTTR (redundant hardware + software) </li></ul></ul>What is High Availability? Availability = MTTF MTTF + MTTR
  9. 9. What is High Availability? (hopefully your hardware is better than this)
  10. 10. What is High Availability? <ul><li>I lied about the presentation title </li></ul>
  11. 11. What is High Availability? <ul><li>I lied about the presentation title
  12. 12. High Availability in 37 Easy Steps </li></ul>
  13. 13. What is High Availability? <ul><li>I lied about the presentation title
  14. 14. High Availability in 37 Easy Steps
  15. 15. High Availability is a Process, not a Product </li></ul>
  16. 16. What is High Availability? (hopefully you hired this sysadmin)
  17. 17. System Design Considerations
  18. 18. System Design Considerations <ul><li>What, exactly, do you need?
  19. 19. How good is your system already?
  20. 20. Within what limits can you operate?
  21. 21. Please, for the love of Eris, keep it simple. </li></ul>
  22. 22. System Design Considerations Dual F/C Ethernet RAID File Server Client Network
  23. 23. System Design Considerations Dual F/C Ethernet Reasonably Highly Available, Most of the Time RAID File Server Client Network
  24. 24. System Design Considerations <ul><li>Good: </li><ul><li>Redundant power to server
  25. 25. Redundant F/C connections
  26. 26. RAID </li></ul><li>Bad: </li><ul><li>Server can still fail
  27. 27. Software can still fail </li></ul></ul>
  28. 28. System Design Considerations Dual F/C Ethernet Dual F/C Ethernet Private Network “ File Server” RAID Node 1 Client Network Node 2
  29. 29. System Design Considerations Dual F/C Ethernet Dual F/C Ethernet Private Network “ File Server” Node 2 takes over when Node 1 fails RAID Node 1 Client Network Node 2
  30. 30. System Design Considerations <ul><li>Redundancy adds complexity
  31. 31. Who's the boss if the two nodes get confused?
  32. 32. STONITH to the rescue </li></ul>
  33. 33. System Design Considerations
  34. 34. System Design Considerations <ul><li>Two-node clusters can be problematic.
  35. 35. Set STONITH action to power off (not reset).
  36. 36. Get a third node.
  37. 37. Test, test , test! </li></ul>
  38. 38. <ul>HA Clustering Software </ul>
  39. 39. HA Clustering Software <ul><li>In the beginning was Heartbeat v1
  40. 40. Easy to configure... </li><ul><li># cat /etc/haresources
  41. 41. node1 IPaddr:: Filesystem::/dev/sda1::/data1::ext3 </li></ul><li>...because it couldn't do anything. </li></ul>–
  42. 42. HA Clustering Software <ul><li>Then came Heartbeat v2, supporting: </li><ul><li>More than 2 nodes
  43. 43. Resource level monitoring
  44. 44. Dependencies between resources </li></ul></ul>
  45. 45. HA Clustering Software <ul><li>Trickier to configure: </li></ul># cibadmin -Q <?xml version=&quot;1.0&quot;?> <cib ...> <configuration> ... <resources> <group id=&quot;ip-with-fs&quot;> <primitive class=&quot;ocf&quot; id=&quot;IP&quot; provider=&quot;heartbeat&quot; type=&quot;IPaddr&quot;> <instance_attributes id=&quot;IP-instance_attributes&quot;> <nvpair id=&quot;IP-instance_attributes-ip&quot; name=&quot;ip&quot; value=&quot;;/> </instance_attributes> <operations> <op id=&quot;IP-monitor-5min&quot; interval=&quot;5min&quot; name=&quot;monitor&quot;/> </operations> </primitive> <primitive class=&quot;ocf&quot; id=&quot;FS&quot; provider=&quot;heartbeat&quot; type=&quot;Filesystem&quot;> <instance_attributes id=&quot;FS-instance_attributes&quot;> <nvpair id=&quot;FS-instance_attributes-device&quot; name=&quot;device&quot; value=&quot;/dev/sda1&quot;/> <nvpair id=&quot;FS-instance_attributes-directory&quot; name=&quot;directory&quot; value=&quot;/data1&quot;/> <nvpair id=&quot;FS-instance_attributes-fstype&quot; name=&quot;fstype&quot; value=&quot;ext3&quot;/> </instance_attributes> <operations> <op id=&quot;FS-monitor-60s&quot; interval=&quot;60s&quot; name=&quot;monitor&quot;/> </operations> </primitive> </group> </resources> <constraints> <rsc_location id=&quot;prefer-node1&quot; node=&quot;node1&quot; rsc=&quot;ip-with-fs&quot; score=&quot;100&quot;/> </constraints> ... </configuration> </cib>
  46. 46. HA Clustering Software <ul><li>Sometimes, it hurts your eyes: </li></ul>
  47. 47. HA Clustering Software <ul><li>Heartbeat then split into two projects: </li><ul><li>Heartbeat 2.1.4 (membership & messaging)
  48. 48. Pacemaker 0.6 (CRM)
  49. 49. (also glue, agents) </li></ul><li>Pacemaker added support for OpenAIS as an alternative to Heartbeat </li><ul><li>Necessary for DLM, thus supporting CLVM, GFS2, OCFS2 </li></ul></ul>
  50. 50. HA Clustering Software
  51. 51. HA Clustering Software <ul><li>Most recently, OpenAIS split into two projects: </li><ul><li>Corosync (cluster infrastructure)
  52. 52. OpenAIS (SA Forum APIs, i.e. magic for DLM, OCFS2, etc.) </li></ul><li>So now we have: </li><ul><li>Pacemaker 1.x on Heartbeat 3.0, or,
  53. 53. Pacemaker 1.x on Corosync 1.x (+ OpenAIS 1.x) </li></ul></ul>
  54. 54. HA Clustering Software Linux Kernel (SUSE only) (diagram courtesy of Lars Marowsky-Brée) ext3, XFS OCFS2 cLVM2 Local Disks SAN FC(oE), iSCSI DRBD Multipath IO DLM SCTP TCP UDP multicast UDP multicast Ethernet Infiniband Bonding SAP MySQL libvirt Xen Apache iSCSI Filesystems IP address DRBD clvmd Ocfs2_controld dlm_controld YaST2 c DRBD c OpenAIS MPIO LVS Resource Agents LSB init STONITH LRM ... DRAC iLO SBD Fencing Web GUI Python GUI CRM Shell CIB Policy Engine Pacemaker OpenAIS
  55. 55. HA Clustering Software <ul><li>But, after all that, it's easy to configure again:
  56. 56. ...and vastly more flexible </li></ul># crm configure show primitive IP ocf:heartbeat:IPaddr params ip=&quot;; op monitor interval=&quot;5min&quot; primitive FS ocf:heartbeat:Filesystem params device=&quot;/dev/sda1&quot; directory=&quot;/data1&quot; fstype=&quot;ext3&quot; op monitor interval=&quot;60s&quot; group ip-with-fs IP FS location prefer-node1 ip-with-fs 100: node1
  57. 57. HA Clustering Software Network Links Clients Storage (diagram courtesy of Lars Marowsky-Brée) Kernel Xen VM 1 LAMP Apache IP ext3 Kernel Kernel Corosync + openAIS Pacemaker DLM cLVM2+OCFS2 Xen VM 2
  58. 58. HA Clustering Software <ul><li>Before it breaks: </li><ul><li># crm
  59. 59. crm(live)# cib new sandbox
  60. 60. INFO: sandbox shadow CIB created
  61. 61. crm(sandbox)# cib cibstatus load live
  62. 62. crm(sandbox)# cib cibstatus op monitor IP not_running
  63. 63. crm(sandbox)# configure ptest
  64. 64. ptest[12971]: 2010/04/05_07:43:36 WARN: unpack_rsc_op: Processing failed op IP_monitor_300000 on hex-14: not running (7) </li></ul></ul>
  65. 65. HA Clustering Software
  66. 66. HA Clustering Software <ul><li>After it breaks: </li><ul><li># hb_report -n “node1 node2” -f 12:00 /tmp/hb_report </li></ul><li>Compiles: </li><ul><li>Cluster-wide log files
  67. 67. Package state
  68. 68. DLM/OCFS2 state
  69. 69. System information
  70. 70. CIB history
  71. 71. Parses core dump reports (needs debuginfo packages!)
  72. 72. ...into a single tarball for subsequent analysis </li></ul></ul>
  73. 73. HA Clustering Software <ul><li>Active Community </li><ul><li>#linux-ha
  74. 74. #linux-cluster
  75. 75. Various mailing lists </li></ul><li>SUSE and Red Hat converging on cluster stacks
  76. 76. Heartbeat in maintenance mode </li></ul>
  77. 77. Questions and Answers
  78. 78. Further Reading <ul><li>
  79. 79.
  80. 80.
  81. 81.
  82. 82.
  83. 83. </li></ul>