MySQL HA Presentation

7,097 views

Published on

MySQL High Availability in prictice, pros and cons of different alternatives in the real world

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
7,097
On SlideShare
0
From Embeds
0
Number of Embeds
3,240
Actions
Shares
0
Downloads
106
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

MySQL HA Presentation

  1. MySQL HA Using different solutions <ul><li>Robert Krzykawski </li></ul><ul><li>DB Team Coordinator, bwin games. </li></ul><ul><li>Anders Karlsson </li></ul><ul><li>Principal Sales Engineer, MySQL </li></ul>
  2. Agenda <ul><li>Who are we? </li></ul><ul><li>HA Basics – Anders </li></ul><ul><li>How we did it; Success or failure – Robert </li></ul><ul><li>Summary </li></ul><ul><li>Questions? </li></ul>
  3. Anders Karlsson <ul><li>Sales Engineer with Sun / MySQL for 5+ years </li></ul><ul><li>I have been in the RDBMS business for 20+ years </li></ul><ul><li>I have worked for many of the major vendors and with most of the vendor products </li></ul><ul><li>I’ve been in roles as </li></ul><ul><ul><li>Sales Engineer </li></ul></ul><ul><ul><li>Consultant </li></ul></ul><ul><ul><li>Porting engineer </li></ul></ul><ul><ul><li>Support engineer </li></ul></ul><ul><ul><li>Etc. </li></ul></ul><ul><li>Outside MySQL I build websites (www.papablues.com), develop Open Source software (MyQuery, ndbtop etc), am a keen photographer and drives sub-standard cars, among other things. Also: www.makezfsgpl.com ! Right now! </li></ul>
  4. Robert Krzykawski <ul><li>DB Team Coordinator @ bwin Games AB </li></ul><ul><li>Have been working with MySQL in every way from system admin, DBA, DBD and now taking a more system architectural role. </li></ul><ul><li>Been involved in building both small and big web based solutions since 1998 using MySQL. </li></ul><ul><li>My roles throughout my professional life have varied. System administrator, Technical Sales support, DBA, DBD, Programmer, Application architect and System architect. </li></ul><ul><li>Off work I am trying to automate things with scripts and programs to off load myself when “on work”.  </li></ul><ul><li>I am also trying to find time to snowboard, play some paintball and a recently introduced hobby is our Maine Coon kittens.  </li></ul>
  5. Why do you need HA <ul><li>Something can break. It usually will, eventually </li></ul><ul><li>You will need to maintain your database eventually, without shutting the whole system down </li></ul><ul><li>Adding HA to an existing running system is difficult, Much more so than to provide HA from the start </li></ul><ul><li>You want a good nights sleep! You want failover to be automatic! </li></ul>
  6. HA Concepts <ul><li>Fault tolerant architectures </li></ul><ul><ul><li>These are hardware architectures with supporting software that prevents against even individual component failures </li></ul></ul><ul><li>Single Point of Failure (SPOF) </li></ul><ul><ul><li>In any fault tolerant setup, you want to avoid a SPOF, as a link is not better than it’s weakest link </li></ul></ul><ul><li>Fail over and Fail back </li></ul><ul><ul><li>Fail over is the process of switching from a failed component to another component, dormant or also active. Fail back is the process of failing back from the backup component to the original one. </li></ul></ul>
  7. Some HA Components <ul><li>Heartbeat </li></ul><ul><ul><li>Heartbeat is an HA component that checks that the services that are being failed over, are alive. Heartbeat can check individual servers, software services, networking etc. </li></ul></ul><ul><li>HA Monitor </li></ul><ul><ul><li>The HA Monitor has different names in different frameworks. This is the component that allows configuration of the services, ensures proper shutdown and startup and allows manual control </li></ul></ul><ul><li>Replication </li></ul><ul><ul><li>Replication is a common component that ensures that the data content of managed data rich components are in sync </li></ul></ul>
  8. What should I require? <ul><li>Don’t aim too high, aim for what is reasonable for your needs </li></ul><ul><li>Aim to ensure that no important data is lost </li></ul><ul><ul><li>What is “important data”? You decide! Different data means different “needs”! </li></ul></ul><ul><li>Aim to ensure that the solution can be automated. You will want this eventually anyway </li></ul><ul><li>Aim to ensure a solution that can easily be tested and administered </li></ul><ul><li>Aim to ensure that the solution is performant and scalable </li></ul>
  9. <ul><li>MySQL Replication </li></ul><ul><ul><li>Easy to use and set up. Low performance impact </li></ul></ul><ul><ul><li>Asynchronous only. Failback can be difficult. Need additional components </li></ul></ul><ul><li>MySQL with DRBD / ZFS / AVS </li></ul><ul><ul><li>Easy to use. Low cost software only. Synchronous. Good HA software integration. </li></ul></ul><ul><ul><li>Certain performance impact. Limited data size and transaction rates. </li></ul></ul>HA with MySQL – In short
  10. <ul><li>MySQL with Shared storage </li></ul><ul><ul><li>Good performance. Eases hardware management. Good integration with HA software. </li></ul></ul><ul><ul><li>Costly. SAN itself is a SPOF. </li></ul></ul><ul><li>MySQL Cluster </li></ul><ul><ul><li>Very good performance. Self contained. Very short fail-over times. Software only solution. </li></ul></ul><ul><ul><li>Needs several physical servers. Not optimized for all MySQL applications. </li></ul></ul>HA with MySQL – In short
  11. bwin games ab
  12. Our goal at bwin <ul><li>We were faced with a requirement; establish a highly available database platform. </li></ul><ul><li>We had some rules to follow from management. </li></ul><ul><ul><li>interruptions due to hardware failure should not require hands-on work. </li></ul></ul><ul><ul><li>Downtime should be minimized during interruptions. </li></ul></ul><ul><ul><li>Performance of DB platform should not decrease when operating as usual </li></ul></ul><ul><ul><li>Performance can decrease if a failure has occurred but should not deem the service unusable. </li></ul></ul><ul><ul><li>Implementation should be done by the operations department. Developers should not be involved. </li></ul></ul>
  13. What solutions did we consider? <ul><li>Master/Master </li></ul><ul><li>Linux HA </li></ul><ul><li>HP Service Guard </li></ul><ul><li>Sun Cluster </li></ul><ul><li>Combination of the above </li></ul><ul><li>MySQL Cluster </li></ul><ul><li>Will walk through all of the above </li></ul>
  14. Master/Master <ul><li>Master/Master with two active nodes would give us a seamless switch if we have a good load balancer. </li></ul><ul><ul><li>Will give us the ability to do schema changes “on line” </li></ul></ul><ul><ul><li>Not only higher availability when both nodes are up, but better performance. </li></ul></ul><ul><ul><li>Can eliminate the use of production slaves. </li></ul></ul><ul><ul><li>One entry point for application when using “LB” </li></ul></ul>
  15. Linux HA/ServiceGuard/SunCluster <ul><li>Service IP switch will cause a glitch in service. </li></ul><ul><li>Since we are running 4.0 we can’t really do a master/master setup with service IP switching. </li></ul><ul><li>Slave integrity is important and we are running 4.0; One master data. Can’t switch to slave and hope that everything was replicated. </li></ul><ul><li>We are using SAN – Shared storage possible. </li></ul><ul><li>One instance, two machines – One active, one standby. </li></ul><ul><li>Innodb log size will be a problem. </li></ul><ul><li>Timeout during recovery can cause problems during switch. </li></ul>
  16. MySQL Cluster <ul><li>High availability built in if implemented correct </li></ul><ul><li>Requires more hardware. </li></ul><ul><li>More complex solution </li></ul><ul><li>Requires application to support NDB </li></ul><ul><li>Not full feature set. </li></ul>
  17. Obstacles <ul><li>We are using MySQL 4.0 in our biggest database </li></ul><ul><li>Master/Master scenario on 4.0 requires higher level of application awareness. </li></ul><ul><li>LinuxHA/ServiceGuard/Sun Cluster will cause small glitch when we move resources. </li></ul><ul><li>MySQL Cluster will require even more application changes in our case. </li></ul>
  18. Our Choice <ul><li>LinuxHA because it is GPL/LGPL. Free and not owned by an organization. </li></ul><ul><li>Fastest way to implement, did not require any support from dev. Department. </li></ul><ul><li>All other ways required changes in application. </li></ul>
  19. Layout <ul><li>Two versions </li></ul>
  20. We do.. <ul><li>Use Linux HA 2.0. Needed for setup of “cluster” </li></ul><ul><li>Use SAN. Shared storage is easier and faster, but Expensive. </li></ul><ul><ul><li>DRBD can be used but saves the same data twice Also comes with a performance decrease. </li></ul></ul><ul><li>Heartbeat on two bonds. Primary database interconnect network, secondary on database service network </li></ul><ul><li>We have LUNs presented to multiple hosts </li></ul><ul><li>Services have rules to be run on specific hosts only. </li></ul><ul><li>We fence using RiLOE </li></ul><ul><ul><li>Have plans to fence on port level in FC switches. </li></ul></ul>
  21. What’s good and what’s bad.. <ul><li>Easy and fast implementation </li></ul><ul><li>Our config does not increase/decrease performance. </li></ul><ul><li>Innodb log size causes long recovery times. Testing to decrease it has caused performance penalties. </li></ul><ul><li>Our solution is not fool proof because of long recovery times. </li></ul><ul><li>It causes interruption of service. </li></ul><ul><li>We can say it’s HA, but true HA solution would give us 100% uptime. </li></ul><ul><li>2nd Setup is complicated. We should aim for having simple setups. More common </li></ul>
  22. What can we do better. <ul><li>Fine tune config for faster recovery/startup </li></ul><ul><li>Add better fencing </li></ul><ul><li>Monitor failover in case recovery takes long </li></ul><ul><li>Master/Master or Multi master. </li></ul><ul><ul><li>If application can reconnect or if we have a smart load balancer we have no outages. </li></ul></ul><ul><ul><li>Upgrades or schema changes can be made “online” </li></ul></ul><ul><ul><li>No separation between writes and reads. Less complicated for developers. One entry point. </li></ul></ul>
  23. Summary <ul><li>Concepts </li></ul><ul><li>Components </li></ul><ul><li>Requirements </li></ul><ul><li>Technologies </li></ul><ul><li>Your goal </li></ul><ul><li>Considerations </li></ul><ul><li>Obstacles </li></ul><ul><li>How we did it @ bwin games AB </li></ul><ul><li>HA recommendations </li></ul>
  24. Questions The question is not, ‘What is the answer?’ The question is, ‘What is the question?’ Henri Poincaré
  25. Thank you for your time! <ul><li>And thank you for listening so kindly. </li></ul><ul><li>We can be found on: </li></ul><ul><li>Robert Krzykawski – http://krzykawski.com </li></ul><ul><li>Anders Karlsson – http://papablues.com http://karlssonondatabases.blogspot.com / </li></ul>

×