Galera Cluster: Synchronous Multi-Master Replication for MySQL HA
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

on

  • 3,852 views

Some slides I've presented at TechEvent 04.2013 and LinuxDay 2013

Some slides I've presented at TechEvent 04.2013 and LinuxDay 2013

Statistics

Views

Total Views
3,852
Views on SlideShare
2,374
Embed Views
1,478

Actions

Likes
3
Downloads
53
Comments
0

5 Embeds 1,478

http://www.ludovicocaldara.net 1471
http://cloud.feedly.com 4
http://digg.com 1
http://161.27.27.33 1
http://161.27.14.86 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Galera Cluster: Synchronous Multi-Master Replication for MySQL HA Presentation Transcript

  • 1. Galera Cluster TechEvent Synchronous Multi-Master Replication for MySQL HA April 2013 Ludovico CALDARA LS-IMS 27.04.2013 BASEL 1 BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 HAMBURG MÜNCHEN STUTTGART WIEN
  • 2. MySQL forks: which one is better? MySQL Oracle MySQL New forks Percona Server Many new features MariaDB Improved instrumentation Drizzle New solutions for DEVs and DBAs Fast-paced competition between forks’ developers Recent evolutions in HA and scalability have made MySQL enterprise ready 2 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 3. There is no recipe that can satisfy all tastes Percona Server MariaDB MySQL Multi source replication NO YES (rel. 10) NO NoSQL integration YES (cassandra) YES (cassandra) YES (memcached) Virtual Columns NO YES NO Improved diagnostics YES NO NO Online DDL NO YES YES Galera Cluster YES YES YES (codership patch) Many many others YES/NO YES/NO YES/NO 3 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 4. Your real requirements will let you choose… Need HA? • 4 How will react your customer if there is an important loss of service? 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 5. Old-school solutions have weaknesses Native MySQL Replication • Doesn’t scale writes • Complex to promote slaves MySQL Multi-Master Replication • Complex and not reliable • Concurrent writes lead to logical corruption DRBD Replication • Standby is offline, doesn’t scale at all • Poor performance MySQL Cluster • Very complex • It’s not InnoDB! NDB NDB NDB 5 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 6. New school solutions: 3rd parties are playing a decisive role Continuent Tungsten Replicator • Similar to Golden Gate • Heterogeneous databases • Provides complex topologies • Asynchronous • Conflicts are complex to resolve • Complex to maintain • Not free ORACLE MYSQL Galera Cluster Replication • Transparent Multi-Master easy to mantain • (Virtually) Synchronous • It’s InnoDB (only InnoDB) • Great and easy scalability • Optimistic locking (side effects) • At least 3 nodes for good HA 6 MYSQL ORACLE MYSQL 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 7. Multi-Master and virtually synchronous: it’s transparent R/W 7 R/W R/W R/W R/W 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 8. Cluster implementation - Ingredients • One or more standalone servers (either physical or virtual) • Linux (other operating systems are not yet available) • “Permissive” Firewall between nodes • Codership’s Galera Library package • A package of your choice: • Percona XtraDB Cluster • MariaDB Galera Cluster • MySQL with wsrep patch (patched by Codership) 8 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 9. Cluster implementation - Variables • Each server’s my.cnf must contain: • wsrep_cluster_address=gcomm://192.168.1.100,…,192.168.1.10x • wsrep_provider=/usr/lib64/libgalera_smm.so • binlog_format=ROW • default_storage_engine=InnoDB • innodb_autoinc_lock_mode=2 • innodb_locks_unsafe_for_binlog=1 #disables gap locking 9 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 10. Cluster implementation – Start the cluster mysqld_safe --wsrep_cluster_address=gcomm:// & […] 130220 17:56:46 [Note] WSREP: Starting new group from scratch: […] The empty gcomm:// address starts the node as the first of the cluster NEVER USE IT TO JOIN AN EXISTING CLUSTER 10 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 11. Cluster implementation – Adding nodes to the cluster mysqld_safe --wsrep_cluster_address=gcomm://host1,host2… & […] 130220 18:01:56 [Note] WSREP: Shifting OPEN -> PRIMARY (TO:…) 130220 18:01:56 [Note] WSREP: State transfer required: […] The address should be already present in the my.cnf! 11 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 12. Server State Transfer • The joiner asks for a SST R/W 12 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W R/W
  • 13. Server State Transfer • The joiner asks for a SST • The cluster chooses a donor, the donor is taken offline R/W DONOR 13 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W
  • 14. Server State Transfer • The joiner asks for a SST • The cluster chooses a donor, the donor is taken offline • The donor is backed up • The donor comes online again and the joiner is loaded R/W DONOR 14 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W R/W
  • 15. Server State Transfer • The joiner asks for a SST • The cluster chooses a donor, the donor is taken offline • The donor is backed up • The donor comes online again and the joiner is loaded • The joiner replays the missing transactions and joins the cluster R/W R/W DONOR 15 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W R/W
  • 16. Server State Transfer • The joiner asks for a SST • The cluster chooses a donor, the donor is taken offline • The donor is backed up • The donor comes online again and the joiner is loaded • The joiner replays the missing transactions and joins the cluster • The cluster can also do Incremental State Transfers (IST) 16 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W R/W R/W R/W
  • 17. Split-Brain • The majority of nodes wins • Complete loss of network: all nodes go offline • The offline nodes will respond: mysql> select * from emp; ERROR 1047 (08S01): Unknown command • Galera arbitrator (garbd) can join the cluster and count as a member in split brain resolution. • NEW: Galera 2.4 intruduces weighted quorum 17 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 gar arbitrator
  • 18. Example 1: Arbitrator in Trivadis Swiss BASEL … sorry for German/Austrian attenders ☺ ZURICH WAN arbitrator • If the WAN connection is lost, Zurich survives BERN • If the Zurich site is lost, the cluster will be off lined LAUSANNE 18 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 19. Example 2: Arbitrator in Trivadis Swiss BASEL … sorry for German/Austrian attenders ☺ ZURICH WAN • If the Zurich site is lost, the other sites survive BERN • If the WAN connection is lost, the cluster will be off lined LAUSANNE 19 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 arbitrator
  • 20. What does “Virtually synchronous” mean? In brief: Write 20 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 21. What does “Virtually synchronous” mean? In brief: Write Commit WS 21 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 22. What does “Virtually synchronous” mean? In brief: Write Commit WS 22 WS 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 WS
  • 23. What does “Virtually synchronous” mean? In brief: Write Commit Commit OK WS 23 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 WS
  • 24. What does “Virtually synchronous” mean? In brief: • Writes are as fast as if they were local • Commits take just the time of a network roundtrip: if acceptable then the cluster can be spread geographically Write Commit Commit OK WS 24 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 WS
  • 25. Optimistic locking leads to side effects mysql> update emp set salary=‘peanuts’ where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 25 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 26. Optimistic locking leads to side effects mysql> update emp set salary=‘peanuts’ where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update emp set salary=‘one billion' where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 26 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 27. Optimistic locking leads to side effects mysql> update emp set salary=‘peanuts’ where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update emp set salary=‘one billion' where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> commit; Query OK, 0 rows affected (0.01 sec WS 27 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 WS
  • 28. Optimistic locking leads to side effects mysql> update emp set salary=‘peanuts’ where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update emp set salary=‘one billion' where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> commit; Query OK, 0 rows affected (0.01 sec mysql> commit; ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction mysql> select salary from emp where name=‘Caldara’; +-------------+ | salary | +-------------+ | one billion | +-------------+ WS 28 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 29. Conclusions on optimistic locking… • Locally, the first that acquires the lock wins (it’s InnoDB…) • Cluster-wise, the first that broadcasts its commit wins (it’s Galera…) • The application should not have hotspots... • … or it should retry the transaction after the deadlock occurs… • … or, for each database, you can elegy one node as the master 29 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 30. About performance • Commit performance loss is between 5% and 10% plus the network RTT • Write workloads scale to up to 8 nodes • >8 nodes: it scales reads, not writes • Many benchmarks show that Galera overcomes NDB with few nodes • NDB scales out more with many nodes thanks to data sharding • Benchmarks on internet are not always reliable… test the performance of YOUR application 30 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 31. How to migrate • Converts all your tables to InnoDB • Double-check that all tables have primary keys • Think about potential problems caused by triggers (if you have any) • Create a new empty Galera Cluster • Setup MySQL native replication between the old database and the Galera cluster • Once all is aligned, direct your clients on the new cluster • Setup the old node to join the cluster NATIVE REPLICATION 31 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 JOIN
  • 32. Load balancing • HAProxy is the most used solution so far • Codership is actively developing his own load balancer: Galera Load Balancer (glbd) • Several balancing modes: round robin, custom, least connected, … • Automatically drains disconnected nodes • New nodes can be added with a single tcp call • Release 1.0 (now rc1) will support watchdog and automatic discover of nodes composing the cluster • Other methods possible (e.g. java connector properties, HW load balancer) 32 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 33. Conclusions on Galera Cluster • Multi-master • Shared-nothing • Great performances and scalability • «Virtually» synchronous • It uses InnoDB!! • Conflict prevention • Split-brain (no inconsistencies) • Easy to add/remove nodes 33 • At least 3 nodes to have good HA • Optimistic locking (side effects) • Explicit locking doesn’t work • Only InnoDB is replicated • Primary keys are mandatory • Not yet available for MySQL 5.6 • Linux only 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 34. Links http://www.slideshare.net/skysql/galera-cluster-by-seppo-jaakola-codership-at-skysql-roadshow-instuttgart-2013 http://www.codership.com/files/presentations/Galera_Replication_PLL_2011.pdf http://www.mysqlperformanceblog.com/2013/01/31/feature-in-details-incremental-state-transfer-after-anode-crash-in-percona-xtradb-cluster/ http://www.percona.tv/percona-webinars/migrating-to-percona-xtradb-cluster http://www.codership.com/content/5-tips-migrating-your-mysql-server-galera-cluster http://www.mysqlperformanceblog.com/2012/08/17/percona-xtradb-cluster-multi-node-writing-andunexpected-deadlocks/ http://www.mysqlperformanceblog.com/2012/11/20/understanding-multi-node-writing-conflict-metrics-inpercona-xtradb-cluster-and-galera/ http://www.mysqlperformanceblog.com/2011/10/13/benchmarking-galera-replication-overhead/ http://karlssonondatabases.blogspot.ch/2012/12/galera-features-beyond-just-ha.html http://infoscience.epfl.ch/record/52305/files/IC_TECH_REPORT_199908.pdf http://www.inf.usi.ch/faculty/pedone/Paper/2005/2005WDIDDR.pdf 34 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 35. Little demo? 35 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 36. ? 36 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 37. Trivadis SA THANK YOU. Ludovico Caldara Senior Consultant Ludovico.caldara@trivadis.com www.trivadis.com BASEL 37 BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 HAMBURG MÜNCHEN STUTTGART WIEN