Galera Cluster

TechEvent

Synchronous Multi-Master
Replication for MySQL HA

April 2013

Ludovico CALDARA
LS-IMS
27.04.2013

BASEL

1

BERN

LAUSANNE

ZÜRICH

DÜSSELDORF

FRANKFURT A.M.

FREIBURG I.BR.

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

HAMBURG

MÜNCHEN

STUTTGART

WIEN
MySQL forks: which one is better?

MySQL

Oracle MySQL

New forks
Percona Server

Many new features
MariaDB

Improved instrumentation
Drizzle

New solutions for DEVs and DBAs
Fast-paced competition between forks’ developers
Recent evolutions in HA and scalability have made MySQL enterprise ready

2

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
There is no recipe that can satisfy all tastes

Percona Server

MariaDB

MySQL

Multi source replication

NO

YES (rel. 10)

NO

NoSQL integration

YES (cassandra)

YES (cassandra)

YES (memcached)

Virtual Columns

NO

YES

NO

Improved diagnostics

YES

NO

NO

Online DDL

NO

YES

YES

Galera Cluster

YES

YES

YES (codership patch)

Many many others

YES/NO

YES/NO

YES/NO

3

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Your real requirements will let you choose… Need HA?

•

4

How will react your customer if there is an important loss of service?

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Old-school solutions have weaknesses
Native MySQL Replication
• Doesn’t scale writes
• Complex to promote slaves
MySQL Multi-Master Replication
• Complex and not reliable
• Concurrent writes lead to logical corruption
DRBD Replication
• Standby is offline, doesn’t scale at all
• Poor performance
MySQL Cluster
• Very complex
• It’s not InnoDB!

NDB

NDB
NDB

5

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
New school solutions: 3rd parties are playing a decisive role
Continuent Tungsten Replicator
• Similar to Golden Gate
• Heterogeneous databases
• Provides complex topologies
• Asynchronous
• Conflicts are complex to resolve
• Complex to maintain
• Not free

ORACLE MYSQL

Galera Cluster Replication
• Transparent Multi-Master easy to mantain
• (Virtually) Synchronous
• It’s InnoDB (only InnoDB)
• Great and easy scalability
• Optimistic locking (side effects)
• At least 3 nodes for good HA

6

MYSQL ORACLE

MYSQL

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Multi-Master and virtually synchronous: it’s transparent

R/W

7

R/W

R/W

R/W

R/W

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Cluster implementation - Ingredients
• One or more standalone servers (either physical or virtual)
• Linux (other operating systems are not yet available)
• “Permissive” Firewall between nodes
• Codership’s Galera Library package
• A package of your choice:
• Percona XtraDB Cluster
• MariaDB Galera Cluster
• MySQL with wsrep patch
(patched by Codership)

8

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Cluster implementation - Variables

• Each server’s my.cnf must contain:
• wsrep_cluster_address=gcomm://192.168.1.100,…,192.168.1.10x
• wsrep_provider=/usr/lib64/libgalera_smm.so
• binlog_format=ROW
• default_storage_engine=InnoDB
• innodb_autoinc_lock_mode=2
• innodb_locks_unsafe_for_binlog=1 #disables gap locking

9

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Cluster implementation – Start the cluster

mysqld_safe --wsrep_cluster_address=gcomm:// &
[…]
130220 17:56:46 [Note] WSREP: Starting new group from scratch:
[…]

The empty gcomm:// address starts the node as the first of the cluster
NEVER USE IT TO JOIN AN EXISTING CLUSTER

10

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Cluster implementation – Adding nodes to the cluster

mysqld_safe 
--wsrep_cluster_address=gcomm://host1,host2… &
[…]
130220 18:01:56 [Note] WSREP: Shifting OPEN -> PRIMARY (TO:…)
130220 18:01:56 [Note] WSREP: State transfer required:
[…]

The address should be already present in the my.cnf!

11

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Server State Transfer
• The joiner asks for a SST

R/W

12

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

R/W

R/W
Server State Transfer
• The joiner asks for a SST
• The cluster chooses a donor, the donor is taken offline

R/W

DONOR

13

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

R/W
Server State Transfer
• The joiner asks for a SST
• The cluster chooses a donor, the donor is taken offline
• The donor is backed up
• The donor comes online again and the joiner is loaded

R/W

DONOR

14

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

R/W

R/W
Server State Transfer
• The joiner asks for a SST
• The cluster chooses a donor, the donor is taken offline
• The donor is backed up
• The donor comes online again and the joiner is loaded
• The joiner replays the missing transactions
and joins the cluster
R/W

R/W

DONOR

15

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

R/W

R/W
Server State Transfer
• The joiner asks for a SST
• The cluster chooses a donor, the donor is taken offline
• The donor is backed up
• The donor comes online again and the joiner is loaded
• The joiner replays the missing transactions
and joins the cluster
• The cluster can also do
Incremental State Transfers (IST)

16

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

R/W

R/W

R/W

R/W
Split-Brain
• The majority of nodes wins
• Complete loss of network: all nodes
go offline
• The offline nodes will respond:
mysql> select * from emp;
ERROR 1047 (08S01): Unknown
command
• Galera arbitrator (garbd) can join the
cluster and count as a member in split
brain resolution.
• NEW: Galera 2.4 intruduces weighted
quorum
17

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

gar
arbitrator
Example 1: Arbitrator in Trivadis Swiss
BASEL

… sorry for German/Austrian attenders ☺

ZURICH

WAN
arbitrator

• If the WAN connection is lost,
Zurich survives

BERN

• If the Zurich site is lost, the cluster
will be off lined
LAUSANNE
18

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Example 2: Arbitrator in Trivadis Swiss
BASEL

… sorry for German/Austrian attenders ☺

ZURICH

WAN

• If the Zurich site is lost, the other
sites survive

BERN

• If the WAN connection is lost, the
cluster will be off lined
LAUSANNE
19

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

arbitrator
What does “Virtually synchronous” mean? In brief:

Write

20

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
What does “Virtually synchronous” mean? In brief:

Write
Commit

WS

21

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
What does “Virtually synchronous” mean? In brief:

Write
Commit

WS

22

WS

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

WS
What does “Virtually synchronous” mean? In brief:

Write
Commit
Commit
OK

WS

23

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

WS
What does “Virtually synchronous” mean? In brief:
•

Writes are as fast as if they were local

•

Commits take just the time of a network
roundtrip: if acceptable then the cluster
can be spread geographically

Write
Commit
Commit
OK

WS

24

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

WS
Optimistic locking leads to side effects
mysql> update emp set salary=‘peanuts’ where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0

25

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Optimistic locking leads to side effects
mysql> update emp set salary=‘peanuts’ where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update emp set salary=‘one billion' where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0

26

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Optimistic locking leads to side effects
mysql> update emp set salary=‘peanuts’ where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update emp set salary=‘one billion' where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> commit;
Query OK, 0 rows affected (0.01 sec

WS

27

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

WS
Optimistic locking leads to side effects
mysql> update emp set salary=‘peanuts’ where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update emp set salary=‘one billion' where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> commit;
Query OK, 0 rows affected (0.01 sec
mysql> commit;
ERROR 1213 (40001): Deadlock found when trying to get lock; try
restarting transaction
mysql> select salary from emp where name=‘Caldara’;
+-------------+
| salary
|
+-------------+
| one billion |
+-------------+
WS

28

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Conclusions on optimistic locking…
• Locally, the first that acquires the lock wins (it’s InnoDB…)
• Cluster-wise, the first that broadcasts its commit wins (it’s
Galera…)
• The application should not have hotspots...
• … or it should retry the transaction after the deadlock occurs…
• … or, for each database, you can elegy one node as the master

29

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
About performance
• Commit performance loss is between 5% and 10% plus the network RTT
• Write workloads scale to up to 8 nodes
• >8 nodes: it scales reads, not writes
• Many benchmarks show that Galera overcomes NDB with few nodes
• NDB scales out more with many nodes thanks to data sharding
• Benchmarks on internet are not always reliable… test the performance
of YOUR application

30

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
How to migrate
• Converts all your tables to InnoDB
• Double-check that all tables have primary keys
• Think about potential problems caused by triggers (if you have any)
• Create a new empty Galera Cluster
• Setup MySQL native replication between the old database and the
Galera cluster
• Once all is aligned, direct your clients on the new cluster
• Setup the old node to join the cluster

NATIVE
REPLICATION

31

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

JOIN
Load balancing
• HAProxy is the most used solution so far
• Codership is actively developing his own load
balancer: Galera Load Balancer (glbd)
• Several balancing modes: round robin,
custom, least connected, …
• Automatically drains disconnected nodes
• New nodes can be added with a single tcp
call
• Release 1.0 (now rc1) will support
watchdog and automatic discover of
nodes composing the cluster
• Other methods possible (e.g. java connector
properties, HW load balancer)
32

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Conclusions on Galera Cluster
• Multi-master
• Shared-nothing
• Great performances and scalability
• «Virtually» synchronous
• It uses InnoDB!!
• Conflict prevention
• Split-brain (no inconsistencies)
• Easy to add/remove nodes

33

• At least 3 nodes to have good HA
• Optimistic locking (side effects)
• Explicit locking doesn’t work
• Only InnoDB is replicated
• Primary keys are mandatory
• Not yet available for MySQL 5.6
• Linux only

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Links
http://www.slideshare.net/skysql/galera-cluster-by-seppo-jaakola-codership-at-skysql-roadshow-instuttgart-2013
http://www.codership.com/files/presentations/Galera_Replication_PLL_2011.pdf
http://www.mysqlperformanceblog.com/2013/01/31/feature-in-details-incremental-state-transfer-after-anode-crash-in-percona-xtradb-cluster/
http://www.percona.tv/percona-webinars/migrating-to-percona-xtradb-cluster
http://www.codership.com/content/5-tips-migrating-your-mysql-server-galera-cluster
http://www.mysqlperformanceblog.com/2012/08/17/percona-xtradb-cluster-multi-node-writing-andunexpected-deadlocks/
http://www.mysqlperformanceblog.com/2012/11/20/understanding-multi-node-writing-conflict-metrics-inpercona-xtradb-cluster-and-galera/
http://www.mysqlperformanceblog.com/2011/10/13/benchmarking-galera-replication-overhead/
http://karlssonondatabases.blogspot.ch/2012/12/galera-features-beyond-just-ha.html
http://infoscience.epfl.ch/record/52305/files/IC_TECH_REPORT_199908.pdf
http://www.inf.usi.ch/faculty/pedone/Paper/2005/2005WDIDDR.pdf

34

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Little demo?

35

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
?

36

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Trivadis SA

THANK YOU.

Ludovico Caldara
Senior Consultant

Ludovico.caldara@trivadis.com
www.trivadis.com

BASEL

37

BERN

LAUSANNE

ZÜRICH

DÜSSELDORF

FRANKFURT A.M.

FREIBURG I.BR.

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

HAMBURG

MÜNCHEN

STUTTGART

WIEN

Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

  • 1.
    Galera Cluster TechEvent Synchronous Multi-Master Replicationfor MySQL HA April 2013 Ludovico CALDARA LS-IMS 27.04.2013 BASEL 1 BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 HAMBURG MÜNCHEN STUTTGART WIEN
  • 2.
    MySQL forks: whichone is better? MySQL Oracle MySQL New forks Percona Server Many new features MariaDB Improved instrumentation Drizzle New solutions for DEVs and DBAs Fast-paced competition between forks’ developers Recent evolutions in HA and scalability have made MySQL enterprise ready 2 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 3.
    There is norecipe that can satisfy all tastes Percona Server MariaDB MySQL Multi source replication NO YES (rel. 10) NO NoSQL integration YES (cassandra) YES (cassandra) YES (memcached) Virtual Columns NO YES NO Improved diagnostics YES NO NO Online DDL NO YES YES Galera Cluster YES YES YES (codership patch) Many many others YES/NO YES/NO YES/NO 3 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 4.
    Your real requirementswill let you choose… Need HA? • 4 How will react your customer if there is an important loss of service? 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 5.
    Old-school solutions haveweaknesses Native MySQL Replication • Doesn’t scale writes • Complex to promote slaves MySQL Multi-Master Replication • Complex and not reliable • Concurrent writes lead to logical corruption DRBD Replication • Standby is offline, doesn’t scale at all • Poor performance MySQL Cluster • Very complex • It’s not InnoDB! NDB NDB NDB 5 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 6.
    New school solutions:3rd parties are playing a decisive role Continuent Tungsten Replicator • Similar to Golden Gate • Heterogeneous databases • Provides complex topologies • Asynchronous • Conflicts are complex to resolve • Complex to maintain • Not free ORACLE MYSQL Galera Cluster Replication • Transparent Multi-Master easy to mantain • (Virtually) Synchronous • It’s InnoDB (only InnoDB) • Great and easy scalability • Optimistic locking (side effects) • At least 3 nodes for good HA 6 MYSQL ORACLE MYSQL 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 7.
    Multi-Master and virtuallysynchronous: it’s transparent R/W 7 R/W R/W R/W R/W 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 8.
    Cluster implementation -Ingredients • One or more standalone servers (either physical or virtual) • Linux (other operating systems are not yet available) • “Permissive” Firewall between nodes • Codership’s Galera Library package • A package of your choice: • Percona XtraDB Cluster • MariaDB Galera Cluster • MySQL with wsrep patch (patched by Codership) 8 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 9.
    Cluster implementation -Variables • Each server’s my.cnf must contain: • wsrep_cluster_address=gcomm://192.168.1.100,…,192.168.1.10x • wsrep_provider=/usr/lib64/libgalera_smm.so • binlog_format=ROW • default_storage_engine=InnoDB • innodb_autoinc_lock_mode=2 • innodb_locks_unsafe_for_binlog=1 #disables gap locking 9 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 10.
    Cluster implementation –Start the cluster mysqld_safe --wsrep_cluster_address=gcomm:// & […] 130220 17:56:46 [Note] WSREP: Starting new group from scratch: […] The empty gcomm:// address starts the node as the first of the cluster NEVER USE IT TO JOIN AN EXISTING CLUSTER 10 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 11.
    Cluster implementation –Adding nodes to the cluster mysqld_safe --wsrep_cluster_address=gcomm://host1,host2… & […] 130220 18:01:56 [Note] WSREP: Shifting OPEN -> PRIMARY (TO:…) 130220 18:01:56 [Note] WSREP: State transfer required: […] The address should be already present in the my.cnf! 11 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 12.
    Server State Transfer •The joiner asks for a SST R/W 12 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W R/W
  • 13.
    Server State Transfer •The joiner asks for a SST • The cluster chooses a donor, the donor is taken offline R/W DONOR 13 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W
  • 14.
    Server State Transfer •The joiner asks for a SST • The cluster chooses a donor, the donor is taken offline • The donor is backed up • The donor comes online again and the joiner is loaded R/W DONOR 14 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W R/W
  • 15.
    Server State Transfer •The joiner asks for a SST • The cluster chooses a donor, the donor is taken offline • The donor is backed up • The donor comes online again and the joiner is loaded • The joiner replays the missing transactions and joins the cluster R/W R/W DONOR 15 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W R/W
  • 16.
    Server State Transfer •The joiner asks for a SST • The cluster chooses a donor, the donor is taken offline • The donor is backed up • The donor comes online again and the joiner is loaded • The joiner replays the missing transactions and joins the cluster • The cluster can also do Incremental State Transfers (IST) 16 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W R/W R/W R/W
  • 17.
    Split-Brain • The majorityof nodes wins • Complete loss of network: all nodes go offline • The offline nodes will respond: mysql> select * from emp; ERROR 1047 (08S01): Unknown command • Galera arbitrator (garbd) can join the cluster and count as a member in split brain resolution. • NEW: Galera 2.4 intruduces weighted quorum 17 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 gar arbitrator
  • 18.
    Example 1: Arbitratorin Trivadis Swiss BASEL … sorry for German/Austrian attenders ☺ ZURICH WAN arbitrator • If the WAN connection is lost, Zurich survives BERN • If the Zurich site is lost, the cluster will be off lined LAUSANNE 18 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 19.
    Example 2: Arbitratorin Trivadis Swiss BASEL … sorry for German/Austrian attenders ☺ ZURICH WAN • If the Zurich site is lost, the other sites survive BERN • If the WAN connection is lost, the cluster will be off lined LAUSANNE 19 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 arbitrator
  • 20.
    What does “Virtuallysynchronous” mean? In brief: Write 20 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 21.
    What does “Virtuallysynchronous” mean? In brief: Write Commit WS 21 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 22.
    What does “Virtuallysynchronous” mean? In brief: Write Commit WS 22 WS 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 WS
  • 23.
    What does “Virtuallysynchronous” mean? In brief: Write Commit Commit OK WS 23 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 WS
  • 24.
    What does “Virtuallysynchronous” mean? In brief: • Writes are as fast as if they were local • Commits take just the time of a network roundtrip: if acceptable then the cluster can be spread geographically Write Commit Commit OK WS 24 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 WS
  • 25.
    Optimistic locking leadsto side effects mysql> update emp set salary=‘peanuts’ where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 25 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 26.
    Optimistic locking leadsto side effects mysql> update emp set salary=‘peanuts’ where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update emp set salary=‘one billion' where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 26 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 27.
    Optimistic locking leadsto side effects mysql> update emp set salary=‘peanuts’ where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update emp set salary=‘one billion' where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> commit; Query OK, 0 rows affected (0.01 sec WS 27 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 WS
  • 28.
    Optimistic locking leadsto side effects mysql> update emp set salary=‘peanuts’ where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update emp set salary=‘one billion' where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> commit; Query OK, 0 rows affected (0.01 sec mysql> commit; ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction mysql> select salary from emp where name=‘Caldara’; +-------------+ | salary | +-------------+ | one billion | +-------------+ WS 28 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 29.
    Conclusions on optimisticlocking… • Locally, the first that acquires the lock wins (it’s InnoDB…) • Cluster-wise, the first that broadcasts its commit wins (it’s Galera…) • The application should not have hotspots... • … or it should retry the transaction after the deadlock occurs… • … or, for each database, you can elegy one node as the master 29 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 30.
    About performance • Commitperformance loss is between 5% and 10% plus the network RTT • Write workloads scale to up to 8 nodes • >8 nodes: it scales reads, not writes • Many benchmarks show that Galera overcomes NDB with few nodes • NDB scales out more with many nodes thanks to data sharding • Benchmarks on internet are not always reliable… test the performance of YOUR application 30 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 31.
    How to migrate •Converts all your tables to InnoDB • Double-check that all tables have primary keys • Think about potential problems caused by triggers (if you have any) • Create a new empty Galera Cluster • Setup MySQL native replication between the old database and the Galera cluster • Once all is aligned, direct your clients on the new cluster • Setup the old node to join the cluster NATIVE REPLICATION 31 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 JOIN
  • 32.
    Load balancing • HAProxyis the most used solution so far • Codership is actively developing his own load balancer: Galera Load Balancer (glbd) • Several balancing modes: round robin, custom, least connected, … • Automatically drains disconnected nodes • New nodes can be added with a single tcp call • Release 1.0 (now rc1) will support watchdog and automatic discover of nodes composing the cluster • Other methods possible (e.g. java connector properties, HW load balancer) 32 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 33.
    Conclusions on GaleraCluster • Multi-master • Shared-nothing • Great performances and scalability • «Virtually» synchronous • It uses InnoDB!! • Conflict prevention • Split-brain (no inconsistencies) • Easy to add/remove nodes 33 • At least 3 nodes to have good HA • Optimistic locking (side effects) • Explicit locking doesn’t work • Only InnoDB is replicated • Primary keys are mandatory • Not yet available for MySQL 5.6 • Linux only 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 34.
    Links http://www.slideshare.net/skysql/galera-cluster-by-seppo-jaakola-codership-at-skysql-roadshow-instuttgart-2013 http://www.codership.com/files/presentations/Galera_Replication_PLL_2011.pdf http://www.mysqlperformanceblog.com/2013/01/31/feature-in-details-incremental-state-transfer-after-anode-crash-in-percona-xtradb-cluster/ http://www.percona.tv/percona-webinars/migrating-to-percona-xtradb-cluster http://www.codership.com/content/5-tips-migrating-your-mysql-server-galera-cluster http://www.mysqlperformanceblog.com/2012/08/17/percona-xtradb-cluster-multi-node-writing-andunexpected-deadlocks/ http://www.mysqlperformanceblog.com/2012/11/20/understanding-multi-node-writing-conflict-metrics-inpercona-xtradb-cluster-and-galera/ http://www.mysqlperformanceblog.com/2011/10/13/benchmarking-galera-replication-overhead/ http://karlssonondatabases.blogspot.ch/2012/12/galera-features-beyond-just-ha.html http://infoscience.epfl.ch/record/52305/files/IC_TECH_REPORT_199908.pdf http://www.inf.usi.ch/faculty/pedone/Paper/2005/2005WDIDDR.pdf 34 2012 © Trivadis GaleraCluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 35.
    Little demo? 35 2012 ©Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 36.
    ? 36 2012 © Trivadis GaleraCluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 37.
    Trivadis SA THANK YOU. LudovicoCaldara Senior Consultant Ludovico.caldara@trivadis.com www.trivadis.com BASEL 37 BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 HAMBURG MÜNCHEN STUTTGART WIEN