Migrating to Galera Cluster
Seppo Jaakola
Codership
2
www.codership.com
Agenda
● Migrating to Galera Replication
● Dierences in MySQL Features
● Supported engines
● Tables with no primary key
● Auto Increment Handling
● DDL processing
● Events, triggers...
● Huge transactions
● LOAD DATA processing
● Multi-master Con*icts
● Locking Sessions
● Online Migration
3
www.codership.com
Supported Features
Galera Cluster is close to native MySQL/InnoDB look  feel
However, there are some dierences in behavior and
limitations for what can be done in Galera Cluster
This presentation goes through these limitations and
guides for sanity checks and best practices for migration
process
4
www.codership.com
Supported Storage Engines
First and foremost limitation is that only InnoDB storage
engine is replicated
However, Galera has also limited MyISAM support
● through 'wsrep_replicate_myisam' con1guration
● Low performance
● Non deterministic: no timestamps, no rands
● Works for simple, low load writes
Transactions on non supported storage engines are not
replicated, data modi1cations remain node local
All DDL is replicated regardless of target engine
5
www.codership.com
InnoDB tables
Find out what table types are used, e.g:
Mysql select table_schema,table_name,engine
from information_schema.tables
where engine != 'InnoDB' and
table_schema not in ( 'mysql', 'performance_schema', 'information_schema') ;
If you have non InnoDB tables, 1gure out if migration to
InnoDB is possible
If you must have .e.g. MyISAM table(s), 1nd out if their
use case is supported by Galera Cluster
Note that:
– even that MyISAM is not replicated by default, still SST will copy all tables
– All DDL is replicated regardless of aected table type
6
www.codership.com
Tables with no Primary Key
● Galera uses ROW based replication
● ROW event applying in slave is not optimal, InnoDB may need
to fall back to full table scan to locate target rows
● But nevertheless, it is safe to use tables without primary keys,
and even in multi-master topology
● for certification, Galera generates MD5sum pseudo keys from
full row
name city title
seppo helsinki geek
insert
Key: 6b4ca6868e422208a0190a5a1c57a246
Write Set
md5suminsert Seqno: n
Binlog events
7
www.codership.com
Finding Tables with no PK
It makes sense to optimize schema design and assign
primary key for every table
Note that you save nothing by not de1ning PK, InnoDB will
create anyway internal 6 byte primary key for such tables,
you just cannot use that column for anything
select
t.table_schema,t.table_name,engine
from
information_schema.tables t
inner join information_schema .columns c
on t.table_schema=c.table_schema and
t.table_name=c.table_name
group by
t.table_schema,t.table_name
having
sum(if(column_key in ('PRI','UNI'), 1,0)) = 0;
http://stackoverflow.com/questions/7233703/how-do-i-find-out-which-tables-have-no-indexes-in-mysql
8
www.codership.com
Auto Increments
MySQL has auto increment control for guaranteeing
interleaved sequences in every cluster node
– auto_increment_increment - how long autoinc steps per insert
– auto_increment_oset – where to start autoinc sequence
By default, Galera manages autoincrement variables
automatically: wsrep_autoincrement_control=ON. Galera
will set increment to the number of nodes in the cluster,
and cycle oset to values 0..(n-1) in each node:
Node-1: 1, 4, 7 ...
Node-2: 2, 5, 8 ...
Node-3: 3, 6, 9 ...
Note that autoinc sequence will contain holes when inserts
hit dierent nodes in random.
Only autoinc_lock_mode=2, is supported
9
www.codership.com
DDL – Schema Changes
Galera requires special attention when running schema
changes.
Basically, alternatives are that DDL can be run in whole
cluster or rolling node by node
See more details in previous Severalnines Galera webinar
10
www.codership.com
Events, Triggers, Stored Procedures
Events, triggers, Views, Prepared Statements and Stored
procedures are supported
Triggers 1re only in the master node, and only possible
trigger execution results will be replicated
Events 1re in every node
– Make sure the end result is what was planned
Foreign keys (even cascading) are supported
11
www.codership.com
Huge Transactions
ROW based replication replicates every modi1ed row.
If transaction modi1es big number of rows, it may result in
huge write set for Galera to replicate.
Problems with Huge Transactions:
– Write set grows big and can cause memory issues
– Transaction is vulnerable for multi-master con*icts
– Slave side applying will take long
Galera has two limits for transaction size
– wsrep_max_ws_rows - not enforced atm
– wsrep_max_ws_size - enforced, max limit 2G
– Too big transactions rollback in master node
12
www.codership.com
LOAD DATA
LOAD DATA can cause very big transactions
To support arbitarily long LOAD DATA sessions, it is
possible to split LOAD DATA sessions into a series of small
INSERT transactions (10K inserts per transaction).
Con1gure with: wsrep_load_data_splitting = ON | OFF
Note, that each batch will commit and replicate
independently. If LOAD DATA is interrupted or rolled back in
master node, all earlier committed 10K insert batches will
remain in eect.
Clean up with TRUNCATE if need be
13
www.codership.com
node3
Multi-Master Replication
Galera Replication
read  write read  write read  write
node2node1 node3
Galera Replication
write read
node2node1
read
Multi-Master Master / Slave
14
www.codership.com
Multi-Master Con*icts
You can use Galera Cluster either in Master/Slave or
Multi-Master topology
In multi-master topology, there may happen multi-master
con*icts and some transactions fail with deadlock error
code
Even a transaction issuing COMMIT may be aborted with
deadlock error
Make sure, your application can deal with deadlock error,
the correct action is just to retry with better luck
wsrep_retry_autocommit may help to hide deadlock errors
15
www.codership.com
Multi-Master Con*ict
UPDATE t1 where id=1...
t1t1
G a l e r a R e p l i c a t i o n
WS
UPDATE t1 where id=1...
WS
16
www.codership.com
Multi-Master Con*ict
OK
t1t1
G a l e r a R e p l i c a t i o n
DEADLOCK
17
www.codership.com
Multi-Master Con*icts
Learn about multi-master con*icts, by enabling logging:
– wsrep_log_con*icts
– wsrep_provider_options :: cert.log_con*icts=1
For recovering, wsrep_retry_autocommit may help to hide
deadlock errors
18
www.codership.com
Latency E+ects
Galera replicates at commit time, and this will add some
delay for commit processing
The delay depends on cluster topology, networking and
SQL load pro1le
Per connection transaction throughput is lower, you may
see performance degradation if application uses just a few
database connections
But accumulating over all connections, the cluster
performance is high
19
www.codership.com
Long Lasting Transactions
A multi-statement transaction, which takes long to
process, even if not modifying many rows, may be
vulnerable for multi-master con*icts, just due to long life
time.
20
www.codership.com
MySQL Replication
Galera Cluster is compatible with MySQL replication
– Galera cluster can operate as MySQL slave
– Galera cluster can operate as master for MySQL slave
MySQL 5.6 and MariaDB 10 GTID make it very simple to
manage MySQL master fail over in Galera Cluster
MySQL replication yields an eective migration path from
MySQL to Galera Cluster
21
www.codership.com
MySQL Replication
Node1
MySQL slave
MySQL
replicationMySQL
master1
Node2
MySQL slave
Node3
MySQL master
MySQL
master2
MySQL
replication
MySQL
slave
MySQL
replication
Galera
Replication
22
www.codership.com
Miscellaneous
Query Cache is supported with latest Galera releases
binlog_format must be set to ROW, STATEMENT and MIXED
are currently not supported
Locking sessions (LOCK TABLE...UNLOCK TABLES) are not
supported
– Locking session will work locally, but in multi-master topology,
replication may break locks
Lock functions get_lock(), release_ock() are not supported
Part II
Online Migration Scenario + Demo

Webinar Slides: Migrating to Galera Cluster

  • 1.
    Migrating to GaleraCluster Seppo Jaakola Codership
  • 2.
    2 www.codership.com Agenda ● Migrating toGalera Replication ● Dierences in MySQL Features ● Supported engines ● Tables with no primary key ● Auto Increment Handling ● DDL processing ● Events, triggers... ● Huge transactions ● LOAD DATA processing ● Multi-master Con*icts ● Locking Sessions ● Online Migration
  • 3.
    3 www.codership.com Supported Features Galera Clusteris close to native MySQL/InnoDB look feel However, there are some dierences in behavior and limitations for what can be done in Galera Cluster This presentation goes through these limitations and guides for sanity checks and best practices for migration process
  • 4.
    4 www.codership.com Supported Storage Engines Firstand foremost limitation is that only InnoDB storage engine is replicated However, Galera has also limited MyISAM support ● through 'wsrep_replicate_myisam' con1guration ● Low performance ● Non deterministic: no timestamps, no rands ● Works for simple, low load writes Transactions on non supported storage engines are not replicated, data modi1cations remain node local All DDL is replicated regardless of target engine
  • 5.
    5 www.codership.com InnoDB tables Find outwhat table types are used, e.g: Mysql select table_schema,table_name,engine from information_schema.tables where engine != 'InnoDB' and table_schema not in ( 'mysql', 'performance_schema', 'information_schema') ; If you have non InnoDB tables, 1gure out if migration to InnoDB is possible If you must have .e.g. MyISAM table(s), 1nd out if their use case is supported by Galera Cluster Note that: – even that MyISAM is not replicated by default, still SST will copy all tables – All DDL is replicated regardless of aected table type
  • 6.
    6 www.codership.com Tables with noPrimary Key ● Galera uses ROW based replication ● ROW event applying in slave is not optimal, InnoDB may need to fall back to full table scan to locate target rows ● But nevertheless, it is safe to use tables without primary keys, and even in multi-master topology ● for certification, Galera generates MD5sum pseudo keys from full row name city title seppo helsinki geek insert Key: 6b4ca6868e422208a0190a5a1c57a246 Write Set md5suminsert Seqno: n Binlog events
  • 7.
    7 www.codership.com Finding Tables withno PK It makes sense to optimize schema design and assign primary key for every table Note that you save nothing by not de1ning PK, InnoDB will create anyway internal 6 byte primary key for such tables, you just cannot use that column for anything select t.table_schema,t.table_name,engine from information_schema.tables t inner join information_schema .columns c on t.table_schema=c.table_schema and t.table_name=c.table_name group by t.table_schema,t.table_name having sum(if(column_key in ('PRI','UNI'), 1,0)) = 0; http://stackoverflow.com/questions/7233703/how-do-i-find-out-which-tables-have-no-indexes-in-mysql
  • 8.
    8 www.codership.com Auto Increments MySQL hasauto increment control for guaranteeing interleaved sequences in every cluster node – auto_increment_increment - how long autoinc steps per insert – auto_increment_oset – where to start autoinc sequence By default, Galera manages autoincrement variables automatically: wsrep_autoincrement_control=ON. Galera will set increment to the number of nodes in the cluster, and cycle oset to values 0..(n-1) in each node: Node-1: 1, 4, 7 ... Node-2: 2, 5, 8 ... Node-3: 3, 6, 9 ... Note that autoinc sequence will contain holes when inserts hit dierent nodes in random. Only autoinc_lock_mode=2, is supported
  • 9.
    9 www.codership.com DDL – SchemaChanges Galera requires special attention when running schema changes. Basically, alternatives are that DDL can be run in whole cluster or rolling node by node See more details in previous Severalnines Galera webinar
  • 10.
    10 www.codership.com Events, Triggers, StoredProcedures Events, triggers, Views, Prepared Statements and Stored procedures are supported Triggers 1re only in the master node, and only possible trigger execution results will be replicated Events 1re in every node – Make sure the end result is what was planned Foreign keys (even cascading) are supported
  • 11.
    11 www.codership.com Huge Transactions ROW basedreplication replicates every modi1ed row. If transaction modi1es big number of rows, it may result in huge write set for Galera to replicate. Problems with Huge Transactions: – Write set grows big and can cause memory issues – Transaction is vulnerable for multi-master con*icts – Slave side applying will take long Galera has two limits for transaction size – wsrep_max_ws_rows - not enforced atm – wsrep_max_ws_size - enforced, max limit 2G – Too big transactions rollback in master node
  • 12.
    12 www.codership.com LOAD DATA LOAD DATAcan cause very big transactions To support arbitarily long LOAD DATA sessions, it is possible to split LOAD DATA sessions into a series of small INSERT transactions (10K inserts per transaction). Con1gure with: wsrep_load_data_splitting = ON | OFF Note, that each batch will commit and replicate independently. If LOAD DATA is interrupted or rolled back in master node, all earlier committed 10K insert batches will remain in eect. Clean up with TRUNCATE if need be
  • 13.
    13 www.codership.com node3 Multi-Master Replication Galera Replication read write read write read write node2node1 node3 Galera Replication write read node2node1 read Multi-Master Master / Slave
  • 14.
    14 www.codership.com Multi-Master Con*icts You canuse Galera Cluster either in Master/Slave or Multi-Master topology In multi-master topology, there may happen multi-master con*icts and some transactions fail with deadlock error code Even a transaction issuing COMMIT may be aborted with deadlock error Make sure, your application can deal with deadlock error, the correct action is just to retry with better luck wsrep_retry_autocommit may help to hide deadlock errors
  • 15.
    15 www.codership.com Multi-Master Con*ict UPDATE t1where id=1... t1t1 G a l e r a R e p l i c a t i o n WS UPDATE t1 where id=1... WS
  • 16.
    16 www.codership.com Multi-Master Con*ict OK t1t1 G al e r a R e p l i c a t i o n DEADLOCK
  • 17.
    17 www.codership.com Multi-Master Con*icts Learn aboutmulti-master con*icts, by enabling logging: – wsrep_log_con*icts – wsrep_provider_options :: cert.log_con*icts=1 For recovering, wsrep_retry_autocommit may help to hide deadlock errors
  • 18.
    18 www.codership.com Latency E+ects Galera replicatesat commit time, and this will add some delay for commit processing The delay depends on cluster topology, networking and SQL load pro1le Per connection transaction throughput is lower, you may see performance degradation if application uses just a few database connections But accumulating over all connections, the cluster performance is high
  • 19.
    19 www.codership.com Long Lasting Transactions Amulti-statement transaction, which takes long to process, even if not modifying many rows, may be vulnerable for multi-master con*icts, just due to long life time.
  • 20.
    20 www.codership.com MySQL Replication Galera Clusteris compatible with MySQL replication – Galera cluster can operate as MySQL slave – Galera cluster can operate as master for MySQL slave MySQL 5.6 and MariaDB 10 GTID make it very simple to manage MySQL master fail over in Galera Cluster MySQL replication yields an eective migration path from MySQL to Galera Cluster
  • 21.
    21 www.codership.com MySQL Replication Node1 MySQL slave MySQL replicationMySQL master1 Node2 MySQLslave Node3 MySQL master MySQL master2 MySQL replication MySQL slave MySQL replication Galera Replication
  • 22.
    22 www.codership.com Miscellaneous Query Cache issupported with latest Galera releases binlog_format must be set to ROW, STATEMENT and MIXED are currently not supported Locking sessions (LOCK TABLE...UNLOCK TABLES) are not supported – Locking session will work locally, but in multi-master topology, replication may break locks Lock functions get_lock(), release_ock() are not supported
  • 23.
    Part II Online MigrationScenario + Demo