Introduction to Galera Cluster and Codership

Introduction
to
Galera Cluster and Codership

2
Created by Codership Oy
  Our founders participated
in 3 MySQL cluster
developments, since
2003.
  Started Galera work
2007. Based on PhD by
Fernando Pedone.
  1.0 in 2011. Percona &
MariaDB in 2012.
  Galera is free & open
source. Support and
consulting by Codership
& partners.

3
Galera
Galera in a nutshell
  True multi-master:
Read & write to any node
  Synchronous replication
  No slave lag
  No integrity issues
  No master-slave failovers or VIP
needed
  Multi-threaded slave, no
performance penalty
  Automatic node provisioning
  Elastic:
Easy scale-out & scale-in,
all nodes read-write
Master MasterMaster

4
Sysbench disk bound (20GB data / 6GB buffer), tps
  EC2 w local disk
-  Note: pretty poor I/O
here
  Blue vs red:
innodb_flush_log_at_trx_commit
> 66% improvement
  Scale-out factors:
2N = 0.5 x 1N
4N = 0.5 x 2N
Sysbench disk bound, 20GB data / 6GB InnoDB buffer, tps
http://codership.com/content/scaling-out-oltp-load-amazon-ec2-revisited

5
Galera vs other HA solutions
Galera is like...
  MySQL replication without
integrity issues or slave lag
  DRBD/SAN without failover
downtime and performance
penalty
  Oracle RAC without failover
downtime
  NDB, but you get to keep
InnoDB
Galera
NDB
Failover downtime
MySQL
replication
Slow Fast
Dataintegrity
DRBD
99 %
99.999...%
PoorSolid
RAC
SAN
Backups

6
Active-Active DB = best with Load Balancer
  HA Proxy, GLB, Cisco, F5...
  Pictured: Load balancer on
each app server
-  No Single Point of Failure
-  One less layer of network components
-  PHP and JDBC drivers provide this built-in!
jdbc:mysql:loadbalance://
10.0.0.1,10.0.0.2,10.0.0.3
/<database>?
loadBalanceBlacklistTimeout=5000
  Or: Separate HW or SW load
balancer
-  Centralized administration
-  What if LB fails?
Galera
MySQL MySQLMySQL
LB LB

7
GaleraGalera
Some other architectures
MySQL MySQLMySQLMySQL MySQLMySQL
VIP
Whole stack cluster
Virtual IP failover

8
Galera
Quorum
  Galera uses quorum based failure
handling:
-  When cluster partitioning is
detected, the majority partition
"has quorum" and can continue
-  A minority partition cannot
commit transactions, but will
attempt to re-connect to primary
partition
-  Note: 50% is not majority!
=> Minimum 3 nodes
recommended.
  Load balancer will notice errors &
remove node from pool
MySQL MySQLMySQL
LB LB

9
WAN replication
  Works fine
  Use higher timeouts and send
windows
  No impact on reads
  No impact within a transaction
  adds 100-300 ms to commit
latency
  No major impact on tps
  Quorum between data
centers
-  3 data centers
-  Distribute nodes evenly

10
WAN with MySQL asynchronous replication
  You can mix Galera replication
and MySQL replication
  Good option on poor WAN
  Remember to watch out for
slave lag, etc...
  "Channel failover" if a master
node crashes
  Mixed replication useful when
you want async slave (such as
time-delayed, filtered, multi-
source...)

13
Migration checklist
  Are your tables InnoDB?
  Make sure all tables have Primary Key
  Watch out for Triggers and Events
Tip: Don't do too many changes at once. Migrate to InnoDB first,
run a month in production, then migrate to Galera.

14
MySQL
A MySQL Galera cluster is...
InnoDBMyISAM
ReplicationAPI
WsrepAPI
SHOW STATUS LIKE "wsrep%"
SHOW VARIABLES ...
Galera group comm library
MySQL
MySQL
Snapshot State Transfer
mysqldump
rsync
xtrabackup
etc...
http://www.codership.com/downloads/download-mysqlgalera

15
Understanding the transaction sequence in Galera
BEGIN
Master Slave
SELECT
UPDATE
COMMIT
User transaction
Certification
Group
communication
=> GTIDCertification
COMMIT
Apply
commit
return
Commit
delay
Virtual
synchrony
=
Committed
events
written to
InnoDB
after small
delay
Optimistic
locking
between
nodes
=
Risk for
deadlocks
ROLLB
InnoDB
commit
COMMIT discard
Certification =
deterministic
InnoDB
commit

16
What if I only have 2 nodes?
Galera Arbitrator (garbd)
  Acts as a 3rd node in a
cluster but doesn't store the
data.
  Run it on an app server.
  Run it on any other available
server.
  Note: Do not run a 3rd node
in a VM on same hypervisor
as other Galera nodes.
(Why?)
Master-slave clustering
  Pacemaker, Heartbeat, etc...
-  Manual failover?
  Still better than MySQL
replication or DRBD: Hot
standby, multi-threaded
slave...
  Prioritize data integrity:
set global wsrep_on=0
# (at failover)
  Prioritize failover speed:
pc.ignore_quorum=on
# (at startup)

17
Optimistic locking cluster-wide
  ...theoretical chance of deadlocks
-  In most cases less than 1 out of 10.000 trx
-  Correct solution: Catch exceptions in app and retry
-  Design: Avoid hot-spots in tables
-  Workaround: Directing all writes (or all problematic writes) to
single node brings back 100% InnoDB compatibility

18
Snapshot options
SST = Full snapshot
  Mysqldump & rsync will block donor
-  Dedicate 1 node to act as donor
  Xtrabackup is a non-blocking option
  Really big databases
-  wsrep_sst_method=skip + manual backup & restore
-  wsrep_sst_method=fedex :-)
IST = Incremental State Transfer
  Logic: IST is preferred over SST
  gcache.size <= DB size
gcache.size >= wsrep_replicated_bytes * <outage duration>

Benchmarks
http://codership.com/info/benchmarks

20
Sysbench disk bound (20GB data / 6GB buffer), tps
  EC2 w local disk
-  Note: pretty poor I/O here
  Blue vs red: turning off
innodb_flush_log_at_trx_com
mit gives 66% improvement
  Scale-out factors:
2N = 0.5 x 1N
4N = 0.5 x 2N
  5th node was EC2 weakness.
Later test scaled a little more
up to 8 nodes

21
Sysbench disk bound (20GB data / 6GB buffer), latency
  As before
  Not syncing InnoDB
decreases latency
  Scale-out decreases
latency
  Galera does not add
latency overhead

22
Galera and NDB shootout: sysbench "out of the box"
  Galera is 4x better
Ok, so what does this really
mean?
  That Galera is better...
-  For this workload
-  With default settings
(Severalnines)
-  Pretty user friendly and
general purpose
  NDB
-  Excels at key-value and
heavy-write workloads
(which sysbench is not)
-  Would benefit here from
PARTITION BY RANGE http://codership.com/content/whats-difference-kenneth

23
Drupal on Galera: baseline w single server
  Drupal, Apache, PHP,
MySQL 5.1
  JMeter
-  3 types of users: poster,
commenter, reader
-  Gaussian (15, 7) think time
  Large EC2 instance
  Ideal scalability: linear until
tipping point at 140-180 users
-  Constrained by Apache/PHP
CPU utilization
-  Could scale out by adding
more Apache in front of
single MySQL
http://codership.com/content/scaling-drupal-stack-galera-part-2-mystery-failed-login

24
Drupal on Galera: Scale-out with 1-4 Galera nodes (tps)
  Drupal, Apache, PHP,
MySQL 5.1 w Galera
  1-4 identical nodes
-  Whole stack cluster
-  MySQL connection to
localhost
  Multiply nr of users
-  180, 360, 540, 720
  3 nodes = linear scalability,
4 nodes still near-linear
  Minimal latency overhead
http://codership.com/content/scaling-drupal-stack-galera-part-2-mystery-failed-login

25
Drupal on Galera: Scale-out with 1-4 Galera nodes (latency)
  Like before
  Constant nr of users
-  180, 180, 180, 180
  Scaling from 1 to 2
-  drastically reduces
latency
-  tps back to linear
scalability
  Scaling to 3 and 4
-  No more tps as there
was no bottleneck.
-  Slightly better latency
-  Note: No overhead from
additional nodes! http://codership.com/content/scaling-drupal-stack-galera-part-2-mystery-failed-login

26
WAN replication, EC2 eu-west + us-east, tps
http://codership.com/content/synchronous-replication-loves-you-again
client eu-west
db in us-east

27
WAN replication, EC2 eu-west + us-east, latency
client eu-west
db in us-east

28
Conclusion: WAN only adds commit latency, which is usually ok
EU-west <-> US-east
-  90 ms
-  "best case"
EU <-> JPN
-  275 ms
EU <-> JPN <-> USA
-  295 ms
You can choose latency
between:
-  user and web server = ok
-  web server and db = bad
-  db and db = great!
http://www.mysqlperformanceblog.com/2012/01/11/making-the-impossible-3-nodes-intercontinental-replication/

Introduction to Galera Cluster and Codership

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Introduction to Galera Cluster and Codership

Similar to Introduction to Galera Cluster and Codership (20)

More from Codership Oy - Creators of Galera Cluster

More from Codership Oy - Creators of Galera Cluster (6)

Recently uploaded

Recently uploaded (20)

Introduction to Galera Cluster and Codership