• Like
  • Save

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Galera Cluster - Node Recovery - Webinar slides


Slides for the webinar held on January 21st 2014 …

Slides for the webinar held on January 21st 2014

Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters

Galera Cluster, NDB Cluster, VIP with HAProxy and Keepalived, MongoDB Sharded Cluster, etc. all have their own availability models. We are aware of these availability models and will demonstrate in this webinar how to take corrective action in case of failures via our cluster management tool, ClusterControl.

In this webinar, Severalnines CTO Johan Andersson will show you how to leverage ClusterControl to detect failures in your database cluster and automatically repair them to maximize the availability of your database services. And Codership CEO Seppo Jaakola will be joining Johan to provide a deep-dive into Galera recovery internals.


Redundancy models for Galera, NDB and MongoDB/TokuMX
Failover & Recovery (Automatic vs Manual)
Zooming into Galera recovery procedures
Split brains in multi-datacenter setups

Published in Technology , Travel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Galera Cluster Node Recovery Seppo Jaakola Codership
  • 2. Agenda ● Node Recovery Scenarios ● Incremental State Transfer ● State Snapshot Transfer ● Full Cluster recovery www.codership.com 2
  • 3. Node Recovery Scenarios Node drops from cluster gracefully and joins back Replication state is stored in grastate.dat le Joining happens by Incremental State Transfer (IST) ● ● Joining after node crash ● ● Node has either known or unknown state Joining can happen by IST or full State Snapshot Transfer (SST) is required Full Cluster recovery ● ● ● ● e.g. data center power down All nodes with known or unknown states The node with latest known state must be identi ed New cluster needs to be bootstrapped www.codership.com 3
  • 4. Joining One Node to Cluster
  • 5. Automatic Node Joining Cluster handshake MySQL joiner MySQL Galera Replication www.codership.com 5
  • 6. Automatic Node Joining Cluster selects donor to help the joiner to join Send state MySQL joiner Donor IST or SST Galera Replication www.codership.com 6
  • 7. Automatic Node Joining Catch up MySQL MySQL Galera Replication joiner Slave queue www.codership.com 7
  • 8. Automatic Node Joining MySQL MySQL MySQL Galera Replication www.codership.com 8
  • 9. Incremental State Transfer
  • 10. Incremental State Transfer ● ● ● Every node in Galera Cluster has a log of replicated write sets: gcache Gcache is mmap le, available disk space is upper limit for size allocation If joining node has past history in the cluster and donor has long enough gcache containing joiner's seqno position => then IST can be used for synchronization www.codership.com 10
  • 11. Incremental State Transfer Request to join Node-1 GTID: seqno-n Node-n Joiner Donor seqno-n+m grastate.dat seqno-n gcache Group ID:seqno gcache www.codership.com 11
  • 12. Incremental State Transfer Node-1 Node-n Joiner Donor apply seqno-n+m Send IST events grastate.dat gcache seqno-n Group ID:seqno gcache www.codership.com 12
  • 13. Incremental State Transfer ● ● ● Node synchronization by IST is very e/ective and least intrusive method for the donor gcache.size parameter de nes how big cache will be maintained Use database size and write rate to optimize gcache: ➢ ➢ ● ● gcache < database size Write rate de nes how long tail is available in cache If joiner node had crashed and IST was used to synchronize it back, then it is essential that InnoDB recovery works (innodb_doublewrite) If IST is not possible, donor will switch automatically to SST method www.codership.com 13
  • 14. State Snapshot Transfer
  • 15. SST Request MySQL joiner MySQL SST Request Galera Replication ● wsrep_sst_method www.codership.com 15
  • 16. SST Method wsrep_sst_mysqldump MySQL donor wsrep_sst_rsync joiner Galera Replication wsrep_sst_xtrabackup www.codership.com 16
  • 17. State Snapshot Transfer ● ● ● ● ● To send full database state wsrep_sst_method to choose the method: ➢ mysqldump ➢ rsync ➢ Xtrabackup Open API for creating new SST methods All SST methods cause at least some service break in donor node If node has crashed, InnoDB recovery will happen during startup. But with SST, this InnoDB recovery is more or less useless www.codership.com 17
  • 18. Full Cluster Recovery
  • 19. Full Cluster Recovery All nodes dropped from cluster: 1. Find the node which has latest changes 2. Bootstrap new cluster from the latest node www.codership.com 19
  • 20. Node With Latest Changes Check grastate.dat les: 1. File has valid seqno # GALERA saved state version: 2.1 uuid: 5ee99582-bb8d-11e2-b8e3-23de375c1d30 seqno: 8204503945773 ● Graceful shutdown ● Find node which has biggest seqno 2. No seqno, but group ID is there # GALERA saved state version: 2.1 uuid: 5ee99582-bb8d-11e2-b8e3-23de375c1d30 seqno: -1 ● Crash during transaction processing ● Use –wsrep-recover to dig out the last seqno 3. No seqno, no group ID ● # GALERA saved state version: 2.1 uuid: 00000000-0000-0000-0000-000000000000 seqno: -1 Crash during DDL http://www.codership.com/wiki/doku.php?id=mysql_galera_restart www.codership.com 20
  • 21. --wsrep-recover MySQL stores last committed GTID in InnoDB data header, transactionally ● This GTID can be read by starting mysqld with –wsrep-recover option ● <path to bin>/mysqld –wsrep-recover –defaults- le=<path to my.cnf> ● Mysqld will read InnoDB header les and shutdown immediately ● Last wsrep position is printed in mysql error le 130514 18:39:13 [Note] WSREP: Recovered position: 5ee99582-bb8d-11e2-b8e3-23de375c1d30:8204503945771 www.codership.com 21
  • 22. Bootstrapping New Cluster When the latest node has been identi ed, start this node as rst node in cluster ● ● ● service mysql start –wsrep_new_cluster service mysql start –wsrep_cluster_address=gcomm:// Start all other nodes. my.cnf should have wsrep_cluster_address pointing to all other nodes ● ● ● service mysql start Don't re all nodes at once, rather start them one by one www.codership.com 22
  • 23. Questions? Thank you for listening! Happy Clustering :-)