• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Plny12 galera-cluster-best-practices
 

Plny12 galera-cluster-best-practices

on

  • 802 views

 

Statistics

Views

Total Views
802
Views on SlideShare
791
Embed Views
11

Actions

Likes
3
Downloads
13
Comments
0

1 Embed 11

http://helloit.es 11

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Plny12 galera-cluster-best-practices Plny12 galera-cluster-best-practices Presentation Transcript

    • Galera Cluster Best Practices Seppo Jaakola Codership
    • Agenda ● Galera Cluster Short Introduction ● Multi-master Conflicts ● State Transfers (SST IST) ● Backups ● Schema Upgrades ● Galera Project www.codership.com 2
    • Galera Cluster
    • Multi-Master Replication MySQL a Galera Replication www.codership.com 4
    • Multi-Master Replication There can be several nodes MySQL a MySQL Galera Replication www.codership.com 5
    • Multi-Master Replication There can be several nodes MySQL a MySQL MySQL Galera Replication www.codership.com 6
    • Multi-Master Replication Client can connect to any node There can be several nodes MySQL a MySQL MySQL Galera Replication www.codership.com 7
    • Multi-Master Replication read & write read & write read & write Read & write access to any node Client can connect to any node There can be several nodes MySQL a MySQL MySQL Galera Replication www.codership.com 8
    • Multi-Master Replication read & write read & write read & write Read & write access to any node Client can connect to any node There can be several nodes MySQL a MySQL MySQL Galera Replication Replication is synchronous www.codership.com 9
    • Multi-Master Replication read & write read & write read & write Multi-master cluster looks like one big database with multiple entry points a MySQL www.codership.com 10
    • Galera Cluster ➢ Synchronous multi-master cluster ➢ For MySQL/InnoDB ➢ 3 or more nodes needed for HA ➢ Automatic node provisioning ➢ Works in LAN / WAN / Cloud www.codership.com 11
    • Synchronous Replication Transaction is processed locally up to commit time Read & write MySQL a MySQL MySQL Galera Replication www.codership.com 14
    • Synchronous Replication Transaction is replicated to whole cluster commit MySQL MySQL MySQL a Galera Replication www.codership.com 15
    • Synchronous Replication Client gets OK status OK MySQL MySQL MySQL a Galera Replication www.codership.com 16
    • Synchronous Replication Transaction is applied in slaves MySQL MySQL MySQL a Galera Replication www.codership.com 17
    • Dealing with Multi-Master Conflicts
    • Multi-master Conflicts write write MySQL MySQL MySQL a Galera Replication www.codership.com 20
    • Multi-master Conflicts write write MySQL MySQL MySQL Conflict detected a Galera Replication www.codership.com 21
    • Multi-master Conflicts OK write MySQL MySQL Deadlock error MySQL a Galera Replication www.codership.com 22
    • Multi-Master Conflicts ● ● Galera uses optimistic concurrency control If two transactions modify same row on different nodes at the same time, one of the transactions must abort ➔ ● Victim transaction will get deadlock error Application should retry deadlocked transactions, however not all applications have retrying logic inbuilt www.codership.com 23
    • Database Hot-Spots ● ● Some rows where many transactions want to write to simultaneously Patterns like queue or ID allocation can be hotspots www.codership.com 24
    • Hot-Spots write write write Hot row a www.codership.com 25
    • Diagnosing Multi-Master Conflicts ● ● ● ● In the past Galera did not log much information from cluster wide conflicts But, by using wsrep_debug configuration, all conflicts (...and plenty of other information) will be logged Next release will add new variable: wsrep_log_conflicts which will cause each cluster conflict to be logged in mysql error log Monitor: ● ● wsrep_local_bf_aborts wsrep_local_cert_failures www.codership.com 26
    • wsrep_retry_autocommit ● ● ● ● Galera can retry autocommit transaction on behalf of the client application, inside of the MySQL server MySQL will not return deadlock error, but will silently retry the transaction wsrep_retry_autocommit=n will retry the transaction n times before giving up and returning deadlock error Retrying applies only to autocommit transactions, as retrying is not safe for multistatement transactions www.codership.com 27
    • Retry Autocommit Write write 1. conflict detected 2. retrying MySQL MySQL MySQL a Galera Replication www.codership.com 28
    • Multi-Master Conflicts 1) Analyze the hot-spot 2) Check if application logic can be changed to catch deadlock exception and apply retrying logic in application 3) Try if wsrep_retry_autocommit configuration helps 4) Limit the number of master nodes or change completely to master-slave model if you can filter out the access to the hotspot table, it is enough to treat writes only to hot-spotwww.codership.com master-slave table as 29
    • State Transfers
    • State Transfer Joining node needs to get the current database state ➢ Two choices: ➢ IST: incremental state transfer ➢ SST: full state transfer ➢ If joining node had some previous state and gcache spans to that, then IST can be used ➢ www.codership.com 31
    • State Snapshot Transfer To send full database state ● wsrep_sst_method to choose the method: ➢ mysqldump ➢ rsync ➢ xtrabackup ● www.codership.com 32
    • SST Request MySQL joiner MySQL SST Request Galera Replication ● wsrep_sst_method www.codership.com 33
    • SST Method wsrep_sst_mysqldump MySQL donor wsrep_sst_rsync joiner Galera Replication wsrep_sst_xtrabackup www.codership.com 34
    • SST API SST is open API for shell scripts ● Anyone can write custom SST ● SST API can be used e.g. for: ● Backups ● Filtering out part of database ● www.codership.com 35
    • wsrep_sst_mysqldump Logical backup ● Slowest method ● Configure authentication ➢ wsrep_sst_auth=”root:rootpass” ➢ Super privilege needed ● Make sure SST user in donor node can take mysqldump from donor and load it over the network to joiner node ● ● You can try this manually beforehand www.codership.com 36
    • wsrep_sst_rsync Physical backup ● Fast method ● Can only be used when node is starting ➢ Rsyncing datadirectory under running InnoDB is not possible ● www.codership.com 37
    • wsrep_sst_xtrabackup Contributed by Percona ● Probably the fastest method ● Uses xtrabackup ● Least blocking on Donor side (short readlock is still used when backup starts) ● www.codership.com 38
    • SST Donor All SST methods cause some disturbance for donor node ● By default donor accepts client connections, although committing will be prohibited for a while ● If wsrep_sst_donor_rejects_queries is set, donor gives unknown command error to clients ➔ Best practice is to dedicate a reference node for donor and backup activities ● www.codership.com 39
    • Incremental State Transfer Request to join Donor gcache GTID: seqno-n Joiner seqno-n gcache www.codership.com 40
    • Incremental State Transfer Joiner Donor Send IST events gcache apply seqno-n gcache www.codership.com 41
    • Incremental State Transfer Very effective ● gcache.size parameter defines how big cache will be maintained ● Gcache is mmap, available disk space is upper limit for size allocation ● www.codership.com 42
    • Incremental State Transfer ● Use database size and write rate to optimize gcache: gcache < database ➢ Write rate tells how long tail will be stored in cache ➢ www.codership.com 43
    • Incremental State Transfer ● You can think that IST Is ● ● A short asynchronous replication session If communication is bad quality, node can drop and join back fast with IST www.codership.com 44
    • Backups Backups Backups
    • Backups ➢ All Galera nodes are constantly up to date ➢ Best practices: Dedicate a reference node for backups ➢ Assign global trx ID with the backup Possible methods: ➢ ➢ 1.Disconnecting a node for backup 2.Using SST script interface 3.xtrabackup www.codership.com 46
    • Backups with global Trx ID ➢ ➢ Global transaction ID (GTID) marks a position in the cluster transaction stream Backup with known GTID make it possible to utilize IST when joining new nodes, eg, when: ➢ ➢ Recovering the node Provisioning new nodes www.codership.com 47
    • Backup by Disconnecting a Node Isolate the backup node Load Balancing MySQL MySQL MySQL Galera Replication www.codership.com 48
    • Backup by Disconnecting a Node Load Balancing MySQL MySQL MySQL Disconnect from group e.g. clear wsrep_provider Galera Replication www.codership.com 49
    • Backup by Disconnecting a Node Load Balancing MySQL MySQL MySQL Disconnect from group e.g. clear wsrep_provider Galera Replication www.codership.com 50
    • Backup by Disconnecting a Node Load Balancing MySQL MySQL MySQL Work your backup magic Galera Replication backups www.codership.com 51
    • Backup by Disconnecting a Node Load Balancing MySQL MySQL MySQL Galera Replication Read global transaction ID from status and assign to backup wsrep_cluster_uuid wsrep_last_committed backups www.codership.com 52
    • Backup by SST ● ● ● Donor mode provides isolated processing environment A special SST script can be written just to prepare backup in donor node: wsrep_sst_backup Garbd can be used to trigger donor node to run the wsrep_sst_backup www.codership.com 53
    • Backup by SST API Launch garbd Load Balancing SST request node1 node2 node3 Garbd wsrep_sst_donor=node3 wsrep_sst_method=backup Galera Replication www.codership.com 54
    • Backup by SST API Donor launches wsrep_sst_backup Load Balancing node1 node2 node3 Galera Replication wsrep_sst_backup . . . www.codership.com 55
    • Backup by SST API wsrep_sst_backup prepares the backup Load Balancing node1 node2 node3 Galera Replication wsrep_sst_backup . . .GTID backups www.codership.com 56
    • Backup by SST API Backup node returns to cluster Load Balancing node1 node2 node3 Galera Replication www.codership.com 57
    • Backup by xtrabackup ● ● ● Xtrabackup is hot backup method and can be used anytime Simple, efficient Use –galera-info option to get global transaction ID logged into separate galera info file www.codership.com 58
    • Schema Upgrades
    • Schema Upgrades ● ● DDL is non-transactional, and therefore bad Galera has two methods for DDL TOI, Total Order Isolation ● RSU, Rolling Schema Upgrade Use wsrep_osu_method to choose either option ● ● www.codership.com 60
    • Total Order Isolation ● DDL is replicated up-front Each node will get the DDL statement and must process the DDL at same slot in transaction stream Galera will isolate the affected table/database for the duration of DDL processing ● ● www.codership.com 61
    • Rolling Schema Upgrade ● ● ● ● ● DDL is not replicated Galera will take the node out of replication for the duration of DDL processing When DDL is done with, node will catch up with missed transactions (like IST) DBA should roll RSU operation over all nodes Requires backwards compatible schema changes www.codership.com 62
    • wsrep_on=OFF ● ● ● wsrep_on is a session variable telling if this session will be replicated or not I tried to hide this information to the best I can, but somebody has leaked this out And so, yes, it is possible to run “poor man's RSU” with wsrep_on set to OFF ● such session may be aborted by replication ● Use only, if you are really sure that: planned SQL is not conflicting ● SQL will not generate inconsistency ● www.codership.com 63
    • Schema Upgrades ● Best practices: ➔ Plan your upgrades ➔ ➔ Rehearse your upgrades ➔ ➔ Try to be backwards compatible Find out DDL execution time Go for RSU if possible www.codership.com 64
    • Consistent Reads
    • Consistent reads Replication is virtually synchronous... Transaction is replicated to whole cluster commit MySQL MySQL MySQL Galera Replication www.codership.com 66
    • Consistent reads 1. Insert into t1 values (1,....) 2. Select from t1 where i=1 Will the select see the inserted row? MySQL MySQL Galera Replication www.codership.com 67
    • Consistent Reads ● ● Aka read causality There is causal dependency between operations on two database connections ● Application is expecting to see the values of earlier write www.codership.com 68
    • Consistent Reads ● Use: wsrep_causal_reads=ON ➔ ● Every read (select, show) will wait until slave queue has been fully applied There is timeout for max causal read wait: ● replicator.causal_read_keepalive www.codership.com 69
    • Other Tidbits...
    • Parallel Applying ● Aka parallel replication ● “true parallel applying” ● ● Every application will benefit of it Works not on database, not on table, but on row level ● wsrep_slave_threads=n ● How many slaves makes sense: ● ● Monitor wsrep_cert_deps_distance Max 2 * cores www.codership.com 71
    • MyISAM Replication ● On experimental level ● MyISAM is phasing out not much demand to complete ● Replicates SQL up-front, like TOI ● Should be used in master-slave model ● No checks for non-deterministic SQL ● Insert into t (r, time) values (rand(), now()); www.codership.com 72
    • SSL / TLS ● Replication over SSL is supported ● No authentication (yet), only encryption ● Whole cluster must use SSL www.codership.com 73
    • SSL or VPN ● ● ● Bundling several nodes through VPN tunnel may cause a vulnerability When VPN gateway breaks, a big part of cluster will be blacked out Best practice is to go for SSL if VPN does not have alternative routes www.codership.com 74
    • UDP Multicast ● ● ● ● Configure with gmcast.mcast_addr Full cluster must be configured for multicast or tcp sockets Multicast is good for scalability Best practice is to go for multicast if planning for large clusters www.codership.com 75
    • Galera Project
    • Galera Project ● Galera Cluster for MySQL ● ● ● ● ● ~3 releases per year ● ● ● 5 years development based on MySQL server community edition Fully open source Active community Release 2.2 RC out yesterday Major release 3.0 in the works Galera Replication also used in: ● ● Percona XtraDB Cluster MariaDB Galera www.codership.com Cluster 77
    • Galera Project Galera Cluster for MySQL MariaDB Galera Cluster Percona XtraDB Cluster MySQL Percona Server e er g m API MariaDB merge API API Galera Replication plugin www.codership.com 78
    • Questions? Thank you for listening! Happy Clustering :-)