Successfully reported this slideshow.
Your SlideShare is downloading. ×

Plny12 galera-cluster-best-practices

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 76 Ad

More Related Content

Slideshows for you (20)

Viewers also liked (20)

Advertisement

Similar to Plny12 galera-cluster-best-practices (20)

More from Dimas Prasetyo (20)

Advertisement

Recently uploaded (20)

Plny12 galera-cluster-best-practices

  1. 1. Galera Cluster Best Practices Seppo Jaakola Codership
  2. 2. Agenda ● Galera Cluster Short Introduction ● Multi-master Conflicts ● State Transfers (SST IST) ● Backups ● Schema Upgrades ● Galera Project www.codership.com 2
  3. 3. Galera Cluster
  4. 4. Multi-Master Replication MySQL a Galera Replication www.codership.com 4
  5. 5. Multi-Master Replication There can be several nodes MySQL a MySQL Galera Replication www.codership.com 5
  6. 6. Multi-Master Replication There can be several nodes MySQL a MySQL MySQL Galera Replication www.codership.com 6
  7. 7. Multi-Master Replication Client can connect to any node There can be several nodes MySQL a MySQL MySQL Galera Replication www.codership.com 7
  8. 8. Multi-Master Replication read & write read & write read & write Read & write access to any node Client can connect to any node There can be several nodes MySQL a MySQL MySQL Galera Replication www.codership.com 8
  9. 9. Multi-Master Replication read & write read & write read & write Read & write access to any node Client can connect to any node There can be several nodes MySQL a MySQL MySQL Galera Replication Replication is synchronous www.codership.com 9
  10. 10. Multi-Master Replication read & write read & write read & write Multi-master cluster looks like one big database with multiple entry points a MySQL www.codership.com 10
  11. 11. Galera Cluster ➢ Synchronous multi-master cluster ➢ For MySQL/InnoDB ➢ 3 or more nodes needed for HA ➢ Automatic node provisioning ➢ Works in LAN / WAN / Cloud www.codership.com 11
  12. 12. Synchronous Replication Transaction is processed locally up to commit time Read & write MySQL a MySQL MySQL Galera Replication www.codership.com 14
  13. 13. Synchronous Replication Transaction is replicated to whole cluster commit MySQL MySQL MySQL a Galera Replication www.codership.com 15
  14. 14. Synchronous Replication Client gets OK status OK MySQL MySQL MySQL a Galera Replication www.codership.com 16
  15. 15. Synchronous Replication Transaction is applied in slaves MySQL MySQL MySQL a Galera Replication www.codership.com 17
  16. 16. Dealing with Multi-Master Conflicts
  17. 17. Multi-master Conflicts write write MySQL MySQL MySQL a Galera Replication www.codership.com 20
  18. 18. Multi-master Conflicts write write MySQL MySQL MySQL Conflict detected a Galera Replication www.codership.com 21
  19. 19. Multi-master Conflicts OK write MySQL MySQL Deadlock error MySQL a Galera Replication www.codership.com 22
  20. 20. Multi-Master Conflicts ● ● Galera uses optimistic concurrency control If two transactions modify same row on different nodes at the same time, one of the transactions must abort ➔ ● Victim transaction will get deadlock error Application should retry deadlocked transactions, however not all applications have retrying logic inbuilt www.codership.com 23
  21. 21. Database Hot-Spots ● ● Some rows where many transactions want to write to simultaneously Patterns like queue or ID allocation can be hotspots www.codership.com 24
  22. 22. Hot-Spots write write write Hot row a www.codership.com 25
  23. 23. Diagnosing Multi-Master Conflicts ● ● ● ● In the past Galera did not log much information from cluster wide conflicts But, by using wsrep_debug configuration, all conflicts (...and plenty of other information) will be logged Next release will add new variable: wsrep_log_conflicts which will cause each cluster conflict to be logged in mysql error log Monitor: ● ● wsrep_local_bf_aborts wsrep_local_cert_failures www.codership.com 26
  24. 24. wsrep_retry_autocommit ● ● ● ● Galera can retry autocommit transaction on behalf of the client application, inside of the MySQL server MySQL will not return deadlock error, but will silently retry the transaction wsrep_retry_autocommit=n will retry the transaction n times before giving up and returning deadlock error Retrying applies only to autocommit transactions, as retrying is not safe for multistatement transactions www.codership.com 27
  25. 25. Retry Autocommit Write write 1. conflict detected 2. retrying MySQL MySQL MySQL a Galera Replication www.codership.com 28
  26. 26. Multi-Master Conflicts 1) Analyze the hot-spot 2) Check if application logic can be changed to catch deadlock exception and apply retrying logic in application 3) Try if wsrep_retry_autocommit configuration helps 4) Limit the number of master nodes or change completely to master-slave model if you can filter out the access to the hotspot table, it is enough to treat writes only to hot-spotwww.codership.com master-slave table as 29
  27. 27. State Transfers
  28. 28. State Transfer Joining node needs to get the current database state ➢ Two choices: ➢ IST: incremental state transfer ➢ SST: full state transfer ➢ If joining node had some previous state and gcache spans to that, then IST can be used ➢ www.codership.com 31
  29. 29. State Snapshot Transfer To send full database state ● wsrep_sst_method to choose the method: ➢ mysqldump ➢ rsync ➢ xtrabackup ● www.codership.com 32
  30. 30. SST Request MySQL joiner MySQL SST Request Galera Replication ● wsrep_sst_method www.codership.com 33
  31. 31. SST Method wsrep_sst_mysqldump MySQL donor wsrep_sst_rsync joiner Galera Replication wsrep_sst_xtrabackup www.codership.com 34
  32. 32. SST API SST is open API for shell scripts ● Anyone can write custom SST ● SST API can be used e.g. for: ● Backups ● Filtering out part of database ● www.codership.com 35
  33. 33. wsrep_sst_mysqldump Logical backup ● Slowest method ● Configure authentication ➢ wsrep_sst_auth=”root:rootpass” ➢ Super privilege needed ● Make sure SST user in donor node can take mysqldump from donor and load it over the network to joiner node ● ● You can try this manually beforehand www.codership.com 36
  34. 34. wsrep_sst_rsync Physical backup ● Fast method ● Can only be used when node is starting ➢ Rsyncing datadirectory under running InnoDB is not possible ● www.codership.com 37
  35. 35. wsrep_sst_xtrabackup Contributed by Percona ● Probably the fastest method ● Uses xtrabackup ● Least blocking on Donor side (short readlock is still used when backup starts) ● www.codership.com 38
  36. 36. SST Donor All SST methods cause some disturbance for donor node ● By default donor accepts client connections, although committing will be prohibited for a while ● If wsrep_sst_donor_rejects_queries is set, donor gives unknown command error to clients ➔ Best practice is to dedicate a reference node for donor and backup activities ● www.codership.com 39
  37. 37. Incremental State Transfer Request to join Donor gcache GTID: seqno-n Joiner seqno-n gcache www.codership.com 40
  38. 38. Incremental State Transfer Joiner Donor Send IST events gcache apply seqno-n gcache www.codership.com 41
  39. 39. Incremental State Transfer Very effective ● gcache.size parameter defines how big cache will be maintained ● Gcache is mmap, available disk space is upper limit for size allocation ● www.codership.com 42
  40. 40. Incremental State Transfer ● Use database size and write rate to optimize gcache: gcache < database ➢ Write rate tells how long tail will be stored in cache ➢ www.codership.com 43
  41. 41. Incremental State Transfer ● You can think that IST Is ● ● A short asynchronous replication session If communication is bad quality, node can drop and join back fast with IST www.codership.com 44
  42. 42. Backups Backups Backups
  43. 43. Backups ➢ All Galera nodes are constantly up to date ➢ Best practices: Dedicate a reference node for backups ➢ Assign global trx ID with the backup Possible methods: ➢ ➢ 1.Disconnecting a node for backup 2.Using SST script interface 3.xtrabackup www.codership.com 46
  44. 44. Backups with global Trx ID ➢ ➢ Global transaction ID (GTID) marks a position in the cluster transaction stream Backup with known GTID make it possible to utilize IST when joining new nodes, eg, when: ➢ ➢ Recovering the node Provisioning new nodes www.codership.com 47
  45. 45. Backup by Disconnecting a Node Isolate the backup node Load Balancing MySQL MySQL MySQL Galera Replication www.codership.com 48
  46. 46. Backup by Disconnecting a Node Load Balancing MySQL MySQL MySQL Disconnect from group e.g. clear wsrep_provider Galera Replication www.codership.com 49
  47. 47. Backup by Disconnecting a Node Load Balancing MySQL MySQL MySQL Disconnect from group e.g. clear wsrep_provider Galera Replication www.codership.com 50
  48. 48. Backup by Disconnecting a Node Load Balancing MySQL MySQL MySQL Work your backup magic Galera Replication backups www.codership.com 51
  49. 49. Backup by Disconnecting a Node Load Balancing MySQL MySQL MySQL Galera Replication Read global transaction ID from status and assign to backup wsrep_cluster_uuid wsrep_last_committed backups www.codership.com 52
  50. 50. Backup by SST ● ● ● Donor mode provides isolated processing environment A special SST script can be written just to prepare backup in donor node: wsrep_sst_backup Garbd can be used to trigger donor node to run the wsrep_sst_backup www.codership.com 53
  51. 51. Backup by SST API Launch garbd Load Balancing SST request node1 node2 node3 Garbd wsrep_sst_donor=node3 wsrep_sst_method=backup Galera Replication www.codership.com 54
  52. 52. Backup by SST API Donor launches wsrep_sst_backup Load Balancing node1 node2 node3 Galera Replication wsrep_sst_backup . . . www.codership.com 55
  53. 53. Backup by SST API wsrep_sst_backup prepares the backup Load Balancing node1 node2 node3 Galera Replication wsrep_sst_backup . . .GTID backups www.codership.com 56
  54. 54. Backup by SST API Backup node returns to cluster Load Balancing node1 node2 node3 Galera Replication www.codership.com 57
  55. 55. Backup by xtrabackup ● ● ● Xtrabackup is hot backup method and can be used anytime Simple, efficient Use –galera-info option to get global transaction ID logged into separate galera info file www.codership.com 58
  56. 56. Schema Upgrades
  57. 57. Schema Upgrades ● ● DDL is non-transactional, and therefore bad Galera has two methods for DDL TOI, Total Order Isolation ● RSU, Rolling Schema Upgrade Use wsrep_osu_method to choose either option ● ● www.codership.com 60
  58. 58. Total Order Isolation ● DDL is replicated up-front Each node will get the DDL statement and must process the DDL at same slot in transaction stream Galera will isolate the affected table/database for the duration of DDL processing ● ● www.codership.com 61
  59. 59. Rolling Schema Upgrade ● ● ● ● ● DDL is not replicated Galera will take the node out of replication for the duration of DDL processing When DDL is done with, node will catch up with missed transactions (like IST) DBA should roll RSU operation over all nodes Requires backwards compatible schema changes www.codership.com 62
  60. 60. wsrep_on=OFF ● ● ● wsrep_on is a session variable telling if this session will be replicated or not I tried to hide this information to the best I can, but somebody has leaked this out And so, yes, it is possible to run “poor man's RSU” with wsrep_on set to OFF ● such session may be aborted by replication ● Use only, if you are really sure that: planned SQL is not conflicting ● SQL will not generate inconsistency ● www.codership.com 63
  61. 61. Schema Upgrades ● Best practices: ➔ Plan your upgrades ➔ ➔ Rehearse your upgrades ➔ ➔ Try to be backwards compatible Find out DDL execution time Go for RSU if possible www.codership.com 64
  62. 62. Consistent Reads
  63. 63. Consistent reads Replication is virtually synchronous... Transaction is replicated to whole cluster commit MySQL MySQL MySQL Galera Replication www.codership.com 66
  64. 64. Consistent reads 1. Insert into t1 values (1,....) 2. Select from t1 where i=1 Will the select see the inserted row? MySQL MySQL Galera Replication www.codership.com 67
  65. 65. Consistent Reads ● ● Aka read causality There is causal dependency between operations on two database connections ● Application is expecting to see the values of earlier write www.codership.com 68
  66. 66. Consistent Reads ● Use: wsrep_causal_reads=ON ➔ ● Every read (select, show) will wait until slave queue has been fully applied There is timeout for max causal read wait: ● replicator.causal_read_keepalive www.codership.com 69
  67. 67. Other Tidbits...
  68. 68. Parallel Applying ● Aka parallel replication ● “true parallel applying” ● ● Every application will benefit of it Works not on database, not on table, but on row level ● wsrep_slave_threads=n ● How many slaves makes sense: ● ● Monitor wsrep_cert_deps_distance Max 2 * cores www.codership.com 71
  69. 69. MyISAM Replication ● On experimental level ● MyISAM is phasing out not much demand to complete ● Replicates SQL up-front, like TOI ● Should be used in master-slave model ● No checks for non-deterministic SQL ● Insert into t (r, time) values (rand(), now()); www.codership.com 72
  70. 70. SSL / TLS ● Replication over SSL is supported ● No authentication (yet), only encryption ● Whole cluster must use SSL www.codership.com 73
  71. 71. SSL or VPN ● ● ● Bundling several nodes through VPN tunnel may cause a vulnerability When VPN gateway breaks, a big part of cluster will be blacked out Best practice is to go for SSL if VPN does not have alternative routes www.codership.com 74
  72. 72. UDP Multicast ● ● ● ● Configure with gmcast.mcast_addr Full cluster must be configured for multicast or tcp sockets Multicast is good for scalability Best practice is to go for multicast if planning for large clusters www.codership.com 75
  73. 73. Galera Project
  74. 74. Galera Project ● Galera Cluster for MySQL ● ● ● ● ● ~3 releases per year ● ● ● 5 years development based on MySQL server community edition Fully open source Active community Release 2.2 RC out yesterday Major release 3.0 in the works Galera Replication also used in: ● ● Percona XtraDB Cluster MariaDB Galera www.codership.com Cluster 77
  75. 75. Galera Project Galera Cluster for MySQL MariaDB Galera Cluster Percona XtraDB Cluster MySQL Percona Server e er g m API MariaDB merge API API Galera Replication plugin www.codership.com 78
  76. 76. Questions? Thank you for listening! Happy Clustering :-)

×