Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Taking Full Advantage of Galera Multi Master Cluster

705 views

Published on

we will discuss important topics related to multi-master setups:
* Practical considerations when using Galera in a multi-master setup

* Evaluating the characteristics of your database workload
* Preparing your application for multi-master
* Detecting and dealing with transaction conflicts

Published in: Software
  • Be the first to comment

  • Be the first to like this

Taking Full Advantage of Galera Multi Master Cluster

  1. 1. Taking Full Advantage of Galera Multi-Master Philip Stoev Codership Oy
  2. 2. Agenda • A very quick overview of Galera Cluster • General principles of multi-master (MM) • Workloads that are well-suited for MM • Application considerations for MM • Configuring, monitoring and troubleshooting multi- master
  3. 3. Galera Cluster Overview Synchronous – each transaction is immediately replicated on all nodes at commit – no stale slaves Multi-Master – read from and write to any node – automatic transaction conflict detection Replication – a copy of the entire dataset is available on all nodes – new nodes can join automatically For MySQL – based on a modified version of MySQL (5.5, 5.6 with 5.7 coming up) – InnoDB storage engine
  4. 4. And more … • Recovers from node failures within seconds • Data consistency protections – avoids reading stale data – prevents unsafe data modifications • Cloud and WAN support
  5. 5. Introduction to Multi-Master
  6. 6. What is Multi-Master • The ability to issue any transaction on any Galera node • A core feature of the product, not a clever trick that happens to work • Available out of the box
  7. 7. Benefits to Multi-Master • Operational flexibility – no need to designate a single node to use exclusively for writes – simplified configuration for load balancing – easier handling of scheduled downtime and node failures • Wide Area Networks – applications can write to the node that is closest to them
  8. 8. General Principles • Galera places consistency on top: – conflicting transactions issued on different nodes will be detected – the transaction that committed first succeeds, those that attempt to commit after it are rejected – a transaction can be aborted halfway through if Galera detects that it can not be completed without a conflict • Callaghan’s Law “a given row can’t be modified more than once per RTT”
  9. 9. Write Scaling Does multi-master provide write scaling? • The updates made by every write transaction need to be applied on every Galera node • But none of the following operations are duplicated: – parser and optimizer overhead – the effort needed to find and read many records in order to update a few – execution of triggers
  10. 10. Applications and Workloads
  11. 11. The Multimaster-ready Application • Check if application uses transactions or individual queries • Suggested application behavior: – ensure that the application can handle “deadlock” errors during transaction and at COMMIT – application should be able to retry failed transactions – transactions requiring absolutely fresh data are known – reads and writes can be directed to different servers if needed • Better logging: – make sure all database errors are logged to enable analysis • For legacy applications: – autocommit statements can be retried by Galera in case of failure
  12. 12. Suitable Workloads • Low percentage of effective database updates • Queries or transactions that perform a lot of work or contain a lot of business logic but eventually update a smaller set of rows
  13. 13. Typical Example START TRANSACTION SELECT * FROM table1; SELECT * FROM table2; SELECT * FROM table3; ... UPDATE total_amount = 42 WHERE pk = 1 COMMIT
  14. 14. Other Examples INSERT INTO t1 SELECT COUNT(*) FROM very_large_table; UPDATE shipments SET flag = 1 WHERE sender_country = ‘Vatican’ AND receiving_state = ‘WY’; # Assuming no suitable indexes
  15. 15. Workload Considerations • High-percentage of single-row, NoSQL-style updates that act on single rows • The SELECT FOR UPDATE statement • Frequent operations on “hot” rows: – job queues or locking schemes implemented in the database – counters – generation of sequence numbers – repeated updates to “last accessed” timestamp records • Long-running and housekeeping transactions
  16. 16. Autoincrement Handling • Galera handles AUTO_INCREMENT columns in a safe way – works even as nodes join or leave the cluster – gaps in the sequence are possible, so use bigint columns • There is no need for the application to manage sequence values, reserve ranges, etc.
  17. 17. Read-Write Splitting • If there are conflicts due to heavy contention on rows, the application can direct writes to those rows alone to a single node • With a TCP load-balancer, provide a TCP port that can be load-balanced to any node and a TCP port that is directed to a single node only • Consider a query-aware proxy such as MaxScale
  18. 18. Configuring Galera for MM
  19. 19. Galera Variables • Galera is multi-master by default – any node can accept any query out of the box Useful options: • wsrep_retry_autocommit – retries queries that failed • wsrep_sync_wait – ensures data freshness • wsrep_log_conficts – prints information in the server error log
  20. 20. Retrying Autocommit Transactions • Autocommit transactions are those that contain only a single SQL statement, even if it updates multiple rows • A higher value of wsrep_retry_autocommit will help a most such transactions complete successfully • default is 1, so one retry will happen by default • SQL statements that update many rows may not be successful even if retried multiple times
  21. 21. Sync Waiting • With Galera, some small slave lag (a few transactions) is allowed for performance reasons • If a transaction absolutely positively needs the most up-to- date data there is, set wsrep_sync_wait • Can be set on a session basis, as needed (do not forget to reset the variable at the end of the critical block) • Makes sure the data is up to date as of the start of the transaction • Sync waiting is a properly of the transaction that requires fresh data, not of the transaction that wrote the data
  22. 22. Dealing with Conflicts
  23. 23. Monitoring Conflicts • wsrep_local_bf_aborts – number of transactions that were aborted because a conflicting transaction has already been committed locally – this type of abort can happen even prior to COMMIT, to avoid performing unnecessary work that is doomed to fail • wsrep_local_cert_failures – number of transactions failed at COMMIT time because they conflict with another transaction still in “in flight” Use the sum of the two counters.
  24. 24. Debugging Conflicts • Enable logging on the application side: – to provide context information on the failing query (e.g. function or line numbers; schema name) • Ensure that system time is synchronized across the cluster and with the application servers • Enable the wsrep_log_conflicts variable • Enable binary logging to obtain information on the winning transaction (see http://goo.gl/Tw5JLn)
  25. 25. Log Output *** Victim TRANSACTION: TRANSACTION 1374, ACTIVE 23 sec starting index read mysql tables in use 1, locked 1 4833 lock struct(s), heap size 554536, 1004832 row lock(s), undo log entries 934296 MySQL thread id 5, OS thread handle 0x7fbbb4601700, query id 50 localhost ::1 root updating update t1 set f2 = 'problematic_key_value21' *** WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 8 page no 4 n bits 280 index `PRIMARY` of table `test`.`t1` trx id 1374 lock_mode X Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0 0: len 4; hex 80000001; asc ;; # Unsigned integer value of PK 1: len 6; hex 00000000055e; asc ^;; 2: len 7; hex 39000021fd0110; asc 9 ! ;; 3: len 30; hex 70726f626c656d617469635f6b65795f76616c7565323120202020202020; asc problematic_key_value21 ; (total 50 bytes);
  26. 26. Avoiding Conflicts For hot records: • break down a “hot record” into multiple rows • replace repeated updates with inserting new records into a log table For long-running transactions: • split housekeeping work into smaller units Or: • Send conflicting writes to a single node – non-conflicting transactions can still be directed to any node
  27. 27. Questions • Please use the Question/Chat box in the GoToWebinar panel
  28. 28. Thank You http://www.galeracluster.com Discussion group: codership-team@googlegroups.com

×