Taking Full Advantage of Galera Multi Master Cluster

Taking Full Advantage of
Galera Multi-Master
Philip Stoev
Codership Oy

Agenda
• A very quick overview of Galera Cluster
• General principles of multi-master (MM)
• Workloads that are well-suited for MM
• Application considerations for MM
• Configuring, monitoring and troubleshooting multi-
master

Galera Cluster Overview
Synchronous
– each transaction is immediately replicated on all nodes at commit
– no stale slaves
Multi-Master
– read from and write to any node
– automatic transaction conflict detection
Replication
– a copy of the entire dataset is available on all nodes
– new nodes can join automatically
For MySQL
– based on a modified version of MySQL (5.5, 5.6 with 5.7 coming up)
– InnoDB storage engine

And more …
• Recovers from node failures within seconds
• Data consistency protections
– avoids reading stale data
– prevents unsafe data modifications
• Cloud and WAN support

What is Multi-Master
• The ability to issue any transaction on any Galera node
• A core feature of the product, not a clever trick that
happens to work
• Available out of the box

Benefits to Multi-Master
• Operational flexibility
– no need to designate a single node to use exclusively for writes
– simplified configuration for load balancing
– easier handling of scheduled downtime and node failures
• Wide Area Networks
– applications can write to the node that is closest to them

General Principles
• Galera places consistency on top:
– conflicting transactions issued on different nodes will be detected
– the transaction that committed first succeeds, those that attempt to
commit after it are rejected
– a transaction can be aborted halfway through if Galera detects that it
can not be completed without a conflict
• Callaghan’s Law
“a given row can’t be modified more than once per RTT”

Write Scaling
Does multi-master provide write scaling?
• The updates made by every write transaction need to
be applied on every Galera node
• But none of the following operations are duplicated:
– parser and optimizer overhead
– the effort needed to find and read many records in order to update a few
– execution of triggers

The Multimaster-ready Application
• Check if application uses transactions or individual queries
• Suggested application behavior:
– ensure that the application can handle “deadlock” errors during
transaction and at COMMIT
– application should be able to retry failed transactions
– transactions requiring absolutely fresh data are known
– reads and writes can be directed to different servers if needed
• Better logging:
– make sure all database errors are logged to enable analysis
• For legacy applications:
– autocommit statements can be retried by Galera in case of failure

Suitable Workloads
• Low percentage of effective database updates
• Queries or transactions that perform a lot of work or
contain a lot of business logic but eventually update a
smaller set of rows

Typical Example
START TRANSACTION
SELECT * FROM table1;
...
UPDATE total_amount = 42 WHERE pk = 1
COMMIT

Other Examples
INSERT INTO t1
SELECT COUNT(*) FROM very_large_table;
UPDATE shipments
SET flag = 1
WHERE sender_country = ‘Vatican’
AND receiving_state = ‘WY’;
# Assuming no suitable indexes

Workload Considerations
• High-percentage of single-row, NoSQL-style updates
that act on single rows
• The SELECT FOR UPDATE statement
• Frequent operations on “hot” rows:
– job queues or locking schemes implemented in the database
– counters
– generation of sequence numbers
– repeated updates to “last accessed” timestamp records
• Long-running and housekeeping transactions

Autoincrement Handling
• Galera handles AUTO_INCREMENT columns in a safe
way
– works even as nodes join or leave the cluster
– gaps in the sequence are possible, so use bigint columns
• There is no need for the application to manage
sequence values, reserve ranges, etc.

Read-Write Splitting
• If there are conflicts due to heavy contention on rows,
the application can direct writes to those rows alone to a
single node
• With a TCP load-balancer, provide a TCP port that can
be load-balanced to any node and a TCP port that is
directed to a single node only
• Consider a query-aware proxy such as MaxScale

Galera Variables
• Galera is multi-master by default
– any node can accept any query out of the box
Useful options:
• wsrep_retry_autocommit
– retries queries that failed
• wsrep_sync_wait
– ensures data freshness
• wsrep_log_conficts
– prints information in the server error log

Retrying Autocommit Transactions
• Autocommit transactions are those that contain only a
single SQL statement, even if it updates multiple rows
• A higher value of wsrep_retry_autocommit will help a
most such transactions complete successfully
• default is 1, so one retry will happen by default
• SQL statements that update many rows may not be
successful even if retried multiple times

Sync Waiting
• With Galera, some small slave lag (a few transactions) is
allowed for performance reasons
• If a transaction absolutely positively needs the most up-to-
date data there is, set wsrep_sync_wait
• Can be set on a session basis, as needed (do not forget to
reset the variable at the end of the critical block)
• Makes sure the data is up to date as of the start of the
transaction
• Sync waiting is a properly of the transaction that requires
fresh data, not of the transaction that wrote the data

Monitoring Conflicts
• wsrep_local_bf_aborts
– number of transactions that were aborted because a conflicting
transaction has already been committed locally
– this type of abort can happen even prior to COMMIT, to avoid
performing unnecessary work that is doomed to fail
• wsrep_local_cert_failures
– number of transactions failed at COMMIT time because they conflict
with another transaction still in “in flight”
Use the sum of the two counters.

Debugging Conflicts
• Enable logging on the application side:
– to provide context information on the failing query (e.g. function or line
numbers; schema name)
• Ensure that system time is synchronized across the
cluster and with the application servers
• Enable the wsrep_log_conflicts variable
• Enable binary logging to obtain information on the
winning transaction (see http://goo.gl/Tw5JLn)

Log Output
*** Victim TRANSACTION:
TRANSACTION 1374, ACTIVE 23 sec starting index read
mysql tables in use 1, locked 1
4833 lock struct(s), heap size 554536, 1004832 row lock(s), undo log entries 934296
MySQL thread id 5, OS thread handle 0x7fbbb4601700, query id 50
localhost ::1 root updating
update t1 set f2 = 'problematic_key_value21'
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 8 page no 4 n bits 280 index `PRIMARY`
of table `test`.`t1` trx id 1374 lock_mode X
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
0: len 4; hex 80000001; asc ;; # Unsigned integer value of PK
1: len 6; hex 00000000055e; asc ^;;
2: len 7; hex 39000021fd0110; asc 9 ! ;;
3: len 30; hex 70726f626c656d617469635f6b65795f76616c7565323120202020202020; asc
problematic_key_value21 ; (total 50 bytes);

Avoiding Conflicts
For hot records:
• break down a “hot record” into multiple rows
• replace repeated updates with inserting new records into a log
table
For long-running transactions:
• split housekeeping work into smaller units
Or:
• Send conflicting writes to a single node
– non-conflicting transactions can still be directed to any node

Questions
• Please use the Question/Chat box in the GoToWebinar
panel

Thank You
http://www.galeracluster.com
Discussion group:
codership-team@googlegroups.com

Taking Full Advantage of Galera Multi Master Cluster

More Related Content

What's hot

Similar to Taking Full Advantage of Galera Multi Master Cluster

More from Codership Oy - Creators of Galera Cluster

Recently uploaded

Taking Full Advantage of Galera Multi Master Cluster