High Availability: replication,
clustering and failover
Shane Johnson
Senior Director of Product Marketing
Agenda: HA with MariaDB TX
Getting started
Terminology
Fundamentals
Scalability
Advanced concepts
Topologies
Internals
Optimizations
Failover: when a standby database becomes the the primary database because the
primary database is unreachable/unavailable
Switchover: when a standy database becomes the primary database, and vice-versa
(e.g., to perform a rolling upgrade)
Rejoin: when a failed primary database becomes a standby database because it is
reachable/available (again)
Terminology: HA with MariaDB TX
Fundamentals
High availability with MariaDB TX
Fundamentals: HA with MariaDB TX
– replication and clustering –
Slave Slave
Master
Slave MasterMaster
Master
Asynchronous
(master/slave replication)
Slave Slave Slave
Master
Semi-synchronous
(master/slave replication)
Synchronous
(multi-master clustering)
Master/slave replication
High availability with MariaDB TX
Fundamentals: HA with MariaDB TX
– master/slave replication –
Master
TX (GTID 1)
TX (GTID 2)
Slave
Relay log
1. request next transaction (GTID = 2)
TX (GTID 1)
TX (GTID 2)
TX (GTID 3)
Binary log
2. read next transaction
(GTID = 3)
3. reply with next transaction (GTID = 3)
TX (GTID 3)
4. write next transaction
(GTID = 3)
Fundamentals: HA with MariaDB TX
– master/slave replication –
Asynchronous or semi-synchronous?
Fundamentals: HA with MariaDB TX
– asynchronous with automatic failover –
Slave 1
(GTID = 2)
Slave 2
(GTID = 1)
Master
(GTID = 3)
Slave 1
(GTID = 2)
Slave 2
(GTID = 1)
Master *
(GTID = 2)
Slave 2
(GTID = 1)
MariaDB MaxScale
(Proxy)
MariaDB MaxScale
(Proxy)
MariaDB MaxScale
(Proxy)
Master
(GTID = 3)
Fundamentals: HA with MariaDB TX
– semi-synchronous with automatic failover –
Master
(GTID = 3)
Slave 1
(GTID = 2)
Slave 2
(GTID = 3)
Master
(GTID = 3)
Slave 1
(GTID = 2)
Slave 2
(GTID = 3)
Slave 1
(GTID = 2)
Master *
(GTID = 3)
MariaDB MaxScale
(Proxy)
MariaDB MaxScale
(Proxy)
MariaDB MaxScale
(Proxy)
Fundamentals: HA with MariaDB TX
– semi-synchronous system variables –
Variable Values Default
rpl_semi_sync_master_enabled 0 (OFF) | 1 (ON) 0 (OFF)
rpl_semi_sync_master_timeout 0 to n (ms) 10000 (10 seconds)
Fundamentals: HA with MariaDB TX
– master/slave replication use cases –
Asynchronous
Read-intensive: product catalogs
Mixed: shopping carts
Write-intensive: clickstream data
Semi-synchronous
Read-intensive: customer profiles
Mixed: inventory and pricing
Write-intensive: checkouts
Multi-master clustering
High availability with MariaDB TX
Fundamentals: HA with MariaDB TX
– multi-master clustering –
Node
Row 3
Row 2
Row 1
Transaction
1. get
writes
NodeNode
2. send
writes
3. certify and apply
writes
Fundamentals: HA with MariaDB TX
– synchronous with automatic failover –
Node 1
(Priority = 1)
Node 2
(Priority = 2)
Node 3
(Priority = 3)
Node 1
(Priority = 1)
Node 2
(Priority = 2)
Node 3
(Priority = 3)
Node 2
(Priority = 2)
Node 3
(Priority = 3)
MariaDB MaxScale
(Proxy)
MariaDB MaxScale
(Proxy)
MariaDB MaxScale
(Proxy)
Fundamentals: HA with MariaDB TX
– multi-master clustering use cases –
Synchronous
Read-intensive: account status
Mixed: package tracking
Write-intensive: payments
Scalability
High availability with MariaDB TX
Master/slave replication
High availability with MariaDB TX
Scalability: HA with MariaDB TX
– read-write splitting –
Master Slave Slave Master Slave Slave Master Slave Slave
writes reads
MariaDB MaxScale
(Proxy)
MariaDB MaxScale
(Proxy)
MariaDB MaxScale
(Proxy)
Scalability: HA with MariaDB TX
– read scalability with consistency –
If read consistency is required, enable the consistent critical reads filter in the database proxy.
Variable Values Default
time Length of time to route reads to the master (seconds) 60
count Number of reads to route to the master -
match RegEx to determine if a write should trigger the filter -
ignore RegEx to determine if a write should NOT trigger the filter
Multi-master clustering
High availability with MariaDB TX
Scalability: HA with MariaDB TX
– read-write splitting –
Node
(role=master)
Node
(role=slave)
writes reads
Node
(role=slave)
Node
(role=master)
Node
(role=slave)
Node
(role=slave)
Node
(role=master)
Node
(role=slave)
Node
(role=slave)
MariaDB MaxScale
(Proxy)
MariaDB MaxScale
(Proxy)
MariaDB MaxScale
(Proxy)
Scalability: HA with MariaDB TX
– read scalability with consistency –
If read consistency is required, set the wsrep_sync_wait system variable to 1.
Variable Values Default
wsrep_sync_wait 0 (DISABLED)
1 (READ)
2 (UPDATE and DELETE)
3 (READ, UPDATE and DELETE)
4 (INSERT and REPLACE)
5 (READ, INSERT and REPLACE)
6 (UPDATE, DELETE, INSERT and REPLACE)
7 (READ, UPDATE, DELETE, INSERT and REPLACE)
8 (SHOW), 9-15 (1-7 + SHOW)
0 (DISABLED)
Topologies
High availability with MariaDB TX
Topologies: HA with MariaDB TX
– master/slave replication with multiple data centers –
Data Center (DC1, Active) Data Center (DC2, Passive)
Slave Slave Master Master Slave Slave
MariaDB MaxScale
(Proxy)
MariaDB MaxScale
(Proxy)
Topologies: HA with MariaDB TX
– master/slave replication with multiple data centers –
Data Center (DC1, Active) Data Center (DC2, Passive)
Node 1
(P1: priority=1,
P2: priority=3)
Node 2
(P1: priority=2,
P2: priority=2)
Node 3
(P1: priority=3,
P2: priority=1)
Multi-master cluster
(synchronous replication)
MariaDB MaxScale
(Proxy)
MariaDB MaxScale
(Proxy)
Topologies: HA with MariaDB TX
– master/slave replication with read scaling –
Master Binlog
Server
Slave R1 Slave R2 Slave R100Slave M1
Writes
(port 3307)
Reads
(port 3308)
Slave M2
Cluster 1 Cluster 2
MariaDB MaxScale
(Proxy)
Topologies: HA with MariaDB TX
– master/slave replication with a dedicated backup –
Slave 1
(backups)
Master Slave 2
(reads)
Slave 3
(reads)
MariaDB
Backup
MariaDB MaxScale
(Proxy)
Internals
High availability with MariaDB TX
Master/slave replication
High availability with MariaDB TX
Internals: HA with MariaDB TX
– binary log –
Variable Values Default
sync_binlog 0 (defer to OS), n (number of group commits to fsync) 0 (deter to OS)
binlog_format STATEMENT | ROW | MIXED MIXED
log_bin_compress 0 (OFF), 1 (ON) 0 (OFF)
1. You can fsync multiple transactions by enabling group commits (sync_binlog=1)
2. You can use the binlog ROW format if transactions take a long time or result in small changes
3. You can compress binlog events to reduce disk and network IO
Internals: HA with MariaDB TX
– Global Transaction IDs (GTIDs) –
GTID = domain ID + server ID + sequence number
1. prevents conflicts between multiple masters
2. enable slaves to resume replications
Internals: HA with MariaDB TX
– binlog with group commits and GTIDs –
Logical view of the binlog
Commit ID GTID Server ID Event type Position End position
100 0-1-200 1 Query 0 150
100 0-1-201 1 Query 151 500
100 0-1-202 1 Query 501 600
101 0-1-203 1 Query 601 800
101 0-1-204 1 Query 801 1000
Internals: HA with MariaDB TX
– replication process –
1. The slave IO thread requests binlog events, includes its current GTID
2. The master returns binlog events for the next GTID(s)
3. The slave IO thread writes the binlog events to its relay log
4. The slave SQL thread reads the binlog events from its relay log
5. The slave SQL thread executes the binlog events and updates its current GTID
Multi-master clustering
High availability with MariaDB TX
Internals: HA with MariaDB TX
– components –
Group communication ensures total ordering of messages sent from multiple nodes
Write sets contain all of the rows modified by a transaction, created during the commit
phase
Global transaction ordering assigns writes sets a GTID (UUID + sequence number)
so writes are applied in the same order on every node
Certification ensures write sets are applied on all nodes or rejected on all nodes with
deterministic testing
Internals: HA with MariaDB TX
– replication process –
1. Synchronous
a. Originating node: create a write set
b. Originating node: assign a global transaction ID to the write set and replicate it
c. Originating node: apply the write set and commit the transaction
2. Asynchronous
a. Other nodes: certify the write set
b. Other nodes: apply the write set and commit the transaction
Optimizations
High availability with MariaDB TX
Topologies: HA with MariaDB TX
– master/slave replication optimizations –
Variable Values Default
slave-parallel-mode optimistic | conservative | aggressive | minimal | none conservative
slave-parallel-threads 0 - n 0
binlog_commit_wait_count 0 - n 0
binlog_commit_wait_usec 0 - n 100000 (100ms)
read_binlog_speed_limit 0 (unlimited), n (kb) 0
1. You can execute transactions in parallel on slaves (slave-parallel-threads > 0)
2. You can throttle replication to reduce slave load on the master
Topologies: HA with MariaDB TX
– multi-master clustering optimizations –
Variable Values Default
innodb_flush_log_at_tx_commit 0 (write and flush once a second)
1 (write and flush during commit)
2 (write during commit, flush once a second)
1
1. You can fsync InnoDB logs asynchronsly because synchronous replication provides durability
Check out the high availability microsite:
mariadb.com/database/topics/high-availability
Check out the high availability white paper:
goo.gl/WdZMM9
What’s next?
Thank you

Choosing the right high availability strategy

  • 1.
    High Availability: replication, clusteringand failover Shane Johnson Senior Director of Product Marketing
  • 2.
    Agenda: HA withMariaDB TX Getting started Terminology Fundamentals Scalability Advanced concepts Topologies Internals Optimizations
  • 3.
    Failover: when astandby database becomes the the primary database because the primary database is unreachable/unavailable Switchover: when a standy database becomes the primary database, and vice-versa (e.g., to perform a rolling upgrade) Rejoin: when a failed primary database becomes a standby database because it is reachable/available (again) Terminology: HA with MariaDB TX
  • 4.
  • 5.
    Fundamentals: HA withMariaDB TX – replication and clustering – Slave Slave Master Slave MasterMaster Master Asynchronous (master/slave replication) Slave Slave Slave Master Semi-synchronous (master/slave replication) Synchronous (multi-master clustering)
  • 6.
  • 7.
    Fundamentals: HA withMariaDB TX – master/slave replication – Master TX (GTID 1) TX (GTID 2) Slave Relay log 1. request next transaction (GTID = 2) TX (GTID 1) TX (GTID 2) TX (GTID 3) Binary log 2. read next transaction (GTID = 3) 3. reply with next transaction (GTID = 3) TX (GTID 3) 4. write next transaction (GTID = 3)
  • 8.
    Fundamentals: HA withMariaDB TX – master/slave replication – Asynchronous or semi-synchronous?
  • 9.
    Fundamentals: HA withMariaDB TX – asynchronous with automatic failover – Slave 1 (GTID = 2) Slave 2 (GTID = 1) Master (GTID = 3) Slave 1 (GTID = 2) Slave 2 (GTID = 1) Master * (GTID = 2) Slave 2 (GTID = 1) MariaDB MaxScale (Proxy) MariaDB MaxScale (Proxy) MariaDB MaxScale (Proxy) Master (GTID = 3)
  • 10.
    Fundamentals: HA withMariaDB TX – semi-synchronous with automatic failover – Master (GTID = 3) Slave 1 (GTID = 2) Slave 2 (GTID = 3) Master (GTID = 3) Slave 1 (GTID = 2) Slave 2 (GTID = 3) Slave 1 (GTID = 2) Master * (GTID = 3) MariaDB MaxScale (Proxy) MariaDB MaxScale (Proxy) MariaDB MaxScale (Proxy)
  • 11.
    Fundamentals: HA withMariaDB TX – semi-synchronous system variables – Variable Values Default rpl_semi_sync_master_enabled 0 (OFF) | 1 (ON) 0 (OFF) rpl_semi_sync_master_timeout 0 to n (ms) 10000 (10 seconds)
  • 12.
    Fundamentals: HA withMariaDB TX – master/slave replication use cases – Asynchronous Read-intensive: product catalogs Mixed: shopping carts Write-intensive: clickstream data Semi-synchronous Read-intensive: customer profiles Mixed: inventory and pricing Write-intensive: checkouts
  • 13.
  • 14.
    Fundamentals: HA withMariaDB TX – multi-master clustering – Node Row 3 Row 2 Row 1 Transaction 1. get writes NodeNode 2. send writes 3. certify and apply writes
  • 15.
    Fundamentals: HA withMariaDB TX – synchronous with automatic failover – Node 1 (Priority = 1) Node 2 (Priority = 2) Node 3 (Priority = 3) Node 1 (Priority = 1) Node 2 (Priority = 2) Node 3 (Priority = 3) Node 2 (Priority = 2) Node 3 (Priority = 3) MariaDB MaxScale (Proxy) MariaDB MaxScale (Proxy) MariaDB MaxScale (Proxy)
  • 16.
    Fundamentals: HA withMariaDB TX – multi-master clustering use cases – Synchronous Read-intensive: account status Mixed: package tracking Write-intensive: payments
  • 17.
  • 18.
  • 19.
    Scalability: HA withMariaDB TX – read-write splitting – Master Slave Slave Master Slave Slave Master Slave Slave writes reads MariaDB MaxScale (Proxy) MariaDB MaxScale (Proxy) MariaDB MaxScale (Proxy)
  • 20.
    Scalability: HA withMariaDB TX – read scalability with consistency – If read consistency is required, enable the consistent critical reads filter in the database proxy. Variable Values Default time Length of time to route reads to the master (seconds) 60 count Number of reads to route to the master - match RegEx to determine if a write should trigger the filter - ignore RegEx to determine if a write should NOT trigger the filter
  • 21.
  • 22.
    Scalability: HA withMariaDB TX – read-write splitting – Node (role=master) Node (role=slave) writes reads Node (role=slave) Node (role=master) Node (role=slave) Node (role=slave) Node (role=master) Node (role=slave) Node (role=slave) MariaDB MaxScale (Proxy) MariaDB MaxScale (Proxy) MariaDB MaxScale (Proxy)
  • 23.
    Scalability: HA withMariaDB TX – read scalability with consistency – If read consistency is required, set the wsrep_sync_wait system variable to 1. Variable Values Default wsrep_sync_wait 0 (DISABLED) 1 (READ) 2 (UPDATE and DELETE) 3 (READ, UPDATE and DELETE) 4 (INSERT and REPLACE) 5 (READ, INSERT and REPLACE) 6 (UPDATE, DELETE, INSERT and REPLACE) 7 (READ, UPDATE, DELETE, INSERT and REPLACE) 8 (SHOW), 9-15 (1-7 + SHOW) 0 (DISABLED)
  • 24.
  • 25.
    Topologies: HA withMariaDB TX – master/slave replication with multiple data centers – Data Center (DC1, Active) Data Center (DC2, Passive) Slave Slave Master Master Slave Slave MariaDB MaxScale (Proxy) MariaDB MaxScale (Proxy)
  • 26.
    Topologies: HA withMariaDB TX – master/slave replication with multiple data centers – Data Center (DC1, Active) Data Center (DC2, Passive) Node 1 (P1: priority=1, P2: priority=3) Node 2 (P1: priority=2, P2: priority=2) Node 3 (P1: priority=3, P2: priority=1) Multi-master cluster (synchronous replication) MariaDB MaxScale (Proxy) MariaDB MaxScale (Proxy)
  • 27.
    Topologies: HA withMariaDB TX – master/slave replication with read scaling – Master Binlog Server Slave R1 Slave R2 Slave R100Slave M1 Writes (port 3307) Reads (port 3308) Slave M2 Cluster 1 Cluster 2 MariaDB MaxScale (Proxy)
  • 28.
    Topologies: HA withMariaDB TX – master/slave replication with a dedicated backup – Slave 1 (backups) Master Slave 2 (reads) Slave 3 (reads) MariaDB Backup MariaDB MaxScale (Proxy)
  • 29.
  • 30.
  • 31.
    Internals: HA withMariaDB TX – binary log – Variable Values Default sync_binlog 0 (defer to OS), n (number of group commits to fsync) 0 (deter to OS) binlog_format STATEMENT | ROW | MIXED MIXED log_bin_compress 0 (OFF), 1 (ON) 0 (OFF) 1. You can fsync multiple transactions by enabling group commits (sync_binlog=1) 2. You can use the binlog ROW format if transactions take a long time or result in small changes 3. You can compress binlog events to reduce disk and network IO
  • 32.
    Internals: HA withMariaDB TX – Global Transaction IDs (GTIDs) – GTID = domain ID + server ID + sequence number 1. prevents conflicts between multiple masters 2. enable slaves to resume replications
  • 33.
    Internals: HA withMariaDB TX – binlog with group commits and GTIDs – Logical view of the binlog Commit ID GTID Server ID Event type Position End position 100 0-1-200 1 Query 0 150 100 0-1-201 1 Query 151 500 100 0-1-202 1 Query 501 600 101 0-1-203 1 Query 601 800 101 0-1-204 1 Query 801 1000
  • 34.
    Internals: HA withMariaDB TX – replication process – 1. The slave IO thread requests binlog events, includes its current GTID 2. The master returns binlog events for the next GTID(s) 3. The slave IO thread writes the binlog events to its relay log 4. The slave SQL thread reads the binlog events from its relay log 5. The slave SQL thread executes the binlog events and updates its current GTID
  • 35.
  • 36.
    Internals: HA withMariaDB TX – components – Group communication ensures total ordering of messages sent from multiple nodes Write sets contain all of the rows modified by a transaction, created during the commit phase Global transaction ordering assigns writes sets a GTID (UUID + sequence number) so writes are applied in the same order on every node Certification ensures write sets are applied on all nodes or rejected on all nodes with deterministic testing
  • 37.
    Internals: HA withMariaDB TX – replication process – 1. Synchronous a. Originating node: create a write set b. Originating node: assign a global transaction ID to the write set and replicate it c. Originating node: apply the write set and commit the transaction 2. Asynchronous a. Other nodes: certify the write set b. Other nodes: apply the write set and commit the transaction
  • 38.
  • 39.
    Topologies: HA withMariaDB TX – master/slave replication optimizations – Variable Values Default slave-parallel-mode optimistic | conservative | aggressive | minimal | none conservative slave-parallel-threads 0 - n 0 binlog_commit_wait_count 0 - n 0 binlog_commit_wait_usec 0 - n 100000 (100ms) read_binlog_speed_limit 0 (unlimited), n (kb) 0 1. You can execute transactions in parallel on slaves (slave-parallel-threads > 0) 2. You can throttle replication to reduce slave load on the master
  • 40.
    Topologies: HA withMariaDB TX – multi-master clustering optimizations – Variable Values Default innodb_flush_log_at_tx_commit 0 (write and flush once a second) 1 (write and flush during commit) 2 (write during commit, flush once a second) 1 1. You can fsync InnoDB logs asynchronsly because synchronous replication provides durability
  • 41.
    Check out thehigh availability microsite: mariadb.com/database/topics/high-availability Check out the high availability white paper: goo.gl/WdZMM9 What’s next?
  • 42.