SlideShare a Scribd company logo
Galera Explained
A Beginner (?) Level Tutorial
Marco “The Grinch” Tusa
2015
About me Marco “The Grinch”
• Former UN, MySQL AB,
Pythian, Percona
• 2 kids, 1 wife
• Ex-MySQL AB employee
• History of Religions;
Ski; Snowboard; Scuba Di
ving;
My Motto
Use the Right Tool for the Job
Why you are here
• You want to understand what Galera Cluster is
• You know what it is, but want to know more
• You’d like to grill the speaker with some nasty
questions about it (wait for the end!)
• You’re bored, with nothing better to do (a special
welcome to you!)
Agenda
• What is Galera?
• How does Galera work?
• What is a Node?
• Node Status
• Primary Component
• Quorum
• Data Replication (Synch.)
• Optimistic & Pessimistic locking
• Write-set Cache
• State Transfer
• Flow Control
• Apply DDL
• Geographic Distribution
• Galera & Binary Logs
• What to keep an Eye on
• Well-known issues
What is Galera?
(Virtually) Synchronous Replication:
– True multi-master
– No slave lag
– No master-slave failover or VIP
– Multi-threaded app layers
– Automatic node provisioning
– Elastic scale (in – out)
– Geographic distributed (with segments)
– Mix with Async replication Galera
Balancer
Web traffic
Data Replication (sync)
Pros
– High Availability Synchronous replication provides highly available
clusters and guarantees 24/7 service availability, given that:
» No data loss when nodes crash.
» Data replicas remain consistent.
» No complex, time-consuming failovers.
– Improved Performance replications allows you to execute transactions on
all nodes in the cluster in parallel to each other, increasing performance.
– Causality across the Cluster Synchronous replication guarantees causality
across the whole cluster.
What is Galera NOT?
• Not Write-scalable solution
• Not great for a high amount of
parallel, small requests
• Not great for working with Foreign
Keys
• Not good for sharding Data (each
node has the entire dataset)
Galera
Balancer
Web traffic
Data Replication (sync) (adv)
Cons
– Do not scale on write
– Use a two phase commit, or
distributed locking with capacity
formula: m = n x o x t (where
messages/sec = number of nodes
due to process o number of
operation with t transaction
throughput)
– More nodes more Dead locks &
conflicts
Comparing Galera with:
MHA
– Each Slave has its own position
– Data is replicate asynchronously
– In case of crash ONLY one server could
be elected, and in some cases needs to
wait update from binlog
Galera
– Data is the same at each finalize
commit
– All Nodes share the same position
– Any Node can be written at any
time
Master
Log_pos=1000
Slave Log_pos=995
Slave Log_pos=993
Slave Log_pos=980
Slave Log_pos=998
Async
Replicatio
n
In Case of Master crash
Election by position
Comparing Galera with:
Continuent Enterprise
– Applications connect to an entry point
– All data is distributed asynchronously
– A central point keep information on all
Galera
– Application can connect any node
– Data is shared using XA transactions
– Status and State is at cluster level
Async
Replication
Canada Italy
Entry point
(man in the
middle)
Galera and HAProxy
Two friends working together
• Automatic Donor/fail/resurrection identification
• Automatic write distribution
• Light process scaling on Application server (no single point of failure)
• Transactional Database It requires that the database is transactional.
Specifically, that the database can rollback uncommitted changes.
• Atomic Changes It requires that replication events change the
database atomically. Specifically, that the series of database
operations must either all occur, else nothing occurs.
• Global Ordering It requires that replication events are ordered
globally. Specifically, that they are applied on all instances in the same
order.
Galera minimal requirements
How does Galera work? 1
Main components corresponding to code
blocks
• Database Management System (DBMS) The database
server that runs on the individual node.
• wsrep API The interface and the responsibilities for the
database server and replication provider
• Galera Replication Plugin The plugin that enables
write-set replication service functionality.
• Group Communication plugins The group
communication systems available to Galera Cluster.
How does Galera work? 2
Main components (WSREP API)
• Is a generic replication plugin interface for databases
• Database servers have a state
• State refers to the contents of the database
• Changes in the database state as a series of atomic
changes, or transactions
• In a database cluster, all nodes always have the same
state
How does Galera work? 3
Main components (Galera Replication Plugin)
The Galera Replication Plugin implements the wsrep API. It operates as
the wsrep provider.
• Certification Layer This layer prepares the write-sets and performs the
certification checks on them, ensuring that they can be applied.
• Replication Layer This layer manages the replication protocol and provides the
total ordering capability.
• Group Communication Framework This layer provides a plugin architecture
for the various group communication systems that connect to Galera Cluster.
How does Galera work? 4
Main components (Group communication plugin)
• Implements a virtual synchrony QoS (Quality of Service)
• Implements its own runtime-configurable temporal flow control. Flow
control keeps nodes synchronized to the faction of a second
• Provides a total ordering of messages from multiple sources. It uses this
to generate Global Transaction ID’s in a multi-master cluster
• Is a symmetric undirected graph. All database nodes connect to each
other over a TCP connection
What is a Node? 1
• Standard MySQL
Replication
Master
Slave
Slave
• Galera MySQL Replication
Node
Node Node
9cba28fa-a8be-11e4-8f41-9f963e1dbf4f
What is a Node? 2
• Standard MySQL
Replication
– Each MySQL instance is
independent
– Data can be different per
node (schema, engine,
content)
• Galera MySQL Replication
– Data is the center
– Nodes connect and share
same data
– Node cannot (should not) be
different, and have the same
STATE
What is a Node? 3
• Data is the center
– Data has an UUID =
• 9cba28fa-a8be-11e4-8f41-9f963e1dbf4f
– Data has a Position (seq number)
• wsrep_last_committed | 1398 |
• Position is the same in ANY Synchronized node
– Node has UUID
• 8186a31a-a8bf-11e4-9d19-6bd85d36493b
Node belongs to a cluster/Data and NOT vice versa.
What is a Node 4
1. A connecting node talks to one
node in the cluster
2. A DONOR is elected
3. The Donor shares Status and
Starts Synchronization
uuid: 9cba28fa-a8be-11e4-8f41-9f963e1dbf4f
seqno: 1950
New cluster view: global state: 9cba28fa-a8be-11e4-8f41-9f963e1dbf4f:2037,
view# 9: Primary, number of nodes: 5, my index: 2, protocol version 3
Segments
• A segment is a logical
grouping of nodes.
• Replication between Segment
is optimized
• Traffic and messaging is
reduced
• In case of SST, the donor is
chosen by proximity
Node Status
1. Node connect and
Send status
2. Cluster provides a DONOR
3. Status (data) Exchange
starts (node Joiner)
4. Donor ends transmission,
applies “delta” and rejoins
5. Joiner -> Joined check
seq_num and become Synced
Primary Component
Under normal operations, the Primary
Component is the whole cluster.
When cluster partitioning occurs, Galera
Cluster invokes a special quorum algorithm
to select one component as the Primary
Component.
This guarantees that there is never more
than one Primary Component in the cluster.
Primary component
Primary Component 2
In case of a network issue, the cluster might
be split.
If the pc.weight and segments are set up
correctly, the nodes in the Non-Primary state
will attempt to rejoin the cluster.
This is an automatic recovery that may
trigger:
• IST
• SST
Primary
Non-Primary
Primary Component 3
When the cluster is NOT able to manage
WHO is the primary correctly, a so-called
“split brain” issue may occur.
Split Brain:
• Cannot be automatically recovered from
• Puts all nodes in READ ONLY mode
Non-Primary
Non-Primary
Split Brain
Quorum
Quorum can be managed using:
• Pc.weight
• Segments
Segments do not modify the quorum calculation but are
useful to logically group servers.
• Zone 1: Segment=1, weight = 2
• Zone 2: Segment=2, weight = 1
Quorum (adv)
•
Quorum (adv)
Galera organizes the presence/modification of node in
VIEWS:
WSREP: view(view_id(PRIM,28b4b776,78)
memb { 28b4b776,1
79cc1886,1
8637105e,2
f218f33d,2}
joined {} left {}
partitioned { b9aabaa5,1 <--- node is shutting down})
78 is the VIEW number
PRIM define the view as Primary component
Segment identifier
Quorum (adv)
Assuming 2 Segments with 3 nodes each
View 1 View 2 View 3
seg weight active
n1 1 1 1
n2 1 1 1
n3 1 1 1
n4 2 1 1
n5 2 1 1
n6 2 1 1
seg weight active
n1 1 1 0
n2 1 1 0
n3 1 1 0
n4 2 1 1
n5 2 1 1
n6 2 1 1
seg weight Active
n1 1 1 1
n2 1 1 1
n3 1 1 1
n4 2 1 0
n5 2 1 0
n6 2 1 0
Segment 2 Quorum 0 Segment 1 Quorum 0
In this case in VIEW 2|3 we will not have a quorum and
the Segments will become NON -PRIMARY
Quorum (adv)
Assuming 2 Segments with 3 nodes each
View 1 View 2 View 3
Segment 1 Quorum 1 Segment 2 Quorum 1
Using an arbitrator we can have the quorum.
BUT what if both can access the quorum but not the other segment?
SPLIT BRAIN !!!
seg weight active
n1 1 1 1
n2 1 1 1
n3 1 1 1
n4 2 1 1
n5 2 1 1
n6 2 1 1
n7 3 1 1
seg weight active
n1 1 1 1
n2 1 1 1
n3 1 1 1
n4 2 1 0
n5 2 1 0
n6 2 1 0
n7 3 1 1
seg weight active
n1 1 1 0
n2 1 1 0
n3 1 1 0
n4 2 1 1
n5 2 1 1
n6 2 1 1
n7 3 1 1
Quorum (adv)
Assuming 2 Segments with 3 nodes each
View 1 View 2 (1) View 2 (2)
seg weight active
n1 1 4 1/0
n2 1 3 1
n3 1 1 1
n4 2 5 1
n5 2 1 1
n6 2 1 1
seg weight active
n1 1 4 1
n2 1 3 1
n3 1 1 1
n4 2 5 0
n5 2 1 0
n6 2 1 0
seg weight Active
n1 1 4 0
n2 1 3 0
n3 1 1 0
n4 2 5 1
n5 2 1 1
n6 2 1 1
Segment 1 Quorum 1 Segment 2 Quorum 1
In this case in VIEW 2|3 we will have a quorum
• Segment 1 always win and will have the quorum
• Segment 2 will have the quorum in case of planned switch, otherwise NO-PRIMARY
Quorum Summary
• Number of Nodes, Even/Odd, not really relevant
• Quorum weight is relevant
• Remind View quorum calculation
• Witness node will NOT guarantee the Split-Brain prevention real node
Should
• HAProxy can help (a lot) to manage Segments
• Plan carefully your cluster, and check View status before mantainance
Data Replication (sync)
• On commit, but before commit
• Transaction changes are
ordered by PK and collected in
a write set
• The write set is certified on each
node (including originator) for
apply/reject
• On failure, the originator rolls
back, while others discard the
write set
Data Replication (sync)(adv)
Local Certification issues
– Each re-ordered transaction
(deterministic) has a Seq_no
– Galera evaluates all transactions in
the queue from the last successfully
committed
– If another writeset in the queue is
conflicting, then the writeset in
evaluation is discarded, and rolled
back on the originator
– Counter is incremented only on
originator
6 5 4 23 1
Cluster
Commit
Queue
Conflict
5 discarded
wsrep_local_cert_failures
Data Replication (sync) (adv)
Local certification issues (2)
– Transaction started, not
committed
– Incoming writeset is applied
– A lock conflict with local
open transaction is raised
– Incoming transaction (write
set) always wins
wsrep_local_bf_aborts
Data Replication (sync)
• Certification take place on write-sets
• Each write-set contains references for each
affected key:
– Primary
– Unique
– Foreign key
• Keys are also maintained in a local
certification index for multi-master conflict
resolutions
Optimistic & Pessimistic locking
1. The originator has all internal locks
2. Originator ignores other nodes
3. On Commit, it optimistically sends the
modification
4. The write-set is reordered and goes through
a deterministic certification test
5. In the presence of a conflict, the last commit
loses
Write-set Cache
GCache A library that provides a transparent on-disk
memory buffer cache.
Its purpose is to allow an (almost) arbitrarily big action cache
without RAM consumption.
Permanent Ring-Buffer File Here, write-sets are pre-
allocated to disk during cache initialization.
State Transfer 1
The process of replicating data from the cluster to an
individual node, bringing that node into sync with the cluster.
AKA Provisioning.
Two ways of doing it:
• Incremental State Transfers (IST) Where only the missing
transactions transfer.
• State Snapshot Transfers (SST) Where a snapshot of the
entire node state transfers.
State Transfer 2
State transfers always require a:
– Donor
– Joiner
A Joiner is the node that request the ST
Member 0.1 (node3) requested state transfer from 'node5'
A donor is the node Providing the data, donor can be
blocked by getting incoming queries.
State Transfer 3
IST Incremental State Transfer, transfer the missing D
between the Joiner and Donor.
• State UUID must be the same as that of the group
• All missing write-sets are available in the donor’s write-set cache
• Much faster and non-blocking operation on the Donor
• IST has a well-known interval:
WSREP: IST request: 9cba28fa-a8be-11e4-8f41-9f963e1dbf4f:77030-
85722|tcp://10.177.128.45:4568
• IST picks the donor that can provide the full WS range (also, if defined,
the donor can change)
State Transfer 4
SST State Snapshot Transfer is a full data copy from one
node to another.
This may happen because:
• A New Node joins the cluster
• Enough WS data not present in the Gcache of any Donor
Two approaches:
• Logical (mysqldump; export)
• Physical (rsync; xtrabackup)
Flow Control
Galera Cluster manages the replication process using a
feedback mechanism named FLOW CONTROL.
• Allows any node to pause and resume replication
• Prevents any node from lagging too far behind
Modes
– No Flow Control
– Write-set Caching
– Catching Up
– Cluster Sync
How Flow Control Works
1. Galera Cluster synchronously replicates write-sets on a
cluster-wide ordering.
2. Transactions received but not yet applied and committed
are placed in the receive queue (wsrep_local_recv_queue)
3. When the size of the queue exceeds the Flow Control Limit,
the node will send a FC pause.
4. When the queue is manageable again (below the limit), the
node removes the pause.
Flow Control States 1
Write-set Caching
• Happens when the node is a:
– Joiner
– Donor
• Write-set will be locally cached and applied later
Flow Control States 2
Catching up
• Happens when the node is:
– Joined
• Nodes in this state can apply write-sets but are still making up
the gap
• Cluster rate replication is tuned to the Joined Node’s capacity
• Applying a write-set is faster than executing a transaction
• On empty Buffer Pool operations will be slower
Flow Control States 3
Cluster Sync
• Happens when the node is:
– Synced
• By far the most common state
• Node enters in FC to limit the receiving queue
• Can be tuned with gcs.fc_limit, gcs.fc_factor
Flow Control
How small my fc_limit should be?
• Enough to keep low the delay any node in the cluster
might have when applying cluster transactions
• Enough to keep the certification interval small, which
minimizes replication conflicts on a cluster where writes
happen on all nodes
– A small fc_limit keeps the certification index smaller in memory
Manage Flow Control
What to check?
• wsrep_flow_control_sent; wsrep_flow_control_recv;
• wsrep_flow_control_paused; wsrep_flow_control_paused_ns
What can be tuned?
• Replication Rate (expert feature, do not touch)
• Flow control
– gcs.fc_limit (default 16, way too low for every real production)
– gcs.fc_factor (default 1, means resume replication as soon as we
go below fc_limit)
Flow Control
Bad Tuned flow control (?)
Apply DDL
• Any DDL is a non-transactional operation
• Modification raises meta-lock/Server/Schema
In a Galera Cluster, you can choose to run DDL in
• TOI Total Order Isolation
• RSU Rolling Schema Upgrade
• pt-online-schema-change (recommended for large tables)
Apply DDL TOI
When using Total Order Isolation, the cluster will work as a
single server until the end of the process on ALL nodes.
Cluster will stay locked:
• Server Level For CREATE SCHEMA, GRANT and similar queries, where the
cluster cannot apply concurrently any other transactions.
• Schema Level For CREATE TABLE and similar queries, where the cluster
cannot apply concurrently any transactions that access the schema.
• Table Level For ALTER TABLE and similar queries, where the cluster cannot
apply concurrently any other transactions that access the table.
Apply DDL RSU
When using Rolling Schema Upgrade, each modification will
apply ONLY on the node where the command is executed.
• Different structure between Nodes
• Data inconsistency
• Dangerous use of WSREP_ON
(http://www.tusacentral.net/joomla/index.php/mysql-blogs/168-how-to-mess-up-your-data-using-
one-command-in-mysqlgalera.html)
In short, this is potentially unsafe.
Apply DDL PT-OSC
When using pt-online-schema-change, the cluster will block
the nodes for a very short period of time: at the start and at
the end of the process.
• Data is replicated as a normal transaction
• Nodes maintain consistency
• No locking during the copy
• Is recoverable
Geographic distribution
Galera Cluster is well suited to cover a geographic
distributed scenario.
• Use a combination of Asynchronous and Synchronous
replication
• Use Master/Slave settings inside Galera
• Use of Segments
Galera and Binary logs
Not needed ?
For a long while I stated so, but today I am older and wiser.
• Useful to identify what transaction is a seql_no
• Required when using a slave
• Must have it on at least 2 Nodes when using a slave
• Still an Option in case of DR (trust me I saw it!!)
Galera and Binary logs
Understand the differences between
SQL_LOG_BIN & WSREP_ON
• SQL_LOG_BIN will prevent ANY DML to be replicated
NOTE: Standard MySQL exclude DML and DDL
• WSREP_ON will prevent ANY DML & DDL to be replicated
• Use of GLOBAL in this context will cause data inconsistency at 99%
What to keep an eye on
As any complex system, Galera Cluster requires your
attention on many areas, the most critical:
• Certification
• Network performance
• Proper schema design (PK/UK/FK)
• Number of nodes (write distribution, not write scaling)
• Correctly plan schema modification
Well known Issues
• Foreign Keys
• Small (very small) transactions and highly parallel
committing
• WSREP_ON (global) == SQL_LOG_BIN=0
• Master/Slave is ok, but be careful when using filters
• Locks/Deadlocks can become more frequent
• Network support (documentation)
What Next?
Galera Operations:
• Installation, simple and distributed
• Add/remove a node
• Data consistency
• Debug issues using the log
• Data export/Load
• Backups
• Monitoring
Q & A
Thank you
To contact us
sales@pythian.com
1-877-PYTHIAN
To follow us
http://www.pythian.com/blog
http://www.facebook.com/pages/The-Pythian-
Group/163902527671
@pythian
http://www.linkedin.com/company/pythian
To contact Me
tusa@pythian.com
marcotusa@tusacentral.net
To follow me
http://www.tusacentral.net/
https://www.facebook.com/marco.tusa.94
@marcotusa
http://it.linkedin.com/in/marcotusa/

More Related Content

What's hot

ProxySQL for MySQL
ProxySQL for MySQLProxySQL for MySQL
ProxySQL for MySQL
Mydbops
 
Errant GTIDs breaking replication @ Percona Live 2019
Errant GTIDs breaking replication @ Percona Live 2019Errant GTIDs breaking replication @ Percona Live 2019
Errant GTIDs breaking replication @ Percona Live 2019
Dieter Adriaenssens
 
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorAlmost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Jean-François Gagné
 
Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1
Codership Oy - Creators of Galera Cluster
 
MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera ClusterAbdul Manaf
 
MySQL GTID Concepts, Implementation and troubleshooting
MySQL GTID Concepts, Implementation and troubleshooting MySQL GTID Concepts, Implementation and troubleshooting
MySQL GTID Concepts, Implementation and troubleshooting
Mydbops
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
Jean-François Gagné
 
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Severalnines
 
Running MariaDB in multiple data centers
Running MariaDB in multiple data centersRunning MariaDB in multiple data centers
Running MariaDB in multiple data centers
MariaDB plc
 
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera ) Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Mydbops
 
Advanced Percona XtraDB Cluster in a nutshell... la suite
Advanced Percona XtraDB Cluster in a nutshell... la suiteAdvanced Percona XtraDB Cluster in a nutshell... la suite
Advanced Percona XtraDB Cluster in a nutshell... la suite
Kenny Gryp
 
My sql failover test using orchestrator
My sql failover test  using orchestratorMy sql failover test  using orchestrator
My sql failover test using orchestrator
YoungHeon (Roy) Kim
 
[2018] MySQL 이중화 진화기
[2018] MySQL 이중화 진화기[2018] MySQL 이중화 진화기
[2018] MySQL 이중화 진화기
NHN FORWARD
 
MySQL GTID 시작하기
MySQL GTID 시작하기MySQL GTID 시작하기
MySQL GTID 시작하기
I Goo Lee
 
Intro ProxySQL
Intro ProxySQLIntro ProxySQL
Intro ProxySQL
I Goo Lee
 
Maxscale_메뉴얼
Maxscale_메뉴얼Maxscale_메뉴얼
Maxscale_메뉴얼
NeoClova
 
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
Jean-François Gagné
 
Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
Codership Oy - Creators of Galera Cluster
 
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScale
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScaleThe Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScale
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScale
Colin Charles
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
Jean-François Gagné
 

What's hot (20)

ProxySQL for MySQL
ProxySQL for MySQLProxySQL for MySQL
ProxySQL for MySQL
 
Errant GTIDs breaking replication @ Percona Live 2019
Errant GTIDs breaking replication @ Percona Live 2019Errant GTIDs breaking replication @ Percona Live 2019
Errant GTIDs breaking replication @ Percona Live 2019
 
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorAlmost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
 
Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1
 
MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera Cluster
 
MySQL GTID Concepts, Implementation and troubleshooting
MySQL GTID Concepts, Implementation and troubleshooting MySQL GTID Concepts, Implementation and troubleshooting
MySQL GTID Concepts, Implementation and troubleshooting
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
 
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
 
Running MariaDB in multiple data centers
Running MariaDB in multiple data centersRunning MariaDB in multiple data centers
Running MariaDB in multiple data centers
 
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera ) Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
 
Advanced Percona XtraDB Cluster in a nutshell... la suite
Advanced Percona XtraDB Cluster in a nutshell... la suiteAdvanced Percona XtraDB Cluster in a nutshell... la suite
Advanced Percona XtraDB Cluster in a nutshell... la suite
 
My sql failover test using orchestrator
My sql failover test  using orchestratorMy sql failover test  using orchestrator
My sql failover test using orchestrator
 
[2018] MySQL 이중화 진화기
[2018] MySQL 이중화 진화기[2018] MySQL 이중화 진화기
[2018] MySQL 이중화 진화기
 
MySQL GTID 시작하기
MySQL GTID 시작하기MySQL GTID 시작하기
MySQL GTID 시작하기
 
Intro ProxySQL
Intro ProxySQLIntro ProxySQL
Intro ProxySQL
 
Maxscale_메뉴얼
Maxscale_메뉴얼Maxscale_메뉴얼
Maxscale_메뉴얼
 
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
 
Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
 
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScale
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScaleThe Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScale
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScale
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
 

Viewers also liked

Galera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesGalera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction Slides
Severalnines
 
Galera Replication Demystified: How Does It Work?
Galera Replication Demystified: How Does It Work?Galera Replication Demystified: How Does It Work?
Galera Replication Demystified: How Does It Work?
Frederic Descamps
 
Choosing A Concurrency Model, Optimistic Or Pessimistic
Choosing A Concurrency Model, Optimistic Or PessimisticChoosing A Concurrency Model, Optimistic Or Pessimistic
Choosing A Concurrency Model, Optimistic Or Pessimistic
Vinod Kumar
 
Java Persistence API 2.0: An Overview
Java Persistence API 2.0: An OverviewJava Persistence API 2.0: An Overview
Java Persistence API 2.0: An Overview
Sanjeeb Sahoo
 
Best practices for MySQL High Availability
Best practices for MySQL High AvailabilityBest practices for MySQL High Availability
Best practices for MySQL High Availability
Colin Charles
 
SQL window functions for MySQL
SQL window functions for MySQLSQL window functions for MySQL
SQL window functions for MySQL
Dag H. Wanvik
 
MySQL 8.0: Common Table Expressions
MySQL 8.0: Common Table Expressions MySQL 8.0: Common Table Expressions
MySQL 8.0: Common Table Expressions
oysteing
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed Introduction
Hanborq Inc.
 
MySQL Group Replication
MySQL Group ReplicationMySQL Group Replication
MySQL Group Replication
Kenny Gryp
 
How Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagHow Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lag
Jean-François Gagné
 

Viewers also liked (10)

Galera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesGalera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction Slides
 
Galera Replication Demystified: How Does It Work?
Galera Replication Demystified: How Does It Work?Galera Replication Demystified: How Does It Work?
Galera Replication Demystified: How Does It Work?
 
Choosing A Concurrency Model, Optimistic Or Pessimistic
Choosing A Concurrency Model, Optimistic Or PessimisticChoosing A Concurrency Model, Optimistic Or Pessimistic
Choosing A Concurrency Model, Optimistic Or Pessimistic
 
Java Persistence API 2.0: An Overview
Java Persistence API 2.0: An OverviewJava Persistence API 2.0: An Overview
Java Persistence API 2.0: An Overview
 
Best practices for MySQL High Availability
Best practices for MySQL High AvailabilityBest practices for MySQL High Availability
Best practices for MySQL High Availability
 
SQL window functions for MySQL
SQL window functions for MySQLSQL window functions for MySQL
SQL window functions for MySQL
 
MySQL 8.0: Common Table Expressions
MySQL 8.0: Common Table Expressions MySQL 8.0: Common Table Expressions
MySQL 8.0: Common Table Expressions
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed Introduction
 
MySQL Group Replication
MySQL Group ReplicationMySQL Group Replication
MySQL Group Replication
 
How Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagHow Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lag
 

Similar to Galera explained 3

Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability
Mydbops
 
Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerkuchinskaya
 
Alibaba patches in MariaDB
Alibaba patches in MariaDBAlibaba patches in MariaDB
Alibaba patches in MariaDBLixun Peng
 
MariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly AvailableMariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Corporation
 
MySQL Multi-Master Replication
MySQL Multi-Master ReplicationMySQL Multi-Master Replication
MySQL Multi-Master Replication
Michael Naumov
 
MySQL Multi Master Replication
MySQL Multi Master ReplicationMySQL Multi Master Replication
MySQL Multi Master Replication
Moshe Kaplan
 
Training Slides: 202 - Monitoring & Troubleshooting
Training Slides: 202 - Monitoring & TroubleshootingTraining Slides: 202 - Monitoring & Troubleshooting
Training Slides: 202 - Monitoring & Troubleshooting
Continuent
 
CAP: Scaling, HA
CAP: Scaling, HACAP: Scaling, HA
CAP: Scaling, HA
Vitaly Peregudov
 
FOSDEM 2012: MySQL synchronous replication in practice with Galera
FOSDEM 2012: MySQL synchronous replication in practice with GaleraFOSDEM 2012: MySQL synchronous replication in practice with Galera
FOSDEM 2012: MySQL synchronous replication in practice with GaleraFromDual GmbH
 
Cassandra Tutorial
Cassandra Tutorial Cassandra Tutorial
Cassandra Tutorial
Na Zhu
 
Retaining Goodput with Query Rate Limiting
Retaining Goodput with Query Rate LimitingRetaining Goodput with Query Rate Limiting
Retaining Goodput with Query Rate Limiting
ScyllaDB
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
Sakari Keskitalo
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
Sakari Keskitalo
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
Fred de Villamil
 
On The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL ClusterOn The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL Cluster
Srihari Sriraman
 
Percona XtraDB 集群文档
Percona XtraDB 集群文档Percona XtraDB 集群文档
Percona XtraDB 集群文档
YUCHENG HU
 
Percona XtraDB Cluster
Percona XtraDB ClusterPercona XtraDB Cluster
Percona XtraDB Cluster
Kenny Gryp
 
Massively sharded my sql at tumblr presentation
Massively sharded my sql at tumblr presentationMassively sharded my sql at tumblr presentation
Massively sharded my sql at tumblr presentation
kriptonium
 
Linux-HA with Pacemaker
Linux-HA with PacemakerLinux-HA with Pacemaker
Linux-HA with Pacemaker
Kris Buytaert
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBase
HBaseCon
 

Similar to Galera explained 3 (20)

Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability
 
Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemaker
 
Alibaba patches in MariaDB
Alibaba patches in MariaDBAlibaba patches in MariaDB
Alibaba patches in MariaDB
 
MariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly AvailableMariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly Available
 
MySQL Multi-Master Replication
MySQL Multi-Master ReplicationMySQL Multi-Master Replication
MySQL Multi-Master Replication
 
MySQL Multi Master Replication
MySQL Multi Master ReplicationMySQL Multi Master Replication
MySQL Multi Master Replication
 
Training Slides: 202 - Monitoring & Troubleshooting
Training Slides: 202 - Monitoring & TroubleshootingTraining Slides: 202 - Monitoring & Troubleshooting
Training Slides: 202 - Monitoring & Troubleshooting
 
CAP: Scaling, HA
CAP: Scaling, HACAP: Scaling, HA
CAP: Scaling, HA
 
FOSDEM 2012: MySQL synchronous replication in practice with Galera
FOSDEM 2012: MySQL synchronous replication in practice with GaleraFOSDEM 2012: MySQL synchronous replication in practice with Galera
FOSDEM 2012: MySQL synchronous replication in practice with Galera
 
Cassandra Tutorial
Cassandra Tutorial Cassandra Tutorial
Cassandra Tutorial
 
Retaining Goodput with Query Rate Limiting
Retaining Goodput with Query Rate LimitingRetaining Goodput with Query Rate Limiting
Retaining Goodput with Query Rate Limiting
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
 
On The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL ClusterOn The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL Cluster
 
Percona XtraDB 集群文档
Percona XtraDB 集群文档Percona XtraDB 集群文档
Percona XtraDB 集群文档
 
Percona XtraDB Cluster
Percona XtraDB ClusterPercona XtraDB Cluster
Percona XtraDB Cluster
 
Massively sharded my sql at tumblr presentation
Massively sharded my sql at tumblr presentationMassively sharded my sql at tumblr presentation
Massively sharded my sql at tumblr presentation
 
Linux-HA with Pacemaker
Linux-HA with PacemakerLinux-HA with Pacemaker
Linux-HA with Pacemaker
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBase
 

More from Marco Tusa

Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Marco Tusa
 
My sql on kubernetes demystified
My sql on kubernetes demystifiedMy sql on kubernetes demystified
My sql on kubernetes demystified
Marco Tusa
 
Comparing high availability solutions with percona xtradb cluster and percona...
Comparing high availability solutions with percona xtradb cluster and percona...Comparing high availability solutions with percona xtradb cluster and percona...
Comparing high availability solutions with percona xtradb cluster and percona...
Marco Tusa
 
Accessing data through hibernate: what DBAs should tell to developers and vic...
Accessing data through hibernate: what DBAs should tell to developers and vic...Accessing data through hibernate: what DBAs should tell to developers and vic...
Accessing data through hibernate: what DBAs should tell to developers and vic...
Marco Tusa
 
Best practice-high availability-solution-geo-distributed-final
Best practice-high availability-solution-geo-distributed-finalBest practice-high availability-solution-geo-distributed-final
Best practice-high availability-solution-geo-distributed-final
Marco Tusa
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pages
Marco Tusa
 
Robust ha solutions with proxysql
Robust ha solutions with proxysqlRobust ha solutions with proxysql
Robust ha solutions with proxysql
Marco Tusa
 
Fortify aws aurora_proxy_2019_pleu
Fortify aws aurora_proxy_2019_pleuFortify aws aurora_proxy_2019_pleu
Fortify aws aurora_proxy_2019_pleu
Marco Tusa
 
Accessing Data Through Hibernate; What DBAs Should Tell Developers and Vice V...
Accessing Data Through Hibernate; What DBAs Should Tell Developers and Vice V...Accessing Data Through Hibernate; What DBAs Should Tell Developers and Vice V...
Accessing Data Through Hibernate; What DBAs Should Tell Developers and Vice V...
Marco Tusa
 
Are we there Yet?? (The long journey of Migrating from close source to opens...
Are we there Yet?? (The long journey of Migrating from close source to opens...Are we there Yet?? (The long journey of Migrating from close source to opens...
Are we there Yet?? (The long journey of Migrating from close source to opens...
Marco Tusa
 
Improve aws withproxysql
Improve aws withproxysqlImprove aws withproxysql
Improve aws withproxysql
Marco Tusa
 
Fortify aws aurora_proxy
Fortify aws aurora_proxyFortify aws aurora_proxy
Fortify aws aurora_proxy
Marco Tusa
 
Mysql8 advance tuning with resource group
Mysql8 advance tuning with resource groupMysql8 advance tuning with resource group
Mysql8 advance tuning with resource group
Marco Tusa
 
Proxysql sharding
Proxysql shardingProxysql sharding
Proxysql sharding
Marco Tusa
 
Geographically dispersed perconaxtra db cluster deployment
Geographically dispersed perconaxtra db cluster deploymentGeographically dispersed perconaxtra db cluster deployment
Geographically dispersed perconaxtra db cluster deployment
Marco Tusa
 
Sync rep aurora_2016
Sync rep aurora_2016Sync rep aurora_2016
Sync rep aurora_2016
Marco Tusa
 
Proxysql ha plam_2016_2_keynote
Proxysql ha plam_2016_2_keynoteProxysql ha plam_2016_2_keynote
Proxysql ha plam_2016_2_keynote
Marco Tusa
 
Empower my sql server administration with 5.7 instruments
Empower my sql server administration with 5.7 instrumentsEmpower my sql server administration with 5.7 instruments
Empower my sql server administration with 5.7 instrumentsMarco Tusa
 
Plmce 14 be a_hero_16x9_final
Plmce 14 be a_hero_16x9_finalPlmce 14 be a_hero_16x9_final
Plmce 14 be a_hero_16x9_final
Marco Tusa
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2
Marco Tusa
 

More from Marco Tusa (20)

Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
 
My sql on kubernetes demystified
My sql on kubernetes demystifiedMy sql on kubernetes demystified
My sql on kubernetes demystified
 
Comparing high availability solutions with percona xtradb cluster and percona...
Comparing high availability solutions with percona xtradb cluster and percona...Comparing high availability solutions with percona xtradb cluster and percona...
Comparing high availability solutions with percona xtradb cluster and percona...
 
Accessing data through hibernate: what DBAs should tell to developers and vic...
Accessing data through hibernate: what DBAs should tell to developers and vic...Accessing data through hibernate: what DBAs should tell to developers and vic...
Accessing data through hibernate: what DBAs should tell to developers and vic...
 
Best practice-high availability-solution-geo-distributed-final
Best practice-high availability-solution-geo-distributed-finalBest practice-high availability-solution-geo-distributed-final
Best practice-high availability-solution-geo-distributed-final
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pages
 
Robust ha solutions with proxysql
Robust ha solutions with proxysqlRobust ha solutions with proxysql
Robust ha solutions with proxysql
 
Fortify aws aurora_proxy_2019_pleu
Fortify aws aurora_proxy_2019_pleuFortify aws aurora_proxy_2019_pleu
Fortify aws aurora_proxy_2019_pleu
 
Accessing Data Through Hibernate; What DBAs Should Tell Developers and Vice V...
Accessing Data Through Hibernate; What DBAs Should Tell Developers and Vice V...Accessing Data Through Hibernate; What DBAs Should Tell Developers and Vice V...
Accessing Data Through Hibernate; What DBAs Should Tell Developers and Vice V...
 
Are we there Yet?? (The long journey of Migrating from close source to opens...
Are we there Yet?? (The long journey of Migrating from close source to opens...Are we there Yet?? (The long journey of Migrating from close source to opens...
Are we there Yet?? (The long journey of Migrating from close source to opens...
 
Improve aws withproxysql
Improve aws withproxysqlImprove aws withproxysql
Improve aws withproxysql
 
Fortify aws aurora_proxy
Fortify aws aurora_proxyFortify aws aurora_proxy
Fortify aws aurora_proxy
 
Mysql8 advance tuning with resource group
Mysql8 advance tuning with resource groupMysql8 advance tuning with resource group
Mysql8 advance tuning with resource group
 
Proxysql sharding
Proxysql shardingProxysql sharding
Proxysql sharding
 
Geographically dispersed perconaxtra db cluster deployment
Geographically dispersed perconaxtra db cluster deploymentGeographically dispersed perconaxtra db cluster deployment
Geographically dispersed perconaxtra db cluster deployment
 
Sync rep aurora_2016
Sync rep aurora_2016Sync rep aurora_2016
Sync rep aurora_2016
 
Proxysql ha plam_2016_2_keynote
Proxysql ha plam_2016_2_keynoteProxysql ha plam_2016_2_keynote
Proxysql ha plam_2016_2_keynote
 
Empower my sql server administration with 5.7 instruments
Empower my sql server administration with 5.7 instrumentsEmpower my sql server administration with 5.7 instruments
Empower my sql server administration with 5.7 instruments
 
Plmce 14 be a_hero_16x9_final
Plmce 14 be a_hero_16x9_finalPlmce 14 be a_hero_16x9_final
Plmce 14 be a_hero_16x9_final
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2
 

Recently uploaded

原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 

Recently uploaded (20)

原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 

Galera explained 3

  • 1. Galera Explained A Beginner (?) Level Tutorial Marco “The Grinch” Tusa 2015
  • 2. About me Marco “The Grinch” • Former UN, MySQL AB, Pythian, Percona • 2 kids, 1 wife • Ex-MySQL AB employee • History of Religions; Ski; Snowboard; Scuba Di ving;
  • 3. My Motto Use the Right Tool for the Job
  • 4. Why you are here • You want to understand what Galera Cluster is • You know what it is, but want to know more • You’d like to grill the speaker with some nasty questions about it (wait for the end!) • You’re bored, with nothing better to do (a special welcome to you!)
  • 5. Agenda • What is Galera? • How does Galera work? • What is a Node? • Node Status • Primary Component • Quorum • Data Replication (Synch.) • Optimistic & Pessimistic locking • Write-set Cache • State Transfer • Flow Control • Apply DDL • Geographic Distribution • Galera & Binary Logs • What to keep an Eye on • Well-known issues
  • 6. What is Galera? (Virtually) Synchronous Replication: – True multi-master – No slave lag – No master-slave failover or VIP – Multi-threaded app layers – Automatic node provisioning – Elastic scale (in – out) – Geographic distributed (with segments) – Mix with Async replication Galera Balancer Web traffic
  • 7. Data Replication (sync) Pros – High Availability Synchronous replication provides highly available clusters and guarantees 24/7 service availability, given that: » No data loss when nodes crash. » Data replicas remain consistent. » No complex, time-consuming failovers. – Improved Performance replications allows you to execute transactions on all nodes in the cluster in parallel to each other, increasing performance. – Causality across the Cluster Synchronous replication guarantees causality across the whole cluster.
  • 8. What is Galera NOT? • Not Write-scalable solution • Not great for a high amount of parallel, small requests • Not great for working with Foreign Keys • Not good for sharding Data (each node has the entire dataset) Galera Balancer Web traffic
  • 9. Data Replication (sync) (adv) Cons – Do not scale on write – Use a two phase commit, or distributed locking with capacity formula: m = n x o x t (where messages/sec = number of nodes due to process o number of operation with t transaction throughput) – More nodes more Dead locks & conflicts
  • 10. Comparing Galera with: MHA – Each Slave has its own position – Data is replicate asynchronously – In case of crash ONLY one server could be elected, and in some cases needs to wait update from binlog Galera – Data is the same at each finalize commit – All Nodes share the same position – Any Node can be written at any time Master Log_pos=1000 Slave Log_pos=995 Slave Log_pos=993 Slave Log_pos=980 Slave Log_pos=998 Async Replicatio n In Case of Master crash Election by position
  • 11. Comparing Galera with: Continuent Enterprise – Applications connect to an entry point – All data is distributed asynchronously – A central point keep information on all Galera – Application can connect any node – Data is shared using XA transactions – Status and State is at cluster level Async Replication Canada Italy Entry point (man in the middle)
  • 12. Galera and HAProxy Two friends working together • Automatic Donor/fail/resurrection identification • Automatic write distribution • Light process scaling on Application server (no single point of failure)
  • 13. • Transactional Database It requires that the database is transactional. Specifically, that the database can rollback uncommitted changes. • Atomic Changes It requires that replication events change the database atomically. Specifically, that the series of database operations must either all occur, else nothing occurs. • Global Ordering It requires that replication events are ordered globally. Specifically, that they are applied on all instances in the same order. Galera minimal requirements
  • 14. How does Galera work? 1 Main components corresponding to code blocks • Database Management System (DBMS) The database server that runs on the individual node. • wsrep API The interface and the responsibilities for the database server and replication provider • Galera Replication Plugin The plugin that enables write-set replication service functionality. • Group Communication plugins The group communication systems available to Galera Cluster.
  • 15. How does Galera work? 2 Main components (WSREP API) • Is a generic replication plugin interface for databases • Database servers have a state • State refers to the contents of the database • Changes in the database state as a series of atomic changes, or transactions • In a database cluster, all nodes always have the same state
  • 16. How does Galera work? 3 Main components (Galera Replication Plugin) The Galera Replication Plugin implements the wsrep API. It operates as the wsrep provider. • Certification Layer This layer prepares the write-sets and performs the certification checks on them, ensuring that they can be applied. • Replication Layer This layer manages the replication protocol and provides the total ordering capability. • Group Communication Framework This layer provides a plugin architecture for the various group communication systems that connect to Galera Cluster.
  • 17. How does Galera work? 4 Main components (Group communication plugin) • Implements a virtual synchrony QoS (Quality of Service) • Implements its own runtime-configurable temporal flow control. Flow control keeps nodes synchronized to the faction of a second • Provides a total ordering of messages from multiple sources. It uses this to generate Global Transaction ID’s in a multi-master cluster • Is a symmetric undirected graph. All database nodes connect to each other over a TCP connection
  • 18. What is a Node? 1 • Standard MySQL Replication Master Slave Slave • Galera MySQL Replication Node Node Node 9cba28fa-a8be-11e4-8f41-9f963e1dbf4f
  • 19. What is a Node? 2 • Standard MySQL Replication – Each MySQL instance is independent – Data can be different per node (schema, engine, content) • Galera MySQL Replication – Data is the center – Nodes connect and share same data – Node cannot (should not) be different, and have the same STATE
  • 20. What is a Node? 3 • Data is the center – Data has an UUID = • 9cba28fa-a8be-11e4-8f41-9f963e1dbf4f – Data has a Position (seq number) • wsrep_last_committed | 1398 | • Position is the same in ANY Synchronized node – Node has UUID • 8186a31a-a8bf-11e4-9d19-6bd85d36493b Node belongs to a cluster/Data and NOT vice versa.
  • 21. What is a Node 4 1. A connecting node talks to one node in the cluster 2. A DONOR is elected 3. The Donor shares Status and Starts Synchronization uuid: 9cba28fa-a8be-11e4-8f41-9f963e1dbf4f seqno: 1950 New cluster view: global state: 9cba28fa-a8be-11e4-8f41-9f963e1dbf4f:2037, view# 9: Primary, number of nodes: 5, my index: 2, protocol version 3
  • 22. Segments • A segment is a logical grouping of nodes. • Replication between Segment is optimized • Traffic and messaging is reduced • In case of SST, the donor is chosen by proximity
  • 23. Node Status 1. Node connect and Send status 2. Cluster provides a DONOR 3. Status (data) Exchange starts (node Joiner) 4. Donor ends transmission, applies “delta” and rejoins 5. Joiner -> Joined check seq_num and become Synced
  • 24. Primary Component Under normal operations, the Primary Component is the whole cluster. When cluster partitioning occurs, Galera Cluster invokes a special quorum algorithm to select one component as the Primary Component. This guarantees that there is never more than one Primary Component in the cluster. Primary component
  • 25. Primary Component 2 In case of a network issue, the cluster might be split. If the pc.weight and segments are set up correctly, the nodes in the Non-Primary state will attempt to rejoin the cluster. This is an automatic recovery that may trigger: • IST • SST Primary Non-Primary
  • 26. Primary Component 3 When the cluster is NOT able to manage WHO is the primary correctly, a so-called “split brain” issue may occur. Split Brain: • Cannot be automatically recovered from • Puts all nodes in READ ONLY mode Non-Primary Non-Primary Split Brain
  • 27. Quorum Quorum can be managed using: • Pc.weight • Segments Segments do not modify the quorum calculation but are useful to logically group servers. • Zone 1: Segment=1, weight = 2 • Zone 2: Segment=2, weight = 1
  • 29. Quorum (adv) Galera organizes the presence/modification of node in VIEWS: WSREP: view(view_id(PRIM,28b4b776,78) memb { 28b4b776,1 79cc1886,1 8637105e,2 f218f33d,2} joined {} left {} partitioned { b9aabaa5,1 <--- node is shutting down}) 78 is the VIEW number PRIM define the view as Primary component Segment identifier
  • 30. Quorum (adv) Assuming 2 Segments with 3 nodes each View 1 View 2 View 3 seg weight active n1 1 1 1 n2 1 1 1 n3 1 1 1 n4 2 1 1 n5 2 1 1 n6 2 1 1 seg weight active n1 1 1 0 n2 1 1 0 n3 1 1 0 n4 2 1 1 n5 2 1 1 n6 2 1 1 seg weight Active n1 1 1 1 n2 1 1 1 n3 1 1 1 n4 2 1 0 n5 2 1 0 n6 2 1 0 Segment 2 Quorum 0 Segment 1 Quorum 0 In this case in VIEW 2|3 we will not have a quorum and the Segments will become NON -PRIMARY
  • 31. Quorum (adv) Assuming 2 Segments with 3 nodes each View 1 View 2 View 3 Segment 1 Quorum 1 Segment 2 Quorum 1 Using an arbitrator we can have the quorum. BUT what if both can access the quorum but not the other segment? SPLIT BRAIN !!! seg weight active n1 1 1 1 n2 1 1 1 n3 1 1 1 n4 2 1 1 n5 2 1 1 n6 2 1 1 n7 3 1 1 seg weight active n1 1 1 1 n2 1 1 1 n3 1 1 1 n4 2 1 0 n5 2 1 0 n6 2 1 0 n7 3 1 1 seg weight active n1 1 1 0 n2 1 1 0 n3 1 1 0 n4 2 1 1 n5 2 1 1 n6 2 1 1 n7 3 1 1
  • 32. Quorum (adv) Assuming 2 Segments with 3 nodes each View 1 View 2 (1) View 2 (2) seg weight active n1 1 4 1/0 n2 1 3 1 n3 1 1 1 n4 2 5 1 n5 2 1 1 n6 2 1 1 seg weight active n1 1 4 1 n2 1 3 1 n3 1 1 1 n4 2 5 0 n5 2 1 0 n6 2 1 0 seg weight Active n1 1 4 0 n2 1 3 0 n3 1 1 0 n4 2 5 1 n5 2 1 1 n6 2 1 1 Segment 1 Quorum 1 Segment 2 Quorum 1 In this case in VIEW 2|3 we will have a quorum • Segment 1 always win and will have the quorum • Segment 2 will have the quorum in case of planned switch, otherwise NO-PRIMARY
  • 33. Quorum Summary • Number of Nodes, Even/Odd, not really relevant • Quorum weight is relevant • Remind View quorum calculation • Witness node will NOT guarantee the Split-Brain prevention real node Should • HAProxy can help (a lot) to manage Segments • Plan carefully your cluster, and check View status before mantainance
  • 34. Data Replication (sync) • On commit, but before commit • Transaction changes are ordered by PK and collected in a write set • The write set is certified on each node (including originator) for apply/reject • On failure, the originator rolls back, while others discard the write set
  • 35. Data Replication (sync)(adv) Local Certification issues – Each re-ordered transaction (deterministic) has a Seq_no – Galera evaluates all transactions in the queue from the last successfully committed – If another writeset in the queue is conflicting, then the writeset in evaluation is discarded, and rolled back on the originator – Counter is incremented only on originator 6 5 4 23 1 Cluster Commit Queue Conflict 5 discarded wsrep_local_cert_failures
  • 36. Data Replication (sync) (adv) Local certification issues (2) – Transaction started, not committed – Incoming writeset is applied – A lock conflict with local open transaction is raised – Incoming transaction (write set) always wins wsrep_local_bf_aborts
  • 37. Data Replication (sync) • Certification take place on write-sets • Each write-set contains references for each affected key: – Primary – Unique – Foreign key • Keys are also maintained in a local certification index for multi-master conflict resolutions
  • 38. Optimistic & Pessimistic locking 1. The originator has all internal locks 2. Originator ignores other nodes 3. On Commit, it optimistically sends the modification 4. The write-set is reordered and goes through a deterministic certification test 5. In the presence of a conflict, the last commit loses
  • 39. Write-set Cache GCache A library that provides a transparent on-disk memory buffer cache. Its purpose is to allow an (almost) arbitrarily big action cache without RAM consumption. Permanent Ring-Buffer File Here, write-sets are pre- allocated to disk during cache initialization.
  • 40. State Transfer 1 The process of replicating data from the cluster to an individual node, bringing that node into sync with the cluster. AKA Provisioning. Two ways of doing it: • Incremental State Transfers (IST) Where only the missing transactions transfer. • State Snapshot Transfers (SST) Where a snapshot of the entire node state transfers.
  • 41. State Transfer 2 State transfers always require a: – Donor – Joiner A Joiner is the node that request the ST Member 0.1 (node3) requested state transfer from 'node5' A donor is the node Providing the data, donor can be blocked by getting incoming queries.
  • 42. State Transfer 3 IST Incremental State Transfer, transfer the missing D between the Joiner and Donor. • State UUID must be the same as that of the group • All missing write-sets are available in the donor’s write-set cache • Much faster and non-blocking operation on the Donor • IST has a well-known interval: WSREP: IST request: 9cba28fa-a8be-11e4-8f41-9f963e1dbf4f:77030- 85722|tcp://10.177.128.45:4568 • IST picks the donor that can provide the full WS range (also, if defined, the donor can change)
  • 43. State Transfer 4 SST State Snapshot Transfer is a full data copy from one node to another. This may happen because: • A New Node joins the cluster • Enough WS data not present in the Gcache of any Donor Two approaches: • Logical (mysqldump; export) • Physical (rsync; xtrabackup)
  • 44. Flow Control Galera Cluster manages the replication process using a feedback mechanism named FLOW CONTROL. • Allows any node to pause and resume replication • Prevents any node from lagging too far behind Modes – No Flow Control – Write-set Caching – Catching Up – Cluster Sync
  • 45. How Flow Control Works 1. Galera Cluster synchronously replicates write-sets on a cluster-wide ordering. 2. Transactions received but not yet applied and committed are placed in the receive queue (wsrep_local_recv_queue) 3. When the size of the queue exceeds the Flow Control Limit, the node will send a FC pause. 4. When the queue is manageable again (below the limit), the node removes the pause.
  • 46. Flow Control States 1 Write-set Caching • Happens when the node is a: – Joiner – Donor • Write-set will be locally cached and applied later
  • 47. Flow Control States 2 Catching up • Happens when the node is: – Joined • Nodes in this state can apply write-sets but are still making up the gap • Cluster rate replication is tuned to the Joined Node’s capacity • Applying a write-set is faster than executing a transaction • On empty Buffer Pool operations will be slower
  • 48. Flow Control States 3 Cluster Sync • Happens when the node is: – Synced • By far the most common state • Node enters in FC to limit the receiving queue • Can be tuned with gcs.fc_limit, gcs.fc_factor
  • 49. Flow Control How small my fc_limit should be? • Enough to keep low the delay any node in the cluster might have when applying cluster transactions • Enough to keep the certification interval small, which minimizes replication conflicts on a cluster where writes happen on all nodes – A small fc_limit keeps the certification index smaller in memory
  • 50. Manage Flow Control What to check? • wsrep_flow_control_sent; wsrep_flow_control_recv; • wsrep_flow_control_paused; wsrep_flow_control_paused_ns What can be tuned? • Replication Rate (expert feature, do not touch) • Flow control – gcs.fc_limit (default 16, way too low for every real production) – gcs.fc_factor (default 1, means resume replication as soon as we go below fc_limit)
  • 51. Flow Control Bad Tuned flow control (?)
  • 52. Apply DDL • Any DDL is a non-transactional operation • Modification raises meta-lock/Server/Schema In a Galera Cluster, you can choose to run DDL in • TOI Total Order Isolation • RSU Rolling Schema Upgrade • pt-online-schema-change (recommended for large tables)
  • 53. Apply DDL TOI When using Total Order Isolation, the cluster will work as a single server until the end of the process on ALL nodes. Cluster will stay locked: • Server Level For CREATE SCHEMA, GRANT and similar queries, where the cluster cannot apply concurrently any other transactions. • Schema Level For CREATE TABLE and similar queries, where the cluster cannot apply concurrently any transactions that access the schema. • Table Level For ALTER TABLE and similar queries, where the cluster cannot apply concurrently any other transactions that access the table.
  • 54. Apply DDL RSU When using Rolling Schema Upgrade, each modification will apply ONLY on the node where the command is executed. • Different structure between Nodes • Data inconsistency • Dangerous use of WSREP_ON (http://www.tusacentral.net/joomla/index.php/mysql-blogs/168-how-to-mess-up-your-data-using- one-command-in-mysqlgalera.html) In short, this is potentially unsafe.
  • 55. Apply DDL PT-OSC When using pt-online-schema-change, the cluster will block the nodes for a very short period of time: at the start and at the end of the process. • Data is replicated as a normal transaction • Nodes maintain consistency • No locking during the copy • Is recoverable
  • 56. Geographic distribution Galera Cluster is well suited to cover a geographic distributed scenario. • Use a combination of Asynchronous and Synchronous replication • Use Master/Slave settings inside Galera • Use of Segments
  • 57. Galera and Binary logs Not needed ? For a long while I stated so, but today I am older and wiser. • Useful to identify what transaction is a seql_no • Required when using a slave • Must have it on at least 2 Nodes when using a slave • Still an Option in case of DR (trust me I saw it!!)
  • 58. Galera and Binary logs Understand the differences between SQL_LOG_BIN & WSREP_ON • SQL_LOG_BIN will prevent ANY DML to be replicated NOTE: Standard MySQL exclude DML and DDL • WSREP_ON will prevent ANY DML & DDL to be replicated • Use of GLOBAL in this context will cause data inconsistency at 99%
  • 59. What to keep an eye on As any complex system, Galera Cluster requires your attention on many areas, the most critical: • Certification • Network performance • Proper schema design (PK/UK/FK) • Number of nodes (write distribution, not write scaling) • Correctly plan schema modification
  • 60. Well known Issues • Foreign Keys • Small (very small) transactions and highly parallel committing • WSREP_ON (global) == SQL_LOG_BIN=0 • Master/Slave is ok, but be careful when using filters • Locks/Deadlocks can become more frequent • Network support (documentation)
  • 61. What Next? Galera Operations: • Installation, simple and distributed • Add/remove a node • Data consistency • Debug issues using the log • Data export/Load • Backups • Monitoring
  • 62. Q & A
  • 63. Thank you To contact us sales@pythian.com 1-877-PYTHIAN To follow us http://www.pythian.com/blog http://www.facebook.com/pages/The-Pythian- Group/163902527671 @pythian http://www.linkedin.com/company/pythian To contact Me tusa@pythian.com marcotusa@tusacentral.net To follow me http://www.tusacentral.net/ https://www.facebook.com/marco.tusa.94 @marcotusa http://it.linkedin.com/in/marcotusa/