SlideShare a Scribd company logo
1 of 18
Consistency between Engine and Binlog
under Reduced Durability
Yoshinori Matsunobu
Production Engineer, Facebook
Jan/2020
What we want to do
▪ When slave or master instances fail and recover, we want to make
them rejoin the replication chain (replica set), instead of dropping and
rebuilding
▪ Imaging a 10 minute network outage in one Availability Zone, and
want to recover MySQL instances in the AZ
Agenda
▪ When binlog and storage engine consistency gets broken
▪ What can go wrong on restarting replica
▪ What can go wrong on restarting master
▪ Challenges to support multiple transactional storage engines
Consistency between binlog and engine
▪ MySQL separates Replication logs (Binary Logs) and Transactional Storage Engine logs
(InnoDB/MyRocks/NDB)
▪ Internally handles XA
▪ Commit ordering:
▪ Binlog Prepare (doing nothing)
▪ Engine Prepare (in parallel)
▪ Binlog Commit (ordered)
▪ Engine Commit (ordered, if binlog_order_commits==ON)
▪ If MySQL instance or host dies in between, Engine and Binlog might become inconsistent
▪ Possibility of inconsistency will be bigger when operating with reduced durability (sync-binlog !=1 and
innodb-flush-log-at-trx-commit!=1)
▪ Some binlog events that were persisted in engine may be lost
▪ Engine may lose some transactions there were persisted in binlog
▪ This talk is about how to address consistency issues under reduced durability
5.6 Single Threaded Slave, Binlog < Engine
▪ Unplanned OS reboot on slave may end up inconsistent state
between Binlog GTID sets and Engine state
▪ A big question is the slave can continue replication by START
SLAVE, without entirely replacing it
▪ Transactional Storage Engines (both InnoDB and MyRocks) store
last committed GTID, and it is visible from
mysql.slave_relay_log_info table. This table is updated for each
commit to the engine
▪ With Single Threaded Slave, you don’t have to think about out of
order execution
▪ Run with relay_log_recovery=1
▪ Slave discards relay logs, restart replication from engine max GTID
position from master
▪ Skips execution in engine if GTID < slave_relay_log_info
▪ Skips writing binlog events if binlog GTID overlaps
Master
GTID: 1-100
Replica
Binlog GTID: 1-98
Engine Max GTID: 99
5.6 Single Threaded Slave, Binlog > Engine
▪ Replication will continue from GTID 95 or
less
▪ Executing Engine GTID 96-98 but not saving
binlog events
▪ Continuing normal replication flows after 99
Replica
Binlog GTID: 1-98
Engine Max GTID: 95
Master
GTID: 1-100
Multi Threaded Slave
Master
GTID: 1-100
Replica
Binlog GTID: 1-98
Engine Max GTID: 95
▪ mysql.slave_relay_log_info stores only max
executed GTID in the instance
▪ Under parallel database execution, MySQL has no
idea if GTID 94 is in engine or not
▪ Execution order might be 91 -> 92 -> 95
▪ In upstream 5.6, you can’t guarantee consistency
5.7 gtid_executed table
Replica
Binlog GTID: 1-98
gtid_executed table: 1-93, 95-98
Master
GTID: 1-100
▪ 5.7 gtid_executed table stores GTID sets in InnoDB
(crash safe)
▪ However, executed GTIDs are not updated for each
commit
▪ It is updated on binlog rotate
▪ If it updates for each commit, you can figure out
GTID 94 is there or not. (you can’t right now)
FB Extension: Slave Idempotent Recovery
- Starting replication from old enough binlog GTID
- Re-executing binlog events to engine, then ignoring
all duplicate key error / row not found error during
catchup
- Eventual Consistency
- Must use RBR, and tables must have primary keys
Master
GTID: 1-100
Replica
Binlog GTID: 1-98
Engine GTID state: empty
What can go wrong when restarting master
▪ Master may go down unexpectedly by various reasons
▪ Hitting segfaults (SIG=11), assertion (SIG=6), forcing kill (SIG=9), out of
memory
▪ Kernel panic
▪ power outages then restarted after a while
▪ Nowadays dead master promotion kicks in (Orchestrator, MHA)
▪ A question is failed master can restart replication from the new master
▪ Dead master may be back before dead master promotion
▪ If the master lost some transactions that are already replicated, replicas may
not be able to continue replication
Master Promotion happening, Binlog < Engine
▪ “Loss-Less Semi-Synchronous Replication” guarantees semisync tailer gets binlog events before master engine commit (so Engine on
orig master <= Binlog/Engine on new master)
▪ You need to start replication from the last GTID in the engine
▪ In this case, GTID Executed Sets in master is 1-98, but replication should start after 99
▪ Master’s engine execution order is serialized (with binlog-order-commit=1) so its’ guaranteed 1~99 are in engine
▪ However, this information is not visible from MySQL commands (only printed in err log)
▪ Feature Request to Oracle: InnoDB should add information_schema to print current committed last GTID, binlog file and position
▪ With Slave Idempotent Recovery, fetching last committed GTID can be skipped so automation can be more simplified.
Instance 1
(Master)
Binlog: 1-98
Engine: 1-99
Instance 2
(Replica)
Binlog: 1-98
Instance 3
(Replica)
Binlog: 1-99
Instance 4
(Replica)
Binlog: 1-98
Instance 1
(Dead)
Binlog: 1-98
Engine: 1-
99
Instance 2
(Replica)
Binlog: 1-98
Instance 3
(Master)
Binlog: 1-99
Instance 4
(Replica)
Binlog: 1-98
Master Promotion happening, Binlog >
Engine
Instance 1
(Master)
Binlog: 1-100
Engine: 1-98
Instance 2
(Replica)
Binlog: 1-98
Instance 3
(Replica)
Binlog: 1-99
Instance 4
(Replica)
Binlog: 1-98
Instance 1
(Dead)
Binlog: 1-
100
Engine: 1-
98
Instance 2
(Replica)
Binlog: 1-98
Instance 3
(Master)
Binlog: 1-99
Instance 4
(Replica)
Binlog: 1-98
“100” should be discarded before
replicating from new master (instance 3)
InnoDB: Last binlog file position 79143, file name binlog.000005
InnoDB: Last MySQL Gtid UUID:98
▪ Binlog GTID 100 is on instance 1 only, and is not acked to client (with loss less semisync)
▪ If the original master (instance 1) applies Binlog 100, it can’t join as a replica
▪ We need some ways not to apply GTID 100 during recovery
FB Extention: Server Side Binlog
Truncation▪ At instance startup, truncating binlog events that don’t exist in storage
engine
▪ End of binlog position is the same or smaller than engine’s last committed GTID
▪ Retaining original binlog file as a backup
▪ All of the prepared state transactions in storage engines will be rolled back
Master Promotion not happening
Instance 1
(Master)
Binlog: 1-100
Engine: 1-98
Instance 2
(Replica)
Binlog: 1-98
Instance 3
(Replica)
Binlog: 1-99
Instance 4
(Replica)
Binlog: 1-98
Instance 1
(Recovered)
Binlog: 1-98
Engine: 1-98
Instance 2
(Replica)
Binlog: 1-98
Instance 3
(Replica)
Binlog: 1-99
Instance 4
(Replica)
Binlog: 1-98
▪ Unplanned reboot on master may end up losing transactions that were already replicated to slaves
▪ Instance1 should not serve write requests until catching up Binlog GTID 99 from instance 3
Common Replica errors
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from
binary log: 'Slave has more GTIDs than the master has, using the
master's SERVER_UUID. This may indicate that the end of the binary log
was truncated or that the last binary log file was lost, e.g., after a
power or disk failure when sync_binlog != 1. The master may or may not
have rolled back transactions that were already replica’
▪ Set read_only=1 by default
▪ Find the most advanced slave, catch up from there, then start serving write requests
Dual Engine Consistency
▪ Binlog GTID Sets
▪ InnoDB
▪ MyRocks
▪ Binlog, InnoDB and MyRocks (or NDB) need to be consistent
▪ Binlog: GTID 1-200, InnoDB: GTID 190, MyRocks: GTID 197
▪ It is unclear if 191-196 are committed
▪ Roll back all prepared transactions (server side binlog truncation)
▪ Idempotent recovery
▪ Recover from binlogs on semi-sync replica
Dual Engine consistency without binlog
▪ 8.0 DDL is transactional
▪ Table metadata info is stored in InnoDB
▪ It is common to run DDL outside of replication
▪ FB OSC changes schema without binlog
▪ MyRocks table changes without binlog may end up inconsistency
▪ There is no binlog to fix inconsistency
▪ DDL validation is our current workaround
Summary
▪ MySQL needs to be aware of executed engine GTID sets
▪ With low update costs
▪ We don’t have in upstream MySQL yet. It’s a nice feature
▪ We worked around by Slave Idempotent Recovery
▪ Binlog Truncation during recovery, so that an old master can rejoin
as a replica

More Related Content

What's hot

Advanced Percona XtraDB Cluster in a nutshell... la suite
Advanced Percona XtraDB Cluster in a nutshell... la suiteAdvanced Percona XtraDB Cluster in a nutshell... la suite
Advanced Percona XtraDB Cluster in a nutshell... la suiteKenny Gryp
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyJean-François Gagné
 
Best practices for MySQL High Availability
Best practices for MySQL High AvailabilityBest practices for MySQL High Availability
Best practices for MySQL High AvailabilityColin Charles
 
MariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly AvailableMariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly AvailableMariaDB Corporation
 
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorAlmost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorJean-François Gagné
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyJean-François Gagné
 
Using and Benchmarking Galera in different architectures (PLUK 2012)
Using and Benchmarking Galera in different architectures (PLUK 2012)Using and Benchmarking Galera in different architectures (PLUK 2012)
Using and Benchmarking Galera in different architectures (PLUK 2012)Henrik Ingo
 
Reducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQLReducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQLKenny Gryp
 
Galera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesGalera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesSeveralnines
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDBI Goo Lee
 
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - SlidesSeveralnines
 
Mysql replication @ gnugroup
Mysql replication @ gnugroupMysql replication @ gnugroup
Mysql replication @ gnugroupJayant Chutke
 
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6Severalnines
 
Introduction to Galera
Introduction to GaleraIntroduction to Galera
Introduction to GaleraHenrik Ingo
 
Highly efficient backups with percona xtrabackup
Highly efficient backups with percona xtrabackupHighly efficient backups with percona xtrabackup
Highly efficient backups with percona xtrabackupNilnandan Joshi
 
Introduction to XtraDB Cluster
Introduction to XtraDB ClusterIntroduction to XtraDB Cluster
Introduction to XtraDB Clusteryoku0825
 
What's New in MySQL 5.7
What's New in MySQL 5.7What's New in MySQL 5.7
What's New in MySQL 5.7Olivier DASINI
 
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationPercona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationKenny Gryp
 
Master master vs master-slave database
Master master vs master-slave databaseMaster master vs master-slave database
Master master vs master-slave databaseWipro
 

What's hot (20)

Advanced Percona XtraDB Cluster in a nutshell... la suite
Advanced Percona XtraDB Cluster in a nutshell... la suiteAdvanced Percona XtraDB Cluster in a nutshell... la suite
Advanced Percona XtraDB Cluster in a nutshell... la suite
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
 
Best practices for MySQL High Availability
Best practices for MySQL High AvailabilityBest practices for MySQL High Availability
Best practices for MySQL High Availability
 
MariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly AvailableMariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly Available
 
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorAlmost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
 
Using and Benchmarking Galera in different architectures (PLUK 2012)
Using and Benchmarking Galera in different architectures (PLUK 2012)Using and Benchmarking Galera in different architectures (PLUK 2012)
Using and Benchmarking Galera in different architectures (PLUK 2012)
 
Reducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQLReducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQL
 
Galera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesGalera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction Slides
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
 
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides
 
Mysql replication @ gnugroup
Mysql replication @ gnugroupMysql replication @ gnugroup
Mysql replication @ gnugroup
 
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
 
Introduction to Galera
Introduction to GaleraIntroduction to Galera
Introduction to Galera
 
Highly efficient backups with percona xtrabackup
Highly efficient backups with percona xtrabackupHighly efficient backups with percona xtrabackup
Highly efficient backups with percona xtrabackup
 
Introduction to XtraDB Cluster
Introduction to XtraDB ClusterIntroduction to XtraDB Cluster
Introduction to XtraDB Cluster
 
What's New in MySQL 5.7
What's New in MySQL 5.7What's New in MySQL 5.7
What's New in MySQL 5.7
 
Oss4b - pxc introduction
Oss4b   - pxc introductionOss4b   - pxc introduction
Oss4b - pxc introduction
 
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationPercona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
 
Master master vs master-slave database
Master master vs master-slave databaseMaster master vs master-slave database
Master master vs master-slave database
 

Similar to MySQL Replication Consistency with Reduced Durability

MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsJean-François Gagné
 
MySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitationsMySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitationsJean-François Gagné
 
Pseudo GTID and Easy MySQL Replication Topology Management
Pseudo GTID and Easy MySQL Replication Topology ManagementPseudo GTID and Easy MySQL Replication Topology Management
Pseudo GTID and Easy MySQL Replication Topology ManagementShlomi Noach
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsJean-François Gagné
 
The consequences of sync_binlog != 1
The consequences of sync_binlog != 1The consequences of sync_binlog != 1
The consequences of sync_binlog != 1Jean-François Gagné
 
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitationsMySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitationsJean-François Gagné
 
Troubleshooting MySQL from a MySQL Developer Perspective
Troubleshooting MySQL from a MySQL Developer PerspectiveTroubleshooting MySQL from a MySQL Developer Perspective
Troubleshooting MySQL from a MySQL Developer PerspectiveMarcelo Altmann
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyJean-François Gagné
 
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBWebinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBSeveralnines
 
Best practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High AvailabilityBest practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High AvailabilityColin Charles
 
Percona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replicationPercona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replicationmysqlops
 
Running gtid replication in production
Running gtid replication in productionRunning gtid replication in production
Running gtid replication in productionBalazs Pocze
 
Webinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera ClusterWebinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera ClusterSeveralnines
 
Managing and Visualizing your Replication Topologies with Orchestrator
Managing and Visualizing your Replication Topologies with OrchestratorManaging and Visualizing your Replication Topologies with Orchestrator
Managing and Visualizing your Replication Topologies with OrchestratorShlomi Noach
 
Riding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication StreamRiding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication StreamJean-François Gagné
 
MySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.comMySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.comJean-François Gagné
 
Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication Mydbops
 
Upgrade to MySQL 5.6 without downtime
Upgrade to MySQL 5.6 without downtimeUpgrade to MySQL 5.6 without downtime
Upgrade to MySQL 5.6 without downtimeOlivier DASINI
 
MySQL GTID Concepts, Implementation and troubleshooting
MySQL GTID Concepts, Implementation and troubleshooting MySQL GTID Concepts, Implementation and troubleshooting
MySQL GTID Concepts, Implementation and troubleshooting Mydbops
 

Similar to MySQL Replication Consistency with Reduced Durability (20)

MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
 
MySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitationsMySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitations
 
Pseudo GTID and Easy MySQL Replication Topology Management
Pseudo GTID and Easy MySQL Replication Topology ManagementPseudo GTID and Easy MySQL Replication Topology Management
Pseudo GTID and Easy MySQL Replication Topology Management
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
 
The consequences of sync_binlog != 1
The consequences of sync_binlog != 1The consequences of sync_binlog != 1
The consequences of sync_binlog != 1
 
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitationsMySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
 
Troubleshooting MySQL from a MySQL Developer Perspective
Troubleshooting MySQL from a MySQL Developer PerspectiveTroubleshooting MySQL from a MySQL Developer Perspective
Troubleshooting MySQL from a MySQL Developer Perspective
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
 
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBWebinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
 
Best practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High AvailabilityBest practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High Availability
 
MySQL 5.6 GTID in a nutshell
MySQL 5.6 GTID in a nutshellMySQL 5.6 GTID in a nutshell
MySQL 5.6 GTID in a nutshell
 
Percona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replicationPercona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replication
 
Running gtid replication in production
Running gtid replication in productionRunning gtid replication in production
Running gtid replication in production
 
Webinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera ClusterWebinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera Cluster
 
Managing and Visualizing your Replication Topologies with Orchestrator
Managing and Visualizing your Replication Topologies with OrchestratorManaging and Visualizing your Replication Topologies with Orchestrator
Managing and Visualizing your Replication Topologies with Orchestrator
 
Riding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication StreamRiding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication Stream
 
MySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.comMySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.com
 
Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication
 
Upgrade to MySQL 5.6 without downtime
Upgrade to MySQL 5.6 without downtimeUpgrade to MySQL 5.6 without downtime
Upgrade to MySQL 5.6 without downtime
 
MySQL GTID Concepts, Implementation and troubleshooting
MySQL GTID Concepts, Implementation and troubleshooting MySQL GTID Concepts, Implementation and troubleshooting
MySQL GTID Concepts, Implementation and troubleshooting
 

More from Yoshinori Matsunobu

RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesYoshinori Matsunobu
 
MyRocks introduction and production deployment
MyRocks introduction and production deploymentMyRocks introduction and production deployment
MyRocks introduction and production deploymentYoshinori Matsunobu
 
データベース技術の羅針盤
データベース技術の羅針盤データベース技術の羅針盤
データベース技術の羅針盤Yoshinori Matsunobu
 
MHA for MySQLとDeNAのオープンソースの話
MHA for MySQLとDeNAのオープンソースの話MHA for MySQLとDeNAのオープンソースの話
MHA for MySQLとDeNAのオープンソースの話Yoshinori Matsunobu
 
Linux and H/W optimizations for MySQL
Linux and H/W optimizations for MySQLLinux and H/W optimizations for MySQL
Linux and H/W optimizations for MySQLYoshinori Matsunobu
 
ソーシャルゲームのためのデータベース設計
ソーシャルゲームのためのデータベース設計ソーシャルゲームのためのデータベース設計
ソーシャルゲームのためのデータベース設計Yoshinori Matsunobu
 
More mastering the art of indexing
More mastering the art of indexingMore mastering the art of indexing
More mastering the art of indexingYoshinori Matsunobu
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLYoshinori Matsunobu
 
Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)Yoshinori Matsunobu
 
Linux/DB Tuning (DevSumi2010, Japanese)
Linux/DB Tuning (DevSumi2010, Japanese)Linux/DB Tuning (DevSumi2010, Japanese)
Linux/DB Tuning (DevSumi2010, Japanese)Yoshinori Matsunobu
 

More from Yoshinori Matsunobu (12)

RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
MyRocks introduction and production deployment
MyRocks introduction and production deploymentMyRocks introduction and production deployment
MyRocks introduction and production deployment
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
データベース技術の羅針盤
データベース技術の羅針盤データベース技術の羅針盤
データベース技術の羅針盤
 
MHA for MySQLとDeNAのオープンソースの話
MHA for MySQLとDeNAのオープンソースの話MHA for MySQLとDeNAのオープンソースの話
MHA for MySQLとDeNAのオープンソースの話
 
Introducing MySQL MHA (JP/LT)
Introducing MySQL MHA (JP/LT)Introducing MySQL MHA (JP/LT)
Introducing MySQL MHA (JP/LT)
 
Linux and H/W optimizations for MySQL
Linux and H/W optimizations for MySQLLinux and H/W optimizations for MySQL
Linux and H/W optimizations for MySQL
 
ソーシャルゲームのためのデータベース設計
ソーシャルゲームのためのデータベース設計ソーシャルゲームのためのデータベース設計
ソーシャルゲームのためのデータベース設計
 
More mastering the art of indexing
More mastering the art of indexingMore mastering the art of indexing
More mastering the art of indexing
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQL
 
Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)
 
Linux/DB Tuning (DevSumi2010, Japanese)
Linux/DB Tuning (DevSumi2010, Japanese)Linux/DB Tuning (DevSumi2010, Japanese)
Linux/DB Tuning (DevSumi2010, Japanese)
 

Recently uploaded

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 

Recently uploaded (20)

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 

MySQL Replication Consistency with Reduced Durability

  • 1. Consistency between Engine and Binlog under Reduced Durability Yoshinori Matsunobu Production Engineer, Facebook Jan/2020
  • 2. What we want to do ▪ When slave or master instances fail and recover, we want to make them rejoin the replication chain (replica set), instead of dropping and rebuilding ▪ Imaging a 10 minute network outage in one Availability Zone, and want to recover MySQL instances in the AZ
  • 3. Agenda ▪ When binlog and storage engine consistency gets broken ▪ What can go wrong on restarting replica ▪ What can go wrong on restarting master ▪ Challenges to support multiple transactional storage engines
  • 4. Consistency between binlog and engine ▪ MySQL separates Replication logs (Binary Logs) and Transactional Storage Engine logs (InnoDB/MyRocks/NDB) ▪ Internally handles XA ▪ Commit ordering: ▪ Binlog Prepare (doing nothing) ▪ Engine Prepare (in parallel) ▪ Binlog Commit (ordered) ▪ Engine Commit (ordered, if binlog_order_commits==ON) ▪ If MySQL instance or host dies in between, Engine and Binlog might become inconsistent ▪ Possibility of inconsistency will be bigger when operating with reduced durability (sync-binlog !=1 and innodb-flush-log-at-trx-commit!=1) ▪ Some binlog events that were persisted in engine may be lost ▪ Engine may lose some transactions there were persisted in binlog ▪ This talk is about how to address consistency issues under reduced durability
  • 5. 5.6 Single Threaded Slave, Binlog < Engine ▪ Unplanned OS reboot on slave may end up inconsistent state between Binlog GTID sets and Engine state ▪ A big question is the slave can continue replication by START SLAVE, without entirely replacing it ▪ Transactional Storage Engines (both InnoDB and MyRocks) store last committed GTID, and it is visible from mysql.slave_relay_log_info table. This table is updated for each commit to the engine ▪ With Single Threaded Slave, you don’t have to think about out of order execution ▪ Run with relay_log_recovery=1 ▪ Slave discards relay logs, restart replication from engine max GTID position from master ▪ Skips execution in engine if GTID < slave_relay_log_info ▪ Skips writing binlog events if binlog GTID overlaps Master GTID: 1-100 Replica Binlog GTID: 1-98 Engine Max GTID: 99
  • 6. 5.6 Single Threaded Slave, Binlog > Engine ▪ Replication will continue from GTID 95 or less ▪ Executing Engine GTID 96-98 but not saving binlog events ▪ Continuing normal replication flows after 99 Replica Binlog GTID: 1-98 Engine Max GTID: 95 Master GTID: 1-100
  • 7. Multi Threaded Slave Master GTID: 1-100 Replica Binlog GTID: 1-98 Engine Max GTID: 95 ▪ mysql.slave_relay_log_info stores only max executed GTID in the instance ▪ Under parallel database execution, MySQL has no idea if GTID 94 is in engine or not ▪ Execution order might be 91 -> 92 -> 95 ▪ In upstream 5.6, you can’t guarantee consistency
  • 8. 5.7 gtid_executed table Replica Binlog GTID: 1-98 gtid_executed table: 1-93, 95-98 Master GTID: 1-100 ▪ 5.7 gtid_executed table stores GTID sets in InnoDB (crash safe) ▪ However, executed GTIDs are not updated for each commit ▪ It is updated on binlog rotate ▪ If it updates for each commit, you can figure out GTID 94 is there or not. (you can’t right now)
  • 9. FB Extension: Slave Idempotent Recovery - Starting replication from old enough binlog GTID - Re-executing binlog events to engine, then ignoring all duplicate key error / row not found error during catchup - Eventual Consistency - Must use RBR, and tables must have primary keys Master GTID: 1-100 Replica Binlog GTID: 1-98 Engine GTID state: empty
  • 10. What can go wrong when restarting master ▪ Master may go down unexpectedly by various reasons ▪ Hitting segfaults (SIG=11), assertion (SIG=6), forcing kill (SIG=9), out of memory ▪ Kernel panic ▪ power outages then restarted after a while ▪ Nowadays dead master promotion kicks in (Orchestrator, MHA) ▪ A question is failed master can restart replication from the new master ▪ Dead master may be back before dead master promotion ▪ If the master lost some transactions that are already replicated, replicas may not be able to continue replication
  • 11. Master Promotion happening, Binlog < Engine ▪ “Loss-Less Semi-Synchronous Replication” guarantees semisync tailer gets binlog events before master engine commit (so Engine on orig master <= Binlog/Engine on new master) ▪ You need to start replication from the last GTID in the engine ▪ In this case, GTID Executed Sets in master is 1-98, but replication should start after 99 ▪ Master’s engine execution order is serialized (with binlog-order-commit=1) so its’ guaranteed 1~99 are in engine ▪ However, this information is not visible from MySQL commands (only printed in err log) ▪ Feature Request to Oracle: InnoDB should add information_schema to print current committed last GTID, binlog file and position ▪ With Slave Idempotent Recovery, fetching last committed GTID can be skipped so automation can be more simplified. Instance 1 (Master) Binlog: 1-98 Engine: 1-99 Instance 2 (Replica) Binlog: 1-98 Instance 3 (Replica) Binlog: 1-99 Instance 4 (Replica) Binlog: 1-98 Instance 1 (Dead) Binlog: 1-98 Engine: 1- 99 Instance 2 (Replica) Binlog: 1-98 Instance 3 (Master) Binlog: 1-99 Instance 4 (Replica) Binlog: 1-98
  • 12. Master Promotion happening, Binlog > Engine Instance 1 (Master) Binlog: 1-100 Engine: 1-98 Instance 2 (Replica) Binlog: 1-98 Instance 3 (Replica) Binlog: 1-99 Instance 4 (Replica) Binlog: 1-98 Instance 1 (Dead) Binlog: 1- 100 Engine: 1- 98 Instance 2 (Replica) Binlog: 1-98 Instance 3 (Master) Binlog: 1-99 Instance 4 (Replica) Binlog: 1-98 “100” should be discarded before replicating from new master (instance 3) InnoDB: Last binlog file position 79143, file name binlog.000005 InnoDB: Last MySQL Gtid UUID:98 ▪ Binlog GTID 100 is on instance 1 only, and is not acked to client (with loss less semisync) ▪ If the original master (instance 1) applies Binlog 100, it can’t join as a replica ▪ We need some ways not to apply GTID 100 during recovery
  • 13. FB Extention: Server Side Binlog Truncation▪ At instance startup, truncating binlog events that don’t exist in storage engine ▪ End of binlog position is the same or smaller than engine’s last committed GTID ▪ Retaining original binlog file as a backup ▪ All of the prepared state transactions in storage engines will be rolled back
  • 14. Master Promotion not happening Instance 1 (Master) Binlog: 1-100 Engine: 1-98 Instance 2 (Replica) Binlog: 1-98 Instance 3 (Replica) Binlog: 1-99 Instance 4 (Replica) Binlog: 1-98 Instance 1 (Recovered) Binlog: 1-98 Engine: 1-98 Instance 2 (Replica) Binlog: 1-98 Instance 3 (Replica) Binlog: 1-99 Instance 4 (Replica) Binlog: 1-98 ▪ Unplanned reboot on master may end up losing transactions that were already replicated to slaves ▪ Instance1 should not serve write requests until catching up Binlog GTID 99 from instance 3
  • 15. Common Replica errors Last_IO_Errno: 1236 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replica’ ▪ Set read_only=1 by default ▪ Find the most advanced slave, catch up from there, then start serving write requests
  • 16. Dual Engine Consistency ▪ Binlog GTID Sets ▪ InnoDB ▪ MyRocks ▪ Binlog, InnoDB and MyRocks (or NDB) need to be consistent ▪ Binlog: GTID 1-200, InnoDB: GTID 190, MyRocks: GTID 197 ▪ It is unclear if 191-196 are committed ▪ Roll back all prepared transactions (server side binlog truncation) ▪ Idempotent recovery ▪ Recover from binlogs on semi-sync replica
  • 17. Dual Engine consistency without binlog ▪ 8.0 DDL is transactional ▪ Table metadata info is stored in InnoDB ▪ It is common to run DDL outside of replication ▪ FB OSC changes schema without binlog ▪ MyRocks table changes without binlog may end up inconsistency ▪ There is no binlog to fix inconsistency ▪ DDL validation is our current workaround
  • 18. Summary ▪ MySQL needs to be aware of executed engine GTID sets ▪ With low update costs ▪ We don’t have in upstream MySQL yet. It’s a nice feature ▪ We worked around by Slave Idempotent Recovery ▪ Binlog Truncation during recovery, so that an old master can rejoin as a replica