SlideShare a Scribd company logo
Autopsy of a MySQL automation
disaster
by Jean-François Gagné
presented at SRECon Dublin (October 2019)
Senior Infrastructure Engineer / System and MySQL Expert
jeanfrancois AT messagebird DOT com / @jfg956 #SREcon
To err is human
To really foul things up requires a computer[1]
(or a script)
[1]: http://quoteinvestigator.com/2010/12/07/foul-computer/
2
Session Summary
1. MySQL replication
2. Automation disaster: external eye
3. Chain of events: analysis
4. Learning / takeaway
3
MySQL replication
● Typical MySQL replication deployment at Booking.com:
+---+
| M |
+---+
|
+------+-- ... --+---------------+-------- ...
| | | |
+---+ +---+ +---+ +---+
| S1| | S2| | Sn| | M1|
+---+ +---+ +---+ +---+
|
+-- ... --+
| |
+---+ +---+
| T1| | Tm|
+---+ +---+
4
MySQL replication’
● And they use(d) Orchestrator (more about Orchestrator in the next slide):
5
MySQL replication’’
● Orchestrator allows to:
● Visualize replication deployments
● Move slaves for planned maintenance of an intermediate master
● Automatically replace an intermediate master in case of its unexpected failure
(thanks to pseudo-GTIDs when we have not deployed GTIDs)
● Automatically replace a master in case of a failure (failing over to a slave)
● But Orchestrator cannot replace a master alone:
● Booking.com uses DNS for master discovery
● So Orchestrator calls a homemade script to repoint DNS (and to do other magic)
6
Intermediate Master Failure
7
Master Failure
8
MySQL Master High Availability [1 of 4]
Failing-over the master to a slave is my favorite HA method
• But it is not as easy as it sounds, and it is hard to automate well
• An example of complete failover solution in production:
https://github.blog/2018-06-20-mysql-high-availability-at-github/
The five considerations of master high availability:
(https://jfg-mysql.blogspot.com/2019/02/mysql-master-high-availability-and-failover-more-thoughts.html)
• Plan how you are doing master high availability
• Decide when you apply your plan (Failure Detection – FD)
• Tell the application about the change (Service Discovery – SD)
• Protect against the limit of FD and SD for avoiding split-brains (Fencing)
• Fix your data if something goes wrong
9
10
MySQL Master High Availability [2 of 4]
Failure detection (FD) is the 1st part (and 1st challenge) of failover
• It is a very hard problem: partial failure, unreliable network, partitions, …
• It is impossible to be 100% sure of a failure, and confidence needs time
à quick FD is unreliable, relatively reliable FD implies longer unavailability
Ø You need to accept that FD generates false positive (and/or false negative)
Repointing is the 2nd part of failover:
• Relatively easy with the right tools: MHA, GTID, Pseudo-GTID, Binlog Servers, …
• Complexity grows with the number of direct slaves of the master
(what if you cannot contact some of those slaves…)
• Some software for repointing:
• Orchestrator, Ripple Binlog Server,
Replication Manager, MHA, Cluster Control, MaxScale, …
11
MySQL Master High Availability [3 of 4]
In this configuration and when the master fails,
one of the slave needs to be repointed to the new master:
12
MySQL Master High Availability [4 of 4]
Service Discovery (SD) is the 3rd part (and 2nd challenge) of failover:
• If centralised, it is a SPOF; if distributed, impossible to update atomically
• SD will either introduce a bottleneck (including performance limits)
or will be unreliable in some way (pointing to the wrong master)
• Some ways to implement MySQL Master SD: DNS, ViP, Proxy, Zookeeper, …
http://code.openark.org/blog/mysql/mysql-master-discovery-methods-part-1-dns
Ø Unreliable FD and unreliable SD is a recipe for split-brains !
Protecting against split-brains (Fencing): Adv. Subject – not many solutions
(Proxies and semi-synchronous replication might help)
Fixing your data in case of a split-brain: only you can know how to do this !
(tip on this later in the talk)
MySQL Service Discovery @ MessageBird
MessageBird uses ProxySQL for MySQL Service Discovery
13
Orchestrator @ MessageBird
15
Failover War Story
Master failover does not always go as planned
We will now look at our War Story
Our subject database
● Simple replication deployment (in two data centers):
Master RWs +---+
happen here --> | A |
+---+
|
+------------------------+
| |
Reads +---+ +---+
happen here --> | B | | X |
+---+ +---+
|
+---+ And reads
| Y | <-- happen here
+---+
16
Incident: 1st event
● A and B (two servers in same data center) fail at the same time:
Master RWs +-/+
happen here --> | A |
but now failing +/-+
Reads +-/+ +---+
happen here --> | B | | X |
but now failing +/-+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
17
(I will cover how/why this happened later.)
Incident: 1st event’
● Orchestrator fixes things:
+-/+
| A |
+/-+
Reads +-/+ +---+ Now, Master RWs
happen here --> | B | | X | <-- happen here
but now failing +/-+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
18
Split-brain: disaster
● A few things happen in that day and night, and I wake-up to this:
+-/+
| A |
+/-+
Master RWs +---+ +---+
happen here --> | B | | X |
+---+ +---+
|
+---+
| Y |
+---+
19
Split-brain: disaster’
● And to make things worse, reads are still happening on Y:
+-/+
| A |
+/-+
Master RWs +---+ +---+
happen here --> | B | | X |
+---+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
20
Split-brain: disaster’’
● This is not good:
● When A and B failed, X was promoted as the new master
● Something made DNS point to B (we will see what later)
à writes are now happening on B
● But B is outdated: all writes to X (after the failure of A) did not reach B
● So we have data on X that cannot be read on B
● And we have new data on B
that is not read on Y
21
+-/+
| A |
+/-+
Master RWs +---+ +---+
happen here --> | B | | X |
+---+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
Split-brain: analysis
● Digging more in the chain of events, we find that:
● After the 1st failure of A, a 2nd one was detected and Orchestrator failed over to B
● So after their failures, A and B came back and formed an isolated replication chain
22
+-/+
| A |
+/-+
+-/+ +---+ Master RWs
| B | | X | <-- happen here
+/-+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
Split-brain: analysis
● Digging more in the chain of events, we find that:
● After the 1st failure of A, a 2nd one was detected and Orchestrator failed over to B
● So after their failures, A and B came back and formed an isolated replication chain
● And something caused a failure of A
23
+---+
| A |
+---+
|
+---+ +---+ Master RWs
| B | | X | <-- happen here
+---+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
Split-brain: analysis
● Digging more in the chain of events, we find that:
● After the 1st failure of A, a 2nd one was detected and Orchestrator failed over to B
● So after their failures, A and B came back and formed an isolated replication chain
● And something caused a failure of A
● But how did DNS end-up pointing to B ?
● The failover to B called the DNS repointing script
● The script stole the DNS entry
from X and pointed it to B
24
+-/+
| A |
+/-+
+---+ +---+ Master RWs
| B | | X | <-- happen here
+---+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
Split-brain: analysis
● Digging more in the chain of events, we find that:
● After the 1st failure of A, a 2nd one was detected and Orchestrator failed over to B
● So after their failures, A and B came back and formed an isolated replication chain
● And something caused a failure of A
● But how did DNS end-up pointing to B ?
● The failover to B called the DNS repointing script
● The script stole the DNS entry
from X and pointed it to B
● But is that all: what made A fail ?
25
+-/+
| A |
+/-+
Master RWs +---+ +---+
happen here --> | B | | X |
+---+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
Split-brain: analysis’
● What made A fail ?
● Once A and B came back up as a new replication chain, they had outdated data
● If B would have come back before A, it could have been re-slaved to X
26
+---+
| A |
+---+
|
+---+ +---+ Master RWs
| B | | X | <-- happen here
+---+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
Split-brain: analysis’
● What made A fail ?
● Once A and B came back up as a new replication chain, they had outdated data
● If B would have come back before A, it could have been re-slaved to X
● But because A came back before re-slaving, it injected heartbeat and p-GTID to B
27
+-/+
| A |
+/-+
+---+ +---+ Master RWs
| B | | X | <-- happen here
+---+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
Split-brain: analysis’
● What made A fail ?
● Once A and B came back up as a new replication chain, they had outdated data
● If B would have come back before A, it could have been re-slaved to X
● But because A came back before re-slaving, it injected heartbeat and p-GTID to B
● Then B could have been re-cloned without problems
28
+---+
| A |
+---+
|
+---+ +---+ Master RWs
| B | | X | <-- happen here
+---+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
Split-brain: analysis’
● What made A fail ?
● Once A and B came back up as a new replication chain, they had outdated data
● If B would have come back before A, it could have been re-slaved to X
● But because A came back before re-slaving, it injected heartbeat and p-GTID to B
● Then B could have been re-cloned without problems
● But A was re-cloned instead (human error #1)
29
+---+
| A |
+---+
+-/+ +---+ Master RWs
| B | | X | <-- happen here
+/-+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
Split-brain: analysis’
● What made A fail ?
● Once A and B came back up as a new replication chain, they had outdated data
● If B would have come back before A, it could have been re-slaved to X
● But because A came back before re-slaving, it injected heartbeat and p-GTID to B
● Then B could have been re-cloned without problems
● But A was re-cloned instead (human error #1)
● Why did Orchestrator not fail-over right away ?
● B was promoted hours after A was brought down…
● Because A was downtimed only for 4 hours (human error #2)
30
+-/+
| A |
+/-+
+---+ +---+ Master RWs
| B | | X | <-- happen here
+---+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
Orchestrator anti-flapping
● Orchestrator has a failover throttling/acknowledgment mechanism[1]:
● Automated recovery will happen
● for an instance in a cluster that has not recently been recovered
● unless such recent recoveries were acknowledged
● In our case:
● the recovery might have been acknowledged too early (human error #0 ?)
● with a too short “recently” timeout
● and maybe Orchestrator should not have failed over the second time
[1]: https://github.com/github/orchestrator/blob/master/docs/topology-recovery.md
#blocking-acknowledgments-anti-flapping
31
Split-brain: summary
● So in summary, this disaster was caused by:
1. A fancy failure: two servers failing at the same time
2. A debatable premature acknowledgment in Orchestrator
and probably too short a timeout for recent failover
3. Edge-case recovery: both servers forming a new replication topology
4. Restarting with the event-scheduler enabled (A injecting heartbeat and p-GTID)
5. Re-cloning wrong (A instead of B; should have created C and D and thrown away A
and B; too short downtime for the re-cloning)
6. Orchestrator failing over something that it should not have
(including taking an questionnable action when a downtime expired)
7. DNS repointing script not defensive enough
32
Fancy failure: more details
● Why did A and B fail at the same time ?
● Deployment error: the two servers in the same rack/failure domain ?
● And/or very unlucky ?
● Very unlucky because…
10 to 20 servers failed that day in the same data center
Because human operations and “sensitive” hardware
33
+-/+
| A |
+/-+
+-/+ +---+
| B | | X |
+/-+ +---+
|
+---+
| Y |
+---+
How to fix such situation ?
● Fixing “split-brain” data on B and X is hard
● Some solutions are:
● Kill B or X (and lose data)
● Replay writes from B on X or vice-versa (manually or with replication)
● But AUTO_INCREMENTs are in the way:
● up to i used on A before 1st failover
● i-n to j1 used on X after recovery
● i to j2 used on B after 2nd failover
34
+-/+
| A |
+/-+
Master RWs +---+ +---+
happen here --> | B | | X |
+---+ +---+
|
+---+ Reads
| Y | <-- happen here
+---+
Takeaway
● Twisted situations happen
● Automation (including failover) is not simple:
à code automation scripts defensively
● Be mindful for premature acknowledgment, downtime more than less,
shutdown slaves first à understand complex interactions of tools in details
● Try something else than AUTO-INCREMENTs for Primary Key
(monotonically increasing UUID[1] [2] ?)
[1]: https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/
[2]: http://mysql.rjweb.org/doc.php/uuid
35
36
Re Failover War Story
Master failover does not always go as planned
• because it is complicated
It is not a matter of “if” things go wrong
• but “when” things will go wrong
Please share your war stories
• so we can learn from each-others’ experience
• GitHub has a MySQL public Post-Mortem (great of them to share this):
https://blog.github.com/2018-10-30-oct21-post-incident-analysis/
MessageBird is hiring
messagebird.com/en/careers/
Thanks !
by Jean-François Gagné
presented at SRECon Dublin (October 2019)
Senior Infrastructure Engineer / System and MySQL Expert
jeanfrancois AT messagebird DOT com / @jfg956 #SREcon

More Related Content

What's hot

Replication Troubleshooting in Classic VS GTID
Replication Troubleshooting in Classic VS GTIDReplication Troubleshooting in Classic VS GTID
Replication Troubleshooting in Classic VS GTID
Mydbops
 
MySQL_MariaDB로의_전환_기술요소-202212.pptx
MySQL_MariaDB로의_전환_기술요소-202212.pptxMySQL_MariaDB로의_전환_기술요소-202212.pptx
MySQL_MariaDB로의_전환_기술요소-202212.pptx
NeoClova
 
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & ClusterMySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
Kenny Gryp
 
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
Jean-François Gagné
 
[2018] MySQL 이중화 진화기
[2018] MySQL 이중화 진화기[2018] MySQL 이중화 진화기
[2018] MySQL 이중화 진화기
NHN FORWARD
 
Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability
Mydbops
 
Keepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docx
Keepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docxKeepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docx
Keepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docx
NeoClova
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication Tutorial
Jean-François Gagné
 
NY Meetup: Scaling MariaDB with Maxscale
NY Meetup: Scaling MariaDB with MaxscaleNY Meetup: Scaling MariaDB with Maxscale
NY Meetup: Scaling MariaDB with Maxscale
Wagner Bianchi
 
MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바
NeoClova
 
My sql failover test using orchestrator
My sql failover test  using orchestratorMy sql failover test  using orchestrator
My sql failover test using orchestrator
YoungHeon (Roy) Kim
 
MaxScale이해와활용-2023.11
MaxScale이해와활용-2023.11MaxScale이해와활용-2023.11
MaxScale이해와활용-2023.11
NeoClova
 
MySQL/MariaDB Proxy Software Test
MySQL/MariaDB Proxy Software TestMySQL/MariaDB Proxy Software Test
MySQL/MariaDB Proxy Software Test
I Goo Lee
 
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
Altinity Ltd
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
Jean-François Gagné
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOps
Sveta Smirnova
 
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZEMySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
Norvald Ryeng
 
Errant GTIDs breaking replication @ Percona Live 2019
Errant GTIDs breaking replication @ Percona Live 2019Errant GTIDs breaking replication @ Percona Live 2019
Errant GTIDs breaking replication @ Percona Live 2019
Dieter Adriaenssens
 
MMUG18 - MySQL Failover and Orchestrator
MMUG18 - MySQL Failover and OrchestratorMMUG18 - MySQL Failover and Orchestrator
MMUG18 - MySQL Failover and Orchestrator
Simon J Mudd
 
MySQL Data Encryption at Rest
MySQL Data Encryption at RestMySQL Data Encryption at Rest
MySQL Data Encryption at Rest
Mydbops
 

What's hot (20)

Replication Troubleshooting in Classic VS GTID
Replication Troubleshooting in Classic VS GTIDReplication Troubleshooting in Classic VS GTID
Replication Troubleshooting in Classic VS GTID
 
MySQL_MariaDB로의_전환_기술요소-202212.pptx
MySQL_MariaDB로의_전환_기술요소-202212.pptxMySQL_MariaDB로의_전환_기술요소-202212.pptx
MySQL_MariaDB로의_전환_기술요소-202212.pptx
 
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & ClusterMySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
 
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
 
[2018] MySQL 이중화 진화기
[2018] MySQL 이중화 진화기[2018] MySQL 이중화 진화기
[2018] MySQL 이중화 진화기
 
Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability
 
Keepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docx
Keepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docxKeepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docx
Keepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docx
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication Tutorial
 
NY Meetup: Scaling MariaDB with Maxscale
NY Meetup: Scaling MariaDB with MaxscaleNY Meetup: Scaling MariaDB with Maxscale
NY Meetup: Scaling MariaDB with Maxscale
 
MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바
 
My sql failover test using orchestrator
My sql failover test  using orchestratorMy sql failover test  using orchestrator
My sql failover test using orchestrator
 
MaxScale이해와활용-2023.11
MaxScale이해와활용-2023.11MaxScale이해와활용-2023.11
MaxScale이해와활용-2023.11
 
MySQL/MariaDB Proxy Software Test
MySQL/MariaDB Proxy Software TestMySQL/MariaDB Proxy Software Test
MySQL/MariaDB Proxy Software Test
 
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOps
 
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZEMySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
 
Errant GTIDs breaking replication @ Percona Live 2019
Errant GTIDs breaking replication @ Percona Live 2019Errant GTIDs breaking replication @ Percona Live 2019
Errant GTIDs breaking replication @ Percona Live 2019
 
MMUG18 - MySQL Failover and Orchestrator
MMUG18 - MySQL Failover and OrchestratorMMUG18 - MySQL Failover and Orchestrator
MMUG18 - MySQL Failover and Orchestrator
 
MySQL Data Encryption at Rest
MySQL Data Encryption at RestMySQL Data Encryption at Rest
MySQL Data Encryption at Rest
 

Similar to Autopsy of a MySQL Automation Disaster

Autopsy of an automation disaster
Autopsy of an automation disasterAutopsy of an automation disaster
Autopsy of an automation disaster
Jean-François Gagné
 
The two little bugs that almost brought down Booking.com
The two little bugs that almost brought down Booking.comThe two little bugs that almost brought down Booking.com
The two little bugs that almost brought down Booking.com
Jean-François Gagné
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
Stanley Huang
 
MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527
Saewoong Lee
 
Percona Live '18 Tutorial: The Accidental DBA
Percona Live '18 Tutorial: The Accidental DBAPercona Live '18 Tutorial: The Accidental DBA
Percona Live '18 Tutorial: The Accidental DBA
Jenni Snyder
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
Wim Godden
 
Miscelaneous Debris
Miscelaneous DebrisMiscelaneous Debris
Miscelaneous Debris
frewmbot
 
More on gdb for my sql db as (fosdem 2016)
More on gdb for my sql db as (fosdem 2016)More on gdb for my sql db as (fosdem 2016)
More on gdb for my sql db as (fosdem 2016)
Valeriy Kravchuk
 
Cloudy with a Chance of Fireballs: Provisioning and Certificate Management in...
Cloudy with a Chance of Fireballs: Provisioning and Certificate Management in...Cloudy with a Chance of Fireballs: Provisioning and Certificate Management in...
Cloudy with a Chance of Fireballs: Provisioning and Certificate Management in...
Puppet
 
Curso de MySQL 5.7
Curso de MySQL 5.7Curso de MySQL 5.7
Curso de MySQL 5.7
Eduardo Legatti
 
32373 uploading-php-shell-through-sql-injection
32373 uploading-php-shell-through-sql-injection32373 uploading-php-shell-through-sql-injection
32373 uploading-php-shell-through-sql-injection
Attaporn Ninsuwan
 
Velocity 2012 - Learning WebOps the Hard Way
Velocity 2012 - Learning WebOps the Hard WayVelocity 2012 - Learning WebOps the Hard Way
Velocity 2012 - Learning WebOps the Hard Way
Cosimo Streppone
 
How Scylla Manager Handles Backups
How Scylla Manager Handles BackupsHow Scylla Manager Handles Backups
How Scylla Manager Handles Backups
ScyllaDB
 
Functional Database Strategies
Functional Database StrategiesFunctional Database Strategies
Functional Database Strategies
Jason Swartz
 
Case Study: MySQL migration from latin1 to UTF-8
Case Study: MySQL migration from latin1 to UTF-8Case Study: MySQL migration from latin1 to UTF-8
Case Study: MySQL migration from latin1 to UTF-8
Olivier DASINI
 
Build a DataWarehouse for your logs with Python, AWS Athena and Glue
Build a DataWarehouse for your logs with Python, AWS Athena and GlueBuild a DataWarehouse for your logs with Python, AWS Athena and Glue
Build a DataWarehouse for your logs with Python, AWS Athena and Glue
Maxym Kharchenko
 
Brace yourselves, leap second is coming
Brace yourselves, leap second is comingBrace yourselves, leap second is coming
Brace yourselves, leap second is coming
Nati Cohen
 
causos da linha de frente - #rsonrails 2011
causos da linha de frente - #rsonrails 2011causos da linha de frente - #rsonrails 2011
causos da linha de frente - #rsonrails 2011
bzanchet
 
An Introduction to Priam
An Introduction to PriamAn Introduction to Priam
An Introduction to Priam
Jason Brown
 
MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015
Dave Stokes
 

Similar to Autopsy of a MySQL Automation Disaster (20)

Autopsy of an automation disaster
Autopsy of an automation disasterAutopsy of an automation disaster
Autopsy of an automation disaster
 
The two little bugs that almost brought down Booking.com
The two little bugs that almost brought down Booking.comThe two little bugs that almost brought down Booking.com
The two little bugs that almost brought down Booking.com
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
 
MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527
 
Percona Live '18 Tutorial: The Accidental DBA
Percona Live '18 Tutorial: The Accidental DBAPercona Live '18 Tutorial: The Accidental DBA
Percona Live '18 Tutorial: The Accidental DBA
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Miscelaneous Debris
Miscelaneous DebrisMiscelaneous Debris
Miscelaneous Debris
 
More on gdb for my sql db as (fosdem 2016)
More on gdb for my sql db as (fosdem 2016)More on gdb for my sql db as (fosdem 2016)
More on gdb for my sql db as (fosdem 2016)
 
Cloudy with a Chance of Fireballs: Provisioning and Certificate Management in...
Cloudy with a Chance of Fireballs: Provisioning and Certificate Management in...Cloudy with a Chance of Fireballs: Provisioning and Certificate Management in...
Cloudy with a Chance of Fireballs: Provisioning and Certificate Management in...
 
Curso de MySQL 5.7
Curso de MySQL 5.7Curso de MySQL 5.7
Curso de MySQL 5.7
 
32373 uploading-php-shell-through-sql-injection
32373 uploading-php-shell-through-sql-injection32373 uploading-php-shell-through-sql-injection
32373 uploading-php-shell-through-sql-injection
 
Velocity 2012 - Learning WebOps the Hard Way
Velocity 2012 - Learning WebOps the Hard WayVelocity 2012 - Learning WebOps the Hard Way
Velocity 2012 - Learning WebOps the Hard Way
 
How Scylla Manager Handles Backups
How Scylla Manager Handles BackupsHow Scylla Manager Handles Backups
How Scylla Manager Handles Backups
 
Functional Database Strategies
Functional Database StrategiesFunctional Database Strategies
Functional Database Strategies
 
Case Study: MySQL migration from latin1 to UTF-8
Case Study: MySQL migration from latin1 to UTF-8Case Study: MySQL migration from latin1 to UTF-8
Case Study: MySQL migration from latin1 to UTF-8
 
Build a DataWarehouse for your logs with Python, AWS Athena and Glue
Build a DataWarehouse for your logs with Python, AWS Athena and GlueBuild a DataWarehouse for your logs with Python, AWS Athena and Glue
Build a DataWarehouse for your logs with Python, AWS Athena and Glue
 
Brace yourselves, leap second is coming
Brace yourselves, leap second is comingBrace yourselves, leap second is coming
Brace yourselves, leap second is coming
 
causos da linha de frente - #rsonrails 2011
causos da linha de frente - #rsonrails 2011causos da linha de frente - #rsonrails 2011
causos da linha de frente - #rsonrails 2011
 
An Introduction to Priam
An Introduction to PriamAn Introduction to Priam
An Introduction to Priam
 
MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015
 

More from Jean-François Gagné

The consequences of sync_binlog != 1
The consequences of sync_binlog != 1The consequences of sync_binlog != 1
The consequences of sync_binlog != 1
Jean-François Gagné
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
Jean-François Gagné
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
Jean-François Gagné
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
Jean-François Gagné
 
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitationsMySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
Jean-François Gagné
 
How Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagHow Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lag
Jean-François Gagné
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
Jean-François Gagné
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
Jean-François Gagné
 
MySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitationsMySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitations
Jean-François Gagné
 
Riding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication StreamRiding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication Stream
Jean-François Gagné
 

More from Jean-François Gagné (10)

The consequences of sync_binlog != 1
The consequences of sync_binlog != 1The consequences of sync_binlog != 1
The consequences of sync_binlog != 1
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
 
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitationsMySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
 
How Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagHow Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lag
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
 
MySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitationsMySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitations
 
Riding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication StreamRiding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication Stream
 

Recently uploaded

Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 

Recently uploaded (20)

Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Artificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic WarfareArtificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic Warfare
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 

Autopsy of a MySQL Automation Disaster

  • 1. Autopsy of a MySQL automation disaster by Jean-François Gagné presented at SRECon Dublin (October 2019) Senior Infrastructure Engineer / System and MySQL Expert jeanfrancois AT messagebird DOT com / @jfg956 #SREcon
  • 2. To err is human To really foul things up requires a computer[1] (or a script) [1]: http://quoteinvestigator.com/2010/12/07/foul-computer/ 2
  • 3. Session Summary 1. MySQL replication 2. Automation disaster: external eye 3. Chain of events: analysis 4. Learning / takeaway 3
  • 4. MySQL replication ● Typical MySQL replication deployment at Booking.com: +---+ | M | +---+ | +------+-- ... --+---------------+-------- ... | | | | +---+ +---+ +---+ +---+ | S1| | S2| | Sn| | M1| +---+ +---+ +---+ +---+ | +-- ... --+ | | +---+ +---+ | T1| | Tm| +---+ +---+ 4
  • 5. MySQL replication’ ● And they use(d) Orchestrator (more about Orchestrator in the next slide): 5
  • 6. MySQL replication’’ ● Orchestrator allows to: ● Visualize replication deployments ● Move slaves for planned maintenance of an intermediate master ● Automatically replace an intermediate master in case of its unexpected failure (thanks to pseudo-GTIDs when we have not deployed GTIDs) ● Automatically replace a master in case of a failure (failing over to a slave) ● But Orchestrator cannot replace a master alone: ● Booking.com uses DNS for master discovery ● So Orchestrator calls a homemade script to repoint DNS (and to do other magic) 6
  • 9. MySQL Master High Availability [1 of 4] Failing-over the master to a slave is my favorite HA method • But it is not as easy as it sounds, and it is hard to automate well • An example of complete failover solution in production: https://github.blog/2018-06-20-mysql-high-availability-at-github/ The five considerations of master high availability: (https://jfg-mysql.blogspot.com/2019/02/mysql-master-high-availability-and-failover-more-thoughts.html) • Plan how you are doing master high availability • Decide when you apply your plan (Failure Detection – FD) • Tell the application about the change (Service Discovery – SD) • Protect against the limit of FD and SD for avoiding split-brains (Fencing) • Fix your data if something goes wrong 9
  • 10. 10 MySQL Master High Availability [2 of 4] Failure detection (FD) is the 1st part (and 1st challenge) of failover • It is a very hard problem: partial failure, unreliable network, partitions, … • It is impossible to be 100% sure of a failure, and confidence needs time à quick FD is unreliable, relatively reliable FD implies longer unavailability Ø You need to accept that FD generates false positive (and/or false negative) Repointing is the 2nd part of failover: • Relatively easy with the right tools: MHA, GTID, Pseudo-GTID, Binlog Servers, … • Complexity grows with the number of direct slaves of the master (what if you cannot contact some of those slaves…) • Some software for repointing: • Orchestrator, Ripple Binlog Server, Replication Manager, MHA, Cluster Control, MaxScale, …
  • 11. 11 MySQL Master High Availability [3 of 4] In this configuration and when the master fails, one of the slave needs to be repointed to the new master:
  • 12. 12 MySQL Master High Availability [4 of 4] Service Discovery (SD) is the 3rd part (and 2nd challenge) of failover: • If centralised, it is a SPOF; if distributed, impossible to update atomically • SD will either introduce a bottleneck (including performance limits) or will be unreliable in some way (pointing to the wrong master) • Some ways to implement MySQL Master SD: DNS, ViP, Proxy, Zookeeper, … http://code.openark.org/blog/mysql/mysql-master-discovery-methods-part-1-dns Ø Unreliable FD and unreliable SD is a recipe for split-brains ! Protecting against split-brains (Fencing): Adv. Subject – not many solutions (Proxies and semi-synchronous replication might help) Fixing your data in case of a split-brain: only you can know how to do this ! (tip on this later in the talk)
  • 13. MySQL Service Discovery @ MessageBird MessageBird uses ProxySQL for MySQL Service Discovery 13
  • 15. 15 Failover War Story Master failover does not always go as planned We will now look at our War Story
  • 16. Our subject database ● Simple replication deployment (in two data centers): Master RWs +---+ happen here --> | A | +---+ | +------------------------+ | | Reads +---+ +---+ happen here --> | B | | X | +---+ +---+ | +---+ And reads | Y | <-- happen here +---+ 16
  • 17. Incident: 1st event ● A and B (two servers in same data center) fail at the same time: Master RWs +-/+ happen here --> | A | but now failing +/-+ Reads +-/+ +---+ happen here --> | B | | X | but now failing +/-+ +---+ | +---+ Reads | Y | <-- happen here +---+ 17 (I will cover how/why this happened later.)
  • 18. Incident: 1st event’ ● Orchestrator fixes things: +-/+ | A | +/-+ Reads +-/+ +---+ Now, Master RWs happen here --> | B | | X | <-- happen here but now failing +/-+ +---+ | +---+ Reads | Y | <-- happen here +---+ 18
  • 19. Split-brain: disaster ● A few things happen in that day and night, and I wake-up to this: +-/+ | A | +/-+ Master RWs +---+ +---+ happen here --> | B | | X | +---+ +---+ | +---+ | Y | +---+ 19
  • 20. Split-brain: disaster’ ● And to make things worse, reads are still happening on Y: +-/+ | A | +/-+ Master RWs +---+ +---+ happen here --> | B | | X | +---+ +---+ | +---+ Reads | Y | <-- happen here +---+ 20
  • 21. Split-brain: disaster’’ ● This is not good: ● When A and B failed, X was promoted as the new master ● Something made DNS point to B (we will see what later) à writes are now happening on B ● But B is outdated: all writes to X (after the failure of A) did not reach B ● So we have data on X that cannot be read on B ● And we have new data on B that is not read on Y 21 +-/+ | A | +/-+ Master RWs +---+ +---+ happen here --> | B | | X | +---+ +---+ | +---+ Reads | Y | <-- happen here +---+
  • 22. Split-brain: analysis ● Digging more in the chain of events, we find that: ● After the 1st failure of A, a 2nd one was detected and Orchestrator failed over to B ● So after their failures, A and B came back and formed an isolated replication chain 22 +-/+ | A | +/-+ +-/+ +---+ Master RWs | B | | X | <-- happen here +/-+ +---+ | +---+ Reads | Y | <-- happen here +---+
  • 23. Split-brain: analysis ● Digging more in the chain of events, we find that: ● After the 1st failure of A, a 2nd one was detected and Orchestrator failed over to B ● So after their failures, A and B came back and formed an isolated replication chain ● And something caused a failure of A 23 +---+ | A | +---+ | +---+ +---+ Master RWs | B | | X | <-- happen here +---+ +---+ | +---+ Reads | Y | <-- happen here +---+
  • 24. Split-brain: analysis ● Digging more in the chain of events, we find that: ● After the 1st failure of A, a 2nd one was detected and Orchestrator failed over to B ● So after their failures, A and B came back and formed an isolated replication chain ● And something caused a failure of A ● But how did DNS end-up pointing to B ? ● The failover to B called the DNS repointing script ● The script stole the DNS entry from X and pointed it to B 24 +-/+ | A | +/-+ +---+ +---+ Master RWs | B | | X | <-- happen here +---+ +---+ | +---+ Reads | Y | <-- happen here +---+
  • 25. Split-brain: analysis ● Digging more in the chain of events, we find that: ● After the 1st failure of A, a 2nd one was detected and Orchestrator failed over to B ● So after their failures, A and B came back and formed an isolated replication chain ● And something caused a failure of A ● But how did DNS end-up pointing to B ? ● The failover to B called the DNS repointing script ● The script stole the DNS entry from X and pointed it to B ● But is that all: what made A fail ? 25 +-/+ | A | +/-+ Master RWs +---+ +---+ happen here --> | B | | X | +---+ +---+ | +---+ Reads | Y | <-- happen here +---+
  • 26. Split-brain: analysis’ ● What made A fail ? ● Once A and B came back up as a new replication chain, they had outdated data ● If B would have come back before A, it could have been re-slaved to X 26 +---+ | A | +---+ | +---+ +---+ Master RWs | B | | X | <-- happen here +---+ +---+ | +---+ Reads | Y | <-- happen here +---+
  • 27. Split-brain: analysis’ ● What made A fail ? ● Once A and B came back up as a new replication chain, they had outdated data ● If B would have come back before A, it could have been re-slaved to X ● But because A came back before re-slaving, it injected heartbeat and p-GTID to B 27 +-/+ | A | +/-+ +---+ +---+ Master RWs | B | | X | <-- happen here +---+ +---+ | +---+ Reads | Y | <-- happen here +---+
  • 28. Split-brain: analysis’ ● What made A fail ? ● Once A and B came back up as a new replication chain, they had outdated data ● If B would have come back before A, it could have been re-slaved to X ● But because A came back before re-slaving, it injected heartbeat and p-GTID to B ● Then B could have been re-cloned without problems 28 +---+ | A | +---+ | +---+ +---+ Master RWs | B | | X | <-- happen here +---+ +---+ | +---+ Reads | Y | <-- happen here +---+
  • 29. Split-brain: analysis’ ● What made A fail ? ● Once A and B came back up as a new replication chain, they had outdated data ● If B would have come back before A, it could have been re-slaved to X ● But because A came back before re-slaving, it injected heartbeat and p-GTID to B ● Then B could have been re-cloned without problems ● But A was re-cloned instead (human error #1) 29 +---+ | A | +---+ +-/+ +---+ Master RWs | B | | X | <-- happen here +/-+ +---+ | +---+ Reads | Y | <-- happen here +---+
  • 30. Split-brain: analysis’ ● What made A fail ? ● Once A and B came back up as a new replication chain, they had outdated data ● If B would have come back before A, it could have been re-slaved to X ● But because A came back before re-slaving, it injected heartbeat and p-GTID to B ● Then B could have been re-cloned without problems ● But A was re-cloned instead (human error #1) ● Why did Orchestrator not fail-over right away ? ● B was promoted hours after A was brought down… ● Because A was downtimed only for 4 hours (human error #2) 30 +-/+ | A | +/-+ +---+ +---+ Master RWs | B | | X | <-- happen here +---+ +---+ | +---+ Reads | Y | <-- happen here +---+
  • 31. Orchestrator anti-flapping ● Orchestrator has a failover throttling/acknowledgment mechanism[1]: ● Automated recovery will happen ● for an instance in a cluster that has not recently been recovered ● unless such recent recoveries were acknowledged ● In our case: ● the recovery might have been acknowledged too early (human error #0 ?) ● with a too short “recently” timeout ● and maybe Orchestrator should not have failed over the second time [1]: https://github.com/github/orchestrator/blob/master/docs/topology-recovery.md #blocking-acknowledgments-anti-flapping 31
  • 32. Split-brain: summary ● So in summary, this disaster was caused by: 1. A fancy failure: two servers failing at the same time 2. A debatable premature acknowledgment in Orchestrator and probably too short a timeout for recent failover 3. Edge-case recovery: both servers forming a new replication topology 4. Restarting with the event-scheduler enabled (A injecting heartbeat and p-GTID) 5. Re-cloning wrong (A instead of B; should have created C and D and thrown away A and B; too short downtime for the re-cloning) 6. Orchestrator failing over something that it should not have (including taking an questionnable action when a downtime expired) 7. DNS repointing script not defensive enough 32
  • 33. Fancy failure: more details ● Why did A and B fail at the same time ? ● Deployment error: the two servers in the same rack/failure domain ? ● And/or very unlucky ? ● Very unlucky because… 10 to 20 servers failed that day in the same data center Because human operations and “sensitive” hardware 33 +-/+ | A | +/-+ +-/+ +---+ | B | | X | +/-+ +---+ | +---+ | Y | +---+
  • 34. How to fix such situation ? ● Fixing “split-brain” data on B and X is hard ● Some solutions are: ● Kill B or X (and lose data) ● Replay writes from B on X or vice-versa (manually or with replication) ● But AUTO_INCREMENTs are in the way: ● up to i used on A before 1st failover ● i-n to j1 used on X after recovery ● i to j2 used on B after 2nd failover 34 +-/+ | A | +/-+ Master RWs +---+ +---+ happen here --> | B | | X | +---+ +---+ | +---+ Reads | Y | <-- happen here +---+
  • 35. Takeaway ● Twisted situations happen ● Automation (including failover) is not simple: à code automation scripts defensively ● Be mindful for premature acknowledgment, downtime more than less, shutdown slaves first à understand complex interactions of tools in details ● Try something else than AUTO-INCREMENTs for Primary Key (monotonically increasing UUID[1] [2] ?) [1]: https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/ [2]: http://mysql.rjweb.org/doc.php/uuid 35
  • 36. 36 Re Failover War Story Master failover does not always go as planned • because it is complicated It is not a matter of “if” things go wrong • but “when” things will go wrong Please share your war stories • so we can learn from each-others’ experience • GitHub has a MySQL public Post-Mortem (great of them to share this): https://blog.github.com/2018-10-30-oct21-post-incident-analysis/
  • 38. Thanks ! by Jean-François Gagné presented at SRECon Dublin (October 2019) Senior Infrastructure Engineer / System and MySQL Expert jeanfrancois AT messagebird DOT com / @jfg956 #SREcon