Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Un-split brain MySQL

146 views

Published on

Is there a way to amend a MySQL split brain scenario? When two servers are diverged from each other, is it possible to identify and undo the conflicting changes?

We introduce gh-mysql-rewind, which combines multiple technologies to achieve auto-resolution of data divergence. This presentation explains how gh-mysql-rewind works, and how it is being tested in production to validate its operation.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Un-split brain MySQL

  1. 1. MySQL Un-split brain
 aka Move Back in Time viagh-mysql-rewind ShlomiNoach GitHub FOSDEM2019
  2. 2. About me @github/database-infrastructure Authoroforchestrator,gh-ost,freno,ccql andothers. Blogathttp://openark.org 
 github.com/shlomi-noach
 @ShlomiNoach
  3. 3. GitHub
 Built for developers Largestopensourcehosting 100M+repositories,31M+developers,
 2.1M+organizations SupplierofoctocatT-Shirtsandstickers
  4. 4. Incentive ExpectedMySQLsplitbrain,withunexpectedrequirement.
  5. 5. ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! Failover & split brain
  6. 6. Incentive Thetimeittooktorestoredataonthedemotedtopology
  7. 7. Incentive Couldwejustrollbackthechangesandmovebackintime?
  8. 8. gh-mysql-rewind Rewindadirtyserverbackintimeandconnectitasahealthy replicainareplicationtopology.
  9. 9. Iteratively restore hosts into replication topology ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
  10. 10. gh-mysql-rewind How?
  11. 11. GTID Thegoodparts
  12. 12. GTID Eachserverkeepstrackof: • gtid_executed: alltransactionseverexecuted:
 00020192-1111-1111-1111-111111111111:1-130541 • gtid_purged: ofwhich,somemayhavebeenpurged:
 00020192-1111-1111-1111-111111111111:1-120008
  13. 13. ! ! ! Binarylogs ! ! ! Binarylogs Split brain: GTID contents ! ! ! ! ! ! Binarylogs
  14. 14. 00020192-1111-1111-1111-111111111111:1-5042 Split brain: gtid_executed gtid_executedondemotedmaster:
  15. 15. 00020192-1111-1111-1111-111111111111:1-5000,
 00020193-2222-2222-2222-222222222222:1-200 Split brain: gtid_executed gtid_executedonpromotedmaster:
  16. 16. 00020192-1111-1111-1111-111111111111:1-5042 - 00020192-1111-1111-1111-111111111111:5001-5042 00020192-1111-1111-1111-111111111111:1-5000,
 00020193-2222-2222-2222-222222222222:1-200 Identifying bad GTID transactions gtid_executed(demoted)-gtid_executed (promoted)
  17. 17. Row based replication Withbinlog_row_image=FULL: EachINSERT,DELETE,UPDATE isexpressedinthebinarylog asthecompleterowimagebefore/afterchange.
  18. 18. Row based replication BINLOG ' L9MtXBMBAAAANgAAAJAWAAAAAG4AAAAAAAEABG1ldGEACWhlYXJ0YmVhdAACAxEBBgCk9AE9 L9MtXB8BAAAAPAAAAMwWAAAAAG4AAAAAAAEAAgAC///8AQAAAFwt0y4BQYr8AQAAAFwt0y8BWKiT IZN+ '/*!*/; ### UPDATE `meta`.`heartbeat` ### WHERE ### @1=1 /* INT meta=0 nullable=0 is_null=0 */ ### @2=1546507054.082314 /* TIMESTAMP(6) meta=6 nullable=0 is_null=0 */ ### SET ### @1=1 /* INT meta=0 nullable=0 is_null=0 */ ### @2=1546507055.088232 /* TIMESTAMP(6) meta=6 nullable=0 is_null=0 */ # at 5836 #190103 11:17:35 server id 1 end_log_pos 5867 CRC32 0x2cf60376 Xid = 114 COMMIT/*!*/;
  19. 19. MariaDB, flashback DevelopedbyAlibaba ContributedtoMySQLandMariaDB ImplementedinMariaDB’smysqlbinlog: • mysqlbinlog --flashback
  20. 20. flashback example, pseudo code insert(1, 'a') insert(2, 'b') insert(3, 'c') update(2, 'b')->(2, 'second') update(3, 'c')->(3, 'third') insert(4, 'd') delete(1, 'a') insert(1, 'a') delete(4, 'd') update(3, 'third')->(3, 'c') update(2, 'second')->(2, 'b') delete(3, 'c') delete(2, 'b') delete(1, 'a')
  21. 21. rewind GTIDprovidesinformationon“whatdiverged”. flashbackprovidesthemechanicstoundochanges. Now,whatexactlydoweneedtorewind?
  22. 22. Time axis ! !
  23. 23. Rewind to split point ! !
  24. 24. Rewind beyond split point ! !
  25. 25. ! ! ! Binarylogs Finding bad transactions in binlogs mysql-bin.0000620 mysql-bin.0000621 mysql-bin.0000622 mysql-bin.0000623
  26. 26. Finding bad transactions in binlogs # at 4 #190103 11:17:14 server id 1 end_log_pos 123 CRC32 0x60730f6b Start: binlog v 4, server v 5.7.17-log created 190103 11:17:14 at startup ROLLBACK/*!*/; BINLOG ' GtMtXA8BAAAAdwAAAHsAAAAAAAQANS43LjE3LWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAa0y1cEzgNAAgAEgAEBAQEEgAAXwAEGggAAAAICAgCAAAACgoKKioAEjQA AWsPc2A= '/*!*/; # at 123 #190103 11:17:14 server id 1 end_log_pos 194 CRC32 0x48f18c88 Previous-GTIDs # 00020192-1111-1111-1111-111111111111:1-59023 # at 194
  27. 27. gh-mysql-rewind: preparation • IdentifiesthebadGTIDtransactions. • Identifieswhichbinarylogscontainthosetransactions. • Generatesflashbackper-binlog
  28. 28. gh-mysql-rewind: inject GTID MariaDB/flashbackdonotspeakMySQL-GTID. WeusesomegoodoldawkhackingtoinjectdummyGTID entries. mysqlbinlog —flashback … | 
 awk 'BEGIN {"uuidgen -r" |& getline u} /^BEGIN/ {c += 1 ; print "SET @@SESSION.GTID_NEXT= x27" u ":" c "x27/*!*/;"} {print}' | sed -e s/', @@session.check_constraint_checks=1//g’
  29. 29. gh-mysql-rewind: apply catallthisintothebrokenMySQLserver. Wehavenowrevertedtheserverintosomeconsistentpointin history,ator(morelikely)beforethesplit. !
  30. 30. gh-mysql-rewind: back in time Havewejustmadethingsevenworse? Wherearewe? gtid_executedisamess! Whathappensifwereconnectthe serverintothetopology? ! ! ! Binarylogs
  31. 31. 00020192-1111-1111-1111-111111111111:1-5042 - 00020192-1111-1111-1111-111111111111:1-4805 00020192-1111-1111-1111-111111111111:4806-5042 Computing target GTID RESET MASTER
 SET GLOBAL gtid_purged := <computation>
  32. 32. Join topology as healthy replica CHANGE MASTER TO
 MASTER_HOST=<healthy host>,
 MASTER_AUTO_POSITION=1
 
 ⚡ ! ! ! ! ! !
  33. 33. Limitations CannotrewindDDL JSON,POINTnotsupported Fullbinlogrollbackmeansmorerecoverytime Localrunoneachserver Oneserveratatime • Withcarefulplanning,theoperationcouldbeappliedona singleserverandpropagatetoitsreplicas.
  34. 34. Testing Isthisreliable?
  35. 35. Testing Continuoustestinginproduction!
  36. 36. Testing Ideally:checksumentiredata,contaminate,rewind,checksum again.But… • Datasetistoobig,testswouldtakedays • Howcanwepredictwhere(inhistory)gh-mysql-rewindwill dropusat?
  37. 37. ! ! ! ! ! ! STOP SLAVE CHANGE MASTER TO MASTER_AUTO_POSITION=1 Rotatesrelaylogs FLUSH BINARY LOGS Opensanewbinarylog Wepredictthatgh-mysql-rewindwilldropusbackinto currentposition. Testing, a dedicated replica
  38. 38. START SLAVE IO_THREAD Sleep30sec STOP SLAVE Weaggregate30secofproductiontrafficinarelaylog. Testing: grab production data ! Relaylog
  39. 39. Weparsetherelaylogandlistaffectedtables. Weuseall“small”tables, Andasingle“large”table. Wechecksumallthesetables. Testing: analyze production data ! Relaylog
  40. 40. ! START SLAVE SQL_THREAD; SELECT MASTER_POS_WAIT(<relaylog end coordinates>) Weapplytheentirecontentoftherelaylog Testing: apply production dataRelaylog Binarylog
  41. 41. ! Foreachtable,weissue: DELETE FROM <t> ORDER BY id DESC LIMIT 10 DELETE FROM <t> LIMIT 10 Andtomakethingsworse: START SLAVE Sleep30sec STOP SLAVE Testing: contaminateRelaylog Binarylog
  42. 42. ! Replicationislikelytobebroken.Dataiscorrupted. A mess.
  43. 43. Testing: gh-mysql-rewind Connecttoahealthyserver. gh-mysql-rewindwillrewindthedirtywrites,andpastthose changes. Ourchoiceof30secwasintentional:itwillrewindbackatour FLUSHpoint. ! ! ! ! ! !
  44. 44. Testing: checksum Wechecksumalltheaffectedtablesagain.Weexpect100% match. ! ! ! ! ! !
  45. 45. Testing: iterate! Thereplicaisbackinthetopology. Waitforittocatchup,runagain,andagain,andagain. Complementarytestexiststoconfirmour“pointofarrival” calculationiscorrect,however,replicationsuccessmakesit redundant. ! ! ! ! ! !
  46. 46. Expectations Toneverusegh-mysql-rewind. Torewind&joinreplicationwithinminutesasopposedtohours. Smallerbinarylogscanhelp.
  47. 47. Status Shellscript. Tobereleasedshortlytothecommunity. Atsomestageintegrateintoorchestrator. Butforthat,wewillneedremoteaccess.
  48. 48. In development at MySQL Query&manipulateGTIDovertheclientprotocol:
  49. 49. Questions? github.com/shlomi-noach @ShlomiNoach Thank you!

×