MySQL Replication Alternative: Pros and Cons

4,599 views
4,454 views

Published on

At Percona Live, New York 2011

Published in: Technology, Business
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
4,599
On SlideShare
0
From Embeds
0
Number of Embeds
23
Actions
Shares
0
Downloads
76
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

MySQL Replication Alternative: Pros and Cons

  1. 1. MySQL replication alternatives: Pros and Cons<br />Darpan Dinker<br />VP Engineering<br />Schooner Information Technology, Inc.http://www.schoonerinfotech.com/<br />Oracle, MySQL, Java are registered trademarks of Oracle and/or its affiliates. <br />Other names may be trademarks of their respective owners.<br />
  2. 2. MySQL/InnoDB Replication<br />Replication for databases is critical<br />High Availability: avoid SPOF, fail-over for service continuity, DR<br />Scale-Out: scale reads on replicas<br />Misc. administrative tasks: upgrades, schema changes, backup, PITR ...<br />MySQL/InnoDB with base replication has serious gaps<br />Failover is not straight-forward<br />Possibility of data loss and data inconsistency<br />Possibility of stale data read on Slaves<br />Writes on Master need to be de-rated to Slave’s single-thread applier performance (forcing unnecessary sharding)<br />Applications and deployments need to work around these issues or live with the consequences<br />Several alternatives exist with different design considerations<br />
  3. 3. Replication Options for MySQL/InnoDB on Commodity HW<br />Qualitative Comparison<br />Integrated Replication<br />External Replication<br />MySQL async<br />Available in MySQL 5.1, 5.5<br />MySQL semi-sync<br />Available in MySQL 5.5<br />Schooner sync<br />Available in Schooner Active Cluster<br />GoldenGate<br />Tungsten<br />DRBD<br />Available on Linux<br />
  4. 4. #1 MySQL Asynchronous Replication<br />Read version based on tx=50<br />Master mysqld<br />Slave mysqld<br />Tx.Commit(101)<br />Repl.apply(51)<br />tx=101<br />tx=101<br />tx=51<br />tx=51<br />DB<br />Relay <br />log<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />InnoDB<br />Tx log<br />Log events pulled by Slave<br />Last tx=100<br />Last tx=100<br />Last tx=70<br />Last tx=50<br /><ul><li>Loosely coupled master/slave relationship
  5. 5. Master does not wait for Slave
  6. 6. Slave determines how much to read and from which point in the binary log
  7. 7. Slave can be arbitrarily behind master in reading and applying changes
  8. 8. Read on slave can give old data
  9. 9. No checksums in binary or relay log stored on disk, data corruption possible
  10. 10. Upon a Master’s failure
  11. 11. Slave may not have latest committed data resulting in data loss
  12. 12. Fail-over to a slave is stalled until all transactions in relay log have been committed – not instantaneous</li></li></ul><li>#1 MySQL Asynchronous Replication<br />Read version based on tx=50<br />Read Stale data on Slave<br />No flow control<br />Corruption<br />Master failure = mess<br />Data loss<br />Master mysqld<br />Slave mysqld<br />Tx.Commit(101)<br />Repl.apply(51)<br />tx=101<br />tx=101<br />tx=51<br />tx=51<br />DB<br />Relay <br />log<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />InnoDB<br />Tx log<br />Log events pulled by Slave<br />Last tx=100<br />Last tx=100<br />Last tx=70<br />Last tx=50<br /><ul><li>Loosely coupled master/slave relationship
  13. 13. Master does not wait for Slave
  14. 14. Slave determines how much to read and from which point in the binary log
  15. 15. Slave can be arbitrarily behind master in reading and applying changes
  16. 16. Read on slave can give old data
  17. 17. No checksums in binary or relay log stored on disk, data corruption possible
  18. 18. Upon a Master’s failure
  19. 19. Slave may not have latest committed data resulting in data loss
  20. 20. Fail-over to a slave is stalled until all transactions in relay log have been committed – not instantaneous</li></li></ul><li>#2 MySQL Semi-synchronous Replication<br />Read version based on tx=50<br />Master mysqld<br />Slave mysqld<br />Tx.Commit(101)<br />Repl.apply(51)<br />tx=101<br />tx=101<br />tx=51<br />tx=51<br />Log for tx=100 pulled by Slave<br />DB<br />Relay <br />log<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />InnoDB<br />Tx log<br />Slave ACK for tx=100<br />Last tx=100<br />Last tx=100<br />Last tx=100<br />Last tx=50<br /><ul><li>Semi-coupled master/slave relationship
  21. 21. On commit, Master waits for an ACK from Slave
  22. 22. Slave logs the transaction event in relay log and ACKs (may not apply yet)
  23. 23. Slave can be arbitrarily behind master in applying changes
  24. 24. Read on slave can give old data
  25. 25. No checksums in binary or relay log stored on disk, data corruption possible
  26. 26. Upon a Master’s failure
  27. 27. Fail-over to a slave is stalled until all transactions in relay log have been committed – not instantaneous</li></li></ul><li>#2 MySQL Semi-synchronous Replication<br />Read version based on tx=50<br />Read Stale data on Slave<br />No flow control<br />Corruption<br />Master failure = mess<br />Data loss<br />Master mysqld<br />Slave mysqld<br />Tx.Commit(101)<br />Repl.apply(51)<br />tx=101<br />tx=101<br />tx=51<br />tx=51<br />Log for tx=100 pulled by Slave<br />DB<br />Relay <br />log<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />InnoDB<br />Tx log<br />Slave ACK for tx=100<br />Last tx=100<br />Last tx=100<br />Last tx=100<br />Last tx=50<br /><ul><li>Semi-coupled master/slave relationship
  28. 28. On commit, Master waits for an ACK from Slave
  29. 29. Slave logs the transaction event in relay log and ACKs (may not apply yet)
  30. 30. Slave can be arbitrarily behind master in applying changes
  31. 31. Read on slave can give old data
  32. 32. No checksums in binary or relay log stored on disk, data corruption possible
  33. 33. Upon a Master’s failure
  34. 34. Fail-over to a slave is stalled until all transactions in relay log have been committed – not instantaneous</li></li></ul><li>#3 Schooner Active Cluster (SAC):An integrated HA and replication solution for MySQL/InnoDB<br />Read version based on tx=100<br />Master mysqld<br />Slave mysqld<br />Log for tx=100 pushed to Slave<br />Slave ACK for tx=100<br />Repl.apply(100)<br />Tx.Commit(101)<br />tx=100<br />tx=101<br />DB<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />InnoDB<br />Tx log<br />Last tx=100<br />Last tx=100<br />Last tx=100<br /><ul><li>Tightly-coupled master/slave relationship
  35. 35. After commit, all Slaves guaranteed to receive and commit the change
  36. 36. Slave in lock-step with Master
  37. 37. Read on slave gives latest committed data
  38. 38. Checksums in log stored on disk
  39. 39. Upon a Master’s failure
  40. 40. Fail-over to a slave is fully integrated and automatic
  41. 41. Application writes continue on new master instantaneously
  42. 42. No data loss</li></li></ul><li>#3 Schooner Active Cluster (SAC):An integrated HA and replication solution for MySQL/InnoDB<br />Read version based on tx=100<br />Master mysqld<br />Slave mysqld<br />Log for tx=100 pushed to Slave<br />Slave ACK for tx=100<br />Repl.apply(100)<br />Tx.Commit(101)<br />Read Stale data on Slave<br />No flow control<br />Corruption<br />Master failure = mess<br />Data loss<br />tx=100<br />tx=101<br />DB<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />InnoDB<br />Tx log<br />Last tx=100<br />Last tx=100<br />Last tx=100<br /><ul><li>Tightly-coupled master/slave relationship
  43. 43. After commit, all Slaves guaranteed to receive and commit the change
  44. 44. Slave in lock-step with Master
  45. 45. Read on slave gives latest committed data
  46. 46. Checksums in log stored on disk
  47. 47. Upon a Master’s failure
  48. 48. Fail-over to a slave is fully integrated and automatic
  49. 49. Application writes continue on new master instantaneously
  50. 50. No data loss</li></li></ul><li>#4 Oracle GoldenGate<br />Read version based on tx=50<br />Master mysqld<br />Slave mysqld<br />Tx.Commit(101)<br />Repl.apply(51)<br />Native SQL<br />tx=51<br />tx=101<br />tx=101<br />tx=51<br />Log for tx=71 sent to Slave<br />DB<br />Dest.<br />Trail<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />InnoDB<br />Tx log<br />Source<br />Trail<br />Last tx=100<br />Last tx=100<br />Last tx=70<br />Last tx=50<br /><ul><li>Loosely coupled master/slave relationship
  51. 51. Master does not wait for Slave
  52. 52. Changes applied on slave similar to MySQL
  53. 53. Slave can be arbitrarily behind master in reading and applying changes
  54. 54. Read on slave can give old data
  55. 55. Heterogeneous Database support
  56. 56. Oracle, Microsoft SQL Server, IBM DB2, MySQL</li></ul>Oracle and MySQL are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.<br />
  57. 57. #4 Oracle GoldenGate<br />Read version based on tx=50<br />Read Stale data on Slave<br />No flow control<br />Corruption<br />Master failure = mess<br />Data loss<br />Master mysqld<br />Slave mysqld<br />Tx.Commit(101)<br />Repl.apply(51)<br />Native SQL<br />tx=51<br />tx=101<br />tx=101<br />tx=51<br />Log for tx=71 sent to Slave<br />DB<br />Dest.<br />Trail<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />InnoDB<br />Tx log<br />Source<br />Trail<br />Last tx=100<br />Last tx=100<br />Last tx=70<br />Last tx=50<br /><ul><li>Loosely coupled master/slave relationship
  58. 58. Master does not wait for Slave
  59. 59. Changes applied on slave similar to MySQL
  60. 60. Slave can be arbitrarily behind master in reading and applying changes
  61. 61. Read on slave can give old data
  62. 62. Heterogeneous Database support
  63. 63. Oracle, Microsoft SQL Server, IBM DB2, MySQL</li></ul>Oracle and MySQL are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.<br />
  64. 64. #5 Tungsten Replicator<br />Read version based on tx=50<br />Master mysqld<br />Slave mysqld<br />Tx.Commit(101)<br />Repl.apply(51)<br />JDBC<br />tx=51<br />tx=101<br />tx=101<br />tx=51<br />Log for tx=71 sent to Slave<br />DB<br />Tx<br />History<br />Log<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />InnoDB<br />Tx log<br />Tx<br />History<br />Log<br />Last tx=100<br />Last tx=100<br />Last tx=70<br />Last tx=50<br /><ul><li>Loosely coupled master/slave relationship
  65. 65. Master does not wait for Slave
  66. 66. Changes applied on slave similar to MySQL
  67. 67. Slave can be arbitrarily behind master in reading and applying changes
  68. 68. Read on slave can give old data*
  69. 69. Heterogeneous Database support
  70. 70. MySQL, PostgreSQL
  71. 71. Global Tx ID, useful to point to new master upon failure
  72. 72. SaaS & ISP feature: Parallel replication for multi-tenant MySQL databases</li></ul>Tungsten is a registered trademark of Continuent<br />http://code.google.com/p/tungsten-replicator/<br />
  73. 73. #5 Tungsten Replicator<br />Read version based on tx=50<br />Read Stale data on Slave<br />No flow control<br />Corruption<br />Master failure = mess<br />Data loss<br />Master mysqld<br />Slave mysqld<br />Tx.Commit(101)<br />Repl.apply(51)<br />JDBC<br />tx=51<br />tx=101<br />tx=101<br />tx=51<br />Log for tx=71 sent to Slave<br />DB<br />Tx<br />History<br />Log<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />InnoDB<br />Tx log<br />Tx<br />History<br />Log<br />Last tx=100<br />Last tx=100<br />Last tx=70<br />Last tx=50<br /><ul><li>Loosely coupled master/slave relationship
  74. 74. Master does not wait for Slave
  75. 75. Changes applied on slave similar to MySQL
  76. 76. Slave can be arbitrarily behind master in reading and applying changes
  77. 77. Read on slave can give old data*
  78. 78. Heterogeneous Database support
  79. 79. MySQL, PostgreSQL
  80. 80. Global Tx ID, useful to point to new master upon failure
  81. 81. SaaS & ISP feature: Parallel replication for multi-tenant MySQL databases</li></ul>Tungsten is a registered trademark of Continuent<br />http://code.google.com/p/tungsten-replicator/<br />
  82. 82. #6 Linux DRBD<br />Master mysqld<br />STAND-BY<br />(No mysqld running)<br />Heartbeat<br />Tx.Commit(101)<br />tx=101<br />tx=101<br />Block {dev-a, 20123}<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />ACK for Block {dev-a, 20123}<br />Last tx=100<br />Last tx=100<br />Last tx=100<br />Last tx=100<br /><ul><li>Active-Passive mirroring at block device
  83. 83. After each commit, the Stand-by server is guaranteed to have identical blocks on device
  84. 84. Stand-by in lock-step with Master
  85. 85. Stand-by server does not service load
  86. 86. No data-loss
  87. 87. Upon a Master’s failure
  88. 88. MySQL is started on stand-by, database recovery takes ~minutes
  89. 89. Stand-by is made new Master
  90. 90. Application writes may use VIPs to write to new Master when its ready</li></li></ul><li>#6 Linux DRBD<br />Peer offers no service: wasted resources<br />No flow control<br />Corruption<br />Master failure = service interruption<br />Data loss<br />Master mysqld<br />STAND-BY<br />(No mysqld running)<br />Heartbeat<br />Tx.Commit(101)<br />tx=101<br />tx=101<br />Block {dev-a, 20123}<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />InnoDB<br />Tx log<br />DB<br />MySQL<br />Bin log<br />ACK for Block {dev-a, 20123}<br />Last tx=100<br />Last tx=100<br />Last tx=100<br />Last tx=100<br /><ul><li>Active-Passive mirroring at block device
  91. 91. After each commit, the Stand-by server is guaranteed to have identical blocks on device
  92. 92. Stand-by in lock-step with Master
  93. 93. Stand-by server does not service load
  94. 94. No data-loss
  95. 95. Upon a Master’s failure
  96. 96. MySQL is started on stand-by, database recovery takes ~minutes
  97. 97. Stand-by is made new Master
  98. 98. Application writes may use VIPs to write to new Master when its ready</li></li></ul><li>Replication Options for MySQL/InnoDB<br />Quantitative Comparison: Performance<br /><ul><li>Performance comparison</li></ul>MySQL asynchronous (v5.5.8)<br />MySQL semi-synchronous (v5.5.8)<br />Schooner Active Cluster (SAC) synchronous replication (v3.1)<br /><ul><li>Benchmark: DBT2 open-source transaction processing benchmark</li></li></ul><li>DBT2 Benchmark<br /><ul><li>Open source benchmark available at http://osdldbt.sourceforge.net
  99. 99. On-line Transaction Processing (OLTP) performance test
  100. 100. Fair-usage implementation of the TPC-C benchmark
  101. 101. Simulates a wholesale parts supplier with a database containing inventory and customer information
  102. 102. Benchmark scale determined by number of warehouses
  103. 103. Results here based on a scale of 1000
  104. 104. Use InnoDB storage engine with 48GB buffer pool and full consistency/durability settings
  105. 105. Schooner has found that optimizing MySQL/InnoDB for DBT2 yields significant benefit for many real customer workloads</li></li></ul><li>Measurement Setup<br />SWITCH<br /><ul><li>Arista 10Gb Ethernet switch</li></ul>SERVERS<br />CLIENT<br /><ul><li>2x 2RU Westmere servers
  106. 106. CPU: 2x Intel Xeon X5670 processors, 6 cores/12 threads per processor, 2.93 GHz
  107. 107. DRAM: 72GB
  108. 108. HDD: 2x300GB 2.5” 10k RPM HDD RAID-1, with LSI M5015 controllerwith NVRAM writeback cache for logging
  109. 109. Flash: 8x200GB OCZ MLC SSDs with LSI 9211controller, md RAID-0
  110. 110. Network: 1x Chelsio 10Gb Ethernet NIC
  111. 111. DBT2 client runs on the master server</li></li></ul><li>Results: DBT2 Performance<br />DBT2 Throughput (kTpm)<br />DBT2 Response Time (ms)<br />
  112. 112. Results: Master CPU Utilization<br />Master CPU Utilization (%)<br />
  113. 113. Results: Master CPU Utilization<br />Master CPU Utilization (%)<br />Higher throughput means<br />higher CPU utilization,<br />better system balance<br />
  114. 114. Results: Transient Behavior on Master<br />SAC<br />5.5 Async<br />5.5 Semi-sync<br />Inconsistent performance with 5.5 Async<br />
  115. 115. Results: Slave Utilization<br />Slave CPU Utilization (%)<br />
  116. 116. Results: Transient Behavior on Slave<br />SAC<br />5.5 Async<br />5.5 Semi-sync<br />Fluctuations with 5.5 Async<br />
  117. 117. Results: Storage IO Utilization<br />Master Storage IOPS<br />Utilization (%)<br />
  118. 118. Results: Storage IO Utilization<br />Master Storage IOPS<br />Utilization (%)<br />Higher throughput means<br />higher storage bandwidth,<br />better system balance<br />
  119. 119. Results: Network Utilization<br />Replication<br />Network Utilization (%)<br />Replication<br />Network Bandwidth (MB/s)<br />
  120. 120. Results: Network Utilization<br />Replication<br />Network Utilization (%)<br />Replication<br />Network Bandwidth (MB/s)<br />Higher throughput means<br />higher replication bandwidth,<br />better system balance<br />
  121. 121. Results: Summary<br /><ul><li>Sustainable replication performance of 5.5.8 Async and Semi-sync is severely limited</li></ul>Results in premature and unnecessary sharding<br /><ul><li>Deeply integrated synchronous replication in SAC yield 4-5X boost in sustainable replication performance</li></ul>High performance and fully synchronous replication are not mutually exclusive<br />
  122. 122. How the Replication Options Compare<br />*Using external tools and scripts<br />
  123. 123. Try it out!Schooner Active Cluster is available for a free trial.http://www.schoonerinfotech.com/free_trials<br />Darpan Dinker<br />darpan@schoonerinfotech.com<br />Twitter: @darpandinker<br />Schooner Information Technology, Inc.http://www.schoonerinfotech.com/<br />

×