Performance comparisons and trade-offs for various
               MySQL replication schemes


               Darpan Dinker                          Brian O’Krafka,
               VP Engineering                         Chief Architect
               Schooner Information Technology, Inc.
               http://www.schoonerinfotech.com/




Oracle, MySQL, Java are registered trademarks of Oracle and/or its affiliates.
Other names may be trademarks of their respective owners.
MySQL Replication


     • Replication for databases is critical
          – High Availability: avoid SPOF, fail-over for service continuity, DR
          – Scale-Out: scale reads on replicas
          – Misc. administrative tasks: upgrades, schema changes, backup, PITR ...

     • MySQL/InnoDB with base replication has serious gaps
          – Failover is not straight-forward
          – Possibility of data loss and data inconsistency
          – Possibility of stale data read on Slaves
          – Writes on Master need to be de-rated to Slave’s single-thread applier
            performance (forcing unnecessary sharding)
          – Applications and deployments need to work around these issues or live with the
            consequences

     • Several alternatives exist with different design considerations


© 2011 Schooner Information Technology, p2
Technology Covered: Replication Options for MySQL/InnoDB


                                             Qualitative Comparison

          MySQL-Specific Replication                        External Replication
          1. MySQL async                                    4. GoldenGate
               – Available in MySQL 5.1, 5.5                5. Tungsten
          2. MySQL semi-sync                                6. DRBD
               – Available in MySQL 5.5                      – Available on Linux
          3. Schooner sync
               – Available in Schooner Active
                 Cluster




© 2011 Schooner Information Technology, p3
#1 MySQL Asynchronous Replication
                                                                                           Read version based on tx=50


                                 Master mysqld                                                           Slave mysqld

                                    Tx.Commit(101)                                                       Repl.apply(51)



                                   tx=101                tx=101                                 tx=51             tx=51

                                             InnoDB MySQL         Log events pulled by Slave     Relay     InnoDB
                           DB                                                                                             DB
                                              Tx log Bin log                                      log       Tx log
                                      Last tx=100 Last tx=100                                    Last tx=70 Last tx=50

          • Loosely coupled master/slave                                          • Read on slave can give old data
            relationship                                                          • No checksums in binary or relay log
                     • Master does not wait for Slave                               stored on disk, data corruption possible
                     • Slave determines how much to read                          • Upon a Master’s failure
                       and from which point in the binary log
                                                                                        • Slave may not have latest committed
                     • Slave can be arbitrarily behind master in                          data resulting in data loss
                       reading and applying changes
                                                                                        • Fail-over to a slave is stalled until all
                                                                                          transactions in relay log have been
                                                                                          committed – not instantaneous

© 2011 Schooner Information Technology, p4
#2 MySQL Semi-synchronous Replication

                                                                                             Read version based on tx=50


                                 Master mysqld                                                              Slave mysqld

                                    Tx.Commit(101)                                                          Repl.apply(51)



                                   tx=101                tx=101                                    tx=51            tx=51
                                                                  Log for tx=100 pulled by Slave
                                             InnoDB MySQL                                           Relay     InnoDB
                           DB                                         Slave ACK for tx=100                                   DB
                                              Tx log Bin log                                         log       Tx log
                                      Last tx=100 Last tx=100                                      Last tx=100 Last tx=50

          • Semi-coupled master/slave relationship                                  • Read on slave can give old data
                     • On commit, Master waits for an ACK                           • No checksums in binary or relay log
                       from Slave                                                     stored on disk, data corruption possible
                     • Slave logs the transaction event in relay
                                                                                    • Upon a Master’s failure
                       log and ACKs (may not apply yet)
                                                                                          • Fail-over to a slave is stalled until all
                     • Slave can be arbitrarily behind master in
                                                                                            transactions in relay log have been
                       applying changes
                                                                                            committed – not instantaneous



© 2011 Schooner Information Technology, p5
#3 Schooner MySQL Active Cluster (SAC):
     An integrated HA and replication solution for MySQL/InnoDB
                                                                                           Read version based on tx=100


                                 Master mysqld                 Log for tx=100 pushed to Slave           Slave mysqld
                                                                    Slave ACK for tx=100            Repl.apply(100)
                                    Tx.Commit(101)


                                                                                                               tx=100
                                   tx=101

                                             InnoDB MySQL                                                  InnoDB
                           DB                                                                                            DB
                                              Tx log Bin log                                                Tx log
                                      Last tx=100 Last tx=100                                              Last tx=100

          • Tightly-coupled master/slave                                          • Read on slave gives latest committed
            relationship                                                            data
                     • After commit, all Slaves guaranteed to                     • Checksums in log stored on disk
                       receive and commit the change
                                                                                  • Upon a Master’s failure
                     • Slave in lock-step with Master
                                                                                        • Fail-over to a slave is fully integrated
                                                                                          and automatic
                                                                                        • Application writes continue on new
                                                                                          master instantaneously


© 2011 Schooner Information Technology, p6
#4 Oracle GoldenGate
                                                                                                        Read version based on tx=50


                                 Master mysqld                                                                             Slave mysqld

                                    Tx.Commit(101)                                                                          Repl.apply(51)



                                   tx=101          tx=101                                                                               tx=51
                                                                            Log for tx=71 sent to Slave
                                             InnoDB MySQL Source                                                  Dest.        InnoDB
                           DB                                                                                                                   DB
                                              Tx log Bin log Trail                                                Trail         Tx log
                                      Last tx=100 Last tx=100                                                    Last tx=70 Last tx=50

          • Loosely coupled master/slave                                                   • Read on slave can give old data
            relationship                                                                   • Heterogeneous Database support
                     • Master does not wait for Slave                                          • Oracle, Microsoft SQL Server, IBM
                     • Changes applied on slave similar to                                       DB2, MySQL
                       MySQL
                     • Slave can be arbitrarily behind master in
                       reading and applying changes


Oracle and MySQL are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

© 2011 Schooner Information Technology, p7
#5 Tungsten Replicator
                                                                                               Read version based on tx=50


                                 Master mysqld                                                                        Slave mysqld

                                    Tx.Commit(101)                                                                     Repl.apply(51)



                                   tx=101          tx=101                                                                       tx=51
                                                                   Log for tx=71 sent to Slave
                                                                Tx                                          Tx
                                             InnoDB MySQL                                                        InnoDB
                           DB                                History                                     History                        DB
                                              Tx log Bin log                                                      Tx log
                                                               Log                                         Log
                                      Last tx=100 Last tx=100                                             Last tx=70 Last tx=50

          • Loosely coupled master/slave                                        • Read on slave can give old data*
            relationship                                                        • Heterogeneous Database support
                     • Master does not wait for Slave                                • MySQL, PostgreSQL
                     • Changes applied on slave similar to
                       MySQL
                                                                                • Global Tx ID, useful to point to new
                                                                                  master upon failure
                     • Slave can be arbitrarily behind master in
                       reading and applying changes                             • SaaS & ISP feature: Parallel replication
                                                                                  for multi-tenant MySQL databases

Tungsten is a registered trademark of Continuent                                http://code.google.com/p/tungsten-replicator/

© 2011 Schooner Information Technology, p8
#6 Linux DRBD



                                 Master mysqld
                                                                         Heartbeat                   STAND-BY
                                    Tx.Commit(101)
                                                                                                (No mysqld running)



                                   tx=101          tx=101
                                                                  Block {dev-a, 20123}
                                             InnoDB MySQL                                               InnoDB MySQL
                           DB                                   ACK for Block {dev-a, 20123}
                                                                                                DB
                                              Tx log Bin log                                             Tx log Bin log
                                      Last tx=100 Last tx=100                                         Last tx=100 Last tx=100

          • Active-Passive mirroring at block device                           • Stand-by server does not service load
                     • After each commit, the Stand-by server                  • No data-loss
                       is guaranteed to have identical blocks
                                                                               • Upon a Master’s failure
                       on device
                                                                                     • MySQL is started on stand-by, database
                     • Stand-by in lock-step with Master
                                                                                       recovery takes ~minutes
                                                                                     • Stand-by is made new Master
                                                                                     • Application writes may use VIPs to write
                                                                                       to new Master when its ready


© 2011 Schooner Information Technology, p9
Replication Options for MySQL/InnoDB


                           Quantitative Comparison: Performance
     • Performance comparison of:
       1. MySQL asynchronous (v5.5.8)
       2. MySQL semi-synchronous (v5.5.8)
       3. Schooner Active Cluster (SAC) synchronous replication (v3.1)

     • Benchmark: DBT2 open-source transaction processing benchmark




© 2011 Schooner Information Technology, p10
DBT2 Benchmark

   • Open source benchmark available at http://osdldbt.sourceforge.net
        • On-line Transaction Processing (OLTP) performance test
        • Fair-usage implementation of the TPC-C benchmark
        • Simulates a wholesale parts supplier with a database containing inventory and
          customer information

   • Benchmark scale determined by number of warehouses
        • Results here based on a scale of 1000

   • Use InnoDB storage engine with 48GB buffer pool and full
     consistency/durability settings

   • Schooner has found that optimizing MySQL/InnoDB for DBT2 yields
     significant benefit for many real customer workloads


© 2011 Schooner Information Technology, p11
Measurement Setup
                                                        SWITCH         •    Arista 10Gb Ethernet
                                                                            switch

                               SERVERS




                                                                           CLIENT
       • IBM 3650 2RU Westmere Server                        •   DBT2 client runs on master server
       • CPU: 2x Intel Xeon X5670 processors, 6 cores/12
         threads per processor, 2.93 GHz
       • DRAM: 72GB
       • HDD: 2x300GB 10k RPM HDD RAID-1 for logging, with
         LSI M5015 controller with NVRAM writeback cache
       • Flash: 8x200GB OCZ MLC SSD’s, with 1 LSI 9211
         controller, md RAID-0
       • Network: 1x Chelseo 10Gb Ethernet NIC




© 2011 Schooner Information Technology, p12
Results: DBT2 Performance
     DBT2 Throughput (kTpm)                            DBT2 Response Time (ms)

        120                                            6
        100                                            5
          80                                           4
          60                                           3
          40                                           2
          20                                           1
             0                                         0
                    2-node 5.5 2-node 5.5     2-node       2-node 5.5 2-node 5.5   2-node
                      async       semi         SAC           async       semi       SAC




© 2011 Schooner Information Technology, p13
Results: DBT2 Performance
     DBT2 Throughput (kTpm)                            DBT2 Response Time (ms)

        120                                            6
        100                                            5
          80                                           4
          60                                           3
          40                                           2
          20                                           1
             0                                         0
                    2-node 5.5 2-node 5.5     2-node        2-node 5.5 2-node 5.5   2-node
                      async       semi         SAC            async       semi       SAC

         5.5 Async and Semi-sync
      limited by serial slave applier                           SAC optimizations
                                                           yields lower response times
              SAC optimizations
       yield 4-5x boost in throughput
© 2011 Schooner Information Technology, p14
Results: Master CPU Utilization
                                              Master CPU Utilization (%)
                                              1600
                                              1400
                                              1200
                                              1000
                                              800
                                              600
                                              400
                                              200
                                                0
                                                      2-node 2-node 2-node
                                                     5.5 async 5.5 semi SAC




© 2011 Schooner Information Technology, p15
Results: Master CPU Utilization
                                              Master CPU Utilization (%)
                                              1600
                                              1400
                                              1200
                                              1000
                                              800
                                              600
                                              400
                                              200
                                                0
                                                      2-node 2-node 2-node
                                                     5.5 async 5.5 semi SAC


                                                 Higher throughput means
                                                  higher CPU utilization,
                                                   better system balance
© 2011 Schooner Information Technology, p16
Results: Transient Behavior on Master
                        5.5 Async                         5.5 Semi-sync                 SAC




                                              Inconsistent performance with 5.5 Async

© 2011 Schooner Information Technology, p17
Results: Slave Utilization
                                              Slave CPU Utilization (%)
                                              600
                                              500
                                              400
                                              300
                                              200
                                              100
                                                0
                                                    2-node 5.5   2-node 5.5   2-node SAC
                                                      async         semi




© 2011 Schooner Information Technology, p18
Results: Slave Utilization
                                              Slave CPU Utilization (%)
                                              600
                                              500
                                              400
                                              300
                                              200
                                              100
                                                0
                                                    2-node 5.5   2-node 5.5   2-node SAC
                                                      async         semi

                                                      Single applier in 5.5
                                                      saturates one core
                                          Even at 5X throughput on SAC Slave,
                                        enough headroom to service Read traffic
© 2011 Schooner Information Technology, p19
Results: Transient Behavior on Slave
               5.5 Async                         5.5 Semi-sync              SAC




                                              Fluctuations with 5.5 Async

© 2011 Schooner Information Technology, p20
Results: Storage IO Utilization
                                                   Master Storage IOPS
                                                     Utilization (%)
                                              16
                                              14
                                              12
                                              10
                                               8
                                               6
                                               4
                                               2
                                               0
                                                    2-node 5.5 2-node 5.5 2-node SAC
                                                      async       semi




© 2011 Schooner Information Technology, p21
Results: Storage IO Utilization
                                                   Master Storage IOPS
                                                     Utilization (%)
                                              16
                                              14
                                              12
                                              10
                                               8
                                               6
                                               4
                                               2
                                               0
                                                    2-node 5.5 2-node 5.5 2-node SAC
                                                      async       semi


                                                   Higher throughput means
                                                   higher storage bandwidth,
                                                     better system balance
© 2011 Schooner Information Technology, p22
Results: Network Utilization
               Replication                                    Replication
          Network Utilization (%)                      Network Bandwidth (MB/s)
         40                                            50
         35                                            45
                                                       40
         30                                            35
         25                                            30
         20                                            25
         15                                            20
                                                       15
         10                                            10
          5                                             5
          0                                             0
                    2-node 5.5 2-node 5.5     2-node        2-node 5.5 2-node 5.5   2-node
                      async       semi         SAC            async       semi       SAC




© 2011 Schooner Information Technology, p23
Results: Network Utilization
               Replication                                           Replication
          Network Utilization (%)                             Network Bandwidth (MB/s)
         40                                                    50
         35                                                    45
                                                               40
         30                                                    35
         25                                                    30
         20                                                    25
         15                                                    20
                                                               15
         10                                                    10
          5                                                     5
          0                                                     0
                    2-node 5.5 2-node 5.5        2-node             2-node 5.5 2-node 5.5   2-node
                      async       semi            SAC                 async       semi       SAC


                                                Higher throughput means
                                              higher replication bandwidth,
                                                  better system balance
© 2011 Schooner Information Technology, p24
Results: Summary

       • Sustainable replication performance of 5.5.8 Async and
         Semi-sync is severely limited

       • Deeply integrated synchronous replication in SAC yield 4-
         5X boost in sustainable replication performance

       • High performance and fully synchronous replication are not
         mutually exclusive




© 2011 Schooner Information Technology, p25
Conclusion
 • Asynchronous replication is de-facto standard for 10+ years, issues with
   consistency, data-loss and service continuity

 • Recently released semi-synchronous replication mitigates the need to DRBD to
   avoid data-loss on failover, however consistency and service continuity issues
   exist

 • Schooner synchronous provides consistent reads on Slaves instantaneous
   failover with zero data-loss

 • For database interoperability, GoldenGate and Tungsten are a good choice

 • Performance comparisons between MySQL 5.5.8 async/semi-sync and
   Schooner Active Cluster using DBT2 show:
       • 5.5.8 async/semi-sync is severely limited by serial slave applier
       • Parallel slave appliers plus Schooner core optimizations in SAC yield 4-5x
         increase in throughput at lower response times, while maintaining read
         consistency across the cluster.


© 2011 Schooner Information Technology, p26
Backup Slides
Reference:
     MySQL Replication Solutions Compared

                                                     Schooner               MySQL 5.5       MySQL 5.1/5.5                     Linux Heartbeat +
                                                                                                                 Tungsten                             GoldenGate
                                                 Active Cluster 3.1         (semi-sync)        (async)                              DRBD
   Eliminates Slave Lag                                   Y                      N                   N              N                Y                    N
   Slave Consistent w/ Master                             Y                      N                   N              N                N                    N
   Write scalability                                    High                 Moderate            Moderate          Low              Low               Moderate
   Read scalability                                     High                   High                High           High              N/A                  High
   Data loss Probability on Failover                    Zero                   Zero             High            High             Zero                  High
                                                                                          Slave-lag-catch- Slave-lag-catch- InnoDB recovery       Slave-lag-catch-
   Planned Failover time (stop mysql)                  <2sec           Slave-lag-catch-up
                                                                                                 up               up             time                    up
                                                                       #                    #                #                #                   #
                                                                       ~8sec + slave-lag-   ~8sec + slave-   ~8sec + slave-    ~8sec + InnoDB      ~8sec + slave-
   Automated Unplanned Failover time                   <8sec
                                                                          catch-up          lag-catch-up     lag-catch-up      recovery time       lag-catch-up
   @
       Manual Unplanned Failover time                   N/A                Minutes-Hours    Minutes-Hours Minutes-Hours        Minutes-Hours      Minutes-Hours

   Checksum in replication logs                           Y                      N                   N              Y                N                    ?
   Replication Solution                                Built-in               Built-in            Built-in       External         External             External
   Replicate to Non-MySQL Databases                      N^                     N^                  N^              Y                N^                   Y
   Point-in Time Recovery (PITR)                          Y                      Y                  Y               Y                Y                    Y
   Automated DB Instantiation                             Y                      N                  N               Y                N                    Y
   Incremental data movement                              Y                      Y                   Y              Y                Y                    Y
   Rolling upgrade                                        Y                     N                   N               Y                N                    Y
   Automated Rolling migration                            Y                     N                   N               ?                N                    ?


     #   Using Multi-master for MySQL (MMM), Flipper, etc. assuming 8sec for heartbeat retries
     @   Manual response to an alert and running trouble-shooting and replication commands/scripts
     ^   Possible to leverage complementing solution like GoldenGate

© 2011 Schooner Information Technology, p28

Performance Comparisons And Trade Offs For Various My Sql Replication Schemes Presentation

  • 1.
    Performance comparisons andtrade-offs for various MySQL replication schemes Darpan Dinker Brian O’Krafka, VP Engineering Chief Architect Schooner Information Technology, Inc. http://www.schoonerinfotech.com/ Oracle, MySQL, Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
  • 2.
    MySQL Replication • Replication for databases is critical – High Availability: avoid SPOF, fail-over for service continuity, DR – Scale-Out: scale reads on replicas – Misc. administrative tasks: upgrades, schema changes, backup, PITR ... • MySQL/InnoDB with base replication has serious gaps – Failover is not straight-forward – Possibility of data loss and data inconsistency – Possibility of stale data read on Slaves – Writes on Master need to be de-rated to Slave’s single-thread applier performance (forcing unnecessary sharding) – Applications and deployments need to work around these issues or live with the consequences • Several alternatives exist with different design considerations © 2011 Schooner Information Technology, p2
  • 3.
    Technology Covered: ReplicationOptions for MySQL/InnoDB Qualitative Comparison MySQL-Specific Replication External Replication 1. MySQL async 4. GoldenGate – Available in MySQL 5.1, 5.5 5. Tungsten 2. MySQL semi-sync 6. DRBD – Available in MySQL 5.5 – Available on Linux 3. Schooner sync – Available in Schooner Active Cluster © 2011 Schooner Information Technology, p3
  • 4.
    #1 MySQL AsynchronousReplication Read version based on tx=50 Master mysqld Slave mysqld Tx.Commit(101) Repl.apply(51) tx=101 tx=101 tx=51 tx=51 InnoDB MySQL Log events pulled by Slave Relay InnoDB DB DB Tx log Bin log log Tx log Last tx=100 Last tx=100 Last tx=70 Last tx=50 • Loosely coupled master/slave • Read on slave can give old data relationship • No checksums in binary or relay log • Master does not wait for Slave stored on disk, data corruption possible • Slave determines how much to read • Upon a Master’s failure and from which point in the binary log • Slave may not have latest committed • Slave can be arbitrarily behind master in data resulting in data loss reading and applying changes • Fail-over to a slave is stalled until all transactions in relay log have been committed – not instantaneous © 2011 Schooner Information Technology, p4
  • 5.
    #2 MySQL Semi-synchronousReplication Read version based on tx=50 Master mysqld Slave mysqld Tx.Commit(101) Repl.apply(51) tx=101 tx=101 tx=51 tx=51 Log for tx=100 pulled by Slave InnoDB MySQL Relay InnoDB DB Slave ACK for tx=100 DB Tx log Bin log log Tx log Last tx=100 Last tx=100 Last tx=100 Last tx=50 • Semi-coupled master/slave relationship • Read on slave can give old data • On commit, Master waits for an ACK • No checksums in binary or relay log from Slave stored on disk, data corruption possible • Slave logs the transaction event in relay • Upon a Master’s failure log and ACKs (may not apply yet) • Fail-over to a slave is stalled until all • Slave can be arbitrarily behind master in transactions in relay log have been applying changes committed – not instantaneous © 2011 Schooner Information Technology, p5
  • 6.
    #3 Schooner MySQLActive Cluster (SAC): An integrated HA and replication solution for MySQL/InnoDB Read version based on tx=100 Master mysqld Log for tx=100 pushed to Slave Slave mysqld Slave ACK for tx=100 Repl.apply(100) Tx.Commit(101) tx=100 tx=101 InnoDB MySQL InnoDB DB DB Tx log Bin log Tx log Last tx=100 Last tx=100 Last tx=100 • Tightly-coupled master/slave • Read on slave gives latest committed relationship data • After commit, all Slaves guaranteed to • Checksums in log stored on disk receive and commit the change • Upon a Master’s failure • Slave in lock-step with Master • Fail-over to a slave is fully integrated and automatic • Application writes continue on new master instantaneously © 2011 Schooner Information Technology, p6
  • 7.
    #4 Oracle GoldenGate Read version based on tx=50 Master mysqld Slave mysqld Tx.Commit(101) Repl.apply(51) tx=101 tx=101 tx=51 Log for tx=71 sent to Slave InnoDB MySQL Source Dest. InnoDB DB DB Tx log Bin log Trail Trail Tx log Last tx=100 Last tx=100 Last tx=70 Last tx=50 • Loosely coupled master/slave • Read on slave can give old data relationship • Heterogeneous Database support • Master does not wait for Slave • Oracle, Microsoft SQL Server, IBM • Changes applied on slave similar to DB2, MySQL MySQL • Slave can be arbitrarily behind master in reading and applying changes Oracle and MySQL are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. © 2011 Schooner Information Technology, p7
  • 8.
    #5 Tungsten Replicator Read version based on tx=50 Master mysqld Slave mysqld Tx.Commit(101) Repl.apply(51) tx=101 tx=101 tx=51 Log for tx=71 sent to Slave Tx Tx InnoDB MySQL InnoDB DB History History DB Tx log Bin log Tx log Log Log Last tx=100 Last tx=100 Last tx=70 Last tx=50 • Loosely coupled master/slave • Read on slave can give old data* relationship • Heterogeneous Database support • Master does not wait for Slave • MySQL, PostgreSQL • Changes applied on slave similar to MySQL • Global Tx ID, useful to point to new master upon failure • Slave can be arbitrarily behind master in reading and applying changes • SaaS & ISP feature: Parallel replication for multi-tenant MySQL databases Tungsten is a registered trademark of Continuent http://code.google.com/p/tungsten-replicator/ © 2011 Schooner Information Technology, p8
  • 9.
    #6 Linux DRBD Master mysqld Heartbeat STAND-BY Tx.Commit(101) (No mysqld running) tx=101 tx=101 Block {dev-a, 20123} InnoDB MySQL InnoDB MySQL DB ACK for Block {dev-a, 20123} DB Tx log Bin log Tx log Bin log Last tx=100 Last tx=100 Last tx=100 Last tx=100 • Active-Passive mirroring at block device • Stand-by server does not service load • After each commit, the Stand-by server • No data-loss is guaranteed to have identical blocks • Upon a Master’s failure on device • MySQL is started on stand-by, database • Stand-by in lock-step with Master recovery takes ~minutes • Stand-by is made new Master • Application writes may use VIPs to write to new Master when its ready © 2011 Schooner Information Technology, p9
  • 10.
    Replication Options forMySQL/InnoDB Quantitative Comparison: Performance • Performance comparison of: 1. MySQL asynchronous (v5.5.8) 2. MySQL semi-synchronous (v5.5.8) 3. Schooner Active Cluster (SAC) synchronous replication (v3.1) • Benchmark: DBT2 open-source transaction processing benchmark © 2011 Schooner Information Technology, p10
  • 11.
    DBT2 Benchmark • Open source benchmark available at http://osdldbt.sourceforge.net • On-line Transaction Processing (OLTP) performance test • Fair-usage implementation of the TPC-C benchmark • Simulates a wholesale parts supplier with a database containing inventory and customer information • Benchmark scale determined by number of warehouses • Results here based on a scale of 1000 • Use InnoDB storage engine with 48GB buffer pool and full consistency/durability settings • Schooner has found that optimizing MySQL/InnoDB for DBT2 yields significant benefit for many real customer workloads © 2011 Schooner Information Technology, p11
  • 12.
    Measurement Setup SWITCH • Arista 10Gb Ethernet switch SERVERS CLIENT • IBM 3650 2RU Westmere Server • DBT2 client runs on master server • CPU: 2x Intel Xeon X5670 processors, 6 cores/12 threads per processor, 2.93 GHz • DRAM: 72GB • HDD: 2x300GB 10k RPM HDD RAID-1 for logging, with LSI M5015 controller with NVRAM writeback cache • Flash: 8x200GB OCZ MLC SSD’s, with 1 LSI 9211 controller, md RAID-0 • Network: 1x Chelseo 10Gb Ethernet NIC © 2011 Schooner Information Technology, p12
  • 13.
    Results: DBT2 Performance DBT2 Throughput (kTpm) DBT2 Response Time (ms) 120 6 100 5 80 4 60 3 40 2 20 1 0 0 2-node 5.5 2-node 5.5 2-node 2-node 5.5 2-node 5.5 2-node async semi SAC async semi SAC © 2011 Schooner Information Technology, p13
  • 14.
    Results: DBT2 Performance DBT2 Throughput (kTpm) DBT2 Response Time (ms) 120 6 100 5 80 4 60 3 40 2 20 1 0 0 2-node 5.5 2-node 5.5 2-node 2-node 5.5 2-node 5.5 2-node async semi SAC async semi SAC 5.5 Async and Semi-sync limited by serial slave applier SAC optimizations yields lower response times SAC optimizations yield 4-5x boost in throughput © 2011 Schooner Information Technology, p14
  • 15.
    Results: Master CPUUtilization Master CPU Utilization (%) 1600 1400 1200 1000 800 600 400 200 0 2-node 2-node 2-node 5.5 async 5.5 semi SAC © 2011 Schooner Information Technology, p15
  • 16.
    Results: Master CPUUtilization Master CPU Utilization (%) 1600 1400 1200 1000 800 600 400 200 0 2-node 2-node 2-node 5.5 async 5.5 semi SAC Higher throughput means higher CPU utilization, better system balance © 2011 Schooner Information Technology, p16
  • 17.
    Results: Transient Behavioron Master 5.5 Async 5.5 Semi-sync SAC Inconsistent performance with 5.5 Async © 2011 Schooner Information Technology, p17
  • 18.
    Results: Slave Utilization Slave CPU Utilization (%) 600 500 400 300 200 100 0 2-node 5.5 2-node 5.5 2-node SAC async semi © 2011 Schooner Information Technology, p18
  • 19.
    Results: Slave Utilization Slave CPU Utilization (%) 600 500 400 300 200 100 0 2-node 5.5 2-node 5.5 2-node SAC async semi Single applier in 5.5 saturates one core Even at 5X throughput on SAC Slave, enough headroom to service Read traffic © 2011 Schooner Information Technology, p19
  • 20.
    Results: Transient Behavioron Slave 5.5 Async 5.5 Semi-sync SAC Fluctuations with 5.5 Async © 2011 Schooner Information Technology, p20
  • 21.
    Results: Storage IOUtilization Master Storage IOPS Utilization (%) 16 14 12 10 8 6 4 2 0 2-node 5.5 2-node 5.5 2-node SAC async semi © 2011 Schooner Information Technology, p21
  • 22.
    Results: Storage IOUtilization Master Storage IOPS Utilization (%) 16 14 12 10 8 6 4 2 0 2-node 5.5 2-node 5.5 2-node SAC async semi Higher throughput means higher storage bandwidth, better system balance © 2011 Schooner Information Technology, p22
  • 23.
    Results: Network Utilization Replication Replication Network Utilization (%) Network Bandwidth (MB/s) 40 50 35 45 40 30 35 25 30 20 25 15 20 15 10 10 5 5 0 0 2-node 5.5 2-node 5.5 2-node 2-node 5.5 2-node 5.5 2-node async semi SAC async semi SAC © 2011 Schooner Information Technology, p23
  • 24.
    Results: Network Utilization Replication Replication Network Utilization (%) Network Bandwidth (MB/s) 40 50 35 45 40 30 35 25 30 20 25 15 20 15 10 10 5 5 0 0 2-node 5.5 2-node 5.5 2-node 2-node 5.5 2-node 5.5 2-node async semi SAC async semi SAC Higher throughput means higher replication bandwidth, better system balance © 2011 Schooner Information Technology, p24
  • 25.
    Results: Summary • Sustainable replication performance of 5.5.8 Async and Semi-sync is severely limited • Deeply integrated synchronous replication in SAC yield 4- 5X boost in sustainable replication performance • High performance and fully synchronous replication are not mutually exclusive © 2011 Schooner Information Technology, p25
  • 26.
    Conclusion • Asynchronousreplication is de-facto standard for 10+ years, issues with consistency, data-loss and service continuity • Recently released semi-synchronous replication mitigates the need to DRBD to avoid data-loss on failover, however consistency and service continuity issues exist • Schooner synchronous provides consistent reads on Slaves instantaneous failover with zero data-loss • For database interoperability, GoldenGate and Tungsten are a good choice • Performance comparisons between MySQL 5.5.8 async/semi-sync and Schooner Active Cluster using DBT2 show: • 5.5.8 async/semi-sync is severely limited by serial slave applier • Parallel slave appliers plus Schooner core optimizations in SAC yield 4-5x increase in throughput at lower response times, while maintaining read consistency across the cluster. © 2011 Schooner Information Technology, p26
  • 27.
  • 28.
    Reference: MySQL Replication Solutions Compared Schooner MySQL 5.5 MySQL 5.1/5.5 Linux Heartbeat + Tungsten GoldenGate Active Cluster 3.1 (semi-sync) (async) DRBD Eliminates Slave Lag Y N N N Y N Slave Consistent w/ Master Y N N N N N Write scalability High Moderate Moderate Low Low Moderate Read scalability High High High High N/A High Data loss Probability on Failover Zero Zero High High Zero High Slave-lag-catch- Slave-lag-catch- InnoDB recovery Slave-lag-catch- Planned Failover time (stop mysql) <2sec Slave-lag-catch-up up up time up # # # # # ~8sec + slave-lag- ~8sec + slave- ~8sec + slave- ~8sec + InnoDB ~8sec + slave- Automated Unplanned Failover time <8sec catch-up lag-catch-up lag-catch-up recovery time lag-catch-up @ Manual Unplanned Failover time N/A Minutes-Hours Minutes-Hours Minutes-Hours Minutes-Hours Minutes-Hours Checksum in replication logs Y N N Y N ? Replication Solution Built-in Built-in Built-in External External External Replicate to Non-MySQL Databases N^ N^ N^ Y N^ Y Point-in Time Recovery (PITR) Y Y Y Y Y Y Automated DB Instantiation Y N N Y N Y Incremental data movement Y Y Y Y Y Y Rolling upgrade Y N N Y N Y Automated Rolling migration Y N N ? N ? # Using Multi-master for MySQL (MMM), Flipper, etc. assuming 8sec for heartbeat retries @ Manual response to an alert and running trouble-shooting and replication commands/scripts ^ Possible to leverage complementing solution like GoldenGate © 2011 Schooner Information Technology, p28