1 TB MySQL Database
Migration and HA
Infrastructure
Alex Gorbachev
Miracle OpenWorld 2010
16 April 2010
Alex Gorbachev

    • CTO, The Pythian Group
    • Blogger

    • OakTable Network member
    • Oracle ACE

    • BattleAgainstAnyGuess.com

    • Vice-president, Oracle RAC SIG




2                             © 2009/2010 Pythian
Why Companies Trust Pythian
    • Recognized Leader:
    •   Global industry-leader in remote database administration services and consulting for Oracle,
        Oracle Applications, MySQL and SQL Server
    •   Work with over 150 multinational companies such as Forbes.com, Fox Interactive media, and
        MDS Inc. to help manage their complex IT deployments

    • Expertise:
    •   One of the world’s largest concentrations of dedicated, full-time DBA expertise.

    • Global Reach & Scalability:
    •   24/7/365 global remote support for DBA and consulting, systems administration, special
        projects or emergency response




3                                            © 2009/2010 Pythian
Agenda

    • Migration
     •   Schema, data, application code
    • HA   infrastructure
     •   Options available
     •   Implemented - Heartbeat cold failover cluster
    • Acceptance      testing
     •   How we simulated failures
    • DR   setup & backups
     •   Replication between two data-centers
     •   1 TB on MySQL - that’s not a simple e-commerce web-site


4                                    © 2009/2010 Pythian
Project profile

    • Document      management solution
     •   Archival & retrieve
    • Web   front-end
    • Critical availability requirements

    • 1 TB now, growth to 2-3 TB in couple years




5                               © 2009/2010 Pythian
Migration from Oracle RDB

    • MySQL     Migration Toolkit
     •   RDB has a package to connect via Oracle TNS
     •   Java


    • Create     and review schema

    • Pump      the data (1TB)




6                                   © 2009/2010 Pythian
Schema conversion

    •   Integer sizes mismatch - smallint, mediumint, decimal(10.2), etc...
    •   DATE VMS => DATE or DATETIME
    •   MEDIUMBLOB + LONGBLOB
    •   no DEFERRABLE constraints in MySQL
    •   character set / VARCHAR behavior (trailing space)
    •   Sequences => AUTO INCREMENT
    •   InnoDB storage => file per table
        •   want Oracle tablespaces there!
        •   page size 16 KB
    •   No stored procedures and modules conversion


7                                       © 2009/2010 Pythian
1 TB data move

    • ARCHIVE       part
     •   Separate and load in advance - 800 GB
    • LIVE   part
     •   200 GB - 30 hours
    • MySQL     migration toolkit
     •   agent mode to speed up data transfer
    • Speeding      up
     •   Disable binlogs
     •   Build indexes and constraints later




8                                    © 2009/2010 Pythian
Hardware

    • Primary        data-center
     •   2 x IBM x3850 Servers
         •   Each in different chassis
         •   4 quad core Intel XEON E7330, 2.4 GHz
         •   16 GB RAM
     •   Storage IBM DS4700 Express Model 72
         •   Fiber-channel
         •   RAID5 with 6 300GB disks +spare = 1.5 TB
    • DR      data-center
     •   1 x IBM x3850 Servers
     •   Same storage

9                                        © 2009/2010 Pythian
Primary DC HA: Options

     • MySQL     replication
      •   - Can loose some data (seconds), not reliable
      •   - Double storage requirements
      •   + potential to scale out
     • DRBD     replication
      •   - Performance impact in SYNC mode
      •   - Double storage requirements
      •   - no scale out (primary + mirror only)
      •   + reliable
     • Third-party     replication

10                                    © 2009/2010 Pythian
Primary DC HA: cold failover cluster

     • Heartbeat controls resources
     • Shared storage
      •   LUN’s accessible from two servers
      •   ext3 - mounted on active node *only*
      •   no LVM - LVM is not clustered
     • Virtual   IP / VIP
      •   Up only on one node
     • MySQL     5.0.67 instance is running on active node
      •   read-write data - must be InnoDB
      •   read-only data - can be MyISAM


11                                    © 2009/2010 Pythian
Heartbeat - simple clustering solution

     • Linux-HA.org




12                     © 2009/2010 Pythian
Heartbeat and network infrastructure
         Chassis 1                                                                      Chassis 2

                             (>?#2@'F                                                                      (>?#2@'7




        (?;<A&'BCD'4%&E

                                                                 :";"<&=&;#'(>?#2@                                     (?;<A&'BCD'4%&E

                            !"#"

                                                                                                                !"#"
                                   ,(-                                                              ,(-
                                   ./)#                                                             ./)#

                                          :";"<&=&;#'
                                             ./)#                                      :";"<&=&;#
                                                                                          ./)#

                                                        D)/%%/*&)'!1I'
                                                               J&="A&

                                                                0-'1"2345'6 ,(789

                                                                         0-'G D-H9


                          !"#"$"%&'(&)*&)'+                              (?;<A&'BCD'
                                                                         2)/%%/*&)'       !"#"$"%&'(&)*&)'K




13                                                             © 2009/2010 Pythian
Heartbeat and network infrastructure

     • Private   heartbeat network
      •   Cross-over ethernet patch-cord
      •   ++ Simple $100 switch - works great
      •   --- Expensive switch and VLAN - no good
     • Serial   link heartbeat
      •   Redundant to ethernet
     • Access    to RSA2 cards
      •   Remote reset and remote power off / lights-out
      •   Dedicated management network and management switches




14                                   © 2009/2010 Pythian
Shared storage setup

     • Linux   multipathing MPIO
      •   2 HBA’s per server
      •   2 controllers on SAN box
     • Added  the 2nd SAN box (cheap SATA disks)
     • errors=panic in mount options
      •   default is make it read-only
     • SANLUN’s visible from both nodes
     • NEVER MOUNT FILESYSTEM ON BOTH NODES!!!
      •   ext3 is not clustered




15                                       © 2009/2010 Pythian
Heartbeat and monitoring

     • Heartbeat        1.0
     •   Starts and stops resources in sequence
     •   Failure detected during start
     •   No resources monitoring - required Heartbeat 2.0
         •   Not sure if 2.0 is stable enough
     • mon      1.2.0 Service Monitoring Daemon
     •   mon.wiki.kernel.org
     •   Stable
     •   Has number of “monitors” out-of-the-box
     •   Can write custom monitors


16                                         © 2009/2010 Pythian
Heartbeat resources

          Start sequence (stop is reverse)
     1.   Virtual / floating IP
     2.   SAN mount points
     3.   MySQL daemon / instance
     4.   mon
     5.   mon-shadow


          mon monitors all resource and initiates a failover
          mon-shadow monitors and restarts mon only
          mon monitors and restarts mon-shadow

17                              © 2009/2010 Pythian
“mon” monitors

     • msql-mysql.monitor

     • fping.monitor

     • freespace.monitor    custom mount point monitor
     • mon.monitor



      On resource failure - goes to standby role.
      Other potential options - stop heartbeat or reboot or
      reset.




18                              © 2009/2010 Pythian
Improving failover

     • innodb_max_dirty_pages_pct=5   in my.cnf
     • service_startup_timeout=60 in /etc/init.d/mysql

     • Heartbeat resource manager retries offline 10 times
      •   /usr/lib64/heartbeat/ResourceManager => ${HA_STOPRETRYMAX=10}

      •   Changed to one
     • mysql.pid- don’t place it on shared storage
     • mon didn’t have timeout functionality
      •   Hacked the perl script and added timeout




19                                      © 2009/2010 Pythian
Other gotchas

     • Standard    MySQL monitor improvement
      •   Added insert/delete from a dummy table
     • Standard    /etc/init.d/mysql is not POSIX compliant
      •   mysql start returns error when MySQL is already up
      •   mysql stop returns error when MySQL is already down
     • SELINUX=disabled

     • innodb-flush-method= O_DSYNC or O_DIRECT
     • ibmrsa-telnet STONITH plug-in has a bug
      •   http://lists.community.tummy.com/pipermail/linux-ha/2008-June/
          033279.html
     • Heartbeat’s     test suite - BasicSanityCheck
20                                   © 2009/2010 Pythian
Acceptance testing - 42 individual tests
     (1)

     • Node    down
      •   power-off, halt command, cpu overload
     • Network     tests
      •   (ifconfig) -Heartbeat NIC down, app NIC down, management NIC
          down
      •   spam serial link - cat /dev/zero >/dev/ttyS0
      •   pulling heartbeat cables - one at a time and together
     • Storage    tests
      •   freeze IO - dmsetup suspend --noflush lunmultipathproddb-01
      •   pull cables (one HBA and both HBA ports)
      •   mess up mount points between two servers

21                                    © 2009/2010 Pythian
Acceptance testing - 42 individual tests
     (2)

     • MySQL    daemon test
     •   MySQL dies - kill -9 {mysqld_pid} {mysql_safe_pid}
     •   MySQL hangs - kill -STOP {mysqld_pid}
     •   MySQL can’t connect (max connections)
     • “mon”     tests
     •   kill -9, kill -STOP, manual start on wrong node (including shadow)
     • Heartbeat
     •   kill -9, kill -STOP
     •   Stopping and starting
     •   Graceful switchover between the nodes


22                                   © 2009/2010 Pythian
• Split    into LIVE and ARCHIVE
     Backup infrastructure           •   LIVE - InnoDB 200-500GB
                                     •   ARCHIVE - MyISAM 2 TB
                                   • ARCHIVE      backup - production
                                     •   can lock + rsync
                                     •   no LVM => no snapshot
                                     •   storage snapshot is expensive
                                   • LIVE    backup - on slave
                                     •   FLUSH ... WITH READ LOCK
                                     •   Stop slave SQL thread
                                     •   LVM snapshot or RSYNC
                                   • Restore
                                     •   LIVE first as a whole instance
                                     •   ARCHIVE later - it’s MyISAM
23                    © 2009/2010 Pythian
Disaster recovery infrastructure




24                     © 2009/2010 Pythian
Q&A



                                  Thank you!




     • gorbachev@pythian.com

     • http://pythian.com/news/author/alex

     • Twitter   @alexgorbachev

25                                © 2009/2010 Pythian

MOW2010: 1TB MySQL Database Migration and HA Infrastructure by Alex Gorbachev, Pythian

  • 1.
    1 TB MySQLDatabase Migration and HA Infrastructure Alex Gorbachev Miracle OpenWorld 2010 16 April 2010
  • 2.
    Alex Gorbachev • CTO, The Pythian Group • Blogger • OakTable Network member • Oracle ACE • BattleAgainstAnyGuess.com • Vice-president, Oracle RAC SIG 2 © 2009/2010 Pythian
  • 3.
    Why Companies TrustPythian • Recognized Leader: • Global industry-leader in remote database administration services and consulting for Oracle, Oracle Applications, MySQL and SQL Server • Work with over 150 multinational companies such as Forbes.com, Fox Interactive media, and MDS Inc. to help manage their complex IT deployments • Expertise: • One of the world’s largest concentrations of dedicated, full-time DBA expertise. • Global Reach & Scalability: • 24/7/365 global remote support for DBA and consulting, systems administration, special projects or emergency response 3 © 2009/2010 Pythian
  • 4.
    Agenda • Migration • Schema, data, application code • HA infrastructure • Options available • Implemented - Heartbeat cold failover cluster • Acceptance testing • How we simulated failures • DR setup & backups • Replication between two data-centers • 1 TB on MySQL - that’s not a simple e-commerce web-site 4 © 2009/2010 Pythian
  • 5.
    Project profile • Document management solution • Archival & retrieve • Web front-end • Critical availability requirements • 1 TB now, growth to 2-3 TB in couple years 5 © 2009/2010 Pythian
  • 6.
    Migration from OracleRDB • MySQL Migration Toolkit • RDB has a package to connect via Oracle TNS • Java • Create and review schema • Pump the data (1TB) 6 © 2009/2010 Pythian
  • 7.
    Schema conversion • Integer sizes mismatch - smallint, mediumint, decimal(10.2), etc... • DATE VMS => DATE or DATETIME • MEDIUMBLOB + LONGBLOB • no DEFERRABLE constraints in MySQL • character set / VARCHAR behavior (trailing space) • Sequences => AUTO INCREMENT • InnoDB storage => file per table • want Oracle tablespaces there! • page size 16 KB • No stored procedures and modules conversion 7 © 2009/2010 Pythian
  • 8.
    1 TB datamove • ARCHIVE part • Separate and load in advance - 800 GB • LIVE part • 200 GB - 30 hours • MySQL migration toolkit • agent mode to speed up data transfer • Speeding up • Disable binlogs • Build indexes and constraints later 8 © 2009/2010 Pythian
  • 9.
    Hardware • Primary data-center • 2 x IBM x3850 Servers • Each in different chassis • 4 quad core Intel XEON E7330, 2.4 GHz • 16 GB RAM • Storage IBM DS4700 Express Model 72 • Fiber-channel • RAID5 with 6 300GB disks +spare = 1.5 TB • DR data-center • 1 x IBM x3850 Servers • Same storage 9 © 2009/2010 Pythian
  • 10.
    Primary DC HA:Options • MySQL replication • - Can loose some data (seconds), not reliable • - Double storage requirements • + potential to scale out • DRBD replication • - Performance impact in SYNC mode • - Double storage requirements • - no scale out (primary + mirror only) • + reliable • Third-party replication 10 © 2009/2010 Pythian
  • 11.
    Primary DC HA:cold failover cluster • Heartbeat controls resources • Shared storage • LUN’s accessible from two servers • ext3 - mounted on active node *only* • no LVM - LVM is not clustered • Virtual IP / VIP • Up only on one node • MySQL 5.0.67 instance is running on active node • read-write data - must be InnoDB • read-only data - can be MyISAM 11 © 2009/2010 Pythian
  • 12.
    Heartbeat - simpleclustering solution • Linux-HA.org 12 © 2009/2010 Pythian
  • 13.
    Heartbeat and networkinfrastructure Chassis 1 Chassis 2 (>?#2@'F (>?#2@'7 (?;<A&'BCD'4%&E :";"<&=&;#'(>?#2@ (?;<A&'BCD'4%&E !"#" !"#" ,(- ,(- ./)# ./)# :";"<&=&;#' ./)# :";"<&=&;# ./)# D)/%%/*&)'!1I' J&="A& 0-'1"2345'6 ,(789 0-'G D-H9 !"#"$"%&'(&)*&)'+ (?;<A&'BCD' 2)/%%/*&)' !"#"$"%&'(&)*&)'K 13 © 2009/2010 Pythian
  • 14.
    Heartbeat and networkinfrastructure • Private heartbeat network • Cross-over ethernet patch-cord • ++ Simple $100 switch - works great • --- Expensive switch and VLAN - no good • Serial link heartbeat • Redundant to ethernet • Access to RSA2 cards • Remote reset and remote power off / lights-out • Dedicated management network and management switches 14 © 2009/2010 Pythian
  • 15.
    Shared storage setup • Linux multipathing MPIO • 2 HBA’s per server • 2 controllers on SAN box • Added the 2nd SAN box (cheap SATA disks) • errors=panic in mount options • default is make it read-only • SANLUN’s visible from both nodes • NEVER MOUNT FILESYSTEM ON BOTH NODES!!! • ext3 is not clustered 15 © 2009/2010 Pythian
  • 16.
    Heartbeat and monitoring • Heartbeat 1.0 • Starts and stops resources in sequence • Failure detected during start • No resources monitoring - required Heartbeat 2.0 • Not sure if 2.0 is stable enough • mon 1.2.0 Service Monitoring Daemon • mon.wiki.kernel.org • Stable • Has number of “monitors” out-of-the-box • Can write custom monitors 16 © 2009/2010 Pythian
  • 17.
    Heartbeat resources Start sequence (stop is reverse) 1. Virtual / floating IP 2. SAN mount points 3. MySQL daemon / instance 4. mon 5. mon-shadow mon monitors all resource and initiates a failover mon-shadow monitors and restarts mon only mon monitors and restarts mon-shadow 17 © 2009/2010 Pythian
  • 18.
    “mon” monitors • msql-mysql.monitor • fping.monitor • freespace.monitor custom mount point monitor • mon.monitor On resource failure - goes to standby role. Other potential options - stop heartbeat or reboot or reset. 18 © 2009/2010 Pythian
  • 19.
    Improving failover • innodb_max_dirty_pages_pct=5 in my.cnf • service_startup_timeout=60 in /etc/init.d/mysql • Heartbeat resource manager retries offline 10 times • /usr/lib64/heartbeat/ResourceManager => ${HA_STOPRETRYMAX=10} • Changed to one • mysql.pid- don’t place it on shared storage • mon didn’t have timeout functionality • Hacked the perl script and added timeout 19 © 2009/2010 Pythian
  • 20.
    Other gotchas • Standard MySQL monitor improvement • Added insert/delete from a dummy table • Standard /etc/init.d/mysql is not POSIX compliant • mysql start returns error when MySQL is already up • mysql stop returns error when MySQL is already down • SELINUX=disabled • innodb-flush-method= O_DSYNC or O_DIRECT • ibmrsa-telnet STONITH plug-in has a bug • http://lists.community.tummy.com/pipermail/linux-ha/2008-June/ 033279.html • Heartbeat’s test suite - BasicSanityCheck 20 © 2009/2010 Pythian
  • 21.
    Acceptance testing -42 individual tests (1) • Node down • power-off, halt command, cpu overload • Network tests • (ifconfig) -Heartbeat NIC down, app NIC down, management NIC down • spam serial link - cat /dev/zero >/dev/ttyS0 • pulling heartbeat cables - one at a time and together • Storage tests • freeze IO - dmsetup suspend --noflush lunmultipathproddb-01 • pull cables (one HBA and both HBA ports) • mess up mount points between two servers 21 © 2009/2010 Pythian
  • 22.
    Acceptance testing -42 individual tests (2) • MySQL daemon test • MySQL dies - kill -9 {mysqld_pid} {mysql_safe_pid} • MySQL hangs - kill -STOP {mysqld_pid} • MySQL can’t connect (max connections) • “mon” tests • kill -9, kill -STOP, manual start on wrong node (including shadow) • Heartbeat • kill -9, kill -STOP • Stopping and starting • Graceful switchover between the nodes 22 © 2009/2010 Pythian
  • 23.
    • Split into LIVE and ARCHIVE Backup infrastructure • LIVE - InnoDB 200-500GB • ARCHIVE - MyISAM 2 TB • ARCHIVE backup - production • can lock + rsync • no LVM => no snapshot • storage snapshot is expensive • LIVE backup - on slave • FLUSH ... WITH READ LOCK • Stop slave SQL thread • LVM snapshot or RSYNC • Restore • LIVE first as a whole instance • ARCHIVE later - it’s MyISAM 23 © 2009/2010 Pythian
  • 24.
  • 25.
    Q&A Thank you! • gorbachev@pythian.com • http://pythian.com/news/author/alex • Twitter @alexgorbachev 25 © 2009/2010 Pythian

Editor's Notes

  • #4 - Successful growing business for more than 10 years - Served many customers with complex requirements/infrastructure just like yours. - Operate globally for 24 x 7 &amp;#x201C;always awake&amp;#x201D; services