Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

M|18 Battle of the Online Schema Change Methods

721 views

Published on

M|18 Battle of the Online Schema Change Methods

Published in: Data & Analytics
  • Be the first to comment

M|18 Battle of the Online Schema Change Methods

  1. 1. Battle of the online schema change methods Ivan Groenewold Valerie Parham-Thompson 27 February 2018
  2. 2. ● Definition of “Online” ● Native Online DDL ● Alternate Methods ● Alternate Tools ● Battle! Special Replication/Cluster Cases ● Battle! Special Workload Cases ● Q&A Agenda
  3. 3. What is “Online”?
  4. 4. ● “Online” can mean several things: ● Changes only happen to metadata. ● Changes are made in-place, and the software allows concurrent DML. ● Table is rebuilt with table copy, but the software allows concurrent DML. ● In some cases (e.g., TokuDB), changes are made in memory and not pushed to disk until the row is accessed. More on that later! Definition
  5. 5. ● Blocking ● Locking ● Application failures Why do we care?
  6. 6. Native Online DDL
  7. 7. Native Online DDL - InnoDB/XtraDB - Overview ● Supported since MariaDB 10.0 / MySQL 5.6 ● Information about progress is limited ● MariaDB >= 10.1.1: SHOW STATUS LIKE 'innodb_onlineddl%'; and error log ● MySQL >= 5.7.6: Performance schema stage/innodb/alter%
  8. 8. Native Online DDL - InnoDB/XtraDB - Settings ● Algorithm and locking methods ● e.g., ALTER TABLE test ADD INDEX name_idx (name), ALGORITHM=inplace, LOCK=none; ● You get an error message if the specified algorithm/locking cannot be used ● If not explicitly specified e.g., ALTER TABLE test ADD INDEX name_idx (name), the less expensive combination will be used ● ALGORITHM=copy, LOCK=shared is the old behaviour (only reads allowed)
  9. 9. ● Some operations still require a table copy ● Change column data type i. from MariaDB 10.2.2, increasing the size of a VARCHAR column can be done in-place ● Charset change ● Primary Key change ● DATE, DATETIME, TIMESTAMP columns created before MariaDB 10.1.2 / MySQL 5.6 ● Online DDL not really “online” on replicas ● Operation needs to finish on the master before being replicated ● SQL thread is “blocked” while replica is executing the DDL operation ● Lag spike at the end Native Online DDL - InnoDB/XtraDB - Limitations
  10. 10. ● Works only with InnoDB ● Can’t pause an online DDL operation ● Rollback can be expensive ● LOCK=none not supported with fulltext indexes ● https://bugs.mysql.com/bug.php?id=81819 ● Instant ADD COLUMN for InnoDB in MariaDB 10.3+ is in RC ● https://jira.mariadb.org/browse/MDEV-11369 https://mariadb.com/kb/en/library/online-ddl-overview/ https://dev.mysql.com/doc/refman/5.7/en/innodb-create-index-overview.html Native Online DDL - InnoDB/XtraDB - More Limitations
  11. 11. ● “TokuDB enables you to add or delete columns in an existing table, expand char, varchar, varbinary, and integer type columns in an existing table, or rename an existing column in a table with little blocking of other updates and queries. … The work of altering the table for column addition, deletion, or expansion is performed as subsequent operations touch parts of the Fractal Tree, both in the primary index and secondary indexes.” Native Online DDL - TokuDB - Overview Note: tokudb_version=5.6.37-82.2
  12. 12. ● Don’t change too much at once. For example, in changing column name, don’t change other elements, or it will revert to a regular mysql (not hot copy) alter. This was originally intcol1 int(11) default null. Native Online DDL - TokuDB - Limitations ALTER TABLE test.t1 CHANGE intcol1 intone INT(11) DEFAULT NULL; Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 ALTER TABLE test.t1 CHANGE intcol1 inttwo INT(11) NOT NULL; Query OK, 56777 rows affected (1.52 sec) Records: 56777 Duplicates: 0 Warnings: 0 Note: tokudb_version=5.6.37-82.2
  13. 13. ● Don’t try to do multiple types of changes in one statement, or multiple renames. Native Online DDL - TokuDB - HCADER - Limitations These are not executed online: ALTER TABLE test.t2 DROP number4, ADD number7 int(11) default null; Query OK, 56777 rows affected (0.93 sec) Records: 56777 Duplicates: 0 Warnings: 0 ALTER TABLE test.t2 CHANGE number4 num4 int(11) default null, CHANGE number7 num7 int(11) default null; Query OK, 56777 rows affected (1.53 sec) Records: 56777 Duplicates: 0 Warnings: 0 Although these are (because natively supported): ALTER TABLE test.t2 ADD number5 int(11) default null, ADD number6 int(11) default null; Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 ALTER TABLE test.t2 DROP number5, DROP number6; Query OK, 0 rows affected (0.18 sec) Records: 0 Duplicates: 0 Warnings: 0
  14. 14. ● If dropping a column with an index, drop the index first, then drop the column. Native Online DDL - TokuDB - HCADER - Limitations This is not executed online, because number4 has an associated index: ALTER TABLE test.t1 DROP column number4; alter table test.t1 drop column number4; Query OK, 159002 rows affected (4.52 sec) Records: 159002 Duplicates: 0 Warnings: 0 Separating the statements allows the drop to happen online: ALTER TABLE test.t1 DROP index idx4; Query OK, 0 rows affected (0.37 sec) Records: 0 Duplicates: 0 Warnings: 0 ALTER TABLE test.t1 DROP column number4; Query OK, 0 rows affected (0.18 sec) Records: 0 Duplicates: 0 Warnings: 0
  15. 15. ● Changing the size of an int, char, varchar, or varbinary is supported online, but only if the column doesn’t have an index attached to it. Native Online DDL - TokuDB - HCADER - Limitations This is not executed online, because string1 has an associated index: ... `string1` char(2) DEFAULT NULL, ... KEY `idxstr1` (`string1`) ALTER TABLE test.t2 change column string1 string1 char(3) default null; Query OK, 56777 rows affected (6.39 sec) Records: 56777 Duplicates: 0 Warnings: 0 This is executed online, because string2 has no index: ... `string2` char(2) DEFAULT NULL, ALTER TABLE test.t2 change column string2 string2 char(3) default null; Query OK, 0 rows affected (0.26 sec) Records: 0 Duplicates: 0 Warnings: 0
  16. 16. ● A note about performance: “The time that the table lock is held can vary. The table-locking time for HCADER is dominated by the time it takes to flush dirty pages, because MySQL closes the table after altering it. If a checkpoint has happened recently, this operation is fast (on the order of seconds). However, if the table has many dirty pages, then the flushing stage can take on the order of minutes.” ● Another note about performance: “The work of altering the table for column addition, deletion, or expansion is performed as subsequent operations touch parts of the Fractal Tree, both in the primary index and secondary indexes.” Think about very busy workloads. ● Take this very seriously, especially in the presence of replication! Test early and often. Native Online DDL - TokuDB - HCADER - Limitations
  17. 17. ● Adding indexes is done online, allowing concurrent inserts and updates by: 1. Setting the variable: set tokudb_create_index_online=on 2. Using the syntax: create index index_name on table_name (column_name); BUT will block replication on replicas: it is long-running even though non-blocking Native Online DDL - TokuDB - Adding Indexes
  18. 18. All changes require a table copy, except create or drop secondary indexes. More may be supported soon. https://github.com/facebook/mysql-5.6/issues/47 Native Online DDL - Native Online DDL - RocksDB
  19. 19. ● Traditional online DDL ● Aurora Fast DDL ● updates INFORMATION_SCHEMA system table with the new schema ● records the old schema into a Schema Version Table with the timestamp ● change is propagated to read replicas ● on subsequent DML operations, check to see if the affected data page has a pending schema operation by comparing the log sequence number (LSN) timestamp for the page with the LSN timestamp of schema changes. ● update the page to the new schema before applying the DML statement https://aws.amazon.com/blogs/database/amazon-aurora-under-the-hood-fast-ddl/ Native Online DDL: RDS MySQL / Aurora - Overview
  20. 20. ● Problems with traditional online DDL ● Temp table can fill ephemeral storage ● Fast DDL ● only supports adding nullable columns to the end of a table ● no support for partitioned tables ● no support for the older REDUNDANT row format ● Only usable if the max possible record size is smaller than half the page size Native Online DDL: RDS MySQL / Aurora - Limitations
  21. 21. Alternate Methods: Rolling Schema Updates
  22. 22. ● Stop replica ● Perform the alter on the replica ● Start the replica and allow it to catch up ● (repeat...) ● Promote replica as master + deploy new code release ● Perform the alter on the old master ● Promote the old master back (optional) Rolling Schema Updates - Overview
  23. 23. ● Time-consuming manual process ● Complexity proportional to number of slaves ● Schema is inconsistent ● If using slaves for reads, the application needs to be forward/backward compatible ● Some changes can break replication ● Potential problems with GTID Rolling Schema Updates - Limitations
  24. 24. ● Run a DDL change on slave A ● The DDL change is applied to other slaves (B, C, …) ● Promote slave A as master ● Possible outcomes: ● Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires ● If server A still has the binlog with the DDL change, the transaction is sent (again) to all current slaves Rolling Schema Updates - Limitations - Errant Tx
  25. 25. ● Avoid errant transactions ● Disable binlog for the session ● Run the changes ● How to spot them ● Check executed_gtid_set in the slave (show slave status) ● Check executed_gtid_set in the master (show master status) ● Use gtid_substract(‘slave_set’, ‘master_set’) function Rolling Schema Updates - Limitations - Errant Tx
  26. 26. Alternate Methods: Downtime
  27. 27. ● Schedule application downtime ● Do changes during code rollout maintenance ● Rarely used these days, but sometimes can choose to do impactful changes during relatively quiet times to reduce length of running change Downtime - Overview
  28. 28. Alternate Tools: pt-online-schema-change
  29. 29. ● Create an empty copy of the table to alter ● Modify the new table ● Create triggers from old to new table (_ins, _upd, _del) ● Copy rows from the original table into the new table (in chunks) ● Move away the original table and replace it with the new one pt-osc - Overview https://www.percona.com/doc/percona-toolkit/LATEST/pt-online-schema-change.html
  30. 30. ● Chunk size controls ● --chunk-time i. Adjusts chunk size dynamically based on row copy time ● --chunk-size i. Specify a fixed chunk size ● Triggers ● --preserve-triggers (MySQL 5.7 only) ● Finding replicas ● --recursion-method i. processlist ii. hosts (Uses SHOW SLAVE HOSTS) iii. dsn=DSN iv. none pt-osc - Settings
  31. 31. ● Throttling ● Master i. --max-load Threads_connected=800,Threads_running=80 ii. --critical-load iii.--pause-file ● Slave lag i. --max-lag ii. --check-interval iii.--check-slave-lag or DSN to monitor specific replica(s) pt-osc - Settings
  32. 32. ● Potential stalls after dropping a big table ● Use --no-drop-old-table ● Trigger-related ● Tool cannot be completely paused ● Adding/dropping triggers is expensive! i. Control metadata lock time 1. --set-vars=lock_wait_timeout=10 ii. Use retries 1. --tries create_triggers:5:0.5,drop_triggers:5:0.5 (try 5 times, wait 0.5 sec between tries) pt-osc - Limitations
  33. 33. ● Dealing with foreign keys ● --alter-foreign-keys-method i. drop_swap: no atomic rename ii. rebuild_constraints: rebuild is done using normal blocking DDL iii. auto: let pt-osc decide ● FK Names: adds/removes underscores at the beginning based on the first FK of the table ● Be careful with underscores in FK names ● Try to avoid (in the same table) things like: i. _FK1 ii. __FK2 iii. FK3 ● https://bugs.launchpad.net/percona-toolkit/+bug/1428812 pt-osc - More Limitations
  34. 34. Alternate Tools: Facebook OSC
  35. 35. ● Create an empty copy (“shadow”) of the table to alter with the new structure ● Create change capture (“deltas”) table ● All columns of the source table ● Extra integer auto-increment column to track order of changes ● Extra integer column to track DML type ● Create 3 triggers from old to “deltas” table ● Inserts ● Deletes ● Updates Facebook OSC - Overview
  36. 36. ● Dump chunks of rows from the original table & load to new table ● Replay changes from deltas table ● Checksum ● Cut-over ● Lock the existing table ● Final round of replay ● Swap the tables Facebook OSC - Overview
  37. 37. Facebook OSC - Unique features ● Developed for rolling schema updates ● Tool runs with sql-log-bin=0 ● Change replay is asynchronous ● Use of SELECT INTO OUTFILE / LOAD DATA INFILE to avoid gap lock ● Manage schema via source control ● Relies on a file containing CREATE TABLE to run the schema change ● Supports MyRocks ● Implemented as standalone Python class ● Interact with OSC from your code
  38. 38. Facebook OSC - Limitations ● No support for triggers, FK, rename column or RBR ● Only one OSC can run at a time ● Requires Python 2.7 ● Difficult to install in older distributions ● No atomic rename ● 'table not found' errors can happen for a short period ● Implications of using sql-log-bin=0 https://github.com/facebookincubator/OnlineSchemaChange
  39. 39. Alternate Tools: gh-ost
  40. 40. ● Create an empty copy of the table to alter ● Modify the new table ● Hook up as a MySQL replica (stream binlog events) ● Copy rows from the original table into the new table in chunks ● Apply events on the new table ● When the copy is complete, move away the original table and replace it with the new one gh-ost - Overview
  41. 41. ● No triggers! ● Can be completely paused ● Low overhead and less locking ● Multiple concurrent migrations ● Dynamic reconfiguration ● Can offload some operations to a slave ● reading binlog events ● queries required for (optional) accurate progress counter ● Supports testing or migrating on a slave ● --test-on-replica ● --migrate-on-replica gh-ost - Unique features
  42. 42. ● Throttling ● Master i. --max-load=Threads_connected=800,Threads_running=80 ii. --critical-load iii.--critical-load-interval-millis=10000 (2nd chance) iv. --critical-load-hibernate-seconds=300 (don’t panic, hibernate instead) ● Slave lag i. --max-lag-millis ii. --throttle-control-replicas ● Pause i. echo throttle | nc -U /tmp/gh-ost.test.sock gh-ost - Settings
  43. 43. ● Send commands via Unix socket ● echo ‘status’ | nc -U /tmp/gh-ost.test.sock ● echo chunk-size=? | nc -U /tmp/gh-ost.test.sock ● echo ‘chunk-size=2500’ | nc -U /tmp/gh-ost.test.sock ● echo ‘[no-]throttle’ | nc -U /tmp/gh-ost.test.sock ■ Make sure binlogs don’t get purged! ● Delay cutover ■ --postpone-cut-over-flag-file=/tmp/file ■ echo unpostpone | nc -U /tmp/gh-ost.test.sock https://github.com/github/gh-ost/blob/master/doc/interactive-commands.md gh-ost - Interactive commands
  44. 44. gh-ost - Limitations ● No support for FK or triggers ● Needs row based binlogs on at least one slave ● No support for new MySQL 5.7 columns ● Generated ● POINT ● No JSON PKs ● Master-master is supported but only active-passive
  45. 45. Special Replication / Cluster Cases: RDS / Aurora
  46. 46. Use Case: RDS MySQL / Aurora - Overview ● Black-box approach ● RDS replicas similar to traditional slaves ● Aurora replicas share the same storage with the master ● STOP/START SLAVE done through procedures ● mysql.rds_stop_replication ● mysql.rds_start_replication
  47. 47. BATTLE!
  48. 48. Use Case: RDS MySQL / Aurora - Complications ● Privileges are limited ● No SUPER ● Detecting replicas is tricky ● processlist gives you the internal IP ● Aurora read replicas are “special” ● Read_only can’t be modified ● Replication filters are used ● Replicate_Ignore_Table: mysql.plugin,mysql.rds_monitor ● Binlogs are disabled by default
  49. 49. gh-ost wins!
  50. 50. Use Case: Aurora - gh-ost on master ● Make sure binary logs are enabled ● Set backup retention >= 1 day ● set binlog_format=ROW in parameter group ● use --assume-rbr ● Specify the Aurora writer endpoint in --host ● use --allow-on-master
  51. 51. ./gh-ost … --alter=”add column b int” --assume-rbr --allow-on-master --host=<aurora writer endpoint> --database="testdb" --table="mytable" --verbose --execute Use Case: Aurora - gh-ost on master
  52. 52. Use Case: RDS MySQL - gh-ost on replica ● Make sure binary logs are enabled on the replica ● Set backup retention >= 1 day ● Set binlog_format=ROW in the parameter group ● Use --assume-rbr ● Specify the replica endpoint in --host ● Specify the master’s endpoint in --assume-master-host
  53. 53. Use Case: RDS MySQL - gh-ost on replica ./gh-ost … --alter=”add column b int” --assume-rbr --assume-master-host=<master endpoint> --host=<replica endpoint> --max-lag-millis=5000 --database="testdb" --table="mytable" --verbose --execute
  54. 54. Special Replication / Cluster Cases: Tungsten
  55. 55. ● Tungsten uses an external replicator process ● Reads from the binary logs (row or statement-based data) ● Converts transaction data to db-agnostic format ● Writes into Transaction History Log (THL) files ● Hosts are not aware they are replicating ● SHOW SLAVE STATUS is empty Use case: Tungsten - Overview
  56. 56. BATTLE!
  57. 57. ● Issues with pt-osc ● Replicas have to be manually specified (DSN table) ● Binlog format considerations (triggers) ■ STATEMENT: works out of the box ■ ROW: use Tungsten plugin for Percona toolkit ■ MIXED: do not use! ● Issues with gh-ost ● Can’t detect the master/replicas automatically ● Can’t start/stop replication ● Can’t switch binlog format Use case: Tungsten - Complications
  58. 58. gh-ost wins!
  59. 59. MySQL binlogs Tungsten Replicator (master) THL MySQL Tungsten Replicator (slave) THL master slave gh-ost Read changes Copy rows and apply changes to gh-ost table binlogs read write read write Use case: Tungsten - gh-ost using a replica
  60. 60. ● Pass the slave to --host ● Use --assume-master-host to specify the actual master ● Specify --tungsten argument ● Slave needs binlog_format=ROW ● Use Tungsten log-slave-updates parameter ○ tpm configure property=log-slave-updates=true ○ hang at INFO Waiting for tables to be in place otherwise Use case: Tungsten - gh-ost using a slave
  61. 61. ./gh-ost --assume-master-host=<master_ip> --tungsten --host=<slave1_ip> --database="test" --table="test_t" --verbose --alter="add column b int" --execute Use case: Tungsten - gh-ost using a slave
  62. 62. Special Replication / Cluster Cases: Galera
  63. 63. ● Can’t benefit from online DDL ● Total order isolation (TOI) ● Blocks all writes to the cluster - no thanks! ● Rolling schema updates (RSU) ● Block one node at a time ● New and old schema definitions need to be compatible ● Manual operation ● Adjust gcache big enough to allow IST after Use case: Galera - Overview
  64. 64. BATTLE!
  65. 65. ● pt-osc ● Only InnoDB tables are supported ● gh-ost ● No official support for Galera/XtraDB Cluster i. Works in Galera 5.6 using --cut-over=two-step ii. Doesn’t work with Galera 5.7 iii. https://github.com/github/gh-ost/issues/224 Use case: Galera - Complications
  66. 66. pt-osc wins!
  67. 67. ● Set wsrep_OSU_method to TOI ● ALTER of _new table needs to replicate to all nodes at the same time ● No other special arguments are required ● Throttling ● No replication lag in Galera ● --max-flow-ctl (percentage) i. Check average time cluster spent pausing for Flow Control and make pt-osc pause if it goes over the percentage indicated ii. available for Galera >=5.6 Use case: Galera - pt-osc
  68. 68. Special Replication / Cluster Cases: Multisource Replication
  69. 69. Multisource replication is commonly found in MariaDB because it has been supported in the product for a while. One replica supports multiple sources. Use case: Multisource Replication - Overview m1 m2 s1
  70. 70. BATTLE!
  71. 71. gh-ost wins!
  72. 72. Use case: Multisource Replication Previous version required running gh-ost on the master. See https://github.com/github/gh- ost/issues/225. Now we can use assume-master-host. m1 m2 s1 gh-ost ./gh-ost --max-load=Threads_running=25 --critical-load=Threads_running=1000 --chunk-size=1000 --throttle-control-replicas="192.168.56.12" --max-lag-millis=1500 --user="ghost" --password="ghost" --allow-master-master --assume-master-host=192.168.56.10 --database="d1" --table="t1" --verbose --alter="change column name name char(50)" --cut-over=default --default-retries=120 --panic-flag-file=/tmp/ghost.panic.flag --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag --initially-drop-ghost-table --initially-drop-old-table --execute
  73. 73. Special Replication / Cluster Cases: Daisy-Chained Replication
  74. 74. Daisy-chained replicas are commonly used in complex topologies to scale a large number of second-level replicas while avoiding load on the primary from the replication threads. The key is to use “log-slave-updates” so that changes are passed to the second-level replicas. Use case: Daisy-chained replication - Overview m1 s1 s1a log-slave-updates
  75. 75. BATTLE!
  76. 76. pt-osc wins!
  77. 77. Run pt-osc on master. Use case: Daisy-chained replication m1 s1 pt-osc s1a
  78. 78. Special Workload Cases: Very Large Tables
  79. 79. Use case: Very large tables - Overview ● Definition of big is environment-dependent ● Many rows ● Disk space used/free
  80. 80. BATTLE!
  81. 81. Use case: Very large tables - Complications ● Problems with DDL changes are exacerbated ● Replication lag ● Expensive rollback ● Info about progress can be limited ● Long queries ● Disk space ● Memory usage
  82. 82. pt-osc wins!
  83. 83. Use case: Very large tables ● External tools are more desirable than online DDL ● pt-osc is usually faster than gh-ost ● Experiment with --chunk-size and --chunk-time vs default auto ● Set high timeouts ● Retry failed operations ● Throttle ● If table is extremely big, consider dump/load into an altered table
  84. 84. Use case: Very large tables - pt-osc pt-online-schema-change --set- vars=lock_wait_timeout=10,innodb_lock_wait_timeout=10 --critical-load Threads_running=1000 --max-load Threads_running=50 --max-lag 30 --alter-foreign-keys-method=drop_swap --chunk-time 1 --alter '...' D=my_db,t=my_table --execute
  85. 85. Special Workload Cases: Very Busy Tables
  86. 86. Use case: Very busy tables - Overview ● Busy tables receive many changes to data over short period of time ● Often combined with very large tables use case
  87. 87. BATTLE!
  88. 88. Use case: Very busy tables - Complications ● Usual problems with online DDL ● No throttling ● Progress info is limited ● Long queries, IO contention ● Problem with buffering of ongoing changes ● Innodb_online_alter_log_max_size parameter ■ stores data inserted, updated, or deleted in the table during the DDL operation ■ If a temporary log file exceeds the upper size limit, operation fails ■ A large value extends the period of time at the end of the DDL operation when the table is locked to apply the data from the log
  89. 89. gh-ost wins!
  90. 90. Use case: Very busy tables ● Use a replica to offload traffic from the master ● Use small chunks ● Set low timeouts to reduce impact ● Configure retry of failed operations ● Configure throttling ● Trigger the cut-over manually on a low traffic period
  91. 91. Use case: Very busy tables - gh-ost ./gh-ost --max-load=Threads_running=50 --critical-load=Threads_running=500 --chunk-size=250 --max-lag-millis=5000 --host=... --database="..." --table="..." --verbose --default-retries=200 --cut-over-lock-timeout-seconds=5 --critical-load-interval-millis=1000 --critical-load-hibernate-seconds=30 --execute
  92. 92. Special Cases: Metadata-only
  93. 93. Use case: Metadata-only - Overview ● Metadata-only changes touch the .frm -- the table definition -- not the data files ● Examples ● dropping an index ● set column default value ● change auto_increment ● change column name ● FK changes with foreign_key_checks=0
  94. 94. BATTLE!
  95. 95. Native online DDL wins!
  96. 96. Thank you Robots photos from Ariel Waldman https://www.flickr.com/photos/ariels_photos
  97. 97. © 2017 Pythian. Confidential 97 About Pythian Pythian’s 400+ IT professionals help companies adopt and manage disruptive technologies to better compete

×