Online Schema Changes for Maximizing Uptime


Published on

PalominoDB's David Turner and Ben Black cover common operations implemented in production and how you can minimize downtime and customer impact

Published in: Technology
1 Comment
1 Like
  • selam RDS devresi şematik olarak varmı sizde .gönderebilirmisiniz?
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Online Schema Changes for Maximizing Uptime

  1. 1. Online Schema Changesfor Maximizing UptimePALOMINODBOPERATIONAL EXCELLENCEFOR DATABASESDavid TurnerBen
  2. 2. Palominodb: Who are we and what do we do?• Proactive database awesomeness so youcan focus on other things• Systems and Performance Optimization• Operational Best Practices• Devops• Ask us how we like working from anywhere
  3. 3. Online Schema Changes• Best practices• Online Schema Tools• PT Online Schema Change• Innodb Online DDL• Case Study
  4. 4. Best Practices• Backups• Benchmarko Start smallo Step upo Timeo Performance_schema and Information_schema
  5. 5. Repository• Prevento Broken Replicationo Data losso Broken applications
  6. 6. Creating the repository• Dump• Script
  7. 7. Syncing the repository• Migrations• Schema Changes• Scheduled
  8. 8. Problems• Schema changes• Fragmentation
  9. 9. You want to do what to that 40Gtable?• Down for Maintenance/blocking DDL• Disable writes• Remaster• Facebook OSC• Large Hadron Migrator• Openark-kit• Pt Online Schema Change• MySQL Innodb DDL - IOD
  10. 10. OSC Tools or how I learnedto love the alter...• copy original table structure to new table• alter new table• create triggers to copy dml from originaltable to new table• copy data in original table to new table inchunks• swap table names and drop original table
  11. 11. pt-online-schema-changePercona toolkit is your friend.Mature productPay attention to version you are using!!!FK issues, log_bin with 2.0screen - use it along with tee and time
  12. 12. Its all ball bearings nowadays--progress time,10 (default 30 seconds)--max-lag--recursion-method (how to find slaves)show processlistshow hostsdsn table (great for ignoring some slaves)
  13. 13. What could possibly go wrong?• PK/UK required• FK names will change on altered table• FKs reference table to alter--alter-foreign-keys-method drop_swapsets foreign_key_checks=0drops original table (hardlink!!!)renames new table
  14. 14. What could possibly go wrong?• non xfs table drops (create the hardlink!)o Can even cause innodb to crash mysqld• PKs with gaps• Largest table to alter vs free disk space• Disk space (2x for RBR)• Global mutexes (table drops)• table metadata locks (triggers)
  15. 15. NOOOOOO!!• Running PT-OSC against a slave with RBRReplication started erroring after pt-oscwhy???• And how about syncing a table using pt-oscwith RBR?
  16. 16. Set it and forget it???• watcho wo df -ho ls -alh ibd fileso mysql -e"show processlist;"|egrep -v "Sleep|repl"o slave lag• How is it affecting the application?• max-lag, max-load,
  17. 17. Features we like• Throttling• Nibbling• Reporting• Replicating• Reorgs
  18. 18. Innodb Online DDL - IOD• Looking back 5.1 and 5.5o Fast index creation• Enhancementso Online DDLo Inplace
  19. 19. Silver Bullets• Drop and add Index together• Temporary table index creation
  20. 20. Alter Online lock modes• Exclusive• Shared• Default• None* Note
  21. 21. Alter Online Algorithms• Inplace• Copy
  22. 22. Grouping DDL• Copy• Sequence• Lock
  23. 23. IOD Behavior• Metadata• Data volume• Type of index• Foreign keys• Partitioningo Add partition, drop partitiono Truncate partitiono Add partition and coalesce partition• Algorithm• Autoincrement• Locking
  24. 24. IOD Errors• Locking• Timeouts• Tmpdir• DML
  25. 25. IOD Crash Recovery• Secondary Indexes• Clustered Indexes (dont crash!)
  26. 26. Of Interest• Tmpdir• Dropping Indexes with Foreign keys• Cascade on• Inconsistent .frm
  27. 27. Monitoring• Rows Affected• Performance_schema andInformation_schema• Progress• Replication lag
  28. 28. IOD Server parms• innodb_online_alter_log_max_size• sort_buffer_size
  29. 29. Case Study• Ticketing system
  30. 30. Architecture• 1 master• 1 active slave• 1 analytics/indexing slave• 1 backup/reorg slave
  31. 31. Issues• Growth• Rebalancing Shards• Deletions• Reorgs
  32. 32. Timeline• Difficult failovers• Ticket Growth• FB OSC• PT OSC• Reorgs
  33. 33. QuestionsDavid Turnerdturner@palominodb.comBen