This document discusses backup solutions for MySQL databases. It begins by defining logical and physical backups. For logical backups, it recommends mysqldump and mydumper/myloader. For physical backups, it recommends xtrabackup and snapshots. It provides details on using these tools and best practices like regular testing of backups. It gives examples of setups for on-premises, Amazon Web Services, and using Cluster Control for managing backups.
Become a MySQL DBA - slides: Deciding on a relevant backup solution
1. Copyright 2015 Severalnines AB
Deciding on a relevant backup solution
June 30, 2015
Krzysztof Książek
Severalnines
krzysztof@severalnines.com
1
2. Copyright 2015 Severalnines AB
! We want to help all non-DBA people who have to take care
about MySQL infrastructure
! Discuss most common activities
! Share tips and good practicies
! If you missed, we’d like to encourage you to watch the
replay of the “Deep Dive Into How to Monitor Galera Cluster"
! http://www.severalnines.com/blog/deep-dive-how-monitor-
galera-cluster-mysql-mariadb-percona-xtradb-webinar-
replay
2
“Become a MySQL DBA” series
3. Copyright 2015 Severalnines AB
! Backup types
! Tools
! How backups are done in CC
! Good practices
! Example setups
3
Agenda
6. Copyright 2015 Severalnines AB
! Generate a plain text file: SQL, CSV, tab-separated
! Use SQL commands to load data
! INSERT INTO …
! LOAD DATA INFILE …
! Give possibility to recover even a single row
! Must have for major version upgrades
6
Logical backups
7. Copyright 2015 Severalnines AB
! Arguably the best known backup tool for MySQL
! Ability to dump tables, schemas
! Ability to recover even a single row (it’s easier with --
extended-insert=0)
! Locking - yes, but it’s not that big problem as you may
think (for InnoDB tables)
! May generate large SQL files, dump separate tables, not
full schema
7
mysqldump
8. Copyright 2015 Severalnines AB
! Can be used to build a slave (--master-data)
! Long recovery time (need to parse all that SQL)
! Single thread (you can run it in parallel on per-table basis,
though - little bit of scripting required)
! Character set may be tricky if you don't pay attention
! Did I mention long recovery time?
8
mysqldump
9. Copyright 2015 Severalnines AB
! Available as a separate mode in mysqldump
! Less performance overhead in restoring
! More tricky to restore (do you remember all those
‘terminated by’, ‘separated by’ settings?)
! Can be used to generate CSV files - for compatibility
9
SELECT INTO OUTFILE
10. Copyright 2015 Severalnines AB
! mysqldump only different - allows parallelization, splits
tables in chunks
! Data only, you need to rely on mysqldump to grab the
schema
! Tends to be hard to install - source code only (rpm’s
showed recently, I haven’t tested them yet)
! Buggy, although it’s system- dependent and if you get it
to work, it should work just fine
10
mydumper/myloader
11. Copyright 2015 Severalnines AB
! Pretty nice dump time (1T ~ 4-6h, YMMV of course)
! Long loading time (but not as long as with mysqldump)
! The fastest logical backup I know - you may need to get
familiar if you have large data set and plan an upgrade
11
mydumper/myloader
14. Copyright 2015 Severalnines AB
! Generate an exact copy of the data
! Tend to work on high level - restore all or nothing
! Fast way of grabbing a copy of your data
! Fast way of restoring your data
! Limitation is usually the hardware (disk, network)
! Great for building the infrastructure
14
Physical backups
15. Copyright 2015 Severalnines AB
! _The_ backup solution
! Online backup for InnoDB tables
! “Virtually” non-locking
! Works by copying the data files and logging transactions
which happened in the meantime
! If you have MyISAM, you’ll get locked. Don’t use MyISAM
15
xtrabackup
16. Copyright 2015 Severalnines AB
! Supports local backups and streaming over the network
! Supports incremental backups
! Backup needs to be prepared (transactions from the log
have to be applied)
! innobackupex --apply-log /path/to/BACKUP-DIR
! Remember about --use-memory when applying logs.
Memory speeds things up
16
xtrabackup
17. Copyright 2015 Severalnines AB
! Supports partial backups
! Per schema
! Per table
! This comes handy when restoring missing data, speeds it
up
! Can bundle replication information with the backup
! Can bundle Galera’s sequence number with the backup
17
xtrabackup
18. Copyright 2015 Severalnines AB
! LVM, EBS snapshot, SAN snapshot, you name it
! Grabs the whole data at once
! Usually it’s pretty fast
! Great for building infrastructure, especially in the cloud
! Not so great for recovering small pieces of data
18
Snapshots
19. Copyright 2015 Severalnines AB
! Snapshots have to be consistent to be useful
! innodb_flush_log_at_trx_commit=1 works for InnoDB but
you’ll have to go through InnoDB recovery
! Running FTWRL may work for all engines but you’ll still
have to go through the InnoDB recovery process
! Cold backup is the best but it’s also expensive (to shut
down MySQL you need to have a separate host)
19
Snapshots
20. Copyright 2015 Severalnines AB
! Great tool to implement snapshots in EC2
! Supports:
! Cold backup
! FLUSH TABLE WITH READ LOCK
! fsfreeze / xfs_freeze
! RAIDed volumes
20
ec2-consistent-snapshot
22. Copyright 2015 Severalnines AB
! Backups are taken at a given time
! How to recover data removed between backups?
! Binary logs store all modifications and can be used to
replay changes. As long as binlogs are enabled, that is.
! Using mysqlbinlog you can easily convert binary logs
into SQL format
! mysqlbinlog binary_log.00001 > data.sql
! add --base64-output=DECODE-ROWS --verbose for
RBR
22
Point-in-time recovery
23. Copyright 2015 Severalnines AB
! Whole process is fairly simple:
! Use a backup prior to data loss to recover the system
! Grab binary log position at the time when the
backup was taken (SHOW MASTER STATUS,
xtrabackup_binlog_info)
! mysqlbinlog --start-position=xxx
! Find the point of the data loss, identify position before
it
! mysqlbinlog --start-position=xxx --stop-position=yyy
23
Point-in-time recovery
25. Copyright 2015 Severalnines AB
! Define a schedule
! Define a method
! Define a backup type (full
or incremental)
! Define where it should be
taken
! Define where it should be
stored
25
Backups in Cluster Control
27. Copyright 2015 Severalnines AB
! Restore the cluster with
couple of clicks
! Support for mysqldump
and xtrabackup
! Support for full and
incremental backups
! Ability to store backups
offsite (upload to S3 or
Glacier)
27
Backups in Cluster Control
29. Copyright 2015 Severalnines AB
! Ensure the backups are being made
! Ensure that their size makes sense
! Ensure logs are clear from the errors
! Automate this process to make your life easier
29
Daily/weekly healthcheck
30. Copyright 2015 Severalnines AB
! Perform it regularly, i.e. every other month, twice per year.
! Perform it after you made any changes to the backup
process
! Should cover whole process of recovering:
! Decompress/decrypt the backup
! Build and start a new instance using this data
! Slave it off the master using the data from the backup
30
Restore test
31. Copyright 2015 Severalnines AB
! Have one
! Store your backups outside of the main datacenter
! Assess time needed to transfer the data back and
recover it
! Prepare detailed runbooks - when in rush you either stick
to the runbook or make mistakes
31
Disaster recovery plan
34. Copyright 2015 Severalnines AB
! xtrabackup, full backup daily, incremental backups every
4h
! Store locally and copy to a separate backup server
! Ensure you make a copy of binary logs too
! This should give you decent recovery speed when doing
Point-in-Time recovery
! If you can use LVM, it’s also feasible - remember about
data consistency, though
34
On premises
35. Copyright 2015 Severalnines AB
! If you have an additional server available (for ad-hoc
queries for example), you can use it to:
! Take a logical backup for easier recovery of small bits
of data
! setup LVM for taking cold backups of the whole
dataset
35
On premises
36. Copyright 2015 Severalnines AB
! EBS snapshot is great but it’s hard to take it frequently
! Remember about the consistency requirements
! xtrabackup and incremental backups may be useful
when you need to take a backup every 10m or so
! ec2-consistent-snapshot will help on RAIDed setups
! On a regular basis, copy the snapshot to a different region
for DR
36
Amazon Web Services
37. Copyright 2015 Severalnines AB
! If you have an additional server available (for ad-hoc
queries for example), you can use it to:
! Take a logical backup for easier recovery of small bits
of data
! setup ec2-consistent-snapshot for taking cold backups
of the whole dataset
37
Amazon Web Services
38. Copyright 2015 Severalnines AB
! More blogs in “Become a MySQL DBA” series:
! http://www.severalnines.com/blog/become-dba-blog-
series-monitoring-and-trending
! http://www.severalnines.com/blog/become-mysql-
dba-blog-series-database-high-availability
! Contact: krzysztof@severalnines.com
38
Thank You!