2. Peter’s Background
•
Working with MySQL since 2003
•
Programming since 1995
•
Working with Linux since 1999
•
Presently Dir. of Database Ops at Clear Channel Media and Entertainment
– With Clear Channel since 2008 and lead the DB team
– Users generate around 200 million page views per month
– MySQL used as primary data store
– Many instances of MySQL deployed
– Primary applications generate around 15K queries/sec
3. Introduction to Storage
•
Approach to storage will greatly affect things like availability, scalability,
replication, backup/recovery, and DR.
•
SAN vs. attached storage
•
Storage performance/architecture:
– Spindles (IOPS) vs. disk capacity
– Write caching
•
Using a hosting facility and leasing SAN storage:
– Disks/networking shared with other customers?
– API/interface available to take/restore snapshots
4. What is LVM
•
Logical Volume Management is a way to
virtualize storage
•
An abstraction layer over disk storage
6. Logical / Physical Mapping
Logical blocks VA VA VA
in a volume LB1 LB2 LB3
Physical Blocks
on disk/LUN PB1 PB2 PB3
VA = Volume A
7. Snapshots vs. Clones
•
A snapshot is a static/read only image of the data at a
specific point in time.
•
A clone is a dynamic/writeable image of the data at a
specific point in time.
•
A clone can appear to be a snap when mounted read
only
8. Creating a snapshot or clone
Logical blocks VA VB
in a volume LB1 LB1
Physical Blocks
on disk/LUN PB1
VA = volume A, VB = snapshot of VA
9. Copy-On-Write (COW)
(write)
Logical blocks VA VB
in a volume LB1 LB1
X
X
X
Physical Blocks
on disk/LUN PB1 (copy) PB2
VA = volume A, VB = snapshot of VA
10. LVM doesn’t actually use blocks
•
File systems – block size
–
A block is logical container of data with a configurable fixed size in
bytes. A block is written/read from disk in a single disk I/O operation.
•
RAID – stripe size
–
The smallest unit of storage allocation that can be written to each disk. It
is a fixed size measured in bytes. (i.e. RAID 0,4,5,6)
•
LVM – extent size
–
The smallest unit of storage allocated to each physical volume. An
extent is a fixed size measured in bytes.
11. How it works: Device Mapper
•
Device Mapper is a Linux framework used to create create block
devices which are mapped to other block devices.
It is the foundation for:
•
Linux mdadm (RAID)
•
Linux LVM2
•
File system encryption
•
and more
For more info: http://mbroz.fedorapeople.org/talks/DeviceMapperBasics/dm.pdf
12. Backing up MySQL
•
Backup basics:
– Online vs. offline & physical vs. logical
•
http://dev.mysql.com/doc/refman/5.5/en/backup-types.html
– Mysqldump is good for portability, bad for fast recovery
•
Sequential export, not globally atomic, holds locks, long running processes
•
Can take a long time to restore
•
File system copy will be impractical if large dataset
•
Snapshot is extremely fast
•
Quiescing the database
•
Can be more frequent than traditional backups
•
Transfer backups to another medium
13. Quiescing MySQL
Initiate backup
Get dirty page %
S % to zero
et
Set session
Kill global lock
timeout
yes
Call lock no Lock monitor
Sleep Hanging lock?
monitor terminates
no Capture binlog
Acquire global Dirty pages?
Check dirty pages position and Take snapshot Release lock
lock Or timeout
processlist
yes
Restore dirty page
Sleep Backup complete
%
14. Recovery
•
Reverting to a backup cannot be undone
•
Future backups lost
16. Recovery
•
Reverting to a backup cannot be undone
•
Future backups lost
•
Near instant recovery, no need to copy data files,
untar or source in a dumpfile
•
Any slaves will need to be rebuilt
•
PITR – can be automated or performed manually
17. File System Architecture
Isolated
Datadir Binary Logs
/var/lib/mysql/ /var/lib/mysql-binlog
Unified
Datadir & Binary Logs
•
Isolated /var/lib/mysql/
– Only datadir is restored, binlogs left untouched
– Simpler recovery process
– Transactions will be duplicated during PITR
– Slaves maintain binlog file and position
•
Unified
– Datadir and binlogs are restored
– Binlogs must be copied to another location for PITR
– More complicated recovery process
– Slave replication will fail
18. Granular Recovery
•
Table or row level recovery
•
Use a dedicated data recovery host
– Create clone of snap (MySQL needs r/w file system)
– Don’t circumvent with clones as backups
•
Extract via mysqldump then source into production
– Use WHERE clause (-w ‘id=123’)
– Pipe to sed to convert INSERT to REPLACE, etc.
•
MyISAM, copy the 3 files, rename, then RENAME TABLE
•
InnoDB, create dumpfile, source in with a new table name, then RENAME
TABLE
19. Retention
•
Process to purge snaps based on age
– Example: 10d; 7d, 4w, 2m; etc.
•
Move archive backups to another medium
– Will never be restored
– Disk usage grows while aging
– Greater the difference to parent, the greater IO
overhead
20. Replication
•
Run slaves on clone of parent (master)
21. Replication – shared disk
Underlying physical disks are shared for master/slave volumes
Master Slave1 Slave2 Slave3
/vol/mysql_master /vol/mysql_slave1 /vol/mysql_slave2 /vol/mysql_slave3
Aggregate
22. Replication
•
Run slaves on clone of parent (master)
•
Shared blocks on parent and slaves
•
Add slaves quickly to add capacity
•
Removes need for slave resync tools
23. Resync a slave
Re-syncing a slave or reclaiming disk space is fast and easy
Master1 Slave1
(destroy clone volume)
X
/vol/mysql_master /vol/mysql_slave1 (old)
(create new volume as clone) Slave1
/vol/mysql_slave1 (new)
24. Slave Reclone
•
Create a clone of a backup snapshot
– Will require time for replication to catch up
•
Clone from live master
– Will need to quiesce master like during
backup
•
Consider urgency and timing
25. Replication
•
Run slaves on clone of parent (master)
•
Shared blocks on parent and slaves
•
Add slaves quickly to add capacity
•
Removes need for slave resync tools
•
Entire reclone process should be automated
•
No need to back up slaves
•
Reclone regularly to reclaim space
•
Reclone after recover from backup
•
Mostly static or r/o data? Consider no MySQL replication
– Replication is inefficient for large volumes of static data
– Consider a reclone instead
•
Easier to rely on SAN for replication to DR facility than replicating multiple MySQL
instances
26. Other Things to Consider
•
Monitor for most recent backup
•
Automated recovery testing host
•
Monitor for backup quality
– Recoverable
– Dirty
•
ETL host with automated reclones from
backup