SlideShare a Scribd company logo
1 of 27
Download to read offline
MySQL with Enterprise
     Storage
 Presented By Peter
     Teitelbaum
Peter’s Background
•
    Working with MySQL since 2003
•
    Programming since 1995
•
    Working with Linux since 1999
•
    Presently Dir. of Database Ops at Clear Channel Media and Entertainment
     –   With Clear Channel since 2008 and lead the DB team
     –   Users generate around 200 million page views per month
     –   MySQL used as primary data store
     –   Many instances of MySQL deployed
     –   Primary applications generate around 15K queries/sec
Introduction to Storage
•
    Approach to storage will greatly affect things like availability, scalability,
    replication, backup/recovery, and DR.
•
    SAN vs. attached storage
•
    Storage performance/architecture:
     –   Spindles (IOPS) vs. disk capacity
     –   Write caching
•
    Using a hosting facility and leasing SAN storage:
     –   Disks/networking shared with other customers?
     –   API/interface available to take/restore snapshots
What is LVM
•
    Logical Volume Management is a way to
    virtualize storage
•
    An abstraction layer over disk storage
LV / VG / PV Relationship
 Logical   / dev/myvolgroup/ logs     /dev/ myvolgroup/ data     / dev/myvolgroup/ images
                   10GB                       600GB                       250GB
Volumes




Volume                                  / dev/myvolgroup
                                             1500GB
Group




Physical
Volumes

                          /dev/sda1         / dev/sdb1         / dev/ sdc1
                            500GB              500GB              500GB
Logical / Physical Mapping

Logical blocks    VA          VA        VA
 in a volume      LB1         LB2       LB3




Physical Blocks
 on disk/LUN      PB1         PB2       PB3



                        VA = Volume A
Snapshots vs. Clones
•
    A snapshot is a static/read only image of the data at a
    specific point in time.

•
    A clone is a dynamic/writeable image of the data at a
    specific point in time.

•
    A clone can appear to be a snap when mounted read
    only
Creating a snapshot or clone

Logical blocks           VA             VB
 in a volume             LB1            LB1




Physical Blocks
 on disk/LUN             PB1



                  VA = volume A, VB = snapshot of VA
Copy-On-Write (COW)
                         (write)




Logical blocks           VA                      VB
 in a volume             LB1                     LB1


                                             X
                                     X
                                   X
Physical Blocks
 on disk/LUN             PB1        (copy)       PB2



                  VA = volume A, VB = snapshot of VA
LVM doesn’t actually use blocks
•
    File systems – block size
     –
         A block is logical container of data with a configurable fixed size in
         bytes. A block is written/read from disk in a single disk I/O operation.

•
    RAID – stripe size
     –
         The smallest unit of storage allocation that can be written to each disk. It
         is a fixed size measured in bytes. (i.e. RAID 0,4,5,6)

•
    LVM – extent size
     –
         The smallest unit of storage allocated to each physical volume. An
         extent is a fixed size measured in bytes.
How it works: Device Mapper
•
    Device Mapper is a Linux framework used to create create block
    devices which are mapped to other block devices.


              It is the foundation for:
                      •
                        Linux mdadm (RAID)
                      •
                        Linux LVM2
                      •
                        File system encryption
                      •
                        and more



For more info: http://mbroz.fedorapeople.org/talks/DeviceMapperBasics/dm.pdf
Backing up MySQL
•
    Backup basics:

     –   Online vs. offline & physical vs. logical
           •
               http://dev.mysql.com/doc/refman/5.5/en/backup-types.html

     –   Mysqldump is good for portability, bad for fast recovery
           •
               Sequential export, not globally atomic, holds locks, long running processes
           •
               Can take a long time to restore
•
    File system copy will be impractical if large dataset
•
    Snapshot is extremely fast
•
    Quiescing the database
•
    Can be more frequent than traditional backups
•
    Transfer backups to another medium
Quiescing MySQL
Initiate backup




Get dirty page %
 S % to zero
  et




  Set session
                                       Kill global lock
   timeout




                                                yes

    Call lock                                             no   Lock monitor
                        Sleep          Hanging lock?
    monitor                                                     terminates




                                                          no   Capture binlog
 Acquire global                        Dirty pages?
                   Check dirty pages                            position and       Take snapshot     Release lock
     lock                               Or timeout
                                                                 processlist

                                                yes



                                                                                                   Restore dirty page
                        Sleep                                                   Backup complete
                                                                                                           %
Recovery
•
    Reverting to a backup cannot be undone
•
    Future backups lost
Recovery & Future Backups

    Backups
Recovery
•
    Reverting to a backup cannot be undone
•
    Future backups lost
•
    Near instant recovery, no need to copy data files,
    untar or source in a dumpfile
•
    Any slaves will need to be rebuilt
•
    PITR – can be automated or performed manually
File System Architecture
                                                             Isolated
                                            Datadir                                    Binary Logs
                                        /var/lib/mysql/                           /var/lib/mysql-binlog



                                                              Unified
                                                          Datadir & Binary Logs
•
    Isolated                                                 /var/lib/mysql/

     –    Only datadir is restored, binlogs left untouched

     –    Simpler recovery process

     –    Transactions will be duplicated during PITR

     –    Slaves maintain binlog file and position

•
    Unified

     –    Datadir and binlogs are restored

     –    Binlogs must be copied to another location for PITR

     –    More complicated recovery process

     –    Slave replication will fail
Granular Recovery
•
    Table or row level recovery
•
    Use a dedicated data recovery host
     –   Create clone of snap (MySQL needs r/w file system)
     –   Don’t circumvent with clones as backups
•
    Extract via mysqldump then source into production
     –   Use WHERE clause (-w ‘id=123’)
     –   Pipe to sed to convert INSERT to REPLACE, etc.
•
    MyISAM, copy the 3 files, rename, then RENAME TABLE
•
    InnoDB, create dumpfile, source in with a new table name, then RENAME
    TABLE
Retention
•
    Process to purge snaps based on age
    –   Example: 10d; 7d, 4w, 2m; etc.
•
    Move archive backups to another medium
    –   Will never be restored
    –   Disk usage grows while aging
    –   Greater the difference to parent, the greater IO
        overhead
Replication
•
    Run slaves on clone of parent (master)
Replication – shared disk
Underlying physical disks are shared for master/slave volumes




      Master              Slave1              Slave2              Slave3



 /vol/mysql_master    /vol/mysql_slave1   /vol/mysql_slave2   /vol/mysql_slave3




                              Aggregate
Replication
•
    Run slaves on clone of parent (master)
•
    Shared blocks on parent and slaves
•
    Add slaves quickly to add capacity
•
    Removes need for slave resync tools
Resync a slave
Re-syncing a slave or reclaiming disk space is fast and easy




      Master1                                           Slave1

                           (destroy clone volume)


                                   X
  /vol/mysql_master                                 /vol/mysql_slave1   (old)




         (create new volume as clone)                   Slave1



                                                    /vol/mysql_slave1   (new)
Slave Reclone
•
    Create a clone of a backup snapshot
    –   Will require time for replication to catch up
•
    Clone from live master
    –   Will need to quiesce master like during
        backup
•
    Consider urgency and timing
Replication
•
    Run slaves on clone of parent (master)
•
    Shared blocks on parent and slaves
•
    Add slaves quickly to add capacity
•
    Removes need for slave resync tools
•
    Entire reclone process should be automated
•
    No need to back up slaves
•
    Reclone regularly to reclaim space
•
    Reclone after recover from backup
•
    Mostly static or r/o data? Consider no MySQL replication
     –   Replication is inefficient for large volumes of static data
     –   Consider a reclone instead
•
    Easier to rely on SAN for replication to DR facility than replicating multiple MySQL
    instances
Other Things to Consider
•
    Monitor for most recent backup
•
    Automated recovery testing host
•
    Monitor for backup quality
    –   Recoverable
    –   Dirty
•
    ETL host with automated reclones from
    backup
Thanks!



Questions/thoughts?

More Related Content

What's hot

Understanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksUnderstanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksJignesh Shah
 
Collaborate instant cloning_kyle
Collaborate instant cloning_kyleCollaborate instant cloning_kyle
Collaborate instant cloning_kyleKyle Hailey
 
Building a Distributed Block Storage System on Xen
Building a Distributed Block Storage System on XenBuilding a Distributed Block Storage System on Xen
Building a Distributed Block Storage System on XenThe Linux Foundation
 
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...Ontico
 
Gluster Webinar: Introduction to GlusterFS v3.3
Gluster Webinar: Introduction to GlusterFS v3.3Gluster Webinar: Introduction to GlusterFS v3.3
Gluster Webinar: Introduction to GlusterFS v3.3GlusterFS
 
Nn ha hadoop world.final
Nn ha hadoop world.finalNn ha hadoop world.final
Nn ha hadoop world.finalHortonworks
 
Introduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System AdministratorsIntroduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System AdministratorsJignesh Shah
 
High Availability Storage (susecon2016)
High Availability Storage (susecon2016)High Availability Storage (susecon2016)
High Availability Storage (susecon2016)Roger Zhou 周志强
 
My experience with embedding PostgreSQL
 My experience with embedding PostgreSQL My experience with embedding PostgreSQL
My experience with embedding PostgreSQLJignesh Shah
 
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt AhrensOpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt AhrensMatthew Ahrens
 
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized EnvironmentsBest Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized EnvironmentsJignesh Shah
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGlusterFS
 
High Availability != High-cost
High Availability != High-costHigh Availability != High-cost
High Availability != High-costnormanmaurer
 
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISORLOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISORVanika Kapoor
 
BACD July 2012 : The Xen Cloud Platform
BACD July 2012 : The Xen Cloud Platform BACD July 2012 : The Xen Cloud Platform
BACD July 2012 : The Xen Cloud Platform The Linux Foundation
 
S4 xen hypervisor_20080622
S4 xen hypervisor_20080622S4 xen hypervisor_20080622
S4 xen hypervisor_20080622Todd Deshane
 
PostgreSQL Disaster Recovery with Barman
PostgreSQL Disaster Recovery with BarmanPostgreSQL Disaster Recovery with Barman
PostgreSQL Disaster Recovery with BarmanGabriele Bartolini
 
Red Hat - LVM - Mazenet Solution
Red Hat - LVM - Mazenet SolutionRed Hat - LVM - Mazenet Solution
Red Hat - LVM - Mazenet SolutionMazenetsolution
 

What's hot (20)

Understanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksUnderstanding PostgreSQL LW Locks
Understanding PostgreSQL LW Locks
 
Collaborate instant cloning_kyle
Collaborate instant cloning_kyleCollaborate instant cloning_kyle
Collaborate instant cloning_kyle
 
Building a Distributed Block Storage System on Xen
Building a Distributed Block Storage System on XenBuilding a Distributed Block Storage System on Xen
Building a Distributed Block Storage System on Xen
 
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
 
Gluster Webinar: Introduction to GlusterFS v3.3
Gluster Webinar: Introduction to GlusterFS v3.3Gluster Webinar: Introduction to GlusterFS v3.3
Gluster Webinar: Introduction to GlusterFS v3.3
 
Drbd
DrbdDrbd
Drbd
 
Nn ha hadoop world.final
Nn ha hadoop world.finalNn ha hadoop world.final
Nn ha hadoop world.final
 
How swift is your Swift - SD.pptx
How swift is your Swift - SD.pptxHow swift is your Swift - SD.pptx
How swift is your Swift - SD.pptx
 
Introduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System AdministratorsIntroduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System Administrators
 
High Availability Storage (susecon2016)
High Availability Storage (susecon2016)High Availability Storage (susecon2016)
High Availability Storage (susecon2016)
 
My experience with embedding PostgreSQL
 My experience with embedding PostgreSQL My experience with embedding PostgreSQL
My experience with embedding PostgreSQL
 
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt AhrensOpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
 
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized EnvironmentsBest Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFS
 
High Availability != High-cost
High Availability != High-costHigh Availability != High-cost
High Availability != High-cost
 
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISORLOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
 
BACD July 2012 : The Xen Cloud Platform
BACD July 2012 : The Xen Cloud Platform BACD July 2012 : The Xen Cloud Platform
BACD July 2012 : The Xen Cloud Platform
 
S4 xen hypervisor_20080622
S4 xen hypervisor_20080622S4 xen hypervisor_20080622
S4 xen hypervisor_20080622
 
PostgreSQL Disaster Recovery with Barman
PostgreSQL Disaster Recovery with BarmanPostgreSQL Disaster Recovery with Barman
PostgreSQL Disaster Recovery with Barman
 
Red Hat - LVM - Mazenet Solution
Red Hat - LVM - Mazenet SolutionRed Hat - LVM - Mazenet Solution
Red Hat - LVM - Mazenet Solution
 

Similar to My sql with enterprise storage

MySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
MySQL Server Backup, Restoration, And Disaster Recovery Planning PresentationMySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
MySQL Server Backup, Restoration, And Disaster Recovery Planning PresentationColin Charles
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia DatabasesJaime Crespo
 
Congratsyourthedbatoo
CongratsyourthedbatooCongratsyourthedbatoo
CongratsyourthedbatooDave Stokes
 
Dueling duplications RMAN vs Delphix
Dueling duplications RMAN vs DelphixDueling duplications RMAN vs Delphix
Dueling duplications RMAN vs DelphixKyle Hailey
 
Rman Presentation
Rman PresentationRman Presentation
Rman PresentationRick van Ek
 
Introduction to DRBD
Introduction to DRBDIntroduction to DRBD
Introduction to DRBDdawnlua
 
VMware Backup in Cybozu Labs
VMware Backup in Cybozu LabsVMware Backup in Cybozu Labs
VMware Backup in Cybozu LabsTakashi Hoshino
 
Best practices for DB2 for z/OS log based recovery
Best practices for DB2 for z/OS log based recoveryBest practices for DB2 for z/OS log based recovery
Best practices for DB2 for z/OS log based recoveryFlorence Dubois
 
Some key value stores using log-structure
Some key value stores using log-structureSome key value stores using log-structure
Some key value stores using log-structureZhichao Liang
 
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...SQLExpert.pl
 
Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012StampedeCon
 
VDI storage and storage virtualization
VDI storage and storage virtualizationVDI storage and storage virtualization
VDI storage and storage virtualizationSisimon Soman
 
Pdb my sql backup london percona live 2012
Pdb my sql backup   london percona live 2012Pdb my sql backup   london percona live 2012
Pdb my sql backup london percona live 2012Pythian
 
Drupal Con My Sql Ha 2008 08 29
Drupal Con My Sql Ha 2008 08 29Drupal Con My Sql Ha 2008 08 29
Drupal Con My Sql Ha 2008 08 29liufabin 66688
 
MySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery PlanningMySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery PlanningLenz Grimmer
 

Similar to My sql with enterprise storage (20)

MySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
MySQL Server Backup, Restoration, And Disaster Recovery Planning PresentationMySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
MySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
Congratsyourthedbatoo
CongratsyourthedbatooCongratsyourthedbatoo
Congratsyourthedbatoo
 
Dueling duplications RMAN vs Delphix
Dueling duplications RMAN vs DelphixDueling duplications RMAN vs Delphix
Dueling duplications RMAN vs Delphix
 
Rman Presentation
Rman PresentationRman Presentation
Rman Presentation
 
MySQL highav Availability
MySQL highav AvailabilityMySQL highav Availability
MySQL highav Availability
 
Introduction to DRBD
Introduction to DRBDIntroduction to DRBD
Introduction to DRBD
 
Hot sec10 slide-suzaki
Hot sec10 slide-suzakiHot sec10 slide-suzaki
Hot sec10 slide-suzaki
 
VMware Backup in Cybozu Labs
VMware Backup in Cybozu LabsVMware Backup in Cybozu Labs
VMware Backup in Cybozu Labs
 
Best practices for DB2 for z/OS log based recovery
Best practices for DB2 for z/OS log based recoveryBest practices for DB2 for z/OS log based recovery
Best practices for DB2 for z/OS log based recovery
 
Some key value stores using log-structure
Some key value stores using log-structureSome key value stores using log-structure
Some key value stores using log-structure
 
A32 Database Virtulization Technologies
A32 Database Virtulization TechnologiesA32 Database Virtulization Technologies
A32 Database Virtulization Technologies
 
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
 
Exchange Server 2013 High Availability - Site Resilience
Exchange Server 2013 High Availability - Site ResilienceExchange Server 2013 High Availability - Site Resilience
Exchange Server 2013 High Availability - Site Resilience
 
Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012
 
VDI storage and storage virtualization
VDI storage and storage virtualizationVDI storage and storage virtualization
VDI storage and storage virtualization
 
Pdb my sql backup london percona live 2012
Pdb my sql backup   london percona live 2012Pdb my sql backup   london percona live 2012
Pdb my sql backup london percona live 2012
 
Drupal Con My Sql Ha 2008 08 29
Drupal Con My Sql Ha 2008 08 29Drupal Con My Sql Ha 2008 08 29
Drupal Con My Sql Ha 2008 08 29
 
MySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery PlanningMySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery Planning
 
DAS RAID NAS SAN
DAS RAID NAS SANDAS RAID NAS SAN
DAS RAID NAS SAN
 

More from Caroline_Rose

Simon Jia - The Kohana Framework
Simon Jia - The Kohana FrameworkSimon Jia - The Kohana Framework
Simon Jia - The Kohana FrameworkCaroline_Rose
 
Peter Zaitsev - Practical MySQL Performance Optimization
Peter Zaitsev - Practical MySQL Performance OptimizationPeter Zaitsev - Practical MySQL Performance Optimization
Peter Zaitsev - Practical MySQL Performance OptimizationCaroline_Rose
 
Tr8n for php - Michael Berkovich
Tr8n for php - Michael BerkovichTr8n for php - Michael Berkovich
Tr8n for php - Michael BerkovichCaroline_Rose
 
Semantic webslideshareversion
Semantic webslideshareversionSemantic webslideshareversion
Semantic webslideshareversionCaroline_Rose
 
Shaddy Zeineddine: Queuing w/ MongoDB & BreakMedia's API
Shaddy Zeineddine: Queuing w/ MongoDB & BreakMedia's APIShaddy Zeineddine: Queuing w/ MongoDB & BreakMedia's API
Shaddy Zeineddine: Queuing w/ MongoDB & BreakMedia's APICaroline_Rose
 

More from Caroline_Rose (6)

Simon Jia - The Kohana Framework
Simon Jia - The Kohana FrameworkSimon Jia - The Kohana Framework
Simon Jia - The Kohana Framework
 
Peter Zaitsev - Practical MySQL Performance Optimization
Peter Zaitsev - Practical MySQL Performance OptimizationPeter Zaitsev - Practical MySQL Performance Optimization
Peter Zaitsev - Practical MySQL Performance Optimization
 
Tr8n for php - Michael Berkovich
Tr8n for php - Michael BerkovichTr8n for php - Michael Berkovich
Tr8n for php - Michael Berkovich
 
Dal deck
Dal deckDal deck
Dal deck
 
Semantic webslideshareversion
Semantic webslideshareversionSemantic webslideshareversion
Semantic webslideshareversion
 
Shaddy Zeineddine: Queuing w/ MongoDB & BreakMedia's API
Shaddy Zeineddine: Queuing w/ MongoDB & BreakMedia's APIShaddy Zeineddine: Queuing w/ MongoDB & BreakMedia's API
Shaddy Zeineddine: Queuing w/ MongoDB & BreakMedia's API
 

My sql with enterprise storage

  • 1. MySQL with Enterprise Storage Presented By Peter Teitelbaum
  • 2. Peter’s Background • Working with MySQL since 2003 • Programming since 1995 • Working with Linux since 1999 • Presently Dir. of Database Ops at Clear Channel Media and Entertainment – With Clear Channel since 2008 and lead the DB team – Users generate around 200 million page views per month – MySQL used as primary data store – Many instances of MySQL deployed – Primary applications generate around 15K queries/sec
  • 3. Introduction to Storage • Approach to storage will greatly affect things like availability, scalability, replication, backup/recovery, and DR. • SAN vs. attached storage • Storage performance/architecture: – Spindles (IOPS) vs. disk capacity – Write caching • Using a hosting facility and leasing SAN storage: – Disks/networking shared with other customers? – API/interface available to take/restore snapshots
  • 4. What is LVM • Logical Volume Management is a way to virtualize storage • An abstraction layer over disk storage
  • 5. LV / VG / PV Relationship Logical / dev/myvolgroup/ logs /dev/ myvolgroup/ data / dev/myvolgroup/ images 10GB 600GB 250GB Volumes Volume / dev/myvolgroup 1500GB Group Physical Volumes /dev/sda1 / dev/sdb1 / dev/ sdc1 500GB 500GB 500GB
  • 6. Logical / Physical Mapping Logical blocks VA VA VA in a volume LB1 LB2 LB3 Physical Blocks on disk/LUN PB1 PB2 PB3 VA = Volume A
  • 7. Snapshots vs. Clones • A snapshot is a static/read only image of the data at a specific point in time. • A clone is a dynamic/writeable image of the data at a specific point in time. • A clone can appear to be a snap when mounted read only
  • 8. Creating a snapshot or clone Logical blocks VA VB in a volume LB1 LB1 Physical Blocks on disk/LUN PB1 VA = volume A, VB = snapshot of VA
  • 9. Copy-On-Write (COW) (write) Logical blocks VA VB in a volume LB1 LB1 X X X Physical Blocks on disk/LUN PB1 (copy) PB2 VA = volume A, VB = snapshot of VA
  • 10. LVM doesn’t actually use blocks • File systems – block size – A block is logical container of data with a configurable fixed size in bytes. A block is written/read from disk in a single disk I/O operation. • RAID – stripe size – The smallest unit of storage allocation that can be written to each disk. It is a fixed size measured in bytes. (i.e. RAID 0,4,5,6) • LVM – extent size – The smallest unit of storage allocated to each physical volume. An extent is a fixed size measured in bytes.
  • 11. How it works: Device Mapper • Device Mapper is a Linux framework used to create create block devices which are mapped to other block devices. It is the foundation for: • Linux mdadm (RAID) • Linux LVM2 • File system encryption • and more For more info: http://mbroz.fedorapeople.org/talks/DeviceMapperBasics/dm.pdf
  • 12. Backing up MySQL • Backup basics: – Online vs. offline & physical vs. logical • http://dev.mysql.com/doc/refman/5.5/en/backup-types.html – Mysqldump is good for portability, bad for fast recovery • Sequential export, not globally atomic, holds locks, long running processes • Can take a long time to restore • File system copy will be impractical if large dataset • Snapshot is extremely fast • Quiescing the database • Can be more frequent than traditional backups • Transfer backups to another medium
  • 13. Quiescing MySQL Initiate backup Get dirty page % S % to zero et Set session Kill global lock timeout yes Call lock no Lock monitor Sleep Hanging lock? monitor terminates no Capture binlog Acquire global Dirty pages? Check dirty pages position and Take snapshot Release lock lock Or timeout processlist yes Restore dirty page Sleep Backup complete %
  • 14. Recovery • Reverting to a backup cannot be undone • Future backups lost
  • 15. Recovery & Future Backups Backups
  • 16. Recovery • Reverting to a backup cannot be undone • Future backups lost • Near instant recovery, no need to copy data files, untar or source in a dumpfile • Any slaves will need to be rebuilt • PITR – can be automated or performed manually
  • 17. File System Architecture Isolated Datadir Binary Logs /var/lib/mysql/ /var/lib/mysql-binlog Unified Datadir & Binary Logs • Isolated /var/lib/mysql/ – Only datadir is restored, binlogs left untouched – Simpler recovery process – Transactions will be duplicated during PITR – Slaves maintain binlog file and position • Unified – Datadir and binlogs are restored – Binlogs must be copied to another location for PITR – More complicated recovery process – Slave replication will fail
  • 18. Granular Recovery • Table or row level recovery • Use a dedicated data recovery host – Create clone of snap (MySQL needs r/w file system) – Don’t circumvent with clones as backups • Extract via mysqldump then source into production – Use WHERE clause (-w ‘id=123’) – Pipe to sed to convert INSERT to REPLACE, etc. • MyISAM, copy the 3 files, rename, then RENAME TABLE • InnoDB, create dumpfile, source in with a new table name, then RENAME TABLE
  • 19. Retention • Process to purge snaps based on age – Example: 10d; 7d, 4w, 2m; etc. • Move archive backups to another medium – Will never be restored – Disk usage grows while aging – Greater the difference to parent, the greater IO overhead
  • 20. Replication • Run slaves on clone of parent (master)
  • 21. Replication – shared disk Underlying physical disks are shared for master/slave volumes Master Slave1 Slave2 Slave3 /vol/mysql_master /vol/mysql_slave1 /vol/mysql_slave2 /vol/mysql_slave3 Aggregate
  • 22. Replication • Run slaves on clone of parent (master) • Shared blocks on parent and slaves • Add slaves quickly to add capacity • Removes need for slave resync tools
  • 23. Resync a slave Re-syncing a slave or reclaiming disk space is fast and easy Master1 Slave1 (destroy clone volume) X /vol/mysql_master /vol/mysql_slave1 (old) (create new volume as clone) Slave1 /vol/mysql_slave1 (new)
  • 24. Slave Reclone • Create a clone of a backup snapshot – Will require time for replication to catch up • Clone from live master – Will need to quiesce master like during backup • Consider urgency and timing
  • 25. Replication • Run slaves on clone of parent (master) • Shared blocks on parent and slaves • Add slaves quickly to add capacity • Removes need for slave resync tools • Entire reclone process should be automated • No need to back up slaves • Reclone regularly to reclaim space • Reclone after recover from backup • Mostly static or r/o data? Consider no MySQL replication – Replication is inefficient for large volumes of static data – Consider a reclone instead • Easier to rely on SAN for replication to DR facility than replicating multiple MySQL instances
  • 26. Other Things to Consider • Monitor for most recent backup • Automated recovery testing host • Monitor for backup quality – Recoverable – Dirty • ETL host with automated reclones from backup