Data Deduplicationin Virtualized Environments       Marc Crespi,       ExaGrid Systems       http://blog.exagrid.com      ...
About the speaker Marc has over 20 years of software  and hardware experience in the high technology sector He is part o...
Objective of This Program  What is Deduplication?  Why Use Deduplication in Backup and Recovery?  Challenges of Deduplicat...
Why Use Deduplication in Backup and Recovery? Enhanced Speed/Performance  ● Faster backup times due to lower volume of da...
Eliminate Redundanciesfor More Efficient Virtual Server Backups    VM   VM      VM      VM      VM      VM                ...
Specific Challenges of Backups/Restoresin Virtualized Environments Management of backups   ● Growing number of virtual ma...
How Dedupe Works: Store Only Changed Bytes        Standard Disk                           Data DeduplicationMost Recent Ba...
Where to Deploy DeduplicationSource Based Data Reduction                      Target Based Data ReductionRemoves data redu...
Using Both Deduplication TechniquesProvides Complementary BenefitsSource Based PLUS Target Based Data DeduplicationRemoves...
Architectural ConsiderationsLegacy Architecture - Single Controller                                       Scalable GRID Ar...
Architectural Considerations   Legacy Architecture –                                 Scalable GRID Architecture     Single...
GRID Architecture for Deduplication Performance                                 Node 1 – System Capacity – RAID6  Backup  ...
What We Covered What is Deduplication? Why Use Deduplication in Backup and Recovery? Challenges of Deduplication in Virtua...
Enjoy and share this material Feel free to promote this material Recommend your peers to pass certification Blog, Tweet...
Upcoming SlideShare
Loading in...5
×

Data deduplication in virtualized environments by Marc Crespi, Backup Academy

568
-1

Published on

The objective of this program is first and foremost to explain to you what is data deduplication. We're going to talk about why you should use deduplication in your backup and recovery operations, talk about some of the specific challenges found in virtualized environments. We'll go over the major components of a successful backup and recovery infrastructure, including data deduplication. We'll talk about the various approaches to deduplication, the pros and cons of each approach, and finally we'll summarize with the role of data deduplication and data protection in disaster recovery.

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
568
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
15
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • IT monitoring/management of backupsGrowing number of virtual machinesInability to monitor backups on individual virtual machinesIssue managing multiple guest OS’sMore data to storeEach change means entire vmdk file is backed upExample: 10 guest OS instances x 50GB = 500GB of backed-up virtual images daily
  • Source deduplication is the removal of redundancies from data before transmission to the backup target.Source deduplication products offer a number of benefits, including reduced bandwidth and storage usage. No additional hardware is required to back up to a remote site and many source deduplication products also support automation for offsite copies. On the other hand, the source-based method can be slower than target deduplication, especially for large (multiple terabyte) amounts of data. Because of the increased workload on servers, overall backup times may increase
  • Data deduplication in virtualized environments by Marc Crespi, Backup Academy

    1. 1. Data Deduplicationin Virtualized Environments Marc Crespi, ExaGrid Systems http://blog.exagrid.com Twitter: @ExaGrid
    2. 2. About the speaker Marc has over 20 years of software and hardware experience in the high technology sector He is part of the ExaGrid team that drives product strategy and execution and is responsible for managing product operations. Prior to joining the company, Marc was director of product management for security management products at Altiris.
    3. 3. Objective of This Program What is Deduplication? Why Use Deduplication in Backup and Recovery? Challenges of Deduplication in Virtualized Environments Deduplication approaches (two camps) Summary ‒ Deduplication’s Role in Data Protection and Disaster Recovery
    4. 4. Why Use Deduplication in Backup and Recovery? Enhanced Speed/Performance ● Faster backup times due to lower volume of data to be backed up ● Data lands faster because it is targeted at disk Dramatic Savings in Disk Costs ● 20:1 Reduction in amount of disk space required to store backups Scalability ● Backup higher data volumes while maintaining backup window Offsite Disaster Recovery ● Efficient use of bandwidth via WAN-efficient replication
    5. 5. Eliminate Redundanciesfor More Efficient Virtual Server Backups VM VM VM VM VM VM VM VM VM VM VM VM VM Each virtual server image gets backed up  Deduplicate backups to changed bytes in its entirety  Dramatic savings in disk and bandwidth Large amount of storage consumed  Integrated ReplicationReduced storage footprint with deduplication Reduce total amount of storage by as much as 1000:1 Store only the bytes that change in your VMware virtual servers Eliminate redundancy of typical VMware backups Restore quickly from most recent VMware backup
    6. 6. Specific Challenges of Backups/Restoresin Virtualized Environments Management of backups ● Growing number of virtual machines/ sprawl ● Inability to monitor backups on individual virtual machines Handling the volume of backup data efficiently ● More data to store as virtual machines proliferate ● Each change means entire virtual server is backed upExample: 10 guest OS instances x50GB = 500GB of backed-up virtual images daily These challenges are driving a need for better tools to more reliably and easily back up and restore virtual machines
    7. 7. How Dedupe Works: Store Only Changed Bytes Standard Disk Data DeduplicationMost Recent Backup 50GB VM VM 2.5GB Most Recent Backup Stored Optimized for Read 50GB VM VM 100MB 50GB VM VM 100MB 50GB VM VM 100MB 50GB VM VM 100MB 500GB 50GB VM VM 100MB 3.4GB 50GB VM VM 100MB 50GB VM VM 100MB 50GB VM VM 100MB Oldest Backup 50GB VM VM 100MB Oldest Backup Total 500GB Total 3.4GB
    8. 8. Where to Deploy DeduplicationSource Based Data Reduction Target Based Data ReductionRemoves data redundancies Removes data redundanciesbefore transmission after transmissionto the backup target to the backup targetPROS PROS Reduces impact on VM  Shortens BU window/less data Shortens BU window/less data  Reduced replication bandwidth Reduced bandwidth needed  Reduction in storage usage to the backup target Reduction in storage usage CONS  Must transfer the entire datasetCONS to the device Can be slower for large  Don’t get reduced bandwidth (multiple TB) amounts of data needed to the backup target Increased workload on servers 2011 ExaGrid Systems, Inc.
    9. 9. Using Both Deduplication TechniquesProvides Complementary BenefitsSource Based PLUS Target Based Data DeduplicationRemoves data redundancies before and after transmission to the backup target Achieves an additional 80% data reduction (98% total) ● Further reduction in bandwidth ● Further reduction in storage usage ● Further reduction in backup window Integrated replication of virtual servers 2011 ExaGrid Systems, Inc.
    10. 10. Architectural ConsiderationsLegacy Architecture - Single Controller Scalable GRID Architecture One Deduplication Engine Multiple Deduplication Engines Deduplication Engine Deduplication Engine X TB/hr 10 TBX TB/hr Deduplication Engine 2X TB/hr 20 TBX TB/hr Disks 20 TB Deduplication EngineX TB/hr Disks 30 TB 3X TB/hr 30 TB Deduplication EngineX TB/hr Disks 40 TB 4X TB/hr 40 TBX TB/hr Disks 50 TB Deduplication Engine 5X TB/hr 50 TBX TB/hr Disks 60 TB Deduplication Engine Backup Window 6X TB/hr 60 TB Backup Window 2011 ExaGrid Systems, Inc.
    11. 11. Architectural Considerations Legacy Architecture – Scalable GRID Architecture Single Controller Multiple Deduplication Engines One Deduplication Engine Legacy Architecture – Appliance Sprawl Scalable GRID Features  Linear performance as data grows, Individual appliances stable backup window Deduplication Engine  Capacity is virtualized across nodes  Deduplication is shared across nodes Deduplication Engine  Simplified management through single UI  System can be right-sized to current data size  Avoids forklift upgrades 2011 ExaGrid Systems, Inc.
    12. 12. GRID Architecture for Deduplication Performance Node 1 – System Capacity – RAID6 Backup Servers Backup VM VM VM VM VM VM VM VM Job Wire Speed Landing Zone Deduplication Process Repository Load Balancing Wire Speed Backup Job VM VM VM VM VM VM VM VM Landing Zone Node 2 – System Capacity – RAID6Benefits  One-time division of data during installation (15 to 30 minutes)  GRID software manages placement of data  Revisit only during expansion (additional 15 to 30 minutes)  Eliminates the challenges of monolithic, primary storage like architectures
    13. 13. What We Covered What is Deduplication? Why Use Deduplication in Backup and Recovery? Challenges of Deduplication in Virtualized Environments Overview Diagram of Major Components Deduplication approaches (two camps) Summary ‒ Deduplication’s Role in Data Protection and Disaster Recovery
    14. 14. Enjoy and share this material Feel free to promote this material Recommend your peers to pass certification Blog, Tweet and share this material and your experience on Facebook You’re an Expert? We will be happy to have you as Backup Academy contributor. Apply here. Web: http://www.backupacademy.com E-mail: feedback@backupacademy.com Twitter: BckpAcademy Facebook: backup.academy
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×