Venkatesh Iyer - Next Generation Backup and Recovery with Data Deduplication - Interop Mumbai 2009


Published on

Protecting critical data is a challenge for organizations of all sizes. According to the Enterprise Strategy Group (ESG), the amount of data requiring protection continues to grow at approximately 60 percent per year. Traditional backup solutions store data repeatedly, expanding total storage under management by five to 10 times. Customers need solutions to help manage the information explosion. In addition, government regulations and requests for legal discovery strain the resources and capabilities of traditional data protection solutions. Failure to comply or provide information in a timely fashion can result in significant costs and penalties. Furthermore, recent legislation has exposed the risk of shipping tapes—either encrypted or unencrypted—as one of the greatest security concerns in today’s IT infrastructure. This session will elaborate on how de-duplication technologies help in achieving lower costs, improving efficiencies, improving protection and simplifying management.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Venkatesh Iyer - Next Generation Backup and Recovery with Data Deduplication - Interop Mumbai 2009

  1. 1. Next Generation Backup and Recovery With Data Deduplication Driving Down the Cost and Risk Venkatesh K. Iyer Head – India & SAARC Backup, Recovery & Archival Solutions © Copyright 2009 EMC Corporation. All rights reserved. 1
  2. 2. Agenda Current Market Scenario Backup and Recovery Challenges Why is Data De-duplication so hot? What is Data Deduplication ? Next Gen Backup and Recovery Architecture © Copyright 2009 EMC Corporation. All rights reserved. 2
  3. 3. 2009 – Market Conditions Challenging times with global economic recession in 2009 Economic environment is a leading indicator of tech spending 71% of CIO’s anticipate flat or declining IT spending budgets IT budgets in developed countries set to decline by 5% TOTAL IT SPENDING IS THE LOWEST IN THE HISTORY OF THE SURVEY “The current environment has moved virtualisation toward the top of the priority list for CIOs” “TCO reductions will be a key driver of the acceleration in server virtualisation deployments as CIOs are forced to cut capital spending and reign in management, administrative, and power/cooling cost” Source Goldman Sachs IT spending Survey Nov 2008 & Merrill Lynch CIO Survey Oct 2008 © Copyright 2009 EMC Corporation. All rights reserved. 3
  4. 4. Industry Challenges & CIO’s Concerns Current Economic Crisis Cut Operating Costs Continued Information Growth (10X growth over next 5 years, EMC/IDC white paper) © Copyright 2009 EMC Corporation. All rights reserved. 4
  5. 5. Key Backup and Recovery Themes: ROI and TCO are #1 on CIO minds The data protection market continues to evolve: – Operational Savings through Automation and Integration – Improvements in Service Levels (RPOs / RTOs) and IT Compliance – Decreasing Reliance on Tape through B2D and Data Deduplication Traditional data protection methodologies don’t map well to virtualized servers A perfect storm is brewing for a fundamental re-architecture of data protection environments in organizations © Copyright 2009 EMC Corporation. All rights reserved. 5
  6. 6. Today’s Backup and Recovery Challenges Massive Data Growth Costs Costs Complexity Shift to Virtual Compliance Compliance Compliance © Copyright 2009 EMC Corporation. All rights reserved. 6
  7. 7. Hierarchy of Data Reduction Types for Backup Regular Storage Array 1:1 LZ Compression ~ 2:1 Whitespace Reduction Single Instance Storage ~ 3:1 File Level Fixed Block ~ 3:1 Fixed Blocks, Data Deduplication Data Snapshots Significantly Reduces ‘Dedupe’ - Power ~ 20:1 - Heat Variable Segments To 500:1 - Cooling -Management -Bandwidth -. 7 Confidential -. © Copyright 2009 EMC Corporation. All rights reserved. 7
  8. 8. Gartner Dedupe Prediction: The Market is HUGE By 2012, deduplication will be applied to 75% of backups Key Findings: Production deployments of deduplication for backups have progressed at an unusually high rate for such a recent technology; however, Gartner estimates that less than 5% of backups today use deduplication techniques. Market Implications: Gartner views this technology as transformational because it radically decreases the economics of disk-based backup and recovery………too compelling to ignore. Recommendations: There are several different implementations of deduplication, and some vendors have only recently released this technology and have a few dozen customers, others have been shipping it for several years and have more than 1,000 customers. …… ensure that your organization is comfortable with the robustness and maturity of the vendor's approach. Analysis by: Dave Russell 8 Confidential © Copyright 2009 EMC Corporation. All rights reserved. 8
  9. 9. Why So Much Interest in Data Deduplication? Backup & Archive processes have been overwhelmed by information growth Primary storage efficiency has become a necessity to cope with massive growth ROI drives the compelling appeal of deupe – Reduced Storage Capacities – Lower Infrastructure Costs – Improved SLA’s – Efficient Replication for DR One of the top 10 Technology Consideration Deduplication 59% Very important Deploying Deduplication 24% 55% 21% In use Evaluating / In Near – Long Term plan Not in Plan © Copyright 2009 EMC Corporation. All rights reserved. - Source: TheInfoPro Wave 11 Storage Study, 2008 9
  10. 10. Why so much Interest in Data De-Duplication? • Data De-duplication – One of the hottest emerging segments within the storage and data protection market – Why? – Network Bandwidth utilisation – Efficiently Move Data – Massive reduction in Storage requirements – Efficiently Store Data – Security – Data protection in transit – Improving efficiencies in virtualised environments • Market is under duress, Backup and Restore has not kept pace with enterprise growth • Companies looking to protect more data – increasing desktop volumes, mobile employees, remote offices, data growth circa 50% + • Data retained at the back-end for longer periods of time for internal reasons or external regulations – Need to archive • Tape is not ideal for backup and restore, the industry is moving towards backup-to- disk • De-duplication market opportunity $1B by 2009 • It’s here to stay, its based on compression which has been around for 20+ years © Copyright 2009 EMC Corporation. All rights reserved. 10
  11. 11. Deduplication 101 Dedupe - storing only unique ‘chunks’ of data (blocks, objects, files) – Uses identification & comparison algorithms, content addressing, indexing or cataloging – Unique “chunks” are reconstituted to original format from the de-duplicated state Data Set 1 De-duplication Data Set 2 Data Set 3 Compression – Minimizes empty space within files; but does not eliminate redundant data – Compression is employed in conjunction with other dedupe processes © Copyright 2009 EMC Corporation. All rights reserved. 11
  12. 12. Target- and Source-based Data De-duplication There are strong use cases for both technologies…but only source-based de-duplication reduces daily network bandwidth requirements and decreases client resource utilization during backups. De-duplication at Source De-duplication at Target Moves ~ 2 percent of primary data weekly Moves ~ 200 percent of primary data weekly Up to 50 times reduction in backup storage Up to 50 times reduction backup storage Up to 500 times less daily network impact Backups are typically restored from full and Up to 10 times faster daily full backups incremental images Fast, daily full backups, single-step recovery De-dupe device viewed as file system and/or Next-generation backup and recovery virtual tape library target for traditional backup software Network Network EMC Disk EMC Avamar Library © Copyright 2009 EMC Corporation. All rights reserved. 12
  13. 13. Data De-Duplication: How it Works First Instance Duplicate Instance Modified Instance A B A B E B C D C D C D A Only unique B Data already backed up, New data segment data segments so only a unique ID pointer E identified and backed up are backed up C is stored (20 bytes) D A B C D E Unique data stored on disk, available for immediate recovery © Copyright 2009 EMC Corporation. All rights reserved. 13
  14. 14. Potential Impact of Data De-duplication on a Backup File 1 Total Capacity Stored = 1 MB RAW Data Over 12 Weeks A B Daily incremental, 3 MB 12 Fulls = 36 MB C D weekly full File 2 Daily full 3 MB 84 Fulls = 252 MB = 1 MB A B De-duplicated backup 3 MB 84 Fulls = 1.25MB C D File 3 = 1 MB Data de-duplication reduces E B Backup to Disk capacity requirements C D © Copyright 2009 EMC Corporation. All rights reserved. 14
  15. 15. Backup and Recovery Use Cases Source Virtualized Relieves backup bottlenecks, enables Environments greater server consolidation ratios Next Generation Backup and Archive Avamar Remote / Branch Protects ROBOs with highest WAN Offices efficiency and with consistent DC policies Edge Protects enterprise desktops / laptops with Devices low device overhead Target 3rd Party Data Domain + DL4000 Heterogeneous target for existing backup Backup applications Datacenter Enterprise infrastructure support NAS / SAN High High-change rate, large data sets Transaction Apps © Copyright 2009 EMC Corporation. All rights reserved. 15
  16. 16. Next Generation Backup, Recovery and Archive - Take the Steps Better Protection and Compliance. Less Cost. Best-Practice Business Benefits • Avoid time and expense of developing expertise Assess • Identify maximum investment/benefit strategies • Reduce the size of backup • Free valuable primary storage capacity Archive • Assure compliance, remove exposure • Reduce eDiscovery expenses • Reduce time, bandwidth and infrastructure • Streamline D/R operations, infrastructure Backup • Expedite disaster recovery • Eliminate remote office backup infrastructure • Expedite application recovery Manage • Reduce backup management overhead • Streamline problem detection, resolution • Lower backup management costs © Copyright 2009 EMC Corporation. All rights reserved. 16
  17. 17. Backup Service Tiering / Catalogue Alignment Attributes Tier 1 Tier 2 Tier 3 Archive Scheme Specification Recoverpoint EDL Avamar LTO4 Tape Centera Backup to Source Dedup Disk Proposed technology CDP to Disk Tape disk to disk Concept Optional at Integrated at Deduplication None None Integrated Target Source Backup Performance Backup Performance Highest High High Medium NA Files/NAS/ Long term Best use Case DB DB/Emails OS Retention VmWare/RO Backup Time Instant Normal Fastest Medium Architecture Considerations / Impact Minimal - LAN/CPU/DISK Impact None HIGH HIGH Minimal None Weeks / Retention on Disk (Typical) < 2 Days 4 Weeks Years Years Months Retention & Disposition Operational Data shredding compliance No Yes Yes No Yes Backup & Verify Quality of Backup Recovery Data Integrity Checks None Protocol Daily None Daily Data automatically Encryption Ability to encrypt backup None None Integrated Optional Application data How long for a replicated Real Time 24-48 hours 30-90 Minutes 2-3 Days 30-90 Min. Replication for DR copy Last Operational Recovery Pt Obj Amount of data loss > = 24 hours < 24 hours 1-2 days 0 Transaction Recoverability Ability to recover data 100% 100% 100% 95% 100% Length of time data is 2 days 3 Weeks 12 Weeks N/A Years Retention period retained on disk Last Last Disaster Recovery Pt Obj (RPO) Amount of data loss Last Backup Last Backup Last Backup Replication Disaster Transaction Recovery Disaster Recovery Time Obj (RTO) 25% Data Restore < 2 hours < 24 hours < 24 hours < 48 hours NA Cost per TB K / TB K / TB K / TB K / TB K / TB © Copyright 2009 EMC Corporation. All rights reserved. 17
  18. 18. Next Generation Data Protection Architecture Centralized Data Protection Management Typically 80% of Data Typically 20% of Data 70-80% data reduction trough shortcutting Source De-Duplication Source De-Duplication Mail Database Professional Services archive & retrieval Archive Archive Source De-Duplicated Backup SAN Backup CDP © Copyright 2009 EMC Corporation. All rights reserved. 18
  19. 19. Next Generation Data Protection with EMC EMC DPA Typically 80% of Data Typically 20% of Data 70-80% data reduction trough shortcutting EMC Professional Services Source De-Duplication Source De-Duplication Mail Database archive & retrieval Backup EmailXtender / DiskXtender + Centera EMC | Avamar EMC | Networker EMC | Recoverpoint Avamar Centera RecoverPoint EDL Data Domain Tape © Copyright 2009 EMC Corporation. All rights reserved. 19
  20. 20. Email: