Deduplication reduces the amount of disk storage needed to retain and protect data by ratios of 10-30x and greater, making a disk a cost-effective alternative to tape. Data on disk is available online and onsite for longer retention periods, and restores become fast and reliable. Storing only unique data on disk also means that data can be cost-effectively replicated over existing networks to remote sites for disaster recovery and consolidated tape operations.
11. Retain: Store More for Longer with Less Over one year of retention in 3U of Data Domain deduplication storage Backup Cumulative Estimated Physical Data Logical Reduction First Full 1 TB 4x 250 GB Week 1 April 7 2.4 TB 8x 308 GB Week 2 April 14 3.8 TB 10x 366 GB Week 3 April 21 5.2 TB 12x 424 GB Month 1 April 28 6.6 TB 14x 482 GB Month 2 May 31 12.2 TB 17x 714 GB Month 3 June 30 17.8 TB 19x 946 GB Month 4 July 31 23.4 TB 20x 1,178 GB TOTAL 23.4 TB 20x 1,178 GB
12. Data Integrity: Data Invulnerability Architecture Generate Checksum Verify Data Re-Checksum and Compare Verify the file system metadata integrity File System Deduplication Verify user data integrity Local Compression RAID Verify stripe integrity End-to-end data verification Checksum Deduplication, write to disk Verify Self-healing file system Cleaning Expired data Defrag Verify Other RAID 6 NVRAM Snapshots End-to-end data verification
17. Cascaded1–5% DB Data Domain system Archive data 1–5% Backup data Data Domain system 1–5% Data Domain Global Deduplication Array Data Domain system Destination: Data Center Hub Supports hundreds of remote sites Source: Remote sites 95–99% cross-site bandwidth reduction
18. DD Boost Software Distributes parts of deduplication process to backup server or application clients Licensable software works across Data Domain portfolio Supports majority of backup software market EMC Avamar and NetWorker Symantec NetBackup and Backup Exec Speeds backups by up to 50 percent Process more backups with existing resources 20–40 percent less overall impact to backup server 80–99 percent less LAN bandwidth Enables Data Domain replication management from the backup application DD Boost
38. Data Domain Systems Trajectory Data Domain SISL Scaling Architecture: CPU-centric 5 3 1.5 0.04 Improvement since 2004: Throughput: ~175x Capacity: ~450x Dual-controller Global Deduplication Array DD Boost 2014 (est.) Single-controller, standard protocols Throughput GB/s DD200 (2004) 2004 Future 2010 2011
39. Why Data Domain? Less disk to resource, less to manage CPU-centric deduplication Inline deduplication Simple, mature, and flexible Simple, mature appliance Any fabric, any software, backup or archive applications Resilience and disaster recovery Storage of last resort Fast time-to-DR readiness Cross-site global compression Data center or remote office
46. EMC best practices and unmatched product expertise = superior customer experience
47. Reduce disruption while taking advantage of the features and benefits of the latest EMC products and solutions
Editor's Notes
Another important differentiator for Data Domain systems is the Data Invulnerability Architecture. Data Domain Data Invulnerability Architecture lays out the industry's best defense against data integrity issues by providing unprecedented levels of data protection, data verification, and self-healing capabilities that are unavailable in conventional disk or tape systems.There are three key areas of data integrity protection described on this slide:First is end-to-end data verification at backup time. As illustrated by the graphic at the right, end-to-end verification means reading data after it is written and comparing it to what was sent to disk, proving that it is reachable through the file system to disk and that the data is not corrupted. Specifically, when the Data Domain Operating System receives a write request from backup software, it computes a checksum over the data. After analyzing the data for redundancy, it stores the new data segments and all of the checksums. After all the data has been written to disk, Data Domain Operating System verifies that it can read the entire file from the disk platter and through the Data Domain file system, and that the checksums of the data read back match the checksums of the written data. This confirms the data is correct and recoverable from every level of the system. If there are problems anywhere along the way—for example, if a bit has flipped on a disk drive—it will be caught. Since most restores happen within a day or two of backups, systems that verify/correct data integrity slowly over time will be too late for most recoveries.Second is a self-healing file system. Data Domain systems actively re-verify the integrity of all data every week in an ongoing background process. This scrub process will find and repair defects on the disk before they can become a problem. In addition, real-time error detection ensures that all data returned to the user during a restore is correct. On every read from disk, the system first verifies that the block read from disk is the block expected. It then uses the checksum to verify the integrity of the data. If any issue is found, the Data Domain Operating System will self-heal and correct the data error. In addition to data verification and self-healing, there are a collection of other capabilities. Data Domain with RAID 6 provides double disk failure protection; NVRAM enables fast, safe restart; and snapshots provide point-in-time file system recoverability.Backups are the data store of last resort. Data Domain Data Invulnerability Architecture provides extra levels of data integrity protection to detect faults and repair them to ensure backup data or recovery is not at risk.
In addition to DD Boost, EMC offers four additional Data Domain software options that can enhance the value of a Data Domain system in your environment. Note to Presenter: Click now in Slide Show mode for animation.The first is DD Virtual Tape Library software, which eliminates tape-related failures by enabling all Data Domain systems to emulate multiple tape devices over a Fibre Channel interface. This software option provides easy integration of deduplication storage in open systems and IBM i environments. Note to Presenter: Click now in Slide Show mode for animation.Next is DD Replicator software, which provides fast, network-efficient , encrypted replication for disaster recovery, remote office data protection, multi-site tape consolidation, and long-term offsite retention. DD Replicator asynchronously transfers only the compressed, deduplicated data over the WAN, making network-based replication cost-effective, fast, and reliable. In addition, you can replicate up to 270 remote sites into a single Data Domain system for consolidated protection of your distributed enterprise.Note to Presenter: Click now in Slide Show mode for animation.Next, DD Retention Lock software enables you to easily implement deduplication with file locking to satisfy IT governance and compliance policies for archive protection. DD Retention Lock also enables electronic data shredding on a per-file basis to ensure that deleted files have been disposed of in an appropriate and permanent manner, in order to maintain confidentiality of classified material, limit liability, and enforce privacy requirements.Note to Presenter: Click now in Slide Show mode for animation.Finally, DD Encryption software protects backup and archive data stored on Data Domain systems with encryption that is performed inline— before the data is written to disk. Encrypting data at rest satisfies internal governance rules and compliance regulations and protects against theft or loss of a physical system. The combination of inline encryption and deduplication provides the most secure data-at-rest encryption solution available.
Like other Data Domain systems, Data Domain Archiver includes a controller and storage shelves, referred to as the “active tier” in this system. The active tier can be expanded to up to four storage shelves (96 TB of usable capacity), and it is used for short-term (generally less than 90 days) retention of backup and archive data. In addition, DD Archiver also incorporates an “archive tier” with up to 23 additional storage shelves (474 TB of usable capacity). Built on a standard Data Domain controller, DD Archiver leverages existing Data Domain technology to enable high throughput of up to 9.8 TB/hr. DD Archiver is cost-optimized for long-term retention of backup and archive data—up to a total of 570 TB usable or 28.5 PB logical capacity (assuming a 50:1 deduplication ratio). In addition, the system offers the unique combination of low cost per gigabyte while still maintaining high throughput. Finally, new fault isolation capabilities ensure long-term recoverability of archive units.All of this leverages existing Data Domain system advantages, including support for network-efficient replication with DD Replicator as well as DD Retention Lock for enforcing file retention. In addition, Data Domain’s Data Invulnerability Architecture ensures data integrity for the life of the system.The combination of high-throughput, cost-optimized storage built on proven Data Domain system technology makes DD Archiver the perfect tape replacement solution.
Here’s a look at the latest Data Domain product family, including the recently introduced DD800 series, Data Domain Global Deduplication Array, and Data Domain Archiver (the system for long-term retention of backup and archive data).
OPTIONAL SLIDEEMC Global Services are a large component of the your total EMC experience. EMC Global Services allows you to…Save money by:Significantly lowering your implementation and operating expenditure costsFilling internal resource gaps for less Protecting your investments in EMC solutionsAccelerate time to value by:Reducing deployment timeAccelerating return on investment for new projectsEasing the burden of compliance while protecting critical business informationMitigate risk and get better results by:Configuring the solution to meet your requirementsImproving your service levels and reducing your management costsUsing EMC best practices and unmatched product expertise = superior customer experienceReducing disruption while taking advantage of the features and benefits of the latest EMC products and solutions