Your SlideShare is downloading. ×

HP, la prochaine vague de la "Data Deduplication" (Livre Blanc en Anglais)

399

Published on

Prolifération du Cloud privé, croissance des volumes de données non-structurées, virtualisation des serveurs... autant de facteurs qui accélèrent l'adoption de la Déduplication. …

Prolifération du Cloud privé, croissance des volumes de données non-structurées, virtualisation des serveurs... autant de facteurs qui accélèrent l'adoption de la Déduplication.

Mais où en est l'adoption de la déduplication en entreprise ? Quelles sont les nouvelles techniques utilisées pour l'optimiser ?
Quelle est la réponse d'HP sur ces sujets à travers l'analyse de ses dernières solutions ?

Document Anglais rédigé par l'Enterprise Strategy Group

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
399
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. WhitePaperDedupe 2.0: What HP Has In Store(Once)By Jason Buffington, Senior AnalystJune 2012This ESG White Paper was commissioned by HPand is distributed under license from ESG.© 2012 by The Enterprise Strategy Group, Inc. All Rights Reserved.
  • 2. White Paper: Dedupe 2.0: What HP Has In Store[Once] 2Contents Introduction .................................................................................................................................................. 3 Deduplication Today ..................................................................................................................................... 4 Dedupe 1.0 – Optimized Storage is Good ............................................................................................................... 4 Dedupe 1.5 – Smarter Backup Servers Are Better .................................................................................................. 5 Dedupe 2.0 – Client-side Deduplication Is Best ...................................................................................................... 5 Deduplication Optimizations and Markers ............................................................................................................. 5 HP StoreOnce Backup ................................................................................................................................... 6 HP StoreOnce Catalyst ............................................................................................................................................ 6 HP StoreOnce B6200 .............................................................................................................................................. 6 HP Data Protector 7 ................................................................................................................................................ 8 HP StoreOnce and Symantec OST ........................................................................................................................... 8 The Bigger Truth ........................................................................................................................................... 9All trademark names are property of their respective companies. Information contained in this publication has been obtained by sources TheEnterprise Strategy Group (ESG) considers to be reliable but is not warranted by ESG. This publication may contain opinions of ESG, which aresubject to change from time to time. This publication is copyrighted by The Enterprise Strategy Group, Inc. Any reproduction or redistribution ofthis publication, in whole or in part, whether in hard-copy format, electronically, or otherwise to persons not authorized to receive it, without theexpress consent of The Enterprise Strategy Group, Inc., is in violation of U.S. copyright law and will be subject to an action for civil damages and,if applicable, criminal prosecution. Should you have any questions, please contact ESG Client Relations at 508.482.0188. © 2012 by The Enterprise Strategy Group, Inc. All Rights Reserved.
  • 3. White Paper: Dedupe 2.0: What HP Has In Store[Once] 3IntroductionIn ESG’s most recent IT spending research for 2012, “Improve Data Backup and Recovery” tied as the number onepriority, as seen in Figure 1, with business continuity and disaster recovery (BC/DR) coming in the top ten as well. Figure 1. Most Important IT Priorities for 2012 Which of the following would you consider to be your organizations most important IT priorities over the next 12-18 months? (Percent of respondents, N=614, ten responses accepted) Improve data backup and recovery 30% Increased use of server virtualization 30% Major application deployments or upgrades 29% Manage data growth 27% Information security initiatives 27% Business continuity/disaster recovery programs 25% Data center consolidation 24% Desktop virtualization 23% Mobile workforce enablement 22% Deploying a "private cloud" infrastructure 22% 0% 5% 10% 15% 20% 25% 30% 35% Source: Enterprise Strategy Group, 2012.Data protection is usually in the top ten, but why is it the top IT initiative in 2012? One only has to look at theothers in the list to understand more:Server virtualization – always near the top over the past few years, as virtualization becomes more commoditized,VM sprawl consumes more and more production storage, which in turn exacerbates legacy approaches todisk-based protection. In addition, not all backup technologies adequately protect virtual machines, often requiringintegration with backup APIs such as VMware vStorage 5 VADP or Microsoft Volume Shadow Copy Services (VSS),which not every backup solution has chosen to implement or support to the same degree.Private cloud infrastructure – every challenge with protecting virtualized environments is even more daunting inprivate cloud architectures, where self-service portals, as well as elastic load monitoring, can dynamically createnew virtualized resources without any IT interaction with—much less, awareness or automation of—backupprocesses.Data growth – along with VM sprawl, every other workload also continues to grow; as databases grow into“big data,” e-mail mailboxes continue to expand, with retention limiting the use of offline folders, and unstructureddata growing exorbitantly, often with huge amounts of undiscovered redundancy. And, of course, as productiondata grows, so does disk-based backup solutions (often at 3-5 times or more the production size). © 2012 by The Enterprise Strategy Group, Inc. All Rights Reserved.
  • 4. White Paper: Dedupe 2.0: What HP Has In Store[Once] 4Deduplication TodayWith so many top IT challenges related to or caused by storage demands in production data or the backup systemthat protects it, deduplication has never been more of a priority or an understood necessity, as seen in Figure 2. Figure 2. Is Your Organization Currently Using any Data Deduplication Solutions? Is your organization currently using any data deduplication solutions? (Percent of respondents, N=323) Dont know, 4% No, and we have no plans to, 20% Yes, 37% No, but we plan to within 24 months, 16% No, but we plan to within 12 months, 23% Source: Enterprise Strategy Group, 2012.Deduplication technologies provide organizations with substantial data storage efficiency advantages whendeployed as a part of their data protection infrastructures. While it does not stop the growth of protected data, itdoes provide an effective strategy for controlling that growth (at least within one’s backups).Deduplication technologies and methodologies continue to evolve as vendors continue to develop improved waysof ensuring that data is transported and stored efficiently. The algorithms that analyze stored data are constantlyimproving in efficiency, with even a fraction of a percent improvement leading to gigabytes of saved storage. Inaddition to the algorithms improving, the “location” where deduplication occurs is also evolving: from simply beingenhancements within secondary storage arrays to working across the network within the data path and acrossmultiple devices.Dedupe 1.0 – Optimized Storage is GoodThe first iteration of deduplication involved a single storage device. In the majority of "Dedupe 1.0" scenarios, thebackup server treated the storage device like it would any other storage device and wrote all of its data to it. Evenwhen the deduplicated storage already stored a significant number of items that the backup server wastransmitting to it, the backup server was unaware—and those redundant elements would be discarded uponreceipt by the deduplication appliance.From a data protection perspective, while organizations did reduce their costs due to backup data being storedmore efficiently, the time required to perform a backup wasnt improved because the same amount of backup datastill needs to be written to the Dedupe 1.0 backup device. And, of course, if the backup server needed to replicatethe data anywhere else, the data would be fully read and transmitted from one backup server to the next and thenwritten in full to another (perhaps deduplication-capable) storage device at the alternate location. © 2012 by The Enterprise Strategy Group, Inc. All Rights Reserved.
  • 5. White Paper: Dedupe 2.0: What HP Has In Store[Once] 5Dedupe 1.5 – Smarter Backup Servers Are BetterDedupe 1.5 improves upon Dedupe 1.0 by enabling the backup server to only send data to the storage device thatisnt already present on the device. In effect, the backup server becomes smarter because it is aware of what isalready in the deduplicated storage device. Dedupe 1.5 can also improve the speed with which backups are takenbecause the backup server may no longer be the bottleneck in sending data since less information is actually sent tothe backup storage device. Of note is the fact that dedicated backup appliances that utilize deduplication are oftenthought of as Dedupe 1.5, since the backup software and deduplication storage are within the same device.While certainly better, Dedupe 1.5 does not typically solve the challenge of replicating data between sites without“rehydrating;” the backup server will read and send the whole data set from the deduplicated storage to anotheroffsite backup server and its deduplicated storage regardless of what the secondary deduplication device mightalready have.Dedupe 2.0 – Client-side Deduplication Is BestDedupe 2.0 leverages intelligence and awareness at the source, backup server, and storage device. In thesescenarios, the awareness of what data is already in the deduplicated storage and the discernment to send new dataor not is performed within the production server instead of the backup server or deduplicated storage. In doing so,network savings begin at the production server—and backups are often significantly faster since only changed datais transmitted from the production server to the storage solution.Deduplication Optimizations and MarkersIn addition to optimizing the algorithms and adjusting where in the backup flow deduplication discernment occurs,the most modern deduplication technologies are beginning to offer other evolutionary capabilities, as well.Don’t Rehydrate Across Devices or SitesAn effective deduplication solution minimizes the rehydration of data between devices. Consider an example of500GB of data: after deduplication (3:1), the backup storage device may only be storing less than 200GB (includingmultiple iterations of small changes).  In legacy Dedupe 1.0 solutions, making a replica of that data at another location involved rehydrating the 500GB and transmitting all of it across the network to the target—even if it then ends up being deduplicated down to the 200GB again.  Often, Dedupe 1.5 solutions would only send the 200GB (keeping the data deduplicated), since the data movement logic in the backup server is deduplication aware. That’s better than 1.0.  But the best solutions understand that on the next backup, if 1GB was changed in the production server, then only 1GB of new data traverses through the backup server to the deduplicated storage—and then only that 1GB is replicated to the secondary site and its deduplicated storage.Rapid RestoresA key to successfully deploying dedupe is ensuring that restore operations are just as fast when performed fromdeduplicated backup storage as they are when performed from storage that does not leverage deduplication. Anydeduplication solution that substantially increases the amount of time required to perform a recovery operationdue to rehydration delays is going to adversely impact data recovery service level agreements (SLAs).Fast (deduplicated) backup is nice, but doesn’t mean as much without equally fast (deduplicated) restores. © 2012 by The Enterprise Strategy Group, Inc. All Rights Reserved.
  • 6. White Paper: Dedupe 2.0: What HP Has In Store[Once] 6HP StoreOnce BackupHP StoreOnce Backup technology is a "Dedupe 2.0–ready” technology that was first introduced in 2010. StoreOncededuplication is available in HPs B6200 Backup System appliance, backup/recovery software, and HP DataProtector. By using a consistent deduplication algorithm across the product suite (see Figure 3), HP StoreOnceminimizes problems related to different products with incompatible deduplication algorithms, which normally leadto comparatively slower recovery versus backup performance and decreased efficiency when it comes to scaling adeduplication solution to meet expanding capacity requirements. Figure 3. HP StoreOnce Using Common Deduplication Technology from Source, to Backup Server, to Storage Source: HP, 2012.HP StoreOnce CatalystStoreOnce Catalyst (new in 2012) is a software accelerator that enables deduplication anywhere rather than just atthose places in the network where specific vendor technologies allow it. StoreOnce Catalyst leverages a commondeduplication algorithm across the enterprise and allows deduplication at the:  Production source or “Client.”  Backup or media server.  Target appliance.By enabling source-side deduplication (Dedupe 2.0), HP has the advantage of performing deduplication at the datacreation point. Source-side deduplication eliminates the need for specialist deduplication hardware at remote andbranch office sites. StoreOnce Catalyst allows customers to align backup with data protection needs, such asminimizing bandwidth utilization when moving data between sites or data centers.As a key “Dedupe 2.0” design goal, StoreOnce Catalyst provides a single technology that can be used in multiplelocations on the network and that does not require rehydration when data is transferred between source server,backup device, and target appliance.As a software component, HP StoreOnce Catalyst supports HP Data Protector and Symantec NetBackup via OSTIntegration. Symantec Backup Exec support will be added in August, along with other software solutions that utilizethe HP StoreOnce Catalyst software development kit (SDK).HP StoreOnce B6200The HP StoreOnce B6200 Backup System, the largest in the StoreOnce family, is an enterprise deduplicationappliance that uses a scale-out architecture. The B6200 can grow from a single, 48TB two-node couplet up to acapacity of 768TB. It also has a pay-as-you-grow model that allows nodes to be added to the configuration withoutincurring downtime. © 2012 by The Enterprise Strategy Group, Inc. All Rights Reserved.
  • 7. White Paper: Dedupe 2.0: What HP Has In Store[Once] 7 Figure 4. The HP StoreOnce Family Source: HP, 2012.Also notable is the HP StoreOnce B6200 high availability features to mitigate the impact of node failures. ItsAutonomic Restart feature configures backup jobs to restart after node failover without requiring the interventionof the data protection administrator. The StoreOnce B6200 is designed to automatically detect and remediatecertain failures, hiding this complexity from data protection administrators and users.PerformanceAccording to HP, the HP StoreOnce B6200 Backup system is able to restore data at nearly the same rate at which itperforms native backups. Table 1 shows HP’s reported performance numbers for backups with the B6200. Table 1. HP StoreOnce B6200 Performance Numbers, as Reported by HP Native VTL performance (max. configuration) 40 TB/hour With source-side deduplication via StoreOnce Catalyst (max. configuration) 100 TB/hour Native VTL performance (single couplet) 10 TB/hour With source-side deduplication via StoreOnce Catalyst (single couplet) 25 TB/hour Source: HP, 2012.According to HP, this places the B6200 in a strong position against its closest (unnamed) competitor at:  Native VTL performance (max config) 2.7 times its competition  With source-side deduplication (max config) 3.2 times its competition  Native VTL performance (single couplet) 1.2 times its competition  With source-side deduplication (single couplet) 1.7 times its competitionAccording to HP, the performance gains shown in Table 1 are largely due to StoreOnce Catalyst, which enables thededuplication tasks to be distributed across the data protection infrastructure. The ability to perform deduplicationat the source, at the backup server, and at the appliance gives organizations new options to help them minimize thegrowth of data under protection. In providing a solution that works at the source, backup server, and appliancelevel, HP offers a comprehensive Dedupe 2.0 solution. © 2012 by The Enterprise Strategy Group, Inc. All Rights Reserved.
  • 8. White Paper: Dedupe 2.0: What HP Has In Store[Once] 8Backing up the data is one thing, but as the amount of protected data grows, the ability to restore it fast enough tomeet SLA agreements becomes more and more challenging. Notably, HP reports that its StoreOnce B6200 caningest data (in VTL mode) and restore data at up to 40 TB/hour. This is especially impressive since manydeduplication technologies suffer an I/O penalty that causes them to restore at much slower rates than theirpublished ingest rate.But at the end of the day, the most important performance number is “price-performance” ($/TB/hour), in whichHP is claiming that it offers 75% better price-performance than its primary competition.HP Data Protector 7Also “new” to the HP StoreOnce family is the addition of HP Data Protector (now shipping version 7) to theportfolio. HP Data Protector 7 (HP DP7) is designed to take advantage of the deduplication capabilities of theStoreOnce family and its Catalyst accelerator to enable client-side, media-centric, or target-based deduplication. Inaddition, by utilizing snapshots from HP storage arrays, HP DP7 can perform “instant recoveries” of complexworkloads or large datasets.HP DP7 also takes advantage of another HP acquisition: the Autonomy Cloud. Currently boasting 50PB, the HPAutonomy cloud is a superset of what was once Iron Mountain Digital’s online storage and recovery repository—and is now an ideal location to replicate HP DP7 data for offsite preservation. HP DP7 also offers expandedhypervisor support—VMware, Microsoft Hyper-V, and Citrix—as well as enterprise application protection forMicrosoft, Oracle, and SAP server platforms, including granular data restore capabilities for Microsoft Exchange,SharePoint, and VMware.As a last note, HP DP7 also supports HP Autonomy’s IDOL 10 framework to deliver “meaning-based dataprotection,” resulting in data protection, retention, and indexing based on the types and context of data in anenvironment.HP StoreOnce and Symantec OSTThe benefits of HP StoreOnce are not limited to just Data Protector. Via a Symantec OST standard plug-in, aNetBackup Media Server is able to utilize HP StoreOnce deduplicated storage, enabling smarter transit from themedia server to the deduplicated storage, but also between HP StoreOnce arrays without rehydration— managedby NetBackup. Figure 5. HP StoreOnce Using Common Deduplication Technology from Source, to Backup Server, to Storage Source: HP, 2012.While this design does not use StoreOnce Catalyst on the production server, the architecture still qualifies as a“Dedupe 2.0” solution since Symantec NetBackup 7.5 has its own client-side deduplication and a new feature called“Accelerator” for optimized discernment from the production node, as well as utilizing the deduplicationcapabilities within the StoreOnce appliances. © 2012 by The Enterprise Strategy Group, Inc. All Rights Reserved.
  • 9. White Paper: Dedupe 2.0: What HP Has In Store[Once] 9The Bigger TruthBetween increased use of server virtualization, the dynamic proliferation of private cloud, the ever-growingunstructured data pool, and the advent of “big data,” organizational data is going to continue to grow. Effectivededuplication reduces the impact of that growth, lessening the strain of constant storage growth on organizations.All organizations need an effective deduplication strategy; ignoring deduplication will lead to data protectioninfrastructure being swamped with data.Choosing a deduplication strategy involves assessing a range of factors such as data type and location. By being ableto support deduplication at the data source, backup server, and appliance, HP StoreOnce gives data protectionsolution architects a high degree of flexibility in developing a deduplication solution tailored to their needs.By delivering a single solution through HP Data Protector paired with StoreOnce appliances, and a joint solutionthat offers StoreOnce capabilities through Symantec OST capable software, HP has taken aim at a unified andbroadly applicable data protection solution that challenges the presumptions of deduplicated storage. With theaddition of cloud capabilities and the “understanding” of data through Autonomy’s assets, the data protectiongame just got more interesting.Deduplication is not new—and HP was certainly not the first to market it. Instead, by watching how deduplicationwas introduced and listening to the evolving demands of customers who struggle with storage and backup issues,HP built StoreOnce as a next generation or “Dedupe 2.0” architecture that is available now. With its formidableenterprise experience, server and storage product lines, and broad partner ecosystem, HP intends to catch up with,and in fact, surpass the status quo to bring better deduplication at a lower cost.If you haven’t already, 2012 is the year to wholly adopt deduplication within your data protection strategy. WithHP’s newest offerings, the short list of enterprise-suitable deduplication for data protection providers just gotbigger—adding a name that many view as synonymous with enterprise servers and their storage. © 2012 by The Enterprise Strategy Group, Inc. All Rights Reserved.
  • 10. 20 Asylum Street | Milford, MA 01757 | Tel: 508.482.0188 Fax: 508.482.0218 | www.esg-global.com 4AA4-1782ENW

×