• Save
Data Reduction Technologies Impact the Cloud: What You Need to Know
 

Data Reduction Technologies Impact the Cloud: What You Need to Know

on

  • 535 views

This presentation will outline the trends and forces driving the need for using the cloud as a tier in data protection in virtualization environments. We’ll discuss the data protection partnership ...

This presentation will outline the trends and forces driving the need for using the cloud as a tier in data protection in virtualization environments. We’ll discuss the data protection partnership required from the source of data, virtual machine disks and files inside of virtual machines, all the way to the cloud in order to successfully achieve data protection goals. This will include examples on how content-specific data reduction techniques virtualized disks can provide immediate value as the cloud becomes the next great tier in the data center storage hierarchy. Finally, we will examine the challenges as data reduction become more valuable as wire-usage optimization than as a disk-usage optimization.

Statistics

Views

Total Views
535
Views on SlideShare
534
Embed Views
1

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 1

http://iponline.imago.emcuk.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Data Reduction Technologies Impact the Cloud: What You Need to Know Data Reduction Technologies Impact the Cloud: What You Need to Know Presentation Transcript

  • Cloudy With a Chance of DataReduction: How Data ReductionTechnologies Impact the CloudMitch HaileTechnical Director – VirtualizationOctober 2011
  • Overview The World Has Changed Life of a File On the Wire2
  • The World Has Changed
  • Unstructured FILE Data Growth Over-committing storage infrastructure & budget By 2015, 80% of file data in virtualized server environments (by capacity) Management nightmare Sources: ESG Digital Archive Study, June 2010; IDC, Virtualized Environments, August, 2010; NetApp Letter to Stockholders, Nov 2009
  • Structured Data Virtual Disks5
  • Structured Data Virtual Disks Databases Email Movies6
  • Data Usage Hot C o D m Warm p R l i a n c e7
  • Cheaper Storage 1 Petabyte retail = $39,9958
  • MTBF Size per MTBF spindle • SATA disk: look at errors/disk capacity • 2 TB disk → 16% chance of hard error • 4 TB disk → 32%chance of hard error • ~2015, disks expected to be 8 – 16 TB → 100% chance of error during whole disk read Source: Steve Hetzler, IBM FAST’119
  • Power TCO Cost of powering 1 drive = from ~$14 to ~$33 per year Cost of cooling 1 drive = ~$33 per year If your kwH cost is ~13¢… Sources: http://www.scsita.org/aboutscsi/sas/SFF_WP_3Q08.pdf http://features.techworld.com/storage/3454/calculating-carbon-emissions/ You can save~$28,000/Petabyte/Year on just power/cooling10
  • “If I can do it at home …”
  • The Cost of Moving Data - Tolls - Throttled12
  • Data Volume: Cloud13
  • Perfect Storm
  • Observations Lots of copies of data you will never use again Economies of scale will help drive cloud adoption Optimizing the wire will dominate Optimizing the wire requires data reduction15
  • LIFE OF A FILE
  • Old Life of a File When On Backup Monthly Where Source LAN LAN Until deleted One When Forever or changed Week or expired17
  • Life of a File Today On This End of When Write Or Tonight weekend Month Time Or … Where Source LAN WAN Cloud Cloud Until deleted One One Six Seven How Long or changed Week Month Months Years or expired18
  • Life of a File.219
  • Life of A File: Data ReductionEncryption,Compression,Filters, Data-specific Data reduction starts Deduplication, here; Otherwise Data Pipeline, opportunities are lost. Data Specific Where Source LAN WAN Cloud Cloud20
  • EmailSplit the Data Fixed Reduction LAN, WAN, Cloud Meta-Data Everywhere Plain Copy LAN Variable Reduction WAN, Cloud21
  • Virtual DisksSplit the Data Fixed Reduction Source swap Temp Temp Temp File Plain Copy Source File File File Variable Reduction Source, LAN, Cloud File File File Variable Reduction Cloud Meta-Data Everywhere22
  • ON THE WIRE
  • Data Reduction and the Cloud For the front-end: – Bandwidth reduction value – Latency improvement – Overall cost because of billing model For Back-end: – Storage footprint – Data is copied around a lot – bandwidth saving But there are issues: – Need to homogeneous solutions – Must be woven in DP and DM activities – Must be automatic – sizing can be big issue
  • Dedupe-Aware Transfers Dedupe-aware transfers disruptively reduce the amount of data passed between the tiers – Dedupe negotiation protocols are required – These protocols can be layered over compression – These protocols must efficiently handle two cases:  Data being pushed from edge into repository  Data being replicated between repositories Primary Data Center Efficient Remote Efficient Data DR Facility Tape Library Data Transfer Replication  Migration Storage  Integration Database Systems  Encryption Mail  Indexing  Self-healing Files Virtual
  • Data Reduction “On-the-Wire” Multiple considerations when moving data over-the-wire: – Is data being moved between a data-reduced repo and traditional “raw” system – Is data being moved between two systems with same reduction technology – Can multiple data reduction technologies be employed at each stage of movement Mixing file and block level solutions is problematic – often, mixing NAS and VTL demonstrate similar problems What media must the data be moved over: high-latency or low-latency? Each data reduction scheme has benefits and downsides in each of above scenarios There is no “free lunch” – somehow, somewhere you have to dehydrate and re-hydrate the data!
  • Deduplication On-the-Wire Most dedupe vendors offer dedupe-enabled replication , buts there is a lot of variance Most are somewhat complex forms of a simple model – Client batch up a group of sequential chunk fingerprints – Client send batch to smart target that can query existence of each fingerprint – Target sends back results and client pushes unique data Above scheme only works when client/server both can form identical chunks and fingerprints Collaborative dedupe schemes are less common; these schemes provide a method that allows client to chunk and fingerprint data to enable the negotiation Collaborative schemes don’t work over the old legacy protocols (NAS); that’s starting to change (OST/XAM/pNFS)
  • Deduplication On-the-Wire Benefits and cost are more subtle: – Most dedupe solutions send file/object level hash of hashes to prune copies similar to SI technologies – Some solutions provide hierarchical hash-of-hashes to obviate the transfer of large ranges – Most solutions can negotiate individual chunks – For solutions that negotiate all (or most) chunks, a large number of hash negotiations:  Excellent results when most actual data transfer is obviated  Results can add to transfer overhead when dedupe ratios are low  Cost of hash negotiations serializes data transfers; this can be invisible on low- latency wires but cause significant slow downs on high-latency wires
  • Data Reduction EARLY Days  Weeks  Months  Years Division/Remote Primary Data Center DR Facility Archive Centralized Management Midrange Storage Solution Open Replication Files Virtual Open Replication Open Database Replication Mail Core Open Replication Server-based Solution Enterprise Storage Solution Managed Service Solution
  • Other Details Federation Policy Data Movement
  • Conclusion Forces are driving us to Cloud more quickly It’s all about the wire Ubiquitous data reduction31
  • YOUR YEAR-ROUND IT RESOURCE – access to everything you’ll need to know
  • THE WHOLETECHNOLOGY STACKfrom start to finish
  • COMMENT & ANALYSISInsights, interviews and the latest thinking on technology solutions
  • VIDEOYour source of live information – all the presentations from our live events
  • TECHNOLOGY LIBRARY Over 3,000 whitepapers,case studies, product overviews and press releases from all the leading IT vendors
  • EVENTS, WEBINARS & PRESENTATIONS Missed the event? Download the presentations thatinterest you. Catch up with convenient webinars. Plan your next visit.
  • DirectoryA comprehensive A-Z listing providing in-depth company overviews
  • ALL FREE TO ACCESS 24/7
  • online.ipexpo.co.uk