© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED
5 Questions to Ask Your Vendor
About Data Deduplication
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED2
5 Questions to Ask Your Vendor About Dedupe
Agenda
Deduplic...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED
Role of Deduplication in Backup
Reduces Storage
Shortens Bac...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED
Key Deduplication Technologies
Block decomposition
 Progres...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED5
Compression Technologies
Basics of Deduplication
Compressio...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED6
Compression Technologies
Basics of Deduplication
Compressio...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED7
Files are
replaced by
pointers to
compressed
blocks
Block-G...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED8
Processing Target Deduplication Source Deduplication
Locati...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED
Processing Target Deduplication Source Deduplication
Locatio...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED
Dedupe Step 1: File Decomposition
Strategy File Block
Variab...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED
Basics of Deduplication
Dedupe Step 1: File Decomposition
St...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED
Dedupe Step 1: File Decomposition
Strategy File Fixed Block
...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED
Dedupe Step 1: File Decomposition
Strategy File Block
Variab...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED14
Dedupe Step 2: Hash Calculation
Basics of Deduplication
8,...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED15
Deduplication Genealogies
Basics of Deduplication
RockSoft...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED16
Deduplication Technologies
Basics of Deduplication
Tolerat...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED17
Fixed-Block
Basics of Deduplication
New
New
File
New
New
K...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED18
Tolerates…
Fixed-
block
Variable-
block
Progressi
ve
“Appe...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED19
Variable-Block Deduplication Detail
Basics of Deduplicatio...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED20
Variable-Block vs. Progressive Dedupe
Basics of Deduplicat...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED21
Progressive Deduplication
Basics of Deduplication
New
New
...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED
Why Is Progressive Dedupe Better?
Higher compression because...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED23
How Orphaned Dedupe-Blocks Are Created
Basics of Deduplica...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED
Block Reclamation Options
Scheduled Maintenance Window
Conti...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED
Block Reclamation Options
Scheduled Maintenance Window
Conti...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED26
WD Progression
WD Business Storage Solutions Overview
Stor...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED27
Network Backup for Business Since 1999
WD Arkeia Architect...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED28
WD Arkeia™ v10 Architecture
WD Arkeia Architecture
Storage...
© 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED
Please Contact Us
WDBusinessStorage@wdc.com
www.arkeia.com
Upcoming SlideShare
Loading in …5
×

5 Key Questions to Ask Your Vendor About Data Deduplication

812 views

Published on

1. Deduplication technology options
2. The role of deduplication in backup
3. Fixed-block versus variable-block deduplication
4. Benefits of application aware deduplication
5. The pros and cons of integrating deduplication

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
812
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

5 Key Questions to Ask Your Vendor About Data Deduplication

  1. 1. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED 5 Questions to Ask Your Vendor About Data Deduplication
  2. 2. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED2 5 Questions to Ask Your Vendor About Dedupe Agenda Deduplication vs. compression Target-side vs. source-side dedupe in backup 1 2 3 4 5 Fixed vs. Variable vs. Progressive™ deduplication Strategies for block decomposition Orphaned Block Reclamation
  3. 3. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Role of Deduplication in Backup Reduces Storage Shortens Backup Window Basics of Deduplication Across computers (e.g. word.exe) Across/Within Files (e.g. PPT files) Over Time (e.g. outlook.pst)
  4. 4. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Key Deduplication Technologies Block decomposition  Progressive Deduplication™  Sliding Window with Progressive Matching Orphaned Block Reclamation  Interruptible / Resumable  Scalable (for arbitrarily large block pools) Hash management  Scalable (for machines with limited memory)  Source-side (agent-side) cache management Basics of Deduplication
  5. 5. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED5 Compression Technologies Basics of Deduplication Compression Type Compression Target Examples Local Files and Blocks (files and smaller) DEFLATE (Zip) JPEG MPEG Various… Global Files and File Systems (files and larger) Deduplication Byte-differencing
  6. 6. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED6 Compression Technologies Basics of Deduplication Compression Type Compression Target Examples Local Files and Blocks (files and smaller) DEFLATE (Zip) JPEG MPEG Various… Global Files and File Systems (files and larger) Deduplication Byte-differencing Question 1 of 5: Do you really deduplicate?
  7. 7. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED7 Files are replaced by pointers to compressed blocks Block-Grain Deduplication Basics of Deduplication A B C D Files
  8. 8. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED8 Processing Target Deduplication Source Deduplication Location of Deduplication? At storage box At data source Impact on… Storage Positive Positive Network None Positive Backup Server Scaling Scale-up Scale-out Target vs. Source Dedupe in Backup Basics of Deduplication Source of data LAN Backup data storage Processing Target Deduplication Source Deduplication Location of Deduplication? At storage box At data source Impact on… Storage Positive Positive Network None Positive Backup Server Scaling Scale-up Scale-out
  9. 9. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Processing Target Deduplication Source Deduplication Location of Deduplication? At storage box At data source Impact on… Storage Positive Positive Network None Positive Backup Server Scaling Scale-up Scale-out Target vs. Source Dedupe in Backup Source of data LAN Backup data storage Processing Target Deduplication Source Deduplication Location of Deduplication? At storage box At data source Impact on… Storage Positive Positive Network None Positive Backup Server Scaling Scale-up Scale-out Question 2 of 5: Is deduplication source-side ?
  10. 10. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Dedupe Step 1: File Decomposition Strategy File Block Variable- Block Progressive Speed Excellent Good Fair Good Compression Ratio Poor Fair Good Better File Basics of Deduplication
  11. 11. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Basics of Deduplication Dedupe Step 1: File Decomposition Strategy File Block Variable- Block Progressive Speed Excellent Good Fair Good Compression Ratio Poor Fair Good Better File Question 3 of 5: Is deduplication file- or block-grain?
  12. 12. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Dedupe Step 1: File Decomposition Strategy File Fixed Block Variable- Block Progressive Speed Excellent Good Fair Good Compression Ratio Poor Fair Good Better File Basics of Deduplication
  13. 13. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Dedupe Step 1: File Decomposition Strategy File Block Variable- Block Progressive Speed Excellent Good Fair Good Compression Ratio Poor Fair Good Better File Fixed Block Basics of Deduplication Question 4 of 5: Is deduplication fixed-block…or better?
  14. 14. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED14 Dedupe Step 2: Hash Calculation Basics of Deduplication 8,192 byte block Unique Blocks & Fingerprints 1EC52D04100B9977 4FCC5428CC9C4CB6 BBE52B32B4672B36 1B1279BD62654F98 908B3236BC03B5F0 Length _ Algorithm (bytes)___ MD-5 16 SHA-1 20 SHA-2 32 SHA-2 64 Block uniquely identified by 16-byte fingerprint Fingerprint Types
  15. 15. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED15 Deduplication Genealogies Basics of Deduplication RockSoft patent 1995 Content-defined chunks Kadena 2004 Multiple file pools Andrew Tridgell 1998 Single-file, rolling checksum Progressive Algorithm Variable-Block Algorithm Fixed-Block Algorithm Ron Rivest, Message Digests 1989 Single-instance Storage Patent acquired by ADIC in March 2006 Quantum in June 2006 Arkeia Software 2009 Block size optimization
  16. 16. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED16 Deduplication Technologies Basics of Deduplication Tolerates… Single- instance Fixed-block Variable-block Progressive “Append” or “Modify” No Yes Yes Yes “Insert” No No Yes Yes Compression Ratios Poor Fair Good Better Speed Excellent Good Fair Good
  17. 17. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED17 Fixed-Block Basics of Deduplication New New File New New Known Known Known New New New Known New New New New New New File with append File with insert Tolerates… Fixed- block Variable- block Progressi ve “Append” or “Modify” Yes Yes Yes “Insert” No Yes Yes Compression Ratios Fair Good Better Speed Good Fair Good
  18. 18. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED18 Tolerates… Fixed- block Variable- block Progressi ve “Append” or “Modify” Yes Yes Yes “Insert” No Yes Yes Compression Ratios Fair Good Better Speed Good Fair Good Variable-Block Basics of Deduplication New New New File with append File File with insert Known Known New New New New Known Known Known Known
  19. 19. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED19 Variable-Block Deduplication Detail Basics of Deduplication File 1 2 3 4 5 6 7 8 9 10 Byte index 11 12 13 14 15 16 ••• 31461D82B327A05FF(1, 2, 3, 4, 5, 6, 7, 8) = 879DE819CAF76125 67B3EE45C803695C 93B4AB041829F665 DA0D523C5F9706BB DBDF4303EC8095D7 668556EE61492000 5217DDF625C2C270 3FBC70CEFE54395B F(2, 3, 4, 5, 6, 7, 8, 9) = F(3, 4, 5, 6, 7, 8, 9, 10) = F(4, 5, 6, 7, 8, 9, 10, 11) = F(5, 6, 7, 8, 9, 10, 11, 12) = F(6, 7, 8, 9, 10, 11, 12, 13) = F(7, 8, 9, 10, 11, 12, 13, 14) = F(8, 9, 10, 11, 12, 13, 14, 15) = F(9, 10, 11, 12, 13, 14, 15, 16) = End of “Variable-Block” STEP ONE: Polynomial hash function, “F”, randomizes file data. STEP TWO: Examine last “n” bits; if zero then block boundary. (If n = 12, average block length = 2^12 = 4096 bytes.) •••
  20. 20. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED20 Variable-Block vs. Progressive Dedupe Basics of Deduplication VARIABLE-BLOCK DEDUPLICATION 1. 53% of blocks by number are shorter than half or longer than twice TBS 2. 50% of blocks by volume are shorter than half or longer than twice TBS PROGRESSIVE DEDUPLICATION 1. Every block is the size of the sliding window 1MB file 1kB target block size (TBS) chosen to deliver optimal compression Because variable-block dedupe cannot control block sizes precisely, it cannot be “content aware”
  21. 21. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED21 Progressive Deduplication Basics of Deduplication New New File New New Known Known Known New New New Known New New Known Known Known Known File with append File with insert Tolerates… Fixed- block Variable- block Progressive “Append” or “Modify” Yes Yes Yes “Insert” No Yes Yes Compression Ratios Fair Good Better Speed Good Fair Good
  22. 22. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Why Is Progressive Dedupe Better? Higher compression because…  Application-aware (different block sizes for different files)  Sliding window sees all possible block boundary locations • Variable-block boundaries fixed at arbitrary locations Faster because…  Blocks are of fixed sizes • Blocks range from 1kB to 64kB  Known data: Deduped as fixed blocks • Sliding only happens on new or modified files  New data: Dedupe uses progressive-matching • Find probable matches with speedy, light-weight algorithm • Confirm candidates with heavy-weight hash Basics of Deduplication
  23. 23. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED23 How Orphaned Dedupe-Blocks Are Created Basics of Deduplication A B C D Files xx x
  24. 24. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Block Reclamation Options Scheduled Maintenance Window Continuous in Background Basics of Deduplication Disk Time Time Maintenance Window Maintenance Window Maintenance Window Wasted Buffer Disk Disk Wasted Buffer Disk
  25. 25. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Block Reclamation Options Scheduled Maintenance Window Continuous in Background Basics of Deduplication Disk Time Time Maintenance Window Maintenance Window Maintenance Window Wasted Buffer Disk Disk Wasted Buffer Disk Question 5 of 5: How much buffer disk is required to manage orphan block reclamation?
  26. 26. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED26 WD Progression WD Business Storage Solutions Overview Storage Components Consumer Storage Solutions Business Storage Solutions Direct Retail VAR OEM Consumer SMB Started in 1988…#1 Started in 2007…#1 Started in 2011…
  27. 27. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED27 Network Backup for Business Since 1999 WD Arkeia Architecture Backup Servers as  Software Application (since 1999)  Physical Appliance (since 2007)  Virtual Appliance (since 2008) Key technology for fast backups  Patented Progressive Deduplication™  Multi-flow (parallel backup of up to 200 clients)  Seed and Feed™  Optimized restore (never overwrites)  Scale-out architecture: Compression / Dedupe / Encryption on Client Key technology for ease of use  Web UI  Disk, Tape, Cloud targets  Broad platform support (243 physical and all major virtual environments) Key technology for affordability  Backward compatibility of server to *all* previous versions of agents reduces cost of upgrades  Single solution for both physical and virtual environments  No agent fees for file-folder backup (or per-CPU or per-core fees)
  28. 28. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED28 WD Arkeia™ v10 Architecture WD Arkeia Architecture Storage devices Disk, Tape, Cloud Backup Agent Backup Agent Backup Agent Backup Agent Backup Server Backup Server… Backup Replication Backup ReplicationBackup Replication Server Backup Agents Encryption Disaster recovery Virtualization Bundle VMware Hyper-V Microsoft Bundle SQLServer ActiveDirectory SharePoint Exchange Graphical Web Administration Command-line Administration Lotus DB2 Oracle DataCenter Bundle Any target: Disk storage: file or block; Tape: 600+ drives/libraries; Cloud: public or private Premium Bundle GroupWise eDirectory PostgreSQL LDAP MySQL FilesFolders Progressive Deduplication Basic Bundle File and folder backup on 240+ platforms: Linux, Windows, Macintosh, AIX, HP-UX, Solaris Novell, BSD Progressive Deduplication: Source- or Target-side Media Server Replication Software Appliance Virtual appliance Backup Servers on any of 130 Linux platforms Capacities of 4TB to 48TB disk and tape drives VMware ESX and ESXi hypervisors Backup Agents
  29. 29. © 2014 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Please Contact Us WDBusinessStorage@wdc.com www.arkeia.com

×