Your SlideShare is downloading. ×
Champion Fas Deduplication
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Champion Fas Deduplication

1,261
views

Published on

FAS Deduplication Overview and Best Practices

FAS Deduplication Overview and Best Practices

Published in: Technology, Business

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,261
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Champion & FAS Deduplication Overview & Best Practices For More Info please contact Michael Hudak NetApp Sales Specialist [email_address] 800-771-7000 x344
  • 2. FAS Deduplication
    • GD Release April 2007
    • R200
    • FAS2000
    • FAS3000
    • FAS6000
    • V-Series (2008)
    • Multi-tier Deduplication
    • Primary data
    • Backup data
    • Archival data
    NetApp Deduplication System Adoption 2007 Projection = 1,700 Systems Deduplication-Enabled System Storage = 50PB
  • 3. Space Reduction Technologies 2002 1992 2004 2006 2004 - RAID-DP 2002 – SnapVault/OSSV 1993 – Snapshot Technology 2005 - Thin Provisioning 2006 - VTL Compression 2007 - Deduplication 2008 2001 – SATA NearStore 2006 – SnapVault for NBU 2005 – Virtual Cloning Cost/GB Time “ Additive” Space Reduction Features
  • 4. Deduplication Basics: Fingerprint Catalog
    • A deduplication catalog consists of a series of “hash values” aka digital fingerprints
    • Once catalogued, hashes can be compared and deduplication candidates identified
    Hashing Algorithm Data Object Digital Fingerprint Fingerprint Catalog
  • 5. Deduplication Basics: Reference Pointers
      • Data objects are written to storage systems using “Reference Pointers”
    • Deduplication introduces two important concepts:
      • Catalog of data objects
      • The ability to reference one object multiple times
    Non-Deduplicated Reference Pointers Allocated Storage Allocated Storage Allocated Storage Allocated Storage Allocated Storage Deduplication Catalog Deduplicated Reference Pointers Allocated Storage Free Storage Free Storage Free Storage Free Storage
  • 6. NetApp Enabling Technology: WAFL Block Sharing
    • FAS deduplication utilizes block sharing within the WAFL file system
    • A single block can be referenced up to 255 times
    • This technology has been in place for 15 years (Snapshots)
    INODE 1 INODE 2 IND IND IND IND DATA DATA DATA DATA
  • 7. FAS Deduplication: Commands
    • License it
    • Turn it on
    • [Deduplicate existing data]
    • Schedule when to deduplicate or run manually
    • Check out what’s happening
    • See the savings!
      • license add <a_sis>
      • sis on <vol>
      • sis start -s <vol>
      • sis config [-s schedule] <vol>
      • sis start <vol>
      • sis status [-l] <vol>
      • df – s <vol>
  • 8. FAS Deduplication: “ sis status ” Progress Messages and Stages Path State Status Progress /vol/vol5 Enabled Active 40MB (20%) done Path State Status Progress /vol/vol5 Enabled Active 30MB Verified OR /vol/vol5 Enabled Active 10% Merged Filer> sis status Path State Status Progress /vol/vol5 Enabled Active 25 MB Scanned Path State Status Progress /vol/vol5 Enabled Active 25 MB Searched Gathering Sorting Deduplicating Checking
  • 9. Deduplication Space Savings
    • Space Savings Will Vary Based On Data Types
    • Use NetApp Space Savings Estimation Tool (SSET) For Validation
  • 10.
    • Scans volumes and discovers duplicate data
      • Simulates the effect of FAS deduplication
    • Does not require Data ONTAP® or A-SIS license
    • Three standalone executables:
      • Linux®
      • Solaris™
      • Windows®
    • Available from NetApp and Partner SE’s
    SSET 2.0 Overview
  • 11. Using SSET 2.0
    • Using the tool—command example:
      • Find_space –f <fingerprint file> -p <path>
    • Tool will “crawl” through the path specified and create fingerprints for each block of data
    • Fingerprints are compared and matches are reported
    • 2TB maximum; if tool determines that the path is >2TB, will exit with error message
    • Large volumes will take a long time to analyze
    • Tool should not be left installed at customer site once the evaluation is completed
  • 12. SSET 2.0 Example
  • 13. Deduplication Best Practices: Qtree SnapMirror® (QSM) Replication
    • QSM replication
    • Improves storage efficiency at secondary location
    • No impact on primary storage workload
    • V-Series data can be mirrored to DR site and also deduplicated
    FAS at Site A, e.g., Data Center FAS at Site B, e.g., DR Site deduplication QSM V-Series
  • 14. Deduplication Best Practices: Volume SnapMirror® (VSM) Replication
    • VSM replication
    • Deduplicated data at primary and secondary locations
      • Secondary site inherits deduplicated data
    FAS at Site A, e.g., Data Center Network Efficiency Reduced amount of data traveling across the network FAS at Site B, e.g., DR Site VSM Deduplication V-Series (Q1 2008) Deduplication “ Inherited” deduplication
  • 15. Deduplication Best Practices: Copying to Tape Third-Party Backup Application Server DB Server Deduplication SAN/NAS NDMP To Tape NDMP to tape can be accomplished at any time - No need to wait for deduplication to complete Primary Storage ERP/ECM Server E-mail Server
  • 16. Deduplication Best Practices: Scheduling with Backup Data Third-Party Backup Application Server DB Server Deduplication SAN/NAS Volume SnapMirror (VSM) Volume SnapMirror® (VSM) Deduped image is mirrored Saves network bandwidth and storage space on both NearStore units DR Site Deduplication Scripted after Each Backup: sis start <vol> Primary Storage ERP/ECM Server E-mail Server
  • 17. Deduplication Best Practices: Scheduling with Archival Data Deduplication Volume SnapMirror (VSM) Volume SnapMirror® (VSM) Deduped image is mirrored Saves network bandwidth and storage space on both NearStore units DR Site Deduplication Automated Schedule Based on 20% Change Rate: sis config –s auto <vol> Third-Party Archival Application Server SAN/NAS Primary Storage ERP/ECM Server E-mail Server
  • 18. Deduplication Best Practices: Scheduling Light-Duty Primary Data Mission-Critical Primary Storage “ Lite Use” Primary Storage Servers Clients Deduplication VMware ®, CIFS shares, home dirs, etc. Volume SnapMirror (VSM) Volume SnapMirror® (VSM) Deduped image is mirrored Saves network bandwidth and storage space on both NearStore units DR Site Deduplication Scheduled during Off-Peak Time: sis config –s schedule <vol> SAN/NAS
  • 19. Deduplication in a VMware® Infrastructure
    • A VMware infrastructure consists of virtual machine (VM) templates and clone copies
    • Templates, or Golden Masters, are created for each application environment and consist of a VM configuration file (.vmx) and one or more virtual disk files (.vmdk)
  • 20. Cloning a VMware® Virtual Machine
    • VM templates and clones can grow very large, for example, one NetApp user with 1,800 VMs requires 64TB of disk capacity to manage these copies
    Virtual Machines ESX Server
  • 21. An Opportunity for Deduplication
    • The creation of VM clone images presents an opportunity for space reduction via deduplication
    • Deduplication removes redundant blocks within a NetApp system volume and does so in a transparent manner so that all clone copies appear intact to the ESX server
    Virtual Machines ESX Server Deduplication
  • 22. Deduplication with VMware® VMs
    • Space savings:
      • Up to 90%
    • Deduplication runs as background task, scheduled during off-peak times
    • Deduplication imposes only nominal impact on read/write performance
    Deduplication Up to 90% Space Savings Remote Data Center VMware ESX Servers SAN/ NAS Primary Data Center “ Golden” VMware Masters + Virtual Machine Clones NetApp FAS System SnapMirror® Replication SAN/ NAS Up to 90% Space Savings NetApp FAS System
  • 23. Deduplication Miscellaneous Best Practices
    • SnapVault®/OSSV
      • Deduplicate only the baseline image today
      • Extended use will be supported in Data ONTAP 7.3
    • Snapshot™ Copies
        • Deduplicate before taking Snapshot copies
        • Delete stale Snapshot copies
        • Refer to Deployment Guide for detailed info
        • Efficiency will improve in Data ONTAP 7.3
  • 24. Volume Limits
    • FAS Deduplication Volume Limits
  • 25. Resources
    • Deduplication FAQs ->
    • TR-3505— Deduplication Deployment and Implementation Guide
    • Online Backup and Recovery Guide
    • Space Savings Estimation Tool
    • All Resources:
      • PartnerCenter>Products>NearStore on FAS

×