This document provides an overview of Data Domain advanced features and functions for Velocity Partner Accreditation. It covers topics such as virtual tape library (VTL) planning, snapshots, replication, recovery, DD Boost integration, capacity and throughput planning, and system monitoring tools. The document contains lessons and explanations on these topics to help partners learn about and describe Data Domain's data protection solutions.
2. 2EMC CONFIDENTIAL—INTERNAL USE ONLY.
Module Objectives
Upon completion of this module, you will be able to:
• Describe VTL and VTL library planning
• Describe snapshots, fastcopy, and data retention
• Describe data replication and recovery
• Describe DD Boost and integration with EMC
NetWorker
• Describe capacity and throughput planning
• Describe Data Domain system monitoring tools
3. 3EMC CONFIDENTIAL—INTERNAL USE ONLY.
Lesson: Virtual Tape Library (VTL)
Upon completion of this lesson, you will be able to:
• Describe a Data Domain VTL
• Describe VTL library planning
4. 4EMC CONFIDENTIAL—INTERNAL USE ONLY.
VTL Definition
Application/
media backup
server
Backup Cache
Retention/
restore/
cloning
Disaster recovery/
archive/
offsite storage
Why should you use a VTL?
5. 5EMC CONFIDENTIAL—INTERNAL USE ONLY.
Configuration Terms
Barcode • Unique ID assigned to virtual tape when you create it
• In the Data Domain OS aka: label, tape label
CAP • Cartridge access port (CAP), emulated tape enter/eject point for
moving tape to/from a library
• In the Data Domain OS aka: mail slot
Library • Emulates physical tape library with tape drives, changer, CAPs, and
slots (cartridge slots)
• In the Data Domain OS aka: autoloader, tape silo, tape mount, tape
jukebox, vault
Pool Collection of tapes that map to a directory on a file system, used to
replicate tapes to a destination
Tapes • Represented in a system as files. You can export/import from a vault
to a library, move within a library across drives, slots, and CAPs
• In the Data Domain OS aka: cartridge
Vault Unused tapes stored in vault, tapes are in library/vault
6. 6EMC CONFIDENTIAL—INTERNAL USE ONLY.
VTL Library Planning
Fibre channel
• 256 virtual tape drives
(DD880 only)
• 128 virtual LTO-1, LTO-2,
LTO-3 tape drives
(all other models)
Robot loadschanges
tape cartridge
VTL
Up to 1,00,000
tape cartridges
in virtual vault
• 64 virtual libraries
• 20,000 slots per library
• 100 CAPs per library
• 1000 CAPs per system
• 800 GiB per tape
7. 7EMC CONFIDENTIAL—INTERNAL USE ONLY.
Capacity Planning
• More planning needed at installation
• Expired tapes NOT deleted, space not reclaimed till
tape is overwritten/deleted
• Always create more slots that you need
• Load tapes when you need them
• Stop loading tapes once retention requirements are
met
8. 8EMC CONFIDENTIAL—INTERNAL USE ONLY.
Lesson: Fastcopy, Snapshots, and Data
Retention
Upon completion of this lesson, you will be able to:
• Describe Data Domain fastcopy
• Describe Data Domain snapshots
• Describe Data Domain data retention
• Explain the Data Domain cleaning process
9. 9EMC CONFIDENTIAL—INTERNAL USE ONLY.
Fastcopy
Copy
If you change source or target directory
while copying, they will not be equal.
Source
directory
Target
directory
10. 10EMC CONFIDENTIAL—INTERNAL USE ONLY.
Snapshots
Original copy Snapshot copy
/data/ coll /backup /data/ coll /backup / .snapshot
/data/ coll /backup/files /data/ coll /backup/files/ .snapshot
Snapshot taken at 22:24 GMT 22:24 GMT snapshot saved
11. 11EMC CONFIDENTIAL—INTERNAL USE ONLY.
Retention Lock
Archive software
or user initiates
Prevents retention-locked files from being deleted/modified for up to 70 years
Licensed feature
Retention locked files can be stored, encrypted, and replicated
12. 12EMC CONFIDENTIAL—INTERNAL USE ONLY.
Retention Lock Flow
1. License/enable retention lock
2. Set min/max retention period3. Create file
4. Lock file (set retention period)
- Extend retention-locked file
- Delete expired retention-locked file
5. Transfer file to Data Domain system
13. 13EMC CONFIDENTIAL—INTERNAL USE ONLY.
currenttime
Configure Client File Retention Period
minimum retention period valid atime period
maximum retention period
2. Data Domain system administrator sets min/max retention periods on Data Domain system
1. User creates file and sets last access time (atime) to desired retention period
Client must initiate retention lock
3. File either committed as a retention-locked file or ignored
14. 14EMC CONFIDENTIAL—INTERNAL USE ONLY.
File System Cleaning
File A deleted with no retention lock File A deleted at next cleaning
File B deleted, retention lock initiated File B maintained until retention lock period ends
SW backups to Data Domain
Cleaning reclaims physical storage occupied by deleted objects
15. 15EMC CONFIDENTIAL—INTERNAL USE ONLY.
Cleaning
Disk blocks
What?
Reclaim space
Disk block
Disk block
Why?
House keeping (reclaim “dead” segments)
Performance (rewrite duplicate data)
House keeping
Performance tuning
Container 1
Container 2
Container 2Container 3
dead copy Forward valid
Container 1
or
Free space
Free space
16. 16EMC CONFIDENTIAL—INTERNAL USE ONLY.
Lesson: Replication and Recovery
Upon completion of this lesson, you will be able to:
• Describe the types of Data Domain replication
• Identify how replication improves storage
• Describe the data recovery process
17. 17EMC CONFIDENTIAL—INTERNAL USE ONLY.
Data Replication
New deduplicated compressed data is automatically replicated to destination
WAN
LAN
Source Destination
18. 18EMC CONFIDENTIAL—INTERNAL USE ONLY.
Data Domain Replication Types
• Collection: for entire site backup
• Directory: for partial site backup
• Pool: for VTL files/tape backup
19. 19EMC CONFIDENTIAL—INTERNAL USE ONLY.
Data Domain Collection Replication
/backup /backup
Source Destination
• Immediate accessibility
• Read only
• User accounts/passwords
replicated from source
• Works with encrypted files
• Works with retention lock
Recovers
entire system
20. 20EMC CONFIDENTIAL—INTERNAL USE ONLY.
Data Domain Directory Replication
/backup/dir a
Source
• Destination must have
available storage
• CIFS and NFS clients ok
• Do not mix CIFS/NFS
data in same directory
• Destination directory
created automatically
• Works with encryption
• Works with retention
lock
/backup/dir b
/backup/dir a
/backup/dir b
Destination
Destination
Recovers selected data
21. 21EMC CONFIDENTIAL—INTERNAL USE ONLY.
Data Domain Pool Replication
Source Destination
• Works like directory replication
• Destination doesn’t require VTL license
pool 3
pool 1pool 1
pool 2pool 2
pool 3
23. 23EMC CONFIDENTIAL—INTERNAL USE ONLY.
Replication Topologies
Source
Source
Source
Destination
Destination
Destination
Source
Source
Source
Destination
Destination
Destination
1 to 1
1 to many
Bi-directional
Many to 1
Cascaded Cascaded 1-to-many
Primary
source/
Destination
Primary
source/
destination
24. 24EMC CONFIDENTIAL—INTERNAL USE ONLY.
Recover Data
Backup
serverFile serverClients
On site
Off site
disaster recovery
WAN
Replication
In case of disaster,
recover off-site replica
You can configure a Data Domain system to store
backup data and retain onsite for 30-90 days
25. 25EMC CONFIDENTIAL—INTERNAL USE ONLY.
Why Resynchronize Recovered Data?
WAN
Source
Resynchronization
Destination
Recreate deleted context Out of space
Convert collection to
directory replication
26. 26EMC CONFIDENTIAL—INTERNAL USE ONLY.
Lesson: Data Domain Boost
Upon completion of this lesson, you will be able to:
• Describe DD Boost
• Describe replica awareness
• Describe how DD Boost works with EMC NetWorker
• Describe supported network topologies
• Describe DD Boost advanced load balancing and link
failover feature
27. 27EMC CONFIDENTIAL—INTERNAL USE ONLY.
DD Boost
• Provides standard/centralized management features
through backup software
• Works with industry standard backup software
– EMC Networker
– Symantec NetBackup (Data Domain plug-in required)
– Symantec Backup Exec (Data Domain plug-in required)
• Enables advanced load balancing and failover
• Requires licenses on Data Domain System
– DD Boost
– Replication (if used) Note: Your backup software might require
license to enable the feature. Verify your backup software
documents.
28. 28EMC CONFIDENTIAL—INTERNAL USE ONLY.
DD Boost (contd.)
Backup Server
OST
plug-in
DD
Boost
Clients
Clients send
data to backup
server
Less data sent
over LAN
Deduplication/compression
occur in backup server
LANLAN
Optimized
protocol
for high
throughput
Manages connections
between backup applications
and Data Domain systems with DD Boost
Deduped Data
Stored
29. 29EMC CONFIDENTIAL—INTERNAL USE ONLY.
Replica Awareness
Backup
Server
WAN
Backup site Disaster recovery site
OST
plug-in
Initiates and tracks
replication for easy
management and
disaster recovery
Archive to tape as needed
You manage replication from
backup server console
replication
DD
Boost
DD
Boost
30. 30EMC CONFIDENTIAL—INTERNAL USE ONLY.
DD Boost Advantage
• Without DD Boost
– Backup server(s) not aware of
Data Domain replica(s)
– Recovery is manual process
• With DD Boost
– Backup server dedupes data
and minimizes network
bandwidth use
– Replication and recovery are
centrally configured and
monitored
Backup
Replication
Optimized
deduplication
Replication
engine
Backup
server
DD Boost server
DD Boost server
Without OST
Manually
configured replication
With DD Boost
OST
plug-in
31. 31EMC CONFIDENTIAL—INTERNAL USE ONLY.
NetWorker – Work Flow
Start clone
(Clone 1)
4
NetWorker
Server
Control
Data
Local
Data Domain system
Remote
Data Domain system
5 Data
transfer
6
Done (Clone 1)
Clone 1
Save
Set 1
Clone 1
update
control
data
7
Clone 1
Save
Set 1
New data backup
(Save Set 1)
1
2
Done (Save Set 1)
Save Set 1
update
control data
3
32. 32EMC CONFIDENTIAL—INTERNAL USE ONLY.
Lesson: Capacity and Throughput Planning
Upon completion of this lesson, you will be able to:
• Describe capacity planning and its importance
• Describe throughput planning and its importance
33. 33EMC CONFIDENTIAL—INTERNAL USE ONLY.
Monitor File System Space Use
• Factors that effect how fast data on disk grows
– Size of data sets getting backed up
– Compressability of data getting backed up
– Retention period specified in backup software
• Monitor disk use closely when you back up large data
sets that show low compression factors and have
large retention times
• You can get more accurate space-use view from CLI
• Use filesys show space to monitor post-compression
data growth
34. 34EMC CONFIDENTIAL—INTERNAL USE ONLY.
Space Graph
Cumulative physical data
written to DDSAmount of Data within
Backup Application
Compression Ratio:
Pre-compression/ Data Collection
Available Space on DDS
36. 36EMC CONFIDENTIAL—INTERNAL USE ONLY.
Compression Factor Calculation
Original bytes
Data Domain system data written
Compression factor
What does cleaning do to this equation?
It decreases the Data Domain system data written (denominator)
and thus increases the compression factor.
37. 37EMC CONFIDENTIAL—INTERNAL USE ONLY.
How much?
• Data size (TB)
• Data type
• Full backup size
• Compression rate
(deduplication)
Capacity Planning: Determine Capacity
Needs
Capacity
needs
How long?
• Retention policy
(duration)
• Schedule
38. 38EMC CONFIDENTIAL—INTERNAL USE ONLY.
Determine Capacity Needs (contd.)
• Data Domain system internal indexes and other
components use variable storage amounts
depending on data type and file sizes
• If different data sets are sent to identical systems,
one system may, over time, have room for
more/less backup data than another
• Challenging data types
– Pre-compressed (multimedia, .zip, and .tiff)
– Encrypted
39. 39EMC CONFIDENTIAL—INTERNAL USE ONLY.
Compression Requirements with Variables
• 5x – Nearline and archive
– Incremental + weekly full backup with two weeks retention
– Daily full backup with one week retention
– Nearline and archival use compression tends to be capped here
• 10x – Overall compression
– Incremental + weekly full backup with one month of retention
– Daily full backup with two-three weeks retention
• 20x – Overall compression
– Incremental + weekly full backup with two-three months retention
– Daily full backup with three-four weeks retention
40. 40EMC CONFIDENTIAL—INTERNAL USE ONLY.
Calculate Required Capacity
1st full backup
Incremental
backup 4
Weekly full
backup
number of
weeks
Required
capacity
Total space required
Required
capacity
41. 41EMC CONFIDENTIAL—INTERNAL USE ONLY.
Calculate Required Throughput
Largest backup
Backup time window
6 TB
10 hrs
Example
Required throughput
600 GB/hr
42. 42EMC CONFIDENTIAL—INTERNAL USE ONLY.
System Model Capacity and Performance
• Maximum capacity is amount of usable data storage
space
• Maximum capacity based on max number of drives
supported by a model
• Maximum throughput is achieved using either VTL
interface and 4Gbps Fibre Channel or DD Boost and
10Gb Ethernet
• Current model throughput and capacity specifications
http://www.datadomain.com/products/
43. 43EMC CONFIDENTIAL—INTERNAL USE ONLY.
Select Model
• Be conservative when determining which model to use
• Use 75-85% of model capacity and throughput
(factor 15-25% buffer for capacity and throughput)
Required capacity
Maximum logical capacity
Required throughput
Maximum throughput
Capacity %
Throughput %
100
100
44. 44EMC CONFIDENTIAL—INTERNAL USE ONLY.
Calculate Capacity Buffer for Selected
Models
Required capacity
Maximum capacity
DD140 example
840 GB
860 GB
840 GB
1650 GB
DD610 example
97%
51%
100%
100%
100%
% of Maximum capacity
3% Buffer not ok
51% Buffer ok
45. 45EMC CONFIDENTIAL—INTERNAL USE ONLY.
Match Required Capacity to Model
Specifications
OR?
1,650 GB with 7 drives
DD610DD140
860 GB
Ensure capacity buffer is big enough
For example
Required capacity = 840 GB
46. 46EMC CONFIDENTIAL—INTERNAL USE ONLY.
Calculate Performance Buffer for Selected
Models
Required throughput
Maximum throughput
DD610 example
600 GB/hr
675 GB/hr
600 GB/hr
1126 GB/hr
DD630 example
89%
53%
100%
100%
100%
% of Maximum throughput
11% Buffer not ok
47% Buffer ok
47. 47EMC CONFIDENTIAL—INTERNAL USE ONLY.
Match Required Capacity to Model
Specifications
OR?
1.1 GB/hr
DD630DD 610
860 GB/hr
Ensure performance buffer is big enough
For example
Required throughput = 600 GB/hr
48. 48EMC CONFIDENTIAL—INTERNAL USE ONLY.
Lesson: System Monitoring Tools
Upon completion of this lesson, you will be able to:
• Describe Data Domain system monitoring tools
– SNMP
– syslog
– autosupport
– SUP
49. 49EMC CONFIDENTIAL—INTERNAL USE ONLY.
Alert
Monitoring a Data Domain System
Data Domain
system administrator
Daily
alerts and
autosupport reports
Daily
alerts and
autosupport reports
Data Domain
technical support
2. syslog 3. autosupport 4. SUB1. SNMP
50. 50EMC CONFIDENTIAL—INTERNAL USE ONLY.
SNMP
• You can monitor a Data Domain system via SNMP utilities
• You can integrate the Data Domain Management
Information Base (MIB) into SNMP monitoring
51. 51EMC CONFIDENTIAL—INTERNAL USE ONLY.
Syslog (Remote Logging)
-
syslog server
LAN
Port 514
System messages
Port 514
Sends system messages to remote syslog server
Uses TCP port 514
You collect logs
52. 52EMC CONFIDENTIAL—INTERNAL USE ONLY.
Autosupport
• Easy to install – just once at system setup
• Helps solve/prevent system problems
– Provides timely notification of significant issues
– Enables rapid response time to address or prevent problems
– Includes critical system data to aid support case triage and
management
53. 53EMC CONFIDENTIAL—INTERNAL USE ONLY.
Autosupport System
autosupport@
autosupport.datadomain.com
System history
Integration to
other systems
Reports
Data Domain technical support
(support case)
Other vendors
Daily alert summary
Reboots
Summary autosupport report
Warnings
via SMTP
Detailed autosupport report
54. 54EMC CONFIDENTIAL—INTERNAL USE ONLY.
Autosupport Types
autosupport
types
Non scheduledScheduled
Detailed
autosupport
report
sent 6 am
Daily alert
summary
email
sent 8 am
Alerts
• warning
• failure
Reboots
55. 55EMC CONFIDENTIAL—INTERNAL USE ONLY.
Autosupport Via Enterprise Manager
• Data Domain systems provide alerts,
autosupport reports, and logs
• Access through Enterprise Manager
56. 56EMC CONFIDENTIAL—INTERNAL USE ONLY.
Autosupport Reports
• Using SMTP, sent to Data Domain
technical support daily at 6 am
local time (default)
• Contains system ID, uptime
information, system command
outputs, runtime parameters, logs,
system settings, status and performance
data, and other debugging information
• Long text report (500-800K)
• Sections parsed into data warehouse
for analysis and reporting
Subscribers
receive daily
detailed
reports
58. 58EMC CONFIDENTIAL—INTERNAL USE ONLY.
Daily Summary Autosupport
Provides
summary
autosupport
report
Tells you if the
system is ok or
not. If not, no
email receipt is
received
Sent daily at 8
am
Uses
autosupport
email
distribution list
This is an example
of an alert
59. 59EMC CONFIDENTIAL—INTERNAL USE ONLY.
Alerts
Unique numerical ID
Physical component where alert
occurred
Subsystem where
alert occurred
Alert severity
Date and time alert occurred
60. 60EMC CONFIDENTIAL—INTERNAL USE ONLY.
Alerts Notification
Terse description of event
Sent immediately upon detection
Creates support case
Has separate email distribution list
61. 61EMC CONFIDENTIAL—INTERNAL USE ONLY.
Logs
Every Sunday at 3 am
1. New log file opened
2. Old log file renamed
CLI: log view filename
62. 62EMC CONFIDENTIAL—INTERNAL USE ONLY.
Support Upload Bundle (SUB)
• Large (multi-GB sized) tar file
• Contains
– OS settings and log files
– System files (not customer data files) identified as needed
for system diagnosis by Data Domain support and
engineering
• Used to triage and diagnose a Data Domain system in the field
• CLI commands used to generate and optionally send (via http)
SUB to Data Domain support site
• Generated by sysadmin on Data Domain system via GUI/CLI
63. 63EMC CONFIDENTIAL—INTERNAL USE ONLY.
Module Summary
Key points covered in this module include:
• VTL and VTL library planning
• Snapshots, fastcopy, and data retention
• Data replication and recovery
• DD Boost and integration with EMC NetWorker
• Capacity and throughput planning
• Data Domain system monitoring tools
66. PROPERTIES
On passing, 'Finish' button: Goes to Next Slide
On failing, 'Finish' button: Goes to Next Slide
Allow user to leave quiz: At any time
User may view slides after quiz: At any time
User may attempt quiz: Unlimited times