SlideShare a Scribd company logo
1 of 12
Download to read offline
IBM Systems and Technology
Thought Leadership White Paper
July 2011
Combining IBM Real-time
Compression and
IBM ProtecTIER Deduplication
Benchmark tests show that combining storage optimization technologies
achieves compelling results
2 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
Contents
2 Introduction
3 Landmarks in the data optimization landscape
5 The need for data optimization in database backup
and recovery
7 The test environment: An overview
10 Test 1: ProtecTIER deduplication only
10 Test 2: IBM Real-time Compression and ProtecTIER
deduplication
11 Summary
11 For more information
Introduction
As the capacity and overhead of powering, cooling and manag-
ing larger amounts of storage continues to outpace the growth
of storage budgets, IT decision makers are increasingly looking
to optimization technologies to meet capacity demands while
minimizing capital expenditures. Recently, two storage optimiza-
tion approaches in particular have been receiving significant
attention in the industry: real-time compression for primary and
secondary data, and data deduplication for highly redundant
backup data sets. Although sometimes viewed as mutually
exclusive, the two technologies are, in fact, very complementary.
This paper discusses the compelling financial and operational
advantages of deploying real-time compression and data dedupli-
cation in conjunction, as demonstrated by the results of tests in
which IBM Real-time Compression and IBM® ProtecTIER®
Deduplication solutions were combined to optimize Oracle
database physical backups in a Network File Storage (NFS)
environment.
The compelling results of combining IBM solutions for real-
time compression and data deduplication in the Oracle database
environment include:
● Greater than 82 percent immediate savings on initial write
to disk.
● Greater than 96 percent overall data reduction when com-
bined with deduplication.
● Up to 71 percent reduction in backup time.
● Less CPU utilization on the deduplication engine.
● Less disk activity in the deduplication subsystem.
● Less network traffic on the deduplication backup network.
3IBM Systems and Technology
Figure 1: The data optimization landscape
Detailed test results are included later in this document. The
bottom line is that combining real-time compression and data
deduplication optimizes your overall storage footprint by reduc-
ing data on your primary NAS devices as well as throughout
your data life cycle. By combining these two technologies, you
can achieve maximum data reduction, which maximizes your
return on investment and dramatically improves your data pro-
tection performance and capabilities.
Landmarks in the data optimization
landscape
To fully appreciate the benefits of combining real-time compres-
sion and data deduplication, it’s important first to understand
how each technology works, the differences between them, and
where they fit in the overall storage architecture, as shown
in Figure 1.
4 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
Real-time compression
Data compression reduces the size of data files so that less space
is required to store them. Real-time compression, as the name
implies, is the ability to compress data in real time—before it is
written to the hard disk rather than after—without any notice-
able performance degradation.
Designed to sit transparently in front of primary Network
Attached Storage (NAS), IBM Real-time Compression offers
the unique advantage of making it possible to shrink primary,
online data in real time with no loss in speed. With over
30 patents, it reduces the size of every file you create by up to
five times, depending upon the file type. It significantly reduces
the physical capacity required to store a file (or copies and per-
mutations of a file) through the entire data life cycle, including
backup. IBM Real-time Compression also has a feature called
the Compression Accelerator that enables the non-disruptive
compression of data that has already been saved to disk—while
applications continue to have random, read-write access to the
data. IBM Real-time Compression can also significantly enhance
overall network and storage performance, since less data is writ-
ten to disk and more data can be stored in the storage cache.
As more server workloads become virtualized, real-time com-
pression becomes increasingly valuable as a tool for storage
optimization in virtualized environments. The technology works
particularly well given the compression rates associated with
virtualized files (see Table 1). As a result, many companies that
have adopted file virtualization technologies are also exploring
deployment of IBM Real-time Compression in conjunction
with file virtualization. IBM Real-time Compression solutions
transparently integrate with file virtualization solutions and
can dramatically extend the cost reductions that file virtualiza-
tion enables.
File type Compression rate
Database Up to 85 percent
Microsoft Office Up to 20 - 60 percent
VMware VMDK (virtualized files) Up to 72 percent
CAD/CAM Up to 70 percent
Oil and Gas Up to 50 percent
Table 1: Compression rates
5IBM Systems and Technology
Data deduplication
Data deduplication is designed to reduce the physical storage
required to store redundant data. The deduplication process
removes duplicate data and replaces it with a pointer to the main
copy, leaving only one copy of the data that actually has to be
stored. This is why it is well suited for backup data where there
are typically multiple data sets (daily/weekly, for example) of
mostly redundant data. The more copies of redundant data you
have, the higher your effective deduplication rate.
IBM ProtecTIER Deduplication solutions feature revolutionary
and patented HyperFactor® data deduplication technology.
They provide enterprise-class performance, scalability and
proven enterprise-level data integrity to meet disk-based data
protection needs while enabling significant infrastructure cost
reductions. They specifically provide improved backup
performance, up to 2000 MB/sec (7.2 TB/hour) sustained
inline deduplication, and even faster restores at up to
2800 MB/sec (10 TB/hour). It also provides:
● The ability to scale to 1 PB of physical storage.
● A reduction in storage capacity consumption of up to 25 times
or more.
● A non-hash-based approach that protects data integrity by
reducing the risk of data loss due to hash collision.
Technology differences
As described above and illustrated in Figure 1, real-time com-
pression and data deduplication technologies address different
problems and sit at different points in the data life cycle. But
more importantly, the two technologies are complementary;
in particular, deploying real-time compression significantly
enhances the value and performance of data deduplication. This
conclusion has been demonstrated in a series of performance
tests, which are described in detail in this paper.
The need for data optimization in
database backup and recovery
In general, backup and recovery refers to the various strategies
and procedures involved in protecting a database against data
loss and allowing for the reconstruction of the database after any
kind of disaster. The performance and reliability of backup and
recovery operations are critical to effective database operation.
Physical backups are backups of the physical files used in storing
and recovering your database, such as data files, control files and
archived redo logs.
Ultimately, every physical backup is a copy of files storing data-
base information to some other location, whether on disk or
some offline storage media such as tape. Backup performed after
a database is properly shut down is called cold database backup.
Conversely, backup performed when a database is online and
fully functional is called hot database backup. During a cold
backup, the database is shut down and unavailable; obviously, any
technology that reduces the period of time that the database is
offline is advantageous. In either case, due to the tremendous
growth in data, it is becoming increasingly difficult for backups
to complete within designated backup windows.
6 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
IBM TS7610 ProtecTIER
Deduplication Appliance Express
Tivoli Storage Manager
NDMP
Backup
Gigabit Ethernet Switch
Oracle Database
10.2.0.4
3x DB Clients
Quest Benchmark
Factory
IBM Real-time
Compression
IBM N5600-A10
Brocade 4100
LAN
Figure 2: The test environment
7IBM Systems and Technology
Although the benefits of utilizing database backups (either cold
or hot) are clear, their use can result in the creation of large
amounts of data that must reside in the storage environment,
taking up precious disk space and increasing the complexity and
cost of backup procedures. This is why primary storage opti-
mization is so important.
The test environment: An overview
In order to simulate accurate and realistic data storage scenarios,
IBM used Quest Software’s Benchmark Factory and Data
Factory to create and populate an Oracle database running over
NFS to an IBM System Storage® N5600-A10 storage con-
troller. A 37GB baseline database was then used to test the
effects of data deduplication and compression, respectively. Each
test had a seven percent daily change rate that was simulated
between each database copy. Seven database copies were then
taken to simulate a week’s worth of Oracle data sets in an enter-
prise environment through a combination of updates to existing
data, additions of new data, and other database activities such as
delete, drop, create and remove.
Backup using Network Data Management Protocol (NDMP)
The test environment consisted of an IBM TS7610
ProtecTIER Deduplication Appliance Express acting as a
Virtual Tape Library attached to an IBM Tivoli® Storage
Manager server. In such a configuration, the Tivoli Storage
Manager server controls the virtual tape library through a direct
physical connection to the library robotics control port.
(The library robotics, the IBM Tivoli Storage Manager server
and the NAS file server are all connected over Fibre Channel.)
For NDMP operations, the drives in the library were connected
directly to the NAS file server, with a path defined from the
NAS head to the virtual drives. The NAS file server transfers
data to the virtual tape drives at the request of the IBM Tivoli
Storage Manager server.
As shown in Figure 3, to allow Tivoli Storage Manager to use
the virtual tape drives for non-NDMP operations, the virtual
tape drives were also connected to the Tivoli Storage Manager
server, with paths defined from the server to the drives.
This configuration also supports an IBM Tivoli Storage
Manager storage agent having access to the virtual tape drives
for its LAN-free operations.
8 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
Tivoli Storage Manager Server
NAS File Server
NAS File Server
File System Disks
Web Client
(optional)
Virtual Tape Library
LEGEND
SCSI or Fibre Channel Connection
TCP/IP Connection
Data Flow
Robotics Control
Drive access
Figure 3: NDMP architecture
9IBM Systems and Technology
SAN configuration
All components with Fibre Channel connectivity (Tivoli Storage
Manager server, TS7610 ProtecTIER Deduplication Appliance
Express and IBM System Storage N5600 storage controller)
were connected to a Brocade 4100 SAN switch in the test envi-
ronment. According to Fibre Channel SAN zoning best prac-
tices, five zones were defined, each including one initiator and
one target. Four zones were defined to connect ProtecTIER’s
virtual tape library robotics and its virtual tape drives to the
Tivoli Storage Manager server in a redundant manner, in order
to enable Control Path Failover (CPF) and Data Path Failover
(DPF) between the Tivoli Storage Manager server and the
ProtecTIER virtual tape library. Control Path Failover and Data
Path Failover were handled transparently by the IBM Tape
Device Driver for Windows (IBM tape) installed on the Tivoli
Storage Manager server. In addition, a single virtual drive was
zoned to the N series to enable the NAS file server to transfer
data directly to a virtual drive.
N Series configuration
The IBM N5600-A10 storage controller configuration
included 28 144 GB 15k Fibre Channel drives in a RAID-DP
environment. The N5600 operating environment was ONTAP
7.3.3 for the tests.
IBM ProtecTIER Deduplication configuration
A ProtecTIER virtual tape library was defined with two virtual
tape drives, each assigned to one Fibre Channel front-end port.
To enable Control Path Failover, the virtual robot was assigned
to both Fibre Channel front-end ports. ProtecTIER’s LUN
masking feature was used to assign specific virtual devices to a
specific host running backup application modules. This feature
enables multiple initiators to share the same target Fibre
Channel port on the ProtecTIER system without having
conflicts on the devices that are being emulated.
Ten cartridges were defined to store backup data. To limit the
nominal cartridge size, maximum cartridge growth was set to
200 GB. Under these conditions, as soon as a cartridge stores
200 GB of nominal data, it is marked “full” and another car-
tridge is used to backup data. Ten virtual tape slots were defined
in the virtual tape library to house the ten virtual tape cartridges.
By default, eight import/export slots were defined.
IBM Real-time Compression configuration
The IBM Real-time Compression Appliance STN6800 (Version
3.7.0) with Gigabit Ethernet ports were used for the tests.
1-Gigabit Ethernet connections were established to the Gigabit
Ethernet switch and the N5600 for connectivity between the
Oracle server and the N5600 storage controller.
10 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
Tivoli Storage Manager configuration
A server running IBM Tivoli Storage Manager 5.5.4.3 Extended
Edition was installed using Tivoli Storage Manager’s built-in
configuration wizards. (An Extended Edition license is required
to allow NDMP backups of NAS devices.) The Tivoli Storage
Manager database size was configured to 2048 MB and the log
size was configured to 1024 MB.
To initiate an NDMP backup from the Tivoli Storage Manager
server, the backup node command was used to perform a full
backup of the Oracle database files on the N5600 storage con-
troller. A table of contents (TOC) was not created as it is needed
only for single file restore. The backup NAS process in Tivoli
Storage Manager was monitored to measure backup time.
Test 1: ProtecTIER deduplication only
To illustrate the benefits of data deduplication, tests were per-
formed to deduplicate the seven 37 GB cold backup sets using
the ProtecTIER deduplication appliance only. Deduplication
was performed during the time the data was copied using
NDMP from the N5600 storage controller to the ProtecTIER
deduplication appliance.
Test 2: IBM Real-time Compression and
ProtecTIER deduplication
To illustrate the added benefits of using real-time compression
with deduplication, an IBM Real-time Compression STN6800
appliance was installed in front of the IBM N5600 storage
controller. IBM Real-time Compression provided an immediate
footprint reduction of the database file size from 37 GB to
6.6 GB, a reduction of over 82 percent. The introduction of
IBM Real-time Compression provided immediate space savings,
since the compression was performed in real time, when the data
was written to storage. No post processing or configuration
changes were required to realize these savings.
Clearly, both data deduplication and compression standing alone
offer significant space savings over traditional, non-optimized
storage. However, the benefits of combining these technologies
are even more compelling.
BackupTimeinSeconds
Day of Backup
900
800
700
600
500
400
300
200
100
0
1 2 3 4 5 6 7
IBM Real-time
Compression with
IBM ProtecTIER
IBM ProtecTIER
Figure 4: Backup times for IBM Real-time Compression combined with
ProtecTIER Deduplication, compared to deduplication alone
11IBM Systems and Technology
UsedSpaceGB
Day of Backup
25
20
15
10
5
0
1 2 3 4 5 6 7
IBM Real-time
Compression with
IBM ProtecTIER
IBM ProtecTIER
Figure 5: Space used with IBM Real-time Compression combined with
ProtecTIER Deduplication, compared to deduplication alone
When the ProtecTIER deduplication solution was used to
backup the IBM Real-time Compression compressed data, the
seven compressed backup sets were further reduced by an aver-
age of 39 percent. In addition, backup of compressed data took
an average 68 percent less time than backup in the absence of
IBM Real-time Compression.
Summary
While IBM Real-time Compression and IBM ProtecTIER
Deduplication solutions both offer compelling storage and data
protection benefits when used individually, the combination of
the two technologies has been shown to produce far greater
storage efficiency, significantly reduce backup times, and
improve utilization of resources. Tests involving Oracle
database physical backups have shown that together, these data
compression and deduplication solutions are capable of produc-
ing benefits far exceeding those found with either technology
alone—including 96 percent overall data reduction and up to
71 percent reduction in backup time, as well as better deduplica-
tion CPU utilization, less deduplication disk activity and less
deduplication network traffic. These results demonstrate strong
synergies between real-time compression and deduplication and
present a powerful argument for using both in order to achieve
storage optimization.
For more information
To learn more about how IBM Real-time Compression and
IBM ProtecTIER Deduplication solutions can optimize storage
efficiency in your environment, contact your IBM representative
or visit ibm.com/storage/solutions/rtc and
ibm.com/systems/storage/tape/protectier/
Additionally, financing solutions from IBM Global Financing
can enable effective cash management, protection from technol-
ogy obsolescence, improved total cost of ownership and return
on investment. Also, our Global Asset Recovery Services help
address environmental concerns with new, more energy-efficient
solutions. For more information on IBM Global Financing,
visit: ibm.com/financing
Please Recycle
© Copyright IBM Corporation 2011
IBM Systems and Technology Group
Route 100
Somers, NY 10589
U.S.A.
Produced in the United States of America
July 2011
All Rights Reserved
IBM, the IBM logo, ibm.com and ProtecTIER are trademarks or
registered trademarks of International Business Machines Corporation in the
United States, other countries, or both. If these and other IBM trademarks
are marked on their first occurrence in this information with a trademark
symbol (® or ™), these symbols indicate U.S. registered or common law
trademarks owned by IBM at the time this information was published. Such
trademarks may also be registered or common law trademarks in other
countries. A current list of IBM trademarks is available on the web at
“Copyright and trademark information” at ibm.com/legal/copytrade.shtml
Other company, product and service names may be trademarks or service
marks of others.
This paper is intended to provide information regarding IBM Real-time
Compression Appliance (RTC) in combination with ProtecTIER
Deduplication solutions. It discusses findings based on configurations that
were created and tested under laboratory conditions. These findings may
not be realized in all customer environments, and implementation in such
environments may require additional steps, configurations and performance,
compression and deduplication analysis. This information does not constitute
a specification or form part of the warranty for any IBM or non-
IBM products.
Information in this document was developed in conjunction with the use of
the equipment specified and is limited in application to those specific
hardware and software products and levels.
The information contained in this document has not been submitted to any
formal IBM test and is distributed as-is. The use of this information or the
implementation of these techniques is a customer responsibility and depends
on the customer’s ability to evaluate and integrate them into the customer’s
operational environment. While each item may have been reviewed by
IBM for accuracy in a specific situation, there is no guarantee that the same
or similar results will be obtained elsewhere. Customers attempting to adapt
these techniques to their own environments do so at their own risk.
IBM may not officially support techniques mentioned in this document.
For questions regarding officially supported techniques, please refer to the
product documentation or announcement letters, or contact IBM Support.
This document could include technical inaccuracies or typographical
errors. IBM may not offer the products, services or features discussed in
this document in other countries, and the product information may be
subject to change without notice. Consult your local IBM business contact
for information on the product or services available in your area. Any
statements regarding IBM’s future direction and intent are subject to
change or withdrawal without notice, and represent goals and objectives
only. The information contained in this document is current as of the
initial date of publication only and is subject to change without notice. All
performance information was determined in a controlled environment.
Actual results may vary. Performance information is provided “AS IS” and
no warranties or guarantees are expressed or implied by IBM. Information
concerning non-IBM products was obtained from the suppliers of their
products their published announcements or other publicly available
sources. Questions on the capabilities of the non-IBM products should be
addressed with the suppliers. IBM does not warrant that the information
offered herein will meet your requirements or those of your distributors
or customers. IBM disclaims all warranties, express or implied, including
the implied warranties of noninfringement, merchantability and fitness for
a particular purpose or noninfringement. IBM products are warranted
according to the terms and conditions of the agreements under which they
are provided.
TSW03093-USEN-00

More Related Content

What's hot

Backup policy template julie bozzi oregon
Backup policy template   julie bozzi oregonBackup policy template   julie bozzi oregon
Backup policy template julie bozzi oregonJulie Bozzi, PfPM, PMP
 
WETEC Compellent Storage Center 4.0 Release
WETEC Compellent Storage Center 4.0 ReleaseWETEC Compellent Storage Center 4.0 Release
WETEC Compellent Storage Center 4.0 ReleaseEddy Jennekens
 
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...IBM India Smarter Computing
 
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...Principled Technologies
 
Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...
Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...
Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...Principled Technologies
 
Backup & restore in windows
Backup & restore in windowsBackup & restore in windows
Backup & restore in windowsJab Vtl
 
Basic principles of backup policies by Andrea Mauro, Backup Academy
Basic principles of backup policies by Andrea Mauro, Backup AcademyBasic principles of backup policies by Andrea Mauro, Backup Academy
Basic principles of backup policies by Andrea Mauro, Backup AcademyVeeam Software
 
Presentation on backup and recoveryyyyyyyyyyyyy
Presentation on backup and recoveryyyyyyyyyyyyyPresentation on backup and recoveryyyyyyyyyyyyy
Presentation on backup and recoveryyyyyyyyyyyyyTehmina Gulfam
 
Scale-Out Architectures for Secondary Storage
Scale-Out Architectures for Secondary StorageScale-Out Architectures for Secondary Storage
Scale-Out Architectures for Secondary StorageInteractiveNEC
 
Server TCO Showdown -- Lenovo x3950 X6 and IBM Storwize V7000 vs HP superdome
Server TCO Showdown -- Lenovo x3950 X6 and IBM Storwize V7000 vs  HP superdomeServer TCO Showdown -- Lenovo x3950 X6 and IBM Storwize V7000 vs  HP superdome
Server TCO Showdown -- Lenovo x3950 X6 and IBM Storwize V7000 vs HP superdomeLenovo Data Center
 
Lenovo System x3950 X6 vs HP Superdome 2
Lenovo System x3950 X6 vs  HP Superdome 2Lenovo System x3950 X6 vs  HP Superdome 2
Lenovo System x3950 X6 vs HP Superdome 2Lenovo Data Center
 
Make sense of important data faster with AWS EC2 M6i instances
Make sense of important data faster with AWS EC2 M6i instancesMake sense of important data faster with AWS EC2 M6i instances
Make sense of important data faster with AWS EC2 M6i instancesPrincipled Technologies
 
Performance Tuning
Performance TuningPerformance Tuning
Performance TuningJannet Peetz
 
Firebird database recovery and protection for enterprises and ISV
Firebird database recovery and protection for enterprises and ISVFirebird database recovery and protection for enterprises and ISV
Firebird database recovery and protection for enterprises and ISVMind The Firebird
 
03 backup-and-recovery
03 backup-and-recovery03 backup-and-recovery
03 backup-and-recoveryhunny garg
 

What's hot (20)

Exa backup playbook
Exa backup playbookExa backup playbook
Exa backup playbook
 
Backup policy template julie bozzi oregon
Backup policy template   julie bozzi oregonBackup policy template   julie bozzi oregon
Backup policy template julie bozzi oregon
 
WETEC Compellent Storage Center 4.0 Release
WETEC Compellent Storage Center 4.0 ReleaseWETEC Compellent Storage Center 4.0 Release
WETEC Compellent Storage Center 4.0 Release
 
Database recovery
Database recoveryDatabase recovery
Database recovery
 
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
 
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
Equip your Dell EMC PowerEdge R740xd servers with Intel Optane persistent mem...
 
Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...
Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...
Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...
 
Backup & restore in windows
Backup & restore in windowsBackup & restore in windows
Backup & restore in windows
 
Basic principles of backup policies by Andrea Mauro, Backup Academy
Basic principles of backup policies by Andrea Mauro, Backup AcademyBasic principles of backup policies by Andrea Mauro, Backup Academy
Basic principles of backup policies by Andrea Mauro, Backup Academy
 
Presentation on backup and recoveryyyyyyyyyyyyy
Presentation on backup and recoveryyyyyyyyyyyyyPresentation on backup and recoveryyyyyyyyyyyyy
Presentation on backup and recoveryyyyyyyyyyyyy
 
GuideIT Delivery Design - File Shares
GuideIT Delivery Design - File SharesGuideIT Delivery Design - File Shares
GuideIT Delivery Design - File Shares
 
Scale-Out Architectures for Secondary Storage
Scale-Out Architectures for Secondary StorageScale-Out Architectures for Secondary Storage
Scale-Out Architectures for Secondary Storage
 
Server TCO Showdown -- Lenovo x3950 X6 and IBM Storwize V7000 vs HP superdome
Server TCO Showdown -- Lenovo x3950 X6 and IBM Storwize V7000 vs  HP superdomeServer TCO Showdown -- Lenovo x3950 X6 and IBM Storwize V7000 vs  HP superdome
Server TCO Showdown -- Lenovo x3950 X6 and IBM Storwize V7000 vs HP superdome
 
Lenovo System x3950 X6 vs HP Superdome 2
Lenovo System x3950 X6 vs  HP Superdome 2Lenovo System x3950 X6 vs  HP Superdome 2
Lenovo System x3950 X6 vs HP Superdome 2
 
Backup strategy
Backup strategyBackup strategy
Backup strategy
 
Make sense of important data faster with AWS EC2 M6i instances
Make sense of important data faster with AWS EC2 M6i instancesMake sense of important data faster with AWS EC2 M6i instances
Make sense of important data faster with AWS EC2 M6i instances
 
Performance Tuning
Performance TuningPerformance Tuning
Performance Tuning
 
Select enterprise backup software
Select enterprise backup softwareSelect enterprise backup software
Select enterprise backup software
 
Firebird database recovery and protection for enterprises and ISV
Firebird database recovery and protection for enterprises and ISVFirebird database recovery and protection for enterprises and ISV
Firebird database recovery and protection for enterprises and ISV
 
03 backup-and-recovery
03 backup-and-recovery03 backup-and-recovery
03 backup-and-recovery
 

Viewers also liked

Deduplication and single instance storage
Deduplication and single instance storageDeduplication and single instance storage
Deduplication and single instance storageInterop
 
Implementing ibm storage data deduplication solutions sg247888
Implementing ibm storage data deduplication solutions sg247888Implementing ibm storage data deduplication solutions sg247888
Implementing ibm storage data deduplication solutions sg247888Banking at Ho Chi Minh city
 
Survey on cloud backup services of personal storage
Survey on cloud backup services of personal storageSurvey on cloud backup services of personal storage
Survey on cloud backup services of personal storageeSAT Journals
 
Columbia International College 2009 Success Stories
Columbia International College 2009 Success StoriesColumbia International College 2009 Success Stories
Columbia International College 2009 Success Storiesanna_korabelnikova
 
Evolution Of Dedupe
Evolution Of DedupeEvolution Of Dedupe
Evolution Of Deduperammotive
 
Indexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and DeduplicationIndexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and DeduplicationPradeeban Kathiravelu, Ph.D.
 
Efficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsEfficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsPradeeban Kathiravelu, Ph.D.
 
Model Comparison for Delta-Compression
Model Comparison for Delta-CompressionModel Comparison for Delta-Compression
Model Comparison for Delta-CompressionMarkus Scheidgen
 
Data mining and privacy preserving in data mining
Data mining and privacy preserving in data miningData mining and privacy preserving in data mining
Data mining and privacy preserving in data miningNeeda Multani
 
liquid a scalable deduplication file system for virtual machine images
liquid a scalable deduplication file system for virtual machine imagesliquid a scalable deduplication file system for virtual machine images
liquid a scalable deduplication file system for virtual machine imagesNaseem nisar
 
White Paper: EMC VNXe File Deduplication and Compression
White Paper: EMC VNXe File Deduplication and Compression   White Paper: EMC VNXe File Deduplication and Compression
White Paper: EMC VNXe File Deduplication and Compression EMC
 

Viewers also liked (11)

Deduplication and single instance storage
Deduplication and single instance storageDeduplication and single instance storage
Deduplication and single instance storage
 
Implementing ibm storage data deduplication solutions sg247888
Implementing ibm storage data deduplication solutions sg247888Implementing ibm storage data deduplication solutions sg247888
Implementing ibm storage data deduplication solutions sg247888
 
Survey on cloud backup services of personal storage
Survey on cloud backup services of personal storageSurvey on cloud backup services of personal storage
Survey on cloud backup services of personal storage
 
Columbia International College 2009 Success Stories
Columbia International College 2009 Success StoriesColumbia International College 2009 Success Stories
Columbia International College 2009 Success Stories
 
Evolution Of Dedupe
Evolution Of DedupeEvolution Of Dedupe
Evolution Of Dedupe
 
Indexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and DeduplicationIndexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and Deduplication
 
Efficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsEfficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data Sets
 
Model Comparison for Delta-Compression
Model Comparison for Delta-CompressionModel Comparison for Delta-Compression
Model Comparison for Delta-Compression
 
Data mining and privacy preserving in data mining
Data mining and privacy preserving in data miningData mining and privacy preserving in data mining
Data mining and privacy preserving in data mining
 
liquid a scalable deduplication file system for virtual machine images
liquid a scalable deduplication file system for virtual machine imagesliquid a scalable deduplication file system for virtual machine images
liquid a scalable deduplication file system for virtual machine images
 
White Paper: EMC VNXe File Deduplication and Compression
White Paper: EMC VNXe File Deduplication and Compression   White Paper: EMC VNXe File Deduplication and Compression
White Paper: EMC VNXe File Deduplication and Compression
 

Similar to Combining IBM Real-time Compression and IBM ProtecTIER Deduplication

EVALUATE DATABASE COMPRESSION PERFORMANCE AND PARALLEL BACKUP
EVALUATE DATABASE COMPRESSION PERFORMANCE AND PARALLEL BACKUPEVALUATE DATABASE COMPRESSION PERFORMANCE AND PARALLEL BACKUP
EVALUATE DATABASE COMPRESSION PERFORMANCE AND PARALLEL BACKUPijdms
 
03 Data Recovery - Notes
03 Data Recovery - Notes03 Data Recovery - Notes
03 Data Recovery - NotesKranthi
 
S de2784 footprint-reduction-edge2015-v2
S de2784 footprint-reduction-edge2015-v2S de2784 footprint-reduction-edge2015-v2
S de2784 footprint-reduction-edge2015-v2Tony Pearson
 
The economics of backup 5 ways disk backup can help your business
The economics of backup 5 ways disk backup can help your businessThe economics of backup 5 ways disk backup can help your business
The economics of backup 5 ways disk backup can help your businessServium
 
Streamlining Backup: Enhancing Data Protection with Backup Appliances
Streamlining Backup: Enhancing Data Protection with Backup AppliancesStreamlining Backup: Enhancing Data Protection with Backup Appliances
Streamlining Backup: Enhancing Data Protection with Backup AppliancesMaryJWilliams2
 
Efficient, fast and reliable backup and recovery solutions featuring IBM Prot...
Efficient, fast and reliable backup and recovery solutions featuring IBM Prot...Efficient, fast and reliable backup and recovery solutions featuring IBM Prot...
Efficient, fast and reliable backup and recovery solutions featuring IBM Prot...IBM India Smarter Computing
 
Data De-Duplication Engine for Efficient Storage Management
Data De-Duplication Engine for Efficient Storage ManagementData De-Duplication Engine for Efficient Storage Management
Data De-Duplication Engine for Efficient Storage ManagementIRJET Journal
 
TECHNICAL BRIEF▶ NetBackup 7.6 Deduplication Technology
TECHNICAL BRIEF▶ NetBackup 7.6 Deduplication TechnologyTECHNICAL BRIEF▶ NetBackup 7.6 Deduplication Technology
TECHNICAL BRIEF▶ NetBackup 7.6 Deduplication TechnologySymantec
 
Backing Up Mountains of Data to Disk
Backing Up Mountains of Data to DiskBacking Up Mountains of Data to Disk
Backing Up Mountains of Data to DiskIT Brand Pulse
 
Net App D2 D Backupwith Snap Vaultand Ossv Customer Strategic Presentation201...
Net App D2 D Backupwith Snap Vaultand Ossv Customer Strategic Presentation201...Net App D2 D Backupwith Snap Vaultand Ossv Customer Strategic Presentation201...
Net App D2 D Backupwith Snap Vaultand Ossv Customer Strategic Presentation201...Michael Hudak
 
8 considerations for evaluating disk based backup solutions
8 considerations for evaluating disk based backup solutions8 considerations for evaluating disk based backup solutions
8 considerations for evaluating disk based backup solutionsServium
 
Back up and restore data faster with a Dell PowerProtect Data Manager Appliance
Back up and restore data faster with a Dell PowerProtect Data Manager ApplianceBack up and restore data faster with a Dell PowerProtect Data Manager Appliance
Back up and restore data faster with a Dell PowerProtect Data Manager AppliancePrincipled Technologies
 
Spectrum Scale final
Spectrum Scale finalSpectrum Scale final
Spectrum Scale finalJoe Krotz
 
Reduce time to complete backups and restores with Transparent Snapshots with ...
Reduce time to complete backups and restores with Transparent Snapshots with ...Reduce time to complete backups and restores with Transparent Snapshots with ...
Reduce time to complete backups and restores with Transparent Snapshots with ...Principled Technologies
 
I-Sieve: An inline High Performance Deduplication System Used in cloud storage
I-Sieve: An inline High Performance Deduplication System Used in cloud storageI-Sieve: An inline High Performance Deduplication System Used in cloud storage
I-Sieve: An inline High Performance Deduplication System Used in cloud storageredpel dot com
 
Efficient and scalable multitenant placement approach for in memory database ...
Efficient and scalable multitenant placement approach for in memory database ...Efficient and scalable multitenant placement approach for in memory database ...
Efficient and scalable multitenant placement approach for in memory database ...CSITiaesprime
 
Comparison of In-memory Data Platforms
Comparison of In-memory Data PlatformsComparison of In-memory Data Platforms
Comparison of In-memory Data PlatformsAmir Mahdi Akbari
 

Similar to Combining IBM Real-time Compression and IBM ProtecTIER Deduplication (20)

EVALUATE DATABASE COMPRESSION PERFORMANCE AND PARALLEL BACKUP
EVALUATE DATABASE COMPRESSION PERFORMANCE AND PARALLEL BACKUPEVALUATE DATABASE COMPRESSION PERFORMANCE AND PARALLEL BACKUP
EVALUATE DATABASE COMPRESSION PERFORMANCE AND PARALLEL BACKUP
 
03 Data Recovery - Notes
03 Data Recovery - Notes03 Data Recovery - Notes
03 Data Recovery - Notes
 
S de2784 footprint-reduction-edge2015-v2
S de2784 footprint-reduction-edge2015-v2S de2784 footprint-reduction-edge2015-v2
S de2784 footprint-reduction-edge2015-v2
 
The economics of backup 5 ways disk backup can help your business
The economics of backup 5 ways disk backup can help your businessThe economics of backup 5 ways disk backup can help your business
The economics of backup 5 ways disk backup can help your business
 
Streamlining Backup: Enhancing Data Protection with Backup Appliances
Streamlining Backup: Enhancing Data Protection with Backup AppliancesStreamlining Backup: Enhancing Data Protection with Backup Appliances
Streamlining Backup: Enhancing Data Protection with Backup Appliances
 
Efficient, fast and reliable backup and recovery solutions featuring IBM Prot...
Efficient, fast and reliable backup and recovery solutions featuring IBM Prot...Efficient, fast and reliable backup and recovery solutions featuring IBM Prot...
Efficient, fast and reliable backup and recovery solutions featuring IBM Prot...
 
Data De-Duplication Engine for Efficient Storage Management
Data De-Duplication Engine for Efficient Storage ManagementData De-Duplication Engine for Efficient Storage Management
Data De-Duplication Engine for Efficient Storage Management
 
TECHNICAL BRIEF▶ NetBackup 7.6 Deduplication Technology
TECHNICAL BRIEF▶ NetBackup 7.6 Deduplication TechnologyTECHNICAL BRIEF▶ NetBackup 7.6 Deduplication Technology
TECHNICAL BRIEF▶ NetBackup 7.6 Deduplication Technology
 
Backing Up Mountains of Data to Disk
Backing Up Mountains of Data to DiskBacking Up Mountains of Data to Disk
Backing Up Mountains of Data to Disk
 
Net App D2 D Backupwith Snap Vaultand Ossv Customer Strategic Presentation201...
Net App D2 D Backupwith Snap Vaultand Ossv Customer Strategic Presentation201...Net App D2 D Backupwith Snap Vaultand Ossv Customer Strategic Presentation201...
Net App D2 D Backupwith Snap Vaultand Ossv Customer Strategic Presentation201...
 
8 considerations for evaluating disk based backup solutions
8 considerations for evaluating disk based backup solutions8 considerations for evaluating disk based backup solutions
8 considerations for evaluating disk based backup solutions
 
SiDe Enabled Reliable Replica Optimization
SiDe Enabled Reliable Replica OptimizationSiDe Enabled Reliable Replica Optimization
SiDe Enabled Reliable Replica Optimization
 
Back up and restore data faster with a Dell PowerProtect Data Manager Appliance
Back up and restore data faster with a Dell PowerProtect Data Manager ApplianceBack up and restore data faster with a Dell PowerProtect Data Manager Appliance
Back up and restore data faster with a Dell PowerProtect Data Manager Appliance
 
IBM PROTECTIER: FROM BACKUP TO RECOVERY
IBM PROTECTIER: FROM BACKUP TO RECOVERYIBM PROTECTIER: FROM BACKUP TO RECOVERY
IBM PROTECTIER: FROM BACKUP TO RECOVERY
 
Spectrum Scale final
Spectrum Scale finalSpectrum Scale final
Spectrum Scale final
 
Reduce time to complete backups and restores with Transparent Snapshots with ...
Reduce time to complete backups and restores with Transparent Snapshots with ...Reduce time to complete backups and restores with Transparent Snapshots with ...
Reduce time to complete backups and restores with Transparent Snapshots with ...
 
I-Sieve: An inline High Performance Deduplication System Used in cloud storage
I-Sieve: An inline High Performance Deduplication System Used in cloud storageI-Sieve: An inline High Performance Deduplication System Used in cloud storage
I-Sieve: An inline High Performance Deduplication System Used in cloud storage
 
Generic RLM White Paper
Generic RLM White PaperGeneric RLM White Paper
Generic RLM White Paper
 
Efficient and scalable multitenant placement approach for in memory database ...
Efficient and scalable multitenant placement approach for in memory database ...Efficient and scalable multitenant placement approach for in memory database ...
Efficient and scalable multitenant placement approach for in memory database ...
 
Comparison of In-memory Data Platforms
Comparison of In-memory Data PlatformsComparison of In-memory Data Platforms
Comparison of In-memory Data Platforms
 

More from IBM India Smarter Computing

Using the IBM XIV Storage System in OpenStack Cloud Environments
Using the IBM XIV Storage System in OpenStack Cloud Environments Using the IBM XIV Storage System in OpenStack Cloud Environments
Using the IBM XIV Storage System in OpenStack Cloud Environments IBM India Smarter Computing
 
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...IBM India Smarter Computing
 
A Comparison of PowerVM and Vmware Virtualization Performance
A Comparison of PowerVM and Vmware Virtualization PerformanceA Comparison of PowerVM and Vmware Virtualization Performance
A Comparison of PowerVM and Vmware Virtualization PerformanceIBM India Smarter Computing
 
IBM pureflex system and vmware vcloud enterprise suite reference architecture
IBM pureflex system and vmware vcloud enterprise suite reference architectureIBM pureflex system and vmware vcloud enterprise suite reference architecture
IBM pureflex system and vmware vcloud enterprise suite reference architectureIBM India Smarter Computing
 

More from IBM India Smarter Computing (20)

Using the IBM XIV Storage System in OpenStack Cloud Environments
Using the IBM XIV Storage System in OpenStack Cloud Environments Using the IBM XIV Storage System in OpenStack Cloud Environments
Using the IBM XIV Storage System in OpenStack Cloud Environments
 
All-flash Needs End to End Storage Efficiency
All-flash Needs End to End Storage EfficiencyAll-flash Needs End to End Storage Efficiency
All-flash Needs End to End Storage Efficiency
 
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
 
IBM FlashSystem 840 Product Guide
IBM FlashSystem 840 Product GuideIBM FlashSystem 840 Product Guide
IBM FlashSystem 840 Product Guide
 
IBM System x3250 M5
IBM System x3250 M5IBM System x3250 M5
IBM System x3250 M5
 
IBM NeXtScale nx360 M4
IBM NeXtScale nx360 M4IBM NeXtScale nx360 M4
IBM NeXtScale nx360 M4
 
IBM System x3650 M4 HD
IBM System x3650 M4 HDIBM System x3650 M4 HD
IBM System x3650 M4 HD
 
IBM System x3300 M4
IBM System x3300 M4IBM System x3300 M4
IBM System x3300 M4
 
IBM System x iDataPlex dx360 M4
IBM System x iDataPlex dx360 M4IBM System x iDataPlex dx360 M4
IBM System x iDataPlex dx360 M4
 
IBM System x3500 M4
IBM System x3500 M4IBM System x3500 M4
IBM System x3500 M4
 
IBM System x3550 M4
IBM System x3550 M4IBM System x3550 M4
IBM System x3550 M4
 
IBM System x3650 M4
IBM System x3650 M4IBM System x3650 M4
IBM System x3650 M4
 
IBM System x3500 M3
IBM System x3500 M3IBM System x3500 M3
IBM System x3500 M3
 
IBM System x3400 M3
IBM System x3400 M3IBM System x3400 M3
IBM System x3400 M3
 
IBM System x3250 M3
IBM System x3250 M3IBM System x3250 M3
IBM System x3250 M3
 
IBM System x3200 M3
IBM System x3200 M3IBM System x3200 M3
IBM System x3200 M3
 
IBM PowerVC Introduction and Configuration
IBM PowerVC Introduction and ConfigurationIBM PowerVC Introduction and Configuration
IBM PowerVC Introduction and Configuration
 
A Comparison of PowerVM and Vmware Virtualization Performance
A Comparison of PowerVM and Vmware Virtualization PerformanceA Comparison of PowerVM and Vmware Virtualization Performance
A Comparison of PowerVM and Vmware Virtualization Performance
 
IBM pureflex system and vmware vcloud enterprise suite reference architecture
IBM pureflex system and vmware vcloud enterprise suite reference architectureIBM pureflex system and vmware vcloud enterprise suite reference architecture
IBM pureflex system and vmware vcloud enterprise suite reference architecture
 
X6: The sixth generation of EXA Technology
X6: The sixth generation of EXA TechnologyX6: The sixth generation of EXA Technology
X6: The sixth generation of EXA Technology
 

Combining IBM Real-time Compression and IBM ProtecTIER Deduplication

  • 1. IBM Systems and Technology Thought Leadership White Paper July 2011 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication Benchmark tests show that combining storage optimization technologies achieves compelling results
  • 2. 2 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication Contents 2 Introduction 3 Landmarks in the data optimization landscape 5 The need for data optimization in database backup and recovery 7 The test environment: An overview 10 Test 1: ProtecTIER deduplication only 10 Test 2: IBM Real-time Compression and ProtecTIER deduplication 11 Summary 11 For more information Introduction As the capacity and overhead of powering, cooling and manag- ing larger amounts of storage continues to outpace the growth of storage budgets, IT decision makers are increasingly looking to optimization technologies to meet capacity demands while minimizing capital expenditures. Recently, two storage optimiza- tion approaches in particular have been receiving significant attention in the industry: real-time compression for primary and secondary data, and data deduplication for highly redundant backup data sets. Although sometimes viewed as mutually exclusive, the two technologies are, in fact, very complementary. This paper discusses the compelling financial and operational advantages of deploying real-time compression and data dedupli- cation in conjunction, as demonstrated by the results of tests in which IBM Real-time Compression and IBM® ProtecTIER® Deduplication solutions were combined to optimize Oracle database physical backups in a Network File Storage (NFS) environment. The compelling results of combining IBM solutions for real- time compression and data deduplication in the Oracle database environment include: ● Greater than 82 percent immediate savings on initial write to disk. ● Greater than 96 percent overall data reduction when com- bined with deduplication. ● Up to 71 percent reduction in backup time. ● Less CPU utilization on the deduplication engine. ● Less disk activity in the deduplication subsystem. ● Less network traffic on the deduplication backup network.
  • 3. 3IBM Systems and Technology Figure 1: The data optimization landscape Detailed test results are included later in this document. The bottom line is that combining real-time compression and data deduplication optimizes your overall storage footprint by reduc- ing data on your primary NAS devices as well as throughout your data life cycle. By combining these two technologies, you can achieve maximum data reduction, which maximizes your return on investment and dramatically improves your data pro- tection performance and capabilities. Landmarks in the data optimization landscape To fully appreciate the benefits of combining real-time compres- sion and data deduplication, it’s important first to understand how each technology works, the differences between them, and where they fit in the overall storage architecture, as shown in Figure 1.
  • 4. 4 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication Real-time compression Data compression reduces the size of data files so that less space is required to store them. Real-time compression, as the name implies, is the ability to compress data in real time—before it is written to the hard disk rather than after—without any notice- able performance degradation. Designed to sit transparently in front of primary Network Attached Storage (NAS), IBM Real-time Compression offers the unique advantage of making it possible to shrink primary, online data in real time with no loss in speed. With over 30 patents, it reduces the size of every file you create by up to five times, depending upon the file type. It significantly reduces the physical capacity required to store a file (or copies and per- mutations of a file) through the entire data life cycle, including backup. IBM Real-time Compression also has a feature called the Compression Accelerator that enables the non-disruptive compression of data that has already been saved to disk—while applications continue to have random, read-write access to the data. IBM Real-time Compression can also significantly enhance overall network and storage performance, since less data is writ- ten to disk and more data can be stored in the storage cache. As more server workloads become virtualized, real-time com- pression becomes increasingly valuable as a tool for storage optimization in virtualized environments. The technology works particularly well given the compression rates associated with virtualized files (see Table 1). As a result, many companies that have adopted file virtualization technologies are also exploring deployment of IBM Real-time Compression in conjunction with file virtualization. IBM Real-time Compression solutions transparently integrate with file virtualization solutions and can dramatically extend the cost reductions that file virtualiza- tion enables. File type Compression rate Database Up to 85 percent Microsoft Office Up to 20 - 60 percent VMware VMDK (virtualized files) Up to 72 percent CAD/CAM Up to 70 percent Oil and Gas Up to 50 percent Table 1: Compression rates
  • 5. 5IBM Systems and Technology Data deduplication Data deduplication is designed to reduce the physical storage required to store redundant data. The deduplication process removes duplicate data and replaces it with a pointer to the main copy, leaving only one copy of the data that actually has to be stored. This is why it is well suited for backup data where there are typically multiple data sets (daily/weekly, for example) of mostly redundant data. The more copies of redundant data you have, the higher your effective deduplication rate. IBM ProtecTIER Deduplication solutions feature revolutionary and patented HyperFactor® data deduplication technology. They provide enterprise-class performance, scalability and proven enterprise-level data integrity to meet disk-based data protection needs while enabling significant infrastructure cost reductions. They specifically provide improved backup performance, up to 2000 MB/sec (7.2 TB/hour) sustained inline deduplication, and even faster restores at up to 2800 MB/sec (10 TB/hour). It also provides: ● The ability to scale to 1 PB of physical storage. ● A reduction in storage capacity consumption of up to 25 times or more. ● A non-hash-based approach that protects data integrity by reducing the risk of data loss due to hash collision. Technology differences As described above and illustrated in Figure 1, real-time com- pression and data deduplication technologies address different problems and sit at different points in the data life cycle. But more importantly, the two technologies are complementary; in particular, deploying real-time compression significantly enhances the value and performance of data deduplication. This conclusion has been demonstrated in a series of performance tests, which are described in detail in this paper. The need for data optimization in database backup and recovery In general, backup and recovery refers to the various strategies and procedures involved in protecting a database against data loss and allowing for the reconstruction of the database after any kind of disaster. The performance and reliability of backup and recovery operations are critical to effective database operation. Physical backups are backups of the physical files used in storing and recovering your database, such as data files, control files and archived redo logs. Ultimately, every physical backup is a copy of files storing data- base information to some other location, whether on disk or some offline storage media such as tape. Backup performed after a database is properly shut down is called cold database backup. Conversely, backup performed when a database is online and fully functional is called hot database backup. During a cold backup, the database is shut down and unavailable; obviously, any technology that reduces the period of time that the database is offline is advantageous. In either case, due to the tremendous growth in data, it is becoming increasingly difficult for backups to complete within designated backup windows.
  • 6. 6 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication IBM TS7610 ProtecTIER Deduplication Appliance Express Tivoli Storage Manager NDMP Backup Gigabit Ethernet Switch Oracle Database 10.2.0.4 3x DB Clients Quest Benchmark Factory IBM Real-time Compression IBM N5600-A10 Brocade 4100 LAN Figure 2: The test environment
  • 7. 7IBM Systems and Technology Although the benefits of utilizing database backups (either cold or hot) are clear, their use can result in the creation of large amounts of data that must reside in the storage environment, taking up precious disk space and increasing the complexity and cost of backup procedures. This is why primary storage opti- mization is so important. The test environment: An overview In order to simulate accurate and realistic data storage scenarios, IBM used Quest Software’s Benchmark Factory and Data Factory to create and populate an Oracle database running over NFS to an IBM System Storage® N5600-A10 storage con- troller. A 37GB baseline database was then used to test the effects of data deduplication and compression, respectively. Each test had a seven percent daily change rate that was simulated between each database copy. Seven database copies were then taken to simulate a week’s worth of Oracle data sets in an enter- prise environment through a combination of updates to existing data, additions of new data, and other database activities such as delete, drop, create and remove. Backup using Network Data Management Protocol (NDMP) The test environment consisted of an IBM TS7610 ProtecTIER Deduplication Appliance Express acting as a Virtual Tape Library attached to an IBM Tivoli® Storage Manager server. In such a configuration, the Tivoli Storage Manager server controls the virtual tape library through a direct physical connection to the library robotics control port. (The library robotics, the IBM Tivoli Storage Manager server and the NAS file server are all connected over Fibre Channel.) For NDMP operations, the drives in the library were connected directly to the NAS file server, with a path defined from the NAS head to the virtual drives. The NAS file server transfers data to the virtual tape drives at the request of the IBM Tivoli Storage Manager server. As shown in Figure 3, to allow Tivoli Storage Manager to use the virtual tape drives for non-NDMP operations, the virtual tape drives were also connected to the Tivoli Storage Manager server, with paths defined from the server to the drives. This configuration also supports an IBM Tivoli Storage Manager storage agent having access to the virtual tape drives for its LAN-free operations.
  • 8. 8 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication Tivoli Storage Manager Server NAS File Server NAS File Server File System Disks Web Client (optional) Virtual Tape Library LEGEND SCSI or Fibre Channel Connection TCP/IP Connection Data Flow Robotics Control Drive access Figure 3: NDMP architecture
  • 9. 9IBM Systems and Technology SAN configuration All components with Fibre Channel connectivity (Tivoli Storage Manager server, TS7610 ProtecTIER Deduplication Appliance Express and IBM System Storage N5600 storage controller) were connected to a Brocade 4100 SAN switch in the test envi- ronment. According to Fibre Channel SAN zoning best prac- tices, five zones were defined, each including one initiator and one target. Four zones were defined to connect ProtecTIER’s virtual tape library robotics and its virtual tape drives to the Tivoli Storage Manager server in a redundant manner, in order to enable Control Path Failover (CPF) and Data Path Failover (DPF) between the Tivoli Storage Manager server and the ProtecTIER virtual tape library. Control Path Failover and Data Path Failover were handled transparently by the IBM Tape Device Driver for Windows (IBM tape) installed on the Tivoli Storage Manager server. In addition, a single virtual drive was zoned to the N series to enable the NAS file server to transfer data directly to a virtual drive. N Series configuration The IBM N5600-A10 storage controller configuration included 28 144 GB 15k Fibre Channel drives in a RAID-DP environment. The N5600 operating environment was ONTAP 7.3.3 for the tests. IBM ProtecTIER Deduplication configuration A ProtecTIER virtual tape library was defined with two virtual tape drives, each assigned to one Fibre Channel front-end port. To enable Control Path Failover, the virtual robot was assigned to both Fibre Channel front-end ports. ProtecTIER’s LUN masking feature was used to assign specific virtual devices to a specific host running backup application modules. This feature enables multiple initiators to share the same target Fibre Channel port on the ProtecTIER system without having conflicts on the devices that are being emulated. Ten cartridges were defined to store backup data. To limit the nominal cartridge size, maximum cartridge growth was set to 200 GB. Under these conditions, as soon as a cartridge stores 200 GB of nominal data, it is marked “full” and another car- tridge is used to backup data. Ten virtual tape slots were defined in the virtual tape library to house the ten virtual tape cartridges. By default, eight import/export slots were defined. IBM Real-time Compression configuration The IBM Real-time Compression Appliance STN6800 (Version 3.7.0) with Gigabit Ethernet ports were used for the tests. 1-Gigabit Ethernet connections were established to the Gigabit Ethernet switch and the N5600 for connectivity between the Oracle server and the N5600 storage controller.
  • 10. 10 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication Tivoli Storage Manager configuration A server running IBM Tivoli Storage Manager 5.5.4.3 Extended Edition was installed using Tivoli Storage Manager’s built-in configuration wizards. (An Extended Edition license is required to allow NDMP backups of NAS devices.) The Tivoli Storage Manager database size was configured to 2048 MB and the log size was configured to 1024 MB. To initiate an NDMP backup from the Tivoli Storage Manager server, the backup node command was used to perform a full backup of the Oracle database files on the N5600 storage con- troller. A table of contents (TOC) was not created as it is needed only for single file restore. The backup NAS process in Tivoli Storage Manager was monitored to measure backup time. Test 1: ProtecTIER deduplication only To illustrate the benefits of data deduplication, tests were per- formed to deduplicate the seven 37 GB cold backup sets using the ProtecTIER deduplication appliance only. Deduplication was performed during the time the data was copied using NDMP from the N5600 storage controller to the ProtecTIER deduplication appliance. Test 2: IBM Real-time Compression and ProtecTIER deduplication To illustrate the added benefits of using real-time compression with deduplication, an IBM Real-time Compression STN6800 appliance was installed in front of the IBM N5600 storage controller. IBM Real-time Compression provided an immediate footprint reduction of the database file size from 37 GB to 6.6 GB, a reduction of over 82 percent. The introduction of IBM Real-time Compression provided immediate space savings, since the compression was performed in real time, when the data was written to storage. No post processing or configuration changes were required to realize these savings. Clearly, both data deduplication and compression standing alone offer significant space savings over traditional, non-optimized storage. However, the benefits of combining these technologies are even more compelling. BackupTimeinSeconds Day of Backup 900 800 700 600 500 400 300 200 100 0 1 2 3 4 5 6 7 IBM Real-time Compression with IBM ProtecTIER IBM ProtecTIER Figure 4: Backup times for IBM Real-time Compression combined with ProtecTIER Deduplication, compared to deduplication alone
  • 11. 11IBM Systems and Technology UsedSpaceGB Day of Backup 25 20 15 10 5 0 1 2 3 4 5 6 7 IBM Real-time Compression with IBM ProtecTIER IBM ProtecTIER Figure 5: Space used with IBM Real-time Compression combined with ProtecTIER Deduplication, compared to deduplication alone When the ProtecTIER deduplication solution was used to backup the IBM Real-time Compression compressed data, the seven compressed backup sets were further reduced by an aver- age of 39 percent. In addition, backup of compressed data took an average 68 percent less time than backup in the absence of IBM Real-time Compression. Summary While IBM Real-time Compression and IBM ProtecTIER Deduplication solutions both offer compelling storage and data protection benefits when used individually, the combination of the two technologies has been shown to produce far greater storage efficiency, significantly reduce backup times, and improve utilization of resources. Tests involving Oracle database physical backups have shown that together, these data compression and deduplication solutions are capable of produc- ing benefits far exceeding those found with either technology alone—including 96 percent overall data reduction and up to 71 percent reduction in backup time, as well as better deduplica- tion CPU utilization, less deduplication disk activity and less deduplication network traffic. These results demonstrate strong synergies between real-time compression and deduplication and present a powerful argument for using both in order to achieve storage optimization. For more information To learn more about how IBM Real-time Compression and IBM ProtecTIER Deduplication solutions can optimize storage efficiency in your environment, contact your IBM representative or visit ibm.com/storage/solutions/rtc and ibm.com/systems/storage/tape/protectier/ Additionally, financing solutions from IBM Global Financing can enable effective cash management, protection from technol- ogy obsolescence, improved total cost of ownership and return on investment. Also, our Global Asset Recovery Services help address environmental concerns with new, more energy-efficient solutions. For more information on IBM Global Financing, visit: ibm.com/financing
  • 12. Please Recycle © Copyright IBM Corporation 2011 IBM Systems and Technology Group Route 100 Somers, NY 10589 U.S.A. Produced in the United States of America July 2011 All Rights Reserved IBM, the IBM logo, ibm.com and ProtecTIER are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarks are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the web at “Copyright and trademark information” at ibm.com/legal/copytrade.shtml Other company, product and service names may be trademarks or service marks of others. This paper is intended to provide information regarding IBM Real-time Compression Appliance (RTC) in combination with ProtecTIER Deduplication solutions. It discusses findings based on configurations that were created and tested under laboratory conditions. These findings may not be realized in all customer environments, and implementation in such environments may require additional steps, configurations and performance, compression and deduplication analysis. This information does not constitute a specification or form part of the warranty for any IBM or non- IBM products. Information in this document was developed in conjunction with the use of the equipment specified and is limited in application to those specific hardware and software products and levels. The information contained in this document has not been submitted to any formal IBM test and is distributed as-is. The use of this information or the implementation of these techniques is a customer responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. IBM may not officially support techniques mentioned in this document. For questions regarding officially supported techniques, please refer to the product documentation or announcement letters, or contact IBM Support. This document could include technical inaccuracies or typographical errors. IBM may not offer the products, services or features discussed in this document in other countries, and the product information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. Any statements regarding IBM’s future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. The information contained in this document is current as of the initial date of publication only and is subject to change without notice. All performance information was determined in a controlled environment. Actual results may vary. Performance information is provided “AS IS” and no warranties or guarantees are expressed or implied by IBM. Information concerning non-IBM products was obtained from the suppliers of their products their published announcements or other publicly available sources. Questions on the capabilities of the non-IBM products should be addressed with the suppliers. IBM does not warrant that the information offered herein will meet your requirements or those of your distributors or customers. IBM disclaims all warranties, express or implied, including the implied warranties of noninfringement, merchantability and fitness for a particular purpose or noninfringement. IBM products are warranted according to the terms and conditions of the agreements under which they are provided. TSW03093-USEN-00