Data footprint reduction is the umbrella term for technologies like Thin Provisioning, Space-efficient snapshots, Data deduplication, and Real-time Compression.
26. 25
Data Footprint Reduction
Active Data Backup
Data
Real-time Compression 40-80%
Best
40-80%
20-30% 80-95 %
Best
Data
Deduplication
Real-Time Compression is a
method of reducing storage needs
by changing the encoding scheme
as the data is being read and
written
– Short patterns for frequent data
– Longer patterns for infrequent data
– Can achieve 40 to 80 percent
reduction in storage capacity for
active data
Data deduplication is a method of
reducing storage needs by
eliminating duplicate copies of data
– Store only one unique instance of the
data
– Redundant data replaced with pointer
– Can achieve 80 to 95 percent
reduction in storage capacity for
backup data
32. Compression Acceleration Cards –
Intel® QuickAssist Technology
Intel QuickAssist technology integrated into new Compression Acceleration cards
Used to offload the LZ compression and decompression processing
Each node supports up to two Compression Acceleration cards
SVC uses 4 parallel compression threads per card
To use compressed volumes, nodes require at least:
SVC 2145-DH8 or next generation Storwize V7000
64GB of Cache Memory per node
One Compression Acceleration card
When compression is enabled
38GB is used as a Compression Cache
Optionally upgrade each node to contain second
Compression Acceleration card
Upgrade recommended when normal data working set > 32TB
31
33. Lower Cache
7.3.0 Software Stack
RAID
New Dual Layer Cache
Architecture
First major update to
cache since 2003
Flexible design for
plug and play style
cache algorithm
enhancements in the
future
“SVC” like L2 cache
for advanced
functions
Upper Cache – simple
write cache
Lower Cache – algorithm
intelligence
Understands mdisks
Shared buffer space
between two layers
* Only 4F2 hardware limited to running no
later than 5.1 Software due to 32bit CPU
SCSI Initiator
Forwarding
Fibre Channel
iSCSI
FCoE
SAS
PCIe
Compression
Upper Cache
FlashCopy
Virtualization
Mirroring
Thin Provisioning
Forwarding
Forwarding
Easy Tier 3
Configuration
PeerCommunications
InterfaceLayer
Clustering
SCSI Target
Replication
New
New
New
32
34. Store more IOPS Response time
Real Time Compression
[RtC]
store more Limited effect Limited effect
Auto Tiering
[Easy Tier and Flash
Technology] No effect More IOPS Faster response
Turbo Compression
[RtC + Easy Tier and Flash
Technology] store more More IOPS Faster response
+
=
Turbo Compression may double the net usability of existing Infrastructures
Turbo Compression Explained
Turbo Compression tests
Oracle TPC-C (07/2013)
[2 % Flash Capacity]
4x
Compression
2.1 x
IOPS Throughput
½ x
Response time
at a fraction of the cost of traditional means
33
35. Turbo Compression for Tiered Flash/Disk Pools
•Easy Tier (no compression)
•1 Volume 100 GB
• 4% Flash (4GB) 23% of IOPS
(assumption : skew = 7)
HDD Tier: 77% of IOPS
•Compression (RtC)
(assumption: 66% savings)
• 12% compressed data fits in 4 GB
• 12% data 60% of IOPS
• HDD Tier: 40% of IOPS
•Turbo Compression
• Pool IOPS capability nearly
doubled without adding any Flash
0%
20%
40%
60%
80%
100%
120%
0% 20% 40% 60% 80% 100%
I
O
%
Go %
RtC
4%
23%
60%
12% Capacity %
Cumulative IOps vs. Capacity
TC
34
39. Summary
• Data Footprint Reduction technologies
have been around for many years
• Algorithms are stable, mature, and
well-understood by the IT industry
• Data is returned byte-for-byte identical
to what was originally stored
• Implementations between vendors and
products can vary greatly
• IBM’s implementations tend to have
faster performance, offer better
scalability, are easier to use and less
expensive TCO
43. 42
About the Speaker
Tony Pearson is a Master Inventor and Senior managing consultant for the IBM System Storage™ product line. Tony joined
IBM Corporation in 1986 in Tucson, Arizona, USA, and has been there ever since. In his current role, Tony presents briefings
on storage topics covering the entire System Storage product line, and topics related to Cloud, Analytics and Social media. He
interacts with clients, speaks at conferences and events, and leads client workshops to help clients with strategic planning for
IBM’s integrated set of storage software, hardware and virtualization products.
Tony writes the “Inside System Storage” blog, which is read by hundreds of clients, IBM sales reps and IBM Business Partners
every week. This blog was rated one of the top 10 blogs for the IT storage industry by “Networking World” magazine and #1
most read IBM blog on IBM’s developerWorks. The blog has been published into a series of books, Inside System Storage:
Volumes I through V.
Over the years, Tony has worked in development, marketing and consulting positions for various storage hardware and
software products. Tony has a Bachelor of Science degree in Software Engineering, and a Master of Science degree in
Electrical Engineering both from the University of Arizona. Tony holds 19 IBM patents for inventions on storage hardware and
software products.
9000 S. Rita Road
Bldg 9032 Floor 1
Tucson, AZ 85744
+1 520-799-4309 (Office)
tpearson@us.ibm.com
Tony Pearson
Master Inventor,
Senior IT Specialist
IBM System Storage™