SlideShare a Scribd company logo
1 of 22
Download to read offline
© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
IBM® Systems
February, 2017
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
Takeshi Ishimoto, Spectrum Archive Development & Architect, IBM Tokyo
Carla Corral, Spectrum Archive Performance, IBM Guadalajara
Pedro Ramos, Spectrum Archive Performance, IBM Guadalajara
Khanh V. Ngo, Spectrum Archive Development, IBM Tucson
Osamu Matsumiya, Spectrum Archive Development, IBM Tokyo
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
2© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
Contents
PREFACE.............................................................................................................................................................................. 3
1. IBM SPECTRUM ARCHIVE....................................................................................................................................4
1.1. PRODUCT OVERVIEW.....................................................................................................................................4
1.2. REFERENCE ARCHITECTURE FOR SCALE-OUT................................................................................................ 5
2. TEST METHODOLOGY .........................................................................................................................................7
2.1. HARDWARE SETUP AND RECOMMENDATIONS................................................................................................ 7
2.1.1. PC SERVER......................................................................................................................................................8
2.1.2. TAPE HARDWARE.............................................................................................................................................. 8
2.1.3. DISK SUBSYSTEM AND IBM SPECTRUM SCALE SETTING..................................................................................8
2.1.4. SAN ZONING CONSIDERATIONS........................................................................................................................ 8
2.1.5. SOFTWARE VERSIONS ....................................................................................................................................9
2.2. TEST PROCEDURES .........................................................................................................................................9
3. MIGRATION PERFORMANCE RESULTS.................................................................................................................... 11
3.1. PERFORMANCE RESULT WITH TS1150 TAPE DRIVE..................................................................................... 11
3.2. PERFORMANCE RESULT WITH LTO 7 TAPE DRIVE ..........................................................................................12
3.3. PERFORMANCE COMPARISON BETWEEN TS1150 AND LTO 7 DRIVES............................................................13
3.4. PERFORMANCE SCALABILITY BY NUMBER OF TAPE DRIVES ..........................................................................14
4. CONCLUSIONS ..................................................................................................................................................16
APPENDIX - SERVER AND DISK STORAGE TUNING......................................................................................................17
ACKNOWLEDGMENTS................................................................................................................................................19
REFERENCES ..............................................................................................................................................................20
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
3© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
Preface
This white paper describes the I/O performance characteristics of IBM Spectrum Archive™ Enterprise
Edition Version 1.2.2 software (Spectrum Archive EE) based on IBMs in-house testing using IBM
TS1150 tape drives and IBM LTO Ultrium 7 (LTO 7) tape drives.
It summarizes the results of measuring the effective data rate under different workload conditions to
characterize the software’s horizontal scalability when additional servers and tape drives are added.
Specifically, the tests measure the throughput of the file migration operation from a disk based file
system to tape storage, with different file sizes and with several hardware configurations. The intent of
the paper is to provide recommendations to help customers plan to meet their data rate requirements
for new installations or for upgrading existing systems.
Chapter 1 describes the high level overview of Spectrum Archive EE functions and the scale-out
reference architecture. Chapter 2 describes the test environment and test procedures, and Chapter 3
shows the test results. Chapter 4 concludes with the summary of measurements and best practices.
DISCLAIMER
Performance measurements presented in this document are limited to the use of the same hardware
configuration. Performance can vary depending on the hardware used (servers, storage system, SAN) and
their configuration.
The following units of measurement are used in this white paper:
Binary Units Decimal Units
Metric Value Symbol Metric Value Symbol
Kibibyte 1024 KiB Kilobyte 1000 KB
Mebibyte 10242 MiB Megabyte 10002 MB
Gibibyte 10243
GiB Gigabyte 10003
GB
Tebibyte 10244 TiB Terabyte 10004 TB
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
4© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
1. IBM Spectrum Archive
This chapter explains the overview of IBM Spectrum Archive Enterprise Edition and its architecture used
to cover the Blueprint (small, medium and large) configurations for performance.
1.1. Product Overview
Spectrum Archive EE provides seamless integration of a tape storage tier with a highly
available and scalable file system provided by IBM Spectrum Scale™. It performs the policy-
based migration of the file from disk storage to tape to free up disk space, and it also allows the
user to recall the data back from tape on demand or by an explicit prefetching technique. With the
full integration of disk and tape in transparent manner, the data owner can run any application
designed for disk while keeping the cold data on the low-cost tape storage tier.
Spectrum Archive EE runs on one or more Linux servers and it will make the cluster of
servers work as the gateway to tape storage. As in Figure 1.1, each server is configured with a
couple of dedicated tape drives and Spectrum Archive EE will automatically distribute the
I/O workload across the servers so that the aggregated performance will scale out by having more
servers.
Figure 1.1: Spectrum Archive EE System
IBM Spectrum Archive EE provides the following benefits (IBM, 2016):
 A low-cost storage tier in an IBM Spectrum Scale environment.
 An active archive or big data repository for long-term storage of data that requires
file system access to that content.
 File-based storage in the Linear Tape File System™ (LTFS) tape format that is open,
self-describing, portable, and interchangeable across platforms.
 Lowers capital expenditure and operational expenditure costs by using cost-effective
and energy-efficient tape media without dependencies on external server hardware or
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
5© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
software.
 Supports the highly scalable automated TS4500, TS3500, and TS3310 tape libraries.
 Allows the retention of data on tape media for long-term preservation (10+ years).
 Provides the portability of large amounts of data by bulk transfer of tape cartridges
between sites for disaster recovery and the initial synchronization of two Spectrum Scale
sites by using open-format, portable, self-describing tapes.
 Migration of data to newer tape or newer technology that is managed by IBM
Spectrum Scale.
 Provides ease of management for operational and active archive storage.
 Expand archive capacity simply by adding and provisioning media without impacting
the availability of data already in the pool.
With Spectrum Archive EE, you can perform the following management tasks on your
systems (IBM, 2016):
 Create and define tape cartridge pools for file migrations.
 Migrate files in the IBM Spectrum Scale namespace to the IBM Spectrum Archive tape tier.
 Recall files that were migrated to the IBM Spectrum Archive tape tier back into
IBM Spectrum Scale.
 Reconcile file inconsistencies between files in IBM Spectrum Scale and their equivalents
in IBM Spectrum Archive.
 Reclaim tape space that is occupied by non-referenced files and non-referenced
content that is present on the physical tapes.
 Export tape cartridges to remove them from IBM Spectrum Archive EE system.
 Import tape cartridges to add them to IBM Spectrum Archive EE system.
 Add tape cartridges to IBM Spectrum Archive EE system to expand the tape cartridge
pool with no disruption to your system.
 Obtain inventory, job, and scan status of IBM Spectrum Archive EE solution.
1.2. Reference Architecture for Scale-out
The reference architecture of Spectrum Archive EE provides a template of server hardware and
software configurations, and it is a blueprint to help the IT architect plan and configure the
servers for use with IBM Spectrum Archive EE. It also helps for planning future upgrade paths
for adding additional I/O bandwidth.
In this white paper, three different configuration classes are presented, with a couple of
model variations by the number of attached tape drives, as shown in Figure 1.2.
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
6© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
Figure 1.2: Configuration Options of Spectrum Archive EE for Performance Scale-out
 Small Configuration is an entry level configuration with a single server with two, three, or
four tape drives
 Medium Configuration is the dual node configuration using three or four tape drives per node.
 Large Configurations are based on a multi node configuration (four server nodes) and the
use of four or five tape drives per node.
IMPORTANT: This white paper only includes measurements for small and medium configurations.
The large configurations will be integrated in the future.
The configuration models are identified by naming convention of “xNyDzT” in this white paper,
where “x” is the number of servers in total, “y” is number of tape drives attached to each server,
and “z” is the total number of tape drives (z = x * y).
Configuration
Class
Configuration Name
xNyDzT
Number of
Nodes
(x)
Number of
Drives,
per Node
(y)
Number of
Drives in
Total
(z)
Small
1N2D2T 1 2 2
1N3D3T 1 3 3
1N4D4T 1 4 4
Medium
2N3D6T 2 3 6
2N4D8T 2 4 8
Large
4N4D16T 4 4 16
4N5D20T 4 5 20
Table 1.1: Blueprint Configurations for IBM Spectrum Archive EE
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
7© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
2. Test Methodology
This chapter explains the hardware and software specifications and setup details used to get the best
performances.
2.1. Hardware Setup and Recommendations
All performance results in this document were obtained using:
- Two single-socket x86-processor servers, running IBM Spectrum Scale and IBM
Spectrum Archive EE
- Eight tape drives in the tape library, and at least the same number of tape cartridges
- Shared SAN disk storage for IBM Spectrum Scale
- Fiber Channel adapter and SAN switch for the connection to external SAN disk
storage and tape drives
The models and types of selected hardware components are shown in Figure 2.1.
Beside the number of servers and number of tape drives, several other factors could affect the
final performance: the server performance; tape drive type; disk storage hardware and IBM
Spectrum Scale setup; and interconnect speed. It is beyond the scope of this white paper to
attempt to present a complete picture of the relative performance characteristics of all
possible hardware/software configurations. However, the Appendix in this document provides
some tuning tips, based on the hardware characteristics of these test measurements.
Figure 2.1: IBM Spectrum Archive EE hardware components
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
8© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
2.1.1.PC server
It is recommended to use latest PC server with single CPU socket and with 3 PCIe slots.
The performance tests in this white paper use IBM System x3850 X5 servers. It is the fifth
generation of the Enterprise X-Architecture that enables optimal performance for databases,
enterprise applications, and virtualized environments. In order to improve the performance for
IBM Spectrum Archive EE under the Non-Uniform Memory Access (NUMA) architecture, the
tuning recommendations described in the Appendix were used.
2.1.2.Tape Hardware
IBM Spectrum Archive supports the latest tape storage technology for maximum cost efficiency
and performance:
IBM TS1150 Enterprise Tape Drive
- Native data rate performance of up to 360 MB/sec (non-compressible data)
- With JD tape cartridge, it can store 10 TB (non-compressible data) or 30 TB (with 3:1
data compression)
IBM LTO 7 Tape Drive
- Native data rate performance of up to 300 MB/sec (non-compressible data)
- With LTO 7 tape cartridge, it can store 6 TB (non-compressible data) or 15 TB (with
2.5:1 data compression)
The selection of tape technology between the two tape drive types should be made by many
factors, such as reliability, cost, requirement of using industry standard tape media, but from the
performance perspective, IBM TS1150 should provide a better result.
The performance test was conducted by having two logical libraries in an IBM TS4500 tape
library; one for TS1150 tape drives and the other for LTO 7 tape drives, because a single logical
library cannot mix drive types.
In Chapter 3, this white paper provides the test results for both TS1150 and LTO 7 tape drives
using the same test cases, for comparison.
2.1.3.Disk Subsystem and IBM Spectrum Scale setting
General performance tuning tips can be applied for the selection of disk storage and its
configuration. The Appendix describes how IBM Storwize V7000 in the test system was
configured.
The following IBM Spectrum Scale mmchconfig command setup parameters were used for
performance testing configuration on a single node and multi node.
>mmchconfig nsdBufSpace=50,nsdMaxWorkerThreads=1024,nsdMinWorkerThreads=1024,nsd
MultiQueue=64,nsdMultiQueueType=1,nsdSmallThreadRatio=1,nsdThreadsPerQueue=48,num
aMemoryInterleave=yes,maxStatCache=0,ignorePrefetchLUNCount=yes,logPingPongSector
=no,scatterBufferSize=256k -N all
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
9© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
2.1.4.SAN Zoning Considerations
The SAN is primarily responsible for managing data traffic between server and storage devices;
tape and disk. Zoning plays a key role to improve the performance to avoid the contention and
congestion.
As shown in diagram A.2 in the Appendix, it is recommended to:
 Isolate the SAN zones for disk and tape
 Assign dedicatedly an HBA port to smaller number of tapes to avoid the overload of ports
The test showed different results by HBA from different manufacturers, and the final test was
conducted using 8Gbps FC adapter from QLogic (Note that the maximum link speed of a tape
drive is 8Gbps).
2.1.5.Software Versions
This test was conducted with following code levels:
Software Version
IBM Spectrum Archive EE 1.2.2.0
IBM Spectrum Scale 4.2.1
IBM Tape Device Driver lin_tape-3.0.10
OS Version
Linux Version RHEL 7.2
Linux Kernel 3.10.0-327.el7
Firmware Level
IBM TS4500 Library Code 1.3.0.4
IBM TS1150 Drive Code D3I4_68E
IBM LTO 7 Full Height Drive Code LTO7_G9Q0
IBM Storwize V7000 code 7.7.1.2
2.2. Test Procedures
The performance tests in this white paper focus on measuring the data rate (MB/sec) of file
migration from disk to tape under variety of file sizes, and will evaluate how the performance will
change with different number of servers and different number of tape drives.
The performance tests measure the maximum capabilities of IBM Spectrum Archive EE with
the least amount of overhead.
The migration test was conducted by running the following steps:
1. Create the uniform size of files on disk
2. Run mmapplypolicy command manually to find the files matching with the policy criteria, and
to pass the list of candidates to Spectrum Archive command (“ltfsee MIGRATE” command)
mmapplypolicy command will invoke multiple instances of ltfsee MIGRATE command,
depending on the length of file list and optional arguments of mmapplypolicy command. And,
once all migration completes, mmapplypolicy will return to the command prompt
3. Measure the elapsed time for step 2
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
10© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
4. Repeat steps 1 to 3, 3 times.
5. Dividing the amount of data transferred by the best elapsed time gives the
aggregated performance
The test uses the following parameters:
 Migration Source
- File size: select one file size from 5MiB, 10MiB, 100MiB, 1GiB and 10GiB, and create the files
of same size.
- File contains the non-compressible random data, generated from /dev/random
- Amount of data prepared on disk: For each test run, step 1 creates the files equal to the
100 GiB per drive. For example of 10MiB files, test with 4 drives will create 40960 files
(= 4 * 100GiB/10MiB = 4 * 10240) at the beginning.
 Migration Target
- Number of file replica: 1 (specifies one tape pool in Policy)
- The tape is empty at the 1st run
- Target tapes are loaded on to the tape drive (there will be no movement of tape library
robot during the test)
 Command and Policy Options used
- “mmapplypolicy filesystem -P policy_file -B 10000 -m 2*T”, where T is the total number of
tape drives in the system
 -B specifies how many files are passed for each invocation of the EXEC script. If the number
of files exceeds the value that is specified by -B parameter, mmapplypolicy starts the external
program multiple times.
 -m parameter specifies the number of threads that are created and dispatched within
each mmapplypolicy process during the policy execution phase.
- The policy file contains, “SIZE 10485760” after OPTS statement
 SIZE parameter limits the total number of bytes, in KB, in all of the files named in each list
of files passed to EXEC 'script'. 10485760 is equivalent to 10GiB.
<< Portion of Policy File >>
RULE EXTERNAL POOL 'ltfs'
EXEC
'/opt/ibm/ltfsee/bin/ltfsee'
OPTS '-p perftest@library1'
SIZE 10485760
See the Knowledge Center of IBM Spectrum Archive EE and IBM Spectrum Scale for more
information of mmapplypolicy parameters for performance optimization.
Test Parameters
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
11© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
File Size 5MiB 10MiB 100MiB 1GiB 10GiB
Number of files per drive 20480 10240 1024 100 10
-B parameter 10000
-m parameter
2 * T (where, T is total number of tape drives in the
system)
SIZE parameter 10485760
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
12© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
3. Migration Performance Results
This chapter contains the IBM Spectrum Archive EE v1.2.2 performance measurements as a result of the
testing with TS1150 and LTO7 tape drives; their scalability; and comparisons between them.
3.1. Performance result with TS1150 tape drive
Table 3.1 shows the aggregated transfer rate of file migration with IBM TS1150 tape drives and
IBM 3592 JD tape cartridges. As shown in the upper right corner, IBM Spectrum Archive EE
migrates the 10 GiB files at 2.3GB/s with 8 tape drives. Given that each tape drive is capable of
transferring the data at 360 MB/s for non-compressible data used in this test, the result is
equivalent to 80% of tape drive’s capability.
Table 3.1: Aggregated Migration Rate - TS1150 Tape Drive (in
MB/s)
The graph in Figure 3.1 plots the test results and presents the projected performance curve for
each hardware configuration. The X axis is the file size in logarithmic scale, and Y axis is the
transfer rate in MB/s.
Figure 3.1: Migration scaling performance for TS1150
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
13© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
3.2. Performance result with LTO 7 tape drive
Table 3.2 shows the aggregated transfer rate of file migration with IBM LTO 7 tape drives with LTO
7 tape cartridges. As shown in the upper right corner, IBM Spectrum Archive EE migrates the
10GiB files at 1.9GB/s with 8 tape drives. Given that each LTO 7 tape drive is capable of
transferring the data at 300 MB/s for non-compressible data used in this test, the result is
equivalent to 80% of the tape drive’s capability.
Table 3.2: Aggregated Migration Rate – LTO 7 tape drive (in MB/s)
The graph in Figure 3.2 plots the test results and presents the projected performance curve for
each hardware configuration. The X axis is the file size in logarithmic scale, and Y axis is the
transfer rate in MB/s.
Figure 3.2: Migration scaling performance for LTO 7
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
14© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
3.3. Performance comparison between TS1150 and LTO 7 drives
The graph in Figure 3.3 compares the test results between ones with IBM TS1150 presented in
Figure 3.1, and ones with LTO 7 tape drives in Figure 3.2. TS1150 tape drive performs better than
LTO 7 tape drive in all the tested range, while the difference is very minor in the smaller files.
Figure 3.3: Migration scaling performance TS1150 and LTO 7 (Comparison)
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
15© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
3.4. Performance scalability by number of tape drives
Table 3.3 has the same test results as Table 3.1, but the results are now presented as the
performance number per drive, rather than aggregated performance. This table shows that the
expected performance per drive will be slightly lower as more drives are added to the system.
Table 3.3: Migration performance by TS1150 tape drive (in MB/s)
Figure 3.4 illustrates the performance scalability for each file size, and the lines show how the
performance will improve by adding more drives for a given file size. In this graph, scaling
factor index is defined as “2” for the result of a 2 drive configuration, and the others are calculated
as the relative performance index. In the perfect linear scalability, the index of an 8 drive
configuration will be “8”, where the actual result ranges from 7.4 to 6.2, for the TS1150 tape drives.
Figure 3.4: Migration scaling performance for TS1150 configuration
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
16© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
Table 3.4 is performance for migration transfer rate per LTO 7 tape drive.
Table 3.4: Performance per drive LTO 7 (in MB/s)
Figure 3.5 is the equivalent version of Figure 3.4 but for LTO 7 tape drives, and it shows a
similar trend.
Figure 3.5: Migration scaling performance LTO 7 configurations
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
17© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
4. Conclusions
IBM Spectrum Archive EE lowers the cost of storage infrastructure by integrating the large
capacity and economical tape tier seamlessly with IBM Spectrum Scale under a single
namespace. IBM Spectrum Archive EE has the ability to provision tape drives and nodes in the
tape tier. This makes it easier to meet the requirements to expand storage capacity, increase I/O
bandwidth, and optimize data availability with minimal downtime.
The test results in this white paper demonstrate that the addition of tape drives on single node
and multi node configurations produces a higher sustained data rate based on its high native
data rate. IBM Spectrum Archive EE shows an optimal performance for large files (10
GiB) in all configurations. The measurements also reflect that increasing the number of
nodes and drives per node improve the performance. The performance for small size files is
also improved by the addition of drives and nodes, however this incremental benefit remains
small even with the addition of drives. It should be also noted that the performance
measurement results are based on the hardware configuration, and they could be improved
by the use of faster disk storage solution (SSD or Flash) which might be tested for a future
revision of this white paper.
This white paper reflects the benefit for IBM Spectrum Archive EE in terms of performance,
and time required to migrate data from any Spectrum Scale tier to a Spectrum Archive tape
tier.
The results also show that IBM Spectrum Scale can serve the high throughput requirements
and low latency access cost benefits when it is optimized for IBM Spectrum Archive EE
which reads the data for the files being migrated in a streaming manner and updates the file
system metadata for stubbing.
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
18© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
Appendix - Server and Disk Storage Tuning
Server Optimization for NUMA architecture
“This architecture allows to control de time to access to the memory which varies with
data location to be accessed. If data resides in local memory, access is fast. If data resides in
remote memory, access is slower. The advantage of the NUMA architecture as a
hierarchical shared memory scheme is its potential to improve average case access time
through the introduction of fast, local memory.
In the NUMA shared memory architecture, each processor has its own local memory
module that it can access directly with a distinctive performance advantage. At the same
time, it can also access any memory module belonging to another processor using a shared
bus (or some other type of interconnect) as seen in the diagram below:
Figure A.1: NUMA Architecture
Thread migration from one core to another poses a problem for the NUMA shared
memory architecture because of the way it disassociates a thread from its local memory
allocations. That is, a thread may allocate memory on node 1 at startup as it runs on a
core within the node 1 package. But when the thread is later migrated to a core on node
2, the data stored earlier becomes remote and memory access time significantly increases.”
(Intel, 2011)
The numaMemoryInterleave parameter of Spectrum Scale is used on a NUMA based
systems to improve the file system performance. It is enabled for this performance testing
propose due the servers are using NUMA configuration.
Disk Storage Optimization
When designing a GPFS file system on Storwize V7000 storage for optimum performance
there are two basic operating modes that match different usage types.
 General IO workloads: By default Storwize creates LUNS (vdisks) over multiple arrays to
utilize the available storage
 Optimal Sequential Performance: The Storwize V7000 use a Redundant Array of
Independent Disks (RAID), this is a method of configuring member drives to create high
availability and high performance systems. Storwize V7000 for sequential IO with GPFS
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
19© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
use RAID5 or RAID 6 arrays. There are two type of RAID configuration; a distributed array
and non-distributed array configuration. (IBM, IBM Storwize V7000 with GPFS, 2015)
RAID Array Type
Performance testing for this paper uses Distributed RAID 5 arrays due the performance of
the pool is more uniform because all of the available drives are used for every volume
extent and they can tolerate the failure of one member drive. These arrays stripe data
over the member drives with one parity strip on every stripe.
“Distributed RAID arrays can support 4 - 128 drives and they also contain rebuild areas that
are used to maintain redundancy after a drive fails. As a result, the distributed configuration
dramatically reduces rebuild times and decreases the exposure volumes have to the extra
load of recovering redundancy.
Distributed arrays remove the need for separate drives that are idle until a failure
occurs. Instead of allocating one or more drives as spares, the spare capacity is distributed
over specific rebuild areas across all the member drives. After the failed drive is replaced,
data is copied back to the drive from the distributed spare capacity. Unlike "hot spare"
drives, read/write requests are processed on other parts of the drive that are not being used
as rebuild areas. The number of rebuild areas is based on the width of the array. The size of
the rebuild area determines how many times the distributed array can recover failed drives
without risking becoming degraded.” (IBM, Distributed array properties, s.f.)
DRAID: Distributed array (DRAID) is used for IBM Spectrum Archive EE configuration, it
allows a RAID5 or RAID6 array to be distributed over a larger set of drives and you can
actually have the spare drive performing reads and writes for your host IO.
RAID Strip Size and File System Block Size
The Drive Assignment configuration was tuning using the total number of drives (48 drives)
in a v7000 distributed array (Array width). A stripe (redundancy unit), is the smallest amount
of data that can be addressed. It is best to use a GPFS block size that is a multiple of the
V7000 stripe size. The V7000 has two strip size options: 128KiB and 256KiB, and 256 KiB
was used for this performance propose.
To optimize the Storwize V7000 for sequential IO with GPFS, a GPFS file system with 2
MiB block size (2048 KiB) was created. The RAID strip size, by default V7000 will use 256
KiB RAID strips. If you have a large sequential workload, then you may want to look at your
host I/O size. For this performance propose is recommended to create a 10 disk RAID5
array (8+P+Q) with a strip default size of 256 KiB. That would give an 8*256 KiB =
2048 KiB stripe size, which matches the filesystem block size. The strip width for RAID 5 is
= 10(Number of Disk) – 1 = 9.
SAN Connection
The Storwize V7000 nodes must always be connected to SAN switches only. Multiple
connections are permitted from redundant storage systems to improve data bandwidth
performance. Use an additional Zone (figure X.X: Zone1) to dedicate the traffic between
FC ports from all nodes; and all Storwize V7000 ports together for best performance and
availability
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
20© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
Figure A.2: IBM Spectrum Archive EE configuration
Acknowledgments
The authors would like to thank Joaquin Quiroz, Vernon Miller, Bruce McNutt, and Larry Coyne for
their support, reviews, comments, and feedback.
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
21© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
References
IBM. (2016, December). IBM Spectrum Archive Enterprise Edition V1.2.2. Retrieved from IBM
Knowledge Center:
http://www.ibm.com/support/knowledgecenter/ST9MBR_1.2.2/ltfs_ee_ichome.html
IBM. (2016, December). IBM Spectrum Archive Enterprise Edition V1.2.2: Installation and
Configuration Guide. Retrieved from Redbooks:
http://www.redbooks.ibm.com/redpieces/abstracts/sg248333.html?Open
IBM. (2016). IBM TS4500 - Supported tape cartridges. Retrieved from IBM Knowledge Center:
http://www.ibm.com/support/knowledgecenter/en/STQRQ9/com.ibm.storage.ts4500.doc/ts4500
_ipg_cartridges_supported.html
IBM. (2016, October). IBM Storwize V7000 - Distributed array properties. Retrieved from IBM
Knowledge Center:
http://www.ibm.com/support/knowledgecenter/en/ST3FR7_7.7.1/com.ibm.storwize.v7000.771.d
oc/svc_distributedRAID.html
IBM. (2015, November 30). IBM Storwise V7000 with GPFS. Retrieved from Developers Works:
https://www.ibm.com/developerworks/community/wikis/home?lang=en# /wiki/General%20Parall
el%20File%20System%20(GPFS)/page/IBM%20Storwise%20V7000%20with%20GPFS
IBM. (2013, October). IBM System x3850 X5 and x3950 X5 - Types 7145, 7146, 7143, and 7191.
Retrieved from Installation and User's Guide:
http://publib.boulder.ibm.com/infocenter/systemx/documentation/topic/com.ibm.sysx.7145.doc/
7 145_iug_pdf.pdf
IBM. (n.d.). System x Documentation - Memory Modules. Retrieved from Info center:
http://publib.boulder.ibm.com/infocenter/systemx/documentation/index.jsp?topic=/com.ibm.sysx.
7145.doc/bb1pw_r_memorymodules.html
Intel. (2011, November 2). Optimizing Applications for NUMA. Retrieved from Intel Developer
Zone: https://software.intel.com/en-us/articles/optimizing-applications-for-numa
IBM® Spectrum Archive™ Enterprise Edition V1.2.2
Performance White Paper
22© COPYRIGHT IBM CORPORATION, 2017
© COPYRIGHT IBM CORPORATION, 2016
© International Business Machines Corporation 2017
Printed in the United States of America
February 2017
All Rights Reserved
IBM, the IBM logo, Linear Tape File System, Spectrum Archive, Spectrum Scale, System Storage are trademarks or
registered trademarks of International Business Machines Corporation in the United States, other countries, or both.
Linear Tape-Open LTO, the LTO logo, Ultrium and the Ultrium logo are registered trademarks of Hewlett Packard
Enterprise, IBM and Quantum in the US and other countries.
Other company, product and service names may be trademarks or service marks of others.
Productdatahasbeenreviewedforaccuracyasofthedateofinitialpublication.Productdataissubjecttochange
without notice. This information could include technical inaccuracies and/or typographical errors. IBM may make
improvements and/or changes in the product(s) and/or programs(s) at any time without notice.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such
products, programs or services available in all countries in which IBM operates or does business. Any reference to an IBM
ProgramProductinthisdocumentisnotintendedtostateorimplythatonlythatprogramproductmaybeused.Any
functionally equivalent program, that does not infringe IBM’s intellectually property rights, may be used instead. It is the
user’s responsibility to evaluate and verify the operation of any non-IBM product, program or service.
The performance data contained herein was obtained in a controlled, isolated environment. Actual results that may be
obtained in other operating environments may vary significantly. While IBM has reviewed each item for accuracy in a
specific situation, there is no guarantee that the same or similar results will be obtained elsewhere.
THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED AS IS WITHOUT ANY WARRANTY, EITHER
EXPRESS OR IMPLIED. IBM EXPRESSLY DISCLAIMS ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR
A PARTICULAR PURPOSE OR NON-INFRINGEMENT. IBM shall have no responsibility to update this information. IBM
products are warranted according to the terms and conditions of the agreements (e.g., IBM Customer Agreement,
Statement of Limited Warranty, International Program License Agreement, etc.) under which they are provided. IBM is
not responsible for the performance or interoperability of any non-IBM products discussed herein.
Theprovisionoftheinformationcontainedhereinisnotintendedto,anddoesnotgrantanyrightorlicenseunderany
IBM patents or copyrights. Inquiries regarding patent or copyright licenses should be made, in writing, to:
IBM Director of Licensing
IBMCorporation
North Castle Drive
Armonk, NY 10504-
1785 U.S.A.

More Related Content

What's hot

EMC Data domain advanced features and functions
EMC Data domain advanced features and functionsEMC Data domain advanced features and functions
EMC Data domain advanced features and functionssolarisyougood
 
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...xKinAnx
 
Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Boni Bruno
 
20+ Million Records a Second - Running Kafka on Isilon F800
20+ Million Records a Second - Running Kafka on Isilon F800 20+ Million Records a Second - Running Kafka on Isilon F800
20+ Million Records a Second - Running Kafka on Isilon F800 Boni Bruno
 
EMC Data Domain Retention Lock Software: Detailed Review
EMC Data Domain Retention Lock Software: Detailed ReviewEMC Data Domain Retention Lock Software: Detailed Review
EMC Data Domain Retention Lock Software: Detailed ReviewEMC
 
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral ProgramBig Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Programinside-BigData.com
 
Less17 moving data
Less17 moving dataLess17 moving data
Less17 moving dataAmit Bhalla
 
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...xKinAnx
 
IBM Spectrum Scale for File and Object Storage
IBM Spectrum Scale for File and Object StorageIBM Spectrum Scale for File and Object Storage
IBM Spectrum Scale for File and Object StorageTony Pearson
 
IBM Platform Computing Elastic Storage
IBM Platform Computing  Elastic StorageIBM Platform Computing  Elastic Storage
IBM Platform Computing Elastic StoragePatrick Bouillaud
 
Lustre Releases Update from LAD'14
Lustre Releases Update from LAD'14Lustre Releases Update from LAD'14
Lustre Releases Update from LAD'14inside-BigData.com
 
S016827 pendulum-swings-nola-v1710d
S016827 pendulum-swings-nola-v1710dS016827 pendulum-swings-nola-v1710d
S016827 pendulum-swings-nola-v1710dTony Pearson
 
IBM Spectrum Scale for File and Object Storage
IBM Spectrum Scale for File and Object StorageIBM Spectrum Scale for File and Object Storage
IBM Spectrum Scale for File and Object StorageTony Pearson
 
Ibm spectrum scale fundamentals workshop for americas part 1 components archi...
Ibm spectrum scale fundamentals workshop for americas part 1 components archi...Ibm spectrum scale fundamentals workshop for americas part 1 components archi...
Ibm spectrum scale fundamentals workshop for americas part 1 components archi...xKinAnx
 
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...Aerospike
 
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...Principled Technologies
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...xKinAnx
 
Presentation data domain advanced features and functions
Presentation   data domain advanced features and functionsPresentation   data domain advanced features and functions
Presentation data domain advanced features and functionsxKinAnx
 

What's hot (20)

EMC Data domain advanced features and functions
EMC Data domain advanced features and functionsEMC Data domain advanced features and functions
EMC Data domain advanced features and functions
 
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
 
Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810
 
20+ Million Records a Second - Running Kafka on Isilon F800
20+ Million Records a Second - Running Kafka on Isilon F800 20+ Million Records a Second - Running Kafka on Isilon F800
20+ Million Records a Second - Running Kafka on Isilon F800
 
EMC Data Domain Retention Lock Software: Detailed Review
EMC Data Domain Retention Lock Software: Detailed ReviewEMC Data Domain Retention Lock Software: Detailed Review
EMC Data Domain Retention Lock Software: Detailed Review
 
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral ProgramBig Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
 
Less17 moving data
Less17 moving dataLess17 moving data
Less17 moving data
 
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
 
IBM Spectrum Scale for File and Object Storage
IBM Spectrum Scale for File and Object StorageIBM Spectrum Scale for File and Object Storage
IBM Spectrum Scale for File and Object Storage
 
Database backup 110810
Database backup 110810Database backup 110810
Database backup 110810
 
IBM Platform Computing Elastic Storage
IBM Platform Computing  Elastic StorageIBM Platform Computing  Elastic Storage
IBM Platform Computing Elastic Storage
 
Lustre Releases Update from LAD'14
Lustre Releases Update from LAD'14Lustre Releases Update from LAD'14
Lustre Releases Update from LAD'14
 
S016827 pendulum-swings-nola-v1710d
S016827 pendulum-swings-nola-v1710dS016827 pendulum-swings-nola-v1710d
S016827 pendulum-swings-nola-v1710d
 
IBM Spectrum Scale for File and Object Storage
IBM Spectrum Scale for File and Object StorageIBM Spectrum Scale for File and Object Storage
IBM Spectrum Scale for File and Object Storage
 
Ibm spectrum scale fundamentals workshop for americas part 1 components archi...
Ibm spectrum scale fundamentals workshop for americas part 1 components archi...Ibm spectrum scale fundamentals workshop for americas part 1 components archi...
Ibm spectrum scale fundamentals workshop for americas part 1 components archi...
 
Exadata Backup
Exadata BackupExadata Backup
Exadata Backup
 
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
 
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
 
Presentation data domain advanced features and functions
Presentation   data domain advanced features and functionsPresentation   data domain advanced features and functions
Presentation data domain advanced features and functions
 

Similar to Ibm spectrum archive ee v1.2.2 performance white_paper v1.1

3PAR and VMWare
3PAR and VMWare3PAR and VMWare
3PAR and VMWarevmug
 
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmarkThe Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmarkLenovo Data Center
 
AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Bi...
AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Bi...AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Bi...
AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Bi...Amazon Web Services
 
Spectrum Scale final
Spectrum Scale finalSpectrum Scale final
Spectrum Scale finalJoe Krotz
 
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 TechnologyAdd Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 TechnologyIBM India Smarter Computing
 
IBM Enterprise 2014 - Technical University Abstract Guide
IBM Enterprise 2014 - Technical University Abstract GuideIBM Enterprise 2014 - Technical University Abstract Guide
IBM Enterprise 2014 - Technical University Abstract GuideCasey Lucas
 
S104872 spectrum nas-one-day-jburg-v1809e
S104872 spectrum nas-one-day-jburg-v1809eS104872 spectrum nas-one-day-jburg-v1809e
S104872 spectrum nas-one-day-jburg-v1809eTony Pearson
 
Module 2: AWS Infrastructure – Compute, Storage and Networking - AWSome Day O...
Module 2: AWS Infrastructure – Compute, Storage and Networking - AWSome Day O...Module 2: AWS Infrastructure – Compute, Storage and Networking - AWSome Day O...
Module 2: AWS Infrastructure – Compute, Storage and Networking - AWSome Day O...Amazon Web Services
 
IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...
IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...
IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...IBM India Smarter Computing
 
IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...
IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...
IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...IBM India Smarter Computing
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Community
 
Module 2: AWS Foundational Services - AWSome Day Online Conference
Module 2: AWS Foundational Services - AWSome Day Online ConferenceModule 2: AWS Foundational Services - AWSome Day Online Conference
Module 2: AWS Foundational Services - AWSome Day Online ConferenceAmazon Web Services
 
Module 2 AWS Foundational Services - AWSome Day Online Conference
Module 2 AWS Foundational Services - AWSome Day Online Conference Module 2 AWS Foundational Services - AWSome Day Online Conference
Module 2 AWS Foundational Services - AWSome Day Online Conference Amazon Web Services
 
Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Tony Pearson
 
S106285 spectrum-archive-taming-big data-istanbul-v1902a
S106285 spectrum-archive-taming-big data-istanbul-v1902aS106285 spectrum-archive-taming-big data-istanbul-v1902a
S106285 spectrum-archive-taming-big data-istanbul-v1902aTony Pearson
 
STG Update 24.11.11
STG Update 24.11.11STG Update 24.11.11
STG Update 24.11.11PatrickGWard
 
Oracle exalytics deployment for high availability
Oracle exalytics deployment for high availabilityOracle exalytics deployment for high availability
Oracle exalytics deployment for high availabilityPaulo Fagundes
 
Storwize SVC presentation February 2017
Storwize SVC presentation February 2017Storwize SVC presentation February 2017
Storwize SVC presentation February 2017Joe Krotz
 

Similar to Ibm spectrum archive ee v1.2.2 performance white_paper v1.1 (20)

3PAR and VMWare
3PAR and VMWare3PAR and VMWare
3PAR and VMWare
 
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmarkThe Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
 
AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Bi...
AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Bi...AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Bi...
AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Bi...
 
Spectrum Scale final
Spectrum Scale finalSpectrum Scale final
Spectrum Scale final
 
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 TechnologyAdd Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
 
IBM Enterprise 2014 - Technical University Abstract Guide
IBM Enterprise 2014 - Technical University Abstract GuideIBM Enterprise 2014 - Technical University Abstract Guide
IBM Enterprise 2014 - Technical University Abstract Guide
 
S104872 spectrum nas-one-day-jburg-v1809e
S104872 spectrum nas-one-day-jburg-v1809eS104872 spectrum nas-one-day-jburg-v1809e
S104872 spectrum nas-one-day-jburg-v1809e
 
Module 2: AWS Infrastructure – Compute, Storage and Networking - AWSome Day O...
Module 2: AWS Infrastructure – Compute, Storage and Networking - AWSome Day O...Module 2: AWS Infrastructure – Compute, Storage and Networking - AWSome Day O...
Module 2: AWS Infrastructure – Compute, Storage and Networking - AWSome Day O...
 
DS8800 Client Presentation
DS8800 Client PresentationDS8800 Client Presentation
DS8800 Client Presentation
 
IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...
IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...
IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...
 
IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...
IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...
IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clini...
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK
 
Module 2: AWS Foundational Services - AWSome Day Online Conference
Module 2: AWS Foundational Services - AWSome Day Online ConferenceModule 2: AWS Foundational Services - AWSome Day Online Conference
Module 2: AWS Foundational Services - AWSome Day Online Conference
 
Module 2 AWS Foundational Services - AWSome Day Online Conference
Module 2 AWS Foundational Services - AWSome Day Online Conference Module 2 AWS Foundational Services - AWSome Day Online Conference
Module 2 AWS Foundational Services - AWSome Day Online Conference
 
Meetup Oracle Database MAD_BCN: 4 Saborea Exadata
Meetup Oracle Database MAD_BCN: 4 Saborea ExadataMeetup Oracle Database MAD_BCN: 4 Saborea Exadata
Meetup Oracle Database MAD_BCN: 4 Saborea Exadata
 
Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4
 
S106285 spectrum-archive-taming-big data-istanbul-v1902a
S106285 spectrum-archive-taming-big data-istanbul-v1902aS106285 spectrum-archive-taming-big data-istanbul-v1902a
S106285 spectrum-archive-taming-big data-istanbul-v1902a
 
STG Update 24.11.11
STG Update 24.11.11STG Update 24.11.11
STG Update 24.11.11
 
Oracle exalytics deployment for high availability
Oracle exalytics deployment for high availabilityOracle exalytics deployment for high availability
Oracle exalytics deployment for high availability
 
Storwize SVC presentation February 2017
Storwize SVC presentation February 2017Storwize SVC presentation February 2017
Storwize SVC presentation February 2017
 

Recently uploaded

Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 

Recently uploaded (20)

Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 

Ibm spectrum archive ee v1.2.2 performance white_paper v1.1

  • 1. © COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 IBM® Systems February, 2017 IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper Takeshi Ishimoto, Spectrum Archive Development & Architect, IBM Tokyo Carla Corral, Spectrum Archive Performance, IBM Guadalajara Pedro Ramos, Spectrum Archive Performance, IBM Guadalajara Khanh V. Ngo, Spectrum Archive Development, IBM Tucson Osamu Matsumiya, Spectrum Archive Development, IBM Tokyo
  • 2. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 2© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 Contents PREFACE.............................................................................................................................................................................. 3 1. IBM SPECTRUM ARCHIVE....................................................................................................................................4 1.1. PRODUCT OVERVIEW.....................................................................................................................................4 1.2. REFERENCE ARCHITECTURE FOR SCALE-OUT................................................................................................ 5 2. TEST METHODOLOGY .........................................................................................................................................7 2.1. HARDWARE SETUP AND RECOMMENDATIONS................................................................................................ 7 2.1.1. PC SERVER......................................................................................................................................................8 2.1.2. TAPE HARDWARE.............................................................................................................................................. 8 2.1.3. DISK SUBSYSTEM AND IBM SPECTRUM SCALE SETTING..................................................................................8 2.1.4. SAN ZONING CONSIDERATIONS........................................................................................................................ 8 2.1.5. SOFTWARE VERSIONS ....................................................................................................................................9 2.2. TEST PROCEDURES .........................................................................................................................................9 3. MIGRATION PERFORMANCE RESULTS.................................................................................................................... 11 3.1. PERFORMANCE RESULT WITH TS1150 TAPE DRIVE..................................................................................... 11 3.2. PERFORMANCE RESULT WITH LTO 7 TAPE DRIVE ..........................................................................................12 3.3. PERFORMANCE COMPARISON BETWEEN TS1150 AND LTO 7 DRIVES............................................................13 3.4. PERFORMANCE SCALABILITY BY NUMBER OF TAPE DRIVES ..........................................................................14 4. CONCLUSIONS ..................................................................................................................................................16 APPENDIX - SERVER AND DISK STORAGE TUNING......................................................................................................17 ACKNOWLEDGMENTS................................................................................................................................................19 REFERENCES ..............................................................................................................................................................20
  • 3. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 3© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 Preface This white paper describes the I/O performance characteristics of IBM Spectrum Archive™ Enterprise Edition Version 1.2.2 software (Spectrum Archive EE) based on IBMs in-house testing using IBM TS1150 tape drives and IBM LTO Ultrium 7 (LTO 7) tape drives. It summarizes the results of measuring the effective data rate under different workload conditions to characterize the software’s horizontal scalability when additional servers and tape drives are added. Specifically, the tests measure the throughput of the file migration operation from a disk based file system to tape storage, with different file sizes and with several hardware configurations. The intent of the paper is to provide recommendations to help customers plan to meet their data rate requirements for new installations or for upgrading existing systems. Chapter 1 describes the high level overview of Spectrum Archive EE functions and the scale-out reference architecture. Chapter 2 describes the test environment and test procedures, and Chapter 3 shows the test results. Chapter 4 concludes with the summary of measurements and best practices. DISCLAIMER Performance measurements presented in this document are limited to the use of the same hardware configuration. Performance can vary depending on the hardware used (servers, storage system, SAN) and their configuration. The following units of measurement are used in this white paper: Binary Units Decimal Units Metric Value Symbol Metric Value Symbol Kibibyte 1024 KiB Kilobyte 1000 KB Mebibyte 10242 MiB Megabyte 10002 MB Gibibyte 10243 GiB Gigabyte 10003 GB Tebibyte 10244 TiB Terabyte 10004 TB
  • 4. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 4© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 1. IBM Spectrum Archive This chapter explains the overview of IBM Spectrum Archive Enterprise Edition and its architecture used to cover the Blueprint (small, medium and large) configurations for performance. 1.1. Product Overview Spectrum Archive EE provides seamless integration of a tape storage tier with a highly available and scalable file system provided by IBM Spectrum Scale™. It performs the policy- based migration of the file from disk storage to tape to free up disk space, and it also allows the user to recall the data back from tape on demand or by an explicit prefetching technique. With the full integration of disk and tape in transparent manner, the data owner can run any application designed for disk while keeping the cold data on the low-cost tape storage tier. Spectrum Archive EE runs on one or more Linux servers and it will make the cluster of servers work as the gateway to tape storage. As in Figure 1.1, each server is configured with a couple of dedicated tape drives and Spectrum Archive EE will automatically distribute the I/O workload across the servers so that the aggregated performance will scale out by having more servers. Figure 1.1: Spectrum Archive EE System IBM Spectrum Archive EE provides the following benefits (IBM, 2016):  A low-cost storage tier in an IBM Spectrum Scale environment.  An active archive or big data repository for long-term storage of data that requires file system access to that content.  File-based storage in the Linear Tape File System™ (LTFS) tape format that is open, self-describing, portable, and interchangeable across platforms.  Lowers capital expenditure and operational expenditure costs by using cost-effective and energy-efficient tape media without dependencies on external server hardware or
  • 5. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 5© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 software.  Supports the highly scalable automated TS4500, TS3500, and TS3310 tape libraries.  Allows the retention of data on tape media for long-term preservation (10+ years).  Provides the portability of large amounts of data by bulk transfer of tape cartridges between sites for disaster recovery and the initial synchronization of two Spectrum Scale sites by using open-format, portable, self-describing tapes.  Migration of data to newer tape or newer technology that is managed by IBM Spectrum Scale.  Provides ease of management for operational and active archive storage.  Expand archive capacity simply by adding and provisioning media without impacting the availability of data already in the pool. With Spectrum Archive EE, you can perform the following management tasks on your systems (IBM, 2016):  Create and define tape cartridge pools for file migrations.  Migrate files in the IBM Spectrum Scale namespace to the IBM Spectrum Archive tape tier.  Recall files that were migrated to the IBM Spectrum Archive tape tier back into IBM Spectrum Scale.  Reconcile file inconsistencies between files in IBM Spectrum Scale and their equivalents in IBM Spectrum Archive.  Reclaim tape space that is occupied by non-referenced files and non-referenced content that is present on the physical tapes.  Export tape cartridges to remove them from IBM Spectrum Archive EE system.  Import tape cartridges to add them to IBM Spectrum Archive EE system.  Add tape cartridges to IBM Spectrum Archive EE system to expand the tape cartridge pool with no disruption to your system.  Obtain inventory, job, and scan status of IBM Spectrum Archive EE solution. 1.2. Reference Architecture for Scale-out The reference architecture of Spectrum Archive EE provides a template of server hardware and software configurations, and it is a blueprint to help the IT architect plan and configure the servers for use with IBM Spectrum Archive EE. It also helps for planning future upgrade paths for adding additional I/O bandwidth. In this white paper, three different configuration classes are presented, with a couple of model variations by the number of attached tape drives, as shown in Figure 1.2.
  • 6. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 6© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 Figure 1.2: Configuration Options of Spectrum Archive EE for Performance Scale-out  Small Configuration is an entry level configuration with a single server with two, three, or four tape drives  Medium Configuration is the dual node configuration using three or four tape drives per node.  Large Configurations are based on a multi node configuration (four server nodes) and the use of four or five tape drives per node. IMPORTANT: This white paper only includes measurements for small and medium configurations. The large configurations will be integrated in the future. The configuration models are identified by naming convention of “xNyDzT” in this white paper, where “x” is the number of servers in total, “y” is number of tape drives attached to each server, and “z” is the total number of tape drives (z = x * y). Configuration Class Configuration Name xNyDzT Number of Nodes (x) Number of Drives, per Node (y) Number of Drives in Total (z) Small 1N2D2T 1 2 2 1N3D3T 1 3 3 1N4D4T 1 4 4 Medium 2N3D6T 2 3 6 2N4D8T 2 4 8 Large 4N4D16T 4 4 16 4N5D20T 4 5 20 Table 1.1: Blueprint Configurations for IBM Spectrum Archive EE
  • 7. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 7© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 2. Test Methodology This chapter explains the hardware and software specifications and setup details used to get the best performances. 2.1. Hardware Setup and Recommendations All performance results in this document were obtained using: - Two single-socket x86-processor servers, running IBM Spectrum Scale and IBM Spectrum Archive EE - Eight tape drives in the tape library, and at least the same number of tape cartridges - Shared SAN disk storage for IBM Spectrum Scale - Fiber Channel adapter and SAN switch for the connection to external SAN disk storage and tape drives The models and types of selected hardware components are shown in Figure 2.1. Beside the number of servers and number of tape drives, several other factors could affect the final performance: the server performance; tape drive type; disk storage hardware and IBM Spectrum Scale setup; and interconnect speed. It is beyond the scope of this white paper to attempt to present a complete picture of the relative performance characteristics of all possible hardware/software configurations. However, the Appendix in this document provides some tuning tips, based on the hardware characteristics of these test measurements. Figure 2.1: IBM Spectrum Archive EE hardware components
  • 8. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 8© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 2.1.1.PC server It is recommended to use latest PC server with single CPU socket and with 3 PCIe slots. The performance tests in this white paper use IBM System x3850 X5 servers. It is the fifth generation of the Enterprise X-Architecture that enables optimal performance for databases, enterprise applications, and virtualized environments. In order to improve the performance for IBM Spectrum Archive EE under the Non-Uniform Memory Access (NUMA) architecture, the tuning recommendations described in the Appendix were used. 2.1.2.Tape Hardware IBM Spectrum Archive supports the latest tape storage technology for maximum cost efficiency and performance: IBM TS1150 Enterprise Tape Drive - Native data rate performance of up to 360 MB/sec (non-compressible data) - With JD tape cartridge, it can store 10 TB (non-compressible data) or 30 TB (with 3:1 data compression) IBM LTO 7 Tape Drive - Native data rate performance of up to 300 MB/sec (non-compressible data) - With LTO 7 tape cartridge, it can store 6 TB (non-compressible data) or 15 TB (with 2.5:1 data compression) The selection of tape technology between the two tape drive types should be made by many factors, such as reliability, cost, requirement of using industry standard tape media, but from the performance perspective, IBM TS1150 should provide a better result. The performance test was conducted by having two logical libraries in an IBM TS4500 tape library; one for TS1150 tape drives and the other for LTO 7 tape drives, because a single logical library cannot mix drive types. In Chapter 3, this white paper provides the test results for both TS1150 and LTO 7 tape drives using the same test cases, for comparison. 2.1.3.Disk Subsystem and IBM Spectrum Scale setting General performance tuning tips can be applied for the selection of disk storage and its configuration. The Appendix describes how IBM Storwize V7000 in the test system was configured. The following IBM Spectrum Scale mmchconfig command setup parameters were used for performance testing configuration on a single node and multi node. >mmchconfig nsdBufSpace=50,nsdMaxWorkerThreads=1024,nsdMinWorkerThreads=1024,nsd MultiQueue=64,nsdMultiQueueType=1,nsdSmallThreadRatio=1,nsdThreadsPerQueue=48,num aMemoryInterleave=yes,maxStatCache=0,ignorePrefetchLUNCount=yes,logPingPongSector =no,scatterBufferSize=256k -N all
  • 9. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 9© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 2.1.4.SAN Zoning Considerations The SAN is primarily responsible for managing data traffic between server and storage devices; tape and disk. Zoning plays a key role to improve the performance to avoid the contention and congestion. As shown in diagram A.2 in the Appendix, it is recommended to:  Isolate the SAN zones for disk and tape  Assign dedicatedly an HBA port to smaller number of tapes to avoid the overload of ports The test showed different results by HBA from different manufacturers, and the final test was conducted using 8Gbps FC adapter from QLogic (Note that the maximum link speed of a tape drive is 8Gbps). 2.1.5.Software Versions This test was conducted with following code levels: Software Version IBM Spectrum Archive EE 1.2.2.0 IBM Spectrum Scale 4.2.1 IBM Tape Device Driver lin_tape-3.0.10 OS Version Linux Version RHEL 7.2 Linux Kernel 3.10.0-327.el7 Firmware Level IBM TS4500 Library Code 1.3.0.4 IBM TS1150 Drive Code D3I4_68E IBM LTO 7 Full Height Drive Code LTO7_G9Q0 IBM Storwize V7000 code 7.7.1.2 2.2. Test Procedures The performance tests in this white paper focus on measuring the data rate (MB/sec) of file migration from disk to tape under variety of file sizes, and will evaluate how the performance will change with different number of servers and different number of tape drives. The performance tests measure the maximum capabilities of IBM Spectrum Archive EE with the least amount of overhead. The migration test was conducted by running the following steps: 1. Create the uniform size of files on disk 2. Run mmapplypolicy command manually to find the files matching with the policy criteria, and to pass the list of candidates to Spectrum Archive command (“ltfsee MIGRATE” command) mmapplypolicy command will invoke multiple instances of ltfsee MIGRATE command, depending on the length of file list and optional arguments of mmapplypolicy command. And, once all migration completes, mmapplypolicy will return to the command prompt 3. Measure the elapsed time for step 2
  • 10. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 10© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 4. Repeat steps 1 to 3, 3 times. 5. Dividing the amount of data transferred by the best elapsed time gives the aggregated performance The test uses the following parameters:  Migration Source - File size: select one file size from 5MiB, 10MiB, 100MiB, 1GiB and 10GiB, and create the files of same size. - File contains the non-compressible random data, generated from /dev/random - Amount of data prepared on disk: For each test run, step 1 creates the files equal to the 100 GiB per drive. For example of 10MiB files, test with 4 drives will create 40960 files (= 4 * 100GiB/10MiB = 4 * 10240) at the beginning.  Migration Target - Number of file replica: 1 (specifies one tape pool in Policy) - The tape is empty at the 1st run - Target tapes are loaded on to the tape drive (there will be no movement of tape library robot during the test)  Command and Policy Options used - “mmapplypolicy filesystem -P policy_file -B 10000 -m 2*T”, where T is the total number of tape drives in the system  -B specifies how many files are passed for each invocation of the EXEC script. If the number of files exceeds the value that is specified by -B parameter, mmapplypolicy starts the external program multiple times.  -m parameter specifies the number of threads that are created and dispatched within each mmapplypolicy process during the policy execution phase. - The policy file contains, “SIZE 10485760” after OPTS statement  SIZE parameter limits the total number of bytes, in KB, in all of the files named in each list of files passed to EXEC 'script'. 10485760 is equivalent to 10GiB. << Portion of Policy File >> RULE EXTERNAL POOL 'ltfs' EXEC '/opt/ibm/ltfsee/bin/ltfsee' OPTS '-p perftest@library1' SIZE 10485760 See the Knowledge Center of IBM Spectrum Archive EE and IBM Spectrum Scale for more information of mmapplypolicy parameters for performance optimization. Test Parameters
  • 11. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 11© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 File Size 5MiB 10MiB 100MiB 1GiB 10GiB Number of files per drive 20480 10240 1024 100 10 -B parameter 10000 -m parameter 2 * T (where, T is total number of tape drives in the system) SIZE parameter 10485760
  • 12. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 12© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 3. Migration Performance Results This chapter contains the IBM Spectrum Archive EE v1.2.2 performance measurements as a result of the testing with TS1150 and LTO7 tape drives; their scalability; and comparisons between them. 3.1. Performance result with TS1150 tape drive Table 3.1 shows the aggregated transfer rate of file migration with IBM TS1150 tape drives and IBM 3592 JD tape cartridges. As shown in the upper right corner, IBM Spectrum Archive EE migrates the 10 GiB files at 2.3GB/s with 8 tape drives. Given that each tape drive is capable of transferring the data at 360 MB/s for non-compressible data used in this test, the result is equivalent to 80% of tape drive’s capability. Table 3.1: Aggregated Migration Rate - TS1150 Tape Drive (in MB/s) The graph in Figure 3.1 plots the test results and presents the projected performance curve for each hardware configuration. The X axis is the file size in logarithmic scale, and Y axis is the transfer rate in MB/s. Figure 3.1: Migration scaling performance for TS1150
  • 13. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 13© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 3.2. Performance result with LTO 7 tape drive Table 3.2 shows the aggregated transfer rate of file migration with IBM LTO 7 tape drives with LTO 7 tape cartridges. As shown in the upper right corner, IBM Spectrum Archive EE migrates the 10GiB files at 1.9GB/s with 8 tape drives. Given that each LTO 7 tape drive is capable of transferring the data at 300 MB/s for non-compressible data used in this test, the result is equivalent to 80% of the tape drive’s capability. Table 3.2: Aggregated Migration Rate – LTO 7 tape drive (in MB/s) The graph in Figure 3.2 plots the test results and presents the projected performance curve for each hardware configuration. The X axis is the file size in logarithmic scale, and Y axis is the transfer rate in MB/s. Figure 3.2: Migration scaling performance for LTO 7
  • 14. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 14© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 3.3. Performance comparison between TS1150 and LTO 7 drives The graph in Figure 3.3 compares the test results between ones with IBM TS1150 presented in Figure 3.1, and ones with LTO 7 tape drives in Figure 3.2. TS1150 tape drive performs better than LTO 7 tape drive in all the tested range, while the difference is very minor in the smaller files. Figure 3.3: Migration scaling performance TS1150 and LTO 7 (Comparison)
  • 15. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 15© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 3.4. Performance scalability by number of tape drives Table 3.3 has the same test results as Table 3.1, but the results are now presented as the performance number per drive, rather than aggregated performance. This table shows that the expected performance per drive will be slightly lower as more drives are added to the system. Table 3.3: Migration performance by TS1150 tape drive (in MB/s) Figure 3.4 illustrates the performance scalability for each file size, and the lines show how the performance will improve by adding more drives for a given file size. In this graph, scaling factor index is defined as “2” for the result of a 2 drive configuration, and the others are calculated as the relative performance index. In the perfect linear scalability, the index of an 8 drive configuration will be “8”, where the actual result ranges from 7.4 to 6.2, for the TS1150 tape drives. Figure 3.4: Migration scaling performance for TS1150 configuration
  • 16. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 16© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 Table 3.4 is performance for migration transfer rate per LTO 7 tape drive. Table 3.4: Performance per drive LTO 7 (in MB/s) Figure 3.5 is the equivalent version of Figure 3.4 but for LTO 7 tape drives, and it shows a similar trend. Figure 3.5: Migration scaling performance LTO 7 configurations
  • 17. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 17© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 4. Conclusions IBM Spectrum Archive EE lowers the cost of storage infrastructure by integrating the large capacity and economical tape tier seamlessly with IBM Spectrum Scale under a single namespace. IBM Spectrum Archive EE has the ability to provision tape drives and nodes in the tape tier. This makes it easier to meet the requirements to expand storage capacity, increase I/O bandwidth, and optimize data availability with minimal downtime. The test results in this white paper demonstrate that the addition of tape drives on single node and multi node configurations produces a higher sustained data rate based on its high native data rate. IBM Spectrum Archive EE shows an optimal performance for large files (10 GiB) in all configurations. The measurements also reflect that increasing the number of nodes and drives per node improve the performance. The performance for small size files is also improved by the addition of drives and nodes, however this incremental benefit remains small even with the addition of drives. It should be also noted that the performance measurement results are based on the hardware configuration, and they could be improved by the use of faster disk storage solution (SSD or Flash) which might be tested for a future revision of this white paper. This white paper reflects the benefit for IBM Spectrum Archive EE in terms of performance, and time required to migrate data from any Spectrum Scale tier to a Spectrum Archive tape tier. The results also show that IBM Spectrum Scale can serve the high throughput requirements and low latency access cost benefits when it is optimized for IBM Spectrum Archive EE which reads the data for the files being migrated in a streaming manner and updates the file system metadata for stubbing.
  • 18. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 18© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 Appendix - Server and Disk Storage Tuning Server Optimization for NUMA architecture “This architecture allows to control de time to access to the memory which varies with data location to be accessed. If data resides in local memory, access is fast. If data resides in remote memory, access is slower. The advantage of the NUMA architecture as a hierarchical shared memory scheme is its potential to improve average case access time through the introduction of fast, local memory. In the NUMA shared memory architecture, each processor has its own local memory module that it can access directly with a distinctive performance advantage. At the same time, it can also access any memory module belonging to another processor using a shared bus (or some other type of interconnect) as seen in the diagram below: Figure A.1: NUMA Architecture Thread migration from one core to another poses a problem for the NUMA shared memory architecture because of the way it disassociates a thread from its local memory allocations. That is, a thread may allocate memory on node 1 at startup as it runs on a core within the node 1 package. But when the thread is later migrated to a core on node 2, the data stored earlier becomes remote and memory access time significantly increases.” (Intel, 2011) The numaMemoryInterleave parameter of Spectrum Scale is used on a NUMA based systems to improve the file system performance. It is enabled for this performance testing propose due the servers are using NUMA configuration. Disk Storage Optimization When designing a GPFS file system on Storwize V7000 storage for optimum performance there are two basic operating modes that match different usage types.  General IO workloads: By default Storwize creates LUNS (vdisks) over multiple arrays to utilize the available storage  Optimal Sequential Performance: The Storwize V7000 use a Redundant Array of Independent Disks (RAID), this is a method of configuring member drives to create high availability and high performance systems. Storwize V7000 for sequential IO with GPFS
  • 19. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 19© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 use RAID5 or RAID 6 arrays. There are two type of RAID configuration; a distributed array and non-distributed array configuration. (IBM, IBM Storwize V7000 with GPFS, 2015) RAID Array Type Performance testing for this paper uses Distributed RAID 5 arrays due the performance of the pool is more uniform because all of the available drives are used for every volume extent and they can tolerate the failure of one member drive. These arrays stripe data over the member drives with one parity strip on every stripe. “Distributed RAID arrays can support 4 - 128 drives and they also contain rebuild areas that are used to maintain redundancy after a drive fails. As a result, the distributed configuration dramatically reduces rebuild times and decreases the exposure volumes have to the extra load of recovering redundancy. Distributed arrays remove the need for separate drives that are idle until a failure occurs. Instead of allocating one or more drives as spares, the spare capacity is distributed over specific rebuild areas across all the member drives. After the failed drive is replaced, data is copied back to the drive from the distributed spare capacity. Unlike "hot spare" drives, read/write requests are processed on other parts of the drive that are not being used as rebuild areas. The number of rebuild areas is based on the width of the array. The size of the rebuild area determines how many times the distributed array can recover failed drives without risking becoming degraded.” (IBM, Distributed array properties, s.f.) DRAID: Distributed array (DRAID) is used for IBM Spectrum Archive EE configuration, it allows a RAID5 or RAID6 array to be distributed over a larger set of drives and you can actually have the spare drive performing reads and writes for your host IO. RAID Strip Size and File System Block Size The Drive Assignment configuration was tuning using the total number of drives (48 drives) in a v7000 distributed array (Array width). A stripe (redundancy unit), is the smallest amount of data that can be addressed. It is best to use a GPFS block size that is a multiple of the V7000 stripe size. The V7000 has two strip size options: 128KiB and 256KiB, and 256 KiB was used for this performance propose. To optimize the Storwize V7000 for sequential IO with GPFS, a GPFS file system with 2 MiB block size (2048 KiB) was created. The RAID strip size, by default V7000 will use 256 KiB RAID strips. If you have a large sequential workload, then you may want to look at your host I/O size. For this performance propose is recommended to create a 10 disk RAID5 array (8+P+Q) with a strip default size of 256 KiB. That would give an 8*256 KiB = 2048 KiB stripe size, which matches the filesystem block size. The strip width for RAID 5 is = 10(Number of Disk) – 1 = 9. SAN Connection The Storwize V7000 nodes must always be connected to SAN switches only. Multiple connections are permitted from redundant storage systems to improve data bandwidth performance. Use an additional Zone (figure X.X: Zone1) to dedicate the traffic between FC ports from all nodes; and all Storwize V7000 ports together for best performance and availability
  • 20. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 20© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 Figure A.2: IBM Spectrum Archive EE configuration Acknowledgments The authors would like to thank Joaquin Quiroz, Vernon Miller, Bruce McNutt, and Larry Coyne for their support, reviews, comments, and feedback.
  • 21. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 21© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 References IBM. (2016, December). IBM Spectrum Archive Enterprise Edition V1.2.2. Retrieved from IBM Knowledge Center: http://www.ibm.com/support/knowledgecenter/ST9MBR_1.2.2/ltfs_ee_ichome.html IBM. (2016, December). IBM Spectrum Archive Enterprise Edition V1.2.2: Installation and Configuration Guide. Retrieved from Redbooks: http://www.redbooks.ibm.com/redpieces/abstracts/sg248333.html?Open IBM. (2016). IBM TS4500 - Supported tape cartridges. Retrieved from IBM Knowledge Center: http://www.ibm.com/support/knowledgecenter/en/STQRQ9/com.ibm.storage.ts4500.doc/ts4500 _ipg_cartridges_supported.html IBM. (2016, October). IBM Storwize V7000 - Distributed array properties. Retrieved from IBM Knowledge Center: http://www.ibm.com/support/knowledgecenter/en/ST3FR7_7.7.1/com.ibm.storwize.v7000.771.d oc/svc_distributedRAID.html IBM. (2015, November 30). IBM Storwise V7000 with GPFS. Retrieved from Developers Works: https://www.ibm.com/developerworks/community/wikis/home?lang=en# /wiki/General%20Parall el%20File%20System%20(GPFS)/page/IBM%20Storwise%20V7000%20with%20GPFS IBM. (2013, October). IBM System x3850 X5 and x3950 X5 - Types 7145, 7146, 7143, and 7191. Retrieved from Installation and User's Guide: http://publib.boulder.ibm.com/infocenter/systemx/documentation/topic/com.ibm.sysx.7145.doc/ 7 145_iug_pdf.pdf IBM. (n.d.). System x Documentation - Memory Modules. Retrieved from Info center: http://publib.boulder.ibm.com/infocenter/systemx/documentation/index.jsp?topic=/com.ibm.sysx. 7145.doc/bb1pw_r_memorymodules.html Intel. (2011, November 2). Optimizing Applications for NUMA. Retrieved from Intel Developer Zone: https://software.intel.com/en-us/articles/optimizing-applications-for-numa
  • 22. IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 22© COPYRIGHT IBM CORPORATION, 2017 © COPYRIGHT IBM CORPORATION, 2016 © International Business Machines Corporation 2017 Printed in the United States of America February 2017 All Rights Reserved IBM, the IBM logo, Linear Tape File System, Spectrum Archive, Spectrum Scale, System Storage are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. Linear Tape-Open LTO, the LTO logo, Ultrium and the Ultrium logo are registered trademarks of Hewlett Packard Enterprise, IBM and Quantum in the US and other countries. Other company, product and service names may be trademarks or service marks of others. Productdatahasbeenreviewedforaccuracyasofthedateofinitialpublication.Productdataissubjecttochange without notice. This information could include technical inaccuracies and/or typographical errors. IBM may make improvements and/or changes in the product(s) and/or programs(s) at any time without notice. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Any reference to an IBM ProgramProductinthisdocumentisnotintendedtostateorimplythatonlythatprogramproductmaybeused.Any functionally equivalent program, that does not infringe IBM’s intellectually property rights, may be used instead. It is the user’s responsibility to evaluate and verify the operation of any non-IBM product, program or service. The performance data contained herein was obtained in a controlled, isolated environment. Actual results that may be obtained in other operating environments may vary significantly. While IBM has reviewed each item for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED AS IS WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IBM EXPRESSLY DISCLAIMS ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT. IBM shall have no responsibility to update this information. IBM products are warranted according to the terms and conditions of the agreements (e.g., IBM Customer Agreement, Statement of Limited Warranty, International Program License Agreement, etc.) under which they are provided. IBM is not responsible for the performance or interoperability of any non-IBM products discussed herein. Theprovisionoftheinformationcontainedhereinisnotintendedto,anddoesnotgrantanyrightorlicenseunderany IBM patents or copyrights. Inquiries regarding patent or copyright licenses should be made, in writing, to: IBM Director of Licensing IBMCorporation North Castle Drive Armonk, NY 10504- 1785 U.S.A.