Storage For Science Wp


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Storage For Science Wp

  1. 1. Storage for Science Methods for Managing Large and Rapidly Growing Data Stores in Life Science Research Environments An Isilon® Systems Whitepaper August 2008 Prepared by:
  2. 2. Table of Contents Introduction 3 Requirements for Science 3 “Large” Capacity 3 Accelerating Growth 4 Variable File Types and Operations 4 Shared Read/Write Access 4 Ease of Use 5 Understanding the Alternatives 5 Common Feature Trade-offs 5 Direct Attached Storage (DAS) 6 Storage Area Network (SAN) 7 Network Attached Storage (NAS) 8 Asymmetric Clustered Storage 8 Symmetric Clustered Storage 9 Isilon Clustered Storage Solution 9 OneFS Operating System 10 Inherent High Availability & Reliability 10 Single Level of Management 11 Linear Scalability in Performance & Capacity 11 Conclusion 11 ISILON SYSTEMS 2
  3. 3. Introduction This document is intended to inform the Life Science researcher with large and rapidly growing data storage needs. We explore many of the storage requirements common to Life Science research and explain the evolution of modern storage architectures from local disks through symmetric clustered storage. Finally, we present Isilon’s IQ clustered storage solution in detail. Requirements for Science “Large” Capacity Many branches of Life Science research involve the generation, accumulation, analysis, and distribution of “large” amounts of data. What is considered “large” changes rapidly as data generation increases through advances in scientific methods and instrumentation. These advances are offset by capacity increases in storage technologies that are undergoing their own rapid evolution. Presently, Neuro-Imaging and Next-Generation Sequencing are branches of science churning out massive amounts of data that push the limits of “large”. We will explore these two specific examples in further detail. Neuro-Imaging A common Neuro-Imaging experiment involves fMRI (Functional Magnetic Resonance Imaging) to determine activated regions of the brain in response to a stimulus. This “brain mapping” is achieved by observing increased blood flow to the activated areas of the brain using an fMRI scanner. The scanning of a single human test subject might occur over a 60 to 90 minute period, with hundreds of discrete scans every few seconds, generating as much as 1GB of data per subject. A single instrument operating at only 50% capacity can produce many terabytes (1,000s of GBs) of data per year. The Neuro-Imaging centers interviewed for this paper utilize up to ten instruments, supporting dozens of scientists, each allocated a baseline of 2TB of disk space for their ongoing experiments. While this rapid scaling is a significant challenge for many labs, data growth of 10 to 20 TB per year is not unusual in these environments. “Next-Generation” DNA Sequencing DNA sequencing has undergone a revolution in recent years. Driven by novel sequencing chemistries, micro-fluidic systems, and reaction detection methods, “Next-Generation” sequencing instruments from 454, Illumina, ABI, and Helicos offer 100 to 1000-fold increased throughput, combined with an additional 100 to 1000-fold decreased cost per nucleotide when compared with conventional Sanger sequencing. This change has put high-throughput genome sequencing, once achievable by only a few major sequencing centers, within reach of many smaller research groups and individual research labs. The result for such labs is a dramatic increase in storage requirements from gigabytes to petabytes (1 million GB) in only the course of a couple of years. Each Next-Generation sequencing platform is unique in terms of the nature and volume of the data it generates. Typically, anywhere from 600GB (gigabytes) to 6TB (terabytes) of primary image data is written over a period of one to three days. By today’s standards, a terabyte is not large. However, for a single laboratory, accumulating and moving terabytes of data per day without loss can be a significant challenge, especially for small sequencing labs that have not yet adopted a highly scalable storage solution. ISILON SYSTEMS 3
  4. 4. Accelerating Growth Storage capacity planning for Life Science research is particularly difficult in that requirements change rapidly and at irregular rates. Planning for growth according to the number of users or number of instruments is often insufficient when, for instance, a new grant can double capacity needs. Similarly a revolutionary new instrument might increase data production by an exponential amount. To be responsive to the requirements of Life Science research, an ideal storage architecture must be scalable in both small and large increments without requiring a system redesign or replacement. Ideally, a storage solution should have “pay-as-you-grow” characteristics that allow for growth as-needed. Variable File Types and Operations Life Science data is highly variable, both in composition and in the way that it is accessed. Therefore, an ideal storage system for Life Science organizations must have good I/O performance across these varied use cases: - Many small files or fewer big files - Text files and binary files - Sequential and random access - Highly concurrent access This variability is common to both neuro-imaging and next-generation sequencing. Massive simultaneous computations are performed upon many, large primary image files ranging in the gigabytes and requiring highly parallel streaming (I/O), resulting in fewer, smaller text files. The resulting data might be kept within directories containing thousands to hundreds-of-thousands of files, totaling many terabytes. Shared Read/Write Access Storage systems for Life Science data must be simultaneously accessible to many instruments, users, analysis computers, and data servers. These storage systems cannot reside in isolated silos with limited accessibility. They must, instead, permit concurrent, integrated, file-level read/write access across the entire organization with I/O bandwidth that scales to accommodate concurrent demand. A typical Neuro-Imaging or Next-Generation sequencing workflow involves the following steps: - Multiple instruments generate primary image data. - Large memory SMP machines and compute clusters distill the primary data into a derived form. - Researchers evaluate and annotate the data to answer scientific questions. - Researchers iterate on the above process, adding more primary data and refining their analyses. - Finally, results are served to a wider audience via internet repositories, usually accessed via FTP or HTTP. The requirements of the workflow above are the sum of requirements from instruments, researchers, computing systems, and customers. A sustainable storage plan for even a small research organization requires a system with shared, file-level read/write access to a common, large, scalable storage repository and should allow access by these common protocols: - NFS (Network File System) – The common network file system for UNIX instruments and analysis computers - SMB/CIFS (Server Message Block/Common Internet File System) – The common network file system for Windows-based instruments and user desktops ISILON SYSTEMS 4
  5. 5. - HTTP (Hypertext Transfer Protocol) – The file transfer protocol used in the World Wide Web - FTP (File Transfer Protocol) – A common internet file transfer protocol for disseminating data Ease of Use At many levels, ease of use is the most significant storage requirement for Life Science research, even though it is generally the most difficult to quantify. Management The human resources involved in maintaining a large storage system range from just above zero to many FTEs (full time equivalents). The management of an ideal storage system should not require the hiring of additional, dedicated IT staff. Scaling Scaling a storage system’s capacity and/or performance, whether by fractional amounts or by orders of magnitude, should not require multiple man-months of meetings to plan, or even several man-days of IT technical expertise to implement. Scaling an ideal storage system should be able to be performed in minutes, independent of scale. User Ideally, the researcher is focused on science, not computers or disks. The researcher shouldn’t be concerned with or aware of volumes, capacity, formats, or how to access their data. Upon scaling storage a user might notice that capacity suddenly increased, but never experience an interruption in service. Understanding the Alternatives Common Feature Trade-offs Like most products, storage solutions compete based on their features. An ideal storage solution would excel at all features: have high I/O performance rates, never become inaccessible, never lose data, have the ability to become infinitely large, be scalable in both large and small increments, have a low purchase price, require little human effort to manage, and be easy to use. In the real world, decisions are based on which of these requirements are most important within a given budget. Storage decisions typically reduce to four factors: - Will this provide me sufficient performance and capacity for my present needs? - Will I experience any significant down-time or data loss? - Do I have the human resources needed to manage the system? - How long will it be before I need to upgrade this system and at what cost? When designing storage systems in a scientific research environment, many variables come into play. Present capacity needs may be the easiest to quantify, but are only a starting point. Performance requirements aren’t generally known until after the storage has been deployed and workflows are executed against data. Data loss is known to be a very bad thing, but quantifying the cost of loss is difficult when the core value to the lab might be a publication or a discovery. Labor costs may be very indirect; the use of graduate students as part-time systems ISILON SYSTEMS 5
  6. 6. administrators is a prime example. Students come and go, which can impose high additional costs if storage systems are difficult to learn or require specialized training. Particularly in primary research, very early in a product pipeline, it can be difficult to set dollar values on these factors. However, such variables must be considered in order to make sensible storage infrastructure decisions. All storage systems have maximum scaling limits. However with some, once this limit is reached one must either (a) deploy a 2nd standalone system or (b) perform a “forklift upgrade” in which major components are retired and replaced with bigger/newer ones. In addition to the decision parameters above, some storage architectures can be tuned to optimize certain facets of their performance. This means that they can be configured to excel at one feature or another, but not all at the same time: - I/O Performance – The speed of writing data to disk and reading it back to the CPU and user. This might be further broken down into transactional, sequential, and random access patterns. - Availability – The cost associated with maintaining uninterrupted access to the data - Reliability – The cost associated with mitigating the risk of data loss - Maximum Scalability – The largest data volume the storage system can ever hold - Dynamic Configuration – The cost in time and effort to make a change to the system - Resolution of Scale – The smallest increment by which the storage can be made larger - Purchase Price – The cost to purchase the system - Total Cost of Ownership – The cost to buy and operate the system over its useful life - Ease of Use – The cost in time and effort to get it working and keep it working Direct Attached Storage (DAS) Discrete Disk The local disk method is the most familiar means of providing storage, either directly attached to the motherboard of a system, or attached through USB, Firewire, SCSI, or Fibre Channel cables. While it may seem odd (when discussing critical, enterprise-class storage systems) to consider using discrete local disks, due to the explosion of data and the ubiquity of simple DAS type storage, labs containing shelves filled with Firewire drives being used as long-term archive solutions are not that uncommon. These discrete disks can be purchased for as little as $200- $300 per TB and are fairly easy to use. In the very short term, this may seem to be a reasonable purchasing decision, coupled with the reagents necessary for an instrument run. However, combining these disks into a larger storage pool is impossible. Accessing the needed data means locating the correct disk and plugging it into the correct computer. This is not practical at scale, even for very small environments. We describe this only as a baseline, and it should not be considered a reasonable enterprise-class storage solution. ISILON SYSTEMS 6
  7. 7. Redundant Array of Independent Disks (RAID) RAID is a technology that combines two or more disks to achieve greater volume capacity, performance, or reliability. A RAID might be directly attached to one computer as in DAS, or indirectly by a SAN (Storage Area Network), or provide the underlying storage for a NAS (Network Attached Storage). With RAID, data is striped across all of the disks to create a “RAID Set” and depending on how the data is sorted to the individual disks, a RAID is highly tunable: RAID 0: Maximum capacity, maximum risk RAID 1: Maximum read performance, minimum risk RAID 5: Balance capacity, performance, and risk RAID 6: Capacity and performance, with less risk than RAID 5 Storage Area Network (SAN) A SAN is an indirect means of attaching disks to a computer. Rather than connecting directly to the motherboard, a computer connects indirectly through a SAN (typically by Fibre Channel, iSCSI, or proprietary low-latency network). A SAN provides a convenient means to centralize the management of many disks into a common storage pool and then permit the allocation of logical portions of this pool out to one or more computers. This common storage pool can be expanded by attaching additional disk to the SAN. SAN clients have block-level access (which appears as a local disk) to “their” portion of the SAN, however they have no access to other portions allocated to other computers. ISILON SYSTEMS 7
  8. 8. Network Attached Storage (NAS) DAS and SAN are methods for attaching disks to a computer, and RAID is a technology for configuring many disks to achieve different I/O characteristics. A NAS is neither of these. A NAS is a computer (file server) that contains (or is attached to) one or more local disks or RAIDs and operates a file-sharing program that provides concurrent, file-level access to many computers over a TCP/IP network. Therefore, the performance of a NAS is often limited by the number of clients competing for access to the file server. NAS/SAN Hybrid Naturally, one can provide the files served by a NAS through disks provided over a SAN. A NAS/SAN hybrid is a common approach to overcoming the performance limitations of a NAS by distributing file server load over multiple file servers attached to a common SAN. Theoretically, this provides a means to increase the number of file servers in response to the demand of competing clients. This approach often fails when one of the file servers is serving a SAN volume with more “popular” data on it. Responding to this failure requires careful monitoring of the demand and moving/copying/synchronizing data across various SAN volumes. As capacity grows, the management of a NAS/SAN Hybrid becomes more complex. Asymmetric Clustered Storage Asymmetric clustered storage is implemented by a software layer operating on top of a NAS/SAN hybrid architecture. Storage of many SAN volumes is “clustered,” and allocated to separate computers, but it is also asymmetric in that nodes have specialized roles in providing the service. Each NAS file server has concurrent access to all of the SAN volumes, so the manual re- distribution of NAS servers and SAN volumes is no longer required. This concurrency is managed through the addition of another type of computer, frequently called the “Meta Data Controller.” This controller ensures that file servers take proper turns accessing the SAN volumes and files. ISILON SYSTEMS 8
  9. 9. Symmetric Clustered Storage Symmetric clustered storage provides a balanced architecture for scalable storage and concurrent read/write access to a common network file system. It pools both the storage and the file serving capabilities of many anonymous and identical systems. While individual “nodes” within the cluster manage their directly attached disks, a software layer allows the nodes to work together to present an integrated whole. This approach maximizes the ability to “scale-out” storage by adding nodes. This is the approach offered by Isilon systems, and is described in greater detail below. Isilon Clustered Storage Solution Life Science research presents an ever changing, highly automated, laboratory data environment. This can be a major challenge for computing and storage vendors. Isilon IQ represents a flexible, scalable system in which capacity can be added in parallel with new equipment or laboratory capacity. Once the system is configured, scientists can reasonably expect to spend most of their time doing science rather than constantly worrying about whether a new sequencing machine will require a complete overhaul of the storage system. Isilon IQ’s biggest strengths are scalability of capacity and/or performance linearly or independently, symmetric data access, and ease of use. Several customers shared that the staffing requirements for their data storage environment dropped to near zero once they installed the Isilon system. The Isilon storage product is a fully symmetric, file-based, clustered architecture system. The hardware consists of industry standard x86 servers that arrive pre-configured with Isilon’s patented OneFS operating system. These “nodes” connect to each other using a low-latency, high-bandwidth Infiniband network. Filesystem clients access data over a pair of Ethernet ports on each node. Clients connect to a single “virtual” network address and OneFS dynamically distributes the actual connections to the nodes. This means that input/output bandwidth, caching, and latency are shared across the entire ISILON SYSTEMS 9
  10. 10. system. Performance scales smoothly as nodes are added, in contrast to gateway based master- slave architectures where the gateway inevitably becomes a bottleneck. Because all connections go to the same virtual address, adding or removing nodes from the system requires no client-side reconfiguration. Users simply continue with the same data volume, but with more capacity. OneFS manages all aspects of the system including detecting and optimizing new nodes. This makes configuration and expansion incredibly simple. All the usual complexity associated with adding new storage is an “under the hood” activity managed by the operating system. Administrators add new nodes to an existing cluster by connecting the network and power cables and then powering on. The nodes detect an existing cluster and make themselves available to it. OneFS then smoothly re-balances data across all the nodes. During this process, all filesystem mount points remain available. Compared with the usual headache of taking down an entire data storage system for days at a time (to migrate existing data off to tapes, create a new larger system, and re-import the data), this is a fairly incredible benefit. Isilon nodes of different ages and sizes can be intermixed. This eliminates one of the major risks of investing in an integrated storage system. Computing hardware doubles in capacity and speed on a regular basis, meaning that any system intended for use over more than a year or two must take into account that expansions may consist of substantially different hardware than the original system. Storage architectures without the capability to smoothly support heterogeneous hardware frequently require a “forklift” upgrade, in which the old system is decommissioned to make space for a new one. Organizations are then in a constant state of flux - migrating data, with the associated chaos in both user access and automated pipelines. With an Isilon system, by contrast, older nodes may remain in service as long as their hardware is functional. Since data management and re-balancing are managed by OneFS, when it is time to retire a storage node, administrators simply instruct the system to migrate the data off of that node, wait for the migration to complete, and then turn off and remove it from the cluster. During migration all data is available for user access. OneFS Operating System To address the scalable distributed file system requirement of clustered storage, Isilon built, patented, and delivered the revolutionary OneFS operating system, which combines three layers of traditional storage architectures – the file system, volume manager and RAID controller – into a unified software layer. This creates a single intelligent, fully symmetrical file system which spans all nodes within a cluster. File striping in the cluster takes place across multiple storage nodes versus the traditional method of striping across individual disks within a volume/RAID array. This provides a very specific benefit: no one node is 100% responsible for any particular file. An Isilon IQ system can withstand the loss of multiple disks or entire nodes without losing access to any data. OneFS provides each node with knowledge of the entire file system layout. Accessing any independent node gives a user access to all content in one unified namespace, meaning that there are no volumes or shares, no inflexible volume size limits, no downtime for reconfiguration or expansion of storage and no multiple network drives to manage. This seamless integration of a fully symmetric clustered architecture makes Isilon unique in the storage landscape. Inherent High Availability & Reliability Non-symmetric data storage architectures contain intrinsic dependencies and create points of failure and bottlenecks. One way to ensure data integrity and eliminate single points of failure is to make all nodes in a cluster peers. Each node in an Isilon cluster can handle a request from any client, and can provide any content. If any particular node fails, the other nodes dynamically fill in. This is true for all the tasks of the filesystem. Because there is no dedicated “head” to the system, all aspects of system performance can be balanced across the entire system. This means that I/O, availability, storage capacity, and resilience to hardware failure are all distributed. ISILON SYSTEMS 10
  11. 11. OneFS has further increased the availability of Isilon’s clustered storage solution by providing multi-failure support of n+4. Simply stated, this means an Isilon cluster can withstand the simultaneous loss of an unprecedented four disks or four entire nodes without losing access to any content – and without requiring dedicated parity drives. Additionally, self-healing capabilities and high levels of hardware redundancy greatly reduce the chances of a production node failing in the first place. In the event of a failure, OneFS automatically re-builds files across all of the existing distributed free space in the cluster in parallel, eliminating the need for dedicated “hot” spare drives. OneFS leverages all available free space across all nodes in the cluster to rebuild data. This minimizes the window of vulnerability during the rebuild process. Isilon IQ constantly monitors the health of all files and disks and maintains records of the smart statistics (e.g. recoverable read errors) available on each drive to anticipate when that drive will fail. This is simply an automation of a “best practice” that many system administrators wish they had time to do. When Isilon IQ identifies at risk components, it preemptively migrates data off of the “at risk” disk to available free space on the cluster in a manner that is both automatic and transparent to the user. Once the data is migrated, the administrator is notified to service the suspect drive in advance of actual failure. This feature provides customers with confidence that data written today will be stored 100 percent reliably, and available whenever it is needed. No other solution today provides this level of data protection and reliability. Single Level of Management An Isilon IQ cluster creates a single, shared pool of all content, providing one point of access for users and one point of management for administrators, with a single file system or volume of up to 1.6 petabytes. Users can connect to any storage node and securely access all of the content within the entire cluster. This means there is single point for all applications to connect to and that every application has visibility and access to every file in the entire file system (per security and permission policies, of course). Linear Scalability in Performance & Capacity One of the key benefits of an Isilon IQ cluster is the ease with which it allows administrators to add both performance and capacity without downtime or application changes. System administrators simply insert a new Isilon IQ storage node, connect the network cables and power up. The cluster automatically detects the newly added storage node and begins to configure it as a member of the cluster. In less than 60 seconds, an administrator can grow a single file system by 2 to 12 terabytes, and increase throughput performance an additional 2 Gigabits/second. Isilon’s modular approach offers a building block (or “pay-as-you-grow”) solution so customers aren’t forced to buy more storage capacity than is needed up front. The modular design of an Isilon cluster also enables customers to incorporate new technologies in the same cluster, such as adding a node with higher-density disk drives, more CPU horsepower or more total performance. Conclusion We have described the data storage needs of Life Science researchers, summarized the major data storage architectures currently in use, and presented the Isilon IQ product as a strong and flexible solution to those needs. Data storage is a critical component of modern scientific research. As smaller labs and individual researchers become responsible for terabytes and petabytes of data, understanding the options and trade-offs will become ever more critical. ISILON SYSTEMS 11