UCSC's Biomolecular Department Eliminates I/O Bottleneck with Panasas

352 views
307 views

Published on

Slow I/O and downtime impacted the run times of the University of California Santa Cruz's Genome Browser search tool used by scientists in their work to solve questions of the postgenomic era. They were searching for a storage solution that delivered high performance random I/O to an exceptionally large number of cluster nodes and one that would allow them to focus solely on their tests instead of the systems running them.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
352
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

UCSC's Biomolecular Department Eliminates I/O Bottleneck with Panasas

  1. 1. Customer Success Story University of California, Santa Cruz University of California,“The Panasas Storagesystem has reduced Santa Cruzrun times by over 40 The Center for Biomolecular Sciences and Engineering at University of California,hours.” Santa Cruz (UCSC) launches interdisciplinary research and academic programs that address the scientific questions of the post-genomic era. The Center usesRobert BaertschResearch Assistant, computational, mathematical, and statistical approaches to probe and analyzeUCSC biological data, from DNA to biological processes to healthcare systems. One of the Center’s major projects is the UCSC Genome Browser, a web-based tool that allows researchers to view all 23 chromosomes of the human genome from as large as a full chromosome down to an individual nucleotide. The UCSC Genome Browser integrates the work of numerous scientists in laboratories worldwide and includes work generated at UCSC in an interactive, graphical display. The Challenge tests, instead of the systems that were The UCSC Genome Browser leverages running them, was critically important. extremely fast search software that runs “Many of our programs would takeSUMMARY on the KiloKluster, a second-generation years to run on a single CPU. Having aIndustry: Life Sciences 1000+ node bioinformatics Linux cluster. cluster with many nodes shortens run It enables researchers to match any time to days or even hours making ourTHE CHALLENGE DNA sequence to the human genome research possible,” said Baertsch. “SlowSlow I/O and downtime impacted the in seconds and maps experimental data I/O and downtime can really impactrun times of their Genome Browser to the reference sequence. In order to overall run times and it is critical tosearch tool used by scientists in their process the Browser’s huge quantity of have a storage system that can scalework to solve questions of the post-genomic era. They were searching data, the Center searched for a storage to thousands of nodes with a singlefor a storage solution that delivered solution that delivered high performance system image.” Finally, price is alwayshigh performance random I/O to an random I/O to a large number of cluster a major consideration for the Center. Aexceptionally large number of cluster nodes. “Our KiloKluster really taxes the fundamental requirement is to deliver anodes and one that would allow them tofocus solely on their tests instead of the capabilities of a storage system,” said compelling price point.systems running them. Robert Baertsch, Research Assistant at UCSC. “To be fully effective, a storage The SolutionTHE SOLUTION solution in our environment needs to be The Center conducted an extendedThe fully integrated software/hardware able to deliver exceptional performance evaluation process including detailedsolution included the Panasas® with a large number of cluster nodes.” testing with many high performanceOperating Environment and the PanFS™ network-attached and direct-attachedparallel file system with the PanasasDirectFLOW® protocol. UCSC’s system needed to scale in storage solutions. After thorough testing, performance as well as capacity. As a the Panasas® Storage solution was result, the Center searched for a solution selected for its ability to deliver high-speedTHE RESULT that had the potential to scale as a large random I/O performance in a large cluster • Exceptional I/O Performance single pool of data. Similar to many environment, simplified management • A single namespace for simplified universities, the UCSC researchers work through a scalable, shared pool of storage cluster management on several complex projects at any one and exceptional value. The Panasas Storage • Maximized ROI from their clustered time. The ability to focus solely on their system is now connected to the KiloKluster computing environment 1-888-panasas www.panasas.com
  2. 2. Customer Success Story: University of California, Santa Cruzto store and retrieve reference sequences. “A typical NFS serveris brought to its knees when 1000 cluster nodes are pulling data “By using the Panasasfrom it,” said Baertsch. “With Panasas Storage and its object-based architecture, we are able to simultaneously read from and DirectFLOW® protocol we’vewrite to all cluster nodes.” been able to eliminate our I/O bottleneck.”Panasas Storage helps organizations like the Center forBiomolecular Sciences and Engineering accelerate the speed Robert Baertsch Research Assistant,and accuracy of its decisions and ultimately, lead to real world UCSCbreakthroughs that improve people’s lives. Panasas Storageenables the Center to maximize the benefits of Linux clustercomputing by breaking down the storage bottleneck created with The Panasas Storage system’s ease of management alsolegacy network storage technologies. The solution is powered by added tremendous value to the Center’s solution. The singlethe Panasas Operating Environment and the company’s unique unified namespace ensures administrator managementobject-based storage architecture. In addition to exceptional will be streamlined today and in the future. As the systemperformance benefits, the system enables seamless growth of capacity requirements increase, the Center is confident thata single namespace, greatly improving system manageability. the Panasas Storage system can increase in size with noFinally, by leveraging industry standard components, Panasas is impact to administrator management.able to offer this solution at an extremely competitive price.The ResultThe Center for Biomolecular Sciences and Engineeringwas able to see a significant performance boost once thePanasas Storage system was moved into production. “Theconsistently high I/O performance delivered by the Panasassolution enables our researchers to get their results morequickly,” said Baertsch. “By using the Panasas DirectFLOW®protocol we’ve been able to eliminate our I/O bottleneck.”In fact, for specific batches of jobs the Panasas Storagesystem has reduced run times by over 40 hours. Perhapseven more important, by leveraging the object-basedarchitecture, Panasas has been able to offer the Center newways to look at increasing overall performance. “Panasashas given us many ideas on how to scale I/O and we lookforward to further experiments,” said Baertsch.About PanasasPanasas, Inc., the leader in high-performance scale-out NAS storage solutions, enables enterprise customers to rapidly solvecomplex computing problems, speed innovation and bring new products to market faster. All Panasas solutions leverage thepatented PanFS™ storage operating system to deliver exceptional performance, scalability and manageability. PW-10-21700 | Phone: 1-888-PANASAS | www.panasas.com © 2010 Panasas Incorporated. All rights reserved. Panasas is a trademark of Panasas, Inc. in the United States and other countries.

×