The Gordon Data-intensive Supercomputer. Enabling Scientific Discovery
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOGordon Flash-based Data-intensive Supercomputer:One Year into ProductionISC’13 LeipzigShawn StrandeGordon Project Manager & Co-PI
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOGordon – An Innovative Data-IntensiveSupercomputer• Designed to accelerate access to massive amounts of data inareas of genomics, earth science, engineering, medicine, andothers.• Emphasizes memory and IO over FLOPS.• Cray-integrated 1,024 node Xeon E5 (Sandy Bridge)cluster.• 300 TB of high performance Intel flash.• Large memory supernodes via vSMP Foundation fromScaleMP.• 3D torus interconnect from Mellanox.• In production operation since February 2012.• Funded by the NSF and available through the ExtremeScience and Engineering Discovery Environment program(XSEDE).
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOGordon is a highly flexible system for exploring a wide range of dataintensive technologies and applicationsGordoHigh performance flashtechnologyHigh speed InfiniBandinterconnectOn-demand Hadoop and dataintensive environmentsMassively large memoryenvironmentsHigh performanceparallel file systemScientific databasesComplex applicationarchitecturesNew algorithms andoptimizations
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOGordon is a data movement machineSandy Bridge Compute Nodes(1,024)• 64 TB memory• 341 Tflop/sFlash based I/O Nodes (64)• 300 TB Intel eMLC flash• 35M IOPSLarge Memory Nodes• vSMP Foundation 5.0• 2 TB of cache-coherentmemory per node“Data Oasis”Lustre PFS100 GB/sec, 4 PBDual-rail, 3D TorusInterconnect• 7GB/s
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOSSD latencies are 2 orders of magnitude lower than HDD’s (that’sa big deal for some data intensive applications)Typical hard drive~ 10 ms (.010s)IOPS = 200Solid State Disk~ 100 ms (.0001s)IOPS = 35,000/3000 (R/W)
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOProtein Data Bank (flash-based I/O node)The RCSB Protein Data Bank (PDB) is the leading primary database that provides access to theexperimentally determined structures of proteins, nucleic acids and complex assemblies. In order to allowusers to quickly identify more distant 3D relationships, the PDB provides a pre-calculated set of allpossible pairwise 3D protein structure alignments.Although the pairwise structurecomparisons are computationallyintensive, the bottleneck is thecentralized server that is responsible forassigning work, collecting results andupdating the MySQL database.Using a dedicated Gordon I/O node andthe associated 16 compute nodes, workcould be accomplished 4-6x faster thanusing the OSGConfiguration Time for 15M alignments speedupReference (OSG) 24 hours 1Lyndonville 6.3 hours 3.8Taylorsville 4.1 hours 5.8
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOCondensed phase chemical reactions (SSD)Source: Lillian Chong (U Pittsburgh) Used by permission. 2013Many reactions occur too quickly to be experimentally observed or measured and there is much interest in modeling thereactions using hybrid quantum mechanics/molecular mechanics (QM/MM). The “weighted ensemble” path samplingapproach is pivotal step towards performing these calculations efficientlyThe weighted ensemble approach, asimplemented in the highly scalable WESTPAsoftware package, involves carrying out alarge number of loosely coupled simulationsto generate potential reaction pathways aswell as rigorous reaction rates. A 24-hour runof a simulation using 1024 cores involveswriting more than ten million files, most ofwhich are less than 100 KB in size, to localscratch.Using the local SSDs on Gordon reduces theI/O overhead of the simulations by more than30x (from 10-20% to 0.3%) relative to Krakenazide-cation addition in explicit solvent
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOOpenTopography Facility (flash-based I/O node)High-resolution bare earth DEM of SanAndreas fault south of San Francisco,generated using OpenTopography LIDARprocessing toolsSource: C. Crosby, UNAVCOIllustration of local binning geometry.Dots are LIDAR shots ‘+’ indicatelocations of DEM nodes at whichelevation is estimated basedDataset and processing configuration # concurrent jobs OT Servers Gordon ION Speed-upLake Tahoe 208 Million LIDAR returns0.2-m grid res and 0.2 m rad.1 3297 sec 1102 sec 3x4 29607 sec 1449 sec 20xLocal binning algorithm utilizes theelevation information from only thepoints inside of a circular searcharea with user specified radius. Anout-of-core (memory) version of thelocal binning algorithm exploitssecondary storage for savingintermediate results when the sizeof a grid exceeds that of memory.Using a dedicated Gordon I/O nodewith the fast SSD drives reducesrun times of massive concurrentout-of-core processing jobs by afactor of 20xThe NSF funded OpenTopography Facility provides online access to Earth science-oriented high-resolution LIDAR topography data along with online processing tools and derivative products. Point clouddata are processed to produce digital elevation models (DEMs) - 3D representations of the landscape.
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOIntegromeDB (flash-based I/O node)The IntegromeDB is a large-scale data integration system and biomedical search engine. IntegromeDBcollects and organizes heterogeneous data from over a thousand databases covered by the Nucleic Acidand millions of public biomedical, biochemical, drug and disease-related resourcesIntegromeDB is a distributed system stored ina PostgreSQL database containing over 5,000tables, 500 billion rows and 50TB of data.New content is acquired using a modifiedversion of the SmartCrawler web crawler andpages are indexed using Apache Lucene.Project was awarded two Gordon I/O nodes,the accompanying compute nodes and 50 TBof space on Data Oasis. The compute nodesare used primarily for post-processing of rawdata. Using the I/O nodes dramaticallyincreased the speed of read/write fileoperations (10x) and I/O database operations(50x).Source: Michael Baitaluk (UCSD)Used by permission 2013
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOStructural response of bone to stress (vSMP)Source: Matthew Goff, Chris Hernandez (Cornell University) Used by permission. 2012The goal of the simulations is to analyze how small variances in boundary conditions effect high strainregions in the model. The research goal is to understand the response of trabecular bone to mechanicalstimuli. This has relevance for paleontologists to infer habitual locomotion of ancient people and animals,and in treatment strategies for populations with fragile bones such as the elderly.• 5 million quadratic, 8 nodedelements• Model created with custom Matlabapplication that converts 253 microCT images into voxel-based finiteelement models
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOPredictive analytics graduate class (dedicated I/O node)For operations on large data sets, the amount of time spent moving data between levels of the storage and memoryhierarchy often dwarfs the computation time. In these cases, it can be more efficient to move the software to the datarather than the traditional approach of moving data to the software. Distributed computing frameworks such as Hadooptake advantage of this new paradigm.Yoav Freund was used an I/O node tobuild a Hadoop cluster for use by agraduate level class in distributedcomputing. The HDFS resides on theSSDs to enable rapid data access.Student projects included• Analysis of temperature and airflowsensors on UCSD campus• Detection of failures in the Internet• Prediction of medical costs in caraccidents• Registration of brain images• Deciphering RNA regulatory code• Analysis of election day TweetsSource: Yuncong Chen, Velu Ganapath and Yoav Freund (UCSD)MapReduce framework and coronal mouse brain stem image
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOThe Gordon system enables science breakthroughs inimportant ways• Solid state drives (SSDs) provide fast scratch space for applications thatread/write large amounts of temporary data.• Dedicated I/O nodes are vital to projects requiring fast, persistent access tomulti-terabyte data sets.• 64 GB DRAM, Xeon E5 compute nodes supports applications requiring bothsignificant compute resources and relatively large memory.• vSMP nodes provide large, logical shared memory and large core counts. Usefulfor big memory and highly scalable threaded apps.• Expert support helps users effectively use Gordon’s novel features and makethe transition from serial to parallel or workstation to HPC.
SAN DIEGO SUPERCOMPUTER CENTERat the UNIVERSITY OF CALIFORNIA; SAN DIEGOThank you!Dankeschönfirstname.lastname@example.org