Extreme scriptingand other adventures in data-intensive computing<br />Ian FosterAllan Espinosa, Ioan Raicu, Mike Wilde, Z...
How data analysis happens at data-intensive computing workshops<br />
How data analysis really happensin scientific laboratories<br />%foo file1 &gt; file2<br />% bar file2 &gt; file3<br />%fo...
Extreme scripting<br />Complex<br />scripts<br />Swift<br />Many activities<br />Numerous files<br />Complex data<br />Dat...
Functional magnetic resonance imaging (fMRI) data analysis<br />
AIRSN program definition<br />(Run or) reorientRun (Run ir, string direction) {<br />foreachVolume iv, i in ir.v {<br />or...
Many many tasks:Identifying potential drug targets<br /> 2M+ ligands<br />      Protein        xtarget(s)          <br />B...
6 GB<br />2M structures<br />(6 GB)<br />~4M x 60s x 1 cpu<br />~60K cpu-hrs<br />FRED<br />DOCK6<br />Select best ~5K<br ...
IBM BG/P<br />570 Teraflop/s, 164,000 cores, 80 TB<br />
DOCK on BG/P: ~1M tasks on 119,000 CPUs<br /> 118784 cores<br /> 934803 tasks<br /> Elapsed time:  7257 sec<br /> Compute ...
Managing 160,000 cores<br />Falkon<br />High-speed local “disk”<br />Slower shared storage<br />
Scaling Posix to petascale<br />Global file system<br />Chirp(multicast)<br />Staging<br /> Torus and tree interconnects ...
Efficiency for 4 second tasks and varying data size(1KB to 1MB) for CIO and GPFS up to 32K processors<br />
+<br />+<br />+<br />+<br />+<br />+<br />+<br />=<br />Provisioning for data-intensive workloads<br />Example: on-demand ...
IoanRaicu<br />“Sine” workload, 2M tasks, 10MB:10ms ratio, 100 nodes, GCC policy, 50GB caches/node<br />
Same scenario, but with dynamic resource provisioning<br />
Data diffusion sine-wave workload: Summary<br />GPFS 	 5.70 hrs,  ~8Gb/s,  1138 CPU hrs<br />DD+SRP  1.80 hrs, ~25Gb/s, ...
Data-intensive computing @ Computation Institute: Example applications<br />Astrophysics<br />Cognitive science<br />East ...
Sequencing outpaces Moore’s law<br />BLAST<br />On EC2,<br />US$<br />Next-gen Solexa<br />454<br />Solexa<br />Gigabases<...
Data-intensive computing @ Computation Institute: Hardware<br />1000 TBtape backup<br />Dynamic provisioning<br />500 TB r...
Data-intensive computing @ Computation Institute: Software<br />HPC systems software (MPICH, PVFS, ZeptOS)<br />Collaborat...
Data-intensive computing is an end-to-end problem<br />Low<br />Chaos<br />Zone ofcomplexity<br />Agreement<br />about<br ...
We need to function in the zone of complexity<br />Low<br />Chaos<br />Agreement<br />about<br />outcomes<br />Plan and co...
The Grid paradigm<br />Principles and mechanisms for dynamic virtual organizations<br />Leverage service oriented architec...
As of Oct19, 2008:<br />122 participants<br />105 services<br />70 data<br />35 analytical <br />
Multi-center clinical cancer trials image capture and review<br />(Center for Health Informatics)<br />
Summary<br />Extreme scripting offers the potential for easy scaling of proven working practices<br />Interesting technica...
Thank you!<br />Computation Institutewww.ci.uchicago.edu<br />
Upcoming SlideShare
Loading in …5
×

Extreme Scripting July 2009

1,247 views
1,202 views

Published on

A talk presented at an NSF Workshop on Data-Intensive Computing, July 30, 2009.

Extreme scripting and other adventures in data-intensive computing

Data analysis in many scientific laboratories is performed via a mix of standalone analysis programs, often written in languages such as Matlab or R, and shell scripts, used to coordinate multiple invocations of these programs. These programs and scripts all run against a shared file system that is used to store both experimental data and computational results.

While superficially messy, the flexibility and simplicity of this approach makes it highly popular and surprisingly effective. However, continued exponential growth in data volumes is leading to a crisis of sorts in many laboratories. Workstations and file servers, even local clusters and storage arrays, are no longer adequate. Users also struggle with the logistical challenges of managing growing numbers of files and computational tasks. In other words, they face the need to engage in data-intensive computing.

We describe the Swift project, an approach to this problem that seeks not to replace the scripting approach but to scale it, from the desktop to larger clusters and ultimately to supercomputers. Motivated by applications in the physical, biological, and social sciences, we have developed methods that allow for the specification of parallel scripts that operate on large amounts of data, and the efficient and reliable execution of those scripts on different computing systems. A particular focus of this work is on methods for implementing, in an efficient and scalable manner, the Posix file system semantics that underpin scripting applications. These methods have allowed us to run applications unchanged on workstations, clusters, infrastructure as a service ("cloud") systems, and supercomputers, and to scale applications from a single workstation to a 160,000-core supercomputer.

Swift is one of a variety of projects in the Computation Institute that seek individually and collectively to develop and apply software architectures and methods for data-intensive computing. Our investigations seek to treat data management and analysis as an end-to-end problem. Because interesting data often has its origins in multiple organizations, a full treatment must encompass not only data analysis but also issues of data discovery, access, and integration. Depending on context, data-intensive applications may have to compute on data at its source, move data to computing, operate on streaming data, or adopt some hybrid of these and other approaches.

Thus, our projects span a wide range, from software technologies (e.g., Swift, the Nimbus infrastructure as a service system, the GridFTP and DataKoa data movement and management systems, the Globus tools for service oriented science, the PVFS parallel file system) to application-oriented projects (e.g., text analysis in the biological sciences, metagenomic analysis, image analysis in neuroscience, information integration for health care applications, management of experimental data from X-ray sources, diffusion tensor imaging for computer aided diagnosis), and the creation and operation of national-scale infrastructures, including the Earth System Grid (ESG), cancer Biomedical Informatics Grid (caBIG), Biomedical Informatics Research Network (BIRN), TeraGrid, and Open Science Grid (OSG).
For more information, please see www.ci.uchicago/swift.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,247
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
18
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Another perspective on the problem. A few words of explanation. If we are deploying a hospital IT system, we are (hopefully) in the bottom left hand corner.“You can’t achieve success via central planning.” Quoted in Crossing the Quality Chasm, p. 312In our scenarios, we don’t have that ability to control.
  • What is the alternative? We can put in place mechanisms that facilitate groups with some common goal to form and function.Over time, things change, these groups evolve.If we are successful, they can expand, perhaps merge.Challenges: make this easy. Leverage scale effects.
  • Principles and mechanisms that has been under development for some years.First CS, then physical sciences, then biology, most recently biomedicine –
  • Extreme Scripting July 2009

    1. 1. Extreme scriptingand other adventures in data-intensive computing<br />Ian FosterAllan Espinosa, Ioan Raicu, Mike Wilde, Zhao Zhang<br />Computation Institute<br />Argonne National Lab & University of Chicago<br />
    2. 2. How data analysis happens at data-intensive computing workshops<br />
    3. 3. How data analysis really happensin scientific laboratories<br />%foo file1 &gt; file2<br />% bar file2 &gt; file3<br />%foo file1 | bar &gt; file3<br />% foreachf (f1 f2 f3 f4 f5 f6 f7 … f100)<br />foreach?foo $f.in | bar &gt; $f.out<br />foreach? end<br />%<br />% Now where on earth is f98.out, and how did I generate it again?<br />Now: command not found.<br />%<br />
    4. 4. Extreme scripting<br />Complex<br />scripts<br />Swift<br />Many activities<br />Numerous files<br />Complex data<br />Data dependencies<br />Many programs<br />Simple<br />scripts<br />Big<br />computers<br />Small<br />computers<br />Many processors<br />Storage hierarchy<br />Failure<br />Heterogeneity<br />Preserving<br />file system semantics,<br />ability to call arbitrary executables<br />
    5. 5. Functional magnetic resonance imaging (fMRI) data analysis<br />
    6. 6. AIRSN program definition<br />(Run or) reorientRun (Run ir, string direction) {<br />foreachVolume iv, i in ir.v {<br />or.v[i] = reorient(iv, direction);<br />}<br />}<br />(Run snr) functional ( Run r, NormAnat a, Air shrink ) {<br />Run yroRun = reorientRun( r , &quot;y&quot; );<br />Run roRun = reorientRun( yroRun , &quot;x&quot; );<br />Volume std = roRun[0];<br />Run rndr = random_select( roRun, 0.1 );<br />AirVectorrndAirVec = align_linearRun( rndr, std, 12, 1000, 1000, &quot;81 3 3&quot; );<br />Run reslicedRndr = resliceRun( rndr, rndAirVec, &quot;o&quot;, &quot;k&quot; );<br />Volume meanRand = softmean( reslicedRndr, &quot;y&quot;, &quot;null&quot; );<br />Air mnQAAir = alignlinear( a.nHires, meanRand, 6, 1000, 4, &quot;81 3 3&quot; );<br />Warp boldNormWarp = combinewarp( shrink, a.aWarp, mnQAAir );<br />Run nr = reslice_warp_run( boldNormWarp, roRun );<br />Volume meanAll = strictmean( nr, &quot;y&quot;, &quot;null&quot; )<br />Volume boldMask = binarize( meanAll, &quot;y&quot; );<br />snr = gsmoothRun( nr, boldMask, &quot;6 6 6&quot; );<br />}<br />
    7. 7. Many many tasks:Identifying potential drug targets<br /> 2M+ ligands<br /> Protein xtarget(s) <br />Benoit Roux et al.<br />
    8. 8. 6 GB<br />2M structures<br />(6 GB)<br />~4M x 60s x 1 cpu<br />~60K cpu-hrs<br />FRED<br />DOCK6<br />Select best ~5K<br />Select best ~5K<br />~10K x 20m x 1 cpu<br />~3K cpu-hrs<br />Amber<br />Select best ~500<br />~500 x 10hr x 100 cpu<br />~500K cpu-hrs<br />GCMC<br />ZINC<br />3-D<br />structures<br />Manually prep<br />DOCK6 rec file<br />Manually prep<br />FRED rec file<br />NAB scriptparameters<br />(defines flexible<br />residues, <br />#MDsteps)<br />NAB<br />Script<br />Template<br />DOCK6<br />Receptor<br />(1 per protein:<br />defines pocket<br />to bind to)<br />FRED<br />Receptor<br />(1 per protein:<br />defines pocket<br />to bind to)<br />PDB<br />protein<br />descriptions<br />1 protein<br />(1MB)<br />BuildNABScript<br />Amber prep:<br />2. AmberizeReceptor<br />4. perl: gen nabscript<br />NAB<br />Script<br />start<br />Amber Score:<br />1. AmberizeLigand<br />3. AmberizeComplex<br />5. RunNABScript<br />For 1 target:<br />4 million tasks500,000 cpu-hrs<br />(50 cpu-years)<br />end<br />report<br />ligands<br />complexes<br />
    9. 9.
    10. 10. IBM BG/P<br />570 Teraflop/s, 164,000 cores, 80 TB<br />
    11. 11. DOCK on BG/P: ~1M tasks on 119,000 CPUs<br /> 118784 cores<br /> 934803 tasks<br /> Elapsed time: 7257 sec<br /> Compute time: 21.43 CPU years<br /> Average task: 667 sec<br />Time (sec)<br />Relativeefficiency 99.7% (from 16 to 32 racks)<br />Utilization: 99.6% sustained, 78.3% overall<br />Ioan Raicu et al.<br />
    12. 12. Managing 160,000 cores<br />Falkon<br />High-speed local “disk”<br />Slower shared storage<br />
    13. 13. Scaling Posix to petascale<br />Global file system<br />Chirp(multicast)<br />Staging<br /> Torus and tree interconnects <br />Intermediate<br />CN-striped intermediate file system<br />Largedataset<br />MosaStore(striping)<br />…<br />IFScompute<br />node<br />IFScompute<br />node<br />LFS<br />LFS<br />IFSseg<br />IFSseg<br />ZOID on I/O node<br />Computenode(local datasets)<br />Computenode(local datasets)<br />ZOID IFS<br />Local<br />. . .<br />
    14. 14. Efficiency for 4 second tasks and varying data size(1KB to 1MB) for CIO and GPFS up to 32K processors<br />
    15. 15. +<br />+<br />+<br />+<br />+<br />+<br />+<br />=<br />Provisioning for data-intensive workloads<br />Example: on-demand “stacking” of arbitrary locations within ~10TB sky survey<br />Challenges<br />Random data access<br />Much computing<br />Time-varying load<br />Solution<br />Dynamic acquisition of compute & storage<br />Data diffusion<br />Sloan<br />Data<br />S<br />IoanRaicu<br />
    16. 16. IoanRaicu<br />“Sine” workload, 2M tasks, 10MB:10ms ratio, 100 nodes, GCC policy, 50GB caches/node<br />
    17. 17. Same scenario, but with dynamic resource provisioning<br />
    18. 18. Data diffusion sine-wave workload: Summary<br />GPFS  5.70 hrs, ~8Gb/s, 1138 CPU hrs<br />DD+SRP  1.80 hrs, ~25Gb/s, 361 CPU hrs<br />DD+DRP  1.86 hrs, ~24Gb/s, 253 CPU hrs<br />
    19. 19. Data-intensive computing @ Computation Institute: Example applications<br />Astrophysics<br />Cognitive science<br />East Asian studies<br />Economics<br />Environmental science<br />Epidemiology<br />Genomic medicine<br />Neuroscience<br />Political science<br />Sociology<br />Solid state physics<br />
    20. 20.
    21. 21. Sequencing outpaces Moore’s law<br />BLAST<br />On EC2,<br />US$<br />Next-gen Solexa<br />454<br />Solexa<br />Gigabases<br />Folker Meyer, Computation Institute<br />
    22. 22. Data-intensive computing @ Computation Institute: Hardware<br />1000 TBtape backup<br />Dynamic provisioning<br />500 TB reliable storage (data & metadata)<br />Parallel analysis<br />Diversedatasources<br />Remote access<br />P A D S<br />180 TB, <br />180 GB/s <br />17 Top/s<br /> analysis<br />Diverseusers<br />Data<br />ingest<br />Offload to remote data centers<br />PADS: Petascale Active Data Store (NSF MRI)<br />
    23. 23. Data-intensive computing @ Computation Institute: Software<br />HPC systems software (MPICH, PVFS, ZeptOS)<br />Collaborative data tagging (GLOSS)<br />Data integration (XDTM)<br />HPC data analytics and visualization<br />Loosely coupled parallelism (Swift, Hadoop)<br />Dynamic provisioning (Falkon)<br />Service authoring (Introduce, caGrid, gRAVI)<br />Provenance recording and query (Swift)<br />Service composition and workflow (Taverna)<br />Virtualization management (Workspace Service)<br />Distributed data management (GridFTP, etc.)<br />
    24. 24. Data-intensive computing is an end-to-end problem<br />Low<br />Chaos<br />Zone ofcomplexity<br />Agreement<br />about<br />outcomes<br />Plan and control<br />High<br />Low<br />High<br />Certainty about outcomes<br />Ralph Stacey, Complexity and Creativity in Organizations, 1996<br />
    25. 25. We need to function in the zone of complexity<br />Low<br />Chaos<br />Agreement<br />about<br />outcomes<br />Plan and control<br />High<br />Low<br />High<br />Certainty about outcomes<br />Ralph Stacey, Complexity and Creativity in Organizations, 1996<br />
    26. 26. The Grid paradigm<br />Principles and mechanisms for dynamic virtual organizations<br />Leverage service oriented architecture<br />Loose coupling of data and services<br />Open software,architecture<br />Engineering<br />Biomedicine<br />Computer science<br />Physics<br />Healthcare<br />Astronomy<br />Biology<br />1995 2000 2005 2010<br />
    27. 27. As of Oct19, 2008:<br />122 participants<br />105 services<br />70 data<br />35 analytical <br />
    28. 28. Multi-center clinical cancer trials image capture and review<br />(Center for Health Informatics)<br />
    29. 29. Summary<br />Extreme scripting offers the potential for easy scaling of proven working practices<br />Interesting technical problems relating to programming and I/O models<br />Many wonderful applications<br />Data-intensive computing is an end-to-end problem<br />Data generation, integration, analysis, etc., is a continuous, loosely coupled process<br />
    30. 30. Thank you!<br />Computation Institutewww.ci.uchicago.edu<br />

    ×