Many Task Applications for Grids and Supercomputers

From the Heroic to the Logistical Programming Model Implications of New Supercomputing Applications Ian Foster Computation Institute Argonne National Lab & University of Chicago

Abstract High-performance computers such as the petascale systems being installed at DOE and NSF centers in the US are conventionally focused on “heroic” computations in which many processors are applied to a single task. Yet a growing number of science applications are equally concerned with “logistical” issues: that is, with the high-performance and reliable execution of many tasks that operate on large shared data and/or are linked by communication-intensive producer-consumer relations. Such applications may require the extreme computational capacity and specialized communication fabrics of petascale computers, but are not easily expressed using conventional parallel programming models such as MPI. To enable the use of high-performance computers for these applications, we need new methods for the efficient dispatch, coupling, and management of large numbers of communication-intensive tasks. I discuss how work on scripting languages, high-throughput computing, and parallel I/O can be combined to build new tools that enable the efficient and reliable execution of applications involving from hundreds to millions of uniprocessor and multiprocessor tasks, with aggregate communication requirements of tens of gigabytes per second. I illustrate my presentation by referring to our experiences adapting the Swift parallel programming system (www.ci.uchicago.edu/swift) for efficient execution in both large-scale grid and petascale cluster environments.

What will we do with 1+ Exaflops and 1M+ cores?

Or, If You Prefer, A Worldwide Grid (or Cloud) EGEE

1) Tackle Bigger and Bigger Problems Computational Scientist as Hero

2) Tackle More Complex Problems Computational Scientist as Logistics Officer

“More Complex Problems” Ensemble runs to quantify climate model uncertainty Identify potential drug targets by screening a database of ligand structures against target proteins Study economic model sensitivity to parameters Analyze turbulence dataset from many perspectives Perform numerical optimization to determine optimal resource assignment in energy problems Mine collection of data from advanced light sources Construct databases of computed properties of chemical compounds Analyze data from the Large Hadron Collider Analyze log data from 100,000-node parallel computations

Programming Model Issues Massive task parallelism Massive data parallelism Integrating black box applications Complex task dependencies (task graphs) Failure , and other execution management issues Data management : input, intermediate, output Dynamic computations (task graphs) Dynamic data access to large, diverse datasets Long-running computations Documenting provenance of data products

Problem Types Number of tasks Input data size 1 1K 1M Hi Med Lo Heroic MPI tasks Data analysis, mining Many loosely coupled tasks Much data and complex tasks

An Incomplete and Simplistic View of Programming Models and Tools Many Tasks DAGMan+Pegasus Karajan+Swift Much Data MapReduce/Hadoop Dryad Complex Tasks, Much Data Dryad, Pig, Sawzall Swift+Falkon Single task, modest data MPI, etc., etc., etc.

Many Tasks Climate Ensemble Simulations (Using FOAM, 2005) Image courtesy Pat Behling and Yun Liu, UW Madison NCAR computer + grad student 160 ensemble members in 75 days TeraGrid + “Virtual Data System” 250 ensemble members in 4 days

Many Many Tasks: Identifying Potential Drug Targets 2M+ ligands Protein x target(s) (Mike Kubal, Benoit Roux, and others)

start report DOCK6 Receptor (1 per protein: defines pocket to bind to) ZINC 3-D structures ligands complexes NAB script parameters (defines flexible residues, #MDsteps) Amber Score: 1. AmberizeLigand 3. AmberizeComplex 5. RunNABScript end BuildNABScript NAB Script NAB Script Template Amber prep: 2. AmberizeReceptor 4. perl: gen nabscript FRED Receptor (1 per protein: defines pocket to bind to) Manually prep DOCK6 rec file Manually prep FRED rec file 1 protein (1MB) PDB protein descriptions For 1 target: 4 million tasks 500,000 cpu-hrs (50 cpu-years) 6 GB 2M structures (6 GB) DOCK6 FRED ~4M x 60s x 1 cpu ~60K cpu-hrs Amber ~10K x 20m x 1 cpu ~3K cpu-hrs Select best ~500 ~500 x 10hr x 100 cpu ~500K cpu-hrs GCMC Select best ~5K Select best ~5K

DOCK on SiCortex CPU cores: 5760 Tasks: 92160 Elapsed time: 12821 sec Compute time: 1.94 CPU years Average task time: 660.3 sec (does not include ~800 sec to stage input data) Ioan Raicu Zhao Zhang

DOCK on BG/P: ~1M Tasks on 118,000 CPUs CPU cores: 118784 Tasks: 934803 Elapsed time: 7257 sec Compute time: 21.43 CPU years Average task time: 667 sec Relative Efficiency: 99.7% (from 16 to 32 racks) Utilization: Sustained: 99.6% Overall: 78.3% GPFS 1 script (~5KB) 2 file read (~10KB) 1 file write (~10KB) RAM (cached from GPFS on first task per node) 1 binary (~7MB) Static input data (~45MB) Ioan Raicu Zhao Zhang Mike Wilde Time (secs)

Managing 120K CPUs Slower shared storage High-speed local disk Falkon

MARS Economic Model Parameter Study 2,048 BG/P CPU cores Tasks: 49,152 Micro-tasks: 7,077,888 Elapsed time: 1,601 secs CPU Hours: 894 Zhao Zhang Mike Wilde

AstroPortal Stacking Service Purpose On-demand “stacks” of random locations within ~10TB dataset Challenge Rapid access to 10-10K “random” files Time-varying load Sample Workloads S 4 Sloan Data Web page or Web Service + + + + + + = +

AstroPortal Stacking Service with Data Diffusion Aggregate throughput: 39Gb/s 10X higher than GPFS Reduced load on GPFS 0.49Gb/s 1/10 of the original load Big performance gains as locality increases Ioan Raicu, 11:15am TOMORROW

B. Berriman, J. Good (Caltech) J. Jacob, D. Katz (JPL)

Montage Benchmark (Yong Zhao, Ioan Raicu, U.Chicago) MPI : ~950 lines of C for one stage Pegasus : ~1200 lines of C + tools to generate DAG for specific dataset SwiftScript : ~92 lines for any dataset

Summary Peta- and exa-scale computers enable us to tackle new problems at greater scales Parameter studies, ensembles, interactive data analysis, “workflows” of various kinds Such apps frequently stress petascale hardware and software in interesting ways New programming models and tools required Mixed task/data parallelism, task management complex data management, failure, … Tools (DAGman, Swift, Hadoop, …) exist but need refinement Interesting connections to distributed systems More info: www.ci.uchicago.edu/swift

Many Task Applications for Grids and Supercomputers

More Related Content

What's hot

Similar to Many Task Applications for Grids and Supercomputers

More from Ian Foster

Recently uploaded

Many Task Applications for Grids and Supercomputers

Editor's Notes