Many-Task Computing Tools forMultiscale ModelingDaniel S. Katz (d.katz@ieee.org)Senior Fellow, Computation Institute (Univ...
Founded in 1892 as an institutional “communityof scholars,” where all fields and disciplines meetand collaborate          ...
Computation Institute•   Mission: address the most challenging problems arising in the use of strategic    computation and...
Multiscale Modeling                          www.ci.anl.gov4                          www.ci.uchicago.edu
Multiscale•   The world is multiscale•   In modeling, a common challenge is    determining the correct scale to capture a ...
Material Science, Methods and ScalesCredit: M. Stan, Materials Today, 12 (2009) 20-28                                     ...
Modeling nuclear fuel rods                  Fuel Element Model (FEM)                                                      ...
Sequential (information passing, hand-shaking, bridging)                             ?                          R. Devanat...
Coupling methods•   We often know how to solve a part of the problem with sufficient accuracy, but when    we combine mult...
More on coupling              Tight               Loose                               Model1      File     Model1    Msg  ...
Coupling Models                    Told , kold , tnew       Finite Element (fuel element, solve for T)                    ...
Swift (for Loose Coupling)                             www.ci.anl.gov12                             www.ci.uchicago.edu
Workflow Example: Protein Structure Prediction     predict()                                      …                       ...
Swift•    Portable workflows – deployable on many     resources•    Fundamental script elements are external     processes...
Portability:dynamic development and execution•     Separate workflow description from resource and      component implemen...
Swift scripts•    C-like syntax, also Python prototype•    Supports file/task model directly in the language        type f...
Data flow and natural concurrency•     Provide natural concurrency through automatic      data flow analysis and task sche...
Variables, Tasks, Files, Concurrency•    Variables are single assignment futures     –   Unassigned variables are open•   ...
Execution model• In a standard Swift workflow, each task must enumerate its input and  output files• These files are shipp...
Performance and Usage•    Swift is fast     – Uses Karajan (in Java CoG) as powerful, efficient,       scalable, and flexi...
nSim (1000) predict()s                                           …                                         T1af7        T1...
Powerful parallel prediction loops in Swift1.    Sweep( )2.    {3.    intnSim = 1000;4.    intmaxRounds = 3;5.      Protei...
Swift                                  Data server                 File                                                   ...
Swift summary•    Structures and arrays of data•    Typed script variables intermixed with     references to file data•   ...
Using Swift for Loose           Coupling                             www.ci.anl.gov25                             www.ci.u...
Multiscale Molecular Dynamics  •   Problem: many systems are too large to solve      using all-atom molecular dynamics (MD...
Building a CG model – initial data processing  •   Stage 0 – run multiple short-duration trajectories of all-      atom MD...
Building CG model – covariance matrix  •   Stage 2 – join trajectory files together into ascii file       –   Requires tra...
Building a CG model – CG mapping  •   Stage 4 – for a given number of sites (#sites), find best      mapping for each traj...
Building a CG model – finding #sites  •   Stage 6 – determine #sites       –   Estimate best #sites (b#sites) from slope/i...
Bio workflow: AA->CG MD                Stage 0:              Stage 1                                       Stage          ...
Multiscale?  • So far, this isn’t really multiscale  • It has just used fine grain information to build the    best coarse...
NSF Center for Chemical Innovation Phase I award: "Centerfor Multiscale Theory and Simulation”•    ... development of a no...
Actin at multiple scales                         Actin filament (F-Actin) – complex of G-Actins                           ...
Geophysics Application  •   Subsurface flow model  •   Couples continuum and pore scale simulations       –   Continuum mo...
Subsurface Hybrid Model WorkflowCredit: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim Scheibe, PNNL                ...
Swift code                                               //Hybrid Model                                               (fil...
Towards Tighter Coupling                            www.ci.anl.gov38                            www.ci.uchicago.edu
More on coupling              Tight               Loose                               Model1      File     Model1    Msg  ...
More on coupling•    Message vs. file issues:      –   Performance: overhead involved in writing to disk vs. keeping in me...
Work in progress towards tighter coupling inSwift: Collective Data Management (CDM)•    Data transfer mechanism is: transf...
CDM examples•    Broadcast an input data set to workers     – On Open Science Grid, just send it to the shared file       ...
Work in progress towards tighter coupling after Swift: ExM(Many-task computing on extreme-scale systems)•    Deploy Swift ...
Increased coverage of scripting                                  ExM                Swift     Many executables w/ driver  ...
Conclusions•    Multiscale modeling is important now, and use     will grow•    Can think of multiscale modeling instances...
Acknowledgments• Swift is supported in part by NSF grants OCI-721939, OCI-0944332, and  PHY-636265, NIH DC08638, DOE and U...
Upcoming SlideShare
Loading in...5
×

Multiscale Modeling

903

Published on

presented at RPI workshop on Data-Driven Multiscale Modeling of Complex Systems

see also: http://arxiv.org/abs/1110.0404

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
903
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
17
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Microstructure model (phase-field) simulates the accumulation and transport of fission products and the evolution of Hegas bubble microstructures in nuclear fuels.The effective thermal conductivity strongly depends on the bubble volume fraction, but weakly on the morphology of the bubbles.
  • nSim – number of Monte Carlo simulationsPredict() – a single montecarloprediction of a protein structureAnalyze() – analyzes a bunch of protein structuresResults of Analyze() need to be fed into next RoundmaxRounds is used in the ... code
  • nSim – number of Monte Carlo simulationsPredict() – a single montecarloprediction of a protein structureAnalyze() – analyzes a bunch of protein structuresResults of Analyze() need to be fed into next RoundmaxRounds is used in the ... code
  • Why not just use AA_MD?Biologically relevant timescales are millisecond or seconds or minutes, while MD simulations provide trajectories with the length of tens or hundreds nanoseconds (there was a pioneer work this year, with a millisecond all-atom MD simulation, but for a very small system). CG models is a way to compensate this difference.
  • (a) is a single monomer of actin (so-called G-actin). This picture symbolizes all-atom representation. (I know, you can't see atoms there, only helices and sheets, but if we draw all atoms there, it will be just a heap of dots and lines).(b) is the actin filament - a linear complex formed by repetition of the above-mentioned monomers. For example, the green fragment is in fact one molecule such as shown in (a), the yellow fragment is another copy of the same molecule from (a), and so on.(c) is the coarse-grained model, with 4 CG sites per each monomer from (b). The beads are the centers of masses of CG sites, and the lines connecting the beads symbolize the terms in the potential energy function, e.g. in Elastic Network Models. (d) is a picture made by molecular biologists. I assume that the red and blue spots is a nucleus of a cell, and the cytoskeleton is shown in green. Every green filament is an actin filament, as shown on (b) and (c). To sum up, (a) refer to the all-atom description, (b) and (c) - to a CG model, (d) - to the mesoscopic level.
  • Multiscale Modeling

    1. 1. Many-Task Computing Tools forMultiscale ModelingDaniel S. Katz (d.katz@ieee.org)Senior Fellow, Computation Institute (Universityof Chicago & Argonne National Laboratory) www.ci.anl.gov www.ci.uchicago.edu
    2. 2. Founded in 1892 as an institutional “communityof scholars,” where all fields and disciplines meetand collaborate www.ci.anl.gov www.ci.uchicago.edu
    3. 3. Computation Institute• Mission: address the most challenging problems arising in the use of strategic computation and communications• Joint Argonne/Chicago institute, ~100 Fellows (~50 UChicago faculty) & ~60 staff• Primary goals: – Pursue new discoveries using multi-disciplinary collaborations and computational methods – Develop new computational methods and paradigms required to tackle these problems, and create the computational tools required for the effective application of advanced methods at the largest scales – Educate the next generation of investigators in the advanced methods and platforms required for discovery www.ci.anl.gov3 www.ci.uchicago.edu
    4. 4. Multiscale Modeling www.ci.anl.gov4 www.ci.uchicago.edu
    5. 5. Multiscale• The world is multiscale• In modeling, a common challenge is determining the correct scale to capture a phenomenon of interest – In computer science, a parallel problem is describing a problem with the right level of abstraction o Capture the details you care about and ignore those you don’t• But multiple phenomena interact, often at different scales www.ci.anl.gov5 www.ci.uchicago.edu
    6. 6. Material Science, Methods and ScalesCredit: M. Stan, Materials Today, 12 (2009) 20-28 www.ci.anl.gov 6 www.ci.uchicago.edu
    7. 7. Modeling nuclear fuel rods Fuel Element Model (FEM) Microstructure Model (PF) 1 cm 10 mB. Mihaila, et al., J. Nucl. Mater., 394 (2009) 182-189 S.Y. Hu et al., J. Nucl. Mater. 392 (2009) 292–300 www.ci.anl.gov7 www.ci.uchicago.edu
    8. 8. Sequential (information passing, hand-shaking, bridging) ? R. Devanathan, et al., Energy Env. Sc., 3 (2010) 1406-1426 www.ci.anl.gov8 www.ci.uchicago.edu
    9. 9. Coupling methods• We often know how to solve a part of the problem with sufficient accuracy, but when we combine multiple parts of the problem at various scales, we need to couple the solution methods too• Must first determine the models to be run and how they iterate/interact• Coupling options – “Manual” coupling (sequential, manual) o Inputs to a code at one scale are influenced by study of the outputs of a previously run code at another scale o Coupling timescale: hours to weeks – “Loose” coupling (sequential, automated) between codes o Typically performed using workflow tools o Often in different memory spaces o Coupling timescale: minutes – “Tight” coupling (concurrent, automated) between codes o e.g., ocean-atmosphere-ice-bio o Typically performed using coupling methods (e.g., CCA), maybe in same memory space o Hard to develop, changes in one code may break the system o Coupling timescale: seconds• Boundary between options can be fuzzy• Choice often depends on how frequently the interactions are required, and how much work the codes do independently www.ci.anl.gov9 www.ci.uchicago.edu
    10. 10. More on coupling Tight Loose Model1 File Model1 Msg Model2 File Model2 www.ci.anl.gov10 www.ci.uchicago.edu
    11. 11. Coupling Models Told , kold , tnew Finite Element (fuel element, solve for T) T  CP k T Q t Ti ( r , ti ) Phase Field (evolve microstructure) Finite Element (microstructure, solve for k) 0 k (r , T ) T ki 1 ( r, Ti , ti 1 ) ki 1 ki No Yes k new No Ti 1 Ti Yes Tnew Tnew , knew , tnew Credit: Marius Stan, Argonne www.ci.anl.gov11 www.ci.uchicago.edu
    12. 12. Swift (for Loose Coupling) www.ci.anl.gov12 www.ci.uchicago.edu
    13. 13. Workflow Example: Protein Structure Prediction predict() … T1af7 T1b72 T1r69 analyze() Want to run: 10 proteins x 1000 simulations x 3 MC rounds x 2 temps x 5 deltas = 300K tasks www.ci.anl.gov13 www.ci.uchicago.edu
    14. 14. Swift• Portable workflows – deployable on many resources• Fundamental script elements are external processes and data files• Provides natural concurrency at runtime through automatic data flow analysis and task scheduling• Data structures and script operations to support scientific computing• Provenance gathered automatically www.ci.anl.gov14 www.ci.uchicago.edu
    15. 15. Portability:dynamic development and execution• Separate workflow description from resource and component implementations <sites.xml> rawdata = sim(settings); … stats = analysis(rawdata); … select resources write script allocate resources Execute (Swift) <tc.data> … define components www.ci.anl.gov15 www.ci.uchicago.edu
    16. 16. Swift scripts• C-like syntax, also Python prototype• Supports file/task model directly in the language type file; app (file output) sim(file input) { namd2 @input @output }• Historically, most tasks have been sequential applications, but don’t have to be www.ci.anl.gov16 www.ci.uchicago.edu
    17. 17. Data flow and natural concurrency• Provide natural concurrency through automatic data flow analysis and task schedulingfile o11 = sim(input1);file o12 = sim(input2);file m = exchange(o11, o12);file i21 = create(o11, m);file o21 = sim(i21);... input1 i21 sim o11 exchange create sim o21 m input2 o12 i22 sim create sim o22 www.ci.anl.gov17 www.ci.uchicago.edu
    18. 18. Variables, Tasks, Files, Concurrency• Variables are single assignment futures – Unassigned variables are open• Variables can represent files – When a file doesn’t exist, the variable is open – When a file exists, the variable is closed• All tasks found at runtime• Tasks with satisfied dependencies (closed variables) are run on whatever resources are available• These runs create files/variables that allow more tasks to run www.ci.anl.gov18 www.ci.uchicago.edu
    19. 19. Execution model• In a standard Swift workflow, each task must enumerate its input and output files• These files are shipped to and from the compute site compute copy inputs submit site return outputs• RPC-like technique, can use multiple queuing systems, data services, and execution environments • Uses abstractions for file transfer, job execution, etc. • Allows use of local systems (laptop, desktop), parallel systems (HPC), distributed systems (HTC, clouds) • Supports grid authentication mechanisms• Can use multi-level scheduling (Coasters) www.ci.anl.gov19 www.ci.uchicago.edu
    20. 20. Performance and Usage• Swift is fast – Uses Karajan (in Java CoG) as powerful, efficient, scalable, and flexible execution engine – Scaling close to 1M tasks; .5M in live science work, and growing• Swift usage is growing (~300 users in last year): – Applications in neuroscience, proteomics, molecular dynamics, biochemistry, climate, economics, statistics, astronomy, etc. www.ci.anl.gov20 www.ci.uchicago.edu
    21. 21. nSim (1000) predict()s … T1af7 T1b72 T1r69ItFix(p, nSim, maxRounds, st, dt) analyze(){ ...foreachsim in [1:nSim] { (structure[sim], log[sim]) = predict(p,st, dt); } result = analyze(structure) ...} www.ci.anl.gov21 www.ci.uchicago.edu
    22. 22. Powerful parallel prediction loops in Swift1. Sweep( )2. {3. intnSim = 1000;4. intmaxRounds = 3;5. Protein pSet[ ] <ext; exec="Protein.map">;6. float startTemp[ ] = [ 100.0, 200.0 ];7. float delT[ ] = [ 1.0, 1.5, 2.0, 5.0, 10.0 ];8. foreachp, pn in pSet {9. foreacht in startTemp {10. foreachd in delT {11. ItFix(p, nSim, maxRounds, t, d);12. }13. }14. }15. } 10 proteins x 1000 simulations x 3 rounds x 2 temps x 5 deltas = 300K tasks www.ci.anl.gov22 www.ci.uchicago.edu
    23. 23. Swift Data server File transport f1 f2 f3 Submithost (Laptop, Linux server, …) Clouds script ????? App App a1 a2 Compute nodes site app f1 list list Workflow status Provenance a1 and logs log f2 a2 f3 www.ci.anl.gov23 http://www.ci.uchicago.edu/swift/ www.ci.uchicago.edu
    24. 24. Swift summary• Structures and arrays of data• Typed script variables intermixed with references to file data• Natural concurrency• Integration with schedulers such as PBS, Cobalt, SGE, GT2, …• Advanced scheduling settings• A variety of useful workflows can be considered www.ci.anl.gov24 www.ci.uchicago.edu
    25. 25. Using Swift for Loose Coupling www.ci.anl.gov25 www.ci.uchicago.edu
    26. 26. Multiscale Molecular Dynamics • Problem: many systems are too large to solve using all-atom molecular dynamics (MD) models • Potential solution: coarse-grained (CG) models where each site represents multiple atoms • In order to do this, have to decide how to coarsen the model – How many sites are needed? – Which atoms are mapped to which sites? – What is the potential energy as a function of coordinates of those CG sites?Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 26 www.ci.uchicago.edu
    27. 27. Building a CG model – initial data processing • Stage 0 – run multiple short-duration trajectories of all- atom MD simulation, e.g., using NAMD, capture dcd files – Can require large run time and memory, so run on TeraGrid system – Download (binary) dcd files to local resources for archiving – Remove light atoms (e.g., water, H) – Performed manually • Stage 1 – remove non α-Carbon atoms on a subset of the dcd files from each trajectory – Need to know how many steps were in each trajectory – not always what was planned, and final file may be corrupt, so some manual checking needed – Performed by fast Tcl scriptCredit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 27 www.ci.uchicago.edu
    28. 28. Building CG model – covariance matrix • Stage 2 – join trajectory files together into ascii file – Requires trajectory length from previous stage – Performed by fast Tcl script • Stage 3 – generate covariance matrix for each trajectory – Find deviation of each atom from its average position across all time steps – Covariance matrix determines which atoms can be grouped into rigid bodies (roughly) – Performed by shell script that runs a compiled C code o Takes several hours per trajectoryCredit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 28 www.ci.uchicago.edu
    29. 29. Building a CG model – CG mapping • Stage 4 – for a given number of sites (#sites), find best mapping for each trajectory – Pick 3 to 5 values for #sites that should cover the likely best value – For each #sites, can find χ² value for each mapping – Overall, want lowest χ² and corresponding mapping – Uses a group of random initial values and simulated annealing from each – Performed by shell script to launch compiled C code, O(50k) trials, takes several days on 100-1000 processors • Stage 5 – check χ² values for each trajectory – χ² vs. #sites on a log-log plot should be linear – Performed by script – If a point is not close to the line, it’s probably not a real minimum χ² for that #sites o Go back to Stage 4 – run more initial case to get a lower χ²Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 29 www.ci.uchicago.edu
    30. 30. Building a CG model – finding #sites • Stage 6 – determine #sites – Estimate best #sites (b#sites) from slope/intercept of line in stage 5, and compare results of all trajectories – Performed by script – If results for each trajectory are different, trajectories didn’t sample enough of the phase space – go back to Stage 0 and run more/longer trajectories – If b#sites is outside the range of #sites that have been calculated, add to initial range and go back to Stage 4 – If b#site is inside the range, create a smaller range around b#sites and go back to Stage 4 o b#sites is an integer, so don’t have to do this too much – Outputs final b#sites and corresponding mapping • Stage 7 – building potential energy as function of site coordinates – Can be done by different methods, e.g., Elastic Network Models (ENM) o Currently under constructionCredit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 30 www.ci.uchicago.edu
    31. 31. Bio workflow: AA->CG MD Stage 0: Stage 1 Stage Stage 11 Stage1:1: Stage remove Stage1:1: remove Stage 2:2: Stage Stage 3: Stage 2: AA_MD remove Stage remove Stage 2: Stage 2: build non-α-C remove non-α-C remove join trajs join join trajs and data non-α-C remove non-α-C join trajs join trajs covar atoms non-α-C atoms non-α-C trajects transfer atoms non-α-C atoms matrix atoms atoms atoms Stage 4: Stage 4: Stage 4: Stage 6: pick 3-5 pick 3-5 Stage 5: Stage 5: Stage 7: pick 3-5 Stage 5: find best #sites #sites check fit check fit find #sites check fit #sites and find and find of χ² of χ² potential and find of χ² and lowest χ² lowest χ² values values energy lowest χ² values mapping for each for each for eachCredit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 31 www.ci.uchicago.edu
    32. 32. Multiscale? • So far, this isn’t really multiscale • It has just used fine grain information to build the best coarse grained model • But it’s a needed part of the process • Overall, can’t run AA_MD as much as desired. – Here, limited AA_MD simulations -> structural information for a rough CG model of the internal molecular structure – With rough CG model, user can parameterize interactions for CG "atoms" via targeted all-atom simulations -> determine average energies and forces etc. for the CG beads • Doing this automatically is a long-term goalCredit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 32 www.ci.uchicago.edu
    33. 33. NSF Center for Chemical Innovation Phase I award: "Centerfor Multiscale Theory and Simulation”• ... development of a novel, powerful, and integrated theoretical and computational capability for the description of biomolecular processes across multiple and connected scales, starting from the molecular scale and ending at the cellular scale• Components: – A theoretical and computer simulation capability to describe biomolecular systems at multiple scales will be developed, includes atomistic, coarse-grained, and mesoscopic scales ... all scales will be connected in a multiscale fashion so that key information is passed upward in scale and vice-versa – Latest generation scalable computing and a novel cyberinfrastructure will be implemented – A high profile demonstration projects will be undertaken using the resulting theoretical and modeling advances which involves themultiscale modeling of the key biomolecular features of the eukaryotic cellular cytoskeleton (i.e., actin-based networks and associated proteins)• Core CCI team includes a diverse group of leading researchers at the University of Chicago from the fields of theoretical/computational chemistry, biophysics, mathematics, and computer science: – Gregory A. Voth (PI, Chemistry, James Franck Institute, Institute for Biophysical Dynamics, Computation Institute); Benoit Roux (co-PI, Biochemistry and Molecular Biology, Institute for Biophysical Dynamics); Nina SinghalHinrichs (co-PI, Computer Science and Statistics); Aaron Dinner (co-PI, Chemistry, James Franck Institute, Institute for Biophysical Dynamics); Karl Freed (co-PI, Chemistry, James Franck Institute, Institute for Biophysical Dynamics); Jonathan Weare (co-PI, Mathematics); Daniel S. Katz (Senior Personnel, Computation Institute) www.ci.anl.gov33 www.ci.uchicago.edu
    34. 34. Actin at multiple scales Actin filament (F-Actin) – complex of G-Actins Actin filament (F-Actin) – CG representation Single Actin monomer (G-Actin) – all-atom representation Actin in cytoskeleton – mesoscaleCredit: Greg Voth, U. Chicago representation www.ci.anl.gov 34 www.ci.uchicago.edu
    35. 35. Geophysics Application • Subsurface flow model • Couples continuum and pore scale simulations – Continuum model: exascale Subsurface Transport Over Multiple Phases (eSTOMP) o Scale: meter o Models full domain – Pore scale model: Smoothed Particle Hydrodynamics (SPH) o Scale: grains of soil (mm) o Models subset of domain as needed • Coupler codes developed – Pore Generator (PG) – adaptively decides where to run SPH and generates inputs for each run – Grid Parameter Generator (GPG) – uses outputs from SPH to build inputs for next eSTOMP iterationCredit: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim Scheibe, PNNL www.ci.anl.gov 35 www.ci.uchicago.edu
    36. 36. Subsurface Hybrid Model WorkflowCredit: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim Scheibe, PNNL www.ci.anl.gov 36 www.ci.uchicago.edu
    37. 37. Swift code //Hybrid Model (file simOutput) HybridModel (file input) {// Driver …file stompIn<"stomp.in">; stompOut = runStomp(input);iterate iter { (sphins, numsph) = pg(stompOut, sphinprefix); output = HybridModel(inputs[iter]); inputs[iter+1] = output; //Find number of pore scale runscapture_provenance(output); intn = @toint(readData(numsph)); foreachi in [1:@toint(n)] {} until(iter>= MAX_ITER); sphout[i]= runSph(sphins[i], procs_task) } simOutput = gpg(sphOutArr, n, sphout); }Credit: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim Scheibe, PNNL www.ci.anl.gov 37 www.ci.uchicago.edu
    38. 38. Towards Tighter Coupling www.ci.anl.gov38 www.ci.uchicago.edu
    39. 39. More on coupling Tight Loose Model1 File Model1 Msg Model2 File Model2 www.ci.anl.gov39 www.ci.uchicago.edu
    40. 40. More on coupling• Message vs. file issues: – Performance: overhead involved in writing to disk vs. keeping in memory – Semantics: messages vs. Posix – Fault tolerance: file storage provides an automatic recovery mechanism – Synchronicity: messages can be sync/async, files must be async• Practical issues: – What drives the application? o Loose case: A driver script calls multiple executables in turns o Tight case: No driver, just one executable – What’s the cost of initialization? o Loose case: Executables initialized each time o Tight case: All executables exist at all times, only initialized once – How much can components be overlapped? o Loose case: If all components need the same number of resources, all resources can be kept busy all the time o Tight case: Components can be idle waiting for other components www.ci.anl.gov40 www.ci.uchicago.edu
    41. 41. Work in progress towards tighter coupling inSwift: Collective Data Management (CDM)• Data transfer mechanism is: transfer input, run, transfer output• Fine for single node systems, could be improved to take advantage of other system features, such as intermediate file system (or shared global file system on distributed sites)• Define I/O patterns (gather, scatter, broadcast, etc.) and build primitives for them• Improve support for shared filesystems on HPC resources• Make use of specialized, site-specific data movement features• Employ caching through the deployment of distributed storage resources on the computation sites• Aggregate small file operations into single larger operations www.ci.anl.gov41 www.ci.uchicago.edu
    42. 42. CDM examples• Broadcast an input data set to workers – On Open Science Grid, just send it to the shared file system of each cluster once, the let worker nodes copy it from there – On IBM BG/P, use intermediate storage on I/O nodes on each pset similarly• Gather an output data set – Rather than sending each job’s output, if multiple jobs are running on a node and sufficient jobs are already runnable, wait and bundle multiple output files, then transfer bundle www.ci.anl.gov42 www.ci.uchicago.edu
    43. 43. Work in progress towards tighter coupling after Swift: ExM(Many-task computing on extreme-scale systems)• Deploy Swift applications on exascale-generation systems• Distributed task (and function management) – Break the bottleneck of a single execution engine – Call functions, not just executables• JETS: Dynamically run multiple MPI tasks on an HPC resource – Allow dynamic mapping of workers to resources – Add resilience – allow mapping of workers to dynamic resources• MosaStore: intermediate file storage – Use files for message passing, but stripe them across RAMdisk on nodes (single distributed filesystemw/ shared namespace), backing store in shared file system, potentially cache in the middle• AME: intermediate file storage – Use files for message passing, but store them in RAMdisk on nodes where written (multiple filesystemsw/ multiple namespaces), copy to new nodes when needed for reading www.ci.anl.gov43 www.ci.uchicago.edu
    44. 44. Increased coverage of scripting ExM Swift Many executables w/ driver Multi-component executable? Single executable All files are individual Files can be grouped Exchange via files Exchange via files in RAM Exchange via messages State stored on disk ? State stored in memory Loose coupling Tight coupling• Questions: – Will we obtain good-enough performance in ExM? – How far can we go towards the tightly-coupled regime without breaking the basic Swift model? www.ci.anl.gov44 www.ci.uchicago.edu
    45. 45. Conclusions• Multiscale modeling is important now, and use will grow• Can think of multiscale modeling instances on a spectrum of loose to tight coupling• Swift works for loose coupling – Examples shown for nuclear energy, biomolecular modeling, and subsurface flows• Improvements in Swift (and ExM) will allow it to be used along more of the spectrum www.ci.anl.gov45 www.ci.uchicago.edu
    46. 46. Acknowledgments• Swift is supported in part by NSF grants OCI-721939, OCI-0944332, and PHY-636265, NIH DC08638, DOE and UChicago SCI Program• http://www.ci.uchicago.edu/swift/• The Swift team: – Mike Wilde, MihaelHategan, Justin Wozniak, KetanMaheshwari, Ben Clifford, David Kelly, Allan Espinosa, Ian Foster, IoanRaicu, Sarah Kenny, Zhao Zhang, Yong Zhao, Jon Monette, Daniel S. Katz• ExM is supported by the DOE Office of Science, ASCR Division – Mike Wilde, Daniel S. Katz, MateiRipeanu, Rusty Lusk, Ian Foster, Justin Wozniak, KetanMaheshwari, Zhao Zhang, Tim Armstrong, Samer Al- Kiswany, EmalayanVairavanathan• Scientific application collaborators and usage described in this talk: – Material Science: Marius Stan, Argonne – Biomolecular Modeling: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago – Subsurface Flows: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim Scheibe, PNNL• Thanks! – questions now or later – d.katz@ieee.org www.ci.anl.gov46 www.ci.uchicago.edu
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×