Many-Task Computing Tools for
Multiscale Modeling


Daniel S. Katz (d.katz@ieee.org)
Senior Fellow, Computation Institute (University
of Chicago & Argonne National Laboratory)




                                                   www.ci.anl.gov
                                                   www.ci.uchicago.edu
Founded in 1892 as an institutional “community
of scholars,” where all fields and disciplines meet
and collaborate




                                            www.ci.anl.gov
                                            www.ci.uchicago.edu
Computation Institute
•   Mission: address the most challenging problems arising in the use of strategic
    computation and communications
•   Joint Argonne/Chicago institute, ~100 Fellows (~50 UChicago faculty) & ~60
    staff
•   Primary goals:
     – Pursue new discoveries using multi-disciplinary collaborations and computational
       methods
     – Develop new computational methods and paradigms required to tackle these
       problems, and create the computational tools required for the effective
       application of advanced methods at the largest scales
     – Educate the next generation of investigators in the advanced methods and
       platforms required for discovery




                                                                            www.ci.anl.gov
3
                                                                            www.ci.uchicago.edu
Multiscale Modeling




                          www.ci.anl.gov
4
                          www.ci.uchicago.edu
Multiscale
•   The world is multiscale
•   In modeling, a common challenge is
    determining the correct scale to capture a
    phenomenon of interest
    –   In computer science, a parallel problem is describing
        a problem with the right level of abstraction
         o   Capture the details you care about and ignore those you
             don’t
•   But multiple phenomena interact, often at
    different scales

                                                             www.ci.anl.gov
5
                                                             www.ci.uchicago.edu
Material Science, Methods and Scales




Credit: M. Stan, Materials Today, 12 (2009) 20-28
                                                    www.ci.anl.gov
 6
                                                    www.ci.uchicago.edu
Modeling nuclear fuel rods
                  Fuel Element Model (FEM)



                                                                       Microstructure Model (PF)




                                  1 cm
                                                                                            10 m




B. Mihaila, et al., J. Nucl. Mater., 394 (2009) 182-189   S.Y. Hu et al., J. Nucl. Mater. 392 (2009) 292–300
                                                                                         www.ci.anl.gov
7
                                                                                         www.ci.uchicago.edu
Sequential (information passing, hand-shaking, bridging)




                             ?




                          R. Devanathan, et al., Energy Env. Sc., 3 (2010) 1406-1426
                                                                     www.ci.anl.gov
8
                                                                     www.ci.uchicago.edu
Coupling methods
•   We often know how to solve a part of the problem with sufficient accuracy, but when
    we combine multiple parts of the problem at various scales, we need to couple the
    solution methods too
•   Must first determine the models to be run and how they iterate/interact
•   Coupling options
     –   “Manual” coupling (sequential, manual)
           o Inputs to a code at one scale are influenced by study of the outputs of a previously run code at another
             scale
           o Coupling timescale: hours to weeks
     –   “Loose” coupling (sequential, automated) between codes
           o Typically performed using workflow tools
           o Often in different memory spaces
           o Coupling timescale: minutes
     –   “Tight” coupling (concurrent, automated) between codes
           o e.g., ocean-atmosphere-ice-bio
           o Typically performed using coupling methods (e.g., CCA), maybe in same memory space
           o Hard to develop, changes in one code may break the system
           o Coupling timescale: seconds

•   Boundary between options can be fuzzy
•   Choice often depends on how frequently the interactions are required, and how much
    work the codes do independently

                                                                                                         www.ci.anl.gov
9
                                                                                                         www.ci.uchicago.edu
More on coupling

              Tight               Loose

                               Model1      File

     Model1    Msg    Model2

                               File     Model2




                                             www.ci.anl.gov
10
                                             www.ci.uchicago.edu
Coupling Models
                    Told , kold , tnew


       Finite Element (fuel element, solve for T)
                    T                                
              CP                       k T           Q
                    t
                                     Ti ( r , ti )
        Phase Field (evolve microstructure)


        Finite Element (microstructure, solve for k)
              0              k (r , T ) T
                                    ki 1 ( r, Ti , ti 1 )

                    ki   1     ki
                                                No
                    Yes             k new
            No
                   Ti    1    Ti

                     Yes           Tnew
                   Tnew , knew , tnew                       Credit: Marius Stan, Argonne
                                                                          www.ci.anl.gov
11
                                                                          www.ci.uchicago.edu
Swift (for Loose Coupling)




                             www.ci.anl.gov
12
                             www.ci.uchicago.edu
Workflow Example: Protein Structure Prediction

     predict()
                                      …

                              T1af7   T1b72   T1r69




                  analyze()


         Want to run: 10 proteins x 1000 simulations x
          3 MC rounds x 2 temps x 5 deltas = 300K tasks
                                                      www.ci.anl.gov
13
                                                      www.ci.uchicago.edu
Swift
•    Portable workflows – deployable on many
     resources
•    Fundamental script elements are external
     processes and data files
•    Provides natural concurrency at runtime
     through automatic data flow analysis and task
     scheduling
•    Data structures and script operations to support
     scientific computing
•    Provenance gathered automatically

                                              www.ci.anl.gov
14
                                              www.ci.uchicago.edu
Portability:
dynamic development and execution
•     Separate workflow description from resource and
      component implementations

                          <sites.xml>
                                               rawdata = sim(settings);
                               …
                                               stats = analysis(rawdata);
                                                            …

                            select resources                 write script

     allocate resources
                                      Execute
                                       (Swift)
             <tc.data>
                 …      define
                      components

                                                                     www.ci.anl.gov
15
                                                                     www.ci.uchicago.edu
Swift scripts
•    C-like syntax, also Python prototype
•    Supports file/task model directly in the language

        type file;


        app (file output) sim(file input) {
          namd2 @input @output
        }



•    Historically, most tasks have been sequential
     applications, but don’t have to be
                                               www.ci.anl.gov
16
                                               www.ci.uchicago.edu
Data flow and natural concurrency
•     Provide natural concurrency through automatic
      data flow analysis and task scheduling
file    o11   =   sim(input1);
file    o12   =   sim(input2);
file    m     =   exchange(o11, o12);
file    i21   =   create(o11, m);
file    o21   =   sim(i21);
...
                         input1                                       i21
                                  sim   o11
                                              exchange       create         sim      o21
                                                         m

                         input2         o12                           i22
                                  sim                        create         sim      o22




                                                                                  www.ci.anl.gov
17
                                                                                  www.ci.uchicago.edu
Variables, Tasks, Files, Concurrency
•    Variables are single assignment futures
     –   Unassigned variables are open
•    Variables can represent files
     –   When a file doesn’t exist, the variable is open
     –   When a file exists, the variable is closed
• All tasks found at runtime
• Tasks with satisfied dependencies (closed variables)
  are run on whatever resources are available
• These runs create files/variables that allow more
  tasks to run
                                                           www.ci.anl.gov
18
                                                           www.ci.uchicago.edu
Execution model
• In a standard Swift workflow, each task must enumerate its input and
  output files
• These files are shipped to and from the compute site


                                                             compute
                          copy inputs

       submit site       return outputs

• RPC-like technique, can use multiple queuing systems, data services, and
  execution environments
   • Uses abstractions for file transfer, job execution, etc.
   • Allows use of local systems (laptop, desktop), parallel systems (HPC),
      distributed systems (HTC, clouds)
   • Supports grid authentication mechanisms
• Can use multi-level scheduling (Coasters)

                                                                 www.ci.anl.gov
19
                                                                 www.ci.uchicago.edu
Performance and Usage
•    Swift is fast
     – Uses Karajan (in Java CoG) as powerful, efficient,
       scalable, and flexible execution engine
     – Scaling close to 1M tasks; .5M in live science work,
       and growing


•    Swift usage is growing (~300 users in last year):
     –   Applications in neuroscience, proteomics, molecular
         dynamics, biochemistry, climate, economics,
         statistics, astronomy, etc.

                                                     www.ci.anl.gov
20
                                                     www.ci.uchicago.edu
nSim (1000)
 predict()s                                           …

                                         T1af7        T1b72   T1r69


ItFix(p, nSim, maxRounds, st, dt)                                analyze()
{
  ...
foreachsim in [1:nSim] {
    (structure[sim], log[sim]) = predict(p,st, dt);
  }
  result = analyze(structure)
  ...
}

                                                                             www.ci.anl.gov
21
                                                                             www.ci.uchicago.edu
Powerful parallel prediction loops in Swift
1.    Sweep( )
2.    {
3.    intnSim = 1000;
4.    intmaxRounds = 3;
5.      Protein pSet[ ] <ext; exec="Protein.map">;
6.      float startTemp[ ] = [ 100.0, 200.0 ];
7.      float delT[ ] = [ 1.0, 1.5, 2.0, 5.0, 10.0 ];
8.    foreachp, pn in pSet {
9.    foreacht in startTemp {
10.   foreachd in delT {
11.   ItFix(p, nSim, maxRounds, t, d);
12.         }
13.       }
14.     }
15.   }
                                       10 proteins
                                                x 1000 simulations x
                                      3 rounds x 2 temps x 5 deltas
                                              = 300K tasks

                                                              www.ci.anl.gov
22
                                                              www.ci.uchicago.edu
Swift
                                  Data server                 File
                                                           transport
                                  f1    f2   f3




     Submithost
     (Laptop, Linux server, …)

                                                                          Clouds
          script
                                       ?????
         App    App
         a1     a2
                                                                       Compute
                                                                        nodes
        site    app                                                        f1
        list    list       Workflow
                            status                Provenance               a1
                           and logs                   log
                                                                          f2

                                                                          a2


                                                                         f3
                                                                        www.ci.anl.gov
23       http://www.ci.uchicago.edu/swift/                              www.ci.uchicago.edu
Swift summary
•    Structures and arrays of data
•    Typed script variables intermixed with
     references to file data
•    Natural concurrency
•    Integration with schedulers such as
     PBS, Cobalt, SGE, GT2, …
•    Advanced scheduling settings
•    A variety of useful workflows can be considered


                                              www.ci.anl.gov
24
                                              www.ci.uchicago.edu
Using Swift for Loose
           Coupling




                             www.ci.anl.gov
25
                             www.ci.uchicago.edu
Multiscale Molecular Dynamics
  •   Problem: many systems are too large to solve
      using all-atom molecular dynamics (MD) models
  •   Potential solution: coarse-grained (CG) models
      where each site represents multiple atoms
  •   In order to do this, have to decide how to
      coarsen the model
       – How many sites are needed?
       – Which atoms are mapped to which sites?
       – What is the potential energy as a function of
         coordinates of those CG sites?
Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago
                                                             www.ci.anl.gov
 26
                                                             www.ci.uchicago.edu
Building a CG model – initial data processing
  •   Stage 0 – run multiple short-duration trajectories of all-
      atom MD simulation, e.g., using NAMD, capture dcd files
       –   Can require large run time and memory, so run on TeraGrid
           system
       –   Download (binary) dcd files to local resources for archiving
       –   Remove light atoms (e.g., water, H)
       –   Performed manually
  •   Stage 1 – remove non α-Carbon atoms on a subset of the
      dcd files from each trajectory
       –   Need to know how many steps were in each trajectory – not
           always what was planned, and final file may be corrupt, so
           some manual checking needed
       –   Performed by fast Tcl script
Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago
                                                                www.ci.anl.gov
 27
                                                                www.ci.uchicago.edu
Building CG model – covariance matrix
  •   Stage 2 – join trajectory files together into ascii file
       –   Requires trajectory length from previous stage
       –   Performed by fast Tcl script
  •   Stage 3 – generate covariance matrix for each
      trajectory
       –   Find deviation of each atom from its average position
           across all time steps
       –   Covariance matrix determines which atoms can be
           grouped into rigid bodies (roughly)
       –   Performed by shell script that runs a compiled C code
            o   Takes several hours per trajectory
Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago
                                                             www.ci.anl.gov
 28
                                                             www.ci.uchicago.edu
Building a CG model – CG mapping
  •   Stage 4 – for a given number of sites (#sites), find best
      mapping for each trajectory
       –   Pick 3 to 5 values for #sites that should cover the likely best value
       –   For each #sites, can find χ² value for each mapping
       –   Overall, want lowest χ² and corresponding mapping
       –   Uses a group of random initial values and simulated annealing from
           each
       –   Performed by shell script to launch compiled C code, O(50k)
           trials, takes several days on 100-1000 processors
  •   Stage 5 – check χ² values for each trajectory
       – χ² vs. #sites on a log-log plot should be linear
       – Performed by script
       – If a point is not close to the line, it’s probably not a real minimum
         χ² for that #sites
            o   Go back to Stage 4 – run more initial case to get a lower χ²

Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago
                                                                               www.ci.anl.gov
 29
                                                                               www.ci.uchicago.edu
Building a CG model – finding #sites
  •   Stage 6 – determine #sites
       –   Estimate best #sites (b#sites) from slope/intercept of line in stage 5, and
           compare results of all trajectories
       –   Performed by script
       –   If results for each trajectory are different, trajectories didn’t sample
           enough of the phase space – go back to Stage 0 and run more/longer
           trajectories
       –   If b#sites is outside the range of #sites that have been calculated, add to
           initial range and go back to Stage 4
       –   If b#site is inside the range, create a smaller range around b#sites and go
           back to Stage 4
            o   b#sites is an integer, so don’t have to do this too much
       –   Outputs final b#sites and corresponding mapping
  •   Stage 7 – building potential energy as function of site coordinates
       –   Can be done by different methods, e.g., Elastic Network Models (ENM)
            o   Currently under construction


Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago
                                                                            www.ci.anl.gov
 30
                                                                            www.ci.uchicago.edu
Bio workflow: AA->CG MD

                Stage 0:              Stage 1
                                       Stage
                                     Stage 11
                                   Stage1:1:
                                    Stage
                                       remove
                                  Stage1:1:
                                      remove        Stage 2:2:
                                                      Stage
                                                                  Stage 3:
                                                                    Stage 2:
                AA_MD                remove
                                 Stage
                                    remove           Stage 2:      Stage 2:
                                                                    build
                                       non-α-C
                                   remove
                                      non-α-C
                                  remove             join trajs
                                                      join         join trajs
                and data             non-α-C
                                 remove
                                    non-α-C         join trajs    join trajs
                                                                   covar
                                        atoms
                                   non-α-C
                                       atoms
                                  non-α-C           trajects
                transfer              atoms
                                 non-α-C
                                     atoms                         matrix
                                    atoms
                                   atoms
                                  atoms




                      Stage 4:
                    Stage 4:
                  Stage 4:                         Stage 6:
                      pick 3-5
                    pick 3-5         Stage 5:
                                    Stage 5:                      Stage 7:
                  pick 3-5         Stage 5:       find best
                        #sites
                       #sites        check fit
                                    check fit                       find
                     #sites        check fit        #sites
                      and find
                    and find           of χ²
                                      of χ²                       potential
                  and find           of χ²           and
                     lowest χ²
                   lowest χ²          values
                                     values                        energy
                 lowest χ²          values        mapping
                      for each
                    for each
                  for each


Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago
                                                                                www.ci.anl.gov
 31
                                                                                www.ci.uchicago.edu
Multiscale?
  • So far, this isn’t really multiscale
  • It has just used fine grain information to build the
    best coarse grained model
  • But it’s a needed part of the process
  • Overall, can’t run AA_MD as much as desired.
       –   Here, limited AA_MD simulations -> structural
           information for a rough CG model of the internal
           molecular structure
       –   With rough CG model, user can parameterize
           interactions for CG "atoms" via targeted all-atom
           simulations -> determine average energies and forces
           etc. for the CG beads
  •   Doing this automatically is a long-term goal
Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago
                                                             www.ci.anl.gov
 32
                                                             www.ci.uchicago.edu
NSF Center for Chemical Innovation Phase I award: "Center
for Multiscale Theory and Simulation”
•    ... development of a novel, powerful, and integrated theoretical and computational capability for
     the description of biomolecular processes across multiple and connected scales, starting from the
     molecular scale and ending at the cellular scale
•    Components:
      –   A theoretical and computer simulation capability to describe biomolecular systems at multiple scales will
          be developed, includes atomistic, coarse-grained, and mesoscopic scales ... all scales will be connected in a
          multiscale fashion so that key information is passed upward in scale and vice-versa
      –   Latest generation scalable computing and a novel cyberinfrastructure will be implemented
      –   A high profile demonstration projects will be undertaken using the resulting theoretical and modeling
          advances which involves themultiscale modeling of the key biomolecular features of the eukaryotic
          cellular cytoskeleton (i.e., actin-based networks and associated proteins)
•    Core CCI team includes a diverse group of leading researchers at the University of Chicago from the
     fields of theoretical/computational chemistry, biophysics, mathematics, and computer science:
      –   Gregory A. Voth (PI, Chemistry, James Franck Institute, Institute for Biophysical Dynamics, Computation
          Institute); Benoit Roux (co-PI, Biochemistry and Molecular Biology, Institute for Biophysical Dynamics); Nina
          SinghalHinrichs (co-PI, Computer Science and Statistics); Aaron Dinner (co-PI, Chemistry, James Franck
          Institute, Institute for Biophysical Dynamics); Karl Freed (co-PI, Chemistry, James Franck Institute, Institute
          for Biophysical Dynamics); Jonathan Weare (co-PI, Mathematics); Daniel S. Katz (Senior Personnel,
          Computation Institute)

                                                                                                       www.ci.anl.gov
33
                                                                                                       www.ci.uchicago.edu
Actin at multiple scales


                         Actin filament (F-Actin) – complex of G-Actins




                                                Actin filament (F-Actin) – CG representation




  Single Actin monomer (G-Actin)
     – all-atom representation
                                                                             Actin in cytoskeleton – mesoscale
Credit: Greg Voth, U. Chicago                                                          representation
                                                                                                   www.ci.anl.gov
 34
                                                                                                   www.ci.uchicago.edu
Geophysics Application
  •   Subsurface flow model
  •   Couples continuum and pore scale simulations
       –   Continuum model: exascale Subsurface Transport Over
           Multiple Phases (eSTOMP)
            o   Scale: meter
            o   Models full domain
       –   Pore scale model: Smoothed Particle Hydrodynamics (SPH)
            o   Scale: grains of soil (mm)
            o   Models subset of domain as needed
  •   Coupler codes developed
       –   Pore Generator (PG) – adaptively decides where to run SPH
           and generates inputs for each run
       –   Grid Parameter Generator (GPG) – uses outputs from SPH to
           build inputs for next eSTOMP iteration
Credit: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim Scheibe, PNNL
                                                                             www.ci.anl.gov
 35
                                                                             www.ci.uchicago.edu
Subsurface Hybrid Model Workflow




Credit: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim Scheibe, PNNL
                                                                             www.ci.anl.gov
 36
                                                                             www.ci.uchicago.edu
Swift code
                                               //Hybrid Model

                                               (file simOutput) HybridModel (file input) {
// Driver                                         …
file stompIn<"stomp.in">;                      stompOut = runStomp(input);

iterate iter {                                   (sphins, numsph) = pg(stompOut, sphinprefix);
      output = HybridModel(inputs[iter]);
      inputs[iter+1] = output;                    //Find number of pore scale runs
capture_provenance(output);                    intn = @toint(readData(numsph));
                                               foreachi in [1:@toint(n)] {
} until(iter>= MAX_ITER);                      sphout[i]= runSph(sphins[i], procs_task)
                                                   }

                                               simOutput = gpg(sphOutArr, n, sphout);
                                               }




Credit: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim Scheibe, PNNL
                                                                                  www.ci.anl.gov
 37
                                                                                  www.ci.uchicago.edu
Towards Tighter Coupling




                            www.ci.anl.gov
38
                            www.ci.uchicago.edu
More on coupling

              Tight               Loose

                               Model1      File

     Model1    Msg    Model2

                               File     Model2




                                             www.ci.anl.gov
39
                                             www.ci.uchicago.edu
More on coupling
•    Message vs. file issues:
      –   Performance: overhead involved in writing to disk vs. keeping in memory
      –   Semantics: messages vs. Posix
      –   Fault tolerance: file storage provides an automatic recovery mechanism
      –   Synchronicity: messages can be sync/async, files must be async
•    Practical issues:
      –   What drives the application?
           o   Loose case: A driver script calls multiple executables in turns
           o   Tight case: No driver, just one executable
      –   What’s the cost of initialization?
           o   Loose case: Executables initialized each time
           o   Tight case: All executables exist at all times, only initialized once
      –   How much can components be overlapped?
           o   Loose case: If all components need the same number of resources, all resources
               can be kept busy all the time
           o   Tight case: Components can be idle waiting for other components


                                                                                       www.ci.anl.gov
40
                                                                                       www.ci.uchicago.edu
Work in progress towards tighter coupling in
Swift: Collective Data Management (CDM)
•    Data transfer mechanism is: transfer input, run, transfer
     output
•    Fine for single node systems, could be improved to take
     advantage of other system features, such as intermediate file
     system (or shared global file system on distributed sites)
•    Define I/O patterns (gather, scatter, broadcast, etc.) and build
     primitives for them
•    Improve support for shared filesystems on HPC resources
•    Make use of specialized, site-specific data movement features
•    Employ caching through the deployment of distributed
     storage resources on the computation sites
•    Aggregate small file operations into single larger operations



                                                            www.ci.anl.gov
41
                                                            www.ci.uchicago.edu
CDM examples
•    Broadcast an input data set to workers
     – On Open Science Grid, just send it to the shared file
       system of each cluster once, the let worker nodes
       copy it from there
     – On IBM BG/P, use intermediate storage on I/O nodes
       on each pset similarly
•    Gather an output data set
     –   Rather than sending each job’s output, if multiple
         jobs are running on a node and sufficient jobs are
         already runnable, wait and bundle multiple output
         files, then transfer bundle
                                                     www.ci.anl.gov
42
                                                     www.ci.uchicago.edu
Work in progress towards tighter coupling after Swift: ExM
(Many-task computing on extreme-scale systems)
•    Deploy Swift applications on exascale-generation systems
•    Distributed task (and function management)
     – Break the bottleneck of a single execution engine
     – Call functions, not just executables
•    JETS: Dynamically run multiple MPI tasks on an HPC resource
     – Allow dynamic mapping of workers to resources
     – Add resilience – allow mapping of workers to dynamic resources
•    MosaStore: intermediate file storage
     –   Use files for message passing, but stripe them across RAMdisk on
         nodes (single distributed filesystemw/ shared namespace), backing
         store in shared file system, potentially cache in the middle
•    AME: intermediate file storage
     –   Use files for message passing, but store them in RAMdisk on nodes
         where written (multiple filesystemsw/ multiple namespaces), copy
         to new nodes when needed for reading

                                                                 www.ci.anl.gov
43
                                                                 www.ci.uchicago.edu
Increased coverage of scripting

                                  ExM
                Swift

     Many executables w/ driver     Multi-component executable?        Single executable
     All files are individual            Files can be grouped
     Exchange via files               Exchange via files in RAM   Exchange via messages
     State stored on disk                         ?               State stored in memory

     Loose coupling                                                       Tight coupling


•     Questions:
       –   Will we obtain good-enough performance in ExM?
       –   How far can we go towards the tightly-coupled regime
           without breaking the basic Swift model?
                                                                           www.ci.anl.gov
44
                                                                           www.ci.uchicago.edu
Conclusions
•    Multiscale modeling is important now, and use
     will grow
•    Can think of multiscale modeling instances on a
     spectrum of loose to tight coupling
•    Swift works for loose coupling
     –   Examples shown for nuclear energy, biomolecular
         modeling, and subsurface flows
•    Improvements in Swift (and ExM) will allow it to
     be used along more of the spectrum

                                                   www.ci.anl.gov
45
                                                   www.ci.uchicago.edu
Acknowledgments
• Swift is supported in part by NSF grants OCI-721939, OCI-0944332, and
  PHY-636265, NIH DC08638, DOE and UChicago SCI Program
• http://www.ci.uchicago.edu/swift/
• The Swift team:
      –   Mike Wilde, MihaelHategan, Justin Wozniak, KetanMaheshwari, Ben
          Clifford, David Kelly, Allan Espinosa, Ian Foster, IoanRaicu, Sarah Kenny,
          Zhao Zhang, Yong Zhao, Jon Monette, Daniel S. Katz
•    ExM is supported by the DOE Office of Science, ASCR Division
      –   Mike Wilde, Daniel S. Katz, MateiRipeanu, Rusty Lusk, Ian Foster, Justin
          Wozniak, KetanMaheshwari, Zhao Zhang, Tim Armstrong, Samer Al-
          Kiswany, EmalayanVairavanathan
•    Scientific application collaborators and usage described in this talk:
      –   Material Science: Marius Stan, Argonne
      –   Biomolecular Modeling: Anton Sinitskiy, John Grime, Greg Voth, U.
          Chicago
      –   Subsurface Flows: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim
          Scheibe, PNNL
•    Thanks! – questions now or later – d.katz@ieee.org
                                                                            www.ci.anl.gov
46
                                                                            www.ci.uchicago.edu

Multiscale Modeling

  • 1.
    Many-Task Computing Toolsfor Multiscale Modeling Daniel S. Katz (d.katz@ieee.org) Senior Fellow, Computation Institute (University of Chicago & Argonne National Laboratory) www.ci.anl.gov www.ci.uchicago.edu
  • 2.
    Founded in 1892as an institutional “community of scholars,” where all fields and disciplines meet and collaborate www.ci.anl.gov www.ci.uchicago.edu
  • 3.
    Computation Institute • Mission: address the most challenging problems arising in the use of strategic computation and communications • Joint Argonne/Chicago institute, ~100 Fellows (~50 UChicago faculty) & ~60 staff • Primary goals: – Pursue new discoveries using multi-disciplinary collaborations and computational methods – Develop new computational methods and paradigms required to tackle these problems, and create the computational tools required for the effective application of advanced methods at the largest scales – Educate the next generation of investigators in the advanced methods and platforms required for discovery www.ci.anl.gov 3 www.ci.uchicago.edu
  • 4.
    Multiscale Modeling www.ci.anl.gov 4 www.ci.uchicago.edu
  • 5.
    Multiscale • The world is multiscale • In modeling, a common challenge is determining the correct scale to capture a phenomenon of interest – In computer science, a parallel problem is describing a problem with the right level of abstraction o Capture the details you care about and ignore those you don’t • But multiple phenomena interact, often at different scales www.ci.anl.gov 5 www.ci.uchicago.edu
  • 6.
    Material Science, Methodsand Scales Credit: M. Stan, Materials Today, 12 (2009) 20-28 www.ci.anl.gov 6 www.ci.uchicago.edu
  • 7.
    Modeling nuclear fuelrods Fuel Element Model (FEM) Microstructure Model (PF) 1 cm 10 m B. Mihaila, et al., J. Nucl. Mater., 394 (2009) 182-189 S.Y. Hu et al., J. Nucl. Mater. 392 (2009) 292–300 www.ci.anl.gov 7 www.ci.uchicago.edu
  • 8.
    Sequential (information passing,hand-shaking, bridging) ? R. Devanathan, et al., Energy Env. Sc., 3 (2010) 1406-1426 www.ci.anl.gov 8 www.ci.uchicago.edu
  • 9.
    Coupling methods • We often know how to solve a part of the problem with sufficient accuracy, but when we combine multiple parts of the problem at various scales, we need to couple the solution methods too • Must first determine the models to be run and how they iterate/interact • Coupling options – “Manual” coupling (sequential, manual) o Inputs to a code at one scale are influenced by study of the outputs of a previously run code at another scale o Coupling timescale: hours to weeks – “Loose” coupling (sequential, automated) between codes o Typically performed using workflow tools o Often in different memory spaces o Coupling timescale: minutes – “Tight” coupling (concurrent, automated) between codes o e.g., ocean-atmosphere-ice-bio o Typically performed using coupling methods (e.g., CCA), maybe in same memory space o Hard to develop, changes in one code may break the system o Coupling timescale: seconds • Boundary between options can be fuzzy • Choice often depends on how frequently the interactions are required, and how much work the codes do independently www.ci.anl.gov 9 www.ci.uchicago.edu
  • 10.
    More on coupling Tight Loose Model1 File Model1 Msg Model2 File Model2 www.ci.anl.gov 10 www.ci.uchicago.edu
  • 11.
    Coupling Models Told , kold , tnew Finite Element (fuel element, solve for T) T  CP k T Q t Ti ( r , ti ) Phase Field (evolve microstructure) Finite Element (microstructure, solve for k) 0 k (r , T ) T ki 1 ( r, Ti , ti 1 ) ki 1 ki No Yes k new No Ti 1 Ti Yes Tnew Tnew , knew , tnew Credit: Marius Stan, Argonne www.ci.anl.gov 11 www.ci.uchicago.edu
  • 12.
    Swift (for LooseCoupling) www.ci.anl.gov 12 www.ci.uchicago.edu
  • 13.
    Workflow Example: ProteinStructure Prediction predict() … T1af7 T1b72 T1r69 analyze() Want to run: 10 proteins x 1000 simulations x 3 MC rounds x 2 temps x 5 deltas = 300K tasks www.ci.anl.gov 13 www.ci.uchicago.edu
  • 14.
    Swift • Portable workflows – deployable on many resources • Fundamental script elements are external processes and data files • Provides natural concurrency at runtime through automatic data flow analysis and task scheduling • Data structures and script operations to support scientific computing • Provenance gathered automatically www.ci.anl.gov 14 www.ci.uchicago.edu
  • 15.
    Portability: dynamic development andexecution • Separate workflow description from resource and component implementations <sites.xml> rawdata = sim(settings); … stats = analysis(rawdata); … select resources write script allocate resources Execute (Swift) <tc.data> … define components www.ci.anl.gov 15 www.ci.uchicago.edu
  • 16.
    Swift scripts • C-like syntax, also Python prototype • Supports file/task model directly in the language type file; app (file output) sim(file input) { namd2 @input @output } • Historically, most tasks have been sequential applications, but don’t have to be www.ci.anl.gov 16 www.ci.uchicago.edu
  • 17.
    Data flow andnatural concurrency • Provide natural concurrency through automatic data flow analysis and task scheduling file o11 = sim(input1); file o12 = sim(input2); file m = exchange(o11, o12); file i21 = create(o11, m); file o21 = sim(i21); ... input1 i21 sim o11 exchange create sim o21 m input2 o12 i22 sim create sim o22 www.ci.anl.gov 17 www.ci.uchicago.edu
  • 18.
    Variables, Tasks, Files,Concurrency • Variables are single assignment futures – Unassigned variables are open • Variables can represent files – When a file doesn’t exist, the variable is open – When a file exists, the variable is closed • All tasks found at runtime • Tasks with satisfied dependencies (closed variables) are run on whatever resources are available • These runs create files/variables that allow more tasks to run www.ci.anl.gov 18 www.ci.uchicago.edu
  • 19.
    Execution model • Ina standard Swift workflow, each task must enumerate its input and output files • These files are shipped to and from the compute site compute copy inputs submit site return outputs • RPC-like technique, can use multiple queuing systems, data services, and execution environments • Uses abstractions for file transfer, job execution, etc. • Allows use of local systems (laptop, desktop), parallel systems (HPC), distributed systems (HTC, clouds) • Supports grid authentication mechanisms • Can use multi-level scheduling (Coasters) www.ci.anl.gov 19 www.ci.uchicago.edu
  • 20.
    Performance and Usage • Swift is fast – Uses Karajan (in Java CoG) as powerful, efficient, scalable, and flexible execution engine – Scaling close to 1M tasks; .5M in live science work, and growing • Swift usage is growing (~300 users in last year): – Applications in neuroscience, proteomics, molecular dynamics, biochemistry, climate, economics, statistics, astronomy, etc. www.ci.anl.gov 20 www.ci.uchicago.edu
  • 21.
    nSim (1000) predict()s … T1af7 T1b72 T1r69 ItFix(p, nSim, maxRounds, st, dt) analyze() { ... foreachsim in [1:nSim] { (structure[sim], log[sim]) = predict(p,st, dt); } result = analyze(structure) ... } www.ci.anl.gov 21 www.ci.uchicago.edu
  • 22.
    Powerful parallel predictionloops in Swift 1. Sweep( ) 2. { 3. intnSim = 1000; 4. intmaxRounds = 3; 5. Protein pSet[ ] <ext; exec="Protein.map">; 6. float startTemp[ ] = [ 100.0, 200.0 ]; 7. float delT[ ] = [ 1.0, 1.5, 2.0, 5.0, 10.0 ]; 8. foreachp, pn in pSet { 9. foreacht in startTemp { 10. foreachd in delT { 11. ItFix(p, nSim, maxRounds, t, d); 12. } 13. } 14. } 15. } 10 proteins x 1000 simulations x 3 rounds x 2 temps x 5 deltas = 300K tasks www.ci.anl.gov 22 www.ci.uchicago.edu
  • 23.
    Swift Data server File transport f1 f2 f3 Submithost (Laptop, Linux server, …) Clouds script ????? App App a1 a2 Compute nodes site app f1 list list Workflow status Provenance a1 and logs log f2 a2 f3 www.ci.anl.gov 23 http://www.ci.uchicago.edu/swift/ www.ci.uchicago.edu
  • 24.
    Swift summary • Structures and arrays of data • Typed script variables intermixed with references to file data • Natural concurrency • Integration with schedulers such as PBS, Cobalt, SGE, GT2, … • Advanced scheduling settings • A variety of useful workflows can be considered www.ci.anl.gov 24 www.ci.uchicago.edu
  • 25.
    Using Swift forLoose Coupling www.ci.anl.gov 25 www.ci.uchicago.edu
  • 26.
    Multiscale Molecular Dynamics • Problem: many systems are too large to solve using all-atom molecular dynamics (MD) models • Potential solution: coarse-grained (CG) models where each site represents multiple atoms • In order to do this, have to decide how to coarsen the model – How many sites are needed? – Which atoms are mapped to which sites? – What is the potential energy as a function of coordinates of those CG sites? Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 26 www.ci.uchicago.edu
  • 27.
    Building a CGmodel – initial data processing • Stage 0 – run multiple short-duration trajectories of all- atom MD simulation, e.g., using NAMD, capture dcd files – Can require large run time and memory, so run on TeraGrid system – Download (binary) dcd files to local resources for archiving – Remove light atoms (e.g., water, H) – Performed manually • Stage 1 – remove non α-Carbon atoms on a subset of the dcd files from each trajectory – Need to know how many steps were in each trajectory – not always what was planned, and final file may be corrupt, so some manual checking needed – Performed by fast Tcl script Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 27 www.ci.uchicago.edu
  • 28.
    Building CG model– covariance matrix • Stage 2 – join trajectory files together into ascii file – Requires trajectory length from previous stage – Performed by fast Tcl script • Stage 3 – generate covariance matrix for each trajectory – Find deviation of each atom from its average position across all time steps – Covariance matrix determines which atoms can be grouped into rigid bodies (roughly) – Performed by shell script that runs a compiled C code o Takes several hours per trajectory Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 28 www.ci.uchicago.edu
  • 29.
    Building a CGmodel – CG mapping • Stage 4 – for a given number of sites (#sites), find best mapping for each trajectory – Pick 3 to 5 values for #sites that should cover the likely best value – For each #sites, can find χ² value for each mapping – Overall, want lowest χ² and corresponding mapping – Uses a group of random initial values and simulated annealing from each – Performed by shell script to launch compiled C code, O(50k) trials, takes several days on 100-1000 processors • Stage 5 – check χ² values for each trajectory – χ² vs. #sites on a log-log plot should be linear – Performed by script – If a point is not close to the line, it’s probably not a real minimum χ² for that #sites o Go back to Stage 4 – run more initial case to get a lower χ² Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 29 www.ci.uchicago.edu
  • 30.
    Building a CGmodel – finding #sites • Stage 6 – determine #sites – Estimate best #sites (b#sites) from slope/intercept of line in stage 5, and compare results of all trajectories – Performed by script – If results for each trajectory are different, trajectories didn’t sample enough of the phase space – go back to Stage 0 and run more/longer trajectories – If b#sites is outside the range of #sites that have been calculated, add to initial range and go back to Stage 4 – If b#site is inside the range, create a smaller range around b#sites and go back to Stage 4 o b#sites is an integer, so don’t have to do this too much – Outputs final b#sites and corresponding mapping • Stage 7 – building potential energy as function of site coordinates – Can be done by different methods, e.g., Elastic Network Models (ENM) o Currently under construction Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 30 www.ci.uchicago.edu
  • 31.
    Bio workflow: AA->CGMD Stage 0: Stage 1 Stage Stage 11 Stage1:1: Stage remove Stage1:1: remove Stage 2:2: Stage Stage 3: Stage 2: AA_MD remove Stage remove Stage 2: Stage 2: build non-α-C remove non-α-C remove join trajs join join trajs and data non-α-C remove non-α-C join trajs join trajs covar atoms non-α-C atoms non-α-C trajects transfer atoms non-α-C atoms matrix atoms atoms atoms Stage 4: Stage 4: Stage 4: Stage 6: pick 3-5 pick 3-5 Stage 5: Stage 5: Stage 7: pick 3-5 Stage 5: find best #sites #sites check fit check fit find #sites check fit #sites and find and find of χ² of χ² potential and find of χ² and lowest χ² lowest χ² values values energy lowest χ² values mapping for each for each for each Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 31 www.ci.uchicago.edu
  • 32.
    Multiscale? •So far, this isn’t really multiscale • It has just used fine grain information to build the best coarse grained model • But it’s a needed part of the process • Overall, can’t run AA_MD as much as desired. – Here, limited AA_MD simulations -> structural information for a rough CG model of the internal molecular structure – With rough CG model, user can parameterize interactions for CG "atoms" via targeted all-atom simulations -> determine average energies and forces etc. for the CG beads • Doing this automatically is a long-term goal Credit: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago www.ci.anl.gov 32 www.ci.uchicago.edu
  • 33.
    NSF Center forChemical Innovation Phase I award: "Center for Multiscale Theory and Simulation” • ... development of a novel, powerful, and integrated theoretical and computational capability for the description of biomolecular processes across multiple and connected scales, starting from the molecular scale and ending at the cellular scale • Components: – A theoretical and computer simulation capability to describe biomolecular systems at multiple scales will be developed, includes atomistic, coarse-grained, and mesoscopic scales ... all scales will be connected in a multiscale fashion so that key information is passed upward in scale and vice-versa – Latest generation scalable computing and a novel cyberinfrastructure will be implemented – A high profile demonstration projects will be undertaken using the resulting theoretical and modeling advances which involves themultiscale modeling of the key biomolecular features of the eukaryotic cellular cytoskeleton (i.e., actin-based networks and associated proteins) • Core CCI team includes a diverse group of leading researchers at the University of Chicago from the fields of theoretical/computational chemistry, biophysics, mathematics, and computer science: – Gregory A. Voth (PI, Chemistry, James Franck Institute, Institute for Biophysical Dynamics, Computation Institute); Benoit Roux (co-PI, Biochemistry and Molecular Biology, Institute for Biophysical Dynamics); Nina SinghalHinrichs (co-PI, Computer Science and Statistics); Aaron Dinner (co-PI, Chemistry, James Franck Institute, Institute for Biophysical Dynamics); Karl Freed (co-PI, Chemistry, James Franck Institute, Institute for Biophysical Dynamics); Jonathan Weare (co-PI, Mathematics); Daniel S. Katz (Senior Personnel, Computation Institute) www.ci.anl.gov 33 www.ci.uchicago.edu
  • 34.
    Actin at multiplescales Actin filament (F-Actin) – complex of G-Actins Actin filament (F-Actin) – CG representation Single Actin monomer (G-Actin) – all-atom representation Actin in cytoskeleton – mesoscale Credit: Greg Voth, U. Chicago representation www.ci.anl.gov 34 www.ci.uchicago.edu
  • 35.
    Geophysics Application • Subsurface flow model • Couples continuum and pore scale simulations – Continuum model: exascale Subsurface Transport Over Multiple Phases (eSTOMP) o Scale: meter o Models full domain – Pore scale model: Smoothed Particle Hydrodynamics (SPH) o Scale: grains of soil (mm) o Models subset of domain as needed • Coupler codes developed – Pore Generator (PG) – adaptively decides where to run SPH and generates inputs for each run – Grid Parameter Generator (GPG) – uses outputs from SPH to build inputs for next eSTOMP iteration Credit: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim Scheibe, PNNL www.ci.anl.gov 35 www.ci.uchicago.edu
  • 36.
    Subsurface Hybrid ModelWorkflow Credit: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim Scheibe, PNNL www.ci.anl.gov 36 www.ci.uchicago.edu
  • 37.
    Swift code //Hybrid Model (file simOutput) HybridModel (file input) { // Driver … file stompIn<"stomp.in">; stompOut = runStomp(input); iterate iter { (sphins, numsph) = pg(stompOut, sphinprefix); output = HybridModel(inputs[iter]); inputs[iter+1] = output; //Find number of pore scale runs capture_provenance(output); intn = @toint(readData(numsph)); foreachi in [1:@toint(n)] { } until(iter>= MAX_ITER); sphout[i]= runSph(sphins[i], procs_task) } simOutput = gpg(sphOutArr, n, sphout); } Credit: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim Scheibe, PNNL www.ci.anl.gov 37 www.ci.uchicago.edu
  • 38.
    Towards Tighter Coupling www.ci.anl.gov 38 www.ci.uchicago.edu
  • 39.
    More on coupling Tight Loose Model1 File Model1 Msg Model2 File Model2 www.ci.anl.gov 39 www.ci.uchicago.edu
  • 40.
    More on coupling • Message vs. file issues: – Performance: overhead involved in writing to disk vs. keeping in memory – Semantics: messages vs. Posix – Fault tolerance: file storage provides an automatic recovery mechanism – Synchronicity: messages can be sync/async, files must be async • Practical issues: – What drives the application? o Loose case: A driver script calls multiple executables in turns o Tight case: No driver, just one executable – What’s the cost of initialization? o Loose case: Executables initialized each time o Tight case: All executables exist at all times, only initialized once – How much can components be overlapped? o Loose case: If all components need the same number of resources, all resources can be kept busy all the time o Tight case: Components can be idle waiting for other components www.ci.anl.gov 40 www.ci.uchicago.edu
  • 41.
    Work in progresstowards tighter coupling in Swift: Collective Data Management (CDM) • Data transfer mechanism is: transfer input, run, transfer output • Fine for single node systems, could be improved to take advantage of other system features, such as intermediate file system (or shared global file system on distributed sites) • Define I/O patterns (gather, scatter, broadcast, etc.) and build primitives for them • Improve support for shared filesystems on HPC resources • Make use of specialized, site-specific data movement features • Employ caching through the deployment of distributed storage resources on the computation sites • Aggregate small file operations into single larger operations www.ci.anl.gov 41 www.ci.uchicago.edu
  • 42.
    CDM examples • Broadcast an input data set to workers – On Open Science Grid, just send it to the shared file system of each cluster once, the let worker nodes copy it from there – On IBM BG/P, use intermediate storage on I/O nodes on each pset similarly • Gather an output data set – Rather than sending each job’s output, if multiple jobs are running on a node and sufficient jobs are already runnable, wait and bundle multiple output files, then transfer bundle www.ci.anl.gov 42 www.ci.uchicago.edu
  • 43.
    Work in progresstowards tighter coupling after Swift: ExM (Many-task computing on extreme-scale systems) • Deploy Swift applications on exascale-generation systems • Distributed task (and function management) – Break the bottleneck of a single execution engine – Call functions, not just executables • JETS: Dynamically run multiple MPI tasks on an HPC resource – Allow dynamic mapping of workers to resources – Add resilience – allow mapping of workers to dynamic resources • MosaStore: intermediate file storage – Use files for message passing, but stripe them across RAMdisk on nodes (single distributed filesystemw/ shared namespace), backing store in shared file system, potentially cache in the middle • AME: intermediate file storage – Use files for message passing, but store them in RAMdisk on nodes where written (multiple filesystemsw/ multiple namespaces), copy to new nodes when needed for reading www.ci.anl.gov 43 www.ci.uchicago.edu
  • 44.
    Increased coverage ofscripting ExM Swift Many executables w/ driver Multi-component executable? Single executable All files are individual Files can be grouped Exchange via files Exchange via files in RAM Exchange via messages State stored on disk ? State stored in memory Loose coupling Tight coupling • Questions: – Will we obtain good-enough performance in ExM? – How far can we go towards the tightly-coupled regime without breaking the basic Swift model? www.ci.anl.gov 44 www.ci.uchicago.edu
  • 45.
    Conclusions • Multiscale modeling is important now, and use will grow • Can think of multiscale modeling instances on a spectrum of loose to tight coupling • Swift works for loose coupling – Examples shown for nuclear energy, biomolecular modeling, and subsurface flows • Improvements in Swift (and ExM) will allow it to be used along more of the spectrum www.ci.anl.gov 45 www.ci.uchicago.edu
  • 46.
    Acknowledgments • Swift issupported in part by NSF grants OCI-721939, OCI-0944332, and PHY-636265, NIH DC08638, DOE and UChicago SCI Program • http://www.ci.uchicago.edu/swift/ • The Swift team: – Mike Wilde, MihaelHategan, Justin Wozniak, KetanMaheshwari, Ben Clifford, David Kelly, Allan Espinosa, Ian Foster, IoanRaicu, Sarah Kenny, Zhao Zhang, Yong Zhao, Jon Monette, Daniel S. Katz • ExM is supported by the DOE Office of Science, ASCR Division – Mike Wilde, Daniel S. Katz, MateiRipeanu, Rusty Lusk, Ian Foster, Justin Wozniak, KetanMaheshwari, Zhao Zhang, Tim Armstrong, Samer Al- Kiswany, EmalayanVairavanathan • Scientific application collaborators and usage described in this talk: – Material Science: Marius Stan, Argonne – Biomolecular Modeling: Anton Sinitskiy, John Grime, Greg Voth, U. Chicago – Subsurface Flows: Karen Schuchardt , Bruce Palmer, KhushbuAgarwal, Tim Scheibe, PNNL • Thanks! – questions now or later – d.katz@ieee.org www.ci.anl.gov 46 www.ci.uchicago.edu

Editor's Notes

  • #12 Microstructure model (phase-field) simulates the accumulation and transport of fission products and the evolution of Hegas bubble microstructures in nuclear fuels.The effective thermal conductivity strongly depends on the bubble volume fraction, but weakly on the morphology of the bubbles.
  • #14 nSim – number of Monte Carlo simulationsPredict() – a single montecarloprediction of a protein structureAnalyze() – analyzes a bunch of protein structuresResults of Analyze() need to be fed into next RoundmaxRounds is used in the ... code
  • #22 nSim – number of Monte Carlo simulationsPredict() – a single montecarloprediction of a protein structureAnalyze() – analyzes a bunch of protein structuresResults of Analyze() need to be fed into next RoundmaxRounds is used in the ... code
  • #28 Why not just use AA_MD?Biologically relevant timescales are millisecond or seconds or minutes, while MD simulations provide trajectories with the length of tens or hundreds nanoseconds (there was a pioneer work this year, with a millisecond all-atom MD simulation, but for a very small system). CG models is a way to compensate this difference.
  • #35 (a) is a single monomer of actin (so-called G-actin). This picture symbolizes all-atom representation. (I know, you can&apos;t see atoms there, only helices and sheets, but if we draw all atoms there, it will be just a heap of dots and lines).(b) is the actin filament - a linear complex formed by repetition of the above-mentioned monomers. For example, the green fragment is in fact one molecule such as shown in (a), the yellow fragment is another copy of the same molecule from (a), and so on.(c) is the coarse-grained model, with 4 CG sites per each monomer from (b). The beads are the centers of masses of CG sites, and the lines connecting the beads symbolize the terms in the potential energy function, e.g. in Elastic Network Models. (d) is a picture made by molecular biologists. I assume that the red and blue spots is a nucleus of a cell, and the cytoskeleton is shown in green. Every green filament is an actin filament, as shown on (b) and (c). To sum up, (a) refer to the all-atom description, (b) and (c) - to a CG model, (d) - to the mesoscopic level.