An Advanced Simulation

Uploaded on


  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. FLASH Future March 21, 2005 Anshu Dubey
  • 2. FLASH3 Motivation
    • The FLASH Center won’t be around forever
    • Put FLASH in a position to be a community developed and supported code
    • First opportunity to focus on design rather than capabilities
  • 3. FLASH 3 Goals
    • To create robust, reliable, efficient and expandable code, that stands the test of time and users
    • Build upon lessons learned in FLASH 2
      • Simplify the architecture
      • Decentralize data management
      • Define rules for inheritance and data movement
      • Eliminate assumptions
    • Incorporate newer software engineering practices
      • Design by contract / Unit test
    • Exploit technologies
      • Tools
    • Create a community code
      • Easy to add new functionality
      • Policy for external contributions
  • 4. The Design
    • Great effort to get the design right
      • Individual proposals
        • Discussed, iterated upon
        • Detailed specifications worked out
        • Prototyped and tested
    • Keeping track of design documents
      • Digital photographs of the board
      • CVS repository of the design specification proposals and documents
        • The documents in the repository should eventually become basis for developers guide .
    • Some parts still dynamic.
  • 5. Design Trade-offs
    • Determine the scope
      • How deep should the changes be
        • Infrastructure
        • Algorithms
    • Design and Implementation Methodology
      • Transition of modules/units from FLASH2 to FLASH3
      • Reconcile the divergence
  • 6. The Implementation Plan
    • Initially concentrate on the Framework
      • Carried out independently of FLASH2
      • No disruption for people using FLASH2 for science runs
    • Move a minimal set of units from FLASH2 to FLASH3
      • Dependencies dictate the set
      • Those units frozen in FLASH2
      • If unavoidable, changes in those units must be carefully synchronized
      • Evaluate the algorithms in the unit and redesign as needed
    • Repeat until all units done
  • 7. Unit Testing
    • Goal: have each module be testable as independently as possible from the rest of the code
    • Tests should be very simple, light on memory.
    • Should eventually hook up with the FLASH test suite
    • What are obstacles?
      • Most modules have some knowledge of other units
      • Most modules need some form of Grid
  • 8. Architecture: Lessons from FLASH 2 Example : Decentralize data management source mesh dbase amr PM data . amr PM func. UG UG UG data UG func source Grid AG PM func. UG UG data UG func PM data . FLASH2 FLASH3
  • 9. IO UG AMR hdf5 netcdf hdf5 netcdf checkpoint() checkpoint() checkpoint() checkpoint() Lots of duplicate code! IO Common hdf5 netcdf checkpoint.F90 Calls generic methods open_file() write_data() Hdf5_open() ncmpi_open() Eliminate duplicate code
  • 10. Infrastructure : Mesh Packages in Flash
    • FLASH original design was Paramesh specific
      • Block Structured
      • Fixed sized blocks
      • Specified at compile time
      • Every section of the code assumed knowledge of block size
    • Removing fixed block size requirement opens the door for other mesh packages like a uniform grid, squishy grid and patched base grid
    Paramesh Relaxing the fixed block size restraint in Flash makes the code more flexible and robust, however it requires the code to be modified at the deepest levels
  • 11. New Capabilities
    • one block per proc
    • No AMR related overhead
    Uniform Grid Patch Based Mesh (planned)
    • Adaptive mesh with variable block size
    • Allows mesh to “zoom” in on specific areas without having to refine surrounding areas
  • 12. New Capabilities - Squishy Grid (Planned)
    • Non-uniform, non-adaptive grid
    • No overhead from AMR
    • Useful in cases where high levels of resolution are needed in fixed areas of the domain
    Squishy Grid or
  • 13. FLASH 3 Architecture Four Cornerstones Setup tool assemble simulation Config files Tell setup how to Assemble simulation Driver Organize interactions Between units Unit Architecture API Inheritance Data management
  • 14. FLASH3 Units Driver I/O Runtime inputs Grid Profiling Runtime viz Simulation Infrastructure monitoring Hydro Burn Gravity MHD Physics
  • 15. Unit Architecture Driver
    • Top level:
    • API
    • Unit Test
    Grid Data Module block data time step etc Wrapper Kernel
  • 16. Unit Hierarchy API/ Stubs Common API Impl API impl Wrapper kernel API impl API impl Wrapper kernel Wrapper kernel
  • 17. Building an Application Grid mesh I/O Runtime Inputs Profiler Simulation/ setup Driver Physics
  • 18. FLASH3 Framework
    • How do you get the Framework out of FLASH 3?
      • Include the source tree up to the top level of each Unit.
        • Stub implementations
        • Config files
      • Include setups/default
      • Include the setup tool
      • Include the Driver
      • Unit tests could be included, but have no meaning with stubs
  • 19. Move a Unit
    • Create a Unit_data Fortran module to store data needed from other units in the code.
    • Create storage in Unit_data for all variables that kernel gets through either runtimeparameter or dbase calls.
    • Eliminate the “first call” constructs, move them to the Unit_init.
      • Unit_init fills up Unit_data.
    • Use wraper functions to fill up the remaining variables in Unit_data.
  • 20. Moving a Module Example: PPM Hydro
    • PPM Hydro
      • And its dependencies (for example Eos)
    • Reasons :
      • Central Module
      • Majority of applications need it
      • Exercises the FLASH3 design
      • FLASH3 cannot be meaningfully tested without it
    • Challenges :
      • Legacy, with many assumptions
      • Has too much knowledge of the mesh
      • Very difficult to encapsulate
      • Eos even more challenging
  • 21. FLASH 3 Status
    • The framework works, and can be separated out
    • The following units are implemented:
      • Grid : can switch between PM2, PM3 and UG
        • PM2 and UG work in parallel
      • Driver, RuntimeInputs, Profiler
      • Hydro, Gamma Eos, Materials
      • Dummy MPI
      • IO
    • Simple application work
    • Compiles and runs on multiple platforms
    • Test suite is up and running
    • Will be released to users within the center soon.
    • A Beta version might be released by the end of the year.
  • 22. Impact on user
    • We hope it won’t be painful
    • Lots of Kernels will be the same as in FLASH2
    • A user, who has modified source files in setup:
      • should be able to read Unit API documentation to clearly determine how to make the code compatible with FLASH3 or
      • Look at the new source file implementation for how to start on converting to FLASH3
    • If you’re in the process of developing a deep functional module for FLASH2 that you want to contribute, talk to us now
  • 23. Tools
    • The Diagnostic Tools
      • Memory diagnostic tool
      • Profiling tools
      • Runtime viz
    • Post Processing Tools
      • Fidlr
      • FLASH view
    • Documentation Tools
      • Benchmark management tool
      • Robodoc for web interfaced documentation
  • 24. Tools Snapshots
  • 25. The Future
    • The next few steps
      • Benchmark and profile the performance
      • Move efficient units in FLASH 2 to FLASH 3
      • Evaluate algorithms in units that are not
        • With some help select the appropriate ones
        • Implement them
    • The still open major design issues
      • Gravity
      • Particles
    • Look at load balancing with non-uniform work per block
  • 26. Load Balancing
    • Load balanced in number of blocks
    • Work concentrated in few blocks
    • CPU snapshot shows one processor idle for the most part
    • Work weighted load balancing needed.
    Work in blocks Distribution of blocks
  • 27. … which brings us to
    • Discussion and Questions