Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introducing the HACC Simulation Data Portal

4 views

Published on

This presentation was given at the 2019 GlobusWorld Conference in Chicago, IL by Katrin Heitmann from Argonne National Laboratory.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Introducing the HACC Simulation Data Portal

  1. 1. Introduction to the HACC Simulation Data Portal Globus World 2019; Chicago, May 1, 2019 Katrin Heitmann (Argonne National Laboratory) Based on: arXiv:1904.11966
  2. 2. Introduction ! In cosmology we study the origin, evolution, and make-up of the Universe ! Many unsolved questions: ○ What is the nature of dark energy and dark matter, making up 95% of the energy-matter budget of our Universe? ○ What is the mass of the lightest particle in the Universe, the neutrino? ○ How can we learn more about the very first moments of the Universe? ! Upcoming cosmological surveys try to answer these questions and rely on detailed, complex simulations ○ Simulations are carried out and analyzed on the largest supercomputers available world-wide ○ Cosmological simulations generate large amounts of data (PBs) to capture the evolution of the Universe faithfully ○ Given the resources required for these simulations, it is crucial to share them with the community to enable the best possible science outcome HACC/Galacticus/GalSim Hubble Ultra Deep Field NASA
  3. 3. What is needed ... A large-scale effort that provides easy access to a range of simulation products to the world’s cosmologists as well as analysis capabilities to established survey collaborations
  4. 4. Storage O( 50PB total) Simulation (HPC allocations, e.g., INCITE, ALCC) Analysis User community via web and community-specific clients simulation job description analysis job description Public access to cosmological data and computational support for collaborations CooleyTheta (10PF) job submission/adaptation layer Datasets Collaboration-installed Web/ Data Interfaces • LSST DM Butler • Jupyter • PDACS (Galaxy) • DESCQA • Visualization • Databases • Globus • Workflows Globus Online Petrel O(1 PB, 100TB to start) • Portal • Globus ALCF-hosted Collaboration-controlled Resources Physical/Virtual Machine(s) Phoenix In collaboration with Tom Uram, Mike Papka, Ian Foster
  5. 5. Storage O( 50PB total) Simulation (HPC allocations, e.g., INCITE, ALCC) Analysis simulation job description analysis job description Public access to cosmological data and computational support for collaborations CooleyTheta (10PF) job submission/adaptation layer
  6. 6. Storage O( 50PB total) Simulation (HPC allocations, e.g., INCITE, ALCC) Analysis simulation job description analysis job description Public access to cosmological data and computational support for collaborations CooleyTheta (10PF) job submission/adaptation layer Temporary storage, expires with allocation, only collaborators on the project have direct access
  7. 7. Storage O( 50PB total) Simulation (HPC allocations, e.g., INCITE, ALCC) Analysis simulation job description analysis job description Public access to cosmological data and computational support for collaborations CooleyTheta (10PF) job submission/adaptation layer Globus Online Petrel O(1 PB, 100TB to start) Datasets
  8. 8. Storage O( 50PB total) Simulation (HPC allocations, e.g., INCITE, ALCC) Analysis simulation job description analysis job description Public access to cosmological data and computational support for collaborations CooleyTheta (10PF) job submission/adaptation layer Globus Online Petrel O(1 PB, 100TB to start) Datasets • Portal • Globus User community via web and community-specific clients
  9. 9. Storage O( 50PB total) Simulation (HPC allocations, e.g., INCITE, ALCC) Analysis User community via web and community-specific clients simulation job description analysis job description Public access to cosmological data and computational support for collaborations CooleyTheta (10PF) job submission/adaptation layer Datasets Collaboration-installed Web/ Data Interfaces • LSST DM Butler • Jupyter • PDACS (Galaxy) • DESCQA • Visualization • Databases • Globus • Workflows Globus Online Petrel O(1 PB, 100TB to start) • Portal • Globus ALCF-hosted Collaboration-controlled Resources Physical/Virtual Machine(s) Phoenix In collaboration with Tom Uram, Mike Papka, Ian Foster
  10. 10. What exists ... • Petrel and Phoenix • Simulations • First version of web portal using Globus
  11. 11. ! Petrel: Data Management and Sharing Pilot, hosted at Argonne ! 1.7PB parallel filesystem ! Embedded in Argonne’s 100+Gbps network fabric to allow high-speed data transfers ! Web and API access via Globus ! Federated login ! Self-managed by PIs ! https://press3.mcs.anl.gov/petrel/
  12. 12. ! Webportal for easy access to simulations ! Currently: ~ 82.5 TB in our project covering three simulation projects ! Step 0: Register with Globus ! Step 1: Select simulation project ! Step 2: Select data products, information about data size available ! Step 3: Transfer with Globus to endpoint of your choice
  13. 13. ! Webportal for easy access to simulations ! Currently: ~ 82.5 TB in our project covering three simulation projects ! Step 0: Register with Globus ! Step 1: Select simulation project ! Step 2: Select data products, information about data size available ! Step 3: Transfer with Globus to endpoint of your choice
  14. 14. ! Webportal for easy access to simulations ! Currently: ~ 82.5 TB in our project covering three simulation projects ! Step 0: Register with Globus ! Step 1: Select simulation project ! Step 2: Select data products, information about data size available ! Step 3: Transfer with Globus to endpoint of your choice
  15. 15. “The purpose of computing is insight not numbers” - Richard Hamming

×