Introducing the HACC Simulation Data Portal

Globus
Introduction to the
HACC Simulation Data Portal
Globus World 2019; Chicago, May 1, 2019
Katrin Heitmann (Argonne National Laboratory)
Based on: arXiv:1904.11966
Introduction
! In cosmology we study the origin, evolution, and make-up of
the Universe
! Many unsolved questions:
○ What is the nature of dark energy and dark matter, making up 95% of the
energy-matter budget of our Universe?
○ What is the mass of the lightest particle in the Universe, the neutrino?
○ How can we learn more about the very first moments of the Universe?
! Upcoming cosmological surveys try to answer these
questions and rely on detailed, complex simulations
○ Simulations are carried out and analyzed on the largest supercomputers
available world-wide
○ Cosmological simulations generate large amounts of data (PBs) to capture
the evolution of the Universe faithfully
○ Given the resources required for these simulations, it is crucial to share
them with the community to enable the best possible science outcome HACC/Galacticus/GalSim
Hubble Ultra Deep Field
NASA
What is needed ...
A large-scale effort that
provides easy access to a
range of simulation products to
the world’s cosmologists as
well as analysis capabilities to
established survey
collaborations
Storage
O( 50PB total)
Simulation
(HPC allocations, e.g.,
INCITE, ALCC)
Analysis User community via web and
community-specific clients
simulation
job description
analysis
job description
Public access to cosmological data and computational support for collaborations
CooleyTheta (10PF)
job submission/adaptation layer
Datasets
Collaboration-installed Web/
Data Interfaces
• LSST DM Butler
• Jupyter
• PDACS (Galaxy)
• DESCQA
• Visualization
• Databases
• Globus
• Workflows
Globus
Online
Petrel
O(1 PB, 100TB to start)
• Portal
• Globus
ALCF-hosted
Collaboration-controlled Resources
Physical/Virtual Machine(s)
Phoenix
In collaboration with Tom Uram, Mike Papka, Ian Foster
Storage
O( 50PB total)
Simulation
(HPC allocations, e.g.,
INCITE, ALCC)
Analysis
simulation
job description
analysis
job description
Public access to cosmological data and computational support for collaborations
CooleyTheta (10PF)
job submission/adaptation layer
Storage
O( 50PB total)
Simulation
(HPC allocations, e.g.,
INCITE, ALCC)
Analysis
simulation
job description
analysis
job description
Public access to cosmological data and computational support for collaborations
CooleyTheta (10PF)
job submission/adaptation layer
Temporary storage,
expires with allocation,
only collaborators on the
project have direct
access
Storage
O( 50PB total)
Simulation
(HPC allocations, e.g.,
INCITE, ALCC)
Analysis
simulation
job description
analysis
job description
Public access to cosmological data and computational support for collaborations
CooleyTheta (10PF)
job submission/adaptation layer
Globus
Online
Petrel
O(1 PB, 100TB to start)
Datasets
Storage
O( 50PB total)
Simulation
(HPC allocations, e.g.,
INCITE, ALCC)
Analysis
simulation
job description
analysis
job description
Public access to cosmological data and computational support for collaborations
CooleyTheta (10PF)
job submission/adaptation layer
Globus
Online
Petrel
O(1 PB, 100TB to start)
Datasets
• Portal
• Globus
User community via web and
community-specific clients
Storage
O( 50PB total)
Simulation
(HPC allocations, e.g.,
INCITE, ALCC)
Analysis User community via web and
community-specific clients
simulation
job description
analysis
job description
Public access to cosmological data and computational support for collaborations
CooleyTheta (10PF)
job submission/adaptation layer
Datasets
Collaboration-installed Web/
Data Interfaces
• LSST DM Butler
• Jupyter
• PDACS (Galaxy)
• DESCQA
• Visualization
• Databases
• Globus
• Workflows
Globus
Online
Petrel
O(1 PB, 100TB to start)
• Portal
• Globus
ALCF-hosted
Collaboration-controlled Resources
Physical/Virtual Machine(s)
Phoenix
In collaboration with Tom Uram, Mike Papka, Ian Foster
What exists ...
• Petrel and Phoenix
• Simulations
• First version of web portal
using Globus
! Petrel: Data Management and
Sharing Pilot, hosted at Argonne
! 1.7PB parallel filesystem
! Embedded in Argonne’s
100+Gbps network fabric to allow
high-speed data transfers
! Web and API access via Globus
! Federated login
! Self-managed by PIs
! https://press3.mcs.anl.gov/petrel/
! Webportal for easy access to
simulations
! Currently: ~ 82.5 TB in our
project covering three
simulation projects
! Step 0: Register with Globus
! Step 1: Select simulation
project
! Step 2: Select data products,
information about data size
available
! Step 3: Transfer with Globus to
endpoint of your choice
! Webportal for easy access to
simulations
! Currently: ~ 82.5 TB in our
project covering three
simulation projects
! Step 0: Register with Globus
! Step 1: Select simulation
project
! Step 2: Select data products,
information about data size
available
! Step 3: Transfer with Globus to
endpoint of your choice
! Webportal for easy access to
simulations
! Currently: ~ 82.5 TB in our
project covering three
simulation projects
! Step 0: Register with Globus
! Step 1: Select simulation
project
! Step 2: Select data products,
information about data size
available
! Step 3: Transfer with Globus to
endpoint of your choice
“The purpose of computing is insight not numbers”
- Richard Hamming
1 of 15

Recommended

Cern uses cloud for next challenge by
Cern uses cloud for next challengeCern uses cloud for next challenge
Cern uses cloud for next challengeJohn Davis
239 views8 slides
Project Matsu: Elastic Clouds for Disaster Relief by
Project Matsu: Elastic Clouds for Disaster ReliefProject Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefRobert Grossman
1K views20 slides
Bioclouds CAMDA (Robert Grossman) 09-v9p by
Bioclouds CAMDA (Robert Grossman) 09-v9pBioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pRobert Grossman
734 views39 slides
Big Data Solutions for the Climate Community by
Big Data Solutions for the Climate CommunityBig Data Solutions for the Climate Community
Big Data Solutions for the Climate CommunityEUDAT
92 views32 slides
Updates on the Fake Object Pipeline for HSC Survey by
Updates on the Fake Object Pipeline for HSC Survey Updates on the Fake Object Pipeline for HSC Survey
Updates on the Fake Object Pipeline for HSC Survey Song Huang
138 views31 slides
PIC Tier-1 (LHCP Conference / Barcelona) by
PIC Tier-1 (LHCP Conference / Barcelona)PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)Josep Flix
345 views1 slide

More Related Content

What's hot

Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO... by
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...Rob Emanuele
1.9K views54 slides
OCC Overview OMG Clouds Meeting 07-13-09 v3 by
OCC Overview OMG Clouds Meeting 07-13-09 v3OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3Robert Grossman
572 views28 slides
The next generation of the Montage image mosaic engine by
The next generation of the Montage image mosaic engineThe next generation of the Montage image mosaic engine
The next generation of the Montage image mosaic engineG. Bruce Berriman
682 views11 slides
OWL reasoning with WebPIE: calculating the closer of 100 billion triples by
OWL reasoning with WebPIE: calculating the closer of 100 billion triplesOWL reasoning with WebPIE: calculating the closer of 100 billion triples
OWL reasoning with WebPIE: calculating the closer of 100 billion triplesMahdi Atawneh
272 views27 slides
Using parallel hierarchical clustering to by
Using parallel hierarchical clustering toUsing parallel hierarchical clustering to
Using parallel hierarchical clustering toBiniam Behailu
19 views16 slides
Coding the Continuum by
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
1.7K views50 slides

What's hot(20)

Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO... by Rob Emanuele
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...
Rob Emanuele1.9K views
OCC Overview OMG Clouds Meeting 07-13-09 v3 by Robert Grossman
OCC Overview OMG Clouds Meeting 07-13-09 v3OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3
Robert Grossman572 views
The next generation of the Montage image mosaic engine by G. Bruce Berriman
The next generation of the Montage image mosaic engineThe next generation of the Montage image mosaic engine
The next generation of the Montage image mosaic engine
G. Bruce Berriman682 views
OWL reasoning with WebPIE: calculating the closer of 100 billion triples by Mahdi Atawneh
OWL reasoning with WebPIE: calculating the closer of 100 billion triplesOWL reasoning with WebPIE: calculating the closer of 100 billion triples
OWL reasoning with WebPIE: calculating the closer of 100 billion triples
Mahdi Atawneh272 views
Using parallel hierarchical clustering to by Biniam Behailu
Using parallel hierarchical clustering toUsing parallel hierarchical clustering to
Using parallel hierarchical clustering to
Biniam Behailu19 views
Coding the Continuum by Ian Foster
Coding the ContinuumCoding the Continuum
Coding the Continuum
Ian Foster1.7K views
Solving Network Throughput Problems at the Diamond Light Source by Jisc
Solving Network Throughput Problems at the Diamond Light SourceSolving Network Throughput Problems at the Diamond Light Source
Solving Network Throughput Problems at the Diamond Light Source
Jisc1.4K views
Storm: a distributed ,fault tolerant ,real time computation by Nitin Guleria
Storm: a distributed ,fault tolerant ,real time computationStorm: a distributed ,fault tolerant ,real time computation
Storm: a distributed ,fault tolerant ,real time computation
Nitin Guleria1.3K views
Faster Workflows, Faster by Ken Krugler
Faster Workflows, FasterFaster Workflows, Faster
Faster Workflows, Faster
Ken Krugler675 views
Q4 2016 GeoTrellis Presentation by Rob Emanuele
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis Presentation
Rob Emanuele963 views
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces... by Jonas Traub
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
Jonas Traub148 views
Summingbird: Streaming Portable, MapReduce by DataWorks Summit
Summingbird: Streaming Portable, MapReduceSummingbird: Streaming Portable, MapReduce
Summingbird: Streaming Portable, MapReduce
DataWorks Summit3.4K views
Round Table Introduction: Analytics on 100 TB+ catalogs by Mario Juric
Round Table Introduction: Analytics on 100 TB+ catalogsRound Table Introduction: Analytics on 100 TB+ catalogs
Round Table Introduction: Analytics on 100 TB+ catalogs
Mario Juric279 views
2021 Dask Summit - Using STAC to catalog SpatioTemporal datasets by Rob Emanuele
2021 Dask Summit - Using STAC to catalog SpatioTemporal datasets2021 Dask Summit - Using STAC to catalog SpatioTemporal datasets
2021 Dask Summit - Using STAC to catalog SpatioTemporal datasets
Rob Emanuele700 views
Many Task Applications for Grids and Supercomputers by Ian Foster
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and Supercomputers
Ian Foster727 views
Lec 17 heap data structure by Sajid Marwat
Lec 17 heap data structureLec 17 heap data structure
Lec 17 heap data structure
Sajid Marwat8.2K views
The Next Light Wave: Why Too Much Light is An Issue by GTTP-GHOU-NUCLIO
The Next Light Wave: Why Too Much Light is An IssueThe Next Light Wave: Why Too Much Light is An Issue
The Next Light Wave: Why Too Much Light is An Issue
GTTP-GHOU-NUCLIO360 views
Research in the Cloud by David Wallom
Research in the CloudResearch in the Cloud
Research in the Cloud
David Wallom269 views

Similar to Introducing the HACC Simulation Data Portal

Data Automation at Light Sources by
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
525 views35 slides
Toward a National Research Platform by
Toward a National Research PlatformToward a National Research Platform
Toward a National Research PlatformLarry Smarr
78 views34 slides
Petrel: A Programmatically Accessible Research Data Service by
Petrel: A Programmatically Accessible Research Data ServicePetrel: A Programmatically Accessible Research Data Service
Petrel: A Programmatically Accessible Research Data ServiceGlobus
232 views28 slides
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem... by
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Ian Foster
816 views57 slides
HPC Cluster Computing from 64 to 156,000 Cores  by
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores inside-BigData.com
2K views102 slides
Preservation And Reuse In High Energy Physics Salvatore Mele by
Preservation And Reuse In High Energy Physics Salvatore MelePreservation And Reuse In High Energy Physics Salvatore Mele
Preservation And Reuse In High Energy Physics Salvatore MeleDigitalPreservationEurope
551 views26 slides

Similar to Introducing the HACC Simulation Data Portal(20)

Data Automation at Light Sources by Ian Foster
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
Ian Foster525 views
Toward a National Research Platform by Larry Smarr
Toward a National Research PlatformToward a National Research Platform
Toward a National Research Platform
Larry Smarr78 views
Petrel: A Programmatically Accessible Research Data Service by Globus
Petrel: A Programmatically Accessible Research Data ServicePetrel: A Programmatically Accessible Research Data Service
Petrel: A Programmatically Accessible Research Data Service
Globus 232 views
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem... by Ian Foster
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Ian Foster816 views
HPC Cluster Computing from 64 to 156,000 Cores  by inside-BigData.com
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech... by Databricks
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Databricks935 views
re:Invent 2013-foster-madduri by Ravi Madduri
re:Invent 2013-foster-maddurire:Invent 2013-foster-madduri
re:Invent 2013-foster-madduri
Ravi Madduri973 views
Toward a Global Interactive Earth Observing Cyberinfrastructure by Larry Smarr
Toward a Global Interactive Earth Observing CyberinfrastructureToward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing Cyberinfrastructure
Larry Smarr445 views
Accelerating Discovery via Science Services by Ian Foster
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science Services
Ian Foster1.3K views
Big Process for Big Data @ NASA by Ian Foster
Big Process for Big Data @ NASABig Process for Big Data @ NASA
Big Process for Big Data @ NASA
Ian Foster1.2K views
The Earth System Grid Federation: Origins, Current State, Evolution by Ian Foster
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, Evolution
Ian Foster15 views
Larry Smarr - NRP Application Drivers by Larry Smarr
Larry Smarr - NRP Application DriversLarry Smarr - NRP Application Drivers
Larry Smarr - NRP Application Drivers
Larry Smarr141 views
Accelerating Time to Science: Transforming Research in the Cloud by Jamie Kinney
Accelerating Time to Science: Transforming Research in the CloudAccelerating Time to Science: Transforming Research in the Cloud
Accelerating Time to Science: Transforming Research in the Cloud
Jamie Kinney318 views
Scaling collaborative data science with Globus and Jupyter by Ian Foster
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and Jupyter
Ian Foster809 views
The Discovery Cloud: Accelerating Science via Outsourcing and Automation by Ian Foster
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
Ian Foster937 views
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard by Docker, Inc.
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardUsing Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Docker, Inc.1.6K views
Scaling People, Not Just Systems, to Take On Big Data Challenges by Matthew Vaughn
Scaling People, Not Just Systems, to Take On Big Data ChallengesScaling People, Not Just Systems, to Take On Big Data Challenges
Scaling People, Not Just Systems, to Take On Big Data Challenges
Matthew Vaughn402 views
Terabit Applications: What Are They, What is Needed to Enable Them? by Larry Smarr
Terabit Applications: What Are They, What is Needed to Enable Them?Terabit Applications: What Are They, What is Needed to Enable Them?
Terabit Applications: What Are They, What is Needed to Enable Them?
Larry Smarr327 views

More from Globus

Introduction to Globus for System Administrators by
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System AdministratorsGlobus
11 views55 slides
Introduction to Data Transfer and Sharing for Researchers by
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersGlobus
4 views33 slides
Introduction to the Globus Platform for Developers by
Introduction to the Globus Platform for DevelopersIntroduction to the Globus Platform for Developers
Introduction to the Globus Platform for DevelopersGlobus
4 views28 slides
Introduction to the Command Line Interface (CLI) by
Introduction to the Command Line Interface (CLI)Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)Globus
12 views12 slides
Automating Research Data with Globus Flows and Compute by
Automating Research Data with Globus Flows and ComputeAutomating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and ComputeGlobus
6 views60 slides
Automating Research Data Flows and Introduction to the Globus Platform by
Automating Research Data Flows and Introduction to the Globus PlatformAutomating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus PlatformGlobus
50 views41 slides

More from Globus (20)

Introduction to Globus for System Administrators by Globus
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
Globus 11 views
Introduction to Data Transfer and Sharing for Researchers by Globus
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for Researchers
Globus 4 views
Introduction to the Globus Platform for Developers by Globus
Introduction to the Globus Platform for DevelopersIntroduction to the Globus Platform for Developers
Introduction to the Globus Platform for Developers
Globus 4 views
Introduction to the Command Line Interface (CLI) by Globus
Introduction to the Command Line Interface (CLI)Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)
Globus 12 views
Automating Research Data with Globus Flows and Compute by Globus
Automating Research Data with Globus Flows and ComputeAutomating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and Compute
Globus 6 views
Automating Research Data Flows and Introduction to the Globus Platform by Globus
Automating Research Data Flows and Introduction to the Globus PlatformAutomating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus Platform
Globus 50 views
Advanced Globus System Administration by Globus
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
Globus 26 views
Introduction to Globus for System Administrators by Globus
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
Globus 94 views
Introduction to Globus for New Users by Globus
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New Users
Globus 55 views
Working with Globus Platform Services and Portals by Globus
Working with Globus Platform Services and PortalsWorking with Globus Platform Services and Portals
Working with Globus Platform Services and Portals
Globus 28 views
Globus Automation by Globus
Globus AutomationGlobus Automation
Globus Automation
Globus 23 views
Advanced Globus System Administration by Globus
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
Globus 21 views
Introduction to Globus by Globus
Introduction to GlobusIntroduction to Globus
Introduction to Globus
Globus 43 views
Introduction to Globus for System Administrators by Globus
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
Globus 27 views
Working with Globus Platform Services by Globus
Working with Globus Platform ServicesWorking with Globus Platform Services
Working with Globus Platform Services
Globus 41 views
Advanced Globus System Administration by Globus
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
Globus 29 views
Introduction to Globus for System Administrators by Globus
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
Globus 145 views
Using Globus to Streamline Research at Scale by Globus
Using Globus to Streamline Research at ScaleUsing Globus to Streamline Research at Scale
Using Globus to Streamline Research at Scale
Globus 30 views
Introduction to Globus for Researchers by Globus
Introduction to Globus for ResearchersIntroduction to Globus for Researchers
Introduction to Globus for Researchers
Globus 89 views
Automating Research Data Flows and an Introduction to the Globus Platform by Globus
Automating Research Data Flows and an Introduction to the Globus PlatformAutomating Research Data Flows and an Introduction to the Globus Platform
Automating Research Data Flows and an Introduction to the Globus Platform
Globus 132 views

Recently uploaded

The Research Portal of Catalonia: Growing more (information) & more (services) by
The Research Portal of Catalonia: Growing more (information) & more (services)The Research Portal of Catalonia: Growing more (information) & more (services)
The Research Portal of Catalonia: Growing more (information) & more (services)CSUC - Consorci de Serveis Universitaris de Catalunya
79 views25 slides
Java Platform Approach 1.0 - Picnic Meetup by
Java Platform Approach 1.0 - Picnic MeetupJava Platform Approach 1.0 - Picnic Meetup
Java Platform Approach 1.0 - Picnic MeetupRick Ossendrijver
27 views39 slides
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...Bernd Ruecker
33 views69 slides
Microsoft Power Platform.pptx by
Microsoft Power Platform.pptxMicrosoft Power Platform.pptx
Microsoft Power Platform.pptxUni Systems S.M.S.A.
52 views38 slides
HTTP headers that make your website go faster - devs.gent November 2023 by
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023Thijs Feryn
21 views151 slides
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive by
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveNetwork Automation Forum
30 views35 slides

Recently uploaded(20)

iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker33 views
HTTP headers that make your website go faster - devs.gent November 2023 by Thijs Feryn
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023
Thijs Feryn21 views
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive by Network Automation Forum
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Lilypad @ Labweek, Istanbul, 2023.pdf by Ally339821
Lilypad @ Labweek, Istanbul, 2023.pdfLilypad @ Labweek, Istanbul, 2023.pdf
Lilypad @ Labweek, Istanbul, 2023.pdf
Ally3398219 views
1st parposal presentation.pptx by i238212
1st parposal presentation.pptx1st parposal presentation.pptx
1st parposal presentation.pptx
i2382129 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10237 views
Attacking IoT Devices from a Web Perspective - Linux Day by Simone Onofri
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day
Simone Onofri15 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi126 views
6g - REPORT.pdf by Liveplex
6g - REPORT.pdf6g - REPORT.pdf
6g - REPORT.pdf
Liveplex10 views
Piloting & Scaling Successfully With Microsoft Viva by Richard Harbridge
Piloting & Scaling Successfully With Microsoft VivaPiloting & Scaling Successfully With Microsoft Viva
Piloting & Scaling Successfully With Microsoft Viva
Data-centric AI and the convergence of data and model engineering: opportunit... by Paolo Missier
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
Paolo Missier39 views
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2217 views
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab19 views
AMAZON PRODUCT RESEARCH.pdf by JerikkLaureta
AMAZON PRODUCT RESEARCH.pdfAMAZON PRODUCT RESEARCH.pdf
AMAZON PRODUCT RESEARCH.pdf
JerikkLaureta19 views
STPI OctaNE CoE Brochure.pdf by madhurjyapb
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdf
madhurjyapb13 views

Introducing the HACC Simulation Data Portal

  • 1. Introduction to the HACC Simulation Data Portal Globus World 2019; Chicago, May 1, 2019 Katrin Heitmann (Argonne National Laboratory) Based on: arXiv:1904.11966
  • 2. Introduction ! In cosmology we study the origin, evolution, and make-up of the Universe ! Many unsolved questions: ○ What is the nature of dark energy and dark matter, making up 95% of the energy-matter budget of our Universe? ○ What is the mass of the lightest particle in the Universe, the neutrino? ○ How can we learn more about the very first moments of the Universe? ! Upcoming cosmological surveys try to answer these questions and rely on detailed, complex simulations ○ Simulations are carried out and analyzed on the largest supercomputers available world-wide ○ Cosmological simulations generate large amounts of data (PBs) to capture the evolution of the Universe faithfully ○ Given the resources required for these simulations, it is crucial to share them with the community to enable the best possible science outcome HACC/Galacticus/GalSim Hubble Ultra Deep Field NASA
  • 3. What is needed ... A large-scale effort that provides easy access to a range of simulation products to the world’s cosmologists as well as analysis capabilities to established survey collaborations
  • 4. Storage O( 50PB total) Simulation (HPC allocations, e.g., INCITE, ALCC) Analysis User community via web and community-specific clients simulation job description analysis job description Public access to cosmological data and computational support for collaborations CooleyTheta (10PF) job submission/adaptation layer Datasets Collaboration-installed Web/ Data Interfaces • LSST DM Butler • Jupyter • PDACS (Galaxy) • DESCQA • Visualization • Databases • Globus • Workflows Globus Online Petrel O(1 PB, 100TB to start) • Portal • Globus ALCF-hosted Collaboration-controlled Resources Physical/Virtual Machine(s) Phoenix In collaboration with Tom Uram, Mike Papka, Ian Foster
  • 5. Storage O( 50PB total) Simulation (HPC allocations, e.g., INCITE, ALCC) Analysis simulation job description analysis job description Public access to cosmological data and computational support for collaborations CooleyTheta (10PF) job submission/adaptation layer
  • 6. Storage O( 50PB total) Simulation (HPC allocations, e.g., INCITE, ALCC) Analysis simulation job description analysis job description Public access to cosmological data and computational support for collaborations CooleyTheta (10PF) job submission/adaptation layer Temporary storage, expires with allocation, only collaborators on the project have direct access
  • 7. Storage O( 50PB total) Simulation (HPC allocations, e.g., INCITE, ALCC) Analysis simulation job description analysis job description Public access to cosmological data and computational support for collaborations CooleyTheta (10PF) job submission/adaptation layer Globus Online Petrel O(1 PB, 100TB to start) Datasets
  • 8. Storage O( 50PB total) Simulation (HPC allocations, e.g., INCITE, ALCC) Analysis simulation job description analysis job description Public access to cosmological data and computational support for collaborations CooleyTheta (10PF) job submission/adaptation layer Globus Online Petrel O(1 PB, 100TB to start) Datasets • Portal • Globus User community via web and community-specific clients
  • 9. Storage O( 50PB total) Simulation (HPC allocations, e.g., INCITE, ALCC) Analysis User community via web and community-specific clients simulation job description analysis job description Public access to cosmological data and computational support for collaborations CooleyTheta (10PF) job submission/adaptation layer Datasets Collaboration-installed Web/ Data Interfaces • LSST DM Butler • Jupyter • PDACS (Galaxy) • DESCQA • Visualization • Databases • Globus • Workflows Globus Online Petrel O(1 PB, 100TB to start) • Portal • Globus ALCF-hosted Collaboration-controlled Resources Physical/Virtual Machine(s) Phoenix In collaboration with Tom Uram, Mike Papka, Ian Foster
  • 10. What exists ... • Petrel and Phoenix • Simulations • First version of web portal using Globus
  • 11. ! Petrel: Data Management and Sharing Pilot, hosted at Argonne ! 1.7PB parallel filesystem ! Embedded in Argonne’s 100+Gbps network fabric to allow high-speed data transfers ! Web and API access via Globus ! Federated login ! Self-managed by PIs ! https://press3.mcs.anl.gov/petrel/
  • 12. ! Webportal for easy access to simulations ! Currently: ~ 82.5 TB in our project covering three simulation projects ! Step 0: Register with Globus ! Step 1: Select simulation project ! Step 2: Select data products, information about data size available ! Step 3: Transfer with Globus to endpoint of your choice
  • 13. ! Webportal for easy access to simulations ! Currently: ~ 82.5 TB in our project covering three simulation projects ! Step 0: Register with Globus ! Step 1: Select simulation project ! Step 2: Select data products, information about data size available ! Step 3: Transfer with Globus to endpoint of your choice
  • 14. ! Webportal for easy access to simulations ! Currently: ~ 82.5 TB in our project covering three simulation projects ! Step 0: Register with Globus ! Step 1: Select simulation project ! Step 2: Select data products, information about data size available ! Step 3: Transfer with Globus to endpoint of your choice
  • 15. “The purpose of computing is insight not numbers” - Richard Hamming