SlideShare a Scribd company logo
“NRP Application Drivers”
4th National Research Platform (4NRP) Workshop
February 9, 2023
Dr. Larry Smarr
Founding Director Emeritus, California Institute for Telecommunications and Information Technology;
Distinguished Professor Emeritus, Dept. of Computer Science and Engineering
Jacobs School of Engineering, UCSD
Rotating Storage
4000 TB
2023: NRP’s Nautilus is a Multi-Institution National to Global Scale Hypercluster
Connected by Optical Networks
~200 FIONAs on 25 Partner Campuses
Networked Together at 10-100Gbps
Feb 9, 2023
Grafana Graphs Nautilus Namespaces Usage
Calendar 2022 GPUs
Grafana Graphs Nautilus Namespaces Usage
Calendar 2022 CPU Cores
2022 Nautilus Namespace Users:
Largest User is One Million Times Smallest!
Nautilus Namespaces
Using >10 GPU-hrs/year
Or >10 CPU-hrs/year
I Will Look in Detail at the
Namespaces in Red
The New Pacific Research Platform Video
Highlights 3 Different Applications Out of 800 Nautilus Namespace Projects
Pacific Research Platform Video:
2015 PRP Grant Was Science-Driven:
Connecting Multi-Campus Application Teams and Devices
UC San Diego UCBerkeley UC Merced
What Are
The Largest 2022
PRP Users
in Each Area?
The Open Science Grid (OSG)
Has Been Integrated With the PRP
In aggregate ~ 200,000 Intel x86 cores
used by ~400 projects
Source: Frank Würthwein,
OSG Exec Director; PRP co-PI; UCSD/SDSC OSG Federates ~100 Clusters Worldwide
All OSG User
Use HTCondor for
Resource Orchestration
OSG Petabyte
Storage Caches
The Open Science Grid (OSG) Delivers to Over 50 Fields of Science
2.6 Billion Core-Hours Per Year of Distributed High Throughput Computing
NCSA Delivered
~35,000 Core-Hours
Per Year in 1990
PRP’s Nautilus Appears
as Just Another OSG Resource
Nautilus Namespace osg-opportunistic Supported a Wide Set of Applications
As the Largest Consumer of CPU Core-Hours in 2022
Source: Igor Sfiligoi, SDSC
3.7 Million CPU Core-Hours
Peaking at 3500 CPU Cores
osg-opportunistic runs fully in low-priority mode,
using only PRP CPU cycles
that would otherwise be unused.
Particle Physics
Bringing Machine Learning to Particle Physics
A new particle was
discovered in 2012
The “holy grail” of the LHC program today is measurement of di-higgs
production to infer the hhh coupling that determines the higgs potential
Source: Frank Wuerthwein, SDSC
ML Inference as a Service on NRP
Raghav Kansal (grad. Stud. UCSD) runs ~1,000 CPU jobs calling out to
~10 GPUs on NRP for inference for his ML model for hh search.
80M events inferenced, sending 1.3TB of data from CPUs to GPUs in 3h
The ML model is too large to fit into the DRAM of the CPUs.
Fastest way to get the job done is “ML Inference as a service” on NRP
~4MB/s output from GPUs
~200MB/s input to GPUs
See Talk by
Shih-Chieh Hsu
4NRP Friday
Source: Frank Wuerthwein, SDSC
Namespace cms-ml Was the
4th Largest Consumer of Nautilus GPU-Hours in 2022
157,571 GPU-Hours
Peaking at 130 GPU
PI Frank Wuerthwein, UCSD
Co-Existence of Interactive and
Non-Interactive Computing on PRP
GPU Simulations Needed to Improve Ice Model.
=> Results in Significant Improvement
in Pointing Resolution for Multi-Messenger Astrophysics
NSF Large-Scale Observatories Are Using PRP and OSG
as a Cohesive, Federated, National-Scale Research Data Infrastructure
IceCube Peaked at
560 GPUs in 2022!
Namespace osg-icecube
Was the Largest Consumer of Nautilus GPU-Hours in 2022
0.8 Million GPU-Hours
Peaking at 560 GPUs
osg-icecube also runs fully in low-priority mode,
using only PRP GPU cycles
that would otherwise be unused.
In 2022 Icecube was the Largest consumer of OSG GPU-Hours
and PRP was the Largest Supplier of GPU-Hours to OSG
Laser Interferometer Gravitational-Wave Observatory (LIGO)
Uses Nautilus/OSG Data Cyberinfrastructure
• LIGO Runs Their Production Rucio Data Management System on Nautilus
– Rucio is the De-Facto Data Management System for Many Large Instruments, LIGO, LHC, …
– LIGO Continues to be One of the Major Users of the OSG Caching Infrastructure (A.K.A.
Stashcache), Which is Deployed Mostly as PRP-Managed Kubernetes Pods.
• LIGO Does Not Use Much PRP Compute Given Their Dedicated Infrastructure
PRP Supports Radio Telescope Through Partnering with
CASPER: the Collaboration for Astronomy Signal Processing and Electronics Research
PRP Access Has Allowed CASPER
to Expand in Several Aspects:
• PRP Portal to CASPER Tools/Libraries
Was Developed by PRP’s John Graham
• The PRP Team Added FPGAs to Nautilus
FIONAs with the CASPER Software Stack
• Nautilus JupyterHub Used for FPGA Training
• Optical Fiber Connected Data Storage
Source: Dan Werthimer
SETI Chief Scientist, UC Berkeley,
Xilinx, Intel, Fujitsu, HP, Nvidia,
The CASPER Collaboration of ~1000 Members
and 50 Radio-Astronomy Instruments Worldwide
to Develop Open-Source
Signal Processing and Instrumentation Pipelines,
Primarily using FPGAs and GPUs.
Radio Telescopes include:
• Event Horizon Telescope
• Square Kilometer Array
• Very Large Array
PRP Portal to CASPER Tools/Libraries
Developed by PRP’s John Graham, UCSD
See John Graham’s CASPER 2021 Workshop Talk and Tutorial:
CASPER designs,
compiles, tests
and evaluates
on the PRP,
then deploys
clusters at the
Discoveries Made with CASPER-Enabled Instrumentation
Radio Image
of a Black Hole
Fast Radio Bursts
Weighing the Universe
Pulsar Timing
Gravitational Waves
Diamond Planet Protheses Control
Neutron Imaging
Source: Dan Werthimer, UC Berkeley
OpenForceField Uses OPEN Software, OPEN Data, OPEN Science
and PRP to Generate Quantum Chemistry Datasets for Druglike Molecules
OFF Open-Source Models are Used in Drug Discovery,
Including in the COVID-19 Computing on Folding@Home.
OFF Runs Quantum Mechanical Computations on Many Molecules
to Determine Their Optimized Force Fields
50% of OFF compute is run on Nautilus.
PRP is Capable of Running Millions of Quantum Chemistry Workloads
OpenFF-1.0.0 released OpenFF-2.0.0 released
OpenFF begins using Nautilus
We run "workers" that pull down QC jobs
for computation from a central project queue.
These jobs require between minutes and hours,
and results are uploaded to the
central, public QCArchive server.
Workers are deployed from Docker images and
scheduled on PRP's Kubernetes system. Due to
the short job duration, these deployments can still
be effective if interrupted every few hours.
OFF Was the Top Nautilus CPU Core Consumer
in 2020 & 2021, 4th Highest in 2022
7.6 Million CPU Core-Hours
Peaking at 1300 CPU Cores
OFF Datasets Consist of Hundreds to Millions of Jobs,
Each Requiring Tens to Thousands of CPU-Hours and 8-32 GB of RAM
Dataset listing:
Python example notebooks for data access:
OpenFF’s dataset lifecycle:
The OFF Datasets on QCArchive
are Fully Open!
Nautilus Namespace tempredict Utilized PRP to Compute
COVID-19 and Vaccine Responses ~65K Participants
Purawat et al., IEEE Big Data, 2021
Mason et al., Sci Rep, 2021
Mason et al., Vaccines, 2022
Source: Prof. Benjamin Smarr, UCSD
Nautilus Namespace braingeneers: One of the Most Advanced PRP projects -
Uses Optical Fiber Connected Shared Storage, CPUs & GPUs
UCSC/Hengenlab Data Analysis Pipeline Using PRP
Source: David Parks, UCSC; braingeneers PI David Haussler
Multiple Worker Processes
Circulate Data
in a 50GB Cache
Sampling Strategy
for braingeneers TB+ data
Jobs Local
Model Training
on the Local Cache
are Returned
to S3
Source: David Parks, UCSC; braingeneers PI David Haussler
UCSC, UCSF & WUSL Are Collaborating
To Grow Human Cerebral Organoids and Measure Their Neural Activity
Multi Electrode Array Silicon Probes
Source: David Parks, UCSC; braingeneers PI David Haussler
Goal: For Every Human Brain Slice, Grow 1000 Organoids,
And For Every Organoid, Compute 1000 Simulated Organoids
From Neural Activity in Living Mouse Brain
To Neural Activity in Human Brain Organoids
Source: David Parks, UCSC; braingeneers PI David Haussler
Nautilus Namespace braingeneers
Was The 3rd Largest Consumer of CPU Core-Hours in 2022
57,000 GPU-Hours
Peaking at 110 GPUs
950,000 CPU Core-Hours
Peaking at 2000 CPU Cores
NeuroKube: An Automated Neuroscience Reconstruction Framework
Uses Nautilus for Large-Scale Processing & Labeling of Neuroimage Volumes
Figures 2, 4, & 5 in “NeuroKube:
An Automated and Autoscaling Neuroimaging Reconstruction Framework
Using Cloud Native Computing and A.I.,”
Matthew Madany, et al. (IEEE Big Data ’20, pp. 320-330)
Computer Vision-Based Approach
Provides the Potential to Automatically Generate Labels Using ML
Subset of Neurites from
Cerebellum Neuropil
Extracted & Rendered
in 3D with Structures
of Interest Labeled
Figures 1 & 14 in “NeuroKube:
An Automated and Autoscaling
Neuroimaging Reconstruction
Framework using
Cloud Native Computing
and A.I.,”
Matthew Madany, et al.
(accepted to IEEE Big Data ’20)
Volumetric Electron Microscopy (VEM)
Data with Colorized Labels
Earth Sciences
NSF-Funded WIFIRE Uses PRP/CENIC to Couple Wireless Edge Sensors
With Supercomputers, Enabling Fire Modeling Workflows
Landscape data
WIFIRE Firemap
Fire Perimeter
Source: Ilkay Altintas, SDSC
Meteorological Sensors
Weather Forecasts
Work Flow
WIFIRE’s Firemap Provides Public Website
Combining Satellite Fire Detections with GIS
SoCal Wildfires Sept 6, 2022
PRP is Building on NSF-Funded SAGE Technology
to Bring ML/AI to the Edge For Smoke Plume Detection
Source: Charlie Catlett, Pete Beckman, Argonne National Lab
Source: Ilkay Altinas, SDSC, HDSI
Training Data: Archive of
25,000 Labeled Wireless Camera Images
of Wildland Fires
PRP namespace digits
Nautilus Namespace wifire-quicfire was the 25th Largest 2022 Consumer of CPU Core-Hours;
digits was the 14th Largest GPU Consumer
108,000 CPU Core-Hours
Peaking at 360 CPU Cores
40,700 GPU-Hours
Peaking at 18 GPUs
Visualization and Virtual Reality
2017: PRP 20Gbps Connection of UCSD SunCAVE and UCM WAVE Over CENIC
2018-2019: Added Their 90 GPUs to PRP for Machine Learning Computations
Leveraging UCM Campus Funds and NSF CNS-1456638 & CNS-1730158 at UCSD
UC Merced WAVE (20 Screens, 20 GPUs) UCSD SunCAVE (70 Screens, 70 GPUs)
See These VR Facilities in Action in the PRP Video
PRP Has Been Bringing Machine Learning to Building Virtual Worlds,
Including Robotics and Autonomous Vehicles
• Goal: Train Robots That Can Manipulate Arbitrary Objects
o Open Drawer, Turn Faucet, Stack Cube, Pull Chair,
Pour Water, Pick And Place, Hang Ropes, Make
Dough, …
Namespace ucsd-haosulab
Consumed the 2nd Most Nautilus GPU-Hours in 2022 (1st is Icecube)
585,170 GPU-Hours
Peaking at 150 GPUs
A Major Project in UCSD’s Hao Su Lab
is Large-Scale Robot Learning
• We Build A Digital Twin of The Real World in Virtual Reality (VR)
For Object Manipulation
• Agents Evolve In VR
o Specialists (Neural Nets) Learn Specific Skills
by Trial and Error
o Generalists (Neural Nets) Distill Knowledge
to Solve Arbitrary Tasks
• On Nautilus:
o Hundreds of specialists
have been trained
o Each specialist is trained
in millions of environment
o ~10,000 GPU hours per
UCSD’s Ravi Group: How to Create Visually Realistic
3D Objects or Dynamic Scenes in VR or the Metaverse
Source: Prof. Ravi Ramamoorthi, UCSD
ML Computing Transforms a Series of 2D Images
Into a 3D View Synthesis
Machine Learning-Based
Neural Radiance Fields for View Synthesis (NeRFs) Are Transformational!
NOVEMBER 10, 2022
A neural radiance field (NeRF) is
a fully-connected neural network
that can generate
novel views of complex 3D scenes,
based on a partial set of 2D images. Source: Prof. Ravi Ramamoorthi, UCSD
Namespace ucsd-ravigroup
Consumed the 3nd Most Nautilus GPU-Hours in 2022
200,000 GPU-Hours
Peaking at 122 GPUs
• Much of the compute involves training computationally expensive NeRFs.
• Training time to learn a representation of a single scene on a GPU can vary from seconds to a day.
• NeRFs that can see behind occlusions may require a week of training on 8 GPUs simultaneously.
Source: Alexander Trevithick, UCSD Ravi Group
2022-2026 NRP Future: PRP Federates with
NSF-Funded Prototype National Research Platform
NSF Award OAC #2112167 (June 2021) [$5M Over 5 Years]
PI Frank Wuerthwein (UCSD, SDSC)
Co-PIs Tajana Rosing (UCSD), Thomas DeFanti (UCSD),
Mahidhar Tatineni (SDSC), Derek Weitzel (UNL)

More Related Content

Similar to Larry Smarr - NRP Application Drivers

Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
The PRP and Its Applications
The PRP and Its ApplicationsThe PRP and Its Applications
The PRP and Its Applications
Larry Smarr
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
Larry Smarr
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
Rafael Ferreira da Silva
CHASE-CI: A Distributed Big Data Machine Learning Platform
CHASE-CI: A Distributed Big Data Machine Learning PlatformCHASE-CI: A Distributed Big Data Machine Learning Platform
CHASE-CI: A Distributed Big Data Machine Learning Platform
Larry Smarr
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
Larry Smarr
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
Larry Smarr
OpenACC Monthly Highlights: May 2020
OpenACC Monthly Highlights: May 2020OpenACC Monthly Highlights: May 2020
OpenACC Monthly Highlights: May 2020
Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Pro...
Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Pro...Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Pro...
Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Pro...
Larry Smarr
The Pacific Research Platform
 Two Years In
The Pacific Research Platform
 Two Years InThe Pacific Research Platform
 Two Years In
The Pacific Research Platform
 Two Years In
Larry Smarr
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
Larry Smarr
Toward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data AnalysisToward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data Analysis
Larry Smarr
Toward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing CyberinfrastructureToward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing Cyberinfrastructure
Larry Smarr
OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights Summer 2019OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights Summer 2019
Berkeley cloud computing meetup may 2020
Berkeley cloud computing meetup may 2020Berkeley cloud computing meetup may 2020
Berkeley cloud computing meetup may 2020
Larry Smarr
Advances at the Argonne Leadership Computing Center
Advances at the Argonne Leadership Computing CenterAdvances at the Argonne Leadership Computing Center
Advances at the Argonne Leadership Computing Center
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
Larry Smarr
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
Larry Smarr
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
Larry Smarr

Similar to Larry Smarr - NRP Application Drivers (20)

Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
The PRP and Its Applications
The PRP and Its ApplicationsThe PRP and Its Applications
The PRP and Its Applications
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
CHASE-CI: A Distributed Big Data Machine Learning Platform
CHASE-CI: A Distributed Big Data Machine Learning PlatformCHASE-CI: A Distributed Big Data Machine Learning Platform
CHASE-CI: A Distributed Big Data Machine Learning Platform
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
OpenACC Monthly Highlights: May 2020
OpenACC Monthly Highlights: May 2020OpenACC Monthly Highlights: May 2020
OpenACC Monthly Highlights: May 2020
Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Pro...
Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Pro...Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Pro...
Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Pro...
The Pacific Research Platform
 Two Years In
The Pacific Research Platform
 Two Years InThe Pacific Research Platform
 Two Years In
The Pacific Research Platform
 Two Years In
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
Toward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data AnalysisToward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data Analysis
Toward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing CyberinfrastructureToward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing Cyberinfrastructure
OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights Summer 2019OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights Summer 2019
Berkeley cloud computing meetup may 2020
Berkeley cloud computing meetup may 2020Berkeley cloud computing meetup may 2020
Berkeley cloud computing meetup may 2020
Advances at the Argonne Leadership Computing Center
Advances at the Argonne Leadership Computing CenterAdvances at the Argonne Leadership Computing Center
Advances at the Argonne Leadership Computing Center
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform

More from Larry Smarr

My Remembrances of Mike Norman Over The Last 45 Years
My Remembrances of Mike Norman Over The Last 45 YearsMy Remembrances of Mike Norman Over The Last 45 Years
My Remembrances of Mike Norman Over The Last 45 Years
Larry Smarr
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Larry Smarr
Panel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving InstitutionsPanel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving Institutions
Larry Smarr
Global Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated SystemsGlobal Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated Systems
Larry Smarr
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
 Wireless FasterData and Distributed Open Compute Opportunities and (some) Us... Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
Larry Smarr
Panel Discussion: Engaging underrepresented technologists, researchers, and e...
Panel Discussion: Engaging underrepresented technologists, researchers, and e...Panel Discussion: Engaging underrepresented technologists, researchers, and e...
Panel Discussion: Engaging underrepresented technologists, researchers, and e...
Larry Smarr
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon MoonThe Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
Larry Smarr
Panel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving InstitutionsPanel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving Institutions
Larry Smarr
Panel: The Global Research Platform: An Overview
Panel: The Global Research Platform: An OverviewPanel: The Global Research Platform: An Overview
Panel: The Global Research Platform: An Overview
Larry Smarr
Panel: Future Wireless Extensions of Regional Optical Networks
Panel: Future Wireless Extensions of Regional Optical NetworksPanel: Future Wireless Extensions of Regional Optical Networks
Panel: Future Wireless Extensions of Regional Optical Networks
Larry Smarr
Global Research Platform Workshops - Maxine Brown
Global Research Platform Workshops - Maxine BrownGlobal Research Platform Workshops - Maxine Brown
Global Research Platform Workshops - Maxine Brown
Larry Smarr
Built around answering questions
Built around answering questionsBuilt around answering questions
Built around answering questions
Larry Smarr
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
Larry Smarr
Democratizing Science through Cyberinfrastructure - Manish Parashar
Democratizing Science through Cyberinfrastructure - Manish ParasharDemocratizing Science through Cyberinfrastructure - Manish Parashar
Democratizing Science through Cyberinfrastructure - Manish Parashar
Larry Smarr
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Larry Smarr
Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...
Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...
Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...
Larry Smarr
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Larry Smarr
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Larry Smarr
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Larry Smarr
Frank Würthwein - NRP and the Path forward
Frank Würthwein - NRP and the Path forwardFrank Würthwein - NRP and the Path forward
Frank Würthwein - NRP and the Path forward
Larry Smarr

More from Larry Smarr (20)

My Remembrances of Mike Norman Over The Last 45 Years
My Remembrances of Mike Norman Over The Last 45 YearsMy Remembrances of Mike Norman Over The Last 45 Years
My Remembrances of Mike Norman Over The Last 45 Years
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Panel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving InstitutionsPanel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving Institutions
Global Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated SystemsGlobal Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated Systems
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
 Wireless FasterData and Distributed Open Compute Opportunities and (some) Us... Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
Panel Discussion: Engaging underrepresented technologists, researchers, and e...
Panel Discussion: Engaging underrepresented technologists, researchers, and e...Panel Discussion: Engaging underrepresented technologists, researchers, and e...
Panel Discussion: Engaging underrepresented technologists, researchers, and e...
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon MoonThe Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
Panel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving InstitutionsPanel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving Institutions
Panel: The Global Research Platform: An Overview
Panel: The Global Research Platform: An OverviewPanel: The Global Research Platform: An Overview
Panel: The Global Research Platform: An Overview
Panel: Future Wireless Extensions of Regional Optical Networks
Panel: Future Wireless Extensions of Regional Optical NetworksPanel: Future Wireless Extensions of Regional Optical Networks
Panel: Future Wireless Extensions of Regional Optical Networks
Global Research Platform Workshops - Maxine Brown
Global Research Platform Workshops - Maxine BrownGlobal Research Platform Workshops - Maxine Brown
Global Research Platform Workshops - Maxine Brown
Built around answering questions
Built around answering questionsBuilt around answering questions
Built around answering questions
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
Democratizing Science through Cyberinfrastructure - Manish Parashar
Democratizing Science through Cyberinfrastructure - Manish ParasharDemocratizing Science through Cyberinfrastructure - Manish Parashar
Democratizing Science through Cyberinfrastructure - Manish Parashar
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...
Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...
Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Frank Würthwein - NRP and the Path forward
Frank Würthwein - NRP and the Path forwardFrank Würthwein - NRP and the Path forward
Frank Würthwein - NRP and the Path forward

Recently uploaded

Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production

Recently uploaded (20)

Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production

Larry Smarr - NRP Application Drivers

  • 1. “NRP Application Drivers” Presentation 4th National Research Platform (4NRP) Workshop February 9, 2023 1 Dr. Larry Smarr Founding Director Emeritus, California Institute for Telecommunications and Information Technology; Distinguished Professor Emeritus, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD
  • 2. Rotating Storage 4000 TB 2023: NRP’s Nautilus is a Multi-Institution National to Global Scale Hypercluster Connected by Optical Networks ~200 FIONAs on 25 Partner Campuses Networked Together at 10-100Gbps Feb 9, 2023
  • 3. Grafana Graphs Nautilus Namespaces Usage Calendar 2022 GPUs 900
  • 4. Grafana Graphs Nautilus Namespaces Usage Calendar 2022 CPU Cores 7,000
  • 5. 2022 Nautilus Namespace Users: Largest User is One Million Times Smallest! osg-opportunistic ucsd-haosulab osg-icecube ucsd-ravigroup cms-ml braingeneers Nautilus Namespaces Using >10 GPU-hrs/year Or >10 CPU-hrs/year wifire-quicfire I Will Look in Detail at the Namespaces in Red digits
  • 6. The New Pacific Research Platform Video Highlights 3 Different Applications Out of 800 Nautilus Namespace Projects Pacific Research Platform Video:
  • 7. 2015 PRP Grant Was Science-Driven: Connecting Multi-Campus Application Teams and Devices Earth Sciences UC San Diego UCBerkeley UC Merced What Are The Largest 2022 PRP Users in Each Area?
  • 8. The Open Science Grid (OSG) Has Been Integrated With the PRP In aggregate ~ 200,000 Intel x86 cores used by ~400 projects Source: Frank Würthwein, OSG Exec Director; PRP co-PI; UCSD/SDSC OSG Federates ~100 Clusters Worldwide All OSG User Communities Use HTCondor for Resource Orchestration SDSC U.Chicago FNAL Caltech Distributed OSG Petabyte Storage Caches
  • 9. The Open Science Grid (OSG) Delivers to Over 50 Fields of Science 2.6 Billion Core-Hours Per Year of Distributed High Throughput Computing NCSA Delivered ~35,000 Core-Hours Per Year in 1990 CMS ATLAS PRP’s Nautilus Appears as Just Another OSG Resource
  • 10. Nautilus Namespace osg-opportunistic Supported a Wide Set of Applications As the Largest Consumer of CPU Core-Hours in 2022 3,500 Source: Igor Sfiligoi, SDSC 3.7 Million CPU Core-Hours Peaking at 3500 CPU Cores osg-opportunistic runs fully in low-priority mode, using only PRP CPU cycles that would otherwise be unused.
  • 12. Bringing Machine Learning to Particle Physics A new particle was discovered in 2012 The “holy grail” of the LHC program today is measurement of di-higgs production to infer the hhh coupling that determines the higgs potential 𝛌 Source: Frank Wuerthwein, SDSC
  • 13. ML Inference as a Service on NRP 13 Raghav Kansal (grad. Stud. UCSD) runs ~1,000 CPU jobs calling out to ~10 GPUs on NRP for inference for his ML model for hh search. 80M events inferenced, sending 1.3TB of data from CPUs to GPUs in 3h The ML model is too large to fit into the DRAM of the CPUs. Fastest way to get the job done is “ML Inference as a service” on NRP ~4MB/s output from GPUs ~200MB/s input to GPUs See Talk by Shih-Chieh Hsu 4NRP Friday Source: Frank Wuerthwein, SDSC
  • 14. Namespace cms-ml Was the 4th Largest Consumer of Nautilus GPU-Hours in 2022 157,571 GPU-Hours Peaking at 130 GPU PI Frank Wuerthwein, UCSD
  • 16. Co-Existence of Interactive and Non-Interactive Computing on PRP GPU Simulations Needed to Improve Ice Model. => Results in Significant Improvement in Pointing Resolution for Multi-Messenger Astrophysics NSF Large-Scale Observatories Are Using PRP and OSG as a Cohesive, Federated, National-Scale Research Data Infrastructure IceCube Peaked at 560 GPUs in 2022!
  • 17. Namespace osg-icecube Was the Largest Consumer of Nautilus GPU-Hours in 2022 0.8 Million GPU-Hours Peaking at 560 GPUs osg-icecube also runs fully in low-priority mode, using only PRP GPU cycles that would otherwise be unused. OSG GPU Consumers OSG GPU Providers In 2022 Icecube was the Largest consumer of OSG GPU-Hours and PRP was the Largest Supplier of GPU-Hours to OSG
  • 18. Laser Interferometer Gravitational-Wave Observatory (LIGO) Uses Nautilus/OSG Data Cyberinfrastructure • LIGO Runs Their Production Rucio Data Management System on Nautilus – Rucio is the De-Facto Data Management System for Many Large Instruments, LIGO, LHC, … – LIGO Continues to be One of the Major Users of the OSG Caching Infrastructure (A.K.A. Stashcache), Which is Deployed Mostly as PRP-Managed Kubernetes Pods. • LIGO Does Not Use Much PRP Compute Given Their Dedicated Infrastructure
  • 19. PRP Supports Radio Telescope Through Partnering with CASPER: the Collaboration for Astronomy Signal Processing and Electronics Research PRP Access Has Allowed CASPER to Expand in Several Aspects: • PRP Portal to CASPER Tools/Libraries Was Developed by PRP’s John Graham • The PRP Team Added FPGAs to Nautilus FIONAs with the CASPER Software Stack • Nautilus JupyterHub Used for FPGA Training • Optical Fiber Connected Data Storage Source: Dan Werthimer SETI Chief Scientist, UC Berkeley, Xilinx, Intel, Fujitsu, HP, Nvidia, NSF, NASA, NRAO, NAIC The CASPER Collaboration of ~1000 Members and 50 Radio-Astronomy Instruments Worldwide to Develop Open-Source Signal Processing and Instrumentation Pipelines, Primarily using FPGAs and GPUs. Radio Telescopes include: • Event Horizon Telescope • Square Kilometer Array • Very Large Array
  • 20. PRP Portal to CASPER Tools/Libraries Developed by PRP’s John Graham, UCSD See John Graham’s CASPER 2021 Workshop Talk and Tutorial: CASPER designs, compiles, tests and evaluates instrumentation on the PRP, then deploys dedicated FPGA and GPU clusters at the observatories
  • 21. Discoveries Made with CASPER-Enabled Instrumentation Radio Image of a Black Hole Fast Radio Bursts Weighing the Universe Pulsar Timing Gravitational Waves Diamond Planet Protheses Control Neutron Imaging Source: Dan Werthimer, UC Berkeley
  • 23. OpenForceField Uses OPEN Software, OPEN Data, OPEN Science and PRP to Generate Quantum Chemistry Datasets for Druglike Molecules www.openforcefield.or OFF Open-Source Models are Used in Drug Discovery, Including in the COVID-19 Computing on Folding@Home.
  • 24. OFF Runs Quantum Mechanical Computations on Many Molecules to Determine Their Optimized Force Fields
  • 25. 50% of OFF compute is run on Nautilus. PRP is Capable of Running Millions of Quantum Chemistry Workloads OpenFF-1.0.0 released OpenFF-2.0.0 released OpenFF begins using Nautilus We run "workers" that pull down QC jobs for computation from a central project queue. These jobs require between minutes and hours, and results are uploaded to the central, public QCArchive server. Workers are deployed from Docker images and scheduled on PRP's Kubernetes system. Due to the short job duration, these deployments can still be effective if interrupted every few hours.
  • 26. OFF Was the Top Nautilus CPU Core Consumer in 2020 & 2021, 4th Highest in 2022 7.6 Million CPU Core-Hours (2020-2022) Peaking at 1300 CPU Cores OFF Datasets Consist of Hundreds to Millions of Jobs, Each Requiring Tens to Thousands of CPU-Hours and 8-32 GB of RAM
  • 27. Dataset listing: Python example notebooks for data access: OpenFF’s dataset lifecycle: The OFF Datasets on QCArchive are Fully Open!
  • 28. Nautilus Namespace tempredict Utilized PRP to Compute COVID-19 and Vaccine Responses ~65K Participants Purawat et al., IEEE Big Data, 2021 Mason et al., Sci Rep, 2021 Mason et al., Vaccines, 2022 Source: Prof. Benjamin Smarr, UCSD
  • 29. Nautilus Namespace braingeneers: One of the Most Advanced PRP projects - Uses Optical Fiber Connected Shared Storage, CPUs & GPUs
  • 30. UCSC/Hengenlab Data Analysis Pipeline Using PRP Hengenlab UWSL PRP/S3 Results PRP Compute CNN Source: David Parks, UCSC; braingeneers PI David Haussler
  • 31. Multiple Worker Processes Circulate Data in a 50GB Cache Sampling Strategy for braingeneers TB+ data PRP/S3 PRP Compute Jobs Local NVMe Model Training Operates on the Local Cache Results are Returned to S3 Source: David Parks, UCSC; braingeneers PI David Haussler
  • 32. UCSC, UCSF & WUSL Are Collaborating To Grow Human Cerebral Organoids and Measure Their Neural Activity Tetrodes Multi Electrode Array Silicon Probes Source: David Parks, UCSC; braingeneers PI David Haussler
  • 33. Goal: For Every Human Brain Slice, Grow 1000 Organoids, And For Every Organoid, Compute 1000 Simulated Organoids From Neural Activity in Living Mouse Brain Human To Neural Activity in Human Brain Organoids Source: David Parks, UCSC; braingeneers PI David Haussler
  • 34. Nautilus Namespace braingeneers Was The 3rd Largest Consumer of CPU Core-Hours in 2022 57,000 GPU-Hours Peaking at 110 GPUs 950,000 CPU Core-Hours Peaking at 2000 CPU Cores
  • 35. NeuroKube: An Automated Neuroscience Reconstruction Framework Uses Nautilus for Large-Scale Processing & Labeling of Neuroimage Volumes Figures 2, 4, & 5 in “NeuroKube: An Automated and Autoscaling Neuroimaging Reconstruction Framework Using Cloud Native Computing and A.I.,” Matthew Madany, et al. (IEEE Big Data ’20, pp. 320-330)
  • 36. Computer Vision-Based Approach Provides the Potential to Automatically Generate Labels Using ML Subset of Neurites from Cerebellum Neuropil Extracted & Rendered in 3D with Structures of Interest Labeled Figures 1 & 14 in “NeuroKube: An Automated and Autoscaling Neuroimaging Reconstruction Framework using Cloud Native Computing and A.I.,” Matthew Madany, et al. (accepted to IEEE Big Data ’20) Volumetric Electron Microscopy (VEM) Data with Colorized Labels
  • 38. NSF-Funded WIFIRE Uses PRP/CENIC to Couple Wireless Edge Sensors With Supercomputers, Enabling Fire Modeling Workflows Landscape data WIFIRE Firemap Fire Perimeter Source: Ilkay Altintas, SDSC Real-Time Meteorological Sensors Weather Forecasts Work Flow PRP
  • 39. WIFIRE’s Firemap Provides Public Website Combining Satellite Fire Detections with GIS SoCal Wildfires Sept 6, 2022
  • 40. PRP is Building on NSF-Funded SAGE Technology to Bring ML/AI to the Edge For Smoke Plume Detection Source: Charlie Catlett, Pete Beckman, Argonne National Lab Source: Ilkay Altinas, SDSC, HDSI Training Data: Archive of 25,000 Labeled Wireless Camera Images of Wildland Fires PRP namespace digits
  • 41. Nautilus Namespace wifire-quicfire was the 25th Largest 2022 Consumer of CPU Core-Hours; digits was the 14th Largest GPU Consumer wifire-quicfire 108,000 CPU Core-Hours Peaking at 360 CPU Cores digits 40,700 GPU-Hours Peaking at 18 GPUs
  • 43. 2017: PRP 20Gbps Connection of UCSD SunCAVE and UCM WAVE Over CENIC 2018-2019: Added Their 90 GPUs to PRP for Machine Learning Computations Leveraging UCM Campus Funds and NSF CNS-1456638 & CNS-1730158 at UCSD UC Merced WAVE (20 Screens, 20 GPUs) UCSD SunCAVE (70 Screens, 70 GPUs) See These VR Facilities in Action in the PRP Video
  • 44. PRP Has Been Bringing Machine Learning to Building Virtual Worlds, Including Robotics and Autonomous Vehicles • Goal: Train Robots That Can Manipulate Arbitrary Objects o Open Drawer, Turn Faucet, Stack Cube, Pull Chair, Pour Water, Pick And Place, Hang Ropes, Make Dough, … (video)
  • 45. Namespace ucsd-haosulab Consumed the 2nd Most Nautilus GPU-Hours in 2022 (1st is Icecube) 585,170 GPU-Hours Peaking at 150 GPUs
  • 46. A Major Project in UCSD’s Hao Su Lab is Large-Scale Robot Learning • We Build A Digital Twin of The Real World in Virtual Reality (VR) For Object Manipulation • Agents Evolve In VR o Specialists (Neural Nets) Learn Specific Skills by Trial and Error o Generalists (Neural Nets) Distill Knowledge to Solve Arbitrary Tasks • On Nautilus: o Hundreds of specialists have been trained o Each specialist is trained in millions of environment variants o ~10,000 GPU hours per run
  • 47. UCSD’s Ravi Group: How to Create Visually Realistic 3D Objects or Dynamic Scenes in VR or the Metaverse Source: Prof. Ravi Ramamoorthi, UCSD ML Computing Transforms a Series of 2D Images Into a 3D View Synthesis
  • 48. Machine Learning-Based Neural Radiance Fields for View Synthesis (NeRFs) Are Transformational! BY JARED LINDZON NOVEMBER 10, 2022 A neural radiance field (NeRF) is a fully-connected neural network that can generate novel views of complex 3D scenes, based on a partial set of 2D images. Source: Prof. Ravi Ramamoorthi, UCSD
  • 49. Namespace ucsd-ravigroup Consumed the 3nd Most Nautilus GPU-Hours in 2022 200,000 GPU-Hours Peaking at 122 GPUs • Much of the compute involves training computationally expensive NeRFs. • Training time to learn a representation of a single scene on a GPU can vary from seconds to a day. • NeRFs that can see behind occlusions may require a week of training on 8 GPUs simultaneously. Source: Alexander Trevithick, UCSD Ravi Group
  • 50. 2022-2026 NRP Future: PRP Federates with NSF-Funded Prototype National Research Platform NSF Award OAC #2112167 (June 2021) [$5M Over 5 Years] PI Frank Wuerthwein (UCSD, SDSC) Co-PIs Tajana Rosing (UCSD), Thomas DeFanti (UCSD), Mahidhar Tatineni (SDSC), Derek Weitzel (UNL)