SlideShare a Scribd company logo
1 of 41
Global Services for Global Science
Ian Foster
Compare climate models to
understand the earth system
>100 models, >20 countries
Coupled Model Intercomparison Project
(CMIP): Standard protocol for studying
general circulation model output
Earth System Grid Federation
Foundation model(s)
trained on 110 million
viral sequences
Finetune model
on SARS-CoV-2
ORFs
TRAINING Periodically
retrain on
new variants
Trained
foundation
model(s)
PREDICTION WORKFLOW
DETECTION
WORKFLOW
Semantic
similarity score
(embeddings)
Sequence log
likelihood
score
Immune
Escape
Fitness
Evaluation
OPENFOLD
PPI
interaction
(MD
simulations)
Epitope
alteration
Variant of
Concern score
Diffusion model to
derive hierarchy of
gene organization
Generated SARS-
CoV-2 genomes
Link field data on pathogen variants
with AI for pandemic surveillance
Detection
targets
New viral
genomes
M. Zvyagin et al., https://www.biorxiv.org/content/10.1101/2022.10.10.511571v1
Field
Labs
1.5 x 1021
flops
ources
Cerebras CS-2
1102 1102
1102 1102
Model (re)training on local GPU:
1102 seconds
@ LCLS
Connect scientific instruments to remote computers
to create smart instruments
Z. Liu et al., https://doi.org/10.48550/arXiv.2105.13967
ources
Z. Liu et al., https://doi.org/10.48550/arXiv.2105.13967
7 seconds
7 + 19 + 5 = 31 s
35x faster
5 seconds
19 seconds
Cerebras CS-2
@ LCLS
Connect scientific instruments to remote computers
to create smart instruments
Integrate data & models to advance
urban science & resilience
http://dx.doi.org/10.1016/j.wear.2016.04.019
Friction first characterized by Leonardo di Vinci in 1493
Integration of distributed resources, a sine qua non,
is impeded by numerous sources of friction
“Whereas computational friction expresses the struggle involved in transforming
data information and knowledge … data friction expresses a more primitive
form of resistance -- the costs in time, energy, and attention required simply to
collect, check, store, move, receive, and access data. Whenever data travel …
data friction impedes their movement” (Edwards, 2010, p. 84).
Increasingly, telecommunications is not the problem
Aalyria’s “Spacetime” [originally “Minkowski”] platform Free-space optics
Increasingly, telecommunications is not the problem
Free-space optics
”Henceforth, space for itself, and time for itself, are doomed to fade
away into mere shadows, and only a kind of union of the two will
preserve an independent reality.” – Hermann Minkowski
A computing continuum
Aalyria’s “Spacetime” [originally “Minkowski”] platform
Increasingly, telecommunications is not the problem
1) Act on resources regardless of location and interface
 Widely deployed local agents
provide a global footprint for actions
2) Execute remote actions reliably
 Cloud-hosted managed research acceleration
services buffer against inevitable failures
3) Manage who is trusted to perform what actions, where and when
 Distributed authentication with delegation
enables secure management of privileges
But it remains unduly difficult to:
Friction: Varying interfaces, behaviors; reliability; security
1) Act on resources regardless of location and interface
 Widely deployed local agents
provide a global footprint for actions
2) Execute remote actions reliably
 Cloud-hosted managed research acceleration
services buffer against inevitable failures
3) Manage who is trusted to perform what actions, where and when
 Distributed authentication with delegation
enables secure management of privileges
But it remains unduly difficult to:
Friction: Great variety of failures
Friction: Varying interfaces, behaviors; reliability; security
1) Act on resources regardless of location and interface
 Widely deployed local agents
provide a global footprint for actions
2) Execute remote actions reliably
 Cloud-hosted managed research acceleration
services buffer against inevitable failures
3) Manage who is trusted to perform what actions, where and when
 Distributed authentication with delegation
enables secure management of privileges
But it remains unduly difficult to:
Friction: Varying credentials, authentication protocols, authorization policies;
need to act on behalf of others
Friction: Varying interfaces, behaviors; reliability; security
Friction: Great variety of failures, scalability, usability
Need: 1) Act anywhere
Past approaches (data actions):
• Gopher, FTP, Web, OPenDAP, …
• Distributed file systems
Past approaches (compute actions):
• SSH, grid protocols, cloud APIs
• Java, virtual machines, containers
Challenges: Performance, scalability, reliability, portability, usability
Need: 1) Act anywhere
Adapter interface
Plug-in
Plug-in
Adapter interface
Plug-in
Adapter interface
Plug-in
Plug-in
Globus
Connect
Globus
Compute
Lightweight
agents abstract
local details
Globus
Connect
Globus
Compute
Our approach:
• HTTPS, GridFTP for universal, fast access
• Portable agents for broad deployment
• Modularity to target many systems
• Integration with secure delegation
• Integration with hosted supervision
Need: 2) Reliable execution of (sets of) actions
Past approaches:
• Workflow systems
• Distributed file systems
• Eventing, consistency protocols
• Reliable RPC, replication
• Cloud
Challenges: Complexity, fragility, scalability, reach
Need: 2) Reliable execution of (sets of) actions
Hosted research supervision services
Cloud-hosted, persistent, scalable, resilient, secure
Other
services
Transfer
Copy data
AB
Compute
Run f(x)@C
actions
Search
Read & write
catalogs
Other
Globus
services
Flows
Sequences
of actions
Non-Globus
services
Hosted
research
supervision
services
Our approach:
• Cloud-hosted, replicated supervision
• Simple retry-based protocols
• Reduce endpoint complexity
• High assurance for sensitive data
• Integration with secure delegation
Need: 3) Control who can perform what actions, when & where
Hosted research supervision services
Cloud-hosted, persistent, scalable, resilient, secure
Other
services
Transfer
Copy data
AB
Compute
Run f(x)@C
actions
Search
Read & write
catalogs
Other
Globus
services
Flows
Sequences
of actions
Non-Globus
services
Past approaches:
• Passwords, PKI, Kerberos
• Grid Security Infrastructure
• OAuth,
• Specialized delegation protocols
Challenges: Multi-site, dynamic computing; complexity, usability
Globus Auth
trust fabric
OAuth + delegation
Hosted research supervision services
Cloud-hosted, persistent, scalable, resilient, secure
Other
services
Transfer
Copy data
AB
Compute
Run f(x)@C
actions
Search
Read & write
catalogs
Other
Globus
services
Flows
Sequences
of actions
Non-Globus
services
Federated auth &
secure managed
delegation
Need: 3) Control who can perform what actions, when & where
Our approach:
• Secure delegation
• Scoped credentials
• Leverage OAuth2
• Brower compatibility
• IdP federation
(5) “Run F()”
F()
@A
node A
F()
Auth
(3) “Run F()
on node A”
Flows (2) “Provide access
token for funcX”
(4) “Provide
dependent token
for F() on node A”
(6) F() runs
Consent
Dependent
token
Token hierarchy
(1) “Run flow”
with consent
Access
token
Cloud-hosted
services
Globus Auth: A managed research acceleration service
providing distributed authorization with delegation
Not shown:
Token refresh
Who do I trust to act on
my behalf, when, and for
what purpose?
Leverage OAuth for:
• Security via scoped
credentials
• Usability via browser
compatibility
S. Tuecke et al., https://doi.org/10.1109/eScience.2016.7870901, R. Chard et al., https://doi.org/10.48550/arXiv.2208.09513
1700 identity providers
1.3 B access tokens
2.7 M consents
Globus
Compute
1) Act on resources regardless of location and interface
 Widely deployed local agents
provide a global footprint for actions
2) Execute remote actions reliably
 Cloud-hosted managed research acceleration
services buffer against inevitable failures
3) Manage who is trusted to perform what actions, where and when
 Distributed authentication with delegation
enables secure management of privileges
In total: Global services enable low-friction global science
Globus platform
Operated by UChicago for researchers worldwide
Made possible by the support of 200+ subscribers
Polaris
Bebop
Cluster
Argonne
Leadership
Computing
Facility
Laboratory
Computing
Research
Center
Eagle store
APS
Computing
Orthros Cluster
APS DM system
Portal
server
Portal
server
Theta
Advanced
Photon
Source Key: Globus Compute agent
Globus Connect agent
Globus-accessible
storage and
computing
(10,000s of systems)
Endpoints at an experimental facility
Globus research data movement service
Collins Australian Clear
School Atlas, 1964
Wellington, New Zealand
to Madrid, Spain:
19,855 km
Circumference of the earth:
40,075 km
Semi-circumference:
20,037 km
https://dashboard.globus.org/esgf As of May 4, 2022
1.5 GB/s
4 to 6 GB/s
7.5 PB transferred between mid-Feb and May 4, ‘22
17,347,671 directories and 28,907,532 files
Cumulative data transferred over time
GPFS Errors &
Unreadable Files
7.5 PB transferred between mid-Feb and May 4, ‘22
17,347,671 directories and 28,907,532 files
def F(in_args):
# do something
return results
gcc.register_function(F)
$ pip install globus-compute-endpoint
$ globus-compute-endpoint configure myep
$ globus-compute-endpoint start myep
Globus Compute: A hosted research supervision service
that implements a universal computing fabric
https://funcx.org
Z. Li et al., https://doi.org/10.48550/arXiv.2209.11631
F(ep, arg)
F(ep, arg)
F(ep, “A”)
f = gcc.run(“A”,
endpoint_id=ep,
function_id=F)
R = gcc.get_result(f)
Deploy Globus Compute agents
Register functions
Run functions
Globus
Compute
Better Classification
Globus Compute application:
Privacy Preserving Fed. Learning
COVID19 detection from chest X-rays
Distributed execution of
30,000-task drug-screening
protocol over multiple HPC
systems, with different
scheduling
heuristics
Z. Li et al., 2023
“Globus Flows”: Hosted research supervision services for flow
specification, execution, and management
R. Chard et al., https://doi.org/10.48550/arXiv.2208.09513
GLOBUS
COMPUTE
AGENT
ources
Request
Z. Liu et al., https://doi.org/10.48550/arXiv.2105.13967
7 seconds
7 + 19 + 5 = 31 s
35x faster
5 seconds
19 seconds
Cerebras CS-2
@ LCLS
Flows allow us to create smart instruments
ources
Request
Z. Liu et al., https://doi.org/10.48550/arXiv.2105.13967
7 seconds
7 + 19 + 5 = 31 s
35x faster
5 seconds
19 seconds
Cerebras CS-2
@ LCLS
Flows allow us to create smart instruments
ModelTrain
New applications mean new computing workloads
Globus Flows can invoke
arbitrary functions via
Globus Compute
Functions may be
executed in various
locations: at a beamline,
local server, cluster, cloud
34
R. Vescovi et al., https://doi.org/10.1016/j.patter.2022.100606
Seven flows in use at the
Advanced Photon Source
New applications mean new computing workloads
Globus Flows can invoke
arbitrary functions via
Globus Compute
Functions may be
executed in various
locations: at a beamline,
local server, cluster, cloud
35
R. Vescovi et al., https://doi.org/10.1016/j.patter.2022.100606
Execution times at the
Argonne Leadership
Computing Facility
Megascale
Gigascale
Terascale
Petascale
Exascale
Zettascale
Faster components Increased connectivity
Dial-up lines
155 Mbps
10 Gbps
5G
400 Gbps
Free space optics
Computing continuum
Time
Science in a converged world
Continuum-aware programming
Compute fabric
Trust fabric
Access data
anywhere
Compute
anywhere
Ensure only authorized operations
occur in trusted locations
Develop efficient, reliable,
reusable applications
New methods and tools to program the continuum
Data fabric
Continuum-aware programming
Compute fabric
Trust fabric
Globus
Compute
Globus
Transfer
Globus
Automation
Services
Globus
Auth
https://braid-project.org
https://globus.org
Data fabric
New methods and tools to program the continuum
Foucault’s pendulum, Paris Panthéon
Problems we have solved, in part
• Authorization (“enterprise grade
security, consumer-grade usability”)
• Universal connectivity
• Location-independent computing
• Programming
New opportunities and challenges
• New devices and applications
• Massive performance and scale
• Energy as a ubiquitous constraint
• Global data space architecture
• Programming
Utility (‘60s) … xSP (’80s) … Grid (‘90s) … Cloud (2010) … Edge … Continuum …
The pendulum swings between centralization and decentralization
There remain important questions to answer
• What will continuum networks look like in practice?
• Energy vs. accuracy vs. cost vs. … tradeoffs
• Where will we place computing and storage?
• What new instruments will be created?
• What new applications and new science will be enabled?
• What new abstractions, services, and tools do we need?
• Do we need new continuum-aware algorithm design methods?
• What will be economic foundation of this new computing fabric?
• (How) Will we integrate quantum sensors, networks, computers?
Thank you for your attention!
Argonne National Laboratory and the University of Chicago – and students and staff
Federal agencies for continued support: DOE, NSF, NIH, NIST
Wonderful colleagues: Rachana Ananthakrishnan, Ben Blaiszik, Kyle Chard, Ryan Chard,
Carl Kesselman, Arvind Ramanathan, Rick Stevens, Vas Vasiliadis, Logan Ward, & many more
To learn more about our work: https://labs.globus.org https://globus.org
Ask questions or share your thoughts: foster@anl.gov
Experiment with tools:
https://braid-project.org
Thanks to:
https://doi.org/10.1016/j.patter.2022.100606

More Related Content

Similar to Global Services for Global Science March 2023.pptx

Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Ian Foster
 
Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...
Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...
Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...Jorge Cardoso
 
Accelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundaneAccelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundaneIan Foster
 
Streamlined data sharing and analysis to accelerate cancer research
Streamlined data sharing and analysis to accelerate cancer researchStreamlined data sharing and analysis to accelerate cancer research
Streamlined data sharing and analysis to accelerate cancer researchIan Foster
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it worldChris Dwan
 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science ServicesIan Foster
 
Science for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataScience for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataIan Foster
 
Globus: Beyond File Transfer
Globus: Beyond File TransferGlobus: Beyond File Transfer
Globus: Beyond File TransferGlobus
 
NIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus Poster
NIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus PosterNIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus Poster
NIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus PosterGlobus
 
Science cloud foster june 2013
Science cloud foster june 2013Science cloud foster june 2013
Science cloud foster june 2013Kirill Osipov
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryIan Foster
 
Mobilizing the Cloud with AIOLOS - T Verbelen
Mobilizing the Cloud with AIOLOS - T VerbelenMobilizing the Cloud with AIOLOS - T Verbelen
Mobilizing the Cloud with AIOLOS - T Verbelenmfrancis
 
Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Dat...
Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Dat...Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Dat...
Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Dat...Artificial Intelligence Institute at UofSC
 
Introduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 TutorialIntroduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 TutorialGlobus
 
Simplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus PlatformSimplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus PlatformGlobus
 
Cloud Accelerated Genomics
Cloud Accelerated GenomicsCloud Accelerated Genomics
Cloud Accelerated GenomicsIdan Tohami
 

Similar to Global Services for Global Science March 2023.pptx (20)

Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
 
Grid computing
Grid computingGrid computing
Grid computing
 
Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...
Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...
Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...
 
Accelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundaneAccelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundane
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
Streamlined data sharing and analysis to accelerate cancer research
Streamlined data sharing and analysis to accelerate cancer researchStreamlined data sharing and analysis to accelerate cancer research
Streamlined data sharing and analysis to accelerate cancer research
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it world
 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science Services
 
Science for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataScience for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing Data
 
Globus: Beyond File Transfer
Globus: Beyond File TransferGlobus: Beyond File Transfer
Globus: Beyond File Transfer
 
NIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus Poster
NIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus PosterNIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus Poster
NIH NCI Childhood Cancer Data Initiative (CCDI) Symposium Globus Poster
 
1. GRID COMPUTING
1. GRID COMPUTING1. GRID COMPUTING
1. GRID COMPUTING
 
Science cloud foster june 2013
Science cloud foster june 2013Science cloud foster june 2013
Science cloud foster june 2013
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate Discovery
 
Agile network agile world, tayo ashiru, huawei
Agile network agile world, tayo ashiru, huaweiAgile network agile world, tayo ashiru, huawei
Agile network agile world, tayo ashiru, huawei
 
Mobilizing the Cloud with AIOLOS - T Verbelen
Mobilizing the Cloud with AIOLOS - T VerbelenMobilizing the Cloud with AIOLOS - T Verbelen
Mobilizing the Cloud with AIOLOS - T Verbelen
 
Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Dat...
Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Dat...Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Dat...
Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Dat...
 
Introduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 TutorialIntroduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 Tutorial
 
Simplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus PlatformSimplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus Platform
 
Cloud Accelerated Genomics
Cloud Accelerated GenomicsCloud Accelerated Genomics
Cloud Accelerated Genomics
 

More from Ian Foster

The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionIan Foster
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsIan Foster
 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationIan Foster
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryIan Foster
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptxIan Foster
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryIan Foster
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationIan Foster
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryIan Foster
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterIan Foster
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon SummaryIan Foster
 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperabilityIan Foster
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasIan Foster
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFIan Foster
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Ian Foster
 

More from Ian Foster (20)

The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, Evolution
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart Instruments
 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and Computation
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptx
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental Science
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and Chemistry
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven Discovery
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and Jupyter
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon Summary
 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperability
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture Ideas
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCF
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingWSO2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governanceWSO2
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaWSO2
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityVictorSzoltysek
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc
 

Recently uploaded (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 

Global Services for Global Science March 2023.pptx

  • 1. Global Services for Global Science Ian Foster
  • 2. Compare climate models to understand the earth system >100 models, >20 countries Coupled Model Intercomparison Project (CMIP): Standard protocol for studying general circulation model output Earth System Grid Federation
  • 3. Foundation model(s) trained on 110 million viral sequences Finetune model on SARS-CoV-2 ORFs TRAINING Periodically retrain on new variants Trained foundation model(s) PREDICTION WORKFLOW DETECTION WORKFLOW Semantic similarity score (embeddings) Sequence log likelihood score Immune Escape Fitness Evaluation OPENFOLD PPI interaction (MD simulations) Epitope alteration Variant of Concern score Diffusion model to derive hierarchy of gene organization Generated SARS- CoV-2 genomes Link field data on pathogen variants with AI for pandemic surveillance Detection targets New viral genomes M. Zvyagin et al., https://www.biorxiv.org/content/10.1101/2022.10.10.511571v1 Field Labs 1.5 x 1021 flops
  • 4. ources Cerebras CS-2 1102 1102 1102 1102 Model (re)training on local GPU: 1102 seconds @ LCLS Connect scientific instruments to remote computers to create smart instruments Z. Liu et al., https://doi.org/10.48550/arXiv.2105.13967
  • 5. ources Z. Liu et al., https://doi.org/10.48550/arXiv.2105.13967 7 seconds 7 + 19 + 5 = 31 s 35x faster 5 seconds 19 seconds Cerebras CS-2 @ LCLS Connect scientific instruments to remote computers to create smart instruments
  • 6. Integrate data & models to advance urban science & resilience
  • 7. http://dx.doi.org/10.1016/j.wear.2016.04.019 Friction first characterized by Leonardo di Vinci in 1493 Integration of distributed resources, a sine qua non, is impeded by numerous sources of friction “Whereas computational friction expresses the struggle involved in transforming data information and knowledge … data friction expresses a more primitive form of resistance -- the costs in time, energy, and attention required simply to collect, check, store, move, receive, and access data. Whenever data travel … data friction impedes their movement” (Edwards, 2010, p. 84).
  • 9. Aalyria’s “Spacetime” [originally “Minkowski”] platform Free-space optics Increasingly, telecommunications is not the problem
  • 10. Free-space optics ”Henceforth, space for itself, and time for itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.” – Hermann Minkowski A computing continuum Aalyria’s “Spacetime” [originally “Minkowski”] platform Increasingly, telecommunications is not the problem
  • 11. 1) Act on resources regardless of location and interface  Widely deployed local agents provide a global footprint for actions 2) Execute remote actions reliably  Cloud-hosted managed research acceleration services buffer against inevitable failures 3) Manage who is trusted to perform what actions, where and when  Distributed authentication with delegation enables secure management of privileges But it remains unduly difficult to: Friction: Varying interfaces, behaviors; reliability; security
  • 12. 1) Act on resources regardless of location and interface  Widely deployed local agents provide a global footprint for actions 2) Execute remote actions reliably  Cloud-hosted managed research acceleration services buffer against inevitable failures 3) Manage who is trusted to perform what actions, where and when  Distributed authentication with delegation enables secure management of privileges But it remains unduly difficult to: Friction: Great variety of failures Friction: Varying interfaces, behaviors; reliability; security
  • 13. 1) Act on resources regardless of location and interface  Widely deployed local agents provide a global footprint for actions 2) Execute remote actions reliably  Cloud-hosted managed research acceleration services buffer against inevitable failures 3) Manage who is trusted to perform what actions, where and when  Distributed authentication with delegation enables secure management of privileges But it remains unduly difficult to: Friction: Varying credentials, authentication protocols, authorization policies; need to act on behalf of others Friction: Varying interfaces, behaviors; reliability; security Friction: Great variety of failures, scalability, usability
  • 14. Need: 1) Act anywhere Past approaches (data actions): • Gopher, FTP, Web, OPenDAP, … • Distributed file systems Past approaches (compute actions): • SSH, grid protocols, cloud APIs • Java, virtual machines, containers Challenges: Performance, scalability, reliability, portability, usability
  • 15. Need: 1) Act anywhere Adapter interface Plug-in Plug-in Adapter interface Plug-in Adapter interface Plug-in Plug-in Globus Connect Globus Compute Lightweight agents abstract local details Globus Connect Globus Compute Our approach: • HTTPS, GridFTP for universal, fast access • Portable agents for broad deployment • Modularity to target many systems • Integration with secure delegation • Integration with hosted supervision
  • 16. Need: 2) Reliable execution of (sets of) actions Past approaches: • Workflow systems • Distributed file systems • Eventing, consistency protocols • Reliable RPC, replication • Cloud Challenges: Complexity, fragility, scalability, reach
  • 17. Need: 2) Reliable execution of (sets of) actions Hosted research supervision services Cloud-hosted, persistent, scalable, resilient, secure Other services Transfer Copy data AB Compute Run f(x)@C actions Search Read & write catalogs Other Globus services Flows Sequences of actions Non-Globus services Hosted research supervision services Our approach: • Cloud-hosted, replicated supervision • Simple retry-based protocols • Reduce endpoint complexity • High assurance for sensitive data • Integration with secure delegation
  • 18. Need: 3) Control who can perform what actions, when & where Hosted research supervision services Cloud-hosted, persistent, scalable, resilient, secure Other services Transfer Copy data AB Compute Run f(x)@C actions Search Read & write catalogs Other Globus services Flows Sequences of actions Non-Globus services Past approaches: • Passwords, PKI, Kerberos • Grid Security Infrastructure • OAuth, • Specialized delegation protocols Challenges: Multi-site, dynamic computing; complexity, usability
  • 19. Globus Auth trust fabric OAuth + delegation Hosted research supervision services Cloud-hosted, persistent, scalable, resilient, secure Other services Transfer Copy data AB Compute Run f(x)@C actions Search Read & write catalogs Other Globus services Flows Sequences of actions Non-Globus services Federated auth & secure managed delegation Need: 3) Control who can perform what actions, when & where Our approach: • Secure delegation • Scoped credentials • Leverage OAuth2 • Brower compatibility • IdP federation
  • 20. (5) “Run F()” F() @A node A F() Auth (3) “Run F() on node A” Flows (2) “Provide access token for funcX” (4) “Provide dependent token for F() on node A” (6) F() runs Consent Dependent token Token hierarchy (1) “Run flow” with consent Access token Cloud-hosted services Globus Auth: A managed research acceleration service providing distributed authorization with delegation Not shown: Token refresh Who do I trust to act on my behalf, when, and for what purpose? Leverage OAuth for: • Security via scoped credentials • Usability via browser compatibility S. Tuecke et al., https://doi.org/10.1109/eScience.2016.7870901, R. Chard et al., https://doi.org/10.48550/arXiv.2208.09513 1700 identity providers 1.3 B access tokens 2.7 M consents Globus Compute
  • 21. 1) Act on resources regardless of location and interface  Widely deployed local agents provide a global footprint for actions 2) Execute remote actions reliably  Cloud-hosted managed research acceleration services buffer against inevitable failures 3) Manage who is trusted to perform what actions, where and when  Distributed authentication with delegation enables secure management of privileges In total: Global services enable low-friction global science
  • 22. Globus platform Operated by UChicago for researchers worldwide Made possible by the support of 200+ subscribers
  • 23. Polaris Bebop Cluster Argonne Leadership Computing Facility Laboratory Computing Research Center Eagle store APS Computing Orthros Cluster APS DM system Portal server Portal server Theta Advanced Photon Source Key: Globus Compute agent Globus Connect agent Globus-accessible storage and computing (10,000s of systems) Endpoints at an experimental facility
  • 24. Globus research data movement service
  • 25. Collins Australian Clear School Atlas, 1964 Wellington, New Zealand to Madrid, Spain: 19,855 km Circumference of the earth: 40,075 km Semi-circumference: 20,037 km
  • 26. https://dashboard.globus.org/esgf As of May 4, 2022 1.5 GB/s 4 to 6 GB/s 7.5 PB transferred between mid-Feb and May 4, ‘22 17,347,671 directories and 28,907,532 files
  • 27. Cumulative data transferred over time GPFS Errors & Unreadable Files 7.5 PB transferred between mid-Feb and May 4, ‘22 17,347,671 directories and 28,907,532 files
  • 28. def F(in_args): # do something return results gcc.register_function(F) $ pip install globus-compute-endpoint $ globus-compute-endpoint configure myep $ globus-compute-endpoint start myep Globus Compute: A hosted research supervision service that implements a universal computing fabric https://funcx.org Z. Li et al., https://doi.org/10.48550/arXiv.2209.11631 F(ep, arg) F(ep, arg) F(ep, “A”) f = gcc.run(“A”, endpoint_id=ep, function_id=F) R = gcc.get_result(f) Deploy Globus Compute agents Register functions Run functions Globus Compute
  • 29. Better Classification Globus Compute application: Privacy Preserving Fed. Learning COVID19 detection from chest X-rays
  • 30. Distributed execution of 30,000-task drug-screening protocol over multiple HPC systems, with different scheduling heuristics Z. Li et al., 2023
  • 31. “Globus Flows”: Hosted research supervision services for flow specification, execution, and management R. Chard et al., https://doi.org/10.48550/arXiv.2208.09513 GLOBUS COMPUTE AGENT
  • 32. ources Request Z. Liu et al., https://doi.org/10.48550/arXiv.2105.13967 7 seconds 7 + 19 + 5 = 31 s 35x faster 5 seconds 19 seconds Cerebras CS-2 @ LCLS Flows allow us to create smart instruments
  • 33. ources Request Z. Liu et al., https://doi.org/10.48550/arXiv.2105.13967 7 seconds 7 + 19 + 5 = 31 s 35x faster 5 seconds 19 seconds Cerebras CS-2 @ LCLS Flows allow us to create smart instruments ModelTrain
  • 34. New applications mean new computing workloads Globus Flows can invoke arbitrary functions via Globus Compute Functions may be executed in various locations: at a beamline, local server, cluster, cloud 34 R. Vescovi et al., https://doi.org/10.1016/j.patter.2022.100606 Seven flows in use at the Advanced Photon Source
  • 35. New applications mean new computing workloads Globus Flows can invoke arbitrary functions via Globus Compute Functions may be executed in various locations: at a beamline, local server, cluster, cloud 35 R. Vescovi et al., https://doi.org/10.1016/j.patter.2022.100606 Execution times at the Argonne Leadership Computing Facility
  • 36. Megascale Gigascale Terascale Petascale Exascale Zettascale Faster components Increased connectivity Dial-up lines 155 Mbps 10 Gbps 5G 400 Gbps Free space optics Computing continuum Time Science in a converged world
  • 37. Continuum-aware programming Compute fabric Trust fabric Access data anywhere Compute anywhere Ensure only authorized operations occur in trusted locations Develop efficient, reliable, reusable applications New methods and tools to program the continuum Data fabric
  • 38. Continuum-aware programming Compute fabric Trust fabric Globus Compute Globus Transfer Globus Automation Services Globus Auth https://braid-project.org https://globus.org Data fabric New methods and tools to program the continuum
  • 39. Foucault’s pendulum, Paris Panthéon Problems we have solved, in part • Authorization (“enterprise grade security, consumer-grade usability”) • Universal connectivity • Location-independent computing • Programming New opportunities and challenges • New devices and applications • Massive performance and scale • Energy as a ubiquitous constraint • Global data space architecture • Programming Utility (‘60s) … xSP (’80s) … Grid (‘90s) … Cloud (2010) … Edge … Continuum … The pendulum swings between centralization and decentralization
  • 40. There remain important questions to answer • What will continuum networks look like in practice? • Energy vs. accuracy vs. cost vs. … tradeoffs • Where will we place computing and storage? • What new instruments will be created? • What new applications and new science will be enabled? • What new abstractions, services, and tools do we need? • Do we need new continuum-aware algorithm design methods? • What will be economic foundation of this new computing fabric? • (How) Will we integrate quantum sensors, networks, computers?
  • 41. Thank you for your attention! Argonne National Laboratory and the University of Chicago – and students and staff Federal agencies for continued support: DOE, NSF, NIH, NIST Wonderful colleagues: Rachana Ananthakrishnan, Ben Blaiszik, Kyle Chard, Ryan Chard, Carl Kesselman, Arvind Ramanathan, Rick Stevens, Vas Vasiliadis, Logan Ward, & many more To learn more about our work: https://labs.globus.org https://globus.org Ask questions or share your thoughts: foster@anl.gov Experiment with tools: https://braid-project.org Thanks to: https://doi.org/10.1016/j.patter.2022.100606

Editor's Notes

  1. We are on the verge of a global communications revolution based on ubiquitous high-speed 5G, 6G, and free-space optics technologies. The resulting global communications fabric can enable new ultra-collaborative research modalities that pool sensors, data, and computation with unprecedented flexibility and focus. But realizing these modalities requires new services to overcome the tremendous friction currently associated with any actions that traverse institutional boundaries. The solution, I argue, is new global science services to mediate between user intent and infrastructure realities. I describe our experiences building and operating such services and the principles that we have identified as needed for successful deployment and operations.
  2. https://clivebest.com/blog/?p=8788
  3. Or, a quite different example, but just as important, linking field data on SARS-Cov-2 genome data to large-scale AI models.
  4. Another example, involving work by Zhengchun Liu and others at Argonne and SLAC. High energy diffraction microscopy experiments at the Linac Coherent Light Source use a deep neural network, running on a GPU, to extract information from high-rate measurements of microstructure evolution in materials. The GPU is not powerful enough, however, for the periodic retraining required to respond to structural deformation.
  5. Dispatching requests to a Cerebras system at Argonne reduces this time to 31 seconds: 12 seconds to move data and the new model; 19 seconds for training. So 35x faster
  6. Another example: recently funded CROCUS urban integrated field laboratory Combining multiple sensors plus simulations to generate both short-term flood warnings and long-term flood risk warnings Chicago Tribune, September 25, 2022
  7. One aspect of this new world is that the large white spaces in our maps, in which little or no communication is possible, will be filled in. 5G and 6G are one reason. Another technology that may be even more transformative is coherent light free-space optics: that is, space lasers. This image depicts such a device, on the right, and the “SpaceTime” platform that ah-Leer-eeh-ah, a Google spinout, is developing to provide global communication. The original name for their platform was Minkowski, after Herman Minkowski
  8. Hermann Minkowski, a colleague of Einstein’s, observed about the special theory of relativity that it defined a space-time continuum
  9. Another essential building block, this one operational for several years, is the Globus Auth identity and access management platform service. We need flows to perform actions on instruments, computers, repositories, and networks at remote locations. To do this securely, we provide consent and tokens mechanisms for controlling what agents acting on a users behalf can do.
  10. Supported by over 200 subscribers world wide.
  11. As a second example of work intended to produce better information faster, I will describe the work of the Globus team on science services to accelerate complex tasks, for example, by making fast and reliable data transfer a one-click operation. The Globus system, developed and operated by a wonderful team at the University of Chicago, uses software running on the Amazon cloud to manage data transfers, and more, with extreme reliability and high performance: what we may call a managed research acceleration service. I created this animation of Globus transfers from late 2010 to today. The x-axis is the great circle distance between source and destination, from 0 to 20,000 kilometers. The y-axis the transfer size, from 100MB to several petabytes. Each dot represents a transfer. I also track total exabytes and gigafiles. I label a few source-destination pairs You’ll see a large gap around 5000 km. It turns out that the sizes of the continents and oceans means that few communications occur over that distance. You may also see a point to the far right at 19855 kilometers. Can any two places be that far apart?
  12. It turns out those dots represent transfers between Madrid and Wellington, And those two cities are as about as far apart on the globe as you can get, as shown in this map from a childhood atlas explaining the concept of ”antipodes.”
  13. Another managed research acceleration service, Globus Compute (formerly funcX), led by Kyle Chard, …
  14. Diaspora
  15. A wonderful team at the University of Chicago led by Rachana Ananthakrishnan has developed Globus Automation Services to manage the execution of such flows. As a cloud-hosted service, they can manage flows that run for seconds or months.
  16. Dispatching requests to a Cerebras system at Argonne reduces this time to 31 seconds: 12 seconds to move data and the new model; 19 seconds for training. So 35x faster
  17. Dispatching requests to a Cerebras system at Argonne reduces this time to 31 seconds: 12 seconds to move data and the new model; 19 seconds for training. So 35x faster
  18. I’ll mention that these new applications represent a new source of increasingly different workloads. Here, for example, I show for thousands of instances of seven synchrotron light source flows the average duration of each action. Durations range from seconds to hours Globus services ensure reliable execution.
  19. I’ll mention that these new applications represent a new source of increasingly different workloads. Here, for example, I show for thousands of instances of seven synchrotron light source flows the average duration of each action. Durations range from seconds to hours Globus services ensure reliable execution.
  20. We have identified the following key requirements. Surely not a complete list, but all essential.
  21. Others include methods for managing flows, computing anywhere, accessing data anywhere, and delegating credentials. As with Globus Transfer, we make extensive use of cloud-hosted research acceleration services, Which I remind you combine cloud hosted management logic for high reliability with local agents for global footprint We are exploring new programming abstractions and managed research acceleration services*
  22. You might ask, how are the problems and solutions that I have presented here different from those deployed 10, 20, or 30 years ago? Dan Reed has observed that computing differs from other sciences in that the questions often stay the same while the answers change. Far more reliable and performant networks, and effective containerization, allow us to revisit old answers. And new concerns, like energy, result in new questions.
  23. Indeed, our early experiences programing the continuum suggest numerous other questions.