Linking Scientific Instruments and Computation:
Patterns, Technologies, Experiences
Ian Foster
The University of Chicago
Argonne National Laboratory
foster@anl.gov
Crescat scientia; vita excolatur
https://arxiv.org/abs/2204.05128
https://arxiv.org/abs/2208.09513
A new generation of
scientific instruments
New sensors produce data at high
velocities and in large volumes
New methods and structures are
required to capture and process
data, and to feed back to sensors
Increasing need to harness HPC,
cloud, edge computers
 An instrument becomes a set of
flows, overlaid on distributed
physical resources and software
Mark Boland, https://bit.ly/3cfSosk, 2017
Example: High-energy diffraction microscopy
Example: Ptychographic reconstruction
Example: Serial synchrotron crystallography
A modular, extensible approach to creating and running flows
Flows
Capture useful patterns as
sequences of actions.
Resource-independent
A modular, extensible approach to creating and running flows
Flows
Capture useful patterns as
sequences of actions.
Resource-independent
Action providers
Implement actions.
Resource-independent
Compute Action Provider: Run function at A.
Transfer Action Provider: Transfer from A to B.
Search Action Provider: Publish metadata.
…
A modular, extensible approach to creating and running flows
Flows
Capture useful patterns as
sequences of actions.
Resource-independent
Action providers
Implement actions.
Resource-independent
Fabric
Implements auth, data, and
compute APIs for
manipulating resources
Authenticate user.
Delegate credentials.
Manage file transfers.
Run jobs on computers.
Access data catalog.
…
Compute Action Provider: Run function at A.
Transfer Action Provider: Transfer from A to B.
Search Action Provider: Publish metadata.
…
Builds on
cloud-hosted
Globus
automation
services
Globus
automation
services
Triggers
Flows
Analysis
Computer
Timers
Queues
Step
Step
Step
Step
Event
Type: creation
Match: *tiff
Action
Queue
1 2 3 4
Action
Type: user selection
data: <feature extraction>
Options: approve/reject
Microscope
Step
Step
Step
Step
Flow run
Step
Step
Step
Step
Action
Type: transfer
From: microscope
To: analysis computer
https://arxiv.org/abs/2204.05128
Capture flows
in reusable
forms
In various ways:
- YAML documents
- Python “Gladier” SDK
- Web authoring
Customize
flow to
application
Capture flows
in reusable
forms
Customize
flow to
application
Specialize
flow to
resources
Capture flows
in reusable
forms
Check flow
status
Execute
specialized
flow
Customize
flow to
application
Specialize
flow to
resources
Capture flows
in reusable
forms
Execute
specialized
flow
Customize
flow to
application
Specialize
flow to
resources
Capture flows
in reusable
forms
Examine flow
actions
Execute
specialized
flow
Customize
flow to
application
Specialize
flow to
resources
Capture flows
in reusable
forms
Identify failed
actions
AI model
training
AI model
deployment
Data collection
& transfer
Cerebras
Catalog &
publish
Detector
Injector
x-ray
Target
FAIR data
Data reduction,
refine structures
Data collection
& transfer
AI accelerators, HPC
Ptychographic
reconstruction
Data collection
& transfer (raw)
Data collection &
transfer (position)
AI accelerators
Serial synchrotron crystallography
Ptychography
High energy diffraction microscopy
Flows have been developed for light source
data analysis, biomedical and materials
science data ingest, on-demand simulation, …
Determining protein structures 10-100x faster
“These data services have taken the
time to solve a structure from
weeks to days and now to hours”
Darren Sherrell, SBC beamline
scientist APS Sector 19
• Developed new automation pipeline to
collect data, analyze and visualize the data,
solve protein structure and load results into a
searchable portal for real-time feedback
• Achieved over 10-100x speed up in time to
solution of protein structures at APS beamline
• Leveraged unique DOE facilities at Advanced
Photon Source (SBC Sector 19) and ALCF
(Theta/ ThetaGPU, Petrel, and Data Portals)
Deposited first results in open repositories
Automation pipeline
(Chard, Vescovi, Foster, Blaiszik, Sherrell, Joachimiak, et al.)
ALCF Theta
ALCF Theta
ALCF Theta
Data Portals
APS
ALCF
Petrel
ALCF Theta
17
Flow invocations 2020-21 for five APS experiments
Numbers vary due to facility and experimental schedules.
We collect detailed performance data on flows
https://arxiv.org/abs/2204.05128
Transfer, compute,
and cataloging
costs for median
flows
Round-trip latencies for various action providers
• Current architecture
has ~1 sec minimum
latency due to cloud
interaction
• funcX latencies higher
due to polling strategy
• Both can be improved
as needed
We build on a universal auth, compute, & data fabric
Globus
Auth
Authentication and delegation mechanisms to control
what happens where
Run functions anywhere funcX deployed
Access data anywhere Globus Connect deployed
* See also: Integrated Research Infrastructure, computing continuum, grid
Globus
Connect
As of 4/2022
Globus hybrid “SaaS” model: Data fabric
Globus hybrid “SaaS” model: Compute fabric
funcX
agent
funcX
agent
Customer owned and
administered computer
with funcX agent
running on it
funcX service orchestrates function
execution via communication with
funcX agent
Polaris
Bebop
Cluster
Argonne
Leadership
Computing
Facility
Laboratory
Computing
Research
Center
Eagle store
APS
Computing
Orthros Cluster
APS DM system
Portal
server
Portal
server
Theta
Advanced
Photon
Source
Key: funcX agent
Globus Connect agent
API
API
API
User-defined flows
Globus-accessible
storage and
computing
(10,000s of systems)
Globus
Automation
Services
Building computationally-enhanced instruments:
There is much more to be done!
• We have worked so far with light sources and data ingest
pipelines
• We are pleased with adaptability and reliability
• Work required in capability (e.g., iteration) and performance
• Others are applying tools to microscopes and other
instruments
• New action providers are needed for instrument control
• We are eager to find partners who want to work with us on
developing and/or applying these methods and tools!
Thanks to talented colleagues!
Linking Scientific Instruments & HPC: Patterns, Technologies, Experiences
Globus Automation Services: Research process automation across the space-time continuum
Rachana
Ananthakrishnan
Josh Bryan Kyle Chard Ryan Chard Kurt McKee Jim Pruyne Brigitte Raumann
https://arxiv.org/abs/2204.05128 https://arxiv.org/abs/2208.09513
Raf Vescovi Ryan Chard Nick Saint Ben Blaiszik Jim Pruyne Tekin Bicer
Alex Lavens Zhengchun Liu Mike Papka Suresh Narayanan Nicholas Schwarz Kyle Chard
and
And sponsors
And the rest of
the ALCF, APS, &
Globus teams
Recap: Enabling
new instruments
Reusable flows
composed from an
extensible set of
actions
Built on global
auth, compute, data
fabric
Join us in applying
these methods!
https://arxiv.org/abs/2204.05128
https://arxiv.org/abs/2208.09513
https://www.globus.org/platform/services/flows

Linking Scientific Instruments and Computation

  • 1.
    Linking Scientific Instrumentsand Computation: Patterns, Technologies, Experiences Ian Foster The University of Chicago Argonne National Laboratory foster@anl.gov Crescat scientia; vita excolatur https://arxiv.org/abs/2204.05128 https://arxiv.org/abs/2208.09513
  • 2.
    A new generationof scientific instruments New sensors produce data at high velocities and in large volumes New methods and structures are required to capture and process data, and to feed back to sensors Increasing need to harness HPC, cloud, edge computers  An instrument becomes a set of flows, overlaid on distributed physical resources and software Mark Boland, https://bit.ly/3cfSosk, 2017
  • 3.
  • 4.
  • 5.
  • 6.
    A modular, extensibleapproach to creating and running flows Flows Capture useful patterns as sequences of actions. Resource-independent
  • 7.
    A modular, extensibleapproach to creating and running flows Flows Capture useful patterns as sequences of actions. Resource-independent Action providers Implement actions. Resource-independent Compute Action Provider: Run function at A. Transfer Action Provider: Transfer from A to B. Search Action Provider: Publish metadata. …
  • 8.
    A modular, extensibleapproach to creating and running flows Flows Capture useful patterns as sequences of actions. Resource-independent Action providers Implement actions. Resource-independent Fabric Implements auth, data, and compute APIs for manipulating resources Authenticate user. Delegate credentials. Manage file transfers. Run jobs on computers. Access data catalog. … Compute Action Provider: Run function at A. Transfer Action Provider: Transfer from A to B. Search Action Provider: Publish metadata. …
  • 9.
    Builds on cloud-hosted Globus automation services Globus automation services Triggers Flows Analysis Computer Timers Queues Step Step Step Step Event Type: creation Match:*tiff Action Queue 1 2 3 4 Action Type: user selection data: <feature extraction> Options: approve/reject Microscope Step Step Step Step Flow run Step Step Step Step Action Type: transfer From: microscope To: analysis computer https://arxiv.org/abs/2204.05128
  • 10.
    Capture flows in reusable forms Invarious ways: - YAML documents - Python “Gladier” SDK - Web authoring
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
    AI model training AI model deployment Datacollection & transfer Cerebras Catalog & publish Detector Injector x-ray Target FAIR data Data reduction, refine structures Data collection & transfer AI accelerators, HPC Ptychographic reconstruction Data collection & transfer (raw) Data collection & transfer (position) AI accelerators Serial synchrotron crystallography Ptychography High energy diffraction microscopy Flows have been developed for light source data analysis, biomedical and materials science data ingest, on-demand simulation, …
  • 17.
    Determining protein structures10-100x faster “These data services have taken the time to solve a structure from weeks to days and now to hours” Darren Sherrell, SBC beamline scientist APS Sector 19 • Developed new automation pipeline to collect data, analyze and visualize the data, solve protein structure and load results into a searchable portal for real-time feedback • Achieved over 10-100x speed up in time to solution of protein structures at APS beamline • Leveraged unique DOE facilities at Advanced Photon Source (SBC Sector 19) and ALCF (Theta/ ThetaGPU, Petrel, and Data Portals) Deposited first results in open repositories Automation pipeline (Chard, Vescovi, Foster, Blaiszik, Sherrell, Joachimiak, et al.) ALCF Theta ALCF Theta ALCF Theta Data Portals APS ALCF Petrel ALCF Theta 17
  • 18.
    Flow invocations 2020-21for five APS experiments Numbers vary due to facility and experimental schedules.
  • 19.
    We collect detailedperformance data on flows https://arxiv.org/abs/2204.05128 Transfer, compute, and cataloging costs for median flows
  • 20.
    Round-trip latencies forvarious action providers • Current architecture has ~1 sec minimum latency due to cloud interaction • funcX latencies higher due to polling strategy • Both can be improved as needed
  • 21.
    We build ona universal auth, compute, & data fabric Globus Auth Authentication and delegation mechanisms to control what happens where Run functions anywhere funcX deployed Access data anywhere Globus Connect deployed * See also: Integrated Research Infrastructure, computing continuum, grid Globus Connect
  • 22.
  • 23.
    Globus hybrid “SaaS”model: Data fabric
  • 24.
    Globus hybrid “SaaS”model: Compute fabric funcX agent funcX agent Customer owned and administered computer with funcX agent running on it funcX service orchestrates function execution via communication with funcX agent
  • 25.
    Polaris Bebop Cluster Argonne Leadership Computing Facility Laboratory Computing Research Center Eagle store APS Computing Orthros Cluster APSDM system Portal server Portal server Theta Advanced Photon Source Key: funcX agent Globus Connect agent API API API User-defined flows Globus-accessible storage and computing (10,000s of systems) Globus Automation Services
  • 26.
    Building computationally-enhanced instruments: Thereis much more to be done! • We have worked so far with light sources and data ingest pipelines • We are pleased with adaptability and reliability • Work required in capability (e.g., iteration) and performance • Others are applying tools to microscopes and other instruments • New action providers are needed for instrument control • We are eager to find partners who want to work with us on developing and/or applying these methods and tools!
  • 27.
    Thanks to talentedcolleagues! Linking Scientific Instruments & HPC: Patterns, Technologies, Experiences Globus Automation Services: Research process automation across the space-time continuum Rachana Ananthakrishnan Josh Bryan Kyle Chard Ryan Chard Kurt McKee Jim Pruyne Brigitte Raumann https://arxiv.org/abs/2204.05128 https://arxiv.org/abs/2208.09513 Raf Vescovi Ryan Chard Nick Saint Ben Blaiszik Jim Pruyne Tekin Bicer Alex Lavens Zhengchun Liu Mike Papka Suresh Narayanan Nicholas Schwarz Kyle Chard and And sponsors And the rest of the ALCF, APS, & Globus teams
  • 28.
    Recap: Enabling new instruments Reusableflows composed from an extensible set of actions Built on global auth, compute, data fabric Join us in applying these methods! https://arxiv.org/abs/2204.05128 https://arxiv.org/abs/2208.09513 https://www.globus.org/platform/services/flows

Editor's Notes

  • #3 Probe. Instrument. Meter.
  • #4 Metacomputing revisited 1010 x faster 105 x more tasks 106 x more data Link HPC, AI, instruments c still 3 x 108 m/s 
  • #5 Metacomputing revisited 1010 x faster 105 x more tasks 106 x more data Link HPC, AI, instruments c still 3 x 108 m/s 
  • #6 Metacomputing revisited 1010 x faster 105 x more tasks 106 x more data Link HPC, AI, instruments c still 3 x 108 m/s 
  • #28 Need to mention other Braid people! Eliu Huerta Bogdan Nicolae Justin Wozniak MENTION Eliu work?