SlideShare a Scribd company logo
1 of 25
Download to read offline
Vas Vasiliadis
vas@uchicago.edu
February 28, 2024
Reliable, Remote Computation at all Scales
Why do we need remote computing?
Workloads
• Interactive, real-time
workloads
• Machine learning training
and inference
• Components may best be
executed in different places
Users
• Diverse backgrounds
and expertise
• Different user interfaces
(e.g., notebooks)
Resources
• Hardware specialization
• Specialization leads to
distribution
Why do researchers “hate” remote computing?
• Remote computing is notoriously complicated
– Authentication
– Network connections
– Configuring/managing jobs
– Interacting with resources (waiting in queues, scaling nodes)
– Configuring execution environment
– Getting results back again
• Researchers need to overcome the same obstacles
every time they move to a new resource
3
How can we bridge this gap?
Borrow page from data management playbook
è“Fire-and-forget” computation
èUniform access interface
èFederated access control
èFast networks making compute resource “local”
How can we bridge this gap?
Move closer to researchers’ environments
èResearchers primarily work in high level languages
èFunctions are a natural unit of computation
Function-as-a-Service (FaaS)!
• Researchers work in familiar language, e.g., Python…
• …using familiar interfaces, e.g., Jupyter
FaaS as offered by cloud providers
6
• Single provider, single
location to submit and
manage tasks
• Homogenous execution
environment
• Transparent and elastic
execution (of even very
small tasks)
• Integrated with
cloud provider data
management
Cloud
Provider 1
Cloud
Provider 2
FaaS as an interface to the advanced computing
ecosystem
7
• Multiple provider,
multiple location, single
interface
• Homogenous execution
environment
• Transparent and elastic
execution
• Integrated with
institution-wide data
management
Globus Compute helps bridge the gap
Managed, federated
Function-as-a-Service for
reliably, scaleably and
securely executing functions
on remote endpoints from
laptops to supercomputers
The Globus Compute model
• Compute service — Highly available cloud-hosted
service for managed, fire-and-forget function execution
• Compute endpoint — Abstracts access to compute
resources, from edge device to supercomputer
• Compute SDK — Python interface for interacting with the
service, suing the familiar (to many) Globus look and feel
• Security — Leverages Globus Auth; Compute endpoints
are resource servers, authN and access via OAuth tokens
9
User interaction with Globus Compute
10
A B
You request a function be
executed on endpoints A and B
1
2 Globus Compute manages
the reliable and secure
execution on these endpoints
3
Globus Compute returns results or
stores them until requested
Compute
Service
You register your code
(as a Python function)
0
A compute
resource
Another
compute
resource
Globus Compute transforms any computing
resource into a function serving endpoint
• Python pip installable agent
• Elastic resource provisioning
from local, cluster, or cloud
system (via Parsl)
• Parallel execution using local
fork or via common schedulers
– Slurm, PBS, LSF, Cobalt, K8s
11
Compute
Service
Executing functions with Globus Compute
• Users invoke functions as tasks
– Register Python function
– Pass input arguments
– Select endpoint(s)
• Service stores tasks in the cloud
• Endpoints fetch waiting tasks (when
online), run tasks, and return results
• Results stored in the cloud and on
Globus storage endpoints
• Users retrieve results asynchronously
12
Compute
Service
Common Use Cases
13
Use case 1: fire-and-forget execution
Executing a bag of tasks, e.g., running simulations with
different parameters, executing ML inferences, on one
or more remote computers directly from your
environment, e.g., Jupyter on your laptop
• Fire-and-forget execution managed by Globus Compute;
tasks/results cached until endpoint/client are online
• Portability across different systems—optionally making use of
specialized hardware
• Elastic scaling to provision resources as needed (from HPC and
cloud systems)
ML-based drug screening
15
National Virtual Biotechnology Laboratory, arXiv:2006.02431
Use case 2: automated analysis of data
Constructing and running automated analysis
pipelines with data processing steps that need to be
executed in different locations
• Automatically process data as they are acquired (event- and
workflow-based)
• Integrate with data movement and other actions—human and
machine
• Execute functions across the computing continuum e.g., near
instrument, in data center, on specialized hardware
Enabling serial crystallography at scale
• Serially image chips with
000’s of embedded crystals
• Quality control first 1,000 to
report failures
• Analyze batches of images as
they are collected
• Report statistics and images
during experiment
• Return structure to scientist
Darren Sherrell, Gyorgy Babnigg, Andrzej Joachimiak
Globus
Flows
How remote computing enables this use case
Image
processing
Local workstation
Carbon!
Check
threshold
Data publication
Transfer
Transfer
raw files
Transfer
Move results
to repo
Compute
Analyze
images
Compute
Visualize
Compute
Gather
metadata
Share
Set access
controls
Compute
Launch QA
job
Search
Ingest to
index
Supercomputer
GPU Cluster
Globus enables “smart” instruments
19
aining of Deep Neural Networks
ources
7sec
19sec
5sec
31sec.
cycle time
Seems too easy? Let’s take a look…
• Run function on my laptop …test, debug
• Scale (a little) on a cloud VM (or K8s cluster)
• Scale (a bit more) on my campus cluster
• …run complete model on DOE supercomputer :--(
20
Common pattern of computing in a flow
Transfer raw
instrument
images
Run a compute
job to process
image files
Transfer Compute
Move processed
images to sharing
system/repository
Set access
controls for
sharing data
Share
Transfer
1 2 3 4
EC2
Instance
“Instrument”
EC2
Instance
Compute
Resource
Our environment Registered
Function
Monitor
script
transfer control
access
result files
0 trigger
flow run
transfer
raw files
1
invoke image
processing function 2
set
permissions
4
transfer
result files
3
Data
Endpoint
Data
Endpoint
Data
Endpoint
ALCF
Eagle
Compute
Endpoint
Globus Compute current state
• Service is running in production
– Tried and tested by 10MMs of invocations on 100Ks endpoints
• Globus Computer Agent supports single user
– Akin to Globus Connect Personal
• Endpoints visible via Globus web app
• Function registration and invocation via SDK
23
Compute service will evolve rapidly
• Multi-user compute endpoints
– Akin to Globus Connect Server
• Native integration with transfer for stage in and stage
out of data for compute tasks
– Can be done today via Globus Flows
• Expanding compute service interfaces in the webapp
for administrators and users
Support resources
• Globus documentation: docs.globus.org
• YouTube channel: youtube.com/GlobusOnline
• Helpdesk: support@globus.org
• Mailing Lists: globus.org/mailing-lists
• Customer engagement team (office hours)
• Professional services team (advisory, custom work)

More Related Content

Similar to Reliable, Remote Computation at All Scales

Openstack For Beginners
Openstack For BeginnersOpenstack For Beginners
Openstack For Beginners
cpallares
 
Other distributed systems
Other distributed systemsOther distributed systems
Other distributed systems
Sri Prasanna
 

Similar to Reliable, Remote Computation at All Scales (20)

Brad stack - Digital Health and Well-Being Festival
Brad stack - Digital Health and Well-Being Festival Brad stack - Digital Health and Well-Being Festival
Brad stack - Digital Health and Well-Being Festival
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
 
re:Invent 2013-foster-madduri
re:Invent 2013-foster-maddurire:Invent 2013-foster-madduri
re:Invent 2013-foster-madduri
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and Jupyter
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
Big Process for Big Data @ NASA
Big Process for Big Data @ NASABig Process for Big Data @ NASA
Big Process for Big Data @ NASA
 
Взгляд на облака с точки зрения HPC
Взгляд на облака с точки зрения HPCВзгляд на облака с точки зрения HPC
Взгляд на облака с точки зрения HPC
 
Taming Big Data!
Taming Big Data!Taming Big Data!
Taming Big Data!
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 
Openstack For Beginners
Openstack For BeginnersOpenstack For Beginners
Openstack For Beginners
 
StarlingX - Project Onboarding
StarlingX - Project OnboardingStarlingX - Project Onboarding
StarlingX - Project Onboarding
 
Genomic Computation at Scale with Serverless, StackStorm and Docker Swarm
Genomic Computation at Scale with Serverless, StackStorm and Docker SwarmGenomic Computation at Scale with Serverless, StackStorm and Docker Swarm
Genomic Computation at Scale with Serverless, StackStorm and Docker Swarm
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme Scales
 
Other distributed systems
Other distributed systemsOther distributed systems
Other distributed systems
 
DOE Magellan OpenStack user story
DOE Magellan OpenStack user storyDOE Magellan OpenStack user story
DOE Magellan OpenStack user story
 
Overview of Scientific Workflows - Why Use Them?
Overview of Scientific Workflows - Why Use Them?Overview of Scientific Workflows - Why Use Them?
Overview of Scientific Workflows - Why Use Them?
 
Globus Labs: Forging the Next Frontier
Globus Labs: Forging the Next FrontierGlobus Labs: Forging the Next Frontier
Globus Labs: Forging the Next Frontier
 
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
 
Dsdt meetup 2017 11-21
Dsdt meetup 2017 11-21Dsdt meetup 2017 11-21
Dsdt meetup 2017 11-21
 

More from Globus

More from Globus (20)

Advanced Globus System Administration Topics
Advanced Globus System Administration TopicsAdvanced Globus System Administration Topics
Advanced Globus System Administration Topics
 
Instrument Data Automation: The Life of a Flow
Instrument Data Automation: The Life of a FlowInstrument Data Automation: The Life of a Flow
Instrument Data Automation: The Life of a Flow
 
Building Research Applications with Globus PaaS
Building Research Applications with Globus PaaSBuilding Research Applications with Globus PaaS
Building Research Applications with Globus PaaS
 
Best Practices for Data Sharing Using Globus
Best Practices for Data Sharing Using GlobusBest Practices for Data Sharing Using Globus
Best Practices for Data Sharing Using Globus
 
An Introduction to Globus for Researchers
An Introduction to Globus for ResearchersAn Introduction to Globus for Researchers
An Introduction to Globus for Researchers
 
Introduction to Research Automation with Globus
Introduction to Research Automation with GlobusIntroduction to Research Automation with Globus
Introduction to Research Automation with Globus
 
Globus for System Administrators
Globus for System AdministratorsGlobus for System Administrators
Globus for System Administrators
 
Introduction to Globus for System Administrators
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
 
Introduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for Researchers
 
Introduction to the Globus Platform for Developers
Introduction to the Globus Platform for DevelopersIntroduction to the Globus Platform for Developers
Introduction to the Globus Platform for Developers
 
Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)
 
Automating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and ComputeAutomating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and Compute
 
Automating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus PlatformAutomating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus Platform
 
Advanced Globus System Administration
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
 
Introduction to Globus for System Administrators
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
 
Introduction to Globus for New Users
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New Users
 
Working with Globus Platform Services and Portals
Working with Globus Platform Services and PortalsWorking with Globus Platform Services and Portals
Working with Globus Platform Services and Portals
 
Globus Automation
Globus AutomationGlobus Automation
Globus Automation
 
Advanced Globus System Administration
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
 
Introduction to Globus
Introduction to GlobusIntroduction to Globus
Introduction to Globus
 

Recently uploaded

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Recently uploaded (20)

The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 

Reliable, Remote Computation at All Scales

  • 1. Vas Vasiliadis vas@uchicago.edu February 28, 2024 Reliable, Remote Computation at all Scales
  • 2. Why do we need remote computing? Workloads • Interactive, real-time workloads • Machine learning training and inference • Components may best be executed in different places Users • Diverse backgrounds and expertise • Different user interfaces (e.g., notebooks) Resources • Hardware specialization • Specialization leads to distribution
  • 3. Why do researchers “hate” remote computing? • Remote computing is notoriously complicated – Authentication – Network connections – Configuring/managing jobs – Interacting with resources (waiting in queues, scaling nodes) – Configuring execution environment – Getting results back again • Researchers need to overcome the same obstacles every time they move to a new resource 3
  • 4. How can we bridge this gap? Borrow page from data management playbook è“Fire-and-forget” computation èUniform access interface èFederated access control èFast networks making compute resource “local”
  • 5. How can we bridge this gap? Move closer to researchers’ environments èResearchers primarily work in high level languages èFunctions are a natural unit of computation Function-as-a-Service (FaaS)! • Researchers work in familiar language, e.g., Python… • …using familiar interfaces, e.g., Jupyter
  • 6. FaaS as offered by cloud providers 6 • Single provider, single location to submit and manage tasks • Homogenous execution environment • Transparent and elastic execution (of even very small tasks) • Integrated with cloud provider data management Cloud Provider 1 Cloud Provider 2
  • 7. FaaS as an interface to the advanced computing ecosystem 7 • Multiple provider, multiple location, single interface • Homogenous execution environment • Transparent and elastic execution • Integrated with institution-wide data management
  • 8. Globus Compute helps bridge the gap Managed, federated Function-as-a-Service for reliably, scaleably and securely executing functions on remote endpoints from laptops to supercomputers
  • 9. The Globus Compute model • Compute service — Highly available cloud-hosted service for managed, fire-and-forget function execution • Compute endpoint — Abstracts access to compute resources, from edge device to supercomputer • Compute SDK — Python interface for interacting with the service, suing the familiar (to many) Globus look and feel • Security — Leverages Globus Auth; Compute endpoints are resource servers, authN and access via OAuth tokens 9
  • 10. User interaction with Globus Compute 10 A B You request a function be executed on endpoints A and B 1 2 Globus Compute manages the reliable and secure execution on these endpoints 3 Globus Compute returns results or stores them until requested Compute Service You register your code (as a Python function) 0 A compute resource Another compute resource
  • 11. Globus Compute transforms any computing resource into a function serving endpoint • Python pip installable agent • Elastic resource provisioning from local, cluster, or cloud system (via Parsl) • Parallel execution using local fork or via common schedulers – Slurm, PBS, LSF, Cobalt, K8s 11 Compute Service
  • 12. Executing functions with Globus Compute • Users invoke functions as tasks – Register Python function – Pass input arguments – Select endpoint(s) • Service stores tasks in the cloud • Endpoints fetch waiting tasks (when online), run tasks, and return results • Results stored in the cloud and on Globus storage endpoints • Users retrieve results asynchronously 12 Compute Service
  • 14. Use case 1: fire-and-forget execution Executing a bag of tasks, e.g., running simulations with different parameters, executing ML inferences, on one or more remote computers directly from your environment, e.g., Jupyter on your laptop • Fire-and-forget execution managed by Globus Compute; tasks/results cached until endpoint/client are online • Portability across different systems—optionally making use of specialized hardware • Elastic scaling to provision resources as needed (from HPC and cloud systems)
  • 15. ML-based drug screening 15 National Virtual Biotechnology Laboratory, arXiv:2006.02431
  • 16. Use case 2: automated analysis of data Constructing and running automated analysis pipelines with data processing steps that need to be executed in different locations • Automatically process data as they are acquired (event- and workflow-based) • Integrate with data movement and other actions—human and machine • Execute functions across the computing continuum e.g., near instrument, in data center, on specialized hardware
  • 17. Enabling serial crystallography at scale • Serially image chips with 000’s of embedded crystals • Quality control first 1,000 to report failures • Analyze batches of images as they are collected • Report statistics and images during experiment • Return structure to scientist Darren Sherrell, Gyorgy Babnigg, Andrzej Joachimiak
  • 18. Globus Flows How remote computing enables this use case Image processing Local workstation Carbon! Check threshold Data publication Transfer Transfer raw files Transfer Move results to repo Compute Analyze images Compute Visualize Compute Gather metadata Share Set access controls Compute Launch QA job Search Ingest to index Supercomputer GPU Cluster
  • 19. Globus enables “smart” instruments 19 aining of Deep Neural Networks ources 7sec 19sec 5sec 31sec. cycle time
  • 20. Seems too easy? Let’s take a look… • Run function on my laptop …test, debug • Scale (a little) on a cloud VM (or K8s cluster) • Scale (a bit more) on my campus cluster • …run complete model on DOE supercomputer :--( 20
  • 21. Common pattern of computing in a flow Transfer raw instrument images Run a compute job to process image files Transfer Compute Move processed images to sharing system/repository Set access controls for sharing data Share Transfer 1 2 3 4
  • 22. EC2 Instance “Instrument” EC2 Instance Compute Resource Our environment Registered Function Monitor script transfer control access result files 0 trigger flow run transfer raw files 1 invoke image processing function 2 set permissions 4 transfer result files 3 Data Endpoint Data Endpoint Data Endpoint ALCF Eagle Compute Endpoint
  • 23. Globus Compute current state • Service is running in production – Tried and tested by 10MMs of invocations on 100Ks endpoints • Globus Computer Agent supports single user – Akin to Globus Connect Personal • Endpoints visible via Globus web app • Function registration and invocation via SDK 23
  • 24. Compute service will evolve rapidly • Multi-user compute endpoints – Akin to Globus Connect Server • Native integration with transfer for stage in and stage out of data for compute tasks – Can be done today via Globus Flows • Expanding compute service interfaces in the webapp for administrators and users
  • 25. Support resources • Globus documentation: docs.globus.org • YouTube channel: youtube.com/GlobusOnline • Helpdesk: support@globus.org • Mailing Lists: globus.org/mailing-lists • Customer engagement team (office hours) • Professional services team (advisory, custom work)