SlideShare a Scribd company logo
1 of 21
Download to read offline
Rachana Ananthakrishnan
ranantha@uchicago.edu
February 28, 2024
Instrument Data Automation: Life of a
Flow
Instrument data management needs
Cryo EM
Lightsheet
Sequencer
ALS/APS
….
Local system
download
Remote analysis,
visualization
• Reliable, near-real time
data access
• Self-service access
control, management
• Grant data access to
collaborators
• Compute on data
across storage classes
• Do it all at SCALE
Local
policy
store
--/cohort045
--/cohort096
--/cohort127
What is needed for such automation…?
Data
capture
Image analysis
QA check
Threshold
Analysis
Visualize
Metadata
extraction
Publish to
search index
• Bridge across
different facility
resources
• Network as an
instrument
• Use variety of
resources
• Human input
• Credentials for
automation
Proceed or
discard
sample?
XPCS: X-ray Photon Correlation Spectroscopy
ALCF Data
Portal
Argonne Leadership
Computing Facility
APS
Publication
5
Lab Server 1
Acquisition
2
Imaging
1
Plot results
4
XPCS-Eigen
3
Science!
6
● Automate flows stage
data to ALCF for on-
demand analysis and
publication
● Metadata and plots
dynamically extracted,
and published into a
search catalog
● Scientists can select
datasets and initiate
flows to perform batch
analysis tasks
Suresh Narayanan, Nicholas Schwarz
Eagle Storage
Globus
Flows
End-to-end Automation: XPCS
Data capture
Data publication
Transfer
Transfer
IMM
Transfer
Move results
to repo
Compute
Run Corr
Compute
Plot results
Compute
Gather
metadata
Share
Set access
controls
Search
Ingest to
index
Transfer
Transfer
HDF5 files
XPCS flow: definition
7
XPCS Flow
XPCS: Integrating experiment and compute facility
8
Reprocessing of data
Experiment-time processing of data
Argonne: Ian Foster, Mike Papka,
Tom Uram, Christine Simpson, Bill
Allcock, Benoit Cote, Ryan Chard
APS: Suresh Narayanan, Miaoqi
Chu, Hannah Parraga, Nicholas
Schwarz, Laurent Chapon
UChicago: Rachana
Ananthakrishnan, Kyle Chard,
Nickolaus Saint, Ben Blaiszik
One-time configuration per beamline
APS
APS DM
import …
def …
APS Beamline
service account
Compute function
Automating for experiment-time processing
• Create Globus application credential
for the software at the instrument
facility
• Register the compute function(s)
needed for analysis
• Configure the flow such that service
account can run the flow
• Guest collection on Globus Connect
Personal (Windows machine), with
read permissions for service account
to read data
XPCS flow: permissions
10
XPCS flow
permission
One-time configuration per beamline
ALCF
Automating for experiment-time processing
Authorized APS admins with
ALCF account allowed to
manage the endpoint, and
analysis code
• Create a local account at the
compute facility to allow
automated processing
• Install Globus Compute endpoint
in the local account, using the
Globus service account
• Set appropriate local account
policy to manage the compute
endpoint deployment
Beamline and
experiment ID
One-time configuration per beamline
Automated workflow during experiments
Data acquisition
ALCF
APS
ALCF
APS
APS DM
APS DM
import …
def …
APS Beamline
service account
Compute function
Automating for experiment-time processing
Authorized APS admins with
ALCF account allowed to
manage the endpoint, and
analysis code
Beamline and
experiment ID
One-time configuration per beamline
Automated workflow during experiments
env/
$> …
Beamline account
Data acquisition
Eagle Compute endpoint
ALCF
APS
ALCF
APS
Polaris
APS DM
APS DM
import …
def …
APS Beamline
service account
Compute function
Automating for experiment-time processing
Authorized APS admins with
ALCF account allowed to
manage the endpoint, and
analysis code
XPCS – Reprocessing of data
14
• Flow triggered by
the user via portal
• A separate
application
credential is used
to run the flow
• Data shared with
researcher(s) using
Globus
XPCS portal
15
XPCS portal
Scaling to several beamlines
Flows can be used
beyond instruments..
17
CityCOVID
• Integrated COVID-19 pandemic
monitoring, modeling, and analysis
capability
• CityCOVID is a city-scale agent-
based model
• Steps:
– Scrape daily Chicago reports
– Perform simulations at Argonne
Leadership Computing Facility
– Postprocess data at Lab Computing
Resource Center
Jonathan Ozik, Nick Collier, and
Charles Macal
CityCOVID
funcX
Analyze
Transfer
Publish
Auth
Get
credentials
funcX
Scrape
funcX
Simulate
Transfer
Transfer
data
Materials Data Facility
> 40 TB of data
> 320 published
authors
> 400 datasets
• Accept data from many
locations with flexible
interfaces
• Index dataset contents in
science-aware ways
• Dispatch data to the
community
• Using Automate to
simplify building
composable flows of
services
MDF Data Publication Automation
Ingest
Bulk
Ingest
Auth
Get
Credentials
Automate
Transfer
Transfer
Dataset
XTract
Extract
Metadata
Share
Set
permissions
Transfer
Move
metadata
Transfer
Transfer
Dataset
Transfers
Transfer
Dataset
Identifier
Mint DOI
Web form
Metadata
Notify
Notify
Curator
Web form
Curation
Notify
Notify
user
Support resources
• Globus documentation: docs.globus.org
• YouTube channel: youtube.com/GlobusOnline
• Helpdesk: support@globus.org
• Mailing Lists: globus.org/mailing-lists
• Customer engagement team (office hours)
• Professional services team (advisory, custom work)

More Related Content

Similar to Instrument Data Automation: The Life of a Flow

BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesAmazon Web Services
 
AWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon KinesisAWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon KinesisAmazon Web Services
 
Army's Cyber Defense Operations: Building the Right Solutions for the Data Su...
Army's Cyber Defense Operations: Building the Right Solutions for the Data Su...Army's Cyber Defense Operations: Building the Right Solutions for the Data Su...
Army's Cyber Defense Operations: Building the Right Solutions for the Data Su...Amazon Web Services
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...Amazon Web Services
 
A Pluggable Autoscaling System @ UCC
A Pluggable Autoscaling System @ UCCA Pluggable Autoscaling System @ UCC
A Pluggable Autoscaling System @ UCCChris Bunch
 
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015Amazon Web Services Korea
 
Log Analytics with Amazon Elasticsearch Service - September Webinar Series
Log Analytics with Amazon Elasticsearch Service - September Webinar SeriesLog Analytics with Amazon Elasticsearch Service - September Webinar Series
Log Analytics with Amazon Elasticsearch Service - September Webinar SeriesAmazon Web Services
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkFabian Hueske
 
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...confluent
 
Productionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices ArchitectureProductionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices ArchitectureDatabricks
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexApache Apex
 
Monitorama 2015 Netflix Instance Analysis
Monitorama 2015 Netflix Instance AnalysisMonitorama 2015 Netflix Instance Analysis
Monitorama 2015 Netflix Instance AnalysisBrendan Gregg
 
Opal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsOpal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsSriram Krishnan
 
Modernizing Cloud and Hyperconverged Infrastructure monitoring
Modernizing Cloud and Hyperconverged Infrastructure monitoringModernizing Cloud and Hyperconverged Infrastructure monitoring
Modernizing Cloud and Hyperconverged Infrastructure monitoringManageEngine, Zoho Corporation
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataApache Apex
 
End-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasEnd-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasDataWorks Summit
 
A Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and ProcessingA Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and ProcessingStreamNative
 

Similar to Instrument Data Automation: The Life of a Flow (20)

BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
 
AWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon KinesisAWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon Kinesis
 
Army's Cyber Defense Operations: Building the Right Solutions for the Data Su...
Army's Cyber Defense Operations: Building the Right Solutions for the Data Su...Army's Cyber Defense Operations: Building the Right Solutions for the Data Su...
Army's Cyber Defense Operations: Building the Right Solutions for the Data Su...
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
Cloud Migration
Cloud MigrationCloud Migration
Cloud Migration
 
A Pluggable Autoscaling System @ UCC
A Pluggable Autoscaling System @ UCCA Pluggable Autoscaling System @ UCC
A Pluggable Autoscaling System @ UCC
 
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
 
Log Analytics with Amazon Elasticsearch Service - September Webinar Series
Log Analytics with Amazon Elasticsearch Service - September Webinar SeriesLog Analytics with Amazon Elasticsearch Service - September Webinar Series
Log Analytics with Amazon Elasticsearch Service - September Webinar Series
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache Flink
 
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
 
Productionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices ArchitectureProductionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices Architecture
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
 
Monitorama 2015 Netflix Instance Analysis
Monitorama 2015 Netflix Instance AnalysisMonitorama 2015 Netflix Instance Analysis
Monitorama 2015 Netflix Instance Analysis
 
Opal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsOpal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific Applications
 
Modernizing Cloud and Hyperconverged Infrastructure monitoring
Modernizing Cloud and Hyperconverged Infrastructure monitoringModernizing Cloud and Hyperconverged Infrastructure monitoring
Modernizing Cloud and Hyperconverged Infrastructure monitoring
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big Data
 
End-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasEnd-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and Atlas
 
Apache edgent
Apache edgentApache edgent
Apache edgent
 
A Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and ProcessingA Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and Processing
 

More from Globus

Advanced Globus System Administration Topics
Advanced Globus System Administration TopicsAdvanced Globus System Administration Topics
Advanced Globus System Administration TopicsGlobus
 
Building Research Applications with Globus PaaS
Building Research Applications with Globus PaaSBuilding Research Applications with Globus PaaS
Building Research Applications with Globus PaaSGlobus
 
Reliable, Remote Computation at All Scales
Reliable, Remote Computation at All ScalesReliable, Remote Computation at All Scales
Reliable, Remote Computation at All ScalesGlobus
 
Best Practices for Data Sharing Using Globus
Best Practices for Data Sharing Using GlobusBest Practices for Data Sharing Using Globus
Best Practices for Data Sharing Using GlobusGlobus
 
An Introduction to Globus for Researchers
An Introduction to Globus for ResearchersAn Introduction to Globus for Researchers
An Introduction to Globus for ResearchersGlobus
 
Introduction to Research Automation with Globus
Introduction to Research Automation with GlobusIntroduction to Research Automation with Globus
Introduction to Research Automation with GlobusGlobus
 
Globus for System Administrators
Globus for System AdministratorsGlobus for System Administrators
Globus for System AdministratorsGlobus
 
Introduction to Globus for System Administrators
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System AdministratorsGlobus
 
Introduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersGlobus
 
Introduction to the Globus Platform for Developers
Introduction to the Globus Platform for DevelopersIntroduction to the Globus Platform for Developers
Introduction to the Globus Platform for DevelopersGlobus
 
Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)Globus
 
Automating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and ComputeAutomating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and ComputeGlobus
 
Automating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus PlatformAutomating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus PlatformGlobus
 
Advanced Globus System Administration
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System AdministrationGlobus
 
Introduction to Globus for System Administrators
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System AdministratorsGlobus
 
Introduction to Globus for New Users
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New UsersGlobus
 
Working with Globus Platform Services and Portals
Working with Globus Platform Services and PortalsWorking with Globus Platform Services and Portals
Working with Globus Platform Services and PortalsGlobus
 
Globus Automation
Globus AutomationGlobus Automation
Globus AutomationGlobus
 
Advanced Globus System Administration
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System AdministrationGlobus
 
Introduction to Globus
Introduction to GlobusIntroduction to Globus
Introduction to GlobusGlobus
 

More from Globus (20)

Advanced Globus System Administration Topics
Advanced Globus System Administration TopicsAdvanced Globus System Administration Topics
Advanced Globus System Administration Topics
 
Building Research Applications with Globus PaaS
Building Research Applications with Globus PaaSBuilding Research Applications with Globus PaaS
Building Research Applications with Globus PaaS
 
Reliable, Remote Computation at All Scales
Reliable, Remote Computation at All ScalesReliable, Remote Computation at All Scales
Reliable, Remote Computation at All Scales
 
Best Practices for Data Sharing Using Globus
Best Practices for Data Sharing Using GlobusBest Practices for Data Sharing Using Globus
Best Practices for Data Sharing Using Globus
 
An Introduction to Globus for Researchers
An Introduction to Globus for ResearchersAn Introduction to Globus for Researchers
An Introduction to Globus for Researchers
 
Introduction to Research Automation with Globus
Introduction to Research Automation with GlobusIntroduction to Research Automation with Globus
Introduction to Research Automation with Globus
 
Globus for System Administrators
Globus for System AdministratorsGlobus for System Administrators
Globus for System Administrators
 
Introduction to Globus for System Administrators
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
 
Introduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for Researchers
 
Introduction to the Globus Platform for Developers
Introduction to the Globus Platform for DevelopersIntroduction to the Globus Platform for Developers
Introduction to the Globus Platform for Developers
 
Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)
 
Automating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and ComputeAutomating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and Compute
 
Automating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus PlatformAutomating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus Platform
 
Advanced Globus System Administration
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
 
Introduction to Globus for System Administrators
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
 
Introduction to Globus for New Users
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New Users
 
Working with Globus Platform Services and Portals
Working with Globus Platform Services and PortalsWorking with Globus Platform Services and Portals
Working with Globus Platform Services and Portals
 
Globus Automation
Globus AutomationGlobus Automation
Globus Automation
 
Advanced Globus System Administration
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
 
Introduction to Globus
Introduction to GlobusIntroduction to Globus
Introduction to Globus
 

Recently uploaded

Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfLivetecs LLC
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 

Recently uploaded (20)

Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdf
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 

Instrument Data Automation: The Life of a Flow

  • 1. Rachana Ananthakrishnan ranantha@uchicago.edu February 28, 2024 Instrument Data Automation: Life of a Flow
  • 2. Instrument data management needs Cryo EM Lightsheet Sequencer ALS/APS …. Local system download Remote analysis, visualization • Reliable, near-real time data access • Self-service access control, management • Grant data access to collaborators • Compute on data across storage classes • Do it all at SCALE Local policy store --/cohort045 --/cohort096 --/cohort127
  • 3. What is needed for such automation…? Data capture Image analysis QA check Threshold Analysis Visualize Metadata extraction Publish to search index • Bridge across different facility resources • Network as an instrument • Use variety of resources • Human input • Credentials for automation Proceed or discard sample?
  • 4. XPCS: X-ray Photon Correlation Spectroscopy ALCF Data Portal Argonne Leadership Computing Facility APS Publication 5 Lab Server 1 Acquisition 2 Imaging 1 Plot results 4 XPCS-Eigen 3 Science! 6 ● Automate flows stage data to ALCF for on- demand analysis and publication ● Metadata and plots dynamically extracted, and published into a search catalog ● Scientists can select datasets and initiate flows to perform batch analysis tasks Suresh Narayanan, Nicholas Schwarz Eagle Storage
  • 5. Globus Flows End-to-end Automation: XPCS Data capture Data publication Transfer Transfer IMM Transfer Move results to repo Compute Run Corr Compute Plot results Compute Gather metadata Share Set access controls Search Ingest to index Transfer Transfer HDF5 files
  • 7. XPCS: Integrating experiment and compute facility 8 Reprocessing of data Experiment-time processing of data Argonne: Ian Foster, Mike Papka, Tom Uram, Christine Simpson, Bill Allcock, Benoit Cote, Ryan Chard APS: Suresh Narayanan, Miaoqi Chu, Hannah Parraga, Nicholas Schwarz, Laurent Chapon UChicago: Rachana Ananthakrishnan, Kyle Chard, Nickolaus Saint, Ben Blaiszik
  • 8. One-time configuration per beamline APS APS DM import … def … APS Beamline service account Compute function Automating for experiment-time processing • Create Globus application credential for the software at the instrument facility • Register the compute function(s) needed for analysis • Configure the flow such that service account can run the flow • Guest collection on Globus Connect Personal (Windows machine), with read permissions for service account to read data
  • 10. One-time configuration per beamline ALCF Automating for experiment-time processing Authorized APS admins with ALCF account allowed to manage the endpoint, and analysis code • Create a local account at the compute facility to allow automated processing • Install Globus Compute endpoint in the local account, using the Globus service account • Set appropriate local account policy to manage the compute endpoint deployment
  • 11. Beamline and experiment ID One-time configuration per beamline Automated workflow during experiments Data acquisition ALCF APS ALCF APS APS DM APS DM import … def … APS Beamline service account Compute function Automating for experiment-time processing Authorized APS admins with ALCF account allowed to manage the endpoint, and analysis code
  • 12. Beamline and experiment ID One-time configuration per beamline Automated workflow during experiments env/ $> … Beamline account Data acquisition Eagle Compute endpoint ALCF APS ALCF APS Polaris APS DM APS DM import … def … APS Beamline service account Compute function Automating for experiment-time processing Authorized APS admins with ALCF account allowed to manage the endpoint, and analysis code
  • 13. XPCS – Reprocessing of data 14 • Flow triggered by the user via portal • A separate application credential is used to run the flow • Data shared with researcher(s) using Globus
  • 15. Scaling to several beamlines
  • 16. Flows can be used beyond instruments.. 17
  • 17. CityCOVID • Integrated COVID-19 pandemic monitoring, modeling, and analysis capability • CityCOVID is a city-scale agent- based model • Steps: – Scrape daily Chicago reports – Perform simulations at Argonne Leadership Computing Facility – Postprocess data at Lab Computing Resource Center Jonathan Ozik, Nick Collier, and Charles Macal
  • 19. Materials Data Facility > 40 TB of data > 320 published authors > 400 datasets • Accept data from many locations with flexible interfaces • Index dataset contents in science-aware ways • Dispatch data to the community • Using Automate to simplify building composable flows of services
  • 20. MDF Data Publication Automation Ingest Bulk Ingest Auth Get Credentials Automate Transfer Transfer Dataset XTract Extract Metadata Share Set permissions Transfer Move metadata Transfer Transfer Dataset Transfers Transfer Dataset Identifier Mint DOI Web form Metadata Notify Notify Curator Web form Curation Notify Notify user
  • 21. Support resources • Globus documentation: docs.globus.org • YouTube channel: youtube.com/GlobusOnline • Helpdesk: support@globus.org • Mailing Lists: globus.org/mailing-lists • Customer engagement team (office hours) • Professional services team (advisory, custom work)