Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Automate: The Globus Vision (GlobusWorld Tour - UMich)

Presented at the GlobusWorld Tour workshop at the University of Michigan, on July 22, 2019.

  • Login to see the comments

  • Be the first to like this

Automate: The Globus Vision (GlobusWorld Tour - UMich)

  1. 1. Automate: The Globus Vision Ryan Chard
  2. 2. Our goal is to make all research data reliably, rapidly, and securely accessible, discoverable, and usable.
  3. 3. Addressing environment heterogeneity • Automation tasks need to be applied to laptops, desktops, local clusters, cloud, and supercomputers • Local system resources, permissions, environments, timescales • We need a common way to interact with arbitrary resources and perform tasks… • …that abstracts these “details”
  4. 4. Requires a platform that … • Automates best practices (replicate, catalog, share) • Can be applied as data are created • Operates across arbitrary storage, compute systems • Dynamically re-programs to respond to new events • Enables non-expert users to define automations
  5. 5. Globus Automate
  6. 6. Globus Automate • A PaaS for creating, using, and sharing data management automation pipelines • Define and register an automation flow, authenticate, then run the flow • Doesn’t do any compute, calls out to Action Providers • Status: Alpha! Whitelist Globus Group gives permission to create flows for testing
  7. 7. Globus automation capabilities • Built on AWS Step Functions – Simple JSON-based state machine language – Conditions, loops, fault tolerance, etc. – Propagates state through the flow • Standardized API for integrating custom event and action services – Actions: synchronous or asynchronous – Custom Web forms prompt for user input • Actions secured with Globus Auth
  8. 8. Creating flows • Definition extends SFN State Machine Language • Definitions are transformed into flows, then into state machines – Adds auth, async, etc. – Returns uuid • Accepts JSON blob as input
  9. 9. Running flows • Define an input doc • In the flow you specify branches of the doc to be used as input/output – $.Transfer1Input – $.Transfer1Result • Invoke the flow with input
  10. 10. Automate Interfaces • No GUI available yet... • SDK to create, visualize, run, and monitor flows • CLI to interact with the flows service and Action Providers
  11. 11. Automate Actions • Create an Auth scope so users’ can authenticate with the service • Toolkit provided to help build actions • Custom GLabs actions (not product): DLHub, funcX, Xtract, user input Action API - /run - /cancel - /status - / - /release
  12. 12. Introspecting Actions • You can figure out what the action requires via the SDK and CLI $ globus-automate action-provider-introspect • Designed to help build GUIs for creating flows and help users invoke actions
  13. 13. Automation platform: Core services Execute Remote execution Self optimization Secure connections Transfer File operations Transfer data Manage shared data permissions Search Catalog Faceted search Search query Identifier Manage namespace Mint DOI Auth User login Secure service interactions App identity, interactions
  14. 14. funcX: FaaS platform for HPC • funcX endpoints deployed at resource • Cloud service routes requests to endpoints • Parsl acquires resources • Singularity containers run functions • Globus Auth secures communication Pre Alpha...
  15. 15. Other Services • DLHub • User Input • Notification • Xtract Any others you’d like? We have dozens of summer students!!
  16. 16. Use cases • Connectomics • Serial Crystallography • Data publication
  17. 17. Argonne JLSEUChicago Argonne Leadership Computing FacilityAPS Publication7 Building the connectome Imaging1 Lab Server 1 Acquisition2 Lab Server 2 Pre-processing3 Preview/Center4 Reconstruction6Visualization8 User validation5 Science!9 Neuroanatomy reconstruction pipeline
  18. 18. Automating the connectome Web form User input Search Ingest Share Set policy Identifier Mint DOI funcX Auth Get credentials Automate Run job Describe Get metadata Transfer Transfer data funcX Run job Transfer Transfer data
  19. 19. Enabling serial crystallography at scale • Serially image 26,000 crystals • Quality control first 1,000 • Analyze full 26,000 • Return crystal structure to scientist
  20. 20. SSX Automation funcX Analyze Transfer Return result Auth Get credentials funcX Preprocess Stop? Threshold Transfer Transfer data
  21. 21. Data Publication Citable Data Standard metadata, persistent identifiers, durable storage Many domains, custom metadata, locally managed storage Institutional Data Agreed schema, larger datasets, fine grained metadata Community Data
  22. 22. Data Publication • Decompose Globus Publish v1 into platform services • Allow for flexible re-composition and adaptation of services • Enable extension and enhancement Auth Transfer Describe Identifier Search Create folder Transfer data Get metadata Mint DOI Catalog Get credentials Set permission Automate
  23. 23. Examples
  24. 24. Thanks!