Simple Data Automation with Globus (GlobusWorld Tour West)

Globus
Data Automation Programming
Teaching Globus Some New Tricks
Greg Nawrocki
greg@globus.org
September 15, 2021
2
Custom portals? Science Gateways? Unique workflows? Our
Command Line Interface, Timer Service, open REST APIs and
Python SDK empower you to create an integrated ecosystem of
research data services and applications.
PaaS Security Challenges – Globus Auth
• How to provide:
– Login to apps
o Web apps (Jupyter Notebook, Portals), Mobile, Desktop, Command line
– Protect all REST API communications
o App à Globus service (Jupyter Notebook, Portals)
o App à non-Globus service (Portals)
o Service à service (Portals)
• While:
– Not introducing even more identities
o Providing a platform to consolidate those identities
– Providing least privileges security model (consents)
– Being agnostic to programming language and framework
– Being web friendly
– Making it easy for users and developers
3
Securing Apps with Globus Auth
• Native App (with refresh tokens – extend expiration)
– Authentication as user identity
– Authentication URL / come back with a auth code – exchanged for tokens
– Clients can’t keep a secret - tokens in plain text
– Jupyter Notebook examples / Timer Service
• Auth Code Grant
– Authentication as user identity
– Browser redirect to Globus Auth, auth code returned (no manual copy)
– Tokens stored securely
– Jupyter hub secured with Globus Auth
• Confidential Client:
– Authentication as application
– ClientID and Secret stored securely
– Custom apps
4
Globus Command Line Interface
Open source, uses
the Python SDK
Because of this
correspondence the CLI is
an excellent tool for getting
the gist of how he SDK
functions.
Great in shell scripts.
Globus CLI
• Easy install and get updates
– https://docs.globus.org/cli/
– https://docs.globus.org/cli/examples/
– https://github.com/globus/globus-cli
• All interactions with transfer and auth at the identity level
– Command “globus login” gets access tokens and refresh tokens
o Stores the token locally (~/.globus.cfg )
o Tokens for Globus Auth and Transfer services
– Command “globus logout” deletes those
– Command “globus whoami” reveals logged in identity
CLI Basics – “globus” is the executable
$ globus endpoint search 'Globus Tutorial'
$ globus task list
$ globus get-identities greg@globus.org --verbose
• Getting help / list of commands
– globus list-commands
– globus --help
• UUIDs for endpoint, task, user identity, groups…
• Can query to discover the UUIDs
– Use search / list / get options
The Globus CLI – Simple tasks
$ globus ls ddb59af0-6d04-11e5-ba46-22000b92c6ec
$ globus ls ddb59aef-6d04-11e5-ba46-22000b92c6ec:/share/godata/
$globus transfer 
ddb59aef-6d04-11e5-ba46-22000b92c6ec:/share/godata/file3.txt 
ddb59af0-6d04-11e5-ba46-22000b92c6ec:/~/file3.txt
• List endpoint contents
• Single file transfer
The Globus CLI – Simple tasks
$globus transfer --recursive 
ddb59aef-6d04-11e5-ba46-22000b92c6ec:/share/godata/ 
ddb59af0-6d04-11e5-ba46-22000b92c6ec:/~/
• Recursive transfer
$ globus delete ddb59af0-6d04-11e5-ba46-22000b92c6ec:/~/file3.txt
• Delete
Batch Transfers
• Transfer tasks have one source/destination, but can have
any number of files
• Provide input source-dest pairs via local file
• File may have embedded comments
$ globus transfer 
ddb59aef-6d04-11e5-ba46-22000b92c6ec:/share/godata/ 
ddb59af0-6d04-11e5-ba46-22000b92c6ec:/~/ 
--batch --label 'CLI Batch' < files.txt
Parsing CLI output
$ globus endpoint search --filter-scope my-endpoints
$ globus endpoint search --filter-scope my-endpoints --format json
$ globus endpoint search --filter-scope my-endpoints --jmespath
'DATA[].[id, display_name]'
• Default output is text; for JSON output use --format
json
• Extract specific attributes using --jmespath
<expression>
Managing notifications
• Turn off emails sent for tasks
• Useful when an application manages tasks for a user
• Disable notifications with the --notify option
--notify off (all notifications)
--notify succeeded|failed|inactive (select notifications)
Other CLI Commands
• globus endpoint permission
– Mange access control rules
– CLI based portal
• globus endpoint role
– Manage endpoint roles
– Delegate roles to other identities
• globus task
– show
– cancel
Automation with the CLI
• Interactions are as user: both for data access and to
Globus services
– Globus login to get tokens
• Collection access
– Mapped Collections
o Use the –skip-activation-check to submit the task even if endpoint is not
activated at submit time
– Guest Collections
o Guest Collection / Shared Endpoints auto-activate
o Use Guest Collections whenever possible
• Reference
– Basic Data Automation with the Globus Command Line Interface
(CLI)
o https://www.youtube.com/watch?v=qIQTC6YOvrE
The Globus Timer Service
• For scheduling recurring Globus transfers using
Globus Automate
– Backups
– Synchronizations
• Doc: https://pypi.org/project/globus-timer-cli/
• Service with a CLI interface
– Simple installation (pip install)
– Authentication as user identity
o Browser redirect to Globus Auth – copy back auth code – native app
o Authentication information is thereafter cached so the authentication
process is only needed on the first use of the CLI
15
Using the Globus Timer Service
• globus-timer session {login, logout, whoami}
16
globus-timer job transfer 
--name example-job 
--label "Timer Transfer Job" 
--interval 28800 
--start '2020-01-01T12:34:56’ 
--source-endpoint ddb59aef-6d04-11e5-ba46-22000b92c6ec 
--dest-endpoint ddb59af0-6d04-11e5-ba46-22000b92c6ec 
--item ~/file1.txt ~/new_file1.txt false 
--item ~/file2.txt ~/new_file2.txt false
Using the Globus Timer Service
• --items-file transfer_items.csv
• Other options – just like in the webApp
--sync-level (how timer behaves if files exist)
--verify-checksum
--encrypt-data
--preserve-timestamp
--stop-after-runs
--stop-after-date
• globus-timer job transfer --help
17
Monitoring and Deleting Jobs
• globus-timer job list
• globus-timer job status <job_id> [--verbose]
• globus-timer job delete <job_id>
18
Data centric applications leveraging Globus
19
Globus Transfer API
• Globus Web App consumes public Transfer API
• Resource named by URL (standard REST approach)
– Query params allow refinement (e.g., subset of fields)
• Globus APIs use JSON for documents and resource
representations
• Requests authorized via OAuth2 access token
– Authorization: Bearer asdflkqhafsdafeawk
docs.globus.org/api/transfer
20
Globus Python SDK
• Python client library for the Globus Auth and Transfer
REST APIs
• TransferClient class handles connection management,
security, framing, marshaling
– Largely direct mapping to REST API
– One method for each API resource and HTTP verb
• Nice high level wrapper to the API – manages low level
API housekeeping tasks
https://globus-sdk-python.readthedocs.io/en/stable/
globus.github.io/globus-sdk-python
21
Endpoint Activation
• Activating endpoint means binding a credential to an
endpoint for login
• Mapped Collections require login via web app
• Auto-activate
– Globus Connect Personal and Guest Collections use
Globus-provided credential
– Must auto-activate before any API calls to endpoints
23
Synchronous Tasks
• Endpoint search (with scopes)
• List directory contents (ls)
• Make directory (mkdir)
• Rename
• Note:
– Path encoding & UTF gotchas
– Don’t forget to auto-activate first
24
Asynchronous Tasks
• Transfer
– Sync level option
• Delete
• Get submission_id, followed by submit
– Once and only once submission
• Use task id to “follow up”
25
The Globus API / SDK with a Jupyter Notebook in a
Jupyter Hub
login
REST APIs
{ “tokens”:…
{“tokens”:…
REST APIs
REST APIs
Bearer a45cd…
Walkthrough API with our Jupyter Hub
• https://jupyter.demo.globus.org
– Sign in with Globus
– Verify the consents
– Start My Server (this will take about a minute)
– Open folder: globus-jupyter-notebooks
– Run Platform_Introduction_JupyterHub_Auth.ipynb
• If you mess it up and want to “go back to the beginning”
– Just stop and restart the server
• If you want to use the notebook outside of our hub
– https://github.com/globus/globus-jupyter-notebooks
– Authentication is a manual cut and paste of exchanging the
authorization code for an access token – Native App
27
Automation Examples
• Simple code examples for various use cases using
Globus
– https://github.com/globus/automation-examples
– Syncing a directory
o Bash script that calls the Globus CLI and a Python module that can
be run as a script or imported as a module.
– Staging data in a shared directory
o Bash / Python
– Removing directories after files are transferred
o Python script
28
Support resources
• Globus documentation: docs.globus.org
• GitHub: https://github.com/globus
• YouTube channel: youtube.com/user/GlobusOnline
1 of 28

Recommended

Tutorial: Leveraging Globus in your Research Applications by
Tutorial: Leveraging Globus in your Research ApplicationsTutorial: Leveraging Globus in your Research Applications
Tutorial: Leveraging Globus in your Research ApplicationsGlobus
292 views29 slides
GlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDK by
GlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDKGlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDK
GlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDKGlobus
176 views32 slides
Introduction to Globus (GlobusWorld Tour West) by
Introduction to Globus (GlobusWorld Tour West)Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)Globus
110 views13 slides
Making Storage Systems Accessible via Globus (GlobusWorld Tour West) by
Making Storage Systems Accessible via Globus (GlobusWorld Tour West)Making Storage Systems Accessible via Globus (GlobusWorld Tour West)
Making Storage Systems Accessible via Globus (GlobusWorld Tour West)Globus
128 views31 slides
Best Practices for Data Sharing (GlobusWorld Tour - UCSD) by
Best Practices for Data Sharing (GlobusWorld Tour - UCSD)Best Practices for Data Sharing (GlobusWorld Tour - UCSD)
Best Practices for Data Sharing (GlobusWorld Tour - UCSD)Globus
65 views17 slides
Globus Command Line Interface (APS Workshop) by
Globus Command Line Interface (APS Workshop)Globus Command Line Interface (APS Workshop)
Globus Command Line Interface (APS Workshop)Globus
100 views25 slides

More Related Content

What's hot

GlobusWorld 2021 Tutorial: Introduction to Globus by
GlobusWorld 2021 Tutorial: Introduction to GlobusGlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to GlobusGlobus
158 views13 slides
Tutorial: Managing Protected Data with Globus Connect Server v5 by
Tutorial: Managing Protected Data with Globus Connect Server v5Tutorial: Managing Protected Data with Globus Connect Server v5
Tutorial: Managing Protected Data with Globus Connect Server v5Globus
371 views40 slides
Leveraging the Globus Platform in Web Applications (CHPC 2019 - South Africa) by
Leveraging the Globus Platform in Web Applications (CHPC 2019 - South Africa)Leveraging the Globus Platform in Web Applications (CHPC 2019 - South Africa)
Leveraging the Globus Platform in Web Applications (CHPC 2019 - South Africa)Globus
106 views30 slides
Globus for System Administrators (GlobusWorld Tour - UCSD) by
Globus for System Administrators (GlobusWorld Tour - UCSD)Globus for System Administrators (GlobusWorld Tour - UCSD)
Globus for System Administrators (GlobusWorld Tour - UCSD)Globus
110 views55 slides
Globus Platform Overview by
Globus Platform OverviewGlobus Platform Overview
Globus Platform OverviewGlobus
260 views24 slides
Globus Endpoint Setup and Configuration - XSEDE14 Tutorial by
Globus Endpoint Setup and Configuration - XSEDE14 TutorialGlobus Endpoint Setup and Configuration - XSEDE14 Tutorial
Globus Endpoint Setup and Configuration - XSEDE14 TutorialGlobus
1.2K views33 slides

What's hot(20)

GlobusWorld 2021 Tutorial: Introduction to Globus by Globus
GlobusWorld 2021 Tutorial: Introduction to GlobusGlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to Globus
Globus 158 views
Tutorial: Managing Protected Data with Globus Connect Server v5 by Globus
Tutorial: Managing Protected Data with Globus Connect Server v5Tutorial: Managing Protected Data with Globus Connect Server v5
Tutorial: Managing Protected Data with Globus Connect Server v5
Globus 371 views
Leveraging the Globus Platform in Web Applications (CHPC 2019 - South Africa) by Globus
Leveraging the Globus Platform in Web Applications (CHPC 2019 - South Africa)Leveraging the Globus Platform in Web Applications (CHPC 2019 - South Africa)
Leveraging the Globus Platform in Web Applications (CHPC 2019 - South Africa)
Globus 106 views
Globus for System Administrators (GlobusWorld Tour - UCSD) by Globus
Globus for System Administrators (GlobusWorld Tour - UCSD)Globus for System Administrators (GlobusWorld Tour - UCSD)
Globus for System Administrators (GlobusWorld Tour - UCSD)
Globus 110 views
Globus Platform Overview by Globus
Globus Platform OverviewGlobus Platform Overview
Globus Platform Overview
Globus 260 views
Globus Endpoint Setup and Configuration - XSEDE14 Tutorial by Globus
Globus Endpoint Setup and Configuration - XSEDE14 TutorialGlobus Endpoint Setup and Configuration - XSEDE14 Tutorial
Globus Endpoint Setup and Configuration - XSEDE14 Tutorial
Globus 1.2K views
Introduction to the Globus Platform (GlobusWorld Tour - UMich) by Globus
Introduction to the Globus Platform (GlobusWorld Tour - UMich)Introduction to the Globus Platform (GlobusWorld Tour - UMich)
Introduction to the Globus Platform (GlobusWorld Tour - UMich)
Globus 136 views
Automating Research Data Flows with the Globus Command Line Interface (CLI) by Globus
Automating Research Data Flows with the Globus Command Line Interface (CLI)Automating Research Data Flows with the Globus Command Line Interface (CLI)
Automating Research Data Flows with the Globus Command Line Interface (CLI)
Globus 237 views
Introduction to Globus for New Users (GlobusWorld Tour - UCSD) by Globus
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Globus 64 views
Globus Portal Framework (APS Workshop) by Globus
Globus Portal Framework (APS Workshop)Globus Portal Framework (APS Workshop)
Globus Portal Framework (APS Workshop)
Globus 121 views
Instrument Data Orchestration with Globus Search and Flows by Globus
Instrument Data Orchestration with Globus Search and FlowsInstrument Data Orchestration with Globus Search and Flows
Instrument Data Orchestration with Globus Search and Flows
Globus 111 views
GlobusWorld 2021 Tutorial: Building with the Globus Platform by Globus
GlobusWorld 2021 Tutorial: Building with the Globus PlatformGlobusWorld 2021 Tutorial: Building with the Globus Platform
GlobusWorld 2021 Tutorial: Building with the Globus Platform
Globus 156 views
Data Publication and Discovery with Globus by Globus
Data Publication and Discovery with GlobusData Publication and Discovery with Globus
Data Publication and Discovery with Globus
Globus 266 views
Tutorial: Introduction to Globus for System Administrators by Globus
Tutorial: Introduction to Globus for System AdministratorsTutorial: Introduction to Globus for System Administrators
Tutorial: Introduction to Globus for System Administrators
Globus 465 views
Introduction to Globus for System Administrators (GlobusWorld Tour - UMich) by Globus
Introduction to Globus for System Administrators (GlobusWorld Tour - UMich)Introduction to Globus for System Administrators (GlobusWorld Tour - UMich)
Introduction to Globus for System Administrators (GlobusWorld Tour - UMich)
Globus 357 views
Globus for System Administrators (CHPC 2019 - South Africa) by Globus
Globus for System Administrators (CHPC 2019 - South Africa)Globus for System Administrators (CHPC 2019 - South Africa)
Globus for System Administrators (CHPC 2019 - South Africa)
Globus 180 views
Jupyter + Globus: The Foundation for Interactive Data Science by Globus
Jupyter + Globus: The Foundation for Interactive Data ScienceJupyter + Globus: The Foundation for Interactive Data Science
Jupyter + Globus: The Foundation for Interactive Data Science
Globus 423 views
Data Orchestration at Scale (GlobusWorld Tour West) by Globus
Data Orchestration at Scale (GlobusWorld Tour West)Data Orchestration at Scale (GlobusWorld Tour West)
Data Orchestration at Scale (GlobusWorld Tour West)
Globus 102 views
Globus for System Administrators (GlobusWorld Tour - Columbia University) by Globus
Globus for System Administrators (GlobusWorld Tour - Columbia University)Globus for System Administrators (GlobusWorld Tour - Columbia University)
Globus for System Administrators (GlobusWorld Tour - Columbia University)
Globus 84 views
Tutorial: Automating Research Data Workflows by Globus
Tutorial: Automating Research Data WorkflowsTutorial: Automating Research Data Workflows
Tutorial: Automating Research Data Workflows
Globus 135 views

Similar to Simple Data Automation with Globus (GlobusWorld Tour West)

Automating Research Data Flows and Introduction to the Globus Platform by
Automating Research Data Flows and Introduction to the Globus PlatformAutomating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus PlatformGlobus
50 views41 slides
Automating Research Data Flows and an Introduction to the Globus Platform by
Automating Research Data Flows and an Introduction to the Globus PlatformAutomating Research Data Flows and an Introduction to the Globus Platform
Automating Research Data Flows and an Introduction to the Globus PlatformGlobus
132 views42 slides
Introduction to the Globus Platform (APS Workshop) by
Introduction to the Globus Platform (APS Workshop)Introduction to the Globus Platform (APS Workshop)
Introduction to the Globus Platform (APS Workshop)Globus
97 views24 slides
Leveraging the Globus Platform (GlobusWorld Tour - Columbia University) by
Leveraging the Globus Platform (GlobusWorld Tour - Columbia University)Leveraging the Globus Platform (GlobusWorld Tour - Columbia University)
Leveraging the Globus Platform (GlobusWorld Tour - Columbia University)Globus
84 views42 slides
Leveraging the Globus Platform (GlobusWorld Tour - UCSD) by
Leveraging the Globus Platform (GlobusWorld Tour - UCSD)Leveraging the Globus Platform (GlobusWorld Tour - UCSD)
Leveraging the Globus Platform (GlobusWorld Tour - UCSD)Globus
62 views26 slides
Automating Research Data Workflows (GlobusWorld Tour - Columbia University) by
Automating Research Data Workflows (GlobusWorld Tour - Columbia University)Automating Research Data Workflows (GlobusWorld Tour - Columbia University)
Automating Research Data Workflows (GlobusWorld Tour - Columbia University)Globus
120 views22 slides

Similar to Simple Data Automation with Globus (GlobusWorld Tour West)(20)

Automating Research Data Flows and Introduction to the Globus Platform by Globus
Automating Research Data Flows and Introduction to the Globus PlatformAutomating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus Platform
Globus 50 views
Automating Research Data Flows and an Introduction to the Globus Platform by Globus
Automating Research Data Flows and an Introduction to the Globus PlatformAutomating Research Data Flows and an Introduction to the Globus Platform
Automating Research Data Flows and an Introduction to the Globus Platform
Globus 132 views
Introduction to the Globus Platform (APS Workshop) by Globus
Introduction to the Globus Platform (APS Workshop)Introduction to the Globus Platform (APS Workshop)
Introduction to the Globus Platform (APS Workshop)
Globus 97 views
Leveraging the Globus Platform (GlobusWorld Tour - Columbia University) by Globus
Leveraging the Globus Platform (GlobusWorld Tour - Columbia University)Leveraging the Globus Platform (GlobusWorld Tour - Columbia University)
Leveraging the Globus Platform (GlobusWorld Tour - Columbia University)
Globus 84 views
Leveraging the Globus Platform (GlobusWorld Tour - UCSD) by Globus
Leveraging the Globus Platform (GlobusWorld Tour - UCSD)Leveraging the Globus Platform (GlobusWorld Tour - UCSD)
Leveraging the Globus Platform (GlobusWorld Tour - UCSD)
Globus 62 views
Automating Research Data Workflows (GlobusWorld Tour - Columbia University) by Globus
Automating Research Data Workflows (GlobusWorld Tour - Columbia University)Automating Research Data Workflows (GlobusWorld Tour - Columbia University)
Automating Research Data Workflows (GlobusWorld Tour - Columbia University)
Globus 120 views
Introduction to the Globus PaaS (GlobusWorld Tour - STFC) by Globus
Introduction to the Globus PaaS (GlobusWorld Tour - STFC)Introduction to the Globus PaaS (GlobusWorld Tour - STFC)
Introduction to the Globus PaaS (GlobusWorld Tour - STFC)
Globus 77 views
Gateways 2020 Tutorial - Introduction to Globus by Globus
Gateways 2020 Tutorial - Introduction to GlobusGateways 2020 Tutorial - Introduction to Globus
Gateways 2020 Tutorial - Introduction to Globus
Globus 151 views
Automating Research Data Workflows (GlobusWorld Tour - STFC) by Globus
Automating Research Data Workflows (GlobusWorld Tour - STFC)Automating Research Data Workflows (GlobusWorld Tour - STFC)
Automating Research Data Workflows (GlobusWorld Tour - STFC)
Globus 128 views
Automating Research Data Workflows (GlobusWorld Tour - UCSD) by Globus
Automating Research Data Workflows (GlobusWorld Tour - UCSD)Automating Research Data Workflows (GlobusWorld Tour - UCSD)
Automating Research Data Workflows (GlobusWorld Tour - UCSD)
Globus 29 views
Automating Data Flows with the Globus CLI (GlobusWorld Tour - UMich) by Globus
Automating Data Flows with the Globus CLI (GlobusWorld Tour - UMich)Automating Data Flows with the Globus CLI (GlobusWorld Tour - UMich)
Automating Data Flows with the Globus CLI (GlobusWorld Tour - UMich)
Globus 231 views
Automating Research Data Flows with Globus (CHPC 2019 - South Africa) by Globus
Automating Research Data Flows with Globus (CHPC 2019 - South Africa)Automating Research Data Flows with Globus (CHPC 2019 - South Africa)
Automating Research Data Flows with Globus (CHPC 2019 - South Africa)
Globus 156 views
Working with Globus Platform Services and Portals by Globus
Working with Globus Platform Services and PortalsWorking with Globus Platform Services and Portals
Working with Globus Platform Services and Portals
Globus 28 views
Using Globus to Streamline Research at Scale by Globus
Using Globus to Streamline Research at ScaleUsing Globus to Streamline Research at Scale
Using Globus to Streamline Research at Scale
Globus 30 views
Automating Research Data with Globus Flows and Compute by Globus
Automating Research Data with Globus Flows and ComputeAutomating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and Compute
Globus 6 views
Introduction to the Command Line Interface (CLI) by Globus
Introduction to the Command Line Interface (CLI)Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)
Globus 12 views
Globus Endpoint Administration (GlobusWorld Tour - STFC) by Globus
Globus Endpoint Administration (GlobusWorld Tour - STFC)Globus Endpoint Administration (GlobusWorld Tour - STFC)
Globus Endpoint Administration (GlobusWorld Tour - STFC)
Globus 230 views
Introduction to Globus: Research Data Management Software at the ALCF by Globus
Introduction to Globus: Research Data Management Software at the ALCFIntroduction to Globus: Research Data Management Software at the ALCF
Introduction to Globus: Research Data Management Software at the ALCF
Globus 274 views
"What's New With Globus" Webinar: Spring 2018 by Globus
"What's New With Globus" Webinar: Spring 2018"What's New With Globus" Webinar: Spring 2018
"What's New With Globus" Webinar: Spring 2018
Globus 270 views

More from Globus

Introduction to Globus for System Administrators by
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System AdministratorsGlobus
11 views55 slides
Introduction to Data Transfer and Sharing for Researchers by
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersGlobus
4 views33 slides
Introduction to the Globus Platform for Developers by
Introduction to the Globus Platform for DevelopersIntroduction to the Globus Platform for Developers
Introduction to the Globus Platform for DevelopersGlobus
4 views28 slides
Advanced Globus System Administration by
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System AdministrationGlobus
26 views29 slides
Introduction to Globus for System Administrators by
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System AdministratorsGlobus
94 views54 slides
Introduction to Globus for New Users by
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New UsersGlobus
55 views26 slides

More from Globus (20)

Introduction to Globus for System Administrators by Globus
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
Globus 11 views
Introduction to Data Transfer and Sharing for Researchers by Globus
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for Researchers
Globus 4 views
Introduction to the Globus Platform for Developers by Globus
Introduction to the Globus Platform for DevelopersIntroduction to the Globus Platform for Developers
Introduction to the Globus Platform for Developers
Globus 4 views
Advanced Globus System Administration by Globus
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
Globus 26 views
Introduction to Globus for System Administrators by Globus
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
Globus 94 views
Introduction to Globus for New Users by Globus
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New Users
Globus 55 views
Globus Automation by Globus
Globus AutomationGlobus Automation
Globus Automation
Globus 20 views
Advanced Globus System Administration by Globus
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
Globus 21 views
Introduction to Globus by Globus
Introduction to GlobusIntroduction to Globus
Introduction to Globus
Globus 38 views
Introduction to Globus for System Administrators by Globus
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
Globus 27 views
Working with Globus Platform Services by Globus
Working with Globus Platform ServicesWorking with Globus Platform Services
Working with Globus Platform Services
Globus 41 views
Advanced Globus System Administration by Globus
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
Globus 29 views
Introduction to Globus for System Administrators by Globus
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
Globus 145 views
Introduction to Globus for Researchers by Globus
Introduction to Globus for ResearchersIntroduction to Globus for Researchers
Introduction to Globus for Researchers
Globus 89 views
Introduction to Globus for New Users by Globus
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New Users
Globus 58 views
Globus Endpoint Migration and Advanced Administration Topics by Globus
Globus Endpoint Migration and Advanced Administration TopicsGlobus Endpoint Migration and Advanced Administration Topics
Globus Endpoint Migration and Advanced Administration Topics
Globus 54 views
Globus for System Administrators by Globus
Globus for System AdministratorsGlobus for System Administrators
Globus for System Administrators
Globus 68 views
Building Data Portals and Science Gateways with Globus by Globus
Building Data Portals and Science Gateways with GlobusBuilding Data Portals and Science Gateways with Globus
Building Data Portals and Science Gateways with Globus
Globus 133 views
Automating Research Data Management with Globus by Globus
Automating Research Data Management with GlobusAutomating Research Data Management with Globus
Automating Research Data Management with Globus
Globus 250 views
Moemoea nui Aotearoa: Challenges and Strategies in Data Lifecycle Management ... by Globus
Moemoea nui Aotearoa: Challenges and Strategies in Data Lifecycle Management ...Moemoea nui Aotearoa: Challenges and Strategies in Data Lifecycle Management ...
Moemoea nui Aotearoa: Challenges and Strategies in Data Lifecycle Management ...
Globus 150 views

Recently uploaded

RIO GRANDE SUPPLY COMPANY INC, JAYSON.docx by
RIO GRANDE SUPPLY COMPANY INC, JAYSON.docxRIO GRANDE SUPPLY COMPANY INC, JAYSON.docx
RIO GRANDE SUPPLY COMPANY INC, JAYSON.docxJaysonGarabilesEspej
6 views3 slides
MOSORE_BRESCIA by
MOSORE_BRESCIAMOSORE_BRESCIA
MOSORE_BRESCIAFederico Karagulian
5 views8 slides
Building Real-Time Travel Alerts by
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel AlertsTimothy Spann
109 views48 slides
Survey on Factuality in LLM's.pptx by
Survey on Factuality in LLM's.pptxSurvey on Factuality in LLM's.pptx
Survey on Factuality in LLM's.pptxNeethaSherra1
5 views9 slides
Introduction to Microsoft Fabric.pdf by
Introduction to Microsoft Fabric.pdfIntroduction to Microsoft Fabric.pdf
Introduction to Microsoft Fabric.pdfishaniuudeshika
24 views16 slides
How Leaders See Data? (Level 1) by
How Leaders See Data? (Level 1)How Leaders See Data? (Level 1)
How Leaders See Data? (Level 1)Narendra Narendra
13 views76 slides

Recently uploaded(20)

Building Real-Time Travel Alerts by Timothy Spann
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
Timothy Spann109 views
Survey on Factuality in LLM's.pptx by NeethaSherra1
Survey on Factuality in LLM's.pptxSurvey on Factuality in LLM's.pptx
Survey on Factuality in LLM's.pptx
NeethaSherra15 views
Introduction to Microsoft Fabric.pdf by ishaniuudeshika
Introduction to Microsoft Fabric.pdfIntroduction to Microsoft Fabric.pdf
Introduction to Microsoft Fabric.pdf
ishaniuudeshika24 views
Understanding Hallucinations in LLMs - 2023 09 29.pptx by Greg Makowski
Understanding Hallucinations in LLMs - 2023 09 29.pptxUnderstanding Hallucinations in LLMs - 2023 09 29.pptx
Understanding Hallucinations in LLMs - 2023 09 29.pptx
Greg Makowski13 views
CRIJ4385_Death Penalty_F23.pptx by yvettemm100
CRIJ4385_Death Penalty_F23.pptxCRIJ4385_Death Penalty_F23.pptx
CRIJ4385_Death Penalty_F23.pptx
yvettemm1006 views
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdf by vikas12611618
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdfVikas 500 BIG DATA TECHNOLOGIES LAB.pdf
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdf
vikas126116188 views
Cross-network in Google Analytics 4.pdf by GA4 Tutorials
Cross-network in Google Analytics 4.pdfCross-network in Google Analytics 4.pdf
Cross-network in Google Analytics 4.pdf
GA4 Tutorials6 views
Chapter 3b- Process Communication (1) (1)(1) (1).pptx by ayeshabaig2004
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptx
ayeshabaig20045 views
Supercharging your Data with Azure AI Search and Azure OpenAI by Peter Gallagher
Supercharging your Data with Azure AI Search and Azure OpenAISupercharging your Data with Azure AI Search and Azure OpenAI
Supercharging your Data with Azure AI Search and Azure OpenAI
Peter Gallagher37 views
3196 The Case of The East River by ErickANDRADE90
3196 The Case of The East River3196 The Case of The East River
3196 The Case of The East River
ErickANDRADE9011 views
Organic Shopping in Google Analytics 4.pdf by GA4 Tutorials
Organic Shopping in Google Analytics 4.pdfOrganic Shopping in Google Analytics 4.pdf
Organic Shopping in Google Analytics 4.pdf
GA4 Tutorials10 views
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx by DataScienceConferenc1
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
RuleBookForTheFairDataEconomy.pptx by noraelstela1
RuleBookForTheFairDataEconomy.pptxRuleBookForTheFairDataEconomy.pptx
RuleBookForTheFairDataEconomy.pptx
noraelstela167 views
Short Story Assignment by Kelly Nguyen by kellynguyen01
Short Story Assignment by Kelly NguyenShort Story Assignment by Kelly Nguyen
Short Story Assignment by Kelly Nguyen
kellynguyen0118 views
UNEP FI CRS Climate Risk Results.pptx by pekka28
UNEP FI CRS Climate Risk Results.pptxUNEP FI CRS Climate Risk Results.pptx
UNEP FI CRS Climate Risk Results.pptx
pekka2811 views

Simple Data Automation with Globus (GlobusWorld Tour West)

  • 1. Data Automation Programming Teaching Globus Some New Tricks Greg Nawrocki greg@globus.org September 15, 2021
  • 2. 2 Custom portals? Science Gateways? Unique workflows? Our Command Line Interface, Timer Service, open REST APIs and Python SDK empower you to create an integrated ecosystem of research data services and applications.
  • 3. PaaS Security Challenges – Globus Auth • How to provide: – Login to apps o Web apps (Jupyter Notebook, Portals), Mobile, Desktop, Command line – Protect all REST API communications o App à Globus service (Jupyter Notebook, Portals) o App à non-Globus service (Portals) o Service à service (Portals) • While: – Not introducing even more identities o Providing a platform to consolidate those identities – Providing least privileges security model (consents) – Being agnostic to programming language and framework – Being web friendly – Making it easy for users and developers 3
  • 4. Securing Apps with Globus Auth • Native App (with refresh tokens – extend expiration) – Authentication as user identity – Authentication URL / come back with a auth code – exchanged for tokens – Clients can’t keep a secret - tokens in plain text – Jupyter Notebook examples / Timer Service • Auth Code Grant – Authentication as user identity – Browser redirect to Globus Auth, auth code returned (no manual copy) – Tokens stored securely – Jupyter hub secured with Globus Auth • Confidential Client: – Authentication as application – ClientID and Secret stored securely – Custom apps 4
  • 5. Globus Command Line Interface Open source, uses the Python SDK Because of this correspondence the CLI is an excellent tool for getting the gist of how he SDK functions. Great in shell scripts.
  • 6. Globus CLI • Easy install and get updates – https://docs.globus.org/cli/ – https://docs.globus.org/cli/examples/ – https://github.com/globus/globus-cli • All interactions with transfer and auth at the identity level – Command “globus login” gets access tokens and refresh tokens o Stores the token locally (~/.globus.cfg ) o Tokens for Globus Auth and Transfer services – Command “globus logout” deletes those – Command “globus whoami” reveals logged in identity
  • 7. CLI Basics – “globus” is the executable $ globus endpoint search 'Globus Tutorial' $ globus task list $ globus get-identities greg@globus.org --verbose • Getting help / list of commands – globus list-commands – globus --help • UUIDs for endpoint, task, user identity, groups… • Can query to discover the UUIDs – Use search / list / get options
  • 8. The Globus CLI – Simple tasks $ globus ls ddb59af0-6d04-11e5-ba46-22000b92c6ec $ globus ls ddb59aef-6d04-11e5-ba46-22000b92c6ec:/share/godata/ $globus transfer ddb59aef-6d04-11e5-ba46-22000b92c6ec:/share/godata/file3.txt ddb59af0-6d04-11e5-ba46-22000b92c6ec:/~/file3.txt • List endpoint contents • Single file transfer
  • 9. The Globus CLI – Simple tasks $globus transfer --recursive ddb59aef-6d04-11e5-ba46-22000b92c6ec:/share/godata/ ddb59af0-6d04-11e5-ba46-22000b92c6ec:/~/ • Recursive transfer $ globus delete ddb59af0-6d04-11e5-ba46-22000b92c6ec:/~/file3.txt • Delete
  • 10. Batch Transfers • Transfer tasks have one source/destination, but can have any number of files • Provide input source-dest pairs via local file • File may have embedded comments $ globus transfer ddb59aef-6d04-11e5-ba46-22000b92c6ec:/share/godata/ ddb59af0-6d04-11e5-ba46-22000b92c6ec:/~/ --batch --label 'CLI Batch' < files.txt
  • 11. Parsing CLI output $ globus endpoint search --filter-scope my-endpoints $ globus endpoint search --filter-scope my-endpoints --format json $ globus endpoint search --filter-scope my-endpoints --jmespath 'DATA[].[id, display_name]' • Default output is text; for JSON output use --format json • Extract specific attributes using --jmespath <expression>
  • 12. Managing notifications • Turn off emails sent for tasks • Useful when an application manages tasks for a user • Disable notifications with the --notify option --notify off (all notifications) --notify succeeded|failed|inactive (select notifications)
  • 13. Other CLI Commands • globus endpoint permission – Mange access control rules – CLI based portal • globus endpoint role – Manage endpoint roles – Delegate roles to other identities • globus task – show – cancel
  • 14. Automation with the CLI • Interactions are as user: both for data access and to Globus services – Globus login to get tokens • Collection access – Mapped Collections o Use the –skip-activation-check to submit the task even if endpoint is not activated at submit time – Guest Collections o Guest Collection / Shared Endpoints auto-activate o Use Guest Collections whenever possible • Reference – Basic Data Automation with the Globus Command Line Interface (CLI) o https://www.youtube.com/watch?v=qIQTC6YOvrE
  • 15. The Globus Timer Service • For scheduling recurring Globus transfers using Globus Automate – Backups – Synchronizations • Doc: https://pypi.org/project/globus-timer-cli/ • Service with a CLI interface – Simple installation (pip install) – Authentication as user identity o Browser redirect to Globus Auth – copy back auth code – native app o Authentication information is thereafter cached so the authentication process is only needed on the first use of the CLI 15
  • 16. Using the Globus Timer Service • globus-timer session {login, logout, whoami} 16 globus-timer job transfer --name example-job --label "Timer Transfer Job" --interval 28800 --start '2020-01-01T12:34:56’ --source-endpoint ddb59aef-6d04-11e5-ba46-22000b92c6ec --dest-endpoint ddb59af0-6d04-11e5-ba46-22000b92c6ec --item ~/file1.txt ~/new_file1.txt false --item ~/file2.txt ~/new_file2.txt false
  • 17. Using the Globus Timer Service • --items-file transfer_items.csv • Other options – just like in the webApp --sync-level (how timer behaves if files exist) --verify-checksum --encrypt-data --preserve-timestamp --stop-after-runs --stop-after-date • globus-timer job transfer --help 17
  • 18. Monitoring and Deleting Jobs • globus-timer job list • globus-timer job status <job_id> [--verbose] • globus-timer job delete <job_id> 18
  • 19. Data centric applications leveraging Globus 19
  • 20. Globus Transfer API • Globus Web App consumes public Transfer API • Resource named by URL (standard REST approach) – Query params allow refinement (e.g., subset of fields) • Globus APIs use JSON for documents and resource representations • Requests authorized via OAuth2 access token – Authorization: Bearer asdflkqhafsdafeawk docs.globus.org/api/transfer 20
  • 21. Globus Python SDK • Python client library for the Globus Auth and Transfer REST APIs • TransferClient class handles connection management, security, framing, marshaling – Largely direct mapping to REST API – One method for each API resource and HTTP verb • Nice high level wrapper to the API – manages low level API housekeeping tasks https://globus-sdk-python.readthedocs.io/en/stable/ globus.github.io/globus-sdk-python 21
  • 22. Endpoint Activation • Activating endpoint means binding a credential to an endpoint for login • Mapped Collections require login via web app • Auto-activate – Globus Connect Personal and Guest Collections use Globus-provided credential – Must auto-activate before any API calls to endpoints 23
  • 23. Synchronous Tasks • Endpoint search (with scopes) • List directory contents (ls) • Make directory (mkdir) • Rename • Note: – Path encoding & UTF gotchas – Don’t forget to auto-activate first 24
  • 24. Asynchronous Tasks • Transfer – Sync level option • Delete • Get submission_id, followed by submit – Once and only once submission • Use task id to “follow up” 25
  • 25. The Globus API / SDK with a Jupyter Notebook in a Jupyter Hub login REST APIs { “tokens”:… {“tokens”:… REST APIs REST APIs Bearer a45cd…
  • 26. Walkthrough API with our Jupyter Hub • https://jupyter.demo.globus.org – Sign in with Globus – Verify the consents – Start My Server (this will take about a minute) – Open folder: globus-jupyter-notebooks – Run Platform_Introduction_JupyterHub_Auth.ipynb • If you mess it up and want to “go back to the beginning” – Just stop and restart the server • If you want to use the notebook outside of our hub – https://github.com/globus/globus-jupyter-notebooks – Authentication is a manual cut and paste of exchanging the authorization code for an access token – Native App 27
  • 27. Automation Examples • Simple code examples for various use cases using Globus – https://github.com/globus/automation-examples – Syncing a directory o Bash script that calls the Globus CLI and a Python module that can be run as a script or imported as a module. – Staging data in a shared directory o Bash / Python – Removing directories after files are transferred o Python script 28
  • 28. Support resources • Globus documentation: docs.globus.org • GitHub: https://github.com/globus • YouTube channel: youtube.com/user/GlobusOnline