2. Data replication
• For backup: initiated by user or system back up
• Automated transfer of data from science instrument
• Replication to a data share
2
Recurring transfers
with sync option
Copy /ingest
Daily @ 3:30am
3. Staging data with compute jobs
• Stage data in or out as part of the job
• Transfer task is submitted when the job is run
– Endpoint may not be currently activated
• Alternative approaches
1. User adds directives to job submission script
2. Application manages data staging on user’s behalf
4. Application driven automation
• Application (e.g. portal, science gateway) submits a
transfer of compute results as the user
• Application monitors transfer, and initiates additional
processing and/or backup of data
6. Globus Auth: Native apps
• Client that cannot keep a secret, e.g…
– Command line, desktop apps
– Mobile apps
– Jupyter notebooks
• Native app is registered with Globus Auth
– Not a confidential client, the task on behalf of an authenticated user
• Native App Grant is used
– Variation on the Authorization Code Grant the the Jupyter Hub used
• Globus SDK:
– To get tokens: NativeAppAuthClient
– To use tokens: AccessTokenAuthorizer
6
7. Browser
Native App grant
7
Native App
(Client)
1. Run
application
2. URL to
authenticate
3. Authenticate and
consent
4. Auth code
5. Register
auth code
6. Exchange
code
7. Access tokens
8. Authenticate with access
tokens to invoke transfer
service as user App/Service
(Resource Server)
Globus Auth
(Authorization Server)
8. Refresh tokens
• Common use cases
– Portal checking transfer status when user is not logged in
– Running command line app from script
o The CLI gets access and refresh tokens upon ”globus login”
• Refresh tokens issued to client, in particular scope
• Client uses refresh token to get access token
– Confidential client: client_id and client_secret required
– Native app: client_secret not required
• Refresh token good for 6 months after last use
• Consent rescindment revokes resource token
8
9. Refresh tokens
9
Native App
(Client)
App/Service
(Resource Server)
Globus Auth
(Authorization Server)
1. Run
application
2. URL to
authenticate
Browser
3. Authenticate and consent
4. Auth code
5. Register
auth code
6. Exchange code,
request refresh tokens
7. Access
tokens and refresh tokens
9. Exchange refresh token
for new access tokens
8. Store refresh tokens
10. Access tokens
11. Authenticate with access
tokens to invoke service as user
10. Native App/Refresh Tokens Sample Code
github.com/globus/native-app-examples
• ./example_copy_paste.py
– User copies and pastes code to the app
• ./example_copy_paste_refresh_token.py
– Stores refresh token locally, uses it to get new access tokens
• See README for installation
10
12. Globus CLI
• It’s a native application distributed by Globus
– https://docs.globus.org/cli/
– https://github.com/globus/globus-cli
• Easy install and updates
• Command “globus login” gets access tokens and refresh
tokens
– Stores the token locally (~/.globus.cfg )
• All interactions with the service use the tokens
– Tokens for Globus Auth and Transfer services
– Just like Vas did in the Platform examples with the API
• Command “globus logout” deletes those
13. CLI Basics
• Getting help / list of commands
– globus –help
– globus list-commands
• UUIDs for endpoint, task, user identity, groups…
– Use search/list options
• get-identities for identity username to UUID
$ globus endpoint search 'Globus Tutorial'
$ globus task list
$ globus get-identities vas@globus.org 2867d9fb-
d5b5-4f21-95e7-312b288b8d11 --verbose
14. The Globus CLI – Simple tasks
• Find endpoints
– globus endpoint search Midway
– globus endpoint search ESNet
– globus endpoint search --filter-scope=recently-used
• Find endpoint contents
– globus ls af7bda53-6d04-11e5-ba46-22000b92c6ec
– globus ls af7bda53-6d04-11e5-ba46-22000b92c6ec:GlobusWorld2020
• Transfer a file
– From ESnet Read-Only Test DTN at CERN to Midway
– Note the specific paths
– globus transfer d8eb36b6-6d04-11e5-ba46-22000b92c6ec:/~/data1/1M.dat af7bda53-6d04-11e5-
ba46-22000b92c6ec:/~/1M.dat
• Transfer a directory
– From Globus Tutorial Endpoint 2 to Midway (create directory and contents)
– globus transfer --recursive ddb59aef-6d04-11e5-ba46-22000b92c6ec:/share/godata/ af7bda53-
6d04-11e5-ba46-22000b92c6ec:/~/syncDemo
• https://docs.globus.org/cli/examples/
15. Batch Transfers
• Transfer tasks have one source/destination, but can have
any number of files
• Provide input source-dest pairs via local file
• e.g. move files listed in files.txt from $ep1 to $ep2
$ ep1=ddb59aef-6d04-11e5-ba46-22000b92c6ec
$ ep2=ddb59af0-6d04-11e5-ba46-22000b92c6ec
$ globus transfer $ep1:/share/godata/ $ep2:/~/ --
batch --label 'CLI Batch' < files.txt
16. Useful submission commands
• Safe resubmissions
– Applies to all tasks (transfer and delete)
– Get a task UUID, use that in submission
– $ globus task generate-submission-id
– --submission-id option in transfer
• Task wait
– useful for scripting conditional on transfer task status
17. Parsing CLI output
• Default output is text; for JSON output use --format json
$ globus endpoint search --filter-scope my-endpoints
$ globus endpoint search --filter-scope my-endpoints --
format json
• Extract specific attributes using --jmespath <expression>
$ globus endpoint search --filter-scope my-endpoints --
jmespath 'DATA[].[id, display_name]'
18. Managing notifications
• Turn off emails sent for tasks
• Useful when an application manages tasks for a user
• Disable notifications with the --notify option
--notify off (all notifications)
--notify succeeded|failed|inactive (select notifications)
19. Permission management
• Set and manage permissions on shared endpoint
• Requires access manager role
$ share=<shared_endpoint_UUID>
$ globus endpoint permission create --permissions r --
identity greg@nawrockinet.com $share:/nawrockipersonal/
$ globus endpoint permission list $share
$ globus endpoint permission delete $share <perm_UUID>
20. Automation with CLI
• A script that uses the CLI to transfer data repeatedly via
task manager/cron
– Interactions are as user: both for data access and to Globus
services
• CLI commands used in the job submission script
– CLI is installed on head node
– User runs ”globus login”, the tokens are stored in user’s home
directory
– Tokens accessible when the job runs and submits stage in or stage
out tasks
– Use the –skip-activation-check to submit the task even if endpoint is
not activated at submit time
21. Automation with portals
• Portal needs to act as the user
• User grants “offline” access to the portal
– Portal gets and stores refresh tokens for each user
– Uses client id/secret + refresh tokens to get new access tokens
– Portal maintains state about transfers being managed (task id)
22. Automation Examples
• Syncing a directory
– Bash script that calls the Globus CLI and a
Python module that can be run as a script or
imported as a module.
• Staging data in a shared directory
– Bash / Python
• Removing directories after files are
transferred
– Python script
• Simple code examples for various use cases
using Globus
– https://github.com/globus/automation-examples
22
23. Support resources
• Globus documentation: docs.globus.org
• Sample code: github.com/globus
• Helpdesk and issue escalation: support@globus.org
• Mailing lists
– https://www.globus.org/mailing-lists
– developer-discuss@globus.org
• Globus professional services team
– Assist with portal/gateway/app architecture and design
– Develop custom applications that leverage the Globus platform
– Advise on customized deployment and integration scenarios
Editor's Notes
Automated (timed) transfers – in the roadmap to be part of the Globus Service at some point, but until then….
Confidential client.
We went through this in the last (Sharing) session.
Do something when I tell you to, and do it as me - Native App
More than just “Do something when I tell you to, and do it as me “ more stuff needs to happen further on down the workflow.
Refresh Tokens - runs independent of the user indefinitely – well 6 months indefinitely.
Let’s review the Native App Grant.
Globus SDK (helper methods) if you want to write your own native app – Jupyter Notebook example
Native App Grant – Review – CLI / Jupyter Notebook (no local secret)
User attempts to access the portal / have the application access the services
Browser redirect
Local site Auth Server prompts for user name and password (if they haven’t already authenticated to Globus) and prompts for consents (the specific things it’s going to use your Globus account for) - “By clicking "Allow", you allow Insert Application Name Here, in accordance with its terms of service and privacy policy, to use the above listed information and services.” -- May have to authenticate to an identity provider.
Auth code is either “passed” or manually ”pasted in” by….
Returning to the application with an authorization code
Exchange the authorization code for
Access token(s)
Use the access token(s) to create a transfer client object and invoke the service.
End result: All calls to the transfer service needs to have the authorization header with the transfer token.
How do we accommodate that third (Application driven automation) use case?
Find endpoints
globus endpoint search Midway
globus endpoint search ESNet
globus endpoint search --filter-scope=recently-used
Find endpoint contents
globus ls af7bda53-6d04-11e5-ba46-22000b92c6ec
globus ls af7bda53-6d04-11e5-ba46-22000b92c6ec:GlobusWorld2020
Transfer a file
From ESnet Read-Only Test DTN at CERN to Midway
Note the specific paths
globus transfer d8eb36b6-6d04-11e5-ba46-22000b92c6ec:/~/data1/1M.dat af7bda53-6d04-11e5-ba46-22000b92c6ec:/~/1M.dat
Transfer a directory
From Globus Tutorial Endpoint 2 to Midway (create directory and contents)
globus transfer --recursive ddb59aef-6d04-11e5-ba46-22000b92c6ec:/share/godata/ af7bda53-6d04-11e5-ba46-22000b92c6ec:/~/syncDemo
export ep1=ddb59aef-6d04-11e5-ba46-22000b92c6ec
export ep2=ddb59af0-6d04-11e5-ba46-22000b92c6ec
$ globus transfer $ep1:/share/godata/ $ep2:/home/nawrocki/ --batch --label "CLI Batch" < files.txt
One task, multiple files
Generates a submission-id that allows for resubmitting a task multiple times while guaranteeing that the actual task will only be carried out once. This is useful for handling the unreliability of networks, lazy script branching, and multiple script threads
Note that the task ID of the task will differ from the submission ID.
export sub_id=$(globus task generate-submission-id)
globus transfer $ep1:/share/godata $ep2:/home/nawrocki --recursive --submission-id $sub_id --label “1st submission”
globus transfer $ep1:/share/godata $ep2:/home/nawrocki --recursive --submission-id $sub_id --label “2nd submission”
globus transfer $ep1:/share/godata $ep2:/home/nawrocki --recursive --submission-id $sub_id --label “3rd submission”
globus endpoint search --filter-scope my-endpoints
globus endpoint search --filter-scope my-endpoints --format json
JMESPath is a query language for JSON
globus endpoint search --filter-scope my-endpoints --jmespath 'DATA[].[id, display_name]’
CLI commands used in the job submission script
Using this method a job submission script can contain specific commands that just use the CLI to be able to stage data in or stay shoot out.
Because tokens are needed, with this method the tokens are on the users home directory and the CLI can use them.
The skip activation check allows the user to submit a transfer without requiring the endpoints be activated. So when the job runs if the endpoints are not activated, it still accepts the task and send the user an email to activate it.
The portal acts as a user, it doesn’t act as an app, but acts on your behalf.
If I want it to check status on my transfer I need to give it access by virture of an access and a refresh token.
MRDP – transfer status – refresh button does exactly this – on behalf of the user - checking status