GlobusWorld 2021 Tutorial: Globus for System Administrators

An overview of installing and configuring a Globus endpoints on your storage system. This tutorial was presented at the GlobusWorld 2021 conference in Chicago, IL by Vas Vasiliadis.

  1. 1. Globus for System Administrators Vas Vasiliadis GlobusWorld - May 13, 2021
  2. 2. Globus Connect Server 2 • Makes your storage accessible via Globus • Software/tools installed and managed by sysadmin Local system users Local Storage System (HPC cluster, NAS, …) Globus Connect Server DTN • Default access for all local accounts • Native packaging Linux: DEB, RPM
  3. 3. Creating a Globus endpoint Globus Connect Server v5 (GCSv5) should be used for all new endpoint installations
  4. 4. GCSv5 resources – please consult these first • Quickstart Guide • GCS Command Line Reference • Video walkthrough of an installation
  5. 5. Globus Connect Server v5 Architecture
  6. 6. requires a paid subscription GCSv5 installation summary 1. Register a Globus Connect Server with Globus Auth 2. Install GCS packages on data transfer node (DTN) 3. Set up the endpoint and add node(s) 4. Create a POSIX storage gateway 5. Create a mapped collection 6. Optional: Associate endpoint with a subscription 7. Optional: Create a guest collection 8. Optional: Add other storage systems to the endpoint
  7. 7. GCS management conceptual architecture 7 Data Transfer Node GCS Command Line Interface GridFTP Server Globus Transfer Service GCS management requests Globus Auth Service GCS Manager authorize request using client ID/secret GCS Manager endpoint: 1. Register a Globus Connect Server get GCS client ID, secret define Globus Transfer resources (gateways, collections, …)
  8. 8. 2. Install Globus Connect Server v5 packages $ curl -LOs toolkit-repo_latest_all.deb $ dpkg -i globus-toolkit-repo_latest_all.deb $ sed -i /etc/apt/sources.list.d/globus-toolkit-6-stable*.list > -e 's/^# deb /deb /’ $ sed -i /etc/apt/sources.list.d/globus-connect-server-stable*.list > -e 's/^# deb /deb /’ $ apt-key add /usr/share/globus-toolkit-repo/RPM-GPG-KEY-Globus $ apt-get update $ apt-get --assume-yes install globus-connect-server54
  9. 9. 3. Set up endpoint and add node $ globus-connect-server endpoint setup > "Endpoint display name" > --organization "University of Chicago" > --client-id 4321dddd-af72-4c4b-9533-a0f4055dd321 > --owner $ globus-connect-server node setup > --client-id 4321dddd-af72-4c4b-9533-a0f4055dd321 Note: endpoint setup command generates deployment-key.json Use this file when setting up additional data transfer nodes
  10. 10. Our setup so far Run globus-connect-server node setup to set up additional data transfer nodes Copy deployment-key.json from original DTN
  11. 11. 4. Create a storage gateway $ globus-connect-server storage-gateway create posix > "Gateway Display Name" > --domain > --authentication-timeout-mins 60
  12. 12. Common Storage Gateway configuration options • Allowed identity domain(s) • Storage system path restrictions • Local user restrictions • Identity mapping • High assurance* setting (and associated timeout) * requires a paid subscription
  13. 13. Mapping identities to local accounts • Configure on storage gateway • Default: Strip identity domain (everything after “@”) – Identity “” à local user “userx” • Alternative: Expressions specified in JSON document – Use --identity-mapping option on storage-gateway commands
  14. 14. Our setup so far…
  15. 15. 5. Create a mapped collection $ globus-connect-server collection create > f77ff456-1f18-41d3-94a7-f3fd8858ea4d > "/home/$USER" > "Collection Display Name" Note: Collections are rooted at the specified base path Specifying "/home/$USER" as the base path sets the collection root to the local user’s home directory, as was the default in GCSv4
  16. 16. Common Collection configuration options • Allow guest collections* à enables sharing • Sharing restrictions: paths, users, groups • HTTPS access* * requires a paid subscription
  17. 17. Our setup so far…
  18. 18. Alternative authentication flow (if not using Globus trusted IdP)
  19. 19. 6. Associate endpoint with a subscription $ globus-connect-server endpoint set-subscription-id DEFAULT Note: Must be run using an identity that is a subscription manager Replace DEFAULT with subscription ID if identity is associated with multiple subscriptions Can also be set via the web app Endpoints page (search for endpoint name)
  20. 20. Be identity-, role-, and permission-aware • Default: Only endpoint owner can configure endpoint • Delegate administrator role to other sysadmins – Best practice: Delegate to a Globus group, not individuals • Check identity using the session command • Check resource permissions on storage gateways and collections with --include-private-policies option
  21. 21. 7. Create a guest collection • Created by user, not endpoint admin • Root is relative to mapped collection base path
  22. 22. 8. Add other storage systems to the endpoint • Update your GCS packages • Add storage gateway • Non-POSIX systems require premium connector • Gateway configuration options vary by connector
  23. 23. On performance… 24
  24. 24. Globus is performant 72.8Gbps
  25. 25. Balance: performance - reliability • Network use parameters: concurrency, parallelism • Maximum, Preferred values for each • Transfer considers source and destination endpoint settings min( max(preferred src, preferred dest), max src, max dest ) • Service limits, e.g. concurrent requests 26
  26. 26. Performance (and security) requires planning 28
  27. 27. Legacy Architecture 10GE Border Router WAN Firewall Enterprise perfSONAR perfSONAR Filesystem (data store) 10GE Portal Server Browsing path Query path Data path Portal server applications: · web server · search · database · authentication · data service
  28. 28. Current best practice using a Science DMZ 10GE 10GE 10GE 10GE Border Router WAN Science DMZ Switch/Router Firewall Enterprise perfSONAR perfSONAR 10GE 10GE 10GE 10GE DTN DTN API DTNs (data access governed by portal) DTN DTN perfSONAR Filesystem (data store) 10GE Portal Server Browsing path Query path Portal server applications: · web server · search · database · authentication Data Path Data Transfer Path Portal Query/Browse Path
  29. 29. Science DMZ configuration 31 Source security filters Destination security filters Destination Science DMZ Source Science DMZ Source Border Router Destination Border Router Source Router Destination Router User Organization DATA CONTROL Physical Control Path Logical Control Path Physical Data Path Logical Data Path * Ports 443, 2811, 7512 * Ports 50000- 51000 Data Transfer Node (DTN) Data Transfer Node (DTN) * Please see TCP ports reference:
  30. 30. Resources • Access the service: • Documentation: • Engage: • Subscribe: • Need help? • Follow us: @globus