Successfully reported this slideshow.
Your SlideShare is downloading. ×

Advanced Globus System Administration

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 34 Ad

Advanced Globus System Administration

Download to read offline

We discuss and demonstrate various advanced features such as managing multi-DTN endpoints, customized mapping of user identities, public/private deployment configurations, tweaking file transfer performance, and differences/migrating between Globus Connect Server versions 4 and 5.

We discuss and demonstrate various advanced features such as managing multi-DTN endpoints, customized mapping of user identities, public/private deployment configurations, tweaking file transfer performance, and differences/migrating between Globus Connect Server versions 4 and 5.

Advertisement
Advertisement

More Related Content

More from Globus (20)

Advertisement

Advanced Globus System Administration

  1. 1. Advanced Globus System Administration Vas Vasiliadis vas@uchicago.edu September 8, 2022 Sponsored by
  2. 2. Agenda • Multi-DTN deployments • Supporting non-POSIX storage systems • Optimizing (or not!) file transfer performance • Modifying the data channel interface • Custom identity mapping example • Comparing GCS v4 and v5 comparison • Migrating from GCS v4 to v5 2
  3. 3. Adding DTNs to your endpoint 3
  4. 4. Recall: GCSv5 deployment key 4
  5. 5. Adding a node requires just two commands $ globus-connect-server node setup $CLIENT_ID --deployment-key THE_KEY $ systemctl restart apache2 Copy the deployment key from the first node (DTN) to every other node Node setup pulls configuration from Globus service Check your DTN cluster status: globus-connect-server node list
  6. 6. Multi-node DTN behavior • Active nodes can receive transfer tasks • Tasks on inactive node will pause until active again • GCS manager assistant – Synchronizes configuration among nodes in the endpoint – Stores encrypted configuration values in Globus service 6
  7. 7. Migrating an endpoint to a new host (DTN) • An endpoints is a logical construct è replace host system without disrupting the endpoint – Avoid replicating configuration data (esp. for guest collections!) – Maintain continuity for custom apps, automation scripts, etc., that use the endpoint UUID • Using GCS’s multi-node configuration, add new node(s) to endpoint and then remove original node(s) • Again, deployment key is required – Export node configuration with node setup --export-node – Import on new DTN using node setup --import-node
  8. 8. Supporting non-POSIX systems • Update your GCS packages • Add the appropriate storage gateway – Non-POSIX systems require add-on connector subscription(s) • Gateway configuration options vary by connector – e.g., specify bucket name(s) for AWS S3 • Collection authentication options vary by connector – e.g., provide user access key and secret key for AWS S3 – Credentials must grant appropriate permissions – Mapped collection may not actually “map” to local user account
  9. 9. Supporting access to AWS S3 (and S3-compatible systems) 9
  10. 10. On performance… 10
  11. 11. You should have Great Expectations 11 ESnet EPOC target for all DOE labs Requires at least a 10G connection
  12. 12. Globus transfer is fast …but it depends on… • Data Transfer Node (CPU, RAM, bus, NIC, …) • Network (devices, path quality, latency, …) • Storage (hardware, attach mode, …) • Dataset make-up (file#, size, tree depth, …) – Remember: LoSF == Great sadness • Things people do (one transfer per file …1M files) • …? 16
  13. 13. Globus balances performance with reliability 72.8Gbps
  14. 14. Performance is a pairs sport • Network use parameters: concurrency, parallelism • Maximum, Preferred values for each • Transfer considers source and destination endpoint settings min( max(preferred src, preferred dest), max src, max dest ) • Service limits, e.g. concurrent requests 18
  15. 15. Globus network use parameters • May only be changed on managed endpoints • Modify via the web app: Console à Endpoints tab • Modify via Globus Connect Server CLI – Run globus-connect-server endpoint modify • Strong recommendation: Do not change network use parameters before establishing baseline performance 19
  16. 16. Modifying network use parameters 20
  17. 17. Configuring a “private” data channel • Default: data interface is set to the DTN’s public IP address (see data_interface in /etc/gridftp.d/globus-connect-server) • Create /etc/gridftp.d/STORAGE_GATEWAY_ID • Set data_interface PRIVATE_INTERFACE_IP_ADDRESS • Replicate on every DTN (files in /etc/gridftp.d/ are not sync'd between nodes by Globus) 21
  18. 18. Customizing identity mapping • Recall: Globus identity userX@domain.edu maps to local user userX • Customize via mapping expressions or external code • Apply mapping expression(s) to storage gateway configuration • Call external script; be aware of storage gateway type – Can use static map files, database, etc. 22
  19. 19. Simple custom mapping example Note: Requires the storage gateway to accept identities from two domains { "DATA_TYPE": "expression_identity_mapping#1.0.0", "mappings": [ { "source": "{username}", "match": “42032579@wassamottau.edu", "output": "vas", "ignore_case": false, "literal": false }, { "source": "{username}", "match": ".*@uchicago.edu", "output": "{0}", "ignore_case": false, "literal": false } ] } Otherwise, default behavior local user à domain username Map 42032579@wassamottau.edu to local user vas
  20. 20. Troubleshooting Globus Connect Server 24
  21. 21. Before asking for help… • self-diagnostic can identify many issues – Are services running? GCS manager/assistant, GridFTP server • Connectivity is a common cause – Is the DTN control channel reachable? – Can the DTN establish data channel connection? docs.globus.org/globus-connect-server/v5.4/troubleshooting-guide …and we’re always here for you: support@globus.org 25
  22. 22. When you really need a clean slate… • Proper clean-up—both on your system and in the Globus service—is important! • Execute these commands in the specified order: o globus-connect-server node cleanup (on every DTN) o globus-connect-server endpoint cleanup (on last DTN) • Delete the GCS registration at developers.globus.org • Don’t use the same Client ID for another endpoint!
  23. 23. Comparing GCSv4 to GCSv5 27
  24. 24. Out with the old, in with the new • Host endpoints è Mapped collections – Need local account to access data • Shared endpoints è Guest collections – No local account needed for data access, permissions set in Globus • Create shared endpoint(s) on a host endpoint è Create guest collection(s) on a mapped collection
  25. 25. GCSv4 vs GCSv5 Feature GCSv4 GCSv5 Credentials for creating endpoints • Globus ID was required to create endpoint • Use any identity with which you can log into Globus. • Endpoint registered as a resource server with Globus Auth à client id/secret. Policy and data access interfaces • Static endpoint definition supporting one collection • Configuration changes required re-running setup on every DTN • Data access via GridFTP only • Endpoint supports multiple storage systems/collections • Automated management across DTNs • Data access via GridFTP and HTTPS
  26. 26. GCSv4 vs GCSv5 Feature GCSv4 GCSv5 Server credentials • Default certificate issued by Globus CA • Custom certificate supported • Default certificate issued by Let’s Encrypt (in order to enable browser-based downloads) • Custom certificate supported Management interface • Configuration files and tools on the DTN • Some attributes editable via the web app • GCS Manager service on the DTNs • GCS CLI used to interact with the service
  27. 27. GCSv4 vs GCSv5 Feature GCSv4 GCSv5 Endpoint management roles • Administrator, Activity Manager, Activity Monitor supported • Unchanged High Assurance data access • Not supported • Storage gateway can be configured for high assurance data handling, enforcing additional security constraints User authentication for data access • InCommon OAuth, MyProxy or MyProxy OAuth • Any identity used to log into Globus • Custom identity provider via OIDC
  28. 28. GCSv4 vs GCSv5 Feature GCSv4 GCSv5 Mapping to local account • Data access identity independent from Globus account • User certificate DN to local account • ePPN based, GridMap files, custom callout • Storage gateway policy maps Globus authentication information to local account • Expression based matching for values in the authentication context • Custom external program Custom domain name • Not applicable • Custom domain name for collections
  29. 29. Migrating GCSv4 to GCSv5 33
  30. 30. Goals • No user intervention should be required • Recreate all host and guest endpoints • Preserve all relevant configuration • Preserve the UUIDs of the resource • Minimize downtime 34
  31. 31. Migration tools: Approach • Read v4 configuration to create migration plan • Allow edits and changes by administrator to plan • Apply migration plan to a vanilla install of v5 • Test v5 endpoint/collections and validate • Finalize by assigning v4 UUID to the new endpoint 35
  32. 32. Impact • Downtime after final migration step (preserving UUID) • Active transfers cancelled on final step; users notified • Pause rules NOT preserved; must be recreated • Custom applications using v4 host endpoint (with activation) must move to v5 collection (with consent) docs.globus.org/globus-connect-server/migrating-to- v5.4/application-migration 36
  33. 33. Migration tools: Steps 37 docs.globus.org/globus-connect- server/migrating-to-v5.4/migration4-guide/ • Valid for POSIX endpoints with <100 shared endpoints • Coming soon: Migration of non-POSIX endpoints and those with more shared endpoints
  34. 34. Resources • GCSv5 Guides: docs.globus.org/globus-connect-server/ • Migration: docs.globus.org/globus-connect- server/migrating-to-v5.4/ • Globus support: support@globus.org 38

×