Successfully reported this slideshow.
Your SlideShare is downloading. ×

Advanced Globus System Administration

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 25 Ad

Advanced Globus System Administration

Download to read offline

Building on basic Globus administration skills, we introduce topics such as using multiple data transfer nodes for your endpoint, customizing identity mapping, and using Globus connectors to access non-POSIX filesystems such as iRODS and Amazon S3.

Presented at a workshop at KU Leuven on July 8, 2022.

Building on basic Globus administration skills, we introduce topics such as using multiple data transfer nodes for your endpoint, customizing identity mapping, and using Globus connectors to access non-POSIX filesystems such as iRODS and Amazon S3.

Presented at a workshop at KU Leuven on July 8, 2022.

Advertisement
Advertisement

More Related Content

Similar to Advanced Globus System Administration (20)

More from Globus (20)

Advertisement

Recently uploaded (20)

Advanced Globus System Administration

  1. 1. Advanced Globus System Administration Vas Vasiliadis vas@uchicago.edu 8 July 2022
  2. 2. Agenda • Comparing GCS v4 and v5 comparison • Migrating from GCS v4 to v5 • Multi-DTN deployments • Supporting non-POSIX storage systems • Optimizing (or not!) file transfer performance • Modifying the data channel interface 2
  3. 3. Adding DTNs to your endpoint 14
  4. 4. Recall: GCSv5 deployment key 15
  5. 5. Adding a node requires just two commands $ globus-connect-server node setup $CLIENT_ID --deployment-key THE_KEY $ systemctl restart apache2 Copy the deployment key from the first node (DTN) to every other node Node setup pulls configuration from Globus service Check your DTN cluster status: globus-connect-server node list
  6. 6. Multi-node DTN behavior • Active nodes can receive transfer tasks • Tasks on inactive node will pause until active again • GCS manager assistant – Synchronizes configuration among nodes in the endpoint – Stores encrypted configuration values in Globus service 17
  7. 7. Migrating an endpoint to a new host (DTN) • An endpoints is a logical construct è replace host system without disrupting the endpoint – Avoid replicating configuration data (esp. for guest collections!) – Maintain continuity for custom apps, automation scripts, etc., that use the endpoint UUID • Using GCS’s multi-node configuration, add new node(s) to endpoint and then remove original node(s) • Again, deployment key is required – Export node configuration with node setup --export-node – Import on new DTN using node setup --import-node
  8. 8. Supporting non-POSIX systems • Update your GCS packages • Add the appropriate storage gateway – Non-POSIX systems require add-on connector subscription(s) • Gateway configuration options vary by connector – e.g., specify bucket name(s) for AWS S3 • Collection authentication options vary by connector – e.g., provide user access key and secret key for AWS S3 – Credentials must grant appropriate permissions – Mapped collection may not actually “map” to local user account
  9. 9. Supporting access to AWS S3 (and S3-compatible systems) 20
  10. 10. On performance… 21
  11. 11. Globus transfer is fast …but it depends on… • Data Transfer Node (CPU, RAM, bus, NIC, …) • Network (devices, path quality, latency, …) • Storage (hardware, attach mode, …) • Dataset make-up (file#, size, tree depth, …) – Remember: LoSF == Great sadness • Things people do (one transfer per file …1M files) • …? 22
  12. 12. You should have Great Expectations 23 ESnet EPOC target for all DOE labs Requires at least a 10G connection
  13. 13. Esnet makes magic happen
  14. 14. Legacy Architecture (don’t do this) 10GE Border Router WAN Firewall Enterprise perfSONAR perfSONAR Filesystem (data store) 10GE Portal Server Browsing path Query path Data path Portal server applications: · web server · search · database · authentication · data service
  15. 15. Best practice: ScienceDMZ – you have one! 10GE 10GE 10GE 10GE Border Router WAN Science DMZ Switch/Router Firewall Enterprise perfSONAR perfSONAR 10GE 10GE 10GE 10GE DTN DTN API DTNs (data access governed by portal) DTN DTN perfSONAR Filesystem (data store) 10GE Portal Server Browsing path Query path Portal server applications: · web server · search · database · authentication Data Path Data Transfer Path Portal Query/Browse Path
  16. 16. Science DMZ configuration 27 Source security filters Destination security filters Destination Science DMZ Source Science DMZ Source Border Router Destination Border Router Source Router Destination Router User Organization DATA CONTROL Physical Control Path Logical Control Path Physical Data Path Logical Data Path * Port 443 * Ports 50000- 51000 Data Transfer Node (DTN) Data Transfer Node (DTN) * Please see TCP ports reference: https://docs.globus.org/resource-provider-guide/#open-tcp-ports_section
  17. 17. Globus balances performance with reliability 72.8Gbps
  18. 18. Performance is a pairs sport • Network use parameters: concurrency, parallelism • Maximum, Preferred values for each • Transfer considers source and destination endpoint settings min( max(preferred src, preferred dest), max src, max dest ) • Service limits, e.g. concurrent requests 29
  19. 19. Globus network use parameters • May only be changed on managed endpoints • Modify via the web app: Console à Endpoints tab • Modify via Globus Connect Server CLI – Run globus-connect-server endpoint modify • Strong recommendation: Do not change network use parameters before establishing baseline performance 30
  20. 20. Modifying network use parameters 31
  21. 21. Configuring a “private” data channel • Default: data interface is set to the DTN’s public IP address (see data_interface in /etc/gridftp.d/globus-connect-server • Create /etc/gridftp.d/STORAGE_GATEWAY_ID • Set data_interface PRIVATE_INTERFACE_IP_ADDRESS • Replicate on every DTN (files in /etc/gridftp.d/ are not sync'd between nodes by Globus) 32
  22. 22. Troubleshooting Globus Connect Server 33
  23. 23. Before asking for help… • self-diagnostic can identify many issues – Are services running? GCS manager/assistant, GridFTP server • Connectivity is a common cause – Is the DTN control channel reachable? – Can the DTN establish data channel connection? docs.globus.org/globus-connect-server/v5.4/troubleshooting-guide …and we’re always here for you: support@globus.org 34
  24. 24. When you really need a clean slate… • Proper clean-up—both on your system and in the Globus service—is important! • Execute these commands in the specified order: o globus-connect-server node cleanup (on every DTN) o globus-connect-server endpoint cleanup (on last DTN) • Delete the GCS registration at developers.globus.org • Don’t use the same Client ID for another endpoint!
  25. 25. Resources • GCSv5 Guides: docs.globus.org/globus-connect-server/ • Migration: docs.globus.org/globus-connect- server/migrating-to-v5.4/ • Globus support: support@globus.org 36

×