Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)


Published on

Presented at the GlobusWorld Tour workshop at Columbia University, on April 24, 2019.

Published in: Data & Analytics
  • Login to see the comments

  • Be the first to like this

Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)

  1. 1. Introduction to Globus for New Users Vas Vasiliadis Columbia University – April 24, 2019
  2. 2. Research data management today (circa 2008) How do we... ...move, share, describe, discover, reproduce? Index? Facilitate data stewardship
  3. 3. Globus… bridges data and people within and beyond organizational boundaries
  4. 4. 4 Research Computing HPC Desktop Workstations Mass Storage Instruments Personal Resources Public/Private Cloud National Resources Unified access to data across storage tiers
  5. 5. Public / private cloud stores External campus storage Project repositories, replication stores Public repositories Sharing with collaborators, community
  6. 6. Analysis store Next-Gen Sequencer MRI Advanced Light Source Personal system Remote visualization Light Sheet Microscope High-durability, low-cost store Manage data from instruments Cryo-EM
  7. 7. Globus SaaS / PaaS: Research data lifecycle Researcher initiates transfer request; or requested automatically by script, science gateway 1 Instrument Compute Facility Globus transfers files reliably, securely 2 Globus controls access to shared files on existing storage; no need to move files to cloud storage! 4 Researcher selects files to share, selects user or group, and sets access permissions 3 Collaborator logs in to Globus and accesses shared files; no local account required; download via Globus 5 Automating research workflows and ensuring those that need access to the data have it. 8 Personal Computer Transfer Share • Use a Web browser or platform services • Access any storage • Use an existing identity Build The Globus Command Line Interface, API sets, and Python SDK provide a platform… 6 … for building science gateways, portals and publication services. 7
  8. 8. Endpoints (Collections) • Storage abstraction – All transfers happen between two endpoints – Globus Connect instantiates endpoints • Collection ~= Endpoint • Test / Demo Endpoints – Globus Tutorial Endpoint 1 – Globus Tutorial Endpoint 2 – ESnet Read-Only endpoints (test files of various sizes) 8
  9. 9. Conceptual architecture: Hybrid SaaS DATA Channel CONTROL Channel Source Endpoint Destination Endpoint Subscriber owned and administered storage system Globus “connector” software No data relay or staging via Globus cloud service Subscriber Control Domain Globus Control Domain Single, globally accessible multi-tenant service
  10. 10. Demonstration File transfer Data Sharing
  11. 11. Globus accounts and identities • Globus Account = – Primary identity (e.g. + – Optional linked identities (e.g. • Note: Identity ≠≠ E-mail Address • Linked identities simplify sharing • Identities/consents managed like any other resource
  12. 12. Groups • What can they be used for? – Sharing: Access permissions for more than one person – Roles: Endpoint management and monitoring • Policy based membership, management – Group creation and visibility – Member invitation and approval – Optional membership fields – Terms and conditions – Roles for delegating management – Subgroups inherit policies; may be overridden
  13. 13. Demonstration Linked identities Group management
  14. 14. Globus for high assurance data management • Restricted data handling: PHI, PII, CUI • Security controls: NIST 800-53, 800-171 Low • Business Associate Agreement (BAA) w/UChicago – University of Chicago has a BAA with Amazon • “Equivalent” UK/EU privacy contractual agreements – e.g. to cover Data Processor requirements under GDPR
  15. 15. High Assurance features • Additional authentication assurance – Per storage gateway policy on frequency of authentication with specific identity for access to data – Ensure that user authenticates with the specific identity that gives them access within session • Application instance isolation – Authentication context is per application, per session • Encryption of user data in transit and Globus data at rest • Detailed audit log (on data transfer nodes)
  16. 16. Demonstration High Assurance Endpoints
  17. 17. …client software that makes a storage system accessible via Globus
  18. 18. Globus Connect Personal • Rapid installation/removal by non-privileged account • Zero configuration; auto updating • Handles NATs
  19. 19. Demonstration Globus Connect Personal installation and configuration
  20. 20. …on sustainability
  21. 21. 6,922 active shared endpoints 70+ petabyte movers 595 PB moved 21,306 active personal endpoints 87 billion files processed 1,879 active server endpoints 110+ subscribers 1 PB largest single transfer to date 99.9% availability 617 identity providers 1,923 most shared endpoints at a single institution 135,000 users Globus by the numbers
  22. 22. Thank you to our sponsors... U . S . D E P A R T M E N T O F ENERGY
  23. 23. …and THANK YOU, subscribers!
  24. 24. Globus sustainability model • Standard Subscription – Sharing, data publication – HTTPS access – Console, usage reporting – Priority support – App integration support • High Assurance subscription – App instance isolation – Additional authentication assurance – Audit logging – NIST 800-53, NIST 800-171 (+ BAA) • Branded Web Site • Premium Storage Connectors • Alternate Identity Provider (InCommon is standard)
  25. 25. Support resources • Globus documentation: • Community email list: • Helpdesk and issue escalation: • Customer engagement team • Globus professional services team – Assist with portal/gateway/app architecture and design – Develop custom applications that leverage the Globus platform – Advise on customized deployment and integration scenarios