Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to the Globus SaaS (GlobusWorld Tour - STFC)

24 views

Published on

Presented at the GlobusWorld Tour workshop at the SFTC Rutherford Appleton Lab, on January 10, 2019.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Introduction to the Globus SaaS (GlobusWorld Tour - STFC)

  1. 1. Welcome! Presenters: Vas Vasiliadis Brendan McCollam vas@uchicago.edu bjmc@globus.org STFC – January 10, 2019
  2. 2. Agenda Morning topics Introduction to the Globus SaaS • Service overview & architecture • Demo: A researcher’s perspective • Our sustainability model Break Globus Endpoints • Installing and configuring GCS • Security mechanisms • Deployment best practices Lunch Afternoon topics Introduction to the Globus PaaS • Accessing Globus APIs • Globus Transfer API walkthrough Automating Research Data Workflows Break Facilitating Collaboration with Globus • Data sharing best practices/automation • Jupyter + Globus for interactive data science Building the Services Ecosystem • Deep dive into Globus Auth • Securing your apps and services
  3. 3. Introduction to Globus SaaS Vas Vasiliadis vas@uchicago.edu STFC – January 10, 2019
  4. 4. Research data management today (circa 2008) How do we... ...move, share, describe, discover, reproduce? Index? Facilitate data stewardship
  5. 5. Globus: A Brief History of Time • Oct. 1998 – Globus Toolkit v1.0.0 • Nov. 2010 – Globus Online initial release • Nov. 2013 – Sustainability model launched • Dec. 2016 – 50,000 registered users, 200PB+ moved • Jan. 2018 – Globus Toolkit support EOL • Jan. 2019 – Globus service >50% sustainable • ??? – Globus service becomes fully self-sustaining globus online
  6. 6. Globus… bridges data and people within and beyond organizational boundaries
  7. 7. 7 Research Computing HPC Desktop Workstations Mass Storage Instruments Personal Resources Public/Private Cloud National Resources Unified access to data across storage tiers
  8. 8. Public / private cloud stores External campus storage Project repositories, replication stores Public repositories Sharing with collaborators, community
  9. 9. Analysis store Next-Gen Sequencer MRI Advanced Light Source Personal system Remote visualization Light Sheet Microscope High-durability, low-cost store Manage data from instruments Cryo-EM
  10. 10. Globus: Core functions Researcher initiates transfer request; or requested automatically by script, science gateway 1 Instrument Compute Facility Globus transfers files reliably, securely 2 Globus controls access to shared files on existing storage; no need to move files to cloud storage! 4 Curator reviews and approves; data set published on campus or other system 7 Researcher selects files to share, selects user or group, and sets access permissions 3 Collaborator logs in to Globus and accesses shared files; no local account required; download via Globus 5 Researcher assembles data set; describes it using metadata (Dublin core and domain- specific) 6 6 Peers, collaborators search and discover datasets; transfer and share using Globus 8 Publication Repository Personal Computer Transfer Share Publish Discover • Use a Web browser • Access any storage • Use an existing identity
  11. 11. Endpoints (Collections) • Storage abstraction – All transfers happen between two endpoints – Globus Connect instantiates endpoints • Collection ~= Endpoint • Test / Demo Endpoints – Globus Tutorial Endpoint 1 – Globus Tutorial Endpoint 2 – ESnet Read-Only endpoints (test files of various sizes) 11
  12. 12. Conceptual architecture: Hybrid SaaS DATA Channel CONTROL Channel Source Endpoint Destination Endpoint Subscriber owned and administered storage system Globus “connector” software No data relay or staging via Globus cloud service Subscriber Control Domain Globus Control Domain Single, globally accessible multi-tenant service
  13. 13. The Globus Web App • Contextual menu, based on type of endpoint/storage • File transfer settings – label – when using the activity monitor it’s nice to see a recognizable name – sync - only transfer new or changed files – delete files on destination that do not exist on source – preserve source file modification times – verify file integrity after transfer – encrypt transfer (for data channel; control channel always encrypted) • Search: Endpoints, Users, Groups
  14. 14. Demonstration File transfer Data Sharing Data Publication
  15. 15. Globus accounts and identities • Globus Account = – Primary identity (e.g. vas@globusid.org) + – Optional linked identities (e.g. vas@uchicago.edu) • Note: Identity ≠≠ E-mail Address • Linked identities simplify sharing • Identities/consents managed like any other resource
  16. 16. Groups • What can they be used for? – Sharing: Access permissions for more than one person – Roles: Endpoint management and monitoring • Policy based membership, management – Group creation and visibility – Member invitation and approval – Optional membership fields – Terms and conditions – Roles for delegating management – Subgroups inherit policies; may be overridden
  17. 17. Demonstration Linked identities Group management
  18. 18. Globus for high assurance data management • Restricted data handling: PHI, PII, CUI • Security controls: NIST 800-53, 800-171 Low • Business Associate Agreement (BAA) w/UChicago – University of Chicago has a BAA with Amazon • “Equivalent” UK/EU privacy contractual agreements – e.g. to cover Data Processor requirements under GDPR
  19. 19. High Assurance features • Additional authentication assurance – Per storage gateway policy on frequency of authentication with specific identity for access to data – Ensure that user authenticates with the specific identity that gives them access within session • Application instance isolation – Authentication context is per application, per session • Encryption of user data in transit and Globus data at rest • Detailed audit log (on data transfer nodes)
  20. 20. Demonstration High Assurance Endpoints
  21. 21. …client software that makes a storage system accessible via Globus
  22. 22. Globus Connect Personal • Rapid installation/removal by non-privileged account • Zero configuration; auto updating • Handles NATs
  23. 23. Demonstration Globus Connect Personal installation and configuration
  24. 24. …on sustainability
  25. 25. 7,274 active shared endpoints 70+ petabyte movers 520 PB moved 21,306 active personal endpoints 82 billion files processed 1,835 active server endpoints 100+ subscribers 1 PB largest single transfer to date 99.9% availability 569 identity providers 1,923 most shared endpoints at a single institution 125,000 users Globus by the numbers
  26. 26. Thank you to our sponsors... U . S . D E P A R T M E N T O F ENERGY
  27. 27. …and THANK YOU, subscribers!
  28. 28. Globus sustainability model • Standard Subscription – Sharing, data publication – HTTPS access – Console, usage reporting – Priority support – App integration support • High Assurance subscription – App instance isolation – Additional authentication assurance – Audit logging – NIST 800-53, NIST 800-171 (+ BAA) • Branded Web Site • Premium Storage Connectors • Alternate Identity Provider (InCommon is standard)
  29. 29. Support resources • Globus documentation: docs.globus.org • Community email list: developer-discuss@globus.org • Helpdesk and issue escalation: support@globus.org • Customer engagement team • Globus professional services team – Assist with portal/gateway/app architecture and design – Develop custom applications that leverage the Globus platform – Advise on customized deployment and integration scenarios

×