Managing Protected and
Controlled Data with Globus
Vas Vasiliadis
vas@uchicago.edu
Globus SaaS: Research data lifecycle
Researcher initiates
transfer request; or
requested automatically
by script, science
gateway
1
Instrument
Compute Facility
Globus transfers files
reliably, securely
2
Globus controls
access to shared
files on existing
storage; no need
to move files to
cloud storage!
4
Curator reviews and
approves; data set
published on campus
or other system
7
Researcher
selects files to
share, selects
user or group,
and sets access
permissions
3
Collaborator logs in to
Globus and accesses
shared files; no local
account required;
download via Globus
5
Researcher
assembles data set;
describes it using
metadata (Dublin
core and domain-
specific)
6
6
Peers, collaborators
search and discover
datasets; transfer and
share using Globus
8
Publication
Repository
Personal Computer
Transfer
Share
Publish
Discover
• Use a Web browser
• Access any storage
• Use an existing identity
Globus for high assurance data management
• Restricted data handling: PHI, PII, CUI
• Security controls: NIST 800-53, 800-171 Low
• Business Associate Agreement (BAA) w/UChicago
– University of Chicago has a BAA with Amazon
Compliance focus areas
• Access Control: Least privilege model
• Configuration Management: Change control, impact/risk
• Maintenance: Automation, vulnerability mitigation
• Accountability: Detailed audit trail (protection, forensics)
• Information integrity: Protection, monitoring
Restricted data disclosed to Globus
• Globus does not see file contents …and never did!
• File paths/name can have restricted data, e.g. PHI
• No other elements (endpoint definitions, labels,
collection definitions) can contain restricted data
Initial release scope
• Globus Service: Auth, Transfer, Groups, DNS, Sharing
• New web app (app.globus.org) – try it now!
• Globus Connect Server v5.2
• Globus Connect Personal v3.x
• Globus Command Line Interface (CLI)
Other features/services/products
• Connectors: AWS S3 as priority (future release)
• Platform: Globus Search (future release)
• Out of scope: Globus ID, data publication SaaS,
current web app, GCS v4.x, GCSv5.0, 5.1, GCP2.x
• Discontinued: Hosted CLI (as of August 1, 2018)
Out with the old, in with the new
• Host endpoints  Mapped collections
– Need local account to access data
• Shared endpoints  Guest collections
– No local account needed for data access, permissions set in Globus
• Use host endpoint to create shared endpoint 
Use storage gateway to create guest collections
• Access via GridFTP  Access via GridFTP or HTTPS
• Initially available via Globus Connect Server v5.2
Conceptual architecture: Mapped collections
Globus Endpoint
Subscriber
Security
Domain
Globus
Security
Domain
DATA
Channel
CONTROL
Channel
No data relay or staging via Globus;
files move directly between endpoints
User identity mapped
to local account
Single, globally accessible
multi-tenant service
Globus
“client” software
Subscriber owned
and administered
storage system
External Security Domain
(User, web app, data portal,
science gateway, …)
Conceptual architecture: Guest Collections
Subscriber
Security
Domain
User managed ”overlay” permissions
stored in Globus service
Guest
Collection
DATA
Channel
CONTROL
Channel
Subscriber managed filesystem
and endpoint policies
External Security Domain
(User, web app, data portal,
science gateway, …)
Globus Endpoint
Globus
Security
Domain
Globus Connect Server v5 Milestones
v5.0: Google
Drive
v5.1: POSIX guest
collections, HTTPS
v5.x: v4 feature parity+v5.3: …
• Multi DTN support
Additional storage types
• Custom IdPs
• …
Other
features
v5.2: High assurance
High Assurance features
• Additional authentication assurance
– Policy (per storage gateway) on frequency of authentication with
specific identity for access to data
– Enforce user authentication with specific identity within session
• Application instance isolation
– Authentication context is per app, per session
• Encryption of user data in transit and Globus data at rest
• Detailed audit log (on DTN via GCSv5.2)
Additional authentication assurance
userX@anl.govuserX@anl.gov
Additional authentication assurance
userX@anl.gov userX@uchicago.edu
Re-authentication timeout
userX@anl.gov userX@uchicago.edu
Application Instance Isolation
userX@uchicago.edu
Authenticated in browser
session (app instance 1)
Re-authentication required in
CLI session (app instance 2)
userX@uchicago.edu
Application Instance Isolation
Application Instance Isolation
Application Instance Isolation
Application Instance Isolation
Application Instance Isolation
userX@ucmed.org
Async transfer between HA collections
userX@uchicago.edu
Encrypted data channel
Mapped Collection
HA timeout: 2hrs
Mapped Collection
HA timeout: 4hrs
Globus Auth: Foundational IAM service
• Enables login for diverse app ecosystem
• Protects REST API communications between and
among apps and services
• No new identity required
• Based on OAuth2 and OpenID Connect
– Least privileges security model: scopes/consents
– Access via OAuth2 and OIDC libraries of your choice
– Programming language and framework agnostic
Globus Auth: Identity broker for research apps
Brokers authentication and authorization among…
• End-users
• Identity providers: enterprise, external (e.g. Google)
• Services: resource servers with REST APIs
• Apps: web, mobile, desktop, command line clients
• Services acting as clients to other services
Mission: Provide a platform for developers to easily
access 100’s of IdPs with just a bit of standard code
Sessions: High Assurance for Globus Auth
• Determine which identities in a user’s identity set have
been used to authenticate and when
• Services make access control decisions
• Uses token introspection
• Session context = app instance, device
• Failed operation  app generates specific redirect URL
docs.globus.org/api/auth/sessions
Example user flow: Guest collection
HA
userA@uchicago.edu
User_A@uchospitals.edu
g.user@gmail.com
accmgr@uchospitals.edu
ham@gmail.com
Guest
Collection
(timeout: 4hrs)
[Role:Access Manager]
grants:Read
Example user flow: Guest collection
HA
userA@uchicago.edu
User_A@uchospitals.edu
g.user@gmail.com
accmgr@uchospitals.edu
ham@gmail.com
Guest
Collection
(timeout: 4hrs)
Example user flow: Guest collection
HA
userA@uchicago.edu
User_A@uchospitals.edu
g.user@gmail.com
accmgr@uchospitals.edu
ham@gmail.com
Guest
Collection
(timeout: 4hrs)
Example user flow: Guest collection
HA
userA@uchicago.edu
User_A@uchospitals.edu
g.user@gmail.com
accmgr@uchospitals.edu
ham@gmail.com
Guest
Collection
(timeout: 4hrs)
redirect  UC Medicine
Example user flow: Guest collection
HA
userA@uchicago.edu
User_A@uchospitals.edu
g.user@gmail.com
accmgr@uchospitals.edu
ham@gmail.com
[Permission:Read]
Guest
Collection
(timeout: 4hrs)
Example user flow: Manage Permissions
HA
accmgr@uchospitals.edu
ham@gmail.com
Guest
Collection
(timeout: 4hrs)
userB@uchicago.edu
User_B@uchospitals.edu
grants:Read, Write
Example user flow: Guest collection
HA
accmgr@uchospitals.edu
ham@gmail.com
Guest
Collection
(timeout: 4hrs)
redirect  UC Medicine
userB@uchicago.edu
User_B@uchospitals.edu
Groups accessing HA guest collections
• Policy options
– High assurance – (not) strict
– Authentication assurance timeout
• Additional restrictions
– Invitations can only be issued by
administrator or manager
– Changes to group policies require
specific identity within session/
authentication assurance timeout
– Subgroups inherit HA policy
Example management flows
• Managing High Assurance endpoints requires
authentication with authorized identity, within session
– Endpoint configuration
– Globus Groups used to provide access to high assurance data
– Management Console access (e.g. to review logs)
New Globus Connect Server installation flow
• Install GCSv5.2+ binaries
• Register the endpoint at developers.globus.org
• Add connectors
• Add storage gateways
– Set as high assurance, configure authentication assurance timeout
– Set policy on type of collections supported
• Add mapped collection
– User must login with identity from configured domain
– Local account determined by removing the TLD:
username@example1.org  username is local account
Audit log on DTN via GCSv5.2
Globus Connect Personal (GCP)
• New version for high assurance data handling
• Allow user to choose an identity for use with the
endpoint
– Using GCP for data access requires that identity be in session
– Guest collections will work as they do with GCS
• Additional logging
Secure operations
• Intrusion detection and prevention
• Performance and health monitoring
• Logging
• Secure remote access, access control
• Uniform configuration management and change control
• Backups and disaster recovery
• AWS best practices for securing operating environment:
VPCs, security groups, IAM best practices
New subscription levels
• High Assurance
– 33% uplift on Standard subscription
and on premium connectors used for
high assurance data
• BAA
– All High Assurance features + BAA
with University of Chicago
– 50% uplift on Standard subscription
and on premium connectors used
under a BAA
• Separate subscription ID issued
Questions?

Managing Protected and Controlled Data with Globus

  • 1.
    Managing Protected and ControlledData with Globus Vas Vasiliadis vas@uchicago.edu
  • 2.
    Globus SaaS: Researchdata lifecycle Researcher initiates transfer request; or requested automatically by script, science gateway 1 Instrument Compute Facility Globus transfers files reliably, securely 2 Globus controls access to shared files on existing storage; no need to move files to cloud storage! 4 Curator reviews and approves; data set published on campus or other system 7 Researcher selects files to share, selects user or group, and sets access permissions 3 Collaborator logs in to Globus and accesses shared files; no local account required; download via Globus 5 Researcher assembles data set; describes it using metadata (Dublin core and domain- specific) 6 6 Peers, collaborators search and discover datasets; transfer and share using Globus 8 Publication Repository Personal Computer Transfer Share Publish Discover • Use a Web browser • Access any storage • Use an existing identity
  • 3.
    Globus for highassurance data management • Restricted data handling: PHI, PII, CUI • Security controls: NIST 800-53, 800-171 Low • Business Associate Agreement (BAA) w/UChicago – University of Chicago has a BAA with Amazon
  • 4.
    Compliance focus areas •Access Control: Least privilege model • Configuration Management: Change control, impact/risk • Maintenance: Automation, vulnerability mitigation • Accountability: Detailed audit trail (protection, forensics) • Information integrity: Protection, monitoring
  • 5.
    Restricted data disclosedto Globus • Globus does not see file contents …and never did! • File paths/name can have restricted data, e.g. PHI • No other elements (endpoint definitions, labels, collection definitions) can contain restricted data
  • 6.
    Initial release scope •Globus Service: Auth, Transfer, Groups, DNS, Sharing • New web app (app.globus.org) – try it now! • Globus Connect Server v5.2 • Globus Connect Personal v3.x • Globus Command Line Interface (CLI)
  • 7.
    Other features/services/products • Connectors:AWS S3 as priority (future release) • Platform: Globus Search (future release) • Out of scope: Globus ID, data publication SaaS, current web app, GCS v4.x, GCSv5.0, 5.1, GCP2.x • Discontinued: Hosted CLI (as of August 1, 2018)
  • 8.
    Out with theold, in with the new • Host endpoints  Mapped collections – Need local account to access data • Shared endpoints  Guest collections – No local account needed for data access, permissions set in Globus • Use host endpoint to create shared endpoint  Use storage gateway to create guest collections • Access via GridFTP  Access via GridFTP or HTTPS • Initially available via Globus Connect Server v5.2
  • 15.
    Conceptual architecture: Mappedcollections Globus Endpoint Subscriber Security Domain Globus Security Domain DATA Channel CONTROL Channel No data relay or staging via Globus; files move directly between endpoints User identity mapped to local account Single, globally accessible multi-tenant service Globus “client” software Subscriber owned and administered storage system External Security Domain (User, web app, data portal, science gateway, …)
  • 16.
    Conceptual architecture: GuestCollections Subscriber Security Domain User managed ”overlay” permissions stored in Globus service Guest Collection DATA Channel CONTROL Channel Subscriber managed filesystem and endpoint policies External Security Domain (User, web app, data portal, science gateway, …) Globus Endpoint Globus Security Domain
  • 17.
    Globus Connect Serverv5 Milestones v5.0: Google Drive v5.1: POSIX guest collections, HTTPS v5.x: v4 feature parity+v5.3: … • Multi DTN support Additional storage types • Custom IdPs • … Other features v5.2: High assurance
  • 18.
    High Assurance features •Additional authentication assurance – Policy (per storage gateway) on frequency of authentication with specific identity for access to data – Enforce user authentication with specific identity within session • Application instance isolation – Authentication context is per app, per session • Encryption of user data in transit and Globus data at rest • Detailed audit log (on DTN via GCSv5.2)
  • 19.
  • 20.
  • 21.
  • 22.
    Application Instance Isolation userX@uchicago.edu Authenticatedin browser session (app instance 1) Re-authentication required in CLI session (app instance 2) userX@uchicago.edu
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
    userX@ucmed.org Async transfer betweenHA collections userX@uchicago.edu Encrypted data channel Mapped Collection HA timeout: 2hrs Mapped Collection HA timeout: 4hrs
  • 29.
    Globus Auth: FoundationalIAM service • Enables login for diverse app ecosystem • Protects REST API communications between and among apps and services • No new identity required • Based on OAuth2 and OpenID Connect – Least privileges security model: scopes/consents – Access via OAuth2 and OIDC libraries of your choice – Programming language and framework agnostic
  • 30.
    Globus Auth: Identitybroker for research apps Brokers authentication and authorization among… • End-users • Identity providers: enterprise, external (e.g. Google) • Services: resource servers with REST APIs • Apps: web, mobile, desktop, command line clients • Services acting as clients to other services Mission: Provide a platform for developers to easily access 100’s of IdPs with just a bit of standard code
  • 31.
    Sessions: High Assurancefor Globus Auth • Determine which identities in a user’s identity set have been used to authenticate and when • Services make access control decisions • Uses token introspection • Session context = app instance, device • Failed operation  app generates specific redirect URL docs.globus.org/api/auth/sessions
  • 32.
    Example user flow:Guest collection HA userA@uchicago.edu User_A@uchospitals.edu g.user@gmail.com accmgr@uchospitals.edu ham@gmail.com Guest Collection (timeout: 4hrs) [Role:Access Manager] grants:Read
  • 33.
    Example user flow:Guest collection HA userA@uchicago.edu User_A@uchospitals.edu g.user@gmail.com accmgr@uchospitals.edu ham@gmail.com Guest Collection (timeout: 4hrs)
  • 34.
    Example user flow:Guest collection HA userA@uchicago.edu User_A@uchospitals.edu g.user@gmail.com accmgr@uchospitals.edu ham@gmail.com Guest Collection (timeout: 4hrs)
  • 35.
    Example user flow:Guest collection HA userA@uchicago.edu User_A@uchospitals.edu g.user@gmail.com accmgr@uchospitals.edu ham@gmail.com Guest Collection (timeout: 4hrs) redirect  UC Medicine
  • 36.
    Example user flow:Guest collection HA userA@uchicago.edu User_A@uchospitals.edu g.user@gmail.com accmgr@uchospitals.edu ham@gmail.com [Permission:Read] Guest Collection (timeout: 4hrs)
  • 37.
    Example user flow:Manage Permissions HA accmgr@uchospitals.edu ham@gmail.com Guest Collection (timeout: 4hrs) userB@uchicago.edu User_B@uchospitals.edu grants:Read, Write
  • 38.
    Example user flow:Guest collection HA accmgr@uchospitals.edu ham@gmail.com Guest Collection (timeout: 4hrs) redirect  UC Medicine userB@uchicago.edu User_B@uchospitals.edu
  • 39.
    Groups accessing HAguest collections • Policy options – High assurance – (not) strict – Authentication assurance timeout • Additional restrictions – Invitations can only be issued by administrator or manager – Changes to group policies require specific identity within session/ authentication assurance timeout – Subgroups inherit HA policy
  • 40.
    Example management flows •Managing High Assurance endpoints requires authentication with authorized identity, within session – Endpoint configuration – Globus Groups used to provide access to high assurance data – Management Console access (e.g. to review logs)
  • 41.
    New Globus ConnectServer installation flow • Install GCSv5.2+ binaries • Register the endpoint at developers.globus.org • Add connectors • Add storage gateways – Set as high assurance, configure authentication assurance timeout – Set policy on type of collections supported • Add mapped collection – User must login with identity from configured domain – Local account determined by removing the TLD: username@example1.org  username is local account
  • 42.
    Audit log onDTN via GCSv5.2
  • 43.
    Globus Connect Personal(GCP) • New version for high assurance data handling • Allow user to choose an identity for use with the endpoint – Using GCP for data access requires that identity be in session – Guest collections will work as they do with GCS • Additional logging
  • 44.
    Secure operations • Intrusiondetection and prevention • Performance and health monitoring • Logging • Secure remote access, access control • Uniform configuration management and change control • Backups and disaster recovery • AWS best practices for securing operating environment: VPCs, security groups, IAM best practices
  • 45.
    New subscription levels •High Assurance – 33% uplift on Standard subscription and on premium connectors used for high assurance data • BAA – All High Assurance features + BAA with University of Chicago – 50% uplift on Standard subscription and on premium connectors used under a BAA • Separate subscription ID issued
  • 46.

Editor's Notes

  • #2 July 16th Globus team presentation
  • #19 Access Control Identities provided and managed by institution Globus acts as identity broker only, does not access or store any institutional user credentials Institution controls all access policies (at multiple levels) who can access what data and with what permissions who can share what data and with what permissions all access policies can be changed or revoked at any time
  • #22 Administrator configurable reauthentication timeout
  • #30 Single sign on preferred Want to encourage application development to the Globus Service. Want to encourage others to use Globus Auth for their own services. But not only for Apps, for other service to service communication.
  • #31 NOT and identity manager …think: an “Identity provider Proxy” Getting user authenticated Consented so users are consenting to what tokens are being used for Issuing tokens Verifying tokens Globus Auth is a Foundational service for all of these In some sense it’s an IdP but think of it more as an Identity Broker Two protocols: OAuth2 – OpenID Connect (Web World) SAML – Shibboleth (Universities)
  • #32 OAuth2 – OpenID Connect (Web World) OpenID Connect – Authentication Layer (RESTful / JSON) RA: some concepts to follow, and then present use cases for integration with Auth with specific solutions on using our SDK for that.
  • #41 Anything that can either access PHI or lead to access to PHI == filename and paths
  • #45 Highlights of some of the Globus security features.