3. 3
Globus delivers…
Fast and reliable big data transfer,
sharing, and platform services…
…directly from your own storage
systems…
...via software-as-a-service using
existing identities with the overarching
goal of...
4. 4
Research Computing HPC
Desktop Workstations
Mass Storage Instruments
Personal Resources
Public Cloud
National Resources
Unifying access to data across tiers
9. Globus SaaS / PaaS: Research data lifecycle
Researcher initiates
transfer request; or
requested automatically
by script, science
gateway
1
Instrument
Compute Facility
Globus transfers files
reliably, securely
2
Globus controls
access to shared
files on existing
storage; no need
to move files to
cloud storage!
4
Researcher
selects files to
share, selects
user or group,
and sets access
permissions
3
Collaborator logs in to
Globus and accesses
shared files; no local
account required;
download via Globus
5
Automating research
workflows and
ensuring those that
need access to the
data have it.
8
Personal Computer
Transfer
Share
• Use a Web browser or
platform services
• Access any storage
• Use an existing identity
Build
The Globus
Command Line
Interface, API sets,
and Python SDK
provide a platform…
6
… for building
science gateways,
portals and
publication services.
7
10. Conceptual architecture: Hybrid SaaS
DATA
Channel
CONTROL
Channel
Source
Endpoint
Destination
Endpoint
Subscriber owned
and administered
storage system
Globus
“client” software
No data relay or
staging via Globus
Subscriber
Control
Domain
Globus
Control
Domain
Single, globally accessible
multi-tenant service
User
accessing
the Globus
Web App via
a browser
13. Endpoints (Collections)
• Storage abstraction
– All transfers happen between two endpoints
– Globus Connect instantiates endpoints
• Collection ~= Endpoint
• Test / Demo Endpoints
– Globus Tutorial Endpoint 1
– Globus Tutorial Endpoint 2
– ESnet Test Endpoints
o Contain file samples of various sizes
• Globus Connect Personal
– Now your laptop is an endpoint
– https://www.globus.org/globus-connect-personal
14
14. Globus Connect Personal
• Installers do not require admin access
• Zero configuration; auto updating
• Handles NATs
• Installs in seconds – easy to delete - I’ll prove it!
15. Almost demo time…
• How do I get a Globus account?
– A Globus Account is
o A Primary Identity
o Possible Linked Identities
– Your existing institutional identity may already work
– Linking / Managing Identities
– Consents
• But I don’t have any Endpoints (Collections)!
– Globus Connect Personal
– Globus Tutorial Endpoint 1
– Globus Tutorial Endpoint 2
– ESnet Test Endpoints
o Contain file samples of various sizes
16
16. Demo time!
Identities and
Accounts Transfer
Sharing
Transfer Details
Bookmarks
The Console
The Hamburger
Menu
The Activity Monitor
Groups
Roles
Responsive
Interface
17. Globus Command Line Interface
Open source, uses
Python SDK
docs.globus.org/cli
github.com/globus/
globus-cli
18. Globus Auth API
(Group Management)
…
GlobusTransferAPI
GlobusConnect
Data Discovery
File Sharing
File Transfer & Replication
Globus Platform-as-a-Service
Use existing institutional
ID systems in external
web applications
Integrate file transfer and sharing
capabilities into scientific web
apps, portals, gateways, etc...
19. A bit of Globus history
U . S . D E P A R T M E N T O F
ENERGY
20. Globus sustainability model
• Standard Subscription
– Shared endpoints
– Management console
– Usage reporting
– Priority support
– Application integration
– HTTPS support
– Branded Web Site
• Premium Storage Connectors
• Alternate Identity Provider (InCommon is standard)
21
22. Globus by the numbers...
6764
active shared
endpoints
115
subscribers
887+ PB
moved
23,818
active personal
endpoints
94 billion
files processed
1866
active server
endpoints
80
countries where
Globus is used
2.9 PB
largest single
transfer to date
99.9%
availability
745
identity providers
1960
most shared
endpoints
at a single
institution 112,105
total users
23. Manage Protected Data
24
Higher assurance levels for HIPAA and other regulated data
• Support for protected data
such as health related
information
• Share data with collaborators
while meeting compliance
requirements
• Includes BAA option
24. Globus for high assurance data management
• Restricted data handling
– PII (Personally identifiable information)
– Controlled Unclassified Information
– PHI (Protected Health Information)
• University of Chicago security controls
– NIST 800-53 Low
– Superset of 800-171 Low
• Business Associate Agreements (BAA) will be between
University of Chicago and our subscribers
– University of Chicago has a BAA with Amazon
25. High Assurance features
• Additional authentication assurance
– Per storage gateway policy on frequency of authentication with
specific identity for access to data (timeout)
– Ensure that user authenticates with the specific identity that
gives them access within session (decoupling linked identities)
• Session/device isolation
– Authentication context is per application, per session (~browser
session)
• Enforces encryption of all user data in transit
• Audit logging
26. Support resources
• Globus documentation: docs.globus.org
• Helpdesk and issue escalation: support@globus.org
• Mailing Lists
– https://www.globus.org/mailing-lists
• Customer engagement team
• Globus professional services team
– Assist with portal/gateway/app architecture and design
– Develop custom applications that leverage the Globus platform
– Advise on customized deployment and integration scenarios
27. Globus on your Campus
• Webinars
• Programs
– Helping you evangelize Globus within your
institution.
• Professional Services
• Globus World Tour
– Taking the show on the road.
28
Editor's Notes
We’ve all seen this. Some of us have been the cause of this.
Actually fairly effective. That is until we need to…. <click>
Big is crossed out because:
One man’s ceiling is another man’s floor.
Why just big data? You don’t need convenience and reliability for ALL data?
Unify – Where / Who / What / How
Where
Who
What
How
Transfer – More data quickly, securely and reliably over optimized networks. Set and forget.
Share – Transfer data across security enclaves between different “accounts”.
No ”local account” use existing institutional credentials.
Point out no tertiary storage (Dropbox) and the cost and security concerns associated with them.
Pretty powerful stuff using just a web browser.
The Globus service is a controller
No data passes through the Globus Service
Fire and forget control – The Service GUI (web page) can go away
Globus abstracts storage systems in a quanta called an ”endpoint”
Storage system complexities are masked or abstracted
Transfers between disparate storage systems is natural
This is a simple transfer case – a single user has permissions on both source and destination filesystems.
The Globus service is a controller
No data passes through the Globus Service
Fire and forget control – The Service GUI (web page) can go away
Globus abstracts storage systems in a quanta called an ”endpoint”
Storage system complexities are masked or abstracted
Transfers between disparate storage systems is natural
This is a simple transfer case – a single user has permissions on both source and destination filesystems.
Storage system complexities such as permissions are abstracted as well.
Endpoint definition
Endpoints you can use right now
GCP – Your very own endpoint, no DTN running Globus Connect Server needed
We will demo this in a minute
Delete old version – don’t forget to blow away the old endpoint
Install new version
Set a directory
Talk about Globus Plus and the shareable check box
Endpoint definition
Endpoints you can use right now
GCP – Your very own endpoint, no DTN running Globus Connect Server needed
We will demo this in a minute
Written in Python and open source – go get it an hack it up if you like.
Highly scriptable – integrate it into you bash scripts for easy RDM automation. Rachana will cover.
HOW
The fundamentals of the practical Globus research data management tools you’ve seen in our web app (File transfer, file sharing, data publication) are available to you through our set of APIs.
<click>
In this talk I’m going to concentrate more on the implementation of the transfer API set, that give you the ability to Integrate file transfer and sharing capabilities into scientific web apps, portals, gateways, etc.
<click>
Rachana and Steve are going to dig into Globus Auth in far more depth than I will, but I’ll touch on Globus Auth a bit in the context of the demos I’m going to do. You can’t just use the Transfer APIs at will, you have to prove who you are first and that you have the authority to do the things that you want to do.
Source:
1960 most shared endpoints at 1 institution NIH, per Vas as of Bio-IT19
686 PB transferred website as of 8-16-19 (actual 686,442,707 per usage report that day)
94 billion files processed Daily Usage Report 8-9-19 (actual 93,914)
1866 active server endpts Globus Metrics Yearly report 8-5-19
115 subscribers per Greg 8-9-19
112,105 users Daily Usage Report 8-9-19
80 countries where we are used Greg data Aug 13 2019 – see my online stats sheet: https://docs.google.com/spreadsheets/d/1G3_ABgpDveicRuLWTmygZsuVcsTuYtRm2AhGdASP51I/edit#gid=1198715507
23,818 active personal endpoints (average in any given year) Globus Metrics Yearly report 8-5-19
745 identity providers per Mattias 8/12/19
2.9 PB largest single transfer Katrin Heitmann June 2019
6764 active shared endpoints Globus Metrics Yearly report 8-5-19
99.9% availability correct per Stu data 8-9-19
Where did our efforts stem from -HIPAA
But HIPAA is not a “thing” or a set of standards. These are the standards an conformities we set to adhere to in our design of the product.
So what do you get.
Sounds simple right? – From a subscriber / user perspective that is by design.
“How hard can it be”