Vas Vasiliadis
vas@uchicago.edu
February 28, 2024
Building Research Applications with Globus PaaS
Globus PaaS accelerates development
• Auth
• Groups
• Transfer
• Compute
• Search
• Timer
• Flows
• GCS Manager
• Globus web app consumes the
same public APIs
• Resources named by URL
(standard REST approach)
• Request/response body is JSON
• Python SDK available; JS coming
docs.globus.org/api
Globus Auth: Foundational IAM service
• Makes app/portal widely and easily accessible
• Brokers authentication and authorization among…
– End-users
– Identity providers: enterprise, external (federated identities)
– Services: resource servers with REST APIs
– Apps: web, mobile, desktop, command line clients
– Services acting as clients to other services
• OAuth 2.0 Authorization Framework (a.k.a. OAuth2)
• OpenID Connect Core 1.0 (a.k.a. OIDC)
5
Fundamental Concepts
• Scopes
– APIs that client is requesting access to
– Service and resources within that service
• Consents
– Authorizes a client to access a service, within limited scope, on
the resource owner's behalf
• Multiple methods for user to grant consent depending
on the type of application
6
Several authentication modes supported
• A portal/science/gateway or other application you host
– Authorization code grant: authenticate as user identity
– Browser redirect; auth code returned automatically; tokens stored securely
• A thick client/script installed and run on the user’s device
– Native app grant: authenticate as user identity
– Auth code returned automatically; tokens stored per installation
• Service account or application credentials for automation
– Client credentials grant: authenticate as application identity
– Client ID and Secret stored securely
• Application able to manage tokens for offline/long lived tasks
– Request refresh tokens in addition to access tokens
Job #1:
App registration
8
Get app credentials at
https://app.globus.org/settings/developers
Managing service accounts/app credentials
• Application Identity:
appclientid@clients.auth.globus.org
• These are confidential apps with client id and secret
• Ensure application is on a secure device
• Set up policy for rotation of secret
• Assign project admins to manage the registration
Globus Groups for authorization
• Grant group manager role to your application (client identity)
– App can add/remove members to grant/remove access
• Use groups to manage your application’s permissions
– Use group membership instead of managing ACLs directly
– Check user’s group membership(s) to determine permissions
globus-sdk-python.readthedocs.io/en/stable/services/groups.html
Guest collections simplify data access
• Users don't need local accounts to access data
• Portal can act as itself when accessing collections
• Grant the application Access Manager role
– Allows the application to manage permissions on the collection
• Grant roles for management of endpoint and tasks
Globus Search: Data description and discovery
• Metadata store with fine-
grained visibility controls
• Schema agnostic dynamic
schemas
• Simple search using URL
query parameters
• Complex search using
search request document
13
docs.globus.org/api/search
Search
Index
Data ingest with Globus Search
14
Search
Index
POST /index/{index_id}/ingest'
{
"ingest_type": "GMetaList",
"ingest_data": {
"gmeta": [
{
"id": "filetype",
"subject”: "https://search.api.globus.org/abc.txt",
"visible_to": ["public"],
"content": {
"metadata-schema/file#type": "file”
}
},
...
]
}
- Bulk create and update
- Task model for ingest at scale
Data ingest with Globus Search
15
Search
Index
POST /index/{index_id}/ingest'
{
"ingest_type": "GMetaList",
"ingest_data": {
"gmeta": [
{
"id": ”weight",
"subject": "https://search.api.globus.org/abc.txt",
"visible_to": ["urn:globus:auth:identity:46bd0f56-
e24f-11e5-a510-131bef46955c"],
"content": {
"metadata-schema/file#size": ”37.6",
"metadata-schema/file#size_human": ”<50lb”
}
},
...
]
}
Visibility limited to Globus Auth identity
- Single user
- Globus Group
- Registered client application
Data discovery with Globus Search
16
{
"@datatype": "GSearchResult",
"@version": "2017-09-01",
"count": 1,
"gmeta": [
{
"@datatype": "GMetaResult",
"@version": "2019-08-27",
"entries": [
{ ... }
],
"subject": "https://..."
}
],
"offset": 0,
"total": 1
}
GET /index/{index_id}/search?q=type%3Ahdf5
Search
Index
Simple query
Data discovery with Globus Search
17
POST /index/{index_id}/search
Search
Index
Complex query
{
"filters": [
{
"type": "range",
"field_name": ”pubdate",
"values": [
{
"from": "*",
"to": "2020-12-31"
}
]
}
],
"facets": [
{
"name": "Publication Date",
"field_name": "pubdate",
...
}
]
}
Filter
Facets
Boosts
Sort
Limit
Experimenting with
Globus services
using notebooks
jupyter.demo.globus.org
Making data
more FAIR with
Globus PaaS and
portal framework
The Modern Research Data
Portal Design Pattern
docs.globus.org/mrdp
Legacy Architecture (don’t do this)
10GE
Border Router
WAN
Firewall
Enterprise
perfSONAR
perfSONAR
Filesystem
(data store)
10GE
Portal
Server
Browsing path
Query path
Data path
Portal server applications:
· web server
· search
· database
· authentication
· data service
Science gateway server applications
Science
gateway
server
Gateway logic (“small” data) and
research data traverse the enterprise
firewall è massive bottleneck
Best practice: ScienceDMZ
10GE
10GE
10GE
10GE
Border Router
WAN
Science DMZ
Switch/Router
Firewall
Enterprise
perfSONAR
perfSONAR
10GE
10GE
10GE
10GE
DTN
DTN
API DTNs
(data access governed
by portal)
DTN
DTN
perfSONAR
Filesystem
(data store)
10GE
Portal
Server
Browsing path
Query path
Portal server applications:
· web server
· search
· database
· authentication
Data Path
Data Transfer Path
Portal Query/Browse Path
Science
gateway
server
Science gateway server applications
Only gateway logic (“small” data)
traverses the enterprise firewall
è fast, clean path for research data
MRDP: Key elements
Science DMZ
Fast, clean data path
Data Transfer Nodes
Purpose-built data movers
Globus Platform
Secure, reliable data
orchestration
Globus Connect
Storage system enabler
23
Globus Portal
Framework
Data discovery and access
Django Globus Portal key features
• Federated login via Globus Auth
• Data export via Globus Transfer
• Browse datasets via Globus Search
• Template-driven search results and landing pages
• Django-based framework with extensible templates
• Bootstrap your project using Cookiecutter
25
Source: github.com/globus/django-globus-portal-framework
Docs: django-globus-portal-framework.readthedocs.io
Portal framework + Compute = Science Gateway
• Discover and select data of interest
• Run analyses on selected data via Globus Compute
– Run on any resource: laptop à supercomputer
• Move (and, optionally, share) analysis results
• Do all this reliably, at scale with Globus Flows
26
Bootstrap your portal, science gateway
27
Resources
• Experiment with APIs: jupyter.demo.globus.org
• Access code samples: github.com/globus
• Leverage Globus professional services team
• Support: support@globus.org
28

Building Research Applications with Globus PaaS

  • 1.
    Vas Vasiliadis vas@uchicago.edu February 28,2024 Building Research Applications with Globus PaaS
  • 2.
    Globus PaaS acceleratesdevelopment • Auth • Groups • Transfer • Compute • Search • Timer • Flows • GCS Manager • Globus web app consumes the same public APIs • Resources named by URL (standard REST approach) • Request/response body is JSON • Python SDK available; JS coming docs.globus.org/api
  • 4.
    Globus Auth: FoundationalIAM service • Makes app/portal widely and easily accessible • Brokers authentication and authorization among… – End-users – Identity providers: enterprise, external (federated identities) – Services: resource servers with REST APIs – Apps: web, mobile, desktop, command line clients – Services acting as clients to other services • OAuth 2.0 Authorization Framework (a.k.a. OAuth2) • OpenID Connect Core 1.0 (a.k.a. OIDC) 5
  • 5.
    Fundamental Concepts • Scopes –APIs that client is requesting access to – Service and resources within that service • Consents – Authorizes a client to access a service, within limited scope, on the resource owner's behalf • Multiple methods for user to grant consent depending on the type of application 6
  • 6.
    Several authentication modessupported • A portal/science/gateway or other application you host – Authorization code grant: authenticate as user identity – Browser redirect; auth code returned automatically; tokens stored securely • A thick client/script installed and run on the user’s device – Native app grant: authenticate as user identity – Auth code returned automatically; tokens stored per installation • Service account or application credentials for automation – Client credentials grant: authenticate as application identity – Client ID and Secret stored securely • Application able to manage tokens for offline/long lived tasks – Request refresh tokens in addition to access tokens
  • 7.
  • 8.
    Get app credentialsat https://app.globus.org/settings/developers
  • 9.
    Managing service accounts/appcredentials • Application Identity: appclientid@clients.auth.globus.org • These are confidential apps with client id and secret • Ensure application is on a secure device • Set up policy for rotation of secret • Assign project admins to manage the registration
  • 10.
    Globus Groups forauthorization • Grant group manager role to your application (client identity) – App can add/remove members to grant/remove access • Use groups to manage your application’s permissions – Use group membership instead of managing ACLs directly – Check user’s group membership(s) to determine permissions globus-sdk-python.readthedocs.io/en/stable/services/groups.html
  • 11.
    Guest collections simplifydata access • Users don't need local accounts to access data • Portal can act as itself when accessing collections • Grant the application Access Manager role – Allows the application to manage permissions on the collection • Grant roles for management of endpoint and tasks
  • 12.
    Globus Search: Datadescription and discovery • Metadata store with fine- grained visibility controls • Schema agnostic dynamic schemas • Simple search using URL query parameters • Complex search using search request document 13 docs.globus.org/api/search Search Index
  • 13.
    Data ingest withGlobus Search 14 Search Index POST /index/{index_id}/ingest' { "ingest_type": "GMetaList", "ingest_data": { "gmeta": [ { "id": "filetype", "subject”: "https://search.api.globus.org/abc.txt", "visible_to": ["public"], "content": { "metadata-schema/file#type": "file” } }, ... ] } - Bulk create and update - Task model for ingest at scale
  • 14.
    Data ingest withGlobus Search 15 Search Index POST /index/{index_id}/ingest' { "ingest_type": "GMetaList", "ingest_data": { "gmeta": [ { "id": ”weight", "subject": "https://search.api.globus.org/abc.txt", "visible_to": ["urn:globus:auth:identity:46bd0f56- e24f-11e5-a510-131bef46955c"], "content": { "metadata-schema/file#size": ”37.6", "metadata-schema/file#size_human": ”<50lb” } }, ... ] } Visibility limited to Globus Auth identity - Single user - Globus Group - Registered client application
  • 15.
    Data discovery withGlobus Search 16 { "@datatype": "GSearchResult", "@version": "2017-09-01", "count": 1, "gmeta": [ { "@datatype": "GMetaResult", "@version": "2019-08-27", "entries": [ { ... } ], "subject": "https://..." } ], "offset": 0, "total": 1 } GET /index/{index_id}/search?q=type%3Ahdf5 Search Index Simple query
  • 16.
    Data discovery withGlobus Search 17 POST /index/{index_id}/search Search Index Complex query { "filters": [ { "type": "range", "field_name": ”pubdate", "values": [ { "from": "*", "to": "2020-12-31" } ] } ], "facets": [ { "name": "Publication Date", "field_name": "pubdate", ... } ] } Filter Facets Boosts Sort Limit
  • 17.
    Experimenting with Globus services usingnotebooks jupyter.demo.globus.org
  • 18.
    Making data more FAIRwith Globus PaaS and portal framework
  • 19.
    The Modern ResearchData Portal Design Pattern docs.globus.org/mrdp
  • 20.
    Legacy Architecture (don’tdo this) 10GE Border Router WAN Firewall Enterprise perfSONAR perfSONAR Filesystem (data store) 10GE Portal Server Browsing path Query path Data path Portal server applications: · web server · search · database · authentication · data service Science gateway server applications Science gateway server Gateway logic (“small” data) and research data traverse the enterprise firewall è massive bottleneck
  • 21.
    Best practice: ScienceDMZ 10GE 10GE 10GE 10GE BorderRouter WAN Science DMZ Switch/Router Firewall Enterprise perfSONAR perfSONAR 10GE 10GE 10GE 10GE DTN DTN API DTNs (data access governed by portal) DTN DTN perfSONAR Filesystem (data store) 10GE Portal Server Browsing path Query path Portal server applications: · web server · search · database · authentication Data Path Data Transfer Path Portal Query/Browse Path Science gateway server Science gateway server applications Only gateway logic (“small” data) traverses the enterprise firewall è fast, clean path for research data
  • 22.
    MRDP: Key elements ScienceDMZ Fast, clean data path Data Transfer Nodes Purpose-built data movers Globus Platform Secure, reliable data orchestration Globus Connect Storage system enabler 23 Globus Portal Framework Data discovery and access
  • 23.
    Django Globus Portalkey features • Federated login via Globus Auth • Data export via Globus Transfer • Browse datasets via Globus Search • Template-driven search results and landing pages • Django-based framework with extensible templates • Bootstrap your project using Cookiecutter 25 Source: github.com/globus/django-globus-portal-framework Docs: django-globus-portal-framework.readthedocs.io
  • 24.
    Portal framework +Compute = Science Gateway • Discover and select data of interest • Run analyses on selected data via Globus Compute – Run on any resource: laptop à supercomputer • Move (and, optionally, share) analysis results • Do all this reliably, at scale with Globus Flows 26
  • 25.
    Bootstrap your portal,science gateway 27
  • 26.
    Resources • Experiment withAPIs: jupyter.demo.globus.org • Access code samples: github.com/globus • Leverage Globus professional services team • Support: support@globus.org 28