But the data are big, distributed…
…and the science is collaborative
petrel.alcf.anl.gov
materialsdatafacility.org
2PB, 80Gbps store
3.2M materials data
Cooley: 290 TFLOPS
Query1 Share4
Transfer2
Learn3
Need multi-credential, multi-service authentication and data management
Hub
Configurable HTTP proxy
Authenticator
User DB
Spawner
Notebook
/api/auth
Browser
/hub/
/user/[name]/
• Multi-user hub
• Manages multiple instances
of Jupyter notebook server
• Configurable HTTP proxy
JupyterHub
Goal: Liberate the notebook!
• Tokens for remote services
• APIs for remote actions, e.g. data
management via Globus service
petrel.alcf.anl.gov
Securing JupyterHub with Globus Auth plugin
• Existing OAuth
framework
• Can restrict IdP
• Custom scopes
• Tokens passed into
notebook environment
github.com/jupyterhub/oauthenticator
REST APIs
REST APIs
REST APIs
Bearer a45cd...
Hub
Configurable HTTP proxy
Authenticator
User DB
Spawner
Notebook
/api/auth
/hub/
/user/[name]/
login
Browser
{"tokens":...
{"tokens":...
Tokens in Jupyter notebooks
The world is your
oyster API…
• Globus Transfer
• Globus Search
• Your app
• Data portal
• Analysis engine
• …
Ad hoc data analysis/results distribution
Notebook
Data
Repository
Bearer a45cd…
Dataset
Shared
endpoint
POST '/endpoint/a3c345f... /mkdir’
200 OK
...
X-Transfer-API-Version: 0.10
Content-Type: application/json
...
Analyze
Experiment with the demo notebook
• Login into our JupyterHub*: jupyter.demo.globus.org
• Launch (spawn) a notebook server; get tokens
• Using the JupyterHub_Integration.ipynb notebook:
– Access Globus APIs; download some data
– “Analyze” data (generate plot)
– PUT results (graph) on an HTTPS endpoint
– Share the URL with others so they can access the results
*zero-to-jupyterhub.readthedocs.io
UChicago Kasthuri Lab: Brain aging and disease
• Construct connectomes—mapping of neuron connections
• Use APS synchrotron to rapidly image brains
– Beam time available once every few months
– ~20GB/minute for large (cm) unsectioned brains
• Generate segmented datasets/visualizations for the community
• Perform semi-standard reconstruction on all data across HPC
resources
Our (simplistic) data flow thus far…
• Adequate for ad hoc sharing (implicit knowledge)
• Broader access, reuse requires “formalization”
• Leverage additional Globus platform services
Notebook
Data
Repository
Bearer a45cd…
Dataset
Shared
endpoint
POST '/endpoint/a3c345f... /mkdir’
200 OK
...
X-Transfer-API-Version: 0.10
Content-Type: application/json
...
Analyze
SearchIdentifierDescribeTransferAuth
Extending the automation flow
• How can we enable more structured/robust data
discovery using Globus platform services?
Create
folder
Transfer
data
Get
metadata
Mint
persistent
identifier
Catalog
Get
credentials
Set ACL
Globus Search
• Scalable service à billions of entries
• Schema agnostic: use standard (e.g. DataCite) or custom
metadata
• Fine grained access control: only returns results that are
visible to user
• Plain text search: ranked results
• Faceted search: facilitates data discovery
• Rich query language: ranges, expressions, regex, etc.
19
docs.globus.org/api/search
Other Globus integrations
• Web app development frameworks (Flask, Django)
• Content management systems (WordPress, Drupal)
• Development tools (Confluence, Jira)
• Scalable cyberinfrastructure (Kubernetes)
• Genomics analysis (Galaxy)
– galaxyproject.org/authnz/use/oidc/idps/globus
globus-integration-examples.readthedocs.io
Support resources
• Globus documentation: docs.globus.org
• Sample code: github.com/globus
• Helpdesk and issue escalation: support@globus.org
• Customer engagement team
• Globus professional services team
– Assist with portal/gateway/app architecture and design
– Develop custom applications that leverage the Globus platform
– Advise on customized deployment and integration scenarios
Join the Globus community
• Access the service: globus.org/login
• Create a personal endpoint: globus.org/app/endpoints/create-gcp
• Documentation: docs.globus.org
• Engage: globus.org/mailing-lists
• Subscribe: globus.org/subscriptions
• Need help? support@globus.org
• Follow us: @globusonline