Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Globus Integrations (JupyterHub, Django, ...)

38 views

Published on

This tutorial was given at the 2019 GlobusWorld Conference in Chicago, IL by Globus Chief Customer Officer Vas Vasiliadis.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Globus Integrations (JupyterHub, Django, ...)

  1. 1. Globus Integrations Vas Vasiliadis vas@uchicago.edu May 2, 2019
  2. 2. Enabling large-scale data intensive science with Jupyter 2
  3. 3. Andre Schleife, UIUC 16,000 CPU-hours per simulation Sample'Experimental' sca0ering' Material' composi4on' Simulated' structure' Simulated' sca0ering' La'60%' Sr'40%' Evolu4onary'op4miza4on' 786,432 CPUs, 10 PFLOPS supercomputer Argonne Leadership Computing Facility MDF: Advanced materials research Modeling stopping power with time-dependent density functional theory
  4. 4. @python_app Logan Ward Jupyter notebooks enable rapid iteration/results
  5. 5. But the data are big, distributed… …and the science is collaborative petrel.alcf.anl.gov materialsdatafacility.org 2PB, 80Gbps store 3.2M materials data Cooley: 290 TFLOPS Query1 Share4 Transfer2 Learn3 Need multi-credential, multi-service authentication and data management
  6. 6. Hub Configurable HTTP proxy Authenticator User DB Spawner Notebook /api/auth Browser /hub/ /user/[name]/ • Multi-user hub • Manages multiple instances of Jupyter notebook server • Configurable HTTP proxy JupyterHub Goal: Liberate the notebook! • Tokens for remote services • APIs for remote actions, e.g. data management via Globus service petrel.alcf.anl.gov
  7. 7. Securing JupyterHub with Globus Auth plugin • Existing OAuth framework • Can restrict IdP • Custom scopes • Tokens passed into notebook environment github.com/jupyterhub/oauthenticator
  8. 8. github.com/jupyterhub/oauthenticator#globus-setup Securing JupyterHub with Globus Auth
  9. 9. REST APIs REST APIs REST APIs Bearer a45cd... Hub Configurable HTTP proxy Authenticator User DB Spawner Notebook /api/auth /hub/ /user/[name]/ login Browser {"tokens":... {"tokens":... Tokens in Jupyter notebooks The world is your oyster API… • Globus Transfer • Globus Search • Your app • Data portal • Analysis engine • …
  10. 10. Ad hoc data analysis/results distribution Notebook Data Repository Bearer a45cd… Dataset Shared endpoint POST '/endpoint/a3c345f... /mkdir’ 200 OK ... X-Transfer-API-Version: 0.10 Content-Type: application/json ... Analyze
  11. 11. Experiment with the demo notebook • Login into our JupyterHub*: jupyter.demo.globus.org • Launch (spawn) a notebook server; get tokens • Using the JupyterHub_Integration.ipynb notebook: – Access Globus APIs; download some data – “Analyze” data (generate plot) – PUT results (graph) on an HTTPS endpoint – Share the URL with others so they can access the results *zero-to-jupyterhub.readthedocs.io
  12. 12. Leveraging the next generation of services 12
  13. 13. Our (simplistic) data flow thus far… • Adequate for ad hoc sharing (implicit knowledge) • Broader access, reuse requires “formalization” • Leverage additional Globus platform services Notebook Data Repository Bearer a45cd… Dataset Shared endpoint POST '/endpoint/a3c345f... /mkdir’ 200 OK ... X-Transfer-API-Version: 0.10 Content-Type: application/json ... Analyze
  14. 14. Globus Search • Scalable service à billions of entries • Schema agnostic: use standard (e.g. DataCite) or custom metadata • Fine grained access control: only returns results that are visible to user • Plain text search: ranked results • Faceted search: facilitates data discovery • Rich query language: ranges, expressions, regex, etc. 14 docs.globus.org/api/search
  15. 15. Persistent identifiers • Developing service for issuing persistent identifiers – DOI, ARK, Handle, Globus – e.g. https://identifiers.globus.org/doi:10.1145/2076450.2076468 • Within a namespace, e.g. your DataCite namespace – Control which identities/groups can create identifiers • Identifier attributes: – Link to data: one or more https URLs, to file, folder or manifest – Landing page: provided by service, or by user – Visibility: identities, groups that can see identifier – Checksum: of the file or manifest – Metadata: as required by identifier (e.g., DataCite), extensible – Replaces/replaced-by: for versioning 15
  16. 16. SearchIdentifierDescribeTransferAuth Extending the automation flow • How can we enable more structured/robust data discovery using Globus platform services? Create folder Transfer data Get metadata Mint persistent identifier Catalog Get credentials Set ACL
  17. 17. Other Globus integrations • Web app development frameworks (Flask, Django) • Content management systems (WordPress, Drupal) • Development tools (Confluence, Jira) • Scalable cyberinfrastructure (Kubernetes) • Genomics analysis (Galaxy) – galaxyproject.org/authnz/use/oidc/idps/globus globus-integration-examples.readthedocs.io
  18. 18. Example ALCF Data Discovery Portal https://petreldata.net
  19. 19. Support resources • Globus documentation: docs.globus.org • Sample code: github.com/globus • Helpdesk and issue escalation: support@globus.org • Customer engagement team • Globus professional services team – Assist with portal/gateway/app architecture and design – Develop custom applications that leverage the Globus platform – Advise on customized deployment and integration scenarios
  20. 20. Join the Globus community • Access the service: globus.org/login • Create a personal endpoint: globus.org/app/endpoints/create-gcp • Documentation: docs.globus.org • Engage: globus.org/mailing-lists • Subscribe: globus.org/subscriptions • Need help? support@globus.org • Follow us: @globusonline

×