More Related Content

Slideshows for you(20)

Similar to Globus Integrations (CHPC 2019 - South Africa)(20)


Globus Integrations (CHPC 2019 - South Africa)

  1. Globus Integrations Vas Vasiliadis CHPC National Conference December 5, 2019
  2. JupyterHub 2
  3. Enabling large-scale data intensive science with Jupyter 3
  4. Andre Schleife, UIUC 16,000 CPU-hours per simulation Sample'Experimental' sca0ering' Material' composi4on' Simulated' structure' Simulated' sca0ering' La'60%' Sr'40%' Evolu4onary'op4miza4on' 786,432 CPUs, 10 PFLOPS supercomputer Argonne Leadership Computing Facility MDF: Advanced materials research Modeling stopping power with time-dependent density functional theory
  5. @python_app Logan Ward Jupyter notebooks enable rapid iteration/results
  6. But the data are big, distributed… …and the science is collaborative 2PB, 80Gbps store 3.2M materials data Cooley: 290 TFLOPS Query1 Share4 Transfer2 Learn3 Need multi-credential, multi-service authentication and data management
  7. Hub Configurable HTTP proxy Authenticator User DB Spawner Notebook /api/auth Browser /hub/ /user/[name]/ • Multi-user hub • Manages multiple instances of Jupyter notebook server • Configurable HTTP proxy JupyterHub Goal: Liberate the notebook! • Tokens for remote services • APIs for remote actions, e.g. data management via Globus service
  8. Securing JupyterHub with Globus Auth plugin • Existing OAuth framework • Can restrict IdP • Custom scopes • Tokens passed into notebook environment
  9. Securing JupyterHub with Globus Auth
  10. REST APIs REST APIs REST APIs Bearer a45cd... Hub Configurable HTTP proxy Authenticator User DB Spawner Notebook /api/auth /hub/ /user/[name]/ login Browser {"tokens":... {"tokens":... Tokens in Jupyter notebooks The world is your oyster API… • Globus Transfer • Globus Search • Your app • Data portal • Analysis engine • …
  11. Ad hoc data analysis/results distribution Notebook Data Repository Bearer a45cd… Dataset Shared endpoint POST '/endpoint/a3c345f... /mkdir’ 200 OK ... X-Transfer-API-Version: 0.10 Content-Type: application/json ... Analyze
  12. Experiment with the demo notebook • Login into our JupyterHub*: • Launch (spawn) a notebook server; get tokens • Using the JupyterHub_Integration.ipynb notebook: – Access Globus APIs; download some data – “Analyze” data (generate plot) – PUT results (graph) on an HTTPS endpoint – Share the URL with others so they can access the results *
  13. Leveraging the next generation of services 13
  14. UChicago Kasthuri Lab: Brain aging and disease • Construct connectomes—mapping of neuron connections • Use APS synchrotron to rapidly image brains – Beam time available once every few months – ~20GB/minute for large (cm) unsectioned brains • Generate segmented datasets/visualizations for the community • Perform semi-standard reconstruction on all data across HPC resources
  15. Advanced Photon Source Argonne Leadership Computing Facility 1 km 5μsec 15
  16. Argonne JLSEUChicago Argonne Leadership Computing FacilityAPS Publication7 Building the connectome Imaging1 Lab Server 1 Acquisition2 Lab Server 2 Pre-processing3 Preview/Center4 Reconstruction6Visualization8 User validation5 Science!9 Neuroanatomy reconstruction pipeline
  17. Our (simplistic) data flow thus far… • Adequate for ad hoc sharing (implicit knowledge) • Broader access, reuse requires “formalization” • Leverage additional Globus platform services Notebook Data Repository Bearer a45cd… Dataset Shared endpoint POST '/endpoint/a3c345f... /mkdir’ 200 OK ... X-Transfer-API-Version: 0.10 Content-Type: application/json ... Analyze
  18. SearchIdentifierDescribeTransferAuth Extending the automation flow • How can we enable more structured/robust data discovery using Globus platform services? Create folder Transfer data Get metadata Mint persistent identifier Catalog Get credentials Set ACL
  19. Globus Search • Scalable service à billions of entries • Schema agnostic: use standard (e.g. DataCite) or custom metadata • Fine grained access control: only returns results that are visible to user • Plain text search: ranked results • Faceted search: facilitates data discovery • Rich query language: ranges, expressions, regex, etc. 19
  20. Other Globus integrations • Web app development frameworks (Flask, Django) • Content management systems (WordPress, Drupal) • Development tools (Confluence, Jira) • Scalable cyberinfrastructure (Kubernetes) • Genomics analysis (Galaxy) –
  21. ALCFPortal Project2 Project1 Globus Search 22Petrel DTNs: GCS+HTTPS Index 2 Custom page templates (display) Custom views (logic) Custom page templates (display) Custom views (logic) Index 1 Data 1 Data 2 users ALCF Project Portals Custom CLIs
  22. Example ALCF Data Discovery Portal
  23. Support resources • Globus documentation: • Sample code: • Helpdesk and issue escalation: • Customer engagement team • Globus professional services team – Assist with portal/gateway/app architecture and design – Develop custom applications that leverage the Globus platform – Advise on customized deployment and integration scenarios
  24. Join the Globus community • Access the service: • Create a personal endpoint: • Documentation: • Engage: • Subscribe: • Need help? • Follow us: @globusonline