Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Ryan Chard, Logan Ward, Zhuozhao Li, Yadu Babuji,
Anna Woodard, Steven Tuecke, Kyle Chard, Ben Blaiszik, and Ian Foster
PE...
Overview
Scientific ML
DLHub
How it works
Use cases
Summary
2
Scientific ML
- Where are the model and trained weights?
- How do I run the model on my data?
- How do I scale my model to...
Scientific ML
Unique scientific requirements:
- Publication, citation, reuse
- Reproducibility
- Research infrastructure
-...
Data and Learning Hub for Science
• Collect, publish, categorize models from many
disciplines (materials science, physics,...
• Register model metadata, weights, and files to improve
discoverability and reusability
• Containerize model to enhance in...
• Servables are self-contained images
• Deploy servables for on-demand inference
• Scale deployments based on load
• Infer...
DLHub Servables
- Self-contained images
- Embed model architecture, weights, and dependencies
- Supports almost any model ...
PostprocessPreprocess Infer
DLHub Servables
Preprocess
.run()
Model predict
.run()
Postprocess
.run()
.run()
.test()
Pipel...
• Security model
○ provided from publication to inference
○ Globus auth -- login with one of hundreds of supported identit...
• Management Service for users to publish, search, and infer
• Task Managers (TM) to support deployment on various compute...
DLHub Performance
Scale Testing
Scaling performance of IPP and HTEX
Scale Testing
• Deployed the servables on
PetrelKube, ...
DLHub Performance
Serving General Models
Latency performance of IPP and HTEX
Latency
• Deployed the executors on
PetrelKub...
Using DLHub is Easy!
Python SDK Command Line Interface
$ pip install dlhub_sdk $ pip install dlhub_cli
14
Marking up a Model – Python SDK
Existing Model
User Mark Up with SDK
Send to DLHub
(via Globus or HTTPS)
DLHub
Containeriz...
Python SDK – Automated Metadata Generation
Citation Metadata DLHub Metadata Servable Metadata
Access Control
• Public
• Gl...
Comparing Models
Cherukara (NST), Nashed (MCS), Harder(XSD) @ Argonne
17
Tomogan
Denoising Tomography Data with
TomoGAN
• Tomography data yields important
insights for a number of different fields....
DLHub
Image tags
Analyzing Beamline Images
• Stage data into containers via Globus
HTTPS
• Pass valid token and data locat...
DLHub Summary
Model deposit and discovery
- Developed a model schema to promote discovery
- Implemented advanced search an...
Thanks to our sponsors!
U.S. DEPARTMENT OF
ENERGY
ALCF DF
Parsl Globus IMaD
DLHub Argonne
LDRD
Learning Systems
Model Repositories
- Catalog and aggregate models
- Enable discovery and citation
- Capture provenance
- ...
Upcoming SlideShare
Loading in …5
×

Publishing and Serving Machine Learning Models with DLHub

11 views

Published on

This presentation was given at the PEARC 2019 conference in Chicago by Zhuozhao Li, Globus Labs Postdoc.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Publishing and Serving Machine Learning Models with DLHub

  1. 1. Ryan Chard, Logan Ward, Zhuozhao Li, Yadu Babuji, Anna Woodard, Steven Tuecke, Kyle Chard, Ben Blaiszik, and Ian Foster PEARC 2019 Publishing and Serving Machine Learning Models with https://www.dlhub.org 1
  2. 2. Overview Scientific ML DLHub How it works Use cases Summary 2
  3. 3. Scientific ML - Where are the model and trained weights? - How do I run the model on my data? - How do I scale my model to run on my cluster? - Should I run the model on my data? - How do I share my model with the community? - How can I build on this work? - How can I create pipelines comprised of many models? 3
  4. 4. Scientific ML Unique scientific requirements: - Publication, citation, reuse - Reproducibility - Research infrastructure - Scalability - Low latency - Research ecosystem - Workflows Need general solutions to support vanguard model types, implementations, dependencies, runtimes, data, and infrastructures 4
  5. 5. Data and Learning Hub for Science • Collect, publish, categorize models from many disciplines (materials science, physics, chemistry, genomics, etc.) • Serve model inference on-demand via API to simplify sharing, consumption, and access • Enable new science through reuse, real-time model-in-the-loop integration, and synthesis & ensembling of existing models https://github.com/DLHub-Argonne 5
  6. 6. • Register model metadata, weights, and files to improve discoverability and reusability • Containerize model to enhance interoperability • Identify model with a permanent identifier (e.g., DOI, minid, etc.) • Version model and data pre/post processing steps DLHub Model Repository Collect Data Train Model Register Model User DLHub SDK local 6
  7. 7. • Servables are self-contained images • Deploy servables for on-demand inference • Scale deployments based on load • Inference can be performed via SDK, CLI, and REST requests DLHub Model Serving Collect Data Receive Pred. Properties Send Compositions Call DLHub User Find Model 7
  8. 8. DLHub Servables - Self-contained images - Embed model architecture, weights, and dependencies - Supports almost any model type and implementation - repo2docker builds servables with almost arbitrary dependencies (apt, pip, R, etc.) - Servables include DLHub SDK as shim for loading and interacting with models - Recognizes model type from metadata and loads appropriately - Facilitates secure data staging for servable to directly download data on users’ behalf - Deploy servables for scalable inference - Kubernetes pods - docker2singularity for HPC 8
  9. 9. PostprocessPreprocess Infer DLHub Servables Preprocess .run() Model predict .run() Postprocess .run() .run() .test() Pipelines Singularity or Docker methods 9
  10. 10. • Security model ○ provided from publication to inference ○ Globus auth -- login with one of hundreds of supported identity providers (e.g., institutions, ORCID, Google) • DLHub CLI and SDK ○ Describe, publish, share, and invoke • DLHub model searching ○ Rich metadata of the model ○ Metadata stored in a flexible search index, built on Globus Search DLHub Features 10
  11. 11. • Management Service for users to publish, search, and infer • Task Managers (TM) to support deployment on various compute resources ○ Parsl, a Python library that supports parallel execution on many sites • Executors on execution sites to invoke servables • Optimizations including Memoization, Data staging with Globus, Batch submissions, Scalability through deployment of model replicas DLHub Architecture 11
  12. 12. DLHub Performance Scale Testing Scaling performance of IPP and HTEX Scale Testing • Deployed the servables on PetrelKube, a 14-node Kubernetes cluster • Parsl -- IPyParallel (IPP) and HighThroughput (HTEX) executors • 10000 batch inferences of “no-op” servable 12
  13. 13. DLHub Performance Serving General Models Latency performance of IPP and HTEX Latency • Deployed the executors on PetrelKube, a 14-node Kubernetes cluster • Parsl -- IPyParallel (IPP) and HighThroughput (HTEX) executors • 1000 repeated inferences of “no-op” servable 13
  14. 14. Using DLHub is Easy! Python SDK Command Line Interface $ pip install dlhub_sdk $ pip install dlhub_cli 14
  15. 15. Marking up a Model – Python SDK Existing Model User Mark Up with SDK Send to DLHub (via Globus or HTTPS) DLHub Containerization Populate Search Index / Mint Identifiers SDK Extracts Metadata for Known Model Types 15
  16. 16. Python SDK – Automated Metadata Generation Citation Metadata DLHub Metadata Servable Metadata Access Control • Public • Globus users • Globus groups 16
  17. 17. Comparing Models Cherukara (NST), Nashed (MCS), Harder(XSD) @ Argonne 17
  18. 18. Tomogan Denoising Tomography Data with TomoGAN • Tomography data yields important insights for a number of different fields. However: ○ data are initially noisy • TomoGAN, denoises tomography data using a generative adversarial network (GAN) • Powerful tool for quickly denoising measurements at scale. 18
  19. 19. DLHub Image tags Analyzing Beamline Images • Stage data into containers via Globus HTTPS • Pass valid token and data location 19
  20. 20. DLHub Summary Model deposit and discovery - Developed a model schema to promote discovery - Implemented advanced search and filtering - Built ingest flow: models are dynamically staged, packaged, dockerized, published, and indexed Model serving - Deployed capabilities for users to run inference with SDK and CLI - Automated testing of containers - Implemented caching and batching Support for multiple execution sites - PetrelKube: Parsl, TF serving, Sagemaker - Other: AWS, OSG Authentication - Protected model metadata and inference with GlobusAuth - Secured data staging Future work - Build Web UI to create pipelines and invoke models - Cache at the servable level within pipelines - Couple DLHub to data sources (MDF, etc.) - Integrate with ML frontend tools (DeepForge), optimization tools (DeepHyper), and more - Create interface for training and retraining of models 20
  21. 21. Thanks to our sponsors! U.S. DEPARTMENT OF ENERGY ALCF DF Parsl Globus IMaD DLHub Argonne LDRD
  22. 22. Learning Systems Model Repositories - Catalog and aggregate models - Enable discovery and citation - Capture provenance - Record performance data - Mint identifiers Model Serving - On-demand model inference - Scalable deployments - Standardized interfaces - Low latency vs ease of use Drawbacks: 1. Current serving platforms are not usable on most HPC platforms 2. There is not an integrated system that provides both 22

×