1. REST and Microservices based ML
deployment with Containers: From Data
Science experiments to Production ML
Nisha Talagala
CTO, ParallelM
2. 2
Growing AI Investments; Few Deployed at Scale
Survey of 3073 AI-aware C-level
Executives
Source: โArtificial Intelligence: The Next Digital Frontier?โ, McKinsey Global Institute, June 2017
Out of 160 reviewed AI
use cases:
88% did not
progress
beyond the
experimental
stage
But successful early
AI adopters report:
Profit margins
3โ15%
higher than
industry
average
20%
AI in
Production
80%
Developing,
Experimenting,
Contemplating
https://parallelm.com/free-account
3. 3
Development Is Only the Beginning
โข Your model has great accuracy! Now what?
โข Many systems available to help develop models
โข โSingle clickโ deployment has become clichรฉ
โข Publicly available endpoint != Production
โข Even vetted models can cause catastrophic errors (e.g.,
Knight Capital)
โข Real-world context is constantly changing (e.g., search
results)
โข Test conditions may not accurately represent overall goal
(e.g., resume filters)
https://parallelm.com/free-account
4. 4
A Microservice approach to ML Deployment
MLApp
Business Application
Request Prediction
https://parallelm.com/free-account
5. 5
MLApp
Business Application
Request Prediction
Example:
- Customer name,
- Prior purchases: cat
food, rugs, pet brush
- Search terms: toys
Example:
- Recommend:
- Cat toys
A Microservice approach to ML Deployment
https://parallelm.com/free-account
6. 6
Endpoint != Production ML Service
โข Performance โ meet SLAs and avoid unhappy customers
โข Resource efficiency โ lower cost
โข Scale out โ adapt to load
โข High availability
โข Debuggability, Observability and Reproducibility
โข Governance and Explainability
โข All models and all languages
โข Update models on the fly without restarting service
https://parallelm.com/free-account
7. 7
What is inside the MLApp? Simple way to get started..
Business Application
Request Prediction
Example:
- Customer name,
- Prior purchases: cat
food, rugs, pet brush
- Search terms: toys
Example:
- Recommend:
- Cat toys
Trained Model from your favorite developer or AutoML tool
REST Inference Pipeline
https://parallelm.com/free-account
8. 8
Business Application
Request Prediction
Example:
- Customer name,
- Prior purchases: cat food
, rugs, pet brush
- Search terms: toys
Example:
- Recommend:
- Cat toys
REST Inference Pipeline
Model Retraining pipeline
Policy Human approval
New model
What is inside the MLApp? Getting more fancy..
https://parallelm.com/free-account
9. 9
Models, Retraining
Control, Statistics
Events, Alerts
Data Science
Platforms
Data Streams Data Lakes
MCenter
MCenter Server
Analytic
Engines
MCenter
Developer Connectors
What we do: MCenter
MCenterA
gent
MCenterA
gent
MCenterA
gent
MCenterA
gent
MCenter
Agent
MCenterA
gent
)CDSW(
10. 10
MCenter Production Model Deployment
MCenter
Server
MCenter
Agent
Inference
Logic
Trained
Model
2 Launch an ML
Application
ML Compute
Platforms
Inference
Logic
Trained
Model
Serving
Endpoint
MCenter
Monitoring
Packaged as
Container
Deployed
Container
1 Upload Trained Model
(and optional
Inference Logic)
ServingEndpoint
3 Prediction
Request
4 Prediction
Response Business
Application
Data Scientist
Environment
https://parallelm.com/free-account
11. 11
Deploying ML Models as a REST based service with
MCenter
โข Train your model wherever you like
โข Upload model into MCenter (you can do this manually or via a script)
โข One step configure of a REST service
โข Click โLaunchโ
โข Get reports, update models, etc. from MCenter dashboards
https://parallelm.com/free-account
12. 12
โข ACME Prey Tracking Service: Wile E Coyote is tracking RoadRunner
โข Road Runnerโs cell phone stats are used to train a Scikit Learn Model
โข Data Scientist uploads model and deploys easily via REST
โข Data Scientist can use Mcenter to easily manage the Service
Demo
https://parallelm.com/free-account
13. 13
Demo
MLApp
Business Application
(Acme Prey Tracker)
Request Prediction
https://parallelm.com/free-account
REST Inference Pipeline
Scikit Learn Model
Road Runner
cell phone
monitoring data
Road Runner
Status
15. 15
Summary
โข We are at the beginnings of MLOps โ production ML for real life
โข With MCenter, anyone can deploy a production grade ML service and manage
full ML lifecycle
โข Interested? Get a Free Account
https://parallelm.com/free-account