SlideShare a Scribd company logo
1 of 56
Download to read offline
Building and Deploying
Scalable NLP Model Services
September 22, 2022
● Overview of Kubernetes (a pedestrian perspective…)
● Overview of Seldon Core
● Building & Deploying a Seldon Core Model
● More Complex Inference Graphs
Agenda
● What we’re shooting for:
○ High level understanding of k8s and seldon
○ Ability to deploy single model using this tooling
○ Template/stub to get started with deployments
● Out of scope:
○ k8s expertise
○ Full Seldon.ai ecosystem (we’ll focus narrowly)
Setting Expectations
What we’re going to build:
Setting Expectations
redaction
REST
What we’re going to build:
Setting Expectations
redaction
REST punctuation
What we’re going to build (time permitting):
Setting Expectations
punctuation
combine
redaction
sentiment
REST
whoami
If you’d like to follow along, please make sure to:
Before We Jump In
git clone https://github.com/zak-s-brown/seldon_sl2022.git
cd seldon_sl2022
make init
Overview of Kubernetes
(A pedestrian perspective…)
“Kubernetes is a portable,
extensible, open source platform
for managing containerized
workloads and services, that
facilitates both declarative
configuration and automation.”
What is Kubernetes?
Allows us to deploy and manage
resilient, scalable services
● Automated rollout/rollback
● Self-healing
● Automatic bin packing
● Storage orchestration
● Secrets/config management
What is Kubernetes?
What is Kubernetes?
Affectionately referred to as k8s
(k-eights or kates) by most folks,
as “kubernetes” is a bit
cumbersome
Cluster: platform for managing
containerized workloads and
services
Anatomy of a k8s Cluster
Node: Worker machine in k8s
Anatomy of a k8s Cluster
Pod: Set of running containers in
your cluster
Anatomy of a k8s Cluster
Container: Lightweight*
and
portable executable image that
contains all software and
dependencies
Anatomy of a k8s Cluster
Container: Lightweight*
and
portable executable image that
contains all software and
dependencies
Anatomy of a k8s Cluster
This is usually (part of) the
deliverable artifact for an MLE
Clusters typically contain a
mix of services, with varying
resource requirements
Anatomy of a k8s Cluster
= pod with lower resource reqs
= pod with higher resource reqs
Pods can also specify a node
group to be deployed on,
allowing hardware
optimization for
heterogeneous workloads
Anatomy of a k8s Cluster
general
purpose
compute
optimized
Anatomy of a k8s Cluster
K8s supports (naive) fixed
replication as well as
horizontal pod auto-scaling
(hpa) to leverage pod group
metrics to trigger scaling
events
Kubernetes Tooling
kubectl is the primary command line tool for interacting with a
kubernetes cluster
kubectl get/describe nodes/pods
kubectl apply/delete -f my-deployment.yml
k9s is an alternative tool offering much functionality as kubectl in a
terminal-based UI
Kubernetes Tooling
Overview of Seldon Core
What is Seldon?
“Seldon Core makes it easier and
faster to deploy your machine
learning models and experiments
at scale on Kubernetes. Seldon
Core serves models built in any
open-source or commercial
model building framework”
● Prepackaged model servers for common frameworks:
○ Sklearn
○ XGBoost
○ Tensorflow
○ ML Flow
● Language Wrappers
○ Python
○ Java (incubating)
○ R, Node, Go (alpha)
Out of the Box Support
● Seldon deployments come with:
○ REST and GRPC endpoints
○ Swagger documentation*
○ Integration with k8s metrics and monitoring (grafana)
What Comes “Out of the Box”
● Seldon deployments come with:
○ REST and GRPC endpoints
○ Swagger documentation*
○ Integration with k8s metrics and monitoring (grafana)
What Comes “Out of the Box”
A consistent framework for deploying models
across diverse organizations
Inference Graph Components
The Seldon Python SDK supports a variety of inference graph
components, accommodating a wide array of use cases
The Seldon Python SDK supports a variety of inference graph
components, accommodating a wide array of use cases
● Models: Model deployment (minimally) with predict method
● Transformers: Custom input/output transformations
● Routers: Logically direct requests to child components
● Combiners: Combine responses from multiple models
Inference Graph Components
Building & Deploying a Seldon
Core Model
Using the Python Wrapper
Define Class Containerize Deploy
Using the Python Wrapper
Define Class Containerize Deploy
To create a new model, we need to
define a model class in a file with
the same name as the defined class
(e.g. MySeldonModel.py)
Defining a Custom Model Class
The (minimal) definition of a model
with the Python SDK requires:
● __init__
● predict
● (optional) load
Defining a Custom Model Class
The (minimal) definition of a model
with the Python SDK requires:
● __init__
● predict
● (optional) load
Defining a Custom Model Class
SpacyScrubber Model Def Hands On
Using the Python Wrapper
Define Class Containerize Deploy
There are two primary options for
containerizing models created with
the Seldon Python SDK
● Openshift source-to-image
● Docker
Containerizing a Seldon Model Class
To create a container image, your
Dockerfile should contain:
● Reference to model class file in
the root of the docker build
context (class/file name only)
● Add Seldon specific env vars
● Invoke
seldon-core-microservice
Containerizing a Seldon Model Class
Once we’ve built our model
container, we then need to make it
available to the k8s cluster via a
container registry
Containerizing a Seldon Model Class
Container
Registry
Once we’ve built our model
container, we then need to make it
available to the k8s cluster via a
docker registry
Containerizing a Seldon Model Class
docker build -t mymodel:latest .
docker tag mymodel:latest localhost:5001/mymodel:latest
docker push localhost:5001/$svc:latest
SpacyScrubber Dockerfile Hands On
Using the Python Wrapper
Define Class Containerize Deploy
Deploying a Containerized Model
A k8s deployment defines the full
configuration for the model pod(s)
● Seldon version/type info
● Container definition
● Graph definition
● Replication/scaling config
# push deployment to cluster
kubectl apply -f mymodel-deploy.yml
# remove (destroy) deployment
kubectl delete -f mymodel-deploy.yml
Once we define the deployment,
we can push it to our k8s cluster
using kubectl:
Deploying a Containerized Model
SpacyScrubber Deployment Hands On
Testing Your Model
Seldon REST endpoints by default expect a numpy.ndarrray as
input, with a request payload of the form:
{
"data": {
"ndarray":
[<input>]
}
}
Testing Your Model
When coupled with istio for ingress, Seldon automatically wires up
new models according to the following routing pattern:
http://<ingress>/seldon/<ns>/<service>/api/v1.0/predictions
SpacyScrubber Endpoint Test
More Complex Inference Graphs
Now that we’ve run through the basics of a single model, let’s take a
look at a slightly more complex inference graph
Setting Expectations
redaction
REST punctuation
Serial Model Deployment
Assuming we already have the
component services deployed, we
can define a new deployment
similar to the following:
Serial Inference Graph Hands On
Going one step further, we can create even more complex inference
graphs:
Setting Expectations
punctuation
combine
redaction
sentiment
REST
Serial Model Deployment
Again, assuming we already have
the component services deployed,
we can define a new deployment
similar to the following:
Complex Inference Graph Hands On

More Related Content

Similar to Building and Deploying Scalable NLP Model Services

kubernetesforbeginners.pptx
kubernetesforbeginners.pptxkubernetesforbeginners.pptx
kubernetesforbeginners.pptx
BaskarKannanK
 
Operator Lifecycle Management
Operator Lifecycle ManagementOperator Lifecycle Management
Operator Lifecycle Management
DoKC
 

Similar to Building and Deploying Scalable NLP Model Services (20)

CD in kubernetes using helm and ksonnet. Stas Kolenkin
CD in kubernetes using helm and ksonnet. Stas KolenkinCD in kubernetes using helm and ksonnet. Stas Kolenkin
CD in kubernetes using helm and ksonnet. Stas Kolenkin
 
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
 
kubernetesforbeginners.pptx
kubernetesforbeginners.pptxkubernetesforbeginners.pptx
kubernetesforbeginners.pptx
 
AKS: k8s e azure
AKS: k8s e azureAKS: k8s e azure
AKS: k8s e azure
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Getting started with google kubernetes engine
Getting started with google kubernetes engineGetting started with google kubernetes engine
Getting started with google kubernetes engine
 
Kubernetes for Java developers
Kubernetes for Java developersKubernetes for Java developers
Kubernetes for Java developers
 
Kubernetes: The Next Research Platform
Kubernetes: The Next Research PlatformKubernetes: The Next Research Platform
Kubernetes: The Next Research Platform
 
MuleSoft Surat Virtual Meetup#6 - MuleSoft Project Template Using Maven Arche...
MuleSoft Surat Virtual Meetup#6 - MuleSoft Project Template Using Maven Arche...MuleSoft Surat Virtual Meetup#6 - MuleSoft Project Template Using Maven Arche...
MuleSoft Surat Virtual Meetup#6 - MuleSoft Project Template Using Maven Arche...
 
Container orchestration k8s azure kubernetes services
Container orchestration  k8s azure kubernetes servicesContainer orchestration  k8s azure kubernetes services
Container orchestration k8s azure kubernetes services
 
Operator Lifecycle Management
Operator Lifecycle ManagementOperator Lifecycle Management
Operator Lifecycle Management
 
Operator Lifecycle Management
Operator Lifecycle ManagementOperator Lifecycle Management
Operator Lifecycle Management
 
Containerized architectures for deep learning
Containerized architectures for deep learningContainerized architectures for deep learning
Containerized architectures for deep learning
 
Liferay (DXP) 7 Tech Meetup for Developers
Liferay (DXP) 7 Tech Meetup for DevelopersLiferay (DXP) 7 Tech Meetup for Developers
Liferay (DXP) 7 Tech Meetup for Developers
 
Introduction to Kubernetes
Introduction to KubernetesIntroduction to Kubernetes
Introduction to Kubernetes
 
Salvatore Incandela, Fabio Marinelli - Using Spinnaker to Create a Developmen...
Salvatore Incandela, Fabio Marinelli - Using Spinnaker to Create a Developmen...Salvatore Incandela, Fabio Marinelli - Using Spinnaker to Create a Developmen...
Salvatore Incandela, Fabio Marinelli - Using Spinnaker to Create a Developmen...
 
Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !
 
How to Integrate Kubernetes in OpenStack
 How to Integrate Kubernetes in OpenStack  How to Integrate Kubernetes in OpenStack
How to Integrate Kubernetes in OpenStack
 
Rome .NET Conference 2024 - Remote Conference
Rome .NET Conference 2024  - Remote ConferenceRome .NET Conference 2024  - Remote Conference
Rome .NET Conference 2024 - Remote Conference
 
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
 

More from Zachary S. Brown

More from Zachary S. Brown (7)

Working in NLP in the Age of Large Language Models
Working in NLP in the Age of Large Language ModelsWorking in NLP in the Age of Large Language Models
Working in NLP in the Age of Large Language Models
 
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
 
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
 
Text Representations for Deep learning
Text Representations for Deep learningText Representations for Deep learning
Text Representations for Deep learning
 
Deep Learning and Modern NLP
Deep Learning and Modern NLPDeep Learning and Modern NLP
Deep Learning and Modern NLP
 
Cyber Threat Ranking using READ
Cyber Threat Ranking using READCyber Threat Ranking using READ
Cyber Threat Ranking using READ
 
Deep Domain
Deep DomainDeep Domain
Deep Domain
 

Recently uploaded

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
pwgnohujw
 
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
varanasisatyanvesh
 
sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444
saurabvyas476
 
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
mikehavy0
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
zifhagzkk
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
23050636
 

Recently uploaded (20)

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
DAA Assignment Solution.pdf is the best1
DAA Assignment Solution.pdf is the best1DAA Assignment Solution.pdf is the best1
DAA Assignment Solution.pdf is the best1
 
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjSCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
 
sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444
 
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
jll-asia-pacific-capital-tracker-1q24.pdf
jll-asia-pacific-capital-tracker-1q24.pdfjll-asia-pacific-capital-tracker-1q24.pdf
jll-asia-pacific-capital-tracker-1q24.pdf
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Pentesting_AI and security challenges of AI
Pentesting_AI and security challenges of AIPentesting_AI and security challenges of AI
Pentesting_AI and security challenges of AI
 
DS Lecture-1 about discrete structure .ppt
DS Lecture-1 about discrete structure .pptDS Lecture-1 about discrete structure .ppt
DS Lecture-1 about discrete structure .ppt
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 

Building and Deploying Scalable NLP Model Services

  • 1. Building and Deploying Scalable NLP Model Services September 22, 2022
  • 2. ● Overview of Kubernetes (a pedestrian perspective…) ● Overview of Seldon Core ● Building & Deploying a Seldon Core Model ● More Complex Inference Graphs Agenda
  • 3. ● What we’re shooting for: ○ High level understanding of k8s and seldon ○ Ability to deploy single model using this tooling ○ Template/stub to get started with deployments ● Out of scope: ○ k8s expertise ○ Full Seldon.ai ecosystem (we’ll focus narrowly) Setting Expectations
  • 4. What we’re going to build: Setting Expectations redaction REST
  • 5. What we’re going to build: Setting Expectations redaction REST punctuation
  • 6. What we’re going to build (time permitting): Setting Expectations punctuation combine redaction sentiment REST
  • 8. If you’d like to follow along, please make sure to: Before We Jump In git clone https://github.com/zak-s-brown/seldon_sl2022.git cd seldon_sl2022 make init
  • 9. Overview of Kubernetes (A pedestrian perspective…)
  • 10. “Kubernetes is a portable, extensible, open source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation.” What is Kubernetes?
  • 11. Allows us to deploy and manage resilient, scalable services ● Automated rollout/rollback ● Self-healing ● Automatic bin packing ● Storage orchestration ● Secrets/config management What is Kubernetes?
  • 12. What is Kubernetes? Affectionately referred to as k8s (k-eights or kates) by most folks, as “kubernetes” is a bit cumbersome
  • 13. Cluster: platform for managing containerized workloads and services Anatomy of a k8s Cluster
  • 14. Node: Worker machine in k8s Anatomy of a k8s Cluster
  • 15. Pod: Set of running containers in your cluster Anatomy of a k8s Cluster
  • 16. Container: Lightweight* and portable executable image that contains all software and dependencies Anatomy of a k8s Cluster
  • 17. Container: Lightweight* and portable executable image that contains all software and dependencies Anatomy of a k8s Cluster This is usually (part of) the deliverable artifact for an MLE
  • 18. Clusters typically contain a mix of services, with varying resource requirements Anatomy of a k8s Cluster = pod with lower resource reqs = pod with higher resource reqs
  • 19. Pods can also specify a node group to be deployed on, allowing hardware optimization for heterogeneous workloads Anatomy of a k8s Cluster general purpose compute optimized
  • 20. Anatomy of a k8s Cluster K8s supports (naive) fixed replication as well as horizontal pod auto-scaling (hpa) to leverage pod group metrics to trigger scaling events
  • 21. Kubernetes Tooling kubectl is the primary command line tool for interacting with a kubernetes cluster kubectl get/describe nodes/pods kubectl apply/delete -f my-deployment.yml
  • 22. k9s is an alternative tool offering much functionality as kubectl in a terminal-based UI Kubernetes Tooling
  • 24. What is Seldon? “Seldon Core makes it easier and faster to deploy your machine learning models and experiments at scale on Kubernetes. Seldon Core serves models built in any open-source or commercial model building framework”
  • 25. ● Prepackaged model servers for common frameworks: ○ Sklearn ○ XGBoost ○ Tensorflow ○ ML Flow ● Language Wrappers ○ Python ○ Java (incubating) ○ R, Node, Go (alpha) Out of the Box Support
  • 26. ● Seldon deployments come with: ○ REST and GRPC endpoints ○ Swagger documentation* ○ Integration with k8s metrics and monitoring (grafana) What Comes “Out of the Box”
  • 27. ● Seldon deployments come with: ○ REST and GRPC endpoints ○ Swagger documentation* ○ Integration with k8s metrics and monitoring (grafana) What Comes “Out of the Box” A consistent framework for deploying models across diverse organizations
  • 28. Inference Graph Components The Seldon Python SDK supports a variety of inference graph components, accommodating a wide array of use cases
  • 29. The Seldon Python SDK supports a variety of inference graph components, accommodating a wide array of use cases ● Models: Model deployment (minimally) with predict method ● Transformers: Custom input/output transformations ● Routers: Logically direct requests to child components ● Combiners: Combine responses from multiple models Inference Graph Components
  • 30. Building & Deploying a Seldon Core Model
  • 31. Using the Python Wrapper Define Class Containerize Deploy
  • 32. Using the Python Wrapper Define Class Containerize Deploy
  • 33. To create a new model, we need to define a model class in a file with the same name as the defined class (e.g. MySeldonModel.py) Defining a Custom Model Class
  • 34. The (minimal) definition of a model with the Python SDK requires: ● __init__ ● predict ● (optional) load Defining a Custom Model Class
  • 35. The (minimal) definition of a model with the Python SDK requires: ● __init__ ● predict ● (optional) load Defining a Custom Model Class
  • 37. Using the Python Wrapper Define Class Containerize Deploy
  • 38. There are two primary options for containerizing models created with the Seldon Python SDK ● Openshift source-to-image ● Docker Containerizing a Seldon Model Class
  • 39. To create a container image, your Dockerfile should contain: ● Reference to model class file in the root of the docker build context (class/file name only) ● Add Seldon specific env vars ● Invoke seldon-core-microservice Containerizing a Seldon Model Class
  • 40. Once we’ve built our model container, we then need to make it available to the k8s cluster via a container registry Containerizing a Seldon Model Class Container Registry
  • 41. Once we’ve built our model container, we then need to make it available to the k8s cluster via a docker registry Containerizing a Seldon Model Class docker build -t mymodel:latest . docker tag mymodel:latest localhost:5001/mymodel:latest docker push localhost:5001/$svc:latest
  • 43. Using the Python Wrapper Define Class Containerize Deploy
  • 44. Deploying a Containerized Model A k8s deployment defines the full configuration for the model pod(s) ● Seldon version/type info ● Container definition ● Graph definition ● Replication/scaling config
  • 45. # push deployment to cluster kubectl apply -f mymodel-deploy.yml # remove (destroy) deployment kubectl delete -f mymodel-deploy.yml Once we define the deployment, we can push it to our k8s cluster using kubectl: Deploying a Containerized Model
  • 47. Testing Your Model Seldon REST endpoints by default expect a numpy.ndarrray as input, with a request payload of the form: { "data": { "ndarray": [<input>] } }
  • 48. Testing Your Model When coupled with istio for ingress, Seldon automatically wires up new models according to the following routing pattern: http://<ingress>/seldon/<ns>/<service>/api/v1.0/predictions
  • 51. Now that we’ve run through the basics of a single model, let’s take a look at a slightly more complex inference graph Setting Expectations redaction REST punctuation
  • 52. Serial Model Deployment Assuming we already have the component services deployed, we can define a new deployment similar to the following:
  • 54. Going one step further, we can create even more complex inference graphs: Setting Expectations punctuation combine redaction sentiment REST
  • 55. Serial Model Deployment Again, assuming we already have the component services deployed, we can define a new deployment similar to the following: