2. ● Overview of Kubernetes (a pedestrian perspective…)
● Overview of Seldon Core
● Building & Deploying a Seldon Core Model
● More Complex Inference Graphs
Agenda
3. ● What we’re shooting for:
○ High level understanding of k8s and seldon
○ Ability to deploy single model using this tooling
○ Template/stub to get started with deployments
● Out of scope:
○ k8s expertise
○ Full Seldon.ai ecosystem (we’ll focus narrowly)
Setting Expectations
8. If you’d like to follow along, please make sure to:
Before We Jump In
git clone https://github.com/zak-s-brown/seldon_sl2022.git
cd seldon_sl2022
make init
10. “Kubernetes is a portable,
extensible, open source platform
for managing containerized
workloads and services, that
facilitates both declarative
configuration and automation.”
What is Kubernetes?
11. Allows us to deploy and manage
resilient, scalable services
● Automated rollout/rollback
● Self-healing
● Automatic bin packing
● Storage orchestration
● Secrets/config management
What is Kubernetes?
18. Clusters typically contain a
mix of services, with varying
resource requirements
Anatomy of a k8s Cluster
= pod with lower resource reqs
= pod with higher resource reqs
19. Pods can also specify a node
group to be deployed on,
allowing hardware
optimization for
heterogeneous workloads
Anatomy of a k8s Cluster
general
purpose
compute
optimized
20. Anatomy of a k8s Cluster
K8s supports (naive) fixed
replication as well as
horizontal pod auto-scaling
(hpa) to leverage pod group
metrics to trigger scaling
events
21. Kubernetes Tooling
kubectl is the primary command line tool for interacting with a
kubernetes cluster
kubectl get/describe nodes/pods
kubectl apply/delete -f my-deployment.yml
22. k9s is an alternative tool offering much functionality as kubectl in a
terminal-based UI
Kubernetes Tooling
24. What is Seldon?
“Seldon Core makes it easier and
faster to deploy your machine
learning models and experiments
at scale on Kubernetes. Seldon
Core serves models built in any
open-source or commercial
model building framework”
25. ● Prepackaged model servers for common frameworks:
○ Sklearn
○ XGBoost
○ Tensorflow
○ ML Flow
● Language Wrappers
○ Python
○ Java (incubating)
○ R, Node, Go (alpha)
Out of the Box Support
26. ● Seldon deployments come with:
○ REST and GRPC endpoints
○ Swagger documentation*
○ Integration with k8s metrics and monitoring (grafana)
What Comes “Out of the Box”
27. ● Seldon deployments come with:
○ REST and GRPC endpoints
○ Swagger documentation*
○ Integration with k8s metrics and monitoring (grafana)
What Comes “Out of the Box”
A consistent framework for deploying models
across diverse organizations
28. Inference Graph Components
The Seldon Python SDK supports a variety of inference graph
components, accommodating a wide array of use cases
29. The Seldon Python SDK supports a variety of inference graph
components, accommodating a wide array of use cases
● Models: Model deployment (minimally) with predict method
● Transformers: Custom input/output transformations
● Routers: Logically direct requests to child components
● Combiners: Combine responses from multiple models
Inference Graph Components
33. To create a new model, we need to
define a model class in a file with
the same name as the defined class
(e.g. MySeldonModel.py)
Defining a Custom Model Class
34. The (minimal) definition of a model
with the Python SDK requires:
● __init__
● predict
● (optional) load
Defining a Custom Model Class
35. The (minimal) definition of a model
with the Python SDK requires:
● __init__
● predict
● (optional) load
Defining a Custom Model Class
38. There are two primary options for
containerizing models created with
the Seldon Python SDK
● Openshift source-to-image
● Docker
Containerizing a Seldon Model Class
39. To create a container image, your
Dockerfile should contain:
● Reference to model class file in
the root of the docker build
context (class/file name only)
● Add Seldon specific env vars
● Invoke
seldon-core-microservice
Containerizing a Seldon Model Class
40. Once we’ve built our model
container, we then need to make it
available to the k8s cluster via a
container registry
Containerizing a Seldon Model Class
Container
Registry
41. Once we’ve built our model
container, we then need to make it
available to the k8s cluster via a
docker registry
Containerizing a Seldon Model Class
docker build -t mymodel:latest .
docker tag mymodel:latest localhost:5001/mymodel:latest
docker push localhost:5001/$svc:latest
44. Deploying a Containerized Model
A k8s deployment defines the full
configuration for the model pod(s)
● Seldon version/type info
● Container definition
● Graph definition
● Replication/scaling config
45. # push deployment to cluster
kubectl apply -f mymodel-deploy.yml
# remove (destroy) deployment
kubectl delete -f mymodel-deploy.yml
Once we define the deployment,
we can push it to our k8s cluster
using kubectl:
Deploying a Containerized Model
47. Testing Your Model
Seldon REST endpoints by default expect a numpy.ndarrray as
input, with a request payload of the form:
{
"data": {
"ndarray":
[<input>]
}
}
48. Testing Your Model
When coupled with istio for ingress, Seldon automatically wires up
new models according to the following routing pattern:
http://<ingress>/seldon/<ns>/<service>/api/v1.0/predictions
51. Now that we’ve run through the basics of a single model, let’s take a
look at a slightly more complex inference graph
Setting Expectations
redaction
REST punctuation
52. Serial Model Deployment
Assuming we already have the
component services deployed, we
can define a new deployment
similar to the following: