KFServing and Feast
Animesh Singh
The InferenceService architecture consists of a static graph of components which coordinate
requests for a single model. Advanced features such as Ensembling, A/B testing, and Multi-Arm-
Bandits should compose InferenceServices together.
Inference Service Control Plane
Inference Service with Transformer
apiVersion: serving.kubeflow.org/v1alpha2
kind: InferenceService
metadata:
name: bert-serving
spec:
default
transformer:
custom:
container:
image: bert-transformer:v1
env:
name: STORAGE_URI
value: s3://examples/bert_transformer
predictor:
pytorch:
storageUri: s3://examples/bert
runtimeVersion: v0.3.0-gpu
resources:
limits:
nvidia.com/gpu: 1 Pytorch Model Server
apiVersion: serving.kubeflow.org/v1alpha2
kind: InferenceService
metadata:
name: bert-serving-onnx
spec:
default
transformer:
custom:
container:
image: bert-transformer:v1
env:
name: STORAGE_URI
value: s3://examples/bert_transformer
predictor:
onnx:
storageUri: s3://examples/bert
runtimeVersion: 0.5.1
Pre/Post Processing
ONNX Runtime Server
Pre/Post Processing
Feature
● What is a feature?
4
/feature/
A feature is a measurable property of the object you’re trying to analyze.
Features are the basic building blocks of models. The
quality of the features in your dataset has a major impact
on the quality of the insights you will gain when you use
that dataset for machine learning.
Importance of Features
Hidden Technical Debt in Machine Learning Systems
5
Tech/User
Trends
Coming up with features is difficult, time-consuming,
requires expert knowledge. "Applied machine
learning" is basically feature engineering.
- Andrew Ng, Founder of deeplearning.ai
...some machine learning projects succeed and
some fail. What makes the difference? Easily the
most important factor is the features used.
- Pedro Domingos, author of ‘The Master Algorithm
algorithms we used are very standard for Kagglers. […]
We spent most of our efforts in feature engineering.
[...]
- Xavier Conort, Chief Data Scientist DataRobot
Feature Engineering is essential, difficult, and costly
The Feature problem
● Different ML models typically use some common set of features
● Examples of common features:
○ Average Loan default rate by zip code: Used by models which predict who should be targeted for a marketing offer, models which predict who should be offered a loan, etc.
○ Average property prices in an area
○ Credit history of customers: Used by models which predict anything that is related to clients.
○ Average traffic in an area: Used by models which deal with finding best route
● Finding the right features for an ML models requires:
○ Thinking of which features will be relevant for building the model
○ Identifying the right data from the data catalog/data lake for building the feature
○ Feature engineering to get the feature in the right format from the source data
○ This is repeated across teams by data scientists!
7
Feature Management is a Huge Painpoint
Spend more time on data prep
Lack of data consistency between training and serving
Duplicate work because they do not know it exists
Manage fragmented data infrastructure
Deal with more request as the data science team scales
Hard to get features into production
Data Scientists
Data Engineer
● Poor Feature Management Leads to….
8
Long Development Time Poor Data Quality Difficulty in Production
The feature store is the central
place to store curated features
for machine learning pipelines.
F E A T U R E S T O R E
Feature Store
9
Feast is a Feature Store Catalog that
attempts to solve the key data
challenges with production machine
learning
FEAST
Feature stores are a critical piece of ML infra
‘17 Uber Michelangelo (Proprietary, original feature store)
‘18 Feast (Open source)
‘18 Logical Clocks (Open source, ML platform)
‘19 Airbnb’s Zipline (Closed source)
‘19 Spotify’s Feature Store on Kubeflow (Closed source)
‘20 Pinterest (Closed source)
‘20 Twitter Feature Store (Closed source, library based)
‘20 Tecton Feature Store (Closed source)
What is a Feature Catalog?
● Feature catalog can be thought of as “Master Data” which is used for building and serving Machine learning models
● It stores different features which can be used across different teams for building ML models
● It is not just a feature repository, but also includes two serving mechanisms for:
○ Batch access
○ Real time access
● Feature Update: Feature values will get updated over time
○ Some will be updated in real time. E.g., average traffic in an area
○ Some will be updated not very frequently. E.g., Credit rating of customer
○ Features need to be synced between repositories used for batch access and real time access.
What does Feast provide?
Registry: A common catalog with which to explore, develop, collaborate on, and publish new feature definitions within
and across teams.
Ingestion: A means for continually ingesting batch and streaming data and storing consistent copies in both an offline
and online store
Serving: A feature-retrieval interface which provides a temporally consistent view of features for both training and online
serving.
Monitoring: Tools that allow operational teams to monitor and act on the quality and accuracy of data reaching models.
Feature Repo feast apply
Redis Serving API
Ingestion
API
Offline Store
(BQ/S3/GCS/Other)
Kafka Spark on K8s
Spark on K8s
Configures infrastructure based on feature definitions
and “provider”
Feast on K8s
Exists
Planning phase
TBD what the scope of apply would be for an K8s provider. It may be that it only spins up jobs and updates stores.
GCS/S3
registry
Redis Serving API
Feast- KFS
Feast online Serving
• gRPC server
• Serves 2 Methods
• GetFeastServingInfo
• GetOnlineFeaturesV2
• How to call the gRPC server-side methods?
• Short answer: Feast SDK
• Feast Python SDK, (also in Java and Go) wraps around gRPC client libs.
• gRPC client libs calls gRPC server-side method like calling local methods.
• gRPC client libs are generated from protobuf definition
• How are python gPRC client libs (modules) generated?
• They are generated from *.proto - Example:
python -m grpc_tools.protoc -I.
--grpc_python_out=../sdk/python/feast/protos/
--python_out=../sdk/python/feast/protos/
--mypy_out=../sdk/python/feast/protos/
feast/serving/ServingService.proto
• Generates
ServingService_pb2_grpc.py: client stub. Wrapper around the ServingService_pb.py
ServingService_pb.py - Implementation of ServingService_pb2_grpc.py
ServingService_pb2.pyi - Stub (interface) file of ServingService_pb2.py
Extend the Transformer
apiVersion: serving.kubeflow.org/v1alpha2
kind: InferenceService
metadata:
name: bert-serving
spec:
default
transformer:
feast:
feastUrl: "http://feast-serving.default.svc"
dataType: TensorProto,
entityIds:
- source
featureIds:
- weather:1:temp
- weather:1:clouds
- weather:1:humidity
numFeatureValues: 5
predictor:
pytorch:
storageUri: s3://examples/bert
runtimeVersion: v0.3.0-gpu
resources:
limits:
nvidia.com/gpu: 1
Pytorch Model Server
Feast Transformer
KFServing with Feast
Feast transformer as a new type of transformer for preprocess
○ Has a custom container image with generic implementation to interact with Feast online serving
○ Properties: entity IDs, feature refs, project, Feast serving URL…
○ Specify IDs in inference service yaml
■ Entity ids è FeatureStore.get_online_features(entity_rows…)
■ Feature refs è FeatureStore.get_online_features(feature_refs…)
○ The initial request will be augmented with features from Feast online store and sent to predictor as the final input
○ Postprocess is a pass-through, not implemented in this transformer
Preprocess Predict Postprocess
Explain
Python
dict
Python
dict
Transformer Transformer
Predictor
Explainer
Feast Online Serving
Model Serving
Request (predict
or explain)
Online Store
(Redis)
Registry
Feast
Model Serving
Response
(predict or
explain)
Python, gRPC
KFServing with Feast – Phased Approach
Phase 1: Provide a sample Feast transformer
○ Illustrate how online features in Feast feature stores can be retrieved and used for model serving
○ As a sample in KFServing docs folder
○ Use the driver ranking data and model from Feast tutorial, https://github.com/feast-dev/feast-
driver-ranking-tutorial
○ Use a custom container image
○ Interact with Feast online serving via python API
Phase 2: Provide a generic Feast transformer
○ Support a variety of Feast feature stores in preprocessing and model serving
○ As a general transformer in KFServing python folder
○ Include test, instructions, and examples
○ Provide a common Feast base image
○ Interact with Feast online serving via gRPC API
Where can it go?
Better precision
21
Data Asset 1
Model 1
(poor quality)
Data Asset 2
Model 2
(poor quality)
Data Asset 3
Model 3
(poor quality)
Difficult to identify the features
the lead to poor quality models
Feature
Better precision
22
Feature 1
Poor Quality Feature
Store
Model 2
(poor quality)
Model 3
(poor quality)
Model 4
(Good quality)
Feature 2
Moderate Quality
Model 1
(poor quality)
• Feature 1 – Used in 3
models all have poor quality
• Feature 2 – Used in 2
models which have good +
poor quality
• Feature 3 – Used in 1 model
with good quality
Easy to identify feature quality

KFServing and Feast

  • 1.
  • 2.
    The InferenceService architectureconsists of a static graph of components which coordinate requests for a single model. Advanced features such as Ensembling, A/B testing, and Multi-Arm- Bandits should compose InferenceServices together. Inference Service Control Plane
  • 3.
    Inference Service withTransformer apiVersion: serving.kubeflow.org/v1alpha2 kind: InferenceService metadata: name: bert-serving spec: default transformer: custom: container: image: bert-transformer:v1 env: name: STORAGE_URI value: s3://examples/bert_transformer predictor: pytorch: storageUri: s3://examples/bert runtimeVersion: v0.3.0-gpu resources: limits: nvidia.com/gpu: 1 Pytorch Model Server apiVersion: serving.kubeflow.org/v1alpha2 kind: InferenceService metadata: name: bert-serving-onnx spec: default transformer: custom: container: image: bert-transformer:v1 env: name: STORAGE_URI value: s3://examples/bert_transformer predictor: onnx: storageUri: s3://examples/bert runtimeVersion: 0.5.1 Pre/Post Processing ONNX Runtime Server Pre/Post Processing
  • 4.
    Feature ● What isa feature? 4 /feature/ A feature is a measurable property of the object you’re trying to analyze. Features are the basic building blocks of models. The quality of the features in your dataset has a major impact on the quality of the insights you will gain when you use that dataset for machine learning. Importance of Features
  • 5.
    Hidden Technical Debtin Machine Learning Systems 5 Tech/User Trends Coming up with features is difficult, time-consuming, requires expert knowledge. "Applied machine learning" is basically feature engineering. - Andrew Ng, Founder of deeplearning.ai ...some machine learning projects succeed and some fail. What makes the difference? Easily the most important factor is the features used. - Pedro Domingos, author of ‘The Master Algorithm algorithms we used are very standard for Kagglers. […] We spent most of our efforts in feature engineering. [...] - Xavier Conort, Chief Data Scientist DataRobot Feature Engineering is essential, difficult, and costly
  • 6.
    The Feature problem ●Different ML models typically use some common set of features ● Examples of common features: ○ Average Loan default rate by zip code: Used by models which predict who should be targeted for a marketing offer, models which predict who should be offered a loan, etc. ○ Average property prices in an area ○ Credit history of customers: Used by models which predict anything that is related to clients. ○ Average traffic in an area: Used by models which deal with finding best route ● Finding the right features for an ML models requires: ○ Thinking of which features will be relevant for building the model ○ Identifying the right data from the data catalog/data lake for building the feature ○ Feature engineering to get the feature in the right format from the source data ○ This is repeated across teams by data scientists!
  • 7.
    7 Feature Management isa Huge Painpoint Spend more time on data prep Lack of data consistency between training and serving Duplicate work because they do not know it exists Manage fragmented data infrastructure Deal with more request as the data science team scales Hard to get features into production Data Scientists Data Engineer
  • 8.
    ● Poor FeatureManagement Leads to…. 8 Long Development Time Poor Data Quality Difficulty in Production
  • 9.
    The feature storeis the central place to store curated features for machine learning pipelines. F E A T U R E S T O R E Feature Store 9
  • 10.
    Feast is aFeature Store Catalog that attempts to solve the key data challenges with production machine learning FEAST
  • 11.
    Feature stores area critical piece of ML infra ‘17 Uber Michelangelo (Proprietary, original feature store) ‘18 Feast (Open source) ‘18 Logical Clocks (Open source, ML platform) ‘19 Airbnb’s Zipline (Closed source) ‘19 Spotify’s Feature Store on Kubeflow (Closed source) ‘20 Pinterest (Closed source) ‘20 Twitter Feature Store (Closed source, library based) ‘20 Tecton Feature Store (Closed source)
  • 12.
    What is aFeature Catalog? ● Feature catalog can be thought of as “Master Data” which is used for building and serving Machine learning models ● It stores different features which can be used across different teams for building ML models ● It is not just a feature repository, but also includes two serving mechanisms for: ○ Batch access ○ Real time access ● Feature Update: Feature values will get updated over time ○ Some will be updated in real time. E.g., average traffic in an area ○ Some will be updated not very frequently. E.g., Credit rating of customer ○ Features need to be synced between repositories used for batch access and real time access.
  • 13.
    What does Feastprovide? Registry: A common catalog with which to explore, develop, collaborate on, and publish new feature definitions within and across teams. Ingestion: A means for continually ingesting batch and streaming data and storing consistent copies in both an offline and online store Serving: A feature-retrieval interface which provides a temporally consistent view of features for both training and online serving. Monitoring: Tools that allow operational teams to monitor and act on the quality and accuracy of data reaching models.
  • 14.
    Feature Repo feastapply Redis Serving API Ingestion API Offline Store (BQ/S3/GCS/Other) Kafka Spark on K8s Spark on K8s Configures infrastructure based on feature definitions and “provider” Feast on K8s Exists Planning phase TBD what the scope of apply would be for an K8s provider. It may be that it only spins up jobs and updates stores. GCS/S3 registry
  • 15.
  • 16.
    Feast online Serving •gRPC server • Serves 2 Methods • GetFeastServingInfo • GetOnlineFeaturesV2 • How to call the gRPC server-side methods? • Short answer: Feast SDK • Feast Python SDK, (also in Java and Go) wraps around gRPC client libs. • gRPC client libs calls gRPC server-side method like calling local methods. • gRPC client libs are generated from protobuf definition • How are python gPRC client libs (modules) generated? • They are generated from *.proto - Example: python -m grpc_tools.protoc -I. --grpc_python_out=../sdk/python/feast/protos/ --python_out=../sdk/python/feast/protos/ --mypy_out=../sdk/python/feast/protos/ feast/serving/ServingService.proto • Generates ServingService_pb2_grpc.py: client stub. Wrapper around the ServingService_pb.py ServingService_pb.py - Implementation of ServingService_pb2_grpc.py ServingService_pb2.pyi - Stub (interface) file of ServingService_pb2.py
  • 17.
    Extend the Transformer apiVersion:serving.kubeflow.org/v1alpha2 kind: InferenceService metadata: name: bert-serving spec: default transformer: feast: feastUrl: "http://feast-serving.default.svc" dataType: TensorProto, entityIds: - source featureIds: - weather:1:temp - weather:1:clouds - weather:1:humidity numFeatureValues: 5 predictor: pytorch: storageUri: s3://examples/bert runtimeVersion: v0.3.0-gpu resources: limits: nvidia.com/gpu: 1 Pytorch Model Server Feast Transformer
  • 18.
    KFServing with Feast Feasttransformer as a new type of transformer for preprocess ○ Has a custom container image with generic implementation to interact with Feast online serving ○ Properties: entity IDs, feature refs, project, Feast serving URL… ○ Specify IDs in inference service yaml ■ Entity ids è FeatureStore.get_online_features(entity_rows…) ■ Feature refs è FeatureStore.get_online_features(feature_refs…) ○ The initial request will be augmented with features from Feast online store and sent to predictor as the final input ○ Postprocess is a pass-through, not implemented in this transformer Preprocess Predict Postprocess Explain Python dict Python dict Transformer Transformer Predictor Explainer Feast Online Serving Model Serving Request (predict or explain) Online Store (Redis) Registry Feast Model Serving Response (predict or explain) Python, gRPC
  • 19.
    KFServing with Feast– Phased Approach Phase 1: Provide a sample Feast transformer ○ Illustrate how online features in Feast feature stores can be retrieved and used for model serving ○ As a sample in KFServing docs folder ○ Use the driver ranking data and model from Feast tutorial, https://github.com/feast-dev/feast- driver-ranking-tutorial ○ Use a custom container image ○ Interact with Feast online serving via python API Phase 2: Provide a generic Feast transformer ○ Support a variety of Feast feature stores in preprocessing and model serving ○ As a general transformer in KFServing python folder ○ Include test, instructions, and examples ○ Provide a common Feast base image ○ Interact with Feast online serving via gRPC API
  • 20.
  • 21.
    Better precision 21 Data Asset1 Model 1 (poor quality) Data Asset 2 Model 2 (poor quality) Data Asset 3 Model 3 (poor quality) Difficult to identify the features the lead to poor quality models Feature
  • 22.
    Better precision 22 Feature 1 PoorQuality Feature Store Model 2 (poor quality) Model 3 (poor quality) Model 4 (Good quality) Feature 2 Moderate Quality Model 1 (poor quality) • Feature 1 – Used in 3 models all have poor quality • Feature 2 – Used in 2 models which have good + poor quality • Feature 3 – Used in 1 model with good quality Easy to identify feature quality