Третий технический вебинар из серии "The A-Z of Data", который посвящен деплою DL моделей при помощи Kubernetes и Kubeflow.
https://dataphoenix.info/webinar-deploying-deep-learning-models-with-kubernetes-and-kubeflow/
В этом докладе вы узнаете про деплой Keras моделей. Сначала мы увидим, как это сделать с помощью TF-Serving и Kubernetes, а во второй части выступления мы сделаем это с помощью KFServing и Kubeflow.
Спикер:
Алексей Григорьев - Principal Data Scientist в OLX Group, основатель DataTalks.Club. Алексей написал несколько книг о машинном обучении. Одной из них является Machine Learning Bookcamp - книга для программистов, которые хотят заняться машинным обучением.
Подписывайтесь на наш Telegram канал (https://t.me/DataPhoenix), чтобы всегда быть в курсе последних новостей!
"The A-Z of Data" (https://dataphoenix.info/the-a-z-of-data) - серия вебинаров от команды Data Phoenix Events, в рамках которых вы сможете систематизировать и расширить свои знания работы с данными. Все вебинары будут разбиты на тематические блоки, а каждый блок будет состоять из обзорного вебинара, нескольких технических вебинаров про лучшие инструменты/практики/подходы/архитектуры моделей, а также вебинара с практическими юзкейсами и дискуссионной панелью экспертов. До конца 2021 года мы планируем раскрыть такие темы как: MLOps, Natural Language Processing, Computer Vision и Time-Series Forecasting.
4. Plan
● Different options to deploy a model (Lambda, Kubernetes, SageMaker)
● Kubernetes 101
● Deploying an XGB model with Flask and Kubernetes
● Deploying a Keras model with TF-Serving and Kubernetes
● Deploying a Keras model with Kubeflow
5. Ways to deploy a model
● Flask + AWS Elastic Beanstalk
● Serverless (AWS Lambda)
● Kubernetes (EKS)
● Kubeflow (EKS)
● AWS SageMaker
● ...
(or their alternatives in other cloud providers)
11. Lambda vs SageMaker vs Kubernetes
● Lambda
○ Cheap for small load
○ Easy to manage
○ Not always transparent
12. Lambda vs SageMaker vs Kubernetes
● Lambda
○ Cheap for small load
○ Easy to manage
○ Not always transparent
● SageMaker (serving)
○ Easy to use/manage
○ Needs wrappers
○ Not always transparent
○ Expensive
13. Lambda vs SageMaker vs Kubernetes
● Lambda
○ Cheap for small load
○ Easy to manage
○ Not always transparent
● SageMaker (serving)
○ Easy to use/manage
○ Needs wrappers
○ Not always transparent
○ Expensive
● Kubernetes
○ Complex (for me)
○ More flexible
○ Cloud-agnostic *
○ Requires support
○ Cheaper for high load
* sort of
15. Kubernetes glossary
● Pod ~ one instance of your service
● Deployment - a bunch of pods
● HPA - horizontal pod autoscaler
● Node - a server (e.g. EC2 instance)
● Service - an interface to the deployment
● Ingress - an interface to the cluster
18. import xgboost as xgb
# load the model from the pickle file
@app.route('/predict', methods=['POST'])
def predict():
data = request.get_json()
result = apply_model(data)
return jsonify(result)
if __name__ == "__main__":
app.run(debug=True, host='0.0.0.0', port=9696)
19. FROM python:3.7.5-slim
RUN pip install flask gunicorn xgboost
COPY "model.py" "model.py"
EXPOSE 9696
ENTRYPOINT ["gunicorn", "--bind", "0.0.0.0:9696", "model:app"]
26. import tensorflow as tf
from tensorflow import keras
model = keras.models.load_model('keras-model.h5')
tf.saved_model.save(model, 'tf-model')
27. $ ls -lhR
.:
total 3,1M
4,0K assets
3,1M saved_model.pb
4,0K variables
./assets:
total 0
./variables:
total 83M
83M variables.data-00000-of-00001
15K variables.index
28. saved_model_cli show --dir tf-model --all
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
...
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['input_8'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 299, 299, 3)
name: serving_default_input_8:0
The given SavedModel SignatureDef contains the following output(s):
outputs['dense_7'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 10)
name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict
29. docker run -it --rm
-p 8500:8500
-v "$(pwd)/tf-model:/models/tf-model/1"
-e MODEL_NAME=tf-model
tensorflow/serving:2.3.0
2021-09-07 21:03:58.579046: I tensorflow_serving/model_servers/server.cc:367]
Running gRPC ModelServer at 0.0.0.0:8500 ...
[evhttp_server.cc : 238] NET_LOG: Entering the event loop ...
2021-09-07 21:03:58.582097: I tensorflow_serving/model_servers/server.cc:387]
Exporting HTTP/REST API at:localhost:8501 ...
33. Not so fast
def np_to_protobuf(data):
return tf.make_tensor_proto(data, shape=data.shape)
pb_request = predict_pb2.PredictRequest()
pb_request.model_spec.name = 'tf-model'
pb_request.model_spec.signature_name = 'serving_default'
pb_request.inputs['input_8'].CopyFrom(np_to_protobuf(X))
pb_result = stub.Predict(pb_request, timeout=20.0)
pred = pb_result.outputs['dense_7'].float_val
34. 2,0 GB dependency?
Get only the things you need!
https://github.com/alexeygrigorev/tensorflow-protobuf
35. from tensorflow.keras.applications.xception import preprocess_input
https://github.com/alexeygrigorev/keras-image-helper
from keras_image_helper import create_preprocessor
preprocessor = create_preprocessor('xception', target_size=(299, 299))
url = 'http://bit.ly/mlbookcamp-pants'
X = preprocessor.from_url(url)
36. Next steps...
● Bake in the model into the TF-serving image
● Wrap the gRPC calls in a Flask app for the Gateway
● Write a Dockerfile for the Gateway
● Publish the images to ERC
47. Kubeflow
● Open-source ML platform with wany services (notebooks, pipelines, serving)
● KFServing makes it easier to deploy models compared to plain Kubernetes
61. Summary
● AWS SageMaker vs AWS Lambda vs Kubernetes vs Kubeflow
● Deploying models with Kubernetes: deployment + service
62. Summary
● AWS SageMaker vs AWS Lambda vs Kubernetes vs Kubeflow
● Deploying models with Kubernetes: deployment + service
● Deploying Keras models: TF-Serving + Gateway (over gRPC)
63. Summary
● AWS SageMaker vs AWS Lambda vs Kubernetes vs Kubeflow
● Deploying models with Kubernetes: deployment + service
● Deploying Keras models: TF-Serving + Gateway (over gRPC)
● KFServing: transformers + model
64. Summary
● AWS SageMaker vs AWS Lambda vs Kubernetes vs Kubeflow
● Deploying models with Kubernetes: deployment + service
● Deploying Keras models: TF-Serving + Gateway (over gRPC)
● KFServing: transformers + model
● No size fits all
65. mlbookcamp.com
● Learn Machine Learning by doing
projects
● http://bit.ly/mlbookcamp
● Get 40% off with code “grigorevpc”
Machine Learning
Bookcamp