SlideShare a Scribd company logo
1 of 60
Download to read offline
Train and deliver machine
learning models to production
with a single command
STEPAN PUSHKAREV
ILNUR GARIFULLIN
Today’s webinar overview
1. Machine Learning Workflow
2. Tools overview
a. Kubeflow
b. Hydrosphere.io
3. Deep Dive into Automation
a. Steps definition
b. Steps automation
Machine Learning
Workflow
ML Workflow
1. Research
2. Data Preparation
3. Model Training
4. Model Cataloguing
5. Model Deployment
6. Model Integration Testing
7. Production Inferencing
8. Model Performance Monitoring
9. Model Maintenance
Step 1: Research
● Defining an objective
● Defining requirements
● Defining methods
● Defining data sources
1.
Step 2: Data Preparation
● Collecting data
● Preparing data
○ Cleaning
○ Feature engineering
○ Transformation
● Important! To be reused for Inferencing.
1. 2.
Step 3: Model Training
● Building the model
● Training the model
● Evaluating the model
● Tuning hyper-parameters
● Versioning training data
1. 2. 3.
Step 4: Model Cataloguing
● Metadata extraction
○ Graph definition
○ Weights
○ Training data version / stats
○ Other dependencies (look_up vocabulary, etc)
● Indexing model’s binaries
● Versioning a model artifact
● Storing a model in Repository
1. 2. 3. 4.
Step 5: Model Deployment
● Preparing infrastructure for the model
● Preparing runtime for the model
● Deploying the model server
● Exposing API endpoints to the model
● Model Integration
1. 2. 3. 4. 5.
Step 6: Model Integration Testing
● Performing integration tests
● Replaying a golden data set
● Replaying edge cases
● Replaying recent traffic
● Asserting results
1. 2. 3. 4. 5. 6.
Step 7: Production Inferencing
● A/B & Canary deployment
● Model scaling
1. 2. 3. 4. 5. 6. 7.
Step 8: Model Performance Monitoring
● System metrics monitoring
● Model metrics tracking
● Model comparison
● Concept drift monitoring
● Anomaly detection
● Data profiling
1. 2. 3. 4. 5. 6. 7. 8.
Step 9: Model Maintenance
● Alerts & Troubleshooting
● Root Cause Analysis
● Edge Case Exploration
● Retraining Dataset Subsampling
● Retraining
1. 2. 3. 4. 5. 6. 7. 8. 9.
The Toolset
The Machine Learning Model Management Platform
The Machine Learning Toolkit for Kubernetes
What is Kubeflow?
● Began as Kubernetes template / blueprint for running Tensorflow
● Evolved into “Toolkit” - loosely coupled tools and blueprints for ML on
Kubernetes
What is Hydrosphere.io?
Hydrosphere.io is a platform for ML models Management.
- An exact value-add “tool” - a part of the toolkit
- Opensource
- Augments Cataloguing, Deployment, Inferencing,
Monitoring and Maintenance
Research Data Prep Training Cataloguing Deployment Integration
Testing
Production
Inferencing
Performance
Monitoring
Model
Maintenance
Tools Landscape
Orchestrate
ModelDB
Deep Dive into
Workflow Automation
Part 1: Creating executables
Step 1: Research
Step 1: Research
MNIST● Objective – given an image of the handwritten
digit, predict what digit it is;
● Requirements – model export with an ease;
● Tools and Methods – Tensorflow Estimator API;
● Data – Mnist dataset
Step 2: Data Preparation
Step 2: Data Preparation — Building Container
FROM python:3.6-slim
RUN pip install numpy==1.14.3 Pillow==5.2.0
ADD ./download.py /src/
WORKDIR /src/
ENTRYPOINT [ "python", "download.py" ]
$ docker build -t {username}/mnist-pipeline-download .
$ docker push {username}/mnist-pipeline-download
Dockerfile
Step 3: Model Training — Building a model
Step 3: Model Training
Step 3.5: Model Training and Saving
Step 4: Model Cataloguing
DIY:
Instrument training
pipeline
Store metadata
Zip model and metadata
Store in S3
Or push to Artifactory
Or push to git
Step 4: Model Cataloguing
DIY:
Instrument training
pipeline
Store metadata
Zip model and metadata
Store in S3
Or push to Artifactory
Or push to git
ModelDB:
Python DSL:
- Sync Model
- Sync Test data
- Sync metrics
Nice UI
Step 4: Model Cataloguing
DIY:
Instrument training
pipeline
Store metadata
Zip model and metadata
Store in S3
Or push to Artifactory
Or push to git
ModelDB:
Python DSL:
- Sync Model
- Sync Test data
- Sync metrics
Nice UI
Hydrosphere.io:
$ hs upload /models/mnist/
$ hs profile push /data/mnist/
Step 4: Model Cataloguing
Version
Extract metadata
Build model docker
Image
Store in Docker
Registry
Hydrosphere.io:
$ hs upload /models/mnist/
$ hs profile push /data/mnist/
Step 4: Model Cataloguing
Step 5: Model Deployment
DIY:
Implement model
server (Flask App)
Lookup for model
Dockerize
Add Kube configs, tags
Expose API (HTTP,
gRPC, batch, Streaming)
Step 5: Model Deployment
DIY:
Implement model
server (Flask App)
Lookup for Model
Dockerize
Add Kube configs, tags
Expose API (HTTP,
gRPC, batch, Streaming)
Niche tools:
TensorFlow Serving
PyTorch Serving
Nvidia TensorRT Serving
Step 5: Model Deployment
DIY:
Implement model
server (Flask App)
Lookup for Model
Dockerize
Add Kube configs, tags
Expose API (HTTP,
gRPC, batch, Streaming)
Niche tools:
TensorFlow Serving
PyTorch Serving
Nvidia TensorRT Serving
Hydrosphere.io
$ hs apply -f - << EOF
kind: Application
name : “MyPredictionApp”
singular:
model: mnist:1
runtime:
“serving-runtime-python:1.7.0-latest”
EOF
Step 5: Model Deployment
Hydrosphere.io
$ hs apply -f - << EOF
kind: Application
name : “MyPredictionApp”
singular:
model: mnist:1
runtime:
“serving-runtime-python:1.7.0-latest”
EOF
metadata
runtime
model
Model launched on Kube
HTTP, gRPC, Kafka API
Step 6: Model Integration Testing
DIY:
Implement testing
script
Dockerize, add to Kube
Replay a golden data
Replay edge cases
Replay recent traffic
Asserting results
Step 6: Model Integration Testing
DIY:
Implement testing
script
Dockerize, add to Kube
Replay a golden data
Replay edge cases
Replay recent traffic
Asserting results
Step 6: Model Integration Testing
DIY:
Implement testing
script
Dockerize, add to Kube
Replay a golden data
Replay edge cases
Replay recent traffic
Asserting results
Hydrosphere Serving (Q2 2019)
$ hs test -f /test/dataset
$ hs test replay anomalies
$ hs test replay <from_date>
Step 7: Production Inference
Step 8: Model Performance Monitoring
Step 9: Model Maintenance
alert
accuracy
drops
data
changed
what exactly
Step 9: Model Maintenance - explainability of monitoring alert
Deep Dive into
Workflow Automation
Part 2: Defining Kubeflow Pipeline
Defining Kubeflow Pipeline
Parametrizing function
Stage 1: Defining Downloading Container
Stage 1: Mounting Volumes
Stage 2: Defining Training Container
Stage 2: Defining Training Container
Stage 3: Defining Uploading Container
Stage 4: Defining Deploying Container
Stage 5: Defining Testing Container
Stage 6: Defining Cleaning Container
Compiling Pipeline
$ python pipeline.py pipeline.tar.gz
$ tar -xvf pipeline.tar.gz # produces pipeline.yaml
Executing Pipeline with a single command
$ argo submit pipeline.yaml --watch
Executing Pipeline with UI
Source code
https://github.com/Hydrospheredata/hydro-serving-kubeflow-demo
Contact Us
GENERAL INQUIRIES
hydrosphere.io
info@hydrosphere.io
linkedin.com/company/hydrospherebigdata
twitter.com/hydrospheredata
facebook.com/hydrosphere.io
ADDRESS
125 University Avenue, Suite 290
Palo Alto, CA, 94301
tel: 650-521-7875
BUSINESS AND TECHNICAL
Stepan Pushkarev
spushkarev@hydrosphere.io
Ilnur Garifullin
igarifullin@provectus.com

More Related Content

What's hot

Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOAnimesh Singh
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAnimesh Singh
 
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)DataWorks Summit
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleLviv Startup Club
 
Yannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflowYannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflowMarynaHoldaieva
 
모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기SeongIkKim2
 
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Animesh Singh
 
Powering machine learning workflows with Apache Airflow and Python
Powering machine learning workflows with Apache Airflow and PythonPowering machine learning workflows with Apache Airflow and Python
Powering machine learning workflows with Apache Airflow and PythonTatiana Al-Chueyr
 
Kubeflow: Machine Learning en Cloud para todos
Kubeflow: Machine Learning en Cloud para todosKubeflow: Machine Learning en Cloud para todos
Kubeflow: Machine Learning en Cloud para todosGlobant
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesDatabricks
 
Serverless machine learning operations
Serverless machine learning operationsServerless machine learning operations
Serverless machine learning operationsStepan Pushkarev
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformDatabricks
 
Clearing Airflow Obstructions
Clearing Airflow ObstructionsClearing Airflow Obstructions
Clearing Airflow ObstructionsTatiana Al-Chueyr
 
Multi runtime serving pipelines for machine learning
Multi runtime serving pipelines for machine learningMulti runtime serving pipelines for machine learning
Multi runtime serving pipelines for machine learningStepan Pushkarev
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowJan Kirenz
 
Model versioning done right: A ModelDB 2.0 Walkthrough
Model versioning done right: A ModelDB 2.0 WalkthroughModel versioning done right: A ModelDB 2.0 Walkthrough
Model versioning done right: A ModelDB 2.0 WalkthroughManasi Vartak
 
Introduction to Apache Beam (incubating) - DataCamp Salzburg - 7 dec 2016
Introduction to Apache Beam (incubating) - DataCamp Salzburg - 7 dec 2016Introduction to Apache Beam (incubating) - DataCamp Salzburg - 7 dec 2016
Introduction to Apache Beam (incubating) - DataCamp Salzburg - 7 dec 2016Sergio Fernández
 

What's hot (20)

Kubeflow
KubeflowKubeflow
Kubeflow
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
 
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
 
Yannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflowYannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflow
 
모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기
 
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
 
Powering machine learning workflows with Apache Airflow and Python
Powering machine learning workflows with Apache Airflow and PythonPowering machine learning workflows with Apache Airflow and Python
Powering machine learning workflows with Apache Airflow and Python
 
Webinar kubernetes and-spark
Webinar  kubernetes and-sparkWebinar  kubernetes and-spark
Webinar kubernetes and-spark
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
 
Kubeflow: Machine Learning en Cloud para todos
Kubeflow: Machine Learning en Cloud para todosKubeflow: Machine Learning en Cloud para todos
Kubeflow: Machine Learning en Cloud para todos
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using Kubernetes
 
Serverless machine learning operations
Serverless machine learning operationsServerless machine learning operations
Serverless machine learning operations
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
Clearing Airflow Obstructions
Clearing Airflow ObstructionsClearing Airflow Obstructions
Clearing Airflow Obstructions
 
Multi runtime serving pipelines for machine learning
Multi runtime serving pipelines for machine learningMulti runtime serving pipelines for machine learning
Multi runtime serving pipelines for machine learning
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
Model versioning done right: A ModelDB 2.0 Walkthrough
Model versioning done right: A ModelDB 2.0 WalkthroughModel versioning done right: A ModelDB 2.0 Walkthrough
Model versioning done right: A ModelDB 2.0 Walkthrough
 
Introduction to Apache Beam (incubating) - DataCamp Salzburg - 7 dec 2016
Introduction to Apache Beam (incubating) - DataCamp Salzburg - 7 dec 2016Introduction to Apache Beam (incubating) - DataCamp Salzburg - 7 dec 2016
Introduction to Apache Beam (incubating) - DataCamp Salzburg - 7 dec 2016
 

Similar to Hydrosphere.io for ODSC: Webinar on Kubeflow

MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...Databricks
 
Machine Learning Platform in LINE Fukuoka
Machine Learning Platform in LINE FukuokaMachine Learning Platform in LINE Fukuoka
Machine Learning Platform in LINE FukuokaLINE Corporation
 
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...Henry Saputra
 
MLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to productionMLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to productionFabian Hadiji
 
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowKostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowIT Arena
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Sotrender
 
AI day2021 approcci DevOps per il rilascio continuo di modelli di machine lea...
AI day2021 approcci DevOps per il rilascio continuo di modelli di machine lea...AI day2021 approcci DevOps per il rilascio continuo di modelli di machine lea...
AI day2021 approcci DevOps per il rilascio continuo di modelli di machine lea...Igor Antonacci
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Databricks
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with DatabricksLiangjun Jiang
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricksLiangjun Jiang
 
_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdfJim Dowling
 
Angular 2 overview in 60 minutes
Angular 2 overview in 60 minutesAngular 2 overview in 60 minutes
Angular 2 overview in 60 minutesLoiane Groner
 
Containerized architectures for deep learning
Containerized architectures for deep learningContainerized architectures for deep learning
Containerized architectures for deep learningAntje Barth
 
How To Build Efficient ML Pipelines From The Startup Perspective (GTC Silicon...
How To Build Efficient ML Pipelines From The Startup Perspective (GTC Silicon...How To Build Efficient ML Pipelines From The Startup Perspective (GTC Silicon...
How To Build Efficient ML Pipelines From The Startup Perspective (GTC Silicon...Jaeman An
 
Start with version control and experiments management in machine learning
Start with version control and experiments management in machine learningStart with version control and experiments management in machine learning
Start with version control and experiments management in machine learningMikhail Rozhkov
 
EPAM ML/AI Accelerator - ODAHU
EPAM ML/AI Accelerator - ODAHUEPAM ML/AI Accelerator - ODAHU
EPAM ML/AI Accelerator - ODAHUDmitrii Suslov
 
MLFlow 1.0 Meetup
MLFlow 1.0 Meetup MLFlow 1.0 Meetup
MLFlow 1.0 Meetup Databricks
 
Useful practices of creation automatic tests by using cucumber jvm
Useful practices of creation automatic tests by using cucumber jvmUseful practices of creation automatic tests by using cucumber jvm
Useful practices of creation automatic tests by using cucumber jvmAnton Shapin
 

Similar to Hydrosphere.io for ODSC: Webinar on Kubeflow (20)

MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
Machine Learning Platform in LINE Fukuoka
Machine Learning Platform in LINE FukuokaMachine Learning Platform in LINE Fukuoka
Machine Learning Platform in LINE Fukuoka
 
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
 
MLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to productionMLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to production
 
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowKostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
 
AI day2021 approcci DevOps per il rilascio continuo di modelli di machine lea...
AI day2021 approcci DevOps per il rilascio continuo di modelli di machine lea...AI day2021 approcci DevOps per il rilascio continuo di modelli di machine lea...
AI day2021 approcci DevOps per il rilascio continuo di modelli di machine lea...
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with Databricks
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricks
 
_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf
 
Angular 2 overview in 60 minutes
Angular 2 overview in 60 minutesAngular 2 overview in 60 minutes
Angular 2 overview in 60 minutes
 
Devops course content
Devops course contentDevops course content
Devops course content
 
Containerized architectures for deep learning
Containerized architectures for deep learningContainerized architectures for deep learning
Containerized architectures for deep learning
 
How To Build Efficient ML Pipelines From The Startup Perspective (GTC Silicon...
How To Build Efficient ML Pipelines From The Startup Perspective (GTC Silicon...How To Build Efficient ML Pipelines From The Startup Perspective (GTC Silicon...
How To Build Efficient ML Pipelines From The Startup Perspective (GTC Silicon...
 
Start with version control and experiments management in machine learning
Start with version control and experiments management in machine learningStart with version control and experiments management in machine learning
Start with version control and experiments management in machine learning
 
EPAM ML/AI Accelerator - ODAHU
EPAM ML/AI Accelerator - ODAHUEPAM ML/AI Accelerator - ODAHU
EPAM ML/AI Accelerator - ODAHU
 
MLFlow 1.0 Meetup
MLFlow 1.0 Meetup MLFlow 1.0 Meetup
MLFlow 1.0 Meetup
 
Useful practices of creation automatic tests by using cucumber jvm
Useful practices of creation automatic tests by using cucumber jvmUseful practices of creation automatic tests by using cucumber jvm
Useful practices of creation automatic tests by using cucumber jvm
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
 

Recently uploaded

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
software engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxsoftware engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxnada99848
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
software engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxsoftware engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptx
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 

Hydrosphere.io for ODSC: Webinar on Kubeflow

  • 1. Train and deliver machine learning models to production with a single command STEPAN PUSHKAREV ILNUR GARIFULLIN
  • 2. Today’s webinar overview 1. Machine Learning Workflow 2. Tools overview a. Kubeflow b. Hydrosphere.io 3. Deep Dive into Automation a. Steps definition b. Steps automation
  • 4. ML Workflow 1. Research 2. Data Preparation 3. Model Training 4. Model Cataloguing 5. Model Deployment 6. Model Integration Testing 7. Production Inferencing 8. Model Performance Monitoring 9. Model Maintenance
  • 5. Step 1: Research ● Defining an objective ● Defining requirements ● Defining methods ● Defining data sources 1.
  • 6. Step 2: Data Preparation ● Collecting data ● Preparing data ○ Cleaning ○ Feature engineering ○ Transformation ● Important! To be reused for Inferencing. 1. 2.
  • 7. Step 3: Model Training ● Building the model ● Training the model ● Evaluating the model ● Tuning hyper-parameters ● Versioning training data 1. 2. 3.
  • 8. Step 4: Model Cataloguing ● Metadata extraction ○ Graph definition ○ Weights ○ Training data version / stats ○ Other dependencies (look_up vocabulary, etc) ● Indexing model’s binaries ● Versioning a model artifact ● Storing a model in Repository 1. 2. 3. 4.
  • 9. Step 5: Model Deployment ● Preparing infrastructure for the model ● Preparing runtime for the model ● Deploying the model server ● Exposing API endpoints to the model ● Model Integration 1. 2. 3. 4. 5.
  • 10. Step 6: Model Integration Testing ● Performing integration tests ● Replaying a golden data set ● Replaying edge cases ● Replaying recent traffic ● Asserting results 1. 2. 3. 4. 5. 6.
  • 11. Step 7: Production Inferencing ● A/B & Canary deployment ● Model scaling 1. 2. 3. 4. 5. 6. 7.
  • 12. Step 8: Model Performance Monitoring ● System metrics monitoring ● Model metrics tracking ● Model comparison ● Concept drift monitoring ● Anomaly detection ● Data profiling 1. 2. 3. 4. 5. 6. 7. 8.
  • 13. Step 9: Model Maintenance ● Alerts & Troubleshooting ● Root Cause Analysis ● Edge Case Exploration ● Retraining Dataset Subsampling ● Retraining 1. 2. 3. 4. 5. 6. 7. 8. 9.
  • 15. The Machine Learning Model Management Platform The Machine Learning Toolkit for Kubernetes
  • 16. What is Kubeflow? ● Began as Kubernetes template / blueprint for running Tensorflow ● Evolved into “Toolkit” - loosely coupled tools and blueprints for ML on Kubernetes
  • 17. What is Hydrosphere.io? Hydrosphere.io is a platform for ML models Management. - An exact value-add “tool” - a part of the toolkit - Opensource - Augments Cataloguing, Deployment, Inferencing, Monitoring and Maintenance
  • 18. Research Data Prep Training Cataloguing Deployment Integration Testing Production Inferencing Performance Monitoring Model Maintenance Tools Landscape Orchestrate ModelDB
  • 19. Deep Dive into Workflow Automation Part 1: Creating executables
  • 21. Step 1: Research MNIST● Objective – given an image of the handwritten digit, predict what digit it is; ● Requirements – model export with an ease; ● Tools and Methods – Tensorflow Estimator API; ● Data – Mnist dataset
  • 22. Step 2: Data Preparation
  • 23. Step 2: Data Preparation — Building Container FROM python:3.6-slim RUN pip install numpy==1.14.3 Pillow==5.2.0 ADD ./download.py /src/ WORKDIR /src/ ENTRYPOINT [ "python", "download.py" ] $ docker build -t {username}/mnist-pipeline-download . $ docker push {username}/mnist-pipeline-download Dockerfile
  • 24. Step 3: Model Training — Building a model
  • 25. Step 3: Model Training
  • 26. Step 3.5: Model Training and Saving
  • 27. Step 4: Model Cataloguing DIY: Instrument training pipeline Store metadata Zip model and metadata Store in S3 Or push to Artifactory Or push to git
  • 28. Step 4: Model Cataloguing DIY: Instrument training pipeline Store metadata Zip model and metadata Store in S3 Or push to Artifactory Or push to git ModelDB: Python DSL: - Sync Model - Sync Test data - Sync metrics Nice UI
  • 29. Step 4: Model Cataloguing DIY: Instrument training pipeline Store metadata Zip model and metadata Store in S3 Or push to Artifactory Or push to git ModelDB: Python DSL: - Sync Model - Sync Test data - Sync metrics Nice UI Hydrosphere.io: $ hs upload /models/mnist/ $ hs profile push /data/mnist/
  • 30. Step 4: Model Cataloguing Version Extract metadata Build model docker Image Store in Docker Registry Hydrosphere.io: $ hs upload /models/mnist/ $ hs profile push /data/mnist/
  • 31. Step 4: Model Cataloguing
  • 32. Step 5: Model Deployment DIY: Implement model server (Flask App) Lookup for model Dockerize Add Kube configs, tags Expose API (HTTP, gRPC, batch, Streaming)
  • 33. Step 5: Model Deployment DIY: Implement model server (Flask App) Lookup for Model Dockerize Add Kube configs, tags Expose API (HTTP, gRPC, batch, Streaming) Niche tools: TensorFlow Serving PyTorch Serving Nvidia TensorRT Serving
  • 34. Step 5: Model Deployment DIY: Implement model server (Flask App) Lookup for Model Dockerize Add Kube configs, tags Expose API (HTTP, gRPC, batch, Streaming) Niche tools: TensorFlow Serving PyTorch Serving Nvidia TensorRT Serving Hydrosphere.io $ hs apply -f - << EOF kind: Application name : “MyPredictionApp” singular: model: mnist:1 runtime: “serving-runtime-python:1.7.0-latest” EOF
  • 35. Step 5: Model Deployment Hydrosphere.io $ hs apply -f - << EOF kind: Application name : “MyPredictionApp” singular: model: mnist:1 runtime: “serving-runtime-python:1.7.0-latest” EOF metadata runtime model Model launched on Kube HTTP, gRPC, Kafka API
  • 36. Step 6: Model Integration Testing DIY: Implement testing script Dockerize, add to Kube Replay a golden data Replay edge cases Replay recent traffic Asserting results
  • 37. Step 6: Model Integration Testing DIY: Implement testing script Dockerize, add to Kube Replay a golden data Replay edge cases Replay recent traffic Asserting results
  • 38. Step 6: Model Integration Testing DIY: Implement testing script Dockerize, add to Kube Replay a golden data Replay edge cases Replay recent traffic Asserting results Hydrosphere Serving (Q2 2019) $ hs test -f /test/dataset $ hs test replay anomalies $ hs test replay <from_date>
  • 39. Step 7: Production Inference
  • 40. Step 8: Model Performance Monitoring
  • 41. Step 9: Model Maintenance alert accuracy drops data changed what exactly
  • 42. Step 9: Model Maintenance - explainability of monitoring alert
  • 43. Deep Dive into Workflow Automation Part 2: Defining Kubeflow Pipeline
  • 44.
  • 47.
  • 48. Stage 1: Defining Downloading Container
  • 49. Stage 1: Mounting Volumes
  • 50. Stage 2: Defining Training Container
  • 51. Stage 2: Defining Training Container
  • 52. Stage 3: Defining Uploading Container
  • 53. Stage 4: Defining Deploying Container
  • 54. Stage 5: Defining Testing Container
  • 55. Stage 6: Defining Cleaning Container
  • 56. Compiling Pipeline $ python pipeline.py pipeline.tar.gz $ tar -xvf pipeline.tar.gz # produces pipeline.yaml
  • 57. Executing Pipeline with a single command $ argo submit pipeline.yaml --watch
  • 60. Contact Us GENERAL INQUIRIES hydrosphere.io info@hydrosphere.io linkedin.com/company/hydrospherebigdata twitter.com/hydrospheredata facebook.com/hydrosphere.io ADDRESS 125 University Avenue, Suite 290 Palo Alto, CA, 94301 tel: 650-521-7875 BUSINESS AND TECHNICAL Stepan Pushkarev spushkarev@hydrosphere.io Ilnur Garifullin igarifullin@provectus.com