Training And Serving ML Model Using Kubeflow by Jayesh Sharma

•Download as PPTX, PDF•

0 likes•5,420 views

We will walk through the exploration, training and serving of a machine learning model by leveraging Kubeflow's main components. We will use Jupyter notebooks on the cluster to train the model and then introduce Kubeflow Pipelines to chain all the steps together, to automate the entire process.

Technology

Training and Serving ML models using Kubeflow
• Subtitle or speaker name
Jayesh Sharma

Make it easy for everyone to develop, deploy,
and manage portable, scalable ML everywhere

Why Kubeflow?
● Composability
○ Choose from existing popular tools
● Portability
○ Build using cloud native, portable Kubernetes APIs
● Scalability
○ TF already supports CPU/GPU/distributed
○ K8s scales to 5k nodes with same stack

What’s in the Box?
● Jupyter Hub - for collaborative & interactive training
● A TensorFlow Training Controller
● A TensorFlow Serving Deployment
● Argo for workflows
● Much more

Kubeflow is composable
Training
• Perform distributed training with TF-Jobs
• Run pipelines with regular containers as steps.
• Run pipelines with TF-Jobs and other CRDs as steps.
Serving
• KF-Serving, Seldon Core
• Azure ML Service and other frameworks.

TF-Job: Distributed Training
A distributed TensorFlow job typically contains 0 or more of the following
processes:
• Chief: The chief is responsible for orchestrating training and performing
tasks like checkpointing the model.
• PS: The ps are parameter servers; these servers provide a distributed
data store for the model parameters.
• Worker: The workers do the actual work of training the model. In some
cases, worker 0 might also act as the chief.
• Evaluator: The evaluators can be used to compute
evaluation metrics as the model is trained.

An example
TF-Job YAML
Parameter Server option
Worker specification
Image with your code
Command to begin
training

Kubeflow Pipelines
• A user interface (UI) for managing and tracking experiments, jobs, and
runs.
• An engine for scheduling multi-step ML workflows.
• An SDK for defining and manipulating pipelines and components.
• Notebooks for interacting with the system using the SDK.

Anatomy of a pipeline
• Containerized implementations of ML Tasks
• Pre-built components: Just provide params or code snippets. Create
your own components from code or libraries
• Use any runtime, framework, data types
• Attach k8s objects - volumes, secrets
• Specification of the sequence of steps
• Specified via Python DSL
• Inferred from data dependencies on input/output
• Input Parameters
• A “Run” = Pipeline invoked w/ specific parameters
• Schedules
• Invoke a single run or create a recurring scheduled pipeline

Training And Serving ML Model Using Kubeflow by Jayesh Sharma

What's hot

Using MLOps to Bring ML to Production/The Promise of MLOpsWeaveworks

Ml ops past_present_futureNisha Talagala

Seamless MLOps with Seldon and MLflowDatabricks

MLOps for production-level machine learningcnvrg.io AI OS - Hands-on ML Workshops

Kubeflow Pipelines (with Tekton)Animesh Singh

MLOps.pptxAllenPeter7

Machine Learning using Kubeflow and KubernetesArun Gupta

MLOps Using MLflowDatabricks

Ml ops intro sessionAvinash Patil

Apply MLOps at ScaleDatabricks

MLOps by Sasha RosenbaumSasha Rosenbaum

Apply MLOps at Scale by H&MDatabricks

Serving models using KFServingTheofilos Papapanagiotou

Deploy PyTorch models in Production on AWS with TorchServeSuman Debnath

Vertex AI: Pipelines for your MLOps workflowsMárton Kodok

What is MLOpsHenrik Skogström

MLOps Bridging the gap between Data Scientists and Ops.Knoldus Inc.

ML-Ops:Philosophy, Best-Practices and ToolsJorge Davila-Chacon

Terraform modules and best-practices - September 2018Anton Babenko

MLOps - The Assembly Line of MLJordan Birdsell

What's hot (20)

Using MLOps to Bring ML to Production/The Promise of MLOps

Ml ops past_present_future

Seamless MLOps with Seldon and MLflow

MLOps for production-level machine learning

Kubeflow Pipelines (with Tekton)

MLOps.pptx

Machine Learning using Kubeflow and Kubernetes

MLOps Using MLflow

Ml ops intro session

Apply MLOps at Scale

MLOps by Sasha Rosenbaum

Apply MLOps at Scale by H&M

Serving models using KFServing

Deploy PyTorch models in Production on AWS with TorchServe

Vertex AI: Pipelines for your MLOps workflows

What is MLOps

MLOps Bridging the gap between Data Scientists and Ops.

ML-Ops:Philosophy, Best-Practices and Tools

Terraform modules and best-practices - September 2018

MLOps - The Assembly Line of ML

Similar to Training And Serving ML Model Using Kubeflow by Jayesh Sharma

Containerized architectures for deep learningAntje Barth

Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusJakob Karalus

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit

Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Akash Tandon

[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...DataScienceConferenc1

Deploy your machine learning models to production with Kubernetescnvrg.io AI OS - Hands-on ML Workshops

MLflow with DatabricksLiangjun Jiang

Mlflow with databricksLiangjun Jiang

Running Apache Spark Jobs Using KubernetesDatabricks

Hot to build continuously processing for 24/7 real-time data streaming platform?GetInData

Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Sotrender

Vertex AI PresentationKnoldus Inc.

Bodywork - GitOps for Machine LearningAlex Ioannides

Democratizing machine learning on kubernetesDocker, Inc.

Parallel ProgrammingMindfire Solutions

Neptune @ SoCalChris Bunch

Caffe2Bang Tsui Liou

ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureFei Chen

DAIS Europe Nov. 2020 presentation on MLflow Model Servingamesar0

Scale machine learning deploymentGang Tao

Similar to Training And Serving ML Model Using Kubeflow by Jayesh Sharma (20)

Containerized architectures for deep learning

Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...

Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...

[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...

Deploy your machine learning models to production with Kubernetes

MLflow with Databricks

Mlflow with databricks

Running Apache Spark Jobs Using Kubernetes

Hot to build continuously processing for 24/7 real-time data streaming platform?

Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...

Vertex AI Presentation

Bodywork - GitOps for Machine Learning

Democratizing machine learning on kubernetes

Parallel Programming

Neptune @ SoCal

Caffe2

ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure

DAIS Europe Nov. 2020 presentation on MLflow Model Serving

Scale machine learning deployment

Recently uploaded

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Manulife - Insurer Innovation Award 2024The Digital Insurer

Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Real Time Object Detection Using Open CVKhem

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

Recently uploaded (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

How to Troubleshoot Apps for the Modern Connected Worker

Manulife - Insurer Innovation Award 2024

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Powerful Google developer tools for immediate impact! (2023-24 C)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Axa Assurance Maroc - Insurer Innovation Award 2024

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

Boost Fertility New Invention Ups Success Rates.pdf

The 7 Things I Know About Cyber Security After 25 Years | April 2024

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Real Time Object Detection Using Open CV

Strategies for Landing an Oracle DBA Job as a Fresher

AWS Community Day CPH - Three problems of Terraform

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

Training And Serving ML Model Using Kubeflow by Jayesh Sharma

1. Training and Serving ML models using Kubeflow • Subtitle or speaker name Jayesh Sharma

2. Machine Learning Stages @aronchick

3. Make it easy for everyone to develop, deploy, and manage portable, scalable ML everywhere

4. Why Kubeflow? ● Composability ○ Choose from existing popular tools ● Portability ○ Build using cloud native, portable Kubernetes APIs ● Scalability ○ TF already supports CPU/GPU/distributed ○ K8s scales to 5k nodes with same stack

5. What’s in the Box? ● Jupyter Hub - for collaborative & interactive training ● A TensorFlow Training Controller ● A TensorFlow Serving Deployment ● Argo for workflows ● Much more

6. What’s in the Box?

7. Kubeflow today

8. Kubeflow is composable Training • Perform distributed training with TF-Jobs • Run pipelines with regular containers as steps. • Run pipelines with TF-Jobs and other CRDs as steps. Serving • KF-Serving, Seldon Core • Azure ML Service and other frameworks.

9. Kubeflow Architecture

10. TF-Job: Distributed Training A distributed TensorFlow job typically contains 0 or more of the following processes: • Chief: The chief is responsible for orchestrating training and performing tasks like checkpointing the model. • PS: The ps are parameter servers; these servers provide a distributed data store for the model parameters. • Worker: The workers do the actual work of training the model. In some cases, worker 0 might also act as the chief. • Evaluator: The evaluators can be used to compute evaluation metrics as the model is trained.

11. An example TF-Job YAML Parameter Server option Worker specification Image with your code Command to begin training

12. Kubeflow Pipelines • A user interface (UI) for managing and tracking experiments, jobs, and runs. • An engine for scheduling multi-step ML workflows. • An SDK for defining and manipulating pipelines and components. • Notebooks for interacting with the system using the SDK.

13. Anatomy of a pipeline • Containerized implementations of ML Tasks • Pre-built components: Just provide params or code snippets. Create your own components from code or libraries • Use any runtime, framework, data types • Attach k8s objects - volumes, secrets • Specification of the sequence of steps • Specified via Python DSL • Inferred from data dependencies on input/output • Input Parameters • A “Run” = Pipeline invoked w/ specific parameters • Schedules • Invoke a single run or create a recurring scheduled pipeline

Training And Serving ML Model Using Kubeflow by Jayesh Sharma

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Training And Serving ML Model Using Kubeflow by Jayesh Sharma

Similar to Training And Serving ML Model Using Kubeflow by Jayesh Sharma (20)

More from CodeOps Technologies LLP

More from CodeOps Technologies LLP (20)

Recently uploaded

Recently uploaded (20)

Training And Serving ML Model Using Kubeflow by Jayesh Sharma