SlideShare a Scribd company logo
Scaling AI and Machine Learning with
Containers and Kubernetes
Global Big Data Conference
Boston, Oct 1-3, 2019
Tushar Katarki
OpenShift Product Manager - Lead for AI/ML
Red Hat
Outline
● Scaling challenges in AI/ML
● Addressing the challenges with containers, kubernetes and more
● Open Data Hub - A community open source project
● Putting it together:
○ Self-service cloud like experience
○ From Experimentation to production continuously with CI/CD
● Summary
● Resources
Scaling
Challenges
Unable to easily share and collaborate,
iteratively and rapidly
Access to data is bespoke, manual and time
consuming
No on-demand access to ML tools and
frameworks and compute infrastructure
Models are remaining prototypes and not going
into production
Reproducing, tracking and explaining results of
AI/ML is hard
IMPACT
Speed, efficiency and productivity of teams
Frustration and lack of satisfaction
The promise of AI/ML to the business is not
redeemed
What do Data Scientists want?
Inferencing
Perform ML
Modelling
Self service portal to select
ML frameworks, data access
Deployment in
production
As a Data Scientist, I want a “self-service
cloud like” experience for my Machine
Learning projects, where I can access a
rich set of modelling frameworks, data,
and computational resources, share and
collaborate with colleagues, and deliver
my work into production with speed,
agility and repeatability to drive
business value!
How do we address this?
Look no further .. we have done this with application
software development and delivery …
Lorem ipsum
congue tempus
Cloud
Microservices
Containers
CI/CD
Agile
How do we bring this to the world of AI?
Source:
http://www1.semi.org/en/semi-arizona-forum-artificial-intellig
ence-machine-learning-deep-learning-applications-0
Kubernetes
Containers
Are the basic units that make
AI/ML programs shareable and
portable across hybrid cloud
Choice: Containers contain all your ML
frameworks and tools
Sharing: Container images can be shared
and iterated in flexible ways
Immutable & Portable: Contain once and run
them anywhere with integrity
Versioning: Incremental changes are tracked
Fast & Efficient: They are Linux processes!
Security: Process isolation and resource
control
Kubernetes
Kubernetes centralizes compute resources
and provides a cloud experience across the
data center, cloud and edge
Provides resource management for compute
resources
Kubernetes provides workload scheduling
and management
Kubernetes provides multi tenancy and
enforces quotas
Networking and storage abstractions
Kubernetes is the de facto container
platform for the hybrid cloud
Foundation of the AI platform for
Hybrid Cloud
Self-service,
Automation,
CI/CD
Boosts speed, efficiency and
productivity
JupyterHub and Jupyter Notebooks running on
Kubernetes form the basis for Self-service
Source-2-image automatically converts a
notebook into a container image that is ready to be
deployed
Kubernetes Operators provide automation and
lifecycle management for the containers
CI/CD makes rapid, incremental and iterative
change possible; Open source technologies such
as Argo, Tekton, Jenkins and Spinnaker in
conjunction with Kubernetes make this happen
‘Serverless’ technologies such as Knative will
enable AI/ML users to spend more time developing
their models
Data
Engineering
Easy, self-service and
repeatable
Data sources: Kubernetes Persistent Volumes and
S3 object store makes access to storage easy and
standardized
Data pipes: Kubernetes Networking and
ServiceMesh provides the data connectivity - high
bandwidth, low latency that is secure
Data streaming and manipulation: Tools such as
Spark, Kafka, Presto etc can run natively and can be
accessed as a service
Data governance: With open source technologies
like Open Policy Agent (OPA)
Deploying into
production
To deliver business value and
redeem the promise of AI in the
enterprise
Containerize models and expose the service
with an REST API using the microservices
pattern - ServiceMesh (such as ISTIO) makes
this easy !
Models are incorporated in a data pipeline
Jobs (batch or real-time) with tools such as
Spark, Kafka and Argo
Models are delivered into existing application
workflow as binaries: PMML, ONNX, Pickle
Monitoring model performance and drift with
open source tools native to Kubernetes:
Prometheus and Grafana
CI/CD to drive continuous change and
improvement in production
OpenShift - Enterprise Distro of Kubernetes
ANY
CONTAINER
Amazon Web Services Microsoft Azure Google CloudOpenStackDatacenterLaptop
ANY
INFRASTRUCTURE
APPLICATION LIFECYCLE MANAGEMENT
ENTERPRISE CONTAINER HOST
CONTAINER ORCHESTRATION AND MANAGEMENT
(KUBERNETES)
OpenShift Abstraction Layers
Automated
Operations
with Operators
Kubernetes
Red Hat Enterprise Linux or Red Hat CoreOS
CaaS PaaSBest IT Ops Experience Best Developer Experience
Application
Services
Middleware, Service Mesh, Functions, ISV
Cluster
Services
Metrics, Chargeback, Registry, Logging
Developer
Services
Dev Tools, Automated Builds, CI/CD, IDE
OpenShift Architecture for AI/ML
EXISTING
AUTOMATION
TOOLSETS
SCM
(GIT)
CI/CD
SERVICE LAYER
ROUTING LAYER
PERSISTENT
STORAGE
REGISTRY
RHEL
NODE
c
RHEL
NODE
RHEL
NODE
RHEL
NODE
RHEL
NODE
RHEL
NODE
C
C
C C
C
C
C CC C
RED HAT
ENTERPRISE LINUX
MASTER
API/AUTHENTICATION
DATA STORE
SCHEDULER
HEALTH/SCALING
PHYSICAL VIRTUAL PRIVATE PUBLIC HYBRID
DATA SCIENTIST
Deploy ML on any
cloud
Expose ML as
services, load
balanced and
scalable
Compute
Resources
on-demand
Best of SDLC
ML in
Production
Open Data Hub Community Project
● Meta-Project that includes best of open source AI projects
● Derives from Red Hat’s internal Data Science and AI platform
● Serves as Reference Architecture for AI on OpenShift
● Growing ecosystem of data science tools and ISVs
Data
Acquisition & Preparation
ML Model
Selection, Training, Testing
ML Model Deployment in
App. Dev. Process
Open Data Hub v0.4
Now available on opendatahub.io
● Unified analytics
engine
● Large-scale data
● Runs on
Kubernetes
● Multi-user Jupyter
● Used for data science
and research
● Monitoring and alerting toolkit
● Records numeric time series
data
● Used to diagnose problems
● Analytics platform for
all metrics
● Query, visualize and
alert on metrics
● Deploying machine
learning models on
Kubernetes
● Expose models via REST
and gRPC
● Full model lifecycle
management
● Distributed Object Store
● S3 Interface
● Distributed event streaming
● Pub/Sub Messaging
Operator
Open Data Hub
Open Data Hub Operator
Operator
Open Data Hub
Deploy and manage
lifecycle
Open Data Hub
Vision and Future
A self-service cloud like experience
Model
test &
iteration
Jupyter Hub
Model deployed
into production
ACCESS TO
DATA
CPUs, GPUs, Memory, NVMe
DATA SCIENTIST
SELF
SERVICE
Compute
Resources
From experimentation to production with CI/CD
Container
DATA SCIENTIST
Source-2-imageCheck-in to
source repo
Deloy
notebook
container
Model test &
iteration and
integration
Promote and
Serve models
into production
as services
Continuous monitoring
and change management
Summary
Containers and Kubernetes are foundational
to scaling AI
Also need to think about: Managing data
pipelines, automation and CI/CD, deploying
models into production
OpenShift - Enterprise Kubernetes Distro that
builds on Red Hat Enterprise Linux and
additional services for CI/CD and automation
on top
Open Data Hub - open source community
project and reference architecture for AI/ML
Scaling AI
Resources
OpenShift developer preview: try.openshift.com
OpenDataHub: https://opendatahub.io/
Contacts:
Tushar Katarki: tkatarki@redhat.com
Linkedin: https://www.linkedin.com/in/katarki/
Upcoming:
OpenShift Commons Gathering on AI/ML in San Francisco
Kubecon Nov 20th 2019 in San Diego - Customer case study for scaling AI/ML with Kubernetes
Thank You

More Related Content

What's hot

Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
confluent
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scale
Sudhir Tonse
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
HostedbyConfluent
 
Introduction To Streaming Data and Stream Processing with Apache Kafka
Introduction To Streaming Data and Stream Processing with Apache KafkaIntroduction To Streaming Data and Stream Processing with Apache Kafka
Introduction To Streaming Data and Stream Processing with Apache Kafka
confluent
 
01. Kubernetes-PPT.pptx
01. Kubernetes-PPT.pptx01. Kubernetes-PPT.pptx
01. Kubernetes-PPT.pptx
TamalBanerjee16
 
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and ManageEnd to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
Animesh Singh
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
Brendan Gregg
 
Apache storm vs. Spark Streaming
Apache storm vs. Spark StreamingApache storm vs. Spark Streaming
Apache storm vs. Spark Streaming
P. Taylor Goetz
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
Animesh Singh
 
The Real Cost of Slow Time vs Downtime
The Real Cost of Slow Time vs DowntimeThe Real Cost of Slow Time vs Downtime
The Real Cost of Slow Time vs Downtime
Radware
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Databricks
 
Karpenter
KarpenterKarpenter
Karpenter
Knoldus Inc.
 
Seamless scaling of Kubernetes nodes
Seamless scaling of Kubernetes nodesSeamless scaling of Kubernetes nodes
Seamless scaling of Kubernetes nodes
Marko Bevc
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
Joe Stein
 
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep Learning
Kai Wähner
 
Cloud Native Application
Cloud Native ApplicationCloud Native Application
Cloud Native Application
VMUG IT
 
Ultimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on KubernetesUltimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on Kubernetes
kloia
 
Aks pimarox from zero to hero
Aks pimarox from zero to heroAks pimarox from zero to hero
Aks pimarox from zero to hero
Johan Biere
 
Containerized Applications Overview
Containerized Applications OverviewContainerized Applications Overview
Containerized Applications Overview
Apoorv Anand
 
Scaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on DatabricksScaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on Databricks
Databricks
 

What's hot (20)

Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scale
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
 
Introduction To Streaming Data and Stream Processing with Apache Kafka
Introduction To Streaming Data and Stream Processing with Apache KafkaIntroduction To Streaming Data and Stream Processing with Apache Kafka
Introduction To Streaming Data and Stream Processing with Apache Kafka
 
01. Kubernetes-PPT.pptx
01. Kubernetes-PPT.pptx01. Kubernetes-PPT.pptx
01. Kubernetes-PPT.pptx
 
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and ManageEnd to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 
Apache storm vs. Spark Streaming
Apache storm vs. Spark StreamingApache storm vs. Spark Streaming
Apache storm vs. Spark Streaming
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
 
The Real Cost of Slow Time vs Downtime
The Real Cost of Slow Time vs DowntimeThe Real Cost of Slow Time vs Downtime
The Real Cost of Slow Time vs Downtime
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
 
Karpenter
KarpenterKarpenter
Karpenter
 
Seamless scaling of Kubernetes nodes
Seamless scaling of Kubernetes nodesSeamless scaling of Kubernetes nodes
Seamless scaling of Kubernetes nodes
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep Learning
 
Cloud Native Application
Cloud Native ApplicationCloud Native Application
Cloud Native Application
 
Ultimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on KubernetesUltimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on Kubernetes
 
Aks pimarox from zero to hero
Aks pimarox from zero to heroAks pimarox from zero to hero
Aks pimarox from zero to hero
 
Containerized Applications Overview
Containerized Applications OverviewContainerized Applications Overview
Containerized Applications Overview
 
Scaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on DatabricksScaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on Databricks
 

Similar to Scaling AI/ML with Containers and Kubernetes

ODSC East 2020 Accelerate ML Lifecycle with Kubernetes and Containerized Da...
ODSC East 2020   Accelerate ML Lifecycle with Kubernetes and Containerized Da...ODSC East 2020   Accelerate ML Lifecycle with Kubernetes and Containerized Da...
ODSC East 2020 Accelerate ML Lifecycle with Kubernetes and Containerized Da...
Abhinav Joshi
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
Databricks
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
Pieter de Bruin
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
Data Science Milan
 
03_aiops-1.pptx
03_aiops-1.pptx03_aiops-1.pptx
03_aiops-1.pptx
FarazulHoda2
 
Red hat's updates on the cloud & infrastructure strategy
Red hat's updates on the cloud & infrastructure strategyRed hat's updates on the cloud & infrastructure strategy
Red hat's updates on the cloud & infrastructure strategy
Orgad Kimchi
 
Containerized architectures for deep learning
Containerized architectures for deep learningContainerized architectures for deep learning
Containerized architectures for deep learning
Antje Barth
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
DataWorks Summit
 
NextGenML
NextGenML NextGenML
Anthos - Oxford - AI - Cloud and edge implementations.pdf
Anthos - Oxford - AI - Cloud and edge implementations.pdfAnthos - Oxford - AI - Cloud and edge implementations.pdf
Anthos - Oxford - AI - Cloud and edge implementations.pdf
AntonioGulli2
 
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Luciano Resende
 
From OpenStack.... towards an Open cloud architecture
From OpenStack.... towards an Open cloud architecture From OpenStack.... towards an Open cloud architecture
From OpenStack.... towards an Open cloud architecture
Claude Riousset
 
Cloud Native AI Introduction, Challenges
Cloud Native AI Introduction, ChallengesCloud Native AI Introduction, Challenges
Cloud Native AI Introduction, Challenges
sharpcheck
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
confluent
 
Episode 1: Building Kubernetes-as-a-Service
Episode 1: Building Kubernetes-as-a-ServiceEpisode 1: Building Kubernetes-as-a-Service
Episode 1: Building Kubernetes-as-a-Service
Mesosphere Inc.
 
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Akash Tandon
 
DDDP 2019 - Brown to Green
DDDP 2019  - Brown to GreenDDDP 2019  - Brown to Green
DDDP 2019 - Brown to Green
John Archer
 
Mobility and federation of Cloud computing
Mobility and federation of Cloud computingMobility and federation of Cloud computing
Mobility and federation of Cloud computing
David Wallom
 
Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes
John Archer
 
CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018
Krishna-Kumar
 

Similar to Scaling AI/ML with Containers and Kubernetes (20)

ODSC East 2020 Accelerate ML Lifecycle with Kubernetes and Containerized Da...
ODSC East 2020   Accelerate ML Lifecycle with Kubernetes and Containerized Da...ODSC East 2020   Accelerate ML Lifecycle with Kubernetes and Containerized Da...
ODSC East 2020 Accelerate ML Lifecycle with Kubernetes and Containerized Da...
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
 
03_aiops-1.pptx
03_aiops-1.pptx03_aiops-1.pptx
03_aiops-1.pptx
 
Red hat's updates on the cloud & infrastructure strategy
Red hat's updates on the cloud & infrastructure strategyRed hat's updates on the cloud & infrastructure strategy
Red hat's updates on the cloud & infrastructure strategy
 
Containerized architectures for deep learning
Containerized architectures for deep learningContainerized architectures for deep learning
Containerized architectures for deep learning
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
 
NextGenML
NextGenML NextGenML
NextGenML
 
Anthos - Oxford - AI - Cloud and edge implementations.pdf
Anthos - Oxford - AI - Cloud and edge implementations.pdfAnthos - Oxford - AI - Cloud and edge implementations.pdf
Anthos - Oxford - AI - Cloud and edge implementations.pdf
 
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
 
From OpenStack.... towards an Open cloud architecture
From OpenStack.... towards an Open cloud architecture From OpenStack.... towards an Open cloud architecture
From OpenStack.... towards an Open cloud architecture
 
Cloud Native AI Introduction, Challenges
Cloud Native AI Introduction, ChallengesCloud Native AI Introduction, Challenges
Cloud Native AI Introduction, Challenges
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
 
Episode 1: Building Kubernetes-as-a-Service
Episode 1: Building Kubernetes-as-a-ServiceEpisode 1: Building Kubernetes-as-a-Service
Episode 1: Building Kubernetes-as-a-Service
 
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
 
DDDP 2019 - Brown to Green
DDDP 2019  - Brown to GreenDDDP 2019  - Brown to Green
DDDP 2019 - Brown to Green
 
Mobility and federation of Cloud computing
Mobility and federation of Cloud computingMobility and federation of Cloud computing
Mobility and federation of Cloud computing
 
Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes
 
CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018
 

Recently uploaded

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 

Recently uploaded (20)

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 

Scaling AI/ML with Containers and Kubernetes

  • 1. Scaling AI and Machine Learning with Containers and Kubernetes Global Big Data Conference Boston, Oct 1-3, 2019 Tushar Katarki OpenShift Product Manager - Lead for AI/ML Red Hat
  • 2. Outline ● Scaling challenges in AI/ML ● Addressing the challenges with containers, kubernetes and more ● Open Data Hub - A community open source project ● Putting it together: ○ Self-service cloud like experience ○ From Experimentation to production continuously with CI/CD ● Summary ● Resources
  • 3. Scaling Challenges Unable to easily share and collaborate, iteratively and rapidly Access to data is bespoke, manual and time consuming No on-demand access to ML tools and frameworks and compute infrastructure Models are remaining prototypes and not going into production Reproducing, tracking and explaining results of AI/ML is hard IMPACT Speed, efficiency and productivity of teams Frustration and lack of satisfaction The promise of AI/ML to the business is not redeemed
  • 4. What do Data Scientists want?
  • 5. Inferencing Perform ML Modelling Self service portal to select ML frameworks, data access Deployment in production As a Data Scientist, I want a “self-service cloud like” experience for my Machine Learning projects, where I can access a rich set of modelling frameworks, data, and computational resources, share and collaborate with colleagues, and deliver my work into production with speed, agility and repeatability to drive business value!
  • 6. How do we address this?
  • 7. Look no further .. we have done this with application software development and delivery … Lorem ipsum congue tempus Cloud Microservices Containers CI/CD Agile How do we bring this to the world of AI? Source: http://www1.semi.org/en/semi-arizona-forum-artificial-intellig ence-machine-learning-deep-learning-applications-0 Kubernetes
  • 8. Containers Are the basic units that make AI/ML programs shareable and portable across hybrid cloud Choice: Containers contain all your ML frameworks and tools Sharing: Container images can be shared and iterated in flexible ways Immutable & Portable: Contain once and run them anywhere with integrity Versioning: Incremental changes are tracked Fast & Efficient: They are Linux processes! Security: Process isolation and resource control
  • 9. Kubernetes Kubernetes centralizes compute resources and provides a cloud experience across the data center, cloud and edge Provides resource management for compute resources Kubernetes provides workload scheduling and management Kubernetes provides multi tenancy and enforces quotas Networking and storage abstractions Kubernetes is the de facto container platform for the hybrid cloud Foundation of the AI platform for Hybrid Cloud
  • 10. Self-service, Automation, CI/CD Boosts speed, efficiency and productivity JupyterHub and Jupyter Notebooks running on Kubernetes form the basis for Self-service Source-2-image automatically converts a notebook into a container image that is ready to be deployed Kubernetes Operators provide automation and lifecycle management for the containers CI/CD makes rapid, incremental and iterative change possible; Open source technologies such as Argo, Tekton, Jenkins and Spinnaker in conjunction with Kubernetes make this happen ‘Serverless’ technologies such as Knative will enable AI/ML users to spend more time developing their models
  • 11. Data Engineering Easy, self-service and repeatable Data sources: Kubernetes Persistent Volumes and S3 object store makes access to storage easy and standardized Data pipes: Kubernetes Networking and ServiceMesh provides the data connectivity - high bandwidth, low latency that is secure Data streaming and manipulation: Tools such as Spark, Kafka, Presto etc can run natively and can be accessed as a service Data governance: With open source technologies like Open Policy Agent (OPA)
  • 12. Deploying into production To deliver business value and redeem the promise of AI in the enterprise Containerize models and expose the service with an REST API using the microservices pattern - ServiceMesh (such as ISTIO) makes this easy ! Models are incorporated in a data pipeline Jobs (batch or real-time) with tools such as Spark, Kafka and Argo Models are delivered into existing application workflow as binaries: PMML, ONNX, Pickle Monitoring model performance and drift with open source tools native to Kubernetes: Prometheus and Grafana CI/CD to drive continuous change and improvement in production
  • 13. OpenShift - Enterprise Distro of Kubernetes ANY CONTAINER Amazon Web Services Microsoft Azure Google CloudOpenStackDatacenterLaptop ANY INFRASTRUCTURE APPLICATION LIFECYCLE MANAGEMENT ENTERPRISE CONTAINER HOST CONTAINER ORCHESTRATION AND MANAGEMENT (KUBERNETES)
  • 14. OpenShift Abstraction Layers Automated Operations with Operators Kubernetes Red Hat Enterprise Linux or Red Hat CoreOS CaaS PaaSBest IT Ops Experience Best Developer Experience Application Services Middleware, Service Mesh, Functions, ISV Cluster Services Metrics, Chargeback, Registry, Logging Developer Services Dev Tools, Automated Builds, CI/CD, IDE
  • 15. OpenShift Architecture for AI/ML EXISTING AUTOMATION TOOLSETS SCM (GIT) CI/CD SERVICE LAYER ROUTING LAYER PERSISTENT STORAGE REGISTRY RHEL NODE c RHEL NODE RHEL NODE RHEL NODE RHEL NODE RHEL NODE C C C C C C C CC C RED HAT ENTERPRISE LINUX MASTER API/AUTHENTICATION DATA STORE SCHEDULER HEALTH/SCALING PHYSICAL VIRTUAL PRIVATE PUBLIC HYBRID DATA SCIENTIST Deploy ML on any cloud Expose ML as services, load balanced and scalable Compute Resources on-demand Best of SDLC ML in Production
  • 16. Open Data Hub Community Project ● Meta-Project that includes best of open source AI projects ● Derives from Red Hat’s internal Data Science and AI platform ● Serves as Reference Architecture for AI on OpenShift ● Growing ecosystem of data science tools and ISVs Data Acquisition & Preparation ML Model Selection, Training, Testing ML Model Deployment in App. Dev. Process
  • 17. Open Data Hub v0.4 Now available on opendatahub.io ● Unified analytics engine ● Large-scale data ● Runs on Kubernetes ● Multi-user Jupyter ● Used for data science and research ● Monitoring and alerting toolkit ● Records numeric time series data ● Used to diagnose problems ● Analytics platform for all metrics ● Query, visualize and alert on metrics ● Deploying machine learning models on Kubernetes ● Expose models via REST and gRPC ● Full model lifecycle management ● Distributed Object Store ● S3 Interface ● Distributed event streaming ● Pub/Sub Messaging Operator Open Data Hub
  • 18. Open Data Hub Operator Operator Open Data Hub Deploy and manage lifecycle
  • 19. Open Data Hub Vision and Future
  • 20. A self-service cloud like experience Model test & iteration Jupyter Hub Model deployed into production ACCESS TO DATA CPUs, GPUs, Memory, NVMe DATA SCIENTIST SELF SERVICE Compute Resources
  • 21. From experimentation to production with CI/CD Container DATA SCIENTIST Source-2-imageCheck-in to source repo Deloy notebook container Model test & iteration and integration Promote and Serve models into production as services Continuous monitoring and change management
  • 22. Summary Containers and Kubernetes are foundational to scaling AI Also need to think about: Managing data pipelines, automation and CI/CD, deploying models into production OpenShift - Enterprise Kubernetes Distro that builds on Red Hat Enterprise Linux and additional services for CI/CD and automation on top Open Data Hub - open source community project and reference architecture for AI/ML Scaling AI
  • 23. Resources OpenShift developer preview: try.openshift.com OpenDataHub: https://opendatahub.io/ Contacts: Tushar Katarki: tkatarki@redhat.com Linkedin: https://www.linkedin.com/in/katarki/ Upcoming: OpenShift Commons Gathering on AI/ML in San Francisco Kubecon Nov 20th 2019 in San Diego - Customer case study for scaling AI/ML with Kubernetes