SlideShare a Scribd company logo
1 of 52
Download to read offline
Machine Learning:
From Lab to
Production with
Kubeflow
With Special Guests Tensorflow and an
Apache Spark teaser
Agenda
● About Us
● Background
● Problem: Building Model is Easy, Serving models in Prod is hard.
● What Is KubeFlow?
● Kubeflow’s Design and Core Components
● …
Holden About Me Slides
● Prefered pronouns are she/her
● Developer Advocate at Google
● Apache Spark PMC / ASF member + contributor on lots of other projects
● previously IBM, Alpine, Databricks, Google, Foursquare & Amazon
● co-author of Learning Spark & High Performance Spark
● Twitter: @holdenkarau
● Slide share http://www.slideshare.net/hkarau
● Code review livestreams: https://www.twitch.tv/holdenkarau /
https://www.youtube.com/user/holdenkarau
● Talk Videos http://bit.ly/holdenSparkVideos
● Talk feedback: http://bit.ly/holdenTalkFeedback
● Organizing data track @ IT Next AMS - CFP Open!
Trevor Grant
● From Chicago
● Preferred pronouns he/him
● Various odd jobs around IBM (data scientist/evangelist/janitor)
● PMC Apache Mahout, Streams, Community Development
● Apache Roadshow Chicago [1]
● IoT Track at Apache Con North America [2]
● Blog: rawkintrevo.org
● Twitter: @rawkintrevo
● Github: rawkintrevo
● Shameless self promotion: rawkintrevo.org/shameless-self-promotion/
[1] https://www.apachecon.com/chiroadshow19/
[2] https://www.apachecon.com/acna19/index.html
Kubeflow Salesman:
Kubeflow Salesman:
**Slaps roof of Kubeflow**
THIS BAD BOY CAN FIT SO MANY
BUZZWORDS IN IT
Kubeflow Salesman:
**Slaps roof of Kubeflow**
THIS BAD BOY CAN FIT SO MANY
BUZZWORDS IN IT
Background
Things we thought you might know.
History of Predictive Analytics
Photo: Numerology Sign
Photo: Akash Kataruka
Photo: Hans Splinter
What is Statistics?
What is Statistics?
Machine Learning?
What is Statistics?
Machine Learning?
A.I. (Artificial Intelligence)
What is Statistics?
Machine Learning?
A.I. (Artificial Intelligence)
Photo: Andreas Kretschmer
Model Training
Photo: Helen Harrop
Model Serving
What is Statistics?
Machine Learning?
A.I. (Artificial Intelligence)
What is Statistics?
Machine Learning?
A.I. (Artificial Intelligence)
From don’t know shit to know your shit*
Verses # of GPUs required
*Holden’s gut feelings after coffee
NeedGPUs
Knowledge of Shit
What is Kubernetes?
Kubeflow: Dev on Laptop Deploy in Cloud
Deploy to Production
So what is Kubeflow?
What is Kubeflow?
What is Kubeflow?
What is Kubeflow?
VIK hotels group
Components Buffet
argo
automation
chainer-job
core
credentials-pod-preset
katib
mpi-job
mxnet-job
openmpi
pachyderm
pytorch-job
Seldon
spark
tf-serving
Paul Harrison
The (many) kinds of models you can train
● All your favourite Python libraries* (in Jupyter)
○ Different options to parallelize, with more coming (for now MPI or Beam ish)
● PyTorch
● Tensorflow (along with hyper param tuning with katib)
● mxnet
● etc.
Add-ons:
● H2o: CI is failing but you know, it's Wednesday
● And more!
But don't forget about data prep friends!
● For now options are a little limited, but other tools like Apache Spark are
in the works
○ https://github.com/kubeflow/kubeflow/pull/1467
● Now with Spark!*
● Pachyderm
● pandas?
● shell scripts?
● Tensorflow Transform in local mode Python 2 only…...
● Yeah I guess we did forget about our dataprep friends
*Where now == what's in master as of Feb 14th 2019
Model persistence/deployment/CI
● You need to save your model somewhere
● Your favourite cloud storage provider goes here
○ Why invest in model management when I can make directory?
● ModelDB
● WeaveFlux
● Pachyderm
Model Serving
Python Flask
TensorFlow Model Server
Openvino
NVIDIA Inference Server
Seldon Core
● Routers for A/B tests and multi-armed bandits.
● Supports lots of libraries (Python/Spark/R/etc)
● Monitoring / Security
If you can put it in a Docker container, you can use it.
So you want to use this?
What’s Next?!
Step away from keyboard
Think about type(s) of model
Look at components directory and see what’s a fit tool wise
Don’t know? Choose jupyter deal with the details live
Can’t find it?
^^ New Cat
Content!!!
^._.^
What about just tensorflow?*
ks registry add kubeflow
github.com/kubeflow/kubeflow/tree/${VERSION}/kubeflow
ks pkg install kubeflow/core@${VERSION}
ks pkg install kubeflow/tf-serving@${VERSION}
ks pkg install kubeflow/tf-job@${VERSION}
Getting the chef's recommend pairing:
kfctl.sh init my_awesome_project --platform {none, gcp, minikube}
cd my_awesome_project
kfctl.sh generate platform && kfctl.sh apply platform
kfctl.sh generate k8s && kfctl.sh apply k8s
# Add spark
cd ks_ap && ks pkg install kubeflow/spark
Douglas O'Brien
Connect to the Kubeflow Web UI
kubectl port-forward svc/ambassador -n kubeflow 8080:80 &
# Or use IAP, but that's… another story
The chef's recommend pairing is:
● Jupyter Hub
● TF Job & TF Serving
● PyTorch
● Katib (Hyper parameter tuning)
● Ambassador (makes it easier to access the UIs)
● Pipelines (Argo + Magic)
chicoblue
Click-to-deploy: get started hella fast* on
GCP
https://deploy.kubeflow.cloud
Click-to-deploy continued
Click-to-deploy continued
Click-to-deploy continued
Click-to-deploy continued
ro Hasegawa
What are those pipelines?
“Kubeflow Pipelines is a platform for building and deploying portable,
scalable machine learning (ML) workflows based on Docker containers.” -
kubeflow.org
Directed Acyclic Graph (DAG) of “pipeline components” (read “docker
containers”) each performing a function.
Building that pipeline?
Running that pipeline
Serving that job (not the only way)
When you don't know anything or know a
lot about ML: Hyper Parameter Tuning
● Katib
○ Does not depend on a specific ML tool (e.g. not just TF)
○ Supports a few different search algorithms
○ e.g. "What should I set my L1 regularization too? Idk let's ask the computer"
● Great way to accidently overfit too!* (*if you're not careful)
● As respective cloud providers, we are happy to rent you a lot of
resources
● Seriously, mention our names in the sales call. We're both going for
promo (and that shit is hard)
Katib Screenshot
But what about [special foo-baz-inator] or
[special-yak-shaving-tool]?
Write a Dockerfile and build an image, use FROM so you’re not starting from
scratch.
FROM gcr.io/kubeflow-images-public/tensorflow-1.6.0-notebook-cpu
RUN pip install py-special-yak-shaving-tool
Then tell set it as a param for your training/serving job as needed:
ks param set tfjob-v1alpha2 image "my-special-image-goes-here”
Now your fortran lives forever!
Live streamed demos (recorded on
YouTube)
● Kubeflow intro
https://codelabs.developers.google.com/codelabs/kubeflow-introductio
n/index.html & streamed http://bit.ly/kfIntroStream
● Kubeflow E2E with Github issue
summurizationhttps://codelabs.developers.google.com/codelabs/cloud-
kubeflow-e2e-gis/ & streamed http://bit.ly/kfGHStream
● You can tell they were live streamed by how poorly went, I promise no
video editing has occurred.
Limited & Optional: Workshop Demo
● We are doing a workshop @ Strata SF and we'd love to trick offer the
option to do a self-guided pre-ffer you an exciting opportunity to try a
self-guided version free of charge and provide us feedback. We have
some demo accts.
● What you will might learn:
○ Installing Kubeflow
○ Setting up a project
○ Deploying that project to GCP / Azure / IBM
○ Monkeying around with a project and still having it work
● Please come and talk to us after. Holden is wearing a shark dress.
● We'll be around to answer your questions
fionasjournal
Why you shouldn't use
this?
Downsides to Kubeflow
● Lot's of overhead versus doing it locally
● Active development (look it's 0.4)
● 3 talks on Kubeflow can give you 3 different toolsets
Questions? Half-baked Demo?

More Related Content

What's hot

Kubernetes Application Deployment with Helm - A beginner Guide!
Kubernetes Application Deployment with Helm - A beginner Guide!Kubernetes Application Deployment with Helm - A beginner Guide!
Kubernetes Application Deployment with Helm - A beginner Guide!Krishna-Kumar
 
Kubeflow at Spotify (For the Kubeflow Summit)
Kubeflow at Spotify (For the Kubeflow Summit)Kubeflow at Spotify (For the Kubeflow Summit)
Kubeflow at Spotify (For the Kubeflow Summit)Josh Baer
 
Deep dive into Kubernetes Networking
Deep dive into Kubernetes NetworkingDeep dive into Kubernetes Networking
Deep dive into Kubernetes NetworkingSreenivas Makam
 
An intro to Kubernetes operators
An intro to Kubernetes operatorsAn intro to Kubernetes operators
An intro to Kubernetes operatorsJ On The Beach
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAnimesh Singh
 
Serverless and Design Patterns In GCP
Serverless and Design Patterns In GCPServerless and Design Patterns In GCP
Serverless and Design Patterns In GCPOliver Fierro
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOAnimesh Singh
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopBob Killen
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformDatabricks
 
Deploy Application on Kubernetes
Deploy Application on KubernetesDeploy Application on Kubernetes
Deploy Application on KubernetesOpsta
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowJan Kirenz
 
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud PlatformMeetupDataScienceRoma
 
[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOSAkihiro Suda
 
Kubernetes Security with Calico and Open Policy Agent
Kubernetes Security with Calico and Open Policy AgentKubernetes Security with Calico and Open Policy Agent
Kubernetes Security with Calico and Open Policy AgentCloudOps2005
 
Helm - Application deployment management for Kubernetes
Helm - Application deployment management for KubernetesHelm - Application deployment management for Kubernetes
Helm - Application deployment management for KubernetesAlexei Ledenev
 

What's hot (20)

Kubernetes Application Deployment with Helm - A beginner Guide!
Kubernetes Application Deployment with Helm - A beginner Guide!Kubernetes Application Deployment with Helm - A beginner Guide!
Kubernetes Application Deployment with Helm - A beginner Guide!
 
Kubeflow at Spotify (For the Kubeflow Summit)
Kubeflow at Spotify (For the Kubeflow Summit)Kubeflow at Spotify (For the Kubeflow Summit)
Kubeflow at Spotify (For the Kubeflow Summit)
 
Deep dive into Kubernetes Networking
Deep dive into Kubernetes NetworkingDeep dive into Kubernetes Networking
Deep dive into Kubernetes Networking
 
Grafana vs Kibana
Grafana vs KibanaGrafana vs Kibana
Grafana vs Kibana
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
An intro to Kubernetes operators
An intro to Kubernetes operatorsAn intro to Kubernetes operators
An intro to Kubernetes operators
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
 
Serverless and Design Patterns In GCP
Serverless and Design Patterns In GCPServerless and Design Patterns In GCP
Serverless and Design Patterns In GCP
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes Workshop
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
Deploy Application on Kubernetes
Deploy Application on KubernetesDeploy Application on Kubernetes
Deploy Application on Kubernetes
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
 
[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS
 
DevOps with Kubernetes
DevOps with KubernetesDevOps with Kubernetes
DevOps with Kubernetes
 
Kubernetes Security with Calico and Open Policy Agent
Kubernetes Security with Calico and Open Policy AgentKubernetes Security with Calico and Open Policy Agent
Kubernetes Security with Calico and Open Policy Agent
 
Kubernetes PPT.pptx
Kubernetes PPT.pptxKubernetes PPT.pptx
Kubernetes PPT.pptx
 
Helm - Application deployment management for Kubernetes
Helm - Application deployment management for KubernetesHelm - Application deployment management for Kubernetes
Helm - Application deployment management for Kubernetes
 

Similar to Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)

Intro - End to end ML with Kubeflow @ SignalConf 2018
Intro - End to end ML with Kubeflow @ SignalConf 2018Intro - End to end ML with Kubeflow @ SignalConf 2018
Intro - End to end ML with Kubeflow @ SignalConf 2018Holden Karau
 
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Migrating Apache Spark ML Jobs to Spark + Tensorflow on KubeflowMigrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Migrating Apache Spark ML Jobs to Spark + Tensorflow on KubeflowDatabricks
 
PySpark on Kubernetes @ Python Barcelona March Meetup
PySpark on Kubernetes @ Python Barcelona March MeetupPySpark on Kubernetes @ Python Barcelona March Meetup
PySpark on Kubernetes @ Python Barcelona March MeetupHolden Karau
 
Big data with Python on kubernetes (pyspark on k8s) - Big Data Spain 2018
Big data with Python on kubernetes (pyspark on k8s) - Big Data Spain 2018Big data with Python on kubernetes (pyspark on k8s) - Big Data Spain 2018
Big data with Python on kubernetes (pyspark on k8s) - Big Data Spain 2018Holden Karau
 
Sharing (or stealing) the jewels of python with big data & the jvm (1)
Sharing (or stealing) the jewels of python with big data & the jvm (1)Sharing (or stealing) the jewels of python with big data & the jvm (1)
Sharing (or stealing) the jewels of python with big data & the jvm (1)Holden Karau
 
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...confluent
 
Infrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStackInfrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStackke4qqq
 
ApacheCloudStack
ApacheCloudStackApacheCloudStack
ApacheCloudStackPuppet
 
Powering tensor flow with big data using apache beam, flink, and spark cern...
Powering tensor flow with big data using apache beam, flink, and spark   cern...Powering tensor flow with big data using apache beam, flink, and spark   cern...
Powering tensor flow with big data using apache beam, flink, and spark cern...Holden Karau
 
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...Holden Karau
 
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...Holden Karau
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...Chris Fregly
 
Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018Holden Karau
 
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017Chris Fregly
 
Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...Holden Karau
 
Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018Holden Karau
 
CoffeeScript: A beginner's presentation for beginners copy
CoffeeScript: A beginner's presentation for beginners copyCoffeeScript: A beginner's presentation for beginners copy
CoffeeScript: A beginner's presentation for beginners copyPatrick Devins
 
Serverless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud PlatformServerless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud PlatformMeetupDataScienceRoma
 

Similar to Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark) (20)

Intro - End to end ML with Kubeflow @ SignalConf 2018
Intro - End to end ML with Kubeflow @ SignalConf 2018Intro - End to end ML with Kubeflow @ SignalConf 2018
Intro - End to end ML with Kubeflow @ SignalConf 2018
 
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Migrating Apache Spark ML Jobs to Spark + Tensorflow on KubeflowMigrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
 
PySpark on Kubernetes @ Python Barcelona March Meetup
PySpark on Kubernetes @ Python Barcelona March MeetupPySpark on Kubernetes @ Python Barcelona March Meetup
PySpark on Kubernetes @ Python Barcelona March Meetup
 
Big data with Python on kubernetes (pyspark on k8s) - Big Data Spain 2018
Big data with Python on kubernetes (pyspark on k8s) - Big Data Spain 2018Big data with Python on kubernetes (pyspark on k8s) - Big Data Spain 2018
Big data with Python on kubernetes (pyspark on k8s) - Big Data Spain 2018
 
Sharing (or stealing) the jewels of python with big data & the jvm (1)
Sharing (or stealing) the jewels of python with big data & the jvm (1)Sharing (or stealing) the jewels of python with big data & the jvm (1)
Sharing (or stealing) the jewels of python with big data & the jvm (1)
 
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
 
Infrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStackInfrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStack
 
ApacheCloudStack
ApacheCloudStackApacheCloudStack
ApacheCloudStack
 
Powering tensor flow with big data using apache beam, flink, and spark cern...
Powering tensor flow with big data using apache beam, flink, and spark   cern...Powering tensor flow with big data using apache beam, flink, and spark   cern...
Powering tensor flow with big data using apache beam, flink, and spark cern...
 
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
 
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
 
Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018
 
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
 
Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...
 
Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018
 
Beyond Puppet
Beyond PuppetBeyond Puppet
Beyond Puppet
 
CoffeeScript: A beginner's presentation for beginners copy
CoffeeScript: A beginner's presentation for beginners copyCoffeeScript: A beginner's presentation for beginners copy
CoffeeScript: A beginner's presentation for beginners copy
 
Serverless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud PlatformServerless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud Platform
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 

Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)

  • 1. Machine Learning: From Lab to Production with Kubeflow With Special Guests Tensorflow and an Apache Spark teaser
  • 2. Agenda ● About Us ● Background ● Problem: Building Model is Easy, Serving models in Prod is hard. ● What Is KubeFlow? ● Kubeflow’s Design and Core Components ● …
  • 3. Holden About Me Slides ● Prefered pronouns are she/her ● Developer Advocate at Google ● Apache Spark PMC / ASF member + contributor on lots of other projects ● previously IBM, Alpine, Databricks, Google, Foursquare & Amazon ● co-author of Learning Spark & High Performance Spark ● Twitter: @holdenkarau ● Slide share http://www.slideshare.net/hkarau ● Code review livestreams: https://www.twitch.tv/holdenkarau / https://www.youtube.com/user/holdenkarau ● Talk Videos http://bit.ly/holdenSparkVideos ● Talk feedback: http://bit.ly/holdenTalkFeedback ● Organizing data track @ IT Next AMS - CFP Open!
  • 4. Trevor Grant ● From Chicago ● Preferred pronouns he/him ● Various odd jobs around IBM (data scientist/evangelist/janitor) ● PMC Apache Mahout, Streams, Community Development ● Apache Roadshow Chicago [1] ● IoT Track at Apache Con North America [2] ● Blog: rawkintrevo.org ● Twitter: @rawkintrevo ● Github: rawkintrevo ● Shameless self promotion: rawkintrevo.org/shameless-self-promotion/ [1] https://www.apachecon.com/chiroadshow19/ [2] https://www.apachecon.com/acna19/index.html
  • 6. Kubeflow Salesman: **Slaps roof of Kubeflow** THIS BAD BOY CAN FIT SO MANY BUZZWORDS IN IT
  • 7. Kubeflow Salesman: **Slaps roof of Kubeflow** THIS BAD BOY CAN FIT SO MANY BUZZWORDS IN IT
  • 9. History of Predictive Analytics Photo: Numerology Sign Photo: Akash Kataruka Photo: Hans Splinter
  • 12. What is Statistics? Machine Learning? A.I. (Artificial Intelligence)
  • 13. What is Statistics? Machine Learning? A.I. (Artificial Intelligence) Photo: Andreas Kretschmer Model Training Photo: Helen Harrop Model Serving
  • 14. What is Statistics? Machine Learning? A.I. (Artificial Intelligence)
  • 15. What is Statistics? Machine Learning? A.I. (Artificial Intelligence)
  • 16. From don’t know shit to know your shit* Verses # of GPUs required *Holden’s gut feelings after coffee NeedGPUs Knowledge of Shit
  • 18. Kubeflow: Dev on Laptop Deploy in Cloud
  • 20. So what is Kubeflow?
  • 23. What is Kubeflow? VIK hotels group
  • 25. The (many) kinds of models you can train ● All your favourite Python libraries* (in Jupyter) ○ Different options to parallelize, with more coming (for now MPI or Beam ish) ● PyTorch ● Tensorflow (along with hyper param tuning with katib) ● mxnet ● etc. Add-ons: ● H2o: CI is failing but you know, it's Wednesday ● And more!
  • 26. But don't forget about data prep friends! ● For now options are a little limited, but other tools like Apache Spark are in the works ○ https://github.com/kubeflow/kubeflow/pull/1467 ● Now with Spark!* ● Pachyderm ● pandas? ● shell scripts? ● Tensorflow Transform in local mode Python 2 only…... ● Yeah I guess we did forget about our dataprep friends *Where now == what's in master as of Feb 14th 2019
  • 27. Model persistence/deployment/CI ● You need to save your model somewhere ● Your favourite cloud storage provider goes here ○ Why invest in model management when I can make directory? ● ModelDB ● WeaveFlux ● Pachyderm
  • 28. Model Serving Python Flask TensorFlow Model Server Openvino NVIDIA Inference Server Seldon Core ● Routers for A/B tests and multi-armed bandits. ● Supports lots of libraries (Python/Spark/R/etc) ● Monitoring / Security If you can put it in a Docker container, you can use it.
  • 29. So you want to use this?
  • 30. What’s Next?! Step away from keyboard Think about type(s) of model Look at components directory and see what’s a fit tool wise Don’t know? Choose jupyter deal with the details live Can’t find it? ^^ New Cat Content!!! ^._.^
  • 31. What about just tensorflow?* ks registry add kubeflow github.com/kubeflow/kubeflow/tree/${VERSION}/kubeflow ks pkg install kubeflow/core@${VERSION} ks pkg install kubeflow/tf-serving@${VERSION} ks pkg install kubeflow/tf-job@${VERSION}
  • 32. Getting the chef's recommend pairing: kfctl.sh init my_awesome_project --platform {none, gcp, minikube} cd my_awesome_project kfctl.sh generate platform && kfctl.sh apply platform kfctl.sh generate k8s && kfctl.sh apply k8s # Add spark cd ks_ap && ks pkg install kubeflow/spark Douglas O'Brien
  • 33. Connect to the Kubeflow Web UI kubectl port-forward svc/ambassador -n kubeflow 8080:80 & # Or use IAP, but that's… another story
  • 34. The chef's recommend pairing is: ● Jupyter Hub ● TF Job & TF Serving ● PyTorch ● Katib (Hyper parameter tuning) ● Ambassador (makes it easier to access the UIs) ● Pipelines (Argo + Magic) chicoblue
  • 35. Click-to-deploy: get started hella fast* on GCP https://deploy.kubeflow.cloud
  • 41. What are those pipelines? “Kubeflow Pipelines is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers.” - kubeflow.org Directed Acyclic Graph (DAG) of “pipeline components” (read “docker containers”) each performing a function.
  • 44. Serving that job (not the only way)
  • 45. When you don't know anything or know a lot about ML: Hyper Parameter Tuning ● Katib ○ Does not depend on a specific ML tool (e.g. not just TF) ○ Supports a few different search algorithms ○ e.g. "What should I set my L1 regularization too? Idk let's ask the computer" ● Great way to accidently overfit too!* (*if you're not careful) ● As respective cloud providers, we are happy to rent you a lot of resources ● Seriously, mention our names in the sales call. We're both going for promo (and that shit is hard)
  • 47. But what about [special foo-baz-inator] or [special-yak-shaving-tool]? Write a Dockerfile and build an image, use FROM so you’re not starting from scratch. FROM gcr.io/kubeflow-images-public/tensorflow-1.6.0-notebook-cpu RUN pip install py-special-yak-shaving-tool Then tell set it as a param for your training/serving job as needed: ks param set tfjob-v1alpha2 image "my-special-image-goes-here” Now your fortran lives forever!
  • 48. Live streamed demos (recorded on YouTube) ● Kubeflow intro https://codelabs.developers.google.com/codelabs/kubeflow-introductio n/index.html & streamed http://bit.ly/kfIntroStream ● Kubeflow E2E with Github issue summurizationhttps://codelabs.developers.google.com/codelabs/cloud- kubeflow-e2e-gis/ & streamed http://bit.ly/kfGHStream ● You can tell they were live streamed by how poorly went, I promise no video editing has occurred.
  • 49. Limited & Optional: Workshop Demo ● We are doing a workshop @ Strata SF and we'd love to trick offer the option to do a self-guided pre-ffer you an exciting opportunity to try a self-guided version free of charge and provide us feedback. We have some demo accts. ● What you will might learn: ○ Installing Kubeflow ○ Setting up a project ○ Deploying that project to GCP / Azure / IBM ○ Monkeying around with a project and still having it work ● Please come and talk to us after. Holden is wearing a shark dress. ● We'll be around to answer your questions fionasjournal
  • 50. Why you shouldn't use this?
  • 51. Downsides to Kubeflow ● Lot's of overhead versus doing it locally ● Active development (look it's 0.4) ● 3 talks on Kubeflow can give you 3 different toolsets