SlideShare a Scribd company logo
End to End ML
With Kubeflow
& friends
@holdenkarau
Signal
2018
Legit-enough
Some links (slides & recordings will be at):
http://bit.ly/2QgsqF9
^ Slides & code-lab links
(after)
CatLoversShow
Holden:
● Prefered pronouns are she/her
● Developer Advocate at Google
● Apache Spark PMC/Committer, contribute to many other projects
● previously IBM, Alpine, Databricks, Google, Foursquare & Amazon
● co-author of Learning Spark & High Performance Spark
● Twitter: @holdenkarau
● Slide share http://www.slideshare.net/hkarau
● Code review livestreams: https://www.twitch.tv/holdenkarau /
https://www.youtube.com/user/holdenkarau
● Spark Talk Videos http://bit.ly/holdenSparkVideos
Who do I think you all are?
● Nice people*
● Interested in Machine Learning
● Possibly Familiar with one of Java, Scala, or Python
Amanda
What is in store for our adventure?
● We have 30 minutes :)
● Brief intros to what Kubernetes & Spark, and Kubeflow are
● How to train a model (ish)
● How to serve a model (ish)
● Scaling (ish)
● Updating models and other scary thoughts
Ada Doglace
What is Kubernetes?
● General purpose distributed system
○ With a really nice API including Python :)
● Apache project
● Faster than Hadoop Map/Reduce
● Good when too big for a single
machine
● Built on top of two abstractions for
distributed data: RDDs & Datasets
● Has ML Libraries
● WIP Kubeflow integration PR 1467
What is Spark?
The different pieces of Spark
Apache Spark
SQL, DataFrames & Datasets
Structured
Streaming
Scala,
Java,
Python, &
R
Spark ML
bagel &
Graph X
MLLib
Scala,
Java,
PythonStreaming
Graph
Frames
Paul Hudson
So what does ML look like?
Code on Laptop
Train Model on ML-Rig
Photo by Tomomi
Deploy to Production
Problem:
Models are Cool,
Feature prep is Hard
Training is Tedious,
Everyone Forgot Deployment
What is Kubeflow?
What is Kubeflow?
What is Kubeflow?
“Data Scientists”
Model Serving On Kube
Model Training
*
What is Kubeflow?
“Kubeflow is a Cloud Native platform for machine learning based on
Google’s internal machine learning pipelines.”
or:
● The recognition that just a bunch of model weights isn’t enough
● Designed to support the ecosystem of tools needed (from data
prep to serving)
● Open source project :)
Ada Doglace
Really just want to replace this:
Photo by: Milestoned
So you want to use this?
What’s Next?!
Step away from keyboard
Think about type(s) of model
Look at components directory and see what’s a fit tool wise
Don’t know? Choose jupyter deal with the details live
Can’t find it?
Containers Buffet
argo
automation
chainer-job
core
credentials-pod-preset
katib
mpi-job
mxnet-job
openmpi
pachyderm
pytorch-job
seldon
tf-serving
weaveflux
What about just the basics?*
./scripts/kfctl.sh init ${KFAPP} --platform gcp --project ${PROJECT}
cd ${KFAPP}
../scripts/kfctl.sh generate platform
../scripts/kfctl.sh apply platform
../scripts/kfctl.sh generate k8s
../scripts/kfctl.sh apply k8s
What about just tensorflow?*
ks registry add kubeflow
github.com/kubeflow/kubeflow/tree/${VERSION}/kubeflow
ks pkg install kubeflow/core@${VERSION}
ks pkg install kubeflow/tf-serving@${VERSION}
ks pkg install kubeflow/tf-job@${VERSION}
Ok well I need to be able to access Jupyter
too...
kubectl port-forward -n ${NAMESPACE} `kubectl get pods -n
${NAMESPACE} --selector=service=ambassador -o
jsonpath='{.items[0].metadata.name}'` 8080:80
Your Special ML Training Goes here
Don’t have any pressing projects but still want to have fun? Check
out Michelle’s notebook for Github Issue summarization.
Or want to see mnist again? here :)
Your Special ML Training Goes here
...
from keras.callbacks import CSVLogger, ModelCheckpoint
script_name_base = 'tutorial_seq2seq'
csv_logger =
CSVLogger('{:}.log'.format(script_name_base))
model_checkpoint =
ModelCheckpoint('{:}.epoch{{epoch:02d}}-val{{val_loss:
.5f}}.hdf5'.format(script_name_base),
save_best_only=True)
Your Special ML Training Goes here
history = seq2seq_Model.fit([encoder_input_data,
decoder_input_data],
np.expand_dims(decoder_target_data, -1),
batch_size=batch_size,
epochs=epochs,
validation_split=0.12,
callbacks=[csv_logger, model_checkpoint])
Really just check out Michelle’s notebook for Github Issue
summarization.
But what about [special foo-baz-inator] or
[special-yak-shaving-tool]?
Write a Dockerfile and build an image, use FROM so you’re not
starting from scratch.
FROM gcr.io/kubeflow-images-public/tensorflow-1.6.0-notebook-cpu
RUN pip install py-special-yak-shaving-tool
Then tell set it as a param for your training/serving job as needed:
ks param set tfjob-v1alpha2 image "my-special-image-goes-here”
What about that magical feature prep?
For now it’s a mostly write-by-hand situation
However TFX has some cool tools we can use today (like
TF.Transform) if we’re ok with DirectRunner or Dataflow (with Flink
support in the works indirectly)
Enter: TF.Transform
● For pre-processing of your data
● e.g. where you spend 90% of your dev time anyways
● Integrates into serving time :D
● OSS
● Runs on top of Apache Beam, but current release not yet
scalable outside of GCP
● On Apache Beam master this can run-ish on Flink, but rough
● Please don’t use this in production today unless your on
GCP/Dataflow
PROKathryn Yengel
Defining a Transform processing function
def preprocessing_fn(inputs):
x = inputs['x']
y = inputs['y']
s = inputs['s']
x_centered = x - tft.mean(x)
y_normalized = tft.scale_to_0_1(y)
s_int = tft.string_to_int(s)
return { 'x_centered': x_centered,
'y_normalized': y_normalized, 's_int': s_int}
mean stddev
normalize
multiply
quantiles
bucketize
Analyzers
Reduce (full pass)
Implemented as a distributed
data pipeline
Transforms
Instance-to-instance (don’t
change batch dimension)
Pure TensorFlow
Analyze
normalize
multiply
bucketize
constant
tensors
data
mean stddev
normalize
multiply
quantiles
bucketize
Scale to ... Bag of Words / N-Grams
Bucketization Feature Crosses
tft.ngrams
tft.string_to_int
tf.string_split
tft.scale_to_z_score
tft.apply_buckets
tft.quantiles
tft.string_to_int
tf.string_join
...
Some common use-cases...
BEAM Beyond the JVM: Current release
● Non JVM BEAM doesn’t work outside of Google’s environment yet
● tl;dr : uses grpc / protobuf
○ Similar to the common design but with more efficient representations (often)
● But exciting new plans to unify the runners and ease the support of different
languages (called SDKS)
○ See https://beam.apache.org/contribute/portability/
● If this is exciting, you can come join me on making BEAM work in Python3
○ Yes we still don’t have that :(
○ But we're getting closer & you can come join us on BEAM-2874 :D
Emma
Serving: TF is probably easiest for now...
MODEL_COMPONENT=my-model-server
MODEL_NAME=cat-finder-3k
ks generate tf-serving ${MODEL_COMPONENT}
--name=${MODEL_NAME}
ks param set ${MODEL_COMPONENT} deployHttpProxy true
ks param set ${MODEL_COMPONENT} modelPath
${MODEL_PATH}
ks apply ${KF_ENV} -c ${MODEL_COMPONENT}
Or use Seldon Core & friends*
Seldon Core is an OSS platform for deploying ML models on
Kubernetes supported by Kubeflow.
Supports Many Model types/formats:
● Tensorflow
● Sklearn
● Spark ML**
● R
● H20
Set up seldon core for serving
# Gives cluster-admin role to the default service account
kubectl create clusterrolebinding seldon-admin
--clusterrole=cluster-admin
--serviceaccount=${NAMESPACE}:default
# Install the kubeflow/seldon package
ks pkg install kubeflow/seldon
# Generate the seldon component and deploy it
ks generate seldon seldon --name=seldon
Build an image with your model*
docker run -v $(pwd):/my_model
seldonio/core-python-wrapper:0.7 /my_model
IssueSummarization 0.1 gcr.io --base-image=python:3.6
--image-name=gcr-repository-name/my-image-name
And kick off the new model:
ks generate seldon-serve-simple new-serving-magic
--name=model-name 
--image=gcr.io/gcr-repository-name/model:version 
--namespace=${NAMESPACE} 
--replicas=2
ks apply ${KF_ENV} -c new-serving-magic
Wait so how do I use this?
Your favourite rest library goes here*
Timeouts matter!
Doing recommendations? Have fall-backs
Have multiple models? fall-backs
*Need to use in batch? Maybe skip seldon, tf-serving &
friends and integrate the library into your code. Or
not.
Trish Hamme
Scaling - or ruh roh people are using this!
replicas: 1
Becomes
replicas: 10
Factor of 10 =~ “science”
Wait really?
● Early: switch from mini-kube to ${cloud provider} with GPUs
○ “Vertical” scaling
● Next: increase # of workers for training
○ “Horizontal” scaling
○ Auto-scaling also WIP per-backend for the most part
● Serving, # of replicas
○ Auto-scaling is a WIP -
https://github.com/kubeflow/kubeflow/issues/1219
PROJennifer C.
What about validation?
TensorFlow Data Validation (TFDV)
Or Roll your own?
● Counters & execution time most common
● Please also check % of data change
Spark-validator (proof of concept)
Please validate your pipelines, and not just for data code changes too.
Demo!
Recorded Demos
Previously live demos recorded
● Kubeflow intro
https://codelabs.developers.google.com/codelabs/kubeflow-intr
oduction/index.html & streamed http://bit.ly/kfIntroStream
● Kubeflow E2E with Github issue
summurizationhttps://codelabs.developers.google.com/codelab
s/cloud-kubeflow-e2e-gis/ & streamed http://bit.ly/kfGHStream
● You can tell they were live streamed by how poorly went, I
promise no video editing has occurred.
● You can do these yourself too (including one of them at our
booth)!
Join me & Boo @ Google’s booth @ 5PM
And join my-coworker Casey West @ 6talking about:
Building Captain Obvious:
Understand Faster with Machine Learning APIs
Want to watch working on a Kubeflow PR?
● Join Holden Friday @ 2pm pacific for live coding continuing
working on her Apache Spark to Kubeflow (using the existing
Spark operator as a base)
https://www.youtube.com/watch?v=zHnTdqbjPik
● Or just https://youtube.com/user/holdenkarau & like +
subscribe + click the bell :p
k thnx bye :)
Give feedback on this presentation
http://bit.ly/holdenTalkFeedback

More Related Content

What's hot

Building Recoverable (and optionally async) Pipelines with Apache Spark (+ s...
Building Recoverable (and optionally async) Pipelines with Apache Spark  (+ s...Building Recoverable (and optionally async) Pipelines with Apache Spark  (+ s...
Building Recoverable (and optionally async) Pipelines with Apache Spark (+ s...
Holden Karau
 
Validating big data pipelines - FOSDEM 2019
Validating big data pipelines -  FOSDEM 2019Validating big data pipelines -  FOSDEM 2019
Validating big data pipelines - FOSDEM 2019
Holden Karau
 
Spark Autotuning Talk - Strata New York
Spark Autotuning Talk - Strata New YorkSpark Autotuning Talk - Strata New York
Spark Autotuning Talk - Strata New York
Holden Karau
 
Validating big data jobs - Spark AI Summit EU
Validating big data jobs  - Spark AI Summit EUValidating big data jobs  - Spark AI Summit EU
Validating big data jobs - Spark AI Summit EU
Holden Karau
 
Validating big data pipelines - Scala eXchange 2018
Validating big data pipelines -  Scala eXchange 2018Validating big data pipelines -  Scala eXchange 2018
Validating big data pipelines - Scala eXchange 2018
Holden Karau
 
Debugging Spark: Scala and Python - Super Happy Fun Times @ Data Day Texas 2018
Debugging Spark:  Scala and Python - Super Happy Fun Times @ Data Day Texas 2018Debugging Spark:  Scala and Python - Super Happy Fun Times @ Data Day Texas 2018
Debugging Spark: Scala and Python - Super Happy Fun Times @ Data Day Texas 2018
Holden Karau
 
Big data beyond the JVM - DDTX 2018
Big data beyond the JVM -  DDTX 2018Big data beyond the JVM -  DDTX 2018
Big data beyond the JVM - DDTX 2018
Holden Karau
 
Apache spark as a gateway drug to FP concepts taught and broken - Curry On 2018
Apache spark as a gateway drug to FP concepts taught and broken - Curry On 2018Apache spark as a gateway drug to FP concepts taught and broken - Curry On 2018
Apache spark as a gateway drug to FP concepts taught and broken - Curry On 2018
Holden Karau
 
Sharing (or stealing) the jewels of python with big data & the jvm (1)
Sharing (or stealing) the jewels of python with big data & the jvm (1)Sharing (or stealing) the jewels of python with big data & the jvm (1)
Sharing (or stealing) the jewels of python with big data & the jvm (1)
Holden Karau
 
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
 Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark... Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
Databricks
 
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Holden Karau
 
Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...
Holden Karau
 
Keeping the fun in functional w/ Apache Spark @ Scala Days NYC
Keeping the fun in functional   w/ Apache Spark @ Scala Days NYCKeeping the fun in functional   w/ Apache Spark @ Scala Days NYC
Keeping the fun in functional w/ Apache Spark @ Scala Days NYC
Holden Karau
 
Data Science Apps: Beyond Notebooks - Natalino Busa - Codemotion Amsterdam 2017
Data Science Apps: Beyond Notebooks - Natalino Busa - Codemotion Amsterdam 2017Data Science Apps: Beyond Notebooks - Natalino Busa - Codemotion Amsterdam 2017
Data Science Apps: Beyond Notebooks - Natalino Busa - Codemotion Amsterdam 2017
Codemotion
 
Puppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet Camp Dallas 2014: How Puppet Ops RollsPuppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet
 
Parallel Programming in Python: Speeding up your analysis
Parallel Programming in Python: Speeding up your analysisParallel Programming in Python: Speeding up your analysis
Parallel Programming in Python: Speeding up your analysis
Manojit Nandi
 
Atlanta Hadoop Users Meetup 09 21 2016
Atlanta Hadoop Users Meetup 09 21 2016Atlanta Hadoop Users Meetup 09 21 2016
Atlanta Hadoop Users Meetup 09 21 2016
Chris Fregly
 
Getting Git Right
Getting Git RightGetting Git Right
Getting Git Right
Sven Peters
 
Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)
DECK36
 

What's hot (20)

Building Recoverable (and optionally async) Pipelines with Apache Spark (+ s...
Building Recoverable (and optionally async) Pipelines with Apache Spark  (+ s...Building Recoverable (and optionally async) Pipelines with Apache Spark  (+ s...
Building Recoverable (and optionally async) Pipelines with Apache Spark (+ s...
 
Validating big data pipelines - FOSDEM 2019
Validating big data pipelines -  FOSDEM 2019Validating big data pipelines -  FOSDEM 2019
Validating big data pipelines - FOSDEM 2019
 
Spark Autotuning Talk - Strata New York
Spark Autotuning Talk - Strata New YorkSpark Autotuning Talk - Strata New York
Spark Autotuning Talk - Strata New York
 
Validating big data jobs - Spark AI Summit EU
Validating big data jobs  - Spark AI Summit EUValidating big data jobs  - Spark AI Summit EU
Validating big data jobs - Spark AI Summit EU
 
Validating big data pipelines - Scala eXchange 2018
Validating big data pipelines -  Scala eXchange 2018Validating big data pipelines -  Scala eXchange 2018
Validating big data pipelines - Scala eXchange 2018
 
Debugging Spark: Scala and Python - Super Happy Fun Times @ Data Day Texas 2018
Debugging Spark:  Scala and Python - Super Happy Fun Times @ Data Day Texas 2018Debugging Spark:  Scala and Python - Super Happy Fun Times @ Data Day Texas 2018
Debugging Spark: Scala and Python - Super Happy Fun Times @ Data Day Texas 2018
 
Big data beyond the JVM - DDTX 2018
Big data beyond the JVM -  DDTX 2018Big data beyond the JVM -  DDTX 2018
Big data beyond the JVM - DDTX 2018
 
Apache spark as a gateway drug to FP concepts taught and broken - Curry On 2018
Apache spark as a gateway drug to FP concepts taught and broken - Curry On 2018Apache spark as a gateway drug to FP concepts taught and broken - Curry On 2018
Apache spark as a gateway drug to FP concepts taught and broken - Curry On 2018
 
Sharing (or stealing) the jewels of python with big data & the jvm (1)
Sharing (or stealing) the jewels of python with big data & the jvm (1)Sharing (or stealing) the jewels of python with big data & the jvm (1)
Sharing (or stealing) the jewels of python with big data & the jvm (1)
 
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
 Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark... Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
 
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
 
Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...
 
Keeping the fun in functional w/ Apache Spark @ Scala Days NYC
Keeping the fun in functional   w/ Apache Spark @ Scala Days NYCKeeping the fun in functional   w/ Apache Spark @ Scala Days NYC
Keeping the fun in functional w/ Apache Spark @ Scala Days NYC
 
Data Science Apps: Beyond Notebooks - Natalino Busa - Codemotion Amsterdam 2017
Data Science Apps: Beyond Notebooks - Natalino Busa - Codemotion Amsterdam 2017Data Science Apps: Beyond Notebooks - Natalino Busa - Codemotion Amsterdam 2017
Data Science Apps: Beyond Notebooks - Natalino Busa - Codemotion Amsterdam 2017
 
Puppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet Camp Dallas 2014: How Puppet Ops RollsPuppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet Camp Dallas 2014: How Puppet Ops Rolls
 
Parallel Programming in Python: Speeding up your analysis
Parallel Programming in Python: Speeding up your analysisParallel Programming in Python: Speeding up your analysis
Parallel Programming in Python: Speeding up your analysis
 
Atlanta Hadoop Users Meetup 09 21 2016
Atlanta Hadoop Users Meetup 09 21 2016Atlanta Hadoop Users Meetup 09 21 2016
Atlanta Hadoop Users Meetup 09 21 2016
 
Web::Scraper
Web::ScraperWeb::Scraper
Web::Scraper
 
Getting Git Right
Getting Git RightGetting Git Right
Getting Git Right
 
Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)
 

Similar to Intro - End to end ML with Kubeflow @ SignalConf 2018

Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
DataWorks Summit
 
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
Holden Karau
 
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Migrating Apache Spark ML Jobs to Spark + Tensorflow on KubeflowMigrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Databricks
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
Chris Fregly
 
Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018
Holden Karau
 
Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018
Holden Karau
 
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
Chris Fregly
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
Stanislav Pogrebnyak
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scaling
Stanislav Osipov
 
CoffeeScript: A beginner's presentation for beginners copy
CoffeeScript: A beginner's presentation for beginners copyCoffeeScript: A beginner's presentation for beginners copy
CoffeeScript: A beginner's presentation for beginners copy
Patrick Devins
 
Kubernetes: The Next Research Platform
Kubernetes: The Next Research PlatformKubernetes: The Next Research Platform
Kubernetes: The Next Research Platform
Bob Killen
 
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Holden Karau
 
Introduction to Google Colaboratory.pdf
Introduction to Google Colaboratory.pdfIntroduction to Google Colaboratory.pdf
Introduction to Google Colaboratory.pdf
Yomna Mahmoud Ibrahim Hassan
 
Leonid Kuligin "Training ML models with Cloud"
 Leonid Kuligin   "Training ML models with Cloud" Leonid Kuligin   "Training ML models with Cloud"
Leonid Kuligin "Training ML models with Cloud"
Lviv Startup Club
 
An introduction into Spark ML plus how to go beyond when you get stuck
An introduction into Spark ML plus how to go beyond when you get stuckAn introduction into Spark ML plus how to go beyond when you get stuck
An introduction into Spark ML plus how to go beyond when you get stuck
Data Con LA
 
Docker + Tenserflow + GOlang - Golang singapore Meetup
Docker + Tenserflow + GOlang - Golang singapore MeetupDocker + Tenserflow + GOlang - Golang singapore Meetup
Docker + Tenserflow + GOlang - Golang singapore Meetup
sangam biradar
 
20180926 kubeflow-meetup-1-kubeflow-operators-Preferred Networks-Shingo Omura
20180926 kubeflow-meetup-1-kubeflow-operators-Preferred Networks-Shingo Omura20180926 kubeflow-meetup-1-kubeflow-operators-Preferred Networks-Shingo Omura
20180926 kubeflow-meetup-1-kubeflow-operators-Preferred Networks-Shingo Omura
Preferred Networks
 
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
Chris Fregly
 
Serverless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud PlatformServerless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud Platform
MeetupDataScienceRoma
 
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
DataWorks Summit
 

Similar to Intro - End to end ML with Kubeflow @ SignalConf 2018 (20)

Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
 
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
 
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Migrating Apache Spark ML Jobs to Spark + Tensorflow on KubeflowMigrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
 
Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018
 
Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018
 
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scaling
 
CoffeeScript: A beginner's presentation for beginners copy
CoffeeScript: A beginner's presentation for beginners copyCoffeeScript: A beginner's presentation for beginners copy
CoffeeScript: A beginner's presentation for beginners copy
 
Kubernetes: The Next Research Platform
Kubernetes: The Next Research PlatformKubernetes: The Next Research Platform
Kubernetes: The Next Research Platform
 
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
 
Introduction to Google Colaboratory.pdf
Introduction to Google Colaboratory.pdfIntroduction to Google Colaboratory.pdf
Introduction to Google Colaboratory.pdf
 
Leonid Kuligin "Training ML models with Cloud"
 Leonid Kuligin   "Training ML models with Cloud" Leonid Kuligin   "Training ML models with Cloud"
Leonid Kuligin "Training ML models with Cloud"
 
An introduction into Spark ML plus how to go beyond when you get stuck
An introduction into Spark ML plus how to go beyond when you get stuckAn introduction into Spark ML plus how to go beyond when you get stuck
An introduction into Spark ML plus how to go beyond when you get stuck
 
Docker + Tenserflow + GOlang - Golang singapore Meetup
Docker + Tenserflow + GOlang - Golang singapore MeetupDocker + Tenserflow + GOlang - Golang singapore Meetup
Docker + Tenserflow + GOlang - Golang singapore Meetup
 
20180926 kubeflow-meetup-1-kubeflow-operators-Preferred Networks-Shingo Omura
20180926 kubeflow-meetup-1-kubeflow-operators-Preferred Networks-Shingo Omura20180926 kubeflow-meetup-1-kubeflow-operators-Preferred Networks-Shingo Omura
20180926 kubeflow-meetup-1-kubeflow-operators-Preferred Networks-Shingo Omura
 
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
 
Serverless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud PlatformServerless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud Platform
 
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
 

Recently uploaded

Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
GTProductions1
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
eutxy
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Sanjeev Rampal
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
laozhuseo02
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
Javier Lasa
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptxInternet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
VivekSinghShekhawat2
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
natyesu
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 

Recently uploaded (20)

Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptxInternet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 

Intro - End to end ML with Kubeflow @ SignalConf 2018

  • 1. End to End ML With Kubeflow & friends @holdenkarau Signal 2018 Legit-enough
  • 2. Some links (slides & recordings will be at): http://bit.ly/2QgsqF9 ^ Slides & code-lab links (after) CatLoversShow
  • 3. Holden: ● Prefered pronouns are she/her ● Developer Advocate at Google ● Apache Spark PMC/Committer, contribute to many other projects ● previously IBM, Alpine, Databricks, Google, Foursquare & Amazon ● co-author of Learning Spark & High Performance Spark ● Twitter: @holdenkarau ● Slide share http://www.slideshare.net/hkarau ● Code review livestreams: https://www.twitch.tv/holdenkarau / https://www.youtube.com/user/holdenkarau ● Spark Talk Videos http://bit.ly/holdenSparkVideos
  • 4.
  • 5. Who do I think you all are? ● Nice people* ● Interested in Machine Learning ● Possibly Familiar with one of Java, Scala, or Python Amanda
  • 6. What is in store for our adventure? ● We have 30 minutes :) ● Brief intros to what Kubernetes & Spark, and Kubeflow are ● How to train a model (ish) ● How to serve a model (ish) ● Scaling (ish) ● Updating models and other scary thoughts Ada Doglace
  • 8. ● General purpose distributed system ○ With a really nice API including Python :) ● Apache project ● Faster than Hadoop Map/Reduce ● Good when too big for a single machine ● Built on top of two abstractions for distributed data: RDDs & Datasets ● Has ML Libraries ● WIP Kubeflow integration PR 1467 What is Spark?
  • 9. The different pieces of Spark Apache Spark SQL, DataFrames & Datasets Structured Streaming Scala, Java, Python, & R Spark ML bagel & Graph X MLLib Scala, Java, PythonStreaming Graph Frames Paul Hudson
  • 10. So what does ML look like?
  • 12. Train Model on ML-Rig Photo by Tomomi
  • 14. Problem: Models are Cool, Feature prep is Hard Training is Tedious, Everyone Forgot Deployment
  • 17. What is Kubeflow? “Data Scientists” Model Serving On Kube Model Training *
  • 18. What is Kubeflow? “Kubeflow is a Cloud Native platform for machine learning based on Google’s internal machine learning pipelines.” or: ● The recognition that just a bunch of model weights isn’t enough ● Designed to support the ecosystem of tools needed (from data prep to serving) ● Open source project :) Ada Doglace
  • 19. Really just want to replace this: Photo by: Milestoned
  • 20. So you want to use this?
  • 21. What’s Next?! Step away from keyboard Think about type(s) of model Look at components directory and see what’s a fit tool wise Don’t know? Choose jupyter deal with the details live Can’t find it?
  • 23. What about just the basics?* ./scripts/kfctl.sh init ${KFAPP} --platform gcp --project ${PROJECT} cd ${KFAPP} ../scripts/kfctl.sh generate platform ../scripts/kfctl.sh apply platform ../scripts/kfctl.sh generate k8s ../scripts/kfctl.sh apply k8s
  • 24. What about just tensorflow?* ks registry add kubeflow github.com/kubeflow/kubeflow/tree/${VERSION}/kubeflow ks pkg install kubeflow/core@${VERSION} ks pkg install kubeflow/tf-serving@${VERSION} ks pkg install kubeflow/tf-job@${VERSION}
  • 25. Ok well I need to be able to access Jupyter too... kubectl port-forward -n ${NAMESPACE} `kubectl get pods -n ${NAMESPACE} --selector=service=ambassador -o jsonpath='{.items[0].metadata.name}'` 8080:80
  • 26. Your Special ML Training Goes here Don’t have any pressing projects but still want to have fun? Check out Michelle’s notebook for Github Issue summarization. Or want to see mnist again? here :)
  • 27. Your Special ML Training Goes here ... from keras.callbacks import CSVLogger, ModelCheckpoint script_name_base = 'tutorial_seq2seq' csv_logger = CSVLogger('{:}.log'.format(script_name_base)) model_checkpoint = ModelCheckpoint('{:}.epoch{{epoch:02d}}-val{{val_loss: .5f}}.hdf5'.format(script_name_base), save_best_only=True)
  • 28. Your Special ML Training Goes here history = seq2seq_Model.fit([encoder_input_data, decoder_input_data], np.expand_dims(decoder_target_data, -1), batch_size=batch_size, epochs=epochs, validation_split=0.12, callbacks=[csv_logger, model_checkpoint]) Really just check out Michelle’s notebook for Github Issue summarization.
  • 29. But what about [special foo-baz-inator] or [special-yak-shaving-tool]? Write a Dockerfile and build an image, use FROM so you’re not starting from scratch. FROM gcr.io/kubeflow-images-public/tensorflow-1.6.0-notebook-cpu RUN pip install py-special-yak-shaving-tool Then tell set it as a param for your training/serving job as needed: ks param set tfjob-v1alpha2 image "my-special-image-goes-here”
  • 30. What about that magical feature prep? For now it’s a mostly write-by-hand situation However TFX has some cool tools we can use today (like TF.Transform) if we’re ok with DirectRunner or Dataflow (with Flink support in the works indirectly)
  • 31. Enter: TF.Transform ● For pre-processing of your data ● e.g. where you spend 90% of your dev time anyways ● Integrates into serving time :D ● OSS ● Runs on top of Apache Beam, but current release not yet scalable outside of GCP ● On Apache Beam master this can run-ish on Flink, but rough ● Please don’t use this in production today unless your on GCP/Dataflow PROKathryn Yengel
  • 32. Defining a Transform processing function def preprocessing_fn(inputs): x = inputs['x'] y = inputs['y'] s = inputs['s'] x_centered = x - tft.mean(x) y_normalized = tft.scale_to_0_1(y) s_int = tft.string_to_int(s) return { 'x_centered': x_centered, 'y_normalized': y_normalized, 's_int': s_int}
  • 33. mean stddev normalize multiply quantiles bucketize Analyzers Reduce (full pass) Implemented as a distributed data pipeline Transforms Instance-to-instance (don’t change batch dimension) Pure TensorFlow
  • 35. Scale to ... Bag of Words / N-Grams Bucketization Feature Crosses tft.ngrams tft.string_to_int tf.string_split tft.scale_to_z_score tft.apply_buckets tft.quantiles tft.string_to_int tf.string_join ... Some common use-cases...
  • 36. BEAM Beyond the JVM: Current release ● Non JVM BEAM doesn’t work outside of Google’s environment yet ● tl;dr : uses grpc / protobuf ○ Similar to the common design but with more efficient representations (often) ● But exciting new plans to unify the runners and ease the support of different languages (called SDKS) ○ See https://beam.apache.org/contribute/portability/ ● If this is exciting, you can come join me on making BEAM work in Python3 ○ Yes we still don’t have that :( ○ But we're getting closer & you can come join us on BEAM-2874 :D Emma
  • 37. Serving: TF is probably easiest for now... MODEL_COMPONENT=my-model-server MODEL_NAME=cat-finder-3k ks generate tf-serving ${MODEL_COMPONENT} --name=${MODEL_NAME} ks param set ${MODEL_COMPONENT} deployHttpProxy true ks param set ${MODEL_COMPONENT} modelPath ${MODEL_PATH} ks apply ${KF_ENV} -c ${MODEL_COMPONENT}
  • 38. Or use Seldon Core & friends* Seldon Core is an OSS platform for deploying ML models on Kubernetes supported by Kubeflow. Supports Many Model types/formats: ● Tensorflow ● Sklearn ● Spark ML** ● R ● H20
  • 39. Set up seldon core for serving # Gives cluster-admin role to the default service account kubectl create clusterrolebinding seldon-admin --clusterrole=cluster-admin --serviceaccount=${NAMESPACE}:default # Install the kubeflow/seldon package ks pkg install kubeflow/seldon # Generate the seldon component and deploy it ks generate seldon seldon --name=seldon
  • 40. Build an image with your model* docker run -v $(pwd):/my_model seldonio/core-python-wrapper:0.7 /my_model IssueSummarization 0.1 gcr.io --base-image=python:3.6 --image-name=gcr-repository-name/my-image-name
  • 41. And kick off the new model: ks generate seldon-serve-simple new-serving-magic --name=model-name --image=gcr.io/gcr-repository-name/model:version --namespace=${NAMESPACE} --replicas=2 ks apply ${KF_ENV} -c new-serving-magic
  • 42. Wait so how do I use this? Your favourite rest library goes here* Timeouts matter! Doing recommendations? Have fall-backs Have multiple models? fall-backs *Need to use in batch? Maybe skip seldon, tf-serving & friends and integrate the library into your code. Or not. Trish Hamme
  • 43. Scaling - or ruh roh people are using this! replicas: 1 Becomes replicas: 10 Factor of 10 =~ “science”
  • 44. Wait really? ● Early: switch from mini-kube to ${cloud provider} with GPUs ○ “Vertical” scaling ● Next: increase # of workers for training ○ “Horizontal” scaling ○ Auto-scaling also WIP per-backend for the most part ● Serving, # of replicas ○ Auto-scaling is a WIP - https://github.com/kubeflow/kubeflow/issues/1219 PROJennifer C.
  • 45. What about validation? TensorFlow Data Validation (TFDV) Or Roll your own? ● Counters & execution time most common ● Please also check % of data change Spark-validator (proof of concept) Please validate your pipelines, and not just for data code changes too.
  • 46. Demo!
  • 48. Previously live demos recorded ● Kubeflow intro https://codelabs.developers.google.com/codelabs/kubeflow-intr oduction/index.html & streamed http://bit.ly/kfIntroStream ● Kubeflow E2E with Github issue summurizationhttps://codelabs.developers.google.com/codelab s/cloud-kubeflow-e2e-gis/ & streamed http://bit.ly/kfGHStream ● You can tell they were live streamed by how poorly went, I promise no video editing has occurred. ● You can do these yourself too (including one of them at our booth)!
  • 49. Join me & Boo @ Google’s booth @ 5PM And join my-coworker Casey West @ 6talking about: Building Captain Obvious: Understand Faster with Machine Learning APIs
  • 50. Want to watch working on a Kubeflow PR? ● Join Holden Friday @ 2pm pacific for live coding continuing working on her Apache Spark to Kubeflow (using the existing Spark operator as a base) https://www.youtube.com/watch?v=zHnTdqbjPik ● Or just https://youtube.com/user/holdenkarau & like + subscribe + click the bell :p
  • 51. k thnx bye :) Give feedback on this presentation http://bit.ly/holdenTalkFeedback