SlideShare a Scribd company logo
1 of 58
Download to read offline
11
Moving Your Machine
Learning Models to
Production with
TensorFlow Extended
Jonathan Mugan
22
Moving From Our Hut to the Production Floor
Your model is going to live for a long time. Not just for a demo.
You must know when to update it. The world changes.
You must ensure production data matches training data. Data reflects its origins.
You may need to track multiple model versions. E.g., for different states.
You need to batch the input to serving. One-at-a-time is slow.
33
Interchangeable Parts
and the ML Revolution
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow ServingImage by https://www.flickr.com/photos/36224933@N07
https://creativecommons.org/licenses/by-sa/2.0/deed.en
44
Interchangeable Parts
and the ML Revolution
• TensorFlow Extended (TFX)
• TFX used internally by Google and
recently open sourced
• Represents your pipeline to
production as a sequence of
components
• Building any one model is more
work, but for large endeavors, TFX
helps to keep you organized
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
55
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
66
TensorFlow ExtendedData Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
ML Metadata (MLMD)
• Individual components talk to the
metadata store (MLMD).
• MLMD doesn’t store data itself. It
stores data about data.
https://www.tensorflow.org/tfx/guide
https://www.tensorflow.org/tfx/guide/mlmd
Each component has three pieces
1. Driver (gets information from MLMD)
2. Executor (does what the component does)
3. Publisher (writes results to MLMD)
Organized by library (components in parenthesis)
77
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
• Pulls in your data and put it into binary format
• Also splits it into train and test
• Protocol Buffers
• tf.Example into a TFRecord file
https://www.tensorflow.org/tutorials/load_data/tfrecord
https://www.tensorflow.org/tfx/guide/examplegen
Data Ingestion:
ExampleGen
88
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
Looks at your data and generates a schema, which
you manually update.
It makes sure the data you pass in later during serving
is still in the same format and hasn’t drifted.
Also has a great way to visualize data, FACETS, we will
see later.
https://www.tensorflow.org/tfx/guide/tfdv
TensorFlow Data Validation:
StatisticsGen, SchemaGen,
Example Validator
99
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
Example Schema
1010
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
Converts your data
• E.g., One-hot encoding, categorical with a vocab
• Part of TensorFlow graph, for better or worse
• Good for transformations that require looking at
all values
TensorFlow Transform:
Transform
Example: tft.scale_to_z_score
subtracts mean and divides by standard deviation
Features come in many types, and TensorFlow
Transform converts them into a format that can be
ingested by a machine learning model.
Nice to have this explicit.
https://www.tensorflow.org/tfx/guide/transform
1111
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
• Trains the model: Part we are all
familiar with
• Except uses an Estimator
• Can use KERAS
tf.keras.estimator.model_to_estimator()
https://www.tensorflow.org/tfx/guide/trainer
Estimator or Keras Model:
Trainer
1212
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
Evaluator Component
• Evaluates the model.
• Uses TensorFlow Model Analysis (TFMA), which we
will see shortly.
• https://www.tensorflow.org/tfx/guide/evaluator
TensorFlow Model Analysis:
Evaluator, Model Validator
Model Validator Component
• You set a baseline (such as the current
serving model) and a metric (such as AUC)
• Marks in the metadata if the model passes
the baseline.
• https://www.tensorflow.org/tfx/guide/modelval
1313
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
For a deeper understanding, see Ice-T’s 1988
hit song, “I’m Your Pusher”
https://www.tensorflow.org/tfx/guide/pusher
Validation Outcomes:
Pusher
• Pushes the model to Serving if it is validated
• I.e., if your new model is better than the
existing model, push it to the model server.
1414
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
• Uses the model to perform inference
• Called via gRPC APIs or RESTFUL APIs
• Easy to get running with Docker
• You can call a particular version of a model
• Takes care of batching
https://www.tensorflow.org/tfx/guide/serving
TensorFlow Serving
1515
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
1616
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• ML Metadata
• Apache Airflow
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
1717
Metadata StoreData Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
(Model Server)
ML Metadata (MLMD)
https://www.tensorflow.org/tfx/guide/mlmd
1818
Looking at the table Artifact using the DB Browser for SQLite
1919
The Types of
Artifacts
The type_id
field from the
previous slide
maps here
2020
You can see the properties of the artifacts in the ArtifactProperty table
2121
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• ML Metadata
• Apache Airflow
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
2222
Pipeline Management with Apache Airflow
Allows you to trigger and keep track of pipelines.
2323
Pipeline Management with Apache Airflow
2424
Pipeline Management with Apache Airflow
• You can also use Kubeflow https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_pipeline_kubeflow_gcp.py
• And Apache Beam https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_pipeline_beam.py
2525
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
2626
TensorFlow 2.0
• Don’t have to define the graph separately
• More like PyTorch
• There are two ways you can do computation:
• Eager: like PyTorch, just compute
• tf.function: You decorate a function and call it
2727
TensorFlow 1.x
session
TensorFlow 2.x
function
TensorFlow 2.x
eager
output
output output
Still get performance of Session
https://www.tensorflow.org/guide/function https://www.tensorflow.org/guide/eager
Debug like a civilized person
2828
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
2929
Data
• tf.train.Example is tf.train.Feature protobuf
message, where each value has a name
and a type (tf.train.BytesList,
tf.train.FloatList, tf.train.Int64List)
• TFRecord is a format for storing sequences
of binary records, each record is
tf.train.Example
• tf.data.Dataset can take in TFRecord and
create an iterator for batching
• tf.parse_example unpacks tf.Example into
standard tensors.
https://www.tensorflow.org/tutorials/load_data/tf_records
Features
• tf.feature_column,
where you further
specify what it is, such as
one-hot, vocabulary, and
embeddings and such.
https://www.tensorflow.org/guide/feature_columns
tf.train.Example specifies what it is
for storage, and tf.feature_column is
for the input to a model.
3030
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
3131
To build a model you need
• format of model input
• tf.feature_column
• model architecture and hyperparameters
• tf.estimator
• (or KERAS with tf.keras.estimator.model_to_estimator)
• function to deliver training data
• tf.estimator.TrainSpec from tf.data
• function to deliver eval data
• tf.estimator.EvalSpec from tf.data
• function to deliver serving data
• tf.estimator.FinalExporter
3232
TensorFlow Estimator
• Estimator is a wrapper for regular
TensorFlow that automatically
scales to multiple machines and
automatically outputs results to
TensorBoard
Shout out to model explainability using estimator using boosted trees
https://www.tensorflow.org/tutorials/estimator/boosted_trees_model_understanding
https://www.tensorflow.org/tutorials/estimator/boosted_trees
3333
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
3434
TensorBoard
Plotting of prescribers
Red has more overdose
Green has fewer
3D plots not that useful,
but they look cool
You can even use TensorBoard
from PyTorch
https://pytorch.org/docs/stable/tensorboard.html
3535
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
3636
TensorFlow Data Validation (TFDV)
• We need to understand our data as well as possible.
• TFDV provides tools that make that less difficult.
• Helps to identify bugs in the data by showing you
pictures that don’t look right.
• https://www.tensorflow.org/tfx/data_validation/get_started
3737
3838
3939
By sorting by non-uniformity,
we can debug features.
4040
In general, we can make sure the distributions are what we would expect.
4141
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
4242
TensorFlow
Model
Analysis
(TFMA)
We can see how well
our model does
by each slice.
We see that this model
does much better for
females than males.
https://www.tensorflow.org/tfx/guide/tfma
4343
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
4444
What-IF Tool
• The What-If Tool applies a model from TensorFlow
Serving to any data you give it.
• https://pair-code.github.io/what-if-tool/index.html
• Change a record and see what the model does
• Find the most similar record with a different classification
• Can be used for fairness. Adjust the model so it is equally
likely to predict “yes” for each group
• https://www.coursera.org/lecture/machine-learning-business-professionals/activity-
applying-fairness-concerns-with-the-what-if-tool-review-0mYda
4545
Looking at the data
by race, age, and
inference score
4646
What-If Tool showing the
the probability of overdose
for individual features.
4747
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
4848
Alternatives (kind of)
• MLflow https://mlflow.org/docs/latest/index.html
• Netflix Metaflow https://github.com/Netflix/metaflow
• Sacred https://github.com/IDSIA/sacred
• Dataiku DSS https://www.dataiku.com/product/
• Polyaxon https://polyaxon.com/
• Facebook Ax https://www.ax.dev/
They all do something a little different,
with pieces straddling different sides of the data science/production divide
4949
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Streamlit Dashboard
• Python Typing, Dataclasses, and Enum
• Conclusion
5050
Streamlit
Dashboard
• Writes to the browser
• Works well for artifacts in the ML pipeline.
https://github.com/streamlit/streamlit/
https://streamlit.io/docs/getting_started.html
5151
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Streamlit Dashboard
• Python Typing, Dataclasses, and Enum
• Conclusion
5252
Typing, Dataclasses,
and Enum
You can build interchangeable
parts right in Python.
Not new of course, but they make
Python a little less wild west.
Output:
[Turtle(size=6, name='Anita'),
Turtle(size=2, name='Anita')]
5353
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
5454
TFX Disadvantages
• Steep learning curve
• Changes constantly (but not while you are watching it)
• Somewhat inflexible, you can create your own
components, but steep learning curve
• No hyperparameter search (yet,
https://github.com/tensorflow/tfx/issues/182)
5555
TFX Advantages
• Set up to scale
• Documents your process through artifacts
• Warm-starting: as new data comes in, you don’t have to
start training over. Keeps models fresh
• Tools to see data and debug problems
• Don’t have to rerun what is already run
5656
Where to Start
• Jupyter notebook tutorial
https://www.tensorflow.org/tfx/tutorials/tfx/components
• Airflow tutorial
https://www.tensorflow.org/tfx/tutorials/tfx/airflow_workshop
5757
Happy Hour!
6500 River Place Blvd.
Bldg. 3, Suite 120
Austin, TX. 78730
Jonathan Mugan, Ph. D.
Email: jmugan@deumbra.com
5858
Appendix
• Original TFX paper
https://ai.google/research/pubs/pub46484
• Documentation
• https://www.tensorflow.org/tfx
• https://www.tensorflow.org/tfx/tutorials
• https://www.tensorflow.org/tfx/guide
• https://www.tensorflow.org/tfx/api_docs

More Related Content

What's hot

Federated Learning
Federated LearningFederated Learning
Federated Learning
DataWorks Summit
 

What's hot (20)

Federated Learning
Federated LearningFederated Learning
Federated Learning
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
 
Parametric & Non-Parametric Machine Learning (Supervised ML)
Parametric & Non-Parametric Machine Learning (Supervised ML)Parametric & Non-Parametric Machine Learning (Supervised ML)
Parametric & Non-Parametric Machine Learning (Supervised ML)
 
Introduction of Deep Learning
Introduction of Deep LearningIntroduction of Deep Learning
Introduction of Deep Learning
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
 
AI, Machine Learning and Deep Learning - The Overview
AI, Machine Learning and Deep Learning - The OverviewAI, Machine Learning and Deep Learning - The Overview
AI, Machine Learning and Deep Learning - The Overview
 
Machine learning for sensor Data Analytics
Machine learning for sensor Data AnalyticsMachine learning for sensor Data Analytics
Machine learning for sensor Data Analytics
 
Big Data & Data Science
Big Data & Data ScienceBig Data & Data Science
Big Data & Data Science
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First Course
 
KNN presentation.pdf
KNN presentation.pdfKNN presentation.pdf
KNN presentation.pdf
 
H2O.ai's Driverless AI
H2O.ai's Driverless AIH2O.ai's Driverless AI
H2O.ai's Driverless AI
 
Prepare your data for machine learning
Prepare your data for machine learningPrepare your data for machine learning
Prepare your data for machine learning
 
Lecture #01
Lecture #01Lecture #01
Lecture #01
 
KNN
KNNKNN
KNN
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 

Similar to Moving Your Machine Learning Models to Production with TensorFlow Extended

Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Chris Fregly
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
Simplilearn
 

Similar to Moving Your Machine Learning Models to Production with TensorFlow Extended (20)

TensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache BeamTensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache Beam
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
 
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlowTensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
 
A Tour of Tensorflow's APIs
A Tour of Tensorflow's APIsA Tour of Tensorflow's APIs
A Tour of Tensorflow's APIs
 
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in Production
 
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
 
Meetup tensorframes
Meetup tensorframesMeetup tensorframes
Meetup tensorframes
 
TensorFlow example for AI Ukraine2016
TensorFlow example  for AI Ukraine2016TensorFlow example  for AI Ukraine2016
TensorFlow example for AI Ukraine2016
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processing
 
Tensor flow 2.0 what's new
Tensor flow 2.0  what's newTensor flow 2.0  what's new
Tensor flow 2.0 what's new
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Advanced Spark and TensorFlow Meetup May 26, 2016
Advanced Spark and TensorFlow Meetup May 26, 2016Advanced Spark and TensorFlow Meetup May 26, 2016
Advanced Spark and TensorFlow Meetup May 26, 2016
 
Neural Networks with Google TensorFlow
Neural Networks with Google TensorFlowNeural Networks with Google TensorFlow
Neural Networks with Google TensorFlow
 
BlaBlaConf'22 The art of MLOps in TensorFlow Ecosystem
BlaBlaConf'22 The art of MLOps in TensorFlow EcosystemBlaBlaConf'22 The art of MLOps in TensorFlow Ecosystem
BlaBlaConf'22 The art of MLOps in TensorFlow Ecosystem
 

More from Jonathan Mugan

How to build someone we can talk to
How to build someone we can talk toHow to build someone we can talk to
How to build someone we can talk to
Jonathan Mugan
 

More from Jonathan Mugan (9)

How to build someone we can talk to
How to build someone we can talk toHow to build someone we can talk to
How to build someone we can talk to
 
Generating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural NetworksGenerating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural Networks
 
Data Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AIData Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AI
 
Data Day Seattle, Chatbots from First Principles
Data Day Seattle, Chatbots from First PrinciplesData Day Seattle, Chatbots from First Principles
Data Day Seattle, Chatbots from First Principles
 
Chatbots from first principles
Chatbots from first principlesChatbots from first principles
Chatbots from first principles
 
From Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial IntelligenceFrom Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial Intelligence
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
 

Recently uploaded

Recently uploaded (20)

Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4j
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 

Moving Your Machine Learning Models to Production with TensorFlow Extended

  • 1. 11 Moving Your Machine Learning Models to Production with TensorFlow Extended Jonathan Mugan
  • 2. 22 Moving From Our Hut to the Production Floor Your model is going to live for a long time. Not just for a demo. You must know when to update it. The world changes. You must ensure production data matches training data. Data reflects its origins. You may need to track multiple model versions. E.g., for different states. You need to batch the input to serving. One-at-a-time is slow.
  • 3. 33 Interchangeable Parts and the ML Revolution Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow ServingImage by https://www.flickr.com/photos/36224933@N07 https://creativecommons.org/licenses/by-sa/2.0/deed.en
  • 4. 44 Interchangeable Parts and the ML Revolution • TensorFlow Extended (TFX) • TFX used internally by Google and recently open sourced • Represents your pipeline to production as a sequence of components • Building any one model is more work, but for large endeavors, TFX helps to keep you organized Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving
  • 5. 55 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 6. 66 TensorFlow ExtendedData Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving ML Metadata (MLMD) • Individual components talk to the metadata store (MLMD). • MLMD doesn’t store data itself. It stores data about data. https://www.tensorflow.org/tfx/guide https://www.tensorflow.org/tfx/guide/mlmd Each component has three pieces 1. Driver (gets information from MLMD) 2. Executor (does what the component does) 3. Publisher (writes results to MLMD) Organized by library (components in parenthesis)
  • 7. 77 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving • Pulls in your data and put it into binary format • Also splits it into train and test • Protocol Buffers • tf.Example into a TFRecord file https://www.tensorflow.org/tutorials/load_data/tfrecord https://www.tensorflow.org/tfx/guide/examplegen Data Ingestion: ExampleGen
  • 8. 88 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving Looks at your data and generates a schema, which you manually update. It makes sure the data you pass in later during serving is still in the same format and hasn’t drifted. Also has a great way to visualize data, FACETS, we will see later. https://www.tensorflow.org/tfx/guide/tfdv TensorFlow Data Validation: StatisticsGen, SchemaGen, Example Validator
  • 9. 99 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving Example Schema
  • 10. 1010 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving Converts your data • E.g., One-hot encoding, categorical with a vocab • Part of TensorFlow graph, for better or worse • Good for transformations that require looking at all values TensorFlow Transform: Transform Example: tft.scale_to_z_score subtracts mean and divides by standard deviation Features come in many types, and TensorFlow Transform converts them into a format that can be ingested by a machine learning model. Nice to have this explicit. https://www.tensorflow.org/tfx/guide/transform
  • 11. 1111 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving • Trains the model: Part we are all familiar with • Except uses an Estimator • Can use KERAS tf.keras.estimator.model_to_estimator() https://www.tensorflow.org/tfx/guide/trainer Estimator or Keras Model: Trainer
  • 12. 1212 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving Evaluator Component • Evaluates the model. • Uses TensorFlow Model Analysis (TFMA), which we will see shortly. • https://www.tensorflow.org/tfx/guide/evaluator TensorFlow Model Analysis: Evaluator, Model Validator Model Validator Component • You set a baseline (such as the current serving model) and a metric (such as AUC) • Marks in the metadata if the model passes the baseline. • https://www.tensorflow.org/tfx/guide/modelval
  • 13. 1313 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving For a deeper understanding, see Ice-T’s 1988 hit song, “I’m Your Pusher” https://www.tensorflow.org/tfx/guide/pusher Validation Outcomes: Pusher • Pushes the model to Serving if it is validated • I.e., if your new model is better than the existing model, push it to the model server.
  • 14. 1414 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving • Uses the model to perform inference • Called via gRPC APIs or RESTFUL APIs • Easy to get running with Docker • You can call a particular version of a model • Takes care of batching https://www.tensorflow.org/tfx/guide/serving TensorFlow Serving
  • 15. 1515 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 16. 1616 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • ML Metadata • Apache Airflow • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 17. 1717 Metadata StoreData Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving (Model Server) ML Metadata (MLMD) https://www.tensorflow.org/tfx/guide/mlmd
  • 18. 1818 Looking at the table Artifact using the DB Browser for SQLite
  • 19. 1919 The Types of Artifacts The type_id field from the previous slide maps here
  • 20. 2020 You can see the properties of the artifacts in the ArtifactProperty table
  • 21. 2121 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • ML Metadata • Apache Airflow • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 22. 2222 Pipeline Management with Apache Airflow Allows you to trigger and keep track of pipelines.
  • 24. 2424 Pipeline Management with Apache Airflow • You can also use Kubeflow https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_pipeline_kubeflow_gcp.py • And Apache Beam https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_pipeline_beam.py
  • 25. 2525 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 26. 2626 TensorFlow 2.0 • Don’t have to define the graph separately • More like PyTorch • There are two ways you can do computation: • Eager: like PyTorch, just compute • tf.function: You decorate a function and call it
  • 27. 2727 TensorFlow 1.x session TensorFlow 2.x function TensorFlow 2.x eager output output output Still get performance of Session https://www.tensorflow.org/guide/function https://www.tensorflow.org/guide/eager Debug like a civilized person
  • 28. 2828 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 29. 2929 Data • tf.train.Example is tf.train.Feature protobuf message, where each value has a name and a type (tf.train.BytesList, tf.train.FloatList, tf.train.Int64List) • TFRecord is a format for storing sequences of binary records, each record is tf.train.Example • tf.data.Dataset can take in TFRecord and create an iterator for batching • tf.parse_example unpacks tf.Example into standard tensors. https://www.tensorflow.org/tutorials/load_data/tf_records Features • tf.feature_column, where you further specify what it is, such as one-hot, vocabulary, and embeddings and such. https://www.tensorflow.org/guide/feature_columns tf.train.Example specifies what it is for storage, and tf.feature_column is for the input to a model.
  • 30. 3030 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 31. 3131 To build a model you need • format of model input • tf.feature_column • model architecture and hyperparameters • tf.estimator • (or KERAS with tf.keras.estimator.model_to_estimator) • function to deliver training data • tf.estimator.TrainSpec from tf.data • function to deliver eval data • tf.estimator.EvalSpec from tf.data • function to deliver serving data • tf.estimator.FinalExporter
  • 32. 3232 TensorFlow Estimator • Estimator is a wrapper for regular TensorFlow that automatically scales to multiple machines and automatically outputs results to TensorBoard Shout out to model explainability using estimator using boosted trees https://www.tensorflow.org/tutorials/estimator/boosted_trees_model_understanding https://www.tensorflow.org/tutorials/estimator/boosted_trees
  • 33. 3333 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 34. 3434 TensorBoard Plotting of prescribers Red has more overdose Green has fewer 3D plots not that useful, but they look cool You can even use TensorBoard from PyTorch https://pytorch.org/docs/stable/tensorboard.html
  • 35. 3535 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 36. 3636 TensorFlow Data Validation (TFDV) • We need to understand our data as well as possible. • TFDV provides tools that make that less difficult. • Helps to identify bugs in the data by showing you pictures that don’t look right. • https://www.tensorflow.org/tfx/data_validation/get_started
  • 37. 3737
  • 38. 3838
  • 39. 3939 By sorting by non-uniformity, we can debug features.
  • 40. 4040 In general, we can make sure the distributions are what we would expect.
  • 41. 4141 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 42. 4242 TensorFlow Model Analysis (TFMA) We can see how well our model does by each slice. We see that this model does much better for females than males. https://www.tensorflow.org/tfx/guide/tfma
  • 43. 4343 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 44. 4444 What-IF Tool • The What-If Tool applies a model from TensorFlow Serving to any data you give it. • https://pair-code.github.io/what-if-tool/index.html • Change a record and see what the model does • Find the most similar record with a different classification • Can be used for fairness. Adjust the model so it is equally likely to predict “yes” for each group • https://www.coursera.org/lecture/machine-learning-business-professionals/activity- applying-fairness-concerns-with-the-what-if-tool-review-0mYda
  • 45. 4545 Looking at the data by race, age, and inference score
  • 46. 4646 What-If Tool showing the the probability of overdose for individual features.
  • 47. 4747 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 48. 4848 Alternatives (kind of) • MLflow https://mlflow.org/docs/latest/index.html • Netflix Metaflow https://github.com/Netflix/metaflow • Sacred https://github.com/IDSIA/sacred • Dataiku DSS https://www.dataiku.com/product/ • Polyaxon https://polyaxon.com/ • Facebook Ax https://www.ax.dev/ They all do something a little different, with pieces straddling different sides of the data science/production divide
  • 49. 4949 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Streamlit Dashboard • Python Typing, Dataclasses, and Enum • Conclusion
  • 50. 5050 Streamlit Dashboard • Writes to the browser • Works well for artifacts in the ML pipeline. https://github.com/streamlit/streamlit/ https://streamlit.io/docs/getting_started.html
  • 51. 5151 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Streamlit Dashboard • Python Typing, Dataclasses, and Enum • Conclusion
  • 52. 5252 Typing, Dataclasses, and Enum You can build interchangeable parts right in Python. Not new of course, but they make Python a little less wild west. Output: [Turtle(size=6, name='Anita'), Turtle(size=2, name='Anita')]
  • 53. 5353 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 54. 5454 TFX Disadvantages • Steep learning curve • Changes constantly (but not while you are watching it) • Somewhat inflexible, you can create your own components, but steep learning curve • No hyperparameter search (yet, https://github.com/tensorflow/tfx/issues/182)
  • 55. 5555 TFX Advantages • Set up to scale • Documents your process through artifacts • Warm-starting: as new data comes in, you don’t have to start training over. Keeps models fresh • Tools to see data and debug problems • Don’t have to rerun what is already run
  • 56. 5656 Where to Start • Jupyter notebook tutorial https://www.tensorflow.org/tfx/tutorials/tfx/components • Airflow tutorial https://www.tensorflow.org/tfx/tutorials/tfx/airflow_workshop
  • 57. 5757 Happy Hour! 6500 River Place Blvd. Bldg. 3, Suite 120 Austin, TX. 78730 Jonathan Mugan, Ph. D. Email: jmugan@deumbra.com
  • 58. 5858 Appendix • Original TFX paper https://ai.google/research/pubs/pub46484 • Documentation • https://www.tensorflow.org/tfx • https://www.tensorflow.org/tfx/tutorials • https://www.tensorflow.org/tfx/guide • https://www.tensorflow.org/tfx/api_docs