SlideShare a Scribd company logo
An end-to-end machine learning platform for TensorFlow
Robert Crowe
Developer Advocate, TensorFlow
@robert_crowe
ML
Code
In addition to the actual ML...
ML
Code
...you have to worry about so much more.
Configuration
Data Collection
Data
Verification
Feature Extraction
Process Management
Tools
Analysis Tools
Machine
Resource
Management
Serving
Infrastructure
Monitoring
Source: Sculley et al.: Hidden Technical Debt in Machine Learning Systems
ML
Code
Data
Ingestion
Data
Analysis + Validation
Data
Transformatio
n
Trainer
Model Evaluation
and Validation
Serving Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
TFX is the solution to this problem...
Data
Ingestion
TensorFlow
Data Validation
TensorFlow
Transform
Estimator
or Keras
Model
TensorFlow
Model Analysis
TensorFlow
Serving
Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
…and we are making it available to you.
TFX powers our most important bets and products...
(incl. )
...and more.
Major ProductsAlphaBets
Data
Ingestion
TensorFlow
Data Validation
Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
Data
Transformatio
n
Trainer
Model Evaluation
and Validation
Serving
Why Data Validation is important
ML
Why Data Validation is important
garbage in garbage out
ML
garbage in garbage out
ML
Why Data Validation is important
Compute and visualize
descriptive statistics
TensorFlow Data Validation Components and Workflows
Infer initial schema
Training/serving
skew detection
Validating data
Compute and visualize
descriptive statistics
TensorFlow Data Validation Components and Workflows
Infer initial schema
Training/serving
skew detection
Validating data
Compute and visualize descriptive statistics
day1_stats =
tfdv.generate_statistics_from_csv(DAY1)
tfdv.visualize_statistics(day1_stats)
Visualization: Facets Overview (github.com/PAIR-code/facets)
Day 1
generate_statistics
day1_stats
Compute and visualize
descriptive statistics
TensorFlow Data Validation Components and Workflows
Infer initial schema
Training/serving
skew detection
Validating data
Training/serving skew detection
# Compute statistics over serving data
serving_data_stats =
tfdv.generate_statistics_from_csv(SERVING_DATA)
# Compare how new data conforms to the schema
anomalies = tfdv.validate_statistics(
serving_data_stats,
schema,
environment=”SERVING”)
Day 2
Serving
Data
schema
validate_
statistics
generate_
statistics
day2_stats
validate_
statistics
generate_
statistics
serving_
data_stats
Validation
Report
Validation
Report
TFDV deployment stats
● Data analysis and validation used by hundreds of product teams
● Certain teams set up TFX solely for data analysis and validation
● Petabytes of training/serving data per day
● Several documented ML wins by catching data anomalies
github.com/tensorflow/data-validation
TensorFlow Data Validation
Understand, validate, and monitor your ML data
Data
Ingestion
TensorFlow
Data Validation
Data
Transformatio
n
Trainer
Model Evaluation
and Validation
Serving Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
Data
Ingestion
TensorFlow
Data Validation
TensorFlow
Transform
Trainer
Model Evaluation
and Validation
Serving Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
Using tf.Transform for feature transformations.
Using tf.Transform for feature transformations.
Using tf.Transform for feature transformations.
Training
Serving
with beam.Pipeline() as pipeline:
with tft_beam.Context(temp_dir=tempfile.mkdtemp()):
...
raw_data = (
pipeline
| 'ReadTrainData' >> beam.io.ReadFromText(train_data_file)
| 'FixCommasTrainData' >> beam.Map(
lambda line: line.replace(', ', ','))
| 'DecodeTrainData' >> MapAndFilterErrors(converter.decode))
...
_ = (
transformed_data
| 'EncodeTrainData' >> beam.Map(transformed_data_coder.encode)
| 'WriteTrainData' >> beam.io.WriteToTFRecord(
os.path.join(working_dir, TRANSFORMED_TRAIN_DATA_FILEBASE)))
...
25
Common tf.Transform Use-Cases
Scale to ... Bag of Words / N-Grams
Bucketization Feature Crosses
tft.ngrams
tft.string_to_int
tf.string_split
tft.scale_to_z_score
tft.apply_buckets
tft.quantiles
tft.string_to_int
tf.string_join
...
Apply another TensorFlow Model
tft.apply_saved_model
Data
Ingestion
TensorFlow
Data Validation
TensorFlow
Transform
Model
Model Evaluation
and Validation
Serving Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
...
estimator = DNNLinearCombinedClassifier(...)
estimator.train(...)
rcvr_fn_map = {
model_fn_lib.ModeKeys.TRAIN: train_rcvr_fn,
model_fn_lib.ModeKeys.EVAL: eval_rcvr_fn,
model_fn_lib.ModeKeys.PREDICT: serve_rcvr_fn,
}
export_dir = tf.contrib.estimator.export_all_saved_models(
estimator,
export_dir_base='/path/to/savedmodel',
input_receiver_fn_map=rcvr_fn_map)
...
Using TensorFlow Estimators.
DNN{Classifier|Regressor}
Linear{Classifier|Regressor}
DNNLinearCombined{Classifier|Regressor}
BoostedTrees{Classifier|Regressor}
...
...
estimator = DNNLinearCombinedClassifier(...)
estimator.train(...)
rcvr_fn_map = {
model_fn_lib.ModeKeys.TRAIN: train_rcvr_fn,
model_fn_lib.ModeKeys.EVAL: eval_rcvr_fn,
model_fn_lib.ModeKeys.PREDICT: serve_rcvr_fn,
}
export_dir = tf.contrib.estimator.export_all_saved_models(
estimator,
export_dir_base='/path/to/savedmodel',
input_receiver_fn_map=rcvr_fn_map)
...
Using TensorFlow Estimators.
TensorFlow
Serving
TensorFlow
Model Analysis
Train, Eval, and Inference Graphs
SignatureDef
Eval
Metadata
SignatureDef
Data
Ingestion
TensorFlow
Data Validation
TensorFlow
Transform
Estimator
or Keras
Model
Model Evaluation
and Validation
Serving Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
Data
Ingestion
TensorFlow
Data Validation
TensorFlow
Transform
Estimator
or Keras
Model
TensorFlow
Model Analysis
Serving Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
TensorFlow Model Analysis (TFMA)
Deep performance analysis
● Examine subsets of data
● Identify weaker/stronger optimization
● Identify key segments that are underserved
● Model fairness
Using TF Model Analysis to slice your metrics.
OVERALL_SLICE_SPEC = tfma.SingleSliceSpec()
FEATURE_COLUMN_SLICE_SPEC = tfma.SingleSliceSpec(columns=['trip_start_hour'])
ALL_SPECS = [
OVERALL_SLICE_SPEC,
FEATURE_COLUMN_SLICE_SPEC
]
eval_result = tfma.run_model_analysis(
model_location='/path/to/savedmodel',
data_location='/path/to/file/containing/tfrecords',
file_format='tfrecords',
slice_spec=ALL_SPECS)
tfma.view.render_slicing_metrics(
eval_result)
Train, Eval, and Inference Graphs
SignatureDef
Eval
Metadata
SignatureDef
Using TF Model Analysis to slice your metrics.
OVERALL_SLICE_SPEC = tfma.SingleSliceSpec()
FEATURE_COLUMN_SLICE_SPEC = tfma.SingleSliceSpec(columns=[‘trip_start_hour’])
ALL_SPECS = [
OVERALL_SLICE_SPEC,
FEATURE_COLUMN_SLICE_SPEC
]
eval_result = tfma.run_model_analysis(
model_location='/path/to/savedmodel',
data_location='/path/to/file/containing/tfrecords',
file_format='tfrecords',
slice_spec=ALL_SPECS)
tfma.view.render_slicing_metrics(
eval_result,
slicing_spec=FEATURE_COLUMN_SLICE_SPEC)
Train, Eval, and Inference Graphs
SignatureDef
Eval
Metadata
SignatureDef
Data
Ingestion
TensorFlow
Data Validation
TensorFlow
Transform
Estimator
or Keras
Model
TensorFlow
Model Analysis
Serving Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
Data
Ingestion
TensorFlow
Data Validation
TensorFlow
Transform
Estimator
or Keras
Model
TensorFlow
Model Analysis
TensorFlow
Serving
Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
Using TensorFlow Serving to serve your models.
tensorflow_model_server
--port=9000
--model_name=chicago_taxi
--model_base_path='/path/to/savedmodel'
Train, Eval, and Inference Graphs
SignatureDef
Eval
Metadata
SignatureDef
Using TensorFlow Serving to serve your models.
tensorflow_model_server
--rest_api_port=8501
--model_name=chicago_taxi
--model_base_path='/path/to/savedmodel'
Train, Eval, and Inference Graphs
SignatureDef
Eval
Metadata
SignatureDef
// Classify Iris flowers from command line!
// More on Iris Model:
// www.tensorflow.org/get_started/get_started_for_beginners
$ curl -d '{"examples": [{"measurements": [7.1, 3.0, 5.9, 2.1]}]}' 
-X POST http://localhost:8501/v1/models/iris:classify
{
"results": [[["0", 1.37459e-09], ["1", 0.000219246], ["2", 0.999781]]]
}
Putting it all together.
Anomalies
TensorFlow
Data Validation
Training
Data
Schema
Anomalies
TensorFlow
Transform
TensorFlow
Data Validation
Training
Data
Schema
SavedModel
Transform
Graph
Transformed
Data
Anomalies
TensorFlow
Transform
TensorFlow
Data Validation
TF Estimator or
Keras Model
Training
Data
Schema
SavedModel
Transform
Graph
Transformed
Data
SavedModel
Eval
Graph
SavedModel
Inference
Graph
Anomalies
TensorFlow
Transform
TensorFlow
Data Validation
TF Estimator or
Keras Model
TensorFlow
Model Analysis
TensorFlow
Serving
Training
Data
Schema
SavedModel
Transform
Graph
Transformed
Data
SavedModel
Eval
Graph
SavedModel
Inference
Graph
Anomalies
TensorFlow
Transform
TensorFlow
Data Validation
TF Estimator or
Keras Model
TensorFlow
Model Analysis
TensorFlow
Serving
Training
Data
Schema
SavedModel
Transform
Graph
Transformed
Data
SavedModel
Eval
Graph
SavedModel
Inference
Graph
Serving
Logs
Use all of this today!
● End-to-end demo works locally (single-node)
● Runner for Flink in prototype phase
● Runner for Spark planned
● Demo runs on Cloud Dataflow for distributed execution
JIRA Ticket: BEAM-2889
goo.gl/4DBvX7
TFX: A TensorFlow-Based Production-Scale Machine Learning Platform. KDD (2017).
https://youtu.be/fPTwLVCq00U
github.com/tensorflow/transform
github.com/tensorflow/model-analysis
github.com/tensorflow/serving
TensorFlow Transform
Consistent in-graph transformations in training and
serving
TensorFlow Model Analysis
Scaleable, sliced, and full-pass metrics
TensorFlow Serving
Serving the world’s machine learning models
github.com/tensorflow/data-validation
TensorFlow Data Validation
Understand, validate, and monitor your ML data

More Related Content

What's hot

Predicting Optimal Parallelism for Data Analytics
Predicting Optimal Parallelism for Data AnalyticsPredicting Optimal Parallelism for Data Analytics
Predicting Optimal Parallelism for Data Analytics
Databricks
 
Tensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdTensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with Hummingbird
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Databricks
 
AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016
Robert Grossman
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
jeykottalam
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
Jose Quesada (hiring)
 
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Databricks
 
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
Robert Grossman
 
Streaming Inference with Apache Beam and TFX
Streaming Inference with Apache Beam and TFXStreaming Inference with Apache Beam and TFX
Streaming Inference with Apache Beam and TFX
Databricks
 
Machine Learning With Spark
Machine Learning With SparkMachine Learning With Spark
Machine Learning With Spark
Shivaji Dutta
 
Pattern: PMML for Cascading and Hadoop
Pattern: PMML for Cascading and HadoopPattern: PMML for Cascading and Hadoop
Pattern: PMML for Cascading and Hadoop
Paco Nathan
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...
Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...
Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...
Spark Summit
 
Graph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational DataGraph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational DataBenjamin Bengfort
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflow
Databricks
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
Jim Dowling
 
MLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning LibraryMLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning Library
jeykottalam
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
Databricks
 
Automated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingAutomated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and Tracking
Databricks
 

What's hot (20)

Predicting Optimal Parallelism for Data Analytics
Predicting Optimal Parallelism for Data AnalyticsPredicting Optimal Parallelism for Data Analytics
Predicting Optimal Parallelism for Data Analytics
 
Tensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdTensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with Hummingbird
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
 
AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
 
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
 
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
 
Streaming Inference with Apache Beam and TFX
Streaming Inference with Apache Beam and TFXStreaming Inference with Apache Beam and TFX
Streaming Inference with Apache Beam and TFX
 
Machine Learning With Spark
Machine Learning With SparkMachine Learning With Spark
Machine Learning With Spark
 
Pattern: PMML for Cascading and Hadoop
Pattern: PMML for Cascading and HadoopPattern: PMML for Cascading and Hadoop
Pattern: PMML for Cascading and Hadoop
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...
Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...
Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...
 
Graph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational DataGraph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational Data
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflow
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
MLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning LibraryMLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning Library
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
 
Automated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingAutomated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and Tracking
 

Similar to TensorFlow Extension (TFX) and Apache Beam

ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
Fei Chen
 
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlowTensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
Databricks
 
Moving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow ExtendedMoving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow Extended
Jonathan Mugan
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
Jan Kirenz
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
Gabriel Moreira
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
Gabriel Moreira
 
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
gdgsurrey
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOps
Databricks
 
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleMLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
Databricks
 
Ranjitbanshpal1
Ranjitbanshpal1Ranjitbanshpal1
Ranjitbanshpal1
ranjit banshpal
 
Oracle demantra online training
Oracle demantra online trainingOracle demantra online training
Oracle demantra online training
Glory IT Technologies Pvt. Ltd.
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Chris Fregly
 
Lap around .net 4
Lap around .net 4Lap around .net 4
Lap around .net 4Abdul Khan
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Jim Dowling
 
Whats New In 2010 (Msdn & Visual Studio)
Whats New In 2010 (Msdn & Visual Studio)Whats New In 2010 (Msdn & Visual Studio)
Whats New In 2010 (Msdn & Visual Studio)
Steve Lange
 
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
Amazon Web Services
 
Overview of VS2010 and .NET 4.0
Overview of VS2010 and .NET 4.0Overview of VS2010 and .NET 4.0
Overview of VS2010 and .NET 4.0
Bruce Johnson
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
DataWorks Summit
 
The Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs PublicThe Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs Public
David Solivan
 
Hot sos em12c_metric_extensions
Hot sos em12c_metric_extensionsHot sos em12c_metric_extensions
Hot sos em12c_metric_extensions
Kellyn Pot'Vin-Gorman
 

Similar to TensorFlow Extension (TFX) and Apache Beam (20)

ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
 
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlowTensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
 
Moving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow ExtendedMoving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow Extended
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
 
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOps
 
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleMLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
 
Ranjitbanshpal1
Ranjitbanshpal1Ranjitbanshpal1
Ranjitbanshpal1
 
Oracle demantra online training
Oracle demantra online trainingOracle demantra online training
Oracle demantra online training
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
 
Lap around .net 4
Lap around .net 4Lap around .net 4
Lap around .net 4
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science Meetup
 
Whats New In 2010 (Msdn & Visual Studio)
Whats New In 2010 (Msdn & Visual Studio)Whats New In 2010 (Msdn & Visual Studio)
Whats New In 2010 (Msdn & Visual Studio)
 
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
 
Overview of VS2010 and .NET 4.0
Overview of VS2010 and .NET 4.0Overview of VS2010 and .NET 4.0
Overview of VS2010 and .NET 4.0
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
 
The Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs PublicThe Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs Public
 
Hot sos em12c_metric_extensions
Hot sos em12c_metric_extensionsHot sos em12c_metric_extensions
Hot sos em12c_metric_extensions
 

More from markgrover

From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting data
markgrover
 
Amundsen lineage designs - community meeting, Dec 2020
Amundsen lineage designs - community meeting, Dec 2020 Amundsen lineage designs - community meeting, Dec 2020
Amundsen lineage designs - community meeting, Dec 2020
markgrover
 
Amundsen at Brex and Looker integration
Amundsen at Brex and Looker integrationAmundsen at Brex and Looker integration
Amundsen at Brex and Looker integration
markgrover
 
REA Group's journey with Data Cataloging and Amundsen
REA Group's journey with Data Cataloging and AmundsenREA Group's journey with Data Cataloging and Amundsen
REA Group's journey with Data Cataloging and Amundsen
markgrover
 
Amundsen gremlin proxy design
Amundsen gremlin proxy designAmundsen gremlin proxy design
Amundsen gremlin proxy design
markgrover
 
Amundsen: From discovering to security data
Amundsen: From discovering to security dataAmundsen: From discovering to security data
Amundsen: From discovering to security data
markgrover
 
Amundsen: From discovering to security data
Amundsen: From discovering to security dataAmundsen: From discovering to security data
Amundsen: From discovering to security data
markgrover
 
Data Discovery & Trust through Metadata
Data Discovery & Trust through MetadataData Discovery & Trust through Metadata
Data Discovery & Trust through Metadata
markgrover
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
markgrover
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the future
markgrover
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
markgrover
 
Big Data at Speed
Big Data at SpeedBig Data at Speed
Big Data at Speed
markgrover
 
Near real-time anomaly detection at Lyft
Near real-time anomaly detection at LyftNear real-time anomaly detection at Lyft
Near real-time anomaly detection at Lyft
markgrover
 
Dogfooding data at Lyft
Dogfooding data at LyftDogfooding data at Lyft
Dogfooding data at Lyft
markgrover
 
Fighting cybersecurity threats with Apache Spot
Fighting cybersecurity threats with Apache SpotFighting cybersecurity threats with Apache Spot
Fighting cybersecurity threats with Apache Spot
markgrover
 
Fraud Detection with Hadoop
Fraud Detection with HadoopFraud Detection with Hadoop
Fraud Detection with Hadoop
markgrover
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
markgrover
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
markgrover
 
Architecting Applications with Hadoop
Architecting Applications with HadoopArchitecting Applications with Hadoop
Architecting Applications with Hadoop
markgrover
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
markgrover
 

More from markgrover (20)

From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting data
 
Amundsen lineage designs - community meeting, Dec 2020
Amundsen lineage designs - community meeting, Dec 2020 Amundsen lineage designs - community meeting, Dec 2020
Amundsen lineage designs - community meeting, Dec 2020
 
Amundsen at Brex and Looker integration
Amundsen at Brex and Looker integrationAmundsen at Brex and Looker integration
Amundsen at Brex and Looker integration
 
REA Group's journey with Data Cataloging and Amundsen
REA Group's journey with Data Cataloging and AmundsenREA Group's journey with Data Cataloging and Amundsen
REA Group's journey with Data Cataloging and Amundsen
 
Amundsen gremlin proxy design
Amundsen gremlin proxy designAmundsen gremlin proxy design
Amundsen gremlin proxy design
 
Amundsen: From discovering to security data
Amundsen: From discovering to security dataAmundsen: From discovering to security data
Amundsen: From discovering to security data
 
Amundsen: From discovering to security data
Amundsen: From discovering to security dataAmundsen: From discovering to security data
Amundsen: From discovering to security data
 
Data Discovery & Trust through Metadata
Data Discovery & Trust through MetadataData Discovery & Trust through Metadata
Data Discovery & Trust through Metadata
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the future
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
 
Big Data at Speed
Big Data at SpeedBig Data at Speed
Big Data at Speed
 
Near real-time anomaly detection at Lyft
Near real-time anomaly detection at LyftNear real-time anomaly detection at Lyft
Near real-time anomaly detection at Lyft
 
Dogfooding data at Lyft
Dogfooding data at LyftDogfooding data at Lyft
Dogfooding data at Lyft
 
Fighting cybersecurity threats with Apache Spot
Fighting cybersecurity threats with Apache SpotFighting cybersecurity threats with Apache Spot
Fighting cybersecurity threats with Apache Spot
 
Fraud Detection with Hadoop
Fraud Detection with HadoopFraud Detection with Hadoop
Fraud Detection with Hadoop
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
 
Architecting Applications with Hadoop
Architecting Applications with HadoopArchitecting Applications with Hadoop
Architecting Applications with Hadoop
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 

Recently uploaded

Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 

Recently uploaded (20)

Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 

TensorFlow Extension (TFX) and Apache Beam

  • 1. An end-to-end machine learning platform for TensorFlow Robert Crowe Developer Advocate, TensorFlow @robert_crowe
  • 3. In addition to the actual ML... ML Code
  • 4. ...you have to worry about so much more. Configuration Data Collection Data Verification Feature Extraction Process Management Tools Analysis Tools Machine Resource Management Serving Infrastructure Monitoring Source: Sculley et al.: Hidden Technical Debt in Machine Learning Systems ML Code
  • 5. Data Ingestion Data Analysis + Validation Data Transformatio n Trainer Model Evaluation and Validation Serving Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization TFX is the solution to this problem...
  • 6. Data Ingestion TensorFlow Data Validation TensorFlow Transform Estimator or Keras Model TensorFlow Model Analysis TensorFlow Serving Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization …and we are making it available to you.
  • 7. TFX powers our most important bets and products... (incl. ) ...and more. Major ProductsAlphaBets
  • 8. Data Ingestion TensorFlow Data Validation Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization Data Transformatio n Trainer Model Evaluation and Validation Serving
  • 9. Why Data Validation is important ML
  • 10. Why Data Validation is important garbage in garbage out ML
  • 11. garbage in garbage out ML Why Data Validation is important
  • 12. Compute and visualize descriptive statistics TensorFlow Data Validation Components and Workflows Infer initial schema Training/serving skew detection Validating data
  • 13. Compute and visualize descriptive statistics TensorFlow Data Validation Components and Workflows Infer initial schema Training/serving skew detection Validating data
  • 14. Compute and visualize descriptive statistics day1_stats = tfdv.generate_statistics_from_csv(DAY1) tfdv.visualize_statistics(day1_stats) Visualization: Facets Overview (github.com/PAIR-code/facets) Day 1 generate_statistics day1_stats
  • 15.
  • 16. Compute and visualize descriptive statistics TensorFlow Data Validation Components and Workflows Infer initial schema Training/serving skew detection Validating data
  • 17. Training/serving skew detection # Compute statistics over serving data serving_data_stats = tfdv.generate_statistics_from_csv(SERVING_DATA) # Compare how new data conforms to the schema anomalies = tfdv.validate_statistics( serving_data_stats, schema, environment=”SERVING”) Day 2 Serving Data schema validate_ statistics generate_ statistics day2_stats validate_ statistics generate_ statistics serving_ data_stats Validation Report Validation Report
  • 18. TFDV deployment stats ● Data analysis and validation used by hundreds of product teams ● Certain teams set up TFX solely for data analysis and validation ● Petabytes of training/serving data per day ● Several documented ML wins by catching data anomalies
  • 20. Data Ingestion TensorFlow Data Validation Data Transformatio n Trainer Model Evaluation and Validation Serving Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
  • 21. Data Ingestion TensorFlow Data Validation TensorFlow Transform Trainer Model Evaluation and Validation Serving Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
  • 22. Using tf.Transform for feature transformations.
  • 23. Using tf.Transform for feature transformations.
  • 24. Using tf.Transform for feature transformations. Training Serving
  • 25. with beam.Pipeline() as pipeline: with tft_beam.Context(temp_dir=tempfile.mkdtemp()): ... raw_data = ( pipeline | 'ReadTrainData' >> beam.io.ReadFromText(train_data_file) | 'FixCommasTrainData' >> beam.Map( lambda line: line.replace(', ', ',')) | 'DecodeTrainData' >> MapAndFilterErrors(converter.decode)) ... _ = ( transformed_data | 'EncodeTrainData' >> beam.Map(transformed_data_coder.encode) | 'WriteTrainData' >> beam.io.WriteToTFRecord( os.path.join(working_dir, TRANSFORMED_TRAIN_DATA_FILEBASE))) ... 25
  • 26. Common tf.Transform Use-Cases Scale to ... Bag of Words / N-Grams Bucketization Feature Crosses tft.ngrams tft.string_to_int tf.string_split tft.scale_to_z_score tft.apply_buckets tft.quantiles tft.string_to_int tf.string_join ... Apply another TensorFlow Model tft.apply_saved_model
  • 27. Data Ingestion TensorFlow Data Validation TensorFlow Transform Model Model Evaluation and Validation Serving Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
  • 28. ... estimator = DNNLinearCombinedClassifier(...) estimator.train(...) rcvr_fn_map = { model_fn_lib.ModeKeys.TRAIN: train_rcvr_fn, model_fn_lib.ModeKeys.EVAL: eval_rcvr_fn, model_fn_lib.ModeKeys.PREDICT: serve_rcvr_fn, } export_dir = tf.contrib.estimator.export_all_saved_models( estimator, export_dir_base='/path/to/savedmodel', input_receiver_fn_map=rcvr_fn_map) ... Using TensorFlow Estimators. DNN{Classifier|Regressor} Linear{Classifier|Regressor} DNNLinearCombined{Classifier|Regressor} BoostedTrees{Classifier|Regressor} ...
  • 29. ... estimator = DNNLinearCombinedClassifier(...) estimator.train(...) rcvr_fn_map = { model_fn_lib.ModeKeys.TRAIN: train_rcvr_fn, model_fn_lib.ModeKeys.EVAL: eval_rcvr_fn, model_fn_lib.ModeKeys.PREDICT: serve_rcvr_fn, } export_dir = tf.contrib.estimator.export_all_saved_models( estimator, export_dir_base='/path/to/savedmodel', input_receiver_fn_map=rcvr_fn_map) ... Using TensorFlow Estimators. TensorFlow Serving TensorFlow Model Analysis Train, Eval, and Inference Graphs SignatureDef Eval Metadata SignatureDef
  • 30. Data Ingestion TensorFlow Data Validation TensorFlow Transform Estimator or Keras Model Model Evaluation and Validation Serving Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
  • 31. Data Ingestion TensorFlow Data Validation TensorFlow Transform Estimator or Keras Model TensorFlow Model Analysis Serving Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
  • 32. TensorFlow Model Analysis (TFMA) Deep performance analysis ● Examine subsets of data ● Identify weaker/stronger optimization ● Identify key segments that are underserved ● Model fairness
  • 33. Using TF Model Analysis to slice your metrics. OVERALL_SLICE_SPEC = tfma.SingleSliceSpec() FEATURE_COLUMN_SLICE_SPEC = tfma.SingleSliceSpec(columns=['trip_start_hour']) ALL_SPECS = [ OVERALL_SLICE_SPEC, FEATURE_COLUMN_SLICE_SPEC ] eval_result = tfma.run_model_analysis( model_location='/path/to/savedmodel', data_location='/path/to/file/containing/tfrecords', file_format='tfrecords', slice_spec=ALL_SPECS) tfma.view.render_slicing_metrics( eval_result) Train, Eval, and Inference Graphs SignatureDef Eval Metadata SignatureDef
  • 34.
  • 35. Using TF Model Analysis to slice your metrics. OVERALL_SLICE_SPEC = tfma.SingleSliceSpec() FEATURE_COLUMN_SLICE_SPEC = tfma.SingleSliceSpec(columns=[‘trip_start_hour’]) ALL_SPECS = [ OVERALL_SLICE_SPEC, FEATURE_COLUMN_SLICE_SPEC ] eval_result = tfma.run_model_analysis( model_location='/path/to/savedmodel', data_location='/path/to/file/containing/tfrecords', file_format='tfrecords', slice_spec=ALL_SPECS) tfma.view.render_slicing_metrics( eval_result, slicing_spec=FEATURE_COLUMN_SLICE_SPEC) Train, Eval, and Inference Graphs SignatureDef Eval Metadata SignatureDef
  • 36.
  • 37.
  • 38.
  • 39.
  • 40. Data Ingestion TensorFlow Data Validation TensorFlow Transform Estimator or Keras Model TensorFlow Model Analysis Serving Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
  • 41. Data Ingestion TensorFlow Data Validation TensorFlow Transform Estimator or Keras Model TensorFlow Model Analysis TensorFlow Serving Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
  • 42. Using TensorFlow Serving to serve your models. tensorflow_model_server --port=9000 --model_name=chicago_taxi --model_base_path='/path/to/savedmodel' Train, Eval, and Inference Graphs SignatureDef Eval Metadata SignatureDef
  • 43. Using TensorFlow Serving to serve your models. tensorflow_model_server --rest_api_port=8501 --model_name=chicago_taxi --model_base_path='/path/to/savedmodel' Train, Eval, and Inference Graphs SignatureDef Eval Metadata SignatureDef
  • 44. // Classify Iris flowers from command line! // More on Iris Model: // www.tensorflow.org/get_started/get_started_for_beginners $ curl -d '{"examples": [{"measurements": [7.1, 3.0, 5.9, 2.1]}]}' -X POST http://localhost:8501/v1/models/iris:classify { "results": [[["0", 1.37459e-09], ["1", 0.000219246], ["2", 0.999781]]] }
  • 45. Putting it all together.
  • 48. Anomalies TensorFlow Transform TensorFlow Data Validation TF Estimator or Keras Model Training Data Schema SavedModel Transform Graph Transformed Data SavedModel Eval Graph SavedModel Inference Graph
  • 49. Anomalies TensorFlow Transform TensorFlow Data Validation TF Estimator or Keras Model TensorFlow Model Analysis TensorFlow Serving Training Data Schema SavedModel Transform Graph Transformed Data SavedModel Eval Graph SavedModel Inference Graph
  • 50. Anomalies TensorFlow Transform TensorFlow Data Validation TF Estimator or Keras Model TensorFlow Model Analysis TensorFlow Serving Training Data Schema SavedModel Transform Graph Transformed Data SavedModel Eval Graph SavedModel Inference Graph Serving Logs
  • 51. Use all of this today! ● End-to-end demo works locally (single-node) ● Runner for Flink in prototype phase ● Runner for Spark planned ● Demo runs on Cloud Dataflow for distributed execution JIRA Ticket: BEAM-2889 goo.gl/4DBvX7
  • 52. TFX: A TensorFlow-Based Production-Scale Machine Learning Platform. KDD (2017). https://youtu.be/fPTwLVCq00U
  • 53. github.com/tensorflow/transform github.com/tensorflow/model-analysis github.com/tensorflow/serving TensorFlow Transform Consistent in-graph transformations in training and serving TensorFlow Model Analysis Scaleable, sliced, and full-pass metrics TensorFlow Serving Serving the world’s machine learning models github.com/tensorflow/data-validation TensorFlow Data Validation Understand, validate, and monitor your ML data