SlideShare a Scribd company logo
Building a Data Science Platform
@dcos
Why is machine learning taking off?
2
CONFIDENTIAL
CONFIDENTIAL
2013: $14B fraud losses, 19% YoY growth
The case for a data science platform.
5
What you want to be doing
6
Data
(clean)
Write awesome ML code
Train
(once)
Deploy
(not me)
7
Sculley, D., Holt, G., Golovin, D. et al. Hidden Technical Debt in Machine Learning Systems
What you’re actually doing
© 2018 Mesosphere, Inc. All Rights Reserved.
Deep Learning: The Promise
8
© 2018 Mesosphere, Inc. All Rights Reserved.
Deep Learning: The Process
9
Step 1: Training
(In Data Center - Over Hours/Days/Weeks)
Step 2: Inference
(Endpoint or Data Center - Instantaneous)
Dog
Input:
Lots of Labeled
Data
Output:
Trained Model
Deep neural
network model
Trained
Model
Output:
Classification
Trained Model
New Input from
Camera or
Sensor
97% Dog
3%
Panda
© 2018 Mesosphere, Inc. All Rights Reserved.
Deep Learning: Some insight
10
© 2018 Mesosphere, Inc. All Rights Reserved.
Deep Learning: The Challenges
11
Open Source Pipeline
Operationalizing a Machine
Learning model can be super hard.
It is a stage where most enterprise
Machine Learning projects fail. I
cannot tell you how many
companies I've talked to, who have
said their innovation teams had
devised these cool ML projects, but
they were struggling getting the ML
models into production. In this set
of courses, we will talk about how
to train, deploy, and predict with ML
models in a way that their
production ready. And finally, we
delve back into Machine Learning
theory.
Valliappa Lakshmanan.
Tech Lead for Big Data and Machine
Learning Professional Services on Google
Cloud Platform.
1. Model Engineering 2. Model Training 3. Monitoring 4. Debugging 5. Model Serving
1. Data Preparation using
Spark
7. Streaming of requests
...
Public Cloud Pipeline
Model Engineering 2. Model Training 3. Monitoring 4. Debugging 5. Model Serving
1. Data Preparation using
Spark
7. Kafka stream of
requests
DIY Open Source Pipeline
1. Model
Engineering
2. Model Training 3. Monitoring 4. Debugging 5. Model Serving
Data Science Pipeline on DC/OS
Continuous Integration
Monitoring & Operations
Distributed Data
Storage and
Streaming
Data Preparation
and Analysis
Storage of trained
Models and
Metadata
Use trained Model
for Inference
Distributed
Training using
Machine Learning
Frameworks
Data & Streaming
Model
Engineering
Model
Management
Model Serving
Model
Training
Management
Tensorboard
Data Science Pipeline on DC/OS
Continuous Integration
Monitoring & Operations
Distributed Data
Storage and
Streaming
Data Preparation
and Analysis
Storage of trained
Models and
Metadata
Use trained Model
for Inference
Distributed
Training using
Machine Learning
Frameworks
Data & Streaming
Model
Engineering
Model
Management
Model Serving
Model
Training
Management
Tensorboard
© 2018 Mesosphere, Inc. All Rights Reserved. 18
Challenges
●
● Training/Dev/Test + New Data
● Large amounts
● Quality
● Availability (for cluster)
● Velocity
● Streaming
Solutions
GFS
Input Data Management
Input:
Lots of Labeled
Data
Apache Kafka
Apache Cassandra
© 2018 Mesosphere, Inc. All Rights Reserved. 19
Challenges
● Data is typically not ready to be
consumed by ML job*
● Data Cleaning
● Missing/incorrect labels
● Data Preparation
● Same Format
● Same Distribution
Solutions
Data Preparation
* Demo datasets are a fortunate exception :)
Data Science Pipeline on DC/OS
Continuous Integration
Monitoring & Operations
Distributed Data
Storage and
Streaming
Data Preparation
and Analysis
Storage of trained
Models and
Metadata
Use trained Model
for Inference
Distributed
Training using
Machine Learning
Frameworks
Data & Streaming
Model
Engineering
Model
Management
Model Serving
Model
Training
Management
Tensorboard
© 2018 Mesosphere, Inc. All Rights Reserved. 21
Challenges
● Different Users/Use cases
● Data Analyst/Exploring
● Production Workloads
● Highly Optimized
● How to spawn Environments?
Solutions
Users
© 2018 Mesosphere, Inc. All Rights Reserved. 22
Challenges
● Different Users/Use cases
● Data Analyst/Exploring
● Production Workloads
● Highly Optimized
● How to spawn Environments?
Solutions
Users
Data Science Pipeline on DC/OS
Continuous Integration
Monitoring & Operations
Distributed Data
Storage and
Streaming
Data Preparation
and Analysis
Storage of trained
Models and
Metadata
Use trained Model
for Inference
Distributed
Training using
Machine Learning
Frameworks
Data & Streaming
Model
Engineering
Model
Management
Model Serving
Model
Training
Management
Tensorboard
© 2018 Mesosphere, Inc. All Rights Reserved.
TensorFlow...
24
© 2018 Mesosphere, Inc. All Rights Reserved. 25
Data Analytics Ecosystem
© 2017 Mesosphere, Inc. All Rights Reserved.
Typical Developer Workflow for TensorFlow (Distributed)
● …
● Provision a set of machines to run your computation
● Install TensorFlow on them
● Write code to map distributed computations to the exact IP address
of the machine where those computations will be performed
● Deploy your code on every machine
● Train your data on the cluster → Output Trained Model
26
Trained
Model
Input
Data Set
© 2018 Mesosphere, Inc. All Rights Reserved.
Resource Isolation and Allocation
27
© 2017 Mesosphere, Inc. All Rights Reserved. 28
Datacenter
Typical Datacenter
siloed, over-provisioned servers,
low utilization
Mesos/ DC/OS
automated schedulers, workload multiplexing onto the
same machines
Tensorflow
Jenkins
Kafka
Spark
Tensorflow
© 2018 Mesosphere, Inc. All Rights Reserved.
PHYSICAL
INFRASTRUCTURE
MICROSERVICES, CONTAINERS, & DEV TOOLS
VIRTUAL MACHINES PUBLIC CLOUDS
DATA SERVICES, MACHINE LEARNING, & AI
Security &
Compliance
Application-Aware
Automation Multitenancy
Hybrid Cloud
Management
100+
MORE
DatacenterEdge
Datacenter and Cloud as a Single Computing Resource
Powered by Apache Mesos
20+
MORE
Deploy
Scale
Configure
Recover
3 AM
...
Typical Datacenter
siloed, over-provisioned servers,
low utilization
HDFS
Kafka
Kubernetes
Flink
TensorFlow
© 2018 Mesosphere, Inc. All Rights Reserved.
Two-level Scheduling
1. Agents advertise resources to Master
2. Master offers resources to Framework
3. Framework rejects / uses resources
4. Agent reports task status to Master
31
MESOS ARCHITECTURE
Mesos
Master
Mesos
Master
Mesos
Master
Mesos AgentMesos Agent Service
Cassandra
Executor
Cassandra
Task
Flink
Scheduler
Spark
Executor
Spark
Task
Mesos AgentMesos Agent Service
Docker
Executor
Docker
Task
CDB
Executor
Spark
Task
Spark
Scheduler
Kafka
Scheduler
© 2017 Mesosphere, Inc. All Rights Reserved.
Running distributed TensorFlow on DC/OS
32
● Wrapper script to abstract away distributed TensorFlow configuration
○ Separates “deployer” responsibilities from “developer” responsibilities
{
"service": {
"name": "mnist",
"job_url": "...",
"job_context": "..."
},
"gpu_worker": {... },
"worker": {... },
"ps": {... }
}
User
Code
Wrapper
Script
© 2017 Mesosphere, Inc. All Rights Reserved.
Running distributed TensorFlow on DC/OS
33
● The dcos-commons SDK cleanly restarts failed tasks and reconnects
them to the cluster
Data Science Pipeline on DC/OS
Continuous Integration
Monitoring & Operations
Distributed Data
Storage and
Streaming
Data Preparation
and Analysis
Storage of trained
Models and
Metadata
Use trained Model
for Inference
Distributed
Training using
Machine Learning
Frameworks
Data & Streaming
Model
Engineering
Model
Management
Model Serving
Model
Training
Management
Tensorboard
© 2018 Mesosphere, Inc. All Rights Reserved. 35
Challenges
● Many Models
● Different Hyperparameter
● Different Models
● New Training Data
● ...
Solutions
● Persistent Storage + Metadata
Model Management
GFS
Data Science Pipeline on DC/OS
Continuous Integration
Monitoring & Operations
Distributed Data
Storage and
Streaming
Data Preparation
and Analysis
Storage of trained
Models and
Metadata
Use trained Model
for Inference
Distributed
Training using
Machine Learning
Frameworks
Data & Streaming
Model
Engineering
Model
Management
Model Serving
Model
Training
Management
Tensorboard
© 2018 Mesosphere, Inc. All Rights Reserved. 37
Challenges
● How to Deploy Models?
● Zero Downtime
● Canary
Solutions
● TensorFlow Serving
Model Serving
© 2018 Mesosphere, Inc. All Rights Reserved.
Rendezvous Architecture
38
https://mapr.com/ebooks/machine-learning-logistics/
Data Science Pipeline on DC/OS
Continuous Integration
Monitoring & Operations
Distributed Data
Storage and
Streaming
Data Preparation
and Analysis
Storage of trained
Models and
Metadata
Use trained Model
for Inference
Distributed
Training using
Machine Learning
Frameworks
Data & Streaming
Model
Engineering
Model
Management
Model Serving
Model
Training
Management
Tensorboard
© 2018 Mesosphere, Inc. All Rights Reserved.
● DC/OS TensorFlow Package (currently closed source)
○ https://github.com/mesosphere/dcos-tensorflow
● DC/OS TensorFlow Tools
○ https://github.com/dcos-labs/dcos-tensorflow-tools/
● Tutorial for deploying TensorFlow on DC/OS
○ https://github.com/dcos/examples/tree/master/tensorflow
● Contact:
○ https://groups.google.com/a/mesosphere.io/forum/#!forum/tensorflow-dco
s
○ Slack: chat.dcos.io #tensorflow
Questions and Links
40

More Related Content

What's hot

MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
Databricks
 
Scaling up Deep Learning by Scaling Down
Scaling up Deep Learning by Scaling DownScaling up Deep Learning by Scaling Down
Scaling up Deep Learning by Scaling Down
Databricks
 
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNHadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Josh Patterson
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
Stepan Pushkarev
 
Infrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Infrastructure Solutions for Deploying AI/ML/DL Workloads at ScaleInfrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Infrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Robb Boyd
 
Machine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache SparkMachine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache Spark
Databricks
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
Stepan Pushkarev
 
Kubeflow and Data Science in Kubernetes
Kubeflow and Data Science in KubernetesKubeflow and Data Science in Kubernetes
Kubeflow and Data Science in Kubernetes
John Liu
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
Provectus
 
"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies
Data Science Milan
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 
Weave GitOps - continuous delivery for any Kubernetes
Weave GitOps - continuous delivery for any KubernetesWeave GitOps - continuous delivery for any Kubernetes
Weave GitOps - continuous delivery for any Kubernetes
Weaveworks
 
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer insteadForget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
Data Con LA
 
Near realtime AI deployment with huge data and super low latency - Levi Brack...
Near realtime AI deployment with huge data and super low latency - Levi Brack...Near realtime AI deployment with huge data and super low latency - Levi Brack...
Near realtime AI deployment with huge data and super low latency - Levi Brack...
Sri Ambati
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning Automático
Sri Ambati
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_future
Nisha Talagala
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...
TigerGraph
 
platform for Machine Learning
 platform for Machine Learning platform for Machine Learning
platform for Machine Learning
SivapriyaS12
 
Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...
Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...
Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...
Sri Ambati
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis Library
TigerGraph
 

What's hot (20)

MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
 
Scaling up Deep Learning by Scaling Down
Scaling up Deep Learning by Scaling DownScaling up Deep Learning by Scaling Down
Scaling up Deep Learning by Scaling Down
 
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNHadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
 
Infrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Infrastructure Solutions for Deploying AI/ML/DL Workloads at ScaleInfrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Infrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
 
Machine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache SparkMachine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache Spark
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
 
Kubeflow and Data Science in Kubernetes
Kubeflow and Data Science in KubernetesKubeflow and Data Science in Kubernetes
Kubeflow and Data Science in Kubernetes
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
Weave GitOps - continuous delivery for any Kubernetes
Weave GitOps - continuous delivery for any KubernetesWeave GitOps - continuous delivery for any Kubernetes
Weave GitOps - continuous delivery for any Kubernetes
 
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer insteadForget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
 
Near realtime AI deployment with huge data and super low latency - Levi Brack...
Near realtime AI deployment with huge data and super low latency - Levi Brack...Near realtime AI deployment with huge data and super low latency - Levi Brack...
Near realtime AI deployment with huge data and super low latency - Levi Brack...
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning Automático
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_future
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...
 
platform for Machine Learning
 platform for Machine Learning platform for Machine Learning
platform for Machine Learning
 
Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...
Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...
Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis Library
 

Similar to TensorFlow 16: Building a Data Science Platform

Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Codemotion
 
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018 Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Codemotion
 
Building ML Pipelines with DCOS
Building ML Pipelines with DCOSBuilding ML Pipelines with DCOS
Building ML Pipelines with DCOS
QAware GmbH
 
Webinar: Deep Learning Pipelines Beyond the Learning
Webinar: Deep Learning Pipelines Beyond the LearningWebinar: Deep Learning Pipelines Beyond the Learning
Webinar: Deep Learning Pipelines Beyond the Learning
Mesosphere Inc.
 
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Alluxio, Inc.
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big Data
DataWorks Summit
 
Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017
Clarisse Hedglin
 
Denis Jannot - Towards Data Science Engineering Principles - Codemotion Milan...
Denis Jannot - Towards Data Science Engineering Principles - Codemotion Milan...Denis Jannot - Towards Data Science Engineering Principles - Codemotion Milan...
Denis Jannot - Towards Data Science Engineering Principles - Codemotion Milan...
Codemotion
 
Running Distributed TensorFlow with GPUs on Mesos with DC/OS
Running Distributed TensorFlow with GPUs on Mesos with DC/OS Running Distributed TensorFlow with GPUs on Mesos with DC/OS
Running Distributed TensorFlow with GPUs on Mesos with DC/OS
Mesosphere Inc.
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
Data Science Milan
 
Integration Patterns for Big Data Applications
Integration Patterns for Big Data ApplicationsIntegration Patterns for Big Data Applications
Integration Patterns for Big Data Applications
Michael Häusler
 
How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...
Antje Barth
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
SigOpt
 
Data Science und Machine Learning im Kubernetes-Ökosystem
Data Science und Machine Learning im Kubernetes-ÖkosystemData Science und Machine Learning im Kubernetes-Ökosystem
Data Science und Machine Learning im Kubernetes-Ökosystem
inovex GmbH
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
Databricks
 
Performance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applicationsPerformance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applications
Accumulo Summit
 
Innovation with ai at scale on the edge vt sept 2019 v0
Innovation with ai at scale  on the edge vt sept 2019 v0Innovation with ai at scale  on the edge vt sept 2019 v0
Innovation with ai at scale on the edge vt sept 2019 v0
Ganesan Narayanasamy
 
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg SchadData Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA
 
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
VMware Tanzu
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
James Anderson
 

Similar to TensorFlow 16: Building a Data Science Platform (20)

Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
 
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018 Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
 
Building ML Pipelines with DCOS
Building ML Pipelines with DCOSBuilding ML Pipelines with DCOS
Building ML Pipelines with DCOS
 
Webinar: Deep Learning Pipelines Beyond the Learning
Webinar: Deep Learning Pipelines Beyond the LearningWebinar: Deep Learning Pipelines Beyond the Learning
Webinar: Deep Learning Pipelines Beyond the Learning
 
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big Data
 
Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017
 
Denis Jannot - Towards Data Science Engineering Principles - Codemotion Milan...
Denis Jannot - Towards Data Science Engineering Principles - Codemotion Milan...Denis Jannot - Towards Data Science Engineering Principles - Codemotion Milan...
Denis Jannot - Towards Data Science Engineering Principles - Codemotion Milan...
 
Running Distributed TensorFlow with GPUs on Mesos with DC/OS
Running Distributed TensorFlow with GPUs on Mesos with DC/OS Running Distributed TensorFlow with GPUs on Mesos with DC/OS
Running Distributed TensorFlow with GPUs on Mesos with DC/OS
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
 
Integration Patterns for Big Data Applications
Integration Patterns for Big Data ApplicationsIntegration Patterns for Big Data Applications
Integration Patterns for Big Data Applications
 
How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
Data Science und Machine Learning im Kubernetes-Ökosystem
Data Science und Machine Learning im Kubernetes-ÖkosystemData Science und Machine Learning im Kubernetes-Ökosystem
Data Science und Machine Learning im Kubernetes-Ökosystem
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
 
Performance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applicationsPerformance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applications
 
Innovation with ai at scale on the edge vt sept 2019 v0
Innovation with ai at scale  on the edge vt sept 2019 v0Innovation with ai at scale  on the edge vt sept 2019 v0
Innovation with ai at scale on the edge vt sept 2019 v0
 
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg SchadData Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
 
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
 

More from Seldon

CD4ML and the challenges of testing and quality in ML systems
CD4ML and the challenges of testing and quality in ML systemsCD4ML and the challenges of testing and quality in ML systems
CD4ML and the challenges of testing and quality in ML systems
Seldon
 
TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative models
Seldon
 
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz SantissiTensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Seldon
 
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
Seldon
 
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
Seldon
 
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
Seldon
 
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
Seldon
 
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAITensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
Seldon
 
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
Seldon
 
Ai in financial services
Ai in financial servicesAi in financial services
Ai in financial services
Seldon
 
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
Seldon
 
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
Seldon
 
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Seldon
 
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Seldon
 
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
Seldon
 
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
Seldon
 
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Seldon
 
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
Seldon
 
TensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya DmitrichenkoTensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya Dmitrichenko
Seldon
 
TensorFlow London 13.09.17 Karim Beguir
TensorFlow London 13.09.17 Karim BeguirTensorFlow London 13.09.17 Karim Beguir
TensorFlow London 13.09.17 Karim Beguir
Seldon
 

More from Seldon (20)

CD4ML and the challenges of testing and quality in ML systems
CD4ML and the challenges of testing and quality in ML systemsCD4ML and the challenges of testing and quality in ML systems
CD4ML and the challenges of testing and quality in ML systems
 
TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative models
 
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz SantissiTensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
 
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
 
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
 
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
 
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
 
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAITensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
 
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
 
Ai in financial services
Ai in financial servicesAi in financial services
Ai in financial services
 
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
 
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
 
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
 
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
 
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
 
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
 
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
 
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
 
TensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya DmitrichenkoTensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya Dmitrichenko
 
TensorFlow London 13.09.17 Karim Beguir
TensorFlow London 13.09.17 Karim BeguirTensorFlow London 13.09.17 Karim Beguir
TensorFlow London 13.09.17 Karim Beguir
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 

TensorFlow 16: Building a Data Science Platform

  • 1. Building a Data Science Platform @dcos
  • 2. Why is machine learning taking off? 2
  • 4. CONFIDENTIAL 2013: $14B fraud losses, 19% YoY growth
  • 5. The case for a data science platform. 5
  • 6. What you want to be doing 6 Data (clean) Write awesome ML code Train (once) Deploy (not me)
  • 7. 7 Sculley, D., Holt, G., Golovin, D. et al. Hidden Technical Debt in Machine Learning Systems What you’re actually doing
  • 8. © 2018 Mesosphere, Inc. All Rights Reserved. Deep Learning: The Promise 8
  • 9. © 2018 Mesosphere, Inc. All Rights Reserved. Deep Learning: The Process 9 Step 1: Training (In Data Center - Over Hours/Days/Weeks) Step 2: Inference (Endpoint or Data Center - Instantaneous) Dog Input: Lots of Labeled Data Output: Trained Model Deep neural network model Trained Model Output: Classification Trained Model New Input from Camera or Sensor 97% Dog 3% Panda
  • 10. © 2018 Mesosphere, Inc. All Rights Reserved. Deep Learning: Some insight 10
  • 11. © 2018 Mesosphere, Inc. All Rights Reserved. Deep Learning: The Challenges 11
  • 12. Open Source Pipeline Operationalizing a Machine Learning model can be super hard. It is a stage where most enterprise Machine Learning projects fail. I cannot tell you how many companies I've talked to, who have said their innovation teams had devised these cool ML projects, but they were struggling getting the ML models into production. In this set of courses, we will talk about how to train, deploy, and predict with ML models in a way that their production ready. And finally, we delve back into Machine Learning theory. Valliappa Lakshmanan. Tech Lead for Big Data and Machine Learning Professional Services on Google Cloud Platform.
  • 13. 1. Model Engineering 2. Model Training 3. Monitoring 4. Debugging 5. Model Serving
  • 14. 1. Data Preparation using Spark 7. Streaming of requests ... Public Cloud Pipeline Model Engineering 2. Model Training 3. Monitoring 4. Debugging 5. Model Serving
  • 15. 1. Data Preparation using Spark 7. Kafka stream of requests DIY Open Source Pipeline 1. Model Engineering 2. Model Training 3. Monitoring 4. Debugging 5. Model Serving
  • 16. Data Science Pipeline on DC/OS Continuous Integration Monitoring & Operations Distributed Data Storage and Streaming Data Preparation and Analysis Storage of trained Models and Metadata Use trained Model for Inference Distributed Training using Machine Learning Frameworks Data & Streaming Model Engineering Model Management Model Serving Model Training Management Tensorboard
  • 17. Data Science Pipeline on DC/OS Continuous Integration Monitoring & Operations Distributed Data Storage and Streaming Data Preparation and Analysis Storage of trained Models and Metadata Use trained Model for Inference Distributed Training using Machine Learning Frameworks Data & Streaming Model Engineering Model Management Model Serving Model Training Management Tensorboard
  • 18. © 2018 Mesosphere, Inc. All Rights Reserved. 18 Challenges ● ● Training/Dev/Test + New Data ● Large amounts ● Quality ● Availability (for cluster) ● Velocity ● Streaming Solutions GFS Input Data Management Input: Lots of Labeled Data Apache Kafka Apache Cassandra
  • 19. © 2018 Mesosphere, Inc. All Rights Reserved. 19 Challenges ● Data is typically not ready to be consumed by ML job* ● Data Cleaning ● Missing/incorrect labels ● Data Preparation ● Same Format ● Same Distribution Solutions Data Preparation * Demo datasets are a fortunate exception :)
  • 20. Data Science Pipeline on DC/OS Continuous Integration Monitoring & Operations Distributed Data Storage and Streaming Data Preparation and Analysis Storage of trained Models and Metadata Use trained Model for Inference Distributed Training using Machine Learning Frameworks Data & Streaming Model Engineering Model Management Model Serving Model Training Management Tensorboard
  • 21. © 2018 Mesosphere, Inc. All Rights Reserved. 21 Challenges ● Different Users/Use cases ● Data Analyst/Exploring ● Production Workloads ● Highly Optimized ● How to spawn Environments? Solutions Users
  • 22. © 2018 Mesosphere, Inc. All Rights Reserved. 22 Challenges ● Different Users/Use cases ● Data Analyst/Exploring ● Production Workloads ● Highly Optimized ● How to spawn Environments? Solutions Users
  • 23. Data Science Pipeline on DC/OS Continuous Integration Monitoring & Operations Distributed Data Storage and Streaming Data Preparation and Analysis Storage of trained Models and Metadata Use trained Model for Inference Distributed Training using Machine Learning Frameworks Data & Streaming Model Engineering Model Management Model Serving Model Training Management Tensorboard
  • 24. © 2018 Mesosphere, Inc. All Rights Reserved. TensorFlow... 24
  • 25. © 2018 Mesosphere, Inc. All Rights Reserved. 25 Data Analytics Ecosystem
  • 26. © 2017 Mesosphere, Inc. All Rights Reserved. Typical Developer Workflow for TensorFlow (Distributed) ● … ● Provision a set of machines to run your computation ● Install TensorFlow on them ● Write code to map distributed computations to the exact IP address of the machine where those computations will be performed ● Deploy your code on every machine ● Train your data on the cluster → Output Trained Model 26 Trained Model Input Data Set
  • 27. © 2018 Mesosphere, Inc. All Rights Reserved. Resource Isolation and Allocation 27
  • 28. © 2017 Mesosphere, Inc. All Rights Reserved. 28 Datacenter Typical Datacenter siloed, over-provisioned servers, low utilization Mesos/ DC/OS automated schedulers, workload multiplexing onto the same machines Tensorflow Jenkins Kafka Spark Tensorflow
  • 29. © 2018 Mesosphere, Inc. All Rights Reserved. PHYSICAL INFRASTRUCTURE MICROSERVICES, CONTAINERS, & DEV TOOLS VIRTUAL MACHINES PUBLIC CLOUDS DATA SERVICES, MACHINE LEARNING, & AI Security & Compliance Application-Aware Automation Multitenancy Hybrid Cloud Management 100+ MORE DatacenterEdge Datacenter and Cloud as a Single Computing Resource Powered by Apache Mesos 20+ MORE
  • 30. Deploy Scale Configure Recover 3 AM ... Typical Datacenter siloed, over-provisioned servers, low utilization HDFS Kafka Kubernetes Flink TensorFlow
  • 31. © 2018 Mesosphere, Inc. All Rights Reserved. Two-level Scheduling 1. Agents advertise resources to Master 2. Master offers resources to Framework 3. Framework rejects / uses resources 4. Agent reports task status to Master 31 MESOS ARCHITECTURE Mesos Master Mesos Master Mesos Master Mesos AgentMesos Agent Service Cassandra Executor Cassandra Task Flink Scheduler Spark Executor Spark Task Mesos AgentMesos Agent Service Docker Executor Docker Task CDB Executor Spark Task Spark Scheduler Kafka Scheduler
  • 32. © 2017 Mesosphere, Inc. All Rights Reserved. Running distributed TensorFlow on DC/OS 32 ● Wrapper script to abstract away distributed TensorFlow configuration ○ Separates “deployer” responsibilities from “developer” responsibilities { "service": { "name": "mnist", "job_url": "...", "job_context": "..." }, "gpu_worker": {... }, "worker": {... }, "ps": {... } } User Code Wrapper Script
  • 33. © 2017 Mesosphere, Inc. All Rights Reserved. Running distributed TensorFlow on DC/OS 33 ● The dcos-commons SDK cleanly restarts failed tasks and reconnects them to the cluster
  • 34. Data Science Pipeline on DC/OS Continuous Integration Monitoring & Operations Distributed Data Storage and Streaming Data Preparation and Analysis Storage of trained Models and Metadata Use trained Model for Inference Distributed Training using Machine Learning Frameworks Data & Streaming Model Engineering Model Management Model Serving Model Training Management Tensorboard
  • 35. © 2018 Mesosphere, Inc. All Rights Reserved. 35 Challenges ● Many Models ● Different Hyperparameter ● Different Models ● New Training Data ● ... Solutions ● Persistent Storage + Metadata Model Management GFS
  • 36. Data Science Pipeline on DC/OS Continuous Integration Monitoring & Operations Distributed Data Storage and Streaming Data Preparation and Analysis Storage of trained Models and Metadata Use trained Model for Inference Distributed Training using Machine Learning Frameworks Data & Streaming Model Engineering Model Management Model Serving Model Training Management Tensorboard
  • 37. © 2018 Mesosphere, Inc. All Rights Reserved. 37 Challenges ● How to Deploy Models? ● Zero Downtime ● Canary Solutions ● TensorFlow Serving Model Serving
  • 38. © 2018 Mesosphere, Inc. All Rights Reserved. Rendezvous Architecture 38 https://mapr.com/ebooks/machine-learning-logistics/
  • 39. Data Science Pipeline on DC/OS Continuous Integration Monitoring & Operations Distributed Data Storage and Streaming Data Preparation and Analysis Storage of trained Models and Metadata Use trained Model for Inference Distributed Training using Machine Learning Frameworks Data & Streaming Model Engineering Model Management Model Serving Model Training Management Tensorboard
  • 40. © 2018 Mesosphere, Inc. All Rights Reserved. ● DC/OS TensorFlow Package (currently closed source) ○ https://github.com/mesosphere/dcos-tensorflow ● DC/OS TensorFlow Tools ○ https://github.com/dcos-labs/dcos-tensorflow-tools/ ● Tutorial for deploying TensorFlow on DC/OS ○ https://github.com/dcos/examples/tree/master/tensorflow ● Contact: ○ https://groups.google.com/a/mesosphere.io/forum/#!forum/tensorflow-dco s ○ Slack: chat.dcos.io #tensorflow Questions and Links 40