SlideShare a Scribd company logo
1 of 28
Download to read offline
Distributed Deep
Learning with Docker
at Salesforce
Jeff Hajewski
Software Engineer,
Salesforce.
github.com/j-haj
jeff-hajewski-3a1b5a29
Caveats
● These my own views and opinions, not those of Salesforce
● This is how one team at Salesforce deploys deep learning
models
● When I use the term Docker I am referring to the
technology, not the company
● Some of these designs are simplified
● What is deep learning and why is it difficult?
● Deep learning at Salesforce
● Challenges
○ Designing for team specialization
○ Interacting with the model server
○ Testing
● Key takeaways
About this talk
The core task of deep learning is function approximation.
Neural networks can approximate any function.
Neural networks are expensive to evaluate.
● Linear regression: ~1,000 parameters
● Deep neural network: 100M - 1B parameters (100,000 - 1M x linear reg.)
Deep Learning Review
How should we design
distributed systems
for deep learning?
high latency tasks
We use deep learning models to provide our customers
useful information about their sales process.
They send us this data as a firehose of streaming data.
The faster we get this data to our customers, the more
useful and actionable it is for their sales teams.
Deep Learning at Salesforce
There are three steps to this process
1. Preprocessing - cleaning and formatting the data
2. Inference - running the data through the model
3. Postprocessing - interpreting the output from the model
Deep Learning at Salesforce
Deep Learning at Salesforce
Discusses
cat
preprocess
[0.2, 0.71, 0.89, 0.6]
[0.85, 0.15]
inference
postprocess“Hello! My
cat is
friendly.”
Deep Learning at Salesforce
Discusses
cat
preprocess
[0.2, 0.71, 0.89, 0.6]
[0.85, 0.15]
inference
postprocess“Hello! My
cat is
friendly.”
Data Science Team
Deep Learning at Salesforce
Discusses
cat
preprocess
[0.2, 0.71, 0.89, 0.6]
[0.85, 0.15]
inference
postprocess“Hello! My
cat is
friendly.”
Data Science Team Systems Team
Challenge 1:
designing
for team
specialization
Requirements
1. The data science team shouldn’t need to know
about the system. They just want to define a
sequence of computation.
2. The systems engineers shouldn’t need to know
anything about the computation. They just want to
scale the system.
Designing for team specialization
Challenges
1. Some functions takes longer to execute than others
(e.g., model inference)
2. The order of execution is important
Designing for team specialization
Solution: map functions to containers
Designing for team specialization
postprocess(inference(preprocess(x)))
preprocess inference postprocess
What about throughput?
preprocess inference postprocess
0110010
1001111
0110010
1000011
It’s a cat!
1,000
QPS
500
QPS
300
QPS
1,000
QPS
Max
throughput
inference
inferenceinferencepreprocess
What about throughput?
preprocess inference postprocess
0110010
1001111
0110010
1000011
It’s a cat!
1,000
QPS
2x
500
QPS
4x
300
QPS
1,000
QPS
Max
throughput
Docker enables us to easily scale out each individual stage
inferenceinferencepreprocess
What about throughput?
preprocess inference postprocess
0110010
1001111
0110010
1000011
It’s a cat!
1,000
QPS
2x
500
QPS
2x
300
QPS
1,000
QPS
Max
throughput
Kafka gives stage-wise checkpointing
Challenge 2:
interacting
with the
model servers
Model servers provide a way to query the model,
typically via gRPC or HTTP.
What is the best way to deploy and interact with these
model servers?
Serving deep learning models
Challenge:
1. Model servers are designed as a standalone process.
2. How should we best utilize multiple GPUs?
3. What about networking?
Interacting with the model server
We want to keep deployment simple!
Solution: Deploy model server images as part of a “pod” or
“group” with a coordinator service
Interacting with the model server
JVM
Manager
Model Server Model Server Model Server...
Pod
1. Who owns the model server?
2. How should we handle model versions? Where are they
stored locally?
3. What are the addresses of the model servers?
This solves additional challenges
Data science team
Docker shared volume
http://localhost via Docker private networking
Challenge 3:
testing
Challenge: how should we test these systems?
1. Deep learning models are probabilistic
2. Interservice interactions can be quite complex
Testing
Solution: Docker Compose
● Makes it easy to swap out the model server with a mock
service
● Deploying the entire system locally is easy
● Integrates well with Maven and Gradle
Testing
We haven’t spent a lot of time
discussing the details of Docker
That is precisely the point!
● Docker allows us to simplify many aspects of our design.
● Docker stays out of the way.
● Docker provides a simple alternative to a much more
complex solution.
Docker simplifies our lives

More Related Content

What's hot

Predicting Space Weather with Docker
Predicting Space Weather with DockerPredicting Space Weather with Docker
Predicting Space Weather with DockerDocker, Inc.
 
DCSF19 Container Security: Theory & Practice at Netflix
DCSF19 Container Security: Theory & Practice at NetflixDCSF19 Container Security: Theory & Practice at Netflix
DCSF19 Container Security: Theory & Practice at NetflixDocker, Inc.
 
Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...
Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...
Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...Docker, Inc.
 
Puppet plugin for vRealize Automation (vRA)
Puppet plugin for vRealize Automation (vRA)Puppet plugin for vRealize Automation (vRA)
Puppet plugin for vRealize Automation (vRA)Puppet
 
Simple tweaks to get the most out of your JVM
Simple tweaks to get the most out of your JVMSimple tweaks to get the most out of your JVM
Simple tweaks to get the most out of your JVMJamie Coleman
 
Webinar: Accelerate Your Inner Dev Loop for Kubernetes Services
Webinar: Accelerate Your Inner Dev Loop for Kubernetes Services Webinar: Accelerate Your Inner Dev Loop for Kubernetes Services
Webinar: Accelerate Your Inner Dev Loop for Kubernetes Services Ambassador Labs
 
DCSF 19 How Entergy is Mitigating Legacy Windows Operating System Vulnerabili...
DCSF 19 How Entergy is Mitigating Legacy Windows Operating System Vulnerabili...DCSF 19 How Entergy is Mitigating Legacy Windows Operating System Vulnerabili...
DCSF 19 How Entergy is Mitigating Legacy Windows Operating System Vulnerabili...Docker, Inc.
 
DCSF 19 Microservices API: Routing Across Any Infrastructure
DCSF 19 Microservices API: Routing Across Any InfrastructureDCSF 19 Microservices API: Routing Across Any Infrastructure
DCSF 19 Microservices API: Routing Across Any InfrastructureDocker, Inc.
 
DCSF 19 Developing Apps with Containers, Functions and Cloud Services
DCSF 19 Developing Apps with Containers, Functions and Cloud ServicesDCSF 19 Developing Apps with Containers, Functions and Cloud Services
DCSF 19 Developing Apps with Containers, Functions and Cloud ServicesDocker, Inc.
 
Serverless java
Serverless   javaServerless   java
Serverless javaVishwas N
 
DCSF 19 Modernizing Insurance with Docker Enterprise: The Physicians Mutual ...
DCSF 19 Modernizing Insurance with Docker Enterprise:  The Physicians Mutual ...DCSF 19 Modernizing Insurance with Docker Enterprise:  The Physicians Mutual ...
DCSF 19 Modernizing Insurance with Docker Enterprise: The Physicians Mutual ...Docker, Inc.
 
DCEU 18: From Monolith to Microservices
DCEU 18: From Monolith to MicroservicesDCEU 18: From Monolith to Microservices
DCEU 18: From Monolith to MicroservicesDocker, Inc.
 
Accessible hpc for everyone with docker and containers
Accessible hpc for everyone with docker and containersAccessible hpc for everyone with docker and containers
Accessible hpc for everyone with docker and containersDocker, Inc.
 
DCEU 18: 5 Patterns for Success in Application Transformation
DCEU 18: 5 Patterns for Success in Application TransformationDCEU 18: 5 Patterns for Success in Application Transformation
DCEU 18: 5 Patterns for Success in Application TransformationDocker, Inc.
 
Networking in Docker EE 2.0 with Kubernetes and Swarm
Networking in Docker EE 2.0 with Kubernetes and SwarmNetworking in Docker EE 2.0 with Kubernetes and Swarm
Networking in Docker EE 2.0 with Kubernetes and SwarmAbhinandan P.b
 
Application Deployment and Management at Scale with 1&1 by Matt Baldwin
Application Deployment and Management at Scale with 1&1 by Matt BaldwinApplication Deployment and Management at Scale with 1&1 by Matt Baldwin
Application Deployment and Management at Scale with 1&1 by Matt BaldwinDocker, Inc.
 
Container on azure
Container on azureContainer on azure
Container on azureVishwas N
 
Immutable Awesomeness by John Willis and Josh Corman
Immutable Awesomeness by John Willis and Josh CormanImmutable Awesomeness by John Willis and Josh Corman
Immutable Awesomeness by John Willis and Josh CormanDocker, Inc.
 

What's hot (20)

Predicting Space Weather with Docker
Predicting Space Weather with DockerPredicting Space Weather with Docker
Predicting Space Weather with Docker
 
DCSF19 Container Security: Theory & Practice at Netflix
DCSF19 Container Security: Theory & Practice at NetflixDCSF19 Container Security: Theory & Practice at Netflix
DCSF19 Container Security: Theory & Practice at Netflix
 
Hands-on Helm
Hands-on Helm Hands-on Helm
Hands-on Helm
 
Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...
Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...
Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...
 
Puppet plugin for vRealize Automation (vRA)
Puppet plugin for vRealize Automation (vRA)Puppet plugin for vRealize Automation (vRA)
Puppet plugin for vRealize Automation (vRA)
 
Simple tweaks to get the most out of your JVM
Simple tweaks to get the most out of your JVMSimple tweaks to get the most out of your JVM
Simple tweaks to get the most out of your JVM
 
Webinar: Accelerate Your Inner Dev Loop for Kubernetes Services
Webinar: Accelerate Your Inner Dev Loop for Kubernetes Services Webinar: Accelerate Your Inner Dev Loop for Kubernetes Services
Webinar: Accelerate Your Inner Dev Loop for Kubernetes Services
 
DCSF 19 How Entergy is Mitigating Legacy Windows Operating System Vulnerabili...
DCSF 19 How Entergy is Mitigating Legacy Windows Operating System Vulnerabili...DCSF 19 How Entergy is Mitigating Legacy Windows Operating System Vulnerabili...
DCSF 19 How Entergy is Mitigating Legacy Windows Operating System Vulnerabili...
 
DCSF 19 Microservices API: Routing Across Any Infrastructure
DCSF 19 Microservices API: Routing Across Any InfrastructureDCSF 19 Microservices API: Routing Across Any Infrastructure
DCSF 19 Microservices API: Routing Across Any Infrastructure
 
DCSF 19 Developing Apps with Containers, Functions and Cloud Services
DCSF 19 Developing Apps with Containers, Functions and Cloud ServicesDCSF 19 Developing Apps with Containers, Functions and Cloud Services
DCSF 19 Developing Apps with Containers, Functions and Cloud Services
 
Serverless java
Serverless   javaServerless   java
Serverless java
 
DCSF 19 Modernizing Insurance with Docker Enterprise: The Physicians Mutual ...
DCSF 19 Modernizing Insurance with Docker Enterprise:  The Physicians Mutual ...DCSF 19 Modernizing Insurance with Docker Enterprise:  The Physicians Mutual ...
DCSF 19 Modernizing Insurance with Docker Enterprise: The Physicians Mutual ...
 
DCEU 18: From Monolith to Microservices
DCEU 18: From Monolith to MicroservicesDCEU 18: From Monolith to Microservices
DCEU 18: From Monolith to Microservices
 
Accessible hpc for everyone with docker and containers
Accessible hpc for everyone with docker and containersAccessible hpc for everyone with docker and containers
Accessible hpc for everyone with docker and containers
 
DCEU 18: 5 Patterns for Success in Application Transformation
DCEU 18: 5 Patterns for Success in Application TransformationDCEU 18: 5 Patterns for Success in Application Transformation
DCEU 18: 5 Patterns for Success in Application Transformation
 
Networking in Docker EE 2.0 with Kubernetes and Swarm
Networking in Docker EE 2.0 with Kubernetes and SwarmNetworking in Docker EE 2.0 with Kubernetes and Swarm
Networking in Docker EE 2.0 with Kubernetes and Swarm
 
Application Deployment and Management at Scale with 1&1 by Matt Baldwin
Application Deployment and Management at Scale with 1&1 by Matt BaldwinApplication Deployment and Management at Scale with 1&1 by Matt Baldwin
Application Deployment and Management at Scale with 1&1 by Matt Baldwin
 
JEEconf 2017
JEEconf 2017JEEconf 2017
JEEconf 2017
 
Container on azure
Container on azureContainer on azure
Container on azure
 
Immutable Awesomeness by John Willis and Josh Corman
Immutable Awesomeness by John Willis and Josh CormanImmutable Awesomeness by John Willis and Josh Corman
Immutable Awesomeness by John Willis and Josh Corman
 

Similar to Distributed Deep Learning with Docker at Salesforce

Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesTuri, Inc.
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsSanghamitra Deb
 
Deep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformDeep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformShivaji Dutta
 
DX@Scale: Optimizing Salesforce Development and Deployment for large scale pr...
DX@Scale: Optimizing Salesforce Development and Deployment for large scale pr...DX@Scale: Optimizing Salesforce Development and Deployment for large scale pr...
DX@Scale: Optimizing Salesforce Development and Deployment for large scale pr...Paris Salesforce Developer Group
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabszekeLabs Technologies
 
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher ScientificEnabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher ScientificDatabricks
 
Why is dev ops for machine learning so different - dataxdays
Why is dev ops for machine learning so different  - dataxdaysWhy is dev ops for machine learning so different  - dataxdays
Why is dev ops for machine learning so different - dataxdaysRyan Dawson
 
How to Build Deep Learning Models
How to Build Deep Learning ModelsHow to Build Deep Learning Models
How to Build Deep Learning ModelsJosh Patterson
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018Apache MXNet
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning InfrastructureSigOpt
 
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...Ambassador Labs
 
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons LearnedITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons LearnedJoão Pedro Martins
 
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...PostgresOpen
 
Agile2015: Introduction to DevOps with Chocolate and Lego Game
Agile2015: Introduction to DevOps with Chocolate and Lego GameAgile2015: Introduction to DevOps with Chocolate and Lego Game
Agile2015: Introduction to DevOps with Chocolate and Lego GameDana Pylayeva
 
Testing with laravel
Testing with laravelTesting with laravel
Testing with laravelDerek Binkley
 
DLT UNIT-3.docx
DLT  UNIT-3.docxDLT  UNIT-3.docx
DLT UNIT-3.docx0567Padma
 
Super Sizing Youtube with Python
Super Sizing Youtube with PythonSuper Sizing Youtube with Python
Super Sizing Youtube with Pythondidip
 

Similar to Distributed Deep Learning with Docker at Salesforce (20)

Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
 
Deep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformDeep Learning on Qubole Data Platform
Deep Learning on Qubole Data Platform
 
DX@Scale: Optimizing Salesforce Development and Deployment for large scale pr...
DX@Scale: Optimizing Salesforce Development and Deployment for large scale pr...DX@Scale: Optimizing Salesforce Development and Deployment for large scale pr...
DX@Scale: Optimizing Salesforce Development and Deployment for large scale pr...
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher ScientificEnabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
 
Why is dev ops for machine learning so different - dataxdays
Why is dev ops for machine learning so different  - dataxdaysWhy is dev ops for machine learning so different  - dataxdays
Why is dev ops for machine learning so different - dataxdays
 
How to Build Deep Learning Models
How to Build Deep Learning ModelsHow to Build Deep Learning Models
How to Build Deep Learning Models
 
What DevOps Isn't
What DevOps Isn'tWhat DevOps Isn't
What DevOps Isn't
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
Deep Domain
Deep DomainDeep Domain
Deep Domain
 
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...
 
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons LearnedITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
 
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
 
Agile2015: Introduction to DevOps with Chocolate and Lego Game
Agile2015: Introduction to DevOps with Chocolate and Lego GameAgile2015: Introduction to DevOps with Chocolate and Lego Game
Agile2015: Introduction to DevOps with Chocolate and Lego Game
 
Testing with laravel
Testing with laravelTesting with laravel
Testing with laravel
 
DLT UNIT-3.docx
DLT  UNIT-3.docxDLT  UNIT-3.docx
DLT UNIT-3.docx
 
Super Sizing Youtube with Python
Super Sizing Youtube with PythonSuper Sizing Youtube with Python
Super Sizing Youtube with Python
 
Os Solomon
Os SolomonOs Solomon
Os Solomon
 

More from Docker, Inc.

Containerize Your Game Server for the Best Multiplayer Experience
Containerize Your Game Server for the Best Multiplayer Experience Containerize Your Game Server for the Best Multiplayer Experience
Containerize Your Game Server for the Best Multiplayer Experience Docker, Inc.
 
How to Improve Your Image Builds Using Advance Docker Build
How to Improve Your Image Builds Using Advance Docker BuildHow to Improve Your Image Builds Using Advance Docker Build
How to Improve Your Image Builds Using Advance Docker BuildDocker, Inc.
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSDocker, Inc.
 
Securing Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINXSecuring Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINXDocker, Inc.
 
How To Build and Run Node Apps with Docker and Compose
How To Build and Run Node Apps with Docker and ComposeHow To Build and Run Node Apps with Docker and Compose
How To Build and Run Node Apps with Docker and ComposeDocker, Inc.
 
The First 10M Pulls: Building The Official Curl Image for Docker Hub
The First 10M Pulls: Building The Official Curl Image for Docker HubThe First 10M Pulls: Building The Official Curl Image for Docker Hub
The First 10M Pulls: Building The Official Curl Image for Docker HubDocker, Inc.
 
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...Docker, Inc.
 
Become a Docker Power User With Microsoft Visual Studio Code
Become a Docker Power User With Microsoft Visual Studio CodeBecome a Docker Power User With Microsoft Visual Studio Code
Become a Docker Power User With Microsoft Visual Studio CodeDocker, Inc.
 
How to Use Mirroring and Caching to Optimize your Container Registry
How to Use Mirroring and Caching to Optimize your Container RegistryHow to Use Mirroring and Caching to Optimize your Container Registry
How to Use Mirroring and Caching to Optimize your Container RegistryDocker, Inc.
 
Kubernetes at Datadog Scale
Kubernetes at Datadog ScaleKubernetes at Datadog Scale
Kubernetes at Datadog ScaleDocker, Inc.
 
Labels, Labels, Labels
Labels, Labels, Labels Labels, Labels, Labels
Labels, Labels, Labels Docker, Inc.
 
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment ModelUsing Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment ModelDocker, Inc.
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSDocker, Inc.
 
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...Docker, Inc.
 
Developing with Docker for the Arm Architecture
Developing with Docker for the Arm ArchitectureDeveloping with Docker for the Arm Architecture
Developing with Docker for the Arm ArchitectureDocker, Inc.
 
Sharing is Caring: How to Begin Speaking at Conferences
Sharing is Caring: How to Begin Speaking at ConferencesSharing is Caring: How to Begin Speaking at Conferences
Sharing is Caring: How to Begin Speaking at ConferencesDocker, Inc.
 
Virtual Meetup Docker + Arm: Building Multi-arch Apps with Buildx
Virtual Meetup Docker + Arm: Building Multi-arch Apps with BuildxVirtual Meetup Docker + Arm: Building Multi-arch Apps with Buildx
Virtual Meetup Docker + Arm: Building Multi-arch Apps with BuildxDocker, Inc.
 
DCSF 19 eBPF Superpowers
DCSF 19 eBPF SuperpowersDCSF 19 eBPF Superpowers
DCSF 19 eBPF SuperpowersDocker, Inc.
 
DCSF 19 Zero Trust Networks Come to Enterprise Kubernetes
DCSF 19 Zero Trust Networks Come to Enterprise KubernetesDCSF 19 Zero Trust Networks Come to Enterprise Kubernetes
DCSF 19 Zero Trust Networks Come to Enterprise KubernetesDocker, Inc.
 
DCSF 19 Node.js Rocks in Docker for Dev and Ops
DCSF 19 Node.js Rocks in Docker for Dev and OpsDCSF 19 Node.js Rocks in Docker for Dev and Ops
DCSF 19 Node.js Rocks in Docker for Dev and OpsDocker, Inc.
 

More from Docker, Inc. (20)

Containerize Your Game Server for the Best Multiplayer Experience
Containerize Your Game Server for the Best Multiplayer Experience Containerize Your Game Server for the Best Multiplayer Experience
Containerize Your Game Server for the Best Multiplayer Experience
 
How to Improve Your Image Builds Using Advance Docker Build
How to Improve Your Image Builds Using Advance Docker BuildHow to Improve Your Image Builds Using Advance Docker Build
How to Improve Your Image Builds Using Advance Docker Build
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
 
Securing Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINXSecuring Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINX
 
How To Build and Run Node Apps with Docker and Compose
How To Build and Run Node Apps with Docker and ComposeHow To Build and Run Node Apps with Docker and Compose
How To Build and Run Node Apps with Docker and Compose
 
The First 10M Pulls: Building The Official Curl Image for Docker Hub
The First 10M Pulls: Building The Official Curl Image for Docker HubThe First 10M Pulls: Building The Official Curl Image for Docker Hub
The First 10M Pulls: Building The Official Curl Image for Docker Hub
 
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
 
Become a Docker Power User With Microsoft Visual Studio Code
Become a Docker Power User With Microsoft Visual Studio CodeBecome a Docker Power User With Microsoft Visual Studio Code
Become a Docker Power User With Microsoft Visual Studio Code
 
How to Use Mirroring and Caching to Optimize your Container Registry
How to Use Mirroring and Caching to Optimize your Container RegistryHow to Use Mirroring and Caching to Optimize your Container Registry
How to Use Mirroring and Caching to Optimize your Container Registry
 
Kubernetes at Datadog Scale
Kubernetes at Datadog ScaleKubernetes at Datadog Scale
Kubernetes at Datadog Scale
 
Labels, Labels, Labels
Labels, Labels, Labels Labels, Labels, Labels
Labels, Labels, Labels
 
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment ModelUsing Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
 
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
 
Developing with Docker for the Arm Architecture
Developing with Docker for the Arm ArchitectureDeveloping with Docker for the Arm Architecture
Developing with Docker for the Arm Architecture
 
Sharing is Caring: How to Begin Speaking at Conferences
Sharing is Caring: How to Begin Speaking at ConferencesSharing is Caring: How to Begin Speaking at Conferences
Sharing is Caring: How to Begin Speaking at Conferences
 
Virtual Meetup Docker + Arm: Building Multi-arch Apps with Buildx
Virtual Meetup Docker + Arm: Building Multi-arch Apps with BuildxVirtual Meetup Docker + Arm: Building Multi-arch Apps with Buildx
Virtual Meetup Docker + Arm: Building Multi-arch Apps with Buildx
 
DCSF 19 eBPF Superpowers
DCSF 19 eBPF SuperpowersDCSF 19 eBPF Superpowers
DCSF 19 eBPF Superpowers
 
DCSF 19 Zero Trust Networks Come to Enterprise Kubernetes
DCSF 19 Zero Trust Networks Come to Enterprise KubernetesDCSF 19 Zero Trust Networks Come to Enterprise Kubernetes
DCSF 19 Zero Trust Networks Come to Enterprise Kubernetes
 
DCSF 19 Node.js Rocks in Docker for Dev and Ops
DCSF 19 Node.js Rocks in Docker for Dev and OpsDCSF 19 Node.js Rocks in Docker for Dev and Ops
DCSF 19 Node.js Rocks in Docker for Dev and Ops
 

Recently uploaded

Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandIES VE
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...FIDO Alliance
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPTiSEO AI
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024Stephanie Beckett
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty SecureFemke de Vroome
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIES VE
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Hiroshi SHIBATA
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101vincent683379
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024Stephen Perrenod
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jNeo4j
 

Recently uploaded (20)

Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4j
 

Distributed Deep Learning with Docker at Salesforce

  • 1. Distributed Deep Learning with Docker at Salesforce
  • 3. Caveats ● These my own views and opinions, not those of Salesforce ● This is how one team at Salesforce deploys deep learning models ● When I use the term Docker I am referring to the technology, not the company ● Some of these designs are simplified
  • 4. ● What is deep learning and why is it difficult? ● Deep learning at Salesforce ● Challenges ○ Designing for team specialization ○ Interacting with the model server ○ Testing ● Key takeaways About this talk
  • 5. The core task of deep learning is function approximation. Neural networks can approximate any function. Neural networks are expensive to evaluate. ● Linear regression: ~1,000 parameters ● Deep neural network: 100M - 1B parameters (100,000 - 1M x linear reg.) Deep Learning Review
  • 6. How should we design distributed systems for deep learning? high latency tasks
  • 7. We use deep learning models to provide our customers useful information about their sales process. They send us this data as a firehose of streaming data. The faster we get this data to our customers, the more useful and actionable it is for their sales teams. Deep Learning at Salesforce
  • 8. There are three steps to this process 1. Preprocessing - cleaning and formatting the data 2. Inference - running the data through the model 3. Postprocessing - interpreting the output from the model Deep Learning at Salesforce
  • 9. Deep Learning at Salesforce Discusses cat preprocess [0.2, 0.71, 0.89, 0.6] [0.85, 0.15] inference postprocess“Hello! My cat is friendly.”
  • 10. Deep Learning at Salesforce Discusses cat preprocess [0.2, 0.71, 0.89, 0.6] [0.85, 0.15] inference postprocess“Hello! My cat is friendly.” Data Science Team
  • 11. Deep Learning at Salesforce Discusses cat preprocess [0.2, 0.71, 0.89, 0.6] [0.85, 0.15] inference postprocess“Hello! My cat is friendly.” Data Science Team Systems Team
  • 13. Requirements 1. The data science team shouldn’t need to know about the system. They just want to define a sequence of computation. 2. The systems engineers shouldn’t need to know anything about the computation. They just want to scale the system. Designing for team specialization
  • 14. Challenges 1. Some functions takes longer to execute than others (e.g., model inference) 2. The order of execution is important Designing for team specialization
  • 15. Solution: map functions to containers Designing for team specialization postprocess(inference(preprocess(x))) preprocess inference postprocess
  • 16. What about throughput? preprocess inference postprocess 0110010 1001111 0110010 1000011 It’s a cat! 1,000 QPS 500 QPS 300 QPS 1,000 QPS Max throughput
  • 17. inference inferenceinferencepreprocess What about throughput? preprocess inference postprocess 0110010 1001111 0110010 1000011 It’s a cat! 1,000 QPS 2x 500 QPS 4x 300 QPS 1,000 QPS Max throughput Docker enables us to easily scale out each individual stage
  • 18. inferenceinferencepreprocess What about throughput? preprocess inference postprocess 0110010 1001111 0110010 1000011 It’s a cat! 1,000 QPS 2x 500 QPS 2x 300 QPS 1,000 QPS Max throughput Kafka gives stage-wise checkpointing
  • 20. Model servers provide a way to query the model, typically via gRPC or HTTP. What is the best way to deploy and interact with these model servers? Serving deep learning models
  • 21. Challenge: 1. Model servers are designed as a standalone process. 2. How should we best utilize multiple GPUs? 3. What about networking? Interacting with the model server We want to keep deployment simple!
  • 22. Solution: Deploy model server images as part of a “pod” or “group” with a coordinator service Interacting with the model server JVM Manager Model Server Model Server Model Server... Pod
  • 23. 1. Who owns the model server? 2. How should we handle model versions? Where are they stored locally? 3. What are the addresses of the model servers? This solves additional challenges Data science team Docker shared volume http://localhost via Docker private networking
  • 25. Challenge: how should we test these systems? 1. Deep learning models are probabilistic 2. Interservice interactions can be quite complex Testing
  • 26. Solution: Docker Compose ● Makes it easy to swap out the model server with a mock service ● Deploying the entire system locally is easy ● Integrates well with Maven and Gradle Testing
  • 27. We haven’t spent a lot of time discussing the details of Docker That is precisely the point!
  • 28. ● Docker allows us to simplify many aspects of our design. ● Docker stays out of the way. ● Docker provides a simple alternative to a much more complex solution. Docker simplifies our lives