SlideShare a Scribd company logo
1 7 J U N E 2 0 2 1
S C A L I N G A I I N P R O D U C T I O N
U S I N G P Y T O R C H
G E E T A C H A U H A N
PyTorch Partner Engineering, Facebook AI
@ C H A U H A N G
MLOPS World 2021
A G E N D A 0 1


C H A L L E N G E S W I T H M L I N
P R O D U C T I O N


0 2


T O R C H S E R V E O V E R V I E W


0 3


B E S T P R A C T I C E S F O R P R O D U C T I O N
D E P L O Y M E N T
MLOps World 2021
P Y T O R C H C O M M U N I T Y G R O W T H
Source: https://paperswithcode.com/trends
MLOps World 2021
●
●
●
Cloud / On-Prem
Preprocessing
Application
Application logic
Application logic
Postprocessing
. . .
. . .
. . .
Performance Ease of use
Cost efficiency Deployment at scale
C H A L L E N G E S W I T H M L I N D E P L O Y M E N T
MLOps World 2021
INFERENCE AT SCALE
Deploying and managing models in production is
di
ffi
cult.


Some of the pain points include:
Loading and managing multiple models, on multiple
servers or end devices


Running pre-processing and post-processing code on
prediction requests.


How to log, monitor and secure predictions


What happens when you hit scale?
MLOps World 2021
TORCHSERVE
Easily deploy PyTorch models in production at scale


D E F A U LT H A N D L E R S
F O R C O M M O N T A S K S
L O W L AT E N C Y M O D E L
S E R V I N G
W O R K S W I T H A N Y M L
E N V I R O N M E N T
MLOps World 2021
• Default handlers for common use
cases (e.g., image segmentation,
text classification) along with
custom handlers support for other
use cases and a Model Zoo


• Multi-model serving, Model
versioning and ability to roll back
to an earlier version


• Automatic batching of individual
inferences across HTTP requests
• Logging including common
metrics, and the ability to
incorporate custom metrics


• Robust HTTP APIS -
Management and Inference
model1.pth
model1.pth
model1.pth
torch-model-archiver
HTTP
HTTP
http://localhost:8080/ …


http://localhost:8081/ …
Logging Metrics
model1.mar model2.mar model3.mar
model4.mar model5.mar
<path>/model_store
Inference API
Management API
TorchServe
Metrics API
Inference
API
Serving Model 3
Serving Model 2
Serving Model 1
torchserve --start
TORCHSERVE
T O R C H S E R V E D E T A I L :


M O D E L H A N D L E R S
TorchServe has default model handlers that
perform boilerplate data transforms for
common cases:


• Image Classification


• Image Segmentation


• Object Detection


• Text Classification


You can also create custom model handlers
for any model and inference task.
import torch


class MyModelHandler(object):


    def initialize(self, context):


# get GPU status & device handle


# load model & supporting files (vocabularies etc.)


    def preprocess(self, data):


# put incoming data into tensor


# transform as needed for your model


    def inference(self, context):


# do predictions


    def postprocess(self, output):


# process inference output, e.g. extracting top K


# package output for web delivery


    def handle(self, context):


if not _service.initialized:


_service.initialize(context)


if data is None:


return None


data = _service.preprocess(data)


data = _service.inference(data)


data = _service.postprocess(data)


return data
M O D E L A R C H I V E
torch-model-archiver cli tool for packaging all
model artifacts into a single deployment unit


• model checkpoints or model definition file
with state_dict


• torchscript and eager mode support


• Extra files like vocab, config, index_to_name
mapping


torch-model-archiver


—model-name BERTSeqClassification_Torchscript


--version 1.0


--serialized-file Transformer_model/traced_model.pt


--handler ./Transformer_handler_generalized.py


--extra-files "./setup_config.json,./
Seq_classification_artifacts/index_to_name.json"





setup.config


{


“model_name": "bert-base-uncased",


“mode": "sequence_classification",


“do_lower_case": "True",


“num_labels": "2",


“save_mode": "torchscript",


“max_length": "150"


}




torchserve --start


--model-store model_store


—-models <path-to model-file/s3-url/azure-blob-url>
https://github.com/pytorch/serve/tree/master/model-archiver#creating-a-model-archive
D Y N A M I C B A T C H I N G
Via Custom Handlers


• Model Configuration based


• batch_size Max batch size


• max_batch_delay The max batch delay time
TorchServe waits to
receive batch_size number of requests


• (Coming soon) Batching support in default
handlers


curl localhost:8081/models/resnet-152


{


"modelName": "resnet-152",


"modelUrl": "https://s3.amazonaws.com/model-server/
model_archive_1.0/examples/resnet-152-batching/resnet-152.ma


"runtime": "python",


"minWorkers": 1,


"maxWorkers": 1,


"batchSize": 8,


"maxBatchDelay": 10,


"workers": [


{


"id": "9008",


"startTime": "2019-02-19T23:56:33.907Z",


"status": "READY",


"gpu": false,


"memoryUsage": 607715328


}


]


}


https://github.com/pytorch/serve/blob/master/docs/batch_inference_with_ts.md
M E T R I C S
Out of box metrics with ability to extend


• CPU, Disk, Memory utilization


• Requests type count


• ts.metrics class for extension


• Types supported - Size, percentage, counter,
general metric


• Prometheus metrics support available


# Access context metrics as follows


metrics = context.metrics


# Create Dimension Object


from ts.metrics.dimension import Dimension


# Dimensions are name value pairs


dim1 = Dimension(name, value)


.


dimN= Dimension(name_n, value_n)


# Add Distance as a metric


# dimensions = [dim1, dim2, dim3, ..., dimN]


metrics.add_metric('DistanceInKM', distance, 'km',
dimensions=dimensions)


# Add Image size as a size metric


metrics.add_size('SizeOfImage', img_size, None, 'MB', dimensions)


# Add MemoryUtilization as a percentage metric


metrics.add_percent('MemoryUtilization', utilization_percent, None,
dimensions)


# Create a counter with name 'LoopCount' and dimensions


metrics.add_counter('LoopCount', 1, None, dimensions)


# Log custom metrics


for metric in metrics.store:


logger.info("[METRICS]%s", str(metric))


https://github.com/pytorch/serve/blob/master/docs/metrics.md
MLOps World 2021
RECENT FEATURES
+ Ensemble Model support, Captum Model Interpretability


+ Kubeflow Pipelines /KFServing Integration with Auto-scaling and Canary rollout on any cloud/on-prem


+ GCP Vertex AI Serverless pipelines


+ MLflow Integration




+ Prometheus Integration with Grafana


+ Multiple nodes on EC2, Autoscaling on SageMaker/EKS, AWS Inferentia support


+ MMF, NMT, DeepLapV3 new examples




Deployment
models
Optimizations Resilience Measurement
Responsible AI
Standalon
e

Primary backu
p

Orchestratio
n

Cloud vs. 

on-premises
Performance vs.
latency
 

TorchScript profilin
g

Offline vs. real-tim
e

Cost
Robust endpoin
t

Auto-scalin
g

Canary
deployment
s

A / B testing
Metric
s

Model
performanc
e

Interpretabilit
y

Feedback loop
Fairnes
s

Human-centered
design
B E S T P R A C T I C E S F O R P R O D U C T I O N D E P L O Y M E N T S
MLOps World 2021
Fairness by design


• Measure skewness of data, model bias, data bias; identify relevant metrics


• Transparency, Explainable AI, inclusive design


Human-centered design


• Consider AI-driven decisions and their impact on people at the time of model design


• Provide ability to have human recourse vs. full automation – for example, need to avoid a mortgage
applications AI rejecting people of certain category or race


• Computer vision models measure results based on demographics; for example, include support for different
skin tones, age groups
R E S P O N S I B L E A I
MLOps World 2021
• Build with performance vs. latency goals in mind


• Reduce size of the model: Quantization, pruning, mixed precision training


• Reduce latency: TorchScript model; use SnakeViz profiler


• Evaluate GPU vs. CPU for low latency


• Evaluate REST vs. gRPC for your prediction service
O P T I M I Z A T I O N S
MLOps World 2021
fp32 accuracy int8 accuracy change Technique CPU inference speed up
ResNet50 76.1


Top-1, Imagenet
-0.2


75.9
Post Training
2x


214ms ➙102ms,


Intel Skylake-DE
MobileNetV2 71.9


Top-1, Imagenet
-0.3


71.6
Quantization-Aware
Training
4x


75ms ➙18ms


OnePlus 5, Snapdragon 835
Translate / FairSeq 32.78


BLEU, IWSLT 2014 de-en
0.0


32.78
Dynamic


(weights only)
4x


for encoder


Intel Skylake-SE
These models and more available on TorchHub - https://pytorch.org/hub/
QUANTIZATION
MLOps World 2021
B E R T


M O D E L


P R O F I L I N G


Eager Mode
MLOps World 2021
B E R T


M O D E L


P R O F I L I N G


Torchscript Mode


4x speedup
MLOps World 2021
Offline vs. real-time predictions


• Offline: Dynamic batching


• Online: Async processing – push/poll


• Pre-computed predictions for certain elements


Cost optimizations


• Spot Instances for offline


• Autoscaling based on metrics, on-demand cluster


• Evaluate AI Accelerators supported like AWS Inferentia for lower cost point


O P T I M I Z A T I O N S ( C O N T D . )
MLOps World 2021
Develop
,

Test
Production
Staging
,

Experiments
Hybrid Cloud
On-prem Cloud Managed
Install from Source
Standalone
Docker
Large Scale

Production
MLflow, Kubeflow
Kubernetes, Kubeflow/KFserving
Primary/Backup, ML Microservices
Autoscaling, Canary Rollouts
Minikub
e

Self managed Docker AWS CloudFormation
CLOUD VMs/ Containers
Microservices behind
 

API Gateway
CLOUD VMs/ Containers
AWS SageMaker
Endpoints, BYOC
AWS SageMaker
EKS/AKS/GKE
AWS SageMaker/ GCP
AI Platform
Serverless Functions
GCP Vertex AI,
 

AWS SageMaker
 

Canary Rollouts
Databricks
Managed MLflow
D E P L O Y I N G M O D E L S I N P R O D U C T I O N
MLOps World 2021
Create robust endpoint for serving, for example, SageMaker endpoint


Auto-scaling with orchestration deployments, multi-node for EC2, and other scenarios


Canary deployments, test new version of a model on small subset before making
default


Shadow inference, deploy new version of model in parallel


A / B testing of different versions of model
R E S I L L I E N C E
MLOps World 2021
Define model performance metrics, such as accuracy, while designing the AI service;
use-case specific


Add custom metrics as appropriate


Use CloudWatch or Prometheus dashboards for monitoring model performance


Model interpretability analysis via Captum


Deploy with a feedback loop, if model accuracy drops over time or new version,
analyze issues like concept drift, stale data, etc.
M E A S U R E M E N T
MLOps World 2021
Understand
Align
Mitigate
Monitor
Measure
Stakeholder conversations to find


consensus and outline measurement and
mitigation plans


Analyze model performance,


label bias, outcomes, and other
relevant signals
Address observed


issues in dataset,


models, policies, etc
How might the product’s goals, its policy,
and its implementation affect users from
different subgroups? Identify contextual
definitions of fairness


Monitor effect of mitigations on


subgroups, and ensure fairness
analysis holds as product adapts


FAIRNESS BY DESIGN
CAPTUM
Text Contributions: 7.54


Image Contributions: 11.19


Total Contributions: 18.73
0 200 400 600 800
400
300
200
100
0
S U P P O R T F O R AT T R I B U T I O N A LG O R I T H M S


T O I N T E R P R E T:


• Output predictions with respect to inputs


• Output predictions with respect to layers


• Neurons with respect to inputs


• Currently provides gradient & perturbation based
approaches (e.g. Integrated Gradients)
Model interpretability library for PyTorch
https://captum.ai/
MLOps World 2021
DYNABOARD & FLORES 101 WMT COMPETITION
http://www.statmt.org/wmt21/large-scale-multilingual-translation-task.html
https://github.com/facebookresearch/dynalab
https://dynabench.org/tasks/3#overall
MLOps World 2021
COMMUNIT Y PROJECTS https://github.com/cceyda/torchserve-dashboard
https://github.com/Unity-Technologies/SynthDet
https://medium.com/pytorch/how-wadhwani-ai-uses-pytorch-
to-empower-cotton-farmers-14397f4c9f2b
MLOps World 2021
FUTURE RELEASES
+ Improved memory and resource usage for better scalability


+ C++ Backend for lower latency


+ Enhanced profiling tools
• TorchServe: https://github.com/pytorch/serve


• Management API: https://github.com/pytorch/serve/blob/master/docs/management_api.md


• Inference API: https://github.com/pytorch/serve/blob/master/docs/inference_api.md


• Language Translation Ensemble example: https://github.com/pytorch/serve/tree/master/examples/Work
fl
ows/nmt_tranformers_pipeline


• BERT Model example: https://github.com/pytorch/serve/tree/master/examples/Huggingface_Transformers


• Model Zoo: https://github.com/pytorch/serve/blob/master/docs/model_zoo.md


• SnakeViz visualizations: https://github.com/pytorch/serve/tree/master/benchmarks#visualize-snakeviz-results


• Logging: https://github.com/pytorch/serve/blob/master/docs/logging.md


• Metrics: https://github.com/pytorch/serve/blob/master/docs/metrics.md


• Prometheus Metrics: https://gith ub.com/pytorch/serve/blob/master/docs/metrics_api.md


• Batch Inference: https://github.com/pytorch/serve/blob/master/docs/batch_inference_with_ts.md


• Kube
fl
ow Pipelines: https://github.com/kube
fl
ow/pipelines/tree/master/components/PyTorch/pytorch-kfp-components


• Kubernetes support: https://github.com/pytorch/serve/blob/master/kubernetes/README.md


• TorchServe Dashboard (Community): https://cceyda.github.io/blog/torchserve/streamlit/dashboard/2020/10/15/torchserve.html


• Custom Handler community blog: https://towardsdatascience.com/deploy-models-and-create-custom-handlers-in-torchserve-
fc2d048fbe91


• Captum Interpretability for BERT models: https://github.com/pytorch/serve/blob/master/captum/Captum_visualization_for_bert.ipynb


• Operationalize, Scale and Infuse Trust in AI using KFServing: https://blog.kube
fl
ow.org/release/o
ffi
cial/2021/03/08/kfserving-0.5.html


REFERENCES
QUESTIONS?


Contact:


Email: gchauhan@fb.com


Linkedin: https://www.linkedin.com/in/geetachauhan/

More Related Content

What's hot

Batchable vs @future vs Queueable
Batchable vs @future vs QueueableBatchable vs @future vs Queueable
Batchable vs @future vs Queueable
Boris Bachovski
 
SuiteCRM Presentation
SuiteCRM PresentationSuiteCRM Presentation
SuiteCRM Presentation
FyNSiS Softlabs Private Limited
 
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Kai Wähner
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
Márton Kodok
 
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Amazon Web Services
 
Managing the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowManaging the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflow
Databricks
 
Einstein bots
Einstein botsEinstein bots
Einstein bots
Amit Chaudhary
 
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
StreamNative
 
AI & Machine Learning Pipelines with Knative
AI & Machine Learning Pipelines with KnativeAI & Machine Learning Pipelines with Knative
AI & Machine Learning Pipelines with Knative
Animesh Singh
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Kai Wähner
 
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Edureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
Edureka!
 
Python Anaconda Tutorial | Edureka
Python Anaconda Tutorial | EdurekaPython Anaconda Tutorial | Edureka
Python Anaconda Tutorial | Edureka
Edureka!
 
Einstein Bots
 Einstein Bots Einstein Bots
Einstein Bots
AIMDek Technologies
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systems
Andrzej Michałowski
 
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Databricks
 
Managing Terraform Module Versioning and Dependencies
Managing Terraform Module Versioning and Dependencies Managing Terraform Module Versioning and Dependencies
Managing Terraform Module Versioning and Dependencies
Nebulaworks
 
Intro to Jupyter Notebooks
Intro to Jupyter NotebooksIntro to Jupyter Notebooks
Intro to Jupyter Notebooks
Francis Michael Bautista
 
Salesforce Einstein Analytics
Salesforce Einstein AnalyticsSalesforce Einstein Analytics
Salesforce Einstein Analytics
Harshala Shewale ☁
 
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata  Das - Spark Meetup 2013-06-17Deep Dive with Spark Streaming - Tathagata  Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
spark-project
 

What's hot (20)

Batchable vs @future vs Queueable
Batchable vs @future vs QueueableBatchable vs @future vs Queueable
Batchable vs @future vs Queueable
 
SuiteCRM Presentation
SuiteCRM PresentationSuiteCRM Presentation
SuiteCRM Presentation
 
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
 
Managing the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowManaging the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflow
 
Einstein bots
Einstein botsEinstein bots
Einstein bots
 
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
 
AI & Machine Learning Pipelines with Knative
AI & Machine Learning Pipelines with KnativeAI & Machine Learning Pipelines with Knative
AI & Machine Learning Pipelines with Knative
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
 
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Python Anaconda Tutorial | Edureka
Python Anaconda Tutorial | EdurekaPython Anaconda Tutorial | Edureka
Python Anaconda Tutorial | Edureka
 
Einstein Bots
 Einstein Bots Einstein Bots
Einstein Bots
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systems
 
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 
Managing Terraform Module Versioning and Dependencies
Managing Terraform Module Versioning and Dependencies Managing Terraform Module Versioning and Dependencies
Managing Terraform Module Versioning and Dependencies
 
Intro to Jupyter Notebooks
Intro to Jupyter NotebooksIntro to Jupyter Notebooks
Intro to Jupyter Notebooks
 
Salesforce Einstein Analytics
Salesforce Einstein AnalyticsSalesforce Einstein Analytics
Salesforce Einstein Analytics
 
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata  Das - Spark Meetup 2013-06-17Deep Dive with Spark Streaming - Tathagata  Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
 

Similar to Scaling AI in production using PyTorch

TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
Stijn Decubber
 
Scaling Up AI Research to Production with PyTorch and MLFlow
Scaling Up AI Research to Production with PyTorch and MLFlowScaling Up AI Research to Production with PyTorch and MLFlow
Scaling Up AI Research to Production with PyTorch and MLFlow
Databricks
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
James Anderson
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Neotys_Partner
 
Reproducible AI using MLflow and PyTorch
Reproducible AI using MLflow and PyTorchReproducible AI using MLflow and PyTorch
Reproducible AI using MLflow and PyTorch
Databricks
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
vitm11
 
NextGenML
NextGenML NextGenML
Overview Of Parallel Development - Ericnel
Overview Of Parallel Development -  EricnelOverview Of Parallel Development -  Ericnel
Overview Of Parallel Development - Ericnel
ukdpe
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Jason Dai
 
Machine learning model to production
Machine learning model to productionMachine learning model to production
Machine learning model to production
Georg Heiler
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
Jan Kirenz
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
inside-BigData.com
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
Data Science Milan
 
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Iulian Pintoiu
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
Stepan Pushkarev
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to Production
Mostafa Majidpour
 
Big Data for Testing - Heading for Post Process and Analytics
Big Data for Testing - Heading for Post Process and AnalyticsBig Data for Testing - Heading for Post Process and Analytics
Big Data for Testing - Heading for Post Process and Analytics
OPNFV
 
Peek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and RoadmapPeek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and Roadmap
Neo4j
 
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at ScaleData Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Databricks
 

Similar to Scaling AI in production using PyTorch (20)

TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
 
Scaling Up AI Research to Production with PyTorch and MLFlow
Scaling Up AI Research to Production with PyTorch and MLFlowScaling Up AI Research to Production with PyTorch and MLFlow
Scaling Up AI Research to Production with PyTorch and MLFlow
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
 
Reproducible AI using MLflow and PyTorch
Reproducible AI using MLflow and PyTorchReproducible AI using MLflow and PyTorch
Reproducible AI using MLflow and PyTorch
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
 
NextGenML
NextGenML NextGenML
NextGenML
 
Overview Of Parallel Development - Ericnel
Overview Of Parallel Development -  EricnelOverview Of Parallel Development -  Ericnel
Overview Of Parallel Development - Ericnel
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
 
Machine learning model to production
Machine learning model to productionMachine learning model to production
Machine learning model to production
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
 
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to Production
 
Big Data for Testing - Heading for Post Process and Analytics
Big Data for Testing - Heading for Post Process and AnalyticsBig Data for Testing - Heading for Post Process and Analytics
Big Data for Testing - Heading for Post Process and Analytics
 
Peek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and RoadmapPeek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and Roadmap
 
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at ScaleData Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
 

More from geetachauhan

Profiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & SustainabilityProfiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & Sustainability
geetachauhan
 
Building AI with Security Privacy in Mind
Building AI with Security Privacy in MindBuilding AI with Security Privacy in Mind
Building AI with Security Privacy in Mind
geetachauhan
 
Building AI with Security and Privacy in mind
Building AI with Security and Privacy in mindBuilding AI with Security and Privacy in mind
Building AI with Security and Privacy in mind
geetachauhan
 
Building Interpretable & Secure AI Systems using PyTorch
Building Interpretable & Secure AI Systems using PyTorchBuilding Interpretable & Secure AI Systems using PyTorch
Building Interpretable & Secure AI Systems using PyTorch
geetachauhan
 
Future is private intel dev fest
Future is private   intel dev festFuture is private   intel dev fest
Future is private intel dev fest
geetachauhan
 
Decentralized AI Draper
Decentralized AI   DraperDecentralized AI   Draper
Decentralized AI Draper
geetachauhan
 
Decentralized AI: Convergence of AI + Blockchain
Decentralized AI: Convergence of AI + Blockchain Decentralized AI: Convergence of AI + Blockchain
Decentralized AI: Convergence of AI + Blockchain
geetachauhan
 
Decentralized AI: Convergence of Blockchain + AI
Decentralized AI: Convergence of Blockchain + AIDecentralized AI: Convergence of Blockchain + AI
Decentralized AI: Convergence of Blockchain + AI
geetachauhan
 
Decentralized AI: Convergence of Blockchain + AI
Decentralized AI: Convergence of Blockchain + AIDecentralized AI: Convergence of Blockchain + AI
Decentralized AI: Convergence of Blockchain + AI
geetachauhan
 
Deep learning for medical imaging
Deep learning for medical imagingDeep learning for medical imaging
Deep learning for medical imaging
geetachauhan
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
geetachauhan
 
NIPS - Deep learning @ Edge using Intel's NCS
NIPS - Deep learning @ Edge using Intel's NCSNIPS - Deep learning @ Edge using Intel's NCS
NIPS - Deep learning @ Edge using Intel's NCS
geetachauhan
 
Best Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in EnterprisesBest Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in Enterprises
geetachauhan
 
Deep learning @ Edge using Intel's Neural Compute Stick
Deep learning @ Edge using Intel's Neural Compute StickDeep learning @ Edge using Intel's Neural Compute Stick
Deep learning @ Edge using Intel's Neural Compute Stick
geetachauhan
 
Distributed deep learning optimizations for Finance
Distributed deep learning optimizations for FinanceDistributed deep learning optimizations for Finance
Distributed deep learning optimizations for Finance
geetachauhan
 
Distributed deep learning optimizations - AI WithTheBest
Distributed deep learning optimizations - AI WithTheBestDistributed deep learning optimizations - AI WithTheBest
Distributed deep learning optimizations - AI WithTheBest
geetachauhan
 
Distributed deep learning optimizations
Distributed deep learning optimizationsDistributed deep learning optimizations
Distributed deep learning optimizations
geetachauhan
 
Tensorflow IoT - 1 Wk coding challenge
Tensorflow IoT - 1 Wk coding challengeTensorflow IoT - 1 Wk coding challenge
Tensorflow IoT - 1 Wk coding challenge
geetachauhan
 
Intel optimized tensorflow, distributed deep learning
Intel optimized tensorflow, distributed deep learningIntel optimized tensorflow, distributed deep learning
Intel optimized tensorflow, distributed deep learning
geetachauhan
 
Transfer learning for IoT
Transfer learning for IoTTransfer learning for IoT
Transfer learning for IoT
geetachauhan
 

More from geetachauhan (20)

Profiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & SustainabilityProfiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & Sustainability
 
Building AI with Security Privacy in Mind
Building AI with Security Privacy in MindBuilding AI with Security Privacy in Mind
Building AI with Security Privacy in Mind
 
Building AI with Security and Privacy in mind
Building AI with Security and Privacy in mindBuilding AI with Security and Privacy in mind
Building AI with Security and Privacy in mind
 
Building Interpretable & Secure AI Systems using PyTorch
Building Interpretable & Secure AI Systems using PyTorchBuilding Interpretable & Secure AI Systems using PyTorch
Building Interpretable & Secure AI Systems using PyTorch
 
Future is private intel dev fest
Future is private   intel dev festFuture is private   intel dev fest
Future is private intel dev fest
 
Decentralized AI Draper
Decentralized AI   DraperDecentralized AI   Draper
Decentralized AI Draper
 
Decentralized AI: Convergence of AI + Blockchain
Decentralized AI: Convergence of AI + Blockchain Decentralized AI: Convergence of AI + Blockchain
Decentralized AI: Convergence of AI + Blockchain
 
Decentralized AI: Convergence of Blockchain + AI
Decentralized AI: Convergence of Blockchain + AIDecentralized AI: Convergence of Blockchain + AI
Decentralized AI: Convergence of Blockchain + AI
 
Decentralized AI: Convergence of Blockchain + AI
Decentralized AI: Convergence of Blockchain + AIDecentralized AI: Convergence of Blockchain + AI
Decentralized AI: Convergence of Blockchain + AI
 
Deep learning for medical imaging
Deep learning for medical imagingDeep learning for medical imaging
Deep learning for medical imaging
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
 
NIPS - Deep learning @ Edge using Intel's NCS
NIPS - Deep learning @ Edge using Intel's NCSNIPS - Deep learning @ Edge using Intel's NCS
NIPS - Deep learning @ Edge using Intel's NCS
 
Best Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in EnterprisesBest Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in Enterprises
 
Deep learning @ Edge using Intel's Neural Compute Stick
Deep learning @ Edge using Intel's Neural Compute StickDeep learning @ Edge using Intel's Neural Compute Stick
Deep learning @ Edge using Intel's Neural Compute Stick
 
Distributed deep learning optimizations for Finance
Distributed deep learning optimizations for FinanceDistributed deep learning optimizations for Finance
Distributed deep learning optimizations for Finance
 
Distributed deep learning optimizations - AI WithTheBest
Distributed deep learning optimizations - AI WithTheBestDistributed deep learning optimizations - AI WithTheBest
Distributed deep learning optimizations - AI WithTheBest
 
Distributed deep learning optimizations
Distributed deep learning optimizationsDistributed deep learning optimizations
Distributed deep learning optimizations
 
Tensorflow IoT - 1 Wk coding challenge
Tensorflow IoT - 1 Wk coding challengeTensorflow IoT - 1 Wk coding challenge
Tensorflow IoT - 1 Wk coding challenge
 
Intel optimized tensorflow, distributed deep learning
Intel optimized tensorflow, distributed deep learningIntel optimized tensorflow, distributed deep learning
Intel optimized tensorflow, distributed deep learning
 
Transfer learning for IoT
Transfer learning for IoTTransfer learning for IoT
Transfer learning for IoT
 

Recently uploaded

SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
maazsz111
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 

Recently uploaded (20)

SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 

Scaling AI in production using PyTorch

  • 1. 1 7 J U N E 2 0 2 1 S C A L I N G A I I N P R O D U C T I O N U S I N G P Y T O R C H G E E T A C H A U H A N PyTorch Partner Engineering, Facebook AI @ C H A U H A N G
  • 2. MLOPS World 2021 A G E N D A 0 1 C H A L L E N G E S W I T H M L I N P R O D U C T I O N 0 2 T O R C H S E R V E O V E R V I E W 0 3 B E S T P R A C T I C E S F O R P R O D U C T I O N D E P L O Y M E N T
  • 3. MLOps World 2021 P Y T O R C H C O M M U N I T Y G R O W T H Source: https://paperswithcode.com/trends
  • 4. MLOps World 2021 ● ● ● Cloud / On-Prem Preprocessing Application Application logic Application logic Postprocessing . . . . . . . . . Performance Ease of use Cost efficiency Deployment at scale C H A L L E N G E S W I T H M L I N D E P L O Y M E N T
  • 5. MLOps World 2021 INFERENCE AT SCALE Deploying and managing models in production is di ffi cult. Some of the pain points include: Loading and managing multiple models, on multiple servers or end devices Running pre-processing and post-processing code on prediction requests. How to log, monitor and secure predictions What happens when you hit scale?
  • 6. MLOps World 2021 TORCHSERVE Easily deploy PyTorch models in production at scale D E F A U LT H A N D L E R S F O R C O M M O N T A S K S L O W L AT E N C Y M O D E L S E R V I N G W O R K S W I T H A N Y M L E N V I R O N M E N T
  • 7. MLOps World 2021 • Default handlers for common use cases (e.g., image segmentation, text classification) along with custom handlers support for other use cases and a Model Zoo • Multi-model serving, Model versioning and ability to roll back to an earlier version • Automatic batching of individual inferences across HTTP requests • Logging including common metrics, and the ability to incorporate custom metrics • Robust HTTP APIS - Management and Inference model1.pth model1.pth model1.pth torch-model-archiver HTTP HTTP http://localhost:8080/ … http://localhost:8081/ … Logging Metrics model1.mar model2.mar model3.mar model4.mar model5.mar <path>/model_store Inference API Management API TorchServe Metrics API Inference API Serving Model 3 Serving Model 2 Serving Model 1 torchserve --start TORCHSERVE
  • 8. T O R C H S E R V E D E T A I L : M O D E L H A N D L E R S TorchServe has default model handlers that perform boilerplate data transforms for common cases: • Image Classification • Image Segmentation • Object Detection • Text Classification You can also create custom model handlers for any model and inference task. import torch class MyModelHandler(object):     def initialize(self, context): # get GPU status & device handle # load model & supporting files (vocabularies etc.)     def preprocess(self, data): # put incoming data into tensor # transform as needed for your model     def inference(self, context): # do predictions     def postprocess(self, output): # process inference output, e.g. extracting top K # package output for web delivery     def handle(self, context): if not _service.initialized: _service.initialize(context) if data is None: return None data = _service.preprocess(data) data = _service.inference(data) data = _service.postprocess(data) return data
  • 9. M O D E L A R C H I V E torch-model-archiver cli tool for packaging all model artifacts into a single deployment unit • model checkpoints or model definition file with state_dict • torchscript and eager mode support • Extra files like vocab, config, index_to_name mapping torch-model-archiver 
 —model-name BERTSeqClassification_Torchscript 
 --version 1.0 
 --serialized-file Transformer_model/traced_model.pt 
 --handler ./Transformer_handler_generalized.py 
 --extra-files "./setup_config.json,./ Seq_classification_artifacts/index_to_name.json" 
 

 setup.config 
 { “model_name": "bert-base-uncased", “mode": "sequence_classification", “do_lower_case": "True", “num_labels": "2", “save_mode": "torchscript", “max_length": "150" } 
 
 torchserve --start 
 --model-store model_store 
 —-models <path-to model-file/s3-url/azure-blob-url> https://github.com/pytorch/serve/tree/master/model-archiver#creating-a-model-archive
  • 10. D Y N A M I C B A T C H I N G Via Custom Handlers • Model Configuration based • batch_size Max batch size • max_batch_delay The max batch delay time TorchServe waits to receive batch_size number of requests 
 • (Coming soon) Batching support in default handlers curl localhost:8081/models/resnet-152 { "modelName": "resnet-152", "modelUrl": "https://s3.amazonaws.com/model-server/ model_archive_1.0/examples/resnet-152-batching/resnet-152.ma "runtime": "python", "minWorkers": 1, "maxWorkers": 1, "batchSize": 8, "maxBatchDelay": 10, "workers": [ { "id": "9008", "startTime": "2019-02-19T23:56:33.907Z", "status": "READY", "gpu": false, "memoryUsage": 607715328 } ] } https://github.com/pytorch/serve/blob/master/docs/batch_inference_with_ts.md
  • 11. M E T R I C S Out of box metrics with ability to extend • CPU, Disk, Memory utilization • Requests type count • ts.metrics class for extension • Types supported - Size, percentage, counter, general metric • Prometheus metrics support available # Access context metrics as follows metrics = context.metrics # Create Dimension Object from ts.metrics.dimension import Dimension # Dimensions are name value pairs dim1 = Dimension(name, value) . dimN= Dimension(name_n, value_n) # Add Distance as a metric # dimensions = [dim1, dim2, dim3, ..., dimN] metrics.add_metric('DistanceInKM', distance, 'km', dimensions=dimensions) # Add Image size as a size metric metrics.add_size('SizeOfImage', img_size, None, 'MB', dimensions) # Add MemoryUtilization as a percentage metric metrics.add_percent('MemoryUtilization', utilization_percent, None, dimensions) # Create a counter with name 'LoopCount' and dimensions metrics.add_counter('LoopCount', 1, None, dimensions) # Log custom metrics for metric in metrics.store: logger.info("[METRICS]%s", str(metric)) https://github.com/pytorch/serve/blob/master/docs/metrics.md
  • 12. MLOps World 2021 RECENT FEATURES + Ensemble Model support, Captum Model Interpretability + Kubeflow Pipelines /KFServing Integration with Auto-scaling and Canary rollout on any cloud/on-prem 
 + GCP Vertex AI Serverless pipelines + MLflow Integration + Prometheus Integration with Grafana + Multiple nodes on EC2, Autoscaling on SageMaker/EKS, AWS Inferentia support + MMF, NMT, DeepLapV3 new examples 
 

  • 13. Deployment models Optimizations Resilience Measurement Responsible AI Standalon e Primary backu p Orchestratio n Cloud vs. 
 on-premises Performance vs. latency TorchScript profilin g Offline vs. real-tim e Cost Robust endpoin t Auto-scalin g Canary deployment s A / B testing Metric s Model performanc e Interpretabilit y Feedback loop Fairnes s Human-centered design B E S T P R A C T I C E S F O R P R O D U C T I O N D E P L O Y M E N T S
  • 14. MLOps World 2021 Fairness by design • Measure skewness of data, model bias, data bias; identify relevant metrics • Transparency, Explainable AI, inclusive design Human-centered design • Consider AI-driven decisions and their impact on people at the time of model design • Provide ability to have human recourse vs. full automation – for example, need to avoid a mortgage applications AI rejecting people of certain category or race • Computer vision models measure results based on demographics; for example, include support for different skin tones, age groups R E S P O N S I B L E A I
  • 15. MLOps World 2021 • Build with performance vs. latency goals in mind • Reduce size of the model: Quantization, pruning, mixed precision training • Reduce latency: TorchScript model; use SnakeViz profiler • Evaluate GPU vs. CPU for low latency • Evaluate REST vs. gRPC for your prediction service O P T I M I Z A T I O N S
  • 16. MLOps World 2021 fp32 accuracy int8 accuracy change Technique CPU inference speed up ResNet50 76.1 
 Top-1, Imagenet -0.2 
 75.9 Post Training 2x 
 214ms ➙102ms, 
 Intel Skylake-DE MobileNetV2 71.9 Top-1, Imagenet -0.3 71.6 Quantization-Aware Training 4x 
 75ms ➙18ms 
 OnePlus 5, Snapdragon 835 Translate / FairSeq 32.78 
 BLEU, IWSLT 2014 de-en 0.0 
 32.78 Dynamic 
 (weights only) 4x 
 for encoder 
 Intel Skylake-SE These models and more available on TorchHub - https://pytorch.org/hub/ QUANTIZATION
  • 17. MLOps World 2021 B E R T M O D E L P R O F I L I N G Eager Mode
  • 18. MLOps World 2021 B E R T M O D E L P R O F I L I N G Torchscript Mode 4x speedup
  • 19. MLOps World 2021 Offline vs. real-time predictions • Offline: Dynamic batching • Online: Async processing – push/poll • Pre-computed predictions for certain elements Cost optimizations • Spot Instances for offline • Autoscaling based on metrics, on-demand cluster • Evaluate AI Accelerators supported like AWS Inferentia for lower cost point O P T I M I Z A T I O N S ( C O N T D . )
  • 20. MLOps World 2021 Develop , Test Production Staging , Experiments Hybrid Cloud On-prem Cloud Managed Install from Source Standalone Docker Large Scale
 Production MLflow, Kubeflow Kubernetes, Kubeflow/KFserving Primary/Backup, ML Microservices Autoscaling, Canary Rollouts Minikub e Self managed Docker AWS CloudFormation CLOUD VMs/ Containers Microservices behind API Gateway CLOUD VMs/ Containers AWS SageMaker Endpoints, BYOC AWS SageMaker EKS/AKS/GKE AWS SageMaker/ GCP AI Platform Serverless Functions GCP Vertex AI, AWS SageMaker Canary Rollouts Databricks Managed MLflow D E P L O Y I N G M O D E L S I N P R O D U C T I O N
  • 21. MLOps World 2021 Create robust endpoint for serving, for example, SageMaker endpoint Auto-scaling with orchestration deployments, multi-node for EC2, and other scenarios Canary deployments, test new version of a model on small subset before making default Shadow inference, deploy new version of model in parallel A / B testing of different versions of model R E S I L L I E N C E
  • 22. MLOps World 2021 Define model performance metrics, such as accuracy, while designing the AI service; use-case specific Add custom metrics as appropriate Use CloudWatch or Prometheus dashboards for monitoring model performance Model interpretability analysis via Captum Deploy with a feedback loop, if model accuracy drops over time or new version, analyze issues like concept drift, stale data, etc. M E A S U R E M E N T
  • 23. MLOps World 2021 Understand Align Mitigate Monitor Measure Stakeholder conversations to find 
 consensus and outline measurement and mitigation plans Analyze model performance, 
 label bias, outcomes, and other relevant signals Address observed 
 issues in dataset, 
 models, policies, etc How might the product’s goals, its policy, and its implementation affect users from different subgroups? Identify contextual definitions of fairness Monitor effect of mitigations on 
 subgroups, and ensure fairness analysis holds as product adapts FAIRNESS BY DESIGN
  • 24. CAPTUM Text Contributions: 7.54 Image Contributions: 11.19 Total Contributions: 18.73 0 200 400 600 800 400 300 200 100 0 S U P P O R T F O R AT T R I B U T I O N A LG O R I T H M S 
 T O I N T E R P R E T: • Output predictions with respect to inputs • Output predictions with respect to layers • Neurons with respect to inputs • Currently provides gradient & perturbation based approaches (e.g. Integrated Gradients) Model interpretability library for PyTorch https://captum.ai/
  • 25. MLOps World 2021 DYNABOARD & FLORES 101 WMT COMPETITION http://www.statmt.org/wmt21/large-scale-multilingual-translation-task.html https://github.com/facebookresearch/dynalab https://dynabench.org/tasks/3#overall
  • 26. MLOps World 2021 COMMUNIT Y PROJECTS https://github.com/cceyda/torchserve-dashboard https://github.com/Unity-Technologies/SynthDet https://medium.com/pytorch/how-wadhwani-ai-uses-pytorch- to-empower-cotton-farmers-14397f4c9f2b
  • 27. MLOps World 2021 FUTURE RELEASES + Improved memory and resource usage for better scalability + C++ Backend for lower latency + Enhanced profiling tools
  • 28. • TorchServe: https://github.com/pytorch/serve • Management API: https://github.com/pytorch/serve/blob/master/docs/management_api.md • Inference API: https://github.com/pytorch/serve/blob/master/docs/inference_api.md • Language Translation Ensemble example: https://github.com/pytorch/serve/tree/master/examples/Work fl ows/nmt_tranformers_pipeline • BERT Model example: https://github.com/pytorch/serve/tree/master/examples/Huggingface_Transformers • Model Zoo: https://github.com/pytorch/serve/blob/master/docs/model_zoo.md • SnakeViz visualizations: https://github.com/pytorch/serve/tree/master/benchmarks#visualize-snakeviz-results • Logging: https://github.com/pytorch/serve/blob/master/docs/logging.md • Metrics: https://github.com/pytorch/serve/blob/master/docs/metrics.md • Prometheus Metrics: https://gith ub.com/pytorch/serve/blob/master/docs/metrics_api.md • Batch Inference: https://github.com/pytorch/serve/blob/master/docs/batch_inference_with_ts.md • Kube fl ow Pipelines: https://github.com/kube fl ow/pipelines/tree/master/components/PyTorch/pytorch-kfp-components • Kubernetes support: https://github.com/pytorch/serve/blob/master/kubernetes/README.md • TorchServe Dashboard (Community): https://cceyda.github.io/blog/torchserve/streamlit/dashboard/2020/10/15/torchserve.html • Custom Handler community blog: https://towardsdatascience.com/deploy-models-and-create-custom-handlers-in-torchserve- fc2d048fbe91 • Captum Interpretability for BERT models: https://github.com/pytorch/serve/blob/master/captum/Captum_visualization_for_bert.ipynb • Operationalize, Scale and Infuse Trust in AI using KFServing: https://blog.kube fl ow.org/release/o ffi cial/2021/03/08/kfserving-0.5.html REFERENCES