SlideShare a Scribd company logo
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hagay Lupesko
01.25.2018
Model Serving for Deep Learning
Amazon AI
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Brief Intro to Deep Learning
AI
Machine
Learning
Deep
Learning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Brief Intro to Deep Learning – Neural Networks
Output
Layer
Input
Layer
Hidden
Layers
Many
More…
• Non linear
• Hierarchical
feature learning
• Scalable
architecture
• Computationally
intensive
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Learning is a Big Deal
It has a growing impact on our lives
Personalization Logistics Voice Autonomous
Vehicles
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Learning is a Big Deal
It’s able to do better than other ML and Humans
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model
Model Server
Mobile
Desktop
IoT
Internet
So what does a deployed model looks like?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance
Availability
Networking
Monitoring
Model Decoupling
Cross Framework
Cross Platform
The Undifferentiated
Heavy Lifting of
Model Serving
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tensor Flow
Serving
Model Server
for MXNet
UC Berkeley
Clipper
Model Serving Systems for Deep Learning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
It’s Demo Time!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model Archive
REST and
OpenAPI
Containerized
ONNX Support Operational Metrics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Trained
Network
Model
Signature
Custom
Code
Auxiliary
Assets
Model Archive
Model Export CLI
Model Archive
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
REST and OpenAPI
REST-like endpoint: <model-name>/predict
Endpoint auto-generated from the model’s signature.json
JSON encoding by default
Binary input via request payload
OpenAPI support – client code-gen and tooling
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MMS
Dockerfile
Build
Push
Launch
Containerization
Container Cluster
MMS Container
MMS ContainerMMS Container
MXNet NGINX
MXNet Model Server
Lightweight virtualization, isolation, runs anywhere
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Requests
• Latencies
• Resources
Metrics
• Model Name
• Host Name
Dimensions
• Log / CSV
• AWS CloudWatch
Target
Operational Metrics
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
O(n2)
Pairs
MXNet
Caffe2
PyTorch
TF
CNTKCoreML
TensorRT
NGraph
SNPEMany Frameworks
ONNX Support
Many Platforms
ONNX: Common IR
Supported in MMS v0.2
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance
• Batching
• Caching
• JIT Compilation
• Custom code
• Quantization Platform
• New players
• ONNX
• Plugins
Adoption
• Ease of use
• Internal
Amazon dev
tools
• Industry
partners
Challenges Ahead
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Open source – try it out and file issues
github.com/awslabs/mxnet-model-server
mxnet-sdk-team@amazon.com

More Related Content

What's hot

AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
Amazon Web Services Korea
 
How Websites go Serverless - WebSummit Lisbon 2018
How Websites go Serverless - WebSummit Lisbon 2018How Websites go Serverless - WebSummit Lisbon 2018
How Websites go Serverless - WebSummit Lisbon 2018
Boaz Ziniman
 
Machine Learning State of the Union - MCL210 - re:Invent 2017
Machine Learning State of the Union - MCL210 - re:Invent 2017Machine Learning State of the Union - MCL210 - re:Invent 2017
Machine Learning State of the Union - MCL210 - re:Invent 2017
Amazon Web Services
 
TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...
TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...
TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...
Amazon Web Services
 
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...
Amazon Web Services
 
Amazon's Innovation with Machine Learning
Amazon's Innovation with Machine LearningAmazon's Innovation with Machine Learning
Amazon's Innovation with Machine Learning
Amazon Web Services Japan
 
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
Edureka!
 
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
Amazon Web Services
 
TVB 透過創新快速接觸三百萬用戶
TVB 透過創新快速接觸三百萬用戶 TVB 透過創新快速接觸三百萬用戶
TVB 透過創新快速接觸三百萬用戶
Amazon Web Services
 
ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...
ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...
ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...
Amazon Web Services
 
Architecting a Real-World Microservices Architecture and DevOps Strategy on A...
Architecting a Real-World Microservices Architecture and DevOps Strategy on A...Architecting a Real-World Microservices Architecture and DevOps Strategy on A...
Architecting a Real-World Microservices Architecture and DevOps Strategy on A...
Amazon Web Services
 
Building Mobile Apps with AWS Amplify
Building Mobile Apps with AWS AmplifyBuilding Mobile Apps with AWS Amplify
Building Mobile Apps with AWS Amplify
Amazon Web Services
 
GPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsGPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital Markets
Amazon Web Services
 
Introducing AWS Cloud9 - AWS Online Tech Talks
Introducing AWS Cloud9 - AWS Online Tech TalksIntroducing AWS Cloud9 - AWS Online Tech Talks
Introducing AWS Cloud9 - AWS Online Tech Talks
Amazon Web Services
 
MBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real WorldMBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real World
Amazon Web Services
 
AWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
AWS SysOps Administrator Training | AWS SysOps Tutorial | EdurekaAWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
AWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
Edureka!
 
IOT328_Building an AWS IoT-Enabled Drink Dispenser
IOT328_Building an AWS IoT-Enabled Drink DispenserIOT328_Building an AWS IoT-Enabled Drink Dispenser
IOT328_Building an AWS IoT-Enabled Drink Dispenser
Amazon Web Services
 
Amazon Time Sync Service now makes it easier to generate and compare timestamps
Amazon Time Sync Service now makes it easier to generate and compare timestampsAmazon Time Sync Service now makes it easier to generate and compare timestamps
Amazon Time Sync Service now makes it easier to generate and compare timestamps
Dhaval Soni
 
Keynote
KeynoteKeynote
AWS Initiate Day Manchester 2019 – AWS Cloud Foundations
AWS Initiate Day Manchester 2019 – AWS Cloud FoundationsAWS Initiate Day Manchester 2019 – AWS Cloud Foundations
AWS Initiate Day Manchester 2019 – AWS Cloud Foundations
Amazon Web Services
 

What's hot (20)

AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
 
How Websites go Serverless - WebSummit Lisbon 2018
How Websites go Serverless - WebSummit Lisbon 2018How Websites go Serverless - WebSummit Lisbon 2018
How Websites go Serverless - WebSummit Lisbon 2018
 
Machine Learning State of the Union - MCL210 - re:Invent 2017
Machine Learning State of the Union - MCL210 - re:Invent 2017Machine Learning State of the Union - MCL210 - re:Invent 2017
Machine Learning State of the Union - MCL210 - re:Invent 2017
 
TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...
TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...
TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...
 
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...
 
Amazon's Innovation with Machine Learning
Amazon's Innovation with Machine LearningAmazon's Innovation with Machine Learning
Amazon's Innovation with Machine Learning
 
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
 
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
 
TVB 透過創新快速接觸三百萬用戶
TVB 透過創新快速接觸三百萬用戶 TVB 透過創新快速接觸三百萬用戶
TVB 透過創新快速接觸三百萬用戶
 
ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...
ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...
ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...
 
Architecting a Real-World Microservices Architecture and DevOps Strategy on A...
Architecting a Real-World Microservices Architecture and DevOps Strategy on A...Architecting a Real-World Microservices Architecture and DevOps Strategy on A...
Architecting a Real-World Microservices Architecture and DevOps Strategy on A...
 
Building Mobile Apps with AWS Amplify
Building Mobile Apps with AWS AmplifyBuilding Mobile Apps with AWS Amplify
Building Mobile Apps with AWS Amplify
 
GPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsGPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital Markets
 
Introducing AWS Cloud9 - AWS Online Tech Talks
Introducing AWS Cloud9 - AWS Online Tech TalksIntroducing AWS Cloud9 - AWS Online Tech Talks
Introducing AWS Cloud9 - AWS Online Tech Talks
 
MBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real WorldMBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real World
 
AWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
AWS SysOps Administrator Training | AWS SysOps Tutorial | EdurekaAWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
AWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
 
IOT328_Building an AWS IoT-Enabled Drink Dispenser
IOT328_Building an AWS IoT-Enabled Drink DispenserIOT328_Building an AWS IoT-Enabled Drink Dispenser
IOT328_Building an AWS IoT-Enabled Drink Dispenser
 
Amazon Time Sync Service now makes it easier to generate and compare timestamps
Amazon Time Sync Service now makes it easier to generate and compare timestampsAmazon Time Sync Service now makes it easier to generate and compare timestamps
Amazon Time Sync Service now makes it easier to generate and compare timestamps
 
Keynote
KeynoteKeynote
Keynote
 
AWS Initiate Day Manchester 2019 – AWS Cloud Foundations
AWS Initiate Day Manchester 2019 – AWS Cloud FoundationsAWS Initiate Day Manchester 2019 – AWS Cloud Foundations
AWS Initiate Day Manchester 2019 – AWS Cloud Foundations
 

Similar to Deep learning systems model serving

Model Serving for Deep Learning with MXNet Model Server
Model Serving for Deep Learning with MXNet Model ServerModel Serving for Deep Learning with MXNet Model Server
Model Serving for Deep Learning with MXNet Model Server
Amazon Web Services
 
Model Serving for Deep Learning
Model Serving for Deep LearningModel Serving for Deep Learning
Model Serving for Deep Learning
Adrian Hornsby
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to Start
Vladimir Simek
 
Building Serverless Microservices with AWS
Building Serverless Microservices with AWSBuilding Serverless Microservices with AWS
Building Serverless Microservices with AWS
Donnie Prakoso
 
Learn how to build serverless applications using the AWS Serverless Platform-...
Learn how to build serverless applications using the AWS Serverless Platform-...Learn how to build serverless applications using the AWS Serverless Platform-...
Learn how to build serverless applications using the AWS Serverless Platform-...
Amazon Web Services
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to Start
Vladimir Simek
 
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Amazon Web Services
 
DVC303-Technological Accelerants for Organizational Transformation
DVC303-Technological Accelerants for Organizational TransformationDVC303-Technological Accelerants for Organizational Transformation
DVC303-Technological Accelerants for Organizational Transformation
Amazon Web Services
 
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Amazon Web Services
 
AWS X-Ray: Debugging Applications at Scale - AWS Online Tech Talks
AWS X-Ray: Debugging Applications at Scale - AWS Online Tech TalksAWS X-Ray: Debugging Applications at Scale - AWS Online Tech Talks
AWS X-Ray: Debugging Applications at Scale - AWS Online Tech Talks
Amazon Web Services
 
Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...
Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...
Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...
Amazon Web Services
 
Innovations fueled by IoT and the Cloud
Innovations fueled by IoT and the CloudInnovations fueled by IoT and the Cloud
Innovations fueled by IoT and the Cloud
Adrian Hornsby
 
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenMaschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
AWS Germany
 
Reactive Architectures with Microservices
Reactive Architectures with MicroservicesReactive Architectures with Microservices
Reactive Architectures with Microservices
AWS Germany
 
CON203_Driving Innovation with Containers
CON203_Driving Innovation with ContainersCON203_Driving Innovation with Containers
CON203_Driving Innovation with Containers
Amazon Web Services
 
Driving Innovation with Containers - CON203 - re:Invent 2017
Driving Innovation with Containers - CON203 - re:Invent 2017Driving Innovation with Containers - CON203 - re:Invent 2017
Driving Innovation with Containers - CON203 - re:Invent 2017
Amazon Web Services
 
AWS Application Service Workshop - Serverless Architecture
AWS Application Service Workshop - Serverless ArchitectureAWS Application Service Workshop - Serverless Architecture
AWS Application Service Workshop - Serverless Architecture
John Yeung
 
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
Amazon Web Services
 
GPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
GPSTEC326-GPS Industry 4.0 AI and the Future of ManufacturingGPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
GPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
Amazon Web Services
 
RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...
RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...
RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...
Amazon Web Services
 

Similar to Deep learning systems model serving (20)

Model Serving for Deep Learning with MXNet Model Server
Model Serving for Deep Learning with MXNet Model ServerModel Serving for Deep Learning with MXNet Model Server
Model Serving for Deep Learning with MXNet Model Server
 
Model Serving for Deep Learning
Model Serving for Deep LearningModel Serving for Deep Learning
Model Serving for Deep Learning
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to Start
 
Building Serverless Microservices with AWS
Building Serverless Microservices with AWSBuilding Serverless Microservices with AWS
Building Serverless Microservices with AWS
 
Learn how to build serverless applications using the AWS Serverless Platform-...
Learn how to build serverless applications using the AWS Serverless Platform-...Learn how to build serverless applications using the AWS Serverless Platform-...
Learn how to build serverless applications using the AWS Serverless Platform-...
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to Start
 
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
 
DVC303-Technological Accelerants for Organizational Transformation
DVC303-Technological Accelerants for Organizational TransformationDVC303-Technological Accelerants for Organizational Transformation
DVC303-Technological Accelerants for Organizational Transformation
 
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
 
AWS X-Ray: Debugging Applications at Scale - AWS Online Tech Talks
AWS X-Ray: Debugging Applications at Scale - AWS Online Tech TalksAWS X-Ray: Debugging Applications at Scale - AWS Online Tech Talks
AWS X-Ray: Debugging Applications at Scale - AWS Online Tech Talks
 
Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...
Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...
Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...
 
Innovations fueled by IoT and the Cloud
Innovations fueled by IoT and the CloudInnovations fueled by IoT and the Cloud
Innovations fueled by IoT and the Cloud
 
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenMaschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
 
Reactive Architectures with Microservices
Reactive Architectures with MicroservicesReactive Architectures with Microservices
Reactive Architectures with Microservices
 
CON203_Driving Innovation with Containers
CON203_Driving Innovation with ContainersCON203_Driving Innovation with Containers
CON203_Driving Innovation with Containers
 
Driving Innovation with Containers - CON203 - re:Invent 2017
Driving Innovation with Containers - CON203 - re:Invent 2017Driving Innovation with Containers - CON203 - re:Invent 2017
Driving Innovation with Containers - CON203 - re:Invent 2017
 
AWS Application Service Workshop - Serverless Architecture
AWS Application Service Workshop - Serverless ArchitectureAWS Application Service Workshop - Serverless Architecture
AWS Application Service Workshop - Serverless Architecture
 
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
 
GPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
GPSTEC326-GPS Industry 4.0 AI and the Future of ManufacturingGPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
GPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
 
RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...
RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...
RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...
 

More from Hagay Lupesko

AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019
AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019
AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019
Hagay Lupesko
 
Deep learning acceleration with Amazon Elastic Inference
Deep learning acceleration with Amazon Elastic Inference  Deep learning acceleration with Amazon Elastic Inference
Deep learning acceleration with Amazon Elastic Inference
Hagay Lupesko
 
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
Hagay Lupesko
 
Emotion recognition in images: from idea to a model in production - Nordic DS...
Emotion recognition in images: from idea to a model in production - Nordic DS...Emotion recognition in images: from idea to a model in production - Nordic DS...
Emotion recognition in images: from idea to a model in production - Nordic DS...
Hagay Lupesko
 
Build, Train and Deploy ML Models using Amazon SageMaker
Build, Train and Deploy ML Models using Amazon SageMakerBuild, Train and Deploy ML Models using Amazon SageMaker
Build, Train and Deploy ML Models using Amazon SageMaker
Hagay Lupesko
 
ONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep LearningONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep Learning
Hagay Lupesko
 

More from Hagay Lupesko (6)

AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019
AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019
AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019
 
Deep learning acceleration with Amazon Elastic Inference
Deep learning acceleration with Amazon Elastic Inference  Deep learning acceleration with Amazon Elastic Inference
Deep learning acceleration with Amazon Elastic Inference
 
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
 
Emotion recognition in images: from idea to a model in production - Nordic DS...
Emotion recognition in images: from idea to a model in production - Nordic DS...Emotion recognition in images: from idea to a model in production - Nordic DS...
Emotion recognition in images: from idea to a model in production - Nordic DS...
 
Build, Train and Deploy ML Models using Amazon SageMaker
Build, Train and Deploy ML Models using Amazon SageMakerBuild, Train and Deploy ML Models using Amazon SageMaker
Build, Train and Deploy ML Models using Amazon SageMaker
 
ONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep LearningONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep Learning
 

Recently uploaded

TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
Tendenci - The Open Source AMS (Association Management Software)
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 

Recently uploaded (20)

TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 

Deep learning systems model serving

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hagay Lupesko 01.25.2018 Model Serving for Deep Learning Amazon AI
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Brief Intro to Deep Learning AI Machine Learning Deep Learning
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Brief Intro to Deep Learning – Neural Networks Output Layer Input Layer Hidden Layers Many More… • Non linear • Hierarchical feature learning • Scalable architecture • Computationally intensive
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Learning is a Big Deal It has a growing impact on our lives Personalization Logistics Voice Autonomous Vehicles
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Learning is a Big Deal It’s able to do better than other ML and Humans
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model Model Server Mobile Desktop IoT Internet So what does a deployed model looks like?
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Performance Availability Networking Monitoring Model Decoupling Cross Framework Cross Platform The Undifferentiated Heavy Lifting of Model Serving
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tensor Flow Serving Model Server for MXNet UC Berkeley Clipper Model Serving Systems for Deep Learning
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. It’s Demo Time!
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model Archive REST and OpenAPI Containerized ONNX Support Operational Metrics
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Trained Network Model Signature Custom Code Auxiliary Assets Model Archive Model Export CLI Model Archive Back
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. REST and OpenAPI REST-like endpoint: <model-name>/predict Endpoint auto-generated from the model’s signature.json JSON encoding by default Binary input via request payload OpenAPI support – client code-gen and tooling Back
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. MMS Dockerfile Build Push Launch Containerization Container Cluster MMS Container MMS ContainerMMS Container MXNet NGINX MXNet Model Server Lightweight virtualization, isolation, runs anywhere Back
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Requests • Latencies • Resources Metrics • Model Name • Host Name Dimensions • Log / CSV • AWS CloudWatch Target Operational Metrics Back
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. O(n2) Pairs MXNet Caffe2 PyTorch TF CNTKCoreML TensorRT NGraph SNPEMany Frameworks ONNX Support Many Platforms ONNX: Common IR Supported in MMS v0.2
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Performance • Batching • Caching • JIT Compilation • Custom code • Quantization Platform • New players • ONNX • Plugins Adoption • Ease of use • Internal Amazon dev tools • Industry partners Challenges Ahead
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Open source – try it out and file issues github.com/awslabs/mxnet-model-server mxnet-sdk-team@amazon.com