SlideShare a Scribd company logo
1 of 18
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hagay Lupesko
01.25.2018
Model Serving for Deep Learning
Amazon AI
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Brief Intro to Deep Learning
AI
Machine
Learning
Deep
Learning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Brief Intro to Deep Learning – Neural Networks
Output
Layer
Input
Layer
Hidden
Layers
Many
More…
• Non linear
• Hierarchical
feature learning
• Scalable
architecture
• Computationally
intensive
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Learning is a Big Deal
It has a growing impact on our lives
Personalization Logistics Voice Autonomous
Vehicles
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Learning is a Big Deal
It’s able to do better than other ML and Humans
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model
Model Server
Mobile
Desktop
IoT
Internet
So what does a deployed model looks like?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance
Availability
Networking
Monitoring
Model Decoupling
Cross Framework
Cross Platform
The Undifferentiated
Heavy Lifting of
Model Serving
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tensor Flow
Serving
Model Server
for MXNet
UC Berkeley
Clipper
Model Serving Systems for Deep Learning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
It’s Demo Time!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model Archive
REST and
OpenAPI
Containerized
ONNX Support Operational Metrics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Trained
Network
Model
Signature
Custom
Code
Auxiliary
Assets
Model Archive
Model Export CLI
Model Archive
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
REST and OpenAPI
REST-like endpoint: <model-name>/predict
Endpoint auto-generated from the model’s signature.json
JSON encoding by default
Binary input via request payload
OpenAPI support – client code-gen and tooling
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MMS
Dockerfile
Build
Push
Launch
Containerization
Container Cluster
MMS Container
MMS ContainerMMS Container
MXNet NGINX
MXNet Model Server
Lightweight virtualization, isolation, runs anywhere
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Requests
• Latencies
• Resources
Metrics
• Model Name
• Host Name
Dimensions
• Log / CSV
• AWS CloudWatch
Target
Operational Metrics
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
O(n2)
Pairs
MXNet
Caffe2
PyTorch
TF
CNTKCoreML
TensorRT
NGraph
SNPEMany Frameworks
ONNX Support
Many Platforms
ONNX: Common IR
Supported in MMS v0.2
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance
• Batching
• Caching
• JIT Compilation
• Custom code
• Quantization Platform
• New players
• ONNX
• Plugins
Adoption
• Ease of use
• Internal
Amazon dev
tools
• Industry
partners
Challenges Ahead
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Open source – try it out and file issues
github.com/awslabs/mxnet-model-server
mxnet-sdk-team@amazon.com

More Related Content

What's hot

AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018Amazon Web Services Korea
 
How Websites go Serverless - WebSummit Lisbon 2018
How Websites go Serverless - WebSummit Lisbon 2018How Websites go Serverless - WebSummit Lisbon 2018
How Websites go Serverless - WebSummit Lisbon 2018Boaz Ziniman
 
Machine Learning State of the Union - MCL210 - re:Invent 2017
Machine Learning State of the Union - MCL210 - re:Invent 2017Machine Learning State of the Union - MCL210 - re:Invent 2017
Machine Learning State of the Union - MCL210 - re:Invent 2017Amazon Web Services
 
TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...
TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...
TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...Amazon Web Services
 
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...Amazon Web Services
 
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...Edureka!
 
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...Amazon Web Services
 
TVB 透過創新快速接觸三百萬用戶
TVB 透過創新快速接觸三百萬用戶 TVB 透過創新快速接觸三百萬用戶
TVB 透過創新快速接觸三百萬用戶 Amazon Web Services
 
ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...
ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...
ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...Amazon Web Services
 
Architecting a Real-World Microservices Architecture and DevOps Strategy on A...
Architecting a Real-World Microservices Architecture and DevOps Strategy on A...Architecting a Real-World Microservices Architecture and DevOps Strategy on A...
Architecting a Real-World Microservices Architecture and DevOps Strategy on A...Amazon Web Services
 
Building Mobile Apps with AWS Amplify
Building Mobile Apps with AWS AmplifyBuilding Mobile Apps with AWS Amplify
Building Mobile Apps with AWS AmplifyAmazon Web Services
 
GPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsGPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsAmazon Web Services
 
Introducing AWS Cloud9 - AWS Online Tech Talks
Introducing AWS Cloud9 - AWS Online Tech TalksIntroducing AWS Cloud9 - AWS Online Tech Talks
Introducing AWS Cloud9 - AWS Online Tech TalksAmazon Web Services
 
MBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real WorldMBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real WorldAmazon Web Services
 
AWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
AWS SysOps Administrator Training | AWS SysOps Tutorial | EdurekaAWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
AWS SysOps Administrator Training | AWS SysOps Tutorial | EdurekaEdureka!
 
IOT328_Building an AWS IoT-Enabled Drink Dispenser
IOT328_Building an AWS IoT-Enabled Drink DispenserIOT328_Building an AWS IoT-Enabled Drink Dispenser
IOT328_Building an AWS IoT-Enabled Drink DispenserAmazon Web Services
 
Amazon Time Sync Service now makes it easier to generate and compare timestamps
Amazon Time Sync Service now makes it easier to generate and compare timestampsAmazon Time Sync Service now makes it easier to generate and compare timestamps
Amazon Time Sync Service now makes it easier to generate and compare timestampsDhaval Soni
 
AWS Initiate Day Manchester 2019 – AWS Cloud Foundations
AWS Initiate Day Manchester 2019 – AWS Cloud FoundationsAWS Initiate Day Manchester 2019 – AWS Cloud Foundations
AWS Initiate Day Manchester 2019 – AWS Cloud FoundationsAmazon Web Services
 

What's hot (20)

AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
 
How Websites go Serverless - WebSummit Lisbon 2018
How Websites go Serverless - WebSummit Lisbon 2018How Websites go Serverless - WebSummit Lisbon 2018
How Websites go Serverless - WebSummit Lisbon 2018
 
Machine Learning State of the Union - MCL210 - re:Invent 2017
Machine Learning State of the Union - MCL210 - re:Invent 2017Machine Learning State of the Union - MCL210 - re:Invent 2017
Machine Learning State of the Union - MCL210 - re:Invent 2017
 
TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...
TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...
TLC304-At the Cutting Edge AWS IOT and Greengrass for Multi-Access Edge Compu...
 
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS -...
 
Amazon's Innovation with Machine Learning
Amazon's Innovation with Machine LearningAmazon's Innovation with Machine Learning
Amazon's Innovation with Machine Learning
 
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
 
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
 
TVB 透過創新快速接觸三百萬用戶
TVB 透過創新快速接觸三百萬用戶 TVB 透過創新快速接觸三百萬用戶
TVB 透過創新快速接觸三百萬用戶
 
ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...
ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...
ATC302_How to Leverage AWS Machine Learning Services to Analyze and Optimize ...
 
Architecting a Real-World Microservices Architecture and DevOps Strategy on A...
Architecting a Real-World Microservices Architecture and DevOps Strategy on A...Architecting a Real-World Microservices Architecture and DevOps Strategy on A...
Architecting a Real-World Microservices Architecture and DevOps Strategy on A...
 
Building Mobile Apps with AWS Amplify
Building Mobile Apps with AWS AmplifyBuilding Mobile Apps with AWS Amplify
Building Mobile Apps with AWS Amplify
 
GPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsGPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital Markets
 
Introducing AWS Cloud9 - AWS Online Tech Talks
Introducing AWS Cloud9 - AWS Online Tech TalksIntroducing AWS Cloud9 - AWS Online Tech Talks
Introducing AWS Cloud9 - AWS Online Tech Talks
 
MBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real WorldMBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real World
 
AWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
AWS SysOps Administrator Training | AWS SysOps Tutorial | EdurekaAWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
AWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
 
IOT328_Building an AWS IoT-Enabled Drink Dispenser
IOT328_Building an AWS IoT-Enabled Drink DispenserIOT328_Building an AWS IoT-Enabled Drink Dispenser
IOT328_Building an AWS IoT-Enabled Drink Dispenser
 
Amazon Time Sync Service now makes it easier to generate and compare timestamps
Amazon Time Sync Service now makes it easier to generate and compare timestampsAmazon Time Sync Service now makes it easier to generate and compare timestamps
Amazon Time Sync Service now makes it easier to generate and compare timestamps
 
Keynote
KeynoteKeynote
Keynote
 
AWS Initiate Day Manchester 2019 – AWS Cloud Foundations
AWS Initiate Day Manchester 2019 – AWS Cloud FoundationsAWS Initiate Day Manchester 2019 – AWS Cloud Foundations
AWS Initiate Day Manchester 2019 – AWS Cloud Foundations
 

Similar to Deep learning systems model serving

Model Serving for Deep Learning with MXNet Model Server
Model Serving for Deep Learning with MXNet Model ServerModel Serving for Deep Learning with MXNet Model Server
Model Serving for Deep Learning with MXNet Model ServerAmazon Web Services
 
Model Serving for Deep Learning
Model Serving for Deep LearningModel Serving for Deep Learning
Model Serving for Deep LearningAdrian Hornsby
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartVladimir Simek
 
Building Serverless Microservices with AWS
Building Serverless Microservices with AWSBuilding Serverless Microservices with AWS
Building Serverless Microservices with AWSDonnie Prakoso
 
Learn how to build serverless applications using the AWS Serverless Platform-...
Learn how to build serverless applications using the AWS Serverless Platform-...Learn how to build serverless applications using the AWS Serverless Platform-...
Learn how to build serverless applications using the AWS Serverless Platform-...Amazon Web Services
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartVladimir Simek
 
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...Amazon Web Services
 
DVC303-Technological Accelerants for Organizational Transformation
DVC303-Technological Accelerants for Organizational TransformationDVC303-Technological Accelerants for Organizational Transformation
DVC303-Technological Accelerants for Organizational TransformationAmazon Web Services
 
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...Amazon Web Services
 
AWS X-Ray: Debugging Applications at Scale - AWS Online Tech Talks
AWS X-Ray: Debugging Applications at Scale - AWS Online Tech TalksAWS X-Ray: Debugging Applications at Scale - AWS Online Tech Talks
AWS X-Ray: Debugging Applications at Scale - AWS Online Tech TalksAmazon Web Services
 
Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...
Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...
Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...Amazon Web Services
 
Innovations fueled by IoT and the Cloud
Innovations fueled by IoT and the CloudInnovations fueled by IoT and the Cloud
Innovations fueled by IoT and the CloudAdrian Hornsby
 
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenMaschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenAWS Germany
 
Reactive Architectures with Microservices
Reactive Architectures with MicroservicesReactive Architectures with Microservices
Reactive Architectures with MicroservicesAWS Germany
 
CON203_Driving Innovation with Containers
CON203_Driving Innovation with ContainersCON203_Driving Innovation with Containers
CON203_Driving Innovation with ContainersAmazon Web Services
 
Driving Innovation with Containers - CON203 - re:Invent 2017
Driving Innovation with Containers - CON203 - re:Invent 2017Driving Innovation with Containers - CON203 - re:Invent 2017
Driving Innovation with Containers - CON203 - re:Invent 2017Amazon Web Services
 
AWS Application Service Workshop - Serverless Architecture
AWS Application Service Workshop - Serverless ArchitectureAWS Application Service Workshop - Serverless Architecture
AWS Application Service Workshop - Serverless ArchitectureJohn Yeung
 
GPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
GPSTEC326-GPS Industry 4.0 AI and the Future of ManufacturingGPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
GPSTEC326-GPS Industry 4.0 AI and the Future of ManufacturingAmazon Web Services
 
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...Amazon Web Services
 
RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...
RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...
RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...Amazon Web Services
 

Similar to Deep learning systems model serving (20)

Model Serving for Deep Learning with MXNet Model Server
Model Serving for Deep Learning with MXNet Model ServerModel Serving for Deep Learning with MXNet Model Server
Model Serving for Deep Learning with MXNet Model Server
 
Model Serving for Deep Learning
Model Serving for Deep LearningModel Serving for Deep Learning
Model Serving for Deep Learning
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to Start
 
Building Serverless Microservices with AWS
Building Serverless Microservices with AWSBuilding Serverless Microservices with AWS
Building Serverless Microservices with AWS
 
Learn how to build serverless applications using the AWS Serverless Platform-...
Learn how to build serverless applications using the AWS Serverless Platform-...Learn how to build serverless applications using the AWS Serverless Platform-...
Learn how to build serverless applications using the AWS Serverless Platform-...
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to Start
 
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
 
DVC303-Technological Accelerants for Organizational Transformation
DVC303-Technological Accelerants for Organizational TransformationDVC303-Technological Accelerants for Organizational Transformation
DVC303-Technological Accelerants for Organizational Transformation
 
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
 
AWS X-Ray: Debugging Applications at Scale - AWS Online Tech Talks
AWS X-Ray: Debugging Applications at Scale - AWS Online Tech TalksAWS X-Ray: Debugging Applications at Scale - AWS Online Tech Talks
AWS X-Ray: Debugging Applications at Scale - AWS Online Tech Talks
 
Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...
Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...
Building secure and scalable mobile applications on AWS - AWS Summit Cape Tow...
 
Innovations fueled by IoT and the Cloud
Innovations fueled by IoT and the CloudInnovations fueled by IoT and the Cloud
Innovations fueled by IoT and the Cloud
 
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenMaschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
 
Reactive Architectures with Microservices
Reactive Architectures with MicroservicesReactive Architectures with Microservices
Reactive Architectures with Microservices
 
CON203_Driving Innovation with Containers
CON203_Driving Innovation with ContainersCON203_Driving Innovation with Containers
CON203_Driving Innovation with Containers
 
Driving Innovation with Containers - CON203 - re:Invent 2017
Driving Innovation with Containers - CON203 - re:Invent 2017Driving Innovation with Containers - CON203 - re:Invent 2017
Driving Innovation with Containers - CON203 - re:Invent 2017
 
AWS Application Service Workshop - Serverless Architecture
AWS Application Service Workshop - Serverless ArchitectureAWS Application Service Workshop - Serverless Architecture
AWS Application Service Workshop - Serverless Architecture
 
GPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
GPSTEC326-GPS Industry 4.0 AI and the Future of ManufacturingGPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
GPSTEC326-GPS Industry 4.0 AI and the Future of Manufacturing
 
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
GPS: Industry 4.0: AI and the Future of Manufacturing - GPSTEC326 - re:Invent...
 
RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...
RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...
RET304_Rapidly Respond to Demanding Retail Customers with the Same Serverless...
 

More from Hagay Lupesko

AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019
AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019
AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019Hagay Lupesko
 
Deep learning acceleration with Amazon Elastic Inference
Deep learning acceleration with Amazon Elastic Inference  Deep learning acceleration with Amazon Elastic Inference
Deep learning acceleration with Amazon Elastic Inference Hagay Lupesko
 
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018Hagay Lupesko
 
Emotion recognition in images: from idea to a model in production - Nordic DS...
Emotion recognition in images: from idea to a model in production - Nordic DS...Emotion recognition in images: from idea to a model in production - Nordic DS...
Emotion recognition in images: from idea to a model in production - Nordic DS...Hagay Lupesko
 
Build, Train and Deploy ML Models using Amazon SageMaker
Build, Train and Deploy ML Models using Amazon SageMakerBuild, Train and Deploy ML Models using Amazon SageMaker
Build, Train and Deploy ML Models using Amazon SageMakerHagay Lupesko
 
ONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep LearningONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep LearningHagay Lupesko
 

More from Hagay Lupesko (6)

AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019
AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019
AI Powered Personalization @ Scale - O'Reilly AI San Jose - Sep 2019
 
Deep learning acceleration with Amazon Elastic Inference
Deep learning acceleration with Amazon Elastic Inference  Deep learning acceleration with Amazon Elastic Inference
Deep learning acceleration with Amazon Elastic Inference
 
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
 
Emotion recognition in images: from idea to a model in production - Nordic DS...
Emotion recognition in images: from idea to a model in production - Nordic DS...Emotion recognition in images: from idea to a model in production - Nordic DS...
Emotion recognition in images: from idea to a model in production - Nordic DS...
 
Build, Train and Deploy ML Models using Amazon SageMaker
Build, Train and Deploy ML Models using Amazon SageMakerBuild, Train and Deploy ML Models using Amazon SageMaker
Build, Train and Deploy ML Models using Amazon SageMaker
 
ONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep LearningONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep Learning
 

Recently uploaded

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburgmasabamasaba
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 

Recently uploaded (20)

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 

Deep learning systems model serving

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hagay Lupesko 01.25.2018 Model Serving for Deep Learning Amazon AI
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Brief Intro to Deep Learning AI Machine Learning Deep Learning
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Brief Intro to Deep Learning – Neural Networks Output Layer Input Layer Hidden Layers Many More… • Non linear • Hierarchical feature learning • Scalable architecture • Computationally intensive
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Learning is a Big Deal It has a growing impact on our lives Personalization Logistics Voice Autonomous Vehicles
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Learning is a Big Deal It’s able to do better than other ML and Humans
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model Model Server Mobile Desktop IoT Internet So what does a deployed model looks like?
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Performance Availability Networking Monitoring Model Decoupling Cross Framework Cross Platform The Undifferentiated Heavy Lifting of Model Serving
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tensor Flow Serving Model Server for MXNet UC Berkeley Clipper Model Serving Systems for Deep Learning
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. It’s Demo Time!
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model Archive REST and OpenAPI Containerized ONNX Support Operational Metrics
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Trained Network Model Signature Custom Code Auxiliary Assets Model Archive Model Export CLI Model Archive Back
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. REST and OpenAPI REST-like endpoint: <model-name>/predict Endpoint auto-generated from the model’s signature.json JSON encoding by default Binary input via request payload OpenAPI support – client code-gen and tooling Back
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. MMS Dockerfile Build Push Launch Containerization Container Cluster MMS Container MMS ContainerMMS Container MXNet NGINX MXNet Model Server Lightweight virtualization, isolation, runs anywhere Back
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Requests • Latencies • Resources Metrics • Model Name • Host Name Dimensions • Log / CSV • AWS CloudWatch Target Operational Metrics Back
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. O(n2) Pairs MXNet Caffe2 PyTorch TF CNTKCoreML TensorRT NGraph SNPEMany Frameworks ONNX Support Many Platforms ONNX: Common IR Supported in MMS v0.2
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Performance • Batching • Caching • JIT Compilation • Custom code • Quantization Platform • New players • ONNX • Plugins Adoption • Ease of use • Internal Amazon dev tools • Industry partners Challenges Ahead
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Open source – try it out and file issues github.com/awslabs/mxnet-model-server mxnet-sdk-team@amazon.com