© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hagay Lupesko
01.25.2018
Model Serving for Deep Learning
Amazon AI
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Brief Intro to Deep Learning
AI
Machine
Learning
Deep
Learning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Brief Intro to Deep Learning – Neural Networks
Output
Layer
Input
Layer
Hidden
Layers
Many
More…
• Non linear
• Hierarchical
feature learning
• Scalable
architecture
• Computationally
intensive
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Learning is a Big Deal
It has a growing impact on our lives
Personalization Logistics Voice Autonomous
Vehicles
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Learning is a Big Deal
It’s able to do better than other ML and Humans
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model
Model Server
Mobile
Desktop
IoT
Internet
So what does a deployed model looks like?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance
Availability
Networking
Monitoring
Model Decoupling
Cross Framework
Cross Platform
The Undifferentiated
Heavy Lifting of
Model Serving
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tensor Flow
Serving
Model Server
for MXNet
UC Berkeley
Clipper
Model Serving Systems for Deep Learning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
It’s Demo Time!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model Archive
REST and
OpenAPI
Containerized
ONNX Support Operational Metrics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Trained
Network
Model
Signature
Custom
Code
Auxiliary
Assets
Model Archive
Model Export CLI
Model Archive
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
REST and OpenAPI
REST-like endpoint: <model-name>/predict
Endpoint auto-generated from the model’s signature.json
JSON encoding by default
Binary input via request payload
OpenAPI support – client code-gen and tooling
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MMS
Dockerfile
Build
Push
Launch
Containerization
Container Cluster
MMS Container
MMS ContainerMMS Container
MXNet NGINX
MXNet Model Server
Lightweight virtualization, isolation, runs anywhere
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Requests
• Latencies
• Resources
Metrics
• Model Name
• Host Name
Dimensions
• Log / CSV
• AWS CloudWatch
Target
Operational Metrics
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
O(n2)
Pairs
MXNet
Caffe2
PyTorch
TF
CNTKCoreML
TensorRT
NGraph
SNPEMany Frameworks
ONNX Support
Many Platforms
ONNX: Common IR
Supported in MMS v0.2
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance
• Batching
• Caching
• JIT Compilation
• Custom code
• Quantization Platform
• New players
• ONNX
• Plugins
Adoption
• Ease of use
• Internal
Amazon dev
tools
• Industry
partners
Challenges Ahead
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Open source – try it out and file issues
github.com/awslabs/mxnet-model-server
mxnet-sdk-team@amazon.com

Deep learning systems model serving

  • 1.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Hagay Lupesko 01.25.2018 Model Serving for Deep Learning Amazon AI
  • 2.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Brief Intro to Deep Learning AI Machine Learning Deep Learning
  • 3.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Brief Intro to Deep Learning – Neural Networks Output Layer Input Layer Hidden Layers Many More… • Non linear • Hierarchical feature learning • Scalable architecture • Computationally intensive
  • 4.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Deep Learning is a Big Deal It has a growing impact on our lives Personalization Logistics Voice Autonomous Vehicles
  • 5.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Deep Learning is a Big Deal It’s able to do better than other ML and Humans
  • 6.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Model Model Server Mobile Desktop IoT Internet So what does a deployed model looks like?
  • 7.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Performance Availability Networking Monitoring Model Decoupling Cross Framework Cross Platform The Undifferentiated Heavy Lifting of Model Serving
  • 8.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Tensor Flow Serving Model Server for MXNet UC Berkeley Clipper Model Serving Systems for Deep Learning
  • 9.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. It’s Demo Time!
  • 10.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Model Archive REST and OpenAPI Containerized ONNX Support Operational Metrics
  • 11.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Trained Network Model Signature Custom Code Auxiliary Assets Model Archive Model Export CLI Model Archive Back
  • 12.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. REST and OpenAPI REST-like endpoint: <model-name>/predict Endpoint auto-generated from the model’s signature.json JSON encoding by default Binary input via request payload OpenAPI support – client code-gen and tooling Back
  • 13.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. MMS Dockerfile Build Push Launch Containerization Container Cluster MMS Container MMS ContainerMMS Container MXNet NGINX MXNet Model Server Lightweight virtualization, isolation, runs anywhere Back
  • 14.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. • Requests • Latencies • Resources Metrics • Model Name • Host Name Dimensions • Log / CSV • AWS CloudWatch Target Operational Metrics Back
  • 15.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved.
  • 16.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. O(n2) Pairs MXNet Caffe2 PyTorch TF CNTKCoreML TensorRT NGraph SNPEMany Frameworks ONNX Support Many Platforms ONNX: Common IR Supported in MMS v0.2
  • 17.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Performance • Batching • Caching • JIT Compilation • Custom code • Quantization Platform • New players • ONNX • Plugins Adoption • Ease of use • Internal Amazon dev tools • Industry partners Challenges Ahead
  • 18.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Open source – try it out and file issues github.com/awslabs/mxnet-model-server mxnet-sdk-team@amazon.com