2. Amazon Confidential
Agenda
• AI and Deep Learning at Amazon
• Brief Primer on Deep Learning & Applications
• MXNet Overview and Investments
• Deep Learning Tools and Usage
• Application Example: Deploying MXNet in ECS/Docker
• Application Example: MXNet in a ‘Server-less’ Lambda Environment
• Next Steps and Call to Action
3. Amazon Confidential
Artificial Intelligence At Amazon
Thousands Of Employees Across The Company Focused on AI
Discovery &
Search
Fulfilment &
Logistics
Enhance
Existing Products
Define New
Categories Of
Products
Bring Machine
Learning To All
4. Amazon Confidential
AI on AWS Today
• Zillow
–Zestimate (using Apache Spark)
• Howard Hughes Corp
–Lead scoring for luxury real estate purchase
predictions
• FINRA
–Anomaly detection, sequence matching, regression
analysis, network/tribe analysis
• Netflix
–Recommendation engine
• Pinterest
–Image recognition search
• Fraud.net
–Detect online payment fraud
• DataXu
–Leverage automated & unattended ML at large
scale (Amazon EMR + Spark)
• Mapillary
–Computer vision for crowd sourced maps
• Hudl
–Predictive analytics on sports plays
• Upserve
–Restaurant table mgmt & POS for forecasting
customer traffic
• TuSimple
–Computer Vision for Autonomous Driving
• Clarifai
– Computer Vision APIs
10. Amazon Confidential
Deep Learning
Significantly improve many applications on multiple domains
“deep learning” trend in the past 10 years
image understanding speech recognition natural language
processing
…
autonomy
12. Amazon Confidential
Image Classification
• Hard to define the network
• the definition of the inception network has >1k lines of codes in
Caffe
• A single image requires billions floating-point operations
• Intel i7 ~500 GFLOPS
• Nvidia Titan X: ~5 TFLOPS
• Memory consumption is linear with number of layers
State-of-the-art networks have tens to hundreds layers
13. Amazon Confidential
Language Modeling
• Variable length of input and output sequences
• State-of-the-art networks have many layers
• Billions of floating-point operations per
sentence
• Memory consumption is linear with both
sequence length and number of layers
<go> hello
hello world
input
output
state
world
!
recurrent
neural networks:
14. Amazon Confidential
TX1 on Flying Drone
TX1 with customized board
Drone
Realtime detection and tracking on TX1
~10 frame/sec with 640x480 resolution
15. Amazon Confidential
Deploy Everywhere
Fit the core library with all
dependencies into a single C++
source file
Easy to compile on …
Beyond
BlindTool by Joseph Paul Cohen, demo on Nexus 4
Amalgamation
Runs in browser
with Javascript
The first image for
search “dog” at
images.google.com
Outputs “beagle”
with prob = 73%
within 1 sec
16. Amazon Confidential
Deep RL | Playing Flappy Birds
• Reinforcement learning: Observe environment
Take Action Achieve Reward Repeat. Goal
is to maximize rewards over time.
• There are three interfaces:
• getInitState() for initialization
• getAction()
• setPerception(nextObservation,action,reward,termin
al)
• Resources:
• http://ww1.sinaimg.cn/mw690/8708cad7jw1f8naomr
mweg209n0fo7wj.gif
• https://github.com/li-haoran/DRL-FlappyBird
21. Amazon Confidential
Caffe - Deep Learning Framework by the BVLC
Caffe is a deep learning framework made with expression, speed, and
modularity in mind. It is developed by the Berkeley Vision and Learning
Center (BVLC) and by community contributors. Yangqing Jia created
the project.
• Expressive architecture encourages application and innovation.
Models and optimization are defined by configuration without hard-
coding. Switch between CPU and GPU by setting a single flag.
• Supports multiple GPUs but not multiple machines.
• Caffe on Spark and Caffe con Troll are some attempts to scale it.
• Community. Caffe powers academic research projects, startup
prototypes, and large-scale industrial applications in vision, speech,
and multimedia.
22. Amazon Confidential
Torch
Torch is a scientific computing framework with wide support for machine learning algorithms that puts GPUs first. It is easy to use and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C/CUDA implementation.
• powerful N-dimensional array
• lots of routines for indexing, slicing, transposing, ...
• amazing interface to C, via LuaJIT
• linear algebra routines
• neural network, and energy-based models
• numeric optimization routines
• Fast and efficient GPU support
• Embeddable, with ports to iOS, Android and FPGA backends
25. Amazon Confidential
MXNet Overview
• Founded by: U.Washington, Carnegie Mellon U. (~1.5yrs old)
• State of the Art Model Support: Convolutional Neural Networks (CNN), Long
Short-Term Memory (LSTM)
• Ultra-scalable: Near-linear scaling equals fastest time to model
• Multi-language: Support for Scala, Python, R, etc.. for legacy code leverage and
easy integration with Spark
• Ecosystem: Vibrant community from Academia and Industry
Open Source Project on Github | Apache-2 Licensed
26. Amazon Confidential
Collaborations and Community
4th DL Framework in Popularity
(Outpacing Torch, CNTK and Theano)
0 27.5 55 82.5 110 137.5
TensorFlow
Caffe
Keras
MXNet
Theano
Deeplearning4j
CNTK
Torch7
Popularity
Diverse Community
(Spans Industry and Academia)
0 15000 30000 45000 60000
Bing Xu (Apple)
Tianqi Chen (UW)
Mu Li (CMU/AWS)
Eric Xie (UW/AWS)
Yizhi Liu (Mediav)
Chiyuan Zhang (MIT)
Tianjun Xiao (Micrsoft)
Yutian Li (Face++)
Guo Jian (Tusimple)
Guosheng Dong (sogou)
Yu Zhang (MIT)
Depeng Liang (?)
Qiang Kou (Indiana U)
Xingjian Shi (HKUST)
Naiyan Wang (Tusimple)
Top Contributors
31. Amazon Confidential
One-Click GPU or CPU
Deep Learning
AWS Deep Learning AMI
Up to~40k CUDA cores
MXNet
TensorFlow
Theano
Caffe
Torch
Pre-configured CUDA drivers
Anaconda, Python3
+ CloudFormation template
+ Container Image
33. Amazon ConfidentialAmazon Confidential
Getting started with Deep Learning
• Tool for data scientists and developers
• Setting up a DL system takes (install) time & skill
• Keep packages up to date and compile
• Install all dependencies
• NVIDIA Drivers and CuDNN for G2 and P2 servers
• Intel MKL Drivers for all other instances (C4, M4, …)
http://bit.ly/deepami
34. Amazon ConfidentialAmazon Confidential
Getting started with Deep Learning
• Drivers
CUDA / CUDNN / CUFFT / CUSPARSE / MKL
• Development tools
Python 2 and 3, Anaconda, Jupyter notebooks, Graphviz
• Deep Learning Platforms (compiled & tested)
• MXNet, Tensorflow, CNTK
multi-GPU, multi-machine (MXNet recommended)
• Caffe, Theano, Torch
• Keras
• Up and running in just a few minutes training a Neural Network
Always up to date (less than 1 month), optimized & tested on AWS
35. Amazon ConfidentialAmazon Confidential
Getting started
acbc32cf4de3:image-classification smola$ ssh ec2-user@54.210.246.140
Last login: Fri Nov 11 05:58:58 2016 from 72-21-196-69.amazon.com
=============================================================================
__| __|_ )
_| ( / Deep Learning AMI for Amazon Linux
___|___|___|
This is beta version of the Deep Learning AMI for Amazon Linux.
The README file for the AMI ➜➜➜➜➜➜➜➜➜➜➜➜➜➜➜➜➜➜➜➜ /home/ec2-user/src/README.md
Tests for deep learning frameworks ➜➜➜➜➜➜➜➜➜➜➜➜ /home/ec2-user/src/bin
=============================================================================
7 package(s) needed for security, out of 75 available
Run "sudo yum update" to apply all updates.
Amazon Linux version 2016.09 is available.
[ec2-user@ip-172-31-55-21 ~]$ cd src/
[ec2-user@ip-172-31-55-21 src]$ ls
anaconda2 bazel caffe cntk keras mxnet OpenBLAS README.md Theano
anaconda3 bin caffe3 demos logs Nvidia_Cloud_EULA.pdf opencv tensorflow torch
39. Amazon ConfidentialAmazon Confidential
AWS CloudFormation Components
• VPC in the customer account.
• The requested number of worker instances in an Auto Scaling group within the
VPC. Workers are launched in a private subnet.
• Master instance in a separate Auto Scaling group that acts as a proxy to enable
connectivity to the cluster via SSH.
• Two security groups that open ports on the private subnet for communication
between the master and workers.
• IAM role that allows users to access and query Auto Scaling groups and the
private IP addresses of the EC2 instances.
• NAT gateway used by instances within the VPC to talk to the outside.
46. Amazon ConfidentialAmazon Confidential
Application Examples | Python notebooks
https://github.com/dmlc/mxnet-notebooks
Basic concepts
• NDArray - multi-dimensional array computation
• Symbol - symbolic expression for neural networks
• Module - neural network training and inference
Applications
• MNIST: recognize handwritten digits
• Check out the distributed training results
• Predict with pre-trained models
• LSTMs for sequence learning
• Recommender systems
• Train a state of the art Computer Vision model (CNN)
• Lots more..
47. Call to Action
MXNet Resources:
• MXNet Blog Post | AWS Endorsement
• Read up on MXNet and Learn More: mxnet.io
• MXNet Github Repo
• MXNet Talk by Mu Li
Developer Resources:
• Jeff Barr Blog on P2 | New P2 Instance Type for Amazon EC2 – Up to 16 GPUs
• Deep Learning AMI
• P2 Instance Information
• CloudFormation Template Instructions
• Deep Learning Benchmark