6. Amazon EC2 P3 Instances
• Up to eight NVIDIA Tesla V100 GPUs
• 1 PetaFLOPs of computational performance – 14x better than P2
• 300 GB/s GPU-to-GPU communication (NVLink) – 9X better than P2
• 16GB GPU memory with 900 GB/sec peak GPU memory bandwidth
T h e f a s t e s t , m o s t p o w e r f u l G P U i n s t a n c e s i n t h e c l o u d
7. Amazon EC2 C5 with Intel® Xeon® Scalable
Processor
AVX 512
72 vCPUs
“Skylake”
144 GiB memory
C5
12 Gbps to EBS
2X vCPUs
2X performance
3X throughput
2.4X memory
C4
36 vCPUs
“Haswell”
4 Gbps to EBS
60 GiB memory
C5: Nex t Ge ne rat ion
Compute - Opt imize d
Insta nc e s wit h
Inte l® Xe on® Sca la ble Proc e ssor
AWS Compute opt imize d insta nc e s
support t he new Inte l® AV X - 512
a dva nc e d inst ruc t ion set , e na bling
you to more eff ic ie nt ly run ve c tor
proc e ssing work loa ds wit h single
a nd double floating point
pre c ision, suc h a s AI/ma c hine
le a rning or v ide o proc e ssing.
8. EU (Ireland) Region Linux On Demand
PricingvCPU ECU Memory (GiB) Instance Storage (GB) Linux/UNIX Usage
CPU c5.large 2 8 4 EBS Only $0.096 per Hour
c5.xlarge 4 16 8 EBS Only $0.192 per Hour
c5.2xlarge 8 31 16 EBS Only $0.384 per Hour
c5.4xlarge 16 62 32 EBS Only $0.768 per Hour
c5.9xlarge 36 139 72 EBS Only $1.728 per Hour
c5.18xlarge 72 278 144 EBS Only $3.456 per Hour
GPU p2.xlarge 4 12 61 EBS Only $0.972 per Hour
p2.8xlarge 32 94 488 EBS Only $7.776 per Hour
p2.16xlarge 64 188 732 EBS Only $15.552 per Hour
p3.2xlarge 8 23.5 61 EBS Only $3.305 per Hour
p3.8xlarge 32 94 244 EBS Only $13.22 per Hour
p3.16xlarge 64 188 488 EBS Only $26.44 per Hour
Source - https://aws.amazon.com/ec2/pricing/on-demand/?refid=em_67469
As of 19th January 2018
9. 9
EC2 Spot instances for training & inference
GPU - p3.16xlarge CPU - c5.18xlarge
C5 CPU Resources Available for Up to 19.8X cheaper over a 3 Month average
As of 19th January 2018
Source – Spot Pricing History Tool in EC2 Console https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances-history.html
11. Convolutional Neural Networks (CNN)
Le Cun, 1998: handwritten digit recognition, 32x32 pixels
Convolution and pooling reduce dimensionality
https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/
12. https://news.developer.nvidia.com/expedia-ranking-hotel-images-with-deep-learning/
• Expedia have over 10 million images from
300,000 hotels
• Using great images boosts conversion
• Using Keras and EC2 GPU instances,
they fine-tuned a pre-trained Convolutional Neural
Network using 100,000 images
• Hotel descriptions now automatically feature the best
available images
CNN: Object Classification
17. Solution
Thorn and AWS-partner, MemSQL, built an age progressed facial recognition
service using data analytics and deep learning on AWS compute-optimized C5 to
identify missing children by matching images against child abuse material. Using
the compute power of Intel® Xeon® Scalable processors in C5, Thorn is able to
match thousands of pictures per second, in real time, against a database of
pictures that is being constantly updated. The goal is to eventually integrate this
capability into Spotlight, Thorn’s trafficking investigations tool that is used by
more than 5,300 officers in over 18 countries
Outcome
Thorn can apply 5,000 data points to a single face and classify, correlate, and
match the image to an image in a database. As a result, the organization’s
solution can make a positive image match in 200 milliseconds, compared to 20
minutes previously.
Spotlight Identifies an average of 5 kids per day.
Source: https://itpeernetwork.intel.com/digital-defenders-fight-child-exploitation/
www.wearethorn.org
350 volunteers/members
United States
Non Profit
Organization
Thorn, a global nonprofit organization
headquartered in Los Angeles, CA
joins forces with the sharpest minds
from tech, non-profit, government
and law enforcement to stop the
spread of child sexual exploitation
and abuse material and stand up to
child traffickers.
www.memsql.com
Partner
MemSQL is a real-time data
warehouse for cloud and on-premises
that delivers immediate insights
across live and historical data.
AI helps find missing kids
19. Long Short Term Memory Networks (LSTM)
• A LSTM neuron computes the
output based on the input and a
previous state
• LSTM networks have memory
• They’re great at predicting
sequences, e.g. machine
translation
21. GAN: Welcome to the (un)real world, Neo
Generating new ”celebrity” faces
https://github.com/tkarras/progressive_growing_of_gans
From semantic map to 2048x1024 picture
https://tcwang0509.github.io/pix2pixHD/
23. Apache MXNet: Open Source library for Deep Learning
Programmable Portable High Performance
Near linear scaling
across hundreds of
GPUs
Highly efficient
models for
mobile
and IoT
Simple syntax,
multiple
languages
Most Open Best On AWS
Optimized for
Deep Learning on AWS
Accepted into the
Apache Incubator
MXNet 1.0 released on December 4th
24. Input Output
1 1 1
1 0 1
0 0 0
3
mx. sym. Convol ut i on( dat a, ker nel =( 5, 5) , num_f i l t er =20)
mx. sym. Pool i ng( dat a, pool _t ype=" max" , ker nel =( 2, 2) ,
st r i de=( 2, 2)
l st m. l st m_unr ol l ( num_l st m_l ayer , seq_l en, l en, num_hi dden, num_embed)
4 2
2 0
4=Max
1
3
...
4
0.2
-0.1
...
0.7
mx. sym. Ful l yConnect ed( dat a, num_hi dden=128)
2
mx. symbol . Embeddi ng( dat a, i nput _di m, out put _di m = k)
0.2
-0.1
...
0.7
Queen
4 2
2 0
2=Avg
Input Weights
cos(w, queen ) = cos(w, k i n g) - cos(w, m an ) + cos(w, w om an )
mx. sym. Act i vat i on( dat a, act _t ype=" xxxx" )
" r el u"
" t anh"
" si gmoi d"
" sof t r el u"
Neural Art
Face Search
Image Segmentation
Image Caption
“ People Riding
Bikes”
Bicycle, People,
Road, Sport
Image Labels
Image
Video
Speech
Text
“ People Riding
Bikes”
Machine Translation
“ Οι άνθρωποι
ιππασίας ποδήλατα”
Events
mx. model . FeedFor war d model . f i t
mx. sym. Sof t maxOut put
28. Demos
- Hello World: learn a synthetic data set
- Classify images with pre-trained models
- Classify MNIST with a MLP and a CNN
https://github.com/juliensimon/dlnotebooks