2. 2
ABOUT ME
- Deep Learning Solution Architect @
NVIDIA - Supporting delivery of AI / Deep Learning
solutions
- 10 years expereince deliverying Machine
Learning of all scale (from embedded, mobile
to Big Data)
- My past experience:
- Capgemini: https://goo.gl/MzgGbq
- Jaguar Land Rover Research: https://goo.gl/ar7LuU
Adam Grzywaczewski – adamg@nvidia.com
4. 4
NEURAL NETWORKS ARE NOT NEW
They just historically never worked well
0 5 10 15 20 25
Algorithm performance in small data regime
ML1Dataset Size
Accuracy
5. 5
NEURAL NETWORKS ARE NOT NEW
They just historically never worked well
Dataset Size
Accuracy
0 5 10 15 20 25
Algorithm performance in small data regime
ML1 ML2 ML3
6. 6
NEURAL NETWORKS ARE NOT NEW
They just historically never worked well
Dataset Size
Accuracy
0 5 10 15 20 25
Algorithm performance in small data regime
Small NN ML1
ML2 ML3
7. 7
NEURAL NETWORKS ARE NOT NEW
They just historically never worked well
Dataset Size
Accuracy
0 5 10 15 20 25
Algorithm performance in small data regime
Small NN ML1
ML2 ML3
The MNIST (1999) database
contains 60,000 training images
and 10,000 testing images.
8. 8
NEURAL NETWORKS ARE NOT NEW
Data and Compute Availability Changed that
Dataset Size
Accuracy
0 50 100 150 200 250
Algorithm performance in big data regime
Small NN ML1
ML2 ML3
9. 9
NEURAL NETWORKS ARE NOT NEW
Data and model size the key to accuracy
Dataset Size
Accuracy
0 50 100 150 200 250 300 350 400 450
Algorithm performance in big data regime
Small NN ML1 ML2 ML3 Big NN
10. 10
NEURAL NETWORKS ARE NOT NEW
Exceedign human level performance
Dataset Size
Accuracy
0 500 1000 1500 2000 2500
Algorithm performance in large data regime
Small NN ML1 ML2 ML3 Big NN Bigger NN
11. 11
EXPLODING DATASETS
Logarithmic relationship between the dataset size and accuracy
Sun, Chen, et al. "Revisiting Unreasonable Effectiveness of Data in Deep Learning Era." arXiv preprint arXiv:1707.02968 (2017).
Shazeer, Noam, et al. "Outrageously large neural networks: The sparsely-gated mixture-of-experts layer." arXiv preprint arXiv:1701.06538 (2017).
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
12. 12
2016 – Baidu Deep Speech 2
Superhuman Voice Recognition
2015 – Microsoft ResNet
Superhuman Image Recognition
2017 – Google Neural Machine Translation
Near Human Language Translation
100 ExaFLOPS
8700 Million Parameters
20 ExaFLOPS
300 Million Parameters
7 ExaFLOPS
60 Million Parameters
To Tackle Increasingly Complex Challenges
NEURAL NETWORK COMPLEXITY IS EXPLODING
15. 15
AI IS THE NEW ELECTRICITY
“Just as electricity transformed almost everything 100 years ago, today I actually
have a hard time thinking of an industry that I don’t think AI will transform in the
next several years, …”
Andrew Ng , Founder of Google Brain
Affecting every aspect of our lives
16. 16
HOW TO BUILD AI PRODUCTS?
Overview of factors that make AI projects successful.
Selecting AI projects.
18. 18
WHAT TYPE OF A PROBLEM IS IT?
Supervised Learning: the mapping from the data to the labels
Based on a presentation from Andrew Ng
Data Labels
Image Name of objects in the image
Speech Text
Video (e.g. football game) Event statistics (number of football
passes)
Mortgage application Mortgage risk
Text Speech
English French
Click through data Content recommendations
19. 19
WHAT TYPE OF A PROBLEM IS IT?
Lots of noise, little structure - most probably not (changing with self normalising NN)
Little noise, complex structure - most probably yes
Noise vs Structure
Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT Press, 2016.
20. 20
WHAT TYPE OF A PROBLEM IS IT?
What is possible today?
Based on a presentation from Andrew Ng
“If a typical person can do a mental task with less than one second of thought, we
can probably automate it using AI either now or in the near future.”
Andrew Ng , Founder of Google Brain
21. 21
DO YOU HAVE ENOUGH LABELED DATA?
The Achilles heel of deep learning: You need a lot of labeled data.
Based on a presentation from Bryan Catanzaro
Without a large dataset, deep learning isn’t likely to succeed.
Labels:
Getting someone to decide the “right” answer can be hard (think about medical
imaging)
If a dataset requires skilled labor to produce labels, this limits scale / affects the
cost
22. 22
DO YOU HAVE ENOUGH LABELED DATA?
“As of 2016, a rough rule of thumb is that a supervised deep learning algorithm will
generally achieve acceptable performance with around 5,000 labeled examples per
category, and will match or exceed human performance when trained with a
dataset containing at least 10 million labeled examples.”
“Working successfully with datasets smaller than this is an important research
area, focusing in particular on how we can take advantage of large quantities of
unlabeled examples, with unsupervised or semi-supervised learning.”
Ian Goodfellow, Yoshua Bengio, Aaron Courville
How much data is enough?
Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT Press, 2016.
23. 23
WHAT LEVEL OF ACCURACY DO YOU NEED?
How much accuracy you need? (mortgage risk calculation - high, celebrity portal - low)
Aim for lowest acceptable for the product
What is the measure:
• Accuracy (% correct)
• Coverage (% of examples processed)
• Precision (% of detections that are right)
• Recall (% of objects that are detected)
• Amount of error (for regression problems)
• What protective mechanisms to you need to safeguard the system from unavoidable
prediction error?
Defining and measuring accuracy
Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT Press, 2016.
24. 24
CAN SOMETHING SIMPLER WORK?
Default Baseline Model
• Build the end to end pipeline ASAP and use a non Deep Learning Baseline Model
• Measure accuracy from day 1
• You need a baseline on which to improve:
• Simple model that you know very well (linear regression, logistic regression, random
forest).
• Boosted Decision trees are a very good baseline model.
• How does the baseline perform in relation to your target accuracy?
• How does the baseline perform in relation to human accuracy?
Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT Press, 2016.
26. 26
TALK TO US ABOUT YOUR USE CASE
We can help (POC, partner network, training and more)
27. 27
NVIDIA GPU DEEP LEARNING
EVERYWHERE, EVERY PLATFORM
TESLA
Servers in every
shape and size
DGX Systems — The essential
deep learning systems
for instant productivity
CLOUD
Everywhere
28. 28
DEEP LEARNING INSTITUTE
DLI Mission: Help the world to solve the most challenging
problems using AI and deep learning
We help developers, data scientists and engineers to get
started in architecting, optimizing, and deploying neural
networks to solve real-world problems in diverse industries
such as autonomous vehicles, healthcare, robotics, media
& entertainment and game development.
29. 29
INNOVATE IN MINUTES, NOT WEEKS
WITH DEEP LEARNING CONTAINERS
Benefits of Containers:
Simplify deployment of
GPU-accelerated applications,
eliminating time-consuming software
integration work
Isolate individual frameworks
or applications
Share, collaborate,
and test applications across
different environments 29
NVIDIA GPU CLOUD
(NGC)
30. 30
March 26—29, 2018 | Silicon Valley | #GTC18
www.gputechconf.com
CONNECT
Connect with technology
experts from NVIDIA and
other leading organizations
LEARN
Gain insight and valuable
hands-on training through
hundreds of sessions and
research posters
DISCOVER
See how GPU technologies
are creating amazing
breakthroughs in important
fields such as deep learning
INNOVATE
Hear about disruptive
innovations as early-stage
companies and startups
present their work
Don’t miss the world’s most important event for GPU developers
March 26—29, 2018 in Silicon Valley
REGISTRATION IS OPEN AT WWW.GPUTECHCONF.COM