Andrew Ng
What data scientists should know
about deep learning
Andrew NgAndrew Ng
Images Speech Behavior
Applications of Deep Learning
Andrew NgAndrew Ng
Computer vision:
Find coffee mug
Andrew NgAndrew Ng
Computer vision:
Find coffee mug
Early, poor computer vision results.
Andrew NgAndrew Ng
Neurons in the brain
Output
Deep Learning: Neural network
Andrew NgAndrew Ng
Computer vision
Andrew NgAndrew Ng
Yes/No
(Mug or not?)
What is a neural network?
Data (image)
x1
λ 5
, x2
λ 5
x2
=(W1
´x1
)+
x3
=(W2
´x2
)+
x1 x2 x3
x4
x5
W4W3W2W1
Andrew NgAndrew Ng
Supervised learning (learning from labeled data)
Yes
No
YX
Image Yes/No
(Is it a coffee mug?)
Data:
Andrew NgAndrew Ng
Engine
Fuel
Large neural networks
Labeled data
(x,y pairs)
Why is Deep Learning taking off?
Andrew NgAndrew Ng
It’s all about scale.
almost
Andrew NgAndrew Ng
Rocket engines: Deep Learning driven by scale
1 million
connections
(2007)
CPU
10 million
connections
(2008)
GPU
1 billion
connections
(2011)
Cloud
(many CPUs)
100 billion
connections
(2015)
HPC
(many GPUs)
Andrew NgAndrew Ng
A yellow bus driving down a road
with green trees and green grass
in the background.
Living room with white couch and
blue carpeting. The room in the
apartment gets some afternoon sun.
Can a computer understand these pictures?
Andrew NgAndrew Ng
Supervised learning (learning from labeled data)
YX
Image Caption
A yellow bus driving down a
road with green trees and
green grass in the background.
Andrew NgAndrew Ng
CaptionData (image)
A yellow bus
driving down….
Learning to Caption
Andrew NgAndrew Ng
( , )
Learning to answer questions
YX
(Image,Question) Answer
The bus is red.
(公共汽车是红色的)
What is the color
of the bus?
(公共汽车是设么颜色的?)
Andrew NgAndrew Ng
What is the
color of the
bus?
Data
(image, question)
The bus is red ....
Learning to Answer Questions
Andrew Ng
Andrew Ng
Andrew NgAndrew Ng
Images Speech Behavior
Applications of Deep Learning
Andrew NgAndrew Ng
Transcript
Audio
Features
Data (audio)
Phonemes
Language
model
Speech recognition
Andrew NgAndrew Ng
TranscriptData (audio)
Baidu Deep Speech: The rocket engine
Andrew NgAndrew Ng
0
20000
40000
60000
80000
100000
120000
WSJ Switchboard Fisher Deep Speech
80 300 2000
>100,000
Synthesized
data
Hours of data
Dataset
Baidu Deep Speech: The rocket fuel (data)
Andrew NgAndrew Ng
Speech recognition performance
Error
Andrew NgAndrew Ng
With 99% accuracy, we could
redesign your cellphone using a
speech interface.
Most people don’t understand the difference
between 95% accuracy and 99% accuracy.
99% is game changing.
Andrew NgAndrew Ng
Home appliances
(e.g., TV, microwave,
music player, ….)
Car interfaces Wearables
Speech will transform the Internet of Things
Andrew NgAndrew Ng
Images Speech Behavior
Applications of Deep Learning
Andrew NgAndrew Ng
27
Web search/Advertising Datacenter management Computer security
Deep Learning and big data
Andrew NgAndrew Ng
Applications of Deep Learning
Images Speech Behavior
Andrew NgAndrew Ng
It’s all about scale.
almost
Andrew NgAndrew Ng
Why deep learning
Amount of data
Performance Older learning
algorithms
Deep learning
How do data science techniques scale with amount of data?
Andrew NgAndrew Ng
The problem of scale: Mobile devices
Image 72 keypoints Face segmentation
Binary size Speed
Desktop model 153MB 1.25 fps
Mobile model 800 KB
(190x reduction)
25 fps
(20x speedup)
Andrew NgAndrew Ng
Faceyou
(available on Apple
app store)
Andrew NgAndrew Ng
The future of Deep Learning
The AI rocket
Images Speech Behavior
Andrew NgAndrew Ng
34
We have superpowers
AI Data Science

Andrew Ng, Chief Scientist at Baidu