C4_W2.pdf

Copyright Notice
These slides are distributed under the Creative Commons License.
DeepLearning.AI makes these slides available for educational purposes. You may not use or distribute
these slides for commercial purposes. You may make copies of these slides and use or distribute them for
educational purposes as long as you cite DeepLearning.AI as the source of the slides.
For the rest of the details of the license, see https://creativecommons.org/licenses/by-sa/2.0/legalcode

deeplearning.ai
Case Studies
Why look at
case studies?

Andrew Ng
Outline
Classic networks:
• LeNet-5
ResNet
Inception
• AlexNet
• VGG

deeplearning.ai
Case Studies
Classic networks

Andrew Ng
LeNet - 5
⋮ ⋮
"
#
32×32 ×1 28×28×6 14×14×6 10×10×16 5×5×16
120 84
5 × 5
s = 1
f = 2
s = 2
avg pool
5 × 5
s = 1
avg pool
f = 2
s = 2
[LeCun et al., 1998. Gradient-based learning applied to document recognition]

Andrew Ng
AlexNet
= ⋮ ⋮
227×227 ×3
55×55 ×96 27×27 ×96 27×27 ×256 13×13 ×256
13×13 ×384 13×13 ×384 13×13 ×256 6×6 ×256 9216 4096
⋮
4096
11 × 11
s = 4
3 × 3
s = 2
MAX-POOL
5 × 5
same
3 × 3
s = 2
MAX-POOL
3 × 3
same
3 × 3 3 × 3 3 × 3
s = 2
MAX-POOL
Softmax
1000
[Krizhevsky et al., 2012. ImageNet classification with deep convolutional neural networks]

Andrew Ng
VGG - 16
224×224 ×3
CONV = 3×3 filter, s = 1, same MAX-POOL = 2×2 , s = 2
[CONV 64]
×2
224×224×64
POOL
112×112 ×64
[CONV 128]
×2
112×112 ×128
POOL
56×56 ×128
[CONV 256]
×3
56×56 ×256
POOL
28×28 ×256
[CONV 512]
×3
28×28 ×512
POOL
14×14×512
[CONV 512]
×3
14×14 ×512
POOL
7×7×512 FC
4096
FC
4096
Softmax
1000
[Simonyan & Zisserman 2015. Very deep convolutional networks for large-scale image recognition]

deeplearning.ai
Case Studies
Residual Networks
(ResNets)

Andrew Ng
Residual block
![#] ![#%&]
'[#%(] = *[#%(] ![#] + ,[#%(] ![#%(] = -('[#%(]) '[#%&] = *[#%&]![#%(] + ,[#%&] ![#%&] = -('[#%&])
![#%(]
[He et al., 2015. Deep residual networks for image recognition]

Andrew Ng
Residual Network
x ![#]
# layers
training
error
Plain
# layers
training
error
ResNet

deeplearning.ai
Case Studies
Why ResNets
work

Andrew Ng
Why do residual networks work?

Andrew Ng
ResNet
Plain
ResNet

deeplearning.ai
Case Studies
Network in Network
and 1×1 convolutions

Andrew Ng
Why does a 1 × 1 convolution do?
1 2 3 6 5 8
3 5 5 1 3 4
2 1 3 4 9 3
4 7 8 5 7 9
1 5 3 7 4 8
5 4 9 8 3 5
2
∗ =
∗ =
6 × 6
6 × 6 × 32 1 × 1 × 32 6 × 6 × # filters
[Lin et al., 2013. Network in network]

Andrew Ng
Using 1×1 convolutions
28 × 28 × 192
28 × 28 × 32
ReLU
CONV 1 × 1
32
[Lin et al., 2013. Network in network]

deeplearning.ai
Case Studies
Inception network
motivation

Andrew Ng
Motivation for inception network
28 × 28 × 192
1 × 1
3 × 3
5 × 5
MAX-POOL
128
32
32
64
28
28
[Szegedy et al. 2014. Going deeper with convolutions]

Andrew Ng
The problem of computational cost
28 × 28 × 192
CONV
5 × 5,
same,
32 28 × 28 × 32

Andrew Ng
Using 1×1 convolution
28 × 28 × 192
CONV
1 × 1,
16,
1 × 1 × 192 28 × 28 × 16
CONV
5 × 5,
32,
5 × 5 × 16 28 × 28 × 32

deeplearning.ai
Case Studies
Inception network

Andrew Ng
Inception module
Previous
Activation
1 × 1
CONV
1 × 1
CONV
3 × 3
CONV
1 × 1
CONV
5 × 5
CONV
MAXPOOL
3 × 3,s = 1
same
1 × 1
CONV
Channel
Concat

Andrew Ng
Inception network
[Szegedy et al., 2014, Going Deeper with Convolutions]

Andrew Ng
http://knowyourmeme.com/memes/we-need-to-go-deeper

MobileNet
Convolutional
Neural Networks

Andrew Ng
Motivation for MobileNets
• Low computational cost at deployment
• Useful for mobile and embedded vision
applications
• Key idea: Normal vs. depthwise-
separable convolutions
[Howard et al. 2017, MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications]

Andrew Ng
Normal Convolution
Computational cost = #filter params x # filter positions x # of filters
6 x 6 x 3
*
3 x 3 x 3
=
4 x 4
4 x 4 x 5

Andrew Ng
Depthwise Separable Convolution
* =
Normal Convolution
* =
*
Depthwise Pointwise

Andrew Ng
6 x 6 x 3
*
3 x 3
=
4 x 4 x 3
Depthwise Convolution

Andrew Ng
* =
Pointwise Convolution
* =

Andrew Ng
* =
4 x 4 x 3
1 x 1 x 3
4 x 4 x 5
4 x 4

Andrew Ng
Cost Summary
[Howard et al. 2017, MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications]
Cost of depthwise separable convolution
Cost of normal convolution

Andrew Ng
* =
* =

MobileNet
Architecture
Convolutional
Neural Networks

MobileNet
Andrew Ng
[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks]
MobileNet v1
MobileNet v2
Depthwise Projection
Expansion
Residual Connection

Andrew Ng
MobileNet v2 Bottleneck
Residual Connection
Expansion Depthwise Pointwise

Andrew Ng
MobileNet v2 Full Architecture
conv2d
conv2d
1 x 1
avgpool
7 x 7
conv2d
1 x 1

EfficientNet
Convolutional
Neural Networks

Andrew Ng
[Tan and Le, 2019, EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks]
Baseline
Higher Resolution
Deeper
Wider
EfficientNet
ො
𝑦 ො
𝑦
ො
𝑦
ො
𝑦
Compound scaling
ො
𝑦

deeplearning.ai
Practical advice for
using ConvNets
Transfer Learning

deeplearning.ai
using ConvNets
Data augmentation

Andrew Ng
Common augmentation method
Mirroring
Random Cropping Rotation
Shearing
Local warping
…

Andrew Ng
Color shifting
+20,-20,+20
-20,+20,+20
+5,0,+50

Andrew Ng
Implementing distortions during training

deeplearning.ai
using ConvNets
The state of
computer vision

Andrew Ng
Data vs. hand-engineering
Two sources of knowledge
• Labeled data
• Hand engineered features/network architecture/other components

Andrew Ng
Tips for doing well on benchmarks/winning
competitions
Ensembling
• Train several networks independently and average their outputs
Multi-crop at test time
• Run classifier on multiple versions of test images and average results

Andrew Ng
Use open source code
• Use architectures of networks published in the literature
• Use pretrained models and fine-tune on your dataset
• Use open source implementations if possible

C4_W2.pdf

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to C4_W2.pdf

Similar to C4_W2.pdf (20)

Recently uploaded

Recently uploaded (20)

C4_W2.pdf