"Methods for Creating Efficient Convolutional Neural Networks," a Presentation from Xnor.ai

© 2019 xnor.ai
Methods for Creating Efficient
Convolutional Neural Networks
Mohammad Rastegari
xnor.ai
May 2019

© 2019 xnor.ai
Approaches to Efficient CNN
• Model design optimization
• Lower Precision (Quantization)
• Binary
• Sparse Models
• Lookup based CNN
• Compact Network Design
• Elastic model, Hierarchical convolution, Dimension-wise convolution
• Training Optimization
• Label Refinery

© 2019 xnor.ai
… …
Convolutional Neural Networks

© 2019 xnor.ai
• 1B – 20B FLOPs
Number of Operations :
• 0.25 – 3 fps
Inference time on CPU :
GPU !
*
+ − ×

© 2019 xnor.ai
Lower Precision
32-bit 1-bit
Reducing Precision
• Saving Memory
• Saving Computation
{-1,+1} {0,1}
MUL XNOR
ADD, SUB Bit-Count (popcount)
8-bit

© 2019 xnor.ai
Why Binary?
Binary Instructions
• AND, OR, XOR, XNOR, PoPCount (Bit-Count)
Low Power Device
Easy to Implement in hardware

© 2019 xnor.ai
+ − × 1x 1x
Operations Memory Computation
+ − ~32x ~2x
XNOR
Bit-
count
~32x ~58x
Binary Weight Networks
XNOR-Networks
Theoretical Improvement
*
*
*
*

© 2019 xnor.ai
filter k 2 R , K = A ⇤k, where 8ij ki j = w ⇥h . K
or all sub-tensors in the input I . K i j corresponds to β for
e location ij (across width and height). This procedure is
ure2. Once weobtained thescaling factor ↵ for theweight
n I (denoted by K ), we can approximate the convolution
filter W mainly using binary operations:
W ⇡ (sign(I ) ~ sign(W )) K ↵ (11)
s the procedure explained in section 3.2 for approximating a convo-
ns.
onal vector where all of its enteries are 1. 1T
can be factored
and theoptimal solutions can beachieved from equation 2 as
sign(Y ) = sign(X T
) sign(W ) = H ⇤T
B ⇤ (9)
ependent, knowing that Y i = X i W i then,
= E [|X i |] E [|W i |] therefore,
P
|X i ||W i |
n
⇡
✓
1
n
kX k` 1
◆✓
1
n
kW k`1
◆
= β⇤
↵⇤
(10)
we convolve A with a 2D filter k 2 R , K = A ⇤k, where 8ij ki j = w ⇥h
contains scaling factors β for all sub-tensors in the input I . K i j corresponds to β
a sub-tensor centered at the location ij (across width and height). This procedur
shown in thethird row of figure2. Once weobtained thescaling factor ↵ for thewe
and β for all sub-tensors in I (denoted by K ), we can approximate the convolu
between input I and weight filter W mainly using binary operations:
I ⇤W ⇡ (sign(I ) ~ sign(W )) K ↵
lution using binary operations.
where 1 is an n-dimensional vector where all of its enteries are 1. 1T
ca
out from theoptimization and theoptimal solutions can beachieved from
follow
C⇤
= sign(Y ) = sign(X T
) sign(W ) = H ⇤T
B ⇤
Since |X i |, |W i | are independent, knowing that Y i = X i W i then,
E [|Y i |] = E [|X i ||W i |] = E [|X i |] E [|W i |] therefore,
γ⇤
=
P
|Y i |
n
=
P
|X i ||W i |
n
⇡
✓
1
n
kX k` 1
◆✓
1
n
kW k`1
◆
= β
c⇥w⇥h
WB
WB
WB = sign(W)
*

© 2019 xnor.ai
Quantization Error
WB = sign(W)
_ 0.75
WB

© 2019 xnor.ai
Optimal Scaling Factor
WB

© 2019 xnor.ai
Binary Input and Binary Weight (XNOR-Net)
a sub-tensor centered at the location ij (across width and height). This proced
shown in thethird row of figure2. Once weobtained thescaling factor ↵ for thew
and β for all sub-tensors in I (denoted by K ), we can approximate the convo
between input I and weight filter W mainly using binary operations:
I ⇤W ⇡ (sign(I ) ~ sign(W )) K ↵
i j
location ij (across width and height). This procedure is
re2. Once weobtained thescaling factor ↵ for theweight
I (denoted by K ), we can approximate the convolution
lter W mainly using binary operations:
⇡ (sign(I ) ~ sign(W )) K ↵ (11)
nal vector where all of its enteries are 1. 1T
can be factored
and theoptimal solutions can beachieved from equation 2 as
gn(Y ) = sign(X T
) sign(W ) = H ⇤T
B ⇤ (9)
endent, knowing that Y i = X i W i then,
E [|X i |] E [|W i |] therefore,
P
|X i ||W i |
n
⇡
✓
1
n
kX k` 1
◆✓
1
n
kW k`1
◆
= β⇤
↵⇤
(10)
volving weight filter W 2 Rc⇥w⇥h
(wherewi n w, hi n
WBXB

© 2019 xnor.ai
How to train a CNN with binary filters?

© 2019 xnor.ai
Training Binary Weight Networks
Naive Solution:
1. Train a network with real value parameters
2. Binarize the weight filters

© 2019 xnor.ai
0
10
20
30
40
50
60
70
80
Top-1 (%) ILSVRC2012
Full Precision

© 2019 xnor.ai
. . . . . .W
. . . . . .WB
Binarization

© 2019 xnor.ai
. . . . . .
Person
Dog
. . . . . .W
Binarization

© 2019 xnor.ai
Training XnorNet
W = W - ηGw
. . . . . .
. . . . . .
. . . . . .
Gw
W
Train for binary weights:
[XNOR-Networks, Rastegari et al, ECCV2016]

© 2019 xnor.ai
0
10
20
30
40
50
60
70
80
0.2
[XNOR-Networks, Rastegari et al, ECCV2016]

© 2019 xnor.ai
0.2
0
10
20
30
40
50
60
70
80

© 2019 xnor.ai
Approaches to Efficient CNN
• Model design optimization
• Lower Precision (Quantization)
• Binary (XNOR-Net)
• Sparse Models
• Lookup based CNN
• Compact Network Design
• Elastic model, Hierarchical convolution, Dimension-wise convolution
• Training Optimization
• Label Refinery

© 2019 xnor.ai
… …
Lookup Based CNN

© 2019 xnor.ai
How to train the discrete indexing?!!!!
*

© 2019 xnor.ai
0 10 20 30 40
Speed-up
AccuracyRate Image Classification
Few-shot Training
Few Iteration Training
OnDeviceTraining

© 2019 xnor.ai
Elastic: Instance Specific Efficiency

© 2019 xnor.ai
Challenging vs. Simple Images

© 2019 xnor.ai
ng Elas-
evaluate
eshow
ermore,
ransfer
classiﬁ-
Xt [34],
be aug-
rch Im-
but no
tandard
Figure4: Imagenet Accuracy vs. FLOPSand ParametersThis
ﬁgure shows our Elastic model can achieve a lower error without
any extra (or with lower) computational cost.
parameters. Table2comparesthetop-1 andtop-5 error rates
of all of the base models with the Elastic augmentation (in-

© 2019 xnor.ai
Standard Convolution
Group Convolution
Depth-wise Convolution
Efficient Convolution
*
*
*
*

© 2019 xnor.ai
Standard Convolution
Dilated Convolution

© 2019 xnor.ai
Depth-wise Dilated Convolution (DDConv)

© 2019 xnor.ai
BNorm
Activ
Pool
Conv
BNorm
Activ
Pool
Conv
Standard CNN Block-structure

© 2019 xnor.ai
Mehtaet al.
(a)
RGB without HFF with HFF
(b)
Object Boundary Detection
Gridding Effect
Standard Block structure

© 2019 xnor.ai
Hierarchical DDConvs

© 2019 xnor.ai
Mehtaet al.
(a)
RGB without HFF with HFF
(b)Gridding Effect
Standard Block structure
No Gridding Effect
Hierarchical structure
Object Boundary Detection

© 2019 xnor.ai
Model FLOPs mIOU
HDDConv 1.4 B 69.1
DeepLabV3 2.84 B 71.8
Semantic Object Segmentation

© 2019 xnor.ai
Dimension-wise Convolution
Dim-Conv. O(c.h.w.k^2)
contains scaling factors β for all sub-tens
a sub-tensor centered at the location ij (
shown in thethird row of figure2. Oncew
and β for all sub-tensors in I (denoted b
between input I and weight filter W main
I ⇤W ⇡ (sign(I )
Efficient Channel Fusion. O(c^2+h.w)
contains scaling factors β for all sub-tensors in the inp
a sub-tensor centered at the location ij (across width
shown in thethird row of figure2. Onceweobtained the
and β for all sub-tensors in I (denoted by K ), we can
between input I and weight filter W mainly using binar
I ⇤W ⇡ (sign(I ) ~ sign(W ))
*
* *

© 2019 xnor.ai
FLOPs vs. Accuracy on Image Classification
ResNet-50
XNOR-res50
HDDconv
HDDConv
HDDConv
LCNN
LCNN
FLOPS(Log Millions)
ELASTIC
HDDConvDimConv
HDDConv
DimConv HDDConv
DimConv
Accuracy

© 2019 xnor.ai
Components in a Supervised Learning System
Data
• ImageNet, MSCOCO, SUN, …
• Data Augmentations
Model
• SVM, CNN
• Optimization Techniques (SGD,ADAM, RMSProp,…)
Label
• ?!!

© 2019 xnor.ai
Labels should be:
Soft
Informative
Dynamic
Cat → 80%
Ball → 20%
Dog --> 60%
Cat --> 30%
Bear --> 10%
Dog --> 60%
Cat --> 10%
Bear --> 30%
Cat → 1 %
Ball → 99%

© 2019 xnor.ai
Label Refinery
Ground-
truth Label
Data
burrito burrito
plate
eggnog
burrito
plate
restaurant
Refinery
Top-1: 57.93 Top-1: 59.97 Top-1: 60.87 Top-1: 61.22
burrito
plate
restaurant
Refined Label
Data
Refinery
Refined Label
Data
Refinery
Refined Label
Data
Model

© 2019 xnor.ai
Model Top-1 Top-5 Top-1 Top-5
AlexNet 57.93 79.41 66.28 86.13
MobileNet-1 68.53 88.14 73.39 91.07
MobileNet-0.75 65.93 86.28 70.92 89.68
MobileNet-0.5 63.03 84.55 66.66 87.07
MobileNet-0.25 50.65 74.42 54.62 77.92
ResNet-50 75.7 92.81 76.5 93.12
ResNet-34 73.39 91.32 75.06 92.35
ResNet-18 69.7 89.26 72.52 90.73
ResNetXnor-50 63.1 83.61 73.31 89.18
VGG16 70.1 88.54 75 92.22
VGG19 71.39 89.44 75.46 92.52
DarkNet19 70.6 89.13 74.47 91.94
Label RefineryStandard Training
© 2019 xnor.ai
Model Top-1 Top-5 Top-1 Top-5
AlexNet 57.93 79.41 66.28 86.13
MobileNet-1 68.53 88.14 73.39 91.07
MobileNet-0.75 65.93 86.28 70.92 89.68
MobileNet-0.5 63.03 84.55 66.66 87.07
MobileNet-0.25 50.65 74.42 54.62 77.92
ResNet-50 75.7 92.81 76.5 93.12
ResNet-34 73.39 91.32 75.06 92.35
ResNet-18 69.7 89.26 72.52 90.73
ResNetXnor-50 63.1 83.61 73.31 89.18
VGG16 70.1 88.54 75 92.22
VGG19 71.39 89.44 75.46 92.52
DarkNet19 70.6 89.13 74.47 91.94
Label RefineryStandard Training

© 2019 xnor.ai 74
How far we can get with this efficiency?
Server & CloudEdge & Embedded Devices Mobile Devices
Compute Capability & Price HighLow
Traditional Home of AI
FPGA AI/Neural Accelerator GPU
CPUs

© 2019 xnor.ai
Thank You !!!
1. ELASTIC: Improving CNNs with Instance Specific Scaling PoliciesH Wang, A Kembhavi, A Farhadi, A Yuille, M Rastegari arXiv
preprint arXiv:1812.05262
2. ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural NetworkS Mehta, M Rastegari, L
Shapiro, H Hajishirzi arXiv preprint arXiv:1811.11431
3. Label refinery: Improving imagenet classification through label progressionH Bagherinezhad, M Horton, M Rastegari, A
Farhadi arXiv preprint arXiv:1805.02641
4. Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentationS Mehta, M Rastegari, A Caspi, L
Shapiro, H Hajishirzi . Proceedings of the European Conference on Computer Vision (ECCV), 552-568
5. Xnor-net: Imagenet classification using binary convolutional neural networksM Rastegari, V Ordonez, J Redmon, A Farhadi
European Conference on Computer Vision, 525-542
6. Lcnn: Lookup-based convolutional neural networkH Bagherinezhad, M Rastegari, A Farhadi. Proceedings of the IEEE
Conference on Computer Vision and Pattern …

"Methods for Creating Efficient Convolutional Neural Networks," a Presentation from Xnor.ai

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to "Methods for Creating Efficient Convolutional Neural Networks," a Presentation from Xnor.ai

Similar to "Methods for Creating Efficient Convolutional Neural Networks," a Presentation from Xnor.ai (20)

More from Edge AI and Vision Alliance

More from Edge AI and Vision Alliance (20)

Recently uploaded

Recently uploaded (20)

"Methods for Creating Efficient Convolutional Neural Networks," a Presentation from Xnor.ai