SlideShare a Scribd company logo
1 of 20
Download to read offline
Pham Quang Khang
2018/8/18 Paper Reading Fest 20180819 1
MobileNet V2: Inverted Residuals
and Linear Bottlenecks
Mark Sandler et al. CVPR 2018
Agendas
1. Motivation of research
2. Key components of MobileNet V2
a. Depthwise Separable Convolutions
b. Linear bottlenecks and inverted residual
c. Effect of linear bottlenecks and inverted residual
3. Architecture of MobileNet V2
4. Experiments and results
2018/8/18 Paper Reading Fest 20180819 2
Agendas
1. Motivation of research
2. Key components of MobileNet V2
a. Depthwise Separable Convolutions
b. Linear bottlenecks and inverted residual
c. Effect of linear bottlenecks and inverted residual
3. Architecture of MobileNet V2
4. Experiments and results
2018/8/18 Paper Reading Fest 20180819 3
Convolutional Neural Networks
■ LeNet for hand written characters
2018/8/18 Paper Reading Fest 20180819 4
Yann LeCun, 1998
Evolution of ImageNet
■ 2012: AlexNet major debut for power of CNN
– Conv layers:3, 48, 128, 192, 192, 128
– FC layers: 2048, 2048
■ 2014: VGG 19 power of very deep network
– Conv layers: 19 conv3
– FC layers: 4096, 4096
■ 2015: ResNet very very very deep network
– 152-layers residual of various side conv
– No FC
■ 2014 – 2016: Inception -> Inception v4, Inception + ResNet
■ Xception (CVPR 2017)
■ MobileNet, ShuffleNet => it is time for architectures can fit on mobile
2018/8/18 Paper Reading Fest 20180819 5
Computation power requirements
■ Previous architectures required massive amount of memory and computational
power
■ In order to run image classification or detection on mobile devices, it is a must to
create lighter model with sufficient accuracy
2018/8/18 Paper Reading Fest 20180819 6
Model
ImageNet
Accuracy
Million
Mult-Adds
Million
Parameters
MobileNetV2 72.0% 300 3.4
MobileNet(1) 70.6 569 4.2
GoogleNet
(Inception)
69.8% 1550 6.8
VGG 16 71.5% 15300 138
Andrew G. Howard et al. 2017
Mark Sandler et al. 2018
Agendas
1. Motivation of research
2. Key components of MobileNet V2
a. Depthwise Separable Convolutions
b. Linear bottlenecks and inverted residual
c. Effect of linear bottlenecks and inverted residual
3. Architecture of MobileNet V2
4. Experiments and results
2018/8/18 Paper Reading Fest 20180819 7
Depthwise Separable Conv
■ Conventional Conv: transform DF x DF x M (input size of DF and M
channels) to DF x DF x N, using DK x DK x M x N kernel
– Cost to compute one point in output: DKxDKxM
– Cost to compute whole output: DK x DK x M x DF x DF x N
■ Conv = filtering + combination
■ New way: split into 2 steps of filtering and combination
– Depthwise conv (filtering): use kernel size DKxDKx1 to first get
the DF x DF x M output 1
Cost: DK x DK x M x DF x DF
– Pointwise conv (combination): use kernel size 1x1xMxN to
combine channels of output 1 to final output of DF x DF x N
Cost: M x DF x DF x N
– Total cost: DF x DF x M x (DK x DK + N)
– With DK = 3, cost is down around 9 times
2018/8/18 Paper Reading Fest 20180819 8
Andrew G. Howard et al. 2017
ReLu and information lost
■ Manifold of interest: each activation tensor of dims ℎ𝑖 × 𝑤𝑖 × 𝑑𝑖 can be treated as
ℎ𝑖 × 𝑤𝑖 pixels with 𝑑𝑖 dimensions
■ Manifold of interest can be embedded in low-dimensional subspaces => reducing
the dimension of the layer would not cause information lost
■ Not so true with non-linear transformation like ReLU:
– If manifold of interest remains non-zero volume after ReLU transformation, it
corresponds to a linear transformation
– ReLU is capable of preserving complete information about input manifold, but
only if the input manifold lies in a low-dimensional subspace of input space
2018/8/18 Paper Reading Fest 20180819 9
Use linear bottleneck layers
Inverted Residuals and Linear Bottlenecks
■ Residual connections: improve the ability of gradient to propagate
■ Inverted: considerably more memory efficient
2018/8/18 Paper Reading Fest 20180819 10
Kaiming He et al. 2015
Unit block of MobileNet V2
■ Combining Depthwise Separable Convolutions, linear bottlenecks and inverted
residual block
■ Computational cost per block:
ℎ × 𝑤 × 𝑑 × 𝑡(𝑑′ + 𝑘2 + 𝑑)
■ With this, input and output dimension can
be relatively small
2018/8/18 Paper Reading Fest 20180819 11
Input Operator Output
ℎ × 𝑤 × 𝑑 1x1 conv2d, ReLU6 ℎ × 𝑤 × (𝑡𝑑)
ℎ × 𝑤 × 𝑡𝑑 3x3 dwise s=s, ReLU6
ℎ
𝑠
×
𝑤
𝑠
× (𝑡𝑑)
ℎ
𝑠
×
𝑤
𝑠
× 𝑡𝑑 Linear 1x1 conv2d
ℎ
𝑠
×
𝑤
𝑠
× 𝑑′
Inverted residual bottleneck for memory saving
■ Transformation function: 𝐹 𝑥 = 𝐴 ∙ 𝑁 ∙ 𝐵 𝑥
A: linear transformation: 𝑅 𝑠×𝑠×𝑘 → 𝑅 𝑠×𝑠×𝑛
N: ReLU6 ∙ dwise ∙ ReLU6: 𝑅 𝑠×𝑠×𝑛 → 𝑅 𝑠′×𝑠′×𝑛
B: linear transformation: 𝑅 𝑠′×𝑠′×𝑛 → 𝑅 𝑠′×𝑠′×𝑘′
■ Memory needed is:
𝑠2
𝑘 + 𝑠′2
𝑘′
+ 𝑂(max 𝑠2
, 𝑠′2
)
■ If expansion layers can be separated into t tensors (that concatenation of them
made up the tensors):
𝐹 𝑥 = σ𝑖=1
𝑡
( 𝐴𝑖 . 𝑁 . 𝐵𝑖) 𝑥
2018/8/18 Paper Reading Fest 20180819 12
A
N
B
Agendas
1. Motivation of research
2. Key components of MobileNet V2
a. Depthwise Separable Convolutions
b. Linear bottlenecks and inverted residual
c. Effect of linear bottlenecks and inverted residual
3. Architecture of MobileNet V2
4. Experiments and results
2018/8/18 Paper Reading Fest 20180819 13
Architecture of the model
■ Each line is a sequence of 1 or
more identical layers, repeated n
times
■ Output channel number: c
■ First layer of each sequence has a
stride s and all others use stride 1
■ All spatial conv use 3x3 kernels
■ Bottleneck layer expansion factor t
■ Input resolution should be 96-224
■ Can use multiplier to use thinner
model
2018/8/18 Paper Reading Fest 20180819 14
Input Operator t c n s
2242
× 3 Conv2d - 32 1 2
1122
× 32 bottleneck 1 16 1 1
1122 × 16 bottleneck 6 24 2 2
562 × 24 bottleneck 6 32 3 2
282
× 32 bottleneck 6 64 4 2
142
× 64 bottleneck 6 96 3 1
142
× 96 bottleneck 6 160 3 2
72 × 160 bottleneck 6 320 1 1
72 × 320 Conv2d 1x1 - 1280 1 1
72
× 1280 Avgpool 7x7 - - 1 -
1 × 1 × 1280 Conv2d 1x1 - k -
Keras code
2018/8/18 Paper Reading Fest 20180819 15
Agendas
1. Motivation of research
2. Key components of MobileNet V2
a. Depthwise Separable Convolutions
b. Linear bottlenecks and inverted residual
c. Effect of linear bottlenecks and inverted residual
3. Architecture of MobileNet V2
4. Experiments and results
2018/8/18 Paper Reading Fest 20180819 16
ImageNet Classification
■ Tensorflow
■ RMSProp: decay and momentum of 0.9
■ Batchnorm after every layer
■ Weight decay of 0.00004
■ Initial learning rate 0.045
■ Learning rate decay 0.98 per epoch
■ 16 GPU
■ Batch size 96
2018/8/18 Paper Reading Fest 20180819 17
Model
ImageNet
Accuracy
Million
Mult-Adds
Million
Parameters
MobileNetV2 72.0% 300 3.4
MobileNet(1) 70.6 569 4.2
GoogleNet
(Inception)
69.8% 1550 6.8
VGG 16 71.5% 15300 138
Comparison between models for mobile (ImageNet)
■ MobileNet, ShuffleNet, NasNet ■ MobileNetV2 with different input
resolution vs NasNet, MobileNetV1,
Shuffle Net
2018/8/18 Paper Reading Fest 20180819 18
Model
ImageNet
Accuracy
Million
Mult-Adds
Million
Parameters
MobileNetV1
70.6 575 4.2
ShuffleNet(1.5) 71.5% 292 3.4
ShuffleNet (x2) 73.7% 524 5.4
NasNet-A 74% 564 5.3
MobileNetV2 72.0 300 3.4
MobileNetV2(1.
4)
74.7% 585 6.9
Object detection
■ Use MobileNet V2 as feature extractors for object detection with modified version of
Single Shot Detector (SSD) on COCO dataset
■ Compare with YOLOv2, original SSD
■ SSDLite: replace all normal conv with separable conv in SSD prediction layers
■ MNetV2 + SSDLite run on Pixel 1
2018/8/18 Paper Reading Fest 20180819 19Liu et al.2016
Model mAP
Ave. Precision
Params
Millions
MAdd CPU
SSD300 23.2 36.1 35.2B
SSD512 26.8 36.1 99.5B
YOLOv2 21.6 50.7 17.5B
MNet 1
SSDLite
22.2 5.1 1.3B 270ms
MNet 2
SSD Lite
22.1 4.3 0.8B 200ms
Thank you for listening. Time for Q&A
2018/8/18 Paper Reading Fest 20180819 20

More Related Content

What's hot

PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...Jinwon Lee
 
Handwritten Digit Recognition using Convolutional Neural Networks
Handwritten Digit Recognition using Convolutional Neural  NetworksHandwritten Digit Recognition using Convolutional Neural  Networks
Handwritten Digit Recognition using Convolutional Neural NetworksIRJET Journal
 
210523 swin transformer v1.5
210523 swin transformer v1.5210523 swin transformer v1.5
210523 swin transformer v1.5taeseon ryu
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningMohamed Loey
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksChristian Perone
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural networkFerdous ahmed
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentationOwin Will
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applicationsSungjoon Choi
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural NetworksYogendra Tamang
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Simplilearn
 
Resnet for image processing (3)
Resnet for image processing (3)Resnet for image processing (3)
Resnet for image processing (3)devikarb
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural networkKIRAN R
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetSungminYou
 

What's hot (20)

EfficientNet
EfficientNetEfficientNet
EfficientNet
 
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...
 
Handwritten Digit Recognition using Convolutional Neural Networks
Handwritten Digit Recognition using Convolutional Neural  NetworksHandwritten Digit Recognition using Convolutional Neural  Networks
Handwritten Digit Recognition using Convolutional Neural Networks
 
210523 swin transformer v1.5
210523 swin transformer v1.5210523 swin transformer v1.5
210523 swin transformer v1.5
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
 
Cnn method
Cnn methodCnn method
Cnn method
 
Swin transformer
Swin transformerSwin transformer
Swin transformer
 
Resnet
ResnetResnet
Resnet
 
cnn ppt.pptx
cnn ppt.pptxcnn ppt.pptx
cnn ppt.pptx
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applications
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
MobileNet V3
MobileNet V3MobileNet V3
MobileNet V3
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
 
Resnet for image processing (3)
Resnet for image processing (3)Resnet for image processing (3)
Resnet for image processing (3)
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural network
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNet
 

Similar to MobileNet V2 Architecture Explained

Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesIRJET Journal
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
 
Traffic Sign Recognition System
Traffic Sign Recognition SystemTraffic Sign Recognition System
Traffic Sign Recognition SystemIRJET Journal
 
Sparse Graph Attention Networks 2021.pptx
Sparse Graph Attention Networks 2021.pptxSparse Graph Attention Networks 2021.pptx
Sparse Graph Attention Networks 2021.pptxssuser2624f71
 
IRJET- Design of Memristor based Multiplier
IRJET- Design of Memristor based MultiplierIRJET- Design of Memristor based Multiplier
IRJET- Design of Memristor based MultiplierIRJET Journal
 
“Improving Power Efficiency for Edge Inferencing with Memory Management Optim...
“Improving Power Efficiency for Edge Inferencing with Memory Management Optim...“Improving Power Efficiency for Edge Inferencing with Memory Management Optim...
“Improving Power Efficiency for Edge Inferencing with Memory Management Optim...Edge AI and Vision Alliance
 
CE1009_Implementation of Civil IoT Architecture.pdf
CE1009_Implementation  of Civil IoT Architecture.pdfCE1009_Implementation  of Civil IoT Architecture.pdf
CE1009_Implementation of Civil IoT Architecture.pdfChenkai Sun
 
Using Graphs for Feature Engineering_ Graph Reduce-2.pdf
Using Graphs for Feature Engineering_ Graph Reduce-2.pdfUsing Graphs for Feature Engineering_ Graph Reduce-2.pdf
Using Graphs for Feature Engineering_ Graph Reduce-2.pdfWes Madrigal
 
RECAP: The Simulation Approach
RECAP: The Simulation ApproachRECAP: The Simulation Approach
RECAP: The Simulation ApproachRECAP Project
 
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...Otávio Carvalho
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
 
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...TELKOMNIKA JOURNAL
 
Overview of the FlexPlan project. Focus on EU regulatory analysis and TSO-DSO...
Overview of the FlexPlan project. Focus on EU regulatory analysis and TSO-DSO...Overview of the FlexPlan project. Focus on EU regulatory analysis and TSO-DSO...
Overview of the FlexPlan project. Focus on EU regulatory analysis and TSO-DSO...Leonardo ENERGY
 
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...nishimurashoji
 
IRJET- Single Precision Floating Point Arithmetic using VHDL Coding
IRJET-  	  Single Precision Floating Point Arithmetic using VHDL CodingIRJET-  	  Single Precision Floating Point Arithmetic using VHDL Coding
IRJET- Single Precision Floating Point Arithmetic using VHDL CodingIRJET Journal
 
オープンハウスにおける 機械学習・データサイエンスの 取り組みについて
オープンハウスにおける機械学習・データサイエンスの取り組みについてオープンハウスにおける機械学習・データサイエンスの取り組みについて
オープンハウスにおける 機械学習・データサイエンスの 取り組みについてTeito Nakagawa
 
Computational steering Interactive Design-through-Analysis for Simulation Sci...
Computational steering Interactive Design-through-Analysis for Simulation Sci...Computational steering Interactive Design-through-Analysis for Simulation Sci...
Computational steering Interactive Design-through-Analysis for Simulation Sci...SURFevents
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 

Similar to MobileNet V2 Architecture Explained (20)

PowerGraph
PowerGraphPowerGraph
PowerGraph
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning Techniques
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
 
Traffic Sign Recognition System
Traffic Sign Recognition SystemTraffic Sign Recognition System
Traffic Sign Recognition System
 
Sparse Graph Attention Networks 2021.pptx
Sparse Graph Attention Networks 2021.pptxSparse Graph Attention Networks 2021.pptx
Sparse Graph Attention Networks 2021.pptx
 
IRJET- Design of Memristor based Multiplier
IRJET- Design of Memristor based MultiplierIRJET- Design of Memristor based Multiplier
IRJET- Design of Memristor based Multiplier
 
“Improving Power Efficiency for Edge Inferencing with Memory Management Optim...
“Improving Power Efficiency for Edge Inferencing with Memory Management Optim...“Improving Power Efficiency for Edge Inferencing with Memory Management Optim...
“Improving Power Efficiency for Edge Inferencing with Memory Management Optim...
 
CE1009_Implementation of Civil IoT Architecture.pdf
CE1009_Implementation  of Civil IoT Architecture.pdfCE1009_Implementation  of Civil IoT Architecture.pdf
CE1009_Implementation of Civil IoT Architecture.pdf
 
Using Graphs for Feature Engineering_ Graph Reduce-2.pdf
Using Graphs for Feature Engineering_ Graph Reduce-2.pdfUsing Graphs for Feature Engineering_ Graph Reduce-2.pdf
Using Graphs for Feature Engineering_ Graph Reduce-2.pdf
 
RECAP: The Simulation Approach
RECAP: The Simulation ApproachRECAP: The Simulation Approach
RECAP: The Simulation Approach
 
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
 
Overview of the FlexPlan project. Focus on EU regulatory analysis and TSO-DSO...
Overview of the FlexPlan project. Focus on EU regulatory analysis and TSO-DSO...Overview of the FlexPlan project. Focus on EU regulatory analysis and TSO-DSO...
Overview of the FlexPlan project. Focus on EU regulatory analysis and TSO-DSO...
 
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...
 
IRJET- Single Precision Floating Point Arithmetic using VHDL Coding
IRJET-  	  Single Precision Floating Point Arithmetic using VHDL CodingIRJET-  	  Single Precision Floating Point Arithmetic using VHDL Coding
IRJET- Single Precision Floating Point Arithmetic using VHDL Coding
 
オープンハウスにおける 機械学習・データサイエンスの 取り組みについて
オープンハウスにおける機械学習・データサイエンスの取り組みについてオープンハウスにおける機械学習・データサイエンスの取り組みについて
オープンハウスにおける 機械学習・データサイエンスの 取り組みについて
 
Computational steering Interactive Design-through-Analysis for Simulation Sci...
Computational steering Interactive Design-through-Analysis for Simulation Sci...Computational steering Interactive Design-through-Analysis for Simulation Sci...
Computational steering Interactive Design-through-Analysis for Simulation Sci...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
J010224750
J010224750J010224750
J010224750
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

MobileNet V2 Architecture Explained

  • 1. Pham Quang Khang 2018/8/18 Paper Reading Fest 20180819 1 MobileNet V2: Inverted Residuals and Linear Bottlenecks Mark Sandler et al. CVPR 2018
  • 2. Agendas 1. Motivation of research 2. Key components of MobileNet V2 a. Depthwise Separable Convolutions b. Linear bottlenecks and inverted residual c. Effect of linear bottlenecks and inverted residual 3. Architecture of MobileNet V2 4. Experiments and results 2018/8/18 Paper Reading Fest 20180819 2
  • 3. Agendas 1. Motivation of research 2. Key components of MobileNet V2 a. Depthwise Separable Convolutions b. Linear bottlenecks and inverted residual c. Effect of linear bottlenecks and inverted residual 3. Architecture of MobileNet V2 4. Experiments and results 2018/8/18 Paper Reading Fest 20180819 3
  • 4. Convolutional Neural Networks ■ LeNet for hand written characters 2018/8/18 Paper Reading Fest 20180819 4 Yann LeCun, 1998
  • 5. Evolution of ImageNet ■ 2012: AlexNet major debut for power of CNN – Conv layers:3, 48, 128, 192, 192, 128 – FC layers: 2048, 2048 ■ 2014: VGG 19 power of very deep network – Conv layers: 19 conv3 – FC layers: 4096, 4096 ■ 2015: ResNet very very very deep network – 152-layers residual of various side conv – No FC ■ 2014 – 2016: Inception -> Inception v4, Inception + ResNet ■ Xception (CVPR 2017) ■ MobileNet, ShuffleNet => it is time for architectures can fit on mobile 2018/8/18 Paper Reading Fest 20180819 5
  • 6. Computation power requirements ■ Previous architectures required massive amount of memory and computational power ■ In order to run image classification or detection on mobile devices, it is a must to create lighter model with sufficient accuracy 2018/8/18 Paper Reading Fest 20180819 6 Model ImageNet Accuracy Million Mult-Adds Million Parameters MobileNetV2 72.0% 300 3.4 MobileNet(1) 70.6 569 4.2 GoogleNet (Inception) 69.8% 1550 6.8 VGG 16 71.5% 15300 138 Andrew G. Howard et al. 2017 Mark Sandler et al. 2018
  • 7. Agendas 1. Motivation of research 2. Key components of MobileNet V2 a. Depthwise Separable Convolutions b. Linear bottlenecks and inverted residual c. Effect of linear bottlenecks and inverted residual 3. Architecture of MobileNet V2 4. Experiments and results 2018/8/18 Paper Reading Fest 20180819 7
  • 8. Depthwise Separable Conv ■ Conventional Conv: transform DF x DF x M (input size of DF and M channels) to DF x DF x N, using DK x DK x M x N kernel – Cost to compute one point in output: DKxDKxM – Cost to compute whole output: DK x DK x M x DF x DF x N ■ Conv = filtering + combination ■ New way: split into 2 steps of filtering and combination – Depthwise conv (filtering): use kernel size DKxDKx1 to first get the DF x DF x M output 1 Cost: DK x DK x M x DF x DF – Pointwise conv (combination): use kernel size 1x1xMxN to combine channels of output 1 to final output of DF x DF x N Cost: M x DF x DF x N – Total cost: DF x DF x M x (DK x DK + N) – With DK = 3, cost is down around 9 times 2018/8/18 Paper Reading Fest 20180819 8 Andrew G. Howard et al. 2017
  • 9. ReLu and information lost ■ Manifold of interest: each activation tensor of dims ℎ𝑖 × 𝑤𝑖 × 𝑑𝑖 can be treated as ℎ𝑖 × 𝑤𝑖 pixels with 𝑑𝑖 dimensions ■ Manifold of interest can be embedded in low-dimensional subspaces => reducing the dimension of the layer would not cause information lost ■ Not so true with non-linear transformation like ReLU: – If manifold of interest remains non-zero volume after ReLU transformation, it corresponds to a linear transformation – ReLU is capable of preserving complete information about input manifold, but only if the input manifold lies in a low-dimensional subspace of input space 2018/8/18 Paper Reading Fest 20180819 9 Use linear bottleneck layers
  • 10. Inverted Residuals and Linear Bottlenecks ■ Residual connections: improve the ability of gradient to propagate ■ Inverted: considerably more memory efficient 2018/8/18 Paper Reading Fest 20180819 10 Kaiming He et al. 2015
  • 11. Unit block of MobileNet V2 ■ Combining Depthwise Separable Convolutions, linear bottlenecks and inverted residual block ■ Computational cost per block: ℎ × 𝑤 × 𝑑 × 𝑡(𝑑′ + 𝑘2 + 𝑑) ■ With this, input and output dimension can be relatively small 2018/8/18 Paper Reading Fest 20180819 11 Input Operator Output ℎ × 𝑤 × 𝑑 1x1 conv2d, ReLU6 ℎ × 𝑤 × (𝑡𝑑) ℎ × 𝑤 × 𝑡𝑑 3x3 dwise s=s, ReLU6 ℎ 𝑠 × 𝑤 𝑠 × (𝑡𝑑) ℎ 𝑠 × 𝑤 𝑠 × 𝑡𝑑 Linear 1x1 conv2d ℎ 𝑠 × 𝑤 𝑠 × 𝑑′
  • 12. Inverted residual bottleneck for memory saving ■ Transformation function: 𝐹 𝑥 = 𝐴 ∙ 𝑁 ∙ 𝐵 𝑥 A: linear transformation: 𝑅 𝑠×𝑠×𝑘 → 𝑅 𝑠×𝑠×𝑛 N: ReLU6 ∙ dwise ∙ ReLU6: 𝑅 𝑠×𝑠×𝑛 → 𝑅 𝑠′×𝑠′×𝑛 B: linear transformation: 𝑅 𝑠′×𝑠′×𝑛 → 𝑅 𝑠′×𝑠′×𝑘′ ■ Memory needed is: 𝑠2 𝑘 + 𝑠′2 𝑘′ + 𝑂(max 𝑠2 , 𝑠′2 ) ■ If expansion layers can be separated into t tensors (that concatenation of them made up the tensors): 𝐹 𝑥 = σ𝑖=1 𝑡 ( 𝐴𝑖 . 𝑁 . 𝐵𝑖) 𝑥 2018/8/18 Paper Reading Fest 20180819 12 A N B
  • 13. Agendas 1. Motivation of research 2. Key components of MobileNet V2 a. Depthwise Separable Convolutions b. Linear bottlenecks and inverted residual c. Effect of linear bottlenecks and inverted residual 3. Architecture of MobileNet V2 4. Experiments and results 2018/8/18 Paper Reading Fest 20180819 13
  • 14. Architecture of the model ■ Each line is a sequence of 1 or more identical layers, repeated n times ■ Output channel number: c ■ First layer of each sequence has a stride s and all others use stride 1 ■ All spatial conv use 3x3 kernels ■ Bottleneck layer expansion factor t ■ Input resolution should be 96-224 ■ Can use multiplier to use thinner model 2018/8/18 Paper Reading Fest 20180819 14 Input Operator t c n s 2242 × 3 Conv2d - 32 1 2 1122 × 32 bottleneck 1 16 1 1 1122 × 16 bottleneck 6 24 2 2 562 × 24 bottleneck 6 32 3 2 282 × 32 bottleneck 6 64 4 2 142 × 64 bottleneck 6 96 3 1 142 × 96 bottleneck 6 160 3 2 72 × 160 bottleneck 6 320 1 1 72 × 320 Conv2d 1x1 - 1280 1 1 72 × 1280 Avgpool 7x7 - - 1 - 1 × 1 × 1280 Conv2d 1x1 - k -
  • 15. Keras code 2018/8/18 Paper Reading Fest 20180819 15
  • 16. Agendas 1. Motivation of research 2. Key components of MobileNet V2 a. Depthwise Separable Convolutions b. Linear bottlenecks and inverted residual c. Effect of linear bottlenecks and inverted residual 3. Architecture of MobileNet V2 4. Experiments and results 2018/8/18 Paper Reading Fest 20180819 16
  • 17. ImageNet Classification ■ Tensorflow ■ RMSProp: decay and momentum of 0.9 ■ Batchnorm after every layer ■ Weight decay of 0.00004 ■ Initial learning rate 0.045 ■ Learning rate decay 0.98 per epoch ■ 16 GPU ■ Batch size 96 2018/8/18 Paper Reading Fest 20180819 17 Model ImageNet Accuracy Million Mult-Adds Million Parameters MobileNetV2 72.0% 300 3.4 MobileNet(1) 70.6 569 4.2 GoogleNet (Inception) 69.8% 1550 6.8 VGG 16 71.5% 15300 138
  • 18. Comparison between models for mobile (ImageNet) ■ MobileNet, ShuffleNet, NasNet ■ MobileNetV2 with different input resolution vs NasNet, MobileNetV1, Shuffle Net 2018/8/18 Paper Reading Fest 20180819 18 Model ImageNet Accuracy Million Mult-Adds Million Parameters MobileNetV1 70.6 575 4.2 ShuffleNet(1.5) 71.5% 292 3.4 ShuffleNet (x2) 73.7% 524 5.4 NasNet-A 74% 564 5.3 MobileNetV2 72.0 300 3.4 MobileNetV2(1. 4) 74.7% 585 6.9
  • 19. Object detection ■ Use MobileNet V2 as feature extractors for object detection with modified version of Single Shot Detector (SSD) on COCO dataset ■ Compare with YOLOv2, original SSD ■ SSDLite: replace all normal conv with separable conv in SSD prediction layers ■ MNetV2 + SSDLite run on Pixel 1 2018/8/18 Paper Reading Fest 20180819 19Liu et al.2016 Model mAP Ave. Precision Params Millions MAdd CPU SSD300 23.2 36.1 35.2B SSD512 26.8 36.1 99.5B YOLOv2 21.6 50.7 17.5B MNet 1 SSDLite 22.2 5.1 1.3B 270ms MNet 2 SSD Lite 22.1 4.3 0.8B 200ms
  • 20. Thank you for listening. Time for Q&A 2018/8/18 Paper Reading Fest 20180819 20