SlideShare a Scribd company logo
1 of 29
Download to read offline
© 2019 Bichen Wu
Enabling Automated Design of
Computationally Efficient Deep
Neural Networks
Bichen Wu
UC Berkeley
May 2019
bichen@berkeley.edu
© 2019 Bichen Wu
Neural networks for embedded vision
2
© 2019 Bichen Wu
Augmented reality
Need for Embedded Vision
3
• Privacy concern
• Latency constraint
• Availability, reliability and cost of data transmission
Biometric identification Autonomous driving Internet-of-things
© 2019 Bichen Wu
Computation Complexity of Neural Networks
4
DGX-1,
170 TOPS,
3.2 KWatts,
128 GB Memory
TitanX:
11 TOPS,
223 Watts,
12GB Memory
VGG16[1] model:
- Parameter size: 552 MB
- Memory: 93 MB/image
- Computation: 15.8 GOPs/image
[1] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
Smartphones
800 MOPs
3 Watts
2-4GB
Embedded Devices
100’s MHz
<5Watt
<1GB
© 2019 Bichen Wu
Goal: Accurate AND Efficient Neural Networks
5
• Embedded computer vision requires accurate AND
efficient neural networks
Accuracy: Essential for many
applications including security
cameras and autonomous driving
Efficiency: Real-time inference speed
on embedded processors with limited
compute & power budgets
© 2019 Bichen Wu
Designing accurate and efficient
neural networks is challenging.
6
© 2019 Bichen Wu
Intractable Design Space
• Design space of Deep Neural Nets is huge!
• VGG16[1] has 16 layers
• Design choices for each layer:
• kernel size = {1, 3, 5}
• channel size = {32, 64, 128, 256, 512}
• Search space = (3x5)^16 = 7e18
7
[1] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
© 2019 Bichen Wu
Conditional Optimality
8
IoTGPUs iPhones Android phones Low end phones Wearable
• Ideally, we should design different Neural Networks to different
devices/tasks/computation budgets
• In reality, due to the cost of design & training Neural Networks, we can only
afford to design one and deploy to all conditions
© 2019 Bichen Wu
Inconsistent Efficiency Metrics
9
• Previous works focus on reducing parameter size or MACs (number of
Multiply-Accumulation operations)
• However, a lower MAC count does not necessarily mean lower latency
– Dilated convolution is slower due to the more complicated
memory access pattern
– NASNet-A has slightly smaller MACs than MobileNetV1, but the
latency is 1.6x slower
Dilated Convolution [1] NASNet [2]
[1] Yu, Fisher, and Vladlen Koltun. "Multi-scale context aggregation by dilated convolutions." arXiv preprint arXiv:1511.07122 (2015).
[2] Zoph, Barret, et al. "Learning transferable architectures for scalable image recognition." arXiv preprint arXiv:1707.070122.6 (2017).
© 2019 Bichen Wu
Rethinking the flow for neural
network design.
10
© 2019 Bichen Wu
Using Off-the-shelf Models
11
• Dealing with hardware constraints:
• Model is too big/small
• Can’t support 1x1, 3x3, or 5x5
convolutions
• Too slow with XXX operators
• Can’t support residual connection
• ReLU must follow convolutions
• Fixed input size
© 2019 Bichen Wu
Manual Design
12
• Manual design:
• Can only afford a few iterations
© 2019 Bichen Wu
(Previous) Neural Architecture Search
13
• Search based neural architecture search
• Computationally expensive: [1] takes 450 GPUs for 4-5 days
[1] Zoph, Barret, et al. "Learning transferable architectures for scalable image recognition." arXiv:1707.070122.6 (2017).
© 2019 Bichen Wu
DNAS: Differentiable Neural Architecture Search
14
Desirable features
• A general framework to support arbitrary design spaces
• Optimize for actual efficiency metrics (such as latency)
• Reasonable search cost
© 2019 Bichen Wu
Using DNAS to search for mixed
precision quantization strategy
15
© 2019 Bichen Wu
Mixed Precision Quantization
16
• Quantizing different layers of a ConvNet to different precisions
• Candidate operators are convolutions with quantized weight and activations
© 2019 Bichen Wu
Mixed Precision Quantization
17
Model ResNet18
reference
DNAS
(ours)
TTQ [1] ADMM [2]
Precision full mixed 2bit 3bit
Accuracy 69.60% 69.58% 66.60% 68.0%
Compression rate 1.0x 21.1x 16.0x 10.7x
[1] Zhu, Chenzhuo, et al. "Trained ternary quantization." arXiv preprint arXiv:1612.01064 (2016).
[2] Leng, Cong, et al. "Extremely low bit neural network: Squeeze the last bit out with admm." arXiv preprint arXiv:1707.09870(2017).
Weight quantization on ImageNet dataset
• 21.2x smaller model size, -0.02% accuracy loss, 2.98% better than
TTQ, 1.58% better than ADMM
Block ID B1 B2 B3 B4 B5 B6 B7 B8 B9
Bit-width 2 3 0 2 4 2 3 2 1
• Block-wise precision
Skipped the entire block
© 2019 Bichen Wu
Mixed Precision Quantization
18
[1] Choi, Jungwook, et al. "PACT: Parameterized Clipping Activation for Quantized Neural Networks." arXiv preprint arXiv:1805.06085 (2018).
[2] Jung, Sangil, et al. "Joint training of low-precision neural network with quantization interval parameters." arXiv preprint arXiv:1808.05779 (2018).
[3] Zhuang, Bohan, Chunhua Shen, and Ian Reid. "Training Compact Neural Networks with Binary Weights and Low Precision Activations." arXiv
preprint arXiv:1808.02631 (2018).
Model ResNet18
reference
DNAS
(ours)
PACT [1] QIP [2] GroupNet[3]
Precision full mixed w4a4 w4a4 w1a2g5
Accuracy 69.60% 68.65% 69.20% 69.30% 67.60%
Compression rate 1.0x 103.5x 64x 64x 102.4x
Weight & activation quantization on ImageNet dataset
• Compression rate computed as: weight-bit x activation-bit / (32 x 32)
• 103.5x reduction of computational cost, <1% accuracy drop
• Search finished in 24 hours on 8 GPUs
© 2019 Bichen Wu
Using DNAS to search for efficient
neural network architectures
19
© 2019 Bichen Wu
Efficient Architecture Search
20
1x1 (group) Conv, ReLU
K x K DWConv, ReLU
1x1 (group) Conv
H x W x Cin
H x W x (e x Cin)
(H/s) x (W/s) x (e x Cin)
(H/s) x (W/s) x Cout
+
Candidate modules with different
hyper-parameters
• Kernel size: 3, 5
• Expansion rate: 1, 3, 6
• Skip: no-operation
• Each “layer” of a network can have different modules
© 2019 Bichen Wu
FBNets: ConvNets discovered by DNAS
21
© 2019 Bichen Wu
FBNet vs. MobileNet & MNasNet
22
[1] Sandler, Mark, et al. "MobileNetV2: Inverted Residuals and Linear Bottlenecks.” CVPR18
[2] Tan, Mingxing, et al. "Mnasnet: Platform-aware neural architecture search for mobile." arXiv preprint arXiv:1807.11626 (2018).
• FBNet-B has the same accuracy
with MobileNetV2[1], but 1.5x lower
latency
• The smallest FBNet achieves 4.5%
higher accuracy than MobileNetV2,
the latency is only 2.9 ms (345
frames per second) on a Samsung
Galaxy S8 phone.
• The search cost of DNAS is 8 GPUs
x 24 hours, 421x smaller than
MnasNet [2] – efficient ConvNets
discovered by reinforcement
learning
Search cost
(GPU hours)
# MACs
(M)
Latency
(ms)
ImageNet
top-1 acc
MobileNetV2-0.35-69 - 11 3.8 45.50
FBNet-0.35-96 (ours) 216 12.9 2.9 50.20
MobileNetV2-1.0 - 300 21.7 72.0
MnasNet-65 91,000 270 - 73.0
FBNet-A (ours) 216 249 19.8 73.0
MobileNetV2-1.3 - 509 33.8 74.4
MnasNet 91,000 317 23.7 74.0
FBNet-B (ours) 216 295 23.1 74.1
MobileNetV2-1.4 - 585 37.4 74.7
MnasNet-92 91,000 388 - 74.8
FBNet-C (ours) 216 375 28.1 74.9
© 2019 Bichen Wu
MobileNetV2: [1]
Acc: 71.8%, lat: 21.7 ms
FBNet vs. MobileNet & MNasNet
23
Longer Latency
(bad)
ImageNet top-1 Accuracy
* Estimated from the paper
description
[1] Sandler, Mark, et al. "MobileNetV2: Inverted Residuals and Linear Bottlenecks.” CVPR18
[2] Tan, Mingxing, et al. "Mnasnet: Platform-aware neural architecture search for mobile." arXiv:1807.11626 (2018).
DNASNet-A: (ours)
Acc: 73.0%, lat: 19.8 ms
MobileNetV2-1.3: [1]
Acc: 74.4%, lat: 33.8 ms
MobileNetV2-1.4: [1]
Acc: 74.7%, lat: 37.4 ms
DNASNet-B: (ours)
Acc: 74.1%, lat: 23.1 ms
DNASNet-C: (ours)
Acc: 74.9%, lat: 28.1 ms
MnasNet: [2]
Acc: 74.0%, lat: 23.7 ms
© 2019 Bichen Wu
MobileNetV2: [4]
Acc: 71.8%, MACs: 300M
FBNet Compared with Other NAS
24
More MACs- BAD
ImageNet top-1 Accuracy -- Good
PNAS: [2] Acc: 74.2%, MACs: 588M
Search cost*: 6,000 GPU-hrs
DARTS: [3] Acc: 73.1%, MACs: 595M
Search cost: 288 GPU-hrs
AMC: [5] Acc: 70.8%, MACs: 150M
MnasNet: [6]
Acc: 74.0, MACs: 317M
Search Cost*: 91,000 GPU-hrs
NAS: [1] Acc: 74.0%, MACs: 564M
Search cost: 48,000 GPU-hrs
* Estimated from the paper
description
[1] Zoph, Barret, et al. "Learning transferable architectures for scalable image recognition." arXiv:1707.070122.6 (2017).
[2] Liu, Chenxi, et al. "Progressive neural architecture search." arXiv:1712.00559 (2017).
[3] Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "Darts: Differentiable architecture search." arXiv:1806.09055 (2018)
[4] Sandler, Mark, et al. "MobileNetV2: Inverted Residuals and Linear Bottlenecks.” CVPR18
[5] He, Yihui, et al. "Amc: Automl for model compression and acceleration on mobile devices." ECCV 2018.
[6] Tan, Mingxing, et al. "Mnasnet: Platform-aware neural architecture search for mobile." arXiv:1807.11626 (2018).
FBNet: (ours)
Acc: 74.1%, MACs: 295M
Search Cost: 216 GPU-hrs
• X-axis: MACs
• Y-axis: accuracy
• Mark size: search
cost
• Circles: search cost
unknown
© 2019 Bichen Wu
Result: FBNet for different target devices
25
• Apple A11
• Big: 2 ARMv8 @ 2.5 GHz
• Little: 4 ARMv8 @ 1.4 GHz
• Vectorization: 4-wide 32-bit MAC
• LPDDR4x memory (30 GB/s)
• GPU + Neural Processing Engine
• Snapdragon 835
• Big: 4 ARMv8 @ 2.4 GHz
• Little: 4 ARMv8 @ 1.9 GHz
• Vectorization: 4-wide 32-bit MAC
• LPDDR4x memory (30 GB/s)
• Adreno 540 GPU
0
5
10
15
20
25
30
iPhone X Samsung S8
FBNet latency on target devices
Target model for iPhoneX Target model for Samsung S8
1.4x speedup• Under similar accuracy
constraint (73.27% vs 73.20%),
FBNet optimized for iPhone-X
achieves 1.4x speedup over the
Samsung optimized model
© 2019 Bichen Wu
FBNet visualization
26
© 2019 Bichen Wu
Result: FBNet for different target devices
27
• DNAS automatically adopts operators with low latency on the targeted devices
© 2019 Bichen Wu
DNAS summary
General search space: the search space of each layer can contain arbitrary operators. This
allows us to apply DNAS to support different target processors
Extremely fast: This process typically takes 8 GPUs for 24 hours to finish. In comparison, to
find models with similar performances, MnasNet requires 421x more computing
resources.
State-of-the-art performance:
• Mixed precision quantization: 21x model size reduction or 104x computational cost
reduction, almost no accuracy loss
• Efficient architecture search: same accuracy, 1.5x faster, 2.4x smaller MACs
Optimize for actual latency on targeted devices:
• Up to 1.4x speedup compared to non-targeted neural architectures
28
© 2019 Bichen Wu
References
Paper:
• Wu, Bichen, et al. "FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable
Neural Architecture Search." arXiv preprint arXiv:1812.03443 (2018).
• Wu, Bichen, et al. "Mixed Precision Quantization of ConvNets via Differentiable Neural
Architecture Search." arXiv preprint arXiv:1812.00090 (2018).
FBNet models:
• Will be open-sourced soon!
Questions and feedback:
• Contact me via email: bichen@berkeley.edu
29

More Related Content

What's hot

A Review of Comparison Techniques of Image Steganography
A Review of Comparison Techniques of Image SteganographyA Review of Comparison Techniques of Image Steganography
A Review of Comparison Techniques of Image SteganographyIOSR Journals
 
COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...
COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...
COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...ijcsit
 
卒業研究 角島康太郎
卒業研究 角島康太郎卒業研究 角島康太郎
卒業研究 角島康太郎ssuser415225
 
Applying convolutional neural networks for limited-memory application
Applying convolutional neural networks for limited-memory applicationApplying convolutional neural networks for limited-memory application
Applying convolutional neural networks for limited-memory applicationTELKOMNIKA JOURNAL
 
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...IJCSIS Research Publications
 
Deep learning health care
Deep learning health care  Deep learning health care
Deep learning health care Meenakshi Sood
 
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionSecure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionIJAEMSJORNAL
 
Reversible Encrypytion and Information Concealment
Reversible Encrypytion and Information ConcealmentReversible Encrypytion and Information Concealment
Reversible Encrypytion and Information ConcealmentIJERA Editor
 
Hermes: Enabling Energy-efficient IoT Networks with Generalized Deduplication
Hermes: Enabling Energy-efficient IoT Networks with Generalized DeduplicationHermes: Enabling Energy-efficient IoT Networks with Generalized Deduplication
Hermes: Enabling Energy-efficient IoT Networks with Generalized DeduplicationLEGATO project
 
Labeling fundus images for classification models
Labeling fundus images for classification modelsLabeling fundus images for classification models
Labeling fundus images for classification modelsPetteriTeikariPhD
 
“Explainability in Computer Vision: A Machine Learning Engineer’s Overview,” ...
“Explainability in Computer Vision: A Machine Learning Engineer’s Overview,” ...“Explainability in Computer Vision: A Machine Learning Engineer’s Overview,” ...
“Explainability in Computer Vision: A Machine Learning Engineer’s Overview,” ...Edge AI and Vision Alliance
 
A new image steganography algorithm based
A new image steganography algorithm basedA new image steganography algorithm based
A new image steganography algorithm basedIJNSA Journal
 
Image Restoration for 3D Computer Vision
Image Restoration for 3D Computer VisionImage Restoration for 3D Computer Vision
Image Restoration for 3D Computer VisionPetteriTeikariPhD
 
Small Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their DesignSmall Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their DesignForrest Iandola
 

What's hot (19)

A Review of Comparison Techniques of Image Steganography
A Review of Comparison Techniques of Image SteganographyA Review of Comparison Techniques of Image Steganography
A Review of Comparison Techniques of Image Steganography
 
COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...
COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...
COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...
 
卒業研究 角島康太郎
卒業研究 角島康太郎卒業研究 角島康太郎
卒業研究 角島康太郎
 
Applying convolutional neural networks for limited-memory application
Applying convolutional neural networks for limited-memory applicationApplying convolutional neural networks for limited-memory application
Applying convolutional neural networks for limited-memory application
 
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
 
Deep learning health care
Deep learning health care  Deep learning health care
Deep learning health care
 
G0210032039
G0210032039G0210032039
G0210032039
 
CI image processing
CI image processing CI image processing
CI image processing
 
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionSecure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
 
Steganography
Steganography Steganography
Steganography
 
Reversible Encrypytion and Information Concealment
Reversible Encrypytion and Information ConcealmentReversible Encrypytion and Information Concealment
Reversible Encrypytion and Information Concealment
 
Hermes: Enabling Energy-efficient IoT Networks with Generalized Deduplication
Hermes: Enabling Energy-efficient IoT Networks with Generalized DeduplicationHermes: Enabling Energy-efficient IoT Networks with Generalized Deduplication
Hermes: Enabling Energy-efficient IoT Networks with Generalized Deduplication
 
Labeling fundus images for classification models
Labeling fundus images for classification modelsLabeling fundus images for classification models
Labeling fundus images for classification models
 
“Explainability in Computer Vision: A Machine Learning Engineer’s Overview,” ...
“Explainability in Computer Vision: A Machine Learning Engineer’s Overview,” ...“Explainability in Computer Vision: A Machine Learning Engineer’s Overview,” ...
“Explainability in Computer Vision: A Machine Learning Engineer’s Overview,” ...
 
1
11
1
 
A new image steganography algorithm based
A new image steganography algorithm basedA new image steganography algorithm based
A new image steganography algorithm based
 
Image Restoration for 3D Computer Vision
Image Restoration for 3D Computer VisionImage Restoration for 3D Computer Vision
Image Restoration for 3D Computer Vision
 
F1803063236
F1803063236F1803063236
F1803063236
 
Small Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their DesignSmall Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their Design
 

Similar to "Enabling Automated Design of Computationally Efficient Deep Neural Networks," a Presentation from UC Berkeley

Compact optimized deep learning model for edge: a review
Compact optimized deep learning model for edge: a reviewCompact optimized deep learning model for edge: a review
Compact optimized deep learning model for edge: a reviewIJECEIAES
 
ACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICES
ACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICESACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICES
ACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICESIAEME Publication
 
A Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep LearningA Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep LearningIRJET Journal
 
Content-based image retrieval based on corel dataset using deep learning
Content-based image retrieval based on corel dataset using deep learningContent-based image retrieval based on corel dataset using deep learning
Content-based image retrieval based on corel dataset using deep learningIAESIJAI
 
A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING
A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING
A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING mlaij
 
1-bit semantic segmentation
1-bit semantic segmentation1-bit semantic segmentation
1-bit semantic segmentationJeonghoonKim30
 
Recent developments in Deep Learning
Recent developments in Deep LearningRecent developments in Deep Learning
Recent developments in Deep LearningBrahim HAMADICHAREF
 
A survey on the layers of convolutional Neural Network
A survey on the layers of convolutional Neural NetworkA survey on the layers of convolutional Neural Network
A survey on the layers of convolutional Neural NetworkSasanko Sekhar Gantayat
 
Design and Implementation of JPEG CODEC using NoC
Design and Implementation of JPEG CODEC using NoCDesign and Implementation of JPEG CODEC using NoC
Design and Implementation of JPEG CODEC using NoCIRJET Journal
 
Residual balanced attention network for real-time traffic scene semantic segm...
Residual balanced attention network for real-time traffic scene semantic segm...Residual balanced attention network for real-time traffic scene semantic segm...
Residual balanced attention network for real-time traffic scene semantic segm...IJECEIAES
 
REVIEW ON OBJECT DETECTION WITH CNN
REVIEW ON OBJECT DETECTION WITH CNNREVIEW ON OBJECT DETECTION WITH CNN
REVIEW ON OBJECT DETECTION WITH CNNIRJET Journal
 
International Journal of Engineering Research and Development (IJERD)
 International Journal of Engineering Research and Development (IJERD) International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Cisco Network Convergence System: Building the Foundation for the Internet of...
Cisco Network Convergence System: Building the Foundation for the Internet of...Cisco Network Convergence System: Building the Foundation for the Internet of...
Cisco Network Convergence System: Building the Foundation for the Internet of...Cisco Service Provider
 
8 of the Must-Read Network & Data Communication Articles Published this weeke...
8 of the Must-Read Network & Data Communication Articles Published this weeke...8 of the Must-Read Network & Data Communication Articles Published this weeke...
8 of the Must-Read Network & Data Communication Articles Published this weeke...IJCNCJournal
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)inventionjournals
 
Efficient addressing schemes for internet of things
Efficient addressing schemes for internet of thingsEfficient addressing schemes for internet of things
Efficient addressing schemes for internet of thingsIJECEIAES
 

Similar to "Enabling Automated Design of Computationally Efficient Deep Neural Networks," a Presentation from UC Berkeley (20)

team12.project_ver_1_(1).pptx
team12.project_ver_1_(1).pptxteam12.project_ver_1_(1).pptx
team12.project_ver_1_(1).pptx
 
Compact optimized deep learning model for edge: a review
Compact optimized deep learning model for edge: a reviewCompact optimized deep learning model for edge: a review
Compact optimized deep learning model for edge: a review
 
ACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICES
ACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICESACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICES
ACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICES
 
kanimozhi2019.pdf
kanimozhi2019.pdfkanimozhi2019.pdf
kanimozhi2019.pdf
 
A Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep LearningA Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep Learning
 
Content-based image retrieval based on corel dataset using deep learning
Content-based image retrieval based on corel dataset using deep learningContent-based image retrieval based on corel dataset using deep learning
Content-based image retrieval based on corel dataset using deep learning
 
A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING
A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING
A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING
 
1-bit semantic segmentation
1-bit semantic segmentation1-bit semantic segmentation
1-bit semantic segmentation
 
Recent developments in Deep Learning
Recent developments in Deep LearningRecent developments in Deep Learning
Recent developments in Deep Learning
 
A survey on the layers of convolutional Neural Network
A survey on the layers of convolutional Neural NetworkA survey on the layers of convolutional Neural Network
A survey on the layers of convolutional Neural Network
 
Design and Implementation of JPEG CODEC using NoC
Design and Implementation of JPEG CODEC using NoCDesign and Implementation of JPEG CODEC using NoC
Design and Implementation of JPEG CODEC using NoC
 
Residual balanced attention network for real-time traffic scene semantic segm...
Residual balanced attention network for real-time traffic scene semantic segm...Residual balanced attention network for real-time traffic scene semantic segm...
Residual balanced attention network for real-time traffic scene semantic segm...
 
Atul
AtulAtul
Atul
 
REVIEW ON OBJECT DETECTION WITH CNN
REVIEW ON OBJECT DETECTION WITH CNNREVIEW ON OBJECT DETECTION WITH CNN
REVIEW ON OBJECT DETECTION WITH CNN
 
International Journal of Engineering Research and Development (IJERD)
 International Journal of Engineering Research and Development (IJERD) International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Cisco Network Convergence System: Building the Foundation for the Internet of...
Cisco Network Convergence System: Building the Foundation for the Internet of...Cisco Network Convergence System: Building the Foundation for the Internet of...
Cisco Network Convergence System: Building the Foundation for the Internet of...
 
8 of the Must-Read Network & Data Communication Articles Published this weeke...
8 of the Must-Read Network & Data Communication Articles Published this weeke...8 of the Must-Read Network & Data Communication Articles Published this weeke...
8 of the Must-Read Network & Data Communication Articles Published this weeke...
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
Efficient addressing schemes for internet of things
Efficient addressing schemes for internet of thingsEfficient addressing schemes for internet of things
Efficient addressing schemes for internet of things
 
Presentation
PresentationPresentation
Presentation
 

More from Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Recently uploaded

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 

Recently uploaded (20)

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

"Enabling Automated Design of Computationally Efficient Deep Neural Networks," a Presentation from UC Berkeley

  • 1. © 2019 Bichen Wu Enabling Automated Design of Computationally Efficient Deep Neural Networks Bichen Wu UC Berkeley May 2019 bichen@berkeley.edu
  • 2. © 2019 Bichen Wu Neural networks for embedded vision 2
  • 3. © 2019 Bichen Wu Augmented reality Need for Embedded Vision 3 • Privacy concern • Latency constraint • Availability, reliability and cost of data transmission Biometric identification Autonomous driving Internet-of-things
  • 4. © 2019 Bichen Wu Computation Complexity of Neural Networks 4 DGX-1, 170 TOPS, 3.2 KWatts, 128 GB Memory TitanX: 11 TOPS, 223 Watts, 12GB Memory VGG16[1] model: - Parameter size: 552 MB - Memory: 93 MB/image - Computation: 15.8 GOPs/image [1] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. Smartphones 800 MOPs 3 Watts 2-4GB Embedded Devices 100’s MHz <5Watt <1GB
  • 5. © 2019 Bichen Wu Goal: Accurate AND Efficient Neural Networks 5 • Embedded computer vision requires accurate AND efficient neural networks Accuracy: Essential for many applications including security cameras and autonomous driving Efficiency: Real-time inference speed on embedded processors with limited compute & power budgets
  • 6. © 2019 Bichen Wu Designing accurate and efficient neural networks is challenging. 6
  • 7. © 2019 Bichen Wu Intractable Design Space • Design space of Deep Neural Nets is huge! • VGG16[1] has 16 layers • Design choices for each layer: • kernel size = {1, 3, 5} • channel size = {32, 64, 128, 256, 512} • Search space = (3x5)^16 = 7e18 7 [1] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  • 8. © 2019 Bichen Wu Conditional Optimality 8 IoTGPUs iPhones Android phones Low end phones Wearable • Ideally, we should design different Neural Networks to different devices/tasks/computation budgets • In reality, due to the cost of design & training Neural Networks, we can only afford to design one and deploy to all conditions
  • 9. © 2019 Bichen Wu Inconsistent Efficiency Metrics 9 • Previous works focus on reducing parameter size or MACs (number of Multiply-Accumulation operations) • However, a lower MAC count does not necessarily mean lower latency – Dilated convolution is slower due to the more complicated memory access pattern – NASNet-A has slightly smaller MACs than MobileNetV1, but the latency is 1.6x slower Dilated Convolution [1] NASNet [2] [1] Yu, Fisher, and Vladlen Koltun. "Multi-scale context aggregation by dilated convolutions." arXiv preprint arXiv:1511.07122 (2015). [2] Zoph, Barret, et al. "Learning transferable architectures for scalable image recognition." arXiv preprint arXiv:1707.070122.6 (2017).
  • 10. © 2019 Bichen Wu Rethinking the flow for neural network design. 10
  • 11. © 2019 Bichen Wu Using Off-the-shelf Models 11 • Dealing with hardware constraints: • Model is too big/small • Can’t support 1x1, 3x3, or 5x5 convolutions • Too slow with XXX operators • Can’t support residual connection • ReLU must follow convolutions • Fixed input size
  • 12. © 2019 Bichen Wu Manual Design 12 • Manual design: • Can only afford a few iterations
  • 13. © 2019 Bichen Wu (Previous) Neural Architecture Search 13 • Search based neural architecture search • Computationally expensive: [1] takes 450 GPUs for 4-5 days [1] Zoph, Barret, et al. "Learning transferable architectures for scalable image recognition." arXiv:1707.070122.6 (2017).
  • 14. © 2019 Bichen Wu DNAS: Differentiable Neural Architecture Search 14 Desirable features • A general framework to support arbitrary design spaces • Optimize for actual efficiency metrics (such as latency) • Reasonable search cost
  • 15. © 2019 Bichen Wu Using DNAS to search for mixed precision quantization strategy 15
  • 16. © 2019 Bichen Wu Mixed Precision Quantization 16 • Quantizing different layers of a ConvNet to different precisions • Candidate operators are convolutions with quantized weight and activations
  • 17. © 2019 Bichen Wu Mixed Precision Quantization 17 Model ResNet18 reference DNAS (ours) TTQ [1] ADMM [2] Precision full mixed 2bit 3bit Accuracy 69.60% 69.58% 66.60% 68.0% Compression rate 1.0x 21.1x 16.0x 10.7x [1] Zhu, Chenzhuo, et al. "Trained ternary quantization." arXiv preprint arXiv:1612.01064 (2016). [2] Leng, Cong, et al. "Extremely low bit neural network: Squeeze the last bit out with admm." arXiv preprint arXiv:1707.09870(2017). Weight quantization on ImageNet dataset • 21.2x smaller model size, -0.02% accuracy loss, 2.98% better than TTQ, 1.58% better than ADMM Block ID B1 B2 B3 B4 B5 B6 B7 B8 B9 Bit-width 2 3 0 2 4 2 3 2 1 • Block-wise precision Skipped the entire block
  • 18. © 2019 Bichen Wu Mixed Precision Quantization 18 [1] Choi, Jungwook, et al. "PACT: Parameterized Clipping Activation for Quantized Neural Networks." arXiv preprint arXiv:1805.06085 (2018). [2] Jung, Sangil, et al. "Joint training of low-precision neural network with quantization interval parameters." arXiv preprint arXiv:1808.05779 (2018). [3] Zhuang, Bohan, Chunhua Shen, and Ian Reid. "Training Compact Neural Networks with Binary Weights and Low Precision Activations." arXiv preprint arXiv:1808.02631 (2018). Model ResNet18 reference DNAS (ours) PACT [1] QIP [2] GroupNet[3] Precision full mixed w4a4 w4a4 w1a2g5 Accuracy 69.60% 68.65% 69.20% 69.30% 67.60% Compression rate 1.0x 103.5x 64x 64x 102.4x Weight & activation quantization on ImageNet dataset • Compression rate computed as: weight-bit x activation-bit / (32 x 32) • 103.5x reduction of computational cost, <1% accuracy drop • Search finished in 24 hours on 8 GPUs
  • 19. © 2019 Bichen Wu Using DNAS to search for efficient neural network architectures 19
  • 20. © 2019 Bichen Wu Efficient Architecture Search 20 1x1 (group) Conv, ReLU K x K DWConv, ReLU 1x1 (group) Conv H x W x Cin H x W x (e x Cin) (H/s) x (W/s) x (e x Cin) (H/s) x (W/s) x Cout + Candidate modules with different hyper-parameters • Kernel size: 3, 5 • Expansion rate: 1, 3, 6 • Skip: no-operation • Each “layer” of a network can have different modules
  • 21. © 2019 Bichen Wu FBNets: ConvNets discovered by DNAS 21
  • 22. © 2019 Bichen Wu FBNet vs. MobileNet & MNasNet 22 [1] Sandler, Mark, et al. "MobileNetV2: Inverted Residuals and Linear Bottlenecks.” CVPR18 [2] Tan, Mingxing, et al. "Mnasnet: Platform-aware neural architecture search for mobile." arXiv preprint arXiv:1807.11626 (2018). • FBNet-B has the same accuracy with MobileNetV2[1], but 1.5x lower latency • The smallest FBNet achieves 4.5% higher accuracy than MobileNetV2, the latency is only 2.9 ms (345 frames per second) on a Samsung Galaxy S8 phone. • The search cost of DNAS is 8 GPUs x 24 hours, 421x smaller than MnasNet [2] – efficient ConvNets discovered by reinforcement learning Search cost (GPU hours) # MACs (M) Latency (ms) ImageNet top-1 acc MobileNetV2-0.35-69 - 11 3.8 45.50 FBNet-0.35-96 (ours) 216 12.9 2.9 50.20 MobileNetV2-1.0 - 300 21.7 72.0 MnasNet-65 91,000 270 - 73.0 FBNet-A (ours) 216 249 19.8 73.0 MobileNetV2-1.3 - 509 33.8 74.4 MnasNet 91,000 317 23.7 74.0 FBNet-B (ours) 216 295 23.1 74.1 MobileNetV2-1.4 - 585 37.4 74.7 MnasNet-92 91,000 388 - 74.8 FBNet-C (ours) 216 375 28.1 74.9
  • 23. © 2019 Bichen Wu MobileNetV2: [1] Acc: 71.8%, lat: 21.7 ms FBNet vs. MobileNet & MNasNet 23 Longer Latency (bad) ImageNet top-1 Accuracy * Estimated from the paper description [1] Sandler, Mark, et al. "MobileNetV2: Inverted Residuals and Linear Bottlenecks.” CVPR18 [2] Tan, Mingxing, et al. "Mnasnet: Platform-aware neural architecture search for mobile." arXiv:1807.11626 (2018). DNASNet-A: (ours) Acc: 73.0%, lat: 19.8 ms MobileNetV2-1.3: [1] Acc: 74.4%, lat: 33.8 ms MobileNetV2-1.4: [1] Acc: 74.7%, lat: 37.4 ms DNASNet-B: (ours) Acc: 74.1%, lat: 23.1 ms DNASNet-C: (ours) Acc: 74.9%, lat: 28.1 ms MnasNet: [2] Acc: 74.0%, lat: 23.7 ms
  • 24. © 2019 Bichen Wu MobileNetV2: [4] Acc: 71.8%, MACs: 300M FBNet Compared with Other NAS 24 More MACs- BAD ImageNet top-1 Accuracy -- Good PNAS: [2] Acc: 74.2%, MACs: 588M Search cost*: 6,000 GPU-hrs DARTS: [3] Acc: 73.1%, MACs: 595M Search cost: 288 GPU-hrs AMC: [5] Acc: 70.8%, MACs: 150M MnasNet: [6] Acc: 74.0, MACs: 317M Search Cost*: 91,000 GPU-hrs NAS: [1] Acc: 74.0%, MACs: 564M Search cost: 48,000 GPU-hrs * Estimated from the paper description [1] Zoph, Barret, et al. "Learning transferable architectures for scalable image recognition." arXiv:1707.070122.6 (2017). [2] Liu, Chenxi, et al. "Progressive neural architecture search." arXiv:1712.00559 (2017). [3] Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "Darts: Differentiable architecture search." arXiv:1806.09055 (2018) [4] Sandler, Mark, et al. "MobileNetV2: Inverted Residuals and Linear Bottlenecks.” CVPR18 [5] He, Yihui, et al. "Amc: Automl for model compression and acceleration on mobile devices." ECCV 2018. [6] Tan, Mingxing, et al. "Mnasnet: Platform-aware neural architecture search for mobile." arXiv:1807.11626 (2018). FBNet: (ours) Acc: 74.1%, MACs: 295M Search Cost: 216 GPU-hrs • X-axis: MACs • Y-axis: accuracy • Mark size: search cost • Circles: search cost unknown
  • 25. © 2019 Bichen Wu Result: FBNet for different target devices 25 • Apple A11 • Big: 2 ARMv8 @ 2.5 GHz • Little: 4 ARMv8 @ 1.4 GHz • Vectorization: 4-wide 32-bit MAC • LPDDR4x memory (30 GB/s) • GPU + Neural Processing Engine • Snapdragon 835 • Big: 4 ARMv8 @ 2.4 GHz • Little: 4 ARMv8 @ 1.9 GHz • Vectorization: 4-wide 32-bit MAC • LPDDR4x memory (30 GB/s) • Adreno 540 GPU 0 5 10 15 20 25 30 iPhone X Samsung S8 FBNet latency on target devices Target model for iPhoneX Target model for Samsung S8 1.4x speedup• Under similar accuracy constraint (73.27% vs 73.20%), FBNet optimized for iPhone-X achieves 1.4x speedup over the Samsung optimized model
  • 26. © 2019 Bichen Wu FBNet visualization 26
  • 27. © 2019 Bichen Wu Result: FBNet for different target devices 27 • DNAS automatically adopts operators with low latency on the targeted devices
  • 28. © 2019 Bichen Wu DNAS summary General search space: the search space of each layer can contain arbitrary operators. This allows us to apply DNAS to support different target processors Extremely fast: This process typically takes 8 GPUs for 24 hours to finish. In comparison, to find models with similar performances, MnasNet requires 421x more computing resources. State-of-the-art performance: • Mixed precision quantization: 21x model size reduction or 104x computational cost reduction, almost no accuracy loss • Efficient architecture search: same accuracy, 1.5x faster, 2.4x smaller MACs Optimize for actual latency on targeted devices: • Up to 1.4x speedup compared to non-targeted neural architectures 28
  • 29. © 2019 Bichen Wu References Paper: • Wu, Bichen, et al. "FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search." arXiv preprint arXiv:1812.03443 (2018). • Wu, Bichen, et al. "Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search." arXiv preprint arXiv:1812.00090 (2018). FBNet models: • Will be open-sourced soon! Questions and feedback: • Contact me via email: bichen@berkeley.edu 29