SlideShare a Scribd company logo
Deep Learning and Computer
Vision in Low Power Devices
Lokesh Vadlamudi
San Jose State University
Deep Learning
Introduction
• Deep neural networks (DNNs) are successful in many computer vision
tasks. However, the most accurate DNNs require millions of
parameters and operations, making them energy, computation and
memory intensive. This impedes the deployment of large DNNs in
low-power devices with limited compute resources.
• Hence, we are going to talk about few techniques for reducing
memory and energy requirements.
1.1. Parameter Quantization
• One method to reduce the number of memory accesses is to
decrease the size of DNN parameters.
• Some methods show that it is possible to have negligible accuracy
loss even when the precision of the DNN parameters is reduced.
• LightNN, CompactNet, and FLightNN find the optimal bit-width for
different parameters of a DNN, given an accuracy constraint.
• Here, as the parameter bit-width
decreases, the energy
consumption decreases at the
expense of increasing test error.
CompactNet
• CompactNet automatically optimizes a pre-trained CNN model on a
specific resource-limited platform given a specific target of inference
speedup. Guided by a simulator of the target platform, CompactNet
progressively trims a pre-trained network by removing certain
redundant filters until the target speedup is reached and generates an
optimal platform-specific model while maintaining the accuracy.
• Advantage:
• When the bit-widths of the parameters decrease, the prediction
performance of DNNs remains constant. This is because constraining
parameters has a regularization effect in the training process.
• Disadvantage:
• DNNs employing quantization techniques need to be retrained
multiple times, making the training process very expensive.
1.2. Pruning Parameters and Connections:
• The Hessian-weighted distortion measure can be used to identify the
importance of parameters in a DNN. These measure-based pruning
methods only operate on the fully-connected layers. By performing
pruning, quantization, and encoding, Deep Compression reduces the
model size by 95%.
• Path-level pruning is also seen in tree-based hierarchical DNNs.
Although these techniques can identify the unimportant connections,
they create unwanted sparsity in DNNs. Sparse matrices require
special data structures (unavailable in deep learning libraries), and are
difficult to map onto modern GPUs.
• Advantage: Pruning can be combined with quantization and encoding
for significant performance gains. When the three techniques are
used together, the VGG-16 model size decreases to 2% of its original
size.
• Disadvantage: The training effort associated with DNN pruning is
considerable because DNNs have to be pruned and trained multiple
times. The advantages of pruning are noticed only when using custom
hardware or special data structures for sparse matrices.
2.1. Convolutional Filter Compression
• Smaller convolution filters have considerably fewer parameters and
lower computation costs than larger filters. For example, a 1×1 filter
has only 11% parameters of a 3×3 filter. However, removing all large
convolution layers affects and lowers its accuracy.
• SqueezeNet is one such technique that uses three strategies to
convert 3×3 convolutions into 1 × 1 convolutions to reduce the
number of parameters.
Squeeze Net
SqueezeNet (Left): begins with a
standalone convolution layer
(conv1), followed by 8 Fire modules
(fire2–9), ending with a final conv
layer (conv10).
• Advantage: Bottleneck convolutional filters reduce the memory and
latency requirements of DNNs significantly. For most computer vision
tasks, these techniques obtain state-of- the-art accuracy.
based on Andrew Ng’s lectures at DeepLearning.ai
2.2. Matrix Factorization
* This technique factorizes multi-dimensional tensors (in convolutional
and fully-connected layers) into smaller matrices to eliminate
redundant computation.
• Some factorizations accelerate DNNs up to 4×.
• To minimize the accuracy loss, matrix factorizations are performed
one layer at a time. The layer-by-layer optimization approach makes it
difficult to scale these techniques to large DNNs because the number
of factorization hyper-parameters increases exponentially with DNN
depth.
• Advantages: Matrix factorizations are methods to reduce the
computation costs in DNNs. The same factorizations can be used in
both convolutional layers and fully connected layers.
• Disadvantages: The computation costs associated with matrix
factorization often offset the performance gains obtained from
performing fewer operations. Matrix factorizations are also difficult to
implement in large DNNs because the training time increases
exponentially with increasing depth.
3 . Network Architecture Search:
• Network Architecture Search (NAS) is a technique that automates
DNN architecture design for various tasks. NAS uses a Recurrent
Neural Network (RNN) controller and uses reinforced learning to
compose candidate DNN architectures.
• These candidate architectures are trained and then tested with the
validation set. The validation accuracy is used as a reward function to
then optimize the controller’s next candidate architecture.
• MNasNet is 2.3× faster than NASNet with 4.8× fewer parameters and
10× fewer operations. Moreover, MNasNet is also more accurate than
NASNet.
• Advantages: NAS automatically balances the trade-offs between
accuracy, memory, and latency by searching through the space of all
possible architectures without any human intervention. NAS achieves
state-of-the-art performance in terms of accuracy and energy
efficiency on many mobile devices.
• Disadvantage: The computational demand of NAS algorithms makes
it difficult to search for architectures that are optimized for large
datasets.
4 . Knowledge Distillation:
• Trains a compact DNN that mimics the outputs, features, and
activations of a more computation-heavy DNN.
• Training small DNNs to learn such functions is challenging with the
conventional back propagation algorithm.
• some methods train small DNNs to learn complex functions by
making the small DNNs mimic larger pre-trained DNNs. These
techniques transfer the “knowledge” of a large DNN to a small DNN
through a process called Knowledge Transfer (KT).
• Advantages: KT- and KD-based techniques can reduce the
computation costs of large pre-trained DNNs significantly.
• Conclusion:
• DNNs are powerful tools for performing many computer vision tasks.
However, because the best performing (most accurate) DNNs are
designed for situations where computational resources are readily
available, they are difficult to deploy on embedded and mobile
devices.

More Related Content

What's hot

Efficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingEfficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter Sharing
Jinwon Lee
 
MobileNet V3
MobileNet V3MobileNet V3
MobileNet V3
Wonbeom Jang
 
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksPR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Jinwon Lee
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
Sanghamitra Deb
 
Mobilenetv1 v2 slide
Mobilenetv1 v2 slideMobilenetv1 v2 slide
Mobilenetv1 v2 slide
威智 黃
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
Jinwon Lee
 
Machine Vision on Embedded Hardware
Machine Vision on Embedded HardwareMachine Vision on Embedded Hardware
Machine Vision on Embedded Hardware
Jash Shah
 
A Survey of Convolutional Neural Networks
A Survey of Convolutional Neural NetworksA Survey of Convolutional Neural Networks
A Survey of Convolutional Neural Networks
Rimzim Thube
 
cbs_sips2005
cbs_sips2005cbs_sips2005
cbs_sips2005
Jordi Arnabat
 
CVPR 2018 Paper Reading MobileNet V2
CVPR 2018 Paper Reading MobileNet V2CVPR 2018 Paper Reading MobileNet V2
CVPR 2018 Paper Reading MobileNet V2
Khang Pham
 
Glsv00dare
Glsv00dareGlsv00dare
Glsv00dare
Gary Dare
 
Image processing by manish myst, ssgbcoet
Image processing by manish myst, ssgbcoetImage processing by manish myst, ssgbcoet
Image processing by manish myst, ssgbcoet
Manish Myst
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
Jinwon Lee
 
Lecture 4 principles of parallel algorithm design updated
Lecture 4   principles of parallel algorithm design updatedLecture 4   principles of parallel algorithm design updated
Lecture 4 principles of parallel algorithm design updated
Vajira Thambawita
 
2019-06-14:7 - Neutral Network Compression
2019-06-14:7 - Neutral Network Compression2019-06-14:7 - Neutral Network Compression
2019-06-14:7 - Neutral Network Compression
uninfoit
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
Jinwon Lee
 
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
Tahmid Abtahi
 
Convolutional Neural Network and RNN for OCR problem.
Convolutional Neural Network and RNN for OCR problem.Convolutional Neural Network and RNN for OCR problem.
Convolutional Neural Network and RNN for OCR problem.
Vishal Mishra
 
Comparison of Learning Algorithms for Handwritten Digit Recognition
Comparison of Learning Algorithms for Handwritten Digit RecognitionComparison of Learning Algorithms for Handwritten Digit Recognition
Comparison of Learning Algorithms for Handwritten Digit Recognition
Safaa Alnabulsi
 
Network Deconvolution review [cdm]
Network Deconvolution review [cdm]Network Deconvolution review [cdm]
Network Deconvolution review [cdm]
Dongmin Choi
 

What's hot (20)

Efficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingEfficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter Sharing
 
MobileNet V3
MobileNet V3MobileNet V3
MobileNet V3
 
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksPR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Mobilenetv1 v2 slide
Mobilenetv1 v2 slideMobilenetv1 v2 slide
Mobilenetv1 v2 slide
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
 
Machine Vision on Embedded Hardware
Machine Vision on Embedded HardwareMachine Vision on Embedded Hardware
Machine Vision on Embedded Hardware
 
A Survey of Convolutional Neural Networks
A Survey of Convolutional Neural NetworksA Survey of Convolutional Neural Networks
A Survey of Convolutional Neural Networks
 
cbs_sips2005
cbs_sips2005cbs_sips2005
cbs_sips2005
 
CVPR 2018 Paper Reading MobileNet V2
CVPR 2018 Paper Reading MobileNet V2CVPR 2018 Paper Reading MobileNet V2
CVPR 2018 Paper Reading MobileNet V2
 
Glsv00dare
Glsv00dareGlsv00dare
Glsv00dare
 
Image processing by manish myst, ssgbcoet
Image processing by manish myst, ssgbcoetImage processing by manish myst, ssgbcoet
Image processing by manish myst, ssgbcoet
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
 
Lecture 4 principles of parallel algorithm design updated
Lecture 4   principles of parallel algorithm design updatedLecture 4   principles of parallel algorithm design updated
Lecture 4 principles of parallel algorithm design updated
 
2019-06-14:7 - Neutral Network Compression
2019-06-14:7 - Neutral Network Compression2019-06-14:7 - Neutral Network Compression
2019-06-14:7 - Neutral Network Compression
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
 
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
 
Convolutional Neural Network and RNN for OCR problem.
Convolutional Neural Network and RNN for OCR problem.Convolutional Neural Network and RNN for OCR problem.
Convolutional Neural Network and RNN for OCR problem.
 
Comparison of Learning Algorithms for Handwritten Digit Recognition
Comparison of Learning Algorithms for Handwritten Digit RecognitionComparison of Learning Algorithms for Handwritten Digit Recognition
Comparison of Learning Algorithms for Handwritten Digit Recognition
 
Network Deconvolution review [cdm]
Network Deconvolution review [cdm]Network Deconvolution review [cdm]
Network Deconvolution review [cdm]
 

Similar to Deep Learning in Low Power Devices

Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
ananth
 
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
Edge AI and Vision Alliance
 
Introduction to CNN Models: DenseNet & MobileNet
Introduction to CNN Models: DenseNet & MobileNetIntroduction to CNN Models: DenseNet & MobileNet
Introduction to CNN Models: DenseNet & MobileNet
KrishnakoumarC
 
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
thanhdowork
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
Chester Chen
 
[2020 CVPR Efficient DET paper review]
[2020 CVPR Efficient DET paper review][2020 CVPR Efficient DET paper review]
[2020 CVPR Efficient DET paper review]
taeseon ryu
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]
SubhradeepMaji
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
DonghyunKang12
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deployment
taeseon ryu
 
Deep gradient compression
Deep gradient compressionDeep gradient compression
Deep gradient compression
David Tung
 
"Designing CNN Algorithms for Real-time Applications," a Presentation from Al...
"Designing CNN Algorithms for Real-time Applications," a Presentation from Al..."Designing CNN Algorithms for Real-time Applications," a Presentation from Al...
"Designing CNN Algorithms for Real-time Applications," a Presentation from Al...
Edge AI and Vision Alliance
 
04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx
ZainULABIDIN496386
 
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptxEfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx
ssuser2624f71
 
Finding the best solution for Image Processing
Finding the best solution for Image ProcessingFinding the best solution for Image Processing
Finding the best solution for Image Processing
Tech Triveni
 
Sp19_P2.pptx
Sp19_P2.pptxSp19_P2.pptx
Sp19_P2.pptx
Md Abul Hayat
 
Beyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networksBeyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networks
JunKudo2
 
Deep learning for image video processing
Deep learning for image video processingDeep learning for image video processing
Deep learning for image video processing
Yu Huang
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer vision
Marcin Jedyk
 
EfficientNet
EfficientNetEfficientNet
EfficientNet
Changjin Lee
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolution
Prudhvi Raj
 

Similar to Deep Learning in Low Power Devices (20)

Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
 
Introduction to CNN Models: DenseNet & MobileNet
Introduction to CNN Models: DenseNet & MobileNetIntroduction to CNN Models: DenseNet & MobileNet
Introduction to CNN Models: DenseNet & MobileNet
 
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
 
[2020 CVPR Efficient DET paper review]
[2020 CVPR Efficient DET paper review][2020 CVPR Efficient DET paper review]
[2020 CVPR Efficient DET paper review]
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 
Deep gradient compression
Deep gradient compressionDeep gradient compression
Deep gradient compression
 
"Designing CNN Algorithms for Real-time Applications," a Presentation from Al...
"Designing CNN Algorithms for Real-time Applications," a Presentation from Al..."Designing CNN Algorithms for Real-time Applications," a Presentation from Al...
"Designing CNN Algorithms for Real-time Applications," a Presentation from Al...
 
04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx
 
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptxEfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx
 
Finding the best solution for Image Processing
Finding the best solution for Image ProcessingFinding the best solution for Image Processing
Finding the best solution for Image Processing
 
Sp19_P2.pptx
Sp19_P2.pptxSp19_P2.pptx
Sp19_P2.pptx
 
Beyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networksBeyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networks
 
Deep learning for image video processing
Deep learning for image video processingDeep learning for image video processing
Deep learning for image video processing
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer vision
 
EfficientNet
EfficientNetEfficientNet
EfficientNet
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolution
 

Recently uploaded

New techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdfNew techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdf
wisnuprabawa3
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
heavyhaig
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
gestioneergodomus
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
ClaraZara1
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
nooriasukmaningtyas
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
Mukeshwaran Balu
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
University of Maribor
 
sieving analysis and results interpretation
sieving analysis and results interpretationsieving analysis and results interpretation
sieving analysis and results interpretation
ssuser36d3051
 
bank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdfbank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdf
Divyam548318
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 

Recently uploaded (20)

New techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdfNew techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdf
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
 
sieving analysis and results interpretation
sieving analysis and results interpretationsieving analysis and results interpretation
sieving analysis and results interpretation
 
bank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdfbank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdf
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 

Deep Learning in Low Power Devices

  • 1. Deep Learning and Computer Vision in Low Power Devices Lokesh Vadlamudi San Jose State University Deep Learning
  • 2. Introduction • Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. • Hence, we are going to talk about few techniques for reducing memory and energy requirements.
  • 3. 1.1. Parameter Quantization • One method to reduce the number of memory accesses is to decrease the size of DNN parameters. • Some methods show that it is possible to have negligible accuracy loss even when the precision of the DNN parameters is reduced. • LightNN, CompactNet, and FLightNN find the optimal bit-width for different parameters of a DNN, given an accuracy constraint.
  • 4. • Here, as the parameter bit-width decreases, the energy consumption decreases at the expense of increasing test error.
  • 5. CompactNet • CompactNet automatically optimizes a pre-trained CNN model on a specific resource-limited platform given a specific target of inference speedup. Guided by a simulator of the target platform, CompactNet progressively trims a pre-trained network by removing certain redundant filters until the target speedup is reached and generates an optimal platform-specific model while maintaining the accuracy.
  • 6. • Advantage: • When the bit-widths of the parameters decrease, the prediction performance of DNNs remains constant. This is because constraining parameters has a regularization effect in the training process. • Disadvantage: • DNNs employing quantization techniques need to be retrained multiple times, making the training process very expensive.
  • 7. 1.2. Pruning Parameters and Connections: • The Hessian-weighted distortion measure can be used to identify the importance of parameters in a DNN. These measure-based pruning methods only operate on the fully-connected layers. By performing pruning, quantization, and encoding, Deep Compression reduces the model size by 95%. • Path-level pruning is also seen in tree-based hierarchical DNNs. Although these techniques can identify the unimportant connections, they create unwanted sparsity in DNNs. Sparse matrices require special data structures (unavailable in deep learning libraries), and are difficult to map onto modern GPUs.
  • 8. • Advantage: Pruning can be combined with quantization and encoding for significant performance gains. When the three techniques are used together, the VGG-16 model size decreases to 2% of its original size. • Disadvantage: The training effort associated with DNN pruning is considerable because DNNs have to be pruned and trained multiple times. The advantages of pruning are noticed only when using custom hardware or special data structures for sparse matrices.
  • 9. 2.1. Convolutional Filter Compression • Smaller convolution filters have considerably fewer parameters and lower computation costs than larger filters. For example, a 1×1 filter has only 11% parameters of a 3×3 filter. However, removing all large convolution layers affects and lowers its accuracy. • SqueezeNet is one such technique that uses three strategies to convert 3×3 convolutions into 1 × 1 convolutions to reduce the number of parameters.
  • 10. Squeeze Net SqueezeNet (Left): begins with a standalone convolution layer (conv1), followed by 8 Fire modules (fire2–9), ending with a final conv layer (conv10).
  • 11. • Advantage: Bottleneck convolutional filters reduce the memory and latency requirements of DNNs significantly. For most computer vision tasks, these techniques obtain state-of- the-art accuracy. based on Andrew Ng’s lectures at DeepLearning.ai
  • 12. 2.2. Matrix Factorization * This technique factorizes multi-dimensional tensors (in convolutional and fully-connected layers) into smaller matrices to eliminate redundant computation. • Some factorizations accelerate DNNs up to 4×. • To minimize the accuracy loss, matrix factorizations are performed one layer at a time. The layer-by-layer optimization approach makes it difficult to scale these techniques to large DNNs because the number of factorization hyper-parameters increases exponentially with DNN depth.
  • 13. • Advantages: Matrix factorizations are methods to reduce the computation costs in DNNs. The same factorizations can be used in both convolutional layers and fully connected layers. • Disadvantages: The computation costs associated with matrix factorization often offset the performance gains obtained from performing fewer operations. Matrix factorizations are also difficult to implement in large DNNs because the training time increases exponentially with increasing depth.
  • 14. 3 . Network Architecture Search: • Network Architecture Search (NAS) is a technique that automates DNN architecture design for various tasks. NAS uses a Recurrent Neural Network (RNN) controller and uses reinforced learning to compose candidate DNN architectures. • These candidate architectures are trained and then tested with the validation set. The validation accuracy is used as a reward function to then optimize the controller’s next candidate architecture. • MNasNet is 2.3× faster than NASNet with 4.8× fewer parameters and 10× fewer operations. Moreover, MNasNet is also more accurate than NASNet.
  • 15. • Advantages: NAS automatically balances the trade-offs between accuracy, memory, and latency by searching through the space of all possible architectures without any human intervention. NAS achieves state-of-the-art performance in terms of accuracy and energy efficiency on many mobile devices. • Disadvantage: The computational demand of NAS algorithms makes it difficult to search for architectures that are optimized for large datasets.
  • 16. 4 . Knowledge Distillation: • Trains a compact DNN that mimics the outputs, features, and activations of a more computation-heavy DNN. • Training small DNNs to learn such functions is challenging with the conventional back propagation algorithm. • some methods train small DNNs to learn complex functions by making the small DNNs mimic larger pre-trained DNNs. These techniques transfer the “knowledge” of a large DNN to a small DNN through a process called Knowledge Transfer (KT).
  • 17. • Advantages: KT- and KD-based techniques can reduce the computation costs of large pre-trained DNNs significantly. • Conclusion: • DNNs are powerful tools for performing many computer vision tasks. However, because the best performing (most accurate) DNNs are designed for situations where computational resources are readily available, they are difficult to deploy on embedded and mobile devices.