SlideShare a Scribd company logo
1 of 30
Download to read offline
Learning Compact DNN
Models for Embedded
Vision
Shuvra S. Bhattacharyya
University of Maryland, College Park,
USA and
INSA/IETR Rennes, France
With contributions from
Xiaomin Wu and
Rong Chen
• Pruning:
• Remove neurons or parameters that provide little or no contribution to
inference accuracy
• Distillation:
• Transfer knowledge from a large model to a small model
• Neural Architecture Search:
• Optimize the number, types and connectivity of network layers
Popular Methods to Compress DNN Models
2
© 2023 University of Maryland
Pruning: Structured and Unstructured
3
© 2023 University of Maryland
Multilayer
Perceptron:
hidden layer
example
• Implementation-
friendly
• Supports common
ML libraries
• More general
• Needs specially-
designed
hardware/
software for sparse
computation
Pruning: Structured and Unstructured
4
© 2023 University of Maryland
CNN-layer
filter
example
• Implementation-
friendly
• Supports common
ML libraries
• More general
• Needs specially-
designed
hardware/
software for sparse
computation
• Deep Compression [Han 2015]
• Uses weight threshold to prune. Leads to unstructured network architecture.
• Inference-time channel reduction without retraining [He 2017]
• Applies a criterion based on Lasso Regression.
• ThiNet — weight-magnitude-based structured pruning [Luo 2017]
• Layer-wise relevance propagation (LRP) [Yeom 2021]
• Uses a novel criterion, layer-wise relevance propagation, to select weights for
structured pruning.
5
Previously-developed Pruning Methods
© 2023 University of Maryland
NeuroGRS was designed to derive compact DNN models for neural decoding systems.
It can also be applied to generate compact DNN models for other embedded vision applications.
Calcium
imaging of
brain
Motion
correction
Neuron
detection
Neural
signal
extraction
Neural
decoding
In real-time: 10 Hz
Image source:
https://www.nature.com/articles/npp2014206,
https://www.youtube.com/watch?v=d5zK1RUJCiU&ab_c
hannel=MocomiKids.
NeuroGRS
Calcium-imaging-based
neural decoding system:
Overview
© 2023 University of Maryland 6
Design of NeuroGRS
Prediction of mouse's behavior
from analysis of neural signals.
E.g., whether or how fast
the mouse is going to move.
• GRS stands for Greedy inter-layer order with Random Selection of intra-layer
units.
• Combines pruning and architecture search with an emphasis on structured
pruning.
• Takes into consideration both the model architecture and trained weights.
• Suitable for further compressing small DNN models for optimized embedded
implementation.
• Accompanied by a dataflow-based inference system for efficient inference.
Overview of NeuroGRS
7
© 2023 University of Maryland
• Structures determine performance for shallow DNNs; learned weights can be
retrained from scratch [Liu 2018] [Frankle 2018].
• This finding is especially relevant for embedded vision, where shallow DNNs
may be preferable due to resource constraints.
• Using a large compression rate (number of removed neurons or connections)
without retraining can significantly degrade inference accuracy [Li 2016] [Hu
2016].
© 2023 University of Maryland 8
Foundational Findings Applied in NeuroGRS
• Specify initial CNN structure in Keras Tensorflow format.
• If pretrained, load pretrained weights. GRS can either train first and
then prune or perform direct-prune from pretrained model.
• Provide training, validation, and testing data sets.
• Configure hyperparameters.
• Run NeuroGRS to execute the pruning process.
• Output  compact model implemented in Python/C/C++, suitable for
inference on embedded platforms
How to Use the NeuroGRS Software Package
9
© 2023 University of Maryland
Using NeuroGRS (Continued)
10
© 2023 University of Maryland
Dataflow graph for NeuroGRS:
© 2023 University of Maryland 11
GRS Method (1)
Mi: an overparameterized DNN model candidate
P(Mi): a pruned DNN model candidate
vi: a selection criterion
k: the final selected compact model(s)
EUP: Enable Unstructured Pruning
TQ: Thresholding weight connections, Quantization
DT: Training dataset
DV: Validation dataset
GRS: Greedy inter-layer order with Random Selection of intra-layer units
• 𝑉𝑎𝑙𝐴𝑐𝑐: validation
accuracy
• 𝑂𝑟𝑖𝑉𝑎𝑙𝐴𝑐𝑐: validation
accuracy of the initial
structure
• 𝒯: tolerance of accuracy
drop
Failed validation:
𝑉𝑎𝑙𝐴𝑐𝑐 < 𝒯 × 𝑂𝑟𝑖𝑉𝑎𝑙𝐴𝑐𝑐
© 2023 University of Maryland 12
GRS Method (2)
[Wu 2022] X. Wu, D.-T. Lin, R. Chen, and S. Bhattacharyya. Learning compact DNN models for behavior
prediction from calcium imaging of neural activity. Journal of Signal Processing Systems, 94:455-472, 2022.
Two state-of-the-art unstructured pruning methods [Han 2015]:
• 𝑇ℎ𝑟𝑒𝑠ℎ𝐼𝑛𝑖𝑡 = 0.3
• 𝑇ℎ𝑟𝑒𝑠ℎ𝑆𝑡𝑒𝑝 = 0.1
• Iteratively increase the threshold until accuracy
falls below 𝒯 × 𝑂𝑟𝑖𝑉𝑎𝑙𝐴𝑐𝑐
T:
T: Cut weight connections having relatively low
magnitudes.
Q: weight
quantization.
Q:
• Iteratively decrease the number of digits until
accuracy falls below 𝒯 × 𝑂𝑟𝑖𝑉𝑎𝑙𝐴𝑐𝑐
© 2023 University of Maryland 13
GRS Method (3)
(Python)
In: input loader
B: Bias loader
W: Weights loader
Read in float type data
© 2023 University of Maryland 14
GRS Method (4)
Initial models in different types and structures:
© 2023 University of Maryland 15
Neural Network Models for Evaluation
Experiment design:
• Investigate whether intermediate sub-structures impact the overall pruning result.
• Compare GRS with RRS.
• RRS = Random inter-layer order and Random Selection of intra-layer units.
• 𝒯 = 0.985
• Report average of 4 different models on 9 MSN (Medium Spiny Neuron) datasets with 10 repeated trials each.
Results:
• AL: Test Accuracy Loss, FCI: FLOP Count Improvement, PCI: Parameter Count Improvement.
• Compare GRS with RRS. Metrics: GRS gives X percent more than RRS.
© 2023 University of Maryland 16
NeuroGRS Experiments (1)
Experiment design:
• Investigate whether state-of-the-art structured pruning methods for large neural networks are effective in our context.
• Compare GRS with NWM.
• NWM = Natural inter-layer order and Weight Magnitude based selection of intra-layer unit to prune.
• NWM is representative of other pruning methods that do not consider model structure [Han 2015, Luo 2017].
• 𝒯 = 0.985
• Report average of 4 different models on 9 MSN (Medium Spiny Neuron) datasets with 10 repeated trials each.
Results:
• AL: Test Accuracy Loss, FCI: FLOP Count Improvement, PCI: Parameter Count Improvement.
• Compare GRS with NWM. Metrics: GRS gives X percent more than NWM.
© 2023 University of Maryland 17
NeuroGRS Experiments (2)
Experiment design:
• On 9 MSN (Medium Spiny Neuron)
datasets of 3000 frames each.
• 4 different shallow DNN models.
• 10 repeated trials.
• 𝒯 of GRS, Pruning Stage T, and
Stage Q are set to 0.985, 0.995,
and 0.990, respectively.
Results:
• Structured pruning using GRS:
• Further unstructured pruning with TQ:
© 2023 University of Maryland 18
NeuroGRS Experiments (3)
Experiment design:
• With 4 different types of DNN models: use NGSynth to implement their optimized and
original forms using LIDE-C, and deploy on a Raspberry Pi Zero W V1.1 platform.
• LIDE-C = Lightweight Dataflow Environment integrated with the C programming language
[Lin 2017].
• How much runtime improvement is observed from the compact models compared to their
corresponding overparameterized models?
Results:
© 2023 University of Maryland 19
NeuroGRS Experiments (4)
Overview:
• Identify the far phase and the near phase of the GRS pruning process.
• Structures impact model performance less in the far phase compared to
the near phase.
• This type of phase-based reasoning can be adapted to other pruning
methods.
• Develop a "jump mechanism" to help GRS step into the near phase much
faster  Much less time (and less carbon footprint) required for pruning.
© 2023 University of Maryland 20
Jump-GRS Extension
• We have given an overview of pruning and other classes of methods for
compressing DNN models.
• We have introduced a new pruning method called Greedy inter-layer
order with Random Selection of intra-layer units (GRS).
• We have combined GRS with methods for unstructured pruning to
provide a more comprehensive pruning solution.
• We have introduced a software tool, called NeuroGRS, that allows system
designers apply the GRS method with a high degree of automation.
• We have introduced concepts of near- and far-phase operation, which
are applied in NeuroGRS to greatly improve pruning speed.
Conclusion
21
© 2023 University of Maryland
22
Backup Slides
© 2023 University of Maryland
• Use RRS (Random inter-layer
order, random intra-layer selection
of units).
• Retrain and validate all possible
intermediate structures 3 times.
• Set 𝒯 = 0.5 to allow pruning to
continue.
• Use an MLP model called
“mlpmulti” with a hidden structure
of 16X16X16.
• Plot the average validation
accuracy of all possible structures
with repeats at each pruning step.
Demonstrating the pruning phases
© 2023 University of Maryland 24
Jump-GRS Introduction
JGRS algorithm:
• Multiple attempts are used to exploit
randomization in the algorithm. The
best result across all attempts is taken.
• The last valid structure, after all
attempts, will be used as the initial
structure for the next phase.
• Each attempt can be regarded as an
examination of different cut-off artificial
node sets.
• The three phases of GRS have different
compression rates.
• JGRS reduces the compression rate as it
goes from one subphase/phase to the
next.
• A structure fails if its validation acc.
becomes lower than 𝒯 × 𝑂𝑟𝑖𝑉𝑎𝑙𝐴𝑐𝑐.
© 2023 University of Maryland 25
Jump-GRS Method
Initial DNN models used in NeuroGRS: Scaled DNN models:
© 2023 University of Maryland 26
Evaluation of Jump-GRS Method
JGRS and GRS Comparison
• 18 datasets (MSN) of 3000 frames.
• Report the average of 10 repeated trials
• on all MSN datasets.
JGRS vs GRS results:
• 𝒯 for both GRS and JGRS is 0.985
• Attempts:
• Far subphase 1 = 3
• Far subphase 2 = 3
• GRS = 3
© 2023 University of Maryland 27
Jump-GRS Experiments (1)
JGRS on larger DNNs
• 18 MSN Datasets of 3000
frames.
• WGEVIA-REAL: 1600
balanced labeled
embeddings for two
classes of microcircuits.
• 𝒯 = 0.985
• Pruning time is reported
in seconds using a Core
i7-2600K CPU with a
GeForce GTX 1080
GPU.
JGRS on MSN dataset
JGRS on WGEVIA-REAL dataset
GRS on MSN dataset
© 2023 University of Maryland 28
Jump-GRS Experiments (2)
GRS, JGRS runtime trend:
• WGEVIA-REAL dataset
• MLP models with different
numbers of nodes in each
layer.
• 𝒯 = 0.985
• Plot average runtime
(seconds) of 10 repeated
trials.
Execution
time
(seconds)
© 2023 University of Maryland 29
Jump-GRS Experiments (3)
• [Bhattacharyya 2019] Bhattacharyya, Shuvra S., et al. "Handbook of Signal Processing Systems." (2019).
• [Han 2015] Han, Song, et al. "Learning both weights and connections for efficient neural network." Advances in neural information
processing systems 28 (2015).
• [He 2017] He, Yihui, Xiangyu Zhang, and Jian Sun. "Channel pruning for accelerating very deep neural networks." Proceedings of the IEEE
international conference on computer vision. 2017.
• [Hu 2016] Hu, Hengyuan, et al. "Network trimming: A data-driven neuron pruning approach towards efficient deep architectures." arXiv
preprint arXiv:1607.03250 (2016).
• [Li 2019] Li, Chunyue, et al. "Prediction of forelimb reach results from motor cortex activities based on calcium imaging and deep learning."
Frontiers in cellular neuroscience 13 (2019): 88.
• [Li 2016] Li, Hao, et al. "Pruning filters for efficient convnets." arXiv preprint arXiv:1608.08710 (2016).
• [Lee 2017] Lee, Yaesop, et al. "Online learning in neural decoding using incremental linear discriminant analysis." 2017 IEEE International
conference on cyborg and bionic systems (CBS). IEEE, 2017.
• [Lin 2017] Lin, Shuoxin, et al. "The DSPCAD framework for modeling and synthesis of signal processing systems." Handbook of
hardware/software codesign. Springer, Dordrecht, 2017. 1185-1219.
• [Liu 2018] Liu, Zhuang, et al. "Rethinking the value of network pruning." arXiv preprint arXiv:1810.05270 (2018).
• [Frankle 2018] Frankle, Jonathan, and Michael Carbin. "The lottery ticket hypothesis: Finding sparse, trainable neural networks." arXiv
preprint arXiv:1803.03635 (2018).
• [Luo 2017] Luo, Jian-Hao, Jianxin Wu, and Weiyao Lin. "Thinet: A filter level pruning method for deep neural network compression."
Proceedings of the IEEE international conference on computer vision. 2017.
• [Molchanov 2016] Molchanov, Pavlo, et al. "Pruning convolutional neural networks for resource efficient inference." arXiv preprint
arXiv:1611.06440 (2016).
References (1)
30
© 2023 University of Maryland
• [Suau 2018] Suau, Xavier, et al. "Principal filter analysis for guided network compression." arXiv preprint arXiv:1807.10585 2 (2018).
• [Wu 2022]X. Wu, D.-T. Lin, R. Chen, and S. Bhattacharyya. Learning compact DNN models for behavior prediction from calcium
imaging of neural activity. Journal of Signal Processing Systems, 94:455-472, 2022.
• [Yeom 2021] Yeom, Seul-Ki, et al. "Pruning by explaining: A novel criterion for deep neural network pruning." Pattern Recognition 115
(2021): 107899.
References (2)
31
© 2023 University of Maryland

More Related Content

Similar to “Learning Compact DNN Models for Embedded Vision,” a Presentation from the University of Maryland at College Park

"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...
"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En..."Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...
"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...ssuser2624f71
 
Distributed Monte Carlo Feature Selection: Extracting Informative Features Ou...
Distributed Monte Carlo Feature Selection: Extracting Informative Features Ou...Distributed Monte Carlo Feature Selection: Extracting Informative Features Ou...
Distributed Monte Carlo Feature Selection: Extracting Informative Features Ou...Łukasz Król
 
Genetic Programming in Automated Test Code Generation
Genetic Programming in Automated Test Code GenerationGenetic Programming in Automated Test Code Generation
Genetic Programming in Automated Test Code GenerationDVClub
 
Gray-Box Models for Performance Assessment of Spark Applications
Gray-Box Models for Performance Assessment of Spark ApplicationsGray-Box Models for Performance Assessment of Spark Applications
Gray-Box Models for Performance Assessment of Spark ApplicationsATMOSPHERE .
 
Beyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networksBeyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networksJunKudo2
 
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...Sunghoon Joo
 
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...ssuser4b1f48
 
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning
GDG Cloud Community Day 2022 -  Managing data quality in Machine LearningGDG Cloud Community Day 2022 -  Managing data quality in Machine Learning
GDG Cloud Community Day 2022 - Managing data quality in Machine LearningSARADINDU SENGUPTA
 
Bangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptxBangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptxKhondokerAbuNaim
 
Mx net image segmentation to predict and diagnose the cardiac diseases karp...
Mx net image segmentation to predict and diagnose the cardiac diseases   karp...Mx net image segmentation to predict and diagnose the cardiac diseases   karp...
Mx net image segmentation to predict and diagnose the cardiac diseases karp...KannanRamasamy25
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deploymenttaeseon ryu
 
A fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming dataA fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming dataAlexander Decker
 
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...csandit
 
Ijricit 01-002 enhanced replica detection in short time for large data sets
Ijricit 01-002 enhanced replica detection in  short time for large data setsIjricit 01-002 enhanced replica detection in  short time for large data sets
Ijricit 01-002 enhanced replica detection in short time for large data setsIjripublishers Ijri
 
Intrusion Detection System using K-Means Clustering and SMOTE
Intrusion Detection System using K-Means Clustering and SMOTEIntrusion Detection System using K-Means Clustering and SMOTE
Intrusion Detection System using K-Means Clustering and SMOTEIRJET Journal
 
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...IJCNCJournal
 

Similar to “Learning Compact DNN Models for Embedded Vision,” a Presentation from the University of Maryland at College Park (20)

"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...
"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En..."Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...
"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...
 
Distributed Monte Carlo Feature Selection: Extracting Informative Features Ou...
Distributed Monte Carlo Feature Selection: Extracting Informative Features Ou...Distributed Monte Carlo Feature Selection: Extracting Informative Features Ou...
Distributed Monte Carlo Feature Selection: Extracting Informative Features Ou...
 
Genetic Programming in Automated Test Code Generation
Genetic Programming in Automated Test Code GenerationGenetic Programming in Automated Test Code Generation
Genetic Programming in Automated Test Code Generation
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentation
 
Gray-Box Models for Performance Assessment of Spark Applications
Gray-Box Models for Performance Assessment of Spark ApplicationsGray-Box Models for Performance Assessment of Spark Applications
Gray-Box Models for Performance Assessment of Spark Applications
 
Beyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networksBeyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networks
 
Data aggregation in wireless sensor networks
Data aggregation in wireless sensor networksData aggregation in wireless sensor networks
Data aggregation in wireless sensor networks
 
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
 
230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx
 
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
 
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning
GDG Cloud Community Day 2022 -  Managing data quality in Machine LearningGDG Cloud Community Day 2022 -  Managing data quality in Machine Learning
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning
 
Bangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptxBangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptx
 
Mx net image segmentation to predict and diagnose the cardiac diseases karp...
Mx net image segmentation to predict and diagnose the cardiac diseases   karp...Mx net image segmentation to predict and diagnose the cardiac diseases   karp...
Mx net image segmentation to predict and diagnose the cardiac diseases karp...
 
nnUNet
nnUNetnnUNet
nnUNet
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 
A fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming dataA fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming data
 
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
 
Ijricit 01-002 enhanced replica detection in short time for large data sets
Ijricit 01-002 enhanced replica detection in  short time for large data setsIjricit 01-002 enhanced replica detection in  short time for large data sets
Ijricit 01-002 enhanced replica detection in short time for large data sets
 
Intrusion Detection System using K-Means Clustering and SMOTE
Intrusion Detection System using K-Means Clustering and SMOTEIntrusion Detection System using K-Means Clustering and SMOTE
Intrusion Detection System using K-Means Clustering and SMOTE
 
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
 

More from Edge AI and Vision Alliance

“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
 

Recently uploaded

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Recently uploaded (20)

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the University of Maryland at College Park

  • 1. Learning Compact DNN Models for Embedded Vision Shuvra S. Bhattacharyya University of Maryland, College Park, USA and INSA/IETR Rennes, France With contributions from Xiaomin Wu and Rong Chen
  • 2. • Pruning: • Remove neurons or parameters that provide little or no contribution to inference accuracy • Distillation: • Transfer knowledge from a large model to a small model • Neural Architecture Search: • Optimize the number, types and connectivity of network layers Popular Methods to Compress DNN Models 2 © 2023 University of Maryland
  • 3. Pruning: Structured and Unstructured 3 © 2023 University of Maryland Multilayer Perceptron: hidden layer example • Implementation- friendly • Supports common ML libraries • More general • Needs specially- designed hardware/ software for sparse computation
  • 4. Pruning: Structured and Unstructured 4 © 2023 University of Maryland CNN-layer filter example • Implementation- friendly • Supports common ML libraries • More general • Needs specially- designed hardware/ software for sparse computation
  • 5. • Deep Compression [Han 2015] • Uses weight threshold to prune. Leads to unstructured network architecture. • Inference-time channel reduction without retraining [He 2017] • Applies a criterion based on Lasso Regression. • ThiNet — weight-magnitude-based structured pruning [Luo 2017] • Layer-wise relevance propagation (LRP) [Yeom 2021] • Uses a novel criterion, layer-wise relevance propagation, to select weights for structured pruning. 5 Previously-developed Pruning Methods © 2023 University of Maryland
  • 6. NeuroGRS was designed to derive compact DNN models for neural decoding systems. It can also be applied to generate compact DNN models for other embedded vision applications. Calcium imaging of brain Motion correction Neuron detection Neural signal extraction Neural decoding In real-time: 10 Hz Image source: https://www.nature.com/articles/npp2014206, https://www.youtube.com/watch?v=d5zK1RUJCiU&ab_c hannel=MocomiKids. NeuroGRS Calcium-imaging-based neural decoding system: Overview © 2023 University of Maryland 6 Design of NeuroGRS Prediction of mouse's behavior from analysis of neural signals. E.g., whether or how fast the mouse is going to move.
  • 7. • GRS stands for Greedy inter-layer order with Random Selection of intra-layer units. • Combines pruning and architecture search with an emphasis on structured pruning. • Takes into consideration both the model architecture and trained weights. • Suitable for further compressing small DNN models for optimized embedded implementation. • Accompanied by a dataflow-based inference system for efficient inference. Overview of NeuroGRS 7 © 2023 University of Maryland
  • 8. • Structures determine performance for shallow DNNs; learned weights can be retrained from scratch [Liu 2018] [Frankle 2018]. • This finding is especially relevant for embedded vision, where shallow DNNs may be preferable due to resource constraints. • Using a large compression rate (number of removed neurons or connections) without retraining can significantly degrade inference accuracy [Li 2016] [Hu 2016]. © 2023 University of Maryland 8 Foundational Findings Applied in NeuroGRS
  • 9. • Specify initial CNN structure in Keras Tensorflow format. • If pretrained, load pretrained weights. GRS can either train first and then prune or perform direct-prune from pretrained model. • Provide training, validation, and testing data sets. • Configure hyperparameters. • Run NeuroGRS to execute the pruning process. • Output  compact model implemented in Python/C/C++, suitable for inference on embedded platforms How to Use the NeuroGRS Software Package 9 © 2023 University of Maryland
  • 10. Using NeuroGRS (Continued) 10 © 2023 University of Maryland
  • 11. Dataflow graph for NeuroGRS: © 2023 University of Maryland 11 GRS Method (1) Mi: an overparameterized DNN model candidate P(Mi): a pruned DNN model candidate vi: a selection criterion k: the final selected compact model(s) EUP: Enable Unstructured Pruning TQ: Thresholding weight connections, Quantization DT: Training dataset DV: Validation dataset
  • 12. GRS: Greedy inter-layer order with Random Selection of intra-layer units • 𝑉𝑎𝑙𝐴𝑐𝑐: validation accuracy • 𝑂𝑟𝑖𝑉𝑎𝑙𝐴𝑐𝑐: validation accuracy of the initial structure • 𝒯: tolerance of accuracy drop Failed validation: 𝑉𝑎𝑙𝐴𝑐𝑐 < 𝒯 × 𝑂𝑟𝑖𝑉𝑎𝑙𝐴𝑐𝑐 © 2023 University of Maryland 12 GRS Method (2) [Wu 2022] X. Wu, D.-T. Lin, R. Chen, and S. Bhattacharyya. Learning compact DNN models for behavior prediction from calcium imaging of neural activity. Journal of Signal Processing Systems, 94:455-472, 2022.
  • 13. Two state-of-the-art unstructured pruning methods [Han 2015]: • 𝑇ℎ𝑟𝑒𝑠ℎ𝐼𝑛𝑖𝑡 = 0.3 • 𝑇ℎ𝑟𝑒𝑠ℎ𝑆𝑡𝑒𝑝 = 0.1 • Iteratively increase the threshold until accuracy falls below 𝒯 × 𝑂𝑟𝑖𝑉𝑎𝑙𝐴𝑐𝑐 T: T: Cut weight connections having relatively low magnitudes. Q: weight quantization. Q: • Iteratively decrease the number of digits until accuracy falls below 𝒯 × 𝑂𝑟𝑖𝑉𝑎𝑙𝐴𝑐𝑐 © 2023 University of Maryland 13 GRS Method (3)
  • 14. (Python) In: input loader B: Bias loader W: Weights loader Read in float type data © 2023 University of Maryland 14 GRS Method (4)
  • 15. Initial models in different types and structures: © 2023 University of Maryland 15 Neural Network Models for Evaluation
  • 16. Experiment design: • Investigate whether intermediate sub-structures impact the overall pruning result. • Compare GRS with RRS. • RRS = Random inter-layer order and Random Selection of intra-layer units. • 𝒯 = 0.985 • Report average of 4 different models on 9 MSN (Medium Spiny Neuron) datasets with 10 repeated trials each. Results: • AL: Test Accuracy Loss, FCI: FLOP Count Improvement, PCI: Parameter Count Improvement. • Compare GRS with RRS. Metrics: GRS gives X percent more than RRS. © 2023 University of Maryland 16 NeuroGRS Experiments (1)
  • 17. Experiment design: • Investigate whether state-of-the-art structured pruning methods for large neural networks are effective in our context. • Compare GRS with NWM. • NWM = Natural inter-layer order and Weight Magnitude based selection of intra-layer unit to prune. • NWM is representative of other pruning methods that do not consider model structure [Han 2015, Luo 2017]. • 𝒯 = 0.985 • Report average of 4 different models on 9 MSN (Medium Spiny Neuron) datasets with 10 repeated trials each. Results: • AL: Test Accuracy Loss, FCI: FLOP Count Improvement, PCI: Parameter Count Improvement. • Compare GRS with NWM. Metrics: GRS gives X percent more than NWM. © 2023 University of Maryland 17 NeuroGRS Experiments (2)
  • 18. Experiment design: • On 9 MSN (Medium Spiny Neuron) datasets of 3000 frames each. • 4 different shallow DNN models. • 10 repeated trials. • 𝒯 of GRS, Pruning Stage T, and Stage Q are set to 0.985, 0.995, and 0.990, respectively. Results: • Structured pruning using GRS: • Further unstructured pruning with TQ: © 2023 University of Maryland 18 NeuroGRS Experiments (3)
  • 19. Experiment design: • With 4 different types of DNN models: use NGSynth to implement their optimized and original forms using LIDE-C, and deploy on a Raspberry Pi Zero W V1.1 platform. • LIDE-C = Lightweight Dataflow Environment integrated with the C programming language [Lin 2017]. • How much runtime improvement is observed from the compact models compared to their corresponding overparameterized models? Results: © 2023 University of Maryland 19 NeuroGRS Experiments (4)
  • 20. Overview: • Identify the far phase and the near phase of the GRS pruning process. • Structures impact model performance less in the far phase compared to the near phase. • This type of phase-based reasoning can be adapted to other pruning methods. • Develop a "jump mechanism" to help GRS step into the near phase much faster  Much less time (and less carbon footprint) required for pruning. © 2023 University of Maryland 20 Jump-GRS Extension
  • 21. • We have given an overview of pruning and other classes of methods for compressing DNN models. • We have introduced a new pruning method called Greedy inter-layer order with Random Selection of intra-layer units (GRS). • We have combined GRS with methods for unstructured pruning to provide a more comprehensive pruning solution. • We have introduced a software tool, called NeuroGRS, that allows system designers apply the GRS method with a high degree of automation. • We have introduced concepts of near- and far-phase operation, which are applied in NeuroGRS to greatly improve pruning speed. Conclusion 21 © 2023 University of Maryland
  • 22. 22 Backup Slides © 2023 University of Maryland
  • 23. • Use RRS (Random inter-layer order, random intra-layer selection of units). • Retrain and validate all possible intermediate structures 3 times. • Set 𝒯 = 0.5 to allow pruning to continue. • Use an MLP model called “mlpmulti” with a hidden structure of 16X16X16. • Plot the average validation accuracy of all possible structures with repeats at each pruning step. Demonstrating the pruning phases © 2023 University of Maryland 24 Jump-GRS Introduction
  • 24. JGRS algorithm: • Multiple attempts are used to exploit randomization in the algorithm. The best result across all attempts is taken. • The last valid structure, after all attempts, will be used as the initial structure for the next phase. • Each attempt can be regarded as an examination of different cut-off artificial node sets. • The three phases of GRS have different compression rates. • JGRS reduces the compression rate as it goes from one subphase/phase to the next. • A structure fails if its validation acc. becomes lower than 𝒯 × 𝑂𝑟𝑖𝑉𝑎𝑙𝐴𝑐𝑐. © 2023 University of Maryland 25 Jump-GRS Method
  • 25. Initial DNN models used in NeuroGRS: Scaled DNN models: © 2023 University of Maryland 26 Evaluation of Jump-GRS Method
  • 26. JGRS and GRS Comparison • 18 datasets (MSN) of 3000 frames. • Report the average of 10 repeated trials • on all MSN datasets. JGRS vs GRS results: • 𝒯 for both GRS and JGRS is 0.985 • Attempts: • Far subphase 1 = 3 • Far subphase 2 = 3 • GRS = 3 © 2023 University of Maryland 27 Jump-GRS Experiments (1)
  • 27. JGRS on larger DNNs • 18 MSN Datasets of 3000 frames. • WGEVIA-REAL: 1600 balanced labeled embeddings for two classes of microcircuits. • 𝒯 = 0.985 • Pruning time is reported in seconds using a Core i7-2600K CPU with a GeForce GTX 1080 GPU. JGRS on MSN dataset JGRS on WGEVIA-REAL dataset GRS on MSN dataset © 2023 University of Maryland 28 Jump-GRS Experiments (2)
  • 28. GRS, JGRS runtime trend: • WGEVIA-REAL dataset • MLP models with different numbers of nodes in each layer. • 𝒯 = 0.985 • Plot average runtime (seconds) of 10 repeated trials. Execution time (seconds) © 2023 University of Maryland 29 Jump-GRS Experiments (3)
  • 29. • [Bhattacharyya 2019] Bhattacharyya, Shuvra S., et al. "Handbook of Signal Processing Systems." (2019). • [Han 2015] Han, Song, et al. "Learning both weights and connections for efficient neural network." Advances in neural information processing systems 28 (2015). • [He 2017] He, Yihui, Xiangyu Zhang, and Jian Sun. "Channel pruning for accelerating very deep neural networks." Proceedings of the IEEE international conference on computer vision. 2017. • [Hu 2016] Hu, Hengyuan, et al. "Network trimming: A data-driven neuron pruning approach towards efficient deep architectures." arXiv preprint arXiv:1607.03250 (2016). • [Li 2019] Li, Chunyue, et al. "Prediction of forelimb reach results from motor cortex activities based on calcium imaging and deep learning." Frontiers in cellular neuroscience 13 (2019): 88. • [Li 2016] Li, Hao, et al. "Pruning filters for efficient convnets." arXiv preprint arXiv:1608.08710 (2016). • [Lee 2017] Lee, Yaesop, et al. "Online learning in neural decoding using incremental linear discriminant analysis." 2017 IEEE International conference on cyborg and bionic systems (CBS). IEEE, 2017. • [Lin 2017] Lin, Shuoxin, et al. "The DSPCAD framework for modeling and synthesis of signal processing systems." Handbook of hardware/software codesign. Springer, Dordrecht, 2017. 1185-1219. • [Liu 2018] Liu, Zhuang, et al. "Rethinking the value of network pruning." arXiv preprint arXiv:1810.05270 (2018). • [Frankle 2018] Frankle, Jonathan, and Michael Carbin. "The lottery ticket hypothesis: Finding sparse, trainable neural networks." arXiv preprint arXiv:1803.03635 (2018). • [Luo 2017] Luo, Jian-Hao, Jianxin Wu, and Weiyao Lin. "Thinet: A filter level pruning method for deep neural network compression." Proceedings of the IEEE international conference on computer vision. 2017. • [Molchanov 2016] Molchanov, Pavlo, et al. "Pruning convolutional neural networks for resource efficient inference." arXiv preprint arXiv:1611.06440 (2016). References (1) 30 © 2023 University of Maryland
  • 30. • [Suau 2018] Suau, Xavier, et al. "Principal filter analysis for guided network compression." arXiv preprint arXiv:1807.10585 2 (2018). • [Wu 2022]X. Wu, D.-T. Lin, R. Chen, and S. Bhattacharyya. Learning compact DNN models for behavior prediction from calcium imaging of neural activity. Journal of Signal Processing Systems, 94:455-472, 2022. • [Yeom 2021] Yeom, Seul-Ki, et al. "Pruning by explaining: A novel criterion for deep neural network pruning." Pattern Recognition 115 (2021): 107899. References (2) 31 © 2023 University of Maryland