SlideShare a Scribd company logo
1 of 68
Download to read offline
Deep Neural Network
Training: Diagnosing
Problems and
Implementing Solutions
Fahed Hassanat
COO / Head of Engineering
Sensor Cortek
2
© 2023 Sensor Cortek
Training DNNs
• Find an acceptable
relationship between inputs
and outputs based on
patterns found in historical
data.
The Goal
3
© 2023 Sensor Cortek
Y (independent variable)
X
(dependent
variable)
• The process of minimizing the
difference between the
produced output and the
actual output.
• Uses mathematical techniques
to minimize the error.
• Stop when the results are
acceptable or can no longer
learn.
The Learning Process
4
© 2023 Sensor Cortek
Y (independent variable)
X
(dependent
variable)
Example Forward Calculation
5
© 2023 Sensor Cortek
Activation (Sigmoid: 1/(1+e-x)) = 1/(1+e-0.4)
SUM = (0.7x0.2)+(0.3x0.9)
(SUM)
(Activation)
0.6
An Educated Network
6
© 2023 Sensor Cortek
• A labeled collection of data.
• Variations of inputs that produce
same output.
• The larger and more diverse the
better.
Training Data
7
© 2023 Sensor Cortek
• A pass through the network in
forward direction.
• Use training data.
• Produce results with the
current parameters.
The Forward Pass
8
© 2023 Sensor Cortek
• Traverse all nodes starting
from the last layer.
• Update the parameters
(weights) to minimize the
error.
• Gradient Descent is usually
the technique used to work
towards a minimum error.
The Backward Pass
9
© 2023 Sensor Cortek
10
© 2023 Sensor Cortek
Components of Training DNNs
• Framework: Manages the data flow and execution of the
training process.
• DNN model: An architecture that serves a certain purpose.
• Hyperparameters: Controls the training process.
• Training data: Labeled dataset for training the model.
Components of Training DNNs
11
© 2023 Sensor Cortek
• Keras:
• Great for prototyping. High API level and
high readability.
• TensorFlow:
• High performance. Suited for large datasets.
• Pytorch:
• High performance. Excellent debugging.
Framework
12
© 2023 Sensor Cortek
Architecture Purpose
Multi-layer perceptron
Image classification, natural language processing, and
regression
Convolutional neural
networks (CNN)
Image classification, object detection, and segmentation
Recurrent neural networks
(RNN)
Time series prediction, and speech recognition
Transformer networks Natural language processing, and computer vision
Generative adversarial
networks (GAN)
Synthetic data generation
DNN Model
13
© 2023 Sensor Cortek
• Hyperparameters are parameters that cannot be learned during
the training process and are set prior to the training process.
• Some components of the model architecture can be considered
hyperparameters.
• Model designers can create a hyperparameter that controls
multiple hyperparameters in a certain fashion.
What is a Hyperparameter
14
© 2023 Sensor Cortek
Non-architecture-based Architecture-based
Learning rate Number of layers
Number of epochs Number of neurons
Batch size Activation function
Dropout rate
Weight initialization
Regularization parameters
Optimizer
Hyperparameters
15
© 2023 Sensor Cortek
• A value that controls the amount by which the network weights
are updated.
• Usually between 0.0 and 1.0.
• Too large or too small affects the convergence of the model.
Hyperparameters: Learning Rate
16
© 2023 Sensor Cortek
• The number of times the dataset is passed through the network.
• Too small and the model does not converge.
• Too large and the model overfits.
Hyperparameters: Number of Epochs
17
© 2023 Sensor Cortek
• The number of data samples used in each iteration of the
optimization algorithm.
• Too large causes the model not to generalize well.
• Too small can prevent the model from converging.
Hyperparameters: Batch Size
18
© 2023 Sensor Cortek
• Collection of data with corresponding labels.
• Data is used as input to the model and labels
adjust and correct the output.
• Dataset is divided into training/validation/
testing sets with the commonly used ratios of
60/20/20 respectively.
Dataset
19
© 2023 Sensor Cortek
Sufficient data The dataset must contain enough data to reflect the targeted population.
Balanced data In multiclass problems, classes must have balanced contributions to the dataset.
Relevant data The data must represent the targeted population and population environment.
Proper labeling Labels must be accurate, and consistent.
Data diversity The diversity of the data should reflect the diversity of the targeted population.
Low SNR The data should have little to no noise that cause ambiguity.
Attributes of a Good Dataset
20
© 2023 Sensor Cortek
21
© 2023 Sensor Cortek
Training Metrics
True Positive (TP)
The model predicts True, and the label’s
True.
False Positive (FP)
The model predicts True, and the label’s
False.
False Negative (FN)
The model predicts False, and the label’s
True.
Prediction Outcomes
22
© 2023 Sensor Cortek
• Used in object detection.
• Calculates overlap in bounding
boxes.
• Determines how close is a
prediction to the ground truth.
• Ranges from 0 to 1.
• A typical value for qualifying a
prediction as TP is 0.5.
Intersection over Union
23
© 2023 Sensor Cortek
IoU = 0
IoU = 0.5
IoU = 1
• Is a measure of how far the
predictions are from the
actual values.
• It is the output of the loss
function during training.
Loss
24
© 2023 Sensor Cortek
Loss
epoch
• It is the number of good predictions out of all predictions.
• Can be calculated as:
𝑇𝑃+𝑇𝑁
𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
Accuracy
25
© 2023 Sensor Cortek
• The number of correctly predicted labels out of all True
predictions.
• Can be calculated as:
𝑇𝑃
𝑇𝑃+𝐹𝑃
Precision
26
© 2023 Sensor Cortek
Recall
27
© 2023 Sensor Cortek
• The number of correctly predicted labels out of all True labels in
the data.
• Can be calculated as:
𝑇𝑃
𝑇𝑃+𝐹𝑁
• It is the harmonic mean of Precision and Recall.
• Gives a global picture of the performance.
• Can be calculated as:
2 x
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 x 𝑅𝑒𝑐𝑎𝑙𝑙
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙
F1-Score
28
© 2023 Sensor Cortek
• A window into the
performance of the model on
one or more classes.
• Can be used to calculate
Accuracy, Precision and Recall.
TP FP
FN TN
Confusion Matrix
29
© 2023 Sensor Cortek
Actual Values
Predicted
Values
P N
P
N
A B C D
A 93 1 5 13
B 4 89 5 3
C 1 7 88 5
D 2 1 2 79
Example: Multi-class Confusion Matrix
30
© 2023 Sensor Cortek
Predicted
Values
Actual Values
31
© 2023 Sensor Cortek
Problems with Training DNNs
• Overfitting
• Underfitting
• Dataset imbalance
• Inefficient learning
Common Problems
32
© 2023 Sensor Cortek
• Parameter initialization
• Gradient masking
• Vanishing gradient
• Exploding gradient
Overfitting
33
© 2023 Sensor Cortek
Problem Solution
When a network learns the training
data too well, it can fail to
generalize to new data
Larger dataset: either add more labeled data or
use data augmentation to artificially increase the
size and variation of the dataset
Less complex model: remove layers or reduce
the width of the layers
Early stoppage / Saving checkpoints
Apply dropout
Apply regularization
Underfitting
34
© 2023 Sensor Cortek
Problem Solution
When a network is unable to
capture the complexity of the data,
it can fail to fit the training data
well enough
Examine the dataset: bad or missed labeling, as
well as lack of class representation can cause
under fitting
Increase the number of epochs: not training
the model enough causes it to fail to grasp the
essential pattern in the data
More complex model: add layers, increase the
width of the layers
New architecture
Data Imbalance
35
© 2023 Sensor Cortek
Problem Solution
If the dataset used to train the
network is not representative of the
target population, the network may
fail to generalize well to new data
Add more data: either add more labeled data to
fix the imbalance or use data augmentation to
artificially achieve the same effect
Transfer learning and fine-tuning: use
pretrained model weights for training on the new
data. The pretrained model would have been
trained on a more balanced dataset and captured
that balance within its weights
Inefficient Learning
36
© 2023 Sensor Cortek
Problem Solution
When the learning process
becomes slow or stalls out during
training, or when the model is
unable to optimize the objective
function
Investigate learning rate: a specific learning
rate can cause the training to be stuck at a local
minima
Use momentum: accelerates convergence of the
model
New architecture
37
© 2023 Sensor Cortek
Detecting Training Problems
Loss Curve
38
© 2023 Sensor Cortek
Ideal
• Bad data
• Model too simple
• Large learning rate
• Bad data
• Check activation
functions
Loss
epoch
Loss
epoch
Loss
epoch
• The loss on the test set will
start to diverge.
• Happens when the model
memorizes the training data.
• Happens when the epoch
number is too large.
• Happens when the dataset is
too small.
Overfitting Indicators
39
© 2023 Sensor Cortek
Loss
epoch
Train
Test
• Indicated by high and noisy
loss values.
• The model is not able to learn
from the training data.
• Happens due to bad data,
small dataset or a small/low
complexity model.
Underfitting Indicators
40
© 2023 Sensor Cortek
Loss
epoch
• High train accuracy, low
test accuracy: this is a sign
of overfitting the model.
• Low train accuracy, high
test accuracy: this is a sign
of imbalance between train
and test data.
Accuracy Indicators
41
© 2023 Sensor Cortek
epoch
Train Acc.
Test Acc.
0.0
1.0
• Due to large number of
incorrectly classified
instances.
• Usually when the value is
below 60%.
Low Accuracy
42
© 2023 Sensor Cortek
epoch
Accuracy
0.0
1.0
• Due to large number of false
positives.
• The model misclassifies many
negative instances as positive.
• Usually when the value is
50% or less.
Low Precision
43
© 2023 Sensor Cortek
epoch
Precision
0.0
1.0
• Due to many false negatives.
• The model is missing many
positive instances.
• Recall is low when it is below
50%.
Low Recall
44
© 2023 Sensor Cortek
epoch
Recall
0.0
1.0
Possible cause Remedy
Noisy or imbalanced data
Remove noise, augment data, use class weighting, use
oversampling/undersampling to fix the imbalance of the data
Small dataset
Use pretrained models as they have already learned
important features
Simple model
Increase the number of layers, width of the layer or change
to a different architecture
Inadequate
hyperparameters
Revisit the choice of hyperparameters and select appropriate
parameter values
Inadequate loss function Change to a loss function more suitable for the task
Fixing Low Accuracy/Precision/Recall
45
© 2023 Sensor Cortek
Confusion Matrix Indicators
46
© 2023 Sensor Cortek
A B C D
A 100 0 0 0
B 0 100 0 0
C 0 0 100 0
D 0 0 0 100
Predicted
Values
Actual Values
Multi-class Confusion Matrix:
IDEAL
Confusion Matrix Indicators
47
© 2023 Sensor Cortek
A B C D
A 23 29 36 24
B 18 17 15 34
C 38 39 30 26
D 21 15 19 16
Predicted
Values
Actual Values
Multi-class Confusion Matrix:
Underfitting,
Inefficient learning,
Simple model …etc
Confusion Matrix Indicators
48
© 2023 Sensor Cortek
A B C D
A 93 1 5 13
B 4 89 5 3
C 1 7 42 39
D 2 1 46 40
Predicted
Values
Actual Values
Multi-class Confusion Matrix:
Data imbalance,
Insufficient dataset
• Training DNNs is a complex process that is subject to many problems.
• With the proper understanding of the training process, one can efficiently
resolve these problems.
• Datasets, hyperparameters and model architecture are the major
contributors to problems during training.
• Precision, recall, accuracy and confusion matrices are important training
evaluation metrics that can help diagnose and guide towards successful
training.
Conclusion
49
© 2023 Sensor Cortek
Interpreting Loss Curves
https://developers.google.com/
machine-learning/testing-
debugging/metrics/interpretic
Resources
50
© 2023 Sensor Cortek
2023 Embedded Vision Summit
• “Fundamentals of Training AI Models for
Computer Vision Applications” by Amit
Mate (Tue. May 23rd, 1:30-2:35 pm)
• “Introduction to Modern LiDAR for
Machine Perception” by Robert Laganiere
(Wed. May 24th, 4:15-5:20 pm)
51
© 2023 Sensor Cortek
Backup Material
52
© 2023 Sensor Cortek
What is a Deep Neural Network?
• Deep Neural networks are
inspired by the function of
biological neurons, the
simplest unit of the nervous
system in humans.
• “Deep” refers to the presence
of multiple layers between the
input and the output to the
network.
Inspiration
53
© 2023 Sensor Cortek
The basic unit of a NN is a
perceptron:
• Inputs
• Weights
• Sum
• Activation
• Output
Anatomy of the Perceptron
54
© 2023 Sensor Cortek
• Input layer
• Hidden layer(s)
• Output layer
Building a Neural Network
55
© 2023 Sensor Cortek
56
© 2023 Sensor Cortek
Additional Hyperparameters
• Determines the depth of the network.
• Too large results in more parameters to train and longer training.
• Too small may not capture the complex relationship between the
input and the output.
Number of Layers (Model Architecture)
57
© 2023 Sensor Cortek
• Determines the width of the network.
• Too large results in more parameters to train and longer training.
• Too small (shallow) may not capture the complex relationship
between the input and the output.
Number of Neurons (Model Architecture)
58
© 2023 Sensor Cortek
• Determines if the neuron is activated and should pass on the
transformed input to the next layer.
• Impacts how well the model learns
Activation Function
59
© 2023 Sensor Cortek
• Controls the number of neurons that are randomly removed
during training.
• Helps the model to generalize.
• Avoids overfitting.
Dropout Rate
60
© 2023 Sensor Cortek
• Controls how the weights are initialized prior to starting training.
Weight Initialization
61
© 2023 Sensor Cortek
• Controls how strong the penalty term in the cost function of a
model.
• Helps in controlling over/under fitting of the model.
Regularization Parameter
62
© 2023 Sensor Cortek
• Controls the method used to update the weights.
• Helps minimize the loss and improve the accuracy.
Optimizer
63
© 2023 Sensor Cortek
64
© 2023 Sensor Cortek
Additional Training Problems
Gradient Masking
65
© 2023 Sensor Cortek
• The Problem
When gradients of some
parameters vanish because of
the choice of activation function,
the parameters cannot be
learned.
• The Solution
This can be addressed by using
activation functions that
alleviate the issue, such as leaky
ReLU or Swish, or by using
different initialization schemes.
Parameter Initialization
66
© 2023 Sensor Cortek
• The Problem
If the weights of the network
are initialized randomly and are
too small or too large, the
network may fail to learn.
• The Solution
This can be addressed by using
initialization techniques such as
Xavier or He initialization, which
ensure that the weights are
initialized in a way that is
appropriate for the network
architecture and the activation
functions used.
Vanishing Gradient
67
© 2023 Sensor Cortek
• The Problem:
The gradients in the deeper
layers of the network become
very small during
backpropagation, making it
difficult to update the weights of
those layers.
• The Solution
use activation functions that
don't saturate, such as ReLU or
variants, and by using
normalization techniques such
as Batch Normalization.
Exploding Gradient
68
© 2023 Sensor Cortek
• The Problem
The gradients in the deeper
layers of the network become
very large during
backpropagation, making it
difficult to update the weights of
those layers.
• The Solution
using gradient clipping
techniques, which limit the
magnitude of the gradients.

More Related Content

Similar to “Deep Neural Network Training: Diagnosing Problems and Implementing Solutions,” a Presentation from Sensor Cortek

Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruptionjagan477830
 
Feature Engineering
Feature Engineering Feature Engineering
Feature Engineering odsc
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning Gopal Sakarkar
 
3_Transfer_Learning.pdf
3_Transfer_Learning.pdf3_Transfer_Learning.pdf
3_Transfer_Learning.pdfFEG
 
KNOLX_Data_preprocessing
KNOLX_Data_preprocessingKNOLX_Data_preprocessing
KNOLX_Data_preprocessingKnoldus Inc.
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersSeunghyun Hwang
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity ManagementEDB
 
Wireless communication without pre shared secrets using spread spectrum techn...
Wireless communication without pre shared secrets using spread spectrum techn...Wireless communication without pre shared secrets using spread spectrum techn...
Wireless communication without pre shared secrets using spread spectrum techn...eSAT Journals
 
Testing Machine Learning-enabled Systems: A Personal Perspective
Testing Machine Learning-enabled Systems: A Personal PerspectiveTesting Machine Learning-enabled Systems: A Personal Perspective
Testing Machine Learning-enabled Systems: A Personal PerspectiveLionel Briand
 
Know Your Data: The stats behind your alerts
Know Your Data: The stats behind your alertsKnow Your Data: The stats behind your alerts
Know Your Data: The stats behind your alertsAll Things Open
 
Dimensionality Reduction in Machine Learning
Dimensionality Reduction in Machine LearningDimensionality Reduction in Machine Learning
Dimensionality Reduction in Machine LearningRomiRoy4
 
Data_Preparation.pptx
Data_Preparation.pptxData_Preparation.pptx
Data_Preparation.pptxImXaib
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design TrainingESCOM
 
4th Year Project Presentation Slides
4th Year Project Presentation Slides4th Year Project Presentation Slides
4th Year Project Presentation SlidesItrat Rahman
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...Jinwon Lee
 
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...Edge AI and Vision Alliance
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGESCASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGESIRJET Journal
 

Similar to “Deep Neural Network Training: Diagnosing Problems and Implementing Solutions,” a Presentation from Sensor Cortek (20)

nnUNet
nnUNetnnUNet
nnUNet
 
Deeplearning
Deeplearning Deeplearning
Deeplearning
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
 
Feature Engineering
Feature Engineering Feature Engineering
Feature Engineering
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
3_Transfer_Learning.pdf
3_Transfer_Learning.pdf3_Transfer_Learning.pdf
3_Transfer_Learning.pdf
 
KNOLX_Data_preprocessing
KNOLX_Data_preprocessingKNOLX_Data_preprocessing
KNOLX_Data_preprocessing
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity Management
 
Wireless communication without pre shared secrets using spread spectrum techn...
Wireless communication without pre shared secrets using spread spectrum techn...Wireless communication without pre shared secrets using spread spectrum techn...
Wireless communication without pre shared secrets using spread spectrum techn...
 
Testing Machine Learning-enabled Systems: A Personal Perspective
Testing Machine Learning-enabled Systems: A Personal PerspectiveTesting Machine Learning-enabled Systems: A Personal Perspective
Testing Machine Learning-enabled Systems: A Personal Perspective
 
Know Your Data: The stats behind your alerts
Know Your Data: The stats behind your alertsKnow Your Data: The stats behind your alerts
Know Your Data: The stats behind your alerts
 
Dimensionality Reduction in Machine Learning
Dimensionality Reduction in Machine LearningDimensionality Reduction in Machine Learning
Dimensionality Reduction in Machine Learning
 
Data_Preparation.pptx
Data_Preparation.pptxData_Preparation.pptx
Data_Preparation.pptx
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
 
4th Year Project Presentation Slides
4th Year Project Presentation Slides4th Year Project Presentation Slides
4th Year Project Presentation Slides
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
 
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGESCASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
 

More from Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
 

Recently uploaded

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Recently uploaded (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions,” a Presentation from Sensor Cortek

  • 1. Deep Neural Network Training: Diagnosing Problems and Implementing Solutions Fahed Hassanat COO / Head of Engineering Sensor Cortek
  • 2. 2 © 2023 Sensor Cortek Training DNNs
  • 3. • Find an acceptable relationship between inputs and outputs based on patterns found in historical data. The Goal 3 © 2023 Sensor Cortek Y (independent variable) X (dependent variable)
  • 4. • The process of minimizing the difference between the produced output and the actual output. • Uses mathematical techniques to minimize the error. • Stop when the results are acceptable or can no longer learn. The Learning Process 4 © 2023 Sensor Cortek Y (independent variable) X (dependent variable)
  • 5. Example Forward Calculation 5 © 2023 Sensor Cortek Activation (Sigmoid: 1/(1+e-x)) = 1/(1+e-0.4) SUM = (0.7x0.2)+(0.3x0.9) (SUM) (Activation) 0.6
  • 6. An Educated Network 6 © 2023 Sensor Cortek
  • 7. • A labeled collection of data. • Variations of inputs that produce same output. • The larger and more diverse the better. Training Data 7 © 2023 Sensor Cortek
  • 8. • A pass through the network in forward direction. • Use training data. • Produce results with the current parameters. The Forward Pass 8 © 2023 Sensor Cortek
  • 9. • Traverse all nodes starting from the last layer. • Update the parameters (weights) to minimize the error. • Gradient Descent is usually the technique used to work towards a minimum error. The Backward Pass 9 © 2023 Sensor Cortek
  • 10. 10 © 2023 Sensor Cortek Components of Training DNNs
  • 11. • Framework: Manages the data flow and execution of the training process. • DNN model: An architecture that serves a certain purpose. • Hyperparameters: Controls the training process. • Training data: Labeled dataset for training the model. Components of Training DNNs 11 © 2023 Sensor Cortek
  • 12. • Keras: • Great for prototyping. High API level and high readability. • TensorFlow: • High performance. Suited for large datasets. • Pytorch: • High performance. Excellent debugging. Framework 12 © 2023 Sensor Cortek
  • 13. Architecture Purpose Multi-layer perceptron Image classification, natural language processing, and regression Convolutional neural networks (CNN) Image classification, object detection, and segmentation Recurrent neural networks (RNN) Time series prediction, and speech recognition Transformer networks Natural language processing, and computer vision Generative adversarial networks (GAN) Synthetic data generation DNN Model 13 © 2023 Sensor Cortek
  • 14. • Hyperparameters are parameters that cannot be learned during the training process and are set prior to the training process. • Some components of the model architecture can be considered hyperparameters. • Model designers can create a hyperparameter that controls multiple hyperparameters in a certain fashion. What is a Hyperparameter 14 © 2023 Sensor Cortek
  • 15. Non-architecture-based Architecture-based Learning rate Number of layers Number of epochs Number of neurons Batch size Activation function Dropout rate Weight initialization Regularization parameters Optimizer Hyperparameters 15 © 2023 Sensor Cortek
  • 16. • A value that controls the amount by which the network weights are updated. • Usually between 0.0 and 1.0. • Too large or too small affects the convergence of the model. Hyperparameters: Learning Rate 16 © 2023 Sensor Cortek
  • 17. • The number of times the dataset is passed through the network. • Too small and the model does not converge. • Too large and the model overfits. Hyperparameters: Number of Epochs 17 © 2023 Sensor Cortek
  • 18. • The number of data samples used in each iteration of the optimization algorithm. • Too large causes the model not to generalize well. • Too small can prevent the model from converging. Hyperparameters: Batch Size 18 © 2023 Sensor Cortek
  • 19. • Collection of data with corresponding labels. • Data is used as input to the model and labels adjust and correct the output. • Dataset is divided into training/validation/ testing sets with the commonly used ratios of 60/20/20 respectively. Dataset 19 © 2023 Sensor Cortek
  • 20. Sufficient data The dataset must contain enough data to reflect the targeted population. Balanced data In multiclass problems, classes must have balanced contributions to the dataset. Relevant data The data must represent the targeted population and population environment. Proper labeling Labels must be accurate, and consistent. Data diversity The diversity of the data should reflect the diversity of the targeted population. Low SNR The data should have little to no noise that cause ambiguity. Attributes of a Good Dataset 20 © 2023 Sensor Cortek
  • 21. 21 © 2023 Sensor Cortek Training Metrics
  • 22. True Positive (TP) The model predicts True, and the label’s True. False Positive (FP) The model predicts True, and the label’s False. False Negative (FN) The model predicts False, and the label’s True. Prediction Outcomes 22 © 2023 Sensor Cortek
  • 23. • Used in object detection. • Calculates overlap in bounding boxes. • Determines how close is a prediction to the ground truth. • Ranges from 0 to 1. • A typical value for qualifying a prediction as TP is 0.5. Intersection over Union 23 © 2023 Sensor Cortek IoU = 0 IoU = 0.5 IoU = 1
  • 24. • Is a measure of how far the predictions are from the actual values. • It is the output of the loss function during training. Loss 24 © 2023 Sensor Cortek Loss epoch
  • 25. • It is the number of good predictions out of all predictions. • Can be calculated as: 𝑇𝑃+𝑇𝑁 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁 Accuracy 25 © 2023 Sensor Cortek
  • 26. • The number of correctly predicted labels out of all True predictions. • Can be calculated as: 𝑇𝑃 𝑇𝑃+𝐹𝑃 Precision 26 © 2023 Sensor Cortek
  • 27. Recall 27 © 2023 Sensor Cortek • The number of correctly predicted labels out of all True labels in the data. • Can be calculated as: 𝑇𝑃 𝑇𝑃+𝐹𝑁
  • 28. • It is the harmonic mean of Precision and Recall. • Gives a global picture of the performance. • Can be calculated as: 2 x 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 x 𝑅𝑒𝑐𝑎𝑙𝑙 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙 F1-Score 28 © 2023 Sensor Cortek
  • 29. • A window into the performance of the model on one or more classes. • Can be used to calculate Accuracy, Precision and Recall. TP FP FN TN Confusion Matrix 29 © 2023 Sensor Cortek Actual Values Predicted Values P N P N
  • 30. A B C D A 93 1 5 13 B 4 89 5 3 C 1 7 88 5 D 2 1 2 79 Example: Multi-class Confusion Matrix 30 © 2023 Sensor Cortek Predicted Values Actual Values
  • 31. 31 © 2023 Sensor Cortek Problems with Training DNNs
  • 32. • Overfitting • Underfitting • Dataset imbalance • Inefficient learning Common Problems 32 © 2023 Sensor Cortek • Parameter initialization • Gradient masking • Vanishing gradient • Exploding gradient
  • 33. Overfitting 33 © 2023 Sensor Cortek Problem Solution When a network learns the training data too well, it can fail to generalize to new data Larger dataset: either add more labeled data or use data augmentation to artificially increase the size and variation of the dataset Less complex model: remove layers or reduce the width of the layers Early stoppage / Saving checkpoints Apply dropout Apply regularization
  • 34. Underfitting 34 © 2023 Sensor Cortek Problem Solution When a network is unable to capture the complexity of the data, it can fail to fit the training data well enough Examine the dataset: bad or missed labeling, as well as lack of class representation can cause under fitting Increase the number of epochs: not training the model enough causes it to fail to grasp the essential pattern in the data More complex model: add layers, increase the width of the layers New architecture
  • 35. Data Imbalance 35 © 2023 Sensor Cortek Problem Solution If the dataset used to train the network is not representative of the target population, the network may fail to generalize well to new data Add more data: either add more labeled data to fix the imbalance or use data augmentation to artificially achieve the same effect Transfer learning and fine-tuning: use pretrained model weights for training on the new data. The pretrained model would have been trained on a more balanced dataset and captured that balance within its weights
  • 36. Inefficient Learning 36 © 2023 Sensor Cortek Problem Solution When the learning process becomes slow or stalls out during training, or when the model is unable to optimize the objective function Investigate learning rate: a specific learning rate can cause the training to be stuck at a local minima Use momentum: accelerates convergence of the model New architecture
  • 37. 37 © 2023 Sensor Cortek Detecting Training Problems
  • 38. Loss Curve 38 © 2023 Sensor Cortek Ideal • Bad data • Model too simple • Large learning rate • Bad data • Check activation functions Loss epoch Loss epoch Loss epoch
  • 39. • The loss on the test set will start to diverge. • Happens when the model memorizes the training data. • Happens when the epoch number is too large. • Happens when the dataset is too small. Overfitting Indicators 39 © 2023 Sensor Cortek Loss epoch Train Test
  • 40. • Indicated by high and noisy loss values. • The model is not able to learn from the training data. • Happens due to bad data, small dataset or a small/low complexity model. Underfitting Indicators 40 © 2023 Sensor Cortek Loss epoch
  • 41. • High train accuracy, low test accuracy: this is a sign of overfitting the model. • Low train accuracy, high test accuracy: this is a sign of imbalance between train and test data. Accuracy Indicators 41 © 2023 Sensor Cortek epoch Train Acc. Test Acc. 0.0 1.0
  • 42. • Due to large number of incorrectly classified instances. • Usually when the value is below 60%. Low Accuracy 42 © 2023 Sensor Cortek epoch Accuracy 0.0 1.0
  • 43. • Due to large number of false positives. • The model misclassifies many negative instances as positive. • Usually when the value is 50% or less. Low Precision 43 © 2023 Sensor Cortek epoch Precision 0.0 1.0
  • 44. • Due to many false negatives. • The model is missing many positive instances. • Recall is low when it is below 50%. Low Recall 44 © 2023 Sensor Cortek epoch Recall 0.0 1.0
  • 45. Possible cause Remedy Noisy or imbalanced data Remove noise, augment data, use class weighting, use oversampling/undersampling to fix the imbalance of the data Small dataset Use pretrained models as they have already learned important features Simple model Increase the number of layers, width of the layer or change to a different architecture Inadequate hyperparameters Revisit the choice of hyperparameters and select appropriate parameter values Inadequate loss function Change to a loss function more suitable for the task Fixing Low Accuracy/Precision/Recall 45 © 2023 Sensor Cortek
  • 46. Confusion Matrix Indicators 46 © 2023 Sensor Cortek A B C D A 100 0 0 0 B 0 100 0 0 C 0 0 100 0 D 0 0 0 100 Predicted Values Actual Values Multi-class Confusion Matrix: IDEAL
  • 47. Confusion Matrix Indicators 47 © 2023 Sensor Cortek A B C D A 23 29 36 24 B 18 17 15 34 C 38 39 30 26 D 21 15 19 16 Predicted Values Actual Values Multi-class Confusion Matrix: Underfitting, Inefficient learning, Simple model …etc
  • 48. Confusion Matrix Indicators 48 © 2023 Sensor Cortek A B C D A 93 1 5 13 B 4 89 5 3 C 1 7 42 39 D 2 1 46 40 Predicted Values Actual Values Multi-class Confusion Matrix: Data imbalance, Insufficient dataset
  • 49. • Training DNNs is a complex process that is subject to many problems. • With the proper understanding of the training process, one can efficiently resolve these problems. • Datasets, hyperparameters and model architecture are the major contributors to problems during training. • Precision, recall, accuracy and confusion matrices are important training evaluation metrics that can help diagnose and guide towards successful training. Conclusion 49 © 2023 Sensor Cortek
  • 50. Interpreting Loss Curves https://developers.google.com/ machine-learning/testing- debugging/metrics/interpretic Resources 50 © 2023 Sensor Cortek 2023 Embedded Vision Summit • “Fundamentals of Training AI Models for Computer Vision Applications” by Amit Mate (Tue. May 23rd, 1:30-2:35 pm) • “Introduction to Modern LiDAR for Machine Perception” by Robert Laganiere (Wed. May 24th, 4:15-5:20 pm)
  • 51. 51 © 2023 Sensor Cortek Backup Material
  • 52. 52 © 2023 Sensor Cortek What is a Deep Neural Network?
  • 53. • Deep Neural networks are inspired by the function of biological neurons, the simplest unit of the nervous system in humans. • “Deep” refers to the presence of multiple layers between the input and the output to the network. Inspiration 53 © 2023 Sensor Cortek
  • 54. The basic unit of a NN is a perceptron: • Inputs • Weights • Sum • Activation • Output Anatomy of the Perceptron 54 © 2023 Sensor Cortek
  • 55. • Input layer • Hidden layer(s) • Output layer Building a Neural Network 55 © 2023 Sensor Cortek
  • 56. 56 © 2023 Sensor Cortek Additional Hyperparameters
  • 57. • Determines the depth of the network. • Too large results in more parameters to train and longer training. • Too small may not capture the complex relationship between the input and the output. Number of Layers (Model Architecture) 57 © 2023 Sensor Cortek
  • 58. • Determines the width of the network. • Too large results in more parameters to train and longer training. • Too small (shallow) may not capture the complex relationship between the input and the output. Number of Neurons (Model Architecture) 58 © 2023 Sensor Cortek
  • 59. • Determines if the neuron is activated and should pass on the transformed input to the next layer. • Impacts how well the model learns Activation Function 59 © 2023 Sensor Cortek
  • 60. • Controls the number of neurons that are randomly removed during training. • Helps the model to generalize. • Avoids overfitting. Dropout Rate 60 © 2023 Sensor Cortek
  • 61. • Controls how the weights are initialized prior to starting training. Weight Initialization 61 © 2023 Sensor Cortek
  • 62. • Controls how strong the penalty term in the cost function of a model. • Helps in controlling over/under fitting of the model. Regularization Parameter 62 © 2023 Sensor Cortek
  • 63. • Controls the method used to update the weights. • Helps minimize the loss and improve the accuracy. Optimizer 63 © 2023 Sensor Cortek
  • 64. 64 © 2023 Sensor Cortek Additional Training Problems
  • 65. Gradient Masking 65 © 2023 Sensor Cortek • The Problem When gradients of some parameters vanish because of the choice of activation function, the parameters cannot be learned. • The Solution This can be addressed by using activation functions that alleviate the issue, such as leaky ReLU or Swish, or by using different initialization schemes.
  • 66. Parameter Initialization 66 © 2023 Sensor Cortek • The Problem If the weights of the network are initialized randomly and are too small or too large, the network may fail to learn. • The Solution This can be addressed by using initialization techniques such as Xavier or He initialization, which ensure that the weights are initialized in a way that is appropriate for the network architecture and the activation functions used.
  • 67. Vanishing Gradient 67 © 2023 Sensor Cortek • The Problem: The gradients in the deeper layers of the network become very small during backpropagation, making it difficult to update the weights of those layers. • The Solution use activation functions that don't saturate, such as ReLU or variants, and by using normalization techniques such as Batch Normalization.
  • 68. Exploding Gradient 68 © 2023 Sensor Cortek • The Problem The gradients in the deeper layers of the network become very large during backpropagation, making it difficult to update the weights of those layers. • The Solution using gradient clipping techniques, which limit the magnitude of the gradients.