SlideShare a Scribd company logo
1 of 40
Feedback
1
Interaction Lab., Seoul National University of Science and Technology
■Degree of gaze estimation
 Geometric based
• 0.5°~ 3°
 Appearance based
• 3°~ 12°
Feedback
Interaction Lab., Seoul National University of Science and Technology
■GazeTR-Hybrid and GazeTR-Conv
Feedback
3
Interaction Lab. Seoul National University of Science and Technology
Improving Accuracy of Binary Neural Networks
using Unbalanced Activation Distribution
Jae-Yeop Jeong
Computer Vision and Pattern Recognition 2021
Hyungjun Kim, Jihoon Park, Changhun Lee, Jae-Joon Kim
POSTECH, Korea
Interaction Lab., Seoul National University of Science and Technology
■Intro
■Method
■Conclusion
Agenda
5
Intro
Method
6
Interaction Lab., Seoul National University of Science and Technology
■Lightweight Network
 Network quantization
 Network pruning
 Efficient architecture design
Intro(1/6)
7
Interaction Lab., Seoul National University of Science and Technology
■Network quantization
 Convert floating-point type to integer or fixed point
• Model Lightweight
• Reduced computation
• Efficient hardware use
 The process that reduces represent form and internal composition in Neural Network
Intro(2/6)
8
Interaction Lab., Seoul National University of Science and Technology
■Binary Neural Network
 32-bit floating-point weights can be represented by 1-bit weights
 XNOR-and-popcount logics
 Weight quantization
 Activation quantization
Intro(3/6)
9
-1.21 0.88 -0.11
0.17 1.72 -0.7
0.98 -0.57 0.63
-1 1 -1
1 1 -1
1 -1 1
Interaction Lab., Seoul National University of Science and Technology
■Accuracy of BNNs
 Due to the lightweight nature, BNNs are garnering interests as a promising solution for
DNN computing on edge devices
 BNNs still suffer from the accuracy degradation caused by the aggressive quantization
 Low accuracy than full-precision model
Intro(4/6)
10
Interaction Lab., Seoul National University of Science and Technology
■Problems of BNNs
 Gradient mismatch problem caused by the non-differentiable binary activation function
• Gradients cannot propagate through the quantization layer in the back-propagation process
• Straight-Through-Estimator(STE)
Intro(5/6)
11
Interaction Lab., Seoul National University of Science and Technology
■Technique to improve the accuracy of BNNs
 Straight-Through-Estimator(STE)
 Parameterized clipping activation function(PACT)
• Focused on multi-bit networks (2-bit, 4-bit …)
 Managing activation distribution
• There are only two values (1 or -1)
• Focused on binary activation more balanced
 Additional activation function
• Between the binary convolution layer and the following BN layer
• ReLU, PReLU
• Increasing the non-linearity
Intro(6/6)
12
Method
Conclusion
13
Interaction Lab., Seoul National University of Science and Technology
■Overview
 The activation functions in conventional full precision models to describe the motivation for
using unbalanced activation distribution
 How to make the distribution of the binary activation unbalanced using threshold shifting in
BNNs
 Experimental results on ImageNet data set
 Ineffectiveness of the methodology to train the threshold of binary activation in BNNs
 Additional activation functions
Method(1/25)
14
Interaction Lab., Seoul National University of Science and Technology
■Motivation for using unbalanced activation distribution
 ReLU solved the gradient vanishing problem
• The gradient vanishing problem is largely diminished by using BN layers in recent models
 There might be other reasons for the success of the ReLU function
Method(2/25)
15
Interaction Lab., Seoul National University of Science and Technology
■Make unbalanced activation distribution using hardtanh
 Compare the performance hardtanh function and ReLU6
• Slight variant of the ReLU function where its positive outputs are clipped to 6
 Threshold shifting
• x-axis, y-axis, increased range
Method(3/25)
16
Interaction Lab., Seoul National University of Science and Technology
■Training model
 VGG-small model was trained on CIFAR-10 dataset with different activation functions
 VGG-small
• Consists of 4 Conv with 64, 64, 128, 128 output channel
• Three dense with 512
• BN layer is placed between the Conv layer and the activation layer
• Max pooling layer was used after each of the latter three convolution layers
 Trained the model in the same condition with 10 different seeds
Method(4/25)
17
Interaction Lab., Seoul National University of Science and Technology
■Accuracy gap between hardtanh and ReLU6(1/2)
 Red dotted line represents the test accuracy of the ReLU6
Method(5/25)
18
Interaction Lab., Seoul National University of Science and Technology
■Accuracy gap between hardtanh and ReLU6(2/2)
 The hardtanh function is shifted to the right along the x-axis
• The number of negative outputs increase and the output distribution becomes positively skewed
 The higher performance of ReLU activation function
• Come from the unbalanced distribution of activation outputs
• When sign function is used, the activation distribution becomes even more noticeable in BNNs
Method(6/25)
19
Interaction Lab., Seoul National University of Science and Technology
■BNNs with unbalanced activation distribution(1/4)
 Previous multi-bit quantization methods
• ReLU-based quantization which outputs unbalanced activation distribution
 Previous BNNs quantization methods
• Sign function for binary activation function thus distribution of binary activation is balanced
• The ratio of 1 to -1 is close to 1:1
Method(7/25)
20
Interaction Lab., Seoul National University of Science and Technology
■BNNs with unbalanced activation distribution(2/4)
 By shifting the activation function along the x-axis helps improve the accuracy of a model
 Poor performance was sparked by balanced activation distribution (sign function)
 Shifting the sign function along the x-axis
Method(8/25)
21
Interaction Lab., Seoul National University of Science and Technology
■BNNs with unbalanced activation distribution(3/4)
 VGG-small
• The Max pooling layer makes the distribution positively skewed
Method(9/25)
22
Interaction Lab., Seoul National University of Science and Technology
■BNNs with unbalanced activation distribution(4/4)
 Effect of Max Pooling
 Max pooling layer is the only layer that gives asymmetry in the model
• The accuracy difference between shifting the threshold to the positive direction and the negative direction
• Replace Max pooling to Average pooling
Method(10/25)
23
Interaction Lab., Seoul National University of Science and Technology
■Effect of other conditions
 Datasets
 Models
 Initialization methods
 Optimizers
Method(11/25)
24
Interaction Lab., Seoul National University of Science and Technology
■Dataset
 MNIST
• Binary VGG- small model on MNIST dataset
• 30 different random seeds and average results are presented
Method(12/25)
25
Interaction Lab., Seoul National University of Science and Technology
■Model architecture
 Two additional model architectures
 2-layer MLP and LeNet-5
• 2-layer MLP do not have Max pooling layer in it
Method(13/25)
26
Interaction Lab., Seoul National University of Science and Technology
■Initialization method(1/2)
 Trained the models from scratch using Xavier normal initialization
 Full-precision model is float point 32
 Pretrained full-precision models are often used to initialize BNN models
 Hardtanh + ReLU
 Full- precision VGG-small model on CIFAR-10 dataset
 Binary VGG- small model initialized with full-precision pretrained model
Method(14/25)
27
Interaction Lab., Seoul National University of Science and Technology
■Initialization method(2/2)
Method(15/25)
28
1
0.6
Interaction Lab., Seoul National University of Science and Technology
■Optimizer
 Mostly used Adam optimizer in BNN
 Replace Adam to SGD
Method(16/25)
29
Interaction Lab., Seoul National University of Science and Technology
■Experiments on ImageNet
 Finding shift amount
• Top-1 train accuracy at the first epoch and Top-1 validation accuracy
Method(17/25)
30
Interaction Lab., Seoul National University of Science and Technology
■Results on ImageNet
Method(18/25)
31
Interaction Lab., Seoul National University of Science and Technology
■Effect of training threshold
 Limited effect on the performance of BNNs
 Batch normalization bias
 Limited learning capability
Method(19/25)
32
Interaction Lab., Seoul National University of Science and Technology
■Batch normalization bias
 Trainable threshold is already covered by the bias values in the BN layer
 𝑋 and 𝑌 are inputs to the BN layer and output from the binary activation layer
 𝜇, 𝜎, 𝛾, 𝛽, and 𝑡ℎ
 mean, std, weight, and bias of the BN layer
 𝛽 and 𝑡ℎ are initialized to zero in the beginning
Method(20/25)
33
Interaction Lab., Seoul National University of Science and Technology
■Limited learning capability
 Trainable threshold method
• An initialization point
Method(21/25)
34
Interaction Lab., Seoul National University of Science and Technology
■Effect of additional activation function(1/4)
 Use additional activation function after convolution layer in BNNs
 BNNs usually lack non-linearity in the model
 Additional activation function is also related to the unbalanced activation distribution
Method(22/25)
35
Interaction Lab., Seoul National University of Science and Technology
■Effect of additional activation function(2/4)
 Effect of additional PReLU layers after binary convolution layers in the ResNet20
Method(23/25)
36
Interaction Lab., Seoul National University of Science and Technology
■Effect of additional activation function(3/4)
 There is accuracy degradation with PReLU
• Already play a role in distorting the activation distribution
• Further shifting the threshold of binary activation function is excessive
Method(24/25)
37
Interaction Lab., Seoul National University of Science and Technology
■Effect of additional activation function(4/4)
 Replace to PReLU to LeakyReLU
 When the slope 1
• Identity function
Method(25/25)
38
Interaction Lab., Seoul National University of Science and Technology
■Analyzed the impact of activation distribution on the accuracy of BNN models
■Demonstrated that the accuracy of previous BNN models could be further
improved
Conclusion
39
Q&A
40

More Related Content

What's hot

2s complement arithmetic
2s complement arithmetic2s complement arithmetic
2s complement arithmeticSanjay Saluth
 
Unit3:Informed and Uninformed search
Unit3:Informed and Uninformed searchUnit3:Informed and Uninformed search
Unit3:Informed and Uninformed searchTekendra Nath Yogi
 
Presentation on Karnaugh Map
Presentation  on Karnaugh MapPresentation  on Karnaugh Map
Presentation on Karnaugh MapKawsar Ahmed
 
System Sequence Diagrams.pdf
System Sequence Diagrams.pdfSystem Sequence Diagrams.pdf
System Sequence Diagrams.pdfLinuLalachan
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networksAkash Goel
 
Magnitude comparator
Magnitude comparatorMagnitude comparator
Magnitude comparatorSyed Saeed
 
Registers and counters
Registers and countersRegisters and counters
Registers and countersHeman Pathak
 
Hill climbing algorithm in artificial intelligence
Hill climbing algorithm in artificial intelligenceHill climbing algorithm in artificial intelligence
Hill climbing algorithm in artificial intelligencesandeep54552
 
Neural network
Neural networkNeural network
Neural networkSilicon
 
I.INFORMED SEARCH IN ARTIFICIAL INTELLIGENCE II. HEURISTIC FUNCTION IN AI III...
I.INFORMED SEARCH IN ARTIFICIAL INTELLIGENCE II. HEURISTIC FUNCTION IN AI III...I.INFORMED SEARCH IN ARTIFICIAL INTELLIGENCE II. HEURISTIC FUNCTION IN AI III...
I.INFORMED SEARCH IN ARTIFICIAL INTELLIGENCE II. HEURISTIC FUNCTION IN AI III...vikas dhakane
 
Soft Computing-173101
Soft Computing-173101Soft Computing-173101
Soft Computing-173101AMIT KUMAR
 
Sequential circuit design
Sequential circuit designSequential circuit design
Sequential circuit designSatya P. Joshi
 

What's hot (20)

2s complement arithmetic
2s complement arithmetic2s complement arithmetic
2s complement arithmetic
 
Unit3:Informed and Uninformed search
Unit3:Informed and Uninformed searchUnit3:Informed and Uninformed search
Unit3:Informed and Uninformed search
 
Presentation on Karnaugh Map
Presentation  on Karnaugh MapPresentation  on Karnaugh Map
Presentation on Karnaugh Map
 
System Sequence Diagrams.pdf
System Sequence Diagrams.pdfSystem Sequence Diagrams.pdf
System Sequence Diagrams.pdf
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 
Quick sort
Quick sortQuick sort
Quick sort
 
Magnitude comparator
Magnitude comparatorMagnitude comparator
Magnitude comparator
 
Registers and counters
Registers and countersRegisters and counters
Registers and counters
 
Hill climbing algorithm in artificial intelligence
Hill climbing algorithm in artificial intelligenceHill climbing algorithm in artificial intelligence
Hill climbing algorithm in artificial intelligence
 
Booth Multiplier
Booth MultiplierBooth Multiplier
Booth Multiplier
 
STLD-Combinational logic design
STLD-Combinational  logic design STLD-Combinational  logic design
STLD-Combinational logic design
 
LOGIC GATES
LOGIC GATESLOGIC GATES
LOGIC GATES
 
Binary Arithmetic
Binary ArithmeticBinary Arithmetic
Binary Arithmetic
 
Logic gates presentation
Logic gates presentationLogic gates presentation
Logic gates presentation
 
Neural network
Neural networkNeural network
Neural network
 
Booth's algorithm
Booth's algorithm Booth's algorithm
Booth's algorithm
 
I.INFORMED SEARCH IN ARTIFICIAL INTELLIGENCE II. HEURISTIC FUNCTION IN AI III...
I.INFORMED SEARCH IN ARTIFICIAL INTELLIGENCE II. HEURISTIC FUNCTION IN AI III...I.INFORMED SEARCH IN ARTIFICIAL INTELLIGENCE II. HEURISTIC FUNCTION IN AI III...
I.INFORMED SEARCH IN ARTIFICIAL INTELLIGENCE II. HEURISTIC FUNCTION IN AI III...
 
Soft Computing-173101
Soft Computing-173101Soft Computing-173101
Soft Computing-173101
 
Sequential circuit design
Sequential circuit designSequential circuit design
Sequential circuit design
 
Division algorithm
Division algorithmDivision algorithm
Division algorithm
 

Similar to Improving Accuracy of Binary Neural Networks Using Unbalanced Activation Distribution

Mlp mixer an all-mlp architecture for vision
Mlp mixer  an all-mlp architecture for visionMlp mixer  an all-mlp architecture for vision
Mlp mixer an all-mlp architecture for visionJaey Jeong
 
Unsupervised representation learning for gaze estimation
Unsupervised representation learning for gaze estimationUnsupervised representation learning for gaze estimation
Unsupervised representation learning for gaze estimationJaey Jeong
 
Gaze estimation using transformer
Gaze estimation using transformerGaze estimation using transformer
Gaze estimation using transformerJaey Jeong
 
Neural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settingsNeural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settingsJaey Jeong
 
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...Sunghoon Joo
 
DeepXplore: Automated Whitebox Testing of Deep Learning Systems
DeepXplore: Automated Whitebox Testing of Deep Learning Systems DeepXplore: Automated Whitebox Testing of Deep Learning Systems
DeepXplore: Automated Whitebox Testing of Deep Learning Systems Jeongwhan Choi
 
Online learning in estimation of distribution algorithms for dynamic environm...
Online learning in estimation of distribution algorithms for dynamic environm...Online learning in estimation of distribution algorithms for dynamic environm...
Online learning in estimation of distribution algorithms for dynamic environm...André Gonçalves
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksSeunghyun Hwang
 
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...Jinwon Lee
 
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...ssuser4b1f48
 
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...Lviv Data Science Summer School
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
NS-CUK Joint Journal Club: THLee, Review on "Learning Conjoint Attentions for...
NS-CUK Joint Journal Club: THLee, Review on "Learning Conjoint Attentions for...NS-CUK Joint Journal Club: THLee, Review on "Learning Conjoint Attentions for...
NS-CUK Joint Journal Club: THLee, Review on "Learning Conjoint Attentions for...ssuser4b1f48
 
A novel hybridization of opposition-based learning and cooperative co-evoluti...
A novel hybridization of opposition-based learning and cooperative co-evoluti...A novel hybridization of opposition-based learning and cooperative co-evoluti...
A novel hybridization of opposition-based learning and cooperative co-evoluti...Borhan Kazimipour
 
Graph Attention Networks.pptx
Graph Attention Networks.pptxGraph Attention Networks.pptx
Graph Attention Networks.pptxssuser2624f71
 
Artificial Neural Network Seminar Report
Artificial Neural Network Seminar ReportArtificial Neural Network Seminar Report
Artificial Neural Network Seminar ReportTodd Turner
 
Poster_Reseau_Neurones_Journees_2013
Poster_Reseau_Neurones_Journees_2013Poster_Reseau_Neurones_Journees_2013
Poster_Reseau_Neurones_Journees_2013Pedro Lopes
 
NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...
NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...
NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...ssuser4b1f48
 
Live to learn: learning rules-based artificial neural network
Live to learn: learning rules-based artificial neural networkLive to learn: learning rules-based artificial neural network
Live to learn: learning rules-based artificial neural networknooriasukmaningtyas
 

Similar to Improving Accuracy of Binary Neural Networks Using Unbalanced Activation Distribution (20)

Mlp mixer an all-mlp architecture for vision
Mlp mixer  an all-mlp architecture for visionMlp mixer  an all-mlp architecture for vision
Mlp mixer an all-mlp architecture for vision
 
Unsupervised representation learning for gaze estimation
Unsupervised representation learning for gaze estimationUnsupervised representation learning for gaze estimation
Unsupervised representation learning for gaze estimation
 
Gaze estimation using transformer
Gaze estimation using transformerGaze estimation using transformer
Gaze estimation using transformer
 
Neural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settingsNeural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settings
 
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
 
DeepXplore: Automated Whitebox Testing of Deep Learning Systems
DeepXplore: Automated Whitebox Testing of Deep Learning Systems DeepXplore: Automated Whitebox Testing of Deep Learning Systems
DeepXplore: Automated Whitebox Testing of Deep Learning Systems
 
230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx
 
Online learning in estimation of distribution algorithms for dynamic environm...
Online learning in estimation of distribution algorithms for dynamic environm...Online learning in estimation of distribution algorithms for dynamic environm...
Online learning in estimation of distribution algorithms for dynamic environm...
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention Networks
 
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
 
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
 
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
NS-CUK Joint Journal Club: THLee, Review on "Learning Conjoint Attentions for...
NS-CUK Joint Journal Club: THLee, Review on "Learning Conjoint Attentions for...NS-CUK Joint Journal Club: THLee, Review on "Learning Conjoint Attentions for...
NS-CUK Joint Journal Club: THLee, Review on "Learning Conjoint Attentions for...
 
A novel hybridization of opposition-based learning and cooperative co-evoluti...
A novel hybridization of opposition-based learning and cooperative co-evoluti...A novel hybridization of opposition-based learning and cooperative co-evoluti...
A novel hybridization of opposition-based learning and cooperative co-evoluti...
 
Graph Attention Networks.pptx
Graph Attention Networks.pptxGraph Attention Networks.pptx
Graph Attention Networks.pptx
 
Artificial Neural Network Seminar Report
Artificial Neural Network Seminar ReportArtificial Neural Network Seminar Report
Artificial Neural Network Seminar Report
 
Poster_Reseau_Neurones_Journees_2013
Poster_Reseau_Neurones_Journees_2013Poster_Reseau_Neurones_Journees_2013
Poster_Reseau_Neurones_Journees_2013
 
NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...
NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...
NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...
 
Live to learn: learning rules-based artificial neural network
Live to learn: learning rules-based artificial neural networkLive to learn: learning rules-based artificial neural network
Live to learn: learning rules-based artificial neural network
 

More from Jaey Jeong

핵심 딥러닝 입문 4장 RNN
핵심 딥러닝 입문 4장 RNN핵심 딥러닝 입문 4장 RNN
핵심 딥러닝 입문 4장 RNNJaey Jeong
 
Gaze supported 3 d object manipulation in virtual reality
Gaze supported 3 d object manipulation in virtual realityGaze supported 3 d object manipulation in virtual reality
Gaze supported 3 d object manipulation in virtual realityJaey Jeong
 
hands on machine learning Chapter 4 model training
hands on machine learning Chapter 4 model traininghands on machine learning Chapter 4 model training
hands on machine learning Chapter 4 model trainingJaey Jeong
 
Deep learning based gaze detection system for automobile drivers using nir ca...
Deep learning based gaze detection system for automobile drivers using nir ca...Deep learning based gaze detection system for automobile drivers using nir ca...
Deep learning based gaze detection system for automobile drivers using nir ca...Jaey Jeong
 
hands on machine learning Chapter 6&7 decision tree, ensemble and random forest
hands on machine learning Chapter 6&7 decision tree, ensemble and random foresthands on machine learning Chapter 6&7 decision tree, ensemble and random forest
hands on machine learning Chapter 6&7 decision tree, ensemble and random forestJaey Jeong
 
Appearance based gaze estimation using deep features and random forest regres...
Appearance based gaze estimation using deep features and random forest regres...Appearance based gaze estimation using deep features and random forest regres...
Appearance based gaze estimation using deep features and random forest regres...Jaey Jeong
 
Tablet gaze unconstrained appearance based gaze estimation in mobile tablets
Tablet gaze unconstrained appearance based gaze estimation in mobile tabletsTablet gaze unconstrained appearance based gaze estimation in mobile tablets
Tablet gaze unconstrained appearance based gaze estimation in mobile tabletsJaey Jeong
 
deep learning from scratch chapter 7.cnn
deep learning from scratch chapter 7.cnndeep learning from scratch chapter 7.cnn
deep learning from scratch chapter 7.cnnJaey Jeong
 
deep learning from scratch chapter 5.learning related skills
deep learning from scratch chapter 5.learning related skillsdeep learning from scratch chapter 5.learning related skills
deep learning from scratch chapter 5.learning related skillsJaey Jeong
 
deep learning from scratch chapter 6.backpropagation
deep learning from scratch chapter 6.backpropagationdeep learning from scratch chapter 6.backpropagation
deep learning from scratch chapter 6.backpropagationJaey Jeong
 
deep learning from scratch chapter 4.neural network learing
deep learning from scratch chapter 4.neural network learingdeep learning from scratch chapter 4.neural network learing
deep learning from scratch chapter 4.neural network learingJaey Jeong
 
deep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkdeep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkJaey Jeong
 

More from Jaey Jeong (12)

핵심 딥러닝 입문 4장 RNN
핵심 딥러닝 입문 4장 RNN핵심 딥러닝 입문 4장 RNN
핵심 딥러닝 입문 4장 RNN
 
Gaze supported 3 d object manipulation in virtual reality
Gaze supported 3 d object manipulation in virtual realityGaze supported 3 d object manipulation in virtual reality
Gaze supported 3 d object manipulation in virtual reality
 
hands on machine learning Chapter 4 model training
hands on machine learning Chapter 4 model traininghands on machine learning Chapter 4 model training
hands on machine learning Chapter 4 model training
 
Deep learning based gaze detection system for automobile drivers using nir ca...
Deep learning based gaze detection system for automobile drivers using nir ca...Deep learning based gaze detection system for automobile drivers using nir ca...
Deep learning based gaze detection system for automobile drivers using nir ca...
 
hands on machine learning Chapter 6&7 decision tree, ensemble and random forest
hands on machine learning Chapter 6&7 decision tree, ensemble and random foresthands on machine learning Chapter 6&7 decision tree, ensemble and random forest
hands on machine learning Chapter 6&7 decision tree, ensemble and random forest
 
Appearance based gaze estimation using deep features and random forest regres...
Appearance based gaze estimation using deep features and random forest regres...Appearance based gaze estimation using deep features and random forest regres...
Appearance based gaze estimation using deep features and random forest regres...
 
Tablet gaze unconstrained appearance based gaze estimation in mobile tablets
Tablet gaze unconstrained appearance based gaze estimation in mobile tabletsTablet gaze unconstrained appearance based gaze estimation in mobile tablets
Tablet gaze unconstrained appearance based gaze estimation in mobile tablets
 
deep learning from scratch chapter 7.cnn
deep learning from scratch chapter 7.cnndeep learning from scratch chapter 7.cnn
deep learning from scratch chapter 7.cnn
 
deep learning from scratch chapter 5.learning related skills
deep learning from scratch chapter 5.learning related skillsdeep learning from scratch chapter 5.learning related skills
deep learning from scratch chapter 5.learning related skills
 
deep learning from scratch chapter 6.backpropagation
deep learning from scratch chapter 6.backpropagationdeep learning from scratch chapter 6.backpropagation
deep learning from scratch chapter 6.backpropagation
 
deep learning from scratch chapter 4.neural network learing
deep learning from scratch chapter 4.neural network learingdeep learning from scratch chapter 4.neural network learing
deep learning from scratch chapter 4.neural network learing
 
deep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkdeep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural network
 

Recently uploaded

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 

Recently uploaded (20)

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 

Improving Accuracy of Binary Neural Networks Using Unbalanced Activation Distribution

  • 2. Interaction Lab., Seoul National University of Science and Technology ■Degree of gaze estimation  Geometric based • 0.5°~ 3°  Appearance based • 3°~ 12° Feedback
  • 3. Interaction Lab., Seoul National University of Science and Technology ■GazeTR-Hybrid and GazeTR-Conv Feedback 3
  • 4. Interaction Lab. Seoul National University of Science and Technology Improving Accuracy of Binary Neural Networks using Unbalanced Activation Distribution Jae-Yeop Jeong Computer Vision and Pattern Recognition 2021 Hyungjun Kim, Jihoon Park, Changhun Lee, Jae-Joon Kim POSTECH, Korea
  • 5. Interaction Lab., Seoul National University of Science and Technology ■Intro ■Method ■Conclusion Agenda 5
  • 7. Interaction Lab., Seoul National University of Science and Technology ■Lightweight Network  Network quantization  Network pruning  Efficient architecture design Intro(1/6) 7
  • 8. Interaction Lab., Seoul National University of Science and Technology ■Network quantization  Convert floating-point type to integer or fixed point • Model Lightweight • Reduced computation • Efficient hardware use  The process that reduces represent form and internal composition in Neural Network Intro(2/6) 8
  • 9. Interaction Lab., Seoul National University of Science and Technology ■Binary Neural Network  32-bit floating-point weights can be represented by 1-bit weights  XNOR-and-popcount logics  Weight quantization  Activation quantization Intro(3/6) 9 -1.21 0.88 -0.11 0.17 1.72 -0.7 0.98 -0.57 0.63 -1 1 -1 1 1 -1 1 -1 1
  • 10. Interaction Lab., Seoul National University of Science and Technology ■Accuracy of BNNs  Due to the lightweight nature, BNNs are garnering interests as a promising solution for DNN computing on edge devices  BNNs still suffer from the accuracy degradation caused by the aggressive quantization  Low accuracy than full-precision model Intro(4/6) 10
  • 11. Interaction Lab., Seoul National University of Science and Technology ■Problems of BNNs  Gradient mismatch problem caused by the non-differentiable binary activation function • Gradients cannot propagate through the quantization layer in the back-propagation process • Straight-Through-Estimator(STE) Intro(5/6) 11
  • 12. Interaction Lab., Seoul National University of Science and Technology ■Technique to improve the accuracy of BNNs  Straight-Through-Estimator(STE)  Parameterized clipping activation function(PACT) • Focused on multi-bit networks (2-bit, 4-bit …)  Managing activation distribution • There are only two values (1 or -1) • Focused on binary activation more balanced  Additional activation function • Between the binary convolution layer and the following BN layer • ReLU, PReLU • Increasing the non-linearity Intro(6/6) 12
  • 14. Interaction Lab., Seoul National University of Science and Technology ■Overview  The activation functions in conventional full precision models to describe the motivation for using unbalanced activation distribution  How to make the distribution of the binary activation unbalanced using threshold shifting in BNNs  Experimental results on ImageNet data set  Ineffectiveness of the methodology to train the threshold of binary activation in BNNs  Additional activation functions Method(1/25) 14
  • 15. Interaction Lab., Seoul National University of Science and Technology ■Motivation for using unbalanced activation distribution  ReLU solved the gradient vanishing problem • The gradient vanishing problem is largely diminished by using BN layers in recent models  There might be other reasons for the success of the ReLU function Method(2/25) 15
  • 16. Interaction Lab., Seoul National University of Science and Technology ■Make unbalanced activation distribution using hardtanh  Compare the performance hardtanh function and ReLU6 • Slight variant of the ReLU function where its positive outputs are clipped to 6  Threshold shifting • x-axis, y-axis, increased range Method(3/25) 16
  • 17. Interaction Lab., Seoul National University of Science and Technology ■Training model  VGG-small model was trained on CIFAR-10 dataset with different activation functions  VGG-small • Consists of 4 Conv with 64, 64, 128, 128 output channel • Three dense with 512 • BN layer is placed between the Conv layer and the activation layer • Max pooling layer was used after each of the latter three convolution layers  Trained the model in the same condition with 10 different seeds Method(4/25) 17
  • 18. Interaction Lab., Seoul National University of Science and Technology ■Accuracy gap between hardtanh and ReLU6(1/2)  Red dotted line represents the test accuracy of the ReLU6 Method(5/25) 18
  • 19. Interaction Lab., Seoul National University of Science and Technology ■Accuracy gap between hardtanh and ReLU6(2/2)  The hardtanh function is shifted to the right along the x-axis • The number of negative outputs increase and the output distribution becomes positively skewed  The higher performance of ReLU activation function • Come from the unbalanced distribution of activation outputs • When sign function is used, the activation distribution becomes even more noticeable in BNNs Method(6/25) 19
  • 20. Interaction Lab., Seoul National University of Science and Technology ■BNNs with unbalanced activation distribution(1/4)  Previous multi-bit quantization methods • ReLU-based quantization which outputs unbalanced activation distribution  Previous BNNs quantization methods • Sign function for binary activation function thus distribution of binary activation is balanced • The ratio of 1 to -1 is close to 1:1 Method(7/25) 20
  • 21. Interaction Lab., Seoul National University of Science and Technology ■BNNs with unbalanced activation distribution(2/4)  By shifting the activation function along the x-axis helps improve the accuracy of a model  Poor performance was sparked by balanced activation distribution (sign function)  Shifting the sign function along the x-axis Method(8/25) 21
  • 22. Interaction Lab., Seoul National University of Science and Technology ■BNNs with unbalanced activation distribution(3/4)  VGG-small • The Max pooling layer makes the distribution positively skewed Method(9/25) 22
  • 23. Interaction Lab., Seoul National University of Science and Technology ■BNNs with unbalanced activation distribution(4/4)  Effect of Max Pooling  Max pooling layer is the only layer that gives asymmetry in the model • The accuracy difference between shifting the threshold to the positive direction and the negative direction • Replace Max pooling to Average pooling Method(10/25) 23
  • 24. Interaction Lab., Seoul National University of Science and Technology ■Effect of other conditions  Datasets  Models  Initialization methods  Optimizers Method(11/25) 24
  • 25. Interaction Lab., Seoul National University of Science and Technology ■Dataset  MNIST • Binary VGG- small model on MNIST dataset • 30 different random seeds and average results are presented Method(12/25) 25
  • 26. Interaction Lab., Seoul National University of Science and Technology ■Model architecture  Two additional model architectures  2-layer MLP and LeNet-5 • 2-layer MLP do not have Max pooling layer in it Method(13/25) 26
  • 27. Interaction Lab., Seoul National University of Science and Technology ■Initialization method(1/2)  Trained the models from scratch using Xavier normal initialization  Full-precision model is float point 32  Pretrained full-precision models are often used to initialize BNN models  Hardtanh + ReLU  Full- precision VGG-small model on CIFAR-10 dataset  Binary VGG- small model initialized with full-precision pretrained model Method(14/25) 27
  • 28. Interaction Lab., Seoul National University of Science and Technology ■Initialization method(2/2) Method(15/25) 28 1 0.6
  • 29. Interaction Lab., Seoul National University of Science and Technology ■Optimizer  Mostly used Adam optimizer in BNN  Replace Adam to SGD Method(16/25) 29
  • 30. Interaction Lab., Seoul National University of Science and Technology ■Experiments on ImageNet  Finding shift amount • Top-1 train accuracy at the first epoch and Top-1 validation accuracy Method(17/25) 30
  • 31. Interaction Lab., Seoul National University of Science and Technology ■Results on ImageNet Method(18/25) 31
  • 32. Interaction Lab., Seoul National University of Science and Technology ■Effect of training threshold  Limited effect on the performance of BNNs  Batch normalization bias  Limited learning capability Method(19/25) 32
  • 33. Interaction Lab., Seoul National University of Science and Technology ■Batch normalization bias  Trainable threshold is already covered by the bias values in the BN layer  𝑋 and 𝑌 are inputs to the BN layer and output from the binary activation layer  𝜇, 𝜎, 𝛾, 𝛽, and 𝑡ℎ  mean, std, weight, and bias of the BN layer  𝛽 and 𝑡ℎ are initialized to zero in the beginning Method(20/25) 33
  • 34. Interaction Lab., Seoul National University of Science and Technology ■Limited learning capability  Trainable threshold method • An initialization point Method(21/25) 34
  • 35. Interaction Lab., Seoul National University of Science and Technology ■Effect of additional activation function(1/4)  Use additional activation function after convolution layer in BNNs  BNNs usually lack non-linearity in the model  Additional activation function is also related to the unbalanced activation distribution Method(22/25) 35
  • 36. Interaction Lab., Seoul National University of Science and Technology ■Effect of additional activation function(2/4)  Effect of additional PReLU layers after binary convolution layers in the ResNet20 Method(23/25) 36
  • 37. Interaction Lab., Seoul National University of Science and Technology ■Effect of additional activation function(3/4)  There is accuracy degradation with PReLU • Already play a role in distorting the activation distribution • Further shifting the threshold of binary activation function is excessive Method(24/25) 37
  • 38. Interaction Lab., Seoul National University of Science and Technology ■Effect of additional activation function(4/4)  Replace to PReLU to LeakyReLU  When the slope 1 • Identity function Method(25/25) 38
  • 39. Interaction Lab., Seoul National University of Science and Technology ■Analyzed the impact of activation distribution on the accuracy of BNN models ■Demonstrated that the accuracy of previous BNN models could be further improved Conclusion 39