SlideShare a Scribd company logo
Batch Normalization: 

Accelerating Deep Network Training 

by Reducing Internal Covariate Shift
#17
2019/02/06
@iiou16_tech
abstract
Deep Neural Networks
 
 
 
  Batch Normalization
 
dropOut
  Batch Normalization 14 1
ImageNet
4.9 5 4.8
outline
1. Introduction
2. Towards Reducing Internal Covariate Shift
3. Normalization via Mini-Batch Statistics
1. Training and Inference with Batch-Normalized
Networks
2. Batch-Normalized ConvolutionalNetworks
3. Batch Normalization enables higher learning rates
4. Batch Normalization regularizes the model
4. Experiments
1. Activations over time
2. ImageNet classification
5. Conclusion
Introduction

• Deep Learning SGD
– x θ l 

θ
– 

• ( ) ( m 

)
• 1
•
Introduction

• covariate shift 

–
DNNx
DNNx’
x DNN x’
Introduction

• covariate shift
x
F1 F2
DNN
F2 F1
→F1 F2 x
2 Towards Reducing Internal Covariate
Shift
•
–
(DNN)
– DNN
– 0 1 

( )
– 1
2 Towards Reducing Internal Covariate
Shift
•
•
• itr
• (SGD)
• 

3 Normalization via Mini-Batch Statistics
•
• 1
• 0 1
•
• γ β x
• 2
3 Normalization via Mini-Batch Statistics
•
• 2
• 0 1 SGD
DNN
• 

/
• itr 

3.1 Training and Inference with
BatchNormalized Networks
•
• /

•
• / 

•
activation
/


/
3.2 Batch-Normalized Convolutional
Networks
• Convolutionarl
• ( )
•
BN
• m
• Conv BN p*q
m*p*q
• Conv BN 2*
3.3 Batch Normalization enables higher
learning rates
• 

• BN 

a 

1/a 



3.4 Batch Normalization regularizes the
model
• 

• BN
• DropOut 

4 Experiments 

4.1 Activations over time
• 

BN
• MNIST
• 3 NN ( )
• 60 50000
BN BN
4.2 ImageNet classification
• Inception ImageNet
• Relu
• CNN layer 5*5( )→3*3 ×2
• batch size = 32
• Optimiser : Momentum SGD
https://arxiv.org/pdf/1409.4842.pdf
4.2.1 Accelerating BN Networks
• BASE Inception BN
• BN
•
• DropOut
• L2 Weight regularization 1/5
• 6
• 

• 1%
• photometric distortion
•
• Local Response Normalization
https://arxiv.org/pdf/1409.4842.pdf
4.2.2 Single-Network Classification
BN-x5LSVRC2012
lr=0.0015
BN
4.2.1
lr=0.0075
4.2.1
lr=0.045
BN-x5
Leru→sigmoid
4.2.3 Ensemble Classification
• ImageNet Best Result
• BN-x30 6
SoTA
• DropOut (5% or 10%)
Conclusion(1/2)
•
• NN 



• activation 

DNN
• SGD
BN 2
• BN
• BN
• dropOut
• BN ImageNet
Conclution(2/2)
• Standardization layer
• BN
• future work
• Recurrent Neural Networks BN
• / BN
• domain adaptation
•
Batch Normalization BN→ , 2
Standardization layer SL no paramater
activation
activation
1
• BN google
• https://patents.google.com/patent/US20160217368A1/en
A neural network system implemented by one or more computers, the neural network system comprising:
a batch normalization layer between a first neural network layer and a second neural network layer, wherein the
first neural network layer generates first layer outputs having a plurality of components, and wherein the batch
normalization layer is configured to, during training of the neural network system on a batch of training examples:
receive a respective first layer output for each training example in the batch;
compute a plurality of normalization statistics for the batch from the first layer outputs;
normalize each component of each first layer output using the normalization statistics to generate a respective
normalized layer output for each training example in the batch;
generate a respective batch normalization layer output for each of the training examples from the normalized layer
outputs; and
provide the batch normalization layer output as an input to the second neural network layer.
• ※
• https://www.slideshare.net/YosukeShinya/ss-125937523
• by 50 @2018/12/15
2
• BN
• Group Normalization
• fixup

More Related Content

What's hot

モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019
Yusuke Uchida
 
SeRanet introduction
SeRanet introductionSeRanet introduction
SeRanet introduction
Kosuke Nakago
 
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural NetworkISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
Hiroki Nakahara
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용
홍배 김
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
Junho Cho
 
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
Hiroki Nakahara
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
Edge AI and Vision Alliance
 
Deep Learningによる超解像の進歩
Deep Learningによる超解像の進歩Deep Learningによる超解像の進歩
Deep Learningによる超解像の進歩
Hiroto Honda
 
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
Shunta Saito
 
#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation
Matthew Opala
 
Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...
Dmytro Mishkin
 
A Random Forest using a Multi-valued Decision Diagram on an FPGa
A Random Forest using a Multi-valued Decision Diagram on an FPGaA Random Forest using a Multi-valued Decision Diagram on an FPGa
A Random Forest using a Multi-valued Decision Diagram on an FPGa
Hiroki Nakahara
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
Universitat Politècnica de Catalunya
 
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
Taegyun Jeon
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Alex Conway
 
DeepFix: a fully convolutional neural network for predicting human fixations...
DeepFix:  a fully convolutional neural network for predicting human fixations...DeepFix:  a fully convolutional neural network for predicting human fixations...
DeepFix: a fully convolutional neural network for predicting human fixations...
Universitat Politècnica de Catalunya
 
Introduction to Chainer Chemistry
Introduction to Chainer ChemistryIntroduction to Chainer Chemistry
Introduction to Chainer Chemistry
Preferred Networks
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
Sungjoon Choi
 
FPL15 talk: Deep Convolutional Neural Network on FPGA
FPL15 talk: Deep Convolutional Neural Network on FPGAFPL15 talk: Deep Convolutional Neural Network on FPGA
FPL15 talk: Deep Convolutional Neural Network on FPGA
Hiroki Nakahara
 
Convolutional Neural Networks for Computer vision Applications
Convolutional Neural Networks for Computer vision ApplicationsConvolutional Neural Networks for Computer vision Applications
Convolutional Neural Networks for Computer vision Applications
Alex Conway
 

What's hot (20)

モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019
 
SeRanet introduction
SeRanet introductionSeRanet introduction
SeRanet introduction
 
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural NetworkISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
 
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
 
Deep Learningによる超解像の進歩
Deep Learningによる超解像の進歩Deep Learningによる超解像の進歩
Deep Learningによる超解像の進歩
 
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
 
#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation
 
Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...
 
A Random Forest using a Multi-valued Decision Diagram on an FPGa
A Random Forest using a Multi-valued Decision Diagram on an FPGaA Random Forest using a Multi-valued Decision Diagram on an FPGa
A Random Forest using a Multi-valued Decision Diagram on an FPGa
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
 
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
 
DeepFix: a fully convolutional neural network for predicting human fixations...
DeepFix:  a fully convolutional neural network for predicting human fixations...DeepFix:  a fully convolutional neural network for predicting human fixations...
DeepFix: a fully convolutional neural network for predicting human fixations...
 
Introduction to Chainer Chemistry
Introduction to Chainer ChemistryIntroduction to Chainer Chemistry
Introduction to Chainer Chemistry
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
FPL15 talk: Deep Convolutional Neural Network on FPGA
FPL15 talk: Deep Convolutional Neural Network on FPGAFPL15 talk: Deep Convolutional Neural Network on FPGA
FPL15 talk: Deep Convolutional Neural Network on FPGA
 
Convolutional Neural Networks for Computer vision Applications
Convolutional Neural Networks for Computer vision ApplicationsConvolutional Neural Networks for Computer vision Applications
Convolutional Neural Networks for Computer vision Applications
 

Similar to Batch normalization

ImageNet classification with deep convolutional neural networks(2012)
ImageNet classification with deep convolutional neural networks(2012)ImageNet classification with deep convolutional neural networks(2012)
ImageNet classification with deep convolutional neural networks(2012)
WoochulShin10
 
Autoencoders for image_classification
Autoencoders for image_classificationAutoencoders for image_classification
Autoencoders for image_classification
Cenk Bircanoğlu
 
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
Edge AI and Vision Alliance
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
DonghyunKang12
 
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
ssuser9357dd
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
Chester Chen
 
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent: A New Approach to Self-Supervised LearningBootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Sungchul Kim
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Introduction to deep learning in python and Matlab
Introduction to deep learning in python and MatlabIntroduction to deep learning in python and Matlab
Introduction to deep learning in python and Matlab
Imry Kissos
 
OBDPC 2022
OBDPC 2022OBDPC 2022
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Balázs Hidasi
 
Image classification with neural networks
Image classification with neural networksImage classification with neural networks
Image classification with neural networks
Sepehr Rasouli
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
MoctardOLOULADE
 
Deep Learning Part 1 : Neural Networks
Deep Learning Part 1 : Neural NetworksDeep Learning Part 1 : Neural Networks
Deep Learning Part 1 : Neural Networks
Madhu Sanjeevi (Mady)
 
Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017
Alex Conway
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident network
NAVER Engineering
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in Python
Imry Kissos
 
Implementation of linear regression and logistic regression on Spark
Implementation of linear regression and logistic regression on SparkImplementation of linear regression and logistic regression on Spark
Implementation of linear regression and logistic regression on Spark
Dalei Li
 
Getting your hands dirty with deep learning in java
Getting your hands dirty with deep learning in javaGetting your hands dirty with deep learning in java
Getting your hands dirty with deep learning in java
Dave Snowdon
 
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
현호 김
 

Similar to Batch normalization (20)

ImageNet classification with deep convolutional neural networks(2012)
ImageNet classification with deep convolutional neural networks(2012)ImageNet classification with deep convolutional neural networks(2012)
ImageNet classification with deep convolutional neural networks(2012)
 
Autoencoders for image_classification
Autoencoders for image_classificationAutoencoders for image_classification
Autoencoders for image_classification
 
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
 
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent: A New Approach to Self-Supervised LearningBootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Introduction to deep learning in python and Matlab
Introduction to deep learning in python and MatlabIntroduction to deep learning in python and Matlab
Introduction to deep learning in python and Matlab
 
OBDPC 2022
OBDPC 2022OBDPC 2022
OBDPC 2022
 
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
 
Image classification with neural networks
Image classification with neural networksImage classification with neural networks
Image classification with neural networks
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Deep Learning Part 1 : Neural Networks
Deep Learning Part 1 : Neural NetworksDeep Learning Part 1 : Neural Networks
Deep Learning Part 1 : Neural Networks
 
Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident network
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in Python
 
Implementation of linear regression and logistic regression on Spark
Implementation of linear regression and logistic regression on SparkImplementation of linear regression and logistic regression on Spark
Implementation of linear regression and logistic regression on Spark
 
Getting your hands dirty with deep learning in java
Getting your hands dirty with deep learning in javaGetting your hands dirty with deep learning in java
Getting your hands dirty with deep learning in java
 
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
 

Recently uploaded

Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
GauravCar
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
AjmalKhan50578
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
bjmsejournal
 
Data Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptxData Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptx
ramrag33
 
integral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdfintegral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdf
gaafergoudaay7aga
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
Mahmoud Morsy
 
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
riddhimaagrawal986
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
ecqow
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
Atif Razi
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
171ticu
 

Recently uploaded (20)

Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
 
Data Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptxData Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptx
 
integral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdfintegral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdf
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
 
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
 

Batch normalization

  • 1. Batch Normalization: 
 Accelerating Deep Network Training 
 by Reducing Internal Covariate Shift #17 2019/02/06 @iiou16_tech
  • 2. abstract Deep Neural Networks         Batch Normalization   dropOut   Batch Normalization 14 1 ImageNet 4.9 5 4.8
  • 3. outline 1. Introduction 2. Towards Reducing Internal Covariate Shift 3. Normalization via Mini-Batch Statistics 1. Training and Inference with Batch-Normalized Networks 2. Batch-Normalized ConvolutionalNetworks 3. Batch Normalization enables higher learning rates 4. Batch Normalization regularizes the model 4. Experiments 1. Activations over time 2. ImageNet classification 5. Conclusion
  • 4. Introduction
 • Deep Learning SGD – x θ l 
 θ – 
 • ( ) ( m 
 ) • 1 •
  • 5. Introduction
 • covariate shift 
 – DNNx DNNx’ x DNN x’
  • 7. 2 Towards Reducing Internal Covariate Shift • – (DNN) – DNN – 0 1 
 ( ) – 1
  • 8. 2 Towards Reducing Internal Covariate Shift • • • itr • (SGD) • 

  • 9. 3 Normalization via Mini-Batch Statistics • • 1 • 0 1 • • γ β x • 2
  • 10. 3 Normalization via Mini-Batch Statistics • • 2 • 0 1 SGD DNN • 
 / • itr 

  • 11. 3.1 Training and Inference with BatchNormalized Networks • • /
 • • / 
 • activation / 
 /
  • 12. 3.2 Batch-Normalized Convolutional Networks • Convolutionarl • ( ) • BN • m • Conv BN p*q m*p*q • Conv BN 2*
  • 13. 3.3 Batch Normalization enables higher learning rates • 
 • BN 
 a 
 1/a 
 

  • 14. 3.4 Batch Normalization regularizes the model • 
 • BN • DropOut 

  • 15. 4 Experiments 
 4.1 Activations over time • 
 BN • MNIST • 3 NN ( ) • 60 50000 BN BN
  • 16. 4.2 ImageNet classification • Inception ImageNet • Relu • CNN layer 5*5( )→3*3 ×2 • batch size = 32 • Optimiser : Momentum SGD https://arxiv.org/pdf/1409.4842.pdf
  • 17. 4.2.1 Accelerating BN Networks • BASE Inception BN • BN • • DropOut • L2 Weight regularization 1/5 • 6 • 
 • 1% • photometric distortion • • Local Response Normalization https://arxiv.org/pdf/1409.4842.pdf
  • 19. 4.2.3 Ensemble Classification • ImageNet Best Result • BN-x30 6 SoTA • DropOut (5% or 10%)
  • 20. Conclusion(1/2) • • NN 
 
 • activation 
 DNN • SGD BN 2 • BN • BN • dropOut • BN ImageNet
  • 21. Conclution(2/2) • Standardization layer • BN • future work • Recurrent Neural Networks BN • / BN • domain adaptation • Batch Normalization BN→ , 2 Standardization layer SL no paramater activation activation
  • 22. 1 • BN google • https://patents.google.com/patent/US20160217368A1/en A neural network system implemented by one or more computers, the neural network system comprising: a batch normalization layer between a first neural network layer and a second neural network layer, wherein the first neural network layer generates first layer outputs having a plurality of components, and wherein the batch normalization layer is configured to, during training of the neural network system on a batch of training examples: receive a respective first layer output for each training example in the batch; compute a plurality of normalization statistics for the batch from the first layer outputs; normalize each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch; generate a respective batch normalization layer output for each of the training examples from the normalized layer outputs; and provide the batch normalization layer output as an input to the second neural network layer. • ※ • https://www.slideshare.net/YosukeShinya/ss-125937523 • by 50 @2018/12/15
  • 23. 2 • BN • Group Normalization • fixup