SlideShare a Scribd company logo
1 of 6
Download to read offline
Back Propagation using Sigmoid & ReLU
P Revanth Kumar
January 15, 2021
Introduction
Activation functions are mathematical equations that determine the output of a neural network.
The function is attached to each neuron in the network, and determines whether it should be
activated (“fired”) or not, based on whether each neuron’s input is relevant for the model’s
prediction.
Let Inputs are 𝑥1, 𝑥2, ..., 𝑥𝑛. These inputs will pass to hidden neuron. Then 2 important operations
will take place:
Figure 1: Neural Network
Step 1: The summation of weights and the inputs
𝑛
∑
𝑖=1
𝑤𝑖𝑥𝑖
𝑦 = 𝑤1𝑥1 + 𝑤2𝑥2 + ... + 𝑤𝑛𝑥𝑛
Step 2: Before activation function the bias will be added and summation follows:
𝑧 =
𝑛
∑
𝑖=1
𝑤𝑖𝑥𝑖 + 𝑏𝑖
1
There are various kind of activation function. Here we will see some activation functions like:
1. Sigmoid
2. ReLu
1 Sigmoid Activation Function
This function basically used in Logistic Regression also.
𝜎(𝑥) =
1
1 + 𝑒−𝑧
where,
𝑧 =
𝑛
∑
𝑖=1
𝑤𝑖𝑥𝑖 + 𝑏𝑖
after transformation, the output transform the value between 0 and 1. Let the product be any
value +ve or -ve. Here, 0.5 is the threshold. If the value is lesser than 0.5 considered as 0. If the
value is greater than 0.5 considered as 1 (Neuron is activated).
Figure 2: Sigmoid function
1.1 Sigmoid function in Back Propagation
Whenever the weights are getting updated we need to find the derivative of the function.
𝜎(𝑥) =
1
1 + 𝑒−𝑧
differentiating sigmoid function with respect to x.
Now,
𝑑𝜎(𝑥)
𝑑𝑥
=
1
(1 + 𝑒−𝑥)2
· 𝑒−𝑥
2
=
𝑒−𝑥
(1 + 𝑒−𝑥)2
=
1
(1 + 𝑒−𝑥)
*
𝑒−𝑥
(1 + 𝑒−𝑥)
(sigmoid) (1-sigmoid)
∴
𝑑𝜎(𝑥)
𝑑𝑥
= 𝜎(𝑥)(1 − 𝜎(𝑥)).
The derivative of sigmoid activation function will lies between 0 to 0.25.
0 ≤
𝜕𝑓
𝜕𝑥
≤ 0.25
Figure 3: Derivative of sigmoid function
• Because of this the vanishing gradient problem occurs. Understand the weight’s update
𝑤𝑛𝑒𝑤 = 𝑤𝑜𝑙𝑑 − 𝜂
𝜕𝐿
𝜕𝑤
this will be calculated with the help of chain rule.
• Suppose, let us consider 3 derivatives over here 1𝑠𝑡
derivative is 0.25, 2𝑛𝑑
derivative is
0.10 and 3𝑟𝑑
derivative is 0.001 which will impact w.
• When multiply this number will give a very small value (0.000025). Now, let us put this
number in 𝜕𝐿
𝜕𝑤
and learning rate as 1, there will be a minor change in 𝑤𝑜𝑙𝑑 that means
𝑤𝑜𝑙𝑑 ≈ w𝑛𝑒𝑤. Because of this minor change it will take much time to converge in the
gradient descent curve.
Now, in order to vanish gradient problem how ReLu will help us, let us see...
3
2 ReLU Activation Function
In ReLU activation function, suppose after particular operation
𝑧 =
𝑛
∑
𝑖=1
𝑤𝑖𝑥𝑖 + 𝑏𝑖
• If this particular function is pass to ReLu activation function then simple formula is
getting applied
𝑚𝑎𝑥 (0, 𝑧)
• Suppose if z is -ve then max (0,z) will be
−𝑣𝑒 𝑚𝑎𝑥 (0, −𝑣𝑒) = 0
• Suppose if z is +ve then max (0,z) will be
+𝑣𝑒 𝑚𝑎𝑥 (0, +𝑣𝑒) = +𝑣𝑒
• ReLu activation function is much more popular than sigmoid function.
Figure 4: ReLu function
4
2.1 ReLU function in Back Propagation
Whenever we are doing back propagation the derivative of ReLu function will be 1.
The line y=x (from fig.4.) have an angle of 45𝑜
. The derivative of any positive value in the case
of ReLU function the output will be always 1 because tan 45𝑜
= 1.
Figure 5: Derivative of ReLu
• Derivative of a function 𝜕𝑓
𝜕𝑧
for ReLu will be always 0 or 1. Whenever derivative is
performed it will check z value.
⎧⎪⎪
⎨⎪⎪⎩
1 𝑧 > 0
0 𝑧 < 0
• Now, let us apply this formula in derivative.
Suppose,
𝑤𝑛𝑒𝑤 = 𝑤𝑜𝑙𝑑 − 𝜂
𝜕𝐿
𝜕𝑤
• Let the derivative be the chain rule it has 3 derivative values as 1 × 1 × 1 = 1 and learning
rate (𝜂) as 1. When these values are updated over 𝜕𝐿
𝜕𝑤
their will be little difference between
𝑤𝑜𝑙𝑑 and 𝑤𝑛𝑒𝑤.
• Now, this will not have the vanishing gradient problem and the weights will get converge.
• But their is small problem with ReLU function, actually fix by Leaky ReLU function.
• As the ReLU derivative will be either 0 or 1. Suppose in one of the derivative case 1×0×1
= 0.
• Now, 𝑤𝑜𝑙𝑑 = 𝑤𝑛𝑒𝑤 this basically create a "dead neuron". This means no process has been
taken place, in order to fix this Leaky ReLU function is used.
5
2.2 Leaky ReLU
Leaky ReLUs are one attempt to fix the “dying ReLU” problem. Instead of the function being
zero when x < 0, a leaky ReLU will instead have a small negative slope of ( 0.01, or so)
Figure 6: Leaky ReLU activation function
⎧⎪⎪
⎨⎪⎪⎩
𝑧 𝑧 > 0
0.01 (𝑧) 𝑧 < 0
• Now, if we do the derivative with respect to z
𝜕(0.01)𝑧
𝜕𝑧
= 0.01 ; but it will not be = 0.
• This means we are solving "dead neuron" during the back propagation by using Leaky
ReLU.
• Suppose, in a neural network there are 100 neurons based on the training cycle if their is
a problem of deactivating or neurons getting dead then Leaky ReLU need to be applied.
*Note: In sigmoid the derivative will be always range between 0 to 0.25, in tanh it is al-
ways lessthan 1, but in ReLU either 0 or 1.
6

More Related Content

What's hot

Image trnsformations
Image trnsformationsImage trnsformations
Image trnsformationsJohn Williams
 
Image Smoothing using Frequency Domain Filters
Image Smoothing using Frequency Domain FiltersImage Smoothing using Frequency Domain Filters
Image Smoothing using Frequency Domain FiltersSuhaila Afzana
 
Edge Detection and Segmentation
Edge Detection and SegmentationEdge Detection and Segmentation
Edge Detection and SegmentationA B Shinde
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkDEEPASHRI HK
 
Optimization/Gradient Descent
Optimization/Gradient DescentOptimization/Gradient Descent
Optimization/Gradient Descentkandelin
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkGayatri Khanvilkar
 
5. gray level transformation
5. gray level transformation5. gray level transformation
5. gray level transformationMdFazleRabbi18
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms Hakky St
 
Image Enhancement using Frequency Domain Filters
Image Enhancement using Frequency Domain FiltersImage Enhancement using Frequency Domain Filters
Image Enhancement using Frequency Domain FiltersKarthika Ramachandran
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)EdutechLearners
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing mapsraphaelkiminya
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and ldaSuresh Pokharel
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep LearningSebastian Ruder
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processingAhmed Daoud
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine LearningKnoldus Inc.
 

What's hot (20)

Image trnsformations
Image trnsformationsImage trnsformations
Image trnsformations
 
Image Smoothing using Frequency Domain Filters
Image Smoothing using Frequency Domain FiltersImage Smoothing using Frequency Domain Filters
Image Smoothing using Frequency Domain Filters
 
Deep learning
Deep learningDeep learning
Deep learning
 
Edge Detection and Segmentation
Edge Detection and SegmentationEdge Detection and Segmentation
Edge Detection and Segmentation
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Optimization/Gradient Descent
Optimization/Gradient DescentOptimization/Gradient Descent
Optimization/Gradient Descent
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
 
Unit 1
Unit 1Unit 1
Unit 1
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
5. gray level transformation
5. gray level transformation5. gray level transformation
5. gray level transformation
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
 
Image Enhancement using Frequency Domain Filters
Image Enhancement using Frequency Domain FiltersImage Enhancement using Frequency Domain Filters
Image Enhancement using Frequency Domain Filters
 
cnn ppt.pptx
cnn ppt.pptxcnn ppt.pptx
cnn ppt.pptx
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing maps
 
Histogram Equalization
Histogram EqualizationHistogram Equalization
Histogram Equalization
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processing
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine Learning
 

Similar to Back propagation using sigmoid & ReLU function

RM-CVaR: Regularized Multiple β-CVaR Portfolio(IJCAI Presentation)
RM-CVaR: Regularized Multiple β-CVaR Portfolio(IJCAI Presentation)RM-CVaR: Regularized Multiple β-CVaR Portfolio(IJCAI Presentation)
RM-CVaR: Regularized Multiple β-CVaR Portfolio(IJCAI Presentation)Kei Nakagawa
 
Deep neural networks & computational graphs
Deep neural networks & computational graphsDeep neural networks & computational graphs
Deep neural networks & computational graphsRevanth Kumar
 
2021 01-02-linformer
2021 01-02-linformer2021 01-02-linformer
2021 01-02-linformerJAEMINJEONG5
 
Illustrative Introductory Neural Networks
Illustrative Introductory Neural NetworksIllustrative Introductory Neural Networks
Illustrative Introductory Neural NetworksYasutoTamura1
 
Training Deep Neural Nets
Training Deep Neural NetsTraining Deep Neural Nets
Training Deep Neural NetsCloudxLab
 
Lecture Notes: EEEC4340318 Instrumentation and Control Systems - Fundamental...
Lecture Notes:  EEEC4340318 Instrumentation and Control Systems - Fundamental...Lecture Notes:  EEEC4340318 Instrumentation and Control Systems - Fundamental...
Lecture Notes: EEEC4340318 Instrumentation and Control Systems - Fundamental...AIMST University
 
Lesson 12: Linear Independence
Lesson 12: Linear IndependenceLesson 12: Linear Independence
Lesson 12: Linear IndependenceMatthew Leingang
 
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing SystemsShuyo Nakatani
 
Rabbit challenge 3 DNN Day1
Rabbit challenge 3 DNN Day1Rabbit challenge 3 DNN Day1
Rabbit challenge 3 DNN Day1TOMMYLINK1
 
Deep learning simplified
Deep learning simplifiedDeep learning simplified
Deep learning simplifiedLovelyn Rose
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Tolerance analysis for pm
Tolerance analysis for pmTolerance analysis for pm
Tolerance analysis for pmCherng-ywh Lee
 
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...Amro Elfeki
 

Similar to Back propagation using sigmoid & ReLU function (16)

RM-CVaR: Regularized Multiple β-CVaR Portfolio(IJCAI Presentation)
RM-CVaR: Regularized Multiple β-CVaR Portfolio(IJCAI Presentation)RM-CVaR: Regularized Multiple β-CVaR Portfolio(IJCAI Presentation)
RM-CVaR: Regularized Multiple β-CVaR Portfolio(IJCAI Presentation)
 
DeepLearning.pdf
DeepLearning.pdfDeepLearning.pdf
DeepLearning.pdf
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Deep neural networks & computational graphs
Deep neural networks & computational graphsDeep neural networks & computational graphs
Deep neural networks & computational graphs
 
2021 01-02-linformer
2021 01-02-linformer2021 01-02-linformer
2021 01-02-linformer
 
Illustrative Introductory Neural Networks
Illustrative Introductory Neural NetworksIllustrative Introductory Neural Networks
Illustrative Introductory Neural Networks
 
2.ANN.pptx
2.ANN.pptx2.ANN.pptx
2.ANN.pptx
 
Training Deep Neural Nets
Training Deep Neural NetsTraining Deep Neural Nets
Training Deep Neural Nets
 
Lecture Notes: EEEC4340318 Instrumentation and Control Systems - Fundamental...
Lecture Notes:  EEEC4340318 Instrumentation and Control Systems - Fundamental...Lecture Notes:  EEEC4340318 Instrumentation and Control Systems - Fundamental...
Lecture Notes: EEEC4340318 Instrumentation and Control Systems - Fundamental...
 
Lesson 12: Linear Independence
Lesson 12: Linear IndependenceLesson 12: Linear Independence
Lesson 12: Linear Independence
 
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems
 
Rabbit challenge 3 DNN Day1
Rabbit challenge 3 DNN Day1Rabbit challenge 3 DNN Day1
Rabbit challenge 3 DNN Day1
 
Deep learning simplified
Deep learning simplifiedDeep learning simplified
Deep learning simplified
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Tolerance analysis for pm
Tolerance analysis for pmTolerance analysis for pm
Tolerance analysis for pm
 
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...
 

More from Revanth Kumar

APPLIED MACHINE LEARNING
APPLIED MACHINE LEARNINGAPPLIED MACHINE LEARNING
APPLIED MACHINE LEARNINGRevanth Kumar
 
Deep learning algorithms
Deep learning algorithmsDeep learning algorithms
Deep learning algorithmsRevanth Kumar
 
Math behind the kernels
Math behind the kernelsMath behind the kernels
Math behind the kernelsRevanth Kumar
 
Kernels in convolution
Kernels in convolutionKernels in convolution
Kernels in convolutionRevanth Kumar
 
Linear regression, costs & gradient descent
Linear regression, costs & gradient descentLinear regression, costs & gradient descent
Linear regression, costs & gradient descentRevanth Kumar
 

More from Revanth Kumar (7)

APPLIED MACHINE LEARNING
APPLIED MACHINE LEARNINGAPPLIED MACHINE LEARNING
APPLIED MACHINE LEARNING
 
Deep learning algorithms
Deep learning algorithmsDeep learning algorithms
Deep learning algorithms
 
Math behind the kernels
Math behind the kernelsMath behind the kernels
Math behind the kernels
 
Kernels in convolution
Kernels in convolutionKernels in convolution
Kernels in convolution
 
Linear regression, costs & gradient descent
Linear regression, costs & gradient descentLinear regression, costs & gradient descent
Linear regression, costs & gradient descent
 
Self driving car
Self driving carSelf driving car
Self driving car
 
Tomography System
Tomography SystemTomography System
Tomography System
 

Recently uploaded

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 

Recently uploaded (20)

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 

Back propagation using sigmoid & ReLU function

  • 1. Back Propagation using Sigmoid & ReLU P Revanth Kumar January 15, 2021 Introduction Activation functions are mathematical equations that determine the output of a neural network. The function is attached to each neuron in the network, and determines whether it should be activated (“fired”) or not, based on whether each neuron’s input is relevant for the model’s prediction. Let Inputs are 𝑥1, 𝑥2, ..., 𝑥𝑛. These inputs will pass to hidden neuron. Then 2 important operations will take place: Figure 1: Neural Network Step 1: The summation of weights and the inputs 𝑛 ∑ 𝑖=1 𝑤𝑖𝑥𝑖 𝑦 = 𝑤1𝑥1 + 𝑤2𝑥2 + ... + 𝑤𝑛𝑥𝑛 Step 2: Before activation function the bias will be added and summation follows: 𝑧 = 𝑛 ∑ 𝑖=1 𝑤𝑖𝑥𝑖 + 𝑏𝑖 1
  • 2. There are various kind of activation function. Here we will see some activation functions like: 1. Sigmoid 2. ReLu 1 Sigmoid Activation Function This function basically used in Logistic Regression also. 𝜎(𝑥) = 1 1 + 𝑒−𝑧 where, 𝑧 = 𝑛 ∑ 𝑖=1 𝑤𝑖𝑥𝑖 + 𝑏𝑖 after transformation, the output transform the value between 0 and 1. Let the product be any value +ve or -ve. Here, 0.5 is the threshold. If the value is lesser than 0.5 considered as 0. If the value is greater than 0.5 considered as 1 (Neuron is activated). Figure 2: Sigmoid function 1.1 Sigmoid function in Back Propagation Whenever the weights are getting updated we need to find the derivative of the function. 𝜎(𝑥) = 1 1 + 𝑒−𝑧 differentiating sigmoid function with respect to x. Now, 𝑑𝜎(𝑥) 𝑑𝑥 = 1 (1 + 𝑒−𝑥)2 · 𝑒−𝑥 2
  • 3. = 𝑒−𝑥 (1 + 𝑒−𝑥)2 = 1 (1 + 𝑒−𝑥) * 𝑒−𝑥 (1 + 𝑒−𝑥) (sigmoid) (1-sigmoid) ∴ 𝑑𝜎(𝑥) 𝑑𝑥 = 𝜎(𝑥)(1 − 𝜎(𝑥)). The derivative of sigmoid activation function will lies between 0 to 0.25. 0 ≤ 𝜕𝑓 𝜕𝑥 ≤ 0.25 Figure 3: Derivative of sigmoid function • Because of this the vanishing gradient problem occurs. Understand the weight’s update 𝑤𝑛𝑒𝑤 = 𝑤𝑜𝑙𝑑 − 𝜂 𝜕𝐿 𝜕𝑤 this will be calculated with the help of chain rule. • Suppose, let us consider 3 derivatives over here 1𝑠𝑡 derivative is 0.25, 2𝑛𝑑 derivative is 0.10 and 3𝑟𝑑 derivative is 0.001 which will impact w. • When multiply this number will give a very small value (0.000025). Now, let us put this number in 𝜕𝐿 𝜕𝑤 and learning rate as 1, there will be a minor change in 𝑤𝑜𝑙𝑑 that means 𝑤𝑜𝑙𝑑 ≈ w𝑛𝑒𝑤. Because of this minor change it will take much time to converge in the gradient descent curve. Now, in order to vanish gradient problem how ReLu will help us, let us see... 3
  • 4. 2 ReLU Activation Function In ReLU activation function, suppose after particular operation 𝑧 = 𝑛 ∑ 𝑖=1 𝑤𝑖𝑥𝑖 + 𝑏𝑖 • If this particular function is pass to ReLu activation function then simple formula is getting applied 𝑚𝑎𝑥 (0, 𝑧) • Suppose if z is -ve then max (0,z) will be −𝑣𝑒 𝑚𝑎𝑥 (0, −𝑣𝑒) = 0 • Suppose if z is +ve then max (0,z) will be +𝑣𝑒 𝑚𝑎𝑥 (0, +𝑣𝑒) = +𝑣𝑒 • ReLu activation function is much more popular than sigmoid function. Figure 4: ReLu function 4
  • 5. 2.1 ReLU function in Back Propagation Whenever we are doing back propagation the derivative of ReLu function will be 1. The line y=x (from fig.4.) have an angle of 45𝑜 . The derivative of any positive value in the case of ReLU function the output will be always 1 because tan 45𝑜 = 1. Figure 5: Derivative of ReLu • Derivative of a function 𝜕𝑓 𝜕𝑧 for ReLu will be always 0 or 1. Whenever derivative is performed it will check z value. ⎧⎪⎪ ⎨⎪⎪⎩ 1 𝑧 > 0 0 𝑧 < 0 • Now, let us apply this formula in derivative. Suppose, 𝑤𝑛𝑒𝑤 = 𝑤𝑜𝑙𝑑 − 𝜂 𝜕𝐿 𝜕𝑤 • Let the derivative be the chain rule it has 3 derivative values as 1 × 1 × 1 = 1 and learning rate (𝜂) as 1. When these values are updated over 𝜕𝐿 𝜕𝑤 their will be little difference between 𝑤𝑜𝑙𝑑 and 𝑤𝑛𝑒𝑤. • Now, this will not have the vanishing gradient problem and the weights will get converge. • But their is small problem with ReLU function, actually fix by Leaky ReLU function. • As the ReLU derivative will be either 0 or 1. Suppose in one of the derivative case 1×0×1 = 0. • Now, 𝑤𝑜𝑙𝑑 = 𝑤𝑛𝑒𝑤 this basically create a "dead neuron". This means no process has been taken place, in order to fix this Leaky ReLU function is used. 5
  • 6. 2.2 Leaky ReLU Leaky ReLUs are one attempt to fix the “dying ReLU” problem. Instead of the function being zero when x < 0, a leaky ReLU will instead have a small negative slope of ( 0.01, or so) Figure 6: Leaky ReLU activation function ⎧⎪⎪ ⎨⎪⎪⎩ 𝑧 𝑧 > 0 0.01 (𝑧) 𝑧 < 0 • Now, if we do the derivative with respect to z 𝜕(0.01)𝑧 𝜕𝑧 = 0.01 ; but it will not be = 0. • This means we are solving "dead neuron" during the back propagation by using Leaky ReLU. • Suppose, in a neural network there are 100 neurons based on the training cycle if their is a problem of deactivating or neurons getting dead then Leaky ReLU need to be applied. *Note: In sigmoid the derivative will be always range between 0 to 0.25, in tanh it is al- ways lessthan 1, but in ReLU either 0 or 1. 6