SlideShare a Scribd company logo
1 of 4
Download to read offline
Derivation of Backpropagation
1 Introduction
Figure 1: Neural network processing
Conceptually, a network forward propagates activation to produce an output and it backward
propagates error to determine weight changes (as shown in Figure 1). The weights on the connec-
tions between neurons mediate the passed values in both directions.
The Backpropagation algorithm is used to learn the weights of a multilayer neural network with
a fixed architecture. It performs gradient descent to try to minimize the sum squared error between
the network’s output values and the given target values.
Figure 2 depicts the network components which affect a particular weight change. Notice that
all the necessary components are locally related to the weight being updated. This is one feature
of backpropagation that seems biologically plausible. However, brain connections appear to be
unidirectional and not bidirectional as would be required to implement backpropagation.
2 Notation
For the purpose of this derivation, we will use the following notation:
• The subscript k denotes the output layer.
• The subscript j denotes the hidden layer.
• The subscript i denotes the input layer.
1
Figure 2: The change to a hidden to output weight depends on error (depicted as a lined pattern)
at the output node and activation (depicted as a solid pattern) at the hidden node. While the
change to a input to hidden weight depends on error at the hidden node (which in turn depends
on error at all the output nodes) and activation at the input node.
• wkj denotes a weight from the hidden to the output layer.
• wji denotes a weight from the input to the hidden layer.
• a denotes an activation value.
• t denotes a target value.
• net denotes the net input.
3 Review of Calculus Rules
d(eu)
dx
= eu du
dx
d(g + h)
dx
=
dg
dx
+
dh
dx
d(gn)
dx
= ngn−1 dg
dx
4 Gradient Descent on Error
We can motivate the backpropagation learning algorithm as gradient descent on sum-squared error
(we square the error because we are interested in its magnitude, not its sign). The total error in a
network is given by the following equation (the 1
2 will simplify things later).
E =
1
2
X
k
(tk − ak)2
We want to adjust the network’s weights to reduce this overall error.
∆W ∝ −
∂E
∂W
We will begin at the output layer with a particular weight.
2
∆wkj ∝ −
∂E
∂wkj
However error is not directly a function of a weight. We expand this as follows.
∆wkj = −ε
∂E
∂ak
∂ak
∂netk
∂netk
∂wkj
Let’s consider each of these partial derivatives in turn. Note that only one term of the E summation
will have a non-zero derivative: the one associated with the particular weight we are considering.
4.1 Derivative of the error with respect to the activation
∂E
∂ak
=
∂(1
2 (tk − ak)2)
∂ak
= −(tk − ak)
Now we see why the 1
2 in the E term was useful.
4.2 Derivative of the activation with respect to the net input
∂ak
∂netk
=
∂(1 + e−netk )−1
∂netk
=
e−netk
(1 + e−netk )2
We’d like to be able to rewrite this result in terms of the activation function. Notice that:
1 −
1
1 + e−netk
=
e−netk
1 + e−netk
Using this fact, we can rewrite the result of the partial derivative as:
ak(1 − ak)
4.3 Derivative of the net input with respect to a weight
Note that only one term of the net summation will have a non-zero derivative: again the one
associated with the particular weight we are considering.
∂netk
∂wkj
=
∂(wkjaj)
∂wkj
= aj
4.4 Weight change rule for a hidden to output weight
Now substituting these results back into our original equation we have:
∆wkj = ε
δk
z }| {
(tk − ak)ak(1 − ak) aj
Notice that this looks very similar to the Perceptron Training Rule. The only difference is the
inclusion of the derivative of the activation function. This equation is typically simplified as shown
below where the δ term repesents the product of the error with the derivative of the activation
function.
∆wkj = εδkaj
3
4.5 Weight change rule for an input to hidden weight
Now we have to determine the appropriate weight change for an input to hidden weight. This is
more complicated because it depends on the error at all of the nodes this weighted connection can
lead to.
∆wji ∝ −[
X
k
∂E
∂ak
∂ak
∂netk
∂netk
∂aj
]
∂aj
∂netj
∂netj
∂wji
= ε[
X
k
δk
z }| {
(tk − ak)ak(1 − ak) wkj]aj(1 − aj)ai
= ε
δj
z }| {
[
X
k
δkwkj]aj(1 − aj) ai
∆wji = εδjai
4

More Related Content

What's hot

Stiffness matrix method of indeterminate Beam2
Stiffness matrix method of indeterminate Beam2Stiffness matrix method of indeterminate Beam2
Stiffness matrix method of indeterminate Beam2anujajape
 
Customer Segmentation using Clustering
Customer Segmentation using ClusteringCustomer Segmentation using Clustering
Customer Segmentation using ClusteringDessy Amirudin
 
Stiffness matrix method of indeterminate Beam5
Stiffness matrix method of indeterminate Beam5Stiffness matrix method of indeterminate Beam5
Stiffness matrix method of indeterminate Beam5anujajape
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetAlaaZ
 
Stiffness matrix method of indeterminate Beam3
Stiffness matrix method of indeterminate Beam3Stiffness matrix method of indeterminate Beam3
Stiffness matrix method of indeterminate Beam3anujajape
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clusteringPVP College
 
STATE SPACE GENERATION FRAMEWORK BASED ON BINARY DECISION DIAGRAM FOR DISTRIB...
STATE SPACE GENERATION FRAMEWORK BASED ON BINARY DECISION DIAGRAM FOR DISTRIB...STATE SPACE GENERATION FRAMEWORK BASED ON BINARY DECISION DIAGRAM FOR DISTRIB...
STATE SPACE GENERATION FRAMEWORK BASED ON BINARY DECISION DIAGRAM FOR DISTRIB...csandit
 
Project Presentation
Project PresentationProject Presentation
Project Presentationsamvb18
 
The Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmThe Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmESCOM
 
Lecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image OperationLecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image OperationVARUN KUMAR
 

What's hot (20)

Stiffness matrix method of indeterminate Beam2
Stiffness matrix method of indeterminate Beam2Stiffness matrix method of indeterminate Beam2
Stiffness matrix method of indeterminate Beam2
 
Disjoint set
Disjoint setDisjoint set
Disjoint set
 
honn
honnhonn
honn
 
K means
K meansK means
K means
 
Customer Segmentation using Clustering
Customer Segmentation using ClusteringCustomer Segmentation using Clustering
Customer Segmentation using Clustering
 
Neural nw k means
Neural nw k meansNeural nw k means
Neural nw k means
 
Stiffness matrix method of indeterminate Beam5
Stiffness matrix method of indeterminate Beam5Stiffness matrix method of indeterminate Beam5
Stiffness matrix method of indeterminate Beam5
 
K-Means manual work
K-Means manual workK-Means manual work
K-Means manual work
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
 
Data miningpresentation
Data miningpresentationData miningpresentation
Data miningpresentation
 
Stiffness matrix method of indeterminate Beam3
Stiffness matrix method of indeterminate Beam3Stiffness matrix method of indeterminate Beam3
Stiffness matrix method of indeterminate Beam3
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
 
Icmtea
IcmteaIcmtea
Icmtea
 
STATE SPACE GENERATION FRAMEWORK BASED ON BINARY DECISION DIAGRAM FOR DISTRIB...
STATE SPACE GENERATION FRAMEWORK BASED ON BINARY DECISION DIAGRAM FOR DISTRIB...STATE SPACE GENERATION FRAMEWORK BASED ON BINARY DECISION DIAGRAM FOR DISTRIB...
STATE SPACE GENERATION FRAMEWORK BASED ON BINARY DECISION DIAGRAM FOR DISTRIB...
 
Project Presentation
Project PresentationProject Presentation
Project Presentation
 
Rough K Means - Numerical Example
Rough K Means - Numerical ExampleRough K Means - Numerical Example
Rough K Means - Numerical Example
 
The Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmThe Back Propagation Learning Algorithm
The Back Propagation Learning Algorithm
 
Vgg
VggVgg
Vgg
 
07. disjoint set
07. disjoint set07. disjoint set
07. disjoint set
 
Lecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image OperationLecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image Operation
 

Similar to Back propderiv

nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyayabhishek upadhyay
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptronsmitamm
 
Backpropagation - A peek into the Mathematics of optimization.pdf
Backpropagation - A peek into the Mathematics of optimization.pdfBackpropagation - A peek into the Mathematics of optimization.pdf
Backpropagation - A peek into the Mathematics of optimization.pdfOussama Haj Salem
 
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...Cemal Ardil
 
Introduction to Neural networks (under graduate course) Lecture 4 of 9
Introduction to Neural networks (under graduate course) Lecture 4 of 9Introduction to Neural networks (under graduate course) Lecture 4 of 9
Introduction to Neural networks (under graduate course) Lecture 4 of 9Randa Elanwar
 
Chapter No. 6: Backpropagation Networks
Chapter No. 6:  Backpropagation NetworksChapter No. 6:  Backpropagation Networks
Chapter No. 6: Backpropagation NetworksRamkrishnaPatil17
 
Design of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approachDesign of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approachEditor Jacotech
 
Design of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approachDesign of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approachEditor Jacotech
 
Scilab for real dummies j.heikell - part 2
Scilab for real dummies j.heikell - part 2Scilab for real dummies j.heikell - part 2
Scilab for real dummies j.heikell - part 2Scilab
 

Similar to Back propderiv (20)

nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
 
UofT_ML_lecture.pptx
UofT_ML_lecture.pptxUofT_ML_lecture.pptx
UofT_ML_lecture.pptx
 
2-Perceptrons.pdf
2-Perceptrons.pdf2-Perceptrons.pdf
2-Perceptrons.pdf
 
Backpropagation - A peek into the Mathematics of optimization.pdf
Backpropagation - A peek into the Mathematics of optimization.pdfBackpropagation - A peek into the Mathematics of optimization.pdf
Backpropagation - A peek into the Mathematics of optimization.pdf
 
NN-Ch6.PDF
NN-Ch6.PDFNN-Ch6.PDF
NN-Ch6.PDF
 
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
 
parallel
parallelparallel
parallel
 
ECE 565 presentation
ECE 565 presentationECE 565 presentation
ECE 565 presentation
 
Capstone paper
Capstone paperCapstone paper
Capstone paper
 
04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks
 
Introduction to Neural networks (under graduate course) Lecture 4 of 9
Introduction to Neural networks (under graduate course) Lecture 4 of 9Introduction to Neural networks (under graduate course) Lecture 4 of 9
Introduction to Neural networks (under graduate course) Lecture 4 of 9
 
SOFTCOMPUTERING TECHNICS - Unit
SOFTCOMPUTERING TECHNICS - UnitSOFTCOMPUTERING TECHNICS - Unit
SOFTCOMPUTERING TECHNICS - Unit
 
Chapter No. 6: Backpropagation Networks
Chapter No. 6:  Backpropagation NetworksChapter No. 6:  Backpropagation Networks
Chapter No. 6: Backpropagation Networks
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Design of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approachDesign of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approach
 
Design of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approachDesign of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approach
 
CS767_Lecture_04.pptx
CS767_Lecture_04.pptxCS767_Lecture_04.pptx
CS767_Lecture_04.pptx
 
Scilab for real dummies j.heikell - part 2
Scilab for real dummies j.heikell - part 2Scilab for real dummies j.heikell - part 2
Scilab for real dummies j.heikell - part 2
 
2. 2.1 BPN.pdf
2. 2.1 BPN.pdf2. 2.1 BPN.pdf
2. 2.1 BPN.pdf
 

Recently uploaded

Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 

Recently uploaded (20)

Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 

Back propderiv

  • 1. Derivation of Backpropagation 1 Introduction Figure 1: Neural network processing Conceptually, a network forward propagates activation to produce an output and it backward propagates error to determine weight changes (as shown in Figure 1). The weights on the connec- tions between neurons mediate the passed values in both directions. The Backpropagation algorithm is used to learn the weights of a multilayer neural network with a fixed architecture. It performs gradient descent to try to minimize the sum squared error between the network’s output values and the given target values. Figure 2 depicts the network components which affect a particular weight change. Notice that all the necessary components are locally related to the weight being updated. This is one feature of backpropagation that seems biologically plausible. However, brain connections appear to be unidirectional and not bidirectional as would be required to implement backpropagation. 2 Notation For the purpose of this derivation, we will use the following notation: • The subscript k denotes the output layer. • The subscript j denotes the hidden layer. • The subscript i denotes the input layer. 1
  • 2. Figure 2: The change to a hidden to output weight depends on error (depicted as a lined pattern) at the output node and activation (depicted as a solid pattern) at the hidden node. While the change to a input to hidden weight depends on error at the hidden node (which in turn depends on error at all the output nodes) and activation at the input node. • wkj denotes a weight from the hidden to the output layer. • wji denotes a weight from the input to the hidden layer. • a denotes an activation value. • t denotes a target value. • net denotes the net input. 3 Review of Calculus Rules d(eu) dx = eu du dx d(g + h) dx = dg dx + dh dx d(gn) dx = ngn−1 dg dx 4 Gradient Descent on Error We can motivate the backpropagation learning algorithm as gradient descent on sum-squared error (we square the error because we are interested in its magnitude, not its sign). The total error in a network is given by the following equation (the 1 2 will simplify things later). E = 1 2 X k (tk − ak)2 We want to adjust the network’s weights to reduce this overall error. ∆W ∝ − ∂E ∂W We will begin at the output layer with a particular weight. 2
  • 3. ∆wkj ∝ − ∂E ∂wkj However error is not directly a function of a weight. We expand this as follows. ∆wkj = −ε ∂E ∂ak ∂ak ∂netk ∂netk ∂wkj Let’s consider each of these partial derivatives in turn. Note that only one term of the E summation will have a non-zero derivative: the one associated with the particular weight we are considering. 4.1 Derivative of the error with respect to the activation ∂E ∂ak = ∂(1 2 (tk − ak)2) ∂ak = −(tk − ak) Now we see why the 1 2 in the E term was useful. 4.2 Derivative of the activation with respect to the net input ∂ak ∂netk = ∂(1 + e−netk )−1 ∂netk = e−netk (1 + e−netk )2 We’d like to be able to rewrite this result in terms of the activation function. Notice that: 1 − 1 1 + e−netk = e−netk 1 + e−netk Using this fact, we can rewrite the result of the partial derivative as: ak(1 − ak) 4.3 Derivative of the net input with respect to a weight Note that only one term of the net summation will have a non-zero derivative: again the one associated with the particular weight we are considering. ∂netk ∂wkj = ∂(wkjaj) ∂wkj = aj 4.4 Weight change rule for a hidden to output weight Now substituting these results back into our original equation we have: ∆wkj = ε δk z }| { (tk − ak)ak(1 − ak) aj Notice that this looks very similar to the Perceptron Training Rule. The only difference is the inclusion of the derivative of the activation function. This equation is typically simplified as shown below where the δ term repesents the product of the error with the derivative of the activation function. ∆wkj = εδkaj 3
  • 4. 4.5 Weight change rule for an input to hidden weight Now we have to determine the appropriate weight change for an input to hidden weight. This is more complicated because it depends on the error at all of the nodes this weighted connection can lead to. ∆wji ∝ −[ X k ∂E ∂ak ∂ak ∂netk ∂netk ∂aj ] ∂aj ∂netj ∂netj ∂wji = ε[ X k δk z }| { (tk − ak)ak(1 − ak) wkj]aj(1 − aj)ai = ε δj z }| { [ X k δkwkj]aj(1 − aj) ai ∆wji = εδjai 4