Back-propagation Primer

A Primer on Back-Propagation of Errors 
(applied to neural networks)
Auro Tripathy
auro@shatterline.com

Auro Tripathy
Outline
• Summary of Forward-Propagation
• The Calculus of Back-propagation
• Summary
2

Auro Tripathy
A Feed-Forward Network is a Brain-Inspired Metaphor
3

Auro Tripathy
Feed-forward to calculate the error relative to the
desired output 
Error-Function (aka Loss-, Cost-, or Objective-Function)
• In the feed-forward path, calculate the error relative to the desired output
• We define a error-function E(X3, Y) as the “penalty” of predicting X3 when the true output is Y.
• The objective is to minimize the error across all the training samples.
• The error/loss E(X3, Y) assigns a numerical score (a scalar) for the network’s output X3 given
the expected output Y.
• The loss is zero only for cases where the neural network’s output is correct.
4

Auro Tripathy
Sigmoid Activation Function
The sigmoid activation function
σ(x) = 1/(1 + e−x)
is an S-shaped activation function transforming all
values of x in the range, [0,1]
5
https://en.wikipedia.org/wiki/File:Logistic-curve.svg

Auro Tripathy
Gradient Descent
6
Note, in practice, we don’t expect a global minima, as shown here
a
b

Auro Tripathy
“Unshackled by the chain-rule” 
-Patrick Winston, MIT
7

Auro Tripathy
Derivative of the Error E with-respect-to the
Output, X3
8

Auro Tripathy
Derivative of the Sigmoid Activation Function
9
P3 X3
For the Sigmoid function, the cool thing is, the derivative of the output, X3
(with respect to the input, P3) is expressed in terms of the output, i.e.,
X3 . (1 - X3)
http://kawahara.ca/wp-content/uploads/derivative_of_sigmoid.jpg

Auro Tripathy
Derivative of P3 with-respect-to W3
10

Auro Tripathy
Propagate the errors backward and adjust the weights,
w, so the actual output mimics the desired output
11

Auro Tripathy
Computations are Localized & Partially Pre-computed in the Previous Layer
12

Auro Tripathy
Summary
☑If there’s a representative set of inputs and
outputs, then back-propagation can learn the
the weights.
☑Back-propagation has linear performance
relative to the number of layers.
☑Simple to implement (and test)
13

Auro Tripathy
Credits
14
Concepts crystalized from MIT Professor Patrick Winston’s lecture,
https://www.youtube.com/watch?v=q0pm3BrIUFo
auro@shatterline.com

Back-propagation Primer

More Related Content

Viewers also liked

Similar to Back-propagation Primer

Recently uploaded

Back-propagation Primer