Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
SSK_Artificial Neural Networks Basic to Models.pdf
1. AN INTRODUCTION TO
ARTIFICIAL NEURAL NETWORKS
Dr.S.SASIKALA
Department of ECE
Kumaraguru College of Technology
Coimbatore
Department of
Electronics and Communication Engineering
Since 1986
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
1
3. What is Learning?
Change is The Result of all True Learning
Leo Buscaglia
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
3
4. What is Learning?
• Learning happens when you observe a
phenomena and recognize a pattern.
• You try to understand this pattern by finding
out if there is any relationship between
the entities involved in that phenomena.
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
4
5. What is Learning?
• Take the example of a simple phenomenon
that we observe daily — the occurrence of day
and night – How do you realize?
Is there a pattern? Yes
Day time: A fixed time period, we
are exposed to light and heat of
the sun.
Night time: Another fixed period,
we are deprived of light and heat
from the sun.
This pattern repeats over and
over and over
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
5
6. What is Learning?
• how this pattern occurs?
• There are 2 entities involved in this observation
— Sun and Earth.
• Is there a relationship between the amount of light(and
heat) originating from the sun and the surface of earth
receiving it.
• The pattern suggests that the surface of the earth
receives the light alternatively
— gets it during the daytime
— does not get it during night-time.
• How is this possible?
— There are many possibilities
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
6
7. What is Learning?
• There are 3 conclusions derived called
“models” that explain the observed
phenomena.
• Model 1: Day/Night is a function of Magical
ON/OFF switch of sun
• Model 2: Day/Night is a function of the
Revolution of Sun around the earth
• Model 3: Day/Night is a function of Rotation of
Earth on its axis
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
7
8. What is Learning?
• The question now arises
— Which model(or function) is more accurate?
As per the observations/findings of different
philosophers/scientists across the ages, Model
3 is the most accurate model which explains
the phenomena of Day and Night.
— We can say, that this model “fits” best for
the observations around this phenomena.
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
8
9. What is Learning?
• Once a model has been built, it can be used
to predict future outcomes for that
phenomena.
• In our example, our model can safely predict
that occurrence of day/night will continue to
happen until, for some reason, the earth stops
rotating or sun runs out of its energy
➢ Will the earth stop rotating?
➢ When will the sun spent all of its energy ?
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
9
10. This is How Humans Learn
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
10
11. Human Learning
• Observing something, identifying a pattern,
building a theory (model) to explain this
pattern and testing this theory to check
whether it fits in most or all observations.
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
11
12. How Human Learn?
Parents Parents
Siblings
Teachers
Parents
Siblings
Teachers
Friends
Parents
Siblings
Teachers
Friends
Society
Experience
Parents
Siblings
Wife
Friends
Society
Colleagues
Parents
Siblings
Wife
Children
Friends
Society
Colleagues
Parents
Siblings
Wife
Children
Grand
Children
Friends
Society
Colleagues
BOOKS BOOKS
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
12
13. Is it possible for a machine to mimic
the process of human learning?
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
13
14. Human vs Machine
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
14
15. Machine Can Mimic
Human Learning Process
• The basic idea remains the same
• As with humans, machines are fed with
observations (data)
• The learning algorithm try to find out a
pattern among the data which best fits the
observations
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
15
16. Human learning vs Machine Learning
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
16
17. Machine Learning
A very powerful extension of
Human Brainpower
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
17
18. Task of Machine Learning
• Pattern Recognition
• Decision Making
• Optimization
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
18
19. Pattern Recognition
A pattern
• is an object, process or event that can be
given a name
• can either be seen physically or it can be
observed
• Eg. Eye colour, finger prints, handwriting
Recognition
• process of identifying the patterns
Pattern recognition
• is identifying patterns in data
• Process of converting the raw data into a
form that is amenable for a machine to use
• Pattern recognition involves classification
and cluster of patterns.
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
19
21. August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
21
22. Pattern Recognition
• Humans
Can perceive pattern naturally
But more computational time is required
• Machines
Computational speed is very high compared to humans.
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
22
23. Human - Very Good in PR
Humans have
Ability to learn from
experience
Brain with lot of information
processing cells
About 1011 neurons
interconnected to form a vast
and complex network like
structure
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
23
24. BIOLOGICAL AND ARTIFICIAL
NEURAL NETWORKS
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
24
25. Biological Neuron
Cell body(Soma)
• Containing organelles of the neuron
Dentrites (Rx)
• Tree-like structure originating to cell body that
receives the signal from surrounding neurons
Axon (TX)
• Long connection extending from cell body and carries signal
• There is only one axon per neuron that axon may divide in many branches at its end
and connected to other cells to transmits the signal from one neuron to others
Synapse
• Small-bulb like organ neuron at the end of axon which introduces the signal to the
near by dendrites of the other through chemical diffusion
Neuron
• Summed up all the inputs and process the sum by a threshold function and
produces an output signal.
• A neuron fires an electrical impulse only if certain condition is met
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
25
26. Biological Neural Network
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
26
27. How do you model an Artificial Neuron
By simulating functioning of a biological neuron
❑Function 1 – Accumulation of Information
Summation or Net Input Calculation
❑Function 2 – Passing of Information
Threshold or Activation or Producing output
Simulation involves
❑Identify the equivalent mathematical operator for the function
❑Design a mathematical model that process information
Artificial Neuron Resembles the human brain in two respects:
❑Knowledge acquisition through learning
❑Storage of knowledge in the synaptic weights
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
27
28. Biological Neuron and Artificial Neuron
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
28
29. Biological & Artificial Neuron
Resemblance
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
29
30. ANN vs BNN
BNN ANN
Soma Node
Dendrites Input
Synapse Weights or Interconnections
Axon Output
Massively parallel, slow but superior than
ANN
Massively parallel, fast but inferior than BNN
10
11
neurons and 10
15
interconnections 10
2
to 10
4
nodes mainly depends on the type
of application and network designer
They can tolerate ambiguity Very precise, structured and formatted data
is required to tolerate ambiguity
Performance degrades with even partial
damage
It is capable of robust performance, hence
has the potential to be fault tolerant
Stores the information in the synapse Stores the information in continuous
memory locations
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
30
31. ANN - Function
-
f
Weighted
sum
Input
vector x
Output y
Weight
vector
w
w0j
w1j
wnj
x0
x1
xn
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
31
32. What is ANN
Artificial Neuron
➢A digital construct that seeks to simulate the behavior of a
biological neuron in the brain.
➢They may be physical devices, or purely mathematical
constructs.
Artificial Neural Networks (ANN)
➢ Networks of Artificial Neurons
➢A parallel computational system consisting of a huge number
of simple and massively connected processing elements
connected together in a specific manner in order to perform a
particular task
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
32
33. History of ANN
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
33
34. Model of Artificial Neural Network
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
34
35. Model of Artificial Neural Network
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
35
• In the general model of ANN, the net input is
calculated by using the equation
• The output can be calculated by applying the
activation function over the net input
36. ANN - Building Blocks
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
36
37. CLASSIFICATIONS OF ANN
• Based on the architecture
➢Feed Forward Neural Network (FFNN)
➢Feed Back Neural Network (FBNN)
➢Recurrent Neural Network (RNN)
➢Competitive Neural Network (CNN)
• Based on the learning algorithm
➢Supervised Learning
➢Unsupervised Learning
➢Reinforcement Learning
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
37
38. Activation Functions
➢Activation functions are mathematical equations i.e a non-linear
transformations attached to each neuron in the network, which
determines whether the neuron should be activated (“fired”)
or not by calculating weighted sum and further adding bias with
it.
➢The purpose of the activation function is to introduce non-
linearity into the output of a neuron.
➢Activation functions also help normalize the output of each
neuron to a range between 1 and 0 or between -1 and 1.
➢The activation function does the non-linear transformation to
the input making it capable to learn and perform more complex
tasks.
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
38
39. Activation Functions
Linear Activation Function or identity function
Sigmoid Activation Function
➢Binary sigmoidal function
➢Bipolar sigmoidal function
F(x) = 1 if x > 0 else 0 if x < 0
Binary Step Activation Function
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
39
40. ANN MODELS
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
40
41. ANN Models
Models
➢McCulloch and Pitts Neuron
➢Hebb Network
➢Perceptron Network
➢Linear Separability
Insight
➢Architecture
➢Net Input Calculation
➢Output Calculation
➢Weight Updation - Learning
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
41
42. McCulloch and Pitts Neuron
➢ Activation is a binary step function
➢ Widely used in designing logic function
without bias
with bias
yin
= b + xi
wi
➢ Usually called as M-P Neuron or Threshold Logic Unit / gate
➢ Simply classifies the set of inputs into two different classes.
➢ Bias b is used to adjust the output along with the weighted
sum of the inputs to the neuron.
➢ b is a constant helps the model in a way
that it can fit best for the given data.
➢ Net input is calculated as
yin
= xi
wi
f(x) = 1 if x > 0 else 0 if x < 0
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
42
43. Hand Worked Example-MP Neuron
Calculation of net input without bias
[x1,x2,x3] = [0.1, 0.6, 0.3]
[w1,w2,w3] = [0.3 ,0.2, -0.4]
yin= xwT
yin=x1w1 + x2w2 + x3w3
= 0.1*0.3 + 0.6*0.2 + 0.3*(-0.4)
= 0.03 + 0.12 – 0.12
= 0.03
X1=0.1
X2=0.6
X3=0.3
w1=0.3
w2=0.2
w3=0.4
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
43
Calculation of output using binary step activation function
y = F(yin) = 1
44. Hand Worked Example-MP Neuron
Calculation of output using binary sigmoidal function
X = [x1,x2,x3] = [0.1, 0.6, 0.3]
W = [w1,w2,w3] = [0.3, 0.2,-0.4]
yin= b +xwT
Assuming x0 = 1 and w0=b
X = [x0, x1,x2,x3]
W = [w0,w1,w2,w3]
yin= xwT
yin=x1w1 +x1w1 + x2w2 + x3w3
= 1*0.5 + 0.1*0.3 + 0.6*0.2 + 0.3*(-0.4)
= 0.5 + 0.03 + 0.12 – 0.12
= 0.53
X1=0.1
X2=0.6
X3=0.3
w1=0.3
w2=0.2
w3=-0.4
b=0.5
1
y = f(yin) = 1 using Binary Step
= 0.63 using binary
sigmoid
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
44
45. Implementation of AND function
x1
x2
w1
w2
X1 X2 y
1 1 1
1 0 0
0 1 0
0 0 0
Assume Initial Weights w1 and w2 = 1
For inputs
➢ (1,1)→ yin=x1w1+x2w2 = 2
➢ (1,0) → 1
➢ (0,1) → 1
➢ (0,0) → 0
➢Assume threshold value Ѳ = 2
0if yin 2
y =f (yin)=
1if yin 2
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
45
46. Implementation of OR function
x1 x2 Y
1 1 1
1 0 1
0 1 1
0 0 0
Assume Initial Weights w1 and
w2 = 1 & b=0.5
For inputs
➢ (1,1)→ yin=x1w1+x2w2 + b= 2.5
➢ (1,0) → 1.5
➢ (0,1) → 1.5
➢ (0,0) → 0
➢Assume threshold value Ѳ = 1.5
y =f (yin)=
1if yin 1.5
0if yin 1.5
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
46
47. Hebb Network
➢ Observed that the learning in human brain takes place by
the change in synaptic gap.
➢ Weight vector is found to increase proportionately to the
product of input and output.
wi
(new )=wi
(old )+xi
y
b(new )=b (old )+y
➢ Weight and bias adjustment
➢ Change in weight w =xi
y
➢ Activation function is identity function f (yin ) = yin
➢ More suited for bipolar data
➢ Used for Pattern association, classification and clustering
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
47
48. Training Steps
1.Initially, the weights are set to zero, i.e. w =0 for all inputs
i =1 to n and n is the total number of input neurons.
2.The activation function for inputs is generally set as an
identity function.
3.The activation function for output is also set to y= t.
4.The weight adjustments and bias are adjusted to:
5.The steps 2 to 4 are repeated for each input vector and
output.
wi
(new )=wi
(old )+xi
y
b(new )=b (old )+y
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
48
49. Implementation of AND function
X1 X2 y
1 1 1
1 0 0
0 1 0
0 0 0
Training data →truth table of AND function
In bipolar form 1 →1 & 0 → -1
x1
x2
w1
w2
x0
b
➢ Initially the weights are set to zero w1=w2=b=0
➢ Present the first set inputs and apply Hebb rule
[x1 x2 x0] = [1 1 1] and y=[1]
wi(new)=wi(old) + xiy
• w1(new) = w1(old)+x1y → 0 + 1 *1 = 1
• w2(new)=w2(old)+x2y → 0 + 1 * 1 = 1
• b(new) = b(old) + y → 0 + 1 = 1
➢ Change in weight
• ∆wi=xiy
• ∆w1=x1y → 1 * 1 = 1
• ∆ w2 = x2y → 1 * 1 = 1
• ∆b=y = 1
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
49
50. Implementation of AND function
x1
x2
2
2
x0
-2
➢ Present the second set inputs and apply Hebb rule
– [x1 x2 x0] = [1 -1 1] and y=[-1]
– wi(new)=wi(old) + xiy
• w1(new) = w1(old)+x1y → 1 + 1 *-1 =0
• w2(new)=w2(old)+x2y → 1 + -1 * -1 = 2
• b(new) = b(old) + y → 1 + -1 = 0
➢ Change in weight
– ∆wi=xiy
• ∆w1=x1y → 1 * -1 =-1
• ∆ w2 = x2y → -1 * -1 = 1
• ∆b=y = -1
X1 X2 X0 Y ∆w1 ∆ w2 ∆b W1
(0)
W2
(0)
B
(0)
1 1 1 1 1 1 1 1 1 1
1 -1 1 -1 -1 1 -1 0 2 0
-1 1 1 -1 1 -1 -1 1 1 -1
-1 -1 1 -1
Dr
.P
.Ganes
1
hKumar,
1
Annauniver
-1
sity
2 2 -2
Hebb Net for AND Function
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
50
51. Perceptron Network
➢Perceptron Networks are single-layer feed-forward
networks introduced by Rosenblatt.
➢The Perceptron consists of an input layer, a hidden layer,
and output layer.
➢The input layer is connected to the hidden layer through
weights which may be inhibitory or excitatory or zero (-
1, +1 or 0).
➢The activation function used is a binary step function for
the input layer and the hidden layer.
➢The output is Y= f (y)
➢The activation function is: F(y)=
1, if y ≥ θ
0, if - θ ≤ y ≤ θ
-1, if y ≤ - θ
where θ is threshold
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
51
52. Perceptron Learning Rule
➢ The weight updation takes place between the hidden layer
and the output layer to match the target output.
➢ The error is calculated based on the actual output and the
desired output.
➢ If the output matches the target then no weight updation
takes place.
➢ The weights in the network can be set to any values initially.
➢ The Perceptron learning will converge to weight vector that
gives correct output for all input training pattern and this
learning happens in a finite number of steps.
➢ The Perceptron rule can be used for both binary and bipolar
inputs.
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
52
53. Training Steps
➢ Let there be “n” training input vectors and x(n) and t(n) are
associated with the target values.
➢ Initialize the weights and bias to zero for easy calculation
and the learning rate be 1.
➢ The input layer has identity activation function so x(i)= y(i).
➢ To calculate the output of the network:
•Calculate the net input to the output neuron
•Apply the activation function over the net input
➢ Now based on the output y, compare the desired target
value (t) and the actual output.
➢ Update Weights and bias if y t.
➢ Continue the iteration until there is
no weight change. Stop once this
condition is achieved
Weight Updation
if output (Y) t arget (t),
then w (new ) =w (old ) +tx
b(new ) =b(old ) +t
else w (new ) =w (old )
b(new ) =b(old )
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
53
54. Inputs Bias Target Net i/p O/p Weight Changes New Weights
w1 w2 b t yin y w1 w2 b W1 w2 b
Implementation of AND function
The EPOCHS are the cycle of input patterns fed to the system until there is no weight change
required and the iteration stops.
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
54
55. Linear Separability
• Linear Separability is possible by ANN(with input and output nodes alone)
only when the given problem is linear otherwise it is not possible.
• But Most of the real world problems are non linear in nature.
• Non-linear problems can be easily solved by introducing one or more
hidden layers between the input and output layers
x2
x1
x2
After Trained by NeuralNetwork
• Concept of separating the input data into classes by means of
straight line called decision line or decision making line or decision
support line or linearly separable line.
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
55
56. Linear Separability Illustrative Example
X1 X2 Y
0 0 0
0 1 1
1 0 1
1 1 1
‘OR’ GATE
X1
X2
‘AND’ GATE
X1 X2 Y
0 0 0
0 1 0
1 0 0
1 1 1
X1
X1
X2
X2
‘OR’ gate and ‘AND’ gate are LINEARLYSEPARABLE
‘XOR’ GATE
X1 X2 Y
0 0 1
0 1 0
1 0 0
1 1 1
‘XOR’ gate is NON-LINEAR
Logic 1 O/p
Logic 0 O/p
NOTE: Most of the data of real world problems are non linear only
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
56
58. Shallow Neural Networks
• A neural network with one hidden layer is considered
a shallow neural network whereas a network with
many hidden layers and a large number of neurons in
each layer is considered a deep neural network.
• A “shallow” neural network has only three layers of
neurons:
➢An input layer that accepts the independent
variables or inputs of the model
➢One hidden layer
➢An output layer that generates predictions
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
58
59. Shallow Machine Learning
• The features extraction in Shallow Machine
Learning is a manual process that requires
domain knowledge of the data that we
are learning from.
• In other words, "Shallow Learning" is a type
of machine learning where we learn from
data described by pre-defined features.
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
59
60. Shallow Neural Networks
• Multilayer Perceptron Network (MLPN)
• Radial Basis Function Network (RBFN)
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
60
62. We will introduce the MLP and the backpropagation
algorithm which is used to train it
MLP used to describe any general feedforward (no
recurrent connections) network
However, we will concentrate on nets with units
arranged in layers
x1
xn
62
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
63. Different books refer to the above as either 4 layer (no. of
layers of neurons) or 3 layer (no. of layers of adaptive
weights). We will follow the latter convention
1st question:
what do the extra layers gain you? Start with looking at
what a single layer can’t do
x1
xn
63
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
64. Perceptron Learning Theorem
• Recap: A perceptron (threshold unit) can
learn anything that it can represent (i.e.
anything separable with a hyperplane)
64
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
65. The Exclusive OR problem
A Perceptron cannot represent Exclusive OR
since it is not linearly separable.
65
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
66. 66
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
67. Minsky & Papert (1969) offered solution to XOR problem by
combining perceptron unit responses using a second layer of
Units. Piecewise linear classification using an MLP with
threshold (perceptron) units
1
2
+1
+1
3
67
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
69. Properties of architecture
• No connections within a layer
y f w x b
i ij j i
j
m
= +
=
( )
1
Each unit is a perceptron
69
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
70. Properties of architecture
• No connections within a layer
• No direct connections between input and output layers
•
y f w x b
i ij j i
j
m
= +
=
( )
1
Each unit is a perceptron
70
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
71. Properties of architecture
• No connections within a layer
• No direct connections between input and output layers
• Fully connected between layers
•
y f w x b
i ij j i
j
m
= +
=
( )
1
Each unit is a perceptron
71
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
72. Properties of architecture
• No connections within a layer
• No direct connections between input and output layers
• Fully connected between layers
• Often more than 3 layers
• Number of output units need not equal number of input units
• Number of hidden units per layer can be more or less than
input or output units
y f w x b
i ij j i
j
m
= +
=
( )
1
Each unit is a perceptron
Often include bias as an extra weight
72
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
73. What do each of the layers do?
1st layer draws
linear boundaries
2nd layer combines
the boundaries
3rd layer can generate
arbitrarily complex
boundaries
73
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
74. Backward pass phase: computes ‘error signal’, propagates
the error backwards through network starting at output units
(where the error is the difference between actual and desired
output values)
Forward pass phase: computes ‘functional signal’, feed forward
propagation of input pattern signals through network
Backpropagation Learning Algorithm ‘BP’
Solution to credit assignment problem in MLP. Rumelhart, Hinton and
Williams (1986) (though actually invented earlier in a PhD thesis
relating to economics)
BP has two phases:
74
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
75. Conceptually: Forward Activity -
Backward Error
75
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
76. Conceptually: Forward Activity -
Backward Error
76
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
77. MLP – with Single Hidden Layer
77
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
https://www.cse.unsw.edu.au/~cs9417ml/MLP2/BackPropagation.html
https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
78. Forward Propagation of Activity
• Step 1: Initialize weights at random, choose a
learning rate η
• Until network is trained:
• For each training example i.e. input pattern and
target output(s):
• Step 2: Do forward pass through net (with fixed
weights) to produce output(s)
– i.e., in Forward Direction, layer by layer:
• Inputs applied
• Multiplied by weights
• Summed
• Squashed by sigmoid activation function
• Output passed to each neuron in next layer
– Repeat above until network output(s) produced
78
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
79. Step 3. Back-propagation of error
• Compute error (delta or local gradient) for each
output unit δ k
• Layer-by-layer, compute error (delta or local
gradient) for each hidden unit δ j by backpropagating
errors (as shown previously)
Step 4: Next, update all the weights Δwij
By gradient descent, and go back to Step 2
− The overall MLP learning algorithm, involving
forward pass and backpropagation of error
(until the network training completion), is
known as the Generalised Delta Rule (GDR),
or more commonly, the Back Propagation
(BP) algorithm
79
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
80. Back Propagation Algorithm Summary
80
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
81. MLP/BP: A worked example
81
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
82. Worked example: Forward Pass
82
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
83. Worked example: Forward Pass
83
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
84. Worked example: Backward Pass
84
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
85. Worked example: Update Weights
Using Generalized Delta Rule (BP)
85
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
86. Similarly for the all weights wij:
86
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
87. Verification that it works
87
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
88. Training
• This was a single iteration of back-prop
• Training requires many iterations with many
training examples or epochs (one epoch is entire
presentation of complete training set)
• It can be slow !
• Note that computation in MLP is local (with
respect to each neuron)
• Parallel computation implementation is also
possible
88
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
89. Training and testing data
• How many examples ?
– The more the merrier !
• Disjoint training and testing data sets
– learn from training data but evaluate
performance (generalization ability) on
unseen test data
• Aim: minimize error on test data
89
August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
90. August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
90
91. August 27, 2022
IEEE EAB & IEEE MAS Sponsored
TryEngineering Workshop on Artificial
Intelligence for All
91