SlideShare a Scribd company logo
Today’s Lecture?
PATTERN RECOGNITION
A GENTLE INTRODUCTION TO
ARTIFICIAL NEURAL NETWORKS
(PART I)
WHAT IS IT?
 An artificial neural network is a crude way of trying
to simulate the human brain (digitally)
 Human brain – Approx 10 billion neurons
 Each neuron connected with thousands of others
 Parts of neuron
 Cell body
 Dendrites – receive input signal
 Axons – Give output
INTRODUCTION
 ANN – made up of artificial neurons
 Digitally modeled biological neuron
 Each input into the neuron has its own weight
associated with it
 As each input enters the nucleus (blue circle) it's
multiplied by its weight.
INTRODUCTION
 The nucleus sums all these new input values which
gives us the activation
 For n inputs and n weights – weights multiplied by
input and summed
a = x1w1+x2w2+x3w3... +xnwn
INTRODUCTION
 If the activation is greater than a threshold value - the
neuron outputs a signal – (for example 1)
 If the activation is less than threshold the neuron outputs
zero.
 This is typically called a step function
INTRODUCTION
 The combination of summation and thresholding is
called a node
 For step (activation) function – The output is 1 if:
http://www-cse.uta.edu/~cook/ai1/lectures/figures/neuron.jpg
x1w1+x2w2+x3w3... +xnwn > T
INTRODUCTION
x1w1+x2w2+x3w3... +xnwn > T
x1w1+x2w2+x3w3... +xnwn -T > 0
Let w0 = -T and x0 = 1
D = x0w0 + x1w1+x2w2+x3w3... +xnwn > 0
Output is 1 if D> 0;
Output is 0 otherwise
w0 is called a bias weight
TYPICAL ACTIVATION FUNCTIONS
Step function Sign function
+1
-1
0
+1
-1
0
X
Y
X
Y
1 1
-1
0 X
Y
Sigmoidfunction
-1
0 X
Y
Linear function






0
if
,
0
0
if
,
1
X
X
Ystep








0
if
,
1
0
if
,
1
X
X
Ysign
X
sigmoid
e
Y



1
1
X
Ylinear

Controls when unit is “active” or “inactive”
AN ARTIFICIAL NEURON- SUMMARY SO FAR
 Receives n-inputs
 Multiplies each input by
its weight
 Applies activation
function to the sum of
results
 Outputs result
http://www-cse.uta.edu/~cook/ai1/lectures/figures/neuron.jpg
SIMPLEST CLASSIFIER
Can a single neuron learn a task?
A MOTIVATING EXAMPLE
 Each day you get lunch at the cafeteria.
 Your diet consists of fish, chips, and drink.
 You get several portions of each
 The cashier only tells you the total price of the meal
 After several days, you should be able to figure out the price of each
portion.
 Each meal price gives a linear constraint on the prices of the
portions:
drink
drink
chips
chips
fish
fish w
x
w
x
w
x
price 


SOLVING THE PROBLEM
 The prices of the portions are like the weights in of a linear
neuron.
 We will start with guesses for the weights and then adjust the
guesses to give a better fit to the prices given by the cashier.
)
( ,
, drink
chips
fish w
w
w

w
THE CASHIER’S BRAIN
Price of meal = 850
portions of fish portions of
chips
portions of
drink
150 50 100
2 5 3
Linear
neuron
 Residual error = 350
 Apply learning rules and
update weights
A MODEL OF THE CASHIER’S BRAIN
WITH ARBITRARY INITIAL WEIGHTS
Price of meal = 500
portions of fish portions of
chips
portions of
drink
50 50 50
2 5 3
PERCEPTRON
 In 1958, Frank Rosenblatt introduced a training
algorithm that provided the first procedure for
training a simple ANN: a perceptron.
Threshold
Inputs
x1
x2
Output
Y
Hard
Limiter
w2
w1
Linear
Combiner

A two input perceptron
PERCEPTRON
 A perceptron takes several inputs, x1, x2, ……, and
produces a single binary output.
 The model consists of a linear combiner followed by a hard
limiter.
 The weighted sum of the inputs is applied to the hard limiter,
which produces an output equal to +1 if its input is positive
and -1 if it is negative. (1/0 in some models).
𝑦 = 𝑠𝑔𝑛
𝑖=1
2
𝑤𝑖𝑥𝑖 + 𝜃
𝑠𝑔𝑛 𝑠 =
1 𝑖𝑓 𝑠 > 0
−1 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
PERCEPTRON
1
x1 Y
-10
1
4
This is Equation of a line - Decision boundary
−10 + 1𝑥1 + 4𝑥2 = 0
𝑥1 = −4𝑥2 + 10
x2
PERCEPTRON
This is Equation of a line - Decision boundary
𝑥1 = −4𝑥2 + 10
PERCEPTRON LEARNING
 A perceptron (threshold unit) can learn anything
that it can represent (i.e. anything separable with a
hyperplane)
X1 X2 Y
0 0 0
0 1 1
1 0 1
1 1 1
21
OR FUNCTION
 The two-input perceptron can implement the OR function when
we set the weights: w0 = -0.3, w1 = w2 = 0.5
Decision hyperplane :
w0 + w1 x1 + w2 x2 = 0
-0.3 + 0.5 x1 + 0.5 x2 = 0
-
+
+
+
x1
x2
-0.3 + 0.5 x1 + 0.5 x2 = 0
-
+
+
+
x1
x2
-0.3 + 0.5 x1 + 0.5 x2 = 0
X1 X2 Y
0 0
0 1
1 0
1 1
Training Data
-1
+1
+1
+1
22
OR FUNCTION
Decision hyperplane :
w0 + w1 x1 + w2 x2 = 0
-0.3 + 0.5 x1 + 0.5 x2 = 0
-
+
+
+
x1
x2
-0.3 + 0.5 x1 + 0.5 x2 = 0
-
+
+
+
x1
x2
-0.3 + 0.5 x1 + 0.5 x2 = 0
Test Results
X1 X2 𝑾𝒊𝒙𝒊
Y
0 0 -0.3 -1
0 1 0.2 +1
1 0 0.2 +1
1 1 0.7 +1
23
A SINGLE PERCEPTRON CAN BE USED TO REPRESENT MANY
BOOLEAN FUNCTIONS.
 AND FUNCTION :
Decision hyperplane :
w0 + w1 x1 + w2 x2 = 0
-0.8 + 0.5 x1 + 0.5 x2 = 0
-
-
-
+
x1
x2
-0.8 + 0.5 x1 + 0.5 x2 = 0
-
-
-
+
x1
x2
-0.8 + 0.5 x1 + 0.5 x2 = 0
X1 X2 Y
0 0 -1
0 1 -1
1 0 -1
1 1 +1
X1 X2
𝑾𝒊𝒙𝒊
Y
0 0 -0.8 -1
0 1 -0.3 -1
1 0 -0.3 -1
1 1 0.2 +1
Training Examples
Test Results
XOR FUNCTION
 A Perceptron cannot represent Exclusive OR since
it is not linearly separable.
X1 X2 Y
0 0 -1
0 1 +1
1 0 +1
1 1 -1
XOR Function
25
XOR FUNCTION :
It is impossible to implement the XOR function by a
single perceptron
Two perceptrons?
X1 X2 Y
0 0 -1
0 1 +1
1 0 +1
1 1 -1
XOR Function
2D PLOT OF BASIC LOGICAL OPERATORS
x1
x2
1
1
x1
x2
1
1
(b) OR (x1  x2)
x1
x2
1
1
(c) Exclusive-OR
(x1  x2)
0
0 0
(a) AND (x1  x2)
A perceptron can learn the operations
AND and OR, but not Exclusive-OR.
PERCEPTRON
 The aim of the perceptron is to classify inputs, x1,
x2, . . ., xn, into one of two classes, say A1 and A2.
 In the case of an elementary perceptron, the n-
dimensional space is divided by a hyperplane into
two decision regions. The hyperplane is defined by
the function:
0
1





n
i
i
i w
x
LINEAR SEPARABILITY WITH PERCEPTRON
x1
x2
Class A2
Class A1
1
2
x1w1 + x2w2  = 0
(a) Two-input perceptron. (b) Three-
input perceptron.
x2
x1
x3
x1w1 + x2w2 + x3w3  = 0
1
2
Perceptron Learning
GRADIENT DESCENT
 Error Surface
 Use gradient descente to find the minimum value of
E
TRAINING RULE DERIVATION – GRADIENT
DESCENT
 Objective: Find the values of weights which minimize
the error function
O(d) is the observed and T(d) is the target output for training example ‘d’
d
n
n
d
d
d
m
d
d
d
x
w
x
w
x
w
w
O
O
T
E






 

....
)
(
2
1
2
2
1
1
0
)
(
2
1
)
(
)
(
BATCH GRADIENT DESCENTE
Gradient-Descent(training_examples, )
Each training example is a pair of the form <(x1,…xn),t> where (x1,…,xn) is the vector
of input values, and t is the target output value,  is the learning rate (e.g. 0.1)
Initialize each wi to some small random value
Until the termination condition is met
Do
Initialize each wi to zero
For each <(x1,…xn),t> in training_examples
Do
Input the instance (x1,…,xn) to the linear unit and compute the output o
For each linear unit weight wi
Do
wi= wi +  (t-o) xi
For each linear unit weight wi
Do
wi=wi+wi

wi  (td
d D
 od )xid
INCREMENTAL GRADIENT DESCENTE
 The gradient decent training rule updates summing over
all the training examples
 Stochastic gradient approximates gradient decent by
updating weights incrementally
 Calculate error for each example
INCREMENTAL GRADIENT DESCENTE
Gradient-Descent(training_examples, )
Each training example is a pair of the form <(x1,…xn),t> where (x1,…,xn) is
the vector of input values, and t is the target output value,  is the learning
rate (e.g. 0.1)
Initialize each wi to some small random value
Until the termination condition is met
Do
Initialize each wi to zero
For each <(x1,…xn),t> in training_examples
Do
Input the instance (x1,…,xn) to the linear unit and compute the
output o
For each linear unit weight wi
wi=wi+wi
wi (t o)xi
GRADIENT DESCENTE ALGORITHM
PERCEPTRON LEARNING:
LOGICAL OPERATION AND
Inputs
x1 x2
0
0
1
1
0
1
0
1
0
0
0
Epoch
Desired
output
Yd
1
Initial
weights
w1 w2
1
0.3 0.1
Actual
output
Y
Error
e
Final
weights
w1 w2
0
0
1
1
0
1
0
1
0
0
0
2
1
0
0
1
1
0
1
0
1
0
0
0
3
1
0
0
1
1
0
1
0
1
0
0
0
4
1
0
0
1
1
0
1
0
1
0
0
0
5
1
Threshold:  = 0.2; learning rate:  = 0.1
𝑤1𝑥1 + 𝑤2𝑥2 − 𝜃 = 0
𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0
Inputs
x1 x2
0
0
1
1
0
1
0
1
0
0
0
Epoch
Desired
output
Yd
1
Initial
weights
w1 w2
1
0.3 0.1 0
Actual
output
Y
Error
e
0
Final
weights
w1 w2
0.3 0.1
0
0
1
1
0
1
0
1
0
0
0
2
1
0
0
1
1
0
1
0
1
0
0
0
3
1
0
0
1
1
0
1
0
1
0
0
0
4
1
0
0
1
1
0
1
0
1
0
0
0
5
1
Threshold:  = 0.2; learning rate:  = 0.1
PERCEPTRON LEARNING:
LOGICAL OPERATION AND
𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0
= 0.3 x 0 − 0.1 x 0 − 0.2
= − 0.2 < 0
Output: 0
Update Rule:
𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖
𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1
𝑤1 = 0.3 + 0.1 0 − 0 0
𝑤1 = 0.3
Training Example 1:
𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2
𝑤2 = −0.1 + 0.1 0 − 0 0
𝑤2 = −0.1
Inputs
x1 x2
0
0
1
1
0
1
0
1
0
0
0
Epoch
Desired
output
Yd
1
Initial
weights
w1 w2
1
0.3 0.1 0
Actual
output
Y
Error
e
0
Final
weights
w1 w2
0.3 0.1
0
0
1
1
0
1
0
1
0
0
0
2
1
0
0
1
1
0
1
0
1
0
0
0
3
1
0
0
1
1
0
1
0
1
0
0
0
4
1
0
0
1
1
0
1
0
1
0
0
0
5
1
Threshold:  = 0.2; learning rate:  = 0.1
0.3 0.1
PERCEPTRON LEARNING:
LOGICAL OPERATION AND
𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0
= 0.3 x 0 − 0.1 x 0 − 0.2
= − 0.2 < 0
Output: 0
Update Rule:
𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖
𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1
𝑤1 = 0.3 + 0.1 0 − 0 0
𝑤1 = 0.3
Training Example 1:
𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2
𝑤2 = −0.1 + 0.1 0 − 0 0
𝑤2 = −0.1
Inputs
x1 x2
0
0
1
1
0
1
0
1
0
0
0
Epoch
Desired
output
Yd
1
Initial
weights
w1 w2
1
0.3 0.1 0
Actual
output
Y
Error
e
0
Final
weights
w1 w2
0.3 0.1
0
0
1
1
0
1
0
1
0
0
0
2
1
0
0
1
1
0
1
0
1
0
0
0
3
1
0
0
1
1
0
1
0
1
0
0
0
4
1
0
0
1
1
0
1
0
1
0
0
0
5
1
Threshold:  = 0.2; learning rate:  = 0.1
0.3 0.1 0 0 0.3 0.1
PERCEPTRON LEARNING:
LOGICAL OPERATION AND
𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0
= 0.3 x 0 − 0.1 x 1 − 0.2
= − 0.3 < 0
Output: 0
Update Rule:
𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖
𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1
𝑤1 = 0.3 + 0.1 0 − 0 0
𝑤1 = 0.3
Training Example 2:
𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2
𝑤2 = −0.1 + 0.1 0 − 0 1
𝑤2 = −0.1
Inputs
x1 x2
0
0
1
1
0
1
0
1
0
0
0
Epoch
Desired
output
Yd
1
Initial
weights
w1 w2
1
0.3 0.1 0
Actual
output
Y
Error
e
0
Final
weights
w1 w2
0.3 0.1
0
0
1
1
0
1
0
1
0
0
0
2
1
0
0
1
1
0
1
0
1
0
0
0
3
1
0
0
1
1
0
1
0
1
0
0
0
4
1
0
0
1
1
0
1
0
1
0
0
0
5
1
Threshold:  = 0.2; learning rate:  = 0.1
0.3 0.1 0 0 0.3 0.1
0.3 0.1
PERCEPTRON LEARNING:
LOGICAL OPERATION AND
𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0
= 0.3 x 0 − 0.1 x 1 − 0.2
= − 0.3 < 0
Output: 0
Update Rule:
𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖
𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1
𝑤1 = 0.3 + 0.1 0 − 0 0
𝑤1 = 0.3
Training Example 2:
𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2
𝑤2 = −0.1 + 0.1 0 − 0 1
𝑤2 = −0.1
Inputs
x1 x2
0
0
1
1
0
1
0
1
0
0
0
Epoch
Desired
output
Yd
1
Initial
weights
w1 w2
1
0.3 0.1 0
Actual
output
Y
Error
e
0
Final
weights
w1 w2
0.3 0.1
0
0
1
1
0
1
0
1
0
0
0
2
1
0
0
1
1
0
1
0
1
0
0
0
3
1
0
0
1
1
0
1
0
1
0
0
0
4
1
0
0
1
1
0
1
0
1
0
0
0
5
1
Threshold:  = 0.2; learning rate:  = 0.1
0.3 0.1 0 0 0.3 0.1
0.3 0.1
PERCEPTRON LEARNING:
LOGICAL OPERATION AND
𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0
= 0.3 x 1 − 0.1 x 0 − 0.2
= 0.1 > 0
Output: 1
Update Rule:
𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖
𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1
𝑤1 = 0.3 + 0.1 0 − 1 1
𝑤1 = 0.2
Training Example 3:
𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2
𝑤2 = −0.1 + 0.1 0 − 1 0
𝑤2 = −0.1
Inputs
x1 x2
0
0
1
1
0
1
0
1
0
0
0
Epoch
Desired
output
Yd
1
Initial
weights
w1 w2
1
0.3 0.1 0
Actual
output
Y
Error
e
0
Final
weights
w1 w2
0.3 0.1
0
0
1
1
0
1
0
1
0
0
0
2
1
0
0
1
1
0
1
0
1
0
0
0
3
1
0
0
1
1
0
1
0
1
0
0
0
4
1
0
0
1
1
0
1
0
1
0
0
0
5
1
Threshold:  = 0.2; learning rate:  = 0.1
0.3 0.1 0 0 0.3 0.1
0.3 0.1 1 -1 0.2 0.1
PERCEPTRON LEARNING:
LOGICAL OPERATION AND
𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0
= 0.3 x 1 − 0.1 x 0 − 0.2
= 0.1 > 0
Output: 1
Update Rule:
𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖
𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1
𝑤1 = 0.3 + 0.1 0 − 1 1
𝑤1 = 0.2
Training Example 3:
𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2
𝑤2 = −0.1 + 0.1 0 − 1 0
𝑤2 = −0.1
Inputs
x1 x2
0
0
1
1
0
1
0
1
0
0
0
Epoch
Desired
output
Yd
1
Initial
weights
w1 w2
1
0.3 0.1 0
Actual
output
Y
Error
e
0
Final
weights
w1 w2
0.3 0.1
0
0
1
1
0
1
0
1
0
0
0
2
1
0
0
1
1
0
1
0
1
0
0
0
3
1
0
0
1
1
0
1
0
1
0
0
0
4
1
0
0
1
1
0
1
0
1
0
0
0
5
1
Threshold:  = 0.2; learning rate:  = 0.1
0.3 0.1 0 0 0.3 0.1
0.3 0.1 1 -1 0.2 0.1
0.2 0.1
PERCEPTRON LEARNING:
LOGICAL OPERATION AND
𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0
= 0.3 x 1 − 0.1 x 0 − 0.2
= 0.1 > 0
Output: 1
Update Rule:
𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖
𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1
𝑤1 = 0.3 + 0.1 0 − 1 1
𝑤1 = 0.2
Training Example 3:
𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2
𝑤2 = −0.1 + 0.1 0 − 1 0
𝑤2 = −0.1
Inputs
x1 x2
0
0
1
1
0
1
0
1
0
0
0
Epoch
Desired
output
Yd
1
Initial
weights
w1 w2
1
0.3
0.3
0.3
0.2
0.1
0.1
0.1
0.1
0
0
1
0
Actual
output
Y
Error
e
0
0
1
1
Final
weights
w1 w2
0.3
0.3
0.2
0.3
0.1
0.1
0.1
0.0
0
0
1
1
0
1
0
1
0
0
0
2
1
0.3
0.3
0.3
0.2
0
0
1
1
0
0
1
0
0.3
0.3
0.2
0.2
0.0
0.0
0.0
0.0
0
0
1
1
0
1
0
1
0
0
0
3
1
0.2
0.2
0.2
0.1
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0
0
1
0
0
0
1
1
0.2
0.2
0.1
0.2
0.0
0.0
0.0
0.1
0
0
1
1
0
1
0
1
0
0
0
4
1
0.2
0.2
0.2
0.1
0.1
0.1
0.1
0.1
0
0
1
1
0
0
1
0
0.2
0.2
0.1
0.1
0.1
0.1
0.1
0.1
0
0
1
1
0
1
0
1
0
0
0
5
1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0
0
0
1
0
0
0
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0
Threshold:  = 0.2; learning rate:  = 0.1
PERCEPTRON LEARNING:
LOGICAL OPERATION AND
0.3𝑥1 − 0.1𝑋2 − 0.2 = 0 0.2𝑥1 − 0.1𝑥2 − 0.2 = 0
0.3𝑥1 − 0.2 = 0 0.2𝑥1 − 0.2 = 0
0.1𝑥1 − 0.2 = 0 0.2𝑥1 + 0.1𝑥2 − 0.2 = 0
0.1𝑥1 + 0.1𝑥2 − 0.2 = 0
[Russell & Norvig, 1995]
XOR - REVISITED
 Piece-wise linear separation
0,0
0,1
1,0
1,1
0,0
0,1
1,0
1,1
AND XOR
0,0
0,1
1,0
1,1
XOR
0,0
0,1
1,0
1,1
XOR
0,0
0,1
1,0
1,1
XOR
1
2
3
MULTI-LAYER PERCEPTRON - MLP
 Minsky & Papert (1969) offered solution to XOR
problem by combining perceptron unit responses
using a second layer of Units
 Piecewise linear classification using an MLP with
threshold (perceptron) units
 More to come…..Part II

More Related Content

Similar to Artificial neural networks - A gentle introduction to ANNS.pptx

Neural network
Neural networkNeural network
Neural networkmarada0033
 
Lecture9April2020_time_11_55amto12_50pm(Neural_network_PPT).pptx
Lecture9April2020_time_11_55amto12_50pm(Neural_network_PPT).pptxLecture9April2020_time_11_55amto12_50pm(Neural_network_PPT).pptx
Lecture9April2020_time_11_55amto12_50pm(Neural_network_PPT).pptxVAIBHAVSAHU55
 
Perceptron noural network algorithme .pptx
Perceptron noural network algorithme .pptxPerceptron noural network algorithme .pptx
Perceptron noural network algorithme .pptxYounesAitihya
 
latest TYPES OF NEURAL NETWORKS (2).pptx
latest TYPES OF NEURAL NETWORKS (2).pptxlatest TYPES OF NEURAL NETWORKS (2).pptx
latest TYPES OF NEURAL NETWORKS (2).pptxMdMahfoozAlam5
 
Perceptron 2015.ppt
Perceptron 2015.pptPerceptron 2015.ppt
Perceptron 2015.pptSadafAyesha9
 
A hitchhiker’s guide to neuroevolution in Erlang
A hitchhiker’s guide to neuroevolution in ErlangA hitchhiker’s guide to neuroevolution in Erlang
A hitchhiker’s guide to neuroevolution in ErlangThoughtworks
 
Introduction to Neural networks (under graduate course) Lecture 4 of 9
Introduction to Neural networks (under graduate course) Lecture 4 of 9Introduction to Neural networks (under graduate course) Lecture 4 of 9
Introduction to Neural networks (under graduate course) Lecture 4 of 9Randa Elanwar
 
Artificial Neural Networks
Artificial Neural NetworksArtificial Neural Networks
Artificial Neural NetworksArslan Zulfiqar
 
Neurvvvvvvvvvvvvvvvvvvvval Networks.pptx
Neurvvvvvvvvvvvvvvvvvvvval Networks.pptxNeurvvvvvvvvvvvvvvvvvvvval Networks.pptx
Neurvvvvvvvvvvvvvvvvvvvval Networks.pptxeman458700
 
Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchAhmed BESBES
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptronsmitamm
 
Lecture 2: Artificial Neural Network
Lecture 2: Artificial Neural NetworkLecture 2: Artificial Neural Network
Lecture 2: Artificial Neural NetworkMohamed Loey
 
Machine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind LibrariesMachine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind LibrariesJ On The Beach
 

Similar to Artificial neural networks - A gentle introduction to ANNS.pptx (20)

Neural network
Neural networkNeural network
Neural network
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Lecture9April2020_time_11_55amto12_50pm(Neural_network_PPT).pptx
Lecture9April2020_time_11_55amto12_50pm(Neural_network_PPT).pptxLecture9April2020_time_11_55amto12_50pm(Neural_network_PPT).pptx
Lecture9April2020_time_11_55amto12_50pm(Neural_network_PPT).pptx
 
Perceptron.pptx
Perceptron.pptxPerceptron.pptx
Perceptron.pptx
 
Perceptron noural network algorithme .pptx
Perceptron noural network algorithme .pptxPerceptron noural network algorithme .pptx
Perceptron noural network algorithme .pptx
 
latest TYPES OF NEURAL NETWORKS (2).pptx
latest TYPES OF NEURAL NETWORKS (2).pptxlatest TYPES OF NEURAL NETWORKS (2).pptx
latest TYPES OF NEURAL NETWORKS (2).pptx
 
Perceptron 2015.ppt
Perceptron 2015.pptPerceptron 2015.ppt
Perceptron 2015.ppt
 
A hitchhiker’s guide to neuroevolution in Erlang
A hitchhiker’s guide to neuroevolution in ErlangA hitchhiker’s guide to neuroevolution in Erlang
A hitchhiker’s guide to neuroevolution in Erlang
 
Nn3
Nn3Nn3
Nn3
 
Introduction to Neural networks (under graduate course) Lecture 4 of 9
Introduction to Neural networks (under graduate course) Lecture 4 of 9Introduction to Neural networks (under graduate course) Lecture 4 of 9
Introduction to Neural networks (under graduate course) Lecture 4 of 9
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
03 Single layer Perception Classifier
03 Single layer Perception Classifier03 Single layer Perception Classifier
03 Single layer Perception Classifier
 
Artificial Neural Networks
Artificial Neural NetworksArtificial Neural Networks
Artificial Neural Networks
 
Neurvvvvvvvvvvvvvvvvvvvval Networks.pptx
Neurvvvvvvvvvvvvvvvvvvvval Networks.pptxNeurvvvvvvvvvvvvvvvvvvvval Networks.pptx
Neurvvvvvvvvvvvvvvvvvvvval Networks.pptx
 
SOFTCOMPUTERING TECHNICS - Unit
SOFTCOMPUTERING TECHNICS - UnitSOFTCOMPUTERING TECHNICS - Unit
SOFTCOMPUTERING TECHNICS - Unit
 
Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from Scratch
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
 
Lecture 2: Artificial Neural Network
Lecture 2: Artificial Neural NetworkLecture 2: Artificial Neural Network
Lecture 2: Artificial Neural Network
 
Machine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind LibrariesMachine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind Libraries
 
2-Perceptrons.pdf
2-Perceptrons.pdf2-Perceptrons.pdf
2-Perceptrons.pdf
 

Recently uploaded

Machine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptxMachine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptxbenishzehra469
 
how can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoinhow can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like BitcoinDOT TECH
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsalex933524
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIAlejandraGmez176757
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesStarCompliance.io
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJames Polillo
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfMichaelSenkow
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...correoyaya
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfscitechtalktv
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group MeetingAlison Pitt
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .NABLAS株式会社
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdfvyankatesh1
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxStephen266013
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxDilipVasan
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?DOT TECH
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...elinavihriala
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Calllward7
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsCEPTES Software Inc
 

Recently uploaded (20)

Machine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptxMachine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptx
 
how can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoinhow can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoin
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 

Artificial neural networks - A gentle introduction to ANNS.pptx

  • 2. PATTERN RECOGNITION A GENTLE INTRODUCTION TO ARTIFICIAL NEURAL NETWORKS (PART I)
  • 3. WHAT IS IT?  An artificial neural network is a crude way of trying to simulate the human brain (digitally)  Human brain – Approx 10 billion neurons  Each neuron connected with thousands of others  Parts of neuron  Cell body  Dendrites – receive input signal  Axons – Give output
  • 4. INTRODUCTION  ANN – made up of artificial neurons  Digitally modeled biological neuron  Each input into the neuron has its own weight associated with it  As each input enters the nucleus (blue circle) it's multiplied by its weight.
  • 5. INTRODUCTION  The nucleus sums all these new input values which gives us the activation  For n inputs and n weights – weights multiplied by input and summed a = x1w1+x2w2+x3w3... +xnwn
  • 6. INTRODUCTION  If the activation is greater than a threshold value - the neuron outputs a signal – (for example 1)  If the activation is less than threshold the neuron outputs zero.  This is typically called a step function
  • 7. INTRODUCTION  The combination of summation and thresholding is called a node  For step (activation) function – The output is 1 if: http://www-cse.uta.edu/~cook/ai1/lectures/figures/neuron.jpg x1w1+x2w2+x3w3... +xnwn > T
  • 8. INTRODUCTION x1w1+x2w2+x3w3... +xnwn > T x1w1+x2w2+x3w3... +xnwn -T > 0 Let w0 = -T and x0 = 1 D = x0w0 + x1w1+x2w2+x3w3... +xnwn > 0 Output is 1 if D> 0; Output is 0 otherwise w0 is called a bias weight
  • 9. TYPICAL ACTIVATION FUNCTIONS Step function Sign function +1 -1 0 +1 -1 0 X Y X Y 1 1 -1 0 X Y Sigmoidfunction -1 0 X Y Linear function       0 if , 0 0 if , 1 X X Ystep         0 if , 1 0 if , 1 X X Ysign X sigmoid e Y    1 1 X Ylinear  Controls when unit is “active” or “inactive”
  • 10. AN ARTIFICIAL NEURON- SUMMARY SO FAR  Receives n-inputs  Multiplies each input by its weight  Applies activation function to the sum of results  Outputs result http://www-cse.uta.edu/~cook/ai1/lectures/figures/neuron.jpg
  • 11. SIMPLEST CLASSIFIER Can a single neuron learn a task?
  • 12. A MOTIVATING EXAMPLE  Each day you get lunch at the cafeteria.  Your diet consists of fish, chips, and drink.  You get several portions of each  The cashier only tells you the total price of the meal  After several days, you should be able to figure out the price of each portion.  Each meal price gives a linear constraint on the prices of the portions: drink drink chips chips fish fish w x w x w x price   
  • 13. SOLVING THE PROBLEM  The prices of the portions are like the weights in of a linear neuron.  We will start with guesses for the weights and then adjust the guesses to give a better fit to the prices given by the cashier. ) ( , , drink chips fish w w w  w
  • 14. THE CASHIER’S BRAIN Price of meal = 850 portions of fish portions of chips portions of drink 150 50 100 2 5 3 Linear neuron
  • 15.  Residual error = 350  Apply learning rules and update weights A MODEL OF THE CASHIER’S BRAIN WITH ARBITRARY INITIAL WEIGHTS Price of meal = 500 portions of fish portions of chips portions of drink 50 50 50 2 5 3
  • 16. PERCEPTRON  In 1958, Frank Rosenblatt introduced a training algorithm that provided the first procedure for training a simple ANN: a perceptron. Threshold Inputs x1 x2 Output Y Hard Limiter w2 w1 Linear Combiner  A two input perceptron
  • 17. PERCEPTRON  A perceptron takes several inputs, x1, x2, ……, and produces a single binary output.  The model consists of a linear combiner followed by a hard limiter.  The weighted sum of the inputs is applied to the hard limiter, which produces an output equal to +1 if its input is positive and -1 if it is negative. (1/0 in some models). 𝑦 = 𝑠𝑔𝑛 𝑖=1 2 𝑤𝑖𝑥𝑖 + 𝜃 𝑠𝑔𝑛 𝑠 = 1 𝑖𝑓 𝑠 > 0 −1 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
  • 18. PERCEPTRON 1 x1 Y -10 1 4 This is Equation of a line - Decision boundary −10 + 1𝑥1 + 4𝑥2 = 0 𝑥1 = −4𝑥2 + 10 x2
  • 19. PERCEPTRON This is Equation of a line - Decision boundary 𝑥1 = −4𝑥2 + 10
  • 20. PERCEPTRON LEARNING  A perceptron (threshold unit) can learn anything that it can represent (i.e. anything separable with a hyperplane) X1 X2 Y 0 0 0 0 1 1 1 0 1 1 1 1
  • 21. 21 OR FUNCTION  The two-input perceptron can implement the OR function when we set the weights: w0 = -0.3, w1 = w2 = 0.5 Decision hyperplane : w0 + w1 x1 + w2 x2 = 0 -0.3 + 0.5 x1 + 0.5 x2 = 0 - + + + x1 x2 -0.3 + 0.5 x1 + 0.5 x2 = 0 - + + + x1 x2 -0.3 + 0.5 x1 + 0.5 x2 = 0 X1 X2 Y 0 0 0 1 1 0 1 1 Training Data -1 +1 +1 +1
  • 22. 22 OR FUNCTION Decision hyperplane : w0 + w1 x1 + w2 x2 = 0 -0.3 + 0.5 x1 + 0.5 x2 = 0 - + + + x1 x2 -0.3 + 0.5 x1 + 0.5 x2 = 0 - + + + x1 x2 -0.3 + 0.5 x1 + 0.5 x2 = 0 Test Results X1 X2 𝑾𝒊𝒙𝒊 Y 0 0 -0.3 -1 0 1 0.2 +1 1 0 0.2 +1 1 1 0.7 +1
  • 23. 23 A SINGLE PERCEPTRON CAN BE USED TO REPRESENT MANY BOOLEAN FUNCTIONS.  AND FUNCTION : Decision hyperplane : w0 + w1 x1 + w2 x2 = 0 -0.8 + 0.5 x1 + 0.5 x2 = 0 - - - + x1 x2 -0.8 + 0.5 x1 + 0.5 x2 = 0 - - - + x1 x2 -0.8 + 0.5 x1 + 0.5 x2 = 0 X1 X2 Y 0 0 -1 0 1 -1 1 0 -1 1 1 +1 X1 X2 𝑾𝒊𝒙𝒊 Y 0 0 -0.8 -1 0 1 -0.3 -1 1 0 -0.3 -1 1 1 0.2 +1 Training Examples Test Results
  • 24. XOR FUNCTION  A Perceptron cannot represent Exclusive OR since it is not linearly separable. X1 X2 Y 0 0 -1 0 1 +1 1 0 +1 1 1 -1 XOR Function
  • 25. 25 XOR FUNCTION : It is impossible to implement the XOR function by a single perceptron Two perceptrons? X1 X2 Y 0 0 -1 0 1 +1 1 0 +1 1 1 -1 XOR Function
  • 26. 2D PLOT OF BASIC LOGICAL OPERATORS x1 x2 1 1 x1 x2 1 1 (b) OR (x1  x2) x1 x2 1 1 (c) Exclusive-OR (x1  x2) 0 0 0 (a) AND (x1  x2) A perceptron can learn the operations AND and OR, but not Exclusive-OR.
  • 27. PERCEPTRON  The aim of the perceptron is to classify inputs, x1, x2, . . ., xn, into one of two classes, say A1 and A2.  In the case of an elementary perceptron, the n- dimensional space is divided by a hyperplane into two decision regions. The hyperplane is defined by the function: 0 1      n i i i w x
  • 28. LINEAR SEPARABILITY WITH PERCEPTRON x1 x2 Class A2 Class A1 1 2 x1w1 + x2w2  = 0 (a) Two-input perceptron. (b) Three- input perceptron. x2 x1 x3 x1w1 + x2w2 + x3w3  = 0 1 2
  • 30. GRADIENT DESCENT  Error Surface  Use gradient descente to find the minimum value of E
  • 31. TRAINING RULE DERIVATION – GRADIENT DESCENT  Objective: Find the values of weights which minimize the error function O(d) is the observed and T(d) is the target output for training example ‘d’ d n n d d d m d d d x w x w x w w O O T E          .... ) ( 2 1 2 2 1 1 0 ) ( 2 1 ) ( ) (
  • 32. BATCH GRADIENT DESCENTE Gradient-Descent(training_examples, ) Each training example is a pair of the form <(x1,…xn),t> where (x1,…,xn) is the vector of input values, and t is the target output value,  is the learning rate (e.g. 0.1) Initialize each wi to some small random value Until the termination condition is met Do Initialize each wi to zero For each <(x1,…xn),t> in training_examples Do Input the instance (x1,…,xn) to the linear unit and compute the output o For each linear unit weight wi Do wi= wi +  (t-o) xi For each linear unit weight wi Do wi=wi+wi  wi  (td d D  od )xid
  • 33. INCREMENTAL GRADIENT DESCENTE  The gradient decent training rule updates summing over all the training examples  Stochastic gradient approximates gradient decent by updating weights incrementally  Calculate error for each example
  • 34. INCREMENTAL GRADIENT DESCENTE Gradient-Descent(training_examples, ) Each training example is a pair of the form <(x1,…xn),t> where (x1,…,xn) is the vector of input values, and t is the target output value,  is the learning rate (e.g. 0.1) Initialize each wi to some small random value Until the termination condition is met Do Initialize each wi to zero For each <(x1,…xn),t> in training_examples Do Input the instance (x1,…,xn) to the linear unit and compute the output o For each linear unit weight wi wi=wi+wi wi (t o)xi
  • 36. PERCEPTRON LEARNING: LOGICAL OPERATION AND Inputs x1 x2 0 0 1 1 0 1 0 1 0 0 0 Epoch Desired output Yd 1 Initial weights w1 w2 1 0.3 0.1 Actual output Y Error e Final weights w1 w2 0 0 1 1 0 1 0 1 0 0 0 2 1 0 0 1 1 0 1 0 1 0 0 0 3 1 0 0 1 1 0 1 0 1 0 0 0 4 1 0 0 1 1 0 1 0 1 0 0 0 5 1 Threshold:  = 0.2; learning rate:  = 0.1 𝑤1𝑥1 + 𝑤2𝑥2 − 𝜃 = 0 𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0
  • 37. Inputs x1 x2 0 0 1 1 0 1 0 1 0 0 0 Epoch Desired output Yd 1 Initial weights w1 w2 1 0.3 0.1 0 Actual output Y Error e 0 Final weights w1 w2 0.3 0.1 0 0 1 1 0 1 0 1 0 0 0 2 1 0 0 1 1 0 1 0 1 0 0 0 3 1 0 0 1 1 0 1 0 1 0 0 0 4 1 0 0 1 1 0 1 0 1 0 0 0 5 1 Threshold:  = 0.2; learning rate:  = 0.1 PERCEPTRON LEARNING: LOGICAL OPERATION AND 𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0 = 0.3 x 0 − 0.1 x 0 − 0.2 = − 0.2 < 0 Output: 0 Update Rule: 𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖 𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1 𝑤1 = 0.3 + 0.1 0 − 0 0 𝑤1 = 0.3 Training Example 1: 𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2 𝑤2 = −0.1 + 0.1 0 − 0 0 𝑤2 = −0.1
  • 38. Inputs x1 x2 0 0 1 1 0 1 0 1 0 0 0 Epoch Desired output Yd 1 Initial weights w1 w2 1 0.3 0.1 0 Actual output Y Error e 0 Final weights w1 w2 0.3 0.1 0 0 1 1 0 1 0 1 0 0 0 2 1 0 0 1 1 0 1 0 1 0 0 0 3 1 0 0 1 1 0 1 0 1 0 0 0 4 1 0 0 1 1 0 1 0 1 0 0 0 5 1 Threshold:  = 0.2; learning rate:  = 0.1 0.3 0.1 PERCEPTRON LEARNING: LOGICAL OPERATION AND 𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0 = 0.3 x 0 − 0.1 x 0 − 0.2 = − 0.2 < 0 Output: 0 Update Rule: 𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖 𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1 𝑤1 = 0.3 + 0.1 0 − 0 0 𝑤1 = 0.3 Training Example 1: 𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2 𝑤2 = −0.1 + 0.1 0 − 0 0 𝑤2 = −0.1
  • 39. Inputs x1 x2 0 0 1 1 0 1 0 1 0 0 0 Epoch Desired output Yd 1 Initial weights w1 w2 1 0.3 0.1 0 Actual output Y Error e 0 Final weights w1 w2 0.3 0.1 0 0 1 1 0 1 0 1 0 0 0 2 1 0 0 1 1 0 1 0 1 0 0 0 3 1 0 0 1 1 0 1 0 1 0 0 0 4 1 0 0 1 1 0 1 0 1 0 0 0 5 1 Threshold:  = 0.2; learning rate:  = 0.1 0.3 0.1 0 0 0.3 0.1 PERCEPTRON LEARNING: LOGICAL OPERATION AND 𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0 = 0.3 x 0 − 0.1 x 1 − 0.2 = − 0.3 < 0 Output: 0 Update Rule: 𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖 𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1 𝑤1 = 0.3 + 0.1 0 − 0 0 𝑤1 = 0.3 Training Example 2: 𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2 𝑤2 = −0.1 + 0.1 0 − 0 1 𝑤2 = −0.1
  • 40. Inputs x1 x2 0 0 1 1 0 1 0 1 0 0 0 Epoch Desired output Yd 1 Initial weights w1 w2 1 0.3 0.1 0 Actual output Y Error e 0 Final weights w1 w2 0.3 0.1 0 0 1 1 0 1 0 1 0 0 0 2 1 0 0 1 1 0 1 0 1 0 0 0 3 1 0 0 1 1 0 1 0 1 0 0 0 4 1 0 0 1 1 0 1 0 1 0 0 0 5 1 Threshold:  = 0.2; learning rate:  = 0.1 0.3 0.1 0 0 0.3 0.1 0.3 0.1 PERCEPTRON LEARNING: LOGICAL OPERATION AND 𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0 = 0.3 x 0 − 0.1 x 1 − 0.2 = − 0.3 < 0 Output: 0 Update Rule: 𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖 𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1 𝑤1 = 0.3 + 0.1 0 − 0 0 𝑤1 = 0.3 Training Example 2: 𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2 𝑤2 = −0.1 + 0.1 0 − 0 1 𝑤2 = −0.1
  • 41. Inputs x1 x2 0 0 1 1 0 1 0 1 0 0 0 Epoch Desired output Yd 1 Initial weights w1 w2 1 0.3 0.1 0 Actual output Y Error e 0 Final weights w1 w2 0.3 0.1 0 0 1 1 0 1 0 1 0 0 0 2 1 0 0 1 1 0 1 0 1 0 0 0 3 1 0 0 1 1 0 1 0 1 0 0 0 4 1 0 0 1 1 0 1 0 1 0 0 0 5 1 Threshold:  = 0.2; learning rate:  = 0.1 0.3 0.1 0 0 0.3 0.1 0.3 0.1 PERCEPTRON LEARNING: LOGICAL OPERATION AND 𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0 = 0.3 x 1 − 0.1 x 0 − 0.2 = 0.1 > 0 Output: 1 Update Rule: 𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖 𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1 𝑤1 = 0.3 + 0.1 0 − 1 1 𝑤1 = 0.2 Training Example 3: 𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2 𝑤2 = −0.1 + 0.1 0 − 1 0 𝑤2 = −0.1
  • 42. Inputs x1 x2 0 0 1 1 0 1 0 1 0 0 0 Epoch Desired output Yd 1 Initial weights w1 w2 1 0.3 0.1 0 Actual output Y Error e 0 Final weights w1 w2 0.3 0.1 0 0 1 1 0 1 0 1 0 0 0 2 1 0 0 1 1 0 1 0 1 0 0 0 3 1 0 0 1 1 0 1 0 1 0 0 0 4 1 0 0 1 1 0 1 0 1 0 0 0 5 1 Threshold:  = 0.2; learning rate:  = 0.1 0.3 0.1 0 0 0.3 0.1 0.3 0.1 1 -1 0.2 0.1 PERCEPTRON LEARNING: LOGICAL OPERATION AND 𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0 = 0.3 x 1 − 0.1 x 0 − 0.2 = 0.1 > 0 Output: 1 Update Rule: 𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖 𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1 𝑤1 = 0.3 + 0.1 0 − 1 1 𝑤1 = 0.2 Training Example 3: 𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2 𝑤2 = −0.1 + 0.1 0 − 1 0 𝑤2 = −0.1
  • 43. Inputs x1 x2 0 0 1 1 0 1 0 1 0 0 0 Epoch Desired output Yd 1 Initial weights w1 w2 1 0.3 0.1 0 Actual output Y Error e 0 Final weights w1 w2 0.3 0.1 0 0 1 1 0 1 0 1 0 0 0 2 1 0 0 1 1 0 1 0 1 0 0 0 3 1 0 0 1 1 0 1 0 1 0 0 0 4 1 0 0 1 1 0 1 0 1 0 0 0 5 1 Threshold:  = 0.2; learning rate:  = 0.1 0.3 0.1 0 0 0.3 0.1 0.3 0.1 1 -1 0.2 0.1 0.2 0.1 PERCEPTRON LEARNING: LOGICAL OPERATION AND 𝑤1𝑥1 + 𝑤2𝑥2 − 0.2 = 0 = 0.3 x 1 − 0.1 x 0 − 0.2 = 0.1 > 0 Output: 1 Update Rule: 𝑤𝑖 = 𝑤𝑖 + 𝜂 𝑡 − 𝑜 𝑥𝑖 𝑤1 = 𝑤1 + 0.1 𝑡 − 𝑜 𝑥1 𝑤1 = 0.3 + 0.1 0 − 1 1 𝑤1 = 0.2 Training Example 3: 𝑤2 = 𝑤2 + 0.1 𝑡 − 𝑜 𝑥2 𝑤2 = −0.1 + 0.1 0 − 1 0 𝑤2 = −0.1
  • 44. Inputs x1 x2 0 0 1 1 0 1 0 1 0 0 0 Epoch Desired output Yd 1 Initial weights w1 w2 1 0.3 0.3 0.3 0.2 0.1 0.1 0.1 0.1 0 0 1 0 Actual output Y Error e 0 0 1 1 Final weights w1 w2 0.3 0.3 0.2 0.3 0.1 0.1 0.1 0.0 0 0 1 1 0 1 0 1 0 0 0 2 1 0.3 0.3 0.3 0.2 0 0 1 1 0 0 1 0 0.3 0.3 0.2 0.2 0.0 0.0 0.0 0.0 0 0 1 1 0 1 0 1 0 0 0 3 1 0.2 0.2 0.2 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 0 1 1 0.2 0.2 0.1 0.2 0.0 0.0 0.0 0.1 0 0 1 1 0 1 0 1 0 0 0 4 1 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0 0 1 1 0 0 1 0 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0 0 1 1 0 1 0 1 0 0 0 5 1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0 0 0 1 0 0 0 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0 Threshold:  = 0.2; learning rate:  = 0.1 PERCEPTRON LEARNING: LOGICAL OPERATION AND
  • 45. 0.3𝑥1 − 0.1𝑋2 − 0.2 = 0 0.2𝑥1 − 0.1𝑥2 − 0.2 = 0 0.3𝑥1 − 0.2 = 0 0.2𝑥1 − 0.2 = 0
  • 46. 0.1𝑥1 − 0.2 = 0 0.2𝑥1 + 0.1𝑥2 − 0.2 = 0 0.1𝑥1 + 0.1𝑥2 − 0.2 = 0
  • 47. [Russell & Norvig, 1995] XOR - REVISITED  Piece-wise linear separation 0,0 0,1 1,0 1,1 0,0 0,1 1,0 1,1 AND XOR
  • 49. MULTI-LAYER PERCEPTRON - MLP  Minsky & Papert (1969) offered solution to XOR problem by combining perceptron unit responses using a second layer of Units  Piecewise linear classification using an MLP with threshold (perceptron) units
  • 50.  More to come…..Part II