2. 01
02
03
04
B a c k g r o u n d a n d I n t r o d u c t i o n
C a s e S t u d y
U n d e r s t a n d i n g t h e P e r c e p t r o n
05
P e r c e p t r o n L e a r n i n g R u l e
C o n c l u s i o n s
3. Background and Introduction
01
Biological Neuron
The human brain comprises billions of interconnected
nerve cells called neurons, responsible for processing
and transmitting both chemical and electrical signals.
Dendrites, the branching structures, receive information
from neighboring neurons. The cell nucleus, or soma
processes the received information. Neurons use axons,
which are cable-like structures, to transmit information.
Synapses serve as the connections between an axon and
the dendrites of other neurons.
3
4. Background and Introduction
01
Artificial Neuron
An artificial neuron is a mathematical function based
on a model of biological neurons, where each neuron
takes inputs, weighs them separately, sums them up
and passes this sum through a nonlinear function to
produce output.
4
5. Background and Introduction
01
Rise of Artificial Neurons (Based on Biological Neuron)
The first model of an artificial neuron, known as the binary threshold neuron, was proposed in
1943 by McCullogh and Pitts. It integrates a number of input values through a weighted sum,
producing a single output value, which is equal to 1 if the sum is greater than a certain
threshold, 0 otherwise. The biological neuron is analogous to artificial neurons in the following
terms:
5
Biological Neuron Artificial Neuron
Cell Nucleus (Soma) Node
Dendrites Input
Synapse Weights or interconnections
Axon Output
6. Background and Introduction
01
Artificial Neural Networks
●Artificial neural network (ANN) is a machine learning approach that models human brain and
consists of a number of artificial neurons.
●Each neuron in ANN receives a number of inputs.
●Knowledge about the learning task is given in the form of examples called training examples.
●An Artificial Neural Network is specified by:
−neuron model: the information processing unit of the NN,
−an architecture: a set of neurons and links connecting neurons. Each link has a weight,
−a learning algorithm: used for training the NN by modifying the weights in order to model a
particular learning task correctly on the training examples.
●The aim is to obtain a NN that is trained and generalizes well.
6
7. Background and Introduction
01
Perceptron
Perceptron was introduced by Frank Rosenblatt in 1957. He proposed a
Perceptron learning rule based on the original MCP neuron. A Perceptron
is an algorithm for supervised learning of binary classifiers. This
algorithm enables neurons to learn and processes elements in the training
set one at a time.
There are two types of perceptron's:
• Single layer: Single layer perceptron can learn only linearly separable
patterns.
• Multilayer: Multilayer perceptron's can learn about two or more layers
having a greater processing power.
The basic components of a perceptron are:
• Input Layer
• Weights
• Bias
• Activation Function
• Output
7
8. Understanding the Perceptron
02
Perceptron is a function that maps its input “x,” which is
multiplied with the learned weight coefficient; an output
value ”f(x)”is generated.
𝑓(𝑥) =
1 if 𝑤 ⋅ 𝑥 + 𝑏 > 0
0 otherwise
In the equation given above:
"𝑤" = vector of real-valued weights
" 𝑏 " = bias (an element that adjusts the boundary away from
origin without any dependence on the input value)
" 𝑥" = vector of input x values
“m” = number of inputs to the Perceptron
𝑖=1
𝑚
𝑤𝑖𝑥𝑖
The output can be represented as “1” or “0.” It can also be
represented as “1” or “-1” depending on which activation
function is used.
8
Perceptron Function
9. Understanding the Perceptron
02
A Perceptron accepts inputs, moderates them with
certain weight values, then applies the
transformation function to output the final result.
The image below shows a Perceptron with a Boolean
output.
9
Inputs of a Perceptron
A Boolean output is based on inputs such as salaried,
married, age, past credit profile, etc. It has only two
values: Yes and No or True and False. The summation
function “∑” multiplies all inputs of “x” by weights “w”
and then adds them up as follows
𝑤0 + 𝑤1𝑥1 + 𝑤2𝑥2 + ⋯ + 𝑤𝑛𝑥𝑛
10. Perceptron learning rule
02
Training a perceptron involves adjusting its
weights based on input data to enable it to
correctly classify or make decisions. The goal is
to find the correct set of weights that allow the
perceptron to accurately classify the input data.
10
Train a perceptron
The training process typically involves the
following steps:
• Initialization
• Provide Training Data
• Compute Output
• Compare with Expected Output
• Update Weights
• Repeat
we'll use a step function as the activation function.
The step function produces an output of 1 if the
weighted sum is greater than or equal to zero.
Output = step
𝑖=1
𝑛
input 𝑖
× weight 𝑖
+ bias
This equation used for updating the weight of a
perceptron.
New weighti=Old weighti+Learning Rate×(Expected Outp
ut−Output)×inputi
11. Perceptron learning rule
02
A single perceptron can only be used to implement linearly
separable functions. It takes both real and Boolean inputs
and associates a set of weights to them, along with a bias.
We learn the weights, we get the function. Let's use a
perceptron to learn logic function.
11
Learn logic function using Perceptron(AND Gate)
First, we need to understand that the output of an AND gate is 1
only if both inputs (in this case, x1 and x2) are 1. So, following the
steps listed above;
Row 1
From w1*x1+w2*x2+b, initializing w1, w2, as 1 and b as –1, we
get;
x1(1)+x2(1)–1
Passing the first row of the AND logic table (x1=0, x2=0), we get;
0+0–1 = –1
From the Perceptron rule, if Wx+b≤0, then y`=0. Therefore, this
row is correct, and no need for Backpropagation.
12. Perceptron learning rule
02
12
Learn logic function using Perceptron(AND Gate) cont..
Row 2
Passing (x1=0 and x2=1), we get;
0+1–1 = 0
From the Perceptron rule, if Wx+b≤0, then y`=0. This row is
correct, as the output is 0 for the AND gate.
From the Perceptron rule, this works (for both row 1, row 2
and 3)
Row 4
Passing (x1=1 and x2=1), we get;
1+1–1 = 1
Again, from the perceptron rule, this is still valid.
Therefore, we can conclude that the model to achieve an
AND gate, using the Perceptron algorithm is;
x1+x2–1
13. Perceptron learning rule
02
13
Learn logic function using Perceptron(OR Gate)
From the diagram, the OR gate is 0 only if both inputs are 0.
Row 1
From w1x1+w2x2+b, initializing w1, w2, as 1 and b as –1, we get;
x1(1)+x2(1)–1
Passing the first row of the OR logic table (x1=0, x2=0), we get;
0+0–1 = –1
From the Perceptron rule, if Wx+b≤0, then y`=0. Therefore, this row is correct.
Row 2
Passing (x1=0 and x2=1), we get;
0+1–1 = 0
From the Perceptron rule, if Wx+b <= 0, then y`=0. Therefore, this row is incorrect.
So we want values that will make inputs x1=0 and x2=1 give y` a value of 1. If we
change w2 to 2, we have;
0+2–1 = 1
From the Perceptron rule, this is correct for both the row 1 and 2.
14. Perceptron learning rule
02
14
Learn logic function using Perceptron(NOT Gate)
From the diagram, the output of a NOT gate is the inverse of a single
input. So, following the steps listed above;
Row 1
From w1x1+b, initializing w1 as 1 (since single input), and b as –1, we
get;
x1(1)–1
Passing the first row of the NOT logic table (x1=0), we get;
0–1 = –1
From the Perceptron rule, if Wx+b≤0, then y`=0. This row is incorrect, as
the output is 1 for the NOT gate.
So we want values that will make input x1=0 to give y` a value of 1. If we
change b to 1, we have;
0+1 = 1
15. Perceptron learning rule
02
15
Learn logic function using Perceptron(NOT Gate) cont..
Row 2
Passing (x1=1), we get;
1+1 = 2
From the Perceptron rule, if Wx+b > 0, then y`=1. This row is so
incorrect, as the output is 0 for the NOT gate.
So we want values that will make input x1=1 to give y` a value
of 0. If we change w1 to –1, we have;
–1+1 = 0
From the Perceptron rule, if Wx+b ≤ 0, then y`=0. Therefore,
this works (for both row 1 and row 2).
Therefore, we can conclude that the model to achieve a NOT
gate, using the Perceptron algorithm is;
–x1+1
16. Case Study
04
16
16
Multi-layer perceptron evaluation model
The first layer is the input layer, which divides the entire data set
into a training set, a validation set and a test set and normalizes
the data as input for the entire network. The second layer is the
implicit layer, which has a multilayer neuron structure. The
training of the model is completed through the entire training
process so that the entire loss function reaches the target value.
The last layer is the output layer, which makes predictions for the
test set based on the trained network. The entire data set is
divided into a training set, a validation set and a test set.
17. Case Study
04
17
17
Bayesian optimization-based multilayer perceptron
Using the Bayesian algorithm multilayer perceptron
model can obtain the optimal hyperparameters of the
multilayer perceptron model in the shortest time
The Bayesian formula:
Where:
P(A|B) – the probability of event A occurring, given event B has occurred
P(B|A) – the probability of event B occurring, given event A has occurred
P(A) – the probability of event A
P(B) – the probability of event B
𝑃 𝐴 ∣ 𝐵 =
𝑃 𝐵 ∣ 𝐴 𝑃 𝐴
𝑃 𝐵
18. Case Study
04
18
18
Bayesian optimization-based multilayer perceptron
A: Event "Patient has Cancer" - Prevalence P(A) = 0.03 (3% of the population has cancer)
B: Event "Positive Test Result" - Sensitivity P(B ∣ A) = 0.95 (probability of a positive test given the patient has cancer)
Specificity P(∼ B ∣∼ A) = 0.90 (probability of a negative test given the patient does not have cancer)
Application of Bayes' Theorem:
We want to find the probability that a patient has cancer given a positive test result (P(A ∣ B)) using Bayes' theorem:
𝑃(𝐴 ∣ 𝐵) =
𝑃(𝐵 ∣ 𝐴) × 𝑃(𝐴)
𝑃(𝐵 ∣ 𝐴) × 𝑃(𝐴) + 𝑃(𝐵 ∣∼ 𝐴) × 𝑃(∼ 𝐴)
Calculation:
𝑃(𝐴) = 0.03 (prevalence of cancer)
𝑃(𝐵 ∣ 𝐴) = 0.95 (sensitivity)
𝑃(∼ 𝐵 ∣∼ 𝐴) = 0.90 (specificity)
𝑃(∼ 𝐴) = 1 − 𝑃(𝐴) = 1 − 0.03 = 0.97
𝑃(𝐴 ∣ 𝐵) =
0.95 × 0.03
0.95 × 0.03 + 0.10 × 0.97
𝑃(𝐴 ∣ 𝐵) =
0.0285
0.0285 + 0.097
𝑃(𝐴 ∣ 𝐵) =
0.0285
0.1255
≈ 0.227
Given a positive test result, the
probability that the patient
actually has cancer is
approximately 22.7%.
19. Case Study
04
19
Perceptron for Predicting Movie Watching
Let's generate a dataset with numeric values representing different features like
weather, day of the week, mood, and the target variable indicating whether the
person watches a movie not.
Training the Perceptron:
Input Features: Weather, Day of the Week, and Mood
Target Label: Watched Movie (Yes/No)
Let's train a perceptron model using this dataset to predict whether a person will
watch a movie based on the provided factors.
Perceptron Implementation:
Using Python and a basic perceptron implementation to train on this dataset and
predict movie-watching behavior based on given features.
Evaluation and Interpretation:
After training the perceptron model, we'll evaluate its performance and interpret its
predictive ability.
print(f"For data {data_point}, Predicted Movie Watch: {'Yes' if prediction == 1
else 'No'}")
Based on the input features ([1, 0, 1] or [0, 1, 2]), the perceptron model would
generate a prediction of 'Yes' or 'No' for movie-watching behavior.
Here:
Weather: 0 (Sunny), 1 (Rainy), 2 (Cloudy)
Day of the Week: 0 (Weekday), 1 (Weekend)
Mood: 0 (Happy), 1 (Neutral), 2 (Sad)
Watched Movie: 1 (Yes), 0 (No)
20. 05
20
Let us summarize what we have learned in this tutorial:
• An artificial neuron is a mathematical function conceived as a model of biological neurons, that is, a neural
network.
• A Perceptron is a neural network unit that does certain computations to detect features or business intelligence
in the input data. It is a function that maps its input “x,” which is multiplied by the learned weight coefficient,
and generates an output value ”f(x).
• ”Perceptron Learning Rule states that the algorithm would automatically learn the optimal weight coefficients.
• Single layer Perceptrons can learn only linearly separable patterns.
• Multilayer Perceptron or feedforward neural network with two or more layers have the greater processing
power and can process non-linear patterns as well.
• Perceptrons can implement Logic Gates like AND, OR, or XOR.
Summary and Conclusions
21. 05
21
The perceptron, a fundamental unit of neural networks, serves as a critical building block in the development of
more complex models like multilayer perceptron's (MLPs) and neural networks. The perceptron, though simple and
limited in handling complex tasks due to its single-layer architecture and linear separability constraint, remains a
key element in the history and understanding of neural networks. While its direct application in solving complex
real-world problems is restricted, its significance lies in its contribution to the development of more advanced
neural network architectures capable of addressing non-linear patterns and solving more intricate machine learning
problems.
Summary and Conclusions