SlideShare a Scribd company logo
1 of 37
Multilayer
Perceptron
07 October 2023
Agenda
• Neurons
• Perceptron
• Structure of a Perceptron
• Perceptron in action
• Multilayer Perceptron (MLP)
• Artificial Neural Network
• Feedforward Network
• Fully Connected Neural Network
• Multilayer Perceptron Structure
• Activation Functions
• How Multilayer Perceptron works
• Forward Propagation
• Error Functions
• The Chain Rule
• Backpropagation
• Learning Rate
• Real-life examples
“A single microscopic brain cell cannot think
and is not conscious, but if you bring in a few
more brain cells and a few more, and connect
them all, at a certain point, the group itself will
be able to think and experience emotions and
have opinions and a personality and know that
it exists.”
- Michael Stevens [1]
9/3/20XX Presentation Title 3
Biological
Neuron
• Neurons Talk to each other by sending
electrical impulses from one cell to another.
[1]
• When this electrical impulse reaches a certain
Threshold, The neuron fires. [1]
9/3/20XX 4
Artificial
Neuron
• An artificial neuron mimics a biological neuron.
• Typically includes multiple inputs, each with an
associated weight. [2]
• It computes a weighted sum of inputs, applies an
activation function, and produces an output. [2]
• The output of an artificial neuron is a continuous
value. [2]
9/3/20XX 5
Fig: Artificial neuron [3]
Perceptron
• Perceptron is a linear binary classifier. [4]
• A Perceptron is a type of Artificial Neuron.
• Artificial Neuron Produce a value between 1 and 0. However, Perceptron only produces binary output. [5]
• Perceptron is usually biased toward the extreme values of 0 or 1. [5].
• Neuron uses softer activation functions like the Sigmoid function, Hyperbolic Tangent (tanh), Rectified
Linear Unit (ReLU) etc. These functions generate a value between 0 and 1. [6]
• Perceptron uses a Step Function such as the Heaviside step function as an Activation function. These
functions generate a binary output. Either 0 or 1.[7]
• A Perceptron still qualify as a form of artificial neuron due to its core characteristics [2].
6
“All perceptrons are neurons, but all
neurons are not perceptrons”
9/3/20XX Presentation Title 7
Perceptron Structure
8
A perceptron has the following components.
• Input: Each input node takes a binary value of 1 or 0. [8]
• Weight: Represents the importance of each input. [8]
• Bias: Bias is used for shifting the activation function towards
0 or 1. [9] Fig: Perceptron [9]
• Summation Function: Computes a weighted sum of inputs.
• Activation function: Outputs 1 if the Summation function returns a value greater than or equal to the
threshold, otherwise returns 0.
• Output: Output of the Activation Function.
Perceptron in action
9
Here are the preferences of a customer and details of 2 restaurants. Threshold = 1
Node Criteria Personal preferences (W) Restaurant A Restaurant B
A Good Food 0.8 Yes (1) Yes (1)
B Friends will come 0.6 Yes (1) No (0)
C Cheap 0.4 No (0) Yes (1)
D Noisy Environment -0.5 Yes (1) No (0)
Restaurant A Restaurant B
Multilayer
Perceptron
• Multilayer perceptron is a supervised learning
model. [10]
• A Multilayer Perceptron (MLP) is a Fully-connected
Feed-forward Artificial Neural network. [10]
• The multi-layer perceptron model is also known as
the Backpropagation algorithm. [11]
10
Fig: A taxonomy of neural network architectures [10]
Artificial Neural Network
11
• It’s a machine-learning model designed to mimic the
function and structure of the human brain.
• ANNs are composed of multiple nodes, which imitate
biological neurons of the human brain. [13]
• The nodes take input data and perform simple operations on
the data. [13]
• Each link between two nodes is associated with weight.
• Nodes are arranged in multiple layers, the layers are the
input layer, Hidden layers, and the output layer.
• ANNs are capable of learning by altering weight values. [13]
Fig: Neural Network [17]
How ANN Works [14]
12
• The Input Neurons multiply the input values with their weight and pass
them to the next Neurons in the hidden layer.
• Neurons in the hidden layer are associated with a numerical value called
bias.
• Hidden Neurons add the weighted values and the bias and pass the
value to an activation function.
• The result of the activation function determines if the Neurons will be
activated or not.
• The activated Neurons transmit their value to the neurons in the next
layer.
• In this manner the data is propagated from the input to output layer. This
is called Forward Propagation.
• In the output layer the neuron with the highest value fires and
determines the output.
• The errors of the output layer are propagated back to the network to
reduce the error by modifying the weights and biases, this is called the
Back Propagation.
The Stilwell Brain Experiment [1]
13
• Human Neural Network
• Experiment by Michael Stevens
• YouTube Channel: Vsauce
• Name: The Stilwell Brain
• Series: Mind Field
• Season: 3
• Episode: 3
• Each person acts like a neuron.
• Several people are arranged in layers.
• They fire by raising a flag.
• Each layer gradually identifies more
complex patterns.
The Stilwell Brain [1]
14
• He writes a number on a paper
and divides the paper into
several pieces.
• Gives each person in the input
layer a piece of paper.
• People forming the neural
network don’t know the number.
• Each paper represents a pixel.
The Stilwell Brain [1]
15
• People in the first hidden layer
identify very basic lines.
The Stilwell Brain [1]
16
• Second hidden layer identifies
more complex patterns like an
angle
The Stilwell Brain [1]
17
• The Third hidden layer identifies
several angles and a pattern
starts to emerge.
The Stilwell Brain [1]
18
• The output layer identifies their
designated patterns and the
person associated with the
number raises the flag.
Feed Forward Neural
Network [15]
19
• The connections between units do not form a cycle.
• Information only travels forward in the network.
• Used in classification problems where the data is not
sequential or time-dependent.
Fig: Feed Forward Neural Network [15]
Fully connected Neural Network
20
• Every neuron in one layer is connected to every neuron in
the other layer. [16]
• Training a fully connected network takes a long time due
to the need to update a large number of parameters. [2]
• Storing and manipulating a large number of weights and
activations consume a significant amount of memory. [2]
Fig: Fully Connected Neural Network [16]
Components of MLP
21
Visual Representation of MLP
22
Activation functions in a nutshell
23
• The activation function calculates the output of a neuron. [18]
Sigmoid Function ReLU Function Softmax Function
Equation
𝐴(𝑥) =
1
1 + 𝑒−𝑥
𝐴(𝑥) = max(0, 𝑥)
𝐴 𝑥 =
𝑒𝑥
𝑗=𝑖
𝑘
𝑒𝑗
Nature Non-Linear Non-Linear Non-Linear
Output 0 to 1 0 to inf 0 to 1
Diagram
Uses Usually used in output layer of a
binary classification
Usually used in hidden layers of a
neural network
Usually used when trying to handle
multiple classes
How Multilayer Perceptron
Works?
24
• It works in two phases, Forward propagation and Backpropagation.
• Forword propagation is where prediction occurs.
• Back propagation is where learning occurs.
• Uses non-linear activation functions like sigmoid, TanH, ReLU, etc. to fit
non-linear data.
Fig: Neural Network [12]
Forward Propagation [19]
25
• The journey starts from the input layer to the output layer.
• Let’s Consider the following MLP with input, initial weights, the
biases.
• Initial weights are generated randomly.
• Sigmoid Activation function.
Outputs of the Hidden Layer neurons:
𝑛𝑒𝑡ℎ1= 𝑤1 ∗ 𝑖1 + 𝑤2 ∗ 𝑖2 + 𝑏1
= 0.15 ∗ 0.05 + 0.2 ∗ 0.1 + 0.35
= 0.3775
𝑜𝑢𝑡ℎ1=
1
1+𝑒−𝑥 =
1
1+𝑒−0.3775 = 0.593
𝑛𝑒𝑡ℎ2= 𝑤3 ∗ 𝑖1 + 𝑤4 ∗ 𝑖2 + 𝑏1
= 0.25 ∗ 0.05 + 0.3 ∗ 0.1 + 0.35
= 0.3925
𝑜𝑢𝑡ℎ2=
1
1 + 𝑒−𝑥
=
1
1 + 𝑒−0.3925
= 0.597
Continuing Forward Propagation
26
Outputs of the Hidden Layer neurons:
𝑛𝑒𝑡𝑜1= 𝑤5 ∗ 𝑜𝑢𝑡ℎ1 + 𝑤6 ∗ 𝑜𝑢𝑡ℎ2 + 𝑏2
= 0.4 ∗ 0.593 + 0.45 ∗ 0.597 + 0.60
= 1.1
𝑜𝑢𝑡𝑜1=
1
1 + 𝑒−𝑥
=
1
1 + 𝑒−1.1
= 0.75
𝑛𝑒𝑡𝑜2= 𝑤7 ∗ 𝑜𝑢𝑡ℎ1 + 𝑤8 ∗ 𝑜𝑢𝑡ℎ2 + 𝑏2
= 0.5 ∗ 0.593 + 0.55 ∗ 0.597 + 0.60
= 1.22
𝑜𝑢𝑡𝑜2=
1
1 + 𝑒−𝑥 =
1
1 + 𝑒−1.22 = 0.77
The neuron with the highest value determines the output.
0.593
0.597
0.75
0.77
Error Function
27
• A measure that quantifies the difference between the predicted output and the actual output.
• Also known as the loss function or the cost function.
• Backpropagation tries to minimize the error function.
• Common Error Functions:
• Mean Squared Error (MSE): Commonly used for regression problems. It calculates the average of
the squared differences between predicted and actual values. [20]
𝑀𝑆𝐸 𝑝, 𝑦 =
1
𝑛
𝑖=1
𝑛
𝑦𝑖 − 𝑝𝑖
2
• Binary Cross-Entropy Loss: Frequently used for binary classification tasks, measuring the difference
between predicted probabilities and true binary labels. [21]
𝐵𝐶𝐸 𝑝, 𝑦 = −[𝑦𝑙𝑜𝑔 𝑝 + 1 − 𝑦 log(1 − 𝑝)]
• Categorical Cross-Entropy Loss: Often used for multi-class classification problems. It calculates the
difference between predicted probabilities and true class labels. [22]
𝐶𝐶𝐸(𝑝, 𝑦) =
𝑖
𝑦𝑖log(𝑝𝑖)
Calculating Error
28
• Error of the forward propagation using Mean Squared Error
function
𝐸𝑡𝑜𝑡𝑎𝑙 = 𝑀𝑆𝐸 𝑝, 𝑦 =
1
𝑛
𝑖=1
𝑛
𝑦𝑖 − 𝑝𝑖
2
𝐸𝑡𝑜𝑡𝑎𝑙 =
1
2
[(𝑡𝑎𝑟𝑔𝑒𝑡𝑜1 − 𝑜𝑢𝑡𝑜1)2+ 𝑡𝑎𝑟𝑔𝑒𝑡𝑜2 − 𝑜𝑢𝑡𝑜2
2]
𝐸𝑡𝑜𝑡𝑎𝑙 =
1
2
0.01 − 0.75 2
+ 0.99 − 0.77 2
= 0.298
0.593
0.597
0.75
0.77
The Chain Rule
29
• The chain rule is a fundamental calculus principle extensively used in backpropagation.
• Extremely important for Multilayer Layer Perceptron.
• Determines the relation between weight and error.
•
𝜕𝐸𝑡𝑜𝑡𝑎𝑙
𝜕𝑤5
is the rate of change in error with respect
to change in weight.
•
𝜕𝐸𝑡𝑜𝑡𝑎𝑙
𝜕𝑜𝑢𝑡𝑜1
is the rate of change in error with respect
to change in output value of a neuron.
•
𝜕𝑜𝑢𝑡𝑜1
𝜕𝑛𝑒𝑡𝑜1
is the rate of change in output value of a
neuron with respect to change in net value.
•
𝜕𝑛𝑒𝑡𝑜1
𝜕𝑤5
is the change in net value with respect to
change in weight.
Backpropagation
30
• Change in error with respect to the output:
𝐸𝑡𝑜𝑡𝑎𝑙 =
1
2
[(𝑡𝑎𝑟𝑔𝑒𝑡𝑜1 − 𝑜𝑢𝑡𝑜1)2
− 𝑡𝑎𝑟𝑔𝑒𝑡𝑜2 − 𝑜𝑢𝑡𝑜2
2
]
𝜕𝐸𝑡𝑜𝑡𝑎𝑙
𝜕𝑜𝑢𝑡𝑜1
= 2 ∗
1
2
𝑡𝑎𝑟𝑔𝑒𝑡𝑜1 − 𝑜𝑢𝑡𝑜1 ∗ −1 + 0
= 0.01 − 0.75 ∗ −1 = 0.74
• Change in output with respect net input:
𝑜𝑢𝑡𝑜1 =
1
1 + 𝑒−𝑥
𝜕𝑜𝑢𝑡𝑜1
𝜕𝑛𝑒𝑡𝑜1
= 𝑜𝑢𝑡𝑜1 1 − 𝑜𝑢𝑡𝑜1 ……………………………[23]
= 0.75 1 − 0.75 = 0.186
• Change in net input with respect to weight:
𝑛𝑒𝑡𝑜1 = 𝑤5 ∗ 𝑜𝑢𝑡ℎ1 + 𝑤6 ∗ 𝑜𝑢𝑡ℎ2 + 𝑏2
𝜕𝑛𝑒𝑡𝑜1
𝜕𝑤5
= 1 ∗ 𝑜𝑢𝑡ℎ1 + 0 + 0 = 0.593
• Change in error with respect to the weight:
𝜕𝐸𝑡𝑜𝑡𝑎𝑙
𝜕𝑤5
= 0.74 ∗ 0.186 ∗ 0.59 = 0.082
0.593
0.597
0.75
0.77
What does it mean?
31
• What is the meaning of
𝜕𝐸𝑡𝑜𝑡𝑎𝑙
𝜕𝑤5
= 0.082 ?
⇒ 𝜕𝐸𝑡𝑜𝑡𝑎𝑙 = 0.082 ∗ 𝜕𝑤5
• So 𝜕𝐸𝑡𝑜𝑡𝑎𝑙 and 𝜕𝑤5 is proportionally related.
• Coefficient of proportionality is 0.082.
Learning Rate
32
• How much to change the model in response to the estimated error, each time the model weights are
updated. [24]
• A very large learning rate can overshoot the optimal values.
• A very small learning rate can cause the optimization process to converge extremely slowly.
• Learning Rate can be optimized using the Gradient Descent algorithm.
Continuing Backpropagation
33
• To decrease the error, we then update the weight
• Assume the learning, 𝜇 = 0.5
𝑤5
+
= 𝑤5
− 𝜇 ∗
𝜕𝐸𝑡𝑜𝑡𝑎𝑙
𝜕𝑤5
= 0.4 − 0.5 ∗ 0.082 = 0.358
Similarly, we get
𝑤6
+
= 0.408
𝑤7
+
= 0.511
𝑤8
+
= 0.561
𝑤1
+
= 0.149
𝑤2
+
= 0.199
𝑤3
+
= 0.249
𝑤4
+
= 0.299
• Repeat this process until the desired accuracy is reached.
Importance of MLP
• Can Fit extremely non-linear datasets [10].
• Can automatically learn sophisticated features from raw input data.
• Versatile model, can be used in Classification, Regression, Pattern Recognition, Time Series
Analysis and many other fields.
• Scalable model, can be used from small datasets to very large datasets.
• MLPs serve as fundamental building blocks in deep learning. By stacking multiple hidden layers,
MLPs become deep neural networks (DNNs).
• MLPs have been at the forefront of neural network research and innovations.
• MLPs are widely used in practical applications such as computer vision, natural language
processing, speech recognition, finance, healthcare, recommendation systems, robotics, and more.
34
References
• [1] “The stilwell brain,” YouTube, https://www.youtube.com/watch?v=rA5qnZUXcqo (accessed Oct. 2, 2023).
• [2] “Chatgpt,” ChatGPT, https://openai.com/chatgpt (accessed Oct. 2, 2023).
• [3] RainerGewalt and Name, “Perceptrons - these artificial neurons are the fundamentals of Neural Networks,” Fly spaceships with your mind, https://starship-knowledge.com/neural-
networks-perceptrons (accessed Oct. 3, 2023).
• [4] “What is a feedforward fully connected neural network and how to implement it in Matlab?,” Saturn Cloud Blog, https://saturncloud.io/blog/what-is-a-feedforward-fully-connected-
neural-network-and-how-to-implement-it-in-matlab/ (accessed Oct. 4, 2023).
• [5] P. King, “What is the difference between the ‘neurons’ in an artificial neural network (ANN) and ‘perceptrons’?,” Quora, https://www.quora.com/What-is-the-difference-between-
the-neurons-in-an-artificial-neural-network-ANN-and-perceptrons (accessed Oct. 2, 2023).
• [6] “Activation functions in neural networks,” GeeksforGeeks, https://www.geeksforgeeks.org/activation-functions-neural-networks/ (accessed Oct. 2, 2023).
• [7] “Perceptron in Machine Learning - Javatpoint,” www.javatpoint.com, https://www.javatpoint.com/perceptron-in-machine-learning (accessed Oct. 3, 2023).
• [8] Perceptrons, https://www.w3schools.com/ai/ai_perceptrons.asp (accessed Oct. 3, 2023).
• [9] A. Hange, “Flux prediction using single-layer perceptron and Multilayer Perceptron,” Medium, https://medium.com/nerd-for-tech/flux-prediction-using-single-layer-perceptron-and-
multilayer-perceptron-cf82c1341c33 (accessed Oct. 3, 2023).
• [10] M. W. Gardner and S. R. Dorling, “Artificial Neural Networks (the multilayer perceptron)—a review of applications in the Atmospheric Sciences,” Atmospheric Environment, vol.
32, no. 14–15, pp. 2627–2636, 1998. doi:10.1016/s1352-2310(97)00447-0.
• [11] “Perceptron in Machine Learning - Javatpoint,” www.javatpoint.com, https://www.javatpoint.com/perceptron-in-machine-learning (accessed Oct. 3, 2023).
• [12] Minesanalytix, “Demystifying Data-Driven Neural Networks for multivariate production analysis,” Data Analytix Association @ Mines,
https://orgs.mines.edu/daa/blog/2019/08/05/neural-networks-mva/ (accessed Oct. 4, 2023).
• [13] “Artificial Intelligence - Neural Networks,” Online Courses and eBooks Library, https://www.tutorialspoint.com/artificial_intelligence/artificial_intelligence_neural_networks.htm
(accessed Oct. 4, 2023).
35
• [14] “Neural network in 5 minutes | what is a neural network? | how neural networks work | Simplilearn,” YouTube, https://www.youtube.com/watch?v=bfmFfD2RIcg (accessed Oct.
4, 2023). 2, 2023).
• [15] “Feedforward Neural Networks,” Brilliant Math & Science Wiki, https://brilliant.org/wiki/feedforward-neural-networks/ (accessed Oct. 4, 2023).
• [16] P. Mahajan, “Fully connected vs Convolutional Neural Networks,” Medium, https://medium.com/swlh/fully-connected-vs-convolutional-neural-networks-813ca7bc6ee5 (accessed
Oct. 4, 2023).
• [17] “A beginner’s guide to keras: Digit recognition in 30 minutes,” SitePoint, https://www.sitepoint.com/keras-digit-recognition-tutorial/ (accessed Oct. 4, 2023).
• [18] “Activation function,” Wikipedia, https://en.wikipedia.org/wiki/Activation_function (accessed Oct. 4, 2023).
• [19] Mazur, “A step by step backpropagation example,” Matt Mazur, https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/ (accessed Oct. 5, 2023).
• [20] “Mean squared error,” Wikipedia, https://en.wikipedia.org/wiki/Mean_squared_error (accessed Oct. 5, 2023).
• [21] “A practical guide to binary cross-entropy and Log Loss,” Aporia, https://www.aporia.com/learn/understanding-binary-cross-entropy-and-log-loss-for-effective-model-monitoring/
(accessed Oct. 5, 2023).
• [22] Neuralthreads, “Categorical cross-entropy loss-the most important loss function,” Medium, https://neuralthreads.medium.com/categorical-cross-entropy-loss-the-most-
important-loss-function-d3792151d05b (accessed Oct. 5, 2023).
• [23] “Logistic function,” Wikipedia, https://en.wikipedia.org/wiki/Logistic_function#Derivative (accessed Oct. 5, 2023).
• [24] J. Brownlee, “Understand the impact of learning rate on neural network performance,” MachineLearningMastery.com, https://machinelearningmastery.com/understand-the-
dynamics-of-learning-rate-on-deep-learning-neural-networks/ (accessed Oct. 5, 2023).
• [25] “Educative answers - trusted answers to developer questions,” Educative, https://www.educative.io/answers/what-is-forward-propagation-in-neural-networks (accessed Oct. 5,
2023).
36
Thank you

More Related Content

Similar to Multilayer Perceptron Neural Network MLP

Neural-Networks.ppt
Neural-Networks.pptNeural-Networks.ppt
Neural-Networks.pptRINUSATHYAN
 
Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Prof. Neeta Awasthy
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
Introduction to Perceptron and Neural Network.pptx
Introduction to Perceptron and Neural Network.pptxIntroduction to Perceptron and Neural Network.pptx
Introduction to Perceptron and Neural Network.pptxPoonam60376
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networksarjitkantgupta
 
Feed forward back propogation algorithm .pptx
Feed forward back propogation algorithm .pptxFeed forward back propogation algorithm .pptx
Feed forward back propogation algorithm .pptxneelamsanjeevkumar
 
Artificial neural network by arpit_sharma
Artificial neural network by arpit_sharmaArtificial neural network by arpit_sharma
Artificial neural network by arpit_sharmaEr. Arpit Sharma
 
Neural net and back propagation
Neural net and back propagationNeural net and back propagation
Neural net and back propagationMohit Shrivastava
 
neuralnetwork.pptx
neuralnetwork.pptxneuralnetwork.pptx
neuralnetwork.pptxSherinRappai
 
Introduction to Neural networks (under graduate course) Lecture 2 of 9
Introduction to Neural networks (under graduate course) Lecture 2 of 9Introduction to Neural networks (under graduate course) Lecture 2 of 9
Introduction to Neural networks (under graduate course) Lecture 2 of 9Randa Elanwar
 
Artificial Neural Network_VCW (1).pptx
Artificial Neural Network_VCW (1).pptxArtificial Neural Network_VCW (1).pptx
Artificial Neural Network_VCW (1).pptxpratik610182
 
artificial-neural-networks-rev.ppt
artificial-neural-networks-rev.pptartificial-neural-networks-rev.ppt
artificial-neural-networks-rev.pptRINUSATHYAN
 
artificial-neural-networks-rev.ppt
artificial-neural-networks-rev.pptartificial-neural-networks-rev.ppt
artificial-neural-networks-rev.pptSanaMateen7
 
Convolutional neural networks
Convolutional neural networksConvolutional neural networks
Convolutional neural networksMohammad Imran
 
Machine learning by using python lesson 2 Neural Networks By Professor Lili S...
Machine learning by using python lesson 2 Neural Networks By Professor Lili S...Machine learning by using python lesson 2 Neural Networks By Professor Lili S...
Machine learning by using python lesson 2 Neural Networks By Professor Lili S...Professor Lili Saghafi
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch Eran Shlomo
 

Similar to Multilayer Perceptron Neural Network MLP (20)

Neural-Networks.ppt
Neural-Networks.pptNeural-Networks.ppt
Neural-Networks.ppt
 
Unit 6: Application of AI
Unit 6: Application of AIUnit 6: Application of AI
Unit 6: Application of AI
 
Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Introduction to Perceptron and Neural Network.pptx
Introduction to Perceptron and Neural Network.pptxIntroduction to Perceptron and Neural Network.pptx
Introduction to Perceptron and Neural Network.pptx
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 
Perceptron
Perceptron Perceptron
Perceptron
 
Feed forward back propogation algorithm .pptx
Feed forward back propogation algorithm .pptxFeed forward back propogation algorithm .pptx
Feed forward back propogation algorithm .pptx
 
Artificial neural network by arpit_sharma
Artificial neural network by arpit_sharmaArtificial neural network by arpit_sharma
Artificial neural network by arpit_sharma
 
Neural net and back propagation
Neural net and back propagationNeural net and back propagation
Neural net and back propagation
 
neuralnetwork.pptx
neuralnetwork.pptxneuralnetwork.pptx
neuralnetwork.pptx
 
neuralnetwork.pptx
neuralnetwork.pptxneuralnetwork.pptx
neuralnetwork.pptx
 
Introduction to Neural networks (under graduate course) Lecture 2 of 9
Introduction to Neural networks (under graduate course) Lecture 2 of 9Introduction to Neural networks (under graduate course) Lecture 2 of 9
Introduction to Neural networks (under graduate course) Lecture 2 of 9
 
Artificial Neural Network_VCW (1).pptx
Artificial Neural Network_VCW (1).pptxArtificial Neural Network_VCW (1).pptx
Artificial Neural Network_VCW (1).pptx
 
artificial-neural-networks-rev.ppt
artificial-neural-networks-rev.pptartificial-neural-networks-rev.ppt
artificial-neural-networks-rev.ppt
 
artificial-neural-networks-rev.ppt
artificial-neural-networks-rev.pptartificial-neural-networks-rev.ppt
artificial-neural-networks-rev.ppt
 
Convolutional neural networks
Convolutional neural networksConvolutional neural networks
Convolutional neural networks
 
Neural network
Neural networkNeural network
Neural network
 
Machine learning by using python lesson 2 Neural Networks By Professor Lili S...
Machine learning by using python lesson 2 Neural Networks By Professor Lili S...Machine learning by using python lesson 2 Neural Networks By Professor Lili S...
Machine learning by using python lesson 2 Neural Networks By Professor Lili S...
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 

More from Abdullah al Mamun

Underfitting and Overfitting in Machine Learning
Underfitting and Overfitting in Machine LearningUnderfitting and Overfitting in Machine Learning
Underfitting and Overfitting in Machine LearningAbdullah al Mamun
 
Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs)Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs)Abdullah al Mamun
 
Principal Component Analysis PCA
Principal Component Analysis PCAPrincipal Component Analysis PCA
Principal Component Analysis PCAAbdullah al Mamun
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Abdullah al Mamun
 
Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)Abdullah al Mamun
 
Convolutional Neural Networks CNN
Convolutional Neural Networks CNNConvolutional Neural Networks CNN
Convolutional Neural Networks CNNAbdullah al Mamun
 
Artificial Neural Network ANN
Artificial Neural Network ANNArtificial Neural Network ANN
Artificial Neural Network ANNAbdullah al Mamun
 
Reinforcement Learning, Application and Q-Learning
Reinforcement Learning, Application and Q-LearningReinforcement Learning, Application and Q-Learning
Reinforcement Learning, Application and Q-LearningAbdullah al Mamun
 
Session on evaluation of DevSecOps
Session on evaluation of DevSecOpsSession on evaluation of DevSecOps
Session on evaluation of DevSecOpsAbdullah al Mamun
 
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...Abdullah al Mamun
 
Python Virtual Environment.pptx
Python Virtual Environment.pptxPython Virtual Environment.pptx
Python Virtual Environment.pptxAbdullah al Mamun
 
Artificial intelligence Presentation.pptx
Artificial intelligence Presentation.pptxArtificial intelligence Presentation.pptx
Artificial intelligence Presentation.pptxAbdullah al Mamun
 
An approach to empirical Optical Character recognition paradigm using Multi-L...
An approach to empirical Optical Character recognition paradigm using Multi-L...An approach to empirical Optical Character recognition paradigm using Multi-L...
An approach to empirical Optical Character recognition paradigm using Multi-L...Abdullah al Mamun
 

More from Abdullah al Mamun (20)

Underfitting and Overfitting in Machine Learning
Underfitting and Overfitting in Machine LearningUnderfitting and Overfitting in Machine Learning
Underfitting and Overfitting in Machine Learning
 
Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs)Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs)
 
Random Forest
Random ForestRandom Forest
Random Forest
 
Principal Component Analysis PCA
Principal Component Analysis PCAPrincipal Component Analysis PCA
Principal Component Analysis PCA
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Long Short Term Memory LSTM
Long Short Term Memory LSTMLong Short Term Memory LSTM
Long Short Term Memory LSTM
 
Linear Regression
Linear RegressionLinear Regression
Linear Regression
 
K-Nearest Neighbor(KNN)
K-Nearest Neighbor(KNN)K-Nearest Neighbor(KNN)
K-Nearest Neighbor(KNN)
 
Hidden Markov Model (HMM)
Hidden Markov Model (HMM)Hidden Markov Model (HMM)
Hidden Markov Model (HMM)
 
Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)
 
Convolutional Neural Networks CNN
Convolutional Neural Networks CNNConvolutional Neural Networks CNN
Convolutional Neural Networks CNN
 
Artificial Neural Network ANN
Artificial Neural Network ANNArtificial Neural Network ANN
Artificial Neural Network ANN
 
Reinforcement Learning, Application and Q-Learning
Reinforcement Learning, Application and Q-LearningReinforcement Learning, Application and Q-Learning
Reinforcement Learning, Application and Q-Learning
 
Session on evaluation of DevSecOps
Session on evaluation of DevSecOpsSession on evaluation of DevSecOps
Session on evaluation of DevSecOps
 
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
 
DevOps Presentation.pptx
DevOps Presentation.pptxDevOps Presentation.pptx
DevOps Presentation.pptx
 
Python Virtual Environment.pptx
Python Virtual Environment.pptxPython Virtual Environment.pptx
Python Virtual Environment.pptx
 
Artificial intelligence Presentation.pptx
Artificial intelligence Presentation.pptxArtificial intelligence Presentation.pptx
Artificial intelligence Presentation.pptx
 
An approach to empirical Optical Character recognition paradigm using Multi-L...
An approach to empirical Optical Character recognition paradigm using Multi-L...An approach to empirical Optical Character recognition paradigm using Multi-L...
An approach to empirical Optical Character recognition paradigm using Multi-L...
 

Recently uploaded

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 

Multilayer Perceptron Neural Network MLP

  • 2. Agenda • Neurons • Perceptron • Structure of a Perceptron • Perceptron in action • Multilayer Perceptron (MLP) • Artificial Neural Network • Feedforward Network • Fully Connected Neural Network • Multilayer Perceptron Structure • Activation Functions • How Multilayer Perceptron works • Forward Propagation • Error Functions • The Chain Rule • Backpropagation • Learning Rate • Real-life examples
  • 3. “A single microscopic brain cell cannot think and is not conscious, but if you bring in a few more brain cells and a few more, and connect them all, at a certain point, the group itself will be able to think and experience emotions and have opinions and a personality and know that it exists.” - Michael Stevens [1] 9/3/20XX Presentation Title 3
  • 4. Biological Neuron • Neurons Talk to each other by sending electrical impulses from one cell to another. [1] • When this electrical impulse reaches a certain Threshold, The neuron fires. [1] 9/3/20XX 4
  • 5. Artificial Neuron • An artificial neuron mimics a biological neuron. • Typically includes multiple inputs, each with an associated weight. [2] • It computes a weighted sum of inputs, applies an activation function, and produces an output. [2] • The output of an artificial neuron is a continuous value. [2] 9/3/20XX 5 Fig: Artificial neuron [3]
  • 6. Perceptron • Perceptron is a linear binary classifier. [4] • A Perceptron is a type of Artificial Neuron. • Artificial Neuron Produce a value between 1 and 0. However, Perceptron only produces binary output. [5] • Perceptron is usually biased toward the extreme values of 0 or 1. [5]. • Neuron uses softer activation functions like the Sigmoid function, Hyperbolic Tangent (tanh), Rectified Linear Unit (ReLU) etc. These functions generate a value between 0 and 1. [6] • Perceptron uses a Step Function such as the Heaviside step function as an Activation function. These functions generate a binary output. Either 0 or 1.[7] • A Perceptron still qualify as a form of artificial neuron due to its core characteristics [2]. 6
  • 7. “All perceptrons are neurons, but all neurons are not perceptrons” 9/3/20XX Presentation Title 7
  • 8. Perceptron Structure 8 A perceptron has the following components. • Input: Each input node takes a binary value of 1 or 0. [8] • Weight: Represents the importance of each input. [8] • Bias: Bias is used for shifting the activation function towards 0 or 1. [9] Fig: Perceptron [9] • Summation Function: Computes a weighted sum of inputs. • Activation function: Outputs 1 if the Summation function returns a value greater than or equal to the threshold, otherwise returns 0. • Output: Output of the Activation Function.
  • 9. Perceptron in action 9 Here are the preferences of a customer and details of 2 restaurants. Threshold = 1 Node Criteria Personal preferences (W) Restaurant A Restaurant B A Good Food 0.8 Yes (1) Yes (1) B Friends will come 0.6 Yes (1) No (0) C Cheap 0.4 No (0) Yes (1) D Noisy Environment -0.5 Yes (1) No (0) Restaurant A Restaurant B
  • 10. Multilayer Perceptron • Multilayer perceptron is a supervised learning model. [10] • A Multilayer Perceptron (MLP) is a Fully-connected Feed-forward Artificial Neural network. [10] • The multi-layer perceptron model is also known as the Backpropagation algorithm. [11] 10 Fig: A taxonomy of neural network architectures [10]
  • 11. Artificial Neural Network 11 • It’s a machine-learning model designed to mimic the function and structure of the human brain. • ANNs are composed of multiple nodes, which imitate biological neurons of the human brain. [13] • The nodes take input data and perform simple operations on the data. [13] • Each link between two nodes is associated with weight. • Nodes are arranged in multiple layers, the layers are the input layer, Hidden layers, and the output layer. • ANNs are capable of learning by altering weight values. [13] Fig: Neural Network [17]
  • 12. How ANN Works [14] 12 • The Input Neurons multiply the input values with their weight and pass them to the next Neurons in the hidden layer. • Neurons in the hidden layer are associated with a numerical value called bias. • Hidden Neurons add the weighted values and the bias and pass the value to an activation function. • The result of the activation function determines if the Neurons will be activated or not. • The activated Neurons transmit their value to the neurons in the next layer. • In this manner the data is propagated from the input to output layer. This is called Forward Propagation. • In the output layer the neuron with the highest value fires and determines the output. • The errors of the output layer are propagated back to the network to reduce the error by modifying the weights and biases, this is called the Back Propagation.
  • 13. The Stilwell Brain Experiment [1] 13 • Human Neural Network • Experiment by Michael Stevens • YouTube Channel: Vsauce • Name: The Stilwell Brain • Series: Mind Field • Season: 3 • Episode: 3 • Each person acts like a neuron. • Several people are arranged in layers. • They fire by raising a flag. • Each layer gradually identifies more complex patterns.
  • 14. The Stilwell Brain [1] 14 • He writes a number on a paper and divides the paper into several pieces. • Gives each person in the input layer a piece of paper. • People forming the neural network don’t know the number. • Each paper represents a pixel.
  • 15. The Stilwell Brain [1] 15 • People in the first hidden layer identify very basic lines.
  • 16. The Stilwell Brain [1] 16 • Second hidden layer identifies more complex patterns like an angle
  • 17. The Stilwell Brain [1] 17 • The Third hidden layer identifies several angles and a pattern starts to emerge.
  • 18. The Stilwell Brain [1] 18 • The output layer identifies their designated patterns and the person associated with the number raises the flag.
  • 19. Feed Forward Neural Network [15] 19 • The connections between units do not form a cycle. • Information only travels forward in the network. • Used in classification problems where the data is not sequential or time-dependent. Fig: Feed Forward Neural Network [15]
  • 20. Fully connected Neural Network 20 • Every neuron in one layer is connected to every neuron in the other layer. [16] • Training a fully connected network takes a long time due to the need to update a large number of parameters. [2] • Storing and manipulating a large number of weights and activations consume a significant amount of memory. [2] Fig: Fully Connected Neural Network [16]
  • 23. Activation functions in a nutshell 23 • The activation function calculates the output of a neuron. [18] Sigmoid Function ReLU Function Softmax Function Equation 𝐴(𝑥) = 1 1 + 𝑒−𝑥 𝐴(𝑥) = max(0, 𝑥) 𝐴 𝑥 = 𝑒𝑥 𝑗=𝑖 𝑘 𝑒𝑗 Nature Non-Linear Non-Linear Non-Linear Output 0 to 1 0 to inf 0 to 1 Diagram Uses Usually used in output layer of a binary classification Usually used in hidden layers of a neural network Usually used when trying to handle multiple classes
  • 24. How Multilayer Perceptron Works? 24 • It works in two phases, Forward propagation and Backpropagation. • Forword propagation is where prediction occurs. • Back propagation is where learning occurs. • Uses non-linear activation functions like sigmoid, TanH, ReLU, etc. to fit non-linear data. Fig: Neural Network [12]
  • 25. Forward Propagation [19] 25 • The journey starts from the input layer to the output layer. • Let’s Consider the following MLP with input, initial weights, the biases. • Initial weights are generated randomly. • Sigmoid Activation function. Outputs of the Hidden Layer neurons: 𝑛𝑒𝑡ℎ1= 𝑤1 ∗ 𝑖1 + 𝑤2 ∗ 𝑖2 + 𝑏1 = 0.15 ∗ 0.05 + 0.2 ∗ 0.1 + 0.35 = 0.3775 𝑜𝑢𝑡ℎ1= 1 1+𝑒−𝑥 = 1 1+𝑒−0.3775 = 0.593 𝑛𝑒𝑡ℎ2= 𝑤3 ∗ 𝑖1 + 𝑤4 ∗ 𝑖2 + 𝑏1 = 0.25 ∗ 0.05 + 0.3 ∗ 0.1 + 0.35 = 0.3925 𝑜𝑢𝑡ℎ2= 1 1 + 𝑒−𝑥 = 1 1 + 𝑒−0.3925 = 0.597
  • 26. Continuing Forward Propagation 26 Outputs of the Hidden Layer neurons: 𝑛𝑒𝑡𝑜1= 𝑤5 ∗ 𝑜𝑢𝑡ℎ1 + 𝑤6 ∗ 𝑜𝑢𝑡ℎ2 + 𝑏2 = 0.4 ∗ 0.593 + 0.45 ∗ 0.597 + 0.60 = 1.1 𝑜𝑢𝑡𝑜1= 1 1 + 𝑒−𝑥 = 1 1 + 𝑒−1.1 = 0.75 𝑛𝑒𝑡𝑜2= 𝑤7 ∗ 𝑜𝑢𝑡ℎ1 + 𝑤8 ∗ 𝑜𝑢𝑡ℎ2 + 𝑏2 = 0.5 ∗ 0.593 + 0.55 ∗ 0.597 + 0.60 = 1.22 𝑜𝑢𝑡𝑜2= 1 1 + 𝑒−𝑥 = 1 1 + 𝑒−1.22 = 0.77 The neuron with the highest value determines the output. 0.593 0.597 0.75 0.77
  • 27. Error Function 27 • A measure that quantifies the difference between the predicted output and the actual output. • Also known as the loss function or the cost function. • Backpropagation tries to minimize the error function. • Common Error Functions: • Mean Squared Error (MSE): Commonly used for regression problems. It calculates the average of the squared differences between predicted and actual values. [20] 𝑀𝑆𝐸 𝑝, 𝑦 = 1 𝑛 𝑖=1 𝑛 𝑦𝑖 − 𝑝𝑖 2 • Binary Cross-Entropy Loss: Frequently used for binary classification tasks, measuring the difference between predicted probabilities and true binary labels. [21] 𝐵𝐶𝐸 𝑝, 𝑦 = −[𝑦𝑙𝑜𝑔 𝑝 + 1 − 𝑦 log(1 − 𝑝)] • Categorical Cross-Entropy Loss: Often used for multi-class classification problems. It calculates the difference between predicted probabilities and true class labels. [22] 𝐶𝐶𝐸(𝑝, 𝑦) = 𝑖 𝑦𝑖log(𝑝𝑖)
  • 28. Calculating Error 28 • Error of the forward propagation using Mean Squared Error function 𝐸𝑡𝑜𝑡𝑎𝑙 = 𝑀𝑆𝐸 𝑝, 𝑦 = 1 𝑛 𝑖=1 𝑛 𝑦𝑖 − 𝑝𝑖 2 𝐸𝑡𝑜𝑡𝑎𝑙 = 1 2 [(𝑡𝑎𝑟𝑔𝑒𝑡𝑜1 − 𝑜𝑢𝑡𝑜1)2+ 𝑡𝑎𝑟𝑔𝑒𝑡𝑜2 − 𝑜𝑢𝑡𝑜2 2] 𝐸𝑡𝑜𝑡𝑎𝑙 = 1 2 0.01 − 0.75 2 + 0.99 − 0.77 2 = 0.298 0.593 0.597 0.75 0.77
  • 29. The Chain Rule 29 • The chain rule is a fundamental calculus principle extensively used in backpropagation. • Extremely important for Multilayer Layer Perceptron. • Determines the relation between weight and error. • 𝜕𝐸𝑡𝑜𝑡𝑎𝑙 𝜕𝑤5 is the rate of change in error with respect to change in weight. • 𝜕𝐸𝑡𝑜𝑡𝑎𝑙 𝜕𝑜𝑢𝑡𝑜1 is the rate of change in error with respect to change in output value of a neuron. • 𝜕𝑜𝑢𝑡𝑜1 𝜕𝑛𝑒𝑡𝑜1 is the rate of change in output value of a neuron with respect to change in net value. • 𝜕𝑛𝑒𝑡𝑜1 𝜕𝑤5 is the change in net value with respect to change in weight.
  • 30. Backpropagation 30 • Change in error with respect to the output: 𝐸𝑡𝑜𝑡𝑎𝑙 = 1 2 [(𝑡𝑎𝑟𝑔𝑒𝑡𝑜1 − 𝑜𝑢𝑡𝑜1)2 − 𝑡𝑎𝑟𝑔𝑒𝑡𝑜2 − 𝑜𝑢𝑡𝑜2 2 ] 𝜕𝐸𝑡𝑜𝑡𝑎𝑙 𝜕𝑜𝑢𝑡𝑜1 = 2 ∗ 1 2 𝑡𝑎𝑟𝑔𝑒𝑡𝑜1 − 𝑜𝑢𝑡𝑜1 ∗ −1 + 0 = 0.01 − 0.75 ∗ −1 = 0.74 • Change in output with respect net input: 𝑜𝑢𝑡𝑜1 = 1 1 + 𝑒−𝑥 𝜕𝑜𝑢𝑡𝑜1 𝜕𝑛𝑒𝑡𝑜1 = 𝑜𝑢𝑡𝑜1 1 − 𝑜𝑢𝑡𝑜1 ……………………………[23] = 0.75 1 − 0.75 = 0.186 • Change in net input with respect to weight: 𝑛𝑒𝑡𝑜1 = 𝑤5 ∗ 𝑜𝑢𝑡ℎ1 + 𝑤6 ∗ 𝑜𝑢𝑡ℎ2 + 𝑏2 𝜕𝑛𝑒𝑡𝑜1 𝜕𝑤5 = 1 ∗ 𝑜𝑢𝑡ℎ1 + 0 + 0 = 0.593 • Change in error with respect to the weight: 𝜕𝐸𝑡𝑜𝑡𝑎𝑙 𝜕𝑤5 = 0.74 ∗ 0.186 ∗ 0.59 = 0.082 0.593 0.597 0.75 0.77
  • 31. What does it mean? 31 • What is the meaning of 𝜕𝐸𝑡𝑜𝑡𝑎𝑙 𝜕𝑤5 = 0.082 ? ⇒ 𝜕𝐸𝑡𝑜𝑡𝑎𝑙 = 0.082 ∗ 𝜕𝑤5 • So 𝜕𝐸𝑡𝑜𝑡𝑎𝑙 and 𝜕𝑤5 is proportionally related. • Coefficient of proportionality is 0.082.
  • 32. Learning Rate 32 • How much to change the model in response to the estimated error, each time the model weights are updated. [24] • A very large learning rate can overshoot the optimal values. • A very small learning rate can cause the optimization process to converge extremely slowly. • Learning Rate can be optimized using the Gradient Descent algorithm.
  • 33. Continuing Backpropagation 33 • To decrease the error, we then update the weight • Assume the learning, 𝜇 = 0.5 𝑤5 + = 𝑤5 − 𝜇 ∗ 𝜕𝐸𝑡𝑜𝑡𝑎𝑙 𝜕𝑤5 = 0.4 − 0.5 ∗ 0.082 = 0.358 Similarly, we get 𝑤6 + = 0.408 𝑤7 + = 0.511 𝑤8 + = 0.561 𝑤1 + = 0.149 𝑤2 + = 0.199 𝑤3 + = 0.249 𝑤4 + = 0.299 • Repeat this process until the desired accuracy is reached.
  • 34. Importance of MLP • Can Fit extremely non-linear datasets [10]. • Can automatically learn sophisticated features from raw input data. • Versatile model, can be used in Classification, Regression, Pattern Recognition, Time Series Analysis and many other fields. • Scalable model, can be used from small datasets to very large datasets. • MLPs serve as fundamental building blocks in deep learning. By stacking multiple hidden layers, MLPs become deep neural networks (DNNs). • MLPs have been at the forefront of neural network research and innovations. • MLPs are widely used in practical applications such as computer vision, natural language processing, speech recognition, finance, healthcare, recommendation systems, robotics, and more. 34
  • 35. References • [1] “The stilwell brain,” YouTube, https://www.youtube.com/watch?v=rA5qnZUXcqo (accessed Oct. 2, 2023). • [2] “Chatgpt,” ChatGPT, https://openai.com/chatgpt (accessed Oct. 2, 2023). • [3] RainerGewalt and Name, “Perceptrons - these artificial neurons are the fundamentals of Neural Networks,” Fly spaceships with your mind, https://starship-knowledge.com/neural- networks-perceptrons (accessed Oct. 3, 2023). • [4] “What is a feedforward fully connected neural network and how to implement it in Matlab?,” Saturn Cloud Blog, https://saturncloud.io/blog/what-is-a-feedforward-fully-connected- neural-network-and-how-to-implement-it-in-matlab/ (accessed Oct. 4, 2023). • [5] P. King, “What is the difference between the ‘neurons’ in an artificial neural network (ANN) and ‘perceptrons’?,” Quora, https://www.quora.com/What-is-the-difference-between- the-neurons-in-an-artificial-neural-network-ANN-and-perceptrons (accessed Oct. 2, 2023). • [6] “Activation functions in neural networks,” GeeksforGeeks, https://www.geeksforgeeks.org/activation-functions-neural-networks/ (accessed Oct. 2, 2023). • [7] “Perceptron in Machine Learning - Javatpoint,” www.javatpoint.com, https://www.javatpoint.com/perceptron-in-machine-learning (accessed Oct. 3, 2023). • [8] Perceptrons, https://www.w3schools.com/ai/ai_perceptrons.asp (accessed Oct. 3, 2023). • [9] A. Hange, “Flux prediction using single-layer perceptron and Multilayer Perceptron,” Medium, https://medium.com/nerd-for-tech/flux-prediction-using-single-layer-perceptron-and- multilayer-perceptron-cf82c1341c33 (accessed Oct. 3, 2023). • [10] M. W. Gardner and S. R. Dorling, “Artificial Neural Networks (the multilayer perceptron)—a review of applications in the Atmospheric Sciences,” Atmospheric Environment, vol. 32, no. 14–15, pp. 2627–2636, 1998. doi:10.1016/s1352-2310(97)00447-0. • [11] “Perceptron in Machine Learning - Javatpoint,” www.javatpoint.com, https://www.javatpoint.com/perceptron-in-machine-learning (accessed Oct. 3, 2023). • [12] Minesanalytix, “Demystifying Data-Driven Neural Networks for multivariate production analysis,” Data Analytix Association @ Mines, https://orgs.mines.edu/daa/blog/2019/08/05/neural-networks-mva/ (accessed Oct. 4, 2023). • [13] “Artificial Intelligence - Neural Networks,” Online Courses and eBooks Library, https://www.tutorialspoint.com/artificial_intelligence/artificial_intelligence_neural_networks.htm (accessed Oct. 4, 2023). 35
  • 36. • [14] “Neural network in 5 minutes | what is a neural network? | how neural networks work | Simplilearn,” YouTube, https://www.youtube.com/watch?v=bfmFfD2RIcg (accessed Oct. 4, 2023). 2, 2023). • [15] “Feedforward Neural Networks,” Brilliant Math & Science Wiki, https://brilliant.org/wiki/feedforward-neural-networks/ (accessed Oct. 4, 2023). • [16] P. Mahajan, “Fully connected vs Convolutional Neural Networks,” Medium, https://medium.com/swlh/fully-connected-vs-convolutional-neural-networks-813ca7bc6ee5 (accessed Oct. 4, 2023). • [17] “A beginner’s guide to keras: Digit recognition in 30 minutes,” SitePoint, https://www.sitepoint.com/keras-digit-recognition-tutorial/ (accessed Oct. 4, 2023). • [18] “Activation function,” Wikipedia, https://en.wikipedia.org/wiki/Activation_function (accessed Oct. 4, 2023). • [19] Mazur, “A step by step backpropagation example,” Matt Mazur, https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/ (accessed Oct. 5, 2023). • [20] “Mean squared error,” Wikipedia, https://en.wikipedia.org/wiki/Mean_squared_error (accessed Oct. 5, 2023). • [21] “A practical guide to binary cross-entropy and Log Loss,” Aporia, https://www.aporia.com/learn/understanding-binary-cross-entropy-and-log-loss-for-effective-model-monitoring/ (accessed Oct. 5, 2023). • [22] Neuralthreads, “Categorical cross-entropy loss-the most important loss function,” Medium, https://neuralthreads.medium.com/categorical-cross-entropy-loss-the-most- important-loss-function-d3792151d05b (accessed Oct. 5, 2023). • [23] “Logistic function,” Wikipedia, https://en.wikipedia.org/wiki/Logistic_function#Derivative (accessed Oct. 5, 2023). • [24] J. Brownlee, “Understand the impact of learning rate on neural network performance,” MachineLearningMastery.com, https://machinelearningmastery.com/understand-the- dynamics-of-learning-rate-on-deep-learning-neural-networks/ (accessed Oct. 5, 2023). • [25] “Educative answers - trusted answers to developer questions,” Educative, https://www.educative.io/answers/what-is-forward-propagation-in-neural-networks (accessed Oct. 5, 2023). 36