SlideShare a Scribd company logo
1 of 52
Artificial Intelligence
Neural Networks
OUTLINE
• Biological neuron structure and function
• The structure of a Artificial Neurons (perceptron)
• What is a Neural Network?
• How do artificial neural networks work?
• Model of ANN
• Application of Artificial Neural Networks
Biological neuron structure and function
• The human ability to perceive their surroundings – to see, hear, and smell what’s
around you – depends on your nervous system.
• Human capacity to wonder how you know where you are depends on your nervous
system.
• Human ability to act on that information also depends on your nervous system.
• All of these processes depend on the interconnected cells (neurons) that make up
your nervous system.
• The basic functions of a neuron:
• Receive signals (or information).
• Integrate incoming signals (to determine whether or not the information should be passed
along).
• Communicate signals to target cells (other neurons or muscles or glands).
Biological neuron structure and function
Neuron Anatomy
• Neurons vary in size, shape, and
structure depending on their role
and location.
• The Soma is a neuron cell body
which converts input activation
into output activation.
• Axons are transmission lines that
send activation to other neurons.
• Dendrites are the receive zones that
receive activation from other
neurons.
• Synapse allows the signal
transmission between axons and
dendrite
Biological neuron structure and function
Neuron Anatomy
• All neurons have three essential
parts:
• Dendrites: Input
• Soma (Cell body): Processor
• Axon: Output
• All neurons connected by
Synaptic: Link.
Biological neuron structure and function
Neuron Anatomy
• A neuron is connected to other neurons through about 10,000 synapses(link).
• A neuron receives input from other neurons. Inputs are combined.
• Once input exceeds a critical level, the neuron discharges a spike ‐ an electrical pulse that travels from the
body, down the axon(output), to the next neuron(s).
• The axon endings almost touch the dendrites (Input) or cell body (Processor) of the next neuron.
• Transmission of an electrical signal from one neuron to the next is affected by neurotransmitters.
• Neurotransmitters are chemicals which are released from the first neuron and which bind to the Second.
• The strength of the signal that reaches the next neuron depends on factors such as the amount of
neurotransmitter available.
The structure of a Artificial Neurons (perceptron)
• Inspiring by Biological neuron, artificial neuron has been made.
• The structure of an artificial neuron (Perceptron) is as shown:
Input units
 f()
Y
Wa
Wb
Wc
Connectio
n weights
Summing
function
Activation
Function
X1
X3
X2
Input(s)
Output
(dendrite) (synapse) (axon)
(soma)
b Bias
The structure of a Artificial Neurons (perceptron)
• Similarity to the biological network, the function of the perceptron is the same:
1. Receives inputs from other sources.
2. Combine them in some way.
3. Performs a generally nonlinear operation on the result.
4. Outputs the final result.
• Hence, the perceptron is a mathematical function modelled on the working of biological neurons.
• Perceptron is a linear classifier (binary).
• One or more inputs are separately weighted.
• Inputs are summed and passed through a nonlinear function to produce output.
• Every perceptron holds an internal state called an activation signal.
• Every perceptron is connected to another neuron via a connection link
• Each connection link carries information about the input signal.
• It is important to note that a perceptron can send only one signal at a time.
The structure of a perceptron
• The inputs of perceptron reflect the real world data.
𝑥 = (𝑥1, … , 𝑥𝑛)
• The synaptic weights 𝑤 = 𝑤1, … , 𝑤𝑛 . is a vector of size the number of inputs.
• The bias 𝑏 is a constant real value or vector.
• The summing (combination) function, 𝜇(𝑤):
• takes the input vector 𝑥 to produce a combined value.
• The combination is computed as bias plus a linear combination of the synaptic weights and the inputs in the perceptron.
𝜇 = 𝑏 + 𝑖=1
𝑛
𝑤𝑖 . 𝑥𝑖 𝑤ℎ𝑒𝑟𝑒 𝑖 = 1, … , 𝑛
• The bias increases or reduces the net input to the activation function, depending on whether it is positive or negative.
• The activation function 𝑓(⋅) defines the output from the perceptron in terms of its combination.
• The output 𝑦 is represented in terms of the composition of the combination and the activation
functions.
The structure of a perceptron Cont.
• Why do we need Weights?
• To answer this question:
• Let us consider there is no weights as shown in the Figure:
• The summation function 𝜇 sums up all the inputs and adds bias to it.
• Let us say, the role of the activation function is to allocate the data points to one of the classes.
• Compare the model expression with line equation as shown in Eq. 1 and Eq. 2 :
𝑦 = 𝑚𝑥 + 𝑐 1
𝑥2 = −𝑥1 + 𝑏 2
• The slope(𝑚) of the equation 𝑥₂ = −𝑥₁ + 𝑏 is fixed that is -1.
𝜇 = 𝑏 +
𝑖=1
2
𝑥𝑖
= 𝑥1 + 𝑥2 + 𝑏
𝑦 = 1 𝑖𝑓 𝜇 𝑥 ≥ 0
= 0 𝑒𝑙𝑠𝑒
The structure of a perceptron Cont.
• Why do we need Weights?
• Example: Consider the following dataset for illustration:
• It contains two independent features [𝑥₁ 𝑎𝑛𝑑 𝑥₂] and one dependent
feature y.
• Our task is to classify a given data point to one of the classes that
belong to feature 𝑦.
• From the data, we infer that:
• If we try to fit the line equation ( 𝑥₂ = −𝑥₁ + 𝑏) for the different
values of 𝑏 we will get the following plot:
The structure of a perceptron Cont.
• Why do we need Weights?
• Orange line is with 𝑏 = 0.
• Blue line is with 𝑏 = 1.
• Green line is with 𝑏 = 2.
𝑥₂ = −𝑥₁ + 𝑏
• Changing the value of 𝑏, end up with parallel lines.
• No change in the orientation or slope of the line.
• So, we require weights to change the orientation of
the line.
The structure of a perceptron Cont.
• What do the weights in a Neuron convey to us?
• Importance of the feature: Features with weights that are close to zero said to have
lesser importance in the prediction process compared to the features with weights
having a larger value.
• Tells the relationship between input and output:
• Example: consider the input of the perceptron represented as follow:
• Suppose that people often tend to buy a car within their budget and the most popular one among
many as follow:
• Hence, if the weight is positive, there is a direct relationship between that feature and the target
value, and inverse relationship if the weight is negative.
The structure of a perceptron Cont.
• Why do we need Bias?
• A bias value allows to shift the function curve
up or down.
• It can increase classification model accuracy.
• It serves as another model parameter to increase
the model performance on training data.
The structure of a perceptron Cont.
• Activation Functions:
• The activation function, also known as the transfer function, is the nonlinear function
applied on the inner product 𝑋𝑇𝑊 in an artificial neural network.
• Properties of activation function
1. Nonlinear: There are two reasons why an activation function should be
nonlinear:
• The boundaries or patterns in real-world problems are mostly non-linear and a
non-linear function can easily approximate a linear boundary whereas a linear
function cannot approximate a non-linear boundary.
• If the activation function is linear, then a perceptron with multiple hidden
layers can be easily compressed to a single layer perceptron.
The structure of a perceptron Cont.
• Activation Functions:
• Properties of activation function
2. Differentiable:
• During backpropagation, the gradient of the loss function is calculated in gradient descent
method.
• The gradient of the loss function with respect to weight is calculated using chain rule.
• Hence, it is necessary that the activation function is differentiable with respect to its input.
3. Continuous: A function cannot be differentiable unless it is continuous.
4. Bounded:
• The input data is passed through a series of perceptrons, each of which contains an activation
function.
• If the function is not bounded in a range, the output value may explode.
The structure of a perceptron Cont.
• Activation Functions:
• Properties of activation function
5. Zero-centered:
• A function is said to be zero centered when its range contains both positive and negative values.
• If the activation function of the network is not zero centered, 𝑦 = 𝑓(𝑋𝑇
𝑊) is always positive or always negative.
• Thus, the output of a layer is always being moved to either the positive values or the negative values.
• As a result, the weight vector needs more update to be trained properly.
• So, the number of epochs needed for the network to get trained increases if the activation function is not zero
centered.
6. Computational cost:
• It is defined as the time required to generate the output of the activation function when input is fed to it.
• The computational cost of the gradient is also important as the gradient is calculated during weight update in
backpropagation.
• Gradient descent optimization itself is a very time consuming process and many iterations are needed to perform
this. Therefore, the computational cost is an issue.
• When the activation function and the gradient of the activation function have high computational cost, it requires
more time to get trained.
The structure of a perceptron Cont.
• Why do we need Activation Function?
• It is biologically inspired by activities inside our brain wherein different neurons get
activated by different stimulus.
• It decides whether a neuron should be activated or not.
• It will decide whether the neuron’s input to the network is important or not.
• It is also used to introduce non-linearity into the output of a neuron.
• It does the non-linear transformation to the input making it capable to learn and
perform more complex tasks.
• Studying the derivatives and application of activation functions is essential for
selecting the proper type of activation function that give accuracy in a particular
Neural Network model.
The structure of a perceptron Cont.
• Problems faced by activation functions:
There are two major problems:
1. Vanishing gradient problem:
• When an activation function compresses a large range of input into a small output range, a large change
to the input of the activation function results into a very small change to the output.
• This leads to a small (typically close to zero) gradient value.
• During weight update, when backpropagation calculates the gradients of the network, the gradients of
each layer are multiplied down from the final layer to that layer because of the chain rule.
• When a value close to zero is multiplied to other values close to zero several times, the value becomes
closer and closer to zero.
• So, the weights get saturated and they are not updated properly.
• The neurons, of which the weights are not updated properly are called the saturated neurons.
The structure of a perceptron Cont.
• Problems faced by activation functions:
There are two major problems:
1. Dead neuron problem:
• When an activation function forces a large part of the input to zero or almost zero, those corresponding
neurons are inactive/dead in contributing to the final output.
• During weight update, there is a possibility that the weights will be updated in such a way that the
weighted sum of a large part of the network will be forced to zero.
• A network will hardly recover from such a situation and a large portion of the input fails to contribute
to the network.
• This leads to a problem because a large part of the input may remain completely deactivated during the
network performs.
• These forcefully deactivated neurons are called 'dead neurons' and this problem is termed as the dead
neuron problem.
The structure of a perceptron Cont.
• Types of Neural Networks Activation Functions:
1. Binary Step Function:
• It depends on a threshold value that decides whether a neuron should be activated or not.
• Mathematically it can be represented as:
𝑓 𝑥 =
0 𝑓𝑜𝑟 𝑥 < 𝜃
1 𝑓𝑜𝑟 𝑥 ≥ 𝜃
• The limitations of binary step function:
• It cannot provide multi-value outputs—for example, it cannot be used for multi-class
classification problems.
• The gradient (slop) of the step function is zero, which causes a hindrance in the
backpropagation process.
The structure of a perceptron Cont.
• Types of Neural Networks Activation Functions:
2. Linear Activation Function:
• The activation is proportional to the input.
• The function doesn't do anything to the weighted sum of the input, it simply spits out the
value it was given.
• Mathematically it can be represented as:
𝑓 𝑥 = 𝑥
• The limitations of the Linear activation function:
• It’s not possible to use backpropagation as the derivative
of the function is a constant and has no relation to the input x.
• All layers of the neural network will collapse into one if a linear activation function
is used. So, essentially, a linear activation function turns the neural network into just
one layer.
The structure of a perceptron Cont.
• Types of Neural Networks Activation Functions:
3. Non-Linear Activation Functions:
• The linear activation function is simply a linear regression model.
• Because of its limited power, this does not allow the model to create complex mappings between
the network’s inputs and outputs.
• Non-linear activation functions solve the limitations of linear activation functions.
• The common non-linear activations functions are:
• Sigmoid Function.
• Hyperbolic Tangent (Tanh) Function.
• Rectified Linear Unit (ReLU) Function.
• Leaky ReLU Function
• Softmax Function
The structure of a perceptron Cont.
• Types of Neural Networks Activation Functions:
3. Non-Linear Activation Functions:
• The common non-linear activations functions are:
• Sigmoid Function:
• Mathematically it can be represented as:
𝑓 𝑥 =
1
1 + 𝑒−𝑥
• This outputs a value between 0 and 1, making it useful for binary classification problems as one
can set a threshold “probability” value.
• The limitations of Sigmoid function:
• Computing exponents is power-intensive and can slow down large networks.
• Although, the function is computationally expensive, its gradient is not.
• Its gradient can be calculated using the formula 𝑓 𝑥 = 𝑓(𝑥)(1 − 𝑓 𝑥 ).
• The function is not zero-centred, as a result, the weight vector needs more updates to be
trained properly.
• It also has large portions of the line which are almost completely flat, giving very small
gradients (vanishing gradient problem) which makes the network hard to train.
The structure of a perceptron Cont.
• Types of Neural Networks Activation Functions:
3. Non-Linear Activation Functions:
• The common non-linear activations functions are:
• Hyperbolic Tangent (Tanh) Function:
• Mathematically it can be represented as:
𝑓 𝑥 =
1 − 𝑒−𝑥
1 + 𝑒−𝑥
• This outputs a value between -1 and 1.
• It is zero centred which makes parameter optimisation easier in layers coming after it.
• Hence, it gives better training performance for multi-layer neural networks.
• The limitations of Tanh function:
• The computing exponents is power-intensive and so can slow down large networks.
• It also suffers from saturation, the weights get saturated and they are not updated properly
(vanishing gradient problem).
The structure of a perceptron Cont.
• Types of Neural Networks Activation Functions:
3. Non-Linear Activation Functions:
• The common non-linear activations functions are:
• Rectified Linear Unit (ReLU) Function:
• Mathematically it can be represented as:
𝑓 𝑥 =
0 𝑥 < 0
𝑥 𝑥 ≥ 0
• This outputs a value between 0 and ∞.
• It has a gradient of 1 for positive values of x.
• It is extremely simple to compute, so it is the most popular choice of activation function for
hidden layers.
• The limitations of ReLU function:
• It is not zero-centred, and so outputs aren’t normalised.
• The input which have negative weighted sums fail to contribute to the whole process.
The structure of a perceptron Cont.
• Types of Neural Networks Activation Functions:
3. Non-Linear Activation Functions:
• The common non-linear activations functions are:
• Leaky ReLU Function:
• Mathematically it can be represented as:
𝑓 𝑥 =
𝛼𝑥 𝑓𝑜𝑟 𝑥 ≤ 0
𝑥 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
• A shallow slope is added for negative x by using alpha (normally small e.g. 0.01).
• This keeps nodes with negative values of x active.
• It is easy to compute, so it is the most popular choice of activation function for hidden layers.
• The limitations of Leaky ReLU function:
• The negative part, the gradient is always 0.01 which is close to zero which may lead to vanishing
gradient problem.
The structure of a perceptron Cont.
• Types of Neural Networks Activation Functions:
3. Non-Linear Activation Functions:
• The common non-linear activations functions are:
• Softmax Function:
• Mathematically it can be represented as:
𝜎(𝑧)𝑖 =
𝑒𝑧𝑖
𝑗=1
𝐾
𝑒𝑧𝑖
• The function returns 1 for the largest probability index while it returns 0 for the
other two array indexes.
• It is suitable more for multi-class classification problems.
• The limitations of Softmax function:
• The calculations for the softmax layer is computationally expensive.
The structure of a perceptron Cont.
• How to choose the right Activation Function?
• As a rule, you can begin with using the ReLU function and then move over to other
activation functions if ReLU doesn’t provide optimum results.
• Few other guidelines to choose the most appropriate activation function:
• ReLU activation function should only be used in the hidden layers.
• Sigmoid and Tanh functions should not be used in hidden layers.
• Few rules for choosing the activation function for your output layer based on the type
of prediction problem that you are solving:
• Regression - Linear Activation Function.
• Binary Classification—Sigmoid Activation Function.
• Multiclass Classification—Softmax.
• Multilabel Classification—Sigmoid.
What is a Artificial Neural Network?
• A neural network is a system or hardware that is designed to operate like a
human brain.
• Artificial Neural network (ANN) is a data processing system consisting of a
large number of simple, highly interconnected processing elements (artificial
neurons).
• Neural networks estimate any sort of function, no matter how complex it is, to
help people solve complex problems in real-life situations.
• (ANNs) are comprised of node layers, containing an input layer, one or more
hidden layers, and an output layer.
• If the output of any individual node is above the specified threshold value, that
node is activated, sending data to the next layer of the network.
How do artificial neural networks work?
• The Artificial Neural Network receives the input signal
mathematically assigned by the notations 𝑥(𝑛) from the external
source in the form of a pattern and image in the form of a vector.
• Afterward, each of the inputs is multiplied by its corresponding
weights
• All the weighted inputs are summarized inside the computing unit.
• If the weighted sum is equal to zero, then bias is added to make the
output non-zero.
• Bias has the same input, and weight equals to 1.
• The total of weighted inputs is passed through the activation
function.
• The predicted outputs are compared with actual outputs to find the
error rate and then update the weights depending on the error
estimation.
Model of ANN
• The model of ANN is specified by
1. Model of the neuron.
2. Model of the network interconnection.
3. Model of learning/training rule.
Model of ANN Cont.
• The model of ANN is specified by
1. Model of the neuron is specified by:
1. The net function.
• Linear function: apply the sum of the product of the input and the
weights.
• Quadratic function: apply the sum of the product of the squared input
and the weights
2. Activation function: There are several possible activation
functions as explained before.
Model of ANN Cont.
• The model of ANN is specified by
2. Model of the network interconnection: There exist two basic
types of neuron connection architecture:
1. Feedforward Networks can be divided into two types:
1. Single-layer feed-forward network
2. Multilayer feed-forward network
2. Feedback Networks can be divided into three types:
1. Single node with its own feedback
2. Single-layer recurrent network
3. Multilayer recurrent network
Model of ANN Cont.
1. Feedforward Network: Information/signals flow only in
one direction, from the input layer to the output layer.
1. Single-layer feed-forward network
• It is a single-layer perceptron.
• the input layer is fully connected
to the output layer.
Model of ANN Cont.
1. Feedforward Network : Information/signals flows only in one
direction, from the input layer to the output layer.
2. Multilayer feed-forward network
• The concept is having
more than one weighted layer.
• Layers between the input and
the output layer, it is called hidden layers.
• The hidden layers enable the network to be computationally stronger.
• They are used in Speech Recognition, Machine Translation, and
Complex Classification.
Model of ANN Cont.
2. Feedback Network : has feedback paths, hence the
signal can flow in both directions using loops.
1. Single node with its own feedback
• The output can be directed
back as inputs to the same layer.
• The figure shows a single recurrent
network having a single neuron with feedback to itself.
Model of ANN Cont.
2. Feedback Network: has feedback paths, hence the signal can flow
in both directions using loops.
2. Single-layer recurrent network
• Single-layer network with feedback
connection output directed back to
itself or to another processing element
or both.
• This allows it to exhibit dynamic temporal behaviour for a time
sequence.
• RNNs can use their internal state (memory) to process sequences of
inputs.
Model of ANN Cont.
2. Feedback Network: has feedback paths, hence the signal can flow in both
directions using loops.
2. Multilayer recurrent network
• Processing element output can be
directed to the processing element
in the same layer and in the preceding
layer forming a multilayer recurrent network.
• They perform the same task for every element of a sequence, with the output being
dependent on the previous computations.
• The main feature of a Recurrent Neural Network is its hidden state, which captures
some information about a sequence.
• They are used in Text processing like auto suggest, grammar checks, etc, Text to
speech processing, Translation.
Model of ANN Cont.
• Model of ANN is specified by
3. Model of learning/training rule:
• Learning, in artificial neural networks, is the method of modifying
the weights of connections between the neurons of a specified
network.
• Learning in ANN can be classified into three categories:
1. Supervised Learning.
2. Unsupervised Learning.
3. Reinforcement Learning.
Model of ANN Cont.
1. Supervised Learning:
• Correct inputs and correct outputs are provided, and the weight
adjustment is performed based on the error of the computed output.
• It uses a training set to teach models to yield the desired output.
• The algorithm measures its accuracy through the loss function, adjusting
until the error has been sufficiently minimized.
• It can used to solve two types of problems:
• Classification: assign test data into specific categories.
• Regression: used to understand the relationship between dependent and
independent variables.
Model of ANN Cont.
1. Supervised Learning:
• However, Training supervised learning models can be very time-
intensive.
• It also cannot cluster or classify data on its own.
Model of ANN Cont.
2. Unsupervised Learning:
• Unsupervised learning algorithms analyze and cluster unlabelled
datasets.
• It discovers hidden patterns or data groupings without the need for
human intervention.
• Its ability to discover similarities and differences in information
makes it the ideal solution for exploratory data analysis, cross-
selling strategies, customer segmentation, and image recognition.
Model of ANN Cont.
2. Unsupervised Learning :
• However, It has a Computational complexity due to a high volume of
training data.
• Higher risk of inaccurate results.
Model of ANN Cont.
3. Reinforcement Learning :
• It is a feedback-based learning technique in which an agent learns to
behave in an environment by performing the actions and seeing the
results of actions.
• For each good action, the agent gets positive feedback.
• For each bad action, the agent gets negative feedback or a penalty.
• The agent learns automatically using feedback without any labelled data.
• So the agent is bound to learn by its experience only.
• The primary goal is to improve performance by getting the maximum
positive rewards.
Model of ANN Cont.
3. Reinforcement Learning:
• The main challenge in reinforcement
learning lies in preparing the
simulation environment, which is
highly dependent on the task to be
performed.
• Another challenge is reaching a local
optimum – that is the agent performs
the task as it is, but not in the optimal
or required way.
Application of Artificial Neural Networks
• The ability of a neural network to learn, to make adjustments to its structure
over time, is what makes it so useful in the field of artificial intelligence.
• Here are some standard uses of neural networks :
• Pattern Recognition: It’s the most common application. Examples are facial
recognition, optical character recognition, etc.
• Time Series Prediction —Neural networks can be used to make predictions. Will the
stock rise or fall tomorrow? Will it rain or be sunny?
• Signal Processing —hearing aids need to filter out unnecessary noise and amplify
the important sounds. Neural networks can be trained to process an audio signal and
filter it appropriately.
Application of Artificial Neural Networks
• Here are some standard uses of neural networks :
• Control —In self-driving cars, neural networks are often used to manage the steering
decisions of physical vehicles (or simulated ones).
• Soft Sensors —A soft sensor refers to the process of analysing a collection of many
measurements. Neural networks can be employed to process the input data from
many individual sensors and evaluate them as a whole.
• Anomaly Detection —Because neural networks are so good at recognizing patterns,
they can also be trained to generate an output when something occurs that doesn’t fit
the pattern. Think of a neural network monitoring your daily routine over a long
period of time. After learning the patterns of your behaviour, it could alert you when
something is amiss.
Summary
• Artificial Neurons concept is introduced.
• The major points to recall are as follows:
• Biological neuron receive signals (or information), Integrate incoming signals (to
determine whether or not the information should be passed along), and Communicate
signals to target cells (other neurons or muscles or glands).
• Inspiring by Biological neuron, artificial neuron receives inputs from source,
combines them linearly, performs a generally nonlinear operation on the combination,
and outputs the final result.
• Weights are used to give the important of specific input.
• Bias is used to increase classification model accuracy.
Summary cont.
• The major points to recall are as follows cont.:
• Three types of Neural Networks Activation Functions are used to activate
specific perceptron.
• Non-linear activation functions are the most useful to allow the model to
create complex mappings between the network’s inputs and outputs.
• Artificial Neural networks (ANNs) are a able to estimate any sort of
function, no matter how complex it is and are comprised of a node
layers, containing an input layer, one or more hidden layers, and an
output layer.
Summary cont.
• The major points to recall are as follows cont.:
• Model of the neuron is specified by the network function and activation
function.
• There exist two basic types of neuron connection architecture
Feedforward Network and Feedback Network.
• Learning in ANN can be classified into three categories Supervised
Learning, Unsupervised Learning, and Reinforcement Learning.

More Related Content

Similar to Neural Networks Lec3.pptx

artificial-neural-networks-rev.ppt
artificial-neural-networks-rev.pptartificial-neural-networks-rev.ppt
artificial-neural-networks-rev.pptRINUSATHYAN
 
artificial-neural-networks-rev.ppt
artificial-neural-networks-rev.pptartificial-neural-networks-rev.ppt
artificial-neural-networks-rev.pptSanaMateen7
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMayuraD1
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural networkSopheaktra YONG
 
Artificial Neural Network_VCW (1).pptx
Artificial Neural Network_VCW (1).pptxArtificial Neural Network_VCW (1).pptx
Artificial Neural Network_VCW (1).pptxpratik610182
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networksarjitkantgupta
 
The Introduction to Neural Networks.ppt
The Introduction to Neural Networks.pptThe Introduction to Neural Networks.ppt
The Introduction to Neural Networks.pptmoh2020
 
20200428135045cfbc718e2c.pdf
20200428135045cfbc718e2c.pdf20200428135045cfbc718e2c.pdf
20200428135045cfbc718e2c.pdfTitleTube
 
Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Jon Lederman
 
33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdf33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdfgnans Kgnanshek
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Randa Elanwar
 

Similar to Neural Networks Lec3.pptx (20)

artificial-neural-networks-rev.ppt
artificial-neural-networks-rev.pptartificial-neural-networks-rev.ppt
artificial-neural-networks-rev.ppt
 
artificial-neural-networks-rev.ppt
artificial-neural-networks-rev.pptartificial-neural-networks-rev.ppt
artificial-neural-networks-rev.ppt
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Artificial Neural Network_VCW (1).pptx
Artificial Neural Network_VCW (1).pptxArtificial Neural Network_VCW (1).pptx
Artificial Neural Network_VCW (1).pptx
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 
The Introduction to Neural Networks.ppt
The Introduction to Neural Networks.pptThe Introduction to Neural Networks.ppt
The Introduction to Neural Networks.ppt
 
20200428135045cfbc718e2c.pdf
20200428135045cfbc718e2c.pdf20200428135045cfbc718e2c.pdf
20200428135045cfbc718e2c.pdf
 
Perceptron
Perceptron Perceptron
Perceptron
 
Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)
 
Unit 6: Application of AI
Unit 6: Application of AIUnit 6: Application of AI
Unit 6: Application of AI
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
ANN.pptx
ANN.pptxANN.pptx
ANN.pptx
 
Neural network
Neural networkNeural network
Neural network
 
33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdf33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdf
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9
 
UNIT-3 .PPTX
UNIT-3 .PPTXUNIT-3 .PPTX
UNIT-3 .PPTX
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
neuralnetwork.pptx
neuralnetwork.pptxneuralnetwork.pptx
neuralnetwork.pptx
 

Recently uploaded

Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSAnaAcapella
 
8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital ManagementMBA Assignment Experts
 
The Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFThe Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFVivekanand Anglo Vedic Academy
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................MirzaAbrarBaig5
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽中 央社
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文中 央社
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...Nguyen Thanh Tu Collection
 
Observing-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxObserving-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxAdelaideRefugio
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjMohammed Sikander
 
The Liver & Gallbladder (Anatomy & Physiology).pptx
The Liver &  Gallbladder (Anatomy & Physiology).pptxThe Liver &  Gallbladder (Anatomy & Physiology).pptx
The Liver & Gallbladder (Anatomy & Physiology).pptxVishal Singh
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi RajagopalEADTU
 
How to Manage Website in Odoo 17 Studio App.pptx
How to Manage Website in Odoo 17 Studio App.pptxHow to Manage Website in Odoo 17 Studio App.pptx
How to Manage Website in Odoo 17 Studio App.pptxCeline George
 
Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...EduSkills OECD
 
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...Gary Wood
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMELOISARIVERA8
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsSandeep D Chaudhary
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptNishitharanjan Rout
 
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes GuàrdiaPersonalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes GuàrdiaEADTU
 

Recently uploaded (20)

Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management
 
The Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFThe Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDF
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
Observing-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxObserving-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptx
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
 
The Liver & Gallbladder (Anatomy & Physiology).pptx
The Liver &  Gallbladder (Anatomy & Physiology).pptxThe Liver &  Gallbladder (Anatomy & Physiology).pptx
The Liver & Gallbladder (Anatomy & Physiology).pptx
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopal
 
OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 
How to Manage Website in Odoo 17 Studio App.pptx
How to Manage Website in Odoo 17 Studio App.pptxHow to Manage Website in Odoo 17 Studio App.pptx
How to Manage Website in Odoo 17 Studio App.pptx
 
Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...
 
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.ppt
 
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes GuàrdiaPersonalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
 

Neural Networks Lec3.pptx

  • 3. OUTLINE • Biological neuron structure and function • The structure of a Artificial Neurons (perceptron) • What is a Neural Network? • How do artificial neural networks work? • Model of ANN • Application of Artificial Neural Networks
  • 4. Biological neuron structure and function • The human ability to perceive their surroundings – to see, hear, and smell what’s around you – depends on your nervous system. • Human capacity to wonder how you know where you are depends on your nervous system. • Human ability to act on that information also depends on your nervous system. • All of these processes depend on the interconnected cells (neurons) that make up your nervous system. • The basic functions of a neuron: • Receive signals (or information). • Integrate incoming signals (to determine whether or not the information should be passed along). • Communicate signals to target cells (other neurons or muscles or glands).
  • 5. Biological neuron structure and function Neuron Anatomy • Neurons vary in size, shape, and structure depending on their role and location. • The Soma is a neuron cell body which converts input activation into output activation. • Axons are transmission lines that send activation to other neurons. • Dendrites are the receive zones that receive activation from other neurons. • Synapse allows the signal transmission between axons and dendrite
  • 6. Biological neuron structure and function Neuron Anatomy • All neurons have three essential parts: • Dendrites: Input • Soma (Cell body): Processor • Axon: Output • All neurons connected by Synaptic: Link.
  • 7. Biological neuron structure and function Neuron Anatomy • A neuron is connected to other neurons through about 10,000 synapses(link). • A neuron receives input from other neurons. Inputs are combined. • Once input exceeds a critical level, the neuron discharges a spike ‐ an electrical pulse that travels from the body, down the axon(output), to the next neuron(s). • The axon endings almost touch the dendrites (Input) or cell body (Processor) of the next neuron. • Transmission of an electrical signal from one neuron to the next is affected by neurotransmitters. • Neurotransmitters are chemicals which are released from the first neuron and which bind to the Second. • The strength of the signal that reaches the next neuron depends on factors such as the amount of neurotransmitter available.
  • 8. The structure of a Artificial Neurons (perceptron) • Inspiring by Biological neuron, artificial neuron has been made. • The structure of an artificial neuron (Perceptron) is as shown: Input units  f() Y Wa Wb Wc Connectio n weights Summing function Activation Function X1 X3 X2 Input(s) Output (dendrite) (synapse) (axon) (soma) b Bias
  • 9. The structure of a Artificial Neurons (perceptron) • Similarity to the biological network, the function of the perceptron is the same: 1. Receives inputs from other sources. 2. Combine them in some way. 3. Performs a generally nonlinear operation on the result. 4. Outputs the final result. • Hence, the perceptron is a mathematical function modelled on the working of biological neurons. • Perceptron is a linear classifier (binary). • One or more inputs are separately weighted. • Inputs are summed and passed through a nonlinear function to produce output. • Every perceptron holds an internal state called an activation signal. • Every perceptron is connected to another neuron via a connection link • Each connection link carries information about the input signal. • It is important to note that a perceptron can send only one signal at a time.
  • 10. The structure of a perceptron • The inputs of perceptron reflect the real world data. 𝑥 = (𝑥1, … , 𝑥𝑛) • The synaptic weights 𝑤 = 𝑤1, … , 𝑤𝑛 . is a vector of size the number of inputs. • The bias 𝑏 is a constant real value or vector. • The summing (combination) function, 𝜇(𝑤): • takes the input vector 𝑥 to produce a combined value. • The combination is computed as bias plus a linear combination of the synaptic weights and the inputs in the perceptron. 𝜇 = 𝑏 + 𝑖=1 𝑛 𝑤𝑖 . 𝑥𝑖 𝑤ℎ𝑒𝑟𝑒 𝑖 = 1, … , 𝑛 • The bias increases or reduces the net input to the activation function, depending on whether it is positive or negative. • The activation function 𝑓(⋅) defines the output from the perceptron in terms of its combination. • The output 𝑦 is represented in terms of the composition of the combination and the activation functions.
  • 11. The structure of a perceptron Cont. • Why do we need Weights? • To answer this question: • Let us consider there is no weights as shown in the Figure: • The summation function 𝜇 sums up all the inputs and adds bias to it. • Let us say, the role of the activation function is to allocate the data points to one of the classes. • Compare the model expression with line equation as shown in Eq. 1 and Eq. 2 : 𝑦 = 𝑚𝑥 + 𝑐 1 𝑥2 = −𝑥1 + 𝑏 2 • The slope(𝑚) of the equation 𝑥₂ = −𝑥₁ + 𝑏 is fixed that is -1. 𝜇 = 𝑏 + 𝑖=1 2 𝑥𝑖 = 𝑥1 + 𝑥2 + 𝑏 𝑦 = 1 𝑖𝑓 𝜇 𝑥 ≥ 0 = 0 𝑒𝑙𝑠𝑒
  • 12. The structure of a perceptron Cont. • Why do we need Weights? • Example: Consider the following dataset for illustration: • It contains two independent features [𝑥₁ 𝑎𝑛𝑑 𝑥₂] and one dependent feature y. • Our task is to classify a given data point to one of the classes that belong to feature 𝑦. • From the data, we infer that: • If we try to fit the line equation ( 𝑥₂ = −𝑥₁ + 𝑏) for the different values of 𝑏 we will get the following plot:
  • 13. The structure of a perceptron Cont. • Why do we need Weights? • Orange line is with 𝑏 = 0. • Blue line is with 𝑏 = 1. • Green line is with 𝑏 = 2. 𝑥₂ = −𝑥₁ + 𝑏 • Changing the value of 𝑏, end up with parallel lines. • No change in the orientation or slope of the line. • So, we require weights to change the orientation of the line.
  • 14. The structure of a perceptron Cont. • What do the weights in a Neuron convey to us? • Importance of the feature: Features with weights that are close to zero said to have lesser importance in the prediction process compared to the features with weights having a larger value. • Tells the relationship between input and output: • Example: consider the input of the perceptron represented as follow: • Suppose that people often tend to buy a car within their budget and the most popular one among many as follow: • Hence, if the weight is positive, there is a direct relationship between that feature and the target value, and inverse relationship if the weight is negative.
  • 15. The structure of a perceptron Cont. • Why do we need Bias? • A bias value allows to shift the function curve up or down. • It can increase classification model accuracy. • It serves as another model parameter to increase the model performance on training data.
  • 16. The structure of a perceptron Cont. • Activation Functions: • The activation function, also known as the transfer function, is the nonlinear function applied on the inner product 𝑋𝑇𝑊 in an artificial neural network. • Properties of activation function 1. Nonlinear: There are two reasons why an activation function should be nonlinear: • The boundaries or patterns in real-world problems are mostly non-linear and a non-linear function can easily approximate a linear boundary whereas a linear function cannot approximate a non-linear boundary. • If the activation function is linear, then a perceptron with multiple hidden layers can be easily compressed to a single layer perceptron.
  • 17. The structure of a perceptron Cont. • Activation Functions: • Properties of activation function 2. Differentiable: • During backpropagation, the gradient of the loss function is calculated in gradient descent method. • The gradient of the loss function with respect to weight is calculated using chain rule. • Hence, it is necessary that the activation function is differentiable with respect to its input. 3. Continuous: A function cannot be differentiable unless it is continuous. 4. Bounded: • The input data is passed through a series of perceptrons, each of which contains an activation function. • If the function is not bounded in a range, the output value may explode.
  • 18. The structure of a perceptron Cont. • Activation Functions: • Properties of activation function 5. Zero-centered: • A function is said to be zero centered when its range contains both positive and negative values. • If the activation function of the network is not zero centered, 𝑦 = 𝑓(𝑋𝑇 𝑊) is always positive or always negative. • Thus, the output of a layer is always being moved to either the positive values or the negative values. • As a result, the weight vector needs more update to be trained properly. • So, the number of epochs needed for the network to get trained increases if the activation function is not zero centered. 6. Computational cost: • It is defined as the time required to generate the output of the activation function when input is fed to it. • The computational cost of the gradient is also important as the gradient is calculated during weight update in backpropagation. • Gradient descent optimization itself is a very time consuming process and many iterations are needed to perform this. Therefore, the computational cost is an issue. • When the activation function and the gradient of the activation function have high computational cost, it requires more time to get trained.
  • 19. The structure of a perceptron Cont. • Why do we need Activation Function? • It is biologically inspired by activities inside our brain wherein different neurons get activated by different stimulus. • It decides whether a neuron should be activated or not. • It will decide whether the neuron’s input to the network is important or not. • It is also used to introduce non-linearity into the output of a neuron. • It does the non-linear transformation to the input making it capable to learn and perform more complex tasks. • Studying the derivatives and application of activation functions is essential for selecting the proper type of activation function that give accuracy in a particular Neural Network model.
  • 20. The structure of a perceptron Cont. • Problems faced by activation functions: There are two major problems: 1. Vanishing gradient problem: • When an activation function compresses a large range of input into a small output range, a large change to the input of the activation function results into a very small change to the output. • This leads to a small (typically close to zero) gradient value. • During weight update, when backpropagation calculates the gradients of the network, the gradients of each layer are multiplied down from the final layer to that layer because of the chain rule. • When a value close to zero is multiplied to other values close to zero several times, the value becomes closer and closer to zero. • So, the weights get saturated and they are not updated properly. • The neurons, of which the weights are not updated properly are called the saturated neurons.
  • 21. The structure of a perceptron Cont. • Problems faced by activation functions: There are two major problems: 1. Dead neuron problem: • When an activation function forces a large part of the input to zero or almost zero, those corresponding neurons are inactive/dead in contributing to the final output. • During weight update, there is a possibility that the weights will be updated in such a way that the weighted sum of a large part of the network will be forced to zero. • A network will hardly recover from such a situation and a large portion of the input fails to contribute to the network. • This leads to a problem because a large part of the input may remain completely deactivated during the network performs. • These forcefully deactivated neurons are called 'dead neurons' and this problem is termed as the dead neuron problem.
  • 22. The structure of a perceptron Cont. • Types of Neural Networks Activation Functions: 1. Binary Step Function: • It depends on a threshold value that decides whether a neuron should be activated or not. • Mathematically it can be represented as: 𝑓 𝑥 = 0 𝑓𝑜𝑟 𝑥 < 𝜃 1 𝑓𝑜𝑟 𝑥 ≥ 𝜃 • The limitations of binary step function: • It cannot provide multi-value outputs—for example, it cannot be used for multi-class classification problems. • The gradient (slop) of the step function is zero, which causes a hindrance in the backpropagation process.
  • 23. The structure of a perceptron Cont. • Types of Neural Networks Activation Functions: 2. Linear Activation Function: • The activation is proportional to the input. • The function doesn't do anything to the weighted sum of the input, it simply spits out the value it was given. • Mathematically it can be represented as: 𝑓 𝑥 = 𝑥 • The limitations of the Linear activation function: • It’s not possible to use backpropagation as the derivative of the function is a constant and has no relation to the input x. • All layers of the neural network will collapse into one if a linear activation function is used. So, essentially, a linear activation function turns the neural network into just one layer.
  • 24. The structure of a perceptron Cont. • Types of Neural Networks Activation Functions: 3. Non-Linear Activation Functions: • The linear activation function is simply a linear regression model. • Because of its limited power, this does not allow the model to create complex mappings between the network’s inputs and outputs. • Non-linear activation functions solve the limitations of linear activation functions. • The common non-linear activations functions are: • Sigmoid Function. • Hyperbolic Tangent (Tanh) Function. • Rectified Linear Unit (ReLU) Function. • Leaky ReLU Function • Softmax Function
  • 25. The structure of a perceptron Cont. • Types of Neural Networks Activation Functions: 3. Non-Linear Activation Functions: • The common non-linear activations functions are: • Sigmoid Function: • Mathematically it can be represented as: 𝑓 𝑥 = 1 1 + 𝑒−𝑥 • This outputs a value between 0 and 1, making it useful for binary classification problems as one can set a threshold “probability” value. • The limitations of Sigmoid function: • Computing exponents is power-intensive and can slow down large networks. • Although, the function is computationally expensive, its gradient is not. • Its gradient can be calculated using the formula 𝑓 𝑥 = 𝑓(𝑥)(1 − 𝑓 𝑥 ). • The function is not zero-centred, as a result, the weight vector needs more updates to be trained properly. • It also has large portions of the line which are almost completely flat, giving very small gradients (vanishing gradient problem) which makes the network hard to train.
  • 26. The structure of a perceptron Cont. • Types of Neural Networks Activation Functions: 3. Non-Linear Activation Functions: • The common non-linear activations functions are: • Hyperbolic Tangent (Tanh) Function: • Mathematically it can be represented as: 𝑓 𝑥 = 1 − 𝑒−𝑥 1 + 𝑒−𝑥 • This outputs a value between -1 and 1. • It is zero centred which makes parameter optimisation easier in layers coming after it. • Hence, it gives better training performance for multi-layer neural networks. • The limitations of Tanh function: • The computing exponents is power-intensive and so can slow down large networks. • It also suffers from saturation, the weights get saturated and they are not updated properly (vanishing gradient problem).
  • 27. The structure of a perceptron Cont. • Types of Neural Networks Activation Functions: 3. Non-Linear Activation Functions: • The common non-linear activations functions are: • Rectified Linear Unit (ReLU) Function: • Mathematically it can be represented as: 𝑓 𝑥 = 0 𝑥 < 0 𝑥 𝑥 ≥ 0 • This outputs a value between 0 and ∞. • It has a gradient of 1 for positive values of x. • It is extremely simple to compute, so it is the most popular choice of activation function for hidden layers. • The limitations of ReLU function: • It is not zero-centred, and so outputs aren’t normalised. • The input which have negative weighted sums fail to contribute to the whole process.
  • 28. The structure of a perceptron Cont. • Types of Neural Networks Activation Functions: 3. Non-Linear Activation Functions: • The common non-linear activations functions are: • Leaky ReLU Function: • Mathematically it can be represented as: 𝑓 𝑥 = 𝛼𝑥 𝑓𝑜𝑟 𝑥 ≤ 0 𝑥 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 • A shallow slope is added for negative x by using alpha (normally small e.g. 0.01). • This keeps nodes with negative values of x active. • It is easy to compute, so it is the most popular choice of activation function for hidden layers. • The limitations of Leaky ReLU function: • The negative part, the gradient is always 0.01 which is close to zero which may lead to vanishing gradient problem.
  • 29. The structure of a perceptron Cont. • Types of Neural Networks Activation Functions: 3. Non-Linear Activation Functions: • The common non-linear activations functions are: • Softmax Function: • Mathematically it can be represented as: 𝜎(𝑧)𝑖 = 𝑒𝑧𝑖 𝑗=1 𝐾 𝑒𝑧𝑖 • The function returns 1 for the largest probability index while it returns 0 for the other two array indexes. • It is suitable more for multi-class classification problems. • The limitations of Softmax function: • The calculations for the softmax layer is computationally expensive.
  • 30. The structure of a perceptron Cont. • How to choose the right Activation Function? • As a rule, you can begin with using the ReLU function and then move over to other activation functions if ReLU doesn’t provide optimum results. • Few other guidelines to choose the most appropriate activation function: • ReLU activation function should only be used in the hidden layers. • Sigmoid and Tanh functions should not be used in hidden layers. • Few rules for choosing the activation function for your output layer based on the type of prediction problem that you are solving: • Regression - Linear Activation Function. • Binary Classification—Sigmoid Activation Function. • Multiclass Classification—Softmax. • Multilabel Classification—Sigmoid.
  • 31. What is a Artificial Neural Network? • A neural network is a system or hardware that is designed to operate like a human brain. • Artificial Neural network (ANN) is a data processing system consisting of a large number of simple, highly interconnected processing elements (artificial neurons). • Neural networks estimate any sort of function, no matter how complex it is, to help people solve complex problems in real-life situations. • (ANNs) are comprised of node layers, containing an input layer, one or more hidden layers, and an output layer. • If the output of any individual node is above the specified threshold value, that node is activated, sending data to the next layer of the network.
  • 32. How do artificial neural networks work? • The Artificial Neural Network receives the input signal mathematically assigned by the notations 𝑥(𝑛) from the external source in the form of a pattern and image in the form of a vector. • Afterward, each of the inputs is multiplied by its corresponding weights • All the weighted inputs are summarized inside the computing unit. • If the weighted sum is equal to zero, then bias is added to make the output non-zero. • Bias has the same input, and weight equals to 1. • The total of weighted inputs is passed through the activation function. • The predicted outputs are compared with actual outputs to find the error rate and then update the weights depending on the error estimation.
  • 33. Model of ANN • The model of ANN is specified by 1. Model of the neuron. 2. Model of the network interconnection. 3. Model of learning/training rule.
  • 34. Model of ANN Cont. • The model of ANN is specified by 1. Model of the neuron is specified by: 1. The net function. • Linear function: apply the sum of the product of the input and the weights. • Quadratic function: apply the sum of the product of the squared input and the weights 2. Activation function: There are several possible activation functions as explained before.
  • 35. Model of ANN Cont. • The model of ANN is specified by 2. Model of the network interconnection: There exist two basic types of neuron connection architecture: 1. Feedforward Networks can be divided into two types: 1. Single-layer feed-forward network 2. Multilayer feed-forward network 2. Feedback Networks can be divided into three types: 1. Single node with its own feedback 2. Single-layer recurrent network 3. Multilayer recurrent network
  • 36. Model of ANN Cont. 1. Feedforward Network: Information/signals flow only in one direction, from the input layer to the output layer. 1. Single-layer feed-forward network • It is a single-layer perceptron. • the input layer is fully connected to the output layer.
  • 37. Model of ANN Cont. 1. Feedforward Network : Information/signals flows only in one direction, from the input layer to the output layer. 2. Multilayer feed-forward network • The concept is having more than one weighted layer. • Layers between the input and the output layer, it is called hidden layers. • The hidden layers enable the network to be computationally stronger. • They are used in Speech Recognition, Machine Translation, and Complex Classification.
  • 38. Model of ANN Cont. 2. Feedback Network : has feedback paths, hence the signal can flow in both directions using loops. 1. Single node with its own feedback • The output can be directed back as inputs to the same layer. • The figure shows a single recurrent network having a single neuron with feedback to itself.
  • 39. Model of ANN Cont. 2. Feedback Network: has feedback paths, hence the signal can flow in both directions using loops. 2. Single-layer recurrent network • Single-layer network with feedback connection output directed back to itself or to another processing element or both. • This allows it to exhibit dynamic temporal behaviour for a time sequence. • RNNs can use their internal state (memory) to process sequences of inputs.
  • 40. Model of ANN Cont. 2. Feedback Network: has feedback paths, hence the signal can flow in both directions using loops. 2. Multilayer recurrent network • Processing element output can be directed to the processing element in the same layer and in the preceding layer forming a multilayer recurrent network. • They perform the same task for every element of a sequence, with the output being dependent on the previous computations. • The main feature of a Recurrent Neural Network is its hidden state, which captures some information about a sequence. • They are used in Text processing like auto suggest, grammar checks, etc, Text to speech processing, Translation.
  • 41. Model of ANN Cont. • Model of ANN is specified by 3. Model of learning/training rule: • Learning, in artificial neural networks, is the method of modifying the weights of connections between the neurons of a specified network. • Learning in ANN can be classified into three categories: 1. Supervised Learning. 2. Unsupervised Learning. 3. Reinforcement Learning.
  • 42. Model of ANN Cont. 1. Supervised Learning: • Correct inputs and correct outputs are provided, and the weight adjustment is performed based on the error of the computed output. • It uses a training set to teach models to yield the desired output. • The algorithm measures its accuracy through the loss function, adjusting until the error has been sufficiently minimized. • It can used to solve two types of problems: • Classification: assign test data into specific categories. • Regression: used to understand the relationship between dependent and independent variables.
  • 43. Model of ANN Cont. 1. Supervised Learning: • However, Training supervised learning models can be very time- intensive. • It also cannot cluster or classify data on its own.
  • 44. Model of ANN Cont. 2. Unsupervised Learning: • Unsupervised learning algorithms analyze and cluster unlabelled datasets. • It discovers hidden patterns or data groupings without the need for human intervention. • Its ability to discover similarities and differences in information makes it the ideal solution for exploratory data analysis, cross- selling strategies, customer segmentation, and image recognition.
  • 45. Model of ANN Cont. 2. Unsupervised Learning : • However, It has a Computational complexity due to a high volume of training data. • Higher risk of inaccurate results.
  • 46. Model of ANN Cont. 3. Reinforcement Learning : • It is a feedback-based learning technique in which an agent learns to behave in an environment by performing the actions and seeing the results of actions. • For each good action, the agent gets positive feedback. • For each bad action, the agent gets negative feedback or a penalty. • The agent learns automatically using feedback without any labelled data. • So the agent is bound to learn by its experience only. • The primary goal is to improve performance by getting the maximum positive rewards.
  • 47. Model of ANN Cont. 3. Reinforcement Learning: • The main challenge in reinforcement learning lies in preparing the simulation environment, which is highly dependent on the task to be performed. • Another challenge is reaching a local optimum – that is the agent performs the task as it is, but not in the optimal or required way.
  • 48. Application of Artificial Neural Networks • The ability of a neural network to learn, to make adjustments to its structure over time, is what makes it so useful in the field of artificial intelligence. • Here are some standard uses of neural networks : • Pattern Recognition: It’s the most common application. Examples are facial recognition, optical character recognition, etc. • Time Series Prediction —Neural networks can be used to make predictions. Will the stock rise or fall tomorrow? Will it rain or be sunny? • Signal Processing —hearing aids need to filter out unnecessary noise and amplify the important sounds. Neural networks can be trained to process an audio signal and filter it appropriately.
  • 49. Application of Artificial Neural Networks • Here are some standard uses of neural networks : • Control —In self-driving cars, neural networks are often used to manage the steering decisions of physical vehicles (or simulated ones). • Soft Sensors —A soft sensor refers to the process of analysing a collection of many measurements. Neural networks can be employed to process the input data from many individual sensors and evaluate them as a whole. • Anomaly Detection —Because neural networks are so good at recognizing patterns, they can also be trained to generate an output when something occurs that doesn’t fit the pattern. Think of a neural network monitoring your daily routine over a long period of time. After learning the patterns of your behaviour, it could alert you when something is amiss.
  • 50. Summary • Artificial Neurons concept is introduced. • The major points to recall are as follows: • Biological neuron receive signals (or information), Integrate incoming signals (to determine whether or not the information should be passed along), and Communicate signals to target cells (other neurons or muscles or glands). • Inspiring by Biological neuron, artificial neuron receives inputs from source, combines them linearly, performs a generally nonlinear operation on the combination, and outputs the final result. • Weights are used to give the important of specific input. • Bias is used to increase classification model accuracy.
  • 51. Summary cont. • The major points to recall are as follows cont.: • Three types of Neural Networks Activation Functions are used to activate specific perceptron. • Non-linear activation functions are the most useful to allow the model to create complex mappings between the network’s inputs and outputs. • Artificial Neural networks (ANNs) are a able to estimate any sort of function, no matter how complex it is and are comprised of a node layers, containing an input layer, one or more hidden layers, and an output layer.
  • 52. Summary cont. • The major points to recall are as follows cont.: • Model of the neuron is specified by the network function and activation function. • There exist two basic types of neuron connection architecture Feedforward Network and Feedback Network. • Learning in ANN can be classified into three categories Supervised Learning, Unsupervised Learning, and Reinforcement Learning.