The document discusses various activation functions used in neural networks including Tanh, ReLU, Leaky ReLU, Sigmoid, and Softmax. It explains that activation functions introduce non-linearity and allow neural networks to learn complex patterns. Tanh squashes outputs between -1 and 1 while ReLU sets negative values to zero, addressing the "dying ReLU" problem. Leaky ReLU allows a small negative slope. Sigmoid and Softmax transform outputs between 0-1 for classification problems. Activation functions determine if a neuron's output is important for prediction.
Introduction by Mohamed Essam and Nourhan Ahmed. Overview of simple neural networks, including layers and an example calculation of node outputs in a neural network.
Defines activation functions, explaining their necessity for introducing non-linearity. Discusses Tanh function, its graph, and example calculations.
Introduction to ReLU (Rectified Linear Unit) function, its computational efficiency, and how it only activates neurons when outputs are >0.
Explains role of activation functions in neuron firing, mapping outputs, and determining neuron importance for predictions.
Discusses the Dying ReLU problem, limitations of ReLU, and the advantages of Leaky ReLU which prevents dead neurons during training.
Introduction to the Sigmoid function, why it's used for probabilities, and its role in adding non-linearity within neural networks.
Describes the Softmax function, its transformation of input values into probabilities, comparison with Sigmoid, and its application in neural networks.
Attribution credits for the presentation, thanking the audience and inviting questions.
Why do NeuralNetworks need it?
● Activation function is to add non-linearity to the neural network.
● activation function is an additional step at each layer, but its computation
is worth it. Here is why?
● Assume we have a neural network working without the activation
functions, In that case, every neuron will only be performing a linear
transformation on the inputs and although the neural network becomes
simpler, learning any complex task is impossible, and our model would be
just a linear regression model.
● Activation Function :
○ Tanh()
○ RELU
7.
Tanh() :
● Theoutput of this function in the range
of -1 to 1 .
● In Tanh, the larger the input (more
positive), the closer the output value
will be to 1.0, whereas the smaller the
input (more negative), the closer the
output will be to -1.0 .
(The Tanh Activation Function Graph)
ReLU:
ReLU stands forRectified Linear Unit.
the ReLU function does not activate all the neurons at the same time.
The neurons will only be deactivated if the output of the linear
transformation is less than 0.
only a certain number of neurons are activated, the ReLU function is
far more computationally efficient when compared to tanh function.
12.
Cont … Relu:
● The neural network is characterized by:
○ Pattern of connection between neurons
( Architecture)
○ Activation Function
○ method of determining the weight of the
connections
( Training, Learning ).
Mathematically it can be represented as:
Activation function
An ActivationFunction decides whether
a neuron should be activated or not.
This means that it will decide whether
the neuron’s input to the network is
important or not in the process of
prediction using simpler mathematical
operations.
Mathematically it can be represented as:
15.
Activation function
Activation (firing)of the neuron takes place when the neuron is
stimulated by pressure, heat, light, or chemical information from
other cells. (The type of stimulation necessary to produce firing
depends on the type of neuron.)
Depending on the nature and intensity of these input signals, the
brain processes them and decides whether the neuron should be
activated (“fired”) or not.
Activation Firing
16.
Activation function
Activation (firing)of the neuron takes place when the neuron is
stimulated by pressure, heat, light, or chemical information from
other cells. (The type of stimulation necessary to produce firing
depends on the type of neuron.)
Depending on the nature and intensity of these input signals, the
brain processes them and decides whether the neuron should be
activated (“fired”) or not.
Activation Firing
17.
Sigmoid -activation function
Activation(firing) of the neuron takes place when the neuron is stimulated by
pressure, heat, light, or chemical information from other cells. (The type of
stimulation necessary to produce firing depends on the type of neuron.)
Depending on the nature and intensity of these input signals, the brain
processes them and decides whether the neuron should be activated (“fired”)
or not.
Activation Firing
18.
Activation function
❏ Theprimary role of the Activation Function is to transform
the summed weighted input from the node into an output
value to be fed to the next hidden layer or as output.
❏ It is used to determine the output of neural network like yes
or no. It maps the resulting values in between 0 to 1 or -1 to
1 etc. (depending upon the function).
Activation Firing
Leaky Relu-activation function
Thenegative side of the graph makes the gradient value zero. Due to this
reason, during the backpropagation process, the weights and biases for
some neurons are not updated. This can create dead neurons which never
get activated.
All the negative input values become zero immediately, which decreases
the model’s ability to fit or train from the data properly.
Limitation Faced By Relu
21.
Leaky Relu-activation function
Thenegative side of the graph makes the gradient value zero. Due to this reason,
during the backpropagation process, the weights and biases for some neurons are
not updated. This can create dead neurons which never get activated.
● All the negative input values become zero immediately, which decreases the
model’s ability to fit or train from the data properly.
● Leaky ReLU is an improved version of ReLU function to solve the Dying ReLU
problem as it has a small positive slope in the negative area.
Limitation Faced By Relu
Leaky Relu-activation function
Advantageof Leaky Relu
The advantages of Leaky ReLU are same as that of ReLU, in addition to the
fact that it does enable backpropagation, even for negative input values.
By making this minor modification for negative input values, the gradient of
the left side of the graph comes out to be a non-zero value. Therefore, we
would no longer encounter dead neurons in that region.
24.
Sigmoid -activation function
Sigmoidfunction
The main reason why we use sigmoid
function is because it exists between (0
to 1). Therefore, it is especially used for
models where we have to predict the
probability as an output.Since
probability of anything exists only
between the range of 0 and 1, sigmoid is
the right choice.
Mathematically it can be represented as:
25.
Sigmoid -activation function
Well,the purpose of an activation function is to add non-
linearity to the neural network.
Activation Purpose
Softmax-activation function
The softmaxfunction is a function that turns a vector of K real values
into a vector of K real values that sum to 1. The input values can be
positive, negative, zero, or greater than one, but the softmax transforms
them into values between 0 and 1, so that they can be interpreted as
probabilities. If one of the inputs is small or negative, the softmax turns it
into a small probability, and if an input is large, then it turns it into a large
probability, but it will always remain between 0 and 1.
Activation
Softmax-activation function
As mentionedabove, the softmax function and the sigmoid
function are similar. The softmax operates on a vector while the
sigmoid takes a scalar.
Softmax Function vs Sigmoid Function
31.
CREDITS: This presentationtemplate was created by Slidesgo, including
icons by Flaticon, and infographics & images by Freepik
THANKS
Do you have any questions?
Please keep this slide for attribution