This document provides an overview of deep learning and some key concepts in neural networks. It discusses how neural networks work by taking inputs, multiplying them by weights, applying an activation function, and using backpropagation to update the weights. It describes common activation functions like sigmoid and different types of neural networks like CNNs and RNNs. For CNNs specifically, it explains concepts like convolution using filters, padding input images to prevent information loss, and max pooling layers to make predictions invariant to position or scale.
2. Agenda
Introduction
How Neural Network work
Activation Function
Neural Network with Back Propagation
Different types of deep learning algorithms
Convolutional Neural Network
What is Convolution
Padding in CNN
Max Pooling Layer
3. Introduction
Deep learning is a technique which basically mimics the human brain.
So, the Scientist and Researchers taught can we make machine learn in the same way
so, their is where deep learning concept came that lead to the invention called Neural
Network.
4. How Neural Network Work
Features 𝑥1, 𝑥2, 𝑥3 for my input layer. I want to determine binary classification.
Now, let us understand what kind of process does hidden layer do and what is the
importance of 𝑤1, 𝑤2, 𝑤3 (weights).
5. Continue…
As soon as the inputs are given they will get multiplied with respective weights which
are intern inputs for hidden layer .
The activation function will trigger.
When 𝑤1, 𝑤2, 𝑤3 are assigned, the weights passes to the hidden neuron. Then two types
of operation usually happen.
Step 1: The summation of weights and the inputs.
i=1
n
WiXi
y= 𝑤1𝑥1 + 𝑤2𝑥2 + 𝑤3𝑥3
6. Continue…
Step 2: Before activation function the bias will be added and summation follows
y= 𝑤1𝑥1 + 𝑤2𝑥2 + 𝑤3𝑥3+𝑏𝑖 (1)
z =
1
1+𝑒−𝑦 = Activation function * Sigmoid function
Z= z × 𝑤4
If it is a classification problem then 0 or 1 will be obtained.
This is an example of forward propagation.
7. Activation function
The activation function is a mathematical “gate” in between the input feeding the current
neuron and its output going to the next layer. It can be as simple as a step function that turns
the neuron output on and off depending on a rule or threshold.
Example: I have 2 hands, on my right hand I have placed a hot object on it then weights
become higher after that activation function will applied so, these neurons will get activated,
and passes information to my brain then I will respond to the stimulus and remove the
object. Then what about my left hand, this will not get activated and will not pass any
information to the brain. This is how the whole human nervous system work.
Sigmoid Function = σ 𝑋 =
1
1+𝑒−𝑦 ; y= 𝑖=1
𝑛
𝑊𝑖𝑋𝑖+𝑏𝑖
This will transform the value between 0 or 1. If it is < 0.5 considered as 0. Here 0.5 is the
threshold.
8. Neural Network with Back Propagation
Let us consider a dataset
Forward propagation: Let Inputs are 𝑥1, 𝑥2, 𝑥3. These inputs will pass to hidden
important operations will take place
y= [𝑤1𝑥1 + 𝑤2𝑥2 + 𝑤3𝑥3]+𝑏𝑖
z= Act (y) * Sigmoid Activation function
𝒙𝟏 𝒙𝟐 𝒙𝟑 O/P
Play Study Sleep y
2h 4h 8h 1
*Only one hidden neuron is considered for training example
9. Continue…
Here loss value is higher and completely predicted wrong.
Now, the weights are to be adjusted in such a way that my predicted output should be 1.
This is basically done by using Optimizer. To reduce the loss value back propagate need
to be used.
Back Propagation: While doing back propagation these weights will get updated.
w4new
= w4𝑜𝑙𝑑
− α
𝜕L
𝜕w4
10. Continue…
Here learning rate α should be minimal value = 0.001.
This small learning rate will help to reach global minima in the gradient descent.
Which is possible only with the optimizer.
After updating 𝑤4, the other weights 𝑤1, 𝑤2, 𝑤3 need to be updated respectively.
w3new
= w3𝑜𝑙𝑑
− α
𝜕L
𝜕w3
Once the values are updated, the forward propagation will start. It will iterate to such a
point the loss value will completely reduce to 𝑦 = 𝑦.
Since there is a single record value defined with Loss function. If there are multiple
records the Cost function need to be defined.
i=1
n
(y − y)2
11. Continue…
The 1st simplest type of neural network is called perceptron.
There was some problems in the perceptron because the perceptron not able to learn
properly because the concepts they applied.
But later on in 1980’s Geoffrey Hinton he invented concept called backpropagation. So,
the ANN, CNN, RNN became efficient that many companies are using it, developed lot of
applications.
12. Different types of deep learning algorithms
Some different types of deep learning algorithms are:
Artificial Neural Network (ANN)
Convolutional Neural Network (CNN)
Recurrent Neural Networks (RNNs)
Long Short-Term Memory Networks (LSTMs)
Stacked Auto-Encoders
Deep Boltzmann Machine (DBM)
Deep Belief Networks (DBN)
13. Continue…
The most popular deep learning algorithms are:
Artificial Neural Network (ANN)
Convolutional Neural Network (CNN)
Recurrent Neural Networks (RNNs)
14. Continue…
Artificial Neural Network: Artificial Neural Network, or ANN, is a group of multiple
perceptrons/ neurons at each layer. ANN is also known as a Feed-Forward Neural
network because inputs are processed only in the forward direction.
Convolutional Neural Network: These CNN models are being used across different applications
and domains, and they’re especially prevalent in image and video processing projects.
Recurrent Neural Network: RNN captures the sequential information present in the input data
i.e. dependency between the words in the text while making predictions.
16. Convolutional Neural Network
Convolution neural network will be able to do a lot of things like object detection,
object classification, object recognition and many more.
Now, we will understand what is exactly convolution neural network and how do it
work and we will see that how the image recognition is basically happening with respect
to our human brain, we will try to understand that first.
In our human brain in the Back Part of our head we basically have something called as
cerebral cortex and we know that human brain is basically divided into four parts and
one of the part basically called cerebral cortex.
Inside this we will have something called as visual cortex. Which is basically
responsible for seeing the images.
17. Continue…
Suppose I am seeing a cat that information passes through my sensory organ that is
my eyes, once it passes it passes through the various neurons and then it reaches the
particular region where there is something called as visual cortex in visual cortex we
have multiple layers like, V1, V2, V3, V4, V5, and V6.
These layers play a very important role.
Suppose consider V1 layer basically responsible for finding the edges of that particular
images and it goes to the next layer V2, here some more information is gathered from
this layer also it will try to determine the object is moving or not, is their any other object
apart from cat so on… likewise V1 to V4, V2 to V3 and so on… the information passes
and final output is shown.
This layer example I took because to explain the concept called filters or kernels in
CNN.
18. What is Convolution
Before understand the convolution we will understand some basic things about images.
Basically image is represented in the form of pixels like, 200x200, 10x10, 6x6, 2x2 etc.,
Suppose if I have a gray scale image considered as 4x4 pixels. When I say gray scale it
has black and white color, most of this value pixels ranges from 0 to 255
Similarly there are some colored images also with the scale of RGB. If I want to specify
this RGB image
255 0 125 0
0 0 3 17
0 100 35 200
0 0 34 147
4x4
4x4x3
19. Continue…
The term convolution refers to the mathematical combination of two functions to
produce a third function. It merges two sets of information. In the case of a CNN,
the convolution is performed on the input data with the use of a filter or kernel (these
terms are used interchangeably) to then produce a feature map.
Now, let us consider a particular image. I have pixel values 0 and 1. If I apply vertical
edge filter on any image it will detect all vertical edges. Now, let us see how this
convolutional operation work.
20. Continue…
Now, we see that I have 6x6 image, 3x3 filter and my output is 4x4.
Let me consider this 6x6 as n=6, and 3x3 as f=3. So, the formula is:
n-f+1
We can observe one thing here, Image is 6x6 and output is 4x4 that basically means
loosing some information in output image. What if I need same size image output over
here, then we will implement a concept called padding.
21. Padding in CNN
I have an image 6x6 pixels, applying a vertical edge filter 3x3, and output of 4x4 pixels.
This type of convolution operation is done just using strides.
Here, filter size may change according to the requirement of what kind of operation
they do.
Let, consider 6x6 as n=6, and 3x3 as f=3. So, what if I need same size image output over
here, then we will implement a concept called padding. The formula for padding is
n-f+1=6
n=6+f-1
n=6+3-1
n=8
22. Continue…
Now, we use a technique called as padding. What padding does means for 6x6 pixels it
will add a padding of 1 and make it as 8x8 pixels.
23. Continue…
Basically creating a compound surrounding the home, we will not loose any
information from here.
Now, what value I can fill in this. Basically of two types
1. Zero padding
2. Replace with neighbor pixels
The best way and most commonly used technique is Zero padding.
Now, the formula for padding is
n+2p-f+1
6+2(1)-3+1
8-3+1=6
24. Continue…
Usually when human brain see some faces, some of the neurons automatically get
triggered. Similarly we should try to make our convolutional neural network also do like
that.
Suppose in one of the image I have multiple cat faces, then automatically this CNN or
kernels that I am using should get automatically triggered and able to detect those
multiple faces. So, this step done my the maxpooling layer.
25. Max Pooling Layer
Suppose I have an image of 4x4 pixel and applying a filter of 2x2. Then I will get a 3x3
output pixel.
n=4, f=2; n-f+1
4-2+1 = 3
26. Continue…
Now, we try to understand what exactly the max pooling is. Here, a concept called
location invariant is introduced.
Suppose I have multiple cat images and my filter is used to detect the faces as we go on
horizontally in to the higher level these faces should be clearly and precisely detected. So,
we use max pooling layer.
Now, how this max pooling layer work, Max pooling layer of 2x2 filter size and stride is
2.