A convolutional neural network (CNN) is a type of neural network that specializes in processing grid-like data such as images. CNNs take advantage of the 2D structure of images by using small filters that are convolved across the input, resulting in feature maps. The core layers of a CNN are convolutional layers, ReLU layers, pooling layers, and fully connected layers. Convolutional layers apply filters to extract features, ReLU layers introduce nonlinearity, pooling layers downsample the data to reduce dimensionality, and fully connected layers perform classification. CNNs are well-suited for computer vision tasks due to their ability to learn translation-invariant features directly from images.
In this presentation we discuss the convolution operation, the architecture of a convolution neural network, different layers such as pooling etc. This presentation draws heavily from A Karpathy's Stanford Course CS 231n
This slide set on convolutional neural networks is meant to be supplementary material to the slides from Andrej Karpathy's course. In this slide set we explain the motivation for CNN and also describe how to understand CNN coming from a standard feed forward neural networks perspective. For detailed architecture and discussions refer the original slides. I might post more detailed slides later.
At the end of this lesson, you should be able to;
describe spatial resolution
describe intensity resolution
identify the effect of aliasing
describe image interpolation
describe relationships among the pixels
In machine learning, a convolutional neural network is a class of deep, feed-forward artificial neural networks that have successfully been applied fpr analyzing visual imagery.
In this presentation we discuss the convolution operation, the architecture of a convolution neural network, different layers such as pooling etc. This presentation draws heavily from A Karpathy's Stanford Course CS 231n
This slide set on convolutional neural networks is meant to be supplementary material to the slides from Andrej Karpathy's course. In this slide set we explain the motivation for CNN and also describe how to understand CNN coming from a standard feed forward neural networks perspective. For detailed architecture and discussions refer the original slides. I might post more detailed slides later.
At the end of this lesson, you should be able to;
describe spatial resolution
describe intensity resolution
identify the effect of aliasing
describe image interpolation
describe relationships among the pixels
In machine learning, a convolutional neural network is a class of deep, feed-forward artificial neural networks that have successfully been applied fpr analyzing visual imagery.
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Jedha Bootcamp
Reconnaissance de visages sur vos photos Facebook, détection de maladies via imagerie médicale, les applications de la reconnaissance d'images grâce à l'intelligence artificielle offrent de vastes possibilités. Lors de cet événement, Cristina & Pierre - Machine Learning Engineers chez Photobox - vous feront une démonstration des outils de reconnaissance d'images via ces algorithmes de Deep Learning.
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
This presentation is Part 2 of my September Lisp NYC presentation on Reinforcement Learning and Artificial Neural Nets. We will continue from where we left off by covering Convolutional Neural Nets (CNN) and Recurrent Neural Nets (RNN) in depth.
Time permitting I also plan on having a few slides on each of the following topics:
1. Generative Adversarial Networks (GANs)
2. Differentiable Neural Computers (DNCs)
3. Deep Reinforcement Learning (DRL)
Some code examples will be provided in Clojure.
After a very brief recap of Part 1 (ANN & RL), we will jump right into CNN and their appropriateness for image recognition. We will start by covering the convolution operator. We will then explain feature maps and pooling operations and then explain the LeNet 5 architecture. The MNIST data will be used to illustrate a fully functioning CNN.
Next we cover Recurrent Neural Nets in depth and describe how they have been used in Natural Language Processing. We will explain why gated networks and LSTM are used in practice.
Please note that some exposure or familiarity with Gradient Descent and Backpropagation will be assumed. These are covered in the first part of the talk for which both video and slides are available online.
A lot of material will be drawn from the new Deep Learning book by Goodfellow & Bengio as well as Michael Nielsen's online book on Neural Networks and Deep Learning as well several other online resources.
Bio
Pierre de Lacaze has over 20 years industry experience with AI and Lisp based technologies. He holds a Bachelor of Science in Applied Mathematics and a Master’s Degree in Computer Science.
https://www.linkedin.com/in/pierre-de-lacaze-b11026b/
Machine Learning - Introduction to Convolutional Neural NetworksAndrew Ferlitsch
Abstract: This PDSG workshop introduces basic concepts of convolutional neural networks. Concepts covered are image pixels, image preprocessing, feature detectors, feature maps, convolution, ReLU, pooling and flattening.
Level: Fundamental
Requirements: No prior programming or statistics knowledge required. Some knowledge of neural networks is recommended.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Jedha Bootcamp
Reconnaissance de visages sur vos photos Facebook, détection de maladies via imagerie médicale, les applications de la reconnaissance d'images grâce à l'intelligence artificielle offrent de vastes possibilités. Lors de cet événement, Cristina & Pierre - Machine Learning Engineers chez Photobox - vous feront une démonstration des outils de reconnaissance d'images via ces algorithmes de Deep Learning.
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
This presentation is Part 2 of my September Lisp NYC presentation on Reinforcement Learning and Artificial Neural Nets. We will continue from where we left off by covering Convolutional Neural Nets (CNN) and Recurrent Neural Nets (RNN) in depth.
Time permitting I also plan on having a few slides on each of the following topics:
1. Generative Adversarial Networks (GANs)
2. Differentiable Neural Computers (DNCs)
3. Deep Reinforcement Learning (DRL)
Some code examples will be provided in Clojure.
After a very brief recap of Part 1 (ANN & RL), we will jump right into CNN and their appropriateness for image recognition. We will start by covering the convolution operator. We will then explain feature maps and pooling operations and then explain the LeNet 5 architecture. The MNIST data will be used to illustrate a fully functioning CNN.
Next we cover Recurrent Neural Nets in depth and describe how they have been used in Natural Language Processing. We will explain why gated networks and LSTM are used in practice.
Please note that some exposure or familiarity with Gradient Descent and Backpropagation will be assumed. These are covered in the first part of the talk for which both video and slides are available online.
A lot of material will be drawn from the new Deep Learning book by Goodfellow & Bengio as well as Michael Nielsen's online book on Neural Networks and Deep Learning as well several other online resources.
Bio
Pierre de Lacaze has over 20 years industry experience with AI and Lisp based technologies. He holds a Bachelor of Science in Applied Mathematics and a Master’s Degree in Computer Science.
https://www.linkedin.com/in/pierre-de-lacaze-b11026b/
Machine Learning - Introduction to Convolutional Neural NetworksAndrew Ferlitsch
Abstract: This PDSG workshop introduces basic concepts of convolutional neural networks. Concepts covered are image pixels, image preprocessing, feature detectors, feature maps, convolution, ReLU, pooling and flattening.
Level: Fundamental
Requirements: No prior programming or statistics knowledge required. Some knowledge of neural networks is recommended.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Event Management System Vb Net Project Report.pdfKamal Acharya
In present era, the scopes of information technology growing with a very fast .We do not see any are untouched from this industry. The scope of information technology has become wider includes: Business and industry. Household Business, Communication, Education, Entertainment, Science, Medicine, Engineering, Distance Learning, Weather Forecasting. Carrier Searching and so on.
My project named “Event Management System” is software that store and maintained all events coordinated in college. It also helpful to print related reports. My project will help to record the events coordinated by faculties with their Name, Event subject, date & details in an efficient & effective ways.
In my system we have to make a system by which a user can record all events coordinated by a particular faculty. In our proposed system some more featured are added which differs it from the existing system such as security.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Automobile Management System Project Report.pdfKamal Acharya
The proposed project is developed to manage the automobile in the automobile dealer company. The main module in this project is login, automobile management, customer management, sales, complaints and reports. The first module is the login. The automobile showroom owner should login to the project for usage. The username and password are verified and if it is correct, next form opens. If the username and password are not correct, it shows the error message.
When a customer search for a automobile, if the automobile is available, they will be taken to a page that shows the details of the automobile including automobile name, automobile ID, quantity, price etc. “Automobile Management System” is useful for maintaining automobiles, customers effectively and hence helps for establishing good relation between customer and automobile organization. It contains various customized modules for effectively maintaining automobiles and stock information accurately and safely.
When the automobile is sold to the customer, stock will be reduced automatically. When a new purchase is made, stock will be increased automatically. While selecting automobiles for sale, the proposed software will automatically check for total number of available stock of that particular item, if the total stock of that particular item is less than 5, software will notify the user to purchase the particular item.
Also when the user tries to sale items which are not in stock, the system will prompt the user that the stock is not enough. Customers of this system can search for a automobile; can purchase a automobile easily by selecting fast. On the other hand the stock of automobiles can be maintained perfectly by the automobile shop manager overcoming the drawbacks of existing system.
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfKamal Acharya
The College Bus Management system is completely developed by Visual Basic .NET Version. The application is connect with most secured database language MS SQL Server. The application is develop by using best combination of front-end and back-end languages. The application is totally design like flat user interface. This flat user interface is more attractive user interface in 2017. The application is gives more important to the system functionality. The application is to manage the student’s details, driver’s details, bus details, bus route details, bus fees details and more. The application has only one unit for admin. The admin can manage the entire application. The admin can login into the application by using username and password of the admin. The application is develop for big and small colleges. It is more user friendly for non-computer person. Even they can easily learn how to manage the application within hours. The application is more secure by the admin. The system will give an effective output for the VB.Net and SQL Server given as input to the system. The compiled java program given as input to the system, after scanning the program will generate different reports. The application generates the report for users. The admin can view and download the report of the data. The application deliver the excel format reports. Because, excel formatted reports is very easy to understand the income and expense of the college bus. This application is mainly develop for windows operating system users. In 2017, 73% of people enterprises are using windows operating system. So the application will easily install for all the windows operating system users. The application-developed size is very low. The application consumes very low space in disk. Therefore, the user can allocate very minimum local disk space for this application.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
2. Introduction
• In the previous slides we learned the basics of Deep neural network and its types and
use cases.
• In this section we will learn one of its kind which is Convolutional Neural Network
(CNN) Architecture
3. What is CNN ?
• A Convolutional Neural Network, also known as CNN or ConvNet, is a type of feed-
forward neural networks that specializes in processing data that has a grid-like
topology, such as an image.
• A digital image is representation of visual data. It contains a series of pixels arranged
in a grid-like fashion that contains pixel values.
• Because of this kind of representation CNN is used for image classification.
• The architecture of CNN is designed to take
advantage of the 2D structure of an input
image.
• The basic CNN is comprised of one or more
convolution layer (often with a pooling step) and
then followed by one or more fully connected
layers as in a standard multilayer neural
network.
4. Motivation behind CNN ?
• Consider an image of size 200x200x3 (200 wide, 200 high, 3 color channels)
A single fully-connected neuron in a first hidden layer of a regular Neural Network would have
200x200x3 = 120000 weights
Due to the presence of several such neurons, this full connectivity is wasteful, and the huge
number of parameters would quickly lead to overfitting.
• However, in a CNN the neurons in a layer will only be connected to a small region of
the layer before it (will discuss later) instead of all the neurons in a fully connected
manner.
The final output layer would have dimensions 1x1xN, because by the end of the CNN
architecture we will reduce the full image into a single vector of class scores (for N classes),
arranged along the depth dimension.
5. MLP vs CNN ?
Multi-layered perceptron: all layers are fully
connected
Convolutional Neural Network with partially
connected Convolution layer
6. MLP vs CNN ?
Multi-layered perceptron: a regular 3-layer
neural network
Convolutional Neural Network arranges its
neuron in 3 dimensions as visualized in
figure.
Because of this 3-D distribution of neurons CNN is intelligently adapted to the properties of images:
• Pixel position and neighborhood have semantic meanings
• Elements of interest can appear anywhere in the image
7. How CNN works – What computer sees
• For example, a CNN can take an image which can be classified a ‘X’ or ‘O’
• In simple case ‘X’ would look like
• But what about trickier case
• Since pattern does not match exactly, the computer will not be able to classify this as ‘X’.
Using CNN, we can overcome this issue by taking some measures.
8. CNN layers
• CNN consist of four basic layers
• Convolutional layer (CONV) will compute the output of neurons that are connected to local
regions in the input, each computing a dot product between their weights and a small region
they are connected to in the input volume.
• RELU (already discussed in ANN) layer will apply an elementwise activation function, such
as the max(0,x) thresholding at zero. This leaves the size of the volume unchanged. Which
removes no-linearity from data.
• Pooling (POOL) layer will perform a down sampling operation along the spatial dimensions
(width, height). Sometimes we also use DROPOUT for down sampling.
• Fully-connected layer (FC) will compute the class scores, resulting in volume of size
[1x1xN], where each of the N numbers correspond to a class score, such as among the N
categories.
9. Convolutional Layer
• The convolution layer (CONV) uses filters that perform convolution operations as it is scanning
the input I with respect to its dimensions. Its hyperparameters include the filter size F and stride
S. The resulting output O is called feature map or activation map.
• Convolution layer will work to identify patterns (features) instead of individual pixels.
• The role of the ConvNet is to reduce the images into a form which is easier to process, without
losing features which are critical for getting a good prediction.
10. What is Convolution
operation?
• Mathematically, convolution is the summation
of the element-wise product of 2 matrices (input
image and filter).
• Let us consider an image ‘X’ & a filter ‘Y’ (More
about filter will be covered later). Both X & Y,
are matrices (image X is being expressed in the
state of pixels). When we convolve the image
‘X’ using filter ‘Y’, we produce the output in a
matrix, say’ Z’.
• Finally, we compute the sum of all the elements
in ‘Z’ to get a scalar number
image X
kernel Y
Convolution operation
11. Convolutional Layer - Filters/Kernels
• A filter provides a measure for how close a patch or a region of the input resembles a feature. A
feature may be any prominent aspect – a vertical edge, a horizontal edge, an arch, a diagonal,
etc.
• A filter acts as a single template or pattern, which, when convolved across the input, finds
similarities between the stored template & different locations/regions in the input image.
• To perform convolution operation, slide the filter over the width and height of the input image
and perform summation of the element-wise product.
• If the input image size is ‘n x n’ & filter size is ‘f’
• Output size = (n – f + 1) x (n – f + 1)
• Output size = (5-3+1) x (5-3+1) = 3x3
12. Filter hyperparameters - Padding
• Sometimes it is convenient to pad the input volume with zeros around the border.
• Zero padding is allowed us to preserve the spatial size of the output volumes.
• Why do we do Padding?
• Every time we apply a convolution operator, our image shrinks. So, we lose a lot of
information because of image shrinking, which is one of the downsides of convolution.
• So, to fix these problems, we can ‘pad’ the image.
One bit Zero padding on a 5x5 image
• Let P be padding. In this example, p = 1
because we padded all around the input image
with an extra border of 1 pixel.
• Output Size = (n + 2p –f +1) x (n + 2p –f +1)
where, n is the image dimension, p is the
padding and f is the filter-size
13. Types of Padding
• There are two common choices for padding: Valid convolutions & the Same convolutions.
a) Valid convolutions - This Means no padding. Thus, in this case, we might have (nxn) image
convolve with (fxf) filter & this would give us an output (n-f+1) x (n-f+1) dimensional output.
b) Same convolutions - In this case, padding is such that the output size is the same as the
input image size. When we do padding by ‘p’ pixels then, size of the input image changes
from (nxn) to (n + 2p –f +1) x (n + 2p –f +1).
The amount of padding to be done should be such that the output image after convolution
matches the size of the input image.
Let, n x n = Original input image size, p = Padding
(n+2p) x (n+2p) = Size of padded input image
(n+2p–f+1) x (n+2p-f+1) = Size of output image after convolving padded image
To avoid shrinkage of the original input image, we calculate ‘p = padding size’.
So, we achieve Output size after convolving padded image = Original input image size
14. How is the Filter Size Decided?
• By convention, the value of ‘f,’ i.e., filter size, is usually odd in computer vision. This might be
because of 2 reasons:
• If the value of ‘f’ is even, we may need asymmetric padding (according the previous slide).
Let us say that the size of the filter i.e., ‘f’ is 6. Then by using equation of padding, we get a
padding size of 2.5, which does not make sense.
Let, nxn = 10 x 10 = Original input image size, p = Padding and f = 6
Output image = (10+2p–6+1) x (10+2p-6+1) = 10x10
because we want out output image same as input
and we get p=2.5 which is not make any sense
• The 2nd reason for choosing an odd size filter such as a 3×3 or a 5×5 filter is we get a central
position & at times it is nice to have a distinguisher.
15. Filter hyperparameters - Stride
• For a convolutional or a pooling operation, the stride S denotes the number of pixels by
which the window moves after each operation.
• In simple words the stride indicates the pace by which the filter moves horizontally &
vertically over the pixels of the input image during convolution.
• Let n x n = Original input image size, p = Padding, f = kernel and s = stride
Output image size = [{(n + 2p - f) / s} + 1] x [{(n + 2p - f) / s} + 1]
Convolution Operation with Stride Length = 2
Stride during convolution
16. Convolutions over RGB images
• Consider an RGB image of size 6×6. Since it’s an RGB image, its dimension is 6x6x3, where
the three corresponds to the three colors channels: Red, Green & Blue. We can imagine this
as a 3-D image with a stack of 3 six by six shots.
• For 3-D images, we need 3D filters, i.e., the filter itself will also have three layers
corresponding to the red, green & blue channels, like that of the input RGB image.
Convolution over volume
• We 1st place the 3x3x3 filter in the upper left
most position same as 2-D. This filter has 27 (9
parameters in each channel) or numbers.
• We take each of these 27 numbers & multiply
them with the corresponding numbers from the
image’s red, green & blue channels.
• Then we add up all those numbers & this gives
us the 1st number in the output image.
18. Multiple Filters for Multiple Features
• We can use multiple filters to detect various features simultaneously.
• Let us consider the following example in which we see vertical edge & curve in the input RGB
image.
• We will have to use two different filters for this task, and the output image will thus have two
feature maps.
Convolution using multiple filters
• Let us understand the dimensions mathematically
19. Some important concepts
• The filters are learned during training (i.e., during backpropagation). Hence, the individual
values of the filters are often called the weights of CNN.
• A neuron is a filter whose weights are learned during training. E.g., a (3,3,3) filter (or neuron)
has 27 units. Each neuron looks at a particular region in the output (i.e., its ‘receptive field’)
• A feature map is a collection of multiple neurons, each looking at different inputs with the
same weights.
• All neurons in a feature map extract the same feature (but from other input regions). It is
called a ‘feature map’ because it maps where a particular part is found in the image.
20. ReLU Layer
• ReLU is a piecewise linear function that will output the input
directly if it is positive, otherwise, it will output zero.
• The main catch here is that the ReLU function does not activate all
the neurons at the same time.
• Mathematically it can be represented as:
• The derivative of the function is:
21. Pooling Layer
• A pooling layer is another essential building block of CNN. It tries to figure out whether a
particular region in the image has the feature we are interested in or not.
• The pooling layer (POOL) is a down sampling operation, typically applied after a convolution
layer, which does some spatial invariance.
• The two most popular aggregate functions used in pooling are ‘max’ & ‘average’:
a) Max pooling – If any of the patches say something firmly about the presence of a particular feature,
then the pooling layer counts that feature as ‘detected’. It preserves detected features and mostly
used.
b) Average pooling – If one patch says something very firmly, but the other ones disagree, the average
pooling takes the average to find out. It down samples feature map and used in LeNet.
22. Pooling Layer – Advantage and Disadvantage
• Advantages
• Pooling has the advantage of making the representation more compact by reducing the
spatial size of the feature maps, thereby reducing the number of parameters to be learnt.
• Pooling reduces only the height & width of the feature map, not the number of channels
• Disadvantage
• Pooling also loses a lot of information, which is often considered a potential disadvantage
23. Dropout Layer
• Large neural nets trained on relatively small datasets can overfit the training data which
results in poor performance when the model is evaluated on new data.
• Dropout is a regularization method that approximates training a large number of neural
networks with different architectures in parallel.
• During training, some number of layer outputs are randomly ignored or “dropped out.” in this
layer.
• Dropout has the effect of making the training
process noisy, forcing nodes within a layer to
probabilistically take on more or less responsibility
for the inputs.
24. Fully connected Layer
• Fully connected layers are the normal flat
feed-forward neural network layers.
• This layers may have some non-linear
activation function or mostly softmax
activation function in order to predict
classes.
• To compute the output, we basically
arrange all the output 2-D matrices as a
1-D array.
25. Fully connected Layer
• A summation of product of inputs and weights at each output node determines the final
prediction. Same as what we do during feed-forward network.
26. Understanding the complexity of the CNN
• In order to assess the complexity of a model, it is often useful to determine the number of
parameters that its architecture will have. In a given layer of a convolutional neural network, it
is done as follows:
27. How image recognition works with CNN ?
• Till now we have seen different components of CNN. Now let see how different component
work together in CNN to identify an image of a bird.
28. How image recognition works with CNN ?
• Till now we have seen different components of CNN. Now let see how different component
work together in CNN to identify an image of a bird.
29. How image recognition works with CNN ?
• Till now we have seen different components of CNN. Now let see how different component
work together in CNN to identify an image of a bird.
30. How image recognition works with CNN ?
• Till now we have seen different components of CNN. Now let see how different component
work together in CNN to identify an image of a bird.
31. How image recognition works with CNN ?
• Till now we have seen different components of CNN. Now let see how different component
work together in CNN to identify an image of a bird.
32. How image recognition works with CNN ?
• Till now we have seen different components of CNN. Now let see how different component
work together in CNN to identify an image of a bird.
33. How image recognition works with CNN ?
• Till now we have seen different components of CNN. Now let see how different component
work together in CNN to identify an image of a bird.
34. Different CNN Architectures
• There are various architectures of CNNs available which have been key in building
algorithms which power and shall power AI in the foreseeable future. Some of them have
been listed below:
35. Summary
• In this section we learn
• Basics of CNN
• How CNN is different from other ML algorithms
• Understand layers of CNN
• How CNN classify/recognize images
• Different CNN architectures
• In the next section we will learn Recurrent Neural network