A convolutional neural network (CNN, or ConvNet) is a class of deep, feed-forward artificial neural networks that has Successfully been applied to analyzing visual imagery
2. CNN
• A convolutional neural network (CNN,
or ConvNet) is a class of deep, feed-
forward artificial neural networks that has
Successfully been applied to analyzing visual imagery.
• A CNN consists of an input and an output layer, as
well as multiple hidden layers. The hidden layers are
either convolutional, pooling or fully connected.
3. Convolutional Layer:
• Convolutional layers apply a convolution operation
to the input, passing the result to the next layer. The
convolution emulates the response of an individual
neuron to visual stimuli.
• Each convolutional neuron processes data only for
its receptive field. Tiling allows CNNs to
tolerate translation of the input image.
4.
5. Pooling Layer:
• Convolutional networks may include local or global
pooling layers , which combine the outputs of
neuron clusters at one layer into a single neuron in
the next layer for minimizing the risk.
6.
7. Fully Connected layer:
• Fully connected layers connect every neuron in one
layer to every neuron in another layer. It is in
principle the same as the traditional multi-layer
perceptron neural network.
8.
9. EXAMPLE
3D volumes of neurons. Convolutional Neural Networks take advantage
of the fact that the input consists of images and they constrain the
architecture in a more sensible way.
• In particular, unlike a regular Neural Network, the layers of a
ConvNet have neurons arranged in 3 dimensions: width, height,
depth.
10. EXAMPLE
• For example, the input images in CIFAR-10 are an input
volume of activations, and the volume has dimensions
32x32x3 (width, height, depth respectively). As we will
soon see, the neurons in a layer will only be connected to
a small region of the layer before it, instead of all of the
neurons in a fully-connected manner. Moreover, the final
output layer would for CIFAR-10 have dimensions
1x1x10, because by the end of the ConvNet architecture
we will reduce the full image into a single vector of class
scores, arranged along the depth dimension. Here is a
visualization: