Introduction to Convolutional Neural Networks (CNNs).pptx

MACHINE
LEARNING –
CONVOLUTIONAL
NEURAL NETWORK

Basic Structure of CNN
• Input Layer: Accepts input images as pixel
data.
• Convolutional Layer: Applies filters to
extract features.
• ReLU Layer: Introduces non-linearity to
the network.
• Pooling Layer: Reduces spatial dimensions
of feature maps.
• Fully Connected Layer: Final layer for
classification.

Convolutional Layer
• Filters/Kernels:
Detect specific
features in input
images.
• Stride: Controls
the movement of
filters across the
input.
• Padding: Adds
pixels around the
input to maintain
dimensions.
• Output:
Produces feature
maps indicating
detected features.

Padding in CNN
• Zero Padding: Adds zeros around
the input image to preserve
dimensions.
• Valid Padding: No padding,
reduces the size of output feature
maps.
• Role: Helps preserve edge
information during convolution.

Pooling Layer
• • Purpose: Reduces dimensionality
and computation in the network.
• • Max Pooling: Selects the
maximum value from each pooling
region.
• • Average Pooling: Takes the
average value from each pooling
region.
• • Impact: Retains important
features while reducing overfitting.

Basic Mathematics of CNN (B&W Image)
• • Convolution: Applies a filter
matrix across the image to detect
features.
• • Example: Sliding a 3x3 filter over
a grayscale image, producing a
feature map.
• • ReLU: Applies non-linearity after
convolution.
• • Pooling: Reduces the size of the
resulting feature map.

Basic Mathematics of CNN (Colored Image)
• • Convolution: Applies the same filter across
each RGB channel.
• • Result: Produces a combined feature map
from all channels.
• • Example: Sliding a filter across an RGB
image and summing up feature maps.
• • Pooling: Reduces the size of the resulting
feature map while preserving important
information.

Fully Connected Layer
• • Purpose: Flattens the output and connects to a fully connected
layer.
• • Function: Combines features for final classification.
• • Uses: Softmax or sigmoid activation functions for output.

LeNet-5 Architecture
• • Designed for handwritten digit
recognition (MNIST dataset).
• • Structure: 2 convolutional layers,
2 subsampling layers, 2 fully
connected layers.
• • Key Feature: Simple and efficient,
early CNN model.

AlexNet Architecture
• • Winner of the ImageNet
competition in 2012.
• • Structure: 5 convolutional layers, 3
fully connected layers.
• • Features: Uses ReLU, dropout, and
data augmentation.
• • Impact: Revolutionized deep
learning and computer vision.

VGG-16 Architecture
• • Uses 16 layers (13 convolutional, 3
fully connected).
• • Features: Smaller filters (3x3) with
deeper networks.
• • Strength: Achieves high accuracy
with a simple structure.

ResNet Architecture
• • Introduces Residual Learning to
combat vanishing gradients.
• • Structure: Skip connections or
shortcuts between layers.
• • Impact: Allows very deep networks
(e.g., ResNet-50, ResNet-101).

Inception (GoogLeNet) Architecture
• • Introduces Inception modules: parallel
convolutional filters.
• • Structure: Multiple filter sizes (1x1,
3x3, 5x5) in parallel.
• • Impact: Efficient and scalable for large-
scale image recognition.

Transfer Learning
• • Concept: Uses a pre-trained model on a new but related task.
• • Benefits: Speeds up training, requires less data, and improves
performance.
• • Example: Using a pre-trained model like ResNet for a new image
classification task.

Object Localization
• • Purpose: Identifies the location of objects within an image.
• • Methods: Bounding box regression, Region Proposal Networks
(RPNs).
• • Applications: Object detection, image segmentation.

Landmark Detection
• • Definition: Detects specific key
points or landmarks within an
image.
• • Applications: Facial recognition,
medical imaging (e.g., key
anatomical points).
• • Methods: CNNs used to detect
and regress the position of
landmarks.

Conclusion
• • CNNs have revolutionized computer vision tasks.
• • Architectures like LeNet, AlexNet, VGG, ResNet, and Inception paved
the way for modern image processing.
• • Transfer learning, object localization, and landmark detection
expand the versatility of CNNs.

Introduction to Convolutional Neural Networks (CNNs).pptx

More Related Content

Similar to Introduction to Convolutional Neural Networks (CNNs).pptx

Recently uploaded

Introduction to Convolutional Neural Networks (CNNs).pptx