3. Introduction
Convolutions are mathematical operations between two functions
that create a third function. In image processing, it happens by
going through each pixel to perform a calculation with the pixel and
its neighbours.
In Convolutional neural network, the kernel is nothing but a filter
that is used to extract the features from the images.
A convolution lets you do many things, like calculate derivatives,
detect edges, apply blurs, etc. A very wide variety of things. And all
of this is done with a ”convolution kernel”.
The kernels will define the size of the convolution, the weights
applied to it, and an anchor point usually positioned at the center.
December 28, 2020 3 / 20
4. Convolution filters
An image can be seen as a matrix I, where I(x, y) is the brightness
of the pixel located at coordinates (x, y).
A convolution product is computed between the matrix I and a
kernel matrix K which represents the type of filter.
K can be of size 3 × 3 or 5 × 5. The result of this product will be
the new brightness of the pixel (x, y).
I∗K =
I(1, 1) I(1, 2) ... I(1, n)
. . .
. . .
I(m, 1) I(m, 2) ... I(m, n)
∗
K(1, 1) K(1, 2) K(1.3)
K(2, 1) K(2, 2) K(2, 3)
K(3, 1) K(3, 2) K(3, 3)
Where
I ∗ Kx ,y =
1
i=−1
1
j=−1
I(x + i, y + j) ∗ K(i, j)
December 28, 2020 4 / 20
5. Edge Detection
Edge detection refers to the process of identifying and locating sharp
discontinuities in an image.
Edges are significant local changes in the image and are important
features for analyzing images.
An edge point is a point in an image with coordinates [i,j] at the
location of a significant local intensity change in the image.
Different Methods of edge detection are available in computer
vision. They are:
1. Sobel Operator
2. Prewitt Operator
3. Laplacian Operator
4. Canny’s Edge Detection Algorithm
December 28, 2020 5 / 20
6. Continue...
Main steps in Edge Detection
Smoothing: Suppress as much noise as possible, without destroying
true edges.
Enhancement: Apply differentiation to enhance the quality of edges
(i.e., sharpening).
Threshold: Determine which edge pixels should be discarded as noise
and which should be retained (i.e., threshold edge magnitude).
Localization: Determine the exact edge location. Edge thinning and
linking are usually required in this step.
December 28, 2020 6 / 20
7. Edge Detection using Derivative (Gradient)
Edge detection is essentially the operation of detecting significant
local changes in an image.
The gradient is a measure of change in a function, and an image can
be considered to be an array of samples of some continuous function
of image intensity.
The first derivate of an image can be computed using the gradient.
f or G[f (x, y)] =
∂f
∂x
∂f
∂y
There are two important properties associated with the gradient: (1)
the vector G[f(x, y)] points in the direction of the maximum rate of
increase of the function f(x, y), and (2) the magnitude of the
gradient, given by
December 28, 2020 7 / 20
8. Continue...
G[f (x, y)] =
∂f 2
∂x
+
∂f 2
∂y
equals the maximum rate of increase of f(x, y) per unit distance in the
direction G
From vector analysis, the direction of the gradient is defined as:
α(x, y) = tan−1
∂f
∂x
∂f
∂y
where the angle α is measured with respect to the x axis.
Gx (x, y) → ∂f
∂x
Gy (x, y) → ∂f
∂y
December 28, 2020 8 / 20
9. Continue...
Consider the arrangement of pixels about the pixel [i, j]:
3x3 Neighbourhood :
a0 a1 a2
a7 [i, j] a3
a6 a5 a4
The partial derivatives ∂f
∂x , ∂f
∂y can be computed by:
Gx =(a2+ca3+a4)-(a0+ca7+a6)
Gy =(a0+ca1+a2)-(a6+ca5+a4)
December 28, 2020 9 / 20
10. Sobel Operator
The Sobel filter is used for edge detection. It works by calculating
the gradient of image intensity at each pixel within the image. It
finds the direction of the largest increase from light to dark and the
rate of change in that direction.
To avoid having the gradient calculated about an interpolated point
between pixels is to use a 3 x 3 neighborhood for the gradient
calculations.
The Sobel operator is the magnitude of the gradient computed by:
M = G2
x + G2
y
where the partial derivatives are computed by
Gx =(a2+ca3+a4)-(a0+ca7+a6)
Gy =(a0+ca1+a2)-(a6+ca5+a4)
December 28, 2020 10 / 20
11. Continue...
With the constant c = 2.
Gx and Gy can be implemented using convolution masks:
Gx :
-1 0 1
-2 0 2
-1 0 1
Gy :
1 2 1
0 0 0
-1 -2 -1
Note that this operator places an emphasis on pixels that are closer
to the center of the mask.
The Sobel operator is one of the most commonly used edge
detectors.
December 28, 2020 11 / 20
12. Prewitt operator
The operator calculates the gradient of the image intensity at each
point, giving the direction of the largest possible increase from light
to dark and the rate of change in that direction.
The Prewitt operator uses the same equations as the Sobel operator,
except that the constant c = 1. Therefore:
Gx :
-1 0 1
-1 0 1
-1 0 1
Gy :
1 1 1
0 0 0
-1 -1 -1
December 28, 2020 12 / 20
13. Laplacian Operator
The Laplacian is the two-dimensional equivalent of the second
derivative. The formula for the Laplacian of a function f (x, y) is
2
f =
∂2
f
∂x2
+
∂2
f
∂y2
The simplest gradient approximation is
Gx
∼= f [i, j + 1] − f [i, j]
Gy
∼= f [i, j] − f [i + 1, j]
∂2
f
∂x2 = ∂Gx
∂x ; ∂2
f
∂y2 =
∂Gy
∂x
December 28, 2020 13 / 20
14. Continue...
∂2
f
∂x2
= f [i, j + 1] − 2f [i, j] + f [i, j − 1]
which is the desired approximation to the second partial derivative
centered about [i,j]. Similarly
∂2
f
∂y2
= f [i + 1, j] − 2f [i, j] + f [i − 1, j]
By combining these two equations into a single operator, the
following mask can be used to approximate the Laplacian:
Laplacian :
∂2
f
∂x2
+
∂2
f
∂y2
≈ fi+1,j + fi−1,j + fi,j+1 + fi,j−1 − 4fi,j
2
=
0 1 0
1 -4 1
0 1 0
December 28, 2020 14 / 20
15. Linear Filter
Linear smoothing filters are for removing Gaussian noise and, in
most cases, the other types of noise as well.
same pattern of weights is used in each window, which means that
the linear filter is spatially invariant and can be implemented using a
convolution mask.
The simplest linear filters is implemented by a local averaging
operation where the value of each pixel is replaced by the average of
all the values in the local neighborhood:
h[i, j] =
1
M
(k,l)∈N
f [k, l] ;
1
9
X
1 1 1
1 1 1
1 1 1
where M is the total number of pixels in the neighborhood N.
December 28, 2020 15 / 20
16. Gaussian Kernel
In image processing filters are mainly used to suppress either the
high frequencies in the image, i.e. smoothing the image, or the low
frequencies, i.e. enhancing or detecting edges in the image.
An image can be filtered either in the frequency or in the spatial
domain.
The corresponding process in the spatial domain is to convolve the
input image f(i,j) with the filter function h(i,j). This can be written
as:
g(i, j) = h(i, j) f (i, j)
The discrete convolution can be defined as a ‘shift and multiply’
operation, where we shift the kernel over the image and multiply its
value with the corresponding pixel values of the image. For a square
kernel with size M× M, we can calculate the output image with the
following formula:
December 28, 2020 16 / 20
17. Calculating Gaussian Convolution Kernels
The Gaussian filter uses the Gaussian function in the kernel of the
filter:
G(x, y) =
1
2πσ2
e− x2
+ y2
2σ2
This filter is a weighted filter which gives more importance to the
central pixels. The parameter σ controls the weight given to the
center.
where,
G(x,y) =The variables referenced as x and y relate to pixel
coordinates within an image.
e = represents the value of Euler’s number has been defined as a
mathematical constant equating to 2.718.
σ = represents a threshold or factor value, as specified by the user.
December 28, 2020 17 / 20
18. Continue...
When calculating the kernel elements, the coordinate values
expressed by x and y should reflect the distance in pixels from the
middle pixel.
In order to gain a better grasp on the Gaussian kernel formula we
can implement the formula in steps.
If we were to create a 3×3 kernel and specified a weighting value of
5.5 our calculations can start off as indicated by the following
illustration:
1
2π(5.5)2 e− 12
+12
2(5.5)2
1
2π(5.5)2 e− 02
+12
2(5.5)2
1
2π(5.5)2 e− 12
+12
2(5.5)2
1
2π(5.5)2 e− 12
+02
2(5.5)2
1
2π(5.5)2 e− 02
+02
2(5.5)2
1
2π(5.5)2 e− 12
+02
2(5.5)2
1
2π(5.5)2 e− 12
+12
2(5.5)2
1
2π(5.5)2 e− 02
+12
2(5.5)2
1
2π(5.5)2 e− 12
+12
2(5.5)2
December 28, 2020 18 / 20
19. Continue...
The calculated values of each kernel element:
0.00509 0.00517 0.0521
0.00517 0.00526 0.00517
0.00509 0.00517 0.00509
An important requirement to take note of at this point being that
the sum total of all the elements contained as part of a
kernel/matrix must equate to one.
At this point the sum total of the kernel equates to .046322.
The kernel values should be updated by multiplying each element by
one divided by the current kernel sum.
k =
0.1098 0.1117 0.1098
0.1117 0.1135 0.1117
0.1098 0.1117 0.1098
Now, successfully calculated a 3×3 Gaussian Blur kernel matrix
which implements a weight value of 5.5.
December 28, 2020 19 / 20