This document contains answers to multiple questions about image processing concepts. For question 22a, the kernel formed by the outer product of vectors v and wT is determined to be separable. For question 22b, it is explained that a separable kernel w can be decomposed into two simpler kernels w1 and w2 such that w = w1 * w2. This allows the convolution to be computed more efficiently in two steps by first convolving w1 with the image and then convolving the result with w2, requiring fewer operations than a direct convolution with w.
1. DIP Homework: 4
Q- 1 Give a single intensity transformation function for spreading the intensities of an image
so the lowest intensity is 0 and the highest intensity is L-1?
Answer: Let f denote the original image. First subtract the minimum value of f denote dmin from
f to yield a function whose minimum value is 0:
𝑔1 = 𝑓 − 𝑓 𝑚𝑖𝑛
Next divide g1 by its maximum value to yield a function in the range [0,1] and multiply the result
by L −1 to yield a function with values in the range [0, L −1]:
𝑔 =
𝐿 − 1
max ( 𝑔1)
𝑔1
=
𝐿 − 1
max (𝑓 − 𝑓 𝑚𝑖𝑛 )
(𝑓 − 𝑓 𝑚𝑖𝑛)
Keep in mind that fmin is a scalar and f is an image.
Q- 4 Do the following?
a- Propose method for extracting the bit planes of an image based on converting the
value of its pixels to binary?
Answer: Converting decimal numbers to binary automatically creates the bit planes. For
example, the values of the pixels in an 8-bit image range from 0 to 255. If you converted
those numbers to binary, each value would be converted to 8 binary bits. The result would
thus be a cube of size M×N×8. But this is exactly. In other words, you can look at the
conversion to binary of n-bit image as being represented by a stack of n binary planes,
descending from most significant to least significant. The trick in implementing this
approach is being able to extract the individual planes. The approach would vary depending
on the computer language used. For instance, in MATLAB we would use function dec-2-
bin to covert an image with decimal values to a binary representation. After that, some
additional manipulation of the resulting array would be needed to extract the individual bit
planes.
b- Find all the bit planes of the following 4-bits image?
0 1 8 6
2 2 1 1
1 15 14 10
2. Answer: Here we need to convert each number to binary, and to form the planes by looking
at the least significant bit (LSB), the next significant bit, and so on, as follows:
where the rightmost bit plane contains the most significant bit (MSB). For example, the 4th element
in the first row of each plane corresponds to the sequence: 1 1 0 0 from the LSB to the MSB. This
is the binary representation of decimal number 3, which is the value of the pixel at that location in
the given image.
Q- 10 Two image, f (x, y) and g (x, y) have un-normalized histogram hf and hg. Give the
conditions (on the values of the pixels in f and g) under which you can determine the
histograms of image formed as follows:
a- f (x, y) + g (x, y)
b- f (x, y) - g (x, y)
c- f (x, y) * g (x, y)
d- f (x, y) % g (x, y)
Answer: The purpose of this simple problem is to think of the meaning of histograms and
arrive at the conclusion that histograms carry no information about spatial properties of
images. Thus, the only time that the histogram of the images formed by the operations
shown in the problem statement can be determined in terms of the original histograms is
when one or both of the images is (are) constant. In (d) we have the additional requirement
that none of the pixels of g (x, y) can be 0. Assume for convenience that the histograms are
not normalized, so that, for example, hf (rk) is the number of pixels in f (x, y) having gray
level rk , assume that all the pixels in g (x, y) have constant value c. The pixels of both
images are assumed to be positive. Finally, let uk denote the gray levels of the pixels of the
0 1 0 0
0 0 1 1
1 1 0 0
1 0 1 0
0 0 0 1
1 1 0 0
0 1 1 0
1 1 0 1
0 0 0 1
0 0 0 0
0 1 1 1
0 1 0 0
0 1 1 0
0 0 0 0
0 1 1 1
0 0 1 0
LBS plane MSB plane
3. images formed by any of the arithmetic operations given in the problem statement. Under
the preceding set of conditions, the histograms are determined as follows:
a- The histogram hsum (uk) of the sum is obtained by letting uk = rk + c, and hsum (uk) = hf
(rk) ∀k. In other words, the values (height) of the components of hsum are the same as
the components of hf , but their locations on the gray axis are shifted right by an amount
c.
b- Similarly, the histogram hdiff (uk) of the difference has the same components as hf but
their locations are moved left by an amount c as a result of the subtraction operation.
c- Following the same reasoning, the values (heights) of the components of histogram
hprod (uk) of the product are the same as hf, but their locations are at uk = c * rk. Note
that while the spacing between components of the resulting histograms in (a) and (b)
was not affected, the spacing between components hprod (uk) will be spread out by an
amount c.
d- Finally, assuming that c ≠ 0, the components of hdiv (uk) are the same as those of hf, but
their locations will be at uk =
𝑟 𝑘
𝑐
Thus, the spacing between components of hdiv (uk)
will be compressed by an amount equal to
1
𝑐
. The preceding solutions are applicable if
image f (x, y) also is constant. In this case the four histograms just discussed would
each have only one component. Their location would be affected as described (a)
through (c).
Q- 12 An image with intensities in the range [0,1] has the PDF, pr (r), shown in the following
figure. It is desired to transform the intensity levels of this image so that they will have the
specified pz (z) shown in the figure. Assume continuous quantities, and find the
transformation (expressed in terms of r and z) that will accomplish this.
4. Answer: the first step is to obtain the histogram equalization transformation:
We know that;
𝑠 = 𝑇 (𝑟) = ∫ 𝑝𝑟 ( 𝑤) 𝑑𝑤
𝑟
0
= ∫(−2𝑤 + 2) = −𝑟2
+ 2𝑟.
𝑟
0
Next we find
𝑣 = 𝐺 (𝑧) = ∫ 𝑝𝑧 ( 𝑤) 𝑑𝑤
𝑧
0
= ∫2𝑤 𝑑𝑤 = 𝑧2
.
𝑧
0
Finally
𝑧 = 𝐺−1 ( 𝑣)
= ± √ 𝑣.
𝑧 = + √ 𝑣
But only positive intensity levels are allowed, so z = √ 𝑣. Then, we replace v with s, which in turn
is −𝑟2
+ 2𝑟, and we have
𝑧 = √−𝑟2 + 2𝑟.
Q- 18 You are given the following kernel and image:
5. a- Give a sketch of the area encircled by the large ellipse in fig.3.28 when the kernel is
centered at point (2,3) (2nd row, 3rd col) of the image shown above. Show specific
values of w and f.
Answer:
The correlation consists of moving the center of a kernel over an image, and computing the sum
of products at each location. The mechanics of spatial convolution are the same, except that the
correlation kernel is rotated by 180°. Thus, when the values of a kernel are symmetric about its
center, correlation and convolution yield the same result.
b- Compute the convolution w*f using the minimum zero padding needed. Show the
details of your computations when the kernel is centered on point (2,3) of f; and then
show the final full convolution result.
Answer:
w (-1,-1) w (0,-1) w (1,-1)
w (0,-1) w (0,0) w (0,1)
w (1,-1) w (0,1) w (1,1)
f (x-1, y-1) f (x-1 ,y) f (x-1, y+1)
f (x, y-1) f (x, y) f (x, y+1)
f (x+1, y-1) f (x-1, y) f (x-1, y+1)
Original Image
Kernel origin
Image pixel
The image showing filter kernel coefficients and corresponding pixels in the image
Kernel coefficients
Filter kernel, w (s, t)
Pixel values under kernel when it is
centered on (x, y)
6. the mechanics of linear spatial filtering using a 3 × 3 kernel. At any point (x, y) in the
image, the response, g (x, y) of the filter is the sum of products of the kernel coefficients
and the image pixels encompassed by the kernel:
𝑔 ( 𝑥, 𝑦) = 𝑥 (−1,−1) 𝑓 ( 𝑥 − 1, 𝑦 − 1) + 𝑤 (−1,0) 𝑓( 𝑥 − 1, 𝑦) + ⋯
+ 𝑤 (0,0) 𝑓( 𝑥, 𝑦) + ⋯+ 𝑤 (1,1) 𝑓( 𝑥 + 1, 𝑦 + 1)
As coordinates x and y are varied, the center of the kernel moves from pixel to pixel,
generating the filtered image, g, in the process.
Observe that the center coefficient of the kernel, w (0, 0), aligns with the pixel at location
(x, y). For a kernel of size m × n, we assume that m = 2a + 1 and n = 2b + 1, where a and
b are nonnegative integers. This means that our focus is on kernels of odd size in both
coordinate directions. In general, linear spatial filtering of an image of size M × N with a
kernel of size m × n is given by the expression.
𝑔 ( 𝑥, 𝑦) = ∑ ∑ 𝑤 ( 𝑠, 𝑡) 𝑓 (𝑥 + 𝑠, 𝑦 + 𝑡)
𝑏
𝑡= −𝑏
𝑎
𝑠 =−𝑎
where x and y are varied so that the center (origin) of the kernel visits every pixel in f once.
For a fixed value of (x, y).
The first correlation value is the sum of products in this initial position, computed using with x =
0:
Correlation
0 0 0 1 0 0 1 2 1
Origin
f w
Convolution
0 0 0 1 0 0 1 2 1
Origin
f w to 180
0 0 0 1 0 0 1 2 1
Starting position alignment
1 2 1
0 0 0 1 0 0 1 2 1
Starting position alignment
1 2 1
Correlation result
0 1 2 1 0
Convolution result
0 1 2 1 0
7. 𝑔 (0) = ∑ 𝑤 ( 𝑠) 𝑓( 𝑠 + 0) = 0
2
𝑠 = −2
c- Repeat (b), but for correlation, w*f.
Answer:
To obtain the value of correlation, we shift the relative positions of w and f one-pixel
location to the right and compute the sum of products again. The result is g (1) = 1, as
shown in the leftmost, nonzero location. When x = 2, we obtain g (2) = 2. When x = 3, we
get g (3) = 1.
Summarizing the preceding discussion in equation form, the correlation of a kernel w of size m ×
n with an image f (x, y), denoted as (w * f) (x, y), which we repeat here for convenience:
𝑤 ⋇ 𝑓 = ∑ ∑ 𝑤 ( 𝑠, 𝑡) 𝑓 ( 𝑥 + 𝑠, 𝑦 + 𝑡)
𝑏
𝑡= −𝑏
𝑎
𝑠= −𝑎
Because our kernels do not depend on (x, y), we will sometimes make this fact explicit by writing
the left side of the preceding equation as w * f (x, y).
Original Image
Kernel origin
Image pixel
Padded f
Initial position for w Correlation result
8. Because our kernels do not depend on (x, y), we will sometimes make this fact explicit by writing
the left side of the preceding equation as w * f (x, y).
(𝑤 ⋇ 𝑓)(𝑥, 𝑦) = ∑ ∑ 𝑤 ( 𝑠, 𝑡) 𝑓 ( 𝑥 − 𝑠, 𝑦 − 𝑡)
𝑏
𝑡= −𝑏
𝑎
𝑠= −𝑎
where the minus signs align the coordinates of f and w when one of the functions is rotated by
180°. This equation implements the sum of products process to which we refer throughout the
book as linear spatial filtering. That is, linear spatial filtering and spatial convolution are
synonymous.
Q- 22 Answer the following:
a- If v = [1 2 3]T and wT = [2 1 1 3], is the kernel formed by vwT separable?
b- The following kernel is separable. Find w1 and w2 such that w = w1 * w2.
Answer:
a- If v = [1 2 1]T and WT = [2 1 1 3] Then vw pi = [1 2 1] [2 1 1 3] = [2 4 2 1 2 2 1 2 1
3 6 3] Yes here this kernel is separable because it can be expressed as the outer product
of the vectors. A separable kernel of size m × n can be expressed as the outer product
of two vectors, v and w:
𝑤 = 𝒗 𝒘 𝑻
where v and w are vectors of size m × 1 and n × 1, respectively. For a square kernel
of size m × m, we can write it:
𝑤 = 𝒗 𝒗 𝑻
It turns out that the product of a column vector and a row vector is the same as the
2-D convolution of the vectors.
Note: Typically, if the kernel size is M×N, we need only (M+N) multiplications and
(M+N-2) additions instead of M.N multiplications and M.N-1 additions for a non-
separable 2D filter. Often the term “MAP” is preferred (multiplications and
accumulations per pixel): there are (M+N) MAP for a separable filter instead of M.N
MAP for a non-separable filter.
9. b- The importance of separable kernels lies in the computational advantages that result
from the associative property of convolution. If we have a kernel w that can be
decomposed into two simpler kernels, such that w = w1 * w2.
𝑤 ∗ 𝑓 = ( 𝑤1 ∗ 𝑤2) ∗ 𝑓 = ( 𝑤2 ∗ 𝑤1) ∗ 𝑓 = 𝑤2 ∗ ( 𝑤1 ∗ 𝑓) = ( 𝑤1 ∗ 𝑓) ∗ 𝑤
the first convolution, 𝑤1 ∗ 𝑓 , requires on the order of MNm multiplications and
additions because w1 is of size m × 1. The result is of size M × N, so the convolution
of w2 with the result requires MNn such operations, for a total of MN (m + n)
multiplication and addition operations. Thus, the computational advantage of
performing convolution with a separable, as opposed to a non-separable, kernel is
defined as:
𝐶 =
𝑀𝑁 𝑚𝑛
𝑀𝑁 (𝑚 + 𝑛)
=
𝑚𝑛
𝑚 + 𝑛
The objective is to find two 1-D kernels, w1 and w2, in order to implement 1-D
convolution. In terms of the preceding notation, w1= c = v and w2= r/E = wT. For
circularly symmetric kernels, the column through the center of the kernel describes the
entire kernel; that is, w = vvT c, where c is the value of the center coefficient. Then, the
1-D components are w1 = v and w2 = vT c.
Q- 26 The two images shown in the following figure are quite different, but their histograms
are the same. Suppose that each image is blurred using a 3 * 3 box kernel.
a- Would the histograms of the blurred image still be equal? Explain.
b- If your answer is no, either sketch the two histograms or give two table detailing the
histogram components.
In the above Figure, an image with white intensities on left and black intensities on the
right. A chessboard image. You can assume all the black regions have intensity of 0, and
the white regions have intensity of 255.
10. a- The number of boundary points between the black and white regions is much larger in
the image on the right. When the images are blurred, the boundary points will give rise
to a larger number of different values for the image on the right, so the histograms of
the two blurred images will be different.
b- To handle the border effects, we surround the image with a border of 0s. We assume
that image is of size N×N (the fact that the image is square is evident from the right
image in the problem statement). Blurring is implemented by a 3×3 mask whose
coefficients are 1/9. You can write a code to calculate and plot the histograms for both
figures. Please note that different choices of N would lead to different answers for the
right figure. A larger N would result in a larger number of 0/255 in the output
histogram; while a smaller N would result in a smaller number of extreme values. But
you should make sure that the summation of the histogram bins equals to N×N.
Q- 38 In a given application, a smoothing kernel is applied to input image to reduce noise,
then a Laplacian kernel is applied to enhance fine details. Would the result be the same if
the order of these operations is reversed?
Answer: A kernel is a mask which is applied to images for blurring, sharpening, edge detection
etc. A smoothing kernel is used to remove high spatial frequency noise from the input image.
Therefore. The result would be the same if the order of these operations were reversed since the
averaging and the Laplacian are linear operations. The Laplacian is a linear operator because
derivatives of any order are linear operations and the Laplacian is the second derivation.