557 480-486

Motion Estimation Algorithms in Video Super-
Resolution
Anand Deshpande1
, Prashant P. Patavardhan2
and D. H. Rao3
1
Research scholar, Dept. of E&C Engg., Gogte Institute of Technology, Belgaum/ Associate Professor, Dept. of E & C,
Angadi Institute of Technology and Management, Belgaum. India
E-mail:deshpande.anandb@gmail.com
2
Professor, Dept. of E&C Engg., Gogte Institute of Technology, Belgaum. India
E-mail:prashantgemini73@gmail.com,
3
Dean Faculty of Engineering/Professor of Dept. of PG studies, Visvesvaraya Technological University, Belgaum. India
E-mail:dr.raodh@gmail.com
Abstract— Super-resolution (SR) is the process of obtaining a high resolution (HR) image or
a sequence of HR images from a set of low resolution (LR) observations. The block
matching algorithms used for motion estimation to obtain motion vectors between the
frames in Super-resolution. The implementation and comparison of two different types of
block matching algorithms viz. Exhaustive Search (ES) and Spiral Search (SS) are
discussed. Advantages of each algorithm are given in terms of motion estimation
computational complexity and Peak Signal to Noise Ratio (PSNR). The Spiral Search
algorithm achieves PSNR close to that of Exhaustive Search at less computation time than
that of Exhaustive Search. The algorithms that are evaluated in this paper are widely used
in video super-resolution and also have been used in implementing various video standards
like H.263, MPEG4, H.264.
Index Terms— Super-resolution, motion estimation, block matching, exhaustive search,
spiral search, PSNR.
I. INTRODUCTION
The optical sensors created a new era of imaging wherein optical images could be efficiently captured by
sensors and stored as digital information. The resolution of the captured image depended on the size and
number of these sensors. Increasing resolution by improving sensor resolution is not always a feasible
approach to improving resolution. For example, to increase spatial resolution [1], reduce the pixel size by
sensor manufacturing techniques. As the pixel size decreases, however, the amount of light available also
decreases. It generates shot noise that degrades the image quality seriously. Another approach for enhancing
the spatial resolution is to increase the chip size, which leads to an increase in capacitance. This approach is
not considered too much effective because large capacitance [2] makes it difficult to speed up a charge
transfer rate. To address this issue, the image processing community is developing a collection of algorithms
known as super-resolution for generating high-resolution (HR) imagery from systems having lower-
resolution (HR) imaging sensors. These algorithms combine a collection of low-resolution images containing
aliasing artifacts and restore a high-resolution image. It is possible to reconstruct the original image, by
choosing a magnification factor, L, for the desired HR image, where L = HR image resolution / LR image
resolution. The value of the magnification factor will depend on the number of non-redundant LR images that
DOI: 02.ITC.2014.5.557
© Association of Computer Electronics and Electrical Engineers, 2014
Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC

481
are available. The observation model that relates the original and HR image to the observed LR image is as
shown in Fig.1.
Figure 1.- Super-Resolution observation model
Here, X denotes the continuous scene, and Xs be the desired HR image sampled above the Nyquist rate from
the band-limited continuous scene. The output Yk is the kth
observed LR image from the image sensor. The
representation of observation model is:
= + = 1, 2, 3, … (1)
where, D is a down-sampling operator, Bk contains the blur for the kth
LR image, Mk contains the motion
information that transforms the kth
LR image onto the HR image grid, and Nk is the noise in the kth
LR image
Motion estimation plays major role in super-resolution. It estimates the relative shift between LR images
compared to the reference LR image. The motion estimation algorithms use block matching method as it
provides flexible trade-off between complexity and motion vector quality [3]. The subjective quality of the
HR image suffers as a result of artifacts, which are generated during the fusion process as a result of
erroneous motion vectors (MVs). Accurate motion estimation plays major role in the SR problem, and with
erroneous MVs, SR may give worse results [4][5][6]. Therefore, although it is necessary to provide accurate
motion vectors in order to increase the spatial resolution, it is even more critical to be able to detect invalid
motion vectors in order to prevent artifacts in the HR image. Most of the motion estimation algorithms [4],
are too complex to be used in practical applications. Many applications require a real-time approach, whereas
most algorithms take from several minutes to several hours to estimate the motion between two images. A
low-complexity approach requires fixing this problem. Low-complexity approaches require a block-based
motion estimation algorithm and low-complexity priors. It is necessary that such an algorithm converge in a
small number of iterations. While a block-based motion estimation algorithm reduces the computational
complexity, it is also presents its own set of challenges.
This paper gives discussion on exhaustive search and spiral search motion estimation algorithms along with
simulation results. Section II explains block matching in general. Section III explains and compares ES and
SS and presents some simulation results and discussion. Section IV gives concluding remarks, followed by
references.
II. BLOCK MATCHING MOTION ESTIMATION
Block-matching algorithms represent a very popular approach for estimating the motion between frames in
an image sequence. Block matching requires the use of the translation-motion model and brightness
constancy assumption to estimate the motion of blocks between image pairs. The actual motion can only be
approximated as a translation for small displacements, and the brightness-constancy assumption does not
hold for illumination changes due to non-uniform lighting, shadows, etc. Block matching is also sensitive to
block size. Large blocks are needed to avoid local minima; however, large blocks produce poor matches
compared to small blocks. Even with the limitations of the translation-motion model and brightness-
constancy assumptions, block-matching algorithms perform well in terms matching the block.
Block matching algorithms make use of the brightness constancy assumption, which assumes that image
pixels retain their luminance values over a spatio-temporal displacement path, i.e.
( ; ; ) = ( + ∆ ; + ∆ ; + ∆ ) (2)
where I(x; y; t) is a continuous representation of the pixel luminance; x and y represent the spatial shift;
and t represents the temporal shift. The brightness constancy assumption is violated when the illumination
of the scene changes between successive images; however, it is generally valid for small spatio-temporal
displacements [7]. To make use of the brightness constancy assumption, block-matching algorithms divide
the image into square regions generally referred to as blocks. To reduce complexity, the image is usually
divided into blocks of fixed size or variable size [8][9][10].

482
With the image divided into Macro-Blocks (MB) and blocks of predetermined size, the task of the block
matching algorithm is to locate the block in the adjacent image that best matches the block in the reference
image to create a vector that represent the movement of a block from one location to another. The adjacent
image may fall before (backward block matching) or after (forward block matching) the reference image. The
search area for a good macro block match is constrained up to p pixels on all fours sides of the corresponding
macro block in previous frame. This p is called as the search parameter. Larger motions require a larger p and
the larger the search parameter the more computationally expensive the process of motion estimation
becomes. Usually the macro block is taken as a square of side 16 pixels, and the search parameter p is 7
pixels.
Correlation-based approaches are used to find the best match [11] called as “cost functions”. There are
various cost functions, of which the most popular and less computationally expensive is Sum of Absolute
Difference (SAD) or Sum of Absolute Error (SAE) given by equation (3). Another cost function is Mean
Squared Error (MSE) given by equation (4).
Sum of Absolute Error:
= ∑ ∑ | − | (3)
Mean Squared Error:
=

∑ ∑ − (4)
where M x N is the size of the macro block, and Cij and Rij are the pixels being compared in current macro
block and reference macro block, respectively. The block that minimizes the SAE will become the Motion
vector for the block at position.
To maximize the probability of choosing the correct MV with the SAE metric, it is required to consider the
following:
1. Choice of p for the search parameter range.
2. Block size.
3. Initializing the search.
To evaluate and compare the systems it is required to measure the quality of the video images displayed to
the viewer. Visual quality measurement is a difficult and imprecise task because there are so many factors
that can affect the results. Visual quality measurement is subjective and is influenced by many factors. The
complexity and cost of subjective quality measurement make it attractive to be able to measure quality
automatically using an algorithm. The objective (algorithmic) quality measures give the quantitative values.
The most widely used measure is Peak Signal to Noise Ratio (PSNR).
= 10
( )
(5)
where (2n
-1)2
is the square of the highest-possible signal value in the image, and n is the number of bits per
image sample.
Above discussed block matching motion estimation leads to develop algorithms to provide good PSNR. Two
motion estimation algorithms have been implemented and discussed for super-resolution, in the next section.
They are: Exhaustive Search and Spiral Search.
III. PROPOSED SYSTEM
The proposed system contains following blocks.
A) Sampling: The continuous input frame is sampled above the Nyquist rate.
B) Motion estimation: This block contains two search algorithms. They are Exhaustive search and Spiral
search algorithms.
A. Exhaustive Search
This algorithm, also known as Full Search [12], is the most computationally expensive block-matching
algorithm of all. This algorithm calculates the cost function at each possible location in the search window.
As a result of which it finds the best possible match and gives the highest PSNR amongst other block-
matching algorithms. Fast block matching algorithms try to achieve the same PSNR doing as little
computation as possible. Full search [12] motion estimation involves evaluating equation (3) (SAD) at each
point in the search window. The first search location is at the top-left of the window and the search proceeds

483
in raster order until all positions have been evaluated. The full search estimation is guaranteed to find the
minimum SAD in the search window but it is computationally intensive since the error measure must be
calculated at every one of (2S +1)2
locations, where S is position.
B. Spiral Search
Most image sequences have smooth motion and high spatial correlation (a measure of the tendency for pixels
that are near to each other to have more similar values of their statistics). It is quite likely that the motion
vector of a block is close to the motion vectors of its neighbors. Hence, the search window center can be
predicted using the motion vectors of the predictor blocks. The motion vectors of three neighboring macro-
blocks (one to the left, one above and one above right) are used as predictors for the motion vector of the
current macro-block. The prediction [11] is formed by taking the median of three motion vectors. The
prediction error between the actual motion vector and the predicted value in the horizontal direction and the
vertical direction is coded as shown in Fig. 2.
MV1 MV
MV2 MV3
MV: Current motion vector, MV1, MV2, MV3: predictors
Prediction = median(MV1,MV2,MV3)
Figure 2. Prediction of motion vectors
Special cases are needed to take care of MBs for which the predictors lie outside the picture boundary or
group of boundary (GOB). These special cases are shown in Fig. 3.
(0, 0) MV
MV2 MV3
MV1 MV
MV1 MV1
MV1 MV
MV2 (0, 0)
Picture boundary or GOB boundary
Figure 3. Special cases of motion vector prediction
Whenever one of the prediction MBs lies outside the picture boundary, it is replaced by (0, 0), however,
when two MBs lie outside, they are replaced by the motion vector of the third MB. This is done to avoid
having two of these motion vectors replaced by zeros, in which case the final value got after the median
operation will be (0, 0). The spiral search algorithm [13] uses the motion vectors of the predictor blocks to
get a predicted search window center. It then uses the SAD values of these predictor blocks to achieve a
variable window size. The SAD is computed starting at the center of the search window, and moving outward
spirally. This process is stopped once the SAD falls under a threshold value. This threshold is clearly the
parameter that controls the size of the window and it is obtained from the SAD values of the predictor blocks.
The threshold is same as the median of the SAD values of the predictor blocks. A median operation helps to
suppress the effect of the SAD value of any uncorrelated block in the neighborhood.
C. Blur
Blur is a natural property of all image acquisition devices caused by the imperfections of their optical
systems. It can also be caused by factors like, motion blur or atmospheric blur. Lens blur can be modeled by

484
convolving the image with a mask corresponding to the optical system's Point Spread Function. Gaussian
blur model is used in this work. The image is convolved with a two-dimensional Gaussian of size G*G and
standard deviation. Since blurring takes place on the image vector, convolution is replaced by matrix
multiplication.
IV. RESULT
Exhaustive Search and Spiral Search algorithms have been implemented to achieve good performance for
different test videos. The performance is evaluated on two counts: PSNR and computational time. The ES
and SS algorithms are implemented on Intel CORE i3 machine, in C language using Visual studio. The
motion estimation algorithms have been tested on three Quarter Common Intermediate Frame (QCIF)-
resolution test video sequences as shown in Table 1.
TABLE NO. I. VIDEO SEQUENCES USED FOR PERFORMANCE ANALYSIS: (A) CARPHONE (B) FOREMAN (C) CLAIRE
S.N0. Filename Video
No. of
Frames
Details
(a)
Carphone 380
Moderate motion in background,
and no motion in camera
(b)
Foreman
400
Motion in camera and
background
(c) Claire 490
No motion in background as well
as camera
The motion vectors and reference frame are sent to motion compensation block where the new frame is
generated using MV and reference frame. This frame is compared with the current frame to get the PSNR.
The Objective performance of ES and SS algorithms is as shown in Table 2.
TABLE NO. II. PERFORMANCE ANALYSIS OF EXHAUSTIVE SEARCH AND SPIRAL SEARCH ALGORITHMS
S.No. Filename
Exhaustive Search Spiral Search
PSNR
(dB)
CPU Time(seconds) PSNR
(dB)
CPU Time(seconds)
(a) Carphone 43.9599 55 43.9485 48
(b) Foreman 43.1578 67 43.1523 60
(c) Claire 45.8718 62 45.8656 43
From the table it can be seen that full search algorithm takes more execution time than spiral search and
achieves better PSNR (in dB) that of spiral search. PSNR of Claire video sequence is more than that of other
video sequences due to less motion in video.
The PSNR of the ES and SS versus the frame number are plotted as shown in Fig. 4.
It can be seen that Claire video shows better PSNR as compared to the Carphone and Foreman videos for
both the search algorithms. The spiral search algorithm gives better PSNR and execution time. The super-
resolution model using spiral search algorithm in motion estimation gives better performance compared to
the exhaustive search algorithm.

485
Figure 4. PSNR Comparison of search algorithm for input video (a) Carphone (b) Foreman (c) Claire
V. CONCLUSION
The motion estimation algorithms, Exhaustive Search and Spiral Search, for video super-resolution are
implemented and tested for different video test sequences. The obtained performance of algorithms is based
on PSNR and execution time. From the test results it can be seen that spiral search takes less execution time
than full search and achieves average PSNR very close to that of full search. These algorithms provide better
PSNR for the video with no background motion than the video with no camera motion. It can be concluded
that spiral search algorithm provides premier design for motion estimation in super-resolution. The spiral
search algorithm provides a hitherto unavailable set of cost/performance points that will have a powerful
impact on super-resolution.
Carphone
41
42
43
44
45
46
1 28 55 82 109 136 163 190 217 244 271 298 325 352 379
Number Of Frames
SNR
car_ful
car_Sp
Foreman
40
41
42
43
44
45
1 28 55 82 109 136 163 190 217 244 271 298 325 352 379
Number Of Frames
SNR
Full Search
Spiral Search
Claire
42
43
44
45
46
47
1 39 77 115 153 191 229 267 305 343 381 419 457 495
Number OfFrames
SNR
Full Search
SpiralSearch

486
REFERENCES
[1] Subhasis Chaudhuri, “Super Resolution Imaging,” Kluwer Academic Publishers, pp.1-44, 2002.
[2] Sung Cheol Park, Min KyuPark,and Moon Gi Kang, “Super-Resolution Image Reconstruction: A Technical
Overview,” IEEE Signal Processing Magazine May 2003.
[3] Michael Santoro “Valid Motion estimation for super-resolution image reconstruction,” Ph.D. dissertation, School of
Electrical and Computer Engineering, Georgia Institute of Technology, USA, 2012
[4] G. Callico, S. Lopez, O. Sosa, J. Lopez, and R. Sarmiento, “Analysis of fast block matching motion estimation
algorithms for video super-resolution systems,” IEEE Transactions on Consumer Electronics, vol. 54, pp. 1430–
1438, Aug. 2008.
[5] S. C. Park, M. K. Park, and M. G. Kang, “Super-resolution image reconstruction: a technical overview,” IEEE
Signal Processing Magazine, vol. 20, pp. 21–36, May 2003.
[6] P. Hill, T. Chiew, D. Bull, and C. Canagarajah, “Interpolation free subpixel accuracy motion estimation,” IEEE
Transactions on Circuits and Systems for Video Technology, vol. 16, pp. 1519–1526, Dec. 2006.
[7] M. Chan, Y. Yu, and A. Constantinides, “Variable size block matching motion compensation with applications to
video coding,” IEEE Proceedings on Communications, Speech and Vision, vol. 137, pp. 205–212, Aug. 1990.
[8] CCITT, “Codec for audiovisual services at n x 384 kbits/s,” Fascicle III.5, Rec. H.261, 1988.
[9] Z. Ahmed, A. Hussain and D. Al-Jumeily, “Fast Computations of Full Search Block Matching Motion Estimation
(FCFS),” proceedings of PGNeT Conference, 2011.
[10] M. Ahmadi and M. Azadfar, “Implementation of fast motion estimation algorithms & comparison with full search
method in H.264,” IJCSNS International Journal of Computer Science & Network Security, vol. 8, no. 3, pp. 139-
143, 2008.
[11] Tsuhan Chen, Deepak Turaga and Mohamed Alkanhal, Correlation Based Search Algorithms for Motion Estimation,
Picture Coding Symposium, Portland, April 1999.
[12] Aroh Barjatya, “Block Matching Algorithms for Motion Estimation,” Technical report, Dept. of ECE, Utah State
University, April 2004.

557 480-486

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Viewers also liked

Viewers also liked (8)

Similar to 557 480-486

Similar to 557 480-486 (20)

More from idescitation

More from idescitation (20)

Recently uploaded

Recently uploaded (20)

557 480-486