2010 15 vo

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2010 399

Selective Data Pruning-Based Compression Using
High-Order Edge-Directed Interpolation
˜
Dung T. Võ, Member, IEEE, Joel Solé, Member, IEEE, Peng Yin, Member, IEEE, Cristina Gomila, Member, IEEE,
and Truong Q. Nguyen, Fellow, IEEE

Abstract—This paper proposes a selective data pruning-based H.264/MPEG-4 AVC. For most video coding standards, in-
compression scheme to improve the rate-distortion relation of creasing quantization step size is used to reduce bit-rate [1].
compressed images and video sequences. The original frames are However, this technique can result in blocking and other coding
pruned to a smaller size before compression. After decoding, they
are interpolated back to their original size by an edge-directed artifacts due to the loss of high frequency details. In the second
interpolation method. The data pruning phase is optimized to direction, common techniques are low-pass filtering or down-
obtain the minimal distortion in the interpolation phase. Further- sampling (which can be seen as a filtering process) followed
more, a novel high-order interpolation is proposed to adapt the by reconstructing or upsampling at the decoder. For example,
interpolation to several edge directions in the current frame. This low-pass filters were adaptively used based on Human Visual
high-order filtering uses more surrounding pixels in the frame
than the fourth-order edge-directed method and it is more robust. System to eliminate high frequency information in [2] or to
The algorithm is also considered for multiframe-based interpola- simplify the contextual information in [3]. Also, to reduce the
tion by using spatio-temporally surrounding pixels coming from bit-rate, some digital television systems uniformly downsized
the previous frame. Simulation results are shown for both image the original sequence and upsized it after decoding. These
interpolation and coding applications to validate the effectiveness methods contain a pruning phase to reduce the amount of data
of the proposed methods.
to compress and a reconstructing phase to recover the dropped
Index Terms—Data pruning, edge-directed interpolation, spa- data. The reconstructed video applying these techniques looked
tial-temporal interpolation, video compression. blur because they were designed to eliminate high-frequency
information with the low-pass filter in the preprocessing step or
with the anti-aliasing filter before downsizing.
I. INTRODUCTION This paper proposes a novel data pruning-based compres-
sion scheme to reduce the bit-rate while still keeping a high
quality reconstructed frame. The original frames are first op-

N OWADAYS, the request for higher quality video is
emerging very fast. Video tends to higher resolution,
higher frame-rate and higher bit-depth. New technologies to
timally pruned to a smaller size by adaptively dropping rows
or columns prior to encoding. At the final stage, an interpola-
tion phase is implemented to reconstruct the decoded frames to
further reduce bit-rate are strongly demanded to combat the their original size. By avoiding filtering the remaining rows and
bit-rate increase of this high definition video, especially to meet columns, the reconstructed frames can still achieve high quality
the network and communication transmission constraints. In from a lower bit-rate.
video coding, there are two main directions to reduce com- Main applications of interpolation are upsampling, demosi-
pression bit-rate. One direction is to improve the compression acking and displaying for different video formats. For resolu-
technology and the other one is to perform a preprocessing step tion enhancement, the interpolation is implemented to overcome
that improves the subsequent compression. the limitation of low resolution imaging. A wide range of in-
The first direction can be viewed from the development terpolation methods has been discussed, starting from conven-
of the MPEG video coding standard, from MPEG-1 to tional bilinear and bicubic interpolations to sophisticated itera-
tive methods such as projection onto convex sets (POCS) [4] and
Manuscript received January 13, 2009; revised September 17, 2009. First nonconvex nonlinear partial differential equations [5]. To avoid
published November 03, 2009; current version published January 15, 2010. This the jaggedness artifacts occurring along edges, edge-oriented in-
work was done while D. T. Võ was with Thomson Corporate Research and Uni-
versity of California at San Diego. This work was supported in part by Texas terpolation methods were performed using Markov random field
Instruments, Inc. The associate editor coordinating the review of this manuscript [6] or the low resolution (LR) image covariance [7]. Further-
and approving it for publication was Dr. Hsueh-Ming Hang.
D. T. Võ is with the Digital Media Solutions Lab, Samsung Information Sys-
more, [8] proposed a 2-D piecewise autoregressive model and
tems America, Irvine, CA 92612 USA (e-mail: dung.vo@samsung.com). a soft-decision estimation to interpolate the missing pixels in a
J. Solé, P. Yin, and C. Gomila are with the Thomson Corporate Re- group. This method required a 12 12 matrix inversion and can
search, Princeton, NJ 08540 USA (e-mail: {joel.sole@thomson.net;
peng.yin@thomson.net; cristina.gomila@thomson.net).
cause artifacts in the output image when the matrix is badly con-
T. Q. Nguyen is with the Department of Electrical and Computer Engineering, ditioned. A combination of directional filtering and data fusion
University of California at San Diego, La Jolla, CA 92093-0407 USA (e-mail: was also discussed in [9] to estimate missing high resolution
nguyent@ece.ucsd.edu). (HR) pixels by a linear minimum mean square error estimation.
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. Another group of interpolation algorithms used different kinds
Digital Object Identifier 10.1109/TIP.2009.2035845 of transforms to predict the fine structure of the HR image from
1057-7149/$26.00 © 2010 IEEE

Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on April 02,2010 at 13:47:56 EDT from IEEE Xplore. Restrictions apply.

400 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2010

its LR version. Instead of directly interpolating the HR image
in pixel domain, zeros were initially padded for the high fre-
quencies from the wavelet transform [10] and the courtourlet
Fig. 1. Block diagram of the data pruned-based compression.
transform [11]. These algorithms were then iterated under the
constraints of sparsity and the similarity of low pass output of
the LR and HR images.
In demosaicking, interpolation is applied to reconstruct the frequencies. Only 25% of the data is kept in the pruning phase,
missing color component due to color-filtered image sensor. The fact that also prevents achieving a reconstructed frame with high
full-resolution color image can be achieved from the Bayer color quality, even without compression.
filter array by interpolating the (R,G,B) planes separately as in In data pruning-based compression for video, downsizing in
[7] or jointly as in [12], [13]. Processing the color plane indepen- both spatial and temporal direction is applied to further reduce
dently helps avoiding the misregistration between color planes the bit-rate. In temporally data pruning-based compression, the
but ignores the color planes’ dependency. For joinly color plane frame-rate is usually reduced by half and is later reconstructed
interpolation, the green pixels are first interpolated from the can- by motion compensated frame interpolation (MCFI) methods
didates of horizontal and vertical interpolation. After that, red [16], [17] . For fast motion video sequences or for frames at
and blue pixels are reconstructed based on the color differences scene change, these methods typically cause blocking, flick-
and with assumption that these differences are flat ering and ghosting artifacts. The rate-distortion (R-D) relation
over the small areas. An iterative algorithm for demosaicking of these data pruned compressed sequences is much lower than
using the color difference is discussed in [14]. Interpolation is that of the directly compressed sequences due to the high per-
also required when video sequences are displayed in different centage of data loss (up to 87.5%) and the limitation of current
frame sizes other than its original frame size. In [15], the de- video interpolation methods.
coded frame in unsuitable frame size is upsized and downsized Uniformly pruning image or video sequences ignores the
to achieved the arbitrary target frame size in a pixel domain data-dependent artifacts caused by the interpolation phase. In
transcoder. this paper, the proposed data prune-based compression method
When interpolation is used along with data pruning, the adapts to the error resulted from the interpolation phase. Data
method needs to adapt to the way of pruning the data and to the which can be reconstructed with less error has higher priority
structure of surrounding pixels. For instance, there are pruning to be dropped than data which cause higher error during the
cases in which only rows or only columns are dropped and interpolation. The proposed data pruning phase and its cor-
upsampling in only one direction is required. This paper de- responding interpolation phase in Fig. 1 will be discussed in
velops a high-order edge-directed interpolation scheme to deal Sections III and IV, respectively.
with these cases. The algorithm is also considered for the cases
of dropping both rows and columns. Furthermore, instead of III. OPTIMAL DATA PRUNING
using only spatially neighboring pixels for image interpolation, The block diagram of the data pruning phase for one frame
the algorithm is extended for cases of video interpolation using is shown in Fig. 2. Only the even rows and columns may be
spatio-temporally neighboring pixels. discarded, while the odd rows and columns are always kept for
The paper is organized as follows. Section II introduces the later interpolation. To simplify the analysis, the compression
data pruning-based compression method. Section III derives stage in Fig. 1 is ignored. In this phase, the original frame is
an optimal data pruning algorithm. The high-order edge-di- selectively decimated to the LR frame for cases of dropping
rected interpolation methods which are corresponding to the all the even rows, all the even columns and all the even rows and
data pruning-based compression scheme are described in Sec- columns. Then, for each of these 3 downsampling scenarios,
tion IV. Results for interpolation and coding applications are is interpolated back to the HR frame based on all odd
presented in Section V. Finally, Section VI gives the concluding rows and columns (upscaling by ratio of 2 2) or all odd rows
remarks and discusses future works. (upscaling by ratio of 2 1) or all odd columns (upscaling by
ratio of 1 2). Finally, these 3 reconstructed are compared to
II. DATA PRUNE-BASED COMPRESSION in order to decide the best downsampling scenario and number
of even rows and columns to be dropped before compression.
The block diagram of the data pruning-based compression for Because of the decimation and interpolation, the reconstructed
one frame is shown in Fig. 1. At first, the original frame of size frame is different than its original frame . The principle of
is pruned to frame of smaller size the algorithm is that the even rows and columns in that have
, where and are the number of dropped rows and least error compared to its corresponding rows and columns in
columns, respectively. The purpose of data pruning is to reduce are chosen to be dropped. The mean squared error
the number of bits representing the stored or compressed frame between and is defined as
. Then, frame having the original size is reconstructed
by interpolating . The conventional data pruning-based com-
(1)
pression methods reduce the frame size with a factor of 2 in both
horizontal and vertical direction by dropping half of the columns
and rows. Because of aliasing, interpolation after downsizing Given a target , the data pruning is optimized to
causes jaggedness artifacts, especially for detail areas with high discard the maximum number of pixels while keeping the


VÕ et al.: SELECTIVE DATA PRUNING-BASED COMPRESSION USING HIGH-ORDER EDGE-DIRECTED INTERPOLATION 401

Fig. 2. Block diagram of the data pruning phase.

overall of dropping rows and columns less
than , that is

Fig. 3. Data pruning for the 1st frame of Akiyo sequence. (a) Lines indicated
for pruning. (b) Pruned frame.

(2)

The location of the dropped rows and columns is indicated by quence Akiyo. In Fig. 3(a), the white lines indicate the dropped
and , respectively. If the even column is dropped, lines with the target dB. The frame size is re-
then , otherwise . These indices are stored duced from the standard definition 720 480 to 464 320. The
as side information in the coded bitstream and are used for re- data pruned frame in Fig. 3(b) is more compact and it requires a
constructing the decoded frame. The same algorithm is applied smaller compressed bitstream than the original frame. Most of
to rows. dropped lines are located in flat areas where the aliasing does
The line mean square error for one dropped column not happen.
is defined as For video sequences, the algorithm is extended by dropping
the same lines over frames in the whole group of picture
. In this case, the is defined as
(3)

and similarly for rows. From (2), lines with smaller
have higher priority to be dropped than lines with larger (6)
. Assume that the rows and columns that are where and are the original and reconstructed video se-
dropped have the smallest and that the maximum quences, respectively, and is the number of frames in the
of these lines is . Then, the overall . The for one dropped column is also extended
in (1) becomes the averaged of all dropped pixels [see in the temporal direction as
(4), shown at the bottom of the page]. Therefore, the condition
in (2) can be tightened to (7)

and similarly for rows. This case leads to the same condition as
in (5).
(5)
IV. HIGH-ORDER EDGE-DIRECTED INTERPOLATION
where is the target minimal that the recon- This section proposes a high-order edge-directed interpola-
structed frame has to achieve. An example of the proposed op- tion method to interpolate the downsized frames in Fig. 1
timal data pruning is shown in Fig. 3 for the 1st frame of the se- and the data pruned frames in Fig. 2. In [7], the fourth-order

(4)



Fig. 4. Block diagram of the single frame-based interpolation phase.

new edge-directed interpolation (NEDI-4) is used to upsize only
for the 2 2 ratio. This interpolation can orient to edges in
2 directions and causes some artifacts in the intersections of
more than 2 edges. The proposed methods are higher order in-
terpolations that can adapt to more edge directions. For single Fig. 5. Model parameters of sixth-order and eighth-order edge-directed inter-
frame-based interpolation, the sixth-order edge-directed interpolation. (a) NEDI-6. (b) NEDI-8.
polation and eighth-order interpolation are developed for in-
terpolating the cases with ratio 1 2 or 2 1 (dropping only , the optimal minimizing the MSE between the interpo-
rows or only columns) and ratio 2 2 (dropping both rows and lated and original pixels in can be calculated by
columns), respectively. For multiframe-based interpolation, the
ninth-order edge-directed interpolation is discussed for interpo-
(9)
lating the case with ratio 1 2 or 2 1 over all the frames of a
GOP (dropping only rows or only columns).
The geometric duality assumption [18] states that the model
A. Single Frame-Based Interpolation vector can be considered constant for different scales and
Because the similar interpolation method is used for and so, it can be estimated from the LR pixels by
, this section will only discuss the case of interpolating .
The block diagram of the interpolation phase is shown in Fig. 4.
First, is expanded to of size by inserting a line
of zeros at the line of if its indicator value
for columns or for rows. is selectively down-
sampled by 1 2, 2 1 or 2 2 ratio to form depending (10)
on the chosen data pruning scheme. Then, are directionally
interpolated to the HR frame of size . Finally, the where are 6-neighboring LR pixels of and
indicators and determine whether the lines in the final is the LR model parameter vector as shown in Fig. 5(a).
reconstructed frame are selected from the interpolated or from contains the edge-directed information which is applied to the
the data pruned frame HR scale for interpolation. The optimal minimum MSE linear
is then obtained by
if
otherwise. (11)

1) Sixth-Order Edge-Directed Interpolation (NEDI- ): The where is the vector of all mapped LR pixels
same NEDI-6 is implemented for case of single frame-based in and is a matrix. The elements of the column
interpolation with upsampling ratios of 1 2 or 2 1. For the of are the 6-neighboring pixels of shown in
case of ratio 1 2, the pixel indexes are classified to Fig. 5(a).
indexes for odd columns and indexes for odd columns 2) Eighth-Order Edge-Directed Interpolation (NEDI- ):
. The columns of are mapped to the odd columns of the This section develops an algorithm to deal with single
HR frame of size by . The frame-based interpolation for the case of upsampling with
even columns of are interpolated from the odd columns by ratio of 2 2. Similar to NEDI-6, the pixels in corre-
a sixth-order interpolation sponding to the LR pixels downsampling by 2 2 in are
extracted to form the LR frame of size . The
interpolation is performed using NEDI-4 as in [7] for the first
round and NEDI-8 for the second round. The interpolation
schemes of NEDI-4 and NEDI-8 are shown in Fig. 5(b), where
(8) the solid circles are the mapped LR pixels and the other pixels
are the HR pixels to be interpolated. Using the quincunx
where is the vector of sixth-order model parameters and sublattice, two passes are performed in the first round. In the
is the vector of 6-neighboring pixels of as shown in first pass, NEDI-4 is used to interpolate type 1 pixels (squares
Fig. 5(a). In this figure, the solid circles are the mapped LR with lines) from the LR pixels (solid circles). In the second
pixels while the circles are the HR pixels needed to be inter- pass, type 2 pixels (squares) and type 3 pixels (circles) are
polated. Assuming that is nearly constant in a local window interpolated from type 1 and LR pixels.



Fig. 6. Block diagram of the proposed multiframe-based interpolation for case of upsampling with ratio 1 2 2.

Having an initial estimation of all the 8-neighboring pixels, based on the sum of absolute difference between
NEDI-8 is implemented to get extra information from 4 direc- the current block and its matching block
tions in the second round. In this round, the model parame-
ters can be directly estimated from the HR pixels. Therefore,
the overfitting problem of NEDI-4 is reduced while considering
more edge orientations. For the sake of interpolation consis-
tency, NEDI-8 is applied to the pixels of type 3, 2, and 1 as in this (12)
order. The fourth-order model parameters and eighth-order
model parameters for HR scale are shown in Fig. 5(b). The where is the block of pixels of interest that includes the HR
optimal is similarly calculated by (11), where is the vector pixels needed to interpolate and is the motion vector
of all HR pixels in , and matrix is of block . is calculated by
employed, which is a matrix whose column is com-
if
posed of the 8-neighboring pixels of .
otherwise

B. Multiframe-Based Interpolation where is the threshold to determine whether is chosen
For multiframe interpolation, using single-frame-based inter- from or . The final reconstructed frame is
polation algorithm such as NEDI-6 or NEDI-8 can result in selected from the interpolated frame or the data pruned frame
temporal inconsistency. This comes from ignoring of temporal by the indicators and
correlation of the single-frame-based interpolation. A spatio-
temporal interpolation method is proposed in this subsection
to reduce the flickering effect. To interpolate one HR pixel in if
the current frame, extra surrounding pixels from the previous otherwise.
frame are used together with its surrounding pixels in the cur- 2) Ninth-Order Edge-Directed Interpolation (NEDI- ): In
rent frame. A multiframe-based ninth-order edge-directed inter- NEDI-9, besides the 6 surrounding pixels in the current frame,
polation (NEDI-9) method is discussed for the case of dropping 3 more pixels in the matching block of previous frame are used.
all the even columns over frames of the whole GOP. A similar The interpolation phase is implemented as shown in Fig. 4. The
algorithm can be applied to the cases of dropping all even rows interpolated pixel is the weighted average
or both even columns and rows.
1) Spatio-Temporal Interpolation Scheme: The block dia-
gram of the multiframe-based interpolation is shown in Fig. 6.
First, the current compressed data pruned frame is ex-
panded to of the original size by inserting zeros as in Sub-
section IV-A. Then, is single frame-based interpolated to
using NEDI-6 as in (8). Assume that the previous inter-
polated frame is , a block-based motion estimation and
motion compensation are used to align the block of pixels of in-
terest in to its matching block in . Interpolating (13)
the current frame and motion estimating based on larger blocks
help to achieve more accurate motion vectors, especially for the where is the vector of ninth-order model parameters and
compressed sequence. The reason is that the interpolated pixels is the vector of 6-spatial neighboring pixels and 3-spatio-tem-
have less artifacts than the LR pixels after the “filter-like” in- poral neighboring pixels of , and is the
terpolation phase. Based on and its motion compensated motion vector of the current block. The interpolation scheme for
frame from , is spatio-temporally interpolated NEDI-9 is shown in Fig. 7(a), where solid circles represent the
using NEDI-9. available pixels and blank circles represent the pixels to be inter-
If the matching block is very different from the current polated. Equation (13) includes one term for the spatial pixels as
blocks, the spatio-temporal pixels should not be used, thus in NEDI-6 and the other term for the spatio-temporal pixels in
preventing the un-related pixels in the previous frame from the previous interpolated frame . The output is edge-di-
contributing to the output. and are combined to rected by the first term and temporal-consistent-directed by the



pling by 2 in both directions. In this simulation, for upsampling
with ratio 1 2 for these methods, only the LR pixels located
in an even row and column (solid circles as plotted in Fig. 5(b)
are used to interpolate the pixels in even columns (square with
lines and circle). The remaining available LR pixels (square)
are ignored. For NEDI-6 and NEDI-9, a window size of 17 17
pixels is chosen for the model parameter estimation. Only 6
HR pixels at the center of teh window are interpolated using
these model parameters. is shifted by (4,4) pixels over the
frame to interpolate all HR pixels. For NEDI-9, for motion
estimation is set to 16 16 and the threshold is experimentally
chosen to be . This helps achieving the highest PSNR
for the interpolated frames of different sequences. A particular
result is shown in Fig. 8 for a zoomed part of a frame of the
Foreman sequence. The PSNR values of the interpolated frames
Fig. 7. Model parameters of 9th order edge-directed interpolation. (a) Interpo-
lation scheme. (b) Parameter estimation. using bicubic, sinc, autoregression, NEDI-6 and NEDI-9 in-
terpolation are 38.86 dB, 38.76 dB, 37.39 dB, 39.31 dB and
39.42 dB, respectively. These results validate the effectiveness
second term. The second term helps reducing the flickering ef-
of NEDI-6 and NEDI-9 for edge-directed interpolation, since
fect of using only frame-based interpolation.
less jaggedness and higher PSNR are attained compared to
The model vector is estimated from its LR model vector
the other methods. Comparing to NEDI-6 using only spatial
, where is shown in Fig. 7(b). In this case, for the spatial
pixels, NEDI-9 using both spatial and spatio-temporal pixels
parameters, the geometric duality is assumed as in NEDI-6.
achieves better visual quality and higher PSNR. When played
This assumption is not needed for the spatio-temporal param-
as a video sequence, interpolated sequence using NEDI-9 also
eters, because all pixels in the previous frame are available.
has less flickering artifacts and a higher quality consistent in
These parameters are finally estimated as in (11) where is
the temporal direction than the single frame-based interpolated
the vector of all mapped LR pixels in
sequence using NEDI-6. Because of the ME part, NEDI-9
and is a matrix whose column is composed
has higher complexity and requires longer running time than
of the 9-spatial and spatio-temporal neighboring pixels of
NEDI-6. For the 2nd frame of Foreman sequence, the running
. The 9-spatial and spatio-temporal neigh-
times are 0.72 s, 0.34 s, 6.59 s, 28.76 s, 433.36 s, and 4690.42
boring pixels of are defined as s for bicubic, sinc, autoregression, NEDI-4, NEDI-6, and
NEDI-9 methods. Note that sinc and autoregression methods
are in C code while the other methods are written using Matlab.
The simulation is run on laptop with Intel 1.83-GHz CPU and
1-GB RAM.
2) Eighth-Order Edge-Directed Interpolation for Up-
sampling With Ratio 2 2: For the proposed NEDI-8, the
comparison is performed with the Shan’s method [19], bicubic,
sinc, and NEDI-4 methods. For NEDI-4 and NEDI-8, the
window size is chosen to be 17 17 and only 4 HR pixels at
the center of are interpolated using these model parameters.
is also shifted by (4,4) pixels over the frame to interpolate
all HR pixels, like in the NEDI-6 and NEDI-9 cases. The frame
V. SIMULATION RESULTS is expanded by reflecting these pixels over the borders in order
to enhance the pixels near the frame borders in the proposed
A. High-Order Edge-Directed Interpolation NEDI-8.
Simulations are performed to compare the proposed PSNR values are shown in Table I for sequences with
high-order edge-directed interpolations with other interpo- different resolutions. To perform a fair comparison to other
lation methods for a wide range of data in different formats. methods that use bilinear interpolation for pixels near the
Both cases of upsampling with ratio of 1 2 and 2 2 are borders, pixels at 5 lines or fewer away from the border are
considered. not counted for the PSNR computation. The Table I shows
1) Sixth-Order and Ninth-Order Edge-Directed Interpola- that NEDI-8 has the highest average PSNR value. The average
tion for Upsampling With Ratio 1 2: The original frames PSNR of NEDI-8 is 3.930 dB, 1.054 dB, 1.198 dB, 0.732 dB
are downsampled by 2 in the horizontal direction (dropping higher than the average PSNR value of Shan’s method, bicubic,
all even columns). The downsized frames are then interpolated sinc, and NEDI-4, respectively.
using bicubic, sinc, autoregression method [8] and the proposed The visual results for a selected part of the Foreman sequence
NEDI-6 and NEDI-9 interpolation. Note that other interpolation are shown in Fig. 9. The result using the sinc-based interpola-
methods, such as [7] and [8], can only be applied for downsam- tion has a lot of jaggedness [Fig. 9(b)]. While the NEDI-4 inter-



Fig. 8. Comparison of NEDI-6 and NEDI-9 to other methods. (a) Original. (b) Bicubic. (c) Sinc. (d) Autoregression. (e) NEDI-6. (f) NEDI-9.

Fig. 9. Comparison of NEDI-8 to other methods. (a) Original. (b) Sinc. (c) NEDI-4. (d) NEDI-8.

TABLE I
PSNR COMPARISON (IN dB ) size 352 288 to 304 288. An H.264/AVC codec is used to
intra code the frames with . NEDI-6 is used for
the edge-directed interpolation. Each even rows and columns
require one bit to indicate whether it is kept or dropped. Such
as for the frame of size 352 288, a total of
bits is used to indicate the dropped even lines. These bits are
sent as side information in the coded bitstream. For compar-
ison, other data pruning-based methods using sinc, bicubic,
polation has signiﬁcant less jaggedness, the interpolated frame autoregression, and NEDI-4 interpolation are also given.
in Fig. 9(c) still shows jaggedness along the strong edges. Be- The R-D curves are plotted in Fig. 10(a) and their zoomed
cause NEDI-4 only uses pixels of 2 directions, artifacts can be in parts are plotted in Fig. 10(b). The percentage of bit saving
observed at the intersections of more than 2 edges. On the other between the H264/AVC compression sequence and the NEDI-6
hand, the NEDI-8 interpolated frame in Fig. 9(d) achieves the data pruning-based compression at the same values of is
best quality with least jaggedness. Using pixels in 4 directions, plotted in Fig. 10(c). The result in Fig. 10(a) shows that the data
the NEDI-8 interpolation also has less artifacts at the intersec- pruning-based compression using NEDI-6 is better than data
tion of more than 2 edges. With respect to objective quality, the pruning-based compression using sinc, bicubic, autoregression
proposed NEDI-8 has the highest PSNR values for all the se- and NEDI-4 methods. The data pruning-based compression
quences across different resolutions. Because of the extra round achieves a better R-D than H264/AVC in the range 31–41 dB.
in the proposed NEDI-8, its running time is longer than NEDI-4. In this range, at the same bit-rate, the PSNR value of data
For Foreman image, the running times are 0.45 s, 0.13 s, 11.56 pruning-based compression is about 0.3–0.5 dB higher than
s, and 65.90 s for bicubic, sinc, NEDI-4 and NEDI-8 methods. the PSNR value of H.264/AVC compression. At the same
Note that sinc method is in C code while the other methods are PSNR, the data pruning-based compression saves about 5% of
written using Matlab software. bit-rate comparing to bit-rate of the H.264/AVC compression.
As shown in Fig. 10(c), at the same QI, the percentage of bit
B. Data Pruning-Based Compression saving is about 4.2%–6.6%. The reconstructed frames using
1) Single-Frame Data Pruning-Based Compression: The data pruning-based compression with sinc, bicubic, autoregres-
simulation in this section veriﬁes the validity of the data sion, NEDI-4 and NEDI-6 methods are shown in Fig. 11(b)–(f)
pruning-based compression method for single frames. This and their zoomed in part are shown in Fig. 12(b)–(f). The data
data pruning-based compression is applied to the compression pruned frames are compressed with and the corre-
of images or intra frames. The target is set to 50 dB. sponding bit-rate is 1.36 Mbps. The PSNR of the reconstructed
Subsequently, the algorithm prunes the frames of Foreman of frame using data pruning-based compression with sinc, bicubic,



Fig. 10. Comparison results for R-D curves of single frame data pruning-based compression. (a) Whole R-D curves. (b) One zoomed in part of (a). (c) Percentage
of bit saving.

Fig. 11. Comparison of NEDI-6 to other interpolation methods in case of single frame data pruning-based compression. (a) Original. (b) Sinc. (c) Bicubic.
(d) Autoregression (37.78 dB). (e) NEDI-4. (f) NEDI-6.

autoregression, NEDI-4 and NEDI-6 methods are 37.79 dB, 2) Multiframe Data Pruning-Based Compression: The data
37.80 dB, 37.78 dB, 37.42 dB, and 37.91 dB, respectively. pruning approach is applied to video compression. An experi-
The results show that the reconstructed frame using NEDI-6 ment is performed in which a GOP of 15 frames of Akiyo is
in Fig. 11 has less artifacts than other methods. Because the pruned with the target dB. Three downsam-
reconstructed frames using autoregression and NEDI-4 are not pling scenarios of dropping all even rows, all even columns all
based on the LR pixels located at even rows, the HR pixels are even rows and columns then using the interpolation scenarios
not consistent to each other and cause some artifacts at the teeth of factors of 1 2, 2 1 and 2 2 are considered to determine
areas in Fig. 11(d) and (e). the best number of lines to be dropped. Simulation shows that
An additional simulation is performed to analysis the affect dropping 160 columns and keepping all rows are the best so-
of the target PSNR on the pruned frame size and the R-D curve lution which achieves the most dropped pixels while still keeps
of the data pruning-based compression. The results in Table II the PSNR of reconstructed frame higher than 45 dB. As a conse-
show that when the target PSNR decreases, more data is con- quence, the frame size is reduced from 720 480 to 320 480
sidered to be dropped while the PSNR range having better R-D lines. An H.264 codec is applied with the GOP structure
curve reduces. The best case to get highest average PSNR im- and . The is averaged over the
provement of dB is obtained when the target PSNR is set to whole GOP, so that the same lines are dropped for all the frames.
50 dB. The table also shows that the compressed bitrate saving In this way, the side information to determine the dropped lines
increases when the target PSNR decreases. is greatly reduced. The extra bit-rate is 1.2 Kbps for the whole



Fig. 12. One zoomed in part of Fig. 11. (a) Original. (b) Sinc. (c) Bicubic. (d) Autoregression. (e) NEDI-4. (f) NEDI-6.

TABLE II of 37.83 dB and 37.91 dB respectively for the H.264/AVC and
PSNR COMPARISON (IN dB ) the proposed data pruning-based compressed sequences. Re-
sults show that the proposed data pruning-based compressed
frame in Fig. 14(b) has higher visual quality and less artifacts
than the H.264/AVC compressed frame in Fig. 14(a). This merit
can be explained by the interpolation phase, which helps re-
ducing the blocking and ringing artifacts, and the smaller quan-
tization step level. Because of the ’filter-like’ interpolation, the
reconstructed sequence in the low bit-rate has fewer blocking
GOP, which again is very small compared to the total bit-rate of artifact than the direct compressed sequence with high compres-
the compressed bitstream. For interpolation, single frame-based sion level.
NEDI-6 is used for the first I frame while multiframe-based Both PSNR curve and visual results validate the effectiveness
NEDI-9 is employed for the following frames. For comparison, of the proposed data pruning-based compression. The proposed
the data pruning scheme is applied to the sequence down- and algorithm requires an interpolation step in the data pruning and
up-sized by 2 2 with the uniform sinc interpolation. reconstruction phases, so the complexity of data pruning-based
The R-D curves are shown in Fig. 13(a), while Fig. 13(b) are compression is higher than the normal compression. However,
zoomed in parts. These results show that the R-D curve of the the coding and decoding time of the proposed method decreases
sinc data-pruned method is consistently below the curve of the proportionally to the size reduction of the data pruned frame.
optimal data pruning method. The proposed method is better in Such as for case of data pruning from the original frame size
the range 32–37.5 dB compared to H.264/AVC. The PSNR im- of 720 480 to 320 480, both encoding and decoding time
provement at the same bit-rate is around 0.3–0.7 dB in the range. for data pruned sequence is only 50% of the encoding and de-
As shown in Fig. 13(c), the percentage of bit-rate saving of the coding time for the original sequence. Additional simulations
optimal data pruning-based compressed sequence is 23%–36% show that to further reduce the running time in the encoding
compared to the H.264/AVC using the same quantization step phase, a simple interpolator such as bilinear interpolator can be
size. Even having the same bit-rate and PSNR values, the recon- applied at the data pruning phase in Fig. 2 while still nearly
structed frames have less artifacts because they are compressed keeps the same performance when using high-order edge-di-
with smaller quantization step and . Fig. 14 shows the rected interpolators. For structure , the same data pruning
comparison between the H.264/AVC compressed frame and the phase for structure can be applied without any modifi-
optimal data pruning-based compressed frame at the quanti- cation. The B frames require smaller number of bits for com-
zation level of 35 and 32, respectively. These sequences have pression and the extra bits for indicating the dropped lines be-
nearly same bit-rate of 92 Kbps and 94 Kbps and same PSNR come significant comparing to the bit for coding frame. So



Fig. 13. Comparison results for multiframe data pruning-based compression. (a) R-D curves. (b) Zoom in of 13 (a). (c) Percentage of bit saving.

Fig. 14. Comparison for H.264/AVC compression and optimal data pruning-based compression with same bit-rate and PSNR values. (a) H.264/AVC. (b) Optimal
data pruning-based.

medium compression level. The NEDI-6 and NEDI-9 for up-
sampling only rows can be also applied for de-interleaving. For
the same sequence, the R-D performance for single frame data
pruning-based compression is much better than the R-D perfor-
mance of multifame data pruning-based compression. This is
because with the same target PSNR, higher percentage of data
can be dropped for a single image than video sequence. Another
reason is that the same rows/columns are dropped over frames
in the GOP and more bits are required to compress the objects
Fig. 15. Comparison for H.264/AVC compression and optimal data pruning- moving over the dropped lines.
based compression with same bit-rate and PSNR values. (a) H.264/AVC. In future work, the location of the dropped lines should be
(b) Optimal data pruning-based. adaptive to the motion of the moving objects. Instead of using
only the pixels at odd indices, high-order edge-directed inter-
polation methods may use more available pixels to estimate
the R-D improvement using structure is better than the more accurately the model parameters. Additionally, the objec-
R-D improvement using structure . All simulation results tive function of the data pruning algorithm may be extended to
can be found at http://videoprocessing.ucsd.edu/~dungvo/dat- consider the coding efficiency of dropping these pixels to fur-
aprune.html. ther improve the R-D curve. A more efficient data pruning-based
compression for dropping the whole frame can also be consid-
VI. CONCLUSION ered using MCFI methods for video sequences with fast mo-
tions.
The paper proposed a novel data pruning-based compression
method to reduce the bit-rate. High-order edge-directed inter- ACKNOWLEDGMENT
polations using more surrounding pixels are also discussed to The authors would like to thank Y. Zheng for the interesting
adapt to different data pruning schemes. The results show that discussions at Thomson Corporate Research.
these high-order edge-directed interpolation methods help re-
ducing the jaggedness along strong edges as well as reduce the REFERENCES
artifacts at the intersection areas. The proposed optimal data [1] Advanced Video Coding for Generic Audiovisual Services, 2005.
[2] N. Vasconcelos and F. Dufaux, “Pre and post-filtering for low bit-rate
pruning-based compression achieves better R-D relation than video coding,” in Proc. IEEE Conf. Image Process., Oct. 1997, vol. 1,
the conventional data pruning-based compression in the low and pp. 291–294.



[3] A. Cavallaro, O. Steiger, and T. Ebrahimi, “Perceptual prefiltering for Joel Solé (M’02) received the M.S. degrees in
video coding,” in Proc. IEEE Int. Symp. Int. Multimedia, Video and telecommunications engineering from the Technical
Speech Processing, Oct. 2004, pp. 510–513. University of Catalonia (UPC), Barcelona, Spain,
[4] K. Ratakonda and N. Ahuja, “POCS based adaptive image magnifi- and the Ecole Nationale Supérieure des Télécom-
cation,” in Proc. IEEE Conf. Image Process., Oct. 1998, vol. 3, pp. munications (ENST), Paris, France, in 2001, and the
203–207. Ph.D. degree from the UPC in 2006.
[5] Y. Cha and S. Kim, “Edge-forming methods for color image zooming,” He is currently a member of the technical staff at
IEEE Trans. Image Process., vol. 15, no. 8, pp. 2315–2323, Aug. 2006. Corporate Research, Thomson, Inc., Princeton, NJ.
[6] M. Li and T. Q. Nguyen, “Markov random field model-based edge- Dr. Sole research interests focus on advanced video
directed image interpolation,” IEEE Trans. Image Process., vol. 17, no. coding and signal processing.
7, pp. 1121–1128, Jul. 2008.
[7] X. Li and M. T. Orchard, “New edge-directed interpolation,” IEEE
Trans. Image Process., vol. 10, no. 10, pp. 1521–1527, Oct. 2001.
[8] X. Zhang and X. Wu, “Image interpolation by adaptive 2-D autore-
gressive modeling and soft-decision estimation,” IEEE Trans. Image Peng Yin (M’02) received the B.E. degree in elec-
Process., vol. 17, no. 6, pp. 887–896, Jun. 2008. trical engineering from the University of Science and
[9] L. Zhang and X. Wu, “An edge-guided image interpolation algorithm Technology of China in 1996 and the Ph.D. degree
via directional filtering and data fusion,” IEEE Trans. Image Process., in electrical engineering from Princeton University,
vol. 15, no. 8, pp. 2226–2238, Aug. 2006. Princeton, NJ, in 2002.
[10] N. Mueller, Y. Lu, and M. N. Do, “Image interpolation using multi- She is currently a senior member of the tech-
scale geometric representations,” in SPIE Conf. Electronic Imaging, nical staff at Corporate Research, Thomson, Inc.,
Feb. 2007, vol. 6498. Princeton, NJ. Her current research interest is mainly
[11] N. Mueller and T. Q. Nguyen, “Image interpolation using classification on image and video compression. Her previous
and stitching,” presented at the IEEE Conf. Image Process., Oct. 2008. research is on video transcoding, error conceal-
[12] S. C. P. I. K. Tam, “Effective color interpolation in CCD color filter ment, and data hiding. She is actively involved in
arrays using signal correlation,” IEEE Trans. Circuits Syst. Video JVT/MPEG standardization process.
Technol., vol. 13, no. 3, pp. 503–513, Jun. 2003. Dr. Yin received the IEEE Circuits and Systems Society Best Paper Award for
[13] D. Menon, S. Andriani, and G. Calvagno, “Demosaicing with direc- her article in the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO
tional filtering and a posteriori decision,” IEEE Trans. Image Process., TECHNOLOGY in 2003.
vol. 16, no. 1, pp. 132–141, Jan. 2007.
[14] X. Li, “Demosaicing by successive approximation,” IEEE Trans. Image
Process., vol. 14, no. 3, pp. 370–379, Mar. 2005.
[15] G. Shen, B. Zeng, Y.-Q. Zhang, and M. L. Liou, “Transcoder with Cristina Gomila (M’01) received the M.S. degree in
arbitrarily resizing capability,” in Proc. IEEE Int. Symposium Circuits telecommunication engineering from the Technical
Syst., May 2001, vol. 5, pp. 22–28. University of Catalonia, Spain, in 1997, and the Ph.D.
[16] B. Choi, J. Han, C. Kim, and S. Ko, “Motion-compensated frame in- degree from the Ecole des Mines de Paris, France, in
terpolation using bilateral motion estimation and adaptive overlapped 2001.
block motion compensation,” IEEE Trans. Image Process., vol. 17, no. She then joined Thomson, Inc., Corporate Re-
4, pp. 407–416, Apr. 2007. search Princeton, Princeton, NJ. She was a core
[17] A. Huang and T. Nguyen, “A multistage motion vector processing member in the development of Thomson’s Film
method for motion-compensated frame interpolation,” IEEE Trans. Grain Technology and actively contributed to several
Image Process., vol. 17, no. 5, pp. 694–708, May 2008. MPEG standardization efforts, including AVC and
[18] S. G. Mallat, A Wavelet Tour of Signal Processing. New York: Aca- MVC. Since 2005, she has managed the Compres-
demic, 1998. sion Research Group at Thomson CR Princeton. Her current research interests
[19] Q. Shan, Z. Li, J. Jia, and C. Tang, “Fast image/video upsampling,” focus on advanced video coding for professional applications.
ACM Transactions on Graphics (SIGGRAPH ASIA 2008), vol. 27,
2008.

Truong Q. Nguyen (F’06) is currently a Professor at
the ECE Department, University of California at San
Diego, La Jolla. He is the coauthor (with Prof. G.
Strang) of the popular textbook Wavelets and Filter
˜
Dung T. Võ (S’06–M’09) received the B.S. and M.S. Banks (Wellesley-Cambridge Press, 1997) and the
degrees from Ho Chi Minh City University of Tech- author of several Matlab-based toolboxes on image
nology, Vietnam, in 2002 and 2004, respectively, and compression, electrocardiogram compression, and
the Ph.D. degree from the University of California at filter bank design. He has over 200 publications. His
San Diego, La Jolla, in 2009. research interests are video processing algorithms
He has been a Fellow of the Vietnam Education and their efficient implementation.
Foundation (VEF) since 2005 and has been on the Prof. Nguyen received the IEEE TRANSACTIONS
teaching staff of Ho Chi Minh City University of ON SIGNAL PROCESSING Paper Award (Image and Multidimensional Pro-
Technology since 2002. He interned at Mitsubishi cessing area) for the paper he co-wrote with Prof. P. P. Vaidyanathan on
Electric Research Laboratories (MERL), Cambridge, linear-phase perfect-reconstruction filter banks (1992). He received the NSF
MA, and Thomson Corporate Research, Princeton, Career Award in 1995 and is currently the Series Editor (Digital Signal
NJ, in the summers of 2007 and 2008, respectively. He has been a senior Processing) for Academic Press. He served as Associate Editor for the IEEE
research engineer at the Digital Media Solutions Lab, Samsung Information TRANSACTIONS ON SIGNAL PROCESSING (1994–1996), the IEEE SIGNAL
Systems America (Samsung US R&D Center), Irvine, CA, since 2009. His PROCESSING LETTERS (2001–2003), the IEEE TRANSACTIONS ON CIRCUITS
research interests are algorithms and applications for image and video coding AND SYSTEMS (1996–1997, 2001–2004), and the IEEE TRANSACTIONS ON
and postprocessing. IMAGE PROCESSING (2004–2005).


2010 15 vo

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to 2010 15 vo

Similar to 2010 15 vo (20)

Recently uploaded

Recently uploaded (20)

2010 15 vo