2010 15 vo


Published on

Image processing

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

2010 15 vo

  1. 1. IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2010 399 Selective Data Pruning-Based Compression Using High-Order Edge-Directed Interpolation ˜Dung T. Võ, Member, IEEE, Joel Solé, Member, IEEE, Peng Yin, Member, IEEE, Cristina Gomila, Member, IEEE, and Truong Q. Nguyen, Fellow, IEEE Abstract—This paper proposes a selective data pruning-based H.264/MPEG-4 AVC. For most video coding standards, in-compression scheme to improve the rate-distortion relation of creasing quantization step size is used to reduce bit-rate [1].compressed images and video sequences. The original frames are However, this technique can result in blocking and other codingpruned to a smaller size before compression. After decoding, theyare interpolated back to their original size by an edge-directed artifacts due to the loss of high frequency details. In the secondinterpolation method. The data pruning phase is optimized to direction, common techniques are low-pass filtering or down-obtain the minimal distortion in the interpolation phase. Further- sampling (which can be seen as a filtering process) followedmore, a novel high-order interpolation is proposed to adapt the by reconstructing or upsampling at the decoder. For example,interpolation to several edge directions in the current frame. This low-pass filters were adaptively used based on Human Visualhigh-order filtering uses more surrounding pixels in the framethan the fourth-order edge-directed method and it is more robust. System to eliminate high frequency information in [2] or toThe algorithm is also considered for multiframe-based interpola- simplify the contextual information in [3]. Also, to reduce thetion by using spatio-temporally surrounding pixels coming from bit-rate, some digital television systems uniformly downsizedthe previous frame. Simulation results are shown for both image the original sequence and upsized it after decoding. Theseinterpolation and coding applications to validate the effectiveness methods contain a pruning phase to reduce the amount of dataof the proposed methods. to compress and a reconstructing phase to recover the dropped Index Terms—Data pruning, edge-directed interpolation, spa- data. The reconstructed video applying these techniques lookedtial-temporal interpolation, video compression. blur because they were designed to eliminate high-frequency information with the low-pass filter in the preprocessing step or with the anti-aliasing filter before downsizing. I. INTRODUCTION This paper proposes a novel data pruning-based compres- sion scheme to reduce the bit-rate while still keeping a high quality reconstructed frame. The original frames are first op-N OWADAYS, the request for higher quality video is emerging very fast. Video tends to higher resolution,higher frame-rate and higher bit-depth. New technologies to timally pruned to a smaller size by adaptively dropping rows or columns prior to encoding. At the final stage, an interpola- tion phase is implemented to reconstruct the decoded frames tofurther reduce bit-rate are strongly demanded to combat the their original size. By avoiding filtering the remaining rows andbit-rate increase of this high definition video, especially to meet columns, the reconstructed frames can still achieve high qualitythe network and communication transmission constraints. In from a lower bit-rate.video coding, there are two main directions to reduce com- Main applications of interpolation are upsampling, demosi-pression bit-rate. One direction is to improve the compression acking and displaying for different video formats. For resolu-technology and the other one is to perform a preprocessing step tion enhancement, the interpolation is implemented to overcomethat improves the subsequent compression. the limitation of low resolution imaging. A wide range of in- The first direction can be viewed from the development terpolation methods has been discussed, starting from conven-of the MPEG video coding standard, from MPEG-1 to tional bilinear and bicubic interpolations to sophisticated itera- tive methods such as projection onto convex sets (POCS) [4] and Manuscript received January 13, 2009; revised September 17, 2009. First nonconvex nonlinear partial differential equations [5]. To avoidpublished November 03, 2009; current version published January 15, 2010. This the jaggedness artifacts occurring along edges, edge-oriented in-work was done while D. T. Võ was with Thomson Corporate Research and Uni-versity of California at San Diego. This work was supported in part by Texas terpolation methods were performed using Markov random fieldInstruments, Inc. The associate editor coordinating the review of this manuscript [6] or the low resolution (LR) image covariance [7]. Further-and approving it for publication was Dr. Hsueh-Ming Hang. D. T. Võ is with the Digital Media Solutions Lab, Samsung Information Sys- more, [8] proposed a 2-D piecewise autoregressive model andtems America, Irvine, CA 92612 USA (e-mail: dung.vo@samsung.com). a soft-decision estimation to interpolate the missing pixels in a J. Solé, P. Yin, and C. Gomila are with the Thomson Corporate Re- group. This method required a 12 12 matrix inversion and cansearch, Princeton, NJ 08540 USA (e-mail: {joel.sole@thomson.net;peng.yin@thomson.net; cristina.gomila@thomson.net). cause artifacts in the output image when the matrix is badly con- T. Q. Nguyen is with the Department of Electrical and Computer Engineering, ditioned. A combination of directional filtering and data fusionUniversity of California at San Diego, La Jolla, CA 92093-0407 USA (e-mail: was also discussed in [9] to estimate missing high resolutionnguyent@ece.ucsd.edu). (HR) pixels by a linear minimum mean square error estimation. Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org. Another group of interpolation algorithms used different kinds Digital Object Identifier 10.1109/TIP.2009.2035845 of transforms to predict the fine structure of the HR image from 1057-7149/$26.00 © 2010 IEEE Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on April 02,2010 at 13:47:56 EDT from IEEE Xplore. Restrictions apply.
  2. 2. 400 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2010its LR version. Instead of directly interpolating the HR imagein pixel domain, zeros were initially padded for the high fre-quencies from the wavelet transform [10] and the courtourlet Fig. 1. Block diagram of the data pruned-based compression.transform [11]. These algorithms were then iterated under theconstraints of sparsity and the similarity of low pass output ofthe LR and HR images. In demosaicking, interpolation is applied to reconstruct the frequencies. Only 25% of the data is kept in the pruning phase,missing color component due to color-filtered image sensor. The fact that also prevents achieving a reconstructed frame with highfull-resolution color image can be achieved from the Bayer color quality, even without compression.filter array by interpolating the (R,G,B) planes separately as in In data pruning-based compression for video, downsizing in[7] or jointly as in [12], [13]. Processing the color plane indepen- both spatial and temporal direction is applied to further reducedently helps avoiding the misregistration between color planes the bit-rate. In temporally data pruning-based compression, thebut ignores the color planes’ dependency. For joinly color plane frame-rate is usually reduced by half and is later reconstructedinterpolation, the green pixels are first interpolated from the can- by motion compensated frame interpolation (MCFI) methodsdidates of horizontal and vertical interpolation. After that, red [16], [17] . For fast motion video sequences or for frames atand blue pixels are reconstructed based on the color differences scene change, these methods typically cause blocking, flick- and with assumption that these differences are flat ering and ghosting artifacts. The rate-distortion (R-D) relationover the small areas. An iterative algorithm for demosaicking of these data pruned compressed sequences is much lower thanusing the color difference is discussed in [14]. Interpolation is that of the directly compressed sequences due to the high per-also required when video sequences are displayed in different centage of data loss (up to 87.5%) and the limitation of currentframe sizes other than its original frame size. In [15], the de- video interpolation methods.coded frame in unsuitable frame size is upsized and downsized Uniformly pruning image or video sequences ignores theto achieved the arbitrary target frame size in a pixel domain data-dependent artifacts caused by the interpolation phase. Intranscoder. this paper, the proposed data prune-based compression method When interpolation is used along with data pruning, the adapts to the error resulted from the interpolation phase. Datamethod needs to adapt to the way of pruning the data and to the which can be reconstructed with less error has higher prioritystructure of surrounding pixels. For instance, there are pruning to be dropped than data which cause higher error during thecases in which only rows or only columns are dropped and interpolation. The proposed data pruning phase and its cor-upsampling in only one direction is required. This paper de- responding interpolation phase in Fig. 1 will be discussed invelops a high-order edge-directed interpolation scheme to deal Sections III and IV, respectively.with these cases. The algorithm is also considered for the casesof dropping both rows and columns. Furthermore, instead of III. OPTIMAL DATA PRUNINGusing only spatially neighboring pixels for image interpolation, The block diagram of the data pruning phase for one framethe algorithm is extended for cases of video interpolation using is shown in Fig. 2. Only the even rows and columns may bespatio-temporally neighboring pixels. discarded, while the odd rows and columns are always kept for The paper is organized as follows. Section II introduces the later interpolation. To simplify the analysis, the compressiondata pruning-based compression method. Section III derives stage in Fig. 1 is ignored. In this phase, the original frame isan optimal data pruning algorithm. The high-order edge-di- selectively decimated to the LR frame for cases of droppingrected interpolation methods which are corresponding to the all the even rows, all the even columns and all the even rows anddata pruning-based compression scheme are described in Sec- columns. Then, for each of these 3 downsampling scenarios,tion IV. Results for interpolation and coding applications are is interpolated back to the HR frame based on all oddpresented in Section V. Finally, Section VI gives the concluding rows and columns (upscaling by ratio of 2 2) or all odd rowsremarks and discusses future works. (upscaling by ratio of 2 1) or all odd columns (upscaling by ratio of 1 2). Finally, these 3 reconstructed are compared to II. DATA PRUNE-BASED COMPRESSION in order to decide the best downsampling scenario and number of even rows and columns to be dropped before compression. The block diagram of the data pruning-based compression for Because of the decimation and interpolation, the reconstructedone frame is shown in Fig. 1. At first, the original frame of size frame is different than its original frame . The principle of is pruned to frame of smaller size the algorithm is that the even rows and columns in that have , where and are the number of dropped rows and least error compared to its corresponding rows and columns incolumns, respectively. The purpose of data pruning is to reduce are chosen to be dropped. The mean squared errorthe number of bits representing the stored or compressed frame between and is defined as . Then, frame having the original size is reconstructedby interpolating . The conventional data pruning-based com- (1)pression methods reduce the frame size with a factor of 2 in bothhorizontal and vertical direction by dropping half of the columnsand rows. Because of aliasing, interpolation after downsizing Given a target , the data pruning is optimized tocauses jaggedness artifacts, especially for detail areas with high discard the maximum number of pixels while keeping the Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on April 02,2010 at 13:47:56 EDT from IEEE Xplore. Restrictions apply.
  3. 3. VÕ et al.: SELECTIVE DATA PRUNING-BASED COMPRESSION USING HIGH-ORDER EDGE-DIRECTED INTERPOLATION 401 Fig. 2. Block diagram of the data pruning phase.overall of dropping rows and columns lessthan , that is Fig. 3. Data pruning for the 1st frame of Akiyo sequence. (a) Lines indicated for pruning. (b) Pruned frame. (2)The location of the dropped rows and columns is indicated by quence Akiyo. In Fig. 3(a), the white lines indicate the dropped and , respectively. If the even column is dropped, lines with the target dB. The frame size is re-then , otherwise . These indices are stored duced from the standard definition 720 480 to 464 320. Theas side information in the coded bitstream and are used for re- data pruned frame in Fig. 3(b) is more compact and it requires aconstructing the decoded frame. The same algorithm is applied smaller compressed bitstream than the original frame. Most ofto rows. dropped lines are located in flat areas where the aliasing does The line mean square error for one dropped column not happen.is defined as For video sequences, the algorithm is extended by dropping the same lines over frames in the whole group of picture . In this case, the is defined as (3)and similarly for rows. From (2), lines with smallerhave higher priority to be dropped than lines with larger (6) . Assume that the rows and columns that are where and are the original and reconstructed video se-dropped have the smallest and that the maximum quences, respectively, and is the number of frames in the of these lines is . Then, the overall . The for one dropped column is also extendedin (1) becomes the averaged of all dropped pixels [see in the temporal direction as(4), shown at the bottom of the page]. Therefore, the conditionin (2) can be tightened to (7) and similarly for rows. This case leads to the same condition as in (5). (5) IV. HIGH-ORDER EDGE-DIRECTED INTERPOLATIONwhere is the target minimal that the recon- This section proposes a high-order edge-directed interpola-structed frame has to achieve. An example of the proposed op- tion method to interpolate the downsized frames in Fig. 1timal data pruning is shown in Fig. 3 for the 1st frame of the se- and the data pruned frames in Fig. 2. In [7], the fourth-order (4) Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on April 02,2010 at 13:47:56 EDT from IEEE Xplore. Restrictions apply.
  4. 4. 402 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2010 Fig. 4. Block diagram of the single frame-based interpolation phase.new edge-directed interpolation (NEDI-4) is used to upsize onlyfor the 2 2 ratio. This interpolation can orient to edges in2 directions and causes some artifacts in the intersections ofmore than 2 edges. The proposed methods are higher order in-terpolations that can adapt to more edge directions. For single Fig. 5. Model parameters of sixth-order and eighth-order edge-directed inter-frame-based interpolation, the sixth-order edge-directed inter- polation. (a) NEDI-6. (b) NEDI-8.polation and eighth-order interpolation are developed for in-terpolating the cases with ratio 1 2 or 2 1 (dropping only , the optimal minimizing the MSE between the interpo-rows or only columns) and ratio 2 2 (dropping both rows and lated and original pixels in can be calculated bycolumns), respectively. For multiframe-based interpolation, theninth-order edge-directed interpolation is discussed for interpo- (9)lating the case with ratio 1 2 or 2 1 over all the frames of aGOP (dropping only rows or only columns). The geometric duality assumption [18] states that the modelA. Single Frame-Based Interpolation vector can be considered constant for different scales and Because the similar interpolation method is used for and so, it can be estimated from the LR pixels by , this section will only discuss the case of interpolating .The block diagram of the interpolation phase is shown in Fig. 4.First, is expanded to of size by inserting a lineof zeros at the line of if its indicator valuefor columns or for rows. is selectively down-sampled by 1 2, 2 1 or 2 2 ratio to form depending (10)on the chosen data pruning scheme. Then, are directionallyinterpolated to the HR frame of size . Finally, the where are 6-neighboring LR pixels of andindicators and determine whether the lines in the final is the LR model parameter vector as shown in Fig. 5(a).reconstructed frame are selected from the interpolated or from contains the edge-directed information which is applied to thethe data pruned frame HR scale for interpolation. The optimal minimum MSE linear is then obtained by if otherwise. (11) 1) Sixth-Order Edge-Directed Interpolation (NEDI- ): The where is the vector of all mapped LR pixelssame NEDI-6 is implemented for case of single frame-based in and is a matrix. The elements of the columninterpolation with upsampling ratios of 1 2 or 2 1. For the of are the 6-neighboring pixels of shown incase of ratio 1 2, the pixel indexes are classified to Fig. 5(a).indexes for odd columns and indexes for odd columns 2) Eighth-Order Edge-Directed Interpolation (NEDI- ): . The columns of are mapped to the odd columns of the This section develops an algorithm to deal with singleHR frame of size by . The frame-based interpolation for the case of upsampling witheven columns of are interpolated from the odd columns by ratio of 2 2. Similar to NEDI-6, the pixels in corre-a sixth-order interpolation sponding to the LR pixels downsampling by 2 2 in are extracted to form the LR frame of size . The interpolation is performed using NEDI-4 as in [7] for the first round and NEDI-8 for the second round. The interpolation schemes of NEDI-4 and NEDI-8 are shown in Fig. 5(b), where (8) the solid circles are the mapped LR pixels and the other pixels are the HR pixels to be interpolated. Using the quincunxwhere is the vector of sixth-order model parameters and sublattice, two passes are performed in the first round. In theis the vector of 6-neighboring pixels of as shown in first pass, NEDI-4 is used to interpolate type 1 pixels (squaresFig. 5(a). In this figure, the solid circles are the mapped LR with lines) from the LR pixels (solid circles). In the secondpixels while the circles are the HR pixels needed to be inter- pass, type 2 pixels (squares) and type 3 pixels (circles) arepolated. Assuming that is nearly constant in a local window interpolated from type 1 and LR pixels. Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on April 02,2010 at 13:47:56 EDT from IEEE Xplore. Restrictions apply.
  5. 5. VÕ et al.: SELECTIVE DATA PRUNING-BASED COMPRESSION USING HIGH-ORDER EDGE-DIRECTED INTERPOLATION 403 Fig. 6. Block diagram of the proposed multiframe-based interpolation for case of upsampling with ratio 1 2 2. Having an initial estimation of all the 8-neighboring pixels, based on the sum of absolute difference betweenNEDI-8 is implemented to get extra information from 4 direc- the current block and its matching blocktions in the second round. In this round, the model parame-ters can be directly estimated from the HR pixels. Therefore,the overfitting problem of NEDI-4 is reduced while consideringmore edge orientations. For the sake of interpolation consis-tency, NEDI-8 is applied to the pixels of type 3, 2, and 1 as in this (12)order. The fourth-order model parameters and eighth-ordermodel parameters for HR scale are shown in Fig. 5(b). The where is the block of pixels of interest that includes the HRoptimal is similarly calculated by (11), where is the vector pixels needed to interpolate and is the motion vectorof all HR pixels in , and matrix is of block . is calculated byemployed, which is a matrix whose column is com- ifposed of the 8-neighboring pixels of . otherwiseB. Multiframe-Based Interpolation where is the threshold to determine whether is chosen For multiframe interpolation, using single-frame-based inter- from or . The final reconstructed frame ispolation algorithm such as NEDI-6 or NEDI-8 can result in selected from the interpolated frame or the data pruned frametemporal inconsistency. This comes from ignoring of temporal by the indicators andcorrelation of the single-frame-based interpolation. A spatio-temporal interpolation method is proposed in this subsectionto reduce the flickering effect. To interpolate one HR pixel in ifthe current frame, extra surrounding pixels from the previous otherwise.frame are used together with its surrounding pixels in the cur- 2) Ninth-Order Edge-Directed Interpolation (NEDI- ): Inrent frame. A multiframe-based ninth-order edge-directed inter- NEDI-9, besides the 6 surrounding pixels in the current frame,polation (NEDI-9) method is discussed for the case of dropping 3 more pixels in the matching block of previous frame are used.all the even columns over frames of the whole GOP. A similar The interpolation phase is implemented as shown in Fig. 4. Thealgorithm can be applied to the cases of dropping all even rows interpolated pixel is the weighted averageor both even columns and rows. 1) Spatio-Temporal Interpolation Scheme: The block dia-gram of the multiframe-based interpolation is shown in Fig. 6.First, the current compressed data pruned frame is ex-panded to of the original size by inserting zeros as in Sub-section IV-A. Then, is single frame-based interpolated to using NEDI-6 as in (8). Assume that the previous inter-polated frame is , a block-based motion estimation andmotion compensation are used to align the block of pixels of in-terest in to its matching block in . Interpolating (13)the current frame and motion estimating based on larger blockshelp to achieve more accurate motion vectors, especially for the where is the vector of ninth-order model parameters andcompressed sequence. The reason is that the interpolated pixels is the vector of 6-spatial neighboring pixels and 3-spatio-tem-have less artifacts than the LR pixels after the “filter-like” in- poral neighboring pixels of , and is theterpolation phase. Based on and its motion compensated motion vector of the current block. The interpolation scheme forframe from , is spatio-temporally interpolated NEDI-9 is shown in Fig. 7(a), where solid circles represent theusing NEDI-9. available pixels and blank circles represent the pixels to be inter- If the matching block is very different from the current polated. Equation (13) includes one term for the spatial pixels asblocks, the spatio-temporal pixels should not be used, thus in NEDI-6 and the other term for the spatio-temporal pixels inpreventing the un-related pixels in the previous frame from the previous interpolated frame . The output is edge-di-contributing to the output. and are combined to rected by the first term and temporal-consistent-directed by the Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on April 02,2010 at 13:47:56 EDT from IEEE Xplore. Restrictions apply.
  6. 6. 404 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2010 pling by 2 in both directions. In this simulation, for upsampling with ratio 1 2 for these methods, only the LR pixels located in an even row and column (solid circles as plotted in Fig. 5(b) are used to interpolate the pixels in even columns (square with lines and circle). The remaining available LR pixels (square) are ignored. For NEDI-6 and NEDI-9, a window size of 17 17 pixels is chosen for the model parameter estimation. Only 6 HR pixels at the center of teh window are interpolated using these model parameters. is shifted by (4,4) pixels over the frame to interpolate all HR pixels. For NEDI-9, for motion estimation is set to 16 16 and the threshold is experimentally chosen to be . This helps achieving the highest PSNR for the interpolated frames of different sequences. A particular result is shown in Fig. 8 for a zoomed part of a frame of the Foreman sequence. The PSNR values of the interpolated framesFig. 7. Model parameters of 9th order edge-directed interpolation. (a) Interpo-lation scheme. (b) Parameter estimation. using bicubic, sinc, autoregression, NEDI-6 and NEDI-9 in- terpolation are 38.86 dB, 38.76 dB, 37.39 dB, 39.31 dB and 39.42 dB, respectively. These results validate the effectivenesssecond term. The second term helps reducing the flickering ef- of NEDI-6 and NEDI-9 for edge-directed interpolation, sincefect of using only frame-based interpolation. less jaggedness and higher PSNR are attained compared to The model vector is estimated from its LR model vector the other methods. Comparing to NEDI-6 using only spatial , where is shown in Fig. 7(b). In this case, for the spatial pixels, NEDI-9 using both spatial and spatio-temporal pixelsparameters, the geometric duality is assumed as in NEDI-6. achieves better visual quality and higher PSNR. When playedThis assumption is not needed for the spatio-temporal param- as a video sequence, interpolated sequence using NEDI-9 alsoeters, because all pixels in the previous frame are available. has less flickering artifacts and a higher quality consistent inThese parameters are finally estimated as in (11) where is the temporal direction than the single frame-based interpolatedthe vector of all mapped LR pixels in sequence using NEDI-6. Because of the ME part, NEDI-9and is a matrix whose column is composed has higher complexity and requires longer running time thanof the 9-spatial and spatio-temporal neighboring pixels of NEDI-6. For the 2nd frame of Foreman sequence, the running . The 9-spatial and spatio-temporal neigh- times are 0.72 s, 0.34 s, 6.59 s, 28.76 s, 433.36 s, and 4690.42boring pixels of are defined as s for bicubic, sinc, autoregression, NEDI-4, NEDI-6, and NEDI-9 methods. Note that sinc and autoregression methods are in C code while the other methods are written using Matlab. The simulation is run on laptop with Intel 1.83-GHz CPU and 1-GB RAM. 2) Eighth-Order Edge-Directed Interpolation for Up- sampling With Ratio 2 2: For the proposed NEDI-8, the comparison is performed with the Shan’s method [19], bicubic, sinc, and NEDI-4 methods. For NEDI-4 and NEDI-8, the window size is chosen to be 17 17 and only 4 HR pixels at the center of are interpolated using these model parameters. is also shifted by (4,4) pixels over the frame to interpolate all HR pixels, like in the NEDI-6 and NEDI-9 cases. The frame V. SIMULATION RESULTS is expanded by reflecting these pixels over the borders in order to enhance the pixels near the frame borders in the proposedA. High-Order Edge-Directed Interpolation NEDI-8. Simulations are performed to compare the proposed PSNR values are shown in Table I for sequences withhigh-order edge-directed interpolations with other interpo- different resolutions. To perform a fair comparison to otherlation methods for a wide range of data in different formats. methods that use bilinear interpolation for pixels near theBoth cases of upsampling with ratio of 1 2 and 2 2 are borders, pixels at 5 lines or fewer away from the border areconsidered. not counted for the PSNR computation. The Table I shows 1) Sixth-Order and Ninth-Order Edge-Directed Interpola- that NEDI-8 has the highest average PSNR value. The averagetion for Upsampling With Ratio 1 2: The original frames PSNR of NEDI-8 is 3.930 dB, 1.054 dB, 1.198 dB, 0.732 dBare downsampled by 2 in the horizontal direction (dropping higher than the average PSNR value of Shan’s method, bicubic,all even columns). The downsized frames are then interpolated sinc, and NEDI-4, respectively.using bicubic, sinc, autoregression method [8] and the proposed The visual results for a selected part of the Foreman sequenceNEDI-6 and NEDI-9 interpolation. Note that other interpolation are shown in Fig. 9. The result using the sinc-based interpola-methods, such as [7] and [8], can only be applied for downsam- tion has a lot of jaggedness [Fig. 9(b)]. While the NEDI-4 inter- Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on April 02,2010 at 13:47:56 EDT from IEEE Xplore. Restrictions apply.
  7. 7. VÕ et al.: SELECTIVE DATA PRUNING-BASED COMPRESSION USING HIGH-ORDER EDGE-DIRECTED INTERPOLATION 405 Fig. 8. Comparison of NEDI-6 and NEDI-9 to other methods. (a) Original. (b) Bicubic. (c) Sinc. (d) Autoregression. (e) NEDI-6. (f) NEDI-9. Fig. 9. Comparison of NEDI-8 to other methods. (a) Original. (b) Sinc. (c) NEDI-4. (d) NEDI-8. TABLE I PSNR COMPARISON (IN dB ) size 352 288 to 304 288. An H.264/AVC codec is used to intra code the frames with . NEDI-6 is used for the edge-directed interpolation. Each even rows and columns require one bit to indicate whether it is kept or dropped. Such as for the frame of size 352 288, a total of bits is used to indicate the dropped even lines. These bits are sent as side information in the coded bitstream. For compar- ison, other data pruning-based methods using sinc, bicubic,polation has significant less jaggedness, the interpolated frame autoregression, and NEDI-4 interpolation are also given.in Fig. 9(c) still shows jaggedness along the strong edges. Be- The R-D curves are plotted in Fig. 10(a) and their zoomedcause NEDI-4 only uses pixels of 2 directions, artifacts can be in parts are plotted in Fig. 10(b). The percentage of bit savingobserved at the intersections of more than 2 edges. On the other between the H264/AVC compression sequence and the NEDI-6hand, the NEDI-8 interpolated frame in Fig. 9(d) achieves the data pruning-based compression at the same values of isbest quality with least jaggedness. Using pixels in 4 directions, plotted in Fig. 10(c). The result in Fig. 10(a) shows that the datathe NEDI-8 interpolation also has less artifacts at the intersec- pruning-based compression using NEDI-6 is better than datation of more than 2 edges. With respect to objective quality, the pruning-based compression using sinc, bicubic, autoregressionproposed NEDI-8 has the highest PSNR values for all the se- and NEDI-4 methods. The data pruning-based compressionquences across different resolutions. Because of the extra round achieves a better R-D than H264/AVC in the range 31–41 dB.in the proposed NEDI-8, its running time is longer than NEDI-4. In this range, at the same bit-rate, the PSNR value of dataFor Foreman image, the running times are 0.45 s, 0.13 s, 11.56 pruning-based compression is about 0.3–0.5 dB higher thans, and 65.90 s for bicubic, sinc, NEDI-4 and NEDI-8 methods. the PSNR value of H.264/AVC compression. At the sameNote that sinc method is in C code while the other methods are PSNR, the data pruning-based compression saves about 5% ofwritten using Matlab software. bit-rate comparing to bit-rate of the H.264/AVC compression. As shown in Fig. 10(c), at the same QI, the percentage of bitB. Data Pruning-Based Compression saving is about 4.2%–6.6%. The reconstructed frames using 1) Single-Frame Data Pruning-Based Compression: The data pruning-based compression with sinc, bicubic, autoregres-simulation in this section verifies the validity of the data sion, NEDI-4 and NEDI-6 methods are shown in Fig. 11(b)–(f)pruning-based compression method for single frames. This and their zoomed in part are shown in Fig. 12(b)–(f). The datadata pruning-based compression is applied to the compression pruned frames are compressed with and the corre-of images or intra frames. The target is set to 50 dB. sponding bit-rate is 1.36 Mbps. The PSNR of the reconstructedSubsequently, the algorithm prunes the frames of Foreman of frame using data pruning-based compression with sinc, bicubic, Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on April 02,2010 at 13:47:56 EDT from IEEE Xplore. Restrictions apply.
  8. 8. 406 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2010Fig. 10. Comparison results for R-D curves of single frame data pruning-based compression. (a) Whole R-D curves. (b) One zoomed in part of (a). (c) Percentageof bit saving.Fig. 11. Comparison of NEDI-6 to other interpolation methods in case of single frame data pruning-based compression. (a) Original. (b) Sinc. (c) Bicubic.(d) Autoregression (37.78 dB). (e) NEDI-4. (f) NEDI-6.autoregression, NEDI-4 and NEDI-6 methods are 37.79 dB, 2) Multiframe Data Pruning-Based Compression: The data37.80 dB, 37.78 dB, 37.42 dB, and 37.91 dB, respectively. pruning approach is applied to video compression. An experi-The results show that the reconstructed frame using NEDI-6 ment is performed in which a GOP of 15 frames of Akiyo isin Fig. 11 has less artifacts than other methods. Because the pruned with the target dB. Three downsam-reconstructed frames using autoregression and NEDI-4 are not pling scenarios of dropping all even rows, all even columns allbased on the LR pixels located at even rows, the HR pixels are even rows and columns then using the interpolation scenariosnot consistent to each other and cause some artifacts at the teeth of factors of 1 2, 2 1 and 2 2 are considered to determineareas in Fig. 11(d) and (e). the best number of lines to be dropped. Simulation shows that An additional simulation is performed to analysis the affect dropping 160 columns and keepping all rows are the best so-of the target PSNR on the pruned frame size and the R-D curve lution which achieves the most dropped pixels while still keepsof the data pruning-based compression. The results in Table II the PSNR of reconstructed frame higher than 45 dB. As a conse-show that when the target PSNR decreases, more data is con- quence, the frame size is reduced from 720 480 to 320 480sidered to be dropped while the PSNR range having better R-D lines. An H.264 codec is applied with the GOP structurecurve reduces. The best case to get highest average PSNR im- and . The is averaged over theprovement of dB is obtained when the target PSNR is set to whole GOP, so that the same lines are dropped for all the frames.50 dB. The table also shows that the compressed bitrate saving In this way, the side information to determine the dropped linesincreases when the target PSNR decreases. is greatly reduced. The extra bit-rate is 1.2 Kbps for the whole Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on April 02,2010 at 13:47:56 EDT from IEEE Xplore. Restrictions apply.
  9. 9. VÕ et al.: SELECTIVE DATA PRUNING-BASED COMPRESSION USING HIGH-ORDER EDGE-DIRECTED INTERPOLATION 407 Fig. 12. One zoomed in part of Fig. 11. (a) Original. (b) Sinc. (c) Bicubic. (d) Autoregression. (e) NEDI-4. (f) NEDI-6. TABLE II of 37.83 dB and 37.91 dB respectively for the H.264/AVC and PSNR COMPARISON (IN dB ) the proposed data pruning-based compressed sequences. Re- sults show that the proposed data pruning-based compressed frame in Fig. 14(b) has higher visual quality and less artifacts than the H.264/AVC compressed frame in Fig. 14(a). This merit can be explained by the interpolation phase, which helps re- ducing the blocking and ringing artifacts, and the smaller quan- tization step level. Because of the ’filter-like’ interpolation, the reconstructed sequence in the low bit-rate has fewer blockingGOP, which again is very small compared to the total bit-rate of artifact than the direct compressed sequence with high compres-the compressed bitstream. For interpolation, single frame-based sion level.NEDI-6 is used for the first I frame while multiframe-based Both PSNR curve and visual results validate the effectivenessNEDI-9 is employed for the following frames. For comparison, of the proposed data pruning-based compression. The proposedthe data pruning scheme is applied to the sequence down- and algorithm requires an interpolation step in the data pruning andup-sized by 2 2 with the uniform sinc interpolation. reconstruction phases, so the complexity of data pruning-based The R-D curves are shown in Fig. 13(a), while Fig. 13(b) are compression is higher than the normal compression. However,zoomed in parts. These results show that the R-D curve of the the coding and decoding time of the proposed method decreasessinc data-pruned method is consistently below the curve of the proportionally to the size reduction of the data pruned frame.optimal data pruning method. The proposed method is better in Such as for case of data pruning from the original frame sizethe range 32–37.5 dB compared to H.264/AVC. The PSNR im- of 720 480 to 320 480, both encoding and decoding timeprovement at the same bit-rate is around 0.3–0.7 dB in the range. for data pruned sequence is only 50% of the encoding and de-As shown in Fig. 13(c), the percentage of bit-rate saving of the coding time for the original sequence. Additional simulationsoptimal data pruning-based compressed sequence is 23%–36% show that to further reduce the running time in the encodingcompared to the H.264/AVC using the same quantization step phase, a simple interpolator such as bilinear interpolator can besize. Even having the same bit-rate and PSNR values, the recon- applied at the data pruning phase in Fig. 2 while still nearlystructed frames have less artifacts because they are compressed keeps the same performance when using high-order edge-di-with smaller quantization step and . Fig. 14 shows the rected interpolators. For structure , the same data pruningcomparison between the H.264/AVC compressed frame and the phase for structure can be applied without any modifi-optimal data pruning-based compressed frame at the quanti- cation. The B frames require smaller number of bits for com-zation level of 35 and 32, respectively. These sequences have pression and the extra bits for indicating the dropped lines be-nearly same bit-rate of 92 Kbps and 94 Kbps and same PSNR come significant comparing to the bit for coding frame. So Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on April 02,2010 at 13:47:56 EDT from IEEE Xplore. Restrictions apply.
  10. 10. 408 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2010 Fig. 13. Comparison results for multiframe data pruning-based compression. (a) R-D curves. (b) Zoom in of 13 (a). (c) Percentage of bit saving.Fig. 14. Comparison for H.264/AVC compression and optimal data pruning-based compression with same bit-rate and PSNR values. (a) H.264/AVC. (b) Optimaldata pruning-based. medium compression level. The NEDI-6 and NEDI-9 for up- sampling only rows can be also applied for de-interleaving. For the same sequence, the R-D performance for single frame data pruning-based compression is much better than the R-D perfor- mance of multifame data pruning-based compression. This is because with the same target PSNR, higher percentage of data can be dropped for a single image than video sequence. Another reason is that the same rows/columns are dropped over frames in the GOP and more bits are required to compress the objectsFig. 15. Comparison for H.264/AVC compression and optimal data pruning- moving over the dropped lines.based compression with same bit-rate and PSNR values. (a) H.264/AVC. In future work, the location of the dropped lines should be(b) Optimal data pruning-based. adaptive to the motion of the moving objects. Instead of using only the pixels at odd indices, high-order edge-directed inter- polation methods may use more available pixels to estimatethe R-D improvement using structure is better than the more accurately the model parameters. Additionally, the objec-R-D improvement using structure . All simulation results tive function of the data pruning algorithm may be extended tocan be found at http://videoprocessing.ucsd.edu/~dungvo/dat- consider the coding efficiency of dropping these pixels to fur-aprune.html. ther improve the R-D curve. A more efficient data pruning-based compression for dropping the whole frame can also be consid- VI. CONCLUSION ered using MCFI methods for video sequences with fast mo- tions. The paper proposed a novel data pruning-based compressionmethod to reduce the bit-rate. High-order edge-directed inter- ACKNOWLEDGMENTpolations using more surrounding pixels are also discussed to The authors would like to thank Y. Zheng for the interestingadapt to different data pruning schemes. The results show that discussions at Thomson Corporate Research.these high-order edge-directed interpolation methods help re-ducing the jaggedness along strong edges as well as reduce the REFERENCESartifacts at the intersection areas. The proposed optimal data [1] Advanced Video Coding for Generic Audiovisual Services, 2005. [2] N. Vasconcelos and F. Dufaux, “Pre and post-filtering for low bit-ratepruning-based compression achieves better R-D relation than video coding,” in Proc. IEEE Conf. Image Process., Oct. 1997, vol. 1,the conventional data pruning-based compression in the low and pp. 291–294. Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on April 02,2010 at 13:47:56 EDT from IEEE Xplore. Restrictions apply.
  11. 11. VÕ et al.: SELECTIVE DATA PRUNING-BASED COMPRESSION USING HIGH-ORDER EDGE-DIRECTED INTERPOLATION 409 [3] A. Cavallaro, O. Steiger, and T. Ebrahimi, “Perceptual prefiltering for Joel Solé (M’02) received the M.S. degrees in video coding,” in Proc. IEEE Int. Symp. Int. Multimedia, Video and telecommunications engineering from the Technical Speech Processing, Oct. 2004, pp. 510–513. University of Catalonia (UPC), Barcelona, Spain, [4] K. Ratakonda and N. Ahuja, “POCS based adaptive image magnifi- and the Ecole Nationale Supérieure des Télécom- cation,” in Proc. IEEE Conf. Image Process., Oct. 1998, vol. 3, pp. munications (ENST), Paris, France, in 2001, and the 203–207. Ph.D. degree from the UPC in 2006. [5] Y. Cha and S. Kim, “Edge-forming methods for color image zooming,” He is currently a member of the technical staff at IEEE Trans. Image Process., vol. 15, no. 8, pp. 2315–2323, Aug. 2006. Corporate Research, Thomson, Inc., Princeton, NJ. [6] M. Li and T. Q. Nguyen, “Markov random field model-based edge- Dr. Sole research interests focus on advanced video directed image interpolation,” IEEE Trans. Image Process., vol. 17, no. coding and signal processing. 7, pp. 1121–1128, Jul. 2008. [7] X. Li and M. T. Orchard, “New edge-directed interpolation,” IEEE Trans. Image Process., vol. 10, no. 10, pp. 1521–1527, Oct. 2001. [8] X. Zhang and X. Wu, “Image interpolation by adaptive 2-D autore- gressive modeling and soft-decision estimation,” IEEE Trans. Image Peng Yin (M’02) received the B.E. degree in elec- Process., vol. 17, no. 6, pp. 887–896, Jun. 2008. trical engineering from the University of Science and [9] L. Zhang and X. Wu, “An edge-guided image interpolation algorithm Technology of China in 1996 and the Ph.D. degree via directional filtering and data fusion,” IEEE Trans. Image Process., in electrical engineering from Princeton University, vol. 15, no. 8, pp. 2226–2238, Aug. 2006. Princeton, NJ, in 2002. [10] N. Mueller, Y. Lu, and M. N. Do, “Image interpolation using multi- She is currently a senior member of the tech- scale geometric representations,” in SPIE Conf. Electronic Imaging, nical staff at Corporate Research, Thomson, Inc., Feb. 2007, vol. 6498. Princeton, NJ. Her current research interest is mainly [11] N. Mueller and T. Q. Nguyen, “Image interpolation using classification on image and video compression. Her previous and stitching,” presented at the IEEE Conf. Image Process., Oct. 2008. research is on video transcoding, error conceal- [12] S. C. P. I. K. Tam, “Effective color interpolation in CCD color filter ment, and data hiding. She is actively involved in arrays using signal correlation,” IEEE Trans. Circuits Syst. Video JVT/MPEG standardization process. Technol., vol. 13, no. 3, pp. 503–513, Jun. 2003. Dr. Yin received the IEEE Circuits and Systems Society Best Paper Award for [13] D. Menon, S. Andriani, and G. Calvagno, “Demosaicing with direc- her article in the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO tional filtering and a posteriori decision,” IEEE Trans. Image Process., TECHNOLOGY in 2003. vol. 16, no. 1, pp. 132–141, Jan. 2007. [14] X. Li, “Demosaicing by successive approximation,” IEEE Trans. Image Process., vol. 14, no. 3, pp. 370–379, Mar. 2005. [15] G. Shen, B. Zeng, Y.-Q. Zhang, and M. L. Liou, “Transcoder with Cristina Gomila (M’01) received the M.S. degree in arbitrarily resizing capability,” in Proc. IEEE Int. Symposium Circuits telecommunication engineering from the Technical Syst., May 2001, vol. 5, pp. 22–28. University of Catalonia, Spain, in 1997, and the Ph.D. [16] B. Choi, J. Han, C. Kim, and S. Ko, “Motion-compensated frame in- degree from the Ecole des Mines de Paris, France, in terpolation using bilateral motion estimation and adaptive overlapped 2001. block motion compensation,” IEEE Trans. Image Process., vol. 17, no. She then joined Thomson, Inc., Corporate Re- 4, pp. 407–416, Apr. 2007. search Princeton, Princeton, NJ. She was a core [17] A. Huang and T. Nguyen, “A multistage motion vector processing member in the development of Thomson’s Film method for motion-compensated frame interpolation,” IEEE Trans. Grain Technology and actively contributed to several Image Process., vol. 17, no. 5, pp. 694–708, May 2008. MPEG standardization efforts, including AVC and [18] S. G. Mallat, A Wavelet Tour of Signal Processing. New York: Aca- MVC. Since 2005, she has managed the Compres- demic, 1998. sion Research Group at Thomson CR Princeton. Her current research interests [19] Q. Shan, Z. Li, J. Jia, and C. Tang, “Fast image/video upsampling,” focus on advanced video coding for professional applications. ACM Transactions on Graphics (SIGGRAPH ASIA 2008), vol. 27, 2008. Truong Q. Nguyen (F’06) is currently a Professor at the ECE Department, University of California at San Diego, La Jolla. He is the coauthor (with Prof. G. Strang) of the popular textbook Wavelets and Filter ˜ Dung T. Võ (S’06–M’09) received the B.S. and M.S. Banks (Wellesley-Cambridge Press, 1997) and the degrees from Ho Chi Minh City University of Tech- author of several Matlab-based toolboxes on image nology, Vietnam, in 2002 and 2004, respectively, and compression, electrocardiogram compression, and the Ph.D. degree from the University of California at filter bank design. He has over 200 publications. His San Diego, La Jolla, in 2009. research interests are video processing algorithms He has been a Fellow of the Vietnam Education and their efficient implementation. Foundation (VEF) since 2005 and has been on the Prof. Nguyen received the IEEE TRANSACTIONS teaching staff of Ho Chi Minh City University of ON SIGNAL PROCESSING Paper Award (Image and Multidimensional Pro- Technology since 2002. He interned at Mitsubishi cessing area) for the paper he co-wrote with Prof. P. P. Vaidyanathan on Electric Research Laboratories (MERL), Cambridge, linear-phase perfect-reconstruction filter banks (1992). He received the NSF MA, and Thomson Corporate Research, Princeton, Career Award in 1995 and is currently the Series Editor (Digital SignalNJ, in the summers of 2007 and 2008, respectively. He has been a senior Processing) for Academic Press. He served as Associate Editor for the IEEEresearch engineer at the Digital Media Solutions Lab, Samsung Information TRANSACTIONS ON SIGNAL PROCESSING (1994–1996), the IEEE SIGNALSystems America (Samsung US R&D Center), Irvine, CA, since 2009. His PROCESSING LETTERS (2001–2003), the IEEE TRANSACTIONS ON CIRCUITSresearch interests are algorithms and applications for image and video coding AND SYSTEMS (1996–1997, 2001–2004), and the IEEE TRANSACTIONS ONand postprocessing. IMAGE PROCESSING (2004–2005). Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on April 02,2010 at 13:47:56 EDT from IEEE Xplore. Restrictions apply.