OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFORMATION DERIVEDFROM TRAINING IMAGES

1,050 views

Published on

In this paper, we propose a new algorithm to estimate a super-resolution image from a given low-resolution
image, by adding high-frequency information that is extracted from natural high-resolution images in the
training dataset. The selection of the high-frequency information from the training dataset is accomplished in
two steps, a nearest-neighbor search algorithm is used to select the closest images from the training dataset,
which can be implemented in the GPU, and a sparse-representation algorithm is used to estimate a weight
parameter to combine the high-frequency information of selected images. This simple but very powerful
super-resolution algorithm can produce state-of-the-art results. Qualitatively and quantitatively, we
demonstrate that the proposed algorithm outperforms existing state-of-the-art super-resolution algorithms.

Published in: Technology, Art & Photos
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,050
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
34
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFORMATION DERIVEDFROM TRAINING IMAGES

  1. 1. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013DOI : 10.5121/ijcsit.2013.5202 19OBTAINING SUPER-RESOLUTION IMAGESBY COMBINING LOW-RESOLUTION IMAGESWITH HIGH-FREQUENCY INFORMATIONDERIVEDFROM TRAINING IMAGESEmil Bilgazyev1, Nikolaos Tsekos2, and Ernst Leiss31,2,3Department of Computer Science, University of Houston, TX, USA1eismailov2@uh.edu, 2ntsekos@cs.uh.edu, 3eleiss@uh.eduABSTRACTIn this paper, we propose a new algorithm to estimate a super-resolution image from a given low-resolutionimage, by adding high-frequency information that is extracted from natural high-resolution images in thetraining dataset. The selection of the high-frequency information from the training dataset is accomplished intwo steps, a nearest-neighbor search algorithm is used to select the closest images from the training dataset,which can be implemented in the GPU, and a sparse-representation algorithm is used to estimate a weightparameter to combine the high-frequency information of selected images. This simple but very powerfulsuper-resolution algorithm can produce state-of-the-art results. Qualitatively and quantitatively, wedemonstrate that the proposed algorithm outperforms existing state-of-the-art super-resolution algorithms.KEYWORDSSuper-resolution, face recognition, sparse representation.1. INTRODUCTIONRecent advances in electronics, sensors, and optics have led to a widespread availability ofvideo-based surveillance and monitoring systems. Some of the imaging devices, such as cameras,camcorders, and surveillance cameras, have limited achievable resolution due to factors such asquality of lenses, limited number of sensors in the camera, etc. Increasing the quality of lenses orthe number of sensors in the camera will also increase the cost of the device; in some cases thedesired resolution may be still not achievable with the current technology. However, manyapplications, ranging from security to broadcasting, are driving the need for higher-resolutionimages or videos for better visualization [1].The idea behind super-resolution is to enhance the low-resolution input image, such that thespatial-resolution (total number of independent pixels within the image) as well as pixel-resolution(total number of pixels) are improved.
  2. 2. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 201320In this paper, we propose a new approach to estimate super-resolution by combining a givenlow-resolution image with high-frequency information obtained from training images (Fig. 1). Thenearest-neighbor-search algorithm is used to select closest images from the training dataset, and asparse representation algorithm is used to estimate a weight parameter to combine thehigh-frequencyinformation of selected images. The main motivation of our approach is that thehigh-frequency information helps to obtain sharp edges on the reconstructed images (see Fig. 1).(a) (b) (c)Figure 1: Depiction of (a) the super-resolution image obtained by combining (b) a given low-resolutionimage and (c) high-frequency information estimated from a natural high-resolution training dataset.The rest of the paper is organized as follow: Previous work is presented in Section 2, a descriptionof our proposed method is presented in Section 3, the implementation details are presented inSection 4, and experimental results of the proposed algorithm as well as other algorithms arepresented in Section 5. Finally, Section 6 summarizes our findings and concludes the paper.2. PREVIOUS WORKIn this section, we briefly review existing techniques for super-resolution of low-resolution imagesfor general and domain-specific purposes. In recent years, several methods have been proposed thataddress the issue of image resolution. Existing super-resolution (SR) algorithms can be classifiedinto two classes: multi-frame-based and example-based algorithms [8]. Multi-frame-basedmethods compute an high-resolution (HR) image from a set of low-resolution (LR) images fromany domain [6]. The key assumption of multi-frame-based super-resolution methods is that the setof input LR images overlap and each LR image contains additional information than other LRimages. Then multi-frame-based SR methods combine these sets of LR images into one image sothat all information is contained in a single output SR image. Additionally, these methods performsuper-resolution with the general goal of improving the quality of the image so that the resultinghigher-resolution image is also visually pleasing. The example-based methods compute an HRcounterpart of a single LR image from a known domain [2, 13, 18, 10, 14]. These methods learnobserved information targeted to a specific domain and thus, can exploit prior knowledge to obtainsuperior results specific to that domain. Our approach belongs to this category, where we use atraining database to improve reconstruction output.Moreover, the domain-specific SR methods targeting the same domain differ considerably fromeach other in the way they model and apply a priori knowledge about natural images. Yang et al.[22] introduced a method to reconstruct SR images using a sparse representation of the input LRimages. However, the performance of these example-based SR methods degrades rapidly if themagnification factor is more than 2. In addition, the performance of these SR methods is highlydependent on the size of the training database.International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 201320In this paper, we propose a new approach to estimate super-resolution by combining a givenlow-resolution image with high-frequency information obtained from training images (Fig. 1). Thenearest-neighbor-search algorithm is used to select closest images from the training dataset, and asparse representation algorithm is used to estimate a weight parameter to combine thehigh-frequencyinformation of selected images. The main motivation of our approach is that thehigh-frequency information helps to obtain sharp edges on the reconstructed images (see Fig. 1).(a) (b) (c)Figure 1: Depiction of (a) the super-resolution image obtained by combining (b) a given low-resolutionimage and (c) high-frequency information estimated from a natural high-resolution training dataset.The rest of the paper is organized as follow: Previous work is presented in Section 2, a descriptionof our proposed method is presented in Section 3, the implementation details are presented inSection 4, and experimental results of the proposed algorithm as well as other algorithms arepresented in Section 5. Finally, Section 6 summarizes our findings and concludes the paper.2. PREVIOUS WORKIn this section, we briefly review existing techniques for super-resolution of low-resolution imagesfor general and domain-specific purposes. In recent years, several methods have been proposed thataddress the issue of image resolution. Existing super-resolution (SR) algorithms can be classifiedinto two classes: multi-frame-based and example-based algorithms [8]. Multi-frame-basedmethods compute an high-resolution (HR) image from a set of low-resolution (LR) images fromany domain [6]. The key assumption of multi-frame-based super-resolution methods is that the setof input LR images overlap and each LR image contains additional information than other LRimages. Then multi-frame-based SR methods combine these sets of LR images into one image sothat all information is contained in a single output SR image. Additionally, these methods performsuper-resolution with the general goal of improving the quality of the image so that the resultinghigher-resolution image is also visually pleasing. The example-based methods compute an HRcounterpart of a single LR image from a known domain [2, 13, 18, 10, 14]. These methods learnobserved information targeted to a specific domain and thus, can exploit prior knowledge to obtainsuperior results specific to that domain. Our approach belongs to this category, where we use atraining database to improve reconstruction output.Moreover, the domain-specific SR methods targeting the same domain differ considerably fromeach other in the way they model and apply a priori knowledge about natural images. Yang et al.[22] introduced a method to reconstruct SR images using a sparse representation of the input LRimages. However, the performance of these example-based SR methods degrades rapidly if themagnification factor is more than 2. In addition, the performance of these SR methods is highlydependent on the size of the training database.International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 201320In this paper, we propose a new approach to estimate super-resolution by combining a givenlow-resolution image with high-frequency information obtained from training images (Fig. 1). Thenearest-neighbor-search algorithm is used to select closest images from the training dataset, and asparse representation algorithm is used to estimate a weight parameter to combine thehigh-frequencyinformation of selected images. The main motivation of our approach is that thehigh-frequency information helps to obtain sharp edges on the reconstructed images (see Fig. 1).(a) (b) (c)Figure 1: Depiction of (a) the super-resolution image obtained by combining (b) a given low-resolutionimage and (c) high-frequency information estimated from a natural high-resolution training dataset.The rest of the paper is organized as follow: Previous work is presented in Section 2, a descriptionof our proposed method is presented in Section 3, the implementation details are presented inSection 4, and experimental results of the proposed algorithm as well as other algorithms arepresented in Section 5. Finally, Section 6 summarizes our findings and concludes the paper.2. PREVIOUS WORKIn this section, we briefly review existing techniques for super-resolution of low-resolution imagesfor general and domain-specific purposes. In recent years, several methods have been proposed thataddress the issue of image resolution. Existing super-resolution (SR) algorithms can be classifiedinto two classes: multi-frame-based and example-based algorithms [8]. Multi-frame-basedmethods compute an high-resolution (HR) image from a set of low-resolution (LR) images fromany domain [6]. The key assumption of multi-frame-based super-resolution methods is that the setof input LR images overlap and each LR image contains additional information than other LRimages. Then multi-frame-based SR methods combine these sets of LR images into one image sothat all information is contained in a single output SR image. Additionally, these methods performsuper-resolution with the general goal of improving the quality of the image so that the resultinghigher-resolution image is also visually pleasing. The example-based methods compute an HRcounterpart of a single LR image from a known domain [2, 13, 18, 10, 14]. These methods learnobserved information targeted to a specific domain and thus, can exploit prior knowledge to obtainsuperior results specific to that domain. Our approach belongs to this category, where we use atraining database to improve reconstruction output.Moreover, the domain-specific SR methods targeting the same domain differ considerably fromeach other in the way they model and apply a priori knowledge about natural images. Yang et al.[22] introduced a method to reconstruct SR images using a sparse representation of the input LRimages. However, the performance of these example-based SR methods degrades rapidly if themagnification factor is more than 2. In addition, the performance of these SR methods is highlydependent on the size of the training database.
  3. 3. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 201321Freeman et al. [8] proposed an example-based learning strategy that applies to generic imageswhere the LR to HR relationship is learned using a Markov Random Field (MRF). Sun et al. [17]extended this approach by using the Primal Sketch priors to reconstruct edge regions and corners bydeblurring them. The main drawback of these methods is that they require a large database of LRand HR image pairs to train the MRF. Chang et al. [3] used the Locally Linear Embedding (LLE)Figure 2: Depiction of the pipeline for the proposed super-resolution algorithm.manifold learning approach to map the the local geometry of HR images to LR images with theassumption that the manifolds between LR and HR images are similar. In addition, theyreconstructed a SR image using K neighbors. However, the manifold between the synthetic LR thatis generated from HR images is not similar to the manifold of real scenario LR images, which arecaptured under different environments and camera settings. Also, using a fixed number ofneighbors to reconstruct an SR image usually results in blurring effects such as artifacts in theedges, due to over- or under-fitting.Another approach is derived from a multi-frame based approach to reconstruct an SR image from asingle LR image [7, 12, 19]. These approaches learn the co-occurrence of a patch within the imagewhere the correspondence between LR and HR is predicted. These approaches cannot be used toreconstruct a SR image from a single LR facial image, due to the limited number of similar patcheswithin a facial image.SR reconstruction based on wavelet analysis has been shown to be well suited for reconstruction,denoising and deblurring, and has been used in a variety of application domains includingbiomedical [11], biometrics [5], and astronomy [20]. In addition, it provides an accurate and sparserepresentation of images that consist of smooth regions with isolated abrupt changes [16]. In ourmethod, we propose to take advantage of the wavelet decomposition-based approach in conjunctionwith compressed sensing techniques to improve the quality of the super-resolution output.
  4. 4. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013223. METHODOLOGYTable 1. Notations used in this paperSymbols DescriptionX Collection of training imagesiX thi training image ( ×m nR )jix ,thj patch of thi image ( ×k lR ) threshold valuei sparse representation of thi patchp. pl -normD DictionaryW , −1W forward and inverse wavelet transforms ,  high- and low-frequencies of image][ Concatenation or vectors or matricesLet nmi RX ×∈ be the thi image of the training dataset }0...=:{= NiXX i , and×∈,k li jx Rbe thethj patch of an image }0...=:{= , MjxX jii . The wavelet transform of an image patchx will return low- and high-frequency information:( ) = [ ( ), ( )] ,W x x x  (1)where W is the forward wavelet transform, )(x is the low-frequency information, and )(x isthe high-frequency information of an image patch x . Taking the inverse wavelet transform ofhigh- and low-frequency information of original image (without any processing on them) willresult in the original image:1= ([ ( ), ( )]) ,x W x x −(2)where −1W is the inverse wavelet transform. If we use Haar wavelet transform with itscoefficients being 0.5 instead of 2 (nonquadratic-mirror-filter), then the low-frequencyinformation of an image x will actually be a low-resolution version of an image x , where fourneighboring pixels are averaged; in other words, it is similar to down-sampling an image x by afactor of 2 with nearest-neighbor interpolation, and the high-frequency information )(x of animage x will be similar to the horizontal, vertical and diagonal gradients of x .Assume that, for a given low-resolution image patch iy which is the thi patch of an image y , wecan find a similar patch }0={= NMjxj  from the natural image patches, then by combiningiy with the high-frequency information )( jx of a high-resolution patch jx , and taking the
  5. 5. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 201323inverse wavelet transform, we will get the super-resolution*y (see Fig. 1):  −− ≤2* 102= ([ , ( )]) , ( ) ,i i j i jy W y x y x (3)where 0 is a small nonnegative value.It is not guaranteed that we will always find an jx such that 022)(  ≤− ji xy , thus, weintroduce an approach to estimate a few closest low-resolution patches ( )( jx ) from the trainingdataset and then estimate a weight for each patch )( jx which will be used to combinehigh-frequency information of the training patches )( jx .To find closest matches to the low-resolution input patch iy , we use a nearest-neighbor searchalgorithm:,},)(=:{= 122Xxxyccciciciii ∈∀≤−  (4)where c is a vector containing the indexes ( ic ) of training patches of the closest matches to inputpatch iy , and 1 is the radius threshold of a nearest-neighbor search. After selecting the closestmatches to iy , we build two dictionaries from the selected patches jx ; the first dictionary will bethe joint of low-frequency information of training patches )( jx where it will be used to estimatea weight parameter, and the second dictionary will be the joint of high-frequency information oftraining patches )( jx :.}:)({=,}:)({= cjxDcjxD jiji ∈∈  (5)We use a sparse representation algorithm [21] to estimate the weight parameter. Thesparse-representation i of an input image patch iy with respect to the dictionary iD , is usedas a weight for fusion of the high-frequency information of training patches ( iD ):.argmin= 12 iiiiii Dy  +− (6)The sparse representation algorithm (Eq. 6) tries to estimate iy by fusing a few atoms (columns)of the dictionary iD , by assigning non-zero weights to these atoms. The result will be thesparse-representation i , which has only a few non-zero elements. In other words, the input imagepatch iy can be represented by combining a few atoms of iD ( iii Dy ≈ ) with a weightparameter i ; similarly, the high-frequency information of training patches iD can also be
  6. 6. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 201324combined with the same weight parameter i , to estimate the unknown high-frequencyinformation of an input image patch iy : −* 1= ([ , ]) ,i i i iy W y D (7)where *iy is the output (super-resolution) image patch, and W −1is the inverse wavelet transform.Figure 2 depicts the pipeline for the proposed algorithm. For the training step, from high-resolutiontraining images we extract patches, then we compute low-frequency (will become low-resolutiontraining image patches) and high-frequency information for each patch in the training dataset. Forthe reconstruction step, given an input low-resolution image y , we extract a patch iy , find nearestneighbors c within the given radius 1 (this can be speeded-up usinga GPU), then from selectedneighbors c, we construct low-frequency iD and high-frequency dictionaries iD , where thelow-frequency dictionary is used to estimate the sparse representation i of input low-resolutionpatch iy with respect to the selected neighbors, and the high-frequency dictionary iD will beused to fuse its atoms (columns) with a weight parameter, where the sparse representation i willbe used as a weight parameter. Finally, by taking the inverse wavelet transform (−1W ) of a givenlow-resolution image patch iy with fused high-frequency information, we will get thesuper-resolution patch*y . Iteratively repeating the reconstruction step (red-dotted block in Fig. 2)for each patch in the low-resolution image y , we will obtain the super-resolution image*y .4. IMPLEMENTATION DETAILSIn this section we will explain the implementation details. As we have pointed out in Sec. 3, weextract patches for each training image }0...=:{= , MjxX jii withlkji Rx ×∈, . The numberM depends on the window function, which determines how we would like to select the patches.There are two ways to select the patches from the image; one is by selecting distinct patches froman image, where two consecutive patches don’t overlap, and another is by selecting overlappedpatches (sliding window), where two consecutive patches are overlapped. Since the 2l -norm innearest-neighbor search is sensitive to the shift, we slide the window by one pixel in horizontal orvertical direction, where the two consecutive patches will overlap each other by lk ×−1)( or1)( −× lk , where×∈,k li jx R . To store these patches we will require an enormous amount ofstorage space lklnkmN ××−×−× )()( , where N is the number of training images and×∈ m niX R . For example, if we have 1000 images natural images in the training dataset, and eachhas a resolution of 10001000× pixels, to store the patches of 4040× , we will require 1.34TB ofstorage space, which would be inefficient and computationally expensive.To reduce the number ofpatches, we removed patches which don’t contain any gradients, or contain very few gradients,222, ≥∇ jix where ∇ is the sum of gradients along the vertical and horizontal directions (
  7. 7. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 201325yxxxxjijiji∂∂+∂∂∇,,, = ), and 2 is the threshold value to filter out the patches with less gradientvariation. Similarly, we calculate the gradients on input low-resolution patches ( iy ), and if they arebelow the threshold 2 , we upsample them using bicubic interpolation, where no super-resolutionreconstruction will be performed on that patch. To improve the computation speed, thenearest-neighbor search can be calculated in the GPU, and since all given low-resolution patchesare calculated independently of each other, multi-threaded processing can be used for eachsuper-resolution patch reconstruction.In the wavelet transform, for low-pass filter and high-pass filter we used [ 0.5 , 0.5 ] and[ 0.5− , 0.5 ], where 2D filters for wavelet transform are created from them. These filters are notquadratic-mirror-filters (nonorthogonal), thus, during the inverse wavelet transform we need tomultiply the output by 4 . The reason for choosing these values for the filters is, the low-frequencyinformation (analysis part) of the forward wavelet transform will be the same as down-sampling thesignal by a factor of 2 with nearest neighbor interpolation, which is used in the nearest-neighborsearch.During the experiments, all color images are converted to YCbCr , where only theluminance component ( Y ) is used.For the display, the blue- and red-difference chromacomponents (Cb and Cr ) of an input low-resolution image are up-sampled and combined withthe super-resolution image to obtain the color image*y (Fig. 2).Note that we can reduce the storage space for the patches to zero, by extracting the patches of thetraining images during reconstruction. This can be accomplished by changing the neighbor-searchalgorithm, and can be implemented in GPU. During the neighbor-searching, each GPU thread willbe assigned to extract low-frequency )( , jlx and high-frequency information  ,( )l jx at anassigned position j of a training image lX ; we compute the distance to the input low-resolutionimage patch iy , and if the distance is less than the threshold 2 , then the GPU thread will returnthe high-frequency information )( , jlx , where the returned high-frequency information will beused to construct iD :  − ≤ ∀ ∈212= { ( ) : = ( ) , } .i i i i c ci iD c c y x x X (8)As a threshold (radius) value for nearest-neighbor search algorithm we used 0.5 for natural images,and 0.3 for facial images. Both low-frequency information of training image and input imagepatches are normalized before calculating euclidean distance. We selected these valuesexperimentally, where at these values we get highest SNR and lowest MSE. As we know that thatthe euclidean distance ( in nearest-neighbor search) is sensitive to noise, but in our approach, ourmain goal is to reduce the number of training patches which are close to input patch. Thus, we takea higher the threshold value for nearest-neighbor search, where we select closest matches, then thesparse representation is performed on them. Note that, sparse representation estimation (Eq. 6)tends to estimate input patch from training patches, where noise is taken care of [14].Reducing thestorage space will slightly increase the super-resolution reconstruction time, since the wavelettransform will be computed during the reconstruction.
  8. 8. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013265. EXPERIMENT RESULTSWe performed experiments on a variety of images to test the performance of our approach (HSR) aswell as other super-resolution algorithms: BCI [1], SSR [22] and MSR [8]. We conducted two typesof experiments.For the first one, we performed the experiment on the Berkeley Segmentation Dataset 500 [15]. Itcontains natural images, where the natural images are divided into two groups; the first group ofimages (Fig. 3(a)) are used to train super-resolution algorithms (except BCI), and the second groupimages (Fig. 3(b)) are used to test the performance of the super-resolution algorithms. To measurethe performance of the algorithms, we use mean-square-error (MSE) and signal-to-noise ratio(SNR) as a distance metric. These algorithms measure the difference between the ground truth andthe reconstructed images.The second type of experiment is performed on facial images (Fig. 4), where face recognitionsystem is used as a distance metric to demonstrate the performance of the super-resolutionalgorithms.(a) (b)Figure 3: Depiction of Berkeley Segmentation Dataset 500 images used for (a) training and (b) testing thesuper-resolution algorithms.(a) (b)Figure 4: Depiction of facial images used for (a) training and (b) testing the super-resolution algorithms.
  9. 9. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013275.1 Results on Natural ImagesIn Figure 5, we show the output of the proposed super-resolution algorithm, BCI, SSR, and MSR.The red rectangle is zoomed-in and displayed in Figure 6. In this figure we focus on the effectofsuper-resolution algorithms on low-level patterns (fur of the bear). Most of the super-resolutionalgorithms tend to improve the sharpness of the edges along the border of the objects, which looksgood to human eyes, and the low-level patterns are ignored. One can see that the output of BCI issmooth (Fig. 5(b)), and from the zoomed-in region (Fig. 6(b)) it can be noticed that the edges alongthe border of the object are smoothed, and similarly, the pattern inside the regions is also smooth.This is because BCI interpolates the neighboring pixel values in the lower-resolution to introduce anew pixel value in the higher-resolution. This is the same as taking the inverse wavelet transform ofa given low-resolution image with its high-frequency information being zero, thus thereconstructed image will not contain any sharp edges. The result of MSR has sharp edges, however,it contains block artifacts (Fig. 5(c)). One can see that the edges around the border of an object aresharp, but the patterns inside the region are smoothed, and block artifact are introduced (Fig. 6(c)).On the other hand, the result of SSR doesn’t contain sharp edges along the border of the object, butit contains sharper patterns compared to BCI and MSR (Fig. 5(d)). The result of the proposedsuper-resolution algorithm has sharp edges, sharp patterns, as well as fewer artifacts compared toother methods(Fig. 5(e) and Fig. 6(e)), and visually it looks more similar to the ground truth image (Fig. 5(f) andFig. 6(f)).Figure 7 shows the performance of the super-resolution algorithms on a different image with fewerpatterns. One can see that the output of the BCI is still smooth along the borders, and inside theregion it is clearer. The output of MSR looks better for the images with fewer patterns, where ittends to reconstruct the edges along the borders.(a) (b) (c)(d) (e ) (f)Figure 5: Depiction of low-resolution, super-resolution and original high-resolution images. (a)Low-resolution image, (b) output of BCI, (c) output of SSR, (d) output of MSR, (e) output of proposedalgorithm, and (f) original high-resolution image. The solid rectangle boxes in red color represents theregions that is magnified and displayed in Figure 5 for better visualization. One can see that the output of theproposed algorithm has sharper patterns compared to other SR algorithms.
  10. 10. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 201328(a) (b) (c)(d) ( e ) (f)Figure 6: Depiction of a region (red rectangle in Figure 3) for (a) low-resolution image, output of (b) BCI, (c)SSR, (d) MSR, (e) proposed algorithm, and (f) original high-resolution image. Notice that the the proposedalgorithm has sharper patterns compared to other SR algorithms.(a) (b) (c)(d) (e ) (f)Figure 7: Depiction of low-resolution, super-resolution and original high-resolution images. (a)Low-resolution image, (b) output of BCI, (c) output of SSR, (d) output of MSR, (e) output of proposedalgorithm, and (f) original high-resolution image. The solid rectangle boxes in yellow and red colorsrepresent the regions that were magnified and displayed on the right side of each image for bettervisualization. One can see that the output of the proposed algorithm has better visual quality compared toother SR algorithms.
  11. 11. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 201329In the output of SSR, one can see that the edges on the borders are smooth, and inside the regions ithas ringing artifacts. The SSR algorithm builds dictionaries from high-resolution andlow-resolution image patches by reducing the number of atoms (columns) of the dictionaries undera constraint that these dictionaries can represent the image patches in the training dataset withminimal difference. This is similar to compressing or dimension reduction, where we try topreserve the structure of the signal, not the details of the signal, and sometimes we get artifactsduring the reconstruction1.We also computed the average SNR and MSE to quantitatively measure the performance of thesuper-resolution algorithms. Table 5.1 depicts the average SNR and MSE values for BCI, MSR,SSR, and HSR. Notice that the proposed algorithm has the highest signal-to-noise ratio and thelowest difference mean-square-error.Table 5.1: Experimental Results5.2. Results on Facial ImagesWe conducted experiments on surveillance camera facial images (SCFace)[9]. This databasecontains 4,160 static images from 130 subjects. The images were acquired in an uncontrolledindoor environment using five surveillance cameras of various qualities and ranges. For each ofthese cameras, one image from each subject at three distances, 4.2 m, 2.6 m, and 1.0 m wasacquired. Another set of images was acquired by a mug shot camera. Nine images per subjectprovide nine discrete images ranging from left to right profile in equal steps of 22.5 degrees,including a frontal mug-shot image at 0 degrees. The database contains images in visible andinfrared spectrum. Images from different quality cameras mimic real-world conditions. Thehigh-resolution images are used as a gallery (Fig. 4(a)), while the images captured by a camera withvisible light spectrum from a 4.2 m distance are used as a probe (Fig. 4(b)). Since the SCFacedataset consists of two types of images, high-resolution images and surveillance images, we usedthe high-resolution images to train SR methods and the surveillance images as a probe.We used Sparse Representation based Face Recognition proposed by Wright .et al [21], to testthe performance of the super-resolution algorithms. It has been proven that the performance of theface recognition systems relies on low-level information (high-frequency information) of the facialimages [4]. The high-level information, which is the structure of the face, affects less theperformance of face recognition systems compared to low-level information, unless we comparehuman faces with other objects such as monkey, lion, car, etc., where the structures between them1The lower-frequencies of the signal affect more the difference between original and reconstructed signals, compared to thehigher-frequencies. For example, if we remove the DC component (0 Hz) from one of the signals, original or reconstructed, thedifference between them will be very large. Thus keeping the lower-frequencies of the signal helps to preserve the structure and haveminimal difference between the original and reconstructed signals.Dist. Metric SR Algorithms BCI SSR MSR HSRSNR ( dB ) 23.08 24.76 18.46 25.34MSE 5.45 5.81 12.01 3.95
  12. 12. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 201330are very different. Most of the human faces have similar structures: two eyes, one nose, twoeyebrows, etc., and in the low-resolution facial images, the edges (high-frequency information)around the eyes, eyebrows, nose, mouth, etc., are lost which decreases the performance of facerecognition systems [21]. Even for humans it is very difficult to recognize a person from alow-resolution image (see Fig. 4). Figure 8 depicts the given low-resolution images, and the outputof super-resolution images. The rank-1 face recognition accuracy for LR, BCI, SSR, MSR, and ourproposed algorithms are: 2%, 18%, 13%, 16%, and 21%. Overall, the face recognition accuracy islow, but compared to the face recognition performance on the low-resolution images, we cansummarize that the super-resolution algorithms can improve the recognition accuracy.6. CONCLUSIONWe have proposed a novel approach to reconstruct super-resolution images for better visual qualityas well as for better face recognition purposes, which also can be applied to other fields. Wepresented a sparse representation-based SR method to recover the high-frequency components ofan(a) (b) (c) (d)Figure 8: Depiction of LR and output of SR images. (a) Low-resolution image, output of (b) BCI, (c) SSR,and (d) proposed method. For this experiment we used a patch size of 10×10 pixels; thus when we increasethe patch size we introduce ringing artifacts, which can be seen in the reconstructed image (d). Quantitatively,in terms of face recognition our proposed super-resolution algorithm outperforms other super-resolutionalgorithms.SR image. We demonstrated the superiority of our methods over existing state-of-the-artsuper-resolution methods for the task of face recognition in low-resolution images obtained fromreal world surveillance data, as well as better performance in terms of MSE and SNR. We concludethat by having more than one training image for the subject we can significantly improve the visualquality of the proposed super-resolution output, as well as the recognition accuracy.REFERENCES[1] M. Ao, D. Yi, Z. Lei, and S. Z. Li. Handbook of remote biometrics, chapter Face Recognition at aDistance:System Issues, pages 155–167. Springer London, 2009.[2] S. Baker and T. Kanade. Hallucinating faces. In Proc. IEEE International Conference onAutomatic Face and Gesture Recognition, pages 83–88, Grenoble, France, March 28-30, 2002.[3] H. Chang, D. Y. Yeung, and Y. Xiong. Super-resolution through neighbor embedding. In Proc.IEEEInternational Conference on Computer Vision and Pattern Recognition, pages275–282,Washington DC., 27 June-2 July 2004.[4] G. Chen and W. Xie. Pattern recognition using dual-tree complex wavelet features and svm. InProc. Canadian Conference on Electrical and Computer Engineering, pages 2053–2056, 2008.
  13. 13. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 201331[5] A. Elayan, H. Ozkaramanli, and H. Demirel. Complex wavelet transform-based face recognition.EURASIP Journal on Advances in Signal Processing, 10(1):1–13, Jan 2008.[6] S. Farsiu, M. D. Robinson, M. Elad, and P. Milanfar. Fast and robust multiframe super resolution.IEEE Transactions on Image Processing, 13(10):1327–1344, 2004.[7] W. T. Freeman, T. R. Jones, and E. C. Pasztor. Learning low-level vision. IEEE InternationalJournal of Computer Vision, 40(1):25–47, 2000.[8] W. T. Freeman and C. Liu. Markov random fields for super-resolution and texture synthesis, chapter10, pages 1–30. MIT Press, 2011.[9] M. Grgic, K. Delac, and S. Grgic. SCface - surveillance cameras face database. Multimedia Toolsand Applications Journal, 51(3):863–879, 2011.[10] P. H. Hennings-Yeomans. Simultaneous super-resolution and recognition. PhD thesis, CarnegieMellon University, Pittsburgh, PA, USA, 2008.[11] J. T. Hsu, C. C. Yen, C. C. Li, M. Sun, B. Tian, and M. Kaygusuz. Application of wavelet-basedPOCS superresolution for cardiovascular MRI image enhancement. In Proc. InternationalConference on Image and Graphics, pages 572 – 575, Hong Kong, China, Dec. 18-20, 2004.[12] K. I. Kim and Y. Kwon. Single-image super-resolution using sparse regression and natural imageprior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(6):1127–1133, 2010.[13] C. Liu, H. Y. Shum, and C. S. Zhang. A two-step approach to hallucinating faces: global parametricmodel and local nonparametric model. In Proc. IEEE Computer Society Conference on ComputerVision and Pattern Recognition, pages 192–198, San Diego, CA, USA, Jun. 20-26, 2005.[14] J. Mairal, M. Elad, and G. Sapiro. Sparse representation for color image restoration. IEEETransactions on Image Processing, 17(1):53–69, Jan. 2008.[15] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and itsapplication to evaluating segmentation algorithms and measuring ecological statistics. In Proc. 8thInternational Conference on Computer Vision, volume 2, pages 416–423, July 2001.[16] G. Pajares and J. M. Cruz. A wavelet based image fusion tutorial. Pattern Recognition,37(9):1855–1872, Sep. 2004.[17] J. Sun, N. N. Zheng, H. Tao, and H. Shum. Image hallucination with primal sketch priors. In Proc.IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages II – 729–36vol.2, Madison, WI, Jun. 18-20, 2003.[18] J. Wang, S. Zhu, and Y. Gong. Resolution enhancement based on learning the sparse association ofimage patches. Pattern Recognition Letters, 31(1):1–10, Jan. 2010.[19] Q. Wang, X. Tang, and H. Shum. Patch based blind image super-resolution. In Proc. IEEEInternational Conference on Computer Vision, Beijing, China, Oct. 17-20, 2005.[20] R. Willet, I. Jermyn, R. Nowak, and J. Zerubia. Wavelet based super resolution in astronomy. InProc. Astronomical Data Analysis Software and Systems, volume 314, pages 107–116, Strasbourg,France, 2003.[21] J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma. Robust face recognition via sparserepresentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2):210–227,February 2009.[22] J. Yang, J. Wright, T. Huang, and Y. Ma. Image super-resolution as sparse representation of rawimage patches. In Proc. IEEE Computer Society Conference on Computer Vision and PatternRecognition, Anchorage, AK, Jun. 23-28, 2008.

×