Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Supervised Facial Recognition Based on Multi- Resolution Analysis and Feature Alignment

184 views

Published on

IEEE ARTICULE TEMAS ITI SQL SERVER REDES ANALISIS DISEÑO IMPLEMENTACION ITI ARTICULOS PDF ALGORITMOS GENETICOS IA B.D DOCUMENTACION REPORTE REDES NEURONALES IA INTELIGENCIA ARTIFICIAL ANGEL WHA MIGUEL ANGEL GARCIA WHA UPV UNIVERSIDAD POLITECNICA DE VICTORIA GRAFOS NODOS PROGRAMACION LATEX WORD REPORTE PLANTILLA PRACTICA PROYECTO ITI COMPUTACION SISTEMAS WEB ESCRITORIO

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Supervised Facial Recognition Based on Multi- Resolution Analysis and Feature Alignment

  1. 1. (2) (1) Supervised Facial Recognition Based on Multi- Resolution Analysis and Feature Alignment Ahmed Aldhahab, George Atia, and Wasfy B. Mikhael Department of Electrical Engineering and Computer Science University of Central Florida Orlando, FL USA Aldhahab2012@knights.ucf.edu, George.Atia@ucf.edu, and Wasfy.Mikhael@ucf.edu Abstract— A new supervised algorithm for face recognition based on the integration of Two-Dimensional Discrete Multiwavelet Transform (2-D DMWT), 2-D Radon Transform, and 2-D Discrete Wavelet Transform (2-D DWT) is proposed1 . In the feature extraction step, Multiwavelet filter banks are used to extract useful information from the face images. The extracted information is then aligned using the Radon Transform, and localized into a single band using 2-D DWT for efficient sparse data representation. This information is fed into a Neural Network based classifier for training and testing. The proposed method is tested on three different databases, namely, ORL, YALE and subset fc of FERET, which comprise different poses and lighting conditions. It is shown that this approach can significantly improve the classification performance and the storage requirements of the overall recognition system. I. INTRODUCTION The design of efficient facial recognition systems received renewed attention in recent years on account of the ever- increasing interest in image processing, computer vision, and pattern recognition for security and big data applications. The efficiency of a recognition system depends on several metrics, including the recognition rate, the computational complexity and the storage requirements. There is a large body of literature on image recognition and numerous algorithms have been proposed. We briefly describe some of the techniques that are most relevant to our work. Image classification based on color information, shape and texture for aircraft color images (RGB) is studied in [1]. There, feature extraction is achieved by applying a Wavelet Transform based on Daubechies 4 and the first order color moment to subimages obtained from the original image. The extracted features are then fed into a Neural Network for classification. In [2], the authors present various feature extraction techniques based on Stationary Multiwavelet Transform (SMWT), a Histogram Based method (HBM), and Principle Component Analysis (PCA). For the ORL database, recognition rates of 89.5%, 94.5%, and 63% were reported for the SWMT, HBM, and PCA, respectively. An algorithm based on Gabor Wavelet coefficients fusion is proposed in [3] and applied to the ORL database. Using coefficients fusion, considerable performance gains were observed. Another approach based on PCA and Neural Networks was proposed in [4]. PCA is used to reduce the dimensionality for feature 1 This work was supported in part by NSF grant CCF-1320547. extraction, and then a Neural Network is trained using a back propagation training algorithm (BPTA). This approach was tested on the YALE database and a performance of 96.5% was reported. In this paper, a new algorithm based on the combination of 2-D DMWT, 2-D RT, and 2-D DWT is presented. The proposed system is shown to significantly improve the storage requirement, the computational complexity, the execution time and the classification performance. Our approach is tested on 3 different databases that have a large number of persons, poses, and lighting conditions. The rest of the paper is organized as follows. In Section II, some background about the transform methods is presented. The proposed system is presented in Section III. The experimental results and the analysis of the proposed approach are presented in Section IV. We conclude in Section V. II. BACKGROUND The tradeoff between time and frequency resolutions is a well-recognized characteristic of windowed Fourier Transform. In particular, a window of infinite length leads to good frequency resolution, but no time information. On the other hand, a narrower window leads to better time resolution, but poorer frequency resolution. The Discrete Wavelet Transform (DWT) alleviates this problem by considering two sets of functions, namely, wavelet and scaling functions. A. Discrete Wavelet and Multiwavelet Transform The wavelet and scaling functions are associated with low pass and high pass filters, respectively. An incoming signal x[n] is passed through these filters and half of the samples are eliminated according to the Nyquist’s rule. This process is called one-level decomposition and can be described by [5]: ∑ . 2 ∑ . 2 Where ylow and yhigh are the output from the low and high pass filters, respectively. H[.] and L[.] are the high pass and low pass filters, respectively. x[.] represents the input signal and in this paper will represent the input image. The theory of Multiwavelet is based on the idea of multiresolution analysis (MRA). There, the signal is analyzed at different resolutions. In contrast to the standard scalar wavelet, the Multiwavelet uses several scaling and wavelet 978-1-4799-4132-2/14/$31.00 ©2014 IEEE 137
  2. 2. functions. Consider Multiwavelet case with K scaling and K wavelet functions. In vector form: , , … . . , , … . . Where Φ and Ψ are the multiscaling and wavelet functions, respectively. The dilation and wavelet equations for Multiwavelet can be written as: √2 ∑ 2 √2 ∑ 2 Where and are matrix filters for each integer k [6]. One of the famous Multiwavelet filters is the GHM filter due to Geronimo, Hardian, and Massopust [7]. Their system with 2 consists of two scaling functions and two wavelet functions . For the GHM the multiscaling functions have four scaling matrices , , , and . Also, G for GHM has four wavelet matrices , , , and as shown below [8]: √ √ , √ 0 √ 0 0 √ , 0 0 0 √ √ , √ √ √ √ , 0 √ 0 These matrix filters are used to construct the transformation matrix T, which is applied to the images in the 2-D DMWT. This transformation matrix can be written as: 0 0 0 0 0 0 0 0 The reader is referred to [8, 11] for further details about Multiwavelet analysis. B. Radon Transform The Radon Transform R(z,θ) of an image f(x,y) is defined as a slant or slopping line with angle θ from the y axis and with distance z from the origin [9]. It is a very powerful tool to capture all the useful directional features of images. In image processing, the Radon Transform is the projection of an image f(x, y) along the specified direction or angles. The result of the Radon Transform is represented by the sum of all intensities along these directions or angles. The Radon Transform for a two-dimensional image f(x, y) is: , , Where δ(.) is the Dirac delta function, ∞, ∞ the perpendicular distance from the origin, and 0,2 an angle formed by the distance vector as shown in Fig. 1 below [10]: Fig. 1. The Radon transform III. PROPOSED ALGORITHM The proposed algorithm for face recognition consists of 3 phases: preprocessing, feature extraction, and classification. A. Preprocessing The preprocessing phase consists of two steps: a. The first step aims to convert different image databases with different dimensions into a general common dimension (128×128). In this study, we consider three databases, with dimensions shown in Table I. TABLE I. THE DIMENSIONS OF THE DATABASES Databases ORL YALE Subset fc of FERET Size 112×92 243×320 384×256 Since the databases have different dimensions, and since the algorithm used in this paper requires dimensions that are power of two, all the databases (images) are converted to a common dimension (128×128). b. All images use a uint8 data type, which is not suitable for the transforms used in this algorithm. Therefore, the second step is to convert all images from (X×Y uint8) to (128×128 double). B. Feature extraction To extract useful features from the images of the considered databases we apply 2-D DMWT, 2-D RT, and 2-D DWT. Mutliresolution analysis based on Multiwavelet is used to extract the salient features, which are then aligned using the Radon Transform for efficient data representation. Since the filter banks in MWT are matrix-valued filters, the data needs to be preprocessed to obtain vector inputs for these filters [11]. Herein, we adopt a critically sampled scheme for preprocessing as in [11, 12], suited for the 2-D image data as described next. Algorithm for computing the 2-D DMWT coefficients using a critically sampled scheme for preprocessing: 1. Construction of the T matrix shown in equation (11). 2. Preprocessing the rows: we obtain a matrix from an input image. The odd rows of the resultant matrix correspond to the original matrix and the even rows are (4) Y X f(x,y) R(r, ) r (3) (5) (6) (7) (8) (9) (10) (11) (12) 138
  3. 3. FERETYALEORL properly scaled versions of the input rows [11]. These serve as vector-valued inputs to the aforementioned GHM filter banks. 3. Transformation of the image rows: first, we apply the transformation T to the preprocessed matrix from step 2. Second, we permute the resulting matrix rows by rearranging the row pairs 1, 2 and 5, 6, …, N−3, N−2 after each other in the upper half, then the row pairs 3, 4 and 7, 8… N−1, N below them in the lower half. 4. Preprocessing the Columns: the same preprocessing procedure is repeated here for the columns by transposing the matrix from step 3 and repeating step 2. 5. Transformation of the image Columns: we apply the transformation T to the resultant matrix from step 4, and then permute the resulting matrix rows as in step 3. 6. The matrix from step 5 is transposed and we permute the coefficients for the resultant matrix. The obtained coefficients represent a one level decomposition of the 2-D DMWT. An example of applying 2-D DMWT on the ORL, YALE, and subset fc of FERET databases is shown in Fig. 2. Fig. 2. Application of 2-D DMWT on different databases In Fig. 2, the resultant image is sub divided into four sub bands and each sub band is further divided into four sub images as shown. We can see that the useful information exists only in the first subband, which contains all the energy from the original image. After extracting the features using 2-D DMWT, the Radon Transform is applied to the first subband. The final step of feature extraction is applying the 2-D DWT on each of the resultant sub-images to localize the information into a single band, which is the low frequency band as shown in Fig. 3. Fig. 3. Application of 2-D DWT on the resultant od 2D RT The final matrix is now fed into a Neural Network classifier for training and testing. C. Recognition phase The algorithm we propose for training is based on the Back Propagation Training Algorithm for Neural Networks. The matrix of features is first converted to a 1-D form that is suitable for training the neural network. Since the proposed approach is supervised we need to choose a desired output vector for each database. There are 40 desired output vectors for ORL, 15 desired output vectors for YALE, and 200 desired output vectors for the subset fc of FERET database. The Neural Network consists of three layers, namely, an input layer, a hidden layer and an output layer. IV. EXPERIMENTAL RESULTS The results of the proposed algorithm are presented in this section. The combination of 2-D DMWT, 2-D RT, and 2-D DWT are presented and compared with the combination of 2-D DMWT and 2-D RT and the combination of 2-D DMWT and 2-D DWT. All the algorithms were tested using the ORL, YALE, and subset fc of FERET databases. A. Experimental Results for the ORL Database. The ORL database consists of images for 40 persons, each with 10 different poses. Let P be the number of poses that are used for training. Hence, 10-P poses are used for testing. In the training phase, we use P=1, P=3, and P=5 poses (of the 40 persons). Table II summarizes the performance of the 3 aforementioned algorithms. It is shown that the system that uses the 3 transforms significantly outperforms the other systems. TABLE II. RECOGNITION RATE OF ORL DATABASE NO. of poses used in training mode NO. of poses used in testing mode Recognition rate for training mode Recognition rate for testing mode 2-D DMWT and 2-D RT 2-D DMWT, and 2-D DWT proposed system P=1 P=9 100% 67.78% 72.5% 90.56% P=3 P=7 100% 75.35% 83.21% 96.07% P=5 P=5 100% 83.5% 89.5% 99.5% Size of the resultant matrix 64×64 32×32 32×32 As shown, even when a small number of poses is used, the algorithm achieves good recognition accuracy. As we increase the size of the training data, the performance is improved. The proposed system also reduces the storage requirements. The size of the feature matrix is shown in the last row of Table II. B. Experimental Results for the YALE Database. The YALE database consists of images for 15 persons each with 11 different poses. Therefore, the total number of poses used to test the system is 165 poses. In the training phase, we use P=1, P=3, and P=5 poses. Table III summarizes the performance of the 3 aforementioned algorithms. Again, it is shown that the system that uses the 3 transforms significantly outperforms the other systems. TABLE III. RECOGNITION RATE OF YALE DATABASE NO. of poses used in training mode NO. of poses used in testing mode Recognition rate for training mode Recognition rate for testing mode 2-D DMWT and 2-D RT 2-D DMWT, and 2-D DWT proposed system P=1 P=10 100% 58.67% 60.67% 88.67% P=3 P=8 100% 81.67% 85.83% 97.5% P=5 P=6 100% 87.78% 92.22% 98.89% Size of the resultant matrix 64×64 32×32 32×32 ORL FERETYALE 139
  4. 4. As before, the algorithm achieves good recognition accuracy even with a small number of poses and reduces the storage requirements. The size of the feature matrix is shown in the last row of Table III. C. Experimental Results for the subset fc of FERET Database. The subset fc of FERET database consists of images for 200 persons each with 11 different poses. Therefore, the total number of poses used to test the system is 2200 poses. In the training phase, we use P=1, P=3, and P=5 poses. Table IV summarizes the performance of the 3 aforementioned algorithms. It is shown that the system that uses the 3 transforms significantly outperforms the other systems. TABLE IV. RECOGNITION RATE OF SUBSET FC OF FERET DATABASE NO. of poses used in training mode NO. of poses used in testing mode Recognition rate for training mode Recognition rate for testing mode 2-D DMWT and 2-D RT 2-D DMWT, and 2-D DWT proposed system P=1 P=10 100% 62.25% 64.4% 89.5% P=3 P=8 100% 75.94% 80.56% 95.18% P=5 P=6 100% 84.08% 90.75% 97.91% Size of the resultant matrix 64×64 32×32 32×32 The results exhibit the same behavior as for the previous databases. Specifically, high recognition rates are achieved by the system that adopts the three transforms and the performance can be further improved with more training. The performance of the neural network during the training phase can affect the overall performance of the proposed algorithm. Choosing the number of hidden layers, the number of neurons in the hidden layers, the type of the activation function, and the training function plays an important role on the overall performance of the system. In this study, we used one hidden layer with 1024 neurons for the ORL and YALE databases. For subset fc of FERET we used 2 hidden layers with 2096 and 1024 neurons for the first and the second layer, respectively. The activation function used is the hyperbolic tangent sigmoid and Back Propagation is used for training. We plot the MSE versus the number of iterations for the 3 proposed algorithms during the training phase. An example of the performance of the Neural Network for the ORL database is shown in Fig. 4. It is shown that the proposed system that combines the three transforms achieves the fastest convergence and attains a MSE of 10^-7. V. CONCLUSION In this paper, we proposed a new supervised classification algorithm that uses three consecutive transforms (2-D DMWT, 2-D RT, and 2-D DWT) and a neural network for facial recognition. Combining these transforms led to a significant improvement in the classification performance, as well as the storage requirement and the execution time of the recognition task. An extensive study of the performance of the proposed algorithm was conducted using three databases (ORL, YALE and subset fc of FERET), which have different poses and lighting conditions. The classification rates achieved were 99.5%, 98.667%, and 97.91%. Fig. 4. MSE performance of the NNT REFERENCES [1] M. Lotfi, A. Solimani, A. Dargazany, H. Afzal, and M. Bandarabadi, “Combining Wavelet Transform and Neural Network for Image Classification,” 41st IEEE SSST, pp 44-48, March 2009. [2] T. Ismaeel, A. Kamil, and A. Naji, “Human Face Recognition Using Stationay Multiwavelet Transform,”International Journal of Computer Applicatios, vol. 72, No.1, May 2013, pp 23-32. [3] Z. Delong and Z. Junbin, “Face Recognition Algorithm based on Gabor Wavelet Coefficients Fusion,” IEEE World Congress on Computer Science and Information Engineering, vol. 6, pp 564-567, March 31- April 2 2009. [4] P. Latha, L. Ganesan, and S. Annadurai, “ Face Recognition Using Neural Networks,” SPIJ, vol. 3, issue 5, November 2009, pp 153-160. [5] E. Varadharajan and M. Rajkumari, “Discrete Wavelet Transform Based Spectrum Sensing in Cognitive Radios Using Eigen Filter,” IJAET, vol. III, Issue I, March 2012, pp 389-391. [6] M. Martin and A. Bell, “New Image Compression Techniques Using Multiwavelets and Multiwavelet Packets,” IEEE Transaction on Image Processing, vol. 10, No. 4, pp 500-510, April 2001. [7] J. Geronimo, D. Hardian, and P. Massopust, “Fractal Functions and Wavelet Expansions Based on several Scaling Functions,” Journal of Approximation Theory, vol. 78, September 1994, pp 373-401. [8] V. Strela and A. Walden, “Orthogonal and Biorthogonal Multiwavelets for Signal Denoising and Image Compression”, Proceedings of SPIE, Issue 1, November 1998, pp 96-107. [9] L. Cai and S. Du, “Rotation, Scale and Translation Invariant Image Watermarking Using Radon Transform and Fourier Transform,” IEEE 6th CAS Symposium on Emerging Technologies: Frontiers of Mobile & Wireless Communication, Shanghai. China, pp 281-284, May 31-June 2 2004. [10] P. Rai and P. Khanna, “Gender Classification Using Radon and Wavelet Transforms,” IEEE 5th ICIIS, India, pp 448-451, July 29- Augest 01 2010. [11] V. Strela, P. Heller, G. Strang, P. Topiwala, and C. Heil, “The Application of Multiwavelet Filterbanks to Image Processing,” IEEE Tran. On Image Processing, Vol. 8, No. 4, April 1999, pp 548-563. [12] W. Al-Jouhar and T. Abbas, “Feature Combination and Mapping Using Multiwavelet Transform,” IASJ, AL-Rafidain University College, Baghdad. Iraq, Issue 19, No. 3, 2006, pp 13-34. 2-D DMWT and 2-D DWT 2-D DMWT and 2-D RTThe Proposed System 140

×