1152 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011Face Recognition System Using MultipleFace Model of ...
1154 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011the phase angle, are represented. Three different fr...
1156 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011overcome this weakness, we adopt an anisotropic appr...
1158 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011Fig. 7. Structure of hybrid Fourier based upon PCLDA...
1160 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011is derived from the optimal likelihood ratio test. T...
Upcoming SlideShare
Loading in …5

Face recognition system using multiple face model of hybrid fourier feature under un controlled illumination variation


Published on

For more projects visit @ www.nanocdac.com

Published in: Technology, News & Politics
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Face recognition system using multiple face model of hybrid fourier feature under un controlled illumination variation

  1. 1. 1152 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011Face Recognition System Using MultipleFace Model of Hybrid Fourier FeatureUnder Uncontrolled Illumination VariationWonjun Hwang, Haitao Wang, Hyunwoo Kim, Member, IEEE, Seok-Cheol Kee, and Junmo Kim, Member, IEEEAbstract—The authors present a robust face recognition systemfor large-scale data sets taken under uncontrolled illuminationvariations. The proposed face recognition system consists of anovel illumination-insensitive preprocessing method, a hybridFourier-based facial feature extraction, and a score fusion scheme.First, in the preprocessing stage, a face image is transformed intoan illumination-insensitive image, called an “integral normalizedgradient image,” by normalizing and integrating the smoothedgradients of a facial image. Then, for feature extraction of com-plementary classifiers, multiple face models based upon hybridFourier features are applied. The hybrid Fourier features areextracted from different Fourier domains in different frequencybandwidths, and then each feature is individually classified bylinear discriminant analysis. In addition, multiple face models aregenerated by plural normalized face images that have different eyedistances. Finally, to combine scores from multiple complemen-tary classifiers, a log likelihood ratio-based score fusion schemeis applied. The proposed system using the face recognition grandchallenge (FRGC) experimental protocols is evaluated; FRGCis a large available data set. Experimental results on the FRGCversion 2.0 data sets have shown that the proposed method showsan average of 81.49% verification rate on 2-D face images undervarious environmental variations such as illumination changes,expression changes, and time elapses.Index Terms—Face recognition, face recognition grand chal-lenge, feature extraction, preprocessing, score fusion.I. INTRODUCTIONAUTOMATIC face recognition is an important vision taskwith many practical applications such as biometrics,video surveillance, image retrieval, and human computer inter-action. One major issue for face recognition is how to ensureManuscript received October 07, 2009; revised March 12, 2010 and August26, 2010; accepted September 06, 2010. Date of publication October 04, 2010;date of current version March 18, 2011. This work was supported in part bythe National Research Foundation of Korea under Grants 20100013490 and20100029372. The associate editor coordinating the review of this manuscriptand approving it for publication was Dr. Arun Ross.W. Hwang is with the Mechatronics and Manufacturing TechnologyCenter, Samsung Electronics Co., 443–742 Suwon, Korea (e-mail:wj.hwang@samsung.com).H. Wang is with the SAIT Beijing Laboratory, Samsung Advanced Instituteof Technology, Beijing 100102, China (e-mail: ht.wang@samsung.com).H. Kim is with the Department of New Media, Korean German Institute ofTechnology, Seoul 121–270, Korea (e-mail: hwkim@kgit.ac.kr).S.-C. Kee is with the Electronic R&D Center, Mando Corporation, GunpoCity 449–901, Korea (e-mail: sckee@mando.com).J. Kim is with the Department of Electrical Engineering, KAIST, Daejeon305-701, Korea (e-mail: junmo@ee.kaist.ac.kr).Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TIP.2010.2083674recognition accuracy for a large data set captured in variousconditions. Several face data sets are collected to comparemany different algorithms with the same protocols on the samedata set. They include face recognition technology (FERET)[1], face recognition vendor test (FRVT) [2], [3], and faceauthentication test [4]. Most recently, face recognition grandchallenge (FRGC) [5] has been designed to improve the accu-racy of recognition systems in a large-scale data set, particularlyfocused on verification of the person rather than identification.The FRGC data set contains face images collected in differentsettings (controlled studio versus uncontrolled illuminationconditions) with two different facial expressions (neutral versussmiling) taken for several months.In FRGC, the main issue is how to match two face imagesof the same person under different conditions. One is taken ina controlled studio setting while the other is captured in uncon-trolled illumination conditions such as hallways, atria, or out-doors. To overcome the uncontrolled environmental problems,we introduce a systematic approach that combines multiple clas-sifiers with complementary features instead of improving theaccuracy of a single classifier. Illumination insensitive prepro-cessing and a score fusion technique are incorporated into theproposed face recognition system.A. PreprocessingIllumination variation is the main obstacle for face recogni-tion since face image appearances of the same person changeunder different illuminations. Sometimes, the changes in termsof different illuminations among the same person are greaterthan those of different persons among the same illumination.This problem is serious in face recognition, especially whenappearance-based methods are applied. That is, a little illumi-nation direction variation can significantly change the appear-ance-based face model. A number of preprocessing algorithms[6]–[10] to minimize the effect of illumination changes for facerecognition have been developed, and many developments andadvantages have occurred within the 3-D face model trainingstages.Belhumeur and Kriegman [6] proved that face images withthe same pose under different illumination conditions form aconvex cone, called an illumination cone. Ramamoorthi andHanrahan [7] applied spherical harmonic representation to ex-plain the low dimensionality of different illuminated face im-ages. The synthesis and recognition results of the illuminationcone and spherical harmonics cast light on robust face recog-nition under various illuminations. Shashua and Riklin-Raviv1057-7149/$26.00 © 2011 IEEE
  2. 2. HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1153[8] designed a quotient image algorithm for dealing with illu-mination changes in face recognition. It is a practical algorithmfor extracting an illumination invariant representation. However,the application range of those algorithms is limited by requiringa 3-D face model since the 3-D model is not available in manypractical applications due to modeling complexity and costs.Based upon Land’s Retinex [11], Jobson et al. [12] and Grossand Brajovie [13] developed the reflectance estimation methodwith the ratio of an original image to its smooth version. Thedifference between the two Retinex-based algorithms is thatJobson’s filter is isotropic and Gross and Brajovie’s filter isanisotropic. Since those approaches do not need a 3-D or 2-Dmodel, they are relatively simple to implement and are generic.Similarly, Wang et al. [9] introduced the self-quotient image(SQI) method that extracts intrinsic and illumination invariantfeatures from a face image based upon the quotient image tech-nique. The authors assumed that the intrinsic part of the imageis mainly located in a high frequency region, and according tothe Lambertian model, the intrinsic part and the extrinsic partcould be theoretically analyzed. The SQI method could removemost of the shaded parts of a face image. However, this methodapplies direct division operations to image intensity and resultsin some noise and halo effects in the step regions. Recently, Li etal. [10] presented an image-based technique that employed thelogarithmic total variation model to factorize each of the twoaligned face images into an illumination-dependent componentand an illumination-invariant component.In this paper, the illumination-insensitive image integral nor-malized gradient image (INGI) method is proposed to overcomethe unexpected illumination changes in face recognition withlimited side effects such as image noise and the halo effect.Based upon intrinsic and extrinsic factor definitions, we firstnormalize the gradients with a smoothed image and then inte-grate the division results with the anisotropic diffusion method[14]. Through this proposed procedure, we can compensate forunexpected artifacts and have smoother and more natural outputimages for face recognition.B. Feature ExtractionFeatures to be used for person classification are extractedto identify any invariance in the face images against environ-mental changes. In this application, appearance-based subspacerepresentations have been implemented in face recognition.They include principal component analysis (PCA) [15], localfeature analysis (LFA) [16], linear discriminant analysis (LDA)[17], and independent component analysis (ICA) [18]. Thosemethods are derived from an ensemble of statistics within thegiven training images. The methods are easy to implement forpractical applications. In addition, with these methods it is notnecessary to carry out extra-high computational burdens exceptvector projections, while other structural-based schemes, suchas dynamic link architecture [19] and elastic bunch graphmatching [20], perform intense template matching on relativelyhigh-resolution images to localize fiducial points near eyes,nose, mouth, etc. Recently, Chang et al. [21] demonstratedthat sufficient discriminatory information persists at ultra-lowresolution to enable a computer to recognize specific humanfaces.Kumar et al. [22] developed a frequency domain feature offace images for recognition by proposing a cross-correlatingmethod based upon the fast Fourier transform. Savvides et al.[23] further extended a correlation filter and developed a hy-brid PCA-correlation filter called “Corefaces,” that performedrobust illumination-tolerant face recognition. Xie et al. [24] ap-plied a nonlinear correlation filter for redundant class-depen-dence feature analysis in the frequency domain. Another suc-cessful approach based upon the Fourier feature is advancedface descriptor (AFD) [25], which was promoted as an inter-national standard by the moving picture experts group (MPEG)society after long-term competitions for a new face descriptorof MPEG-7 [26]. This approach is designed for the metadataof small face images in multimedia content with compact de-scriptor sizes, and its algorithms were derived from Fourier andintensity features based upon the cascaded LDA method.Recently, there have been many studies [27]–[30] on the facerecognition for the FRGC. For example, Yang and Liu [27] pre-sented a color image discriminant (CID) model that seeks tounify the color image representation and recognition tasks intoone framework. The proposed models have two sets of vari-ables: a set of color component combination coefficients forcolor image representation and multiple projection basis vec-tors for color image discrimination. Liu and Liu [28] proposedthe hybrid color space-based face recognition, which consistsof patch-based Gabor classifiers with a high resolution image,local binary pattern (LBP) feature-based classifiers, and discretecosine transform (DCT)-based classifiers with a low resolutionimage. Tan and Triggs [29] suggested the use of heterogeneousfeature sets such as Gabor wavelets and LBP features. Each fea-ture is projected into reduced dimension spaces by PCA. Su etal. [30] proposed a hierarchical framework mixing global andlocal classifiers. The global classifier is based upon Fourier fea-tures originating from a low resolution image while the localclassifiers are organized by a combination of patch-based Gaborfeatures from a high resolution facial image.The current study on face recognition can be largely classifiedinto two different classes of approaches, the local feature-basedmethod and the global feature-based method. The local fea-ture-based methods [28]–[30] analyze a plural of local features,such as Gabor wavelet features extracted from the high-resolu-tion image, but most of these approaches impose a heavy com-putational burden on the target device, in particular on mobiledevices, which have low computational power. On the otherhand, the global feature-based approaches [27]–[30] extract fea-tures, for example, Fourier and DCT features, from a low-res-olution image. Because the basic framework of the global fea-ture-based methods is simple, they are good for devices with lowcomputational power in spite of the limited accuracy comparedwith the local feature-based methods.In this paper, we extend the AFD [25] to handle a largenumber of uncontrolled face images effectively. This learningprocess is done independently from various selected frequencybands using a 2-D discrete Fourier transform. This featureextraction framework is introduced in order to remove un-necessary frequency parts as the occasion demands for facerecognition. Three types of Fourier feature domain, concate-nated real and imaginary components, Fourier spectrums, and
  3. 3. 1154 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011the phase angle, are represented. Three different frequencybandwidths are also designed to extract more complementaryfrequency features. As for complementary features, Wang andTang [31] proved that random sampling of feature vectors andtraining samples improved the classifiers’ performance. We donot depend upon random feature sampling but design severalcomplementary features based upon the frequency analysis.Moreover, we construct multiple face models that have threeface models with different eye distances within a regular faceimage region. While the component-based schemes [32], [33]perform local analysis of important facial components, our pro-posed multiple face models focus on compensatory face modelsin imitation of human perception, which uses both internal fea-tures (e.g., mutual spatial configuration of facial components)and external facial features (e.g., hair and jaw-line) [34] for facerecognition. With multiple complementary features and classi-fiers, we can expect to achieve a more general face recogni-tion system that is not yet completed by a single feature andclassifier.C. Score FusionAs we have a set of complementary classifiers, we build a uni-fied classifier combining these complementary classifiers. Thepurpose here is to construct a strong classifier by suitably com-bining a set of classifiers. To this end, we want to keep as muchinformation each classifier extracts as possible, and at the sametime the combination should be easy to implement. The infor-mation each classifier extracts is well summarized in the scoreeach classifier produces. Hence, combining the classifiers canbe achieved by processing the set of scores produced by compo-nent classifiers and generating a new single score value. We callthis process “score fusion.” Previous methods for score fusioninclude sum rule, product rule, weighted sum, Bayesian method,and voting [35]–[37].In this paper, we consider a score fusion method based upona probabilistic approach, namely, log-likelihood ratio (LLR) forface recognition. If the ground truth distributions of the scoresare known, LLR-based score fusion is optimal. However, thetrue distributions are unknown so we have to estimate the dis-tributions. We propose a simple approximation of the optimalscore fusion based upon parametric estimation of the score dis-tributions from the training data set.The rest of this paper is organized as follows: illumination in-sensitive representation as a preprocessing method is presentedin Section II. LDA based upon Fourier feature and the pro-posed algorithm, hybrid Fourier feature based LDA from mul-tiple face models, is explained in Sections III and IV, respec-tively. In Section V, log-likelihood-rate-based score fusion ispresented. The experimental results and discussions are given inSection VI. Last, the conclusion is summarized in Section VII.II. ILLUMINATION INSENSITIVE REPRESENTATIONCompared to the controlled illumination changes in the studio(indoors, same day, overhead), achieving high recognition ac-curacy in an uncontrolled illumination situation (outdoors, dif-ferent day) is hard, as mentioned in the FRVT2002 report [3].The main reason is that the image distortion caused by illumi-nation changes makes images of different persons in the sameFig. 1. Under the assumption of the Lambertian reflectance model, an imageconsists of a 3-D shape, texture, and illumination.illumination conditions more similar rather than images of thesame person under various illumination changes. Consequently,we propose a novel illumination invariant preprocessing algo-rithm1 based upon local analysis as opposed to global analysissuch as histogram equalization to deal with the uncontrolled il-lumination situation.A. Illumination AnalysisAssuming the Lambertian reflectance model [8], thegrayscale intensity image of a 3-D object is repre-sented by(1)where is the surface texture associated with point inthe image, is the surface normal direction (shape) asso-ciated with point in the image, and is the light sourcedirection whose magnitude is the light source intensity. This re-lationship is illustrated in Fig. 1.Except for the nose region, most parts of a human face arerelatively flat and continuous, and all faces from different per-sons have similar 3-D shapes . That is, imposing a differentperson’s facial texture on the 3-D shape of a generic face doesnot seriously affect the identity of each person. From the view-point of face recognition, how to obtain the raw facial textureinformation is important. The quotient image method [8] hastaken advantage of this property for extracting illumination-in-variant features. We can also assume that is an illumina-tion-sensitive part in the imaging model.We define as the intrinsic factor and as the extrinsicfactor for face recognition. The intrinsic factor is illuminationfree and represents the identity of a face image, whereas theextrinsic factor is very sensitive to illumination variations, andonly partial identity information is included in the 3-D shape. Furthermore, the illumination problem is the well-knownill-posed problem. Without any additional assumption or con-straint, no analytical solution can be deduced from an input 2-Dimage alone. Previous approaches, such as the illumination cone[6] and the spherical harmonic method [7], directly take the3-D shape as the known parameter or parameters that canbe estimated by training data. However, these approaches arenot available in many practical systems, because the predefined3-D face model is complex, and it does not always work prop-erly under unexpected conditions. In the case of the quotientimage algorithm, though it does not need the 3-D information,it works under the assumption of a face image illuminated witha single-point light source without shadow [9].Our definitions of intrinsic and extrinsic factors are alsobased upon the Lambertian reflectance model with pointlighting sources. This definition can be extended to any type of1The preprocessing algorithm has also been presented in [38] by the authors.
  4. 4. HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1155Fig. 2. Edge information of the image is very sensitive to illumination changes.(a) Sample images under different illumination conditions. (b) Correspondingedge maps.lighting source by a combination of point lighting sources asfollows:(2)According to the previously mentioned assumptions, it is ob-vious that an illumination-insensitive image could be obtainedby enhancing the intrinsic factor and depressing the extrinsicfactor in the input image; the proposed approach is based uponthis idea.B. Integral Normalized Gradient Image (INGI)We can make the following assumptions: 1) most of the in-trinsic factor is in the high spatial frequency domain, and 2) mostof the extrinsic factor is in the low spatial frequency domain.Considering the first assumption, one might use a high-passfilter to extract the intrinsic factor, but it has been proved [39]that this kind of filter is not robust to illumination variationsas shown in Fig. 2. In addition, a high-pass filtering operationmay have a risk of removing some of the useful intrinsic factor.Hence, we propose an alternative approach, namely, employinga gradient operation. The gradient operation is written as(3)where the approximation comes from the assumptions that boththe surface normal direction (shape) and the light source direc-tion vary slowly across the image, whereas the surface texturevaries fast. The scaling factor is the extrinsicfactor of our imaging model. Fig. 3 shows the sample gradientmaps. The Retinex method [12], [13] and SQI method [9] usedthe smoothed images as the estimation of this extrinsic factor.We also use the same approach to estimate the extrinsic part(4)where is a smoothing kernel and denotes the convolution.To overcome the illumination sensitivity, we normalizedthe gradient map with the followingequation:(5)Fig. 3. Structure of the integral normalized gradient image.Fig. 4. Toy examples of reconstruction methods, which are (a) original image,(b) recovered image by isotropic method, and (c) recovered image by anisotropicmethod.because can be taken as the estimation of the extrinsic factor;the illumination effect could be reduced from the gradient mapafter this normalization.After the normalization, the texture information in the nor-malized image is still not apparent enough asshown in Fig. 3. In addition, the division operation of (5) mayintensify unexpected noise terms. To recover the rich texture andremove the noise at the same time, we integrate the normalizedgradients and with the anisotropic diffusion method [14],[38], which we explain in the following, and finally acquire thereconstructed image as shown in Fig. 3.C. Image ReconstructionThe last INGI procedure is to recover a grayscale image fromnormalized gradient maps. If given an initial grayscale value ofone point in an image, we can estimate the grayscale ofany point by an integration method, such as an iterative isotropicdiffusion method, as follows:(6)where is an iteration number, is the iterative reconstructedimage and usually(7)However, this isotropic method has one shortcoming: It blursthe step-edge regions of an image, shown in Fig. 4(b). To
  5. 5. 1156 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011overcome this weakness, we adopt an anisotropic approach[14](8)where ,, is a scaling factor, and controls the up-date speed. If is too large, we may not get stable results; it isset to 0.25 in this paper.Compared with the result in Fig. 4(b), the recovered imagein Fig. 4(c) is more edge-preserved and numerically stable thanthe other method. As complete removal of the illumination vari-ations can lead to loss of useful information for face recognition,we fuse the reconstructed image with the original input image.The image can be derived by(9)where is the final reconstructed image from (8) and is theweighting parameter, .III. LDA BASED UPON FOURIER FEATUREIn this paper, we propose the extended Fourier feature-basedrecognition scheme for FRGC motivated by MPEG7 AFD [25].A. 2-D Discrete Fourier Features for Face RecognitionThe Fourier transform has played a key role in image pro-cessing applications for many years because of its wide rangeof possibilities [40]. To analyze facial features in the Fourierdomain, we apply the Fourier transform to a spatial face image.A 2-D face image can be transformed into a fre-quency domain by(10)where and are frequency variables. TheFourier transform of a real function is generally complex; that is(11)where and are the real and imaginary componentsof , respectively. The magnitude function called theFourier spectrum is(12)and the phase function is defined as(13)To represent the image as a feature vector, the magnitude co-efficients from (12) are widely used instead of the phase values.This is largely because a little spatial displacement in an imagewill change the phase values drastically while the magnitudevaries smoothly when there is no compensator for the phaseshift [41]. In the case of face recognition, this displacementoften occurs when an eye detector finds imprecise eye positionsand causes a little mis-alignment of a face image at the normal-ization stage. On the other hand, Savvides et al. [42] recentlyshowed that the phase coefficient-based method is invariant toillumination variations and is tolerant to occlusion problems andthat eigenphases show better results than eigenfaces.In this paper, we introduce three different Fourier featuresextracted from the real and imaginary component domain,Fourier spectrum domain, and phase angle domain. Toavoid the angular problem in phase angle, , we actuallyuse the cosine values of (13)(14)(15)(16)B. LDALDA [17] is a supervised learning method that finds the linearprojection in subspaces; it maximizes the between-class scatterwhile minimizing the within-class scatter of the projected data.According to this objective, two scatter matrices—the between-class scatter matrix and the within-class scatter matrices—are defined as(17)(18)where the set of training datahave total classes, is the sample mean for theentire data set, is the sample mean for th class,is the number of samples of a class , and .To maximize the between-class scatter and minimize thewithin-class scatter, the transformation matrix, , is for-mulated as(19)In face recognition, when dealing with high-dimensional imagedata, the within-class scatter matrix is often singular [6].To overcome this problem, PCA is first used with the sampledata to reduce its dimensionality. In this paper, we will call itPCLDA.
  6. 6. HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1157Fig. 5. Feature regions are selected according to different frequency bands inFourier features. The upper left point of all quadrangles (0, 0) is the lowestfrequency, and notations are B , B , and B .IV. MULTIPLE FACE MODELS BASED UPONHYBRID FOURIER FEATUREIn this paper, we applied the complementary scheme [31]to the face recognition system with selective frequency band-width and multiple face models based upon different eye dis-tances [43]. To gain more powerful discriminating features, weextract the multiblock Fourier features [25]. We first divide aninput image into several blocks and then we apply a 2-D discreteFourier filter to each block. The Fourier features extracted fromblocks by band selection rules are finally concatenated.A. Frequency Band Selection in Fourier FeatureFrom the viewpoint of psychophysiology and neurophysi-ology, the human recognition system selectively utilizes the fre-quency information of a face image depending upon recogni-tion tasks. For example, Shinha et al. [34] observed that humanscan recognize familiar faces in low-resolution images and thathigh-frequency information by itself does not lead to good facerecognition performance. According to a survey by Chellappaet al. [44], depending upon the specific recognition task, low,band-pass, and high frequency components may play differentroles. For example, gender classification can be accomplishedusing only low-frequency components, whereas the identifica-tion task needs more high-frequency components. As to the hy-brid images, also known as “Dr. Angry and Mr. Smile,” Schynand Oliva [45] demonstrated that low spatial frequencies andhigh spatial frequencies are distinguished differently at differentdistances in human perception. In this regard, human beingshave the ability to select the proper frequency band relevant totheir purposes.Looking back to the image-based face recognition system,now that the dimensionality of a face image is large, it is pos-sible for an image to contain a less informative signal disturbingthe recognition procedure. For this reason, analysis of the facemodel in the Fourier frequency domain provides more oppor-tunities to select useful information, assuming that we knowa priori which frequency bands are important. In this paper,we propose three frequency bands ( ,, and ), as shown in Fig. 5.The proposed band selection has different frequency rangesbeginning at the lowest frequency, because high frequency byitself does not have a pivotal role in identifying face images. Weconclude that , whose range is from 0 to 1/6 and from 5/6 to 1in image sizes, which focuses on lower frequency information,for example, outlines the structure of a face image. Second,Fig. 6. Ratios of between-class to within-class variability of the Fourier fea-tures. It is defined as r = = where is the variance of the classmeans and is the sum of the variances within each class [18]. The darkerregion is the lower discriminative power and vice versa. (a) Discriminant powerof real component + discriminant power of imaginary component)/2. (b) FourierSpectrum. (c) Phase angle.covers higher frequency information ranging from 0 to 2/6 andfrom 4/6 to 1, so that we can analyze finer descriptions of a faceimage excluded in the former band. The full-frequency band,, finally spans the image’s full descriptions.In the average aspect ratios of the discriminant powers ofFig. 6, the lower frequency band generally has good severabilityvalues for all the three types of the Fourier features. Real andimaginary components are the best, but phase angle informationhas the lowest average discriminant powers. This figure revealsthat all proposed Fourier features and corresponding frequencybands have different characteristics.B. Hybrid Fourier Feature for Face RecognitionWe have three different Fourier feature domains, namely, thereal and imaginary component (RI) domain, Fourier spectrumdomain, and phase angle domain. Now we present howwe apply the three frequency band selections , , and tothe three Fourier features domains. The discriminant powers asshowninFig.6giveahintforhowtoapplythebandselectionreg-ulation to each Fourier domain. The domain has more pow-erful descriptions to distinguish faces than other domains, so weapply to it. On the other hand, the and do-mains do not make use of the highest frequency region becausethe discriminating power of the highest frequency parts in theseFourier domains are small. Moreover, the higher frequency in-formation of the phase angle is sensitive to small spatial changesand, thus, only is adopted. In this respect, this selection pro-cedureinphaseinformationisakindofcompensationforthesus-ceptible phasecoefficients. Consequently, , and areadditionallyusedtodescribethefacemodelwith domainfea-tures. The whole procedure of the proposed method is summa-rized in Fig. 7. All Fourier features are independently projectedinto discriminative subspaces by PCLDA theory. For example,one output of is derived by(20)where is the transformation matrix of PCLDA learnedby a given training set and is its mean vector. In se-quence, three outputs of different frequency bands are concate-nated as follows:(21)
  7. 7. 1158 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011Fig. 7. Structure of hybrid Fourier based upon PCLDA.In other domains, the outputs are also calculated(22)(23)The final augmented feature consists of three different com-plementary features, and the notation is given by(24)In the end, the is projected to other PCLDA spaces in orderto reduce the dimensionality of the augmented feature,(25)C. Multiple Face Model for Robust Face RecognitionIn computer vision tasks, internal facial components havebeen commonly employed, because external features (e.g., hair)are too variable for face recognition. However, in the case ofhumans, Sinha’s result [34] showed that both internal and ex-ternal facial cues are important, and moreover, the human visualsystem sometimes makes strong use of the overall head shapein order to determine facial identity. In this respect, we pro-pose a multiple face model that consists of three face modelswith different eye distances in the same image size. It is de-signed to imitate the human visual system and examines a faceimage from the internal facial components to the external fa-cial shapes. Three face models with the same image sizes, 4656, are constructed with three different eye distances (for in-stance, Eye Distance ), and the integral normal-ized gradient image method is applied to each normalized faceimage. We finally have fine ( ), dominant , and coarseface models, as shown in Fig. 8. The fine face model isformed to analyze the internal components of a face, such as theeyes, nose, and mouth, while the coarse face model includes theFig. 8. Structure of Fourier-based LDA with multiple face models.general structures of a face and the external components such ashair, ear, and jaw-line. The last one, the dominant face model,is a compromise between the fine model and the coarse model.Now that they all have their own individual interesting aspectsfor analysis, each face model can play an inherent role for theothers in the face recognition system. For example, the fine facemodel is robust to background and hair style changes but sensi-tive to pose changes. On the other hand, the coarse face modelshows the opposite tendency.In the end, we can have three different classifiers, and eachsimilarity score is calculated by a normalized correlation. Theequation of the normalized score between two features andin the th classifier is defined as(26)V. SCORE FUSION BASED UPON LOG-LIKELIHOOD RATIOIn this section, we present two ways to calculate the score fu-sion: equal error rate (EER)-based and LLR-based methods. TheEER-based score fusion computes a weighted sum of scores,where the weight is a measure of the discriminating power ofthe component classifier. On the other hand, the LLR-basedscore fusion is motivated by Bayesian theory, and it is knownthat LLR-based score fusion is optimal in the Neyman-Pearsonsense [37].A. Score Fusion Based Upon Weighted Sum MethodOne way to combine the scores is to compute a weighted sumas follows:(27)where the weight is the amount of confidence we have in theth classifier and its score . In this work, we use 1/EER as ameasure of such confidence. Thus, we have a new score(28)B. Score Fusion Based Upon Log-Likelihood RatioWe interpret the set of scores as a feature vector from whichwe perform the classification task. Suppose we have a set of
  8. 8. HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1159Fig. 9. (a) and (b) Depict the score distributions of two classifiers, respectively.The upper is (a) p(s jdiff) and (b) p(s jdiff), the lower is (a) p(s jsame) and(b) p(s jsame), and (c) the blue color and the red color are p(s ; s jsame) andp(s ; s jdiff), respectively.scores computed by classifiers. Now the problem isto decide whether the query-target pair is from the same personor not based upon these scores. We can cast this problem as thefollowing hypothesis testing:diffsame (29)where diff is the distribution of the scoreswhen the query and target are from different persons, andsame is the distribution of the scores when thequery and target are from the same person.Fig. 9 gives an example of such distributions and provides theintuition behind the benefits of using multiple scores generatedby multiple classifiers. Suppose we have two classifiers andthey produce two scores and . Fig. 9(a) shows diffand same , distributions of a single score. The regionbetween the two vertical lines is where the two distributionsoverlap. Intuitively speaking, a classification error can occurwhen the score falls within this overlapped region, and thesmaller this overlapped region, the smaller the probability ofclassification error. Likewise, Fig. 9(b) shows diff andsame . Fig. 9(c) shows how the pair of the two scoresis distributed. The upper left of Fig. 9(c) is the scatterplot of when the query and target are from the sameperson, and the upper right of Fig. 9(c) is the scatter plot ofwhen the query and target are from different persons. Thebottom of Fig. 9(c) shows how the two scatter plots overlap.Compared with Fig. 9(a) and (b), we can see that the probabilityof overlap can be reduced by jointly considering the two scoresand , which suggests that hypothesis testing based uponthe two scores and is better than hypothesis testing basedupon a single score or .If we know the two densities diff andsame , the log-likelihood ratio test achievesthe highest verification rate for a given false accept rate ac-cording to the Neyman-Pearson Lemma [46]samediff(30)However, the true densities diff andsame are unknown, so we need to estimatethese densities observing scores computed from query-targetpairs in the training data. One way to estimate these densities isto use a nonparametric density estimation, such as the Parzendensity estimate [37]. In this work, we use parametric densityestimation in order to avoid over-fitting and reduce computa-tional complexity. In particular, we model the distribution ofgiven as a Gaussian random variable with mean diff andvariance diff , and model given as independentGaussian random variables with densitydiff diff diff (31)where isthe Gaussian density function. The parameters diff diffare estimated from the scores of the th classifier corre-sponding to nonmatch query-target pairs in the trainingdatabase. Similarly, we approximate the density ofgiven by same same , and the parame-ters diff diff and same same are computedfrom the scores of the th classifier corresponding to matchquery-target pairs in the training database.Now we define the fused score to be the log-likelihood ratio,which is given bysame samediff diffdiffdiffsamesame(32)where is a constant value.Note that the new score is a quadratic function of the originalset of scores.C. Comparison of the Two Score Fusion MethodsLet us briefly compare the two score fusion methods. Theweighted sum method based upon the EER is heuristic butintuitively appealing and easy to implement. In addition, thismethod has the advantage that it is robust to the difference instatistics between training data and test data. Even though thetraining data and test data have different statistics, the relativestrength of the component classifiers is less likely to changesignificantly, suggesting that the weighting still makes sense.One drawback of the weighed sum method is that the scoresgenerated by component classifiers may have different physicalor statistical meaning and different ranges. Hence, we shouldmake sure that the range of each score is normalized appropri-ately. The LLR-based fusion method is more principled thanthe weighted sum method in that the LLR-based fusion method
  9. 9. 1160 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011is derived from the optimal likelihood ratio test. The method’sdecision boundary is nonlinear, thereby being able to performmore complex classification.However, the score fusion in (32) depends upon the parame-ters that we estimate from the training data. Thus, this methodis more sensitive to discrepancies between the statistics of thetraining data and the test data than the weighted sum method.For instance, the score fusion in (32) is affected by the shift ofthe mean parameters diff and same .In our experiments, the LLR-based fusion resulted in ahigher verification rate when the false accept rate (FAR) is0.1%. Hence, a rule of thumb is that we use the LLR-basedfusion method when we expect that the mean parametersestimated from the training data are good estimates of the trueparameters of the test data.VI. EXPERIMENTAL RESULTS AND DISCUSSIONSA. FRGC Evaluation ProtocolFRGC [5] provides three components as an evaluation frame-work: image data sets, experimental protocols, and the infra-structure. The FRGC data corpus contains high-resolution 2-Dstill images taken under controlled lighting conditions and withunstructured illumination as well. The still images were takenwith a 4-mega-pixel digital camera, and the resolutions are ei-ther 1704 2272 or 1200 1600. The data corpus is dividedinto training and validation partitions. The data in the trainingpartition was collected in the 2002–2003 academic year. Thetraining set consists of 12,776 images from 222 subjects, with6,388 controlled still images and 6,388 uncontrolled still im-ages. Images in the validation partition were collected duringthe 2003–2004 academic year with time elapse. The validationset contains images from 466 subjects.The FRGC evaluation protocol consists of six experiments,but we focused on two experiments, Experiment 1 and 4, whichare involved with 2-D still images. Other protocols are designedfor 3-D face recognition or multiple 2-D images enrollment. Ex-periment 1 measures the performance for 16,028 frontal facialimages taken under controlled illumination, and Experiment 4 isdesigned to measure recognition performance for 8014 uncon-trolled frontal still images versus 16,028 controlled images. Ex-periment 1 is designed to measure performances of traditionalfrontal face recognition, but Experiment 4 is a more practicalprotocol because it contains unexpected illumination changes,blurred images, and some occlusions in images. Table I sum-marizes the protocols of Experiment 1 and 4, and Fig. 10 showssample images.The verification performance is characterized by two statis-tics: the verification rate (VR) and false acceptance rate (FAR).The FAR is computed from comparisons of faces of differentpeople, defined as “non-match.” On the other hand, the VR iscomputed from comparisons of two facial images of the sameperson, defined as “match.” The baseline algorithm and its per-formances are provided with FRGC evaluation tools. The BEEbaseline algorithm is PCA [5]. The performance is reported on areceiver operating characteristic (ROC) that shows the tradeoffbetween the VR and FAR. Three ROC experimental results arereported. The images for ROC 1, ROC 2, and ROC 3 are col-lected within semesters, within a year, and between semesters,TABLE IDATABASE PROTOCOLS OF FRGC VER. 2.0 EXPERIMENTS. [C] AND [U] MEANTHE CONTROLLED SITUATION AND THE UNCONTROLLED ILLUMINATIONCONDITION, RESPECTIVELYFig. 10. Example face images of the FRGC data set. The controlled images arein the first and second columns, and the uncontrolled images are in the othercolumns.respectively. The evaluation tools ensure that results from dif-ferent algorithms are computed on the same data sets and thatperformance scores are generated by the same protocol. TheVRs at FAR of the baseline algorithm are about 78%and 16% for ROC 1 of Experiment 1 and 4, respectively.Even though FRGC has provided high-resolution imagestogether with four ground truth locations of the four fiducialpoints, namely two eyes, nose, and mouth points, to give morechances to improve the recognition performance, we havedown-sampled the face images into small images of size 4656 such that the two eye points are located at the specifiedpoints. This is largely because face recognition systems areoften implemented on consumer electronics devices, which stillhave lower computational power, and eye locations are easierto find than others in real situations.B. Experimental Results on Integral Normalized GradientImageIn Yang and Liu’s method [27], they proposed the color imagediscriminant (CID) model for a more effective representationof the color image for face recognition. In this paper, we adopttheir approach to make a salient single-channel image froma color image, where the differences are that we use our ownweighting parameters and that we apply our novel INGI methodto the single-channel image . We call this proposed scheme theINGI-Color method. The single-channel image is given as aweighted sum as follows:(33)where , , are the red, green, and blue channel im-ages, respectively. The weight parameters, , are cal-culated from the discriminant powers of RGB channel imagesof the training samples. We divide the training image set intotwo sets: the target set which contains 6,681 controlled trainingimages and the query set which consists of 6,095 uncontrolledtraining images. With the training target and query sets, we canmeasure the performances of the red, green, and blue color chan-nels, respectively, and the weight parameters,
  10. 10. HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1161Fig. 11. Preprocessing example images. The controlled images are in the firstand third columns and the images of the other columns are the uncontrolledimages. The preprocessing methods are (a) gray image, (b) histogram equal-ization, (c) Retinex, (d) INGI-Gray, the INGI method applied to gray images,and (e) INGI-Color, the INGI method applied to the novel image made by theweighted summation of color channel images., are given by the reciprocal values of theEERs.From the sample images of Fig. 11, we can see that the pro-posed INGI methods can alleviate the image distortion causedby unexpected illumination changes and enrich the texture infor-mation of a face image for recognition. For example, the shadowof the deep-set eyes of Fig. 11(a) are removed by the proposedINGI methods as shown in Fig. 11(d) and (e), but Fig. 11(b)shows that histogram equalization could not solve this problemadequately.For the performance comparisons of the preprocessingmethods, we use a simple PCLDA-based classifier withfive different preprocessing methods: no preprocessing, i.e.,using a raw image itself, the histogram equalization, Retinex,INGI-Gray (the proposed INGI method applied to the grayimage), and INGI-Color [the proposed INGI method applied tothe single channel image made by (33)]. Fig. 12(b) showsthat theINGI-Color method performs the best in the Experiment 4 pro-tocol but the INGI-Color method is worse than the INGI-Grayin Experiment 1. This is largely because the parameters of (33)come from the condition similar to the Experiment 4. On theother hand, the histogram equalization is the best in the Exper-iment 1 protocol but it is not in the Experiment 4 protocol. Byhistogram equalization, the histogram of a controlled image,which is often concentrated in a narrow range of intensity, isextended to the entire range of intensity, whereas that of anuncontrolled image, which is in general already of a large range,has little room for such an extension. Therefore, the enrichingtextures of the controlled images contribute to the good accuracyin the Experiment 1 protocol. On the other hand, in the Exper-iment 4 protocol, histogram equalization makes the controlledand uncontrolled images of the same person more different. Incase of the Retinex method, it does not make good results inExperiment 1 and 4, and we can infer that some of the useful in-formation for face recognition is also lost while trying to removemost of the image variation induced by illumination variation.It, thus, seems natural that we employ the INGI-Color method toovercome the uncontrolled illumination changes in this paper.Fig. 12. Face recognition performances of the PCLDA by the preprocessingmethods in (a) Experiment 1 and (b) Experiment 4.Fig. 13. Face recognition performances of individual elements of the hybridFourier feature based upon face model. For simplicity, we compare onlywith performances of ROC1 in the Experiment 4.C. Experimental Results on Hybrid Fourier Feature-BasedMultiple Face ModelFig. 13 shows the VRs of individual Fourier features rangingfrom 40.13% to 63.42% on the face model . As indicatedin Fig. 6, the features from the domain perform the bestwhile the worst is made by the feature from the domain. Theaugmented feature, , consists of all these Fourier features from
  11. 11. 1162 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011TABLE IIFACE RECOGNITION PERFORMANCES OF THE BEE BASELINE AND THE PROPOSED METHODS IN EXPERIMENT 1 AND 4Fig. 14. Face recognition performances of the three face models and mergedface models are shown in both experiments. (a) Experiment 1. (b) Experiment4. The multiple face model results are fused by the weighted sum rule.to . It gives 69.89% accuracy, which is approximately6% better than the best single feature in Experiment 4, thoughthe dimensionality is six times bigger than a single Fourier fea-ture. From this result, we safely conclude that the proposedFourier features extracted from different Fourier domains andfrequency bands have complementary characteristics in spite ofextending the feature dimensionality.In the case of the proposed multiple face model, Fig. 14 showsthe recognition performances of the fine, dominant, and coarseface models; the dominant face model performs the best, fol-lowed in order by the coarse face model, and the fine face model.The fine face model relatively has too redundant facial textureand leaves insufficient discriminated features for face recogni-tion; in addition, the coarse face model is too complex to buildan ideal transformation matrix within linear subspaces. In spiteof the weak performances of these two face models, when theyare merged with the dominant face model by the weighted sumrule, they contribute to the performance increase. To be specific,while the maximum VR (ROC1) of the dominant face model isFig. 15. ROC curves for Experiment 1 corresponding to the multiple facemodel-based face recognitions fused by weighted sum and by LLR.93.34% in Experiment 1 and 69.97% at FAR in Exper-iment 4, the multiple face model achieves 95.32% and 77.55%,respectively.D. Experimental Results on Log-Likelihood Ratio(LLR)-BasedScore FusionHow to merge the outputs of individual classifiers for leadingto a good performance is another challenging problem inface recognition. All parameters of the LLR and weightedsum methods are calculated in advance by training samples,respectively. The results of each method for Experiment 1 and4 are shown in Table II, Fig. 15, and Fig. 16.2 Recognitionaccuracy always benefits from merging three face models, andthe LLR has made the best performances in both Experiments1 and 4. Note that all merged methods show more performanceimprovement in Experiment 4 than in Experiment 1. Becausethe performance of Experiment 1 is good enough, there is littleroom for significant improvements.E. Performance Comparisons With Previous WorkAs summarized in Table III, most of the methods have triedto combine the local-based classifier and the global-basedclassifier for better accuracy of the face recognition. Regardingperformance of individual classifiers, the local Gabor featuresextracted from the high-resolution image, over 120 120,achieves better accuracy than the global features from thelow-resolution image, under 60 60. In detail, the former2The final ROC curves of the proposed method in Experiment 1 and 4 can befound at http://ispl.korea.ac.kr/~wjhwang/project/2010/TIP.html.
  12. 12. HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1163TABLE IIIPERFORMANCE COMPARISONS WITH PREVIOUS PUBLISHED METHODS ON EXPERIMENT 4 OF THE FRGC VER 2.0 DATABASE. THE ACCURACY IS MEASUREDWITH VR AT FAR = 0:1% FOR ROC3. THE RESULTS OF OTHER METHODS ARE DIRECTLY CITED FROM THE CORRESPONDING PAPERSFig. 16. ROC curves for Experiment 4 corresponding to the multiple facemodel-based face recognitions fused by the weighted sum and by LLR.methods, Gabor features [28]–[30] show 73.5%–84.12% VRat FAR for ROC3 and the latter methods, for ex-ample, Fourier, DCT, and LBP features [27]–[30], achieve51%–78.26% VR. However, they do not discard the globalclassifiers because the global features have their own com-plementary characteristics against the local features andcombining two different features have led to better accuracy inface recognition.The best accuracy, 92.40% VR at FAR for ROC3,was achieved by Liu’s method [28] using one high resolutionand two low resolution color-scale images. The basic frame-work of Liu’s method is based upon the combination of thethree different features extracted from the proposed novel color-channels for improving accuracy of face recognition. However,TABLE IVAVERAGE COMPUTATIONAL TIMES IN EACH MODULE OF THE PROPOSEDMETHOD ARE MEASURED ON A 2.66 GHz SINGLE PROCESSOR PC. THEPREPROCESSING DOES NOT INCLUDE THE FACE NORMALIZATION MODULEtheir algorithm relies on a large number of nontrivial classifiers,giving rise to increased computational complexity.Among the face recognition methods based upon global fea-tures extracted from small size images, the proposed methodperforms best. For example, the proposed method which con-sists of only the three global classifiers shows better perfor-mances, 81.14% VR at FAR for ROC3, than the otherglobal methods [27]–[30], which achieved 78.26%, 69.38%, ap-proximately 73.5%, and 51% VRs, respectively.To demonstrate that the local feature-based methods havemuch larger computational complexities than the proposedmethod, we evaluated the computational times for the proposedmethod and the feature extraction stage of each local fea-ture-based method. We considered the feature extraction stagesince this stage is expected to be the bottleneck of the localfeature-based methods. As Table IV and Table V show, for thelocal-feature based methods, the feature extraction stage alonetakes about 2000 ms for [28], [29] and 400–17,000 ms for [30],which is much larger than 70 ms, the total computational timefor the proposed method. The test platform is a 2.66 GHz singleprocessor PC with 4G RAM. In the case in [30] in Table V,they select 30 local patches from a given image, where the sizeof each patch is somewhere between 16 16 and 64 64
  13. 13. 1164 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011TABLE VAVERAGE COMPUTATIONAL TIMES IN ONLY FEATURE EXTRACTION FUNCTIONARE SHOWN. THE VRS OF OTHER METHODS ARE CITEDDIRECTLY FROM THE CORRESPONDING PAPERSbut exact patch sizes are not specified. Hence, we evaluatedthe computational times for the cases where all patch sizes areminimum 16 16 or maximum 64 64 and the range of thecomputational time is shown in Table V.In general, local feature-based methods are suitable for highresolution images and their performances depend upon theimage resolution available as demonstrated in Fig. 11 of [30].In this figure, the accuracy of the local Gabor feature-basedmethod drops as the size of the normalized face image de-creases. In particular, the verification rates for the method in[30] decreases from 83% to 68% as the sizes of the imagechanges from 128 160 to 48 60, as shown in Table V. Thisis largely because the Gabor wavelet-based method is com-monly used for analyzing the fine features in a high-resolutionfacial image.Consequently, the proposed method will be useful for the sce-narios where the input images have small resolution and it hasthe benefits of lower computational complexity compared to thelocal feature-based methods.VII. CONCLUSIONWe have presented a whole face recognition system with pre-processing, feature extraction and classifier, and score fusionmethods for uncontrolled illumination situations. First, we pro-posed a preprocessing method based upon the analysis of theface imaging model with the definitions of intrinsic and ex-trinsic factors of a human face and proposed the INGI method asan illumination insensitive representation for face recognition.We also proposed hybrid Fourier-based classifiers with multi-face models, which basically consist of three Fourier domains,concatenated real and imaginary components, Fourier spectrum,and phase angle. The Fourier features are extracted from eachdomain within its own proper frequency bands, and to gain themaximum discriminant power of the classes, each feature is pro-jected into the linear discriminative subspace with the PCLDAscheme. We build multiple face models, namely, fine, domi-nant, and coarse face models. They have the same image sizeswith different eye distances. Multiple face models always per-form better than the dominant face model. Moreover, to effec-tively utilize the several classifiers, we proposed the score fusionmethod based upon the LLR at the final stage of the face recog-nition system. With the proposed method, we achieved an av-erage 95.13% verification accuracy in Experiment 1 and average81.49% in Experiment 4. Compared with the other global fea-ture-based algorithms, the proposed system demonstrated suc-cessful accuracy in face recognition under uncontrolled illumi-nation situations.REFERENCES[1] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERETevaluation methodology for face recognition algorithms,” IEEE Trans.Pattern Anal. Mach. Intell., vol. 22, no. 10, pp. 1090–1104, Oct. 2000.[2] D. M. Blackburn, M. Bone, and P. J. Phillips, “Facial recognitionvendor test 2000 evaluation report,” Dec. 2000 [Online]. Available:http://www.frvt.org/[3] P. Phillips, P. Grother, R. Micheals, D. Blackburn, E. Tabassi, and M.Bone, “Face recognition vendor test 2002: evaluation report,” 2003[Online]. Available: http://www.frvt.org/[4] K. Messer, J. Kittler, M. Sadeghi, M. Hamouz, and A. Kostin et al.,“Face authentication test on the BANCA database,” in Proc. Int. Conf.Pattern Recognit., Aug. 2004, vol. 4, pp. 523–532.[5] P. J. Phillips, P. J. Flynn, T. Scruggs, K. Bowyer, J. Chang, K. Hoffman,J. Marques, J. Min, and W. Worek, “Overview of the face recognitiongrand challenge,” in Proc. IEEE. Comput. Vis. Pattern Recognit., Jun.2005, vol. 1, pp. 947–954.[6] P. N. Belhumeur and D. J. Kriegman, “What is the set of images ofan object under all possible lighting conditions?,” in Proc. IEEE Conf.Comput. Vis. Pattern Recognit., Jun. 1996, pp. 270–277.[7] R. Ramamoorthi and P. Hanrahan, “On the relationship between ra-diance and irradiance: Determining the illumination from images ofa convex Lambertian object,” J. Opt. Soc. Amer., vol. 18, no. 10, pp.2448–2459, 2001.[8] A. Shashua and T. Riklin-Raviv, “The quotient image: Class-basedre-rendering and recognition with varying illuminations,” IEEE Trans.Pattern Anal. Mach. Intell., vol. 23, no. 2, pp. 129–139, Feb. 2001.[9] H. Wang, S. Li, and Y. Wang, “Generalized quotient image,” in Proc.IEEE. Comput. Vis. Pattern Recognit., Jul. 2004, vol. 2, pp. 498–505.[10] Q. Li, W. Yin, and Z. Deng, “Image-based face illumination transfer-ring using logarithmic total variation models,” Int. J. Comput. Graph.,vol. 26, no. 1, pp. 41–49, Nov. 2009.[11] E. H. Land, “The Retinex theory of color vision,” Sci. Amer., vol. 237,no. 6, pp. 108–128, Dec. 1977.[12] D. J. Jobson, Z. Rahman, and G. A. Woodell, “Properties and perfor-mance of a center/surround Retinex,” IEEE Trans. Image Process., vol.6, no. 3, pp. 451–462, Mar. 1997.[13] R. Gross and V. Brajovie, “An image preprocessing algorithm for il-lumination invariant face recognition,” in Proc. 4th Int. Conf. AudioVideo Based Biometric Person Authentication, 2003, vol. 2688/2003,pp. 10–18.[14] J. Malik and P. Perona, “Scale-space and edge detection usinganisotropic diffusion,” IEEE Trans. Pattern Anal. Mach. Intell., vol.12, no. 7, pp. 629–639, Jul. 1990.[15] M. A. Turk and A. P. Pentland, “Eigenfaces for recognition,” J. Cogn.Neurosci., vol. 3, no. 1, pp. 71–86, 1991.[16] P. S. Penev and J. J. Atick, “Local feature analysis: A general statisticaltheory for object representation,” Network: Comput. Neural Syst., vol.7, no. 3, pp. 477–500, 1996.[17] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenface vs.fisherfaces: Recognition using class specific linear projection,” IEEETrans. Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 711–720, Jul. 1997.[18] M. S. Bartlett, Face Image Analysis by Unsupervised Learning. Nor-well, MA: Kluwer, 2001.[19] M. Lades, J. C. Vorbruggen, J. Buhmann, J. Lange, C. von der Mals-burg, R. P. Wurtz, and W. Konen, “Distortion invariant object recogni-tion in the dynamic link architecture,” IEEE Trans. Comput., vol. 42,no. 3, pp. 300–311, Mar. 1993.[20] L. Wiskott, J. M. Fellous, N. Kruger, and C. von der Malsburg, “Facerecognition by elastic bunch graph matching,” IEEE Trans. PatternAnal. Mach. Intell., vol. 19, no. 7, pp. 775–779, Jul. 1997.[21] J. Chang, M. Kirby, H. Kley, C. Peterson, B. Draper, and R. Beveridge,“Recognition of digital images of the human face at ultra low resolutionvia illumination spaces,” in Proc. Asian Conf. Comput. Vis., Nov. 2007,vol. 4844/2007, pp. 733–743.[22] B. V. Kumar, M. Savvides, K. Venkataramani, and X. Xie, “Spatial fre-quency domain image processing for biometric recognition,” in Proc.IEEE. Intl. Conf. Image Process., 2002, vol. 1, pp. 53–56.[23] M. Savvides, B. Kumar, and P. Khosla, “Corefaces—Robust shiftinvariant PCA based correlation filter for illumination tolerant facerecognition,” in Proc. IEEE, Comput. Vis. Pattern Recognit., Jun.2004, vol. 2, pp. 834–841.[24] C. Xie, M. Savvides, and B. V. Kumar, “Kernel correlation filter basedredundant class-dependence feature analysis (kcfa) on frgc2.0 data,”Analysis and Modeling of Faces and Gestures, LNCS 3723, vol. 3723/2005, pp. 32–43, 2005.[25] Advanced Face Descriptor Using Fourier and Intensity LDA Features,ISO/IEC JTC1/SC29/WG11-MPEG-8998, Oct. 2002.
  14. 14. HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1165[26] W. Hwang and S. Kee, “International standardization on face recog-nition technology,” Adv. Biomet. Person Authent., vol. 3338/2005, pp.349–357, Dec. 2004.[27] J. Yang and C. Liu, “Color image discriminant models and algorithmsfor face recognition,” IEEE Trans. Neural Netw., vol. 19, no. 12, pp.2088–2098, Dec. 2008.[28] Z. Liu and C. Liu, “Robust face recognition using color information,”Adv. Biomet., vol. 5558/2009, pp. 122–131, 2009.[29] X. Tan and B. Triggs, “Fusing gabor and lbp feature set for kernel-basedface recognition,” in Proc. IEEE Int. Workshop Anal. Model. Face Ges-tures, 2007, pp. 235–249.[30] Y. Su, S. Shan, X. Chen, and W. Gao, “Hierarchical ensemble of globaland local classifiers for face recognition,” IEEE Trans. Image Process.,vol. 18, no. 8, pp. 1885–1896, Aug. 2009.[31] X. Wang and X. Tang, “Random sampling LDA for face recognition,”in Proc. IEEE. Comput. Vis. Pattern Recognit., 2004, vol. 2.[32] B. Heisele, P. Ho, J. Wu, and T. Poggio, “Face recognition: Com-ponent-based versus global approaches,” Comput. Vis. Image Under-stand., vol. 91, no. 1/2, pp. 6–21, 2003.[33] T. Kim, H. Kim, W. Hwang, and J. Kittler, “Component-based LDAface description for image retrieval and MPEG-7 standardisation,”Image Vis. Comput., vol. 23, no. 7, pp. 631–642, Jul. 2005.[34] P. Sinha, B. J. Balas, Y. Ostrovsky, and R. Russell, “Face recognitionby humans: 19 results all computer vision researchers should knowabout,” Proc. IEEE, vol. 94, no. 11, pp. 1948–1962, Nov. 2006.[35] A. Jain, K. Nandakumar, and A. Ross, “Score normalization in mul-timodal biometric systems,” Pattern Recognit., vol. 38, no. 12, pp.2270–2285, Dec. 2005.[36] J. Kittler, M. Hatef, R. P. Duin, and J. Matas, “On combining clas-sifiers,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 3, pp.226–239, Mar. 1998.[37] S. Prabhakar and A. K. Jain, “Decision-level fusion in fingerprint ver-ification,” Pattern Recognit., vol. 35, no. 4, pp. 861–873, 2002.[38] S. Samsung, “Integral normalized gradient image—A novel illu-mination insensitive representation,” in Proc. IEEE Workshop FaceRecognit. Grand Challenge Exper., Jun. 2005, pp. 166–172.[39] Y. Adnin, Y. Moses, and S. Ullman, “Face recognition: The problemof compensating for changes in illumination direction,” IEEE Trans.Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 721–732, Jul. 1997.[40] R. C. Gonzalez and R. E. Woods, Digital Image Processing. Reading,MA: Addison-Wesley, 1992.[41] L. Wiskott, J. M. Fellous, N. Kruger, and C. von der Malsburg, “Facerecognition and gender determination,” in Proc. Int. Workshop FaceGesture Recognit., 1995, pp. 92–97.[42] M.Savvides,B.V.Kumar,andP.Khosla,“Eigenphasesvs.eigenfaces,”in Proc. Int. Conf. Pattern Recognit., Aug. 2004, vol. 3, pp. 810–813.[43] W. Hwang, G. Park, J. Lee, and S. Kee, “Multiple face model of hybridfourier feature for large face image set,” in Proc. IEEE Comput. Vis.Pattern Recognit., Jun. 2006, vol. 2, pp. 1574–1581.[44] R. Chellappa, C. L. Wilson, and S. Sirohey, “Human and machinerecognition of faces: A survey,” Proc. IEEE, vol. 83, no. 5, pp. 705–741,May 1995.[45] P. G. Schyns and A. Oliva, “Dr. Angry and Mr. Smile: When catego-rization flexibly modifies the perception of faces in rapid visual presen-tations,” Cognition, vol. 69, no. 3, pp. 243–265, Jan. 1999.[46] T. M. Cover and J. A. Thomas, Elements of Information Theory.Hoboken, NJ: Wiley, 1991.Wonjun Hwang received the B.S. and M.S. degreesfrom the Department of Electronics Engineering,Korea University, Seoul, Korea, in 1999 and 2001,respectively.From 2001 to 2008, he worked as a Researcherfor the Face Recognition Group, Samsung AdvancedInstitute of Technology (SAIT). In 2001–2004, hecontributed to the promotion of Advanced FaceDescriptor, Samsung and NEC joint proposal, toMPEG-7 international standardization. From 2004to 2006, he worked on developing the SAIT facerecognition engine for Face Recognition Grand Challenge (FRGC) and FaceRecognition Vender Test (FRVT) 2006, which achieved the best results underthe uncontrolled illumination situation at both FRGC and FRVT2006. In 2007,he developed the real-time face recognition system for the Samsung cellularphone, SGH-V920. He is currently a senior research engineer of Mechatronics Manufacturing Technology Center at Samsung Electronics Co., Ltd., nowworking on face recognition for the Samsung humanoid robot, RoboRay. Hiscurrent research interests are in face recognition, human gesture recognition,object recognition, and computer vision.Haitao Wang received the M.S. degree in en-gineering from the University of Science andTechnology Beijing, in 2001, and the Ph.D. degreein computer science from Institute of Automation,Chinese Academy of Sciences, Beijing, China, in2004.He is currently a Senior Researcher of SAITBeijing Lab, Samsung Advanced Institute of Tech-nology (SAIT). His research interests include2-D-to-3-D conversion, stereo matching, multi-view3-D reconstruction, and 3-D display.Hyunwoo Kim (M’10) received the B.S. degreein electronic communication engineering fromHanyang University, Seoul, Korea, in 1994, and theM.S. and Ph.D. degrees in electrical and computerengineering from POSTECH, Pohang, Korea, in1996 and 2001, respectively.In 2000, he was a Visiting Scientist in the Institutefor Robotics and Intelligent Systems, University ofSouthern California, Los Angeles, CA. He worked atSamsung Electronics Co., Ltd. (2001–2007) and Ko-rean German Industrial Park Co., Ltd., (2007–2008)in the field of image processing, computer vision, robotics, and human-com-puter interaction (HCI). In 2009, he joined Department of New Media at Ko-rean German Institute of Technology, Seoul, Korea, where he is currently anAssistant Professor, after working for Intelligent System Research Center atSungkyunkwan University (2008–2009) as a Research Associate Professor. Hiscurrent research interests include computer vision, machine learning, cognitiverobotics, and augmented reality.Seok-Cheol Kee received the B.S. and M.S. degreesin control and instrumentation engineering and thePh.D. degree in electrical engineering, all from SeoulNational University, Seoul, Korea, in 1987, 1989, and2002, respectively.From 1989 to 2007, he worked as a PrincipalResearch Staff in the Samsung Advanced Institute ofTechnology (SAIT). From 2007 to 2010, he workedas a Director of Robot Research Center in theRobotever, Inc. Since June 2010, he has been withthe Mando Co. where he is now head of ElectronicRD Center. His research interests are in computer vision, image processing,biometrics for multimedia applications and human-computer interaction,sensor fusion, embedded control system for robot and automotive applications.Junmo Kim (S’01-M’05) received the B.S. degreefrom Seoul National University, Seoul, Korea,in 1998, and the M.S. and Ph.D. degrees fromthe Massachusetts Institute of Technology (MIT),Cambridge, in 2000 and 2005, respectively.From 2005 to 2009, he was with the SamsungAdvanced Institute of Technology (SAIT), Korea, asa Research Staff Member. He joined the faculty ofKAIST in 2009, where he is currently an AssistantProfessor of electrical engineering. His researchinterests are in image processing, computer vision,statistical signal processing, and information theory.