HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1153 designed a quotient image algorithm for dealing with illu-mination changes in face recognition. It is a practical algorithmfor extracting an illumination invariant representation. However,the application range of those algorithms is limited by requiringa 3-D face model since the 3-D model is not available in manypractical applications due to modeling complexity and costs.Based upon Land’s Retinex , Jobson et al.  and Grossand Brajovie  developed the reﬂectance estimation methodwith the ratio of an original image to its smooth version. Thedifference between the two Retinex-based algorithms is thatJobson’s ﬁlter is isotropic and Gross and Brajovie’s ﬁlter isanisotropic. Since those approaches do not need a 3-D or 2-Dmodel, they are relatively simple to implement and are generic.Similarly, Wang et al.  introduced the self-quotient image(SQI) method that extracts intrinsic and illumination invariantfeatures from a face image based upon the quotient image tech-nique. The authors assumed that the intrinsic part of the imageis mainly located in a high frequency region, and according tothe Lambertian model, the intrinsic part and the extrinsic partcould be theoretically analyzed. The SQI method could removemost of the shaded parts of a face image. However, this methodapplies direct division operations to image intensity and resultsin some noise and halo effects in the step regions. Recently, Li etal.  presented an image-based technique that employed thelogarithmic total variation model to factorize each of the twoaligned face images into an illumination-dependent componentand an illumination-invariant component.In this paper, the illumination-insensitive image integral nor-malized gradient image (INGI) method is proposed to overcomethe unexpected illumination changes in face recognition withlimited side effects such as image noise and the halo effect.Based upon intrinsic and extrinsic factor deﬁnitions, we ﬁrstnormalize the gradients with a smoothed image and then inte-grate the division results with the anisotropic diffusion method. Through this proposed procedure, we can compensate forunexpected artifacts and have smoother and more natural outputimages for face recognition.B. Feature ExtractionFeatures to be used for person classiﬁcation are extractedto identify any invariance in the face images against environ-mental changes. In this application, appearance-based subspacerepresentations have been implemented in face recognition.They include principal component analysis (PCA) , localfeature analysis (LFA) , linear discriminant analysis (LDA), and independent component analysis (ICA) . Thosemethods are derived from an ensemble of statistics within thegiven training images. The methods are easy to implement forpractical applications. In addition, with these methods it is notnecessary to carry out extra-high computational burdens exceptvector projections, while other structural-based schemes, suchas dynamic link architecture  and elastic bunch graphmatching , perform intense template matching on relativelyhigh-resolution images to localize ﬁducial points near eyes,nose, mouth, etc. Recently, Chang et al.  demonstratedthat sufﬁcient discriminatory information persists at ultra-lowresolution to enable a computer to recognize speciﬁc humanfaces.Kumar et al.  developed a frequency domain feature offace images for recognition by proposing a cross-correlatingmethod based upon the fast Fourier transform. Savvides et al. further extended a correlation ﬁlter and developed a hy-brid PCA-correlation ﬁlter called “Corefaces,” that performedrobust illumination-tolerant face recognition. Xie et al.  ap-plied a nonlinear correlation ﬁlter for redundant class-depen-dence feature analysis in the frequency domain. Another suc-cessful approach based upon the Fourier feature is advancedface descriptor (AFD) , which was promoted as an inter-national standard by the moving picture experts group (MPEG)society after long-term competitions for a new face descriptorof MPEG-7 . This approach is designed for the metadataof small face images in multimedia content with compact de-scriptor sizes, and its algorithms were derived from Fourier andintensity features based upon the cascaded LDA method.Recently, there have been many studies – on the facerecognition for the FRGC. For example, Yang and Liu  pre-sented a color image discriminant (CID) model that seeks tounify the color image representation and recognition tasks intoone framework. The proposed models have two sets of vari-ables: a set of color component combination coefﬁcients forcolor image representation and multiple projection basis vec-tors for color image discrimination. Liu and Liu  proposedthe hybrid color space-based face recognition, which consistsof patch-based Gabor classiﬁers with a high resolution image,local binary pattern (LBP) feature-based classiﬁers, and discretecosine transform (DCT)-based classiﬁers with a low resolutionimage. Tan and Triggs  suggested the use of heterogeneousfeature sets such as Gabor wavelets and LBP features. Each fea-ture is projected into reduced dimension spaces by PCA. Su etal.  proposed a hierarchical framework mixing global andlocal classiﬁers. The global classiﬁer is based upon Fourier fea-tures originating from a low resolution image while the localclassiﬁers are organized by a combination of patch-based Gaborfeatures from a high resolution facial image.The current study on face recognition can be largely classiﬁedinto two different classes of approaches, the local feature-basedmethod and the global feature-based method. The local fea-ture-based methods – analyze a plural of local features,such as Gabor wavelet features extracted from the high-resolu-tion image, but most of these approaches impose a heavy com-putational burden on the target device, in particular on mobiledevices, which have low computational power. On the otherhand, the global feature-based approaches – extract fea-tures, for example, Fourier and DCT features, from a low-res-olution image. Because the basic framework of the global fea-ture-based methods is simple, they are good for devices with lowcomputational power in spite of the limited accuracy comparedwith the local feature-based methods.In this paper, we extend the AFD  to handle a largenumber of uncontrolled face images effectively. This learningprocess is done independently from various selected frequencybands using a 2-D discrete Fourier transform. This featureextraction framework is introduced in order to remove un-necessary frequency parts as the occasion demands for facerecognition. Three types of Fourier feature domain, concate-nated real and imaginary components, Fourier spectrums, and
1154 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011the phase angle, are represented. Three different frequencybandwidths are also designed to extract more complementaryfrequency features. As for complementary features, Wang andTang  proved that random sampling of feature vectors andtraining samples improved the classiﬁers’ performance. We donot depend upon random feature sampling but design severalcomplementary features based upon the frequency analysis.Moreover, we construct multiple face models that have threeface models with different eye distances within a regular faceimage region. While the component-based schemes , perform local analysis of important facial components, our pro-posed multiple face models focus on compensatory face modelsin imitation of human perception, which uses both internal fea-tures (e.g., mutual spatial conﬁguration of facial components)and external facial features (e.g., hair and jaw-line)  for facerecognition. With multiple complementary features and classi-ﬁers, we can expect to achieve a more general face recogni-tion system that is not yet completed by a single feature andclassiﬁer.C. Score FusionAs we have a set of complementary classiﬁers, we build a uni-ﬁed classiﬁer combining these complementary classiﬁers. Thepurpose here is to construct a strong classiﬁer by suitably com-bining a set of classiﬁers. To this end, we want to keep as muchinformation each classiﬁer extracts as possible, and at the sametime the combination should be easy to implement. The infor-mation each classiﬁer extracts is well summarized in the scoreeach classiﬁer produces. Hence, combining the classiﬁers canbe achieved by processing the set of scores produced by compo-nent classiﬁers and generating a new single score value. We callthis process “score fusion.” Previous methods for score fusioninclude sum rule, product rule, weighted sum, Bayesian method,and voting –.In this paper, we consider a score fusion method based upona probabilistic approach, namely, log-likelihood ratio (LLR) forface recognition. If the ground truth distributions of the scoresare known, LLR-based score fusion is optimal. However, thetrue distributions are unknown so we have to estimate the dis-tributions. We propose a simple approximation of the optimalscore fusion based upon parametric estimation of the score dis-tributions from the training data set.The rest of this paper is organized as follows: illumination in-sensitive representation as a preprocessing method is presentedin Section II. LDA based upon Fourier feature and the pro-posed algorithm, hybrid Fourier feature based LDA from mul-tiple face models, is explained in Sections III and IV, respec-tively. In Section V, log-likelihood-rate-based score fusion ispresented. The experimental results and discussions are given inSection VI. Last, the conclusion is summarized in Section VII.II. ILLUMINATION INSENSITIVE REPRESENTATIONCompared to the controlled illumination changes in the studio(indoors, same day, overhead), achieving high recognition ac-curacy in an uncontrolled illumination situation (outdoors, dif-ferent day) is hard, as mentioned in the FRVT2002 report .The main reason is that the image distortion caused by illumi-nation changes makes images of different persons in the sameFig. 1. Under the assumption of the Lambertian reﬂectance model, an imageconsists of a 3-D shape, texture, and illumination.illumination conditions more similar rather than images of thesame person under various illumination changes. Consequently,we propose a novel illumination invariant preprocessing algo-rithm1 based upon local analysis as opposed to global analysissuch as histogram equalization to deal with the uncontrolled il-lumination situation.A. Illumination AnalysisAssuming the Lambertian reﬂectance model , thegrayscale intensity image of a 3-D object is repre-sented by(1)where is the surface texture associated with point inthe image, is the surface normal direction (shape) asso-ciated with point in the image, and is the light sourcedirection whose magnitude is the light source intensity. This re-lationship is illustrated in Fig. 1.Except for the nose region, most parts of a human face arerelatively ﬂat and continuous, and all faces from different per-sons have similar 3-D shapes . That is, imposing a differentperson’s facial texture on the 3-D shape of a generic face doesnot seriously affect the identity of each person. From the view-point of face recognition, how to obtain the raw facial textureinformation is important. The quotient image method  hastaken advantage of this property for extracting illumination-in-variant features. We can also assume that is an illumina-tion-sensitive part in the imaging model.We deﬁne as the intrinsic factor and as the extrinsicfactor for face recognition. The intrinsic factor is illuminationfree and represents the identity of a face image, whereas theextrinsic factor is very sensitive to illumination variations, andonly partial identity information is included in the 3-D shape. Furthermore, the illumination problem is the well-knownill-posed problem. Without any additional assumption or con-straint, no analytical solution can be deduced from an input 2-Dimage alone. Previous approaches, such as the illumination cone and the spherical harmonic method , directly take the3-D shape as the known parameter or parameters that canbe estimated by training data. However, these approaches arenot available in many practical systems, because the predeﬁned3-D face model is complex, and it does not always work prop-erly under unexpected conditions. In the case of the quotientimage algorithm, though it does not need the 3-D information,it works under the assumption of a face image illuminated witha single-point light source without shadow .Our deﬁnitions of intrinsic and extrinsic factors are alsobased upon the Lambertian reﬂectance model with pointlighting sources. This deﬁnition can be extended to any type of1The preprocessing algorithm has also been presented in  by the authors.
HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1155Fig. 2. Edge information of the image is very sensitive to illumination changes.(a) Sample images under different illumination conditions. (b) Correspondingedge maps.lighting source by a combination of point lighting sources asfollows:(2)According to the previously mentioned assumptions, it is ob-vious that an illumination-insensitive image could be obtainedby enhancing the intrinsic factor and depressing the extrinsicfactor in the input image; the proposed approach is based uponthis idea.B. Integral Normalized Gradient Image (INGI)We can make the following assumptions: 1) most of the in-trinsic factor is in the high spatial frequency domain, and 2) mostof the extrinsic factor is in the low spatial frequency domain.Considering the ﬁrst assumption, one might use a high-passﬁlter to extract the intrinsic factor, but it has been proved that this kind of ﬁlter is not robust to illumination variationsas shown in Fig. 2. In addition, a high-pass ﬁltering operationmay have a risk of removing some of the useful intrinsic factor.Hence, we propose an alternative approach, namely, employinga gradient operation. The gradient operation is written as(3)where the approximation comes from the assumptions that boththe surface normal direction (shape) and the light source direc-tion vary slowly across the image, whereas the surface texturevaries fast. The scaling factor is the extrinsicfactor of our imaging model. Fig. 3 shows the sample gradientmaps. The Retinex method ,  and SQI method  usedthe smoothed images as the estimation of this extrinsic factor.We also use the same approach to estimate the extrinsic part(4)where is a smoothing kernel and denotes the convolution.To overcome the illumination sensitivity, we normalizedthe gradient map with the followingequation:(5)Fig. 3. Structure of the integral normalized gradient image.Fig. 4. Toy examples of reconstruction methods, which are (a) original image,(b) recovered image by isotropic method, and (c) recovered image by anisotropicmethod.because can be taken as the estimation of the extrinsic factor;the illumination effect could be reduced from the gradient mapafter this normalization.After the normalization, the texture information in the nor-malized image is still not apparent enough asshown in Fig. 3. In addition, the division operation of (5) mayintensify unexpected noise terms. To recover the rich texture andremove the noise at the same time, we integrate the normalizedgradients and with the anisotropic diffusion method ,, which we explain in the following, and ﬁnally acquire thereconstructed image as shown in Fig. 3.C. Image ReconstructionThe last INGI procedure is to recover a grayscale image fromnormalized gradient maps. If given an initial grayscale value ofone point in an image, we can estimate the grayscale ofany point by an integration method, such as an iterative isotropicdiffusion method, as follows:(6)where is an iteration number, is the iterative reconstructedimage and usually(7)However, this isotropic method has one shortcoming: It blursthe step-edge regions of an image, shown in Fig. 4(b). To
1156 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011overcome this weakness, we adopt an anisotropic approach(8)where ,, is a scaling factor, and controls the up-date speed. If is too large, we may not get stable results; it isset to 0.25 in this paper.Compared with the result in Fig. 4(b), the recovered imagein Fig. 4(c) is more edge-preserved and numerically stable thanthe other method. As complete removal of the illumination vari-ations can lead to loss of useful information for face recognition,we fuse the reconstructed image with the original input image.The image can be derived by(9)where is the ﬁnal reconstructed image from (8) and is theweighting parameter, .III. LDA BASED UPON FOURIER FEATUREIn this paper, we propose the extended Fourier feature-basedrecognition scheme for FRGC motivated by MPEG7 AFD .A. 2-D Discrete Fourier Features for Face RecognitionThe Fourier transform has played a key role in image pro-cessing applications for many years because of its wide rangeof possibilities . To analyze facial features in the Fourierdomain, we apply the Fourier transform to a spatial face image.A 2-D face image can be transformed into a fre-quency domain by(10)where and are frequency variables. TheFourier transform of a real function is generally complex; that is(11)where and are the real and imaginary componentsof , respectively. The magnitude function called theFourier spectrum is(12)and the phase function is deﬁned as(13)To represent the image as a feature vector, the magnitude co-efﬁcients from (12) are widely used instead of the phase values.This is largely because a little spatial displacement in an imagewill change the phase values drastically while the magnitudevaries smoothly when there is no compensator for the phaseshift . In the case of face recognition, this displacementoften occurs when an eye detector ﬁnds imprecise eye positionsand causes a little mis-alignment of a face image at the normal-ization stage. On the other hand, Savvides et al.  recentlyshowed that the phase coefﬁcient-based method is invariant toillumination variations and is tolerant to occlusion problems andthat eigenphases show better results than eigenfaces.In this paper, we introduce three different Fourier featuresextracted from the real and imaginary component domain,Fourier spectrum domain, and phase angle domain. Toavoid the angular problem in phase angle, , we actuallyuse the cosine values of (13)(14)(15)(16)B. LDALDA  is a supervised learning method that ﬁnds the linearprojection in subspaces; it maximizes the between-class scatterwhile minimizing the within-class scatter of the projected data.According to this objective, two scatter matrices—the between-class scatter matrix and the within-class scatter matrices—are deﬁned as(17)(18)where the set of training datahave total classes, is the sample mean for theentire data set, is the sample mean for th class,is the number of samples of a class , and .To maximize the between-class scatter and minimize thewithin-class scatter, the transformation matrix, , is for-mulated as(19)In face recognition, when dealing with high-dimensional imagedata, the within-class scatter matrix is often singular .To overcome this problem, PCA is ﬁrst used with the sampledata to reduce its dimensionality. In this paper, we will call itPCLDA.
HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1157Fig. 5. Feature regions are selected according to different frequency bands inFourier features. The upper left point of all quadrangles (0, 0) is the lowestfrequency, and notations are B , B , and B .IV. MULTIPLE FACE MODELS BASED UPONHYBRID FOURIER FEATUREIn this paper, we applied the complementary scheme to the face recognition system with selective frequency band-width and multiple face models based upon different eye dis-tances . To gain more powerful discriminating features, weextract the multiblock Fourier features . We ﬁrst divide aninput image into several blocks and then we apply a 2-D discreteFourier ﬁlter to each block. The Fourier features extracted fromblocks by band selection rules are ﬁnally concatenated.A. Frequency Band Selection in Fourier FeatureFrom the viewpoint of psychophysiology and neurophysi-ology, the human recognition system selectively utilizes the fre-quency information of a face image depending upon recogni-tion tasks. For example, Shinha et al.  observed that humanscan recognize familiar faces in low-resolution images and thathigh-frequency information by itself does not lead to good facerecognition performance. According to a survey by Chellappaet al. , depending upon the speciﬁc recognition task, low,band-pass, and high frequency components may play differentroles. For example, gender classiﬁcation can be accomplishedusing only low-frequency components, whereas the identiﬁca-tion task needs more high-frequency components. As to the hy-brid images, also known as “Dr. Angry and Mr. Smile,” Schynand Oliva  demonstrated that low spatial frequencies andhigh spatial frequencies are distinguished differently at differentdistances in human perception. In this regard, human beingshave the ability to select the proper frequency band relevant totheir purposes.Looking back to the image-based face recognition system,now that the dimensionality of a face image is large, it is pos-sible for an image to contain a less informative signal disturbingthe recognition procedure. For this reason, analysis of the facemodel in the Fourier frequency domain provides more oppor-tunities to select useful information, assuming that we knowa priori which frequency bands are important. In this paper,we propose three frequency bands ( ,, and ), as shown in Fig. 5.The proposed band selection has different frequency rangesbeginning at the lowest frequency, because high frequency byitself does not have a pivotal role in identifying face images. Weconclude that , whose range is from 0 to 1/6 and from 5/6 to 1in image sizes, which focuses on lower frequency information,for example, outlines the structure of a face image. Second,Fig. 6. Ratios of between-class to within-class variability of the Fourier fea-tures. It is deﬁned as r = = where is the variance of the classmeans and is the sum of the variances within each class . The darkerregion is the lower discriminative power and vice versa. (a) Discriminant powerof real component + discriminant power of imaginary component)/2. (b) FourierSpectrum. (c) Phase angle.covers higher frequency information ranging from 0 to 2/6 andfrom 4/6 to 1, so that we can analyze ﬁner descriptions of a faceimage excluded in the former band. The full-frequency band,, ﬁnally spans the image’s full descriptions.In the average aspect ratios of the discriminant powers ofFig. 6, the lower frequency band generally has good severabilityvalues for all the three types of the Fourier features. Real andimaginary components are the best, but phase angle informationhas the lowest average discriminant powers. This ﬁgure revealsthat all proposed Fourier features and corresponding frequencybands have different characteristics.B. Hybrid Fourier Feature for Face RecognitionWe have three different Fourier feature domains, namely, thereal and imaginary component (RI) domain, Fourier spectrumdomain, and phase angle domain. Now we present howwe apply the three frequency band selections , , and tothe three Fourier features domains. The discriminant powers asshowninFig.6giveahintforhowtoapplythebandselectionreg-ulation to each Fourier domain. The domain has more pow-erful descriptions to distinguish faces than other domains, so weapply to it. On the other hand, the and do-mains do not make use of the highest frequency region becausethe discriminating power of the highest frequency parts in theseFourier domains are small. Moreover, the higher frequency in-formation of the phase angle is sensitive to small spatial changesand, thus, only is adopted. In this respect, this selection pro-cedureinphaseinformationisakindofcompensationforthesus-ceptible phasecoefﬁcients. Consequently, , and areadditionallyusedtodescribethefacemodelwith domainfea-tures. The whole procedure of the proposed method is summa-rized in Fig. 7. All Fourier features are independently projectedinto discriminative subspaces by PCLDA theory. For example,one output of is derived by(20)where is the transformation matrix of PCLDA learnedby a given training set and is its mean vector. In se-quence, three outputs of different frequency bands are concate-nated as follows:(21)
1158 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011Fig. 7. Structure of hybrid Fourier based upon PCLDA.In other domains, the outputs are also calculated(22)(23)The ﬁnal augmented feature consists of three different com-plementary features, and the notation is given by(24)In the end, the is projected to other PCLDA spaces in orderto reduce the dimensionality of the augmented feature,(25)C. Multiple Face Model for Robust Face RecognitionIn computer vision tasks, internal facial components havebeen commonly employed, because external features (e.g., hair)are too variable for face recognition. However, in the case ofhumans, Sinha’s result  showed that both internal and ex-ternal facial cues are important, and moreover, the human visualsystem sometimes makes strong use of the overall head shapein order to determine facial identity. In this respect, we pro-pose a multiple face model that consists of three face modelswith different eye distances in the same image size. It is de-signed to imitate the human visual system and examines a faceimage from the internal facial components to the external fa-cial shapes. Three face models with the same image sizes, 4656, are constructed with three different eye distances (for in-stance, Eye Distance ), and the integral normal-ized gradient image method is applied to each normalized faceimage. We ﬁnally have ﬁne ( ), dominant , and coarseface models, as shown in Fig. 8. The ﬁne face model isformed to analyze the internal components of a face, such as theeyes, nose, and mouth, while the coarse face model includes theFig. 8. Structure of Fourier-based LDA with multiple face models.general structures of a face and the external components such ashair, ear, and jaw-line. The last one, the dominant face model,is a compromise between the ﬁne model and the coarse model.Now that they all have their own individual interesting aspectsfor analysis, each face model can play an inherent role for theothers in the face recognition system. For example, the ﬁne facemodel is robust to background and hair style changes but sensi-tive to pose changes. On the other hand, the coarse face modelshows the opposite tendency.In the end, we can have three different classiﬁers, and eachsimilarity score is calculated by a normalized correlation. Theequation of the normalized score between two features andin the th classiﬁer is deﬁned as(26)V. SCORE FUSION BASED UPON LOG-LIKELIHOOD RATIOIn this section, we present two ways to calculate the score fu-sion: equal error rate (EER)-based and LLR-based methods. TheEER-based score fusion computes a weighted sum of scores,where the weight is a measure of the discriminating power ofthe component classiﬁer. On the other hand, the LLR-basedscore fusion is motivated by Bayesian theory, and it is knownthat LLR-based score fusion is optimal in the Neyman-Pearsonsense .A. Score Fusion Based Upon Weighted Sum MethodOne way to combine the scores is to compute a weighted sumas follows:(27)where the weight is the amount of conﬁdence we have in theth classiﬁer and its score . In this work, we use 1/EER as ameasure of such conﬁdence. Thus, we have a new score(28)B. Score Fusion Based Upon Log-Likelihood RatioWe interpret the set of scores as a feature vector from whichwe perform the classiﬁcation task. Suppose we have a set of
HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1159Fig. 9. (a) and (b) Depict the score distributions of two classiﬁers, respectively.The upper is (a) p(s jdiff) and (b) p(s jdiff), the lower is (a) p(s jsame) and(b) p(s jsame), and (c) the blue color and the red color are p(s ; s jsame) andp(s ; s jdiff), respectively.scores computed by classiﬁers. Now the problem isto decide whether the query-target pair is from the same personor not based upon these scores. We can cast this problem as thefollowing hypothesis testing:diffsame (29)where diff is the distribution of the scoreswhen the query and target are from different persons, andsame is the distribution of the scores when thequery and target are from the same person.Fig. 9 gives an example of such distributions and provides theintuition behind the beneﬁts of using multiple scores generatedby multiple classiﬁers. Suppose we have two classiﬁers andthey produce two scores and . Fig. 9(a) shows diffand same , distributions of a single score. The regionbetween the two vertical lines is where the two distributionsoverlap. Intuitively speaking, a classiﬁcation error can occurwhen the score falls within this overlapped region, and thesmaller this overlapped region, the smaller the probability ofclassiﬁcation error. Likewise, Fig. 9(b) shows diff andsame . Fig. 9(c) shows how the pair of the two scoresis distributed. The upper left of Fig. 9(c) is the scatterplot of when the query and target are from the sameperson, and the upper right of Fig. 9(c) is the scatter plot ofwhen the query and target are from different persons. Thebottom of Fig. 9(c) shows how the two scatter plots overlap.Compared with Fig. 9(a) and (b), we can see that the probabilityof overlap can be reduced by jointly considering the two scoresand , which suggests that hypothesis testing based uponthe two scores and is better than hypothesis testing basedupon a single score or .If we know the two densities diff andsame , the log-likelihood ratio test achievesthe highest veriﬁcation rate for a given false accept rate ac-cording to the Neyman-Pearson Lemma samediff(30)However, the true densities diff andsame are unknown, so we need to estimatethese densities observing scores computed from query-targetpairs in the training data. One way to estimate these densities isto use a nonparametric density estimation, such as the Parzendensity estimate . In this work, we use parametric densityestimation in order to avoid over-ﬁtting and reduce computa-tional complexity. In particular, we model the distribution ofgiven as a Gaussian random variable with mean diff andvariance diff , and model given as independentGaussian random variables with densitydiff diff diff (31)where isthe Gaussian density function. The parameters diff diffare estimated from the scores of the th classiﬁer corre-sponding to nonmatch query-target pairs in the trainingdatabase. Similarly, we approximate the density ofgiven by same same , and the parame-ters diff diff and same same are computedfrom the scores of the th classiﬁer corresponding to matchquery-target pairs in the training database.Now we deﬁne the fused score to be the log-likelihood ratio,which is given bysame samediff diffdiffdiffsamesame(32)where is a constant value.Note that the new score is a quadratic function of the originalset of scores.C. Comparison of the Two Score Fusion MethodsLet us brieﬂy compare the two score fusion methods. Theweighted sum method based upon the EER is heuristic butintuitively appealing and easy to implement. In addition, thismethod has the advantage that it is robust to the difference instatistics between training data and test data. Even though thetraining data and test data have different statistics, the relativestrength of the component classiﬁers is less likely to changesigniﬁcantly, suggesting that the weighting still makes sense.One drawback of the weighed sum method is that the scoresgenerated by component classiﬁers may have different physicalor statistical meaning and different ranges. Hence, we shouldmake sure that the range of each score is normalized appropri-ately. The LLR-based fusion method is more principled thanthe weighted sum method in that the LLR-based fusion method
1160 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011is derived from the optimal likelihood ratio test. The method’sdecision boundary is nonlinear, thereby being able to performmore complex classiﬁcation.However, the score fusion in (32) depends upon the parame-ters that we estimate from the training data. Thus, this methodis more sensitive to discrepancies between the statistics of thetraining data and the test data than the weighted sum method.For instance, the score fusion in (32) is affected by the shift ofthe mean parameters diff and same .In our experiments, the LLR-based fusion resulted in ahigher veriﬁcation rate when the false accept rate (FAR) is0.1%. Hence, a rule of thumb is that we use the LLR-basedfusion method when we expect that the mean parametersestimated from the training data are good estimates of the trueparameters of the test data.VI. EXPERIMENTAL RESULTS AND DISCUSSIONSA. FRGC Evaluation ProtocolFRGC  provides three components as an evaluation frame-work: image data sets, experimental protocols, and the infra-structure. The FRGC data corpus contains high-resolution 2-Dstill images taken under controlled lighting conditions and withunstructured illumination as well. The still images were takenwith a 4-mega-pixel digital camera, and the resolutions are ei-ther 1704 2272 or 1200 1600. The data corpus is dividedinto training and validation partitions. The data in the trainingpartition was collected in the 2002–2003 academic year. Thetraining set consists of 12,776 images from 222 subjects, with6,388 controlled still images and 6,388 uncontrolled still im-ages. Images in the validation partition were collected duringthe 2003–2004 academic year with time elapse. The validationset contains images from 466 subjects.The FRGC evaluation protocol consists of six experiments,but we focused on two experiments, Experiment 1 and 4, whichare involved with 2-D still images. Other protocols are designedfor 3-D face recognition or multiple 2-D images enrollment. Ex-periment 1 measures the performance for 16,028 frontal facialimages taken under controlled illumination, and Experiment 4 isdesigned to measure recognition performance for 8014 uncon-trolled frontal still images versus 16,028 controlled images. Ex-periment 1 is designed to measure performances of traditionalfrontal face recognition, but Experiment 4 is a more practicalprotocol because it contains unexpected illumination changes,blurred images, and some occlusions in images. Table I sum-marizes the protocols of Experiment 1 and 4, and Fig. 10 showssample images.The veriﬁcation performance is characterized by two statis-tics: the veriﬁcation rate (VR) and false acceptance rate (FAR).The FAR is computed from comparisons of faces of differentpeople, deﬁned as “non-match.” On the other hand, the VR iscomputed from comparisons of two facial images of the sameperson, deﬁned as “match.” The baseline algorithm and its per-formances are provided with FRGC evaluation tools. The BEEbaseline algorithm is PCA . The performance is reported on areceiver operating characteristic (ROC) that shows the tradeoffbetween the VR and FAR. Three ROC experimental results arereported. The images for ROC 1, ROC 2, and ROC 3 are col-lected within semesters, within a year, and between semesters,TABLE IDATABASE PROTOCOLS OF FRGC VER. 2.0 EXPERIMENTS. [C] AND [U] MEANTHE CONTROLLED SITUATION AND THE UNCONTROLLED ILLUMINATIONCONDITION, RESPECTIVELYFig. 10. Example face images of the FRGC data set. The controlled images arein the ﬁrst and second columns, and the uncontrolled images are in the othercolumns.respectively. The evaluation tools ensure that results from dif-ferent algorithms are computed on the same data sets and thatperformance scores are generated by the same protocol. TheVRs at FAR of the baseline algorithm are about 78%and 16% for ROC 1 of Experiment 1 and 4, respectively.Even though FRGC has provided high-resolution imagestogether with four ground truth locations of the four ﬁducialpoints, namely two eyes, nose, and mouth points, to give morechances to improve the recognition performance, we havedown-sampled the face images into small images of size 4656 such that the two eye points are located at the speciﬁedpoints. This is largely because face recognition systems areoften implemented on consumer electronics devices, which stillhave lower computational power, and eye locations are easierto ﬁnd than others in real situations.B. Experimental Results on Integral Normalized GradientImageIn Yang and Liu’s method , they proposed the color imagediscriminant (CID) model for a more effective representationof the color image for face recognition. In this paper, we adopttheir approach to make a salient single-channel image froma color image, where the differences are that we use our ownweighting parameters and that we apply our novel INGI methodto the single-channel image . We call this proposed scheme theINGI-Color method. The single-channel image is given as aweighted sum as follows:(33)where , , are the red, green, and blue channel im-ages, respectively. The weight parameters, , are cal-culated from the discriminant powers of RGB channel imagesof the training samples. We divide the training image set intotwo sets: the target set which contains 6,681 controlled trainingimages and the query set which consists of 6,095 uncontrolledtraining images. With the training target and query sets, we canmeasure the performances of the red, green, and blue color chan-nels, respectively, and the weight parameters,
HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1161Fig. 11. Preprocessing example images. The controlled images are in the ﬁrstand third columns and the images of the other columns are the uncontrolledimages. The preprocessing methods are (a) gray image, (b) histogram equal-ization, (c) Retinex, (d) INGI-Gray, the INGI method applied to gray images,and (e) INGI-Color, the INGI method applied to the novel image made by theweighted summation of color channel images., are given by the reciprocal values of theEERs.From the sample images of Fig. 11, we can see that the pro-posed INGI methods can alleviate the image distortion causedby unexpected illumination changes and enrich the texture infor-mation of a face image for recognition. For example, the shadowof the deep-set eyes of Fig. 11(a) are removed by the proposedINGI methods as shown in Fig. 11(d) and (e), but Fig. 11(b)shows that histogram equalization could not solve this problemadequately.For the performance comparisons of the preprocessingmethods, we use a simple PCLDA-based classiﬁer withﬁve different preprocessing methods: no preprocessing, i.e.,using a raw image itself, the histogram equalization, Retinex,INGI-Gray (the proposed INGI method applied to the grayimage), and INGI-Color [the proposed INGI method applied tothe single channel image made by (33)]. Fig. 12(b) showsthat theINGI-Color method performs the best in the Experiment 4 pro-tocol but the INGI-Color method is worse than the INGI-Grayin Experiment 1. This is largely because the parameters of (33)come from the condition similar to the Experiment 4. On theother hand, the histogram equalization is the best in the Exper-iment 1 protocol but it is not in the Experiment 4 protocol. Byhistogram equalization, the histogram of a controlled image,which is often concentrated in a narrow range of intensity, isextended to the entire range of intensity, whereas that of anuncontrolled image, which is in general already of a large range,has little room for such an extension. Therefore, the enrichingtextures of the controlled images contribute to the good accuracyin the Experiment 1 protocol. On the other hand, in the Exper-iment 4 protocol, histogram equalization makes the controlledand uncontrolled images of the same person more different. Incase of the Retinex method, it does not make good results inExperiment 1 and 4, and we can infer that some of the useful in-formation for face recognition is also lost while trying to removemost of the image variation induced by illumination variation.It, thus, seems natural that we employ the INGI-Color method toovercome the uncontrolled illumination changes in this paper.Fig. 12. Face recognition performances of the PCLDA by the preprocessingmethods in (a) Experiment 1 and (b) Experiment 4.Fig. 13. Face recognition performances of individual elements of the hybridFourier feature based upon face model. For simplicity, we compare onlywith performances of ROC1 in the Experiment 4.C. Experimental Results on Hybrid Fourier Feature-BasedMultiple Face ModelFig. 13 shows the VRs of individual Fourier features rangingfrom 40.13% to 63.42% on the face model . As indicatedin Fig. 6, the features from the domain perform the bestwhile the worst is made by the feature from the domain. Theaugmented feature, , consists of all these Fourier features from
1162 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011TABLE IIFACE RECOGNITION PERFORMANCES OF THE BEE BASELINE AND THE PROPOSED METHODS IN EXPERIMENT 1 AND 4Fig. 14. Face recognition performances of the three face models and mergedface models are shown in both experiments. (a) Experiment 1. (b) Experiment4. The multiple face model results are fused by the weighted sum rule.to . It gives 69.89% accuracy, which is approximately6% better than the best single feature in Experiment 4, thoughthe dimensionality is six times bigger than a single Fourier fea-ture. From this result, we safely conclude that the proposedFourier features extracted from different Fourier domains andfrequency bands have complementary characteristics in spite ofextending the feature dimensionality.In the case of the proposed multiple face model, Fig. 14 showsthe recognition performances of the ﬁne, dominant, and coarseface models; the dominant face model performs the best, fol-lowed in order by the coarse face model, and the ﬁne face model.The ﬁne face model relatively has too redundant facial textureand leaves insufﬁcient discriminated features for face recogni-tion; in addition, the coarse face model is too complex to buildan ideal transformation matrix within linear subspaces. In spiteof the weak performances of these two face models, when theyare merged with the dominant face model by the weighted sumrule, they contribute to the performance increase. To be speciﬁc,while the maximum VR (ROC1) of the dominant face model isFig. 15. ROC curves for Experiment 1 corresponding to the multiple facemodel-based face recognitions fused by weighted sum and by LLR.93.34% in Experiment 1 and 69.97% at FAR in Exper-iment 4, the multiple face model achieves 95.32% and 77.55%,respectively.D. Experimental Results on Log-Likelihood Ratio(LLR)-BasedScore FusionHow to merge the outputs of individual classiﬁers for leadingto a good performance is another challenging problem inface recognition. All parameters of the LLR and weightedsum methods are calculated in advance by training samples,respectively. The results of each method for Experiment 1 and4 are shown in Table II, Fig. 15, and Fig. 16.2 Recognitionaccuracy always beneﬁts from merging three face models, andthe LLR has made the best performances in both Experiments1 and 4. Note that all merged methods show more performanceimprovement in Experiment 4 than in Experiment 1. Becausethe performance of Experiment 1 is good enough, there is littleroom for signiﬁcant improvements.E. Performance Comparisons With Previous WorkAs summarized in Table III, most of the methods have triedto combine the local-based classiﬁer and the global-basedclassiﬁer for better accuracy of the face recognition. Regardingperformance of individual classiﬁers, the local Gabor featuresextracted from the high-resolution image, over 120 120,achieves better accuracy than the global features from thelow-resolution image, under 60 60. In detail, the former2The ﬁnal ROC curves of the proposed method in Experiment 1 and 4 can befound at http://ispl.korea.ac.kr/~wjhwang/project/2010/TIP.html.
HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1163TABLE IIIPERFORMANCE COMPARISONS WITH PREVIOUS PUBLISHED METHODS ON EXPERIMENT 4 OF THE FRGC VER 2.0 DATABASE. THE ACCURACY IS MEASUREDWITH VR AT FAR = 0:1% FOR ROC3. THE RESULTS OF OTHER METHODS ARE DIRECTLY CITED FROM THE CORRESPONDING PAPERSFig. 16. ROC curves for Experiment 4 corresponding to the multiple facemodel-based face recognitions fused by the weighted sum and by LLR.methods, Gabor features – show 73.5%–84.12% VRat FAR for ROC3 and the latter methods, for ex-ample, Fourier, DCT, and LBP features –, achieve51%–78.26% VR. However, they do not discard the globalclassiﬁers because the global features have their own com-plementary characteristics against the local features andcombining two different features have led to better accuracy inface recognition.The best accuracy, 92.40% VR at FAR for ROC3,was achieved by Liu’s method  using one high resolutionand two low resolution color-scale images. The basic frame-work of Liu’s method is based upon the combination of thethree different features extracted from the proposed novel color-channels for improving accuracy of face recognition. However,TABLE IVAVERAGE COMPUTATIONAL TIMES IN EACH MODULE OF THE PROPOSEDMETHOD ARE MEASURED ON A 2.66 GHz SINGLE PROCESSOR PC. THEPREPROCESSING DOES NOT INCLUDE THE FACE NORMALIZATION MODULEtheir algorithm relies on a large number of nontrivial classiﬁers,giving rise to increased computational complexity.Among the face recognition methods based upon global fea-tures extracted from small size images, the proposed methodperforms best. For example, the proposed method which con-sists of only the three global classiﬁers shows better perfor-mances, 81.14% VR at FAR for ROC3, than the otherglobal methods –, which achieved 78.26%, 69.38%, ap-proximately 73.5%, and 51% VRs, respectively.To demonstrate that the local feature-based methods havemuch larger computational complexities than the proposedmethod, we evaluated the computational times for the proposedmethod and the feature extraction stage of each local fea-ture-based method. We considered the feature extraction stagesince this stage is expected to be the bottleneck of the localfeature-based methods. As Table IV and Table V show, for thelocal-feature based methods, the feature extraction stage alonetakes about 2000 ms for ,  and 400–17,000 ms for ,which is much larger than 70 ms, the total computational timefor the proposed method. The test platform is a 2.66 GHz singleprocessor PC with 4G RAM. In the case in  in Table V,they select 30 local patches from a given image, where the sizeof each patch is somewhere between 16 16 and 64 64
1164 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011TABLE VAVERAGE COMPUTATIONAL TIMES IN ONLY FEATURE EXTRACTION FUNCTIONARE SHOWN. THE VRS OF OTHER METHODS ARE CITEDDIRECTLY FROM THE CORRESPONDING PAPERSbut exact patch sizes are not speciﬁed. Hence, we evaluatedthe computational times for the cases where all patch sizes areminimum 16 16 or maximum 64 64 and the range of thecomputational time is shown in Table V.In general, local feature-based methods are suitable for highresolution images and their performances depend upon theimage resolution available as demonstrated in Fig. 11 of .In this ﬁgure, the accuracy of the local Gabor feature-basedmethod drops as the size of the normalized face image de-creases. In particular, the veriﬁcation rates for the method in decreases from 83% to 68% as the sizes of the imagechanges from 128 160 to 48 60, as shown in Table V. Thisis largely because the Gabor wavelet-based method is com-monly used for analyzing the ﬁne features in a high-resolutionfacial image.Consequently, the proposed method will be useful for the sce-narios where the input images have small resolution and it hasthe beneﬁts of lower computational complexity compared to thelocal feature-based methods.VII. CONCLUSIONWe have presented a whole face recognition system with pre-processing, feature extraction and classiﬁer, and score fusionmethods for uncontrolled illumination situations. First, we pro-posed a preprocessing method based upon the analysis of theface imaging model with the deﬁnitions of intrinsic and ex-trinsic factors of a human face and proposed the INGI method asan illumination insensitive representation for face recognition.We also proposed hybrid Fourier-based classiﬁers with multi-face models, which basically consist of three Fourier domains,concatenated real and imaginary components, Fourier spectrum,and phase angle. The Fourier features are extracted from eachdomain within its own proper frequency bands, and to gain themaximum discriminant power of the classes, each feature is pro-jected into the linear discriminative subspace with the PCLDAscheme. We build multiple face models, namely, ﬁne, domi-nant, and coarse face models. They have the same image sizeswith different eye distances. Multiple face models always per-form better than the dominant face model. Moreover, to effec-tively utilize the several classiﬁers, we proposed the score fusionmethod based upon the LLR at the ﬁnal stage of the face recog-nition system. With the proposed method, we achieved an av-erage 95.13% veriﬁcation accuracy in Experiment 1 and average81.49% in Experiment 4. Compared with the other global fea-ture-based algorithms, the proposed system demonstrated suc-cessful accuracy in face recognition under uncontrolled illumi-nation situations.REFERENCES P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERETevaluation methodology for face recognition algorithms,” IEEE Trans.Pattern Anal. Mach. Intell., vol. 22, no. 10, pp. 1090–1104, Oct. 2000. D. M. Blackburn, M. Bone, and P. J. Phillips, “Facial recognitionvendor test 2000 evaluation report,” Dec. 2000 [Online]. Available:http://www.frvt.org/ P. Phillips, P. Grother, R. Micheals, D. Blackburn, E. Tabassi, and M.Bone, “Face recognition vendor test 2002: evaluation report,” 2003[Online]. Available: http://www.frvt.org/ K. Messer, J. Kittler, M. Sadeghi, M. Hamouz, and A. Kostin et al.,“Face authentication test on the BANCA database,” in Proc. Int. Conf.Pattern Recognit., Aug. 2004, vol. 4, pp. 523–532. P. J. Phillips, P. J. Flynn, T. Scruggs, K. Bowyer, J. Chang, K. Hoffman,J. Marques, J. Min, and W. Worek, “Overview of the face recognitiongrand challenge,” in Proc. IEEE. Comput. Vis. Pattern Recognit., Jun.2005, vol. 1, pp. 947–954. P. N. Belhumeur and D. J. Kriegman, “What is the set of images ofan object under all possible lighting conditions?,” in Proc. IEEE Conf.Comput. Vis. Pattern Recognit., Jun. 1996, pp. 270–277. R. Ramamoorthi and P. Hanrahan, “On the relationship between ra-diance and irradiance: Determining the illumination from images ofa convex Lambertian object,” J. Opt. Soc. Amer., vol. 18, no. 10, pp.2448–2459, 2001. A. Shashua and T. Riklin-Raviv, “The quotient image: Class-basedre-rendering and recognition with varying illuminations,” IEEE Trans.Pattern Anal. Mach. Intell., vol. 23, no. 2, pp. 129–139, Feb. 2001. H. Wang, S. Li, and Y. Wang, “Generalized quotient image,” in Proc.IEEE. Comput. Vis. Pattern Recognit., Jul. 2004, vol. 2, pp. 498–505. Q. Li, W. Yin, and Z. Deng, “Image-based face illumination transfer-ring using logarithmic total variation models,” Int. J. Comput. Graph.,vol. 26, no. 1, pp. 41–49, Nov. 2009. E. H. Land, “The Retinex theory of color vision,” Sci. Amer., vol. 237,no. 6, pp. 108–128, Dec. 1977. D. J. Jobson, Z. Rahman, and G. A. Woodell, “Properties and perfor-mance of a center/surround Retinex,” IEEE Trans. Image Process., vol.6, no. 3, pp. 451–462, Mar. 1997. R. Gross and V. Brajovie, “An image preprocessing algorithm for il-lumination invariant face recognition,” in Proc. 4th Int. Conf. AudioVideo Based Biometric Person Authentication, 2003, vol. 2688/2003,pp. 10–18. J. Malik and P. Perona, “Scale-space and edge detection usinganisotropic diffusion,” IEEE Trans. Pattern Anal. Mach. Intell., vol.12, no. 7, pp. 629–639, Jul. 1990. M. A. Turk and A. P. Pentland, “Eigenfaces for recognition,” J. Cogn.Neurosci., vol. 3, no. 1, pp. 71–86, 1991. P. S. Penev and J. J. Atick, “Local feature analysis: A general statisticaltheory for object representation,” Network: Comput. Neural Syst., vol.7, no. 3, pp. 477–500, 1996. P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenface vs.ﬁsherfaces: Recognition using class speciﬁc linear projection,” IEEETrans. Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 711–720, Jul. 1997. M. S. Bartlett, Face Image Analysis by Unsupervised Learning. Nor-well, MA: Kluwer, 2001. M. Lades, J. C. Vorbruggen, J. Buhmann, J. Lange, C. von der Mals-burg, R. P. Wurtz, and W. Konen, “Distortion invariant object recogni-tion in the dynamic link architecture,” IEEE Trans. Comput., vol. 42,no. 3, pp. 300–311, Mar. 1993. L. Wiskott, J. M. Fellous, N. Kruger, and C. von der Malsburg, “Facerecognition by elastic bunch graph matching,” IEEE Trans. PatternAnal. Mach. Intell., vol. 19, no. 7, pp. 775–779, Jul. 1997. J. Chang, M. Kirby, H. Kley, C. Peterson, B. Draper, and R. Beveridge,“Recognition of digital images of the human face at ultra low resolutionvia illumination spaces,” in Proc. Asian Conf. Comput. Vis., Nov. 2007,vol. 4844/2007, pp. 733–743. B. V. Kumar, M. Savvides, K. Venkataramani, and X. Xie, “Spatial fre-quency domain image processing for biometric recognition,” in Proc.IEEE. Intl. Conf. Image Process., 2002, vol. 1, pp. 53–56. M. Savvides, B. Kumar, and P. Khosla, “Corefaces—Robust shiftinvariant PCA based correlation ﬁlter for illumination tolerant facerecognition,” in Proc. IEEE, Comput. Vis. Pattern Recognit., Jun.2004, vol. 2, pp. 834–841. C. Xie, M. Savvides, and B. V. Kumar, “Kernel correlation ﬁlter basedredundant class-dependence feature analysis (kcfa) on frgc2.0 data,”Analysis and Modeling of Faces and Gestures, LNCS 3723, vol. 3723/2005, pp. 32–43, 2005. Advanced Face Descriptor Using Fourier and Intensity LDA Features,ISO/IEC JTC1/SC29/WG11-MPEG-8998, Oct. 2002.
HWANG et al.: FACE RECOGNITION SYSTEM USING MULTIPLE FACE MODEL OF HYBRID FOURIER FEATURE 1165 W. Hwang and S. Kee, “International standardization on face recog-nition technology,” Adv. Biomet. Person Authent., vol. 3338/2005, pp.349–357, Dec. 2004. J. Yang and C. Liu, “Color image discriminant models and algorithmsfor face recognition,” IEEE Trans. Neural Netw., vol. 19, no. 12, pp.2088–2098, Dec. 2008. Z. Liu and C. Liu, “Robust face recognition using color information,”Adv. Biomet., vol. 5558/2009, pp. 122–131, 2009. X. Tan and B. Triggs, “Fusing gabor and lbp feature set for kernel-basedface recognition,” in Proc. IEEE Int. Workshop Anal. Model. Face Ges-tures, 2007, pp. 235–249. Y. Su, S. Shan, X. Chen, and W. Gao, “Hierarchical ensemble of globaland local classiﬁers for face recognition,” IEEE Trans. Image Process.,vol. 18, no. 8, pp. 1885–1896, Aug. 2009. X. Wang and X. Tang, “Random sampling LDA for face recognition,”in Proc. IEEE. Comput. Vis. Pattern Recognit., 2004, vol. 2. B. Heisele, P. Ho, J. Wu, and T. Poggio, “Face recognition: Com-ponent-based versus global approaches,” Comput. Vis. Image Under-stand., vol. 91, no. 1/2, pp. 6–21, 2003. T. Kim, H. Kim, W. Hwang, and J. Kittler, “Component-based LDAface description for image retrieval and MPEG-7 standardisation,”Image Vis. Comput., vol. 23, no. 7, pp. 631–642, Jul. 2005. P. Sinha, B. J. Balas, Y. Ostrovsky, and R. Russell, “Face recognitionby humans: 19 results all computer vision researchers should knowabout,” Proc. IEEE, vol. 94, no. 11, pp. 1948–1962, Nov. 2006. A. Jain, K. Nandakumar, and A. Ross, “Score normalization in mul-timodal biometric systems,” Pattern Recognit., vol. 38, no. 12, pp.2270–2285, Dec. 2005. J. Kittler, M. Hatef, R. P. Duin, and J. Matas, “On combining clas-siﬁers,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 3, pp.226–239, Mar. 1998. S. Prabhakar and A. K. Jain, “Decision-level fusion in ﬁngerprint ver-iﬁcation,” Pattern Recognit., vol. 35, no. 4, pp. 861–873, 2002. S. Samsung, “Integral normalized gradient image—A novel illu-mination insensitive representation,” in Proc. IEEE Workshop FaceRecognit. Grand Challenge Exper., Jun. 2005, pp. 166–172. Y. Adnin, Y. Moses, and S. Ullman, “Face recognition: The problemof compensating for changes in illumination direction,” IEEE Trans.Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 721–732, Jul. 1997. R. C. Gonzalez and R. E. Woods, Digital Image Processing. Reading,MA: Addison-Wesley, 1992. L. Wiskott, J. M. Fellous, N. Kruger, and C. von der Malsburg, “Facerecognition and gender determination,” in Proc. Int. Workshop FaceGesture Recognit., 1995, pp. 92–97. M.Savvides,B.V.Kumar,andP.Khosla,“Eigenphasesvs.eigenfaces,”in Proc. Int. Conf. Pattern Recognit., Aug. 2004, vol. 3, pp. 810–813. W. Hwang, G. Park, J. Lee, and S. Kee, “Multiple face model of hybridfourier feature for large face image set,” in Proc. IEEE Comput. Vis.Pattern Recognit., Jun. 2006, vol. 2, pp. 1574–1581. R. Chellappa, C. L. Wilson, and S. Sirohey, “Human and machinerecognition of faces: A survey,” Proc. IEEE, vol. 83, no. 5, pp. 705–741,May 1995. P. G. Schyns and A. Oliva, “Dr. Angry and Mr. Smile: When catego-rization ﬂexibly modiﬁes the perception of faces in rapid visual presen-tations,” Cognition, vol. 69, no. 3, pp. 243–265, Jan. 1999. T. M. Cover and J. A. Thomas, Elements of Information Theory.Hoboken, NJ: Wiley, 1991.Wonjun Hwang received the B.S. and M.S. degreesfrom the Department of Electronics Engineering,Korea University, Seoul, Korea, in 1999 and 2001,respectively.From 2001 to 2008, he worked as a Researcherfor the Face Recognition Group, Samsung AdvancedInstitute of Technology (SAIT). In 2001–2004, hecontributed to the promotion of Advanced FaceDescriptor, Samsung and NEC joint proposal, toMPEG-7 international standardization. From 2004to 2006, he worked on developing the SAIT facerecognition engine for Face Recognition Grand Challenge (FRGC) and FaceRecognition Vender Test (FRVT) 2006, which achieved the best results underthe uncontrolled illumination situation at both FRGC and FRVT2006. In 2007,he developed the real-time face recognition system for the Samsung cellularphone, SGH-V920. He is currently a senior research engineer of Mechatronics Manufacturing Technology Center at Samsung Electronics Co., Ltd., nowworking on face recognition for the Samsung humanoid robot, RoboRay. Hiscurrent research interests are in face recognition, human gesture recognition,object recognition, and computer vision.Haitao Wang received the M.S. degree in en-gineering from the University of Science andTechnology Beijing, in 2001, and the Ph.D. degreein computer science from Institute of Automation,Chinese Academy of Sciences, Beijing, China, in2004.He is currently a Senior Researcher of SAITBeijing Lab, Samsung Advanced Institute of Tech-nology (SAIT). His research interests include2-D-to-3-D conversion, stereo matching, multi-view3-D reconstruction, and 3-D display.Hyunwoo Kim (M’10) received the B.S. degreein electronic communication engineering fromHanyang University, Seoul, Korea, in 1994, and theM.S. and Ph.D. degrees in electrical and computerengineering from POSTECH, Pohang, Korea, in1996 and 2001, respectively.In 2000, he was a Visiting Scientist in the Institutefor Robotics and Intelligent Systems, University ofSouthern California, Los Angeles, CA. He worked atSamsung Electronics Co., Ltd. (2001–2007) and Ko-rean German Industrial Park Co., Ltd., (2007–2008)in the ﬁeld of image processing, computer vision, robotics, and human-com-puter interaction (HCI). In 2009, he joined Department of New Media at Ko-rean German Institute of Technology, Seoul, Korea, where he is currently anAssistant Professor, after working for Intelligent System Research Center atSungkyunkwan University (2008–2009) as a Research Associate Professor. Hiscurrent research interests include computer vision, machine learning, cognitiverobotics, and augmented reality.Seok-Cheol Kee received the B.S. and M.S. degreesin control and instrumentation engineering and thePh.D. degree in electrical engineering, all from SeoulNational University, Seoul, Korea, in 1987, 1989, and2002, respectively.From 1989 to 2007, he worked as a PrincipalResearch Staff in the Samsung Advanced Institute ofTechnology (SAIT). From 2007 to 2010, he workedas a Director of Robot Research Center in theRobotever, Inc. Since June 2010, he has been withthe Mando Co. where he is now head of ElectronicRD Center. His research interests are in computer vision, image processing,biometrics for multimedia applications and human-computer interaction,sensor fusion, embedded control system for robot and automotive applications.Junmo Kim (S’01-M’05) received the B.S. degreefrom Seoul National University, Seoul, Korea,in 1998, and the M.S. and Ph.D. degrees fromthe Massachusetts Institute of Technology (MIT),Cambridge, in 2000 and 2005, respectively.From 2005 to 2009, he was with the SamsungAdvanced Institute of Technology (SAIT), Korea, asa Research Staff Member. He joined the faculty ofKAIST in 2009, where he is currently an AssistantProfessor of electrical engineering. His researchinterests are in image processing, computer vision,statistical signal processing, and information theory.