A Robust & Fast Face Detection System


Published on

Human face detection is a significant problem of
image processing and is usually a first step for face
recognition and visual surveillance. This paper presents the
details of face detection approach that is implemented to
achieve accurate face detection in group color images which
are based on facial feature and Support Vector Machine. In
the first step, the proposed approach quickly separates skin
color regions from the background and from non-skin color
regions using YCbCr color space transformation. After the
detection of skin regions, the images are processed with,
wavelet transforms (WT) and discrete cosine transforms
(DCT) as a result of which the 30×30 pixel sub images are
found. These sub images are then assigned to SVM classifier
as an input. The SVM is used to classify non-face regions from
the remaining regions more accurately, that are obtained
from previous steps and having big difference between faces
regions and non-faces regions. The experimental results on
different types of group color images show that this approach
improves the detection speed and minimizes the false
detection rate in less time and detects faces in different color

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A Robust & Fast Face Detection System

  1. 1. ACEEE Int. J. on Signal & Image Processing, Vol. 01, No. 03, Dec 2010 A Robust & Fast Face Detection System Ritu Verma, Anupam Agrawal and Shanu Sharma Indian Institute of Information Technology Allahabad, INDIA Email: rituverma1021@gmail.com Indian Institute of Information Technology Allahabad, INDIA Email: {anupam@iiita.ac.in, shanu.sharma1611@gmail.com}Abstract- Human face detection is a significant problem of color gives more reliability because it is not affected byimage processing and is usually a first step for face body posture and facial expression. It is easilyrecognition and visual surveillance. This paper presents the distinguished from the background color. Hence the facedetails of face detection approach that is implemented to detection approaches, based on the skin color, are widelyachieve accurate face detection in group color images which used. But it is not sufficient to absolutely and preciselyare based on facial feature and Support Vector Machine. Inthe first step, the proposed approach quickly separates skin detect the face only by using skin color information. Whencolor regions from the background and from non-skin color several faces are very near to each other or the face regionsregions using YCbCr color space transformation. After the and other body regions are close or skin-likelihooddetection of skin regions, the images are processed with, background is connected together to the face, it oftenwavelet transforms (WT) and discrete cosine transforms increases the false detection ratio. This problem can be(DCT) as a result of which the 30×30 pixel sub images are handled by detecting the false candidate regions withfound. These sub images are then assigned to SVM classifier statistical methods. In this face detection system the subas an input. The SVM is used to classify non-face regions from images of faces are very small in size for which thethe remaining regions more accurately, that are obtained statistical learning is used. Statistical learning theory isfrom previous steps and having big difference between facesregions and non-faces regions. The experimental results on currently the best theory for small samples statisticsdifferent types of group color images show that this approach estimates and projection learning. SVM theory isimproves the detection speed and minimizes the false established on the basis of statistical learning theory; itsdetection rate in less time and detects faces in different color objective is to resolve the problem of classification of smallimages. samples.Index Terms: Face Detection; Skin Color Detection; Wavelet The outline of the paper is prepared as follows: TheTransform; Discrete Cosine Transform; Support Vector summary of literature survey described which is similar toMachine. my system and few face detection methods with their merits and demerits. Section III explains the details of the I. INTRODUCTION implementation and methods we have been used. In section IV the results of this face detection approach on various A face detection system is a system that determines the types of images are discussed and in section V thelocations and sizes of human faces in arbitrary (digital) conclusion and scope for the future work are explained.images. It detects facial features from images and ignoresall other things, like buildings, trees etc. Recently, II. RELATED WORKresearchers have proposed to detect face by methodcombining features and color to obtain a high performance Face detection technique is an open challenge from lastand high speed results [1], [4] and [13]. Detecting faces is a many years, and various solutions addressing the facecrucial step in the identification applications for example detection problem have been proposed under differentairport security, law enforcement etc. Most of the face categories which are discussed below. Face detection is notrecognition and face tracking algorithms assumes that the an easy method as the detection is affected by manyinitial face localization is known. The main merit of any internal and external factors.good approach is to provide fast and high detection ratio Few main Face Detection Methods are as follows:and can deal with faces in complex background. A. Knowledge-Based Method: In this paper, implementation of a robust face detectionalgorithm which is based on facial feature and LSVM In this method the relationship between facial features(linear support vector machine) is presented. This of test image is used to represent the content of the face andalgorithm deals with different complexities and provides then encode picture digitally as a set of rules and to reachhigh speed and high detection ratio. Different complexities the finest scale. It is a top down approach [5]. Merits andinclude finding number of faces in group image, varying demerits of knowledge-based method are as follows:illumination, occlusion and complex background present in Meritsan input image. • It is simple to describe the features of face and their The skin color is a significant feature of a face. It has a relationship by using simple rules.strong cluster feature of YCbCr and HIS color space [1]. In • By coded rules first facial features of image are extractedYCbCr, Y stands for the “luma” (luminance) which is then candidate faces are identified.brightness. Cb and Cr stand for the “color difference” ofblue – luma (B-Y), and red – luma (R-Y) respectively. Skin© 2010 ACEEE 17DOI: 01.IJSIP.01.03.135
  2. 2. ACEEE Int. J. on Signal & Image Processing, Vol. 01, No. 03, Dec 2010Demerits Demerits• Translation of human knowledge into precise rule is very • Difficult to locate facial feature due to various difficult. complexities (illumination, occlusion etc.) in an image.• General rules may find many false positives. • Difficult to detect features in complex background.B. Template Matching Method: D. Appearance-Based Method: This method is based on finding the co-relation between This method learns the templates from the set ofa test sub image and the pre-defined stored face patterns. training images. It finds the relevant characteristics of faceThe predefined images might be the whole face or and non-face by using statistical analysis and machineindividual face features such as nose, eyes, mouth, learning techniques [3] and [7].eyebrows, and lips [5]. Algorithms used under this method are:Algorithms used under this method are: Eigen Faces:Predefined Face Templates: These are also called the eigenvectors, in which In predefined face templates several templates for the different algorithms are used to approximate thewhole/individual or both parts (whole & individual) of the eigenvectors of the auto correlation matrix of a candidateface are stored. image [19].Deformable Templates: Neural Network: In this an elastic facial feature model as a reference A network of neurons (simple element) called nodesmodel is stored and the deformable template mode of the used is to perform function in parallel. Central nervousobject of interest is fitted in. system gave this idea of neural network. These networksMerits and Demerits of Template Matching Method are as are trained for the detection of faces by providing it, facefollows: and non-face samples [15].Merits Support Vector Machine:• It’s simple and easy to implement. Support vector machine are learning machine and it makes binary classification. The idea is to enlarge theDemerits difference or margin between the vectors of negative and• Templates have to be initialized near the face images. positive sets and obtain an optimal boundary which• Difficult to enumerate templates for different poses. separates two sets of vectors [8] and [14].C. Feature-Invariant Approach: Hidden Markov Model: In this approach faces structural features are not It is also abbreviated as HMM model and can bechanged under different conditions, such as varying considered as simple dynamic Bayesian network. Hiddenviewpoints of cameras, pose angles, and /or illumination Markov Model is a class of statistical model which uses theconditions. statistical properties of a signal that model the processedAlgorithms used under this approach are: system. The Markov parameters should be taken from the observed parameters [16].Colour-Based Approach: Merits and demerits of Appearance-based method are as Colour based is also called skin-model based method. This follows:approach is based on the fact that different skins fromdifferent races are clustered in a single region and makes Meritsuse of the skin colour as indication to the presence of • Use powerful machine learning algorithms and it hashuman beings [1], [4] and [6]. demonstrated good empirical results.Facial-Feature Based Approach: • It offers to detect faces in various poses and orientations. In this method global and/or detailed features are used Demeritsfor face detection. It has become popular in present days. • It is usually needed to look for the space and scale.The global features (e.g. skin, size and shape) are firstly • It requires lots of positive and negative examples.used to detect the candidate area after that they are testedusing detailed features (e.g. eyes, nose, and lips) [13]. II. DETAILS OF THE APPROACH IMPLEMENTEDMerits and Demerits of Feature-invariant approach are asfollows: The flow chart of a proposed approach is shown in figure1.Merits• Features are invariant in different poses and orientations of the faces.© 2010 ACEEE 18DOI: 01.IJSIP.01.03.135
  3. 3. ACEEE Int. J. on Signal & Image Processing, Vol. 01, No. 03, Dec 2010 color space. Segmented skin color regions are obtained by Input the elliptical cluster method for the skin tones in the imag Color Morpholo Discrete Space gy Based Wavelet transformed YC’bC’r space. It is described in equations (1) e Based Operation Transfor and (2) as given below [7]. Segmentat m   Outp … (1)  ut Classificati Discrete imag on by Cosine … (2) SVM Transfor m Where a = 25.39, b = 14.03, ecx = 1.60, ecy = 2.41, Ѳ = 2.53, cx =109.38, and cy =152.02 are computed in the YC’bC’r space [7]. Figure 1: Flow chart of the approach used for face detection The images are received after lighting compensation technique, and are filtered with a 3×3 low pass filter [18]Steps for Face Detection: which is used for minimizing the effect of noise. If then the1. First give a RGB image as an input image to the Skin pixel satisfies equation (1) in elliptical cluster method color model. (YC’bC’r color space), it is marked as 1 and has to be2. The Skin color model converts the RGB image to the considered as skin color pixel. Otherwise, it is marked 0 YCbCr color space model [18]. and has to be considered as non-skin color pixel. It3. For handling varying lighting conditions convert this provides an output binary image after the above process. output image in YC’bC’r color space by the elliptical Finally it can detect skin color regions accurately after formula [7]. morphological (dilation) operation [18].4. For reducing noise effects filter this image by 3×3 low a. B. Discrete Wavelet Transform: pass filter, and then apply morphology (dilation) operation to get a binary image [18]. For reducing the training time and SVM dimension, the5. Find the skin regions based on above binary image. samples are compressed by wavelet transform (WT). Here6. The discrete wavelet transform (DWT) decomposes the using the discrete wavelet transforms which is based on given input image into a set of sub-bands of different sub-band coding and it is found to create a fast computation resolutions and selects the low frequency parts. The of WT [12]. It is easy to execute and minimize the new generated top left low frequency sub-bands are computational time and resources required. nearly equal to the original image [18]. The discrete wavelet transform decomposes the input7. Take the output of the DWT to the DCT and use 30×30 frame of image into a set of sub band of different size window to pick up the significant information of resolutions. The new generated sub-band is nearly equal to signal energy [11]. the original frame. DWT is a time-scaled representation of8. Support Vector Machine is used for classification to the digital signal and is found by digital filtering techniques construct an optimal hyper-plane which has a maximum [18]. The amount of the information present in the signal is margin of the separation between the face and non-face measured and this is termed as the resolution of the signal classes [8]. We have taken 30×30 size of windows as an which is to be finding out by several filtering operations input and separate these in faces or non-faces by the and it is given by up-sampling and down-sampling classification. phenomena. The dilation function of discrete wavelet9. Obtain the final face detected output image. transform is represented by a tree of low & high pass Details of main components of the approach are given filters. Low pass filters are transforming in each step. Thebelow: original signals are continuously decomposed into the subpart of lower resolution and the high frequencyA. Skin Color Model And Segmentation: components are not analyzed. In order to apply this method in the real time system, Wavelet coefficients are created into wavelet blocks inskin color detection is adopted; de-noising and lighting which horizontal, vertical and diagonal edges are the subcompensation are the initial steps of skin color model. This images of real image, it is shown in figure2. The upperis because the lighting condition and noise has great effect most left sub image represents the superior level of lowon the skin color detection. YCbCr color space pass sub image. The concept of wavelet block gives antransformation is faster than the other approaches and association between coefficient and what they representpopularly used in skin color detection [2]. YCbCr color spatially in the frames [10].space is developed for television systems, and it isluminance separated color space so it is widely used inmpeg, jpeg and other video compression standards. First linear conversion of RGB color space to YCbCrcolor space is obtained, but for further reduction in thelighting effect and to obtain a good result of skin colorcluster, a segmented non-linear conversion algorithm [7] is Figure2: wavelet block are reconstruction of wavelet coefficient.used which converts YCbCr color space into the YC’bC’r This is a four level discrete wavelet transform [10]© 2010 ACEEE 19DOI: 01.IJSIP.01.03.135
  4. 4. ACEEE Int. J. on Signal & Image Processing, Vol. 01, No. 03, Dec 2010C. Collecting Training Sample: linear problem. A LSVM classifier is designed to classify In the previous methods training samples are collect and used LibSVM [8] to train the samples. The LSVMfrom the database directly and the non-face samples are kernel function is adopted here-selected from the scenery images, such as building, plants, K (xi , xj) = < xi , xj > …..(3)trees and so on. So that it narrows the selecting scope. Buthere the training samples are selected after the processing In a binary classification with l sample points:with color transform, de-noising, and detection of skincolor regions and so on. Here we use 12 images for testing (xi , yj) i = 1,2,3…………..l …..(4)purpose which are collecting from personal digital cameraand also from the database [17]. After the initial steps like- Where xi є Rn and yi = {+1, -1} are the classifying labelcolor space transformation, lighting compensation and [7]. This system finds faces by thoroughly scanning andetection of skin regions we get scaled images. From the image for face like patterns at several possible scales, byscaled images we extract 30×30 pixel sub-images and here isolating the original image into overlap sub-images andwe get around 700 sub-images from 12 testing images and determines them into appropriate class face or non-face byextract them in 150 faces and 550 non-faces. using support vector machine. The figure 3 shows the geometrical interpretation of the technique support vectorD. Discrete Cosine Transform: machine provides in the framework of the face detection. The DCT is a good example of the transform coding The vital use of support vector machine is in the[18]. The recent JPEG standard images use the DCT as its classification step, which is the essential part of the work.basis. The discrete cosine transform relocates the high By using support vector machine classify all windowvalued energies (information) to the upper left corner to the patterns and if the class matches a face then make a squareimage and the lesser energies are relocated in other areas around the face in the output image.[11]. Discrete cosine transform is a unique method that hasnear-optimal energy compaction property [9]. It separatesthe given image into sub–bands (parts of image) on thebasis of visual quality. The DCT has a great featureextraction and excellent data compression and has less Non- Facescomputing features. It gives robustness for detection inlighting effects or variations. Figure3: SVM separate the face and non-face by geometrical Energy Compaction is the main property of DCT [11]. interpretation. The patterns are real support vectors obtained afterHaving a power to produce a transformation scheme can be training the system [8]directly approximated by its ability to compact input data Facesinto a few possible coefficients. It allows quantizer toremove coefficient with relatively small amplitudes and IV. EXPERIMENTAL RESULTSreconstruct image without any visual distortion. DCTexhibits excellent energy compaction for highly correlation Here evaluation of proposed methodology on a face imagesub-images. In the transform coding, the pixels in an image database, and construction of the database for facedisplays a certain level of correlation with neighboring detection from personal photo collections and internet [17]pixels. Same problem is there in video transmission which is done. These color images or the database has been takenshows very high correlation of adjacent pixels in under different complexities, like detecting possible facesconsecutive frames. We take the output of Discrete under varying illumination conditions and occlusion inWavelet Transform as an input to the Discrete Cosine group photographs with complex backgrounds. With highTransform and use 30×30 size window to pick out the detection rate of 87.65% accuracy, this approach can detectsignificant information of signal energy. The sample all possible faces in between range (9.38sec to 11.97sec) offeature vector is extracted and compacted by DCT [7]. time. The face detection time depends on the complexities of the testing color images. Further the discussed approachE. Support Vector Machine: is able to detect multiple numbers of faces with broad range A SVM is a supervised learning technique form of of facial variations in an image.machine learning, and it is applicable for classification andregression. This support vector machine theory is A. Discussion for the output images shown in section B aredeveloped by Vladimir Vapnik & his team in 1995 at AT& given below:Bell Laboratories, and the principle is based on structural 1. The first input image is the original RGB image whichrisk minimization, so it has very good generalization ability we get either from the personal dataset or from the[8]. Generalization means the summation of data and internet datasets [17], having different complexities.knowledge. For example the given input image1 has varying The main aim of statistical learning theory is to present illumination over different faces and has complexa framework for studying the problem of inference, which background.is of gaining knowledge, making predictions, making 2. Perform low pass filtering to reduce effect of noise anddecisions or constructing models from a set of data. The for handling varying lighting condition use ellipticalproposed method adopts a kernel function so it is able to formula (as discussed in above) on the input image.solve the dimension problem, and is well suited for non- From this we get the binary skin map image.© 2010 ACEEE 20DOI: 01.IJSIP.01.03.135
  5. 5. ACEEE Int. J. on Signal & Image Processing, Vol. 01, No. 03, Dec 20103. Third image shows the skin region detected image of Complexities in different input images which are shown the input RGB image. Here we separate the background in below section B and section C are: of image from the skin color regions. 1. Image1 has complexity of varying illumination over4. For the fourth image, perform the dilation operation different faces and has complex background (skin (morphological operation) on the 2nd skin map region likelihood background). image. The dilation operation which accepts the 2. Image9 has complexity of occlusion and has complex structuring element objects, known as STRELs [18]. background.5. The fifth image shows the dilated skin region detected 3. Image10 has complexity of tilted faces. image of the input image after applying the above B. The output images (2 to 8) generated by various steps operations on the 4th image. on input image (1) are given below:6. Apply discrete wavelet transform to get a sixth scaled image.7. After getting the scaled image apply discrete cosine transform. By applying this process the image is divided into the 30×30 sub-images, and we train all sub-images as a face or non-face sub-image.8. In seventh image, Support Vector Machine (SVM) is used for classification of data to construct an optimal hyper-plane which has a maximum margin of the separation between the face and non-face classes.9. Finally we obtain the final face detected output image Image1. The original RGB Image2. Skin map image (image8) after classification, where faces are enclosed image in boxes around them. Here, we have collected 12 testing color images ofdifferent sizes and different complexities. In these 12testing group color images, first six images (1 to 6) aretaken from personal digital camera and the next six images(7 to 12) are taken from the face detection datasets “BaoFace Database” [17]. Total 81 faces are there in 12 imagesin which 71 faces are detected successfully. This approachgives accuracy 87.65% with a good speed. After the Image3. Skin region detected Image4. Dilated skin maptraining time of the faces and non-faces it can able to detect image imagethe possible faces in between range 9.38sec to 11.97sec. Itsdetection timing depends on the complexities of theimages. Table1 and Table2 show the results of findingfaces in different given input images. TABLE I: FACE DETECTION RESULTS ON THE PERSONAL SIX (1 TO 6) TESTING COLOR IMAGES. Sr. Number of Correct Missing Detection no. faces in detection of detection time of Image5. Dilated skin region Image6. Scaled image after images faces of faces faces(sec) detected image applying DWT image 1 6 6 0 9.87 2 6 6 0 10.16 3 6 6 0 9.77 4 5 5 0 9.64 5 6 6 0 9.38 6 4 3 1 11.97 TABLE II: FACE DETECTION RESULTS ON THE DATEBASE SIX (7 TO 12) TESTING COLOR IMAGES. Sr. Number of Correct Missing Detection Image7. Classification by Image8. Final face detected no. faces in detection of detection time of SVM image image images faces of faces faces(sec) 7 12 8 4 10.24 8 9 6 3 9.99 9 8 8 0 9.96 10 5 5 0 10.88 11 7 5 2 11.44 12 7 7 0 11.20© 2010 ACEEE 21DOI: 01.IJSIP.01.03.135
  6. 6. ACEEE Int. J. on Signal & Image Processing, Vol. 01, No. 03, Dec 2010C. The output for more images with different complexities: [4] Yepeng Guan and Lin Yang, "An unsupervised face detection based on skin color and geometric information," Sixth International Conference on Intelligent Systems Design and Applications, (ISDA06), Volume 2, 2006, pp. 272-276. [5] Muhammad Usman Ghani Khan and Atif Saeed, “Human detection in vedios,” Journal of Theoretical and Applied Information Technology, 2009, pp. 212-220. [6] C. Garcia, G.Zikos and G.Tziritas, “Face Detection in Color Images Using Wavelet Packet Analysis,” IEEE Conference on Multimedia Computing and Systems, volume 1, 1999, pp. 703- 708. [7] Jinxin Ruan and Junxun Yin, "Face Detection Based on Image9. Face detected Image10. Face detected image image Facial Features and Linear Support Vector Machines," International Conference on Communication Software and Networks, 2009, pp. 371-375. V. CONCLUSION AND FUTURE WORK [8] C. C. Chang and C.J. Lin. “LIBSVM - A Library for support Vector Machines” Available This paper discusses a robust & fast face detection on:http://www.csie.ntu.edu.tw/~cjlin/libsvm/ 2008. (Accessedapproach and its implementation is based on facial feature 10th Feb 2010).and LibSVM. The statistical learning theory is related to [9] Chai Beng Seow, Regina Gani and Varun Jeoti,the training samples. Selected samples and regions which “Wavelet-DCT based image coder for video coding applications,”are found from the skin color regions by non-linear International Conference on Intelligent and Advanced Systems,conversion are used; the strength of samples and the ICIAS, 2007, pp.748-752.functioning or the performance of classifier is improved. [10] J. Karlekar and U. B. Desai, “Finding faces in color images using wavelet transform,” International Conference onFor the compression purpose we use here discrete wavelet Image Analysis and Processing, 1999, pp. 1085-1088.transform and for extracting the feature vector of sample [11] “The Discrete Cosine Transform (DCT): Theory andimages we use discrete cosine transform, so the resultant Applications”, Available on:matching time and the training difficulty of support vector http://www.wisnet.seecs.nust.edu.pk/publications/tech_reports/Dmachine are obviously reduced and there is speeding up the CT_TR802.pdf (Accessed 15th Mar 2010).algorithm. Result shows that the algorithm achieves good [12] R. de Queiroz, C. K. Choi, Y. Huh and K. R. Rao,(around 87.65%) detection accuracy, lower false detection “Wavelet Transforms in a JPEG-Like Image Coder,” IEEErate and improved speed, which makes the algorithm Transaction on Circuit and Systems for Video Technology,highly robust. volume 7, no. 2, April 1997, pp. 419-424. [13] Tse-Wei Chen, Shou-Chieh Hsu and Shao-Yi Chien, Further the present work may be extended to reduce the "Automatic Feature-Based Face Scoring in Surveillancefalse detection rate, solve the problem of shifted boxes and Systems," Ninth IEEE International Symposium on Multimedia,improve its accuracy for face recognition. ISM, 2007, pp. 139-146. [14] E. Osuna, R. Freund and F. Girosi, “Training Support ACKNOWLEDGEMENT Vector Machines: An Application to Face Detection,” IEEE Computer Society Conference on Computer Vision and Pattern The authors would like to express sincere gratitude to Recognition, 1997, pp.130-136.the Director of the Institution Dr. M.D Tiwari for providing [15] L. Mostafa and S. Abdelazeem, “Face Detection basedexcellent computational facilities and stimulating work on Skin Color using Neural Networks,” First ICGST Internationalenvironment for carrying out the research work. Conference on Graphics, Vision and Image Processing GVIP 2005, pp. 53-58. REFERENCES [16] Yi-Qiong, Bi-Cheng Li and Bo Wang, “Face detection and recognition using neural network and hidden markov[1] R.L. Hsu, M. Abdel-Mottaleb and A.K. Jain, "Face Detection models,” International conference on neural networks and signalin Color Images," IEEE Transactions on Pattern Analysis and processing, 2003, pp. 228-231.Machine Intelligence, volume 24, no. 5, May 2002, pp. 696-706. [17] Face detection datasets “Bao Face Database”, Available[2] Yu-Ting Pai, Shanq-Jang Ruan, Mon-Chau Shie and on: http://www.facedetection.com/facedetection/datasets.htmYi-Chi Liu, “A Simple and Accurate Color Face Detection (Accessed 10th May 2010).Algorithm in Complex Background,” IEEE International [18] Rafael C. Gonzalez, Richard E. Woods and Steven LConference on Multimedia and Expo, ICME-2006, pp. 1545- Eddins, “Digital Image Processing using MATLAB”, Pearson1548. Education Inc. 2005.[3] Michael J. Jone and Paul Viola, “Fast Multi-view Face [19] Hossein Sahoolizadeh and Youness Aliyari Ghassabeh,Detection,” TR2003-96 August 2003. Available on: “Face recognition using eigen-faces, fisher-faces and neuralhttp://www.merl.com/reports/docs/TR2003-96.pdf. (Accessed on networks,” 7th IEEE internet conference on cybernetic intelligent5th Feb 2010). systems, 2008, pp© 2010 ACEEE 22DOI: 01.IJSIP.01.03.135