SlideShare a Scribd company logo
1 of 11
Download to read offline
Component-Based Ethnicity Identification
from Facial Images
A. Boyseens and S. Viriri(B)
School of Maths, Statistics and Computer Science,
University of KwaZulu-Natal, Durban, South Africa
{210501411,viriris}@ukzn.ac.za
Abstract. This paper presents an exhaustive component-based analysis
to identify the ethnicity from facial images. The different ethnic groups
identified are Asian, African, African American, Asian Middle East, Cau-
casian and Other. The classification techniques investigated include Deci-
sion Trees, Naı̈ve Bayes, Random Forest and K-Nearest Neighbor. Naı̈ve
Bayes achieved 84.7 % and 85.6 % accuracy rates for African ethnicity and
Asian ethnicity identification, respectively. The Decision Trees achieved
85.8 % for African American ethnicity identification rate, while K-Nearest
Neighbor achieved 86.8 % for Asian Middle East ethnicity and Random
Forest achieved 90.8 % for Caucasian ethnicity identification rate. This
research work achieved an overall ethnicity identification rate of 86.6 %.
1 Introduction
This paper investigates methods of analysing and identifying ethnicity of
a facial image. These ethnicities are Asian, African, African American, Asian
Middle East, Caucasian and Other. The aim of this research work is to investi-
gate a model for the efficient ethnicity identification using facial components.
Ethnicity is a socially defined category of people who are identified by each
other based on the common social, cultural and ancestral backgrounds [1]. Ethnic
facial recognition is an important biometric authentication technology, there are
various applications of facial recognition, these include law enforcement, security
systems and biometric system therefore the need for an ethnic facial recognition
system is essential.
2 Related Work
Lu and Jain [2] described techniques, which use Nearest Neighbor (NN) and
Linear Discriminant Analysis (LDA) for ethnic identification from facial images.
The two classes identified are Asian and Non-Asian ethnicity. The feature extrac-
tion technique used were Hu Moment and Zernike Moments. The experimental
results achieved an average accuracy rate of 86 %. The short falls experienced
are that it only classified for two classes Asian and non-Asian ethnicity. Another
short fall was that Product Rule was used to achieve an integrated strategy to
c
 Springer International Publishing AG 2016
L.J. Chmielewski et al. (Eds.): ICCVG 2016, LNCS 9972, pp. 293–303, 2016.
DOI: 10.1007/978-3-319-46418-3 26
294 A. Boyseens and S. Viriri
combine outputs and the data results achieved are estimated due to the extensive
cross-validation that was used.
Buchala et al. [3] discussed the effects that Principal Component Analysis
(PCA) has on the identification of Gender, Ethnicity, Age and Identity of facial
images. Three classes are identified these are Caucasian, African American and
East Asian. The feature extractions technique were Elastic Bunchc Graph. The
experimental results achieved an average accuracy for Ethnicity of 81.67 %. This
paper had the following short falls it only classified for three different classes
Caucasian, African American and East Asian and the authors assumed that
using Linear Discriminant Analysis (LDA) would achieve better results than
that of PCA.
Tin and Sein [4] analysis the use of Nearest Neighbor (NN) and Principal
Component Analysis (PCA) in order to achieve ethnicity identification from
facial images. This paper distinguish between two classes Myanmar and Non-
Myanmar ethnicities. The feature extraction technique used was Hu Moments.
The experimental results obtained by the paper on average are 92 % ethnic
identification accuracy. The insufficiency experienced were that it only classified
between two different classes Myanmar and Non-Myanmar. The authors assumed
that more images are needed to achieve a better result and the system needs to
be more identity-sensitive to features that are closer together.
3 Methods and Techniques
The component-based ethnicity identification system is depicted in Fig. 1.
3.1 Facial Components
The facial components are extracted using the Haar Transformation [5] which is
real and orthogonal. Haar Transformation HTn
(f) of an N-input function Xn
(f)
is the 2nd
element vector as described in Eq. (1). The Haar Transformation cross
multiplies a function with Haar matrix that contains Haar functions with differ-
ent widths at different locations. It is calculated at two levels which decomposes
the discrete signal into two components at half of the original lengths [5].
HTn
(f) = Hn
Xn
(f) (1)
where n is the number of elements in the function, Hn
is the element vector and
Xn
(f) is the 2n
element vector. The components that are extracted are the left
eye, right eye, mouth, nose, chin, forehead, left cheek and right cheek.
3.2 Feature Extraction
From each component a feature vector is obtained. Analysis was done on differ-
ent textural and structural or geometrical feature extractions, 7 Hu Moments,
Zernike Moments, Local Binary Pattern (LBP), Gabor Filter and Haralick
Texture Moments, in order to obtain the correct feature extraction for that
component.
Component-Based Ethnicity Identification from Facial Images 295
Fig. 1. Overview of ethnicity identification system
The Hu Moments are a set of invariant moments and it characterizes regard-
less of their scale, positions, size and orientation. These are computed by normal-
izing the central moments though order 3 [6] in terms of the central moments.
The Zernike Moments are used to overcome redundancy in which certain
geometric moments obtain [6]. They are a class of orthogonal moments which
are rotational invariant and effective in image representation. Zernike are a set
of complex, orthogonal polynomials defined as the interior of the unit circle. The
general form of the Zernike Moments is defined in Eq. (2).
Znm(x, y) = Znm(p, θ) = Rnm(p)jmθ
(2)
where x, y, p and θ correspond to Cartesian and Polar coordinates respectively,
n ∈ Z+
and m ∈ Z, constrained to n − m even, m ≤ n
Rnm(p) =
n−m
2

k=0
(−1)k
(n − k)!
k!(n+m
2 − k)!(n−m
2 − k)!
pn−2k
(3)
where Rnm(p) is a radial polynomial and k is the order.
The Haralick Texture Moments are texture features that can be used to
analyze the spatial distribution of the image’s texture features [6] with differ-
ent spatial positions and angles. Four of these Haralick Texture Moments are
computed; Energy, Entropy, Correlation and Homogeneity.
296 A. Boyseens and S. Viriri
Entropy is the reflection of the disorder and complexity of the images texture.
This is defined using Eq. (4).
Entropy =

ij
ˆ
f(i, j) log ˆ
f(i, j) (4)
where ˆ
f(i, j) is the [i, j] entry if the gray level value of image matrix and i and
j are points on the image matrix.
Energy is the measure of the local homogeneity and is the opposite of Entropy.
This shows the uniformity of the images texture and is computed using the
Eq. (5).
Energy =

ij
ˆ
f(i, j)2
(5)
where ˆ
f(i, j) is the [i, j] entry if the gray level value of image matrix and i and
j are points on the image matrix.
Homogeneity is the reflection of equaliness of the images’ textures and scale
of local changes in the texture of the image. High homogeneity shows no change
between the regions with regards to images texture, this is defined in Eq. (6).
Homogeneity =

i

j
1
1 + (i − j)2
ˆ
fi,j (6)
where ˆ
f(i, j) is the [i, j] entry if the gray level value of image matrix and i and
j are points on the image matrix.
Correlation is the consistency of the images texture which is described in
Eq. (7).
Correlation =

ij
(i − μi)(j − μj) ˆ
f(i, j)
σiσj
(7)
In which μj, μi, σi and σj are described as:
μi =
n

i=1
n

j=1
i ˆ
f(i, j) μj =
n

i=1
n

j=1
j ˆ
f(i, j) (8)
σi =
n
i=1
n
j=1(i − μi)2 ˆ
f(i, j) σj =
n
i=1
n
j=1(i − μj)2 ˆ
f(i, j)
(9)
The Gabor Filters are geometric moments which are the product between
an elliptical Gaussian and a sinusoidal, [7]. Gabor elementary function can be
defined as the product of the pulse with a harmonic oscillation of frequency.
g(t) = −α2
(t−t0)2
−i2π(f−fo)+φ
(10)
where α is the time duration of the Gaussian envelope, t0 denotes the centroid,
f0 is the frequency of the sinusoidal and φ denotes the phase shift.
Component-Based Ethnicity Identification from Facial Images 297
Local Binary Pattern (LBP) is a geometric moment operator that described
the surrounding of the pixels by obtaining a bit code of a pixel [8], this is defined
using Eq. (11).
C =
k=7

k=0
(2k
bk) (11)
bk =

1,
k=7
k=0(tk ≥ C)
0,
k=7
k=0(tk  C)
where tk is the gray scale amount. bk is the binary variables between 1 and 0
and C is a constant value of 0 and 1.
3.3 Classification
A number of different supervised and unsupervised machine learning algorithms
are used in order to identify ethnicity of the facial images. The classification
techniques used are Naı̈ve Bayes, K-Nearest Neighbor and Decision Tree.
The Decision Trees use recursive partitioning to separate the dataset by
finding the best variable [9], and using the selected variable to split the data.
Then using the entropy, defined in Eqs. (12) and (13), to calculate the difference
that variable would make on the results if it is chosen. If the entropy is 0 then
that variable is prefect to use, else a new variable needs to be selected.
H(D) = −
k

i=1
P(Ci|D) logk(P(Ci|D)) (12)
where the entropy of a sample D with repest to target variable of k possible
classes Ci.
P(Ci|D) =
numberofcorrectobservationforthatclass
totalobservationforthatclass
(13)
where the probability of class Ci in D is obtained directly from the dataset.
Naı̈ve Bayes classify an instance by assuming the presence or absence of a par-
ticular feature and sees if it is unrelated to the presence or absence of another
feature, given in the class variable [9,10]. This is done by calculating the prob-
ability for which it occurred, as defined in Eq. (14).
P(x1, ......., xn|y) =
n
i=1 P(y)P(xn|y)
P(x1, ......., xn)
(14)
where case y is a class value, attributes are x1, ........, xn and n is the sample size.
298 A. Boyseens and S. Viriri
K-Nearest Neighbor (KNN) classifies by a majority vote of its neighbors,
with the case being assigned to the class with the most common amongst its
dataset. The KNN is measured by a distance function, for example Euclidean,
as defined in Eq. (15).
d =


 n

i=1
(xi − qi)
2
(15)
where n is the size of the data, xi is an element in the dataset and qi is a central
point.
4 Results and Discussion
This paper used a union of four different facial image databases. The total dataset
contained 1300 facial images of 900 subjects. The subjects were divided into six
different ethnic groups (Asian, African, African American, Asian Middle East,
Caucasian and Other). Asian dataset composed of Yale [11] and FERET [12],
African dataset was composed of MUCT [13], African American was composed
of Yale [11], ORL [14] and FERET [12], Caucasian dataset was composed of Yale
[11], ORL [14], FERET [12] and MUCT [13], Asian Middle East was composed
of ORL [14], Yale [11], FERET [12] and MUCT [13] and Other was composed
of FERET [12].
Analysis was done to obtain which feature extraction technique would achieve
the most accurate True Positive Rate (TPR) for ethnic identification. These
analysis results were obtained by taking the feature vector for each component
and classifying it using K-Nearest Neighbor, to observe which feature vector
obtained the highest True Positive Rate, results obtained are shown in Table 1.
Table 1. Accuracy rates per component per feature extraction technique
7 Hu
moments
Zernike
moments
LBP Gabor
filter
Haralick
texture
moments
Nose 85.0 % 80.1 % 62.7 % 82.5 % 66.5 %
Left eye 57.2 % 76.4 % 83.7 % 73.4 % 77.1 %
Right eye 62.7 % 76.3 % 37.6 % 71.8 % 63.4 %
Mouth 89.5 % 72.1 % 76.9 % 88.2 % 70.1 %
Forehead 45.2 % 79.5 % 81.0 % 70.6 % 82.3 %
Chin 80.5 % 72.6 % 47.5 % 66.8 % 69.8 %
Left cheek 60.4 % 52.6 % 30.5 % 66.7 % 72.8 %
Right cheek 62.5 % 71.5 % 22.7 % 81.8 % 84.2 %
Zernike Moments is a geometric feature extraction technique which achieved
a TPR of 76.3 % for the right eye. LBP is a geometric feature extraction tech-
nique it achieved a TPR of 83.7 % for the left eye. These two geometric feature
Component-Based Ethnicity Identification from Facial Images 299
extraction technique achieved high results as the eyes are structurally different
in size and shape between ethnic groups.
Gabor Filter, 88.2 %, and 7 Hu Moment, 89.5 %, achieved a high TPR for the
mouth component. Both feature extraction techniques are textural based. These
achieved high results as the mouth is different in colour, shape and coarseness
for different ethnic groups.
The forehead, 82.3 % TPR, left cheek, 72.8 % TPR and right cheek, 84.2 %
TPR, have different textures, colours and gradients. These achieved high results
due to the fact that Haralick Texture Moments calculate the correlation and
homogeneity for each pixel.
7 Hu Moments is a textural feature extraction techniques it achieved 80.5 %
for the chin and 85.0 % for the nose. This is due to all components needing to be
identified by using the texture of the component which 7 Hu Moments achieves.
4.1 African Ethnicity Results
In Buchala et al. [3] it was shown that for African ethnicity the percentage ach-
ieved was on average 80 % for the Principal Component Analysis (PCA). These
results were obtained from a dataset size of 320 images from the FERET dataset.
The Naı̈ve Bayes produced the best results for the whole database. K-nearest
Neighbour produced high results, shown in Fig. 2.
Fig. 2. Accuracy achieved for African ethnicity
4.2 African American Ethnicity Results
Buchala et al. [3] showed that for African ethnicity the percentage achieved
was on average 80 % for the Principal Component Analysis (PCA). The results
obtained are shown in Fig. 3. The Decision Trees achieved the best results of
85.8 %.
4.3 Asian Ethnicity Results
In Lu and Juain [2] obtained an average of 97 % for Nearest Neighbour and
95 % for Linear Discriminant Analysis (LDA) for Asian identification. The Naı̈ve
Bayes produced the best accuracy rate of 86 %, and the K-nearest Neighbor
produced the accuracy rate of 75 % as shown in Fig. 4.
300 A. Boyseens and S. Viriri
Fig. 3. Accuracy achieved for African American ethnicity
Fig. 4. Accuracy achieved for Asian ethnicity
4.4 Asian Middle East Ethnicity Results
Buchala et al. [3] results obtained for Asian Middle East ethnicity were on
average 83 % for the Principal Component Analysis (PCA). These results were
obtained from a dataset size of 363 images from the FERET dataset. The
K-Nearest Neighbor is the best machine learning algorithm to identify the Asian
Middle East ethnicity for all images, this produced 86.8 %, among other tech-
nique as shown in Fig. 5.
Fig. 5. Accuracy achieved for Asian Middle East ethnicity
Component-Based Ethnicity Identification from Facial Images 301
4.5 Caucasian Ethnicity Results
In Buchala et al. [3] results showed that for Caucasian ethnicity the percentage
achieved was 82 % for the Principal Component Analysis (PCA). These results
were obtained by using 1758 images from the FERET dataset. The Random
Forest is best used to identify the Caucasian ethnicity for all images, this pro-
duced 90.8 %. Random Forest achieved well here as most of the testing data was
made up of Caucasian images and the decision rule produced easily classified
the Caucasian ethnicity. After testing it was found that all datasets produced
on average a 90 % for ethnic accuracy identification, as shown in Fig. 6.
Fig. 6. Accuracy achieved for caucasian ethnicity
4.6 Other Ethnicity Results
Tin and Sein [4] achieved on average for other ethnicity was on average 93 % for
Nearest Neighbour (NN) and 96 % for Principal Component Analysis (PCA).
These results were obtained from a 250 images obtained from the Internet. The
K-Nearest Neighbor is best accuracy rate of 80 % as shown in Fig. 7.
Fig. 7. Accuracy achieved for other ethnicity
302 A. Boyseens and S. Viriri
4.7 Results of All Six Ethnicities
A number empirical experiments were carried out to investigate which machine
learning algorithm would be suitable to determine the best results for ethnicity.
The dataset testing sizes ranged from 10 % to 100 % of the original dataset and
were tested against the different machine learning algorithms. These training
datasets were filled with randomly chosen images from the original dataset.
The tests showed that the Decision Tree machine learning algorithm achieved
86.6 % ethnicity detection rate. The worst machine learning algorithm was Ran-
dom Forest which achieved 70 % ethnic accuracy identification rate. This is a
variation of 6 % between the worst and best ethnic accuracy identification rate,
as shown in Fig. 8.
Fig. 8. Accuracy achieved for all six ethnicity
Table 2 shows the comparison between the results achieved by related works
for ethnicity identification and our work for ethnicity identification. It is seen
that the results obtained for Asian ethnicity identification were lower than that
of Lu and Jain [2], possibly due the bigger size dataset used in this work.
Table 2. Comparison of related works results to our results for ethnicity identification
Asian African African
American
Asian
Middle
East
Caucasian Other
Lu and Jain [2] 97.7 % - - - - -
Buchala et al. [3] - 80.2 % 80.0 % 83.1 % 82.0 % -
Tin and Sein [4] - - - - - 96.0 %
Our Work 85.6 % 84.7 % 85.8 % 86.8 % 90.8 % 82.3 %
5 Conclusion
This paper presented a component-based ethnicity identification from facial
images using machine learning algorithms. The ethnicities that were identi-
fied were Asian, African, African American, Asian Middle East, Caucasian and
Component-Based Ethnicity Identification from Facial Images 303
Other. The feature vector that was obtained were Haralick Texture Moments
for the forehead, Left cheek and right cheek. Zernike Moments was used for
the right eye and LBP for the left eye. Gabor Filter for the Mouth and 7 Hu
Moments for the chin, mouth and nose, which is fused and normalized to obtain
results. African ethnicity identification rate achieved 84.7 % with Naı̈ve Bayes,
85.8 % was achieved with Decision Tree achieved for African American ethnicity
identification rate. Naı̈ve Bayes achieved 85.6 % for Asian ethnicity identifica-
tion rate, K-Nearest Neighbor Classification achieved 86.8 % for Asian Middle
East ethnicity identification rate. 90.8 % was achieved for Caucasian ethnicity
identification rate with Random Forest achieved and K-Nearest Neighbor Clas-
sification achieved a 82.3 % for Other ethnicity identification rate. This research
achieved a total ethnicity identification rate of 86.6 %.
References
1. Lu, X.: Image analysis for face recognition. Personal notes, 5 May (2003)
2. Lu, X., Jain, A.K.: Ethnicity identification from face images. In: Defense and Secu-
rity, International Society for Optics and Photonics, pp. 114–123 (2004)
3. Buchala, S., Davey, N., Gale, T.M., Frank, R.J.: Principal component analysis of
gender, ethnicity, age, and identity of face images. In: Proceedings of IEEE ICMI
(2005)
4. Tin, H.H.K., Sein, M.M.: Race identification for face images. ACEEE Int. J. Inform.
Tech. 1(02) (2011)
5. Mulcahy, C.: Image compression using the haar wavelet transform. Spelman Sci.
Math. J. 1(1), 22–31 (1997)
6. Teague, M.R.: Image analysis via the general theory of moments. JOSA 70(8),
920–930 (1980)
7. Berisha, S.: Image classification using gabor filters and machine learning (2009)
8. Salah, S.H., Du, H., Al-Jawad, N.: Fusing local binary patterns with wavelet fea-
tures for ethnicity identification. In: Proceedings of IEEE International Conference
on Signal Image Process, vol. 21, pp. 416–422 (2013)
9. Domingos, P.: A few useful things to know about machine learning. Commun.
ACM 55(10), 78–87 (2012)
10. Lowd, D., Domingos, P.: Naive bayes models for probability estimation. In: Pro-
ceedings of the 22nd International Conference on Machine Learning, pp. 529–536.
ACM (2005)
11. Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. fisherfaces: recog-
nition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell.
19(7), 711–720 (1997)
12. Phillips, P.J., Wechsler, H., Huang, J., Rauss, P.J.: The feret database and evalua-
tion procedure for face-recognition algorithms. Image Vis. Comput. 16(5), 295–306
(1998)
13. Milborrow, S., Morkel, J., Nicolls, F.: The MUCT landmarked face database. In:
Pattern Recognition Association of South Africa (2010)
14. Samaria, F.S., Harter, A.C.: Parameterisation of a stochastic model for human face
identification. In: Proceedings of the Second IEEE Workshop on Applications of
Computer Vision, pp. 138–142. IEEE (1994)

More Related Content

Similar to Component-Based Ethnicity Identification from Facial Images.pdf

Image Processing
Image ProcessingImage Processing
Image ProcessingTuyen Pham
 
Skin colour segmentation using fintte bivariate pearsonian type iv a mixture ...
Skin colour segmentation using fintte bivariate pearsonian type iv a mixture ...Skin colour segmentation using fintte bivariate pearsonian type iv a mixture ...
Skin colour segmentation using fintte bivariate pearsonian type iv a mixture ...Alexander Decker
 
Different Image Segmentation Techniques for Dental Image Extraction
Different Image Segmentation Techniques for Dental Image ExtractionDifferent Image Segmentation Techniques for Dental Image Extraction
Different Image Segmentation Techniques for Dental Image ExtractionIJERA Editor
 
Human Face Detection Based on Combination of Logistic Regression, Distance of...
Human Face Detection Based on Combination of Logistic Regression, Distance of...Human Face Detection Based on Combination of Logistic Regression, Distance of...
Human Face Detection Based on Combination of Logistic Regression, Distance of...IJCSIS Research Publications
 
Texture features from Chaos Game Representation Images of Genomes
Texture features from Chaos Game Representation Images of GenomesTexture features from Chaos Game Representation Images of Genomes
Texture features from Chaos Game Representation Images of GenomesCSCJournals
 
FACE RECOGNITION ALGORITHM BASED ON ORIENTATION HISTOGRAM OF HOUGH PEAKS
FACE RECOGNITION ALGORITHM BASED ON ORIENTATION HISTOGRAM OF HOUGH PEAKSFACE RECOGNITION ALGORITHM BASED ON ORIENTATION HISTOGRAM OF HOUGH PEAKS
FACE RECOGNITION ALGORITHM BASED ON ORIENTATION HISTOGRAM OF HOUGH PEAKSijaia
 
Behavior study of entropy in a digital image through an iterative algorithm
Behavior study of entropy in a digital image through an iterative algorithmBehavior study of entropy in a digital image through an iterative algorithm
Behavior study of entropy in a digital image through an iterative algorithmijscmcj
 
SEGMENTATION USING ‘NEW’ TEXTURE FEATURE
SEGMENTATION USING ‘NEW’ TEXTURE FEATURESEGMENTATION USING ‘NEW’ TEXTURE FEATURE
SEGMENTATION USING ‘NEW’ TEXTURE FEATUREacijjournal
 
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint RecognitionExtended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint RecognitionCSCJournals
 
GREY LEVEL CO-OCCURRENCE MATRICES: GENERALISATION AND SOME NEW FEATURES
GREY LEVEL CO-OCCURRENCE MATRICES: GENERALISATION AND SOME NEW FEATURESGREY LEVEL CO-OCCURRENCE MATRICES: GENERALISATION AND SOME NEW FEATURES
GREY LEVEL CO-OCCURRENCE MATRICES: GENERALISATION AND SOME NEW FEATURESijcseit
 
BEHAVIOR STUDY OF ENTROPY IN A DIGITAL IMAGE THROUGH AN ITERATIVE ALGORITHM O...
BEHAVIOR STUDY OF ENTROPY IN A DIGITAL IMAGE THROUGH AN ITERATIVE ALGORITHM O...BEHAVIOR STUDY OF ENTROPY IN A DIGITAL IMAGE THROUGH AN ITERATIVE ALGORITHM O...
BEHAVIOR STUDY OF ENTROPY IN A DIGITAL IMAGE THROUGH AN ITERATIVE ALGORITHM O...ijscmcj
 
MULTIFOCUS IMAGE FUSION USING MULTIRESOLUTION APPROACH WITH BILATERAL GRADIEN...
MULTIFOCUS IMAGE FUSION USING MULTIRESOLUTION APPROACH WITH BILATERAL GRADIEN...MULTIFOCUS IMAGE FUSION USING MULTIRESOLUTION APPROACH WITH BILATERAL GRADIEN...
MULTIFOCUS IMAGE FUSION USING MULTIRESOLUTION APPROACH WITH BILATERAL GRADIEN...cscpconf
 
Blind Image Seperation Using Forward Difference Method (FDM)
Blind Image Seperation Using Forward Difference Method (FDM)Blind Image Seperation Using Forward Difference Method (FDM)
Blind Image Seperation Using Forward Difference Method (FDM)sipij
 
PDE BASED FEATURES FOR TEXTURE ANALYSIS USING WAVELET TRANSFORM
PDE BASED FEATURES FOR TEXTURE ANALYSIS USING WAVELET TRANSFORMPDE BASED FEATURES FOR TEXTURE ANALYSIS USING WAVELET TRANSFORM
PDE BASED FEATURES FOR TEXTURE ANALYSIS USING WAVELET TRANSFORMIJCI JOURNAL
 
Local Phase Oriented Structure Tensor To Segment Texture Images With Intensit...
Local Phase Oriented Structure Tensor To Segment Texture Images With Intensit...Local Phase Oriented Structure Tensor To Segment Texture Images With Intensit...
Local Phase Oriented Structure Tensor To Segment Texture Images With Intensit...CSCJournals
 
PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION
PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION
PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION cscpconf
 

Similar to Component-Based Ethnicity Identification from Facial Images.pdf (20)

Image Processing
Image ProcessingImage Processing
Image Processing
 
Skin colour segmentation using fintte bivariate pearsonian type iv a mixture ...
Skin colour segmentation using fintte bivariate pearsonian type iv a mixture ...Skin colour segmentation using fintte bivariate pearsonian type iv a mixture ...
Skin colour segmentation using fintte bivariate pearsonian type iv a mixture ...
 
Different Image Segmentation Techniques for Dental Image Extraction
Different Image Segmentation Techniques for Dental Image ExtractionDifferent Image Segmentation Techniques for Dental Image Extraction
Different Image Segmentation Techniques for Dental Image Extraction
 
Human Face Detection Based on Combination of Logistic Regression, Distance of...
Human Face Detection Based on Combination of Logistic Regression, Distance of...Human Face Detection Based on Combination of Logistic Regression, Distance of...
Human Face Detection Based on Combination of Logistic Regression, Distance of...
 
Texture features from Chaos Game Representation Images of Genomes
Texture features from Chaos Game Representation Images of GenomesTexture features from Chaos Game Representation Images of Genomes
Texture features from Chaos Game Representation Images of Genomes
 
FACE RECOGNITION ALGORITHM BASED ON ORIENTATION HISTOGRAM OF HOUGH PEAKS
FACE RECOGNITION ALGORITHM BASED ON ORIENTATION HISTOGRAM OF HOUGH PEAKSFACE RECOGNITION ALGORITHM BASED ON ORIENTATION HISTOGRAM OF HOUGH PEAKS
FACE RECOGNITION ALGORITHM BASED ON ORIENTATION HISTOGRAM OF HOUGH PEAKS
 
Behavior study of entropy in a digital image through an iterative algorithm
Behavior study of entropy in a digital image through an iterative algorithmBehavior study of entropy in a digital image through an iterative algorithm
Behavior study of entropy in a digital image through an iterative algorithm
 
07 Tensor Visualization
07 Tensor Visualization07 Tensor Visualization
07 Tensor Visualization
 
SEGMENTATION USING ‘NEW’ TEXTURE FEATURE
SEGMENTATION USING ‘NEW’ TEXTURE FEATURESEGMENTATION USING ‘NEW’ TEXTURE FEATURE
SEGMENTATION USING ‘NEW’ TEXTURE FEATURE
 
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint RecognitionExtended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
 
GREY LEVEL CO-OCCURRENCE MATRICES: GENERALISATION AND SOME NEW FEATURES
GREY LEVEL CO-OCCURRENCE MATRICES: GENERALISATION AND SOME NEW FEATURESGREY LEVEL CO-OCCURRENCE MATRICES: GENERALISATION AND SOME NEW FEATURES
GREY LEVEL CO-OCCURRENCE MATRICES: GENERALISATION AND SOME NEW FEATURES
 
BEHAVIOR STUDY OF ENTROPY IN A DIGITAL IMAGE THROUGH AN ITERATIVE ALGORITHM O...
BEHAVIOR STUDY OF ENTROPY IN A DIGITAL IMAGE THROUGH AN ITERATIVE ALGORITHM O...BEHAVIOR STUDY OF ENTROPY IN A DIGITAL IMAGE THROUGH AN ITERATIVE ALGORITHM O...
BEHAVIOR STUDY OF ENTROPY IN A DIGITAL IMAGE THROUGH AN ITERATIVE ALGORITHM O...
 
MULTIFOCUS IMAGE FUSION USING MULTIRESOLUTION APPROACH WITH BILATERAL GRADIEN...
MULTIFOCUS IMAGE FUSION USING MULTIRESOLUTION APPROACH WITH BILATERAL GRADIEN...MULTIFOCUS IMAGE FUSION USING MULTIRESOLUTION APPROACH WITH BILATERAL GRADIEN...
MULTIFOCUS IMAGE FUSION USING MULTIRESOLUTION APPROACH WITH BILATERAL GRADIEN...
 
495Poster
495Poster495Poster
495Poster
 
Blind Image Seperation Using Forward Difference Method (FDM)
Blind Image Seperation Using Forward Difference Method (FDM)Blind Image Seperation Using Forward Difference Method (FDM)
Blind Image Seperation Using Forward Difference Method (FDM)
 
101 Rough Draft
101 Rough Draft101 Rough Draft
101 Rough Draft
 
icarsn
icarsnicarsn
icarsn
 
PDE BASED FEATURES FOR TEXTURE ANALYSIS USING WAVELET TRANSFORM
PDE BASED FEATURES FOR TEXTURE ANALYSIS USING WAVELET TRANSFORMPDE BASED FEATURES FOR TEXTURE ANALYSIS USING WAVELET TRANSFORM
PDE BASED FEATURES FOR TEXTURE ANALYSIS USING WAVELET TRANSFORM
 
Local Phase Oriented Structure Tensor To Segment Texture Images With Intensit...
Local Phase Oriented Structure Tensor To Segment Texture Images With Intensit...Local Phase Oriented Structure Tensor To Segment Texture Images With Intensit...
Local Phase Oriented Structure Tensor To Segment Texture Images With Intensit...
 
PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION
PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION
PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION
 

Recently uploaded

Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyDrAnita Sharma
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 

Recently uploaded (20)

Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 

Component-Based Ethnicity Identification from Facial Images.pdf

  • 1. Component-Based Ethnicity Identification from Facial Images A. Boyseens and S. Viriri(B) School of Maths, Statistics and Computer Science, University of KwaZulu-Natal, Durban, South Africa {210501411,viriris}@ukzn.ac.za Abstract. This paper presents an exhaustive component-based analysis to identify the ethnicity from facial images. The different ethnic groups identified are Asian, African, African American, Asian Middle East, Cau- casian and Other. The classification techniques investigated include Deci- sion Trees, Naı̈ve Bayes, Random Forest and K-Nearest Neighbor. Naı̈ve Bayes achieved 84.7 % and 85.6 % accuracy rates for African ethnicity and Asian ethnicity identification, respectively. The Decision Trees achieved 85.8 % for African American ethnicity identification rate, while K-Nearest Neighbor achieved 86.8 % for Asian Middle East ethnicity and Random Forest achieved 90.8 % for Caucasian ethnicity identification rate. This research work achieved an overall ethnicity identification rate of 86.6 %. 1 Introduction This paper investigates methods of analysing and identifying ethnicity of a facial image. These ethnicities are Asian, African, African American, Asian Middle East, Caucasian and Other. The aim of this research work is to investi- gate a model for the efficient ethnicity identification using facial components. Ethnicity is a socially defined category of people who are identified by each other based on the common social, cultural and ancestral backgrounds [1]. Ethnic facial recognition is an important biometric authentication technology, there are various applications of facial recognition, these include law enforcement, security systems and biometric system therefore the need for an ethnic facial recognition system is essential. 2 Related Work Lu and Jain [2] described techniques, which use Nearest Neighbor (NN) and Linear Discriminant Analysis (LDA) for ethnic identification from facial images. The two classes identified are Asian and Non-Asian ethnicity. The feature extrac- tion technique used were Hu Moment and Zernike Moments. The experimental results achieved an average accuracy rate of 86 %. The short falls experienced are that it only classified for two classes Asian and non-Asian ethnicity. Another short fall was that Product Rule was used to achieve an integrated strategy to c Springer International Publishing AG 2016 L.J. Chmielewski et al. (Eds.): ICCVG 2016, LNCS 9972, pp. 293–303, 2016. DOI: 10.1007/978-3-319-46418-3 26
  • 2. 294 A. Boyseens and S. Viriri combine outputs and the data results achieved are estimated due to the extensive cross-validation that was used. Buchala et al. [3] discussed the effects that Principal Component Analysis (PCA) has on the identification of Gender, Ethnicity, Age and Identity of facial images. Three classes are identified these are Caucasian, African American and East Asian. The feature extractions technique were Elastic Bunchc Graph. The experimental results achieved an average accuracy for Ethnicity of 81.67 %. This paper had the following short falls it only classified for three different classes Caucasian, African American and East Asian and the authors assumed that using Linear Discriminant Analysis (LDA) would achieve better results than that of PCA. Tin and Sein [4] analysis the use of Nearest Neighbor (NN) and Principal Component Analysis (PCA) in order to achieve ethnicity identification from facial images. This paper distinguish between two classes Myanmar and Non- Myanmar ethnicities. The feature extraction technique used was Hu Moments. The experimental results obtained by the paper on average are 92 % ethnic identification accuracy. The insufficiency experienced were that it only classified between two different classes Myanmar and Non-Myanmar. The authors assumed that more images are needed to achieve a better result and the system needs to be more identity-sensitive to features that are closer together. 3 Methods and Techniques The component-based ethnicity identification system is depicted in Fig. 1. 3.1 Facial Components The facial components are extracted using the Haar Transformation [5] which is real and orthogonal. Haar Transformation HTn (f) of an N-input function Xn (f) is the 2nd element vector as described in Eq. (1). The Haar Transformation cross multiplies a function with Haar matrix that contains Haar functions with differ- ent widths at different locations. It is calculated at two levels which decomposes the discrete signal into two components at half of the original lengths [5]. HTn (f) = Hn Xn (f) (1) where n is the number of elements in the function, Hn is the element vector and Xn (f) is the 2n element vector. The components that are extracted are the left eye, right eye, mouth, nose, chin, forehead, left cheek and right cheek. 3.2 Feature Extraction From each component a feature vector is obtained. Analysis was done on differ- ent textural and structural or geometrical feature extractions, 7 Hu Moments, Zernike Moments, Local Binary Pattern (LBP), Gabor Filter and Haralick Texture Moments, in order to obtain the correct feature extraction for that component.
  • 3. Component-Based Ethnicity Identification from Facial Images 295 Fig. 1. Overview of ethnicity identification system The Hu Moments are a set of invariant moments and it characterizes regard- less of their scale, positions, size and orientation. These are computed by normal- izing the central moments though order 3 [6] in terms of the central moments. The Zernike Moments are used to overcome redundancy in which certain geometric moments obtain [6]. They are a class of orthogonal moments which are rotational invariant and effective in image representation. Zernike are a set of complex, orthogonal polynomials defined as the interior of the unit circle. The general form of the Zernike Moments is defined in Eq. (2). Znm(x, y) = Znm(p, θ) = Rnm(p)jmθ (2) where x, y, p and θ correspond to Cartesian and Polar coordinates respectively, n ∈ Z+ and m ∈ Z, constrained to n − m even, m ≤ n Rnm(p) = n−m 2 k=0 (−1)k (n − k)! k!(n+m 2 − k)!(n−m 2 − k)! pn−2k (3) where Rnm(p) is a radial polynomial and k is the order. The Haralick Texture Moments are texture features that can be used to analyze the spatial distribution of the image’s texture features [6] with differ- ent spatial positions and angles. Four of these Haralick Texture Moments are computed; Energy, Entropy, Correlation and Homogeneity.
  • 4. 296 A. Boyseens and S. Viriri Entropy is the reflection of the disorder and complexity of the images texture. This is defined using Eq. (4). Entropy = ij ˆ f(i, j) log ˆ f(i, j) (4) where ˆ f(i, j) is the [i, j] entry if the gray level value of image matrix and i and j are points on the image matrix. Energy is the measure of the local homogeneity and is the opposite of Entropy. This shows the uniformity of the images texture and is computed using the Eq. (5). Energy = ij ˆ f(i, j)2 (5) where ˆ f(i, j) is the [i, j] entry if the gray level value of image matrix and i and j are points on the image matrix. Homogeneity is the reflection of equaliness of the images’ textures and scale of local changes in the texture of the image. High homogeneity shows no change between the regions with regards to images texture, this is defined in Eq. (6). Homogeneity = i j 1 1 + (i − j)2 ˆ fi,j (6) where ˆ f(i, j) is the [i, j] entry if the gray level value of image matrix and i and j are points on the image matrix. Correlation is the consistency of the images texture which is described in Eq. (7). Correlation = ij (i − μi)(j − μj) ˆ f(i, j) σiσj (7) In which μj, μi, σi and σj are described as: μi = n i=1 n j=1 i ˆ f(i, j) μj = n i=1 n j=1 j ˆ f(i, j) (8) σi = n i=1 n j=1(i − μi)2 ˆ f(i, j) σj = n i=1 n j=1(i − μj)2 ˆ f(i, j) (9) The Gabor Filters are geometric moments which are the product between an elliptical Gaussian and a sinusoidal, [7]. Gabor elementary function can be defined as the product of the pulse with a harmonic oscillation of frequency. g(t) = −α2 (t−t0)2 −i2π(f−fo)+φ (10) where α is the time duration of the Gaussian envelope, t0 denotes the centroid, f0 is the frequency of the sinusoidal and φ denotes the phase shift.
  • 5. Component-Based Ethnicity Identification from Facial Images 297 Local Binary Pattern (LBP) is a geometric moment operator that described the surrounding of the pixels by obtaining a bit code of a pixel [8], this is defined using Eq. (11). C = k=7 k=0 (2k bk) (11) bk = 1, k=7 k=0(tk ≥ C) 0, k=7 k=0(tk C) where tk is the gray scale amount. bk is the binary variables between 1 and 0 and C is a constant value of 0 and 1. 3.3 Classification A number of different supervised and unsupervised machine learning algorithms are used in order to identify ethnicity of the facial images. The classification techniques used are Naı̈ve Bayes, K-Nearest Neighbor and Decision Tree. The Decision Trees use recursive partitioning to separate the dataset by finding the best variable [9], and using the selected variable to split the data. Then using the entropy, defined in Eqs. (12) and (13), to calculate the difference that variable would make on the results if it is chosen. If the entropy is 0 then that variable is prefect to use, else a new variable needs to be selected. H(D) = − k i=1 P(Ci|D) logk(P(Ci|D)) (12) where the entropy of a sample D with repest to target variable of k possible classes Ci. P(Ci|D) = numberofcorrectobservationforthatclass totalobservationforthatclass (13) where the probability of class Ci in D is obtained directly from the dataset. Naı̈ve Bayes classify an instance by assuming the presence or absence of a par- ticular feature and sees if it is unrelated to the presence or absence of another feature, given in the class variable [9,10]. This is done by calculating the prob- ability for which it occurred, as defined in Eq. (14). P(x1, ......., xn|y) = n i=1 P(y)P(xn|y) P(x1, ......., xn) (14) where case y is a class value, attributes are x1, ........, xn and n is the sample size.
  • 6. 298 A. Boyseens and S. Viriri K-Nearest Neighbor (KNN) classifies by a majority vote of its neighbors, with the case being assigned to the class with the most common amongst its dataset. The KNN is measured by a distance function, for example Euclidean, as defined in Eq. (15). d = n i=1 (xi − qi) 2 (15) where n is the size of the data, xi is an element in the dataset and qi is a central point. 4 Results and Discussion This paper used a union of four different facial image databases. The total dataset contained 1300 facial images of 900 subjects. The subjects were divided into six different ethnic groups (Asian, African, African American, Asian Middle East, Caucasian and Other). Asian dataset composed of Yale [11] and FERET [12], African dataset was composed of MUCT [13], African American was composed of Yale [11], ORL [14] and FERET [12], Caucasian dataset was composed of Yale [11], ORL [14], FERET [12] and MUCT [13], Asian Middle East was composed of ORL [14], Yale [11], FERET [12] and MUCT [13] and Other was composed of FERET [12]. Analysis was done to obtain which feature extraction technique would achieve the most accurate True Positive Rate (TPR) for ethnic identification. These analysis results were obtained by taking the feature vector for each component and classifying it using K-Nearest Neighbor, to observe which feature vector obtained the highest True Positive Rate, results obtained are shown in Table 1. Table 1. Accuracy rates per component per feature extraction technique 7 Hu moments Zernike moments LBP Gabor filter Haralick texture moments Nose 85.0 % 80.1 % 62.7 % 82.5 % 66.5 % Left eye 57.2 % 76.4 % 83.7 % 73.4 % 77.1 % Right eye 62.7 % 76.3 % 37.6 % 71.8 % 63.4 % Mouth 89.5 % 72.1 % 76.9 % 88.2 % 70.1 % Forehead 45.2 % 79.5 % 81.0 % 70.6 % 82.3 % Chin 80.5 % 72.6 % 47.5 % 66.8 % 69.8 % Left cheek 60.4 % 52.6 % 30.5 % 66.7 % 72.8 % Right cheek 62.5 % 71.5 % 22.7 % 81.8 % 84.2 % Zernike Moments is a geometric feature extraction technique which achieved a TPR of 76.3 % for the right eye. LBP is a geometric feature extraction tech- nique it achieved a TPR of 83.7 % for the left eye. These two geometric feature
  • 7. Component-Based Ethnicity Identification from Facial Images 299 extraction technique achieved high results as the eyes are structurally different in size and shape between ethnic groups. Gabor Filter, 88.2 %, and 7 Hu Moment, 89.5 %, achieved a high TPR for the mouth component. Both feature extraction techniques are textural based. These achieved high results as the mouth is different in colour, shape and coarseness for different ethnic groups. The forehead, 82.3 % TPR, left cheek, 72.8 % TPR and right cheek, 84.2 % TPR, have different textures, colours and gradients. These achieved high results due to the fact that Haralick Texture Moments calculate the correlation and homogeneity for each pixel. 7 Hu Moments is a textural feature extraction techniques it achieved 80.5 % for the chin and 85.0 % for the nose. This is due to all components needing to be identified by using the texture of the component which 7 Hu Moments achieves. 4.1 African Ethnicity Results In Buchala et al. [3] it was shown that for African ethnicity the percentage ach- ieved was on average 80 % for the Principal Component Analysis (PCA). These results were obtained from a dataset size of 320 images from the FERET dataset. The Naı̈ve Bayes produced the best results for the whole database. K-nearest Neighbour produced high results, shown in Fig. 2. Fig. 2. Accuracy achieved for African ethnicity 4.2 African American Ethnicity Results Buchala et al. [3] showed that for African ethnicity the percentage achieved was on average 80 % for the Principal Component Analysis (PCA). The results obtained are shown in Fig. 3. The Decision Trees achieved the best results of 85.8 %. 4.3 Asian Ethnicity Results In Lu and Juain [2] obtained an average of 97 % for Nearest Neighbour and 95 % for Linear Discriminant Analysis (LDA) for Asian identification. The Naı̈ve Bayes produced the best accuracy rate of 86 %, and the K-nearest Neighbor produced the accuracy rate of 75 % as shown in Fig. 4.
  • 8. 300 A. Boyseens and S. Viriri Fig. 3. Accuracy achieved for African American ethnicity Fig. 4. Accuracy achieved for Asian ethnicity 4.4 Asian Middle East Ethnicity Results Buchala et al. [3] results obtained for Asian Middle East ethnicity were on average 83 % for the Principal Component Analysis (PCA). These results were obtained from a dataset size of 363 images from the FERET dataset. The K-Nearest Neighbor is the best machine learning algorithm to identify the Asian Middle East ethnicity for all images, this produced 86.8 %, among other tech- nique as shown in Fig. 5. Fig. 5. Accuracy achieved for Asian Middle East ethnicity
  • 9. Component-Based Ethnicity Identification from Facial Images 301 4.5 Caucasian Ethnicity Results In Buchala et al. [3] results showed that for Caucasian ethnicity the percentage achieved was 82 % for the Principal Component Analysis (PCA). These results were obtained by using 1758 images from the FERET dataset. The Random Forest is best used to identify the Caucasian ethnicity for all images, this pro- duced 90.8 %. Random Forest achieved well here as most of the testing data was made up of Caucasian images and the decision rule produced easily classified the Caucasian ethnicity. After testing it was found that all datasets produced on average a 90 % for ethnic accuracy identification, as shown in Fig. 6. Fig. 6. Accuracy achieved for caucasian ethnicity 4.6 Other Ethnicity Results Tin and Sein [4] achieved on average for other ethnicity was on average 93 % for Nearest Neighbour (NN) and 96 % for Principal Component Analysis (PCA). These results were obtained from a 250 images obtained from the Internet. The K-Nearest Neighbor is best accuracy rate of 80 % as shown in Fig. 7. Fig. 7. Accuracy achieved for other ethnicity
  • 10. 302 A. Boyseens and S. Viriri 4.7 Results of All Six Ethnicities A number empirical experiments were carried out to investigate which machine learning algorithm would be suitable to determine the best results for ethnicity. The dataset testing sizes ranged from 10 % to 100 % of the original dataset and were tested against the different machine learning algorithms. These training datasets were filled with randomly chosen images from the original dataset. The tests showed that the Decision Tree machine learning algorithm achieved 86.6 % ethnicity detection rate. The worst machine learning algorithm was Ran- dom Forest which achieved 70 % ethnic accuracy identification rate. This is a variation of 6 % between the worst and best ethnic accuracy identification rate, as shown in Fig. 8. Fig. 8. Accuracy achieved for all six ethnicity Table 2 shows the comparison between the results achieved by related works for ethnicity identification and our work for ethnicity identification. It is seen that the results obtained for Asian ethnicity identification were lower than that of Lu and Jain [2], possibly due the bigger size dataset used in this work. Table 2. Comparison of related works results to our results for ethnicity identification Asian African African American Asian Middle East Caucasian Other Lu and Jain [2] 97.7 % - - - - - Buchala et al. [3] - 80.2 % 80.0 % 83.1 % 82.0 % - Tin and Sein [4] - - - - - 96.0 % Our Work 85.6 % 84.7 % 85.8 % 86.8 % 90.8 % 82.3 % 5 Conclusion This paper presented a component-based ethnicity identification from facial images using machine learning algorithms. The ethnicities that were identi- fied were Asian, African, African American, Asian Middle East, Caucasian and
  • 11. Component-Based Ethnicity Identification from Facial Images 303 Other. The feature vector that was obtained were Haralick Texture Moments for the forehead, Left cheek and right cheek. Zernike Moments was used for the right eye and LBP for the left eye. Gabor Filter for the Mouth and 7 Hu Moments for the chin, mouth and nose, which is fused and normalized to obtain results. African ethnicity identification rate achieved 84.7 % with Naı̈ve Bayes, 85.8 % was achieved with Decision Tree achieved for African American ethnicity identification rate. Naı̈ve Bayes achieved 85.6 % for Asian ethnicity identifica- tion rate, K-Nearest Neighbor Classification achieved 86.8 % for Asian Middle East ethnicity identification rate. 90.8 % was achieved for Caucasian ethnicity identification rate with Random Forest achieved and K-Nearest Neighbor Clas- sification achieved a 82.3 % for Other ethnicity identification rate. This research achieved a total ethnicity identification rate of 86.6 %. References 1. Lu, X.: Image analysis for face recognition. Personal notes, 5 May (2003) 2. Lu, X., Jain, A.K.: Ethnicity identification from face images. In: Defense and Secu- rity, International Society for Optics and Photonics, pp. 114–123 (2004) 3. Buchala, S., Davey, N., Gale, T.M., Frank, R.J.: Principal component analysis of gender, ethnicity, age, and identity of face images. In: Proceedings of IEEE ICMI (2005) 4. Tin, H.H.K., Sein, M.M.: Race identification for face images. ACEEE Int. J. Inform. Tech. 1(02) (2011) 5. Mulcahy, C.: Image compression using the haar wavelet transform. Spelman Sci. Math. J. 1(1), 22–31 (1997) 6. Teague, M.R.: Image analysis via the general theory of moments. JOSA 70(8), 920–930 (1980) 7. Berisha, S.: Image classification using gabor filters and machine learning (2009) 8. Salah, S.H., Du, H., Al-Jawad, N.: Fusing local binary patterns with wavelet fea- tures for ethnicity identification. In: Proceedings of IEEE International Conference on Signal Image Process, vol. 21, pp. 416–422 (2013) 9. Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012) 10. Lowd, D., Domingos, P.: Naive bayes models for probability estimation. In: Pro- ceedings of the 22nd International Conference on Machine Learning, pp. 529–536. ACM (2005) 11. Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. fisherfaces: recog- nition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997) 12. Phillips, P.J., Wechsler, H., Huang, J., Rauss, P.J.: The feret database and evalua- tion procedure for face-recognition algorithms. Image Vis. Comput. 16(5), 295–306 (1998) 13. Milborrow, S., Morkel, J., Nicolls, F.: The MUCT landmarked face database. In: Pattern Recognition Association of South Africa (2010) 14. Samaria, F.S., Harter, A.C.: Parameterisation of a stochastic model for human face identification. In: Proceedings of the Second IEEE Workshop on Applications of Computer Vision, pp. 138–142. IEEE (1994)