Facial Expression Recognitino

Facial Expression Recognition System
Team ID- 01
Session: M.C.S. Fall 2016
Project Advisor: Mr. Amir Jamshaid
Submitted By
Name of Student Mehwish S. Khan
RollNo IU-16-M2Mr022
Department of Computer Science & IT
Bahawalnagar Campus
The Islamia University of Bahawalpur

ii
© Department of Computer Science &IT(BWN), The Islamia University Of Bahawalpur.
STATEMENT OF SUBMISSION
This is to certify that Mehwish S. Khan Roll No. IU-16-M2Mr022 has successfully completed
the final project named as: Facial Expression Recognition System, at The Islamia University
Of Bahawalpur to fulfill the partial requirement of the degree of Masters in Computer Science.
_____________________
Project Office Supervisor
IUB, Bahawalnagar
____________________________ ________________________
Project Primary Advisor Project Examiner
Designation Designation
IUB, Bahawalnagar

iii
Proofreading Certificate
It is to certify that I have read the document meticulously and circumspectly. I am convinced that
the resultant project does not contain any spelling, punctuation or grammatical mistakes as such.
All in all, I find this document well organized and I am in no doubt that its objectives have been
successfully met.
_____________________
Mr. Amir Jamshaid
CS & IT
Assistant Professor, IUB, Bahawalnagar Campus

iv
Acknowledgement
Firstly, I would like to thank my supervisor, Mr. Amir Jamshaid, for his constant support,
feedback and guidance throughout the entire development of this project.
I would also like to thank my husband Sunawar Khan for all the support they have provided me
during my years of university and guidance throughout the entire development of this project.
I’m also thankful to my friends whose silent support led us to complete our project.
1- Miss. Tahira
2- Miss. Bushra
Date:
24 May, 2018.

v
Abstract
The problem of automatic recognition of facial expressions is still an ongoing
research, and it relies on advancements in Image Processing and Computer Vision
techniques. Such systems have a variety of interesting applications, from human-
computer interaction, to robotics and computer animations. Their aim is to provide
robustness and high accuracy, but also to cope with variability in the environment
and adapt to real time scenarios.
This project proposes an automatic facial expression recognition system, capable
of distinguishing the seven universal emotions: disgust, anger, fear, happiness,
sadness, surprise and normal. It is designed to be person independent and tailored
only for static images. The system integrates a face detection mechanism using
Viola-Jones algorithm, uses uniform Gabor features for feature extraction and
performs classification using a Multi-Layer Feed Forward Neural Network model.

vi
Table of Contents
Title Page ........................................................................................................................................ i
STATEMENT OF SUBMISSION............................................................................................... ii
Proofreading Certificate..............................................................................................................iii
Acknowledgement........................................................................................................................ iv
Abstract.......................................................................................................................................... v
List of Table.................................................................................................................................. ix
List of Figures............................................................................................................................... ix
List of Equations .......................................................................................................................... ix
Chapter 1 ....................................................................................................................................... 1
INTRODUCTION......................................................................................................................... 1
2. The Importance of Facial Recognition ................................................................................ 1
3. Expressions and Emotions................................................................................................... 2
4. Facial Expressions Evolutionary Reasons ........................................................................... 5
4.1 Anger ................................................................................................................................. 5
4.2 Disgust............................................................................................................................... 5
4.3 Fear.................................................................................................................................... 5
4.4 Surprise.............................................................................................................................. 5
4.5 Sadness .............................................................................................................................. 5
4.6 Contempt: .......................................................................................................................... 6
4.7 Happiness........................................................................................................................... 6
5. Context................................................................................................................................. 6
6. Scope and Objectives........................................................................................................... 7
7. Achievements....................................................................................................................... 7
8. Overview of Dissertation ..................................................................................................... 7
9. Objectives ............................................................................................................................ 8
Chapter 2 ....................................................................................................................................... 9
BACKGROUND AND LITERATURE SURVEY..................................................................... 9
1. State-of-The-Art................................................................................................................... 9

vii
2. Two Main Approaches for Facial Expression Analysis on Still Images ............................. 9
2.1 Geometric and Appearance Based Parameterizations................................................ 10
2.2 More about Appearance Based Parameterizations ..................................................... 11
3. Current state-of-the-art approaches.................................................................................... 12
4. Face detection algorithms .................................................................................................. 12
5. Feature extraction algorithms ............................................................................................ 13
6. Classification algorithms ................................................................................................... 14
Chapter 3 ..................................................................................................................................... 15
REQUIREMENTS...................................................................................................................... 15
Chapter Overview....................................................................................................................... 15
1. Requirements Elicitation.................................................................................................... 15
2. Functional and non-functional requirements ..................................................................... 15
3. Use case Diagram .............................................................................................................. 16
Chapter 4 ..................................................................................................................................... 17
DESIGN ....................................................................................................................................... 17
Chapter overview........................................................................................................................ 17
1. Design Methodologies ....................................................................................................... 17
2. Architectural Design .......................................................................................................... 18
3. Dependencies..................................................................................................................... 19
4. Interface Design................................................................................................................. 19
5. Image Detection................................................................................................................. 20
Chapter 5 ..................................................................................................................................... 21
IMPLEMENTATION ................................................................................................................ 21
1. Implementation tools ......................................................................................................... 21
2. Dataset................................................................................................................................ 21
3. Methodology...................................................................................................................... 21
3.1 Face detection............................................................................................................. 22
3.2 Viola-Jones Object Detection algorithm .................................................................... 22

viii
3.3 Detection of faces using Viola-Jones ......................................................................... 24
3.4 Feature extraction ....................................................................................................... 25
3.5 Pre-processing ............................................................................................................ 25
3.6 Gabor Coefficients...................................................................................................... 25
3.6.1 Ada-boost Classifier with Feature Selection........................................................... 26
4. Classification...................................................................................................................... 28
4.1 Multi-Layer Feed Forward Neural Network .............................................................. 28
5. System Walkthrough.......................................................................................................... 29
Chapter 6 ..................................................................................................................................... 32
TESTING AND EVALUATION OF THE RESULT.............................................................. 32
1. Evaluation methods............................................................................................................ 32
1.1 Cross validation Method description.......................................................................... 32
2. Analysis of the Result ........................................................................................................ 33
3. Confusion Matrix............................................................................................................... 33
3.1 Methods description ................................................................................................... 33
3.2 Analysis of the Result................................................................................................. 34
Chapter 7 ..................................................................................................................................... 35
CONCLUSION ........................................................................................................................... 35
1. Project Achievements ........................................................................................................ 35
2. Challenges.......................................................................................................................... 35
3. Future Work....................................................................................................................... 36
4. Concluding Remarks and Reflections................................................................................ 36
Literature Cited .......................................................................................................................... 38

ix
List of Table
Table 1:- Functional and Non-Functional Requirement............................................................... 16
Table 2:- Gabor Filter Parameter Table........................................................................................ 26
Table 3:- General Format for a Confusion Matrix........................................................................ 33
Table 4:- Confusion Matrix Obtained After Testing .................................................................... 34
List of Figures
Figure 1:- Feature Extraction Techniques..................................................................................... 13
Figure 2:- Classification Algorithm.............................................................................................. 14
Figure 3:- Use Case Diagram........................................................................................................ 16
Figure 4:- Proposed Block Diagram ............................................................................................. 18
Figure 5:- Components of the System .......................................................................................... 18
Figure 6:- Graphical User Interface of the System ....................................................................... 20
Figure 7:- Face Detection.............................................................................................................. 20
Figure 8:- Constituent Module of the Image................................................................................. 21
Figure 9:- General Form of Viola Jones Object Detection Algorithm ......................................... 23
Figure 10:- Face Detection............................................................................................................ 24
Figure 11:- Region Detecting by the System................................................................................ 24
Figure 12:-System Walkthrough - Open Model ........................................................................... 30
Figure 13:- GUI Face Detection and Localization........................................................................ 31
List of Equations
Equation 1:- Gabor Filter.............................................................................................................. 25
Equation 2:- Gabor Feature........................................................................................................... 26

x
This page intentionally left blank

1
Chapter 1
INTRODUCTION
Trying to interpret a person's emotional state in a nonverbal form, usually requires
decoding his/hers facial expression. Many times, body languages and especially facial
expressions, tell us more than words about one's state of mind. Face plays significant role in
social communication. This is a 'window' to human personality, emotions and thoughts.
According to the psychological research conducted by Mehrabian(Mehrabian, 1968)nonverbal
part is the most informative channel in social communication. Verbal part contributes about 7%
of the message, vocal – 34% and facial expression about 55%. Due to that, face is a subject of
study in many areas of science such as psychology, behavioral science, medicine and finally
computer science. In the field of computer science much effort is put to explore the ways of
automation the process of face detection and segmentation. Several approaches addressing the
problem of facial feature extraction have been proposed. The main issue is to provide appropriate
face representation, which remains robust with respect to diversity of facial appearances. For this
project I have performed an experiment which serves multiple purposes: Finding out, once and
for all, who "reads" facial expressions better- Men or Women, and if so, suggesting an answer
for the question- why do those differences exist? Revealing special features for recognizing
classically defined facial expressions and answering the question- which facial cues help us the
most decipher facial expressions?
Moreover, I will try to justify those features from an evolutionary point of view.
2. The Importance of Facial Recognition
Understanding the human facial expressions and the study of expressions has many
aspects, from computer analysis, emotion recognition, lie detectors, airport security, nonverbal
communication and even the role of expressions in art.
Improving the skills of reading expressions is an important step towards successful
relations.

2
3. Expressions and Emotions
A facial expression is a gesture executed with the facial muscles, which convey the
emotional state of the subject to observers. An expression sends a message about a person's
internal feeling. In Hebrew, the word for "face”, has the same letters as the word represents
"within" or "inside". That similarity implies about the facial expression most important role-
being a channel of nonverbal communication.
Facial expressions are a primary means of conveying nonverbal information among
humans, though many animal species display facial expressions too. Although human developed
a very wide range and powerful of verbal languages, facial expression role in interactions
remains essential, and sometimes even critical.
Expressions and emotions go hand in hand, i.e. special combinations of face muscular
actions reflect a particular emotion. For certain emotions, it is very hard, and maybe even
impossible, to avoid it's fitting facial expression.
For example, a person who is trying to ignore his boss's annoying offensive comment by
keeping a neutral expression might nevertheless show a brief expression of anger. This
phenomenon of a brief, involuntary facial expression shown on the face of humans according to
emotions experienced is called 'micro expression'.
Micro expressions express the seven universal emotions: happiness, sadness, anger,
surprise, contempt, fear and disgust and normal. However, Paul Ekman, a Jewish American
psychologist who was a pioneer in the study of emotions and their relation to facial expressions,
expanded the list of classical emotions. Ekman has added to the list of emotions nine more:
amusement, shame, embarrassment, excitement, pride, guilt, relief, satisfaction and
pleasure.
Micro expression is lasting only 1/25-1/15 of a second. Nonetheless, capturing it can
illuminate one's real feelings, whether he wants it or not. That is exactly what Paul Ekman did.

3
Back in the 80's, Ekman was already known as a specialist for study of facial expressions,
when approached by a psychiatrist, asking if Ekman has the ability to detect liars. The
psychiatrist wanted to detect if a patient is lying by threatening to suicide. Ekman watched a tape
of a patient over and over again, looking for a clue until he found a split second of desperation,
meaning that the patient's threat wasn't empty. Since then, Ekman have found those critical split
seconds in almost every liar's documentation. The leading character in the TV series "Lie to me"
is based on Paul Ekman himself, the man who dedicated his life to read people's expressions- the
"human polygraph".
The research of facial expressions and emotions began many years before Ekman's work.
Charles Darwin published his book, called "The Expression of the Emotions in Man and
Animals" in 1872. This book was dedicated to nonverbal patterns in humans and animals and to
the source of expressions.
Darwin's two former books- "The Descent of Man, and Selection in Relation to Sex" and
"On the Origin of Species" represented the idea that man did not came into existence in his
present condition, but in a gradual process- Evolution.
This was, of course, a revolutionary theory since in the middle of the 19th
century no one
believed that man and animal "obey to the same rules of nature".
Darwin's work attempted to find parallels between behaviors and expressions in animals
and humans. Ekman's work supports Darwin's theory about universality of facial expressions,
even across cultures.
The main idea of "The Expression of the Emotions in Man and Animals" is that the
source of nonverbal expressions of man and animals is functional, and not communicative, as
we may have thought. This means that facial expressions creation was not for communication
purposes, but for something else.
An important observation was that individuals who were born blind had similar facial
expressions to individuals who were born with the ability to see.

4
This observation was intended to contradict Sir Charles Bell's idea (a Scottish surgeon,
anatomist, neurologist and philosophical theologian, who influenced Darwin's work), who
claimed that human facial muscles were created to provide humans the unique option to express
emotions, meaning, for communicational reasons.
According to Darwin, there are three "chief principles", which are three general
principles of expression:
The first one is called "principle of serviceable habits". He described it as a habit that was
reinforced at the beginning and then inherited by offspring.
For example: he noticed a serviceable habit of raising the eyebrows in order to increase
the vision field. He connected it to a person who is trying to remember something, while
performing those actions, as though he could "see" what he is trying to remember.
The second principle is called "antithesis". Darwin suggested that some actions or habits
might not be serviceable themselves, but carried out only because they are opposite in nature to a
serviceable habit.
I have found this principle very interesting, and I will go into more detail later on.
The third principle is called "The principle of actions due to the constitution of the
Nervous System". This principle is independent from will or a certain extent of habit. For
example: Darwin noticed that animals rarely make noises, but in special circumstances, like fear
or pain they response by making involuntary noises.
Automatic facial expression recognition has been used in various real life applications
such as security systems, interactive computer simulations/designs, computer graphics,
psychology and computer vision. In this project, the aim is to implement binary and multi-class
face expression analysis algorithms based primarily on ‘Gabor facial feature’ by evaluating
features such as Haar-like, Gabor, Haar wavelet coefficients; and making use of classifiers like
MFNN combined with feature selection methods such as Adaboost to be used in automated
systems in real-life applications based on learning from examples.

5
4. Facial Expressions Evolutionary Reasons
A common assumption is that facial expressions initially served a functional role and not
a communicative one. I will try to justify each one of the seven classical expressions with its
functional initially role:
4.1 Anger: involves three main features- teeth revealing, eyebrows down and inner side
tightening, squinting eyes. The function is clear- preparing for attack. The teeth are
ready to bite and threaten enemies, eyes and eyebrows squinting to protect the eyes, but
not closing entirely in order to see the enemy.
4.2 Disgust: involves wrinkled nose and mouth. Sometimes even involves tongue
coming out. This expression mimics a person that tasted bad food and wants to spit it
out, or smelling foul smell.
4.3 Fear: involves widened eyes and sometimes open mouth. The function- opening
the eyes so wide is supposed to help increasing the visual field (though studies show
that it doesn't actually do so) and the fast eye movement, which can assist finding
threats. Opening the mouth enables to breath quietly and by that not being revealed by
the enemy.
4.4 Surprise: very similar to the expression of fear. Maybe because a surprising
situation can frighten us for a brief moment, and then it depends whether the surprise is
a good or a bad one. Therefore the function is similar.
4.5 Sadness: involves a slight pulling down of lip corners, inner side of eyebrows is
rising. Darwin explained this expression by suppressing the will to cry. The control
over the upper lip is greater than the control over the lower lip, and so the lower lip
drops. during a cry, the eyes are closed in order to protect them from blood pressure
that accumulates in the face. So, when we have the urge to cry and we want to stop it,
the eyebrows are rising to prevent the eyes

6
4.6 Contempt: involves lip corner to rise only on one side of the face. Sometimes only
one eyebrow rises. This expression might look like half surprise, half happiness. This
can imply the person who receives this look that we are surprised by what he said or did
(not in a good way) and that we are amused by it. This is obviously an offensive
expression that leaves the impression that a person is superior to another person.
4.7 Happiness: usually involves a smile- both corner of the mouth rising, the eyes
are squinting and wrinkles appear at eyes corners. The initial functional role of the
smile, which represents happiness, remains a mystery. Some biologists believe that
smile was initially a sign of fear. Monkeys and apes clenched teeth in order to show
predators that they are harmless. A smile encourages the brain to release endorphins
that assist lessening pain and resemble a feeling of well being. Those good feeling that
one smile can produce can help dealing with the fear. A smile can also produce positive
feelings for someone who is witness to the smile, and might even get him to smile too.
Newborn babies have been observed to smile involuntarily, or without any external
stimuli while they are sleeping. A baby's smile helps his parents to connect with him and get
attached to him. It makes sense that for evolutionary reasons, an involuntary smile of a baby
helps creating positive feelings for the parents, so they wouldn't abandon their offspring.
5. Context
Since facial expression analysis is used in various applications in real-life and is one of
the topics of interest in pattern recognition/classification areas, it has been taken into
consideration by various researchers in various methodologies. For example, some
researchers(Bartlett, Hager, Ekmen, & Sejnowski, 1999) (Bartlett M. , Littlewort, Lainscek,
Fsasel, & Movellen, 2004)have taken the whole face into account without dividing it into sub-
regions or sub-units for processing while some others came up with sub-sections (Ekmen &
Friesen, The Facial Actin Coding System: A Technique for the Measurement of Facial
Movement, 1978) for the implementations of their methods. Different parameterization
techniques, all aiming to solve the classification problem in the most efficient way possible,
havebeen introduced and used together with the above expressed methods.
To give an example to the application areas of facial expression recognition, one might
thinkabout the computer simulations and animations that are carried out in movies/cinema and in

7
computer games. Recognition of the expression on face can further be used in various other
aspects, such as in projects where a driver’s expression is examined to decide whether he is tired
and there foreran alert should be displayed to fulfill the requirements of a safe drive.
6. Scope and Objectives
The project mainly aims to come up with a solution to the facial expression recognition
problem by dividing it into sub-problems of classifications of some specific ‘Facial Features’.
For this, different methodologies and techniques for feature extraction,normalization,
selection and classification are considered. The resulting system comes up withsolutions to these
problems as well as taking the computational complexity and timing issues intoconsideration.
7. Achievements
In this project, several algorithms for extraction of feature vectors, followed by selection
and classification methods in viola Jones and Gabor Feature Extractor have been tried on
different datasets. The binary classification results for the Face have been observed to be better
than the current results in the literature. Achievement of a classification rate about 94.5%.On the
other hand, the results for the multi-class classification scheme for a large number ofclasses.
One of the other main achievements is the speed obtained as a result of using MFNN the
multi-class classification part, which can hardly be achieved by making use of other technique in
the literature.
8. Overview of Dissertation
In Chapter 2; ‘the state-of-the-art’ which includes information about different
methodologies and approaches for facial expression recognition in the literature has been
examined. In the chapters numbered 3 to 7, the system that has been used in the implementation
of the project has been introduced in detail together with its technical and theoretical parts. The
chapters are mainly about:
Multi-Layer Feed Forward Neural Network, Image Normalization, Feature Extraction,
Feature Selection and Classification. The experiments carried out on different datasets together
with their results and discussions are presented. The final chapter, Conclusion, summarizes and

8
evaluates the project, and addresses to some future work that can be carried outin order to
improve the current results.
9. Objectives
The objective of this report is to outline the problem of facial expression recognition, that
is a great challenge in the area of computer vision. Advantages of creating a fully automatic
system for facial action analysis are constant motivation for exploring this field of science and
will be mentioned in this thesis.
The system that is designed for automatic analysis of facial actions is usually called
Facial Expression Recognition System (FERS). The FER system is composed of 3 main
elements: face detection, feature extraction and expression recognition. Different methods were
proposed for each stage of the system, however, only the major ones will be mentioned in the
report. More in-depth study and comparison of related work can be found in surveys done by
Pantic and Rothkrantz(Pantic & Rothkrantz)as well as by Zeng et al. (Mase, 1991).
Firstly, I would like to outline the basic idea of the FER system and explain the most
important issues which should be taken under consideration in the process of system design and
development. Then, each FER system stage will be described in details, namely: main task
typical problems and proposed methods. Furthermore, the recent advances in the area of facial
expression analysis will be listed. Finally, some exemplary applications of FER systems will be
mentioned to show that they are widely used in many fields of science as well as in everyday
life.

9
Chapter 2
BACKGROUND AND LITERATURE SURVEY
1. State-of-The-Art
Introduction in the literature, when facial expression analysis is considered; two main
different approaches, both of which include two different methodologies, exist. Dividing the face
into separate action units or keeping it as a whole for further processing appears to be the first
and the primary distinction between the main approaches. In both of these approaches, two
different methodologies, namely the ‘Geometric-based’ and the ‘Appearance-based’
parameterizations, can be used. In the following subtitles, details of the two approaches and the
two methodologies have been further discussed.
2. Two Main Approaches for Facial Expression Analysis on Still Images
The two main approaches can be summarized as follows:
Making use of the whole frontal face image and processing it in order to end up with the
classifications of 6 universal facial expression prototypes: disgust, fear, joy, surprise, sadness and
anger; outlines the first approach. Here, it is assumed that each of the abovementioned emotions
have characteristic expressions on face and that’s why recognition of them is necessary and
sufficient. Ekman, Friesen (Ekmen & Friesen, 1976) and Izard (Izard , Doughberty, & Hembree,
1983)have proposed the seats in their related work and Bartlett, Littlewort et al (Bartlett, Hager,
Ekmen, & Sejnowski, 1999) (Bartlett M. , Littlewort, Lainscek, Fsasel, & Movellen, 2004) have
used the method for fully automatic recognition systems.
Instead of using the face images as a whole, dividing them into some sub-sections for
further processing forms up the main idea of the second approach for facial expression analysis.
As expression is more related with subtle changes of some discrete features such as eyes,
eyebrows and lip corners; these fine-grained changes are used for analyzing automate
recognition. This approach has been presented to be the ‘Facial Action Coding System’, which is
first developed by Ekman and Friesen (Ekmen & Friesen, The Facial Actin Coding System: A
Technique for the Measurement of Facial Movement, 1978), for describing facial expressions by

10
44 different Action Units (AU’s) existing on face. The advantage is that; this decomposition
widens the range of applications of face expression recognition. This is due to ending up with
individual features to be used in/with different processing areas/method so the than just having
the 6 universal facial expression prototypes. Most of the current work done on facial expression
analysis makes use of these action units.
It should be mentioned that, there are also some other methods in which neither the
frontal face image as a whole nor the all of 44 action units themselves, but some other criterion
such as the manually selected regions on face (Mase, 1991)or surface regions of facial features
(Yacoob & Davis, 1996)are used for the recognition of the facial expression.
2.1 Geometric and Appearance Based Parameterizations
There are two main methods that are used in both of the above explained approaches:
1. Geometric Based Parameterization is an old way which consists of
tracking and processing the motions of some spots on image sequences, firstly presented
by Suwa to recognize facial expressions (Suwa, Sugie, & Fujimora, 1978). Cohn and
Kanade later on tried geometrical modeling and tracking of facial features by claiming
that each AU is presented with a specific set of facial muscles. In general, facial motion
parameters (Mase, 1991)(Yacoob & Davis, 1996)and the tracked spatial positioning &
shapes of some special points (Lanitis, Taylor, & Cootes, 1997) (Kapoor, Qi, & Picard,
2003)on face, are used as feature vectors for the geometric based method. These feature
vectors are then used for classification. The following might be regarded as the
disadvantages of this method:
 The approximate locations of individual face features are detected
automatically in the initial frame; but, in order to carry out template based tracking, the
contours of these features and components have to be adjusted manually in this frame.
(And this process should be carried out for each individual subject)
 The problems of robustness and difficulties come out in cases of pose and
illumination changes while the tracking is applied on images.
 As actions & expressions tend to change both in morphological and in
dynamical senses ,it becomes hard to estimate general parameters for movement and

11
displacement. Therefore, ending up with robust decisions for facial actions under these
varying conditions becomes to be difficult (Donato, Bartlett, Ekmen, & Sejnowksi,
1999).
2. Rather than tracking spatial points and using positioning and movement
parameters that vary within time, color (pixel) information of related regions of face are
processed in Appearance Based Parameterizations; in order to obtain the parameters that
are going to form the feature vectors. Different features such as gabor, haar wavelet
coefficients, together with feature extraction and selection methods such as PCA1
, LDA2
,
and Adaboost are used within this framework. Example research can be found in
(Whitehill & Omlin, 2006)(Bartlett M. , Littlewort, Lainscek, Fsasel, & Movellen,
2004).The combination of the Geometric and Appearance based methods have also been
used I some work in the literature. For example, Zhang(Zhang, 1999)has tracked some
fiducially points on the face images while also taking the Gabor wavelets of these points
into account for the facial expression recognition.
2.2 More about Appearance Based Parameterizations
One of the most successful approaches for expression recognition by making use of
appearance based parameterizations consists of applying Gabor filters for feature extractions of
AU’s, and then using support vector machines to classify them. But, Omlin and
Whitehill(Whitehill & Omlin, 2006)have shown that although the recognition rates for this
method are satisfactory, the approach is very inefficient in memory usage and very slow due to
the high redundancy of Gabor representation. They proposed using Haar features instead of
Gabor, as Haar features can be extracted quickly without a need of Fourier transform in contrast
to Gabor. Secondly, for the classifier part, they made use of Ada-boost, which performs feature
selection at the same time with classification. As for the advantages, they have shown that Ada-
boost as a classifier works 300 times faster than SVM3
and that Haar coefficients+ Ada-boost
yields mostly better results than Gabor wavelets (five spatial frequencies and eight orientations)
1
2
Linear Discriminant Analysis
3
Support Vector Machine

12
+ Support Vector Machines. In the project, in order to avoid the above mentioned disadvantages
of Geometric based methods, Appearance based methods have been decided to be used. Within
the implementation of the binary classification scheme of each class; different from Omlin and
Whitehill’s suggested methods, Ada-boost is just used as a feature selection method rather than
a method for carrying out classification together with feature selection. The classification itself is
achieved by Support Vector Machines. This scheme gives out better average classification
results; although the classification time comes out to be a little bit slower than before. Apart from
Haar and Gabor wavelets, ‘Haar-Like’ features are also used in the project as feature vectors and
the difference in terms of accuracy is measured and reported. When the multi-class classification
problem is taken into consideration, the algorithm implemented uses the ‘Error correcting output
code (ECOC)’ technique which is combined with the Ada-boost feature selection and Support
Vector Machines classification techniques, together with an application of Bootstrapping on the
training data. ECOC is examined to be giving out much smaller amount of classification time for
the multiclass AU4
recognition problems although the classification rates comes out to be less
robust.
3. Current state-of-the-art approaches
As brief mentioned before, an automatic FER system is composed of three major
components. According to Pantic (Leon & Pantic, 2000), depending on the type of images that
are being used, we can talk about extracting the facial information as "localizing the face and its
features", in the context of static images, and "tracking the face and its features" forvideos.
4. Face detection algorithms
For solving the first step, which is face identification, various methods have been
proposed. In the case of static images, the most commonly used technique is called Viola-Jones,
which achieves fast and reliable detection for frontal faces (Viola & Micheal, Robust real-time
face detection, 2004).Among other localization techniques, there is a neural network-based face
detection solution by Rowley et al (Takeo, Henry, & Rowlwy, 1998)and a statistical method for
3D object detection applied to faces and cars, by Schneiderman (Takeo & Henry, 2000).Face
4
Action Unit

13
tracking in image sequences use other types of approaches, which rely on constructing 3D face
models. Some popular examples are the 3D Candied face model (J., 2001) and Piecewise Bezier
Volume Deformation tracker (PBVD) .
5. Feature extraction algorithms
The next and most important step is feature extraction, which can determine the performance,
efficiency and scalability of the system. The main goal is mapping the face pixels into a higher
level representation, in order to capture the most relevant properties of the image and reduce the
dimension of data. There are three types of approaches that appear in the literature, which
depend on data and goal of the system.
Figure 1:- Feature Extraction Techniques
Firstly, geometric or feature based techniques are concerned with identifying specific areas
or landmarks of the face. They are more computationally expensive, but they can also be more
robust and accurate, especially if there is variation in size or orientation. An example would
Active Shape Models, also known as ASM, which is popular for face and medical imaging
applications. They are statistical models that learn the shape of objects and iteratively get
adjusted on a new example, in this case, on a face. However, they can be highly sensitive to
image brightness or noise. Improved results are achieved with Appearance Active Models
(AAM), a more elaborated version of ASM which also incorporates texture information for
building an object model.
The second approach does not treat the face as individual parts, but analyses the face as a
whole. These are known as appearance or holistic methods. One of the most popular algorithms
in the literature are the Gabor Wavelets, which can achieve excellent results in recognizing facial
expressions. An interesting system developed by Bartlett et al in 2003 (Bartlett & Littlewort,
2003) uses this method and it has been deployed on several platforms. A major downside for real

14
time applications is the high computational complexity and memory storage, even though it is
usually combined with a dimension reduction technique. An alternative approach, originally used
for texture analysis but which recently gained popularity in the current context, is Local Binary
Patterns(LBP). This technique has a great advantage in terms of time complexity, while
exhibiting high discriminating capabilities and tolerance against illumination changes.
There is also a third approach, perhaps the best in tackling feature extraction, which consists
of a combination of the previous methods, usually known in literature as hybrid techniques.
Geometric procedures such as AAM are being used for automatic identification of important
facial areas on which holistic methods, such as LBP, are applied.
6. Classification algorithms
Once a higher representation of the face is obtained, a classification process is applied. A
set of faces and their corresponding labels are fed into a classifier, which upon training, it learns
and predicts the emotion class for a new face. There is a large variety of classifiers that are used
in literature, and choosing which one to use depends on criteria, such as: type and size of data,
computational complexity, importance of robustness and overall outcome.
Figure 2:- Classification Algorithm
One of the most popular methods are Multi-Layer Feed Forward Neural Network, greatly
used for their results and high generalization capabilities, but suited for binary classification
problems. Alternatively, powerful, exiles and capable of training complex functions are Artificial
Neural Networks, which are also naturally multi-class algorithms, or Random Forests. Cohen et
al. (Cohen, Sebe, & Chen, 2003) suggests using dynamic classifiers, such as Hidden Markov
Models. This method is proposed for person-dependent systems, as it is more sensitive to
temporal pattern changes, in the case of videos. Studies by the same authors also recommend
using static classifiers, such as Tree Augmented Naive Bayes, for person-independent scenarios.

15
Chapter 3
REQUIREMENTS
Chapter Overview
This chapter presents the first step taken to create the proposed system, which is focused on
understanding what it is expected from the system, who are the stakeholders and how the user is
meant to interact with the delivered services. To support this process, a list of requirements and a
use case diagram are provided.
1. Requirements Elicitation
It is very important to grasp the scope of the system, what is the core functionality and what
represents a 'nice to have' feature. The elicitation step involves gathering clear and precise
requirements, in order to model the system and its characteristics, a process that can be very
complex in software development. Due to the fact that this project is mainly focused on research
and lesson providing a user oriented tool, it only uses the main techniques for analyzing the
system's requirements.
2. Functional and non-functional requirements
When collecting and analyzing the requirements of a software, there are two aspects that
need to be considered. The functional one refers to the features that the system needs
to deliver, while the non-functional aspect takes into account constraints and how the
system should behave.
Table 3.1 lists the requirements of the proposed solution.
Functional Requirement Non-Functional Requirement
The system should classify an image into one
of 7 emotions
The system should be implemented in Matlab
The system should include an automatic face
detection algorithm
The system should produce graphs showing
performance
The system should allow for manual face
extraction
The system’s GUI should be simple and clear.
The system should include techniques for

16
extraction of meaningful facial features
The system should deliver a trained classifier
The system should deliver a simple GUI.
The system should allow to user to upload new
images.
Table 1:- Functional and Non-Functional Requirement
3. Use case Diagram
Figure 3:- Use Case Diagram

17
Chapter 4
DESIGN
Chapter overview
This chapter presents the FER system from a large-grain perspective, depicting the
designprocesses and its constituent parts, which have all led to the current implementationof the
system.
1. Design Methodologies
Software design is one of the most important steps of software life-cycle, as it provides the
processes for transitioning the user requirements into the system's implementation. There are a
series of methodologies that can be adopted, which highly depend on the type of project, team
and available resources.
The proposed FER system has been developed by following an approach known as prototyping".
According to Bedouin-Lafon (Bedouin-Lafon & Mackey, 2003), a prototype is a tangible art
effect, not an abstract description that requires interpretation". The idea behind this, is creating a
series of uncompleted versions of the software, until the expected final solution is achieved.
From the various approaches of this design principle, I have chosen to use evolutionary
prototyping. It implies creating prototypes that will emerge as building blocks for the final
software; therefore, the system is re-evaluated and enhanced after each version to provide a more
functionality or a more accurate performance.

18
Figure 4:- Proposed Block Diagram
2. Architectural Design
The proposed FER system has been developed as a standalone application, with no
communication with other services or applications. The most important component incorporates
all the functionality, which is itself divided into 3 modules, while the second one is a simple
Graphical User Interface that allows user access to the system's features.
The elements of the system are illustrated in Figure 3
Image
Database
Face Detection Feature
Extraction
Emotion
Classification
Graphical User
Interface
Model FERS System
Figure 5:- Components of the System

19
3. Dependencies
The external component consisting of a dataset of images, is used for training a model
recognizes facial emotions and testing its performance. This acts as a dependency for the system
because if the data is scarce, has a bad quality or varies significantly, the system will be poorly
trained, hence will achieve inaccurate results. Some of the solution's functionality is achieved
through available libraries. This leads to another important dependency, created by the use of a
face detection library, who accuracy has an impact on the overall performance.
4. Interface Design
The system's GUI has the role of allowing the user to use the available features, rather
than providing an extensive and modern software front-end. Being a secondary component, it has
been built as a simple, yet user friendly interface, which integrates all the major functionalities.
Both the design and development of the GUI were performed in Matlab's GUI design
environment, known as GUIDE. The very first scheme was primitive in terms of features,
consisting only of two buttons which allowed the user to upload a new image, respectively to
request the system to perform the classification, which would result in displaying the name of the
predicted label.
Figure 4 presents the final version GUI, along with brief descriptions of its
elements.

20
Figure 6:- Graphical User Interface of the System
5. Image Detection
Viola and jones system (Viola & Jones, Robust Real Time Face Detection, 2004) implemented to
detect the face component. It detects exact location of face. Fig. 5 shows sample face detection
by the Viola–Jones classifier.
Figure 7:- Face Detection

21
Chapter 5
IMPLEMENTATION
Chapter overview
This chapter presents the process of developing the proposed Facial Expression Recognition
System. It firstly introduces the tools that were used and the database, followed by a detailed
description of the implementation of the functionality and the interface. It concludes with a
system walkthrough and a brief presentation of the prototypic stages.
1. Implementation tools
As the goal of the project was highly targeted towards research, the entire system was developed
in Matlab, a high level language and scientific environment. Its capabilities are enhanced by
using the integration with OpenCV, a library of functions mainly aimed towards Computer
Vision usage.
2. Dataset
JAFFEE and Yale The system classifies images of people expressing one of the basic
Seven emotions: disgust, anger, fear, happiness, sadness or surprise. The dataset used for training
and testing the system was chosen out of the free and publicly available datasets on the web,
namely
3. Methodology
The proposed solution for the Automatic Face Expression Recognition System is
composed of a series of modules, with well-defined properties and actions, that follow sequential
processes. If we look at the system from a high grain perspective, its main attributes are
identifying the face from a given image, mapping the face pixels into a higher representation and
ultimately decide the emotion class. The sequence of steps undertaken by the system is depicted
in Figure 6.
Face
Detection
Preprocessing
Feature
Extraction
Feature
Selection
Classification
Figure 8:- Constituent Module of the Image

22
3.1 Face detection
Detecting the region of interest represents an essential part of any recognition
system. Ideally, this process has to be performed automatically and with a very low false
positive rate. One of the most famous frameworks for object detection that is currently
being used is called Viola-Jones.
3.2 Viola-Jones Object Detection algorithm
In (Viola & Jones, Robust Real Time Face Detection, 2004), Viola and Jones
proposed a new algorithm for object detection, widely used for face detection. Their
novel approach attained better results compared to previous methodologies, achieving
fast detection and a low false positive rate. The first stage of the system consists of
computing and extracting the so called Haar like features. Because computing the feature
values would be an expensive operation, a new concept called 'Integral Image', was
introduced, which allowed for constant time computations. This median representation
enabled a fast and easy way for obtaining the feature values. However, deriving all the
possible features would be very expensive. Therefore, a feature selection process was
proposed, which applies a modified version of the Ada-Boost technique. This machine
learning boosting algorithm was used to create a strong classifier, out of a series of weak
classifiers (models which perform slightly better than a random guess) and a scheme of
associated weights.
Lastly, a cascade of classifiers was used, in which the first classifiers are simple and use
to discard non-faces, and the stronger classifiers were used for sub-windows that might
Be faces.

23
Figure 9:- General Form of Viola Jones Object Detection Algorithm
3.2.1 Rectangle features:
 Value = Σ (pixels in black area) - Σ (pixels in white area)
 Three types: two-, three-, four-rectangles, Viola & Jones used two-rectangle features
 For example: the difference in brightness between the white &black rectangles over a
specific area
 Each feature is related to a special location in the sub-window
3.2.2 Haar Like Features
All human faces share some similar properties. These regularities may be matched
using Haar Features.
A few properties common to human faces:
 The eye region is darker than the upper-cheeks.
 The nose bridge region is brighter than the eyes.
Composition of properties forming matchable facial features:
 Location and size: eyes, mouth, bridge of nose
 Value: oriented gradients of pixel intensities
The four features matched by this algorithm are then sought in the image of a face
(shown at right).

24
3.3 Detection of faces using Viola-Jones
Due to its efficiency and universality, I have chosen the Viola-Jones algorithm for
this project, in order to detect and extract the faces. This detects the region of the eyes,
which is then used to adjust the left and right margins of the face window, to ensure equal
distance between eyes and the sides of the face. In this way, unnecessary information
(such as hair, ears, background) is discarded and the extracted faces will have normalized
positions. The first two pictures of Figure 7 show the face and eyes regions returned by
the Viola-Jones detectors, outlining the side areas with non-essential elements, while the
third image displays the area which is ultimately extracted.
Figure 10:- Face Detection
Figure 11:- Region Detecting by the System

25
The extracted faces are then resized to a standard dimension of 100x100 pixels in
8-bit Gray scale in 14.5sec time complexity and stored in a new face dataset, which is
used in the next modules, for feature extraction and classification.
3.4 Feature extraction
Feature extraction is one of the most important stages for any classification
system. The choice of algorithms depends not only on the computational properties, but
also on the type of data. As a result, the algorithm that I have chosen to perform feature
extraction is Gabor feature extractor which is widely known not only for its
computational efficiency, but also for its robustness against illumination changes. To
increase its performance, the images are firstly taken though a pre-processing step.
3.5 Pre-processing
Raw image data can be corrupted by noise or other unwanted effects, even if the
camera or environment remains unchanged. Therefore, before doing any processing to
extract meaningful information, the quality of the images has to be improved though a
series of operations, known under the term of pre-processing. This solution applies a pre
processing technique called Contrast-limited adaptive histogram equalization, by using
the Matlab's built-in function, 'adapthisteq', which is used for its property of improving
the local contrast in the face images.
3.6 Gabor Coefficients
In the spatial domain by Gaussian function Gabor filter is a complex exponential
modulated. The 2-D Gabor filters are spatial sinusoids localized by Gaussian window,
and they can be created to be selective for orientation, localization, and frequency as
well. Spatial relations are preserved details about spatial relation in the process by Gabor
filter and it is very flexible to demonstrate images.
𝐺(𝑥, 𝑦, 𝜃, 𝑢, 𝜎) =
1
2𝜋𝜎2
− exp {−
𝑥2
+ 𝑦2
2𝜎2
⁡} exp{2πi(ux cos θ + uy sin θ)⁡}
Equation 1:- Gabor Filter

26
Where 𝐺(𝑥, 𝑦)specify the impulse position
Parameter Symbol Values
Orientation 𝜃
{0,
𝜋
8
,
2𝜋
8
,
3𝜋
8
,
4𝜋
8
,
5𝜋
8
,
6𝜋
8
,
7𝜋
8
}
Wavelength 𝜆 {4⁡⁡, √2
4
, 8, √2
8
, 16}
Standard Deviation 𝜎 Standard deviation of 2D
Gaussian Envelop
Table 2:- Gabor Filter Parameter Table
A set of Gabor filters is used with 5 spatial frequencies and 8 distinct orientations;
this makes 40 different Gabor filters.
𝐺̃(𝑥, 𝑦, 𝜃, 𝑢, 𝜎) = 𝐺[𝑥, 𝑦, 𝜃, 𝑢, 𝜎] −
∑ ∑ 𝐺(𝑥, 𝑦, 𝜃, 𝑢, 𝜎)𝑛
𝑗=−𝑛
𝑛
𝑖=−𝑛
(2𝑛 + 1)2
Equation 2:- Gabor Feature
This function is used for filtering and feature extraction, where (2𝑛 + 1)2
is the size of
filter.
3.6.1 Ada-boost Classifier with Feature Selection
Ada-boost classifier, which is slightly modified by Viola and Jones (Viola &
Micheal, Robust real-time face detection, 2004) to perform feature selection at the same
time with classification, is explained in this section. In this algorithm, different from the
original one, the features are taken into consideration independently within each single
run. A binary classifier is created from the feature which can best discriminate among the
classes, and the weights are updated accordingly.
That is to say, for a manually decided number of features times:
Calculate the probability distribution of the data according to the weights. Set the
initial distributions & weights to be equal to each other if you don’t have prior knowledge
about them. In each iteration, call the Weak Learn algorithm M times to learn the N* M
data (N = number of data, M = number of features) according to a single feature each
time.

27
Apply each of t the trained M trained functions on the training set to calculate the
individual training errors (error per pattern is equal to1 if it is misclassified, 0
otherwise).Out of the M functions select the one giving out the least error and calculate
its coding
loss function to be used in resetting the weights of the distribution.
Just same as the original Ada-boost the loss function, which resets the weights, give
priority to the examples which are wrongly classified (which are ‘difficult’ to classify) by
increasing their weights; and does the vice versa for the correctly classified patterns.
Go to the first step.
The final hypothesis is again the weighted average of the all hypothesis reached by the
Weak-Learn Algorithm in each iteration, which is equal to the number of selected
features. Simply, in both of the Ada-boost algorithms, by making calls to the Weak Learn
multiple times through altering the distribution over the feature domain each time so that
the probability of “harder” samples parts of the space is increased, the aim is to force the
weak learning algorithm to generate new hypothesis that makes less mistakes on these
parts(Freund & Schapire, 1997). Ada-boost is also claimed to work more powerfully as a
classifier than the boost-by-majority algorithms in which best prediction of each network
is selected and then combined with a majority rule, as its final hypothesis’ accuracy
depends on all of the hypothesis that have been returned by Weak Learn In other words,
the final error of Ada-boost depends on the errors obtained from each Weak Learn
hypotheses whereas the errors of the other boosting algorithms depend on the maximal
error of the weakest hypothesis only. These algorithms therefore don’t have the
advantage of making use other hypotheses whose errors are smaller (Freund & Schapire,
1997)Classification. Different classification methods to be used in binary and multi-class
classification task are explained in this section.

28
𝑆𝑒𝑙𝑒𝑐𝑡𝑖𝑜𝑛⁡𝐴𝑙𝑔𝑜𝑟𝑖𝑡ℎ𝑚
 𝐼𝑛𝑖𝑡𝑖𝑎𝑙𝑖𝑧𝑒⁡𝑆𝑎𝑚𝑝𝑙𝑒⁡𝐷𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
 𝐹𝑜𝑟⁡𝑡ℎ𝑒⁡𝑖𝑡𝑒𝑟𝑎𝑡𝑖𝑜𝑛⁡𝑡⁡ = ⁡1, 2, . . . , 𝑇, 𝑤ℎ𝑒𝑟𝑒⁡𝑇⁡𝑖𝑠⁡𝑡ℎ𝑒⁡𝑓𝑖𝑛𝑎𝑙⁡𝑖𝑡𝑒𝑟𝑎𝑡𝑖𝑜𝑛
 𝑁𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒⁡𝑡ℎ𝑒⁡𝑊𝑒𝑖𝑔ℎ𝑡
 𝑇𝑟𝑎𝑖𝑛⁡𝑎⁡𝑤𝑒𝑎𝑘⁡𝐶𝑙𝑎𝑠𝑖𝑓𝑖𝑒𝑟
 𝑆𝑒𝑙𝑒𝑐𝑡⁡𝑡ℎ𝑒⁡ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠
 𝐶𝑜𝑚𝑝𝑢𝑡𝑒⁡𝑡ℎ𝑒⁡𝑤𝑒𝑖𝑔ℎ𝑡
 𝑈𝑝𝑑𝑎𝑡𝑒⁡𝑡ℎ𝑒⁡𝑤𝑒𝑖𝑔ℎ𝑡⁡𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
 𝐹𝑖𝑛𝑎𝑙⁡𝑆𝑒𝑙𝑒𝑐𝑡𝑖𝑜𝑛⁡𝑓𝑒𝑎𝑡𝑢𝑟𝑒⁡𝐻𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠
4. Classification
The last stage of the system consists of a model that is trained to perform emotion
classification on new images. It uses a Machine Learning classifier called Multi-layer
Neural Network(MNN), which takes the output of the feature extraction module, the
feature vectors, and learns the patterns that differentiate one emotion from the other. This
subsection first introduces the concept of MNN and continues with a detailed explanation
of how they are used in the system.
4.1 Multi-Layer Feed Forward Neural Network
The selected features are fed into the constructed neural network to train it to
identify the seven universal facial expressions. The architecture is a 3-layer feed-forward
neural network and trained by a back-propagation algorithm (Bouzalmat, Belghini,
Zarghili, Kharroubi, & Majda, 2011). The back propagation algorithm basically
replicates its input to its output via a narrow conduit of hidden units. The hidden units
extract regularities from the inputs because they are completely connected to the inputs.
Every network was trained to give the maximum value of 1 for exact facial expression
and 0 for all other expressions. The input layer has 7 nodes, each for each facial
expression whiles the hidden layer had 49 neurons; each expression for 7 neurons. We
chose 7 neurons to compensate for the target output of seven facial expressions. This was
the case for the seven prototypical facial expressions, which was validated by the use of

29
the JAFFE facial expression database. Since the experiment was also validated using the
Yale database, where four expressions were used there was a slight modification in the
construction of the network for this application. Here, the hidden layer neurons were
settled at 16 each facial expression dedicated for 4 neurons and 4 neurons in the output
layer.
 𝑃𝑟𝑜𝑐𝑒𝑠𝑠⁡𝑜𝑓⁡𝑇𝑟𝑎𝑖𝑛𝑖𝑛𝑔⁡𝐼𝑛𝑣𝑜𝑙𝑣𝑒𝑠
 𝑊𝑒𝑖𝑔ℎ𝑡⁡𝐼𝑛𝑖𝑡𝑖𝑎𝑙𝑖𝑧𝑎𝑡𝑖𝑜𝑛
 𝐶𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑖𝑜𝑛⁡𝑜𝑓⁡𝐴𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛⁡𝐹𝑢𝑛𝑐𝑡𝑖𝑜𝑛
 𝑊𝑒𝑖𝑔ℎ𝑡⁡𝐴𝑑𝑗𝑢𝑠𝑡𝑚𝑒𝑛𝑡
 𝑊𝑒𝑖𝑔ℎ𝑡⁡𝐴𝑑𝑎𝑝𝑡𝑖𝑜𝑛
 𝑇𝑒𝑠𝑡𝑖𝑛𝑔⁡𝑓𝑜𝑟⁡𝐶𝑜𝑛𝑣𝑒𝑟𝑔𝑒𝑛𝑐𝑒⁡𝑜𝑓⁡𝑁/𝑊
 𝐼𝑓⁡𝐸𝑟𝑟𝑜𝑟⁡𝑇ℎ𝑒𝑛⁡𝐼𝑡’𝑙𝑙⁡𝐵𝑒⁡𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒𝑑
 𝑃𝑟𝑒𝑣𝑖𝑜𝑢𝑠⁡𝑊𝑒𝑖𝑔ℎ𝑡⁡𝑊𝑖𝑙𝑙⁡𝐵𝑒⁡𝐶ℎ𝑎𝑛𝑔𝑒⁡
 𝑈𝑝𝑑𝑎𝑡𝑒⁡𝑃𝑟𝑒𝑣𝑖𝑜𝑢𝑠⁡𝑊𝑒𝑖𝑔ℎ𝑡.
5. System Walkthrough
The main functionality of the system is the ability of uploading an image of a person, expressing
one of the seven basic emotions, and requesting a prediction for its category. Therefore, the first
feature that is enabled is the upload panel. The user has the choice of loading an image from the
file drive, or alternatively, he uses the computer's camera

30
Figure 12:-System Walkthrough - Open Model
Once the image is uploaded into the system, it can perform face detection, displayed in Figure
10, This is done automatically, as described in the 'Face Detection' section, butin some scenarios,
due to poor image quality (usually in the case of using the computer's camera to take a picture),
the algorithm might fail. In order to cope with such situations, the user has the option to
manually select the face region and still use the system to classify the image.

31
Figure 13:- GUI Face Detection and Localization
Finally, the Express Recognition button performs classification of the face, displaying the
predicted label and its description in the Facial Expression Recognition panel. The Probability
Estimates" panel offers a graphical visualization of the posterior probabilities of each class,
outlining the system's belief in all the emotions, not only in the predicted one (Figure10)

32
Chapter 6
TESTING AND EVALUATION OF THE RESULT
Chapter overview
This chapter presents a detailed analysis of the system from the perspective of its performance
and robustness. It firstly introduces the methods that were used to test the system and it then
presents the obtained results. Finally, it presents an alternative classification method and offers a
comparison of the outcomes.
1. Evaluation methods
1.1 Cross validation Method description
Adopting a validation technique is not only essential for estimating the
performance of the system, but also to be able to compare different models or versions of
the same model, by modifying its parameters. Moreover, one of the biggest problems that
systems which use Machine Learning algorithms have, is getting a big enough dataset.
Having such large dataset would ensure proper training of the agent and would allow to
use the other half of the examples for a robust performance evaluation.
To overcome this limitation, a widely used validation technique in Machine
Learning, is cross-validation. This method replies on repetitive divisions of the dataset,
into training and testing subsets, in order to avoid over-fitting and capture the prediction
error that the model exhibits. These are various adaptations of this technique, with
different chemes of partition; this project uses the so called k-fold cross validation. With
this method, the dataset is randomly split into K equal subsets, out of which K-1subsets
are used for the training phase, while the remaining fold is used for testing. This process
in repeated K times, such that in every round of testing, a different subset is used for
validation.

33
2. Analysis of the Result
Analysis of results the proposed system uses a 10-fold cross validation method for
splitting the 306 examples; therefore it performs testing on approximately 30 examples each
round. Following the validation phase, an average accuracy of 86% is achieved for the
implemented model. validation process; during one respectively ten rounds of testing. The graph
depicts the average accuracies and the associated error bars, which outline how many the results,
vary. The difference in values for each test, observed, is due to the random division of the
examples in each fold. In some partitioning, the training set might contain too few examples
from one class; hence have a poorer performance on classifying examples from that specific
category.
3. Confusion Matrix
3.1 Methods description
It is very important to analyses the model not only from its performance
perspective, but also to investigate how it behaves for each individual classes. Does it
predict perfectly for class A but always miss-classifies class C, or does it have a fair
performance for all classes? A very common approach to detect such issues is using a
confusion matrix. This is a table-like representation of the predictions that the model
outputs during testing, illustrating how the examples have been classified. Table 3
outlines the general format of this type of representation, for a binary problem.
Predict Class
Positive Negative
True Class Positive True Positive False Negative
Negative False Positive True Negative
Table 3:- General Format for a Confusion Matrix

34
3.2 Analysis of the Result
The confusion matrix that has been obtained during 10 rounds of testing the
system, where the rows correspond to the true labels and the columns to the predicted
ones. Therefore, looking at the rows, we can not only observe how many instances have
been correctly classified, but also which were the classes assigned to the misclassified
examples. For example, 'surprise' is well labelled (80 times, out of 82) as well as 'disgust'
(54 out of 58). On the other hand, 'sadness' is often confused with 'anger', while 'fear' is
mostly mislabeled.
Predicted Label
Anger Disgust Fear Happiness Sadness Surprise Normal
Actual
Label
Anger 38 3 0 1 3 0 3
Disgust 3 54 0 1 0 0 0
Fear 2 2 9 3 3 4 1
Happiness 0 0 1 687 0 0 2
Sadness 0 0 0 0 17 2 4
Surprise 1 0 2 0 0 80 5
Normal 2 1 1 1 5 9 6
Table 4:- Confusion Matrix Obtained After Testing

35
Chapter 7
CONCLUSION
Chapter overview
This final chapter summaries the achievements of the project as well as the challenges that
were faced during its development. It also provides an outline of the possible improvements and
their applicability, concluding with final remarks.
1. Project Achievements
The proposed solution delivers a recognizer system for facial expressions. The most
important achievement consists in the integrated functionalities and the obtained results. The
system includes an automatic face detection mechanism and implements feature extraction
techniques, tailored for the current problem. A Multi-Layer Neural Network Modelmodel is
trained on examples of faces and extended to support classification. This successfully expands
the system with the capability of classifying all six emotions, ultimately achieving an accuracy of
86%. The functionality can be easily accessed by the user through a simple, yet intuitive GUI,
which provides the ability of uploading an image and requesting a classification.
2. Challenges
Prior to implementing the system, one of the first challenges of the project was choosing
the algorithms for each individual module, because the selection had to consider the following:
integration of techniques, restriction to the time allowed for the project development,speed of
computation and ultimately, achieving a good system performance. Other difficulties were met
during the feature extraction phase, while implementing the gabor feature algorithm and finding
a suitable way of partitioning the face into meaningful regions. Extending the ada-booswere also
non-trivial tasks.

36
3. Future Work
There are a series of approaches that would either increase the performance of the
system or extent its functionality. A major improvement would be replacing the current method
of face division with one of the geometric techniques, Active Shape Models or Active
Appearance Models. These techniques allow identification of landmark points, surrounding
important face regions, such as eyes, nose and mouth. This enables feature extraction to be
applied only on key areas, hence improve the results or even eliminate the need of using a face
detection mechanism. One of the limitations of the proposed system is that it only allows
recognition for static images. Therefore, a significant advancement would be adding the ability
of modeling the temporal component of expressions, hence analyzing video input as well. This
would imply using alternative algorithms such as Piecewise Bezier Volume Deformation
(PBVD) tracker, for face tracking, and Hidden Markov Models for classification. Moreover, the
solution has a restricted number of classes that it is able to predict. To overcome this, a more
appropriate approach would be using the FACS parameterized system, hence regard each
emotion as a set of Action Units describing movements of face muscles. This allows both the
extension of the subset of classes to more than 6, and a more in-depth emotion characterization.
4. Concluding Remarks and Reflections
Finally, this report demonstrates the achievements of the project, but also presents an assessment
of the performance and reliability. Overall, the proposed solution has delivered a system capable
of classifying the six basic universal emotions, with an averageaccuracy of 86%. It extensively
makes use of Image Processing and Machine Learning techniques, to evaluate still images and
derive suitable features, such that, presented with a new example, it would be able to recognize
the expressed emotion. Personally, this project had a significant contribution for improving my
knowledge of Computer Vision and Machine Learning methodologies and for understanding the
challenges and limitations of image interpretation. Moreover, it helped me in developing a
systematic approach for building such a system, including planning, design and documentation,
within a restricted amount of time. To conclude with, I believe that the current solution
succeeded to meet the project's requirements and its deliverables. Even though it has a series of

37
limitations, it allows for further extensions, which would enable a more in-depth analysis and
understanding of human behavior, through facial emotions.

38
Literature Cited
Bartlett, M., & Littlewort, G. (2003). Real Time Face Detection and Facial Expression
Recognition. Computer Vision and Pattern Recognition, 5, 53-58.
Bartlett, M., Hager, J., Ekmen, P., & Sejnowski, T. (1999). Measuring Facial Expressin by
Computer. Image Analysis, Psychology, 36, 253-264.
Bartlett, M., Littlewort, G., Lainscek, C., Fsasel, .., & Movellen, J. (2004). Machine Learnin
Methods for fully Automatic Recogniton of Facail Expression and Facial Actions. (IEEE,
Ed.) 592-597.
Bedouin-Lafon, & Mackey, W. (2003). The Human Computer Inteaction Handbook. In Bedouin-
Lafon, & W. Mackey.
Cohen, I., Sebe, N., & Chen, L. (2003). Facial Expression Recognition from Video Sequences.
Computer Vision and Pattern Recognition, 160-187.
Donato, G., Bartlett, M., Ekmen, P., & Sejnowksi. (1999). Classifying Facial Actions. Pattern
Analysis and machine Intelligence, 121-128.
Ekmen, P., & Friesen, W. (1976). Picture of Facial Affect.
Ekmen, P., & Friesen, W. (1978). The Facial Actin Coding System: A Technique for the
Measurement of Facial Movement.
Freund, Y., & Schapire, R. E. (1997). Decision-Theoretic Generalization of On-Line Learning
and an Application to Boosting. Journal of Computer and System Sciences , 119-139.
Izard , C., Doughberty, L., & Hembree, E. (1983). A system for Identifying Affect Expression by
Holistic Judgement.
J., O. K. (2001). An Updated Parametrized Report.
Jalin, A., & Farukhina , F. (1991). Unsupervised Texture Segmentation using Gabor Filters.
Patttern Recognition, 24(12), 1167-1186.

39
Kapoor, A., Qi, Y., & Picard, R. (2003). Fully Automatic Upper Facial Action Recognition.
Workshop on Analysis and Modeling of Faces and Gestures.
Lanitis, A., Taylor, C., & Cootes, T. (1997, July). Automatic Interpretation and Coding of Face
Images Using Flexible Models. Pattern Analysis and Machine Intelligence, 19(7), 743-
756.
Lee, C., & Wang, S. (1999). Fingerprint Feature Extraction Using Gabor Filters. Electronics
Letters,, 35(4), 288-290.
Leon, J., & Pantic, M. R. (2000). Automatic analysis of facial expressions: State of the Art.
Pattern Analysis and Machine Intelligence, 1424-1445.
Mase, K. (1991). Recognition of Facial Expression from Optical Flow. E74(10), 3474-3483.
Mehrabian, A. (1968). Communication without words. 2(4), 53-56.
Movellan. (1996). Tutorial on Gabor Filters.
Pantic, M., & Rothkrantz, L. (n.d.). Automatic Facial Expression Recognition: The State of the
Art. 22(12).
Shen, L., Bai, L., & Fairhurst, M. (2006). Gabor Wavelets and General Discriminant Analysis
for Face Identification and Verification. Image Vision Computing, 25(5), 553-563.
Suwa, M., Sugie, N., & Fujimora, K. (1978, July). Preliminary Notes on Pattern Recognition of
Human Expression Recognition. Pattern Recognition, 408-410.
Takeo, k., & Henry, S. (2000). A Statistical Method for 3rd Onbject Detection applied to face
and cars. 746-754.
Takeo, K., Henry, A., & Rowlwy. (1998). Neural Network Based Face Detection. Pattern
Analysis and Machine Intelligence, 20(1), 23-28.
Viola, P., & Jones, M. J. (2004). Robust Real Time Face Detection. International Journal of
Computer Vision, 57(2), 137-154.

40
Viola, P., & Micheal, J. (2004). Robust real-time face detection. International Journal of
Compueter Vision, 57(2), 137-154.
Whitehill, J., & Omlin, C. (2006). Haar Featrures for FACS AU Recognition . Automatci face
and Gesture Recognition.
Yacoob, Y., & Davis, L. (1996, June). Recognizing Human Facial Expression from Long Image
Sequence using Optical Flow. Pattern Analysis and Machine Intelligence, 18, 636-642.
Zhan, Y., Niu, D., & Cao, P. (2004). Facial Expression Recognition Based on Gabor Wavelet
Transformation and Elastic Templates Matching. IC on Image and Graphics, 254-257.
Zhang, Z. (1999). Feature Based Facial Expression Recognition: Senstive Analysis and
Experiment with Multilayer Perceptron. Pattern Recognition and Artifical Intelligence,
13(6), 893-911.

41
APPENDIX 1 – USER GUIDE
In this section, brief explanations of the Matlab source codes that are provided within the CD of
the project are given.
AdaBoost.m: Given the dataset together with the labels, apply the boosting and return the
trainingerror.
AdaBoostClassify.m:
Adafacuakexp.m:
Addtodatabase.m:
Derive_gauss.m:
Expressionclassification.m:
Expressain.m
Findfeature.m
Gaborfilters1.m:
Gaborfilters.m:
Mykurtosis.m
Mystd.m:
SingleWeakLearner.m:
SingleWeakLearnerROC.m:
StrongClassify.m:
Trovagauss.m:
WeakClassify.m
WeakClassifyBatch.m
WeakClassifyROC.m:
WeakLearner.m
Zigzagdct.m:

42
APPENDIX B – Abbreviation
AAM: Active Appearance Model
AU: Action Unit
CK: Cohn-Kanade
DCT: Discrete Cosine Transform
EFM: Enhanced Fisher Linear Discriminant Analysis
EICA: Enhanced Independent Component Analysis
EDM: Euclidian Distance Measure
FLDA: Fisher Linear Discriminant Analysis
FACS: Facial Action Coding System
FER: Facial Expression Recognition
FA: Facial Action
FER: Facial Expression Recognition
FR: Face Recognition
FRS: Face Recognition System
FFT: Fast Fourier Transform
HMM: Hidden Markov Model
ICA: Independent Component Analysis
JAFFE: Japanese Female Facial Expression
KNN: K nearestNeighbor
LED: Light Emitting Diode
LBP: Local Binary Pattern
NN: Nearest Neighbor
PCA: Principal Component Analysis
ROC: Receiver Operator Characteristic
SVM: Support Vector Machine

43
3 Modules ofSystem, 19
Action Units, 11
Active Appearance Models, 37
Active Shape Models, 37
Anger, 5
Appearance Based Feature’, 10
Charles Darwin Theory, 3
Classification Algorithms, 15
Classifying Results, 36
communication, 1
Cross validation Method, 33
Design Methodologies, 18
Disgust, 5
Equation Gabor Filter, 27
Evolutionary Reasons, 5
Face detection, 24
Fear, 5
Feature Extraction, 22
Feature extraction Algorithms, 14
Feature Extraction Techniques, 12
FERS
Facial Expression Recognition System, 8
Functional Requirement, 16
Gabor Coefficients, 27
Gabor Feature, 27
Geometric Based Feature, 10
Guide Matlab, 20
Happiness, 6
Human Polygraph, 3
Image Detection, 21
Methodology, 23
Multi-layer Neural Network(MNN), 30
Nine Emotions, 2
Non-Functional Requirement, 16
nonverbal communication, 2
Parameterizations, 10
Project Achievements, 36
Requirements Elicitation, 16
Sadness, 6
seven universal emotions, 2
social communication, 1
Surprise, 5
System Walkthrough, 31
Three General Principles, 4
Tracking Spatial Points, 12
Validation Process, 34
Viola-Jones, 24
WeakLearn Algorithm, 28

Facial Expression Recognitino

More Related Content

What's hot

Similar to Facial Expression Recognitino

More from International Islamic University

Recently uploaded

Facial Expression Recognitino