RECOGNISING EMOTIONS FROM ENSEMBLE OF FEATURES
As Human-Robot Interaction is increasing its attention nowadays. In
order to put some limelight on socializing robots with human,
Understanding the facial gestures and visual cues of an individual is a
It allows a robot to understand the expressions of humans in turn
enhancing its effectiveness in performing various tasks.
It serves as a Measurement systems for behavioural science.
Socially intelligent software tools can be accomplished.
Challenges in Recognizing
Pose and Frequent head movements
Presence of structural components
Subtle facial deformation
Ambiguity and uncertainty in face motion measurement
Describing Facial Expressions
Determine the amplitude of the expression in terms of intensity levels,
where the levels correspond to some measure of the extent to which the
expression is present on the face.
Splitting the expression into three temporal phases like Onset, Apex and
Image is converted into grayscale
Image Enhancement using Gaussian filter for
After this normalization, the image will be fairly flat, limited to noise
and blurring in the shadowed region as well as the jaggiest and
1.1 Head Pose Identification
Images , Camera
1.3 Face Tracking
Face tracking involves the separation on face as a feature space
from the raw image or a video.
One of the reliable method of face tracking can be done using
The YCrCb color space is widely used for digital video. In the format,
luminance information is stored as single component (Y) and
chrominance information is stored as two color-difference
components (Cb and Cr).
1.4 Face Part Identification
1.3.1 Eye Identification:
Eye display a strong vertical edges (Horizontal transitions) due to iris
and white part of eye.
The Sobel mask can be applied to an image and the horizontal
projection of vertical edges can be obtained to determine the Y
coordinate of the eyes.
Sobel edge detection is applied to the upper half of the face image
and the sum of each row is horizontally plotted
Vertical edger, Horizontal edger
The peak with the lower intensity value in horizontal projection of
intensity is selected as the Y coordinate
A pair of regions that satisfy certain geometric conditions (G < 60) are
selected as eyes from those regions.
1.3.2 Eyebrow Identification:
Two rectangular regions in the edge image which lies directly
above each of the eye regions are selected as initial eyebrow
These obtained edge images are then dilated and the holes are
Then the edge images of these two areas are obtained for further
1.3.3 Mouth Identification
Since lips has more amount of red than other part of skin a color
filter to enlarge the difference between lips and face.
Since the eye regions are known, the image region below the eyes
in the face image is processed to and the regions which satisfy the
1.2 ≤ R/G ≤ 1.5
AUs are considered to be the smallest visually
discernible facial movements.
As AUs are independent of any interpretation,
they can be used as the basis for recognition
of basic emotions.
However, both timing and the duration of
various AUs are important for the
interpretation of human facial behavior.
It is an unambiguous means of describing all
possible movements of face in 46 action
2.1 Localization of Action Units
The minor axis is a feature of the eye that varies for each emotion.
The major axis of the eye is more or less fixed for a particular person
in varied emotions. The ellipse can be parameterized by its minor
(2b) and major axes (2a).
From the edge detected eye image the value of b i.e. value of
minor axis is computed by calculating the uppermost and
lowermost position of the white pixels vertically.
The optimization is performed for more than 6 times for each
emotion in reaching consistent minor axis value of b.
2.2 Facial Point Tracking
The motion seen on the skin surface at each muscle location was
compared to a predetermined axis of motion along which each
muscle expands and contracts, allowing estimates as to the activity of
each muscle to be made.
Feature Based (Active Shape models)
Facial action coding system
9 – Upper face
18 – Lower Face
5 - Miscellaneous
11 – Head Position
9- Eye Position
14 - Miscellaneous
The Emotion Quadrants
We can translate facial muscle movements into FAPs.
The selected FPs can be automatically detected from real images
or video sequences.
In the next step, the range of variation of each FAP is estimated.
2.3 Evolving Feature Points
Once facial motion has been determined, it is necessary to place the
motion signatures into the correct class of facial expression.
We translate facial muscle movements into Facial Action Points along
the emotion quadrants.
The classification method used to distinguish between the emotions.
All these approaches have focused on classifying the six universal
Classifiers are concerned with finding the optimal hyper-plane that
separates the classes in the feature space. The optimal hyper plane
means finding the maximum margin between the classes.
Some commonly used classifiers are:
Support Vector Machines
Support Vector Machine (SVM) is the successful and effective
statistical classification machine learning approach. SVM is a linear
classification that separates the classes in feature space by using
Adaboost is similar to SVM algorithm. The AdaBoost preserves a
probability distribution, weighted W, over the gathered samples.
MLP is a network model composed of an input layer, an output layer
and several hidden layers. Each unit in the hidden layer and the
output layer has two computations. The first one calculates their
input, and then passes the input value through the activation
function to obtain the units’ output.
ANN as a Classifier
The extracted feature points are
processed to obtain the inputs for
the neural network.
The neural network is trained so
that the emotions neutral,
happiness, sadness, anger,
disgust, surprise and fear are
There may be roughly 41 input
“An Approach To Automatic
Recognition Of Spontaneous
B. Braathen et al 
“Facial Expression Database And
Facial Expression Recognition”
Yu-Li Xue et al 
“Estimation Of The Temporal
Dynamics Of Facial Expression”
Jane Reilly et al 
“A Unified Probabilistic
R Spontaneous Facial Action
Yang Tong et al 
This Paper describes the method of wrapping of Images
into 3D forms of canonical views which is followed by
machine learning techniques for emotion identification.
Using human-Machine interaction the various attributes
of people are databased and later facilitated for
further recognition of expressions
Using the temporal dynamics of the image the locally
linear embedding is done for the emotion identification
Recognizing is done by probabilistic facial action model
based on the Dynamic Bayesian Network (dbn) to
simultaneously represent rigid and nonrigid facial
motions, their spatiotemporal dependencies, and their
“Real-Time Facial Expression
Recognition With IlluminationCorrected Image
He Li et al 
“Facial Expression Recognition
Based On Weighted Principal
Component Analysis And
Support Vector Machines”
Zhiguo Niu et al 
“Fully Automatic Recognition Of
Temporal Phases Of Facial
The image of a face is represented by a low
dimensional vector that results from projecting the
illumination corrected image onto a low dimensional
expression manifold which favours robust
identification of features
The approach is based on the distribution of action
units in the different facial area to determine the
weights to extract the facial expression feature
Algorithm uses a facial point detector to
automatically localize 20 facial points. These points
are tracked through a sequence of images using a
method called particle filtering with factorized
likelihoods. For temporal activation models based on
the tracking data, it applies support vector