SlideShare a Scribd company logo
CHAPTER 1 
INTRODUCTION 
1.1A Brief Description 
Virtual learning is increasing day by day, and Human Computer 
Interaction is a necessity to make virtual learning a better experience. 
The emotions of a person play a major role in the learning process. 
Hence the proposed work, detects the emotions of a person, by his face 
expressions. 
For a facial expression to be detected face location and area must 
be known; therefore in most cases, emotion detection algorithms start 
with face detection, taking into account the fact that face emotions are 
mostly depicted using the mouth. Eventually, algorithms for eye and 
mouth detection and tracking are necessary, in order to provide the 
features for subsequent emotion recognition. In this project we propose a 
detection system for natural emotion recognition. 
1.2Need For Face Detection 
Human activity is a major concern in a wide variety of 
applications such as video surveillance, human computer interface, face 
recognition and face database management. Most face recognition 
algorithms assume that face location is known. Similarly, face-tracking 
algorithms often assume that initial face location is known. In order to 
improve the efficiency of the face recognition systems, an efficient face 
detection algorithm is needed. 
1
1.3Need For Emotion Detection 
Human beings communicate through facial emotions in day to day 
interactions with others. Human perceiving the emotions of fellow 
human is natural and inherently accurate. Human can express his/her 
inner state of mind through emotions. Many times, emotion indicates 
that a human needs help. Computer recognising emotions is an important 
research in Human Computer Interfacing (HCI). This HCI can be a 
welcoming method for physically disabled and to those who are unable 
to express their requirement by voice or by other means and especially to 
those who are confined to bed. The human emotion can be detected 
through facial actions or through biosensors. Facial actions are imaged 
through still or video cameras. From still images, taken at discrete 
times, the changes in eye and mouth areas can be exposed. Measuring 
and analysing such changes will lead to the determination of human 
emotions. 
1.4 Existing Face Detection Approaches 
1.4.1 Feature Invariant Methods 
These methods aim to find structural features that exist even when 
the pose, viewpoint, or lighting conditions vary, and then use these to 
locate faces. These methods are designed mainly for face localization. 
2
Texture 
Human faces have a distinct texture that can be used to 
separate them from different objects. The textures are computed using 
second-order statistical features on sub images of 16X16 pixels. Three 
types of features are considered: skin, hair, and others. To infer the 
presence of a face from the texture labels, the votes of occurrence of hair 
and skin textures are used. Here the colour information is also 
incorporated with face-texture model. Using the face texture model, a 
scanning scheme for face detection in colour scenes in which the orange 
like parts including the face areas are enhanced. One advantage of this 
approach is that it can detect faces which are not upright or have features 
such as beards and glasses. 
Skin Colour 
Human skin colour has been used and proven to be an effective 
feature in many applications from face detection to hand tracking. 
Although different people have different colour, several studies have 
shown that the major difference lies largely between their intensity rather 
than their chrominance. Several colour spaces have been utilized to label 
pixels as skin including RGB, Normalized RGB, HSV, YCbCr, YIQ, 
YES, CIE XYZ and CIE LUV. 
1.4.2 Template Matching Methods 
In template matching, a standard face pattern is manually 
predefined or parameterized by a function. Given an input image, the 
correlation values with the standard patterns are computed for the four 
3
contours, eyes, nose, and mouth independently. The existence of a face is 
determined based on the correlation values. This approach has the 
advantage of being simple to implement. However, it has proven to be 
inadequate for face detection since it cannot effectively deal with 
variation in scale, pose, and shape. Multiresolution, multiscale, sub 
templates, and deformable templates have subsequently been proposed 
to achieve scale and shape invariance. 
Predefined Face Template 
In this approach several sub templates for nose, eyes, mouth and 
face contour are used to model a face. Each sub template is defined in 
terms of line segments. Lines in the input image are extracted based on 
greatest gradient change and then matched against the sub templates. The 
correlations between sub images and contour templates are computed 
first to detect candidate location of faces. Then, matching with the other 
sub templates is performed at the candidate positions. In other words, the 
first phase determines focus of attention or region of interest and second 
phase examines the details to determine the existence of a face. 
1.4.3 Appearance Based Methods 
In the appearance based methods the templates are learned from 
examples in images. In general, appearance based methods rely on 
techniques from statistical analysis and machine learning to find the 
relevant characteristics of face and non face images. The learned 
characteristics are in the form of distribution models that are 
consequently used for face detection. 
4
1.5 Existing Emotion Detection Approaches 
1.5.1 Genetic Algorithm 
The eye feature plays a vital role in classifying the face emotion 
using Genetic Algorithm. The acquired images must go through few pre-processing 
methods such as grayscale, histogram equalization and 
filtering. A Genetic Algorithm methodology estimates the emotions from 
eye feature alone. Observation of various emotions lead to a unique 
characteristic of eye, that is, the eye exhibits ellipses of different 
parameters in each emotion. Genetic Algorithm is adopted to optimize 
the ellipse characteristics of the eye features. Processing time for Genetic 
Algorithm varies for each emotion. 
1.5.2 Neural Network 
Neural networks have found profound success in the area of 
pattern recognition. By repeatedly showing a neural network, inputs are 
classified into groups, the network can be trained to discern the criteria 
used to classify, and it can do so in a generalized manner allowing 
successful classification of new inputs not used during training. With the 
explosion of research in emotions in recent years, the application of 
pattern recognition technology to emotion detection has become 
increasingly interesting. Since emotion has become an important 
interface for the communication between human and machine, it plays a 
basic role in rational decision-making, learning, perception, and various 
cognitive tasks. 
5
Human's emotion can be detected based on the physiological 
measurements, facial expression. Since human shows the same facial 
muscles when expressing a particular emotion, therefore the emotion can 
be quantified. Primary emotions such as anger, disgust, fear, happiness, 
sadness and surprise can be classified using Neural Network. 
1.5.3 Feature Point Extraction 
Template Matching 
An interesting approach in the problem of automatic facial feature 
extraction is a technique based on the use of template prototypes, which 
are portrayed on the 2-d space in gray scale format. This is a technique 
that is, to some extent, easy to use, but also effective. It uses correlation 
as a basic tool for comparing the template with the part of the image that 
we wish to recognize. An interesting question that arises is, the 
behaviour of recognition with template matching in different resolutions. 
This involves multi-resolution representations through the use of 
Gaussian pyramids. The experiments proved that not very high 
resolutions are needed for template matching recognition. For example, 
the use of templates of 36x36 pixels proved sufficient. This fact shows 
us that template matching is not as computationally complex as we 
originally imagined. 
This class implements the face detection algorithm which starts by 
scanning the given image with the SSR filter and locating the face 
candidates, then it assembles the candidates that are close to each other 
using connected components (to treat less candidates which means less 
processing time, remember this is a real-time application), then we take 
6
the centre of each cluster and extract a template based on this centre; we 
pass the template to the Support Vector Machine which tells us whether 
this template is a face or not, if yes, we locate the eyes, then we locate 
the nose. 
Face detection techniques are of two categories: 
1. Feature based approach 
2. Image-based approach. 
Template Matching provides for the human face detection system. 
1. Feature Based Technique: 
The techniques in the first category make use of apparent 
properties of face such as face geometry, skin colour, and motion. Even 
feature-based technique can achieve high speed in face detection, but it 
also has problem in poor reliability under lighting condition. 
2. Image Based Technique: 
The image based approach takes advantage of current advance in 
pattern recognition theory. Most of the image based approach applies a 
window scanning technique for detecting face, which requires large 
computation. 
To achieve high speed and reliable face detection system, we 
propose the method which combines both feature-based and image-based 
approach using SSR Filter. 
7
1.5.4 Template Matching. 
Template matching is a technique in digital image processing for 
finding small parts of an image which match a template image or as a 
way to detect edges in images. 
The basic method of template matching uses a convolution mask 
(template), tailored to a specific feature of the search image, which we 
want to detect. 
This technique can be easily performed on grey images or edge 
images. The convolution output will be highest at places where the 
image structure matches the mask structure, where large image values 
get multiplied by large mask values 
Eyes and Nose detection using SSR Filter. 
A real-time face detection algorithm using Six-Segmented 
Rectangular (SSR) filter of the eyes and nose detection. 
SSR is a six segment rectangle as illustrated in Figure 1.1. 
Figure 1.1 SSR Filter 
8
At the beginning, a rectangle is scanned throughout the input 
image. This rectangle is segmented into six segments as shown below. 
The SSR filter is used to detect the Between-the-Eyes based on 
two characteristics of face geometry. 
BTE - Between The Eyes 
The detection of BTE is based on the property of the image 
characteristics of the area on face. The intensity of the BTE image 
closely resembles a hyperbolic surface as shown in Figure 1.2. The BTE 
is the saddle point on the hyperbolic surface. A rotationally invariant 
filter could thus be devised for detecting the BTE area. 
9
Figure 1.2 Determination of BTE 
The nose area is usually calculated to be 2/3rd of the value of L as 
shown in Figure 1.3. The L is calculated as the approximate distance 
between both eyes and the distance from eye to nose. 
Figure1.3 Nose Tip Search Area Relative to Eyes 
The common BTE area on human face resembles a hyperbolic surface. 
The proposed work uses this hyperbolic model to describe the BTE 
region, the centre of the BTE is thus the saddle point on the surface. 
Blobs 
Blobs provide a complementary description of image structures in 
terms of regions, as opposed to corners that are more point-like. 
Nevertheless, blob descriptors often contain a preferred point (a local 
maximum of an operator response or a centre of gravity) which means 
10
that many blob detectors may also be regarded as interest point 
operators. Blob detectors can detect areas in an image which are too 
smooth to be detected by a corner detector. 
Gabor Filtering 
It is possible for Gabor filtering to be used in a facial recognition 
system. The neighbouring region of a pixel may be described by the 
response of a group of Gabor filters in different frequencies and 
directions, which have a reference to the specific pixel. In that way, a 
feature vector may be formed, containing the responses of those filters. 
Automated Facial Feature Extraction 
In this approach, as far as the frontal images are concerned, the 
fundamental concept upon which the automated localization of the 
predetermined points is based consists of two steps: the hierarchic and 
reliable selection of specific blocks of the image and subsequently the 
use of a standardized procedure for the detection of the required 
benchmark points. In order for the former of the two processes to be 
successful, the need of a secure method of approach has emerged. The 
detection of a block describing a facial feature relies on a previously, 
effectively detected feature. By adopting this reasoning, the choice of the 
most significant characteristic -the ground of the cascade routine- has to 
be made. The importance that each of the commonly used facial features, 
regarding the issue of face recognition, has already been studied by other 
researchers. The outcome of surveys proved the eyes to be the most 
dependable and easily located of all facial features, and as such they 
were used. The techniques that were developed and tried separately, 
utilize a combination of template matching and Gabor filtering. 
11
The Hybrid Method 
The basic question of the desired feature blocks is performed by a 
simple template matching procedure. Each feature prototype is selected 
from one of the frontal images of the face base. The practiced 
comparison criterion is the maximum correlation coefficient between the 
prototype and the repeatedly audited blocks of a smartly restricted area 
of the face. 
In order for the search area to be incisively and functionally 
limited, the knowledge of the human face physiology has been applied, 
without hindering the satisfactory performance of the algorithm in cases 
of small violations of the initial limitations. However, the final block 
selection by the mere use of this method has not always been crowned 
with success. Therefore, the need of a measure of reliability came forth. 
For that reason, the use of Gabor filtering was deemed to be one suitable 
tool. As it can be mathematically deduced from the filter’s form, it 
ensures simultaneous optimum localization in the natural space as well 
as in frequency space. 
The filter is applied both on the localized area and the template in 
four different spatial frequencies. Its response is regarded as valid, only 
in the case that its amplitude exceeds a saliency threshold. The area with 
minimum phase distance from its template is considered to be the most 
reliably traced block. 
12
1.5.5 Preprocessing and Postprocessing of Images 
Image Processing Toolbox provides reference-standard algorithms 
for pre-processing and post processing tasks that solve frequent system 
problems, such as interfering noise, low dynamic range, out-of-focus 
optics, and the difference in colour representation between input and 
output devices. Using region-of-interest tools to create a mask, items in 
the original image (top) are selected to create the mask (bottom). 
Image Enhancement techniques in Image Processing Toolbox 
enables user to increase the signal-to-noise ration and accentuate image 
features by modifying the colours or intensities of an image. 
We can 
· Perform histogram equalization 
· Perform decorrelation stretching 
· Remap the dynamic range 
· Adjust the gamma value 
· Perform linear, median or adaptive filtering. 
1.5.6 Typical Tasks of Computer Vision 
Each of the application areas in computer vision systems employ a 
range of computer vision tasks, more or less well-defined measurement 
problems or processing problems, which can be solved using a variety of 
methods. Some examples of typical computer vision tasks are presented 
13
below. 
Recognition 
The classical problem in computer vision, image processing and 
machine vision is that of determining whether or not the image data 
contains some specific object, feature, or activity. This task can normally 
be solved robustly and without effort by a human, but is still not 
satisfactorily solved in computer vision for the general case: arbitrary 
objects in arbitrary situations. Computer vision for the general case: 
arbitrary objects in arbitrary situations. The existing methods for dealing 
with this problem can at best solve it only for specific objects, such as 
simple geometric objects (e.g., polyhedrons), human faces, printed or 
hand-written characters, or vehicles, and in specific situations, typically 
described in terms of well-defined illumination, background, and pose of 
the object relative to the camera. 
Different varieties of the recognition problem are described in the 
literature: 
Recognition: one or several pre-specified or learned objects or object 
classes can be recognized, usually together with their 2D positions in the 
image or 3D poses in the scene. 
Identification: An individual instance of an object is recognized. 
Examples: identification of a specific person face or fingerprint, or 
identification of a specific vehicle. Detection based on relatively simple 
and fast computations is sometimes used for finding smaller regions of 
interesting image data. 
14
CHAPTER 2 
LITERATURE SURVEY 
Jarkiewicz et al [1] propose an emotion detection system where 
analysis is done using a Haar-like detector and face detection is done 
using a hybrid approach. The technique proposed here is to localize 
seventeen characteristic points on the face and based on their 
displacements certain emotions can be automatically recognized. An 
improvement over the above proposed method is the feature extraction 
technique. 
A face detection algorithm is proposed by Zhao et al [2] for colour 
images. This work is based on an adaptive threshold and a chroma chart 
that shows probability of skin colours. Thus by identifying the skin 
region, the facial part can be identified in the image. This technique 
when used with the feature extraction technique yields better results. 
Maglogiannis et al [3] present an integrated system for emotion 
detection. The system uses colour images and it is composed of three 
modules. The first module implements skin detection, using Markov 
random fields for image segmentation and face detection. A second 
module is responsible for eye and mouth detection and extraction. The 
specific module uses the HSV colour space of the specified eye and 
mouth region. The third module detects the emotions, pictured in the 
eyes and mouth using edge detection and measuring the gradient of the 
eye’s and mouth’s region. 
15
A detailed experimental study of face detection algorithms based 
on skin colour has been made by Singh et al [4]. Three colour spaces, 
RGB, YCbCr and HSI are of main concern .The algorithms of these 
three colour spaces have been compared and then combined to get a new 
skin colour based face detection algorithm which gives higher accuracy. 
A survey by Yang et al [5] categorizes and evaluates the various 
face detection algorithms. Other relevant issues such as benchmarking, 
data collection and evaluation techniques have also been discussed. The 
algorithms have been analysed and their limitations have been identified. 
The Eigenface method [6] which uses principal components 
analysis for dimensionality reduction, yields projection directions that 
maximize the total scatter across all classes, ie, across all images of all 
faces. In choosing the projection which maximizes total scatter, principal 
components analysis retains unwanted variations due to lighting and 
facial expression. The Eigenface method is also based on linearly 
projecting the image space to a low dimensional feature space. 
The Bunch Graph technique [7] has been fairly reliable to 
determine facial attributes from single images, such as gender or the 
presence of glasses or a beard. If this technique was developed to extract 
independent and stable personal attributes, such as age, race or gender, 
recognition from large databases could be improved and speed-up 
considerably by preselecting corresponding sectors of the database. 
Image deblurring algorithms are blind, Lucy-Richardson, Wiener and 
regularized filter deconvolution as well as conversions between point 
16
spread and optical transfer solutions. 
The Fisherfaces method [8], a derivative of Fisher’s Linear 
Discriminant (FLD) maximizes the ratio between class scatter to that of 
within-class scatter and appears to be the best at extrapolating and 
interpolating over variation in lighting, although the Linear Subspace 
method is a close second. The Eigenface method is also based on linearly 
projecting the image space to a low dimensional feature space. 
However, the Eigenface method, which uses principal components 
analysis, yields projection directions that maximize the total scatter. 
In a survey by Cheng-Chin Chiang[9] et al., presents a real-time 
face detection algorithm for locating faces in images and videos. This 
algorithm finds not only the face regions, but also the precise locations 
of the facial components such as eyes and lips. The algorithm starts from 
the extraction of skin pixels based upon rules derived from a simple 
quadratic polynomial model. Interestingly, with a minor modification, 
this polynomial model is also applicable to the extraction of lips. The 
benefits of applying these two similar polynomial models are twofold. 
First, much computation time are saved. Second, both extraction 
processes can be performed simultaneously in one scan of the image or 
video frame. The eye components are then extracted after the extraction 
of skin pixels and lips. Afterwards, the algorithm removes the falsely 
extracted components by verifying with rules derived from the spatial 
and geometrical relationships of facial components. Finally, the precise 
face regions are determined accordingly. According to the experimental 
results, the proposed algorithm exhibits satisfactory performance in 
terms of both accuracy and speed for detecting faces with wide 
variations in size, scale, orientation, colour, and expressions. 
17
Hironori Yamauchi[9], proposed Bio Security using Face 
recognition for Industrial Use about current systems for face recognition 
techniques which often use either SVM or adaboost techniques for face 
detection part and use PCA for face recognition part. 
In Robust real time face tracking for the analysis of human 
behaviour proposed by Damien Douxchamp and Nick Campbell[10], 
presented a real-time system for face detection, tracking and 
characterization from Omni directional video. Viola-Jones is used as a 
basis for face detection, and then various filters are applied to eliminate 
false positives. Gaps between two detection of a face by the Viola-Jones 
algorithms are filled using a colour-based tracking. 
Shinjiro Kawato and Nobuji Tetsutani[11], proposed Scale 
Adaptive Face Detection and Tracking in Real Time for detection and 
tracking of faces in video sequences in real time. It can be applied to a 
wide range of face scales. Fast extraction of face candidates is done with 
a Six-Segmented Rectangular (SSR) filters and face verification by a 
support vector machine. 
18
Real-Time Face Detection Using Six-Segmented Rectangular 
Filter (SSR Filter) by Oraya Sawettanusorn and et al.,[12], proposed 
a real-time face detection algorithm using Six-Segmented 
Rectangular (SSR) filter, distance information, and template 
matching technique. Between-the-Eyes is selected as face 
representative because its characteristic is common to most people 
and is easily seen for a wide range of face orientation. Image is 
scanned and divided into six segments throughout the face image. 
A research by Li Zhang[13] and et al., concentrates on intelligent 
neural network based facial emotion recognition and Latent Semantic 
Analysis based topic detection for a humanoid robot. The work has first 
of all incorporated Facial Action Coding System describing physical 
cues and anatomical knowledge of facial behavior for the detection of 
neutral and six basic emotions from real-time posed facial expressions. 
Feedforward neural networks (NN) are used to respectively implement 
both upper and lower facial Action Units (AU) analyzers to recognize six 
upper and 11 lower facial actions including Inner and Outer Brow 
Raiser, Lid Tightener, Lip Corner Puller, Upper Lip Raiser, Nose 
Wrinkler, Mouth Stretch etc. An artificial neural network based facial 
emotion recognizer is subsequently used to accept the derived 17 Action 
Units as inputs to decode neutral and six basic emotions from facial 
expressions. Moreover, in order to advise the robot to make appropriate 
responses based on the detected affective facial behaviors, Latent 
Semantic Analysis is used to focus on underlying semantic structures of 
the data and go beyond linguistic restrictions to identify topics embedded 
in the users’ conversations. The overall development is integrated with a 
modern humanoid robot platform under its Linux C++ SDKs. The work 
19
presented here shows great potential in developing personalized 
intelligent agents/robots with emotion and social intelligence. 
CHAPTER 3 
PROBLEM DEFINITION 
The aim of this project is to detect human facial emotions namely 
happiness, sadness and surprise. This is done by first detecting the face 
from an image, based on the skin colour detection technique. It is then 
followed by image segmentation and feature extraction techniques, 
where eye and mouth parts are extracted. Based on the eye and mouth 
variances the emotions are detected. From the position of eyes, emotions 
are detected. If the person is happy or sad then eyes will be open and 
when a person is surprised, eyes will be wide open. Similarly for the lips 
the shape and colour properties are important. Depending on the shape of 
the lips, emotions are detected, i.e., if the lips are closed and curved 
upwards it indicates happiness. If lips are opened it indicates surprise 
etc. Therefore based on the facial features such as eyes and mouth, 
emotions are detected and recognized. 
20
CHAPTER 4 
FACIAL EMOTION DETECTION AND RECOGNITION 
4.1 Overview of the Algorithm 
Our project proposes an emotion detection system where in the 
facial emotions namely - happy, sad and surprised are detected. First 
the face is detected from an image using the skin colour model. This is 
then followed by feature extraction such as eyes and mouth. This is used 
for further processing to detect the emotion. For detecting the emotion 
we take into account the fact that emotions are basically represented 
using mouth expressions. This is done using the shape and colour 
properties of the lips. 
4.1.1 Video Fragmentation 
The input video of an e-learning student is acquired using an image 
acquisition device and stored into a database. This video is extracted and 
21
fragmented into several frames to detect the emotions of the e-leaning 
student and to thereby improve the virtual learning environment. By the 
video acquisition feature which is used to record and register the on-going 
emotional changes in the e-learning student, the resulting emotions 
are detected by mapping the changes in the eye and lip region. The 
videos are recorded into a database before processing, thereby making it 
useful to analyse the changes of emotion for a particular subject or 
during a particular time of the day. 
Frame rate and motion blur are important aspects of video quality. 
This demo helps to show the visual differences between various frame 
rates and motion blur. 
A few presets to try out: 
Motion blur is a natural effect when you film the world in discrete time 
intervals. When a film is recorded at 25 frames per second, each frame 
has an exposure time of up to 40 milliseconds (1/25 seconds). All the 
changes in the scene over that entire 40 milliseconds will blend into the 
final frame. Without motion blur, animation will appear to jump and will 
not look fluid. 
When the frame rate of a movie is too low, your mind will no longer be 
convinced that the contents of the movie are continuous, and the movie 
will appear to jump (also called strobing). 
The human eye and its brain interface, the human visual system, can 
process 10 to 12 separate images per second, perceiving them 
individually, but the threshold of perception is more complex, with 
different stimuli having different thresholds: the average shortest 
22
noticeable dark period, such as the flicker of a cathode ray tube monitor 
or fluorescent lamp, is 16 milliseconds, while single-millisecond visual 
stimulus may have a perceived duration between 100ms and 400ms due 
to persistence of vision in the visual cortex. This may cause images 
perceived in this duration to appear as one stimulus, such as a 10ms 
green flash of light immediately followed by a 10ms red flash of light 
perceived as a single yellow flash of light. 
4.1.2 Face Detection 
The first step for face detection is to make a skin colour model. 
After the skin colour model is produced, the test image is skin 
segmented (binary image) and the face is detected. The result of Face 
Detection is processed by a decision function based on the chroma 
components (CrCb from YCbCr and Hue from HSV). Before the result is 
passed to the next module, it is cropped according to the skin mask. 
Small background areas which could lead to errors during the next stages 
will be deleted. 
A model image of face detection with the bounding box is 
illustrated below in Figure 4.1. 
23
Figure 4.1 Face Detection 
4.1.3 Feature Extraction 
After the face has been detected the next step is feature extraction 
where the eyes and mouth are extracted from the detected face .For eye 
extraction, this is done by creating two eye maps, a chrominance eye 
map and a luminance eye map. The two maps are then combined to 
locate the eyes in a face image, as shown in Figure 4.2. 
24
Figure 4.2 Feature Detection 
To locate the mouth region, we use the fact that it contains 
stronger red components and weaker blue components than other facial 
regions (Cr > Cb), and thus the mouth map is constructed .Based on this 
the mouth region is extracted .Finally the extracted eyes and mouth from 
the face image according to the maps are passed onto the next module of 
our algorithm. 
4.1.4 Emotion Detection 
The last module is emotion detection. This module makes use of 
the fact that the emotions are expressed majorly with the help of eye and 
mouth expressions as show in Figure 4.3. Emotion detection from lip 
images is based on colour and shape properties of human lips. Having a 
binary lip image, shape detection can be performed. Thus, depending on 
the shape of the lips and other morphological properties the emotions are 
detected. A computer is being taught to interpret human emotions based 
on lip pattern, according to research published in the International 
Journal of Artificial Intelligence and Soft Computing. The system could 
improve the way we interact with computers and perhaps allow disabled 
people to use computer-based communications devices, such as voice 
synthesizers, more effectively and more efficiently. 
25
Figure 4.3 Emotion Detection 
4.2 Architectural Design 
The architectural diagram shows the overall working of the 
system, where captured colour image sample is taken as the input and it 
is processed using image processing tools and is analysed to locate the 
facial features such as eyes and mouth, which will be further processed 
to recognize the emotion of the person. After the localization of the facial 
features the next step is to localize the characteristic points on the face. 
Followed by this is the feature extraction process where the features are 
extracted such as eyes and mouth. 
Based on the variations of eyes and mouth, emotion of a person is 
detected and recognized. For a person who is happy, the eyes will be 
open and the lips will be closed upwards whereas for a person who is 
26
sad, the eyes will be open and the lips will be closed facing downwards. 
Similarly for a person who is surprised the eyes will be wide open and 
there will be a considerable displacement of the eye brows from the eyes 
and the mouth will be wide open. Based on the above measures mood 
exhibited by a person is detected and it is recognized. 
The Figure 4.4 shows the overall working of the system where the 
input is the image and the output is the emotion recognized such as 
happy, sad or surprised. 
27
Figure 4.4 – Architectural Diagram 
CHAPTER 5 
REQUIREMENT ANALYSIS 
28
The Software Requirements Specification is based on the problem 
definition. Ideally, the requirement specification will state the “what” of 
the software product without implying “how” the software design is 
concerned, by specifying how the product will provide the required 
features. 
5.1 Product Requirements 
5.1.1 Input Requirements 
The input for this work is the video of an e-learning student, which 
may contain the human face. 
5.1.2 Output Requirements 
The output is the detected facial emotion such as happy, sad, and 
surprised. 
5.2 Resource Requirements 
The hardware configuration requirement is shown in Table 5.1 and 
software configuration required to run this software is shown in Table 
5.2. 
5.2.1 Hardware Requirements 
29
Table 5.1 – Hardware Requirements 
S.No Feature Configuration 
1 CPU Intel core 2 Duo processor 
2 Main memory 1 GB RAM 
3 Hard Disk 60 GB Disk size 
The above configuration in the Table 5.1 is the minimum hardware 
requirements for the proposed system. 
5.2.2 Software Requirements 
Table 5.2 – Software Requirements 
S.No Software Version 
1 Windows 7 
2 Matlab R2012a 
3 Picassa 3 
The proposed system is executed using Windows 7, 
MatlabR2012a and picassa 3 as shown in Table 5.2. 
CHAPTER 6 
DEVELOPMENT PROCESS AND DOCUMENTATION 
30
6.1 Face Detection 
Face detection is used in biometrics, often as a part of or together with 
a facial recognition system. It is also used in video surveillance, human 
computer interface and image database management. Some recent digital 
cameras use face detection for autofocus. Face detection is also useful 
for selecting regions of interest in photo slideshows that use a pan-and-scale 
Ken Burns effect. 
Face detection can be regarded as a specific case of object-class 
detection. In object-class detection, the task is to find the locations and 
sizes of all objects in an image that belong to a given class. Examples 
include upper torsos, pedestrians, and cars. 
Face detection can be regarded as a more general case of face 
localization. In face localization, the task is to find the locations and 
sizes of a known number of faces. In face detection, one does not have 
this additional information. 
6.1.1 Sample Collection 
The sample skin coloured pixels is collected from images of 
people belonging to different races. Each pixel is carefully chosen from 
the images so that the other regions which are not belonging to the skin 
colour do not get included. 
6.1.2 Chroma Chart Preparation 
Chroma chart shown in Figure 6.1 is the distribution of the skin 
colour of different people over the chromatic colour space. 
31
Figure 6.1 – Chroma Chart Diagram 
Here the chromatic colour is taken in the (Cb, Cr) colour space. 
Normally the images will be stored in the (R, G, B) format. A suitable 
conversion is needed to convert it into YCbCr colour space. 
The collected sample pixels values are converted from (R G B) 
colour space to the YcbCr colour space and a chart is drawn by taking 
the Cb along x- axis and Cr along Y-axis. Now the obtained chart shows 
the distribution of the skin colour of different people. The Intensity(Y) 
component is not considered because it has very little effect in the 
chrominance variation. The following diagram shows the distribution of 
the skin colour of different people. 
6.1.3 Skin Colour Model 
The skin-likelihood image is obtained using the developed skin 
colour model. The skin colour model is the distribution of skin colour 
32
over the chromatic colour space. Each and every pixel in the given input 
image is compared with the skin colour model. If the particular 
chrominance pair is present then that pixel is made as white pixel. This is 
achieved by assigning the red, green and blue component of each pixel 
as 255. If the chrominance pair is not present that pixel is made as black 
pixel. This is achieved by assigning the red, green and blue component 
of each pixel as 0. 
The result of Face Detection is first processed by a decision 
function based on the chroma components (CrCb from YCrCb and Hue 
from HSV).If all the following conditions are true for a pixel, it's marked 
as skin area; 140< Cr < 165 and 140< Cb <195.Now the obtained image 
is a binary image where the white coloured regions show the possible 
skin coloured region. The black region shows the non-skin coloured 
region. Before the result is passed to the next module, it is cropped 
according to the skin mask. Small background areas which could lead to 
errors during the next stages will be deleted. 
6.2 Feature Extraction 
Feature extraction is the process of detecting the required features 
from the face and extracting it by cropping or other such technique. 
6.2.1 Eye Detection 
Two separate eye maps are built, one from the chrominance 
component and the other from the luminance component. These two 
33
maps are then combined into a single eye map. The eye map from the 
chrominance is based on the fact that high-Cb and low-Cr values can be 
found around the eyes. The following formula presented helps us to 
construct the map: 
1/3*(Cb*Cb + (255-Cr)*(255-Cr) + (Cb/Cr)). 
Eyes usually contain both dark and bright pixels in the luminance 
component, so gray scale operators can be de- signed to emphasize 
brighter and darker pixels in the luminance component around eye 
regions. Such operators are dilation and erosion. We use gray scale 
dilation and erosion with a spherical structuring element to construct the 
eye map. 
The eye map from the chrominance is then combined with the eye 
map from the luminance by an AND (multiplication) operation, Eye 
Map=(EyeMapChr) AND (EyeMapLum). The resulting eye map is then 
dilated and normalized to brighten both the eyes and suppress other 
facial areas. Then with an appropriate choice of a threshold, we can track 
the location of the eye region. 
6.2.2 Mouth Detection 
To locate the mouth region, we use the fact that it contains 
stronger red components and weaker blue components than other facial 
34
regions (Cr >Cb), so the mouth map is constructed as follows: 
n= 0.95 * (1/k sum (Cr(x,y)*Cr(x,y))) / (1/k sum (Cr(x,y)/Cb(x,y))) 
Map = Cr*Cr*(Cr*Cr – n*Cr/Cb) 
Where k is the number of pixels in the face. 
The mouth detection diagram is shown in Figure 6.2 
Happy: 
Surprised: 
Figure 6.2 – Mouth Detection Diagram 
6.3 Emotion Detection 
Emotion detection from lip images is based on colour and shape 
properties of human lips. For this task we considered already having a 
35
rectangular colour image containing lips and surrounding skin (with as 
small amount of skin as possible). Given this we can start extracting a 
binary image of lips, which would give us the necessary information 
about the shape. 
To extract a binary image of lips, a double threshold approach was 
used. First, a binary image (mask) containing objects similar to lips is 
extracted. The mask image is extracted in the way that it contains a 
subset of pixels which is equal or greater of the exact subset of lip pixels. 
Then, another image (marker) is generated by extracting pixels which 
contain lips with highest probability. Later, the mask image is 
reconstructed using the marker image to make results more accurate. 
Having a binary lip image, shape detection can be performed. 
Some lip features of face expressing certain emotions are obvious: side 
corners of happy lips are higher compared to the lip centre than it is for 
serious or sad lips. One way to express it more mathematically is to find 
the leftmost and rightmost pixels (lip corners), draw a line between them 
and calculate the position of lip centre with respect to that line. The 
lower below the line is the centre, the happier the lips are. Another 
morphological lip property that can be extracted is mouth openness. 
Open lips imply certain emotions: usually happiness and surprise. 
For example (surprised and happy): 
1. Based on the original binary image the first step is to remove small 
areas which is done with the 'sizethre(x,y,'z')' function. 
36
2. In the second step a morphological closing (imclose(bw,se)) with a 
'disk' structure element is done. 
3. In the third step some properties of image regions are measured (blob 
analysis).More precise: 
A 'BoundingBox' is calculated which contains the smallest 
rectangle of the region (in our case the green box). In digital image 
processing, the bounding box is merely the coordinates of the 
rectangular border that fully encloses a digital image when it is placed 
over a page, a canvas, a screen or other similar bidimensional 
background. 
'Extremas' were calculated which means a 8-by-2 matrix that 
specifies the extrema points in the region. Each row of the matrix 
contains the x- and y-coordinates of one of the points. The format of the 
vector is [top-left top-right right-top right-bottom bottom-right bottom-left 
left-bottom left-top] (in our case the cyan dots). 
A 'Centroid' which is a 1-by-ndims(L) vector that specifies the 
centre of mass of the region (in our case the blue 'star'). 
The centroid is calculated based on: 
1. p_poly_dist.....Calculates distance (shown as red line) between 
Centroid and 'left-top-right-top-line'. 
37
2. lipratio....Ratio between width and height of the bounding box. 
3. lip_sign....Is a positive/negative number, which is calculated to 
detect if the 'left-top-right-top-line' runs over/under the 'centroid'. 
4. The decision is made if mood is 'happy', 'sad' and 'surprised'. 
After reviewing some illumination correction (colour constancy) 
algorithms we decided to use the "Max-RGB" (also known as "White 
patch") algorithm. This algorithm assumes that in every image there is a 
white patch, which is then used as a reference for present illumination. A 
more accurate "Colour by Correlation" algorithm was also considered, 
but it required building a precise colour-illumination correlation table in 
controlled conditions, which would be beyond the scope of this task. As 
the face detection is always the first step in the processes of these 
recognition or transmission systems, its performance would put a strict 
limit on the achieved performance of the whole system. Ideally, a good 
face detector should accurately extract all faces in images regardless of 
their positions, scales, orientations, colours, shapes, poses, expressions 
and light conditions. However, for the current state of the art in image 
processing technologies, this goal is a big challenge. For this reason, 
many designed face detectors deal with only upright and frontal faces in 
well-constrained environments. 
This lip emotion detection algorithm has one restriction - the face 
cannot be rotated more than 90 degrees, since then the corner detection 
would obviously fail. 
CHAPTER 7 
EXPERIMENTAL RESULTS 
38
7.1 General 
The results obtained after successful implementation of the project 
is given in this chapter. The results obtained are given in step by step 
basis. 
7.2 Chroma Chart 
Chroma chart displayed in Figure 7.1 is the distribution of the skin 
colour of different people over the chromatic colour space. Here the 
chromatic colour is taken in the (Cb, Cr) colour space. The Intensity(Y) 
component is not considered because it has very little effect in the 
chrominance variation. The following diagram shows the distribution of 
the skin colour of different people. 
Figure 7.1 – Chroma Chart 
7.3 Result Analysis 
This gives the overall efficiency of the proposed system detected 
39
at each step. The system was analysed for its detection rate and time 
taken to detect a particular stage for a specified number of input images. 
Three stages were considered in the system that is the skin detection, the 
face detection such as eyes and mouth and the emotion detection and 
recognition, at each of these stages detection rate and the time taken was 
calculated. The results are tabulated in the Table 7.1. 
Table 7.1 – Result Analysis 
STAGES DETECTION 
RATES 
(%) 
According to the table, 17 image samples were taken to determine 
the skin detection rate and it was found that, out of 17 images, skin was 
detected for 16 images giving a detection rate of 94.44 % with an 
average time of 1.4 seconds per image. The face detection rate was 
40 
NUMBER OF 
IMAGES TIME(s) 
SKIN DETECTION 94.44 17 1.4 
FACE DETECTION 
(EYES AND 
MOUTH) 
83.33 15 1 
EMOTION 
DETECTION AND 
RECOGNITION 
88.88 16 0.5
calculated for 15 images out of which for 12 images face was detected 
successfully giving a detection rate of 83.33 % with an average time of 1 
second per image. Similarly the emotion detection and recognition rate 
was calculated for 16 images. Out of which for 14 images exact 
emotions were detected and recognized giving a detection rate of 88.88 
% with an average time of 0.5 seconds per image. 
The video fragmentation rate of a video depends on the duration 
and length of the original video. The Frames per Second (fps) rate is 
dependent on the time span of the video. 
Frame rate (also known as frame frequency) is the frequency (rate) 
at which an imaging device produces unique consecutive images 
called frames. The term applies equally well to film and 
video cameras, computer graphics, and motion capture systems. Frame 
rate is most often expressed in frames per second (FPS) and is also 
expressed in progressive scan monitors as hertz(Hz). If a video of a 
greater time span is given, the interval between the fragments remain 
constant. For every fragment produced, the emotion of the person is 
detected. Thereby, acquiring an opinion on what intervals the change of 
emotions occurs, and narrowing down to the corresponding reason of 
occurrence. 
CHAPTER 8 
CONCLUSION AND FUTURE WORK 
41
Conclusion 
The proposed system utilizes feature extraction techniques and 
determines the emotion of the person based on the facial features namely 
eyes and lips. The emotion exhibited by a person is determined with a 
good accuracy and it is user friendly system. 
Face-Detection and Segmentation 
In this project we have proposed an emotion detection and 
recognition system for colour images. Although our application is only 
constructed for full frontal pictures with only one person per picture 
Face-Detection is necessary for decreasing the area of interest needed for 
further processing in order to achieve the best results. 
Trying to detect the skin of a face in an image really is a hard task 
due to the variance of illumination. The success of correct detection 
depends a lot on the light sources and illumination properties of the 
environment the picture are taken. 
Emotion Detection 
The major difficulty of the used approach is determining the right 
hue threshold range for lip extraction. Lip colours vary mostly according 
to face owner's race, presence of make-up and illumination, under which 
the photo was taken. The latter is the least problem, since there exist 
illumination correction algorithms. 
Future Enhancements 
42
The future work includes enhancement of the system so that it is 
able to detect emotions of the person even in complex backgrounds 
having different illumination conditions and to eliminate the lip colour 
constraint in the coloured images. The other criterion that can be worked 
upon is to project more emotions other than happy, sad and surprised. 
APPENDIX 1 
SCREENSHOTS 
43
SCREEN 1 : The detected face for the given video input. 
44
SCREEN 2: The interface which is used to select the input image. 
45
SCREEN 3: The image which is to given as reference 
46
SCREEN 4: The image to be tested. 
47
SCREEN 5: The smoothened reference image. 
48
SCREEN 6: The test image after smoothening. 
49
SCREEN 7: The image after the detection of edges. 
50
SCREEN 8: The above screen is the result screen which displays the 
end result of the system, the emotion portrayed by the person in the 
image. 
REFERENCES 
51
[1] J. Jarkiewicz, R. Kocielnik and K. Marasek, “Anthropometric 
Facial Emotion Recognition”, Novel Interaction Methods and 
Techniques -Lecture Notes in Computer Science, Volume 5611, 
2009. 
[2]L. Zhao, X. LinSun, J. Liu and X.Hexu, “Face Detection Based on 
Skin Colour”, Proceedings of the third international conference on 
machine learning and cybernetics, Shanghai, 2004. 
[3] I. Maglogiannis, D. Vouyioukas and C. Aggelopoulos, “Face 
Detection and Recognition of Natural Human Emotion Using 
Markov Random Fields”, Pers Ubiquit Comput, 2009. 
[4]M.H Yang, D. J. Kriegman, N. Ahuja, “Detecting Faces in 
Images”, IEEE transactions on pattern analysis and machine 
intelligence, vol.24, no.1, 2002 
[5] Pedro J. Muñoz-Merino,Carlos Delgado Kloos and Mario Muñoz- 
Organero, “Enhancement of Student Learning Through the Use of 
a Hinting Computer e-Learning System and Comparison With 
Human Teachers ,“ IEEE Journal.vol.52, 2011. 
[6]Emily Mower,Maja.J.Mataric and Shrikanth Narayanan, “A Frame 
Work for Automatic Human Emotion Classification using 
Emotions Profile,” IEEE Journal.vol.23, 2011 
[7]Xiaogang Wang and Xiaoou Tang, “Face Photo-Sketch Synthesis 
and Recognition”, IEEE transactions, 2009. 
[8]Yan Tong Jixu Chen and Qiang ji, “A Unified Probabilistic 
52
Framework for Spontaneous Facial Action Modelling”, IEEE 
transactions on pattern analysis and machine intelligence, vol.32, 
no.2, 2010. 
[9] Chen, L.S., Huang, T.S. “Emotional Expressions in Audiovisual 
Human Computer Interaction”, IEEE International Conference, 
Volume: 1, 2000. 
[10] De Silva, L.C., Ng, P. C. “Bimodal Emotion Recognition”, 
Fourth IEEE International Conference, 2000. 
53

More Related Content

What's hot

Facial emotion recognition
Facial emotion recognitionFacial emotion recognition
Facial emotion recognition
Rahin Patel
 
Facial Expression Recognition System using Deep Convolutional Neural Networks.
Facial Expression Recognition  System using Deep Convolutional Neural Networks.Facial Expression Recognition  System using Deep Convolutional Neural Networks.
Facial Expression Recognition System using Deep Convolutional Neural Networks.
Sandeep Wakchaure
 
Model Based Emotion Detection using Point Clouds
Model Based Emotion Detection using Point CloudsModel Based Emotion Detection using Point Clouds
Model Based Emotion Detection using Point Clouds
Lakshmi Sarvani Videla
 
Predicting Emotions through Facial Expressions
Predicting Emotions through Facial Expressions  Predicting Emotions through Facial Expressions
Predicting Emotions through Facial Expressions
twinkle singh
 
Facial Expression Recognitino
Facial Expression RecognitinoFacial Expression Recognitino
Facial Expression Recognitino
International Islamic University
 
Facial Emotion Recognition using Convolution Neural Network
Facial Emotion Recognition using Convolution Neural NetworkFacial Emotion Recognition using Convolution Neural Network
Facial Emotion Recognition using Convolution Neural Network
YogeshIJTSRD
 
Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.
Takrim Ul Islam Laskar
 
Face recognition
Face recognitionFace recognition
Face recognition
sandeepsharma1193
 
Facial recognition technology by vaibhav
Facial recognition technology by vaibhavFacial recognition technology by vaibhav
Facial recognition technology by vaibhavVaibhav P
 
EMOTION DETECTION USING AI
EMOTION DETECTION USING AIEMOTION DETECTION USING AI
EMOTION DETECTION USING AI
Aantariksh Developers
 
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
ijgca
 
Face recognition system
Face recognition systemFace recognition system
Face recognition system
shraddha mane
 
Facial recognition system
Facial recognition systemFacial recognition system
Facial recognition system
Divya Sushma
 
Face Detection
Face DetectionFace Detection
Face Detection
Reber Novanta
 
Computer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and PythonComputer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and Python
Akash Satamkar
 
Automated Face Detection System
Automated Face Detection SystemAutomated Face Detection System
Automated Face Detection System
Abhiroop Ghatak
 
4837410 automatic-facial-emotion-recognition
4837410 automatic-facial-emotion-recognition4837410 automatic-facial-emotion-recognition
4837410 automatic-facial-emotion-recognitionNgaire Taylor
 
Emotion Recognition using Image Processing
Emotion Recognition using Image ProcessingEmotion Recognition using Image Processing
Emotion Recognition using Image Processing
ijtsrd
 
face recognition
face recognitionface recognition
face recognition
vipin varghese
 
Face Recognition Technology
Face Recognition TechnologyFace Recognition Technology
Face Recognition Technology
Shashidhar Reddy
 

What's hot (20)

Facial emotion recognition
Facial emotion recognitionFacial emotion recognition
Facial emotion recognition
 
Facial Expression Recognition System using Deep Convolutional Neural Networks.
Facial Expression Recognition  System using Deep Convolutional Neural Networks.Facial Expression Recognition  System using Deep Convolutional Neural Networks.
Facial Expression Recognition System using Deep Convolutional Neural Networks.
 
Model Based Emotion Detection using Point Clouds
Model Based Emotion Detection using Point CloudsModel Based Emotion Detection using Point Clouds
Model Based Emotion Detection using Point Clouds
 
Predicting Emotions through Facial Expressions
Predicting Emotions through Facial Expressions  Predicting Emotions through Facial Expressions
Predicting Emotions through Facial Expressions
 
Facial Expression Recognitino
Facial Expression RecognitinoFacial Expression Recognitino
Facial Expression Recognitino
 
Facial Emotion Recognition using Convolution Neural Network
Facial Emotion Recognition using Convolution Neural NetworkFacial Emotion Recognition using Convolution Neural Network
Facial Emotion Recognition using Convolution Neural Network
 
Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.
 
Face recognition
Face recognitionFace recognition
Face recognition
 
Facial recognition technology by vaibhav
Facial recognition technology by vaibhavFacial recognition technology by vaibhav
Facial recognition technology by vaibhav
 
EMOTION DETECTION USING AI
EMOTION DETECTION USING AIEMOTION DETECTION USING AI
EMOTION DETECTION USING AI
 
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
 
Face recognition system
Face recognition systemFace recognition system
Face recognition system
 
Facial recognition system
Facial recognition systemFacial recognition system
Facial recognition system
 
Face Detection
Face DetectionFace Detection
Face Detection
 
Computer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and PythonComputer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and Python
 
Automated Face Detection System
Automated Face Detection SystemAutomated Face Detection System
Automated Face Detection System
 
4837410 automatic-facial-emotion-recognition
4837410 automatic-facial-emotion-recognition4837410 automatic-facial-emotion-recognition
4837410 automatic-facial-emotion-recognition
 
Emotion Recognition using Image Processing
Emotion Recognition using Image ProcessingEmotion Recognition using Image Processing
Emotion Recognition using Image Processing
 
face recognition
face recognitionface recognition
face recognition
 
Face Recognition Technology
Face Recognition TechnologyFace Recognition Technology
Face Recognition Technology
 

Viewers also liked

Emotional Stress Indicator and Digital Thermometer-Project-8thsem
Emotional Stress Indicator and Digital Thermometer-Project-8thsemEmotional Stress Indicator and Digital Thermometer-Project-8thsem
Emotional Stress Indicator and Digital Thermometer-Project-8thsem
kaushikbandopadhyay
 
Emotion based music player
Emotion based music playerEmotion based music player
Emotion based music player
Nizam Muhammed
 
Personal entrepreneurial competencies
Personal entrepreneurial competenciesPersonal entrepreneurial competencies
Personal entrepreneurial competencies
Eug'z Olmedillo
 
Final project proposal
Final project proposalFinal project proposal
Final project proposalridewan hilmi
 
Automatic group happiness intensity analysis.
Automatic group happiness intensity analysis.Automatic group happiness intensity analysis.
Automatic group happiness intensity analysis.
LeMeniz Infotech
 
Proposal format
Proposal formatProposal format
Proposal formatMr SMAK
 
10 Project Proposal Writing
10 Project Proposal Writing10 Project Proposal Writing
10 Project Proposal Writing
Tony
 

Viewers also liked (7)

Emotional Stress Indicator and Digital Thermometer-Project-8thsem
Emotional Stress Indicator and Digital Thermometer-Project-8thsemEmotional Stress Indicator and Digital Thermometer-Project-8thsem
Emotional Stress Indicator and Digital Thermometer-Project-8thsem
 
Emotion based music player
Emotion based music playerEmotion based music player
Emotion based music player
 
Personal entrepreneurial competencies
Personal entrepreneurial competenciesPersonal entrepreneurial competencies
Personal entrepreneurial competencies
 
Final project proposal
Final project proposalFinal project proposal
Final project proposal
 
Automatic group happiness intensity analysis.
Automatic group happiness intensity analysis.Automatic group happiness intensity analysis.
Automatic group happiness intensity analysis.
 
Proposal format
Proposal formatProposal format
Proposal format
 
10 Project Proposal Writing
10 Project Proposal Writing10 Project Proposal Writing
10 Project Proposal Writing
 

Similar to Final Year Project - Enhancing Virtual Learning through Emotional Agents (Document)

EMOTION RECOGNITION FROM FACIAL EXPRESSION BASED ON BEZIER CURVE
EMOTION RECOGNITION FROM FACIAL EXPRESSION BASED ON BEZIER CURVEEMOTION RECOGNITION FROM FACIAL EXPRESSION BASED ON BEZIER CURVE
EMOTION RECOGNITION FROM FACIAL EXPRESSION BASED ON BEZIER CURVE
ijait
 
Paper id 24201475
Paper id 24201475Paper id 24201475
Paper id 24201475IJRAT
 
Face expression recognition using Scaled-conjugate gradient Back-Propagation ...
Face expression recognition using Scaled-conjugate gradient Back-Propagation ...Face expression recognition using Scaled-conjugate gradient Back-Propagation ...
Face expression recognition using Scaled-conjugate gradient Back-Propagation ...
IJMER
 
Intellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
Intellectual Person Identification Using 3DMM, GPSO and Genetic AlgorithmIntellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
Intellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
IJCSIS Research Publications
 
Face detection using template matching
Face detection using template matchingFace detection using template matching
Face detection using template matchingBrijesh Borad
 
Real time facial expression analysis using pca
Real time facial expression analysis using pcaReal time facial expression analysis using pca
Real time facial expression analysis using pca
International Journal of Science and Research (IJSR)
 
Efficient Facial Expression and Face Recognition using Ranking Method
Efficient Facial Expression and Face Recognition using Ranking MethodEfficient Facial Expression and Face Recognition using Ranking Method
Efficient Facial Expression and Face Recognition using Ranking Method
IJERA Editor
 
A Study on Face Recognition Technique based on Eigenface
A Study on Face Recognition Technique based on EigenfaceA Study on Face Recognition Technique based on Eigenface
A Study on Face Recognition Technique based on Eigenface
sadique_ghitm
 
AN IMPROVED TECHNIQUE FOR HUMAN FACE RECOGNITION USING IMAGE PROCESSING
AN IMPROVED TECHNIQUE FOR HUMAN FACE RECOGNITION USING IMAGE PROCESSINGAN IMPROVED TECHNIQUE FOR HUMAN FACE RECOGNITION USING IMAGE PROCESSING
AN IMPROVED TECHNIQUE FOR HUMAN FACE RECOGNITION USING IMAGE PROCESSING
ijiert bestjournal
 
IRJET - Emotionalizer : Face Emotion Detection System
IRJET - Emotionalizer : Face Emotion Detection SystemIRJET - Emotionalizer : Face Emotion Detection System
IRJET - Emotionalizer : Face Emotion Detection System
IRJET Journal
 
IRJET- Emotionalizer : Face Emotion Detection System
IRJET- Emotionalizer : Face Emotion Detection SystemIRJET- Emotionalizer : Face Emotion Detection System
IRJET- Emotionalizer : Face Emotion Detection System
IRJET Journal
 
Facial emotion recognition using enhanced multi-verse optimizer method
Facial emotion recognition using enhanced multi-verse optimizer methodFacial emotion recognition using enhanced multi-verse optimizer method
Facial emotion recognition using enhanced multi-verse optimizer method
IJECEIAES
 
IRJET- A Review on Various Approaches of Face Recognition
IRJET- A Review on Various Approaches of Face RecognitionIRJET- A Review on Various Approaches of Face Recognition
IRJET- A Review on Various Approaches of Face Recognition
IRJET Journal
 
Fl33971979
Fl33971979Fl33971979
Fl33971979
IJERA Editor
 
Fl33971979
Fl33971979Fl33971979
Fl33971979
IJERA Editor
 
Ijarcet vol-2-issue-4-1352-1356
Ijarcet vol-2-issue-4-1352-1356Ijarcet vol-2-issue-4-1352-1356
Ijarcet vol-2-issue-4-1352-1356Editor IJARCET
 
Facial expression using 3 d animation
Facial expression using 3 d animationFacial expression using 3 d animation
Facial expression using 3 d animationiaemedu
 
Facial expression using 3 d animation
Facial expression using 3 d animationFacial expression using 3 d animation
Facial expression using 3 d animationiaemedu
 
Facial expression using 3 d animation
Facial expression using 3 d animationFacial expression using 3 d animation
Facial expression using 3 d animationIAEME Publication
 
Facial expression using 3 d animation
Facial expression using 3 d animationFacial expression using 3 d animation
Facial expression using 3 d animationiaemedu
 

Similar to Final Year Project - Enhancing Virtual Learning through Emotional Agents (Document) (20)

EMOTION RECOGNITION FROM FACIAL EXPRESSION BASED ON BEZIER CURVE
EMOTION RECOGNITION FROM FACIAL EXPRESSION BASED ON BEZIER CURVEEMOTION RECOGNITION FROM FACIAL EXPRESSION BASED ON BEZIER CURVE
EMOTION RECOGNITION FROM FACIAL EXPRESSION BASED ON BEZIER CURVE
 
Paper id 24201475
Paper id 24201475Paper id 24201475
Paper id 24201475
 
Face expression recognition using Scaled-conjugate gradient Back-Propagation ...
Face expression recognition using Scaled-conjugate gradient Back-Propagation ...Face expression recognition using Scaled-conjugate gradient Back-Propagation ...
Face expression recognition using Scaled-conjugate gradient Back-Propagation ...
 
Intellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
Intellectual Person Identification Using 3DMM, GPSO and Genetic AlgorithmIntellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
Intellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
 
Face detection using template matching
Face detection using template matchingFace detection using template matching
Face detection using template matching
 
Real time facial expression analysis using pca
Real time facial expression analysis using pcaReal time facial expression analysis using pca
Real time facial expression analysis using pca
 
Efficient Facial Expression and Face Recognition using Ranking Method
Efficient Facial Expression and Face Recognition using Ranking MethodEfficient Facial Expression and Face Recognition using Ranking Method
Efficient Facial Expression and Face Recognition using Ranking Method
 
A Study on Face Recognition Technique based on Eigenface
A Study on Face Recognition Technique based on EigenfaceA Study on Face Recognition Technique based on Eigenface
A Study on Face Recognition Technique based on Eigenface
 
AN IMPROVED TECHNIQUE FOR HUMAN FACE RECOGNITION USING IMAGE PROCESSING
AN IMPROVED TECHNIQUE FOR HUMAN FACE RECOGNITION USING IMAGE PROCESSINGAN IMPROVED TECHNIQUE FOR HUMAN FACE RECOGNITION USING IMAGE PROCESSING
AN IMPROVED TECHNIQUE FOR HUMAN FACE RECOGNITION USING IMAGE PROCESSING
 
IRJET - Emotionalizer : Face Emotion Detection System
IRJET - Emotionalizer : Face Emotion Detection SystemIRJET - Emotionalizer : Face Emotion Detection System
IRJET - Emotionalizer : Face Emotion Detection System
 
IRJET- Emotionalizer : Face Emotion Detection System
IRJET- Emotionalizer : Face Emotion Detection SystemIRJET- Emotionalizer : Face Emotion Detection System
IRJET- Emotionalizer : Face Emotion Detection System
 
Facial emotion recognition using enhanced multi-verse optimizer method
Facial emotion recognition using enhanced multi-verse optimizer methodFacial emotion recognition using enhanced multi-verse optimizer method
Facial emotion recognition using enhanced multi-verse optimizer method
 
IRJET- A Review on Various Approaches of Face Recognition
IRJET- A Review on Various Approaches of Face RecognitionIRJET- A Review on Various Approaches of Face Recognition
IRJET- A Review on Various Approaches of Face Recognition
 
Fl33971979
Fl33971979Fl33971979
Fl33971979
 
Fl33971979
Fl33971979Fl33971979
Fl33971979
 
Ijarcet vol-2-issue-4-1352-1356
Ijarcet vol-2-issue-4-1352-1356Ijarcet vol-2-issue-4-1352-1356
Ijarcet vol-2-issue-4-1352-1356
 
Facial expression using 3 d animation
Facial expression using 3 d animationFacial expression using 3 d animation
Facial expression using 3 d animation
 
Facial expression using 3 d animation
Facial expression using 3 d animationFacial expression using 3 d animation
Facial expression using 3 d animation
 
Facial expression using 3 d animation
Facial expression using 3 d animationFacial expression using 3 d animation
Facial expression using 3 d animation
 
Facial expression using 3 d animation
Facial expression using 3 d animationFacial expression using 3 d animation
Facial expression using 3 d animation
 

Recently uploaded

Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
DuvanRamosGarzon1
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
Kamal Acharya
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
MuhammadTufail242431
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
Kamal Acharya
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
addressing modes in computer architecture
addressing modes  in computer architectureaddressing modes  in computer architecture
addressing modes in computer architecture
ShahidSultan24
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
PrashantGoswami42
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
Kamal Acharya
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 

Recently uploaded (20)

Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
addressing modes in computer architecture
addressing modes  in computer architectureaddressing modes  in computer architecture
addressing modes in computer architecture
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 

Final Year Project - Enhancing Virtual Learning through Emotional Agents (Document)

  • 1. CHAPTER 1 INTRODUCTION 1.1A Brief Description Virtual learning is increasing day by day, and Human Computer Interaction is a necessity to make virtual learning a better experience. The emotions of a person play a major role in the learning process. Hence the proposed work, detects the emotions of a person, by his face expressions. For a facial expression to be detected face location and area must be known; therefore in most cases, emotion detection algorithms start with face detection, taking into account the fact that face emotions are mostly depicted using the mouth. Eventually, algorithms for eye and mouth detection and tracking are necessary, in order to provide the features for subsequent emotion recognition. In this project we propose a detection system for natural emotion recognition. 1.2Need For Face Detection Human activity is a major concern in a wide variety of applications such as video surveillance, human computer interface, face recognition and face database management. Most face recognition algorithms assume that face location is known. Similarly, face-tracking algorithms often assume that initial face location is known. In order to improve the efficiency of the face recognition systems, an efficient face detection algorithm is needed. 1
  • 2. 1.3Need For Emotion Detection Human beings communicate through facial emotions in day to day interactions with others. Human perceiving the emotions of fellow human is natural and inherently accurate. Human can express his/her inner state of mind through emotions. Many times, emotion indicates that a human needs help. Computer recognising emotions is an important research in Human Computer Interfacing (HCI). This HCI can be a welcoming method for physically disabled and to those who are unable to express their requirement by voice or by other means and especially to those who are confined to bed. The human emotion can be detected through facial actions or through biosensors. Facial actions are imaged through still or video cameras. From still images, taken at discrete times, the changes in eye and mouth areas can be exposed. Measuring and analysing such changes will lead to the determination of human emotions. 1.4 Existing Face Detection Approaches 1.4.1 Feature Invariant Methods These methods aim to find structural features that exist even when the pose, viewpoint, or lighting conditions vary, and then use these to locate faces. These methods are designed mainly for face localization. 2
  • 3. Texture Human faces have a distinct texture that can be used to separate them from different objects. The textures are computed using second-order statistical features on sub images of 16X16 pixels. Three types of features are considered: skin, hair, and others. To infer the presence of a face from the texture labels, the votes of occurrence of hair and skin textures are used. Here the colour information is also incorporated with face-texture model. Using the face texture model, a scanning scheme for face detection in colour scenes in which the orange like parts including the face areas are enhanced. One advantage of this approach is that it can detect faces which are not upright or have features such as beards and glasses. Skin Colour Human skin colour has been used and proven to be an effective feature in many applications from face detection to hand tracking. Although different people have different colour, several studies have shown that the major difference lies largely between their intensity rather than their chrominance. Several colour spaces have been utilized to label pixels as skin including RGB, Normalized RGB, HSV, YCbCr, YIQ, YES, CIE XYZ and CIE LUV. 1.4.2 Template Matching Methods In template matching, a standard face pattern is manually predefined or parameterized by a function. Given an input image, the correlation values with the standard patterns are computed for the four 3
  • 4. contours, eyes, nose, and mouth independently. The existence of a face is determined based on the correlation values. This approach has the advantage of being simple to implement. However, it has proven to be inadequate for face detection since it cannot effectively deal with variation in scale, pose, and shape. Multiresolution, multiscale, sub templates, and deformable templates have subsequently been proposed to achieve scale and shape invariance. Predefined Face Template In this approach several sub templates for nose, eyes, mouth and face contour are used to model a face. Each sub template is defined in terms of line segments. Lines in the input image are extracted based on greatest gradient change and then matched against the sub templates. The correlations between sub images and contour templates are computed first to detect candidate location of faces. Then, matching with the other sub templates is performed at the candidate positions. In other words, the first phase determines focus of attention or region of interest and second phase examines the details to determine the existence of a face. 1.4.3 Appearance Based Methods In the appearance based methods the templates are learned from examples in images. In general, appearance based methods rely on techniques from statistical analysis and machine learning to find the relevant characteristics of face and non face images. The learned characteristics are in the form of distribution models that are consequently used for face detection. 4
  • 5. 1.5 Existing Emotion Detection Approaches 1.5.1 Genetic Algorithm The eye feature plays a vital role in classifying the face emotion using Genetic Algorithm. The acquired images must go through few pre-processing methods such as grayscale, histogram equalization and filtering. A Genetic Algorithm methodology estimates the emotions from eye feature alone. Observation of various emotions lead to a unique characteristic of eye, that is, the eye exhibits ellipses of different parameters in each emotion. Genetic Algorithm is adopted to optimize the ellipse characteristics of the eye features. Processing time for Genetic Algorithm varies for each emotion. 1.5.2 Neural Network Neural networks have found profound success in the area of pattern recognition. By repeatedly showing a neural network, inputs are classified into groups, the network can be trained to discern the criteria used to classify, and it can do so in a generalized manner allowing successful classification of new inputs not used during training. With the explosion of research in emotions in recent years, the application of pattern recognition technology to emotion detection has become increasingly interesting. Since emotion has become an important interface for the communication between human and machine, it plays a basic role in rational decision-making, learning, perception, and various cognitive tasks. 5
  • 6. Human's emotion can be detected based on the physiological measurements, facial expression. Since human shows the same facial muscles when expressing a particular emotion, therefore the emotion can be quantified. Primary emotions such as anger, disgust, fear, happiness, sadness and surprise can be classified using Neural Network. 1.5.3 Feature Point Extraction Template Matching An interesting approach in the problem of automatic facial feature extraction is a technique based on the use of template prototypes, which are portrayed on the 2-d space in gray scale format. This is a technique that is, to some extent, easy to use, but also effective. It uses correlation as a basic tool for comparing the template with the part of the image that we wish to recognize. An interesting question that arises is, the behaviour of recognition with template matching in different resolutions. This involves multi-resolution representations through the use of Gaussian pyramids. The experiments proved that not very high resolutions are needed for template matching recognition. For example, the use of templates of 36x36 pixels proved sufficient. This fact shows us that template matching is not as computationally complex as we originally imagined. This class implements the face detection algorithm which starts by scanning the given image with the SSR filter and locating the face candidates, then it assembles the candidates that are close to each other using connected components (to treat less candidates which means less processing time, remember this is a real-time application), then we take 6
  • 7. the centre of each cluster and extract a template based on this centre; we pass the template to the Support Vector Machine which tells us whether this template is a face or not, if yes, we locate the eyes, then we locate the nose. Face detection techniques are of two categories: 1. Feature based approach 2. Image-based approach. Template Matching provides for the human face detection system. 1. Feature Based Technique: The techniques in the first category make use of apparent properties of face such as face geometry, skin colour, and motion. Even feature-based technique can achieve high speed in face detection, but it also has problem in poor reliability under lighting condition. 2. Image Based Technique: The image based approach takes advantage of current advance in pattern recognition theory. Most of the image based approach applies a window scanning technique for detecting face, which requires large computation. To achieve high speed and reliable face detection system, we propose the method which combines both feature-based and image-based approach using SSR Filter. 7
  • 8. 1.5.4 Template Matching. Template matching is a technique in digital image processing for finding small parts of an image which match a template image or as a way to detect edges in images. The basic method of template matching uses a convolution mask (template), tailored to a specific feature of the search image, which we want to detect. This technique can be easily performed on grey images or edge images. The convolution output will be highest at places where the image structure matches the mask structure, where large image values get multiplied by large mask values Eyes and Nose detection using SSR Filter. A real-time face detection algorithm using Six-Segmented Rectangular (SSR) filter of the eyes and nose detection. SSR is a six segment rectangle as illustrated in Figure 1.1. Figure 1.1 SSR Filter 8
  • 9. At the beginning, a rectangle is scanned throughout the input image. This rectangle is segmented into six segments as shown below. The SSR filter is used to detect the Between-the-Eyes based on two characteristics of face geometry. BTE - Between The Eyes The detection of BTE is based on the property of the image characteristics of the area on face. The intensity of the BTE image closely resembles a hyperbolic surface as shown in Figure 1.2. The BTE is the saddle point on the hyperbolic surface. A rotationally invariant filter could thus be devised for detecting the BTE area. 9
  • 10. Figure 1.2 Determination of BTE The nose area is usually calculated to be 2/3rd of the value of L as shown in Figure 1.3. The L is calculated as the approximate distance between both eyes and the distance from eye to nose. Figure1.3 Nose Tip Search Area Relative to Eyes The common BTE area on human face resembles a hyperbolic surface. The proposed work uses this hyperbolic model to describe the BTE region, the centre of the BTE is thus the saddle point on the surface. Blobs Blobs provide a complementary description of image structures in terms of regions, as opposed to corners that are more point-like. Nevertheless, blob descriptors often contain a preferred point (a local maximum of an operator response or a centre of gravity) which means 10
  • 11. that many blob detectors may also be regarded as interest point operators. Blob detectors can detect areas in an image which are too smooth to be detected by a corner detector. Gabor Filtering It is possible for Gabor filtering to be used in a facial recognition system. The neighbouring region of a pixel may be described by the response of a group of Gabor filters in different frequencies and directions, which have a reference to the specific pixel. In that way, a feature vector may be formed, containing the responses of those filters. Automated Facial Feature Extraction In this approach, as far as the frontal images are concerned, the fundamental concept upon which the automated localization of the predetermined points is based consists of two steps: the hierarchic and reliable selection of specific blocks of the image and subsequently the use of a standardized procedure for the detection of the required benchmark points. In order for the former of the two processes to be successful, the need of a secure method of approach has emerged. The detection of a block describing a facial feature relies on a previously, effectively detected feature. By adopting this reasoning, the choice of the most significant characteristic -the ground of the cascade routine- has to be made. The importance that each of the commonly used facial features, regarding the issue of face recognition, has already been studied by other researchers. The outcome of surveys proved the eyes to be the most dependable and easily located of all facial features, and as such they were used. The techniques that were developed and tried separately, utilize a combination of template matching and Gabor filtering. 11
  • 12. The Hybrid Method The basic question of the desired feature blocks is performed by a simple template matching procedure. Each feature prototype is selected from one of the frontal images of the face base. The practiced comparison criterion is the maximum correlation coefficient between the prototype and the repeatedly audited blocks of a smartly restricted area of the face. In order for the search area to be incisively and functionally limited, the knowledge of the human face physiology has been applied, without hindering the satisfactory performance of the algorithm in cases of small violations of the initial limitations. However, the final block selection by the mere use of this method has not always been crowned with success. Therefore, the need of a measure of reliability came forth. For that reason, the use of Gabor filtering was deemed to be one suitable tool. As it can be mathematically deduced from the filter’s form, it ensures simultaneous optimum localization in the natural space as well as in frequency space. The filter is applied both on the localized area and the template in four different spatial frequencies. Its response is regarded as valid, only in the case that its amplitude exceeds a saliency threshold. The area with minimum phase distance from its template is considered to be the most reliably traced block. 12
  • 13. 1.5.5 Preprocessing and Postprocessing of Images Image Processing Toolbox provides reference-standard algorithms for pre-processing and post processing tasks that solve frequent system problems, such as interfering noise, low dynamic range, out-of-focus optics, and the difference in colour representation between input and output devices. Using region-of-interest tools to create a mask, items in the original image (top) are selected to create the mask (bottom). Image Enhancement techniques in Image Processing Toolbox enables user to increase the signal-to-noise ration and accentuate image features by modifying the colours or intensities of an image. We can · Perform histogram equalization · Perform decorrelation stretching · Remap the dynamic range · Adjust the gamma value · Perform linear, median or adaptive filtering. 1.5.6 Typical Tasks of Computer Vision Each of the application areas in computer vision systems employ a range of computer vision tasks, more or less well-defined measurement problems or processing problems, which can be solved using a variety of methods. Some examples of typical computer vision tasks are presented 13
  • 14. below. Recognition The classical problem in computer vision, image processing and machine vision is that of determining whether or not the image data contains some specific object, feature, or activity. This task can normally be solved robustly and without effort by a human, but is still not satisfactorily solved in computer vision for the general case: arbitrary objects in arbitrary situations. Computer vision for the general case: arbitrary objects in arbitrary situations. The existing methods for dealing with this problem can at best solve it only for specific objects, such as simple geometric objects (e.g., polyhedrons), human faces, printed or hand-written characters, or vehicles, and in specific situations, typically described in terms of well-defined illumination, background, and pose of the object relative to the camera. Different varieties of the recognition problem are described in the literature: Recognition: one or several pre-specified or learned objects or object classes can be recognized, usually together with their 2D positions in the image or 3D poses in the scene. Identification: An individual instance of an object is recognized. Examples: identification of a specific person face or fingerprint, or identification of a specific vehicle. Detection based on relatively simple and fast computations is sometimes used for finding smaller regions of interesting image data. 14
  • 15. CHAPTER 2 LITERATURE SURVEY Jarkiewicz et al [1] propose an emotion detection system where analysis is done using a Haar-like detector and face detection is done using a hybrid approach. The technique proposed here is to localize seventeen characteristic points on the face and based on their displacements certain emotions can be automatically recognized. An improvement over the above proposed method is the feature extraction technique. A face detection algorithm is proposed by Zhao et al [2] for colour images. This work is based on an adaptive threshold and a chroma chart that shows probability of skin colours. Thus by identifying the skin region, the facial part can be identified in the image. This technique when used with the feature extraction technique yields better results. Maglogiannis et al [3] present an integrated system for emotion detection. The system uses colour images and it is composed of three modules. The first module implements skin detection, using Markov random fields for image segmentation and face detection. A second module is responsible for eye and mouth detection and extraction. The specific module uses the HSV colour space of the specified eye and mouth region. The third module detects the emotions, pictured in the eyes and mouth using edge detection and measuring the gradient of the eye’s and mouth’s region. 15
  • 16. A detailed experimental study of face detection algorithms based on skin colour has been made by Singh et al [4]. Three colour spaces, RGB, YCbCr and HSI are of main concern .The algorithms of these three colour spaces have been compared and then combined to get a new skin colour based face detection algorithm which gives higher accuracy. A survey by Yang et al [5] categorizes and evaluates the various face detection algorithms. Other relevant issues such as benchmarking, data collection and evaluation techniques have also been discussed. The algorithms have been analysed and their limitations have been identified. The Eigenface method [6] which uses principal components analysis for dimensionality reduction, yields projection directions that maximize the total scatter across all classes, ie, across all images of all faces. In choosing the projection which maximizes total scatter, principal components analysis retains unwanted variations due to lighting and facial expression. The Eigenface method is also based on linearly projecting the image space to a low dimensional feature space. The Bunch Graph technique [7] has been fairly reliable to determine facial attributes from single images, such as gender or the presence of glasses or a beard. If this technique was developed to extract independent and stable personal attributes, such as age, race or gender, recognition from large databases could be improved and speed-up considerably by preselecting corresponding sectors of the database. Image deblurring algorithms are blind, Lucy-Richardson, Wiener and regularized filter deconvolution as well as conversions between point 16
  • 17. spread and optical transfer solutions. The Fisherfaces method [8], a derivative of Fisher’s Linear Discriminant (FLD) maximizes the ratio between class scatter to that of within-class scatter and appears to be the best at extrapolating and interpolating over variation in lighting, although the Linear Subspace method is a close second. The Eigenface method is also based on linearly projecting the image space to a low dimensional feature space. However, the Eigenface method, which uses principal components analysis, yields projection directions that maximize the total scatter. In a survey by Cheng-Chin Chiang[9] et al., presents a real-time face detection algorithm for locating faces in images and videos. This algorithm finds not only the face regions, but also the precise locations of the facial components such as eyes and lips. The algorithm starts from the extraction of skin pixels based upon rules derived from a simple quadratic polynomial model. Interestingly, with a minor modification, this polynomial model is also applicable to the extraction of lips. The benefits of applying these two similar polynomial models are twofold. First, much computation time are saved. Second, both extraction processes can be performed simultaneously in one scan of the image or video frame. The eye components are then extracted after the extraction of skin pixels and lips. Afterwards, the algorithm removes the falsely extracted components by verifying with rules derived from the spatial and geometrical relationships of facial components. Finally, the precise face regions are determined accordingly. According to the experimental results, the proposed algorithm exhibits satisfactory performance in terms of both accuracy and speed for detecting faces with wide variations in size, scale, orientation, colour, and expressions. 17
  • 18. Hironori Yamauchi[9], proposed Bio Security using Face recognition for Industrial Use about current systems for face recognition techniques which often use either SVM or adaboost techniques for face detection part and use PCA for face recognition part. In Robust real time face tracking for the analysis of human behaviour proposed by Damien Douxchamp and Nick Campbell[10], presented a real-time system for face detection, tracking and characterization from Omni directional video. Viola-Jones is used as a basis for face detection, and then various filters are applied to eliminate false positives. Gaps between two detection of a face by the Viola-Jones algorithms are filled using a colour-based tracking. Shinjiro Kawato and Nobuji Tetsutani[11], proposed Scale Adaptive Face Detection and Tracking in Real Time for detection and tracking of faces in video sequences in real time. It can be applied to a wide range of face scales. Fast extraction of face candidates is done with a Six-Segmented Rectangular (SSR) filters and face verification by a support vector machine. 18
  • 19. Real-Time Face Detection Using Six-Segmented Rectangular Filter (SSR Filter) by Oraya Sawettanusorn and et al.,[12], proposed a real-time face detection algorithm using Six-Segmented Rectangular (SSR) filter, distance information, and template matching technique. Between-the-Eyes is selected as face representative because its characteristic is common to most people and is easily seen for a wide range of face orientation. Image is scanned and divided into six segments throughout the face image. A research by Li Zhang[13] and et al., concentrates on intelligent neural network based facial emotion recognition and Latent Semantic Analysis based topic detection for a humanoid robot. The work has first of all incorporated Facial Action Coding System describing physical cues and anatomical knowledge of facial behavior for the detection of neutral and six basic emotions from real-time posed facial expressions. Feedforward neural networks (NN) are used to respectively implement both upper and lower facial Action Units (AU) analyzers to recognize six upper and 11 lower facial actions including Inner and Outer Brow Raiser, Lid Tightener, Lip Corner Puller, Upper Lip Raiser, Nose Wrinkler, Mouth Stretch etc. An artificial neural network based facial emotion recognizer is subsequently used to accept the derived 17 Action Units as inputs to decode neutral and six basic emotions from facial expressions. Moreover, in order to advise the robot to make appropriate responses based on the detected affective facial behaviors, Latent Semantic Analysis is used to focus on underlying semantic structures of the data and go beyond linguistic restrictions to identify topics embedded in the users’ conversations. The overall development is integrated with a modern humanoid robot platform under its Linux C++ SDKs. The work 19
  • 20. presented here shows great potential in developing personalized intelligent agents/robots with emotion and social intelligence. CHAPTER 3 PROBLEM DEFINITION The aim of this project is to detect human facial emotions namely happiness, sadness and surprise. This is done by first detecting the face from an image, based on the skin colour detection technique. It is then followed by image segmentation and feature extraction techniques, where eye and mouth parts are extracted. Based on the eye and mouth variances the emotions are detected. From the position of eyes, emotions are detected. If the person is happy or sad then eyes will be open and when a person is surprised, eyes will be wide open. Similarly for the lips the shape and colour properties are important. Depending on the shape of the lips, emotions are detected, i.e., if the lips are closed and curved upwards it indicates happiness. If lips are opened it indicates surprise etc. Therefore based on the facial features such as eyes and mouth, emotions are detected and recognized. 20
  • 21. CHAPTER 4 FACIAL EMOTION DETECTION AND RECOGNITION 4.1 Overview of the Algorithm Our project proposes an emotion detection system where in the facial emotions namely - happy, sad and surprised are detected. First the face is detected from an image using the skin colour model. This is then followed by feature extraction such as eyes and mouth. This is used for further processing to detect the emotion. For detecting the emotion we take into account the fact that emotions are basically represented using mouth expressions. This is done using the shape and colour properties of the lips. 4.1.1 Video Fragmentation The input video of an e-learning student is acquired using an image acquisition device and stored into a database. This video is extracted and 21
  • 22. fragmented into several frames to detect the emotions of the e-leaning student and to thereby improve the virtual learning environment. By the video acquisition feature which is used to record and register the on-going emotional changes in the e-learning student, the resulting emotions are detected by mapping the changes in the eye and lip region. The videos are recorded into a database before processing, thereby making it useful to analyse the changes of emotion for a particular subject or during a particular time of the day. Frame rate and motion blur are important aspects of video quality. This demo helps to show the visual differences between various frame rates and motion blur. A few presets to try out: Motion blur is a natural effect when you film the world in discrete time intervals. When a film is recorded at 25 frames per second, each frame has an exposure time of up to 40 milliseconds (1/25 seconds). All the changes in the scene over that entire 40 milliseconds will blend into the final frame. Without motion blur, animation will appear to jump and will not look fluid. When the frame rate of a movie is too low, your mind will no longer be convinced that the contents of the movie are continuous, and the movie will appear to jump (also called strobing). The human eye and its brain interface, the human visual system, can process 10 to 12 separate images per second, perceiving them individually, but the threshold of perception is more complex, with different stimuli having different thresholds: the average shortest 22
  • 23. noticeable dark period, such as the flicker of a cathode ray tube monitor or fluorescent lamp, is 16 milliseconds, while single-millisecond visual stimulus may have a perceived duration between 100ms and 400ms due to persistence of vision in the visual cortex. This may cause images perceived in this duration to appear as one stimulus, such as a 10ms green flash of light immediately followed by a 10ms red flash of light perceived as a single yellow flash of light. 4.1.2 Face Detection The first step for face detection is to make a skin colour model. After the skin colour model is produced, the test image is skin segmented (binary image) and the face is detected. The result of Face Detection is processed by a decision function based on the chroma components (CrCb from YCbCr and Hue from HSV). Before the result is passed to the next module, it is cropped according to the skin mask. Small background areas which could lead to errors during the next stages will be deleted. A model image of face detection with the bounding box is illustrated below in Figure 4.1. 23
  • 24. Figure 4.1 Face Detection 4.1.3 Feature Extraction After the face has been detected the next step is feature extraction where the eyes and mouth are extracted from the detected face .For eye extraction, this is done by creating two eye maps, a chrominance eye map and a luminance eye map. The two maps are then combined to locate the eyes in a face image, as shown in Figure 4.2. 24
  • 25. Figure 4.2 Feature Detection To locate the mouth region, we use the fact that it contains stronger red components and weaker blue components than other facial regions (Cr > Cb), and thus the mouth map is constructed .Based on this the mouth region is extracted .Finally the extracted eyes and mouth from the face image according to the maps are passed onto the next module of our algorithm. 4.1.4 Emotion Detection The last module is emotion detection. This module makes use of the fact that the emotions are expressed majorly with the help of eye and mouth expressions as show in Figure 4.3. Emotion detection from lip images is based on colour and shape properties of human lips. Having a binary lip image, shape detection can be performed. Thus, depending on the shape of the lips and other morphological properties the emotions are detected. A computer is being taught to interpret human emotions based on lip pattern, according to research published in the International Journal of Artificial Intelligence and Soft Computing. The system could improve the way we interact with computers and perhaps allow disabled people to use computer-based communications devices, such as voice synthesizers, more effectively and more efficiently. 25
  • 26. Figure 4.3 Emotion Detection 4.2 Architectural Design The architectural diagram shows the overall working of the system, where captured colour image sample is taken as the input and it is processed using image processing tools and is analysed to locate the facial features such as eyes and mouth, which will be further processed to recognize the emotion of the person. After the localization of the facial features the next step is to localize the characteristic points on the face. Followed by this is the feature extraction process where the features are extracted such as eyes and mouth. Based on the variations of eyes and mouth, emotion of a person is detected and recognized. For a person who is happy, the eyes will be open and the lips will be closed upwards whereas for a person who is 26
  • 27. sad, the eyes will be open and the lips will be closed facing downwards. Similarly for a person who is surprised the eyes will be wide open and there will be a considerable displacement of the eye brows from the eyes and the mouth will be wide open. Based on the above measures mood exhibited by a person is detected and it is recognized. The Figure 4.4 shows the overall working of the system where the input is the image and the output is the emotion recognized such as happy, sad or surprised. 27
  • 28. Figure 4.4 – Architectural Diagram CHAPTER 5 REQUIREMENT ANALYSIS 28
  • 29. The Software Requirements Specification is based on the problem definition. Ideally, the requirement specification will state the “what” of the software product without implying “how” the software design is concerned, by specifying how the product will provide the required features. 5.1 Product Requirements 5.1.1 Input Requirements The input for this work is the video of an e-learning student, which may contain the human face. 5.1.2 Output Requirements The output is the detected facial emotion such as happy, sad, and surprised. 5.2 Resource Requirements The hardware configuration requirement is shown in Table 5.1 and software configuration required to run this software is shown in Table 5.2. 5.2.1 Hardware Requirements 29
  • 30. Table 5.1 – Hardware Requirements S.No Feature Configuration 1 CPU Intel core 2 Duo processor 2 Main memory 1 GB RAM 3 Hard Disk 60 GB Disk size The above configuration in the Table 5.1 is the minimum hardware requirements for the proposed system. 5.2.2 Software Requirements Table 5.2 – Software Requirements S.No Software Version 1 Windows 7 2 Matlab R2012a 3 Picassa 3 The proposed system is executed using Windows 7, MatlabR2012a and picassa 3 as shown in Table 5.2. CHAPTER 6 DEVELOPMENT PROCESS AND DOCUMENTATION 30
  • 31. 6.1 Face Detection Face detection is used in biometrics, often as a part of or together with a facial recognition system. It is also used in video surveillance, human computer interface and image database management. Some recent digital cameras use face detection for autofocus. Face detection is also useful for selecting regions of interest in photo slideshows that use a pan-and-scale Ken Burns effect. Face detection can be regarded as a specific case of object-class detection. In object-class detection, the task is to find the locations and sizes of all objects in an image that belong to a given class. Examples include upper torsos, pedestrians, and cars. Face detection can be regarded as a more general case of face localization. In face localization, the task is to find the locations and sizes of a known number of faces. In face detection, one does not have this additional information. 6.1.1 Sample Collection The sample skin coloured pixels is collected from images of people belonging to different races. Each pixel is carefully chosen from the images so that the other regions which are not belonging to the skin colour do not get included. 6.1.2 Chroma Chart Preparation Chroma chart shown in Figure 6.1 is the distribution of the skin colour of different people over the chromatic colour space. 31
  • 32. Figure 6.1 – Chroma Chart Diagram Here the chromatic colour is taken in the (Cb, Cr) colour space. Normally the images will be stored in the (R, G, B) format. A suitable conversion is needed to convert it into YCbCr colour space. The collected sample pixels values are converted from (R G B) colour space to the YcbCr colour space and a chart is drawn by taking the Cb along x- axis and Cr along Y-axis. Now the obtained chart shows the distribution of the skin colour of different people. The Intensity(Y) component is not considered because it has very little effect in the chrominance variation. The following diagram shows the distribution of the skin colour of different people. 6.1.3 Skin Colour Model The skin-likelihood image is obtained using the developed skin colour model. The skin colour model is the distribution of skin colour 32
  • 33. over the chromatic colour space. Each and every pixel in the given input image is compared with the skin colour model. If the particular chrominance pair is present then that pixel is made as white pixel. This is achieved by assigning the red, green and blue component of each pixel as 255. If the chrominance pair is not present that pixel is made as black pixel. This is achieved by assigning the red, green and blue component of each pixel as 0. The result of Face Detection is first processed by a decision function based on the chroma components (CrCb from YCrCb and Hue from HSV).If all the following conditions are true for a pixel, it's marked as skin area; 140< Cr < 165 and 140< Cb <195.Now the obtained image is a binary image where the white coloured regions show the possible skin coloured region. The black region shows the non-skin coloured region. Before the result is passed to the next module, it is cropped according to the skin mask. Small background areas which could lead to errors during the next stages will be deleted. 6.2 Feature Extraction Feature extraction is the process of detecting the required features from the face and extracting it by cropping or other such technique. 6.2.1 Eye Detection Two separate eye maps are built, one from the chrominance component and the other from the luminance component. These two 33
  • 34. maps are then combined into a single eye map. The eye map from the chrominance is based on the fact that high-Cb and low-Cr values can be found around the eyes. The following formula presented helps us to construct the map: 1/3*(Cb*Cb + (255-Cr)*(255-Cr) + (Cb/Cr)). Eyes usually contain both dark and bright pixels in the luminance component, so gray scale operators can be de- signed to emphasize brighter and darker pixels in the luminance component around eye regions. Such operators are dilation and erosion. We use gray scale dilation and erosion with a spherical structuring element to construct the eye map. The eye map from the chrominance is then combined with the eye map from the luminance by an AND (multiplication) operation, Eye Map=(EyeMapChr) AND (EyeMapLum). The resulting eye map is then dilated and normalized to brighten both the eyes and suppress other facial areas. Then with an appropriate choice of a threshold, we can track the location of the eye region. 6.2.2 Mouth Detection To locate the mouth region, we use the fact that it contains stronger red components and weaker blue components than other facial 34
  • 35. regions (Cr >Cb), so the mouth map is constructed as follows: n= 0.95 * (1/k sum (Cr(x,y)*Cr(x,y))) / (1/k sum (Cr(x,y)/Cb(x,y))) Map = Cr*Cr*(Cr*Cr – n*Cr/Cb) Where k is the number of pixels in the face. The mouth detection diagram is shown in Figure 6.2 Happy: Surprised: Figure 6.2 – Mouth Detection Diagram 6.3 Emotion Detection Emotion detection from lip images is based on colour and shape properties of human lips. For this task we considered already having a 35
  • 36. rectangular colour image containing lips and surrounding skin (with as small amount of skin as possible). Given this we can start extracting a binary image of lips, which would give us the necessary information about the shape. To extract a binary image of lips, a double threshold approach was used. First, a binary image (mask) containing objects similar to lips is extracted. The mask image is extracted in the way that it contains a subset of pixels which is equal or greater of the exact subset of lip pixels. Then, another image (marker) is generated by extracting pixels which contain lips with highest probability. Later, the mask image is reconstructed using the marker image to make results more accurate. Having a binary lip image, shape detection can be performed. Some lip features of face expressing certain emotions are obvious: side corners of happy lips are higher compared to the lip centre than it is for serious or sad lips. One way to express it more mathematically is to find the leftmost and rightmost pixels (lip corners), draw a line between them and calculate the position of lip centre with respect to that line. The lower below the line is the centre, the happier the lips are. Another morphological lip property that can be extracted is mouth openness. Open lips imply certain emotions: usually happiness and surprise. For example (surprised and happy): 1. Based on the original binary image the first step is to remove small areas which is done with the 'sizethre(x,y,'z')' function. 36
  • 37. 2. In the second step a morphological closing (imclose(bw,se)) with a 'disk' structure element is done. 3. In the third step some properties of image regions are measured (blob analysis).More precise: A 'BoundingBox' is calculated which contains the smallest rectangle of the region (in our case the green box). In digital image processing, the bounding box is merely the coordinates of the rectangular border that fully encloses a digital image when it is placed over a page, a canvas, a screen or other similar bidimensional background. 'Extremas' were calculated which means a 8-by-2 matrix that specifies the extrema points in the region. Each row of the matrix contains the x- and y-coordinates of one of the points. The format of the vector is [top-left top-right right-top right-bottom bottom-right bottom-left left-bottom left-top] (in our case the cyan dots). A 'Centroid' which is a 1-by-ndims(L) vector that specifies the centre of mass of the region (in our case the blue 'star'). The centroid is calculated based on: 1. p_poly_dist.....Calculates distance (shown as red line) between Centroid and 'left-top-right-top-line'. 37
  • 38. 2. lipratio....Ratio between width and height of the bounding box. 3. lip_sign....Is a positive/negative number, which is calculated to detect if the 'left-top-right-top-line' runs over/under the 'centroid'. 4. The decision is made if mood is 'happy', 'sad' and 'surprised'. After reviewing some illumination correction (colour constancy) algorithms we decided to use the "Max-RGB" (also known as "White patch") algorithm. This algorithm assumes that in every image there is a white patch, which is then used as a reference for present illumination. A more accurate "Colour by Correlation" algorithm was also considered, but it required building a precise colour-illumination correlation table in controlled conditions, which would be beyond the scope of this task. As the face detection is always the first step in the processes of these recognition or transmission systems, its performance would put a strict limit on the achieved performance of the whole system. Ideally, a good face detector should accurately extract all faces in images regardless of their positions, scales, orientations, colours, shapes, poses, expressions and light conditions. However, for the current state of the art in image processing technologies, this goal is a big challenge. For this reason, many designed face detectors deal with only upright and frontal faces in well-constrained environments. This lip emotion detection algorithm has one restriction - the face cannot be rotated more than 90 degrees, since then the corner detection would obviously fail. CHAPTER 7 EXPERIMENTAL RESULTS 38
  • 39. 7.1 General The results obtained after successful implementation of the project is given in this chapter. The results obtained are given in step by step basis. 7.2 Chroma Chart Chroma chart displayed in Figure 7.1 is the distribution of the skin colour of different people over the chromatic colour space. Here the chromatic colour is taken in the (Cb, Cr) colour space. The Intensity(Y) component is not considered because it has very little effect in the chrominance variation. The following diagram shows the distribution of the skin colour of different people. Figure 7.1 – Chroma Chart 7.3 Result Analysis This gives the overall efficiency of the proposed system detected 39
  • 40. at each step. The system was analysed for its detection rate and time taken to detect a particular stage for a specified number of input images. Three stages were considered in the system that is the skin detection, the face detection such as eyes and mouth and the emotion detection and recognition, at each of these stages detection rate and the time taken was calculated. The results are tabulated in the Table 7.1. Table 7.1 – Result Analysis STAGES DETECTION RATES (%) According to the table, 17 image samples were taken to determine the skin detection rate and it was found that, out of 17 images, skin was detected for 16 images giving a detection rate of 94.44 % with an average time of 1.4 seconds per image. The face detection rate was 40 NUMBER OF IMAGES TIME(s) SKIN DETECTION 94.44 17 1.4 FACE DETECTION (EYES AND MOUTH) 83.33 15 1 EMOTION DETECTION AND RECOGNITION 88.88 16 0.5
  • 41. calculated for 15 images out of which for 12 images face was detected successfully giving a detection rate of 83.33 % with an average time of 1 second per image. Similarly the emotion detection and recognition rate was calculated for 16 images. Out of which for 14 images exact emotions were detected and recognized giving a detection rate of 88.88 % with an average time of 0.5 seconds per image. The video fragmentation rate of a video depends on the duration and length of the original video. The Frames per Second (fps) rate is dependent on the time span of the video. Frame rate (also known as frame frequency) is the frequency (rate) at which an imaging device produces unique consecutive images called frames. The term applies equally well to film and video cameras, computer graphics, and motion capture systems. Frame rate is most often expressed in frames per second (FPS) and is also expressed in progressive scan monitors as hertz(Hz). If a video of a greater time span is given, the interval between the fragments remain constant. For every fragment produced, the emotion of the person is detected. Thereby, acquiring an opinion on what intervals the change of emotions occurs, and narrowing down to the corresponding reason of occurrence. CHAPTER 8 CONCLUSION AND FUTURE WORK 41
  • 42. Conclusion The proposed system utilizes feature extraction techniques and determines the emotion of the person based on the facial features namely eyes and lips. The emotion exhibited by a person is determined with a good accuracy and it is user friendly system. Face-Detection and Segmentation In this project we have proposed an emotion detection and recognition system for colour images. Although our application is only constructed for full frontal pictures with only one person per picture Face-Detection is necessary for decreasing the area of interest needed for further processing in order to achieve the best results. Trying to detect the skin of a face in an image really is a hard task due to the variance of illumination. The success of correct detection depends a lot on the light sources and illumination properties of the environment the picture are taken. Emotion Detection The major difficulty of the used approach is determining the right hue threshold range for lip extraction. Lip colours vary mostly according to face owner's race, presence of make-up and illumination, under which the photo was taken. The latter is the least problem, since there exist illumination correction algorithms. Future Enhancements 42
  • 43. The future work includes enhancement of the system so that it is able to detect emotions of the person even in complex backgrounds having different illumination conditions and to eliminate the lip colour constraint in the coloured images. The other criterion that can be worked upon is to project more emotions other than happy, sad and surprised. APPENDIX 1 SCREENSHOTS 43
  • 44. SCREEN 1 : The detected face for the given video input. 44
  • 45. SCREEN 2: The interface which is used to select the input image. 45
  • 46. SCREEN 3: The image which is to given as reference 46
  • 47. SCREEN 4: The image to be tested. 47
  • 48. SCREEN 5: The smoothened reference image. 48
  • 49. SCREEN 6: The test image after smoothening. 49
  • 50. SCREEN 7: The image after the detection of edges. 50
  • 51. SCREEN 8: The above screen is the result screen which displays the end result of the system, the emotion portrayed by the person in the image. REFERENCES 51
  • 52. [1] J. Jarkiewicz, R. Kocielnik and K. Marasek, “Anthropometric Facial Emotion Recognition”, Novel Interaction Methods and Techniques -Lecture Notes in Computer Science, Volume 5611, 2009. [2]L. Zhao, X. LinSun, J. Liu and X.Hexu, “Face Detection Based on Skin Colour”, Proceedings of the third international conference on machine learning and cybernetics, Shanghai, 2004. [3] I. Maglogiannis, D. Vouyioukas and C. Aggelopoulos, “Face Detection and Recognition of Natural Human Emotion Using Markov Random Fields”, Pers Ubiquit Comput, 2009. [4]M.H Yang, D. J. Kriegman, N. Ahuja, “Detecting Faces in Images”, IEEE transactions on pattern analysis and machine intelligence, vol.24, no.1, 2002 [5] Pedro J. Muñoz-Merino,Carlos Delgado Kloos and Mario Muñoz- Organero, “Enhancement of Student Learning Through the Use of a Hinting Computer e-Learning System and Comparison With Human Teachers ,“ IEEE Journal.vol.52, 2011. [6]Emily Mower,Maja.J.Mataric and Shrikanth Narayanan, “A Frame Work for Automatic Human Emotion Classification using Emotions Profile,” IEEE Journal.vol.23, 2011 [7]Xiaogang Wang and Xiaoou Tang, “Face Photo-Sketch Synthesis and Recognition”, IEEE transactions, 2009. [8]Yan Tong Jixu Chen and Qiang ji, “A Unified Probabilistic 52
  • 53. Framework for Spontaneous Facial Action Modelling”, IEEE transactions on pattern analysis and machine intelligence, vol.32, no.2, 2010. [9] Chen, L.S., Huang, T.S. “Emotional Expressions in Audiovisual Human Computer Interaction”, IEEE International Conference, Volume: 1, 2000. [10] De Silva, L.C., Ng, P. C. “Bimodal Emotion Recognition”, Fourth IEEE International Conference, 2000. 53