This is the 2nd key paper: Facial Feature Tracking under Varying Facial Expressions and Face Poses based on Restricted Boltzmann Machines
For faces with expression
Facial Feature Tracking under Varying Facial Expressions and Face Poses based on Restricted Boltzmann Machines
1. JAMES COOK AUSTRALIA
INSTITUTE OF HIGHER
LEARNING
IN SINGAPORE
HEALTH DIAGNOSTIC BY ANALYSING FACE IMAGES USING
MOBILE DEVICES
Instructor : Dr. Insu Song
Student : Ho Thi Hoang
Yen
Email:
2. INTRODUCTION
Previously : A robust, highly accurate method for detecting
20 facial points in images of expressionless faces
3. BACKGROUND
A restricted Boltzmann machine (RBM) is a generative stochastic
artificial neural network that can learn a probability distribution over its
set of inputs.
Inventor : under the name Harmonium by Paul Smolensky in 1986.
Fast learning algorithms : mid-2000s by Geoffrey Hinton &
collaborators.
RBMs have found applications in dimensionality reduction,
classification, collaborative filtering, feature learning and topic
modelling.
5. INTRODUCTION
Other methods :
- track facial feature points independently or
- build a shape model to capture the variations of face shape or
- appearance regardless of the facial expressions and face poses
This method : capture the distinctions & variations of face shapes
due to facial expression and pose change in a UNIFIED
framework
6. CONTENT - MEDOTHOLOGY
1. Related work
2. FrontalRBM & PoseRBM
3. Facial feature tracking based on face shape prior
model
4. Experimental results
7. 1. RELATED WORK
Facial feature localization:
2 categories :
• Without shape prior models : track each facial feature
point independently and ignore the prior knowledge
about the face => sensitive with expression & pose
• With shape prior models : capture the dependence
between facial feature points by explicitly modeling the
general properties as well as the variations of facial
shape or appearance
8. 1. RELATED WORK
Facial feature localization:
Recently methods:
• Active Shape Model (ASM) [2] and Active Appearance Model
(AAM) : linear generative models
• Facial point detection using boosted regression and graph
models : facial feature points are detected independently based
on the response of the support vector regressor.
• Gaussian Process Latent Variable model : a single Gaussian
is used for each facial component.
• Multi-State Facial Component Model of Tian and Cohn
• ….
9. 1. RELATED WORK
Restricted Boltzmann Machines based shape prior
model:
• Deep Belief Networks(DBNs)-like model : S. Eslami, N.
Heess, and J. Winn. (2012) - a strong model of object
shape.
• Implicit mixture of Conditional Restricted Boltzmann
Machines : G. Taylor, L. Sigal, D. Fleet, and G. Hinton
(2010) - capture the human poses and motions (imRBM)
under different activities such as walking, running etc
• …
10. CONTENT - MEDOTHOLOGY
1. Related work
2. FrontalRBM & PoseRBM
3. Facial feature tracking based on face shape prior
model
4. Experimental results
11. 2. FRONTAL-RBM & POSE-RBM
the locations
of facial
feature
points for
frontal face
when
subjects
show
different
facial
expressions
the
corresponding
locations of
facial feature
points for non-
frontal face
under the same
facial
expression
H1 & H2 are two
sets of hidden
nodes
12. FACIAL FEATURE TRACKING BASED ON FACE
SHAPE PRIOR MODEL
Gaussian assumption : estimate the prior probability by
calculating the mean vector μp and covariance matrix Σp
from the samples.
Kernel Density Function: to estimate the probability.
13. CONTENT - MEDOTHOLOGY
1. Related work
2. FrontalRBM & PoseRBM
3. Facial feature tracking based on face shape prior
model
4. Experimental results
16. RESULT
Experiments on MMI database: comparable to Facial
point detection using boosted regression and graph
models (rate error of 5.3 on 400 images ).
19 : Robust facial feature tracking under varying face pose and facial expression
(Y. Tong, Y. Wang, Z. Zhu, and Q. Ji. - Nov 2007)
21. CONCLUSION
• Improving The Accuracy And Robustness Of Facial Feature Tracking
Under Simultaneous Pose And Expression Variations
• 1st : A face shape prior model to capture the face shape patterns under
varying facial expressions for near-frontal face based on deep belief
networks
• 2nd : Extend the frontal face prior model by a 3-way RBM to capture face
shape patterns under simultaneous expression and pose variation.
• 3rd : Systematically combine the face prior models with image
measurements of facial feature points to perform facial feature point
tracking.
22. SWOT
• STRENGTH ?
Experiments on many methods & do well comparing.
• WEAKNESS?
The steps of the methods are not very clear.
There is no specific correct detection rate.
• OPPORTUNITY ?
Can be very useful for face detection related programs
• THREAT ?
There are more than 6 basic expression.
The training data must be labelled manually.
23. OPINION
This paper has described a good method for face
detection under varying expression and specially with
occlusions.
It is valuable for all kinds of researches related to face,
to the system for interacting between human & computer , or
the face recognition and to FACE ANALYSIS FOR HEALTH
PURPOSE.
Editor's Notes
Good morning every body,
Code : http://deeplearning.net/tutorial/DBN.html
Previously on Literature Review, I have introduced a paper about A robust, highly accurate method for detecting 20 facial points in images of expressionless faces.
Today, I’ll continue with a paper about “facial feature tracking under varying facial expressions”
RBMs were initially invented under the name Harmonium by Paul Smolensky in 1986,
but only rose to prominence after Geoffrey Hinton and collaborators invented fast learning algorithms for them in the mid-2000s.
RBMs have found applications in :
dimensionality reduction,
classification,[3]
collaborative filtering,[4]
feature learning[5] and
topic modelling.[6]
This paper present a work that can effectively track facial feature points using face shape prior models that are constructed based on RBM.
The facial feature tracker can track 26 facial feature points (Fig. 1 (a))
even if faces have different facial expressions, varying poses, or occlusion
Unlike the previous works that track facial feature points independently or build a shape model to capture the variations of face shape or appearance regardless of the facial expressions and face poses
the proposed model could capture the distinctions as well as the variations of face shapes due to facial expression and pose change in a unified framework.
Methods without shape prior model track each facial feature point independently and ignore the prior knowledge about the face. As a result, they usually are sensitive to facial expression change, face poses change, occlusion etc. (etcetera)
On the other hand, methods with shape prior model capture the dependence between facial feature points by explicitly modeling the general properties as well as the variations of facial shape or appearance
In this paper, authors have researched on other methods in recently article.
In real world situations, faces usually vary in facial expressions and poses. These natural movements make facial feature tracking even more difficult. To solve this problem,
Tian and Cohn [9] propose a multi-state facial component model, where the state is selected by tracking a few control points. As a result, the accuracy of their method critically relies on how accurately and reliably the control points are tracked.
Tong et al. [19] propose a model to capture the different states of facial components like mouth open and mouth closed. In addition, they project the frontal face to face with poses to handle the varying poses problem.
However, during tracking, they need to dynamically and explicitly estimate the state of local components and switch between different models.
In [3], Dantone and Van Gool take into account the pose variations and build sets of conditional regression forests on different poses.
In [1], instead of using the parametric model, Belhumeur present methods to represent the face shape variations with non-parametric training data.
Eslami et al. propose a strong model of object shape based on Boltzmann Machines.
Specifically, they build a Deep Belief Networks(DBNs)-like model but with only locally shared weights in the first hidden layer to represent the shape of horse and motorbikes.
The sampling results from the model look realistic and have a good generalization
The mixture nature of im-RBM makes it possible to learn a single model to represent the human poses and motions under different activities such as walking, running, etc.
By experiments, authors have found that there are exist patterns for human face shapes, but these patterns depend on the facial expressions.
These are 6 basic facial expressions
To capture these patterns, they have proposed a face shape prior model based on Deep Belief Networks which we call ForntalRBM in this paper : use two-layer DBNs to explicitly capture the face shape patterns under different facial expressions
Part 2 : the 3-way RBM model captures the transition between the facial feature locations for frontal face and corresponding non-frontal face.
The two part can be trained seperately
Gausian assumption and Kernel Density Function are used to tracking facial feature points based on face shape prior model.
We show the experiments using synthetic data,
sequences from the extended Cohn-Kanade database (CK+) [10], the MMI fa- cial expression database [15], the American Sign Lan- guage (ASL) database [24] and the ISL Facial Expression database
Synthetic data are used in the process of data mining. Testing and training fraud detection systems
This synthetic data assists in teaching a system how to react to certain situations or criteria
and (c) are faces with outlier (left eyebrow tip) and corrupted points on the left half face.
(b) and (d) are the results after correction.
FrontalRBM shows strong power as a face shape prior model.
The CK+ database contains facial behavior videos of 123 subjects showing 7 basic facial expressions including anger, disgust, fear, happiness, sadness, surprise, and contempt.
They compared the result with the Active Appearance Models of Matthews & Baker (2004).
MMI database : there are 196 sequences of 27 subjects with 6 basic facial expressions. Some subjects may wear glasses.
By incorporating the frontalRBM as face shape prior model, the overall errors decrease by 16.50% and 15.13% when using Gaussian assumption and KDE to combine the measurement and the mode.
This result is comparable to the state of art research (Facial point detection using boosted regression and graph models), which reports an average detection error of 5.3 on 400 images selected from not only the MMI database but also the FERET database
Gabor approach for local feature extraction outperformed PCA (Principal Component Analysis), FLD (Fisher’s Linear Discriminant) and LFA (Local Feature Analysis)