All pose face alignment robust to occlusion

1

POSE & OCCLUSION ROBUST FACE
ALIGNMENT USING MULTIPLE SHAPE MODELS
AND PARTIAL INFERENCE
- PhD Thesis Proposal -

Jongju Shin
[jjshin@postech.ac.kr]

Advisor : Daijin Kim

2013.01.03

I.M. Lab.
Dept. of CSE

2

Outline
• Introduction
• Previous Work
• Proposed Method
• Shape Representation
• Formulation
• Multiple Shape Models
• Local Feature Detection
• Hypothesizing Transformation Parameters
• Hypothesizing Shape Parameters
• Model Hypotheses Evaluation
• Experimental Results
• Conclusion
• Future Work

4

introduction

What is face alignment?
• Face alignment is to extract facial feature points :
• , and from the given image

Eyebrow

Eye

Nose

Mouth

Chin

* “The POSTECH Face Database (PF07) and Performance Evaluation”, FG 2008

5

introduction

Why is it important?
• Face alignment is pre-requisite for many face-related
problem.

Angry Happy

-25° 0° +25°
Surprise Neutral

Face Recognition Face Expression Recognition Head Pose Estimation

6

introduction

Challenges
Illumination Pose

Expression Occlusion

8

Previous work

Previous work
• Two approaches
• 1. Discriminative approach
• Active Shape Model
• The shape parameters are iteratively updated by locally finding the best
nearby match for each feature point.

• 2. Generative approach
• Active Appearance Model
• The shape parameters are iteratively updated by minimizing the error
between appearance instance and input image.

9

Previous work

Previous work
• 1. Discriminative approach
Constrained Local Model[1] Bayesian Tangent Shape Model[2]

• Feature detector : Linear SVM • Feature detector : gradient along normal vector
• Alignment algorithm : Mean-shifts • Alignment algorithm : Bayesian Inference

• They assume that all the feature points are visible.
• By the wrong detected feature points, alignment fails.

[1] Jason et al., “Face Alignment through Subspace Constrained Mean-Shifts”, ICCV 2009
[2] Yi et al., “Bayesian Tangent Shape Model:Estimating Shape and Pose Parameters via Bayesian Inference”, CVPR 2003

10

Previous work

Previous work
• 2. Generative approach
Boosted Appearance Model[3] Fourier Active Appearance Model[4]

• Appearance model : Haar-like feature • Appearance model : Fourier transformed
and boosting. appearance
• Weak classifier : discriminate aligned • Alignment algorithm : gradient descent
images from not-aligned images.

• Due to high dimensional solution space, it has large number of
local minimums.
• They need good initialization by eye detection.

[3] Xiaoming Liu, “Generic Face Alignment using Boosted Appearance Model”, CVPR 2007
[4] Rajitha, et al., “Fourier Active Appearance Models”, ICCV 2011

12

Proposed method

Motivation
• We follow discriminative approach.
• Determine whether a feature point is visible or not.
• Only visible feature points are involved alignment step.
• Invisible feature points are estimated by visible feature points using partial
inference (PI) algorithm.
• Using the multiple shape models, we solve pose problem.
We propose pose and occlusion robust face alignment !

Visible

Invisible

13

Proposed method

Shape Representation
• Point Distribution Model
• The non-rigid shape :
• is represented by linear combination of shape bases with the
mean shape as

: mean shape associated to

: eigenvectors associated to
: shape parameter
: scale
: rotation
: translation(x, y)

14

Proposed method

Formulation
• Shape Model with parameter, p ={s, R, q, t}

• Energy function

denotes whether the is aligned(visible) or not,
is the number of local features.

15

Proposed method

Multiple Shape Models
• To cover various pose and expression, we build multiple
shape models.
• We build eigenvectors for nth pose, mth expression,

• Given n and m, shape is

16

Proposed method

Formulation with multiple shape models
• Energy function

17

Proposed method

Algorithm Overview

Model Hypotheses
[Input] Evaluation
Local Feature Detection

[Output]
[Hypothesis-and-test]

Hypothesizing
Transformation Parameters

Face Hypothesizing
Detection Shape Parameters

18

Proposed method


Model Hypotheses
[Input] Evaluation

[Output]

Hypothesizing

Face Hypothesizing

19

Proposed method
Local feature detection
• Goal
Detect feature point candidates with Gaussian Model!

• Based on MCT+Adaboost algorithm [5],

• We propose Hierarchical MCT to increase detection
performance.
[5] Jun, and Kim, “Robust Real-Time Face Detection Using Face Certainty Map”, ICB, 2007

20

Proposed method
Feature Descriptor
• Modified Census Transform (MCT)

I1 I2 I3 B1 B2 B3
9
I4 I5 I6 B4 B5 B6 C   B x * 2x
x1
I7 I8 I9 B7 B8 B9
1 9
M   Ix
9 x 1

Bx  1 if Ix  M
B x  0 otherwise

102 105 118 0 0 0
120 111 101 1 0 0 011100000 2
 224

123 119 109 1 1 0

21

Proposed method
Feature Descriptor
• Modified Census Transform (MCT)
• Transformed result

Gray image MCT

• MCT is point feature
• Represents local intensity’s difference
• Very sensitive to noise

22

Proposed method
Feature Descriptor
We propose Hierarchical MCT
• Regional feature
• To represent regional difference
• Robust to noise

I1 I2 I3
9

I4 I5 I6 C   B x * 2x
x 1
Partition Average MCT
I7 I8 I9

23

Proposed method
Training procedure
• Hierarchical MCT + Adaboost

35

25
Adaboost
Training

15
35

5

Image pyramid Concatenated
Input image
By Integral Image
MCT
vector

24

Proposed method
Feature Response
• Feature response by Adaboost with different feature
descriptor

Training
Image

Test
Image

Conventional Conventional Hierarchical Hierarchical
LBP MCT LBP MCT

25

Proposed method
Process of local feature detection

[Input] Hierarchical Adaboost Regressed
Search region
MCT Response Response

How to obtain feature point candidates?

26

Proposed method
Representation of Feature Response
• How to obtain feature point candidates?
• Local maximum points in candidate search region
arg max x  y, y  , and px  0, x is center of 
x

Segmented
[Input] Response region

27

Proposed method
Representation of Feature Response
• How to obtain feature point candidates?
• We compute distribution of segmented region through convex
quadratic function

is kth segmented region in ith feature point.
is the centroid of
is the inverted feature response function.
• We obtain and : feature candidate’s distribution and centroid.

• Independent Gaussian distribution

Kronecker delta function which is visible.

28

Proposed method
Feature clustering
• Mouth corner’s appearance varies according to facial
expression according.
• The detection performance degrades when only one detector is
used to train for all the mouth shapes and appearances.

Neutral Smile Surprise

29

Proposed method
Feature clustering
• Train each detector with each clustered feature
• Run detectors and combine results



30

Proposed method

…..
..…

[Candidates
[Input] [Search region] [Adaboost [output of detection]
with Gaussian]
Response]

31

Proposed method

Hypothesizing Transformation Parameters

Model Hypotheses
[Input] Evaluation

[Output]

Hypothesizing

Face Hypothesizing

32

Proposed method

Hypothesizing Hypo. trans. param.

• Goal

Find a best combination of the
local feature point candidates
which represents input image well.

[Feature point candidates]

• Assumption for occlusion
• We assume that at least half of feature points are not occluded.
• Let be N is total number of features points.
• N/2 feature points can be assumed to be visible ones.

33

Proposed method

Hypothesizing Hypo. trans. param.

• Coarse-to-fine approach
– The hypothesis space of visibility of feature p
oints is HUGE.
– Partial Inference (PI) Algorithm
• 1. Transformation parameters (s, R, t) are estimate
d by RANSAC.
• 2. Shape parameters (q) are estimated, also transfo
rmation parameters are updated by RANSAC

34

Proposed method
Hypothesizing Transformation Parameters Hypo. trans. param.

Algorithm 1. Partial Inference (PI) algorithm for transformation parameters

[PI algorithm]

35

Proposed method

Hypothesizing Shape Parameters

Model Hypotheses
[Input] Evaluation

[Output]

Hypothesizing

Face Hypothesizing

36

Proposed method

Hypothesizing Shape Parameters
• From the selected feature points , we calculate parameters p
in closed form by

• Visibility indicator
• to and to are selected candidate’s Gaussian parameters.

37

Proposed method

Hypothesizing shape parameters Hypo. shp. param.

Algorithm 2. Partial Inference (PI) algorithm for shape parameters

[Selected feature points]

[Hallucinated shape]

38

Proposed method

Hypothesizing for all pose and expression

• Run two hypothesizing steps for all shape mod
els (of face pose and expression)

39

Proposed method

Model Hypothesis Evaluation

Model Hypotheses
[Input] Evaluation

[Output]

Hypothesizing

Face Hypothesizing

40

Proposed method

Model Hypotheses Evaluation
• We should select best pose and expression from all the
hypotheses.
• Hypothesis error is mean error of inliers(E) over number of
inliers(v).

Num. of Inliers 54 52 43 40
Error of inliers 2.9755 3.23 3.37 2.95

43

Experimental results

Training database
• CMU Multi-PIE [7]
• Various pose, expression and illumination
• We used 10,948 images among 750,000 images
• 5 Pose models
• 0°, 15°~30°, 30°~45° (70 feature points)
• 60°~75°, and 75°~90° (40 feature points)
• 2 Expression models
• Neutral and smile
• surprise

[7] Ralph et al., “Guide to the CMU Multi-pie database”, Technical report, CMU, 2007

44


Test database
• ARDB [8]
• Occlusion (Sunglasses, and scarf)
• CMU Multi-PIE
• Various pose, expression, illumination
• For artificial occlusion
• LFPW(Labeled Face Parts in the Wild) [9]
• Various pose, expression, illumination, and partial occlusion.
• 29 feature points
• To compare our algorithm with other state-of-the art one

AR DB LFPW
[8] A.M. Martinez and R. Benavente. The AR Face Database. CVC Technical Report #24, June 1998
[9] P. Belhumeur, et al., “Localizing parts of faces using a concensus of exemplars”, IEEE CVPR, 2011

45


Alignment Accuracy
• Normalized error
• Euclidean distance between aligned feature and ground truth with
respect to face size.
• If Normalized error is 0.01 with 100 pixel size face,
• distance between aligned feature and ground truth is only one pixel.

46


AR database
• Test result
• 60 images

47


AR database

• Normalized error for • Cumulative error
occlusion type

Normalized mean error for occlusion type
Non occlusion 0.0226
Scarf 0.0258
Sunglasses 0.0338

48


CMU Multi-PIE Database
• Test result
• Test for pose
• 321 images

49



• Normalized mean error • Cumulative error
for pose

Normalized mean error for pose
*60°~90° shows a little poor than 0°~45°.
0° 0.0263 60° 0.0352
Since large portion of the facial features
15° 0.0253 75° 0.0336 are covered by hair, the total number of
30° 0.0273 90° 0.0368 visible feature points detected is too small
to hallucinate correct facial shape.
45° 0.0267

50


• Test for artificial occlusion
• Face area is divided by 5-by-5.
• Among 25 regions, 1 to 15 regions are selected randomly and filled by
black.
• From 8 of occluded regions, the fraction of occlusion starts to be over 50%
of feature points.
• 2,100 images

51


• Test result

52


• Normalized error for pose

• For the profile(60°~90°) view, even small occlusion affects the alignment
badly because there are fewer strong features like eyes, mouth, and
nostrils.
• However, with respect to the mean error, the proposed method shows
stable alignment up to 7 degree of occlusion which is nearly 50% of
occlusion.

53


LFPW database
• Mean error over inter-ocular distance for 21 feature points
• 240 of 300 images

* P. Belhumeur, et al., “Localizing parts of faces using a concensus of exemplars”, IEEE CVPR, 2011

55

Conclusion
• We proposed pose and occlusion robust face alignment
method.
• To solve pose problem, we used multiple shape models.
• To solve occlusion problem, we proposed partial
inference (PI) algorithm.
• We explicitly determine which part is occluded.
• We proposed Hierarchical MCT+Adaboost for local
feature detector to improve detection performance.

57

Future work
• We combine generative approach (Active Appearance
Model) with discriminative approach (local feature detector).

• Current facial feature tracking
• AAM with temporal matching, template update, and motion
estimation

58

Future work
• Problem in facial feature tracking
• Drift problem
Iterative Update

Appearance
Error
arg minE AAM I n , A,p, α 
[Input] [Output]
p,α

 - 
Update
parameters
p  p  p
α  α  α
x  x0   pi si

Condition

59

Future work
• By local feature detection result,
• we can constrain the aligned feature points by AAM to the local
feature detector.

60

Future work
[Input In]
Iterative Update

[Point constraint]

Feature point Appearance
selection Error
arg minE AAM I n , A,p, α 
p,α

 - 
Local feature [Output]
detector Point Error Update
parameters
 x1  y1  p  p  p
 
E pts   x2  y2  α  α  α
   x  x0   pi si
 
…

 xn  yn 

Condition

61

Future work
• By local feature detection result,
• We can make validation matrix of AAM for robust fitting.

• After alignment,
• We run feature detector on the aligned feature points.
• We determine whether each point is occluded or not.
• Based on feature-occlusion information, we make validation matrix
of AAM for robust fitting.
• Validation matrix is used for robust AAM from the next input image.

62

Future work
Validation
[Input In] Matrix
Iterative Update

[Point constraint]

Feature point Appearance
selection Error
arg minE AAM I n , A,p, α  Occlusion

 
p,α
Decision
x1-pos.
- x2-neg.
…
xn-pos.
Local feature
parameters
 x1  y1  p  p  p [Output]
 
E pts   x2  y2  α  α  α
   x  x0   pi si
 
…

 xn  yn 

Condition

63

Future work
Validation
[Input In+1] Matrix
Iterative Update

[Point constraint]

Feature point Robust
selection App. Error

arg minE AAM I n , A,p, α  Occlusion
p,α Decision

 * -  x1-pos.
x2-neg.
…
xn-pos.
Local feature
parameters
 x1  y1  p  p  p [Output]
 
E pts   x2  y2  α  α  α
   x  x0   pi si
 
…

 xn  yn 

Condition

All pose face alignment robust to occlusion

More Related Content

Viewers also liked

Similar to All pose face alignment robust to occlusion

All pose face alignment robust to occlusion