The document proposes a method for pose and occlusion robust face alignment using multiple shape models and partial inference. It introduces shape representation using point distribution models and multiple shape models to handle various poses and expressions. The method detects local features hierarchically using modified census transform and Adaboost. It then hypothesizes transformation and shape parameters using partial inference to estimate visible and invisible features. Experimental results on public databases show the method achieves accurate alignment under poses, expressions, and occlusions.
Invited Talk "Pattern Language 3.0: Writing Pattern Languages for Human Actio...Takashi Iba
Takashi Iba's Talk "Pattern Language 3.0: Writing Pattern Languages for Human Actions" at 19th International Conference on Pattern Languages of Programs (PLoP2012).
AdaBoost를 이용한 Face Recognition
랩 세미나용이라 디테일하게 모든 정보가 들어있지는 않음.
참고논문
P.Viola & M.Jones, Rapid Object Detection using a Boosted Cascade of Simple Features, 2011 외 다수
Invited Talk "Pattern Language 3.0: Writing Pattern Languages for Human Actio...Takashi Iba
Takashi Iba's Talk "Pattern Language 3.0: Writing Pattern Languages for Human Actions" at 19th International Conference on Pattern Languages of Programs (PLoP2012).
AdaBoost를 이용한 Face Recognition
랩 세미나용이라 디테일하게 모든 정보가 들어있지는 않음.
참고논문
P.Viola & M.Jones, Rapid Object Detection using a Boosted Cascade of Simple Features, 2011 외 다수
The Grand Canyon presents a myriad of photographic opportunities for the amateur photo enthusiast. Moments turn into memories. Elements turn into easels. And novelty turns into nostalgia. All in a series of tiny vignettes, waiting to be shared.
Don’t let your joy fade into oblivion. Keep it alive by striking a pose and passing it along. We handpicked a few Instagram pics that will have You wanting to visit the Grand Canyon just to STRIKE a POSE!
Deformable Part Models are Convolutional Neural NetworksWei Yang
Girshick, Ross, et al. "Deformable part models are convolutional neural networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
Pose Method clinic held at CrossFit Ferus in Fayetteville, NC. Covers running form and technique from an efficiency and injury prevention standpoint. Programming for marathon training and interval sessions described.
딥러닝을 이용한 얼굴인식 (Face Recogniton with Deep Learning)Daehee Han
Open Face를 이용한 얼굴인식,구분 서비스 개발. 한대희 멘토 (http://slowcampus.com)
소프트웨어 마에스트로 6기 1단계2차 (2015년 9~11월) 프로젝트.
FaceNet, Open face, Face Recognition, Deep learning
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as l1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.
Computer vision techniques can be seen in various aspects in our daily life with tremendous impacts. This slides aim at introducing basic concepts of computer vision and applications for the general public.
Download link: https://uofi.box.com/shared/static/24vy7aule67o4g6djr83hzurf5a9lfp6.pptx
The Grand Canyon presents a myriad of photographic opportunities for the amateur photo enthusiast. Moments turn into memories. Elements turn into easels. And novelty turns into nostalgia. All in a series of tiny vignettes, waiting to be shared.
Don’t let your joy fade into oblivion. Keep it alive by striking a pose and passing it along. We handpicked a few Instagram pics that will have You wanting to visit the Grand Canyon just to STRIKE a POSE!
Deformable Part Models are Convolutional Neural NetworksWei Yang
Girshick, Ross, et al. "Deformable part models are convolutional neural networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
Pose Method clinic held at CrossFit Ferus in Fayetteville, NC. Covers running form and technique from an efficiency and injury prevention standpoint. Programming for marathon training and interval sessions described.
딥러닝을 이용한 얼굴인식 (Face Recogniton with Deep Learning)Daehee Han
Open Face를 이용한 얼굴인식,구분 서비스 개발. 한대희 멘토 (http://slowcampus.com)
소프트웨어 마에스트로 6기 1단계2차 (2015년 9~11월) 프로젝트.
FaceNet, Open face, Face Recognition, Deep learning
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as l1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.
Computer vision techniques can be seen in various aspects in our daily life with tremendous impacts. This slides aim at introducing basic concepts of computer vision and applications for the general public.
Download link: https://uofi.box.com/shared/static/24vy7aule67o4g6djr83hzurf5a9lfp6.pptx
Despite widespread adoption, machine learning models remain mostly black boxes. Understanding why certains predictions are made are very important in assessing trust, which is very important if one plans to take action based on a prediction. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. If the user does not trust the model they will never use it .
Enforcing Behavioral Constraints in Evolving Aspect-Oriented ProgramsRaffi Khatchadourian
Reasoning, specification, and verification of Aspect-Oriented (AO) programs presents unique challenges especially as such programs evolve over time. Components, base-code and aspects alike, may be easily added, removed, interchanged, or presently unavailable at unpredictable frequencies. Consequently, modular reasoning of such programs is highly attractive as it enables tractable evolution, otherwise necessitating that the entire program be reexamined each time a component is changed. It is well known, however, that modular reasoning about AO programs is difficult. In this paper, we present our ongoing work in constructing a rely-guarantee style reasoning system for the Aspect-Oriented Programming (AOP) paradigm, adopting a trace-based approach to deal with the plug-n-play nature inherent to these programs, thus easing AOP evolution.
Eigenfaces , Fisherfaces and Dimensionality_Reductionmostafayounes012
Eigenfaces , Fisherfaces and Dimensionality Reduction are explained easily and clearly. These topics focus on 2 face recognition methods and explains the mathematics behind them. Hope it Helps!
Deformable Facial Models and 3D Face Reconstruction Methods: A surveyLakshmi Sarvani Videla
Deformable Facial Model Construction for non-rigid motion tracking, 3D Face Reconstruction Methods, Geometry-Based Methods , Stereo methods, Shape from Motion models, Face Models, Cylindrical Model, Ellipsoidal Model, Planar Model
, Facial deformable models, Holistic models, Part based models, Eigenfaces, Active Shape Models, Combined Appearance Models, comparison of 3D facial features,list of 3d face databases containing 3D static expressions
4. 4
introduction
What is face alignment?
• Face alignment is to extract facial feature points :
• , and from the given image
Eyebrow
Eye
Nose
Mouth
Chin
* “The POSTECH Face Database (PF07) and Performance Evaluation”, FG 2008
5. 5
introduction
Why is it important?
• Face alignment is pre-requisite for many face-related
problem.
Angry Happy
-25° 0° +25°
Surprise Neutral
Face Recognition Face Expression Recognition Head Pose Estimation
8. 8
Previous work
Previous work
• Two approaches
• 1. Discriminative approach
• Active Shape Model
• The shape parameters are iteratively updated by locally finding the best
nearby match for each feature point.
• 2. Generative approach
• Active Appearance Model
• The shape parameters are iteratively updated by minimizing the error
between appearance instance and input image.
9. 9
Previous work
Previous work
• 1. Discriminative approach
Constrained Local Model[1] Bayesian Tangent Shape Model[2]
• Feature detector : Linear SVM • Feature detector : gradient along normal vector
• Alignment algorithm : Mean-shifts • Alignment algorithm : Bayesian Inference
• They assume that all the feature points are visible.
• By the wrong detected feature points, alignment fails.
[1] Jason et al., “Face Alignment through Subspace Constrained Mean-Shifts”, ICCV 2009
[2] Yi et al., “Bayesian Tangent Shape Model:Estimating Shape and Pose Parameters via Bayesian Inference”, CVPR 2003
10. 10
Previous work
Previous work
• 2. Generative approach
Boosted Appearance Model[3] Fourier Active Appearance Model[4]
• Appearance model : Haar-like feature • Appearance model : Fourier transformed
and boosting. appearance
• Weak classifier : discriminate aligned • Alignment algorithm : gradient descent
images from not-aligned images.
• Due to high dimensional solution space, it has large number of
local minimums.
• They need good initialization by eye detection.
[3] Xiaoming Liu, “Generic Face Alignment using Boosted Appearance Model”, CVPR 2007
[4] Rajitha, et al., “Fourier Active Appearance Models”, ICCV 2011
12. 12
Proposed method
Motivation
• We follow discriminative approach.
• Determine whether a feature point is visible or not.
• Only visible feature points are involved alignment step.
• Invisible feature points are estimated by visible feature points using partial
inference (PI) algorithm.
• Using the multiple shape models, we solve pose problem.
We propose pose and occlusion robust face alignment !
Visible
Invisible
13. 13
Proposed method
Shape Representation
• Point Distribution Model
• The non-rigid shape :
• is represented by linear combination of shape bases with the
mean shape as
: mean shape associated to
: eigenvectors associated to
: shape parameter
: scale
: rotation
: translation(x, y)
14. 14
Proposed method
Formulation
• Shape Model with parameter, p ={s, R, q, t}
• Energy function
denotes whether the is aligned(visible) or not,
is the number of local features.
15. 15
Proposed method
Multiple Shape Models
• To cover various pose and expression, we build multiple
shape models.
• We build eigenvectors for nth pose, mth expression,
• Given n and m, shape is
16. 16
Proposed method
Formulation with multiple shape models
• Energy function
17. 17
Proposed method
Algorithm Overview
Model Hypotheses
[Input] Evaluation
Local Feature Detection
[Output]
[Hypothesis-and-test]
Hypothesizing
Transformation Parameters
Face Hypothesizing
Detection Shape Parameters
18. 18
Proposed method
Local Feature Detection
Model Hypotheses
[Input] Evaluation
Local Feature Detection
[Output]
[Hypothesis-and-test]
Hypothesizing
Transformation Parameters
Face Hypothesizing
Detection Shape Parameters
19. 19
Proposed method
Local feature detection
Local Feature Detection
• Goal
Detect feature point candidates with Gaussian Model!
• Based on MCT+Adaboost algorithm [5],
• We propose Hierarchical MCT to increase detection
performance.
[5] Jun, and Kim, “Robust Real-Time Face Detection Using Face Certainty Map”, ICB, 2007
20. 20
Proposed method
Local feature detection
Feature Descriptor
• Modified Census Transform (MCT)
I1 I2 I3 B1 B2 B3
9
I4 I5 I6 B4 B5 B6 C B x * 2x
x1
I7 I8 I9 B7 B8 B9
1 9
M Ix
9 x 1
Bx 1 if Ix M
B x 0 otherwise
102 105 118 0 0 0
120 111 101 1 0 0 011100000 2
224
123 119 109 1 1 0
21. 21
Proposed method
Local feature detection
Feature Descriptor
• Modified Census Transform (MCT)
• Transformed result
Gray image MCT
• MCT is point feature
• Represents local intensity’s difference
• Very sensitive to noise
22. 22
Proposed method
Local feature detection
Feature Descriptor
We propose Hierarchical MCT
• Regional feature
• To represent regional difference
• Robust to noise
I1 I2 I3
9
I4 I5 I6 C B x * 2x
x 1
Partition Average MCT
I7 I8 I9
23. 23
Proposed method
Local feature detection
Training procedure
• Hierarchical MCT + Adaboost
35
25
Adaboost
Training
15
35
5
Image pyramid Concatenated
Input image
By Integral Image
MCT
vector
24. 24
Proposed method
Local feature detection
Feature Response
• Feature response by Adaboost with different feature
descriptor
Training
Image
Test
Image
Conventional Conventional Hierarchical Hierarchical
LBP MCT LBP MCT
25. 25
Proposed method
Local feature detection
Process of local feature detection
[Input] Hierarchical Adaboost Regressed
Search region
MCT Response Response
How to obtain feature point candidates?
26. 26
Proposed method
Local feature detection
Representation of Feature Response
• How to obtain feature point candidates?
• Local maximum points in candidate search region
arg max x y, y , and px 0, x is center of
x
Segmented
[Input] Response region
27. 27
Proposed method
Local feature detection
Representation of Feature Response
• How to obtain feature point candidates?
• We compute distribution of segmented region through convex
quadratic function
is kth segmented region in ith feature point.
is the centroid of
is the inverted feature response function.
• We obtain and : feature candidate’s distribution and centroid.
• Independent Gaussian distribution
Kronecker delta function which is visible.
28. 28
Proposed method
Local feature detection
Feature clustering
• Mouth corner’s appearance varies according to facial
expression according.
• The detection performance degrades when only one detector is
used to train for all the mouth shapes and appearances.
Neutral Smile Surprise
29. 29
Proposed method
Local feature detection
Feature clustering
• Train each detector with each clustered feature
• Run detectors and combine results
30. 30
Proposed method
Local feature detection
Local feature detection
…..
..…
[Candidates
[Input] [Search region] [Adaboost [output of detection]
with Gaussian]
Response]
31. 31
Proposed method
Hypothesizing Transformation Parameters
Model Hypotheses
[Input] Evaluation
Local Feature Detection
[Output]
[Hypothesis-and-test]
Hypothesizing
Transformation Parameters
Face Hypothesizing
Detection Shape Parameters
32. 32
Proposed method
Hypothesizing Hypo. trans. param.
• Goal
Find a best combination of the
local feature point candidates
which represents input image well.
[Feature point candidates]
• Assumption for occlusion
• We assume that at least half of feature points are not occluded.
• Let be N is total number of features points.
• N/2 feature points can be assumed to be visible ones.
33. 33
Proposed method
Hypothesizing Hypo. trans. param.
• Coarse-to-fine approach
– The hypothesis space of visibility of feature p
oints is HUGE.
– Partial Inference (PI) Algorithm
• 1. Transformation parameters (s, R, t) are estimate
d by RANSAC.
• 2. Shape parameters (q) are estimated, also transfo
rmation parameters are updated by RANSAC
35. 35
Proposed method
Hypothesizing Shape Parameters
Model Hypotheses
[Input] Evaluation
Local Feature Detection
[Output]
[Hypothesis-and-test]
Hypothesizing
Transformation Parameters
Face Hypothesizing
Detection Shape Parameters
36. 36
Proposed method
Hypothesizing Shape Parameters
• From the selected feature points , we calculate parameters p
in closed form by
• Visibility indicator
• to and to are selected candidate’s Gaussian parameters.
38. 38
Proposed method
Hypothesizing for all pose and expression
• Run two hypothesizing steps for all shape mod
els (of face pose and expression)
39. 39
Proposed method
Model Hypothesis Evaluation
Model Hypotheses
[Input] Evaluation
Local Feature Detection
[Output]
[Hypothesis-and-test]
Hypothesizing
Transformation Parameters
Face Hypothesizing
Detection Shape Parameters
40. 40
Proposed method
Model Hypotheses Evaluation
• We should select best pose and expression from all the
hypotheses.
• Hypothesis error is mean error of inliers(E) over number of
inliers(v).
Num. of Inliers 54 52 43 40
Error of inliers 2.9755 3.23 3.37 2.95
43. 43
Experimental results
Training database
• CMU Multi-PIE [7]
• Various pose, expression and illumination
• We used 10,948 images among 750,000 images
• 5 Pose models
• 0°, 15°~30°, 30°~45° (70 feature points)
• 60°~75°, and 75°~90° (40 feature points)
• 2 Expression models
• Neutral and smile
• surprise
[7] Ralph et al., “Guide to the CMU Multi-pie database”, Technical report, CMU, 2007
44. 44
Experimental results
Test database
• ARDB [8]
• Occlusion (Sunglasses, and scarf)
• CMU Multi-PIE
• Various pose, expression, illumination
• For artificial occlusion
• LFPW(Labeled Face Parts in the Wild) [9]
• Various pose, expression, illumination, and partial occlusion.
• 29 feature points
• To compare our algorithm with other state-of-the art one
AR DB LFPW
[8] A.M. Martinez and R. Benavente. The AR Face Database. CVC Technical Report #24, June 1998
[9] P. Belhumeur, et al., “Localizing parts of faces using a concensus of exemplars”, IEEE CVPR, 2011
45. 45
Experimental results
Alignment Accuracy
• Normalized error
• Euclidean distance between aligned feature and ground truth with
respect to face size.
• If Normalized error is 0.01 with 100 pixel size face,
• distance between aligned feature and ground truth is only one pixel.
46. 46
Experimental results
AR database
• Test result
• 60 images
47. 47
Experimental results
AR database
• Normalized error for • Cumulative error
occlusion type
Normalized mean error for occlusion type
Non occlusion 0.0226
Scarf 0.0258
Sunglasses 0.0338
48. 48
Experimental results
CMU Multi-PIE Database
• Test result
• Test for pose
• 321 images
49. 49
Experimental results
CMU Multi-PIE Database
• Normalized mean error • Cumulative error
for pose
Normalized mean error for pose
*60°~90° shows a little poor than 0°~45°.
0° 0.0263 60° 0.0352
Since large portion of the facial features
15° 0.0253 75° 0.0336 are covered by hair, the total number of
30° 0.0273 90° 0.0368 visible feature points detected is too small
to hallucinate correct facial shape.
45° 0.0267
50. 50
Experimental results
CMU Multi-PIE Database
• Test for artificial occlusion
• Face area is divided by 5-by-5.
• Among 25 regions, 1 to 15 regions are selected randomly and filled by
black.
• From 8 of occluded regions, the fraction of occlusion starts to be over 50%
of feature points.
• 2,100 images
51. 51
Experimental results
CMU Multi-PIE Database
• Test result
52. 52
Experimental results
CMU Multi-PIE Database
• Normalized error for pose
• For the profile(60°~90°) view, even small occlusion affects the alignment
badly because there are fewer strong features like eyes, mouth, and
nostrils.
• However, with respect to the mean error, the proposed method shows
stable alignment up to 7 degree of occlusion which is nearly 50% of
occlusion.
53. 53
Experimental results
LFPW database
• Mean error over inter-ocular distance for 21 feature points
• 240 of 300 images
* P. Belhumeur, et al., “Localizing parts of faces using a concensus of exemplars”, IEEE CVPR, 2011
55. 55
Conclusion
• We proposed pose and occlusion robust face alignment
method.
• To solve pose problem, we used multiple shape models.
• To solve occlusion problem, we proposed partial
inference (PI) algorithm.
• We explicitly determine which part is occluded.
• We proposed Hierarchical MCT+Adaboost for local
feature detector to improve detection performance.
57. 57
Future work
• We combine generative approach (Active Appearance
Model) with discriminative approach (local feature detector).
• Current facial feature tracking
• AAM with temporal matching, template update, and motion
estimation
58. 58
Future work
• Problem in facial feature tracking
• Drift problem
Iterative Update
Appearance
Error
arg minE AAM I n , A,p, α
[Input] [Output]
p,α
-
Update
parameters
p p p
α α α
x x0 pi si
Condition
59. 59
Future work
• By local feature detection result,
• we can constrain the aligned feature points by AAM to the local
feature detector.
60. 60
Future work
[Input In]
Iterative Update
[Point constraint]
Feature point Appearance
selection Error
arg minE AAM I n , A,p, α
p,α
-
Local feature [Output]
detector Point Error Update
parameters
x1 y1 p p p
E pts x2 y2 α α α
x x0 pi si
…
xn yn
Condition
61. 61
Future work
• By local feature detection result,
• We can make validation matrix of AAM for robust fitting.
• After alignment,
• We run feature detector on the aligned feature points.
• We determine whether each point is occluded or not.
• Based on feature-occlusion information, we make validation matrix
of AAM for robust fitting.
• Validation matrix is used for robust AAM from the next input image.
62. 62
Future work
Validation
[Input In] Matrix
Iterative Update
[Point constraint]
Feature point Appearance
selection Error
arg minE AAM I n , A,p, α Occlusion
p,α
Decision
x1-pos.
- x2-neg.
…
xn-pos.
Local feature
detector Point Error Update
parameters
x1 y1 p p p [Output]
E pts x2 y2 α α α
x x0 pi si
…
xn yn
Condition
63. 63
Future work
Validation
[Input In+1] Matrix
Iterative Update
[Point constraint]
Feature point Robust
selection App. Error
arg minE AAM I n , A,p, α Occlusion
p,α Decision
* - x1-pos.
x2-neg.
…
xn-pos.
Local feature
detector Point Error Update
parameters
x1 y1 p p p [Output]
E pts x2 y2 α α α
x x0 pi si
…
xn yn
Condition