All pose face alignment robust to occlusion

1,575 views
1,349 views

Published on

My PhD proposal

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,575
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
48
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

All pose face alignment robust to occlusion

  1. 1. 1POSE & OCCLUSION ROBUST FACEALIGNMENT USING MULTIPLE SHAPE MODELSAND PARTIAL INFERENCE - PhD Thesis Proposal - Jongju Shin [jjshin@postech.ac.kr] Advisor : Daijin Kim 2013.01.03 I.M. Lab. Dept. of CSE
  2. 2. 2Outline• Introduction• Previous Work• Proposed Method • Shape Representation • Formulation • Multiple Shape Models • Local Feature Detection • Hypothesizing Transformation Parameters • Hypothesizing Shape Parameters • Model Hypotheses Evaluation • Experimental Results• Conclusion• Future Work
  3. 3. 3INTRODUCTION
  4. 4. 4 introductionWhat is face alignment?• Face alignment is to extract facial feature points : • , and from the given image Eyebrow Eye Nose Mouth Chin * “The POSTECH Face Database (PF07) and Performance Evaluation”, FG 2008
  5. 5. 5 introductionWhy is it important?• Face alignment is pre-requisite for many face-related problem. Angry Happy -25° 0° +25° Surprise Neutral Face Recognition Face Expression Recognition Head Pose Estimation
  6. 6. 6 introductionChallenges Illumination Pose Expression Occlusion
  7. 7. 7PREVIOUS WORK
  8. 8. 8 Previous workPrevious work• Two approaches • 1. Discriminative approach • Active Shape Model • The shape parameters are iteratively updated by locally finding the best nearby match for each feature point. • 2. Generative approach • Active Appearance Model • The shape parameters are iteratively updated by minimizing the error between appearance instance and input image.
  9. 9. 9 Previous work Previous work • 1. Discriminative approach Constrained Local Model[1] Bayesian Tangent Shape Model[2] • Feature detector : Linear SVM • Feature detector : gradient along normal vector • Alignment algorithm : Mean-shifts • Alignment algorithm : Bayesian Inference • They assume that all the feature points are visible. • By the wrong detected feature points, alignment fails.[1] Jason et al., “Face Alignment through Subspace Constrained Mean-Shifts”, ICCV 2009[2] Yi et al., “Bayesian Tangent Shape Model:Estimating Shape and Pose Parameters via Bayesian Inference”, CVPR 2003
  10. 10. 10 Previous workPrevious work• 2. Generative approach Boosted Appearance Model[3] Fourier Active Appearance Model[4] • Appearance model : Haar-like feature • Appearance model : Fourier transformed and boosting. appearance • Weak classifier : discriminate aligned • Alignment algorithm : gradient descent images from not-aligned images. • Due to high dimensional solution space, it has large number of local minimums. • They need good initialization by eye detection.[3] Xiaoming Liu, “Generic Face Alignment using Boosted Appearance Model”, CVPR 2007[4] Rajitha, et al., “Fourier Active Appearance Models”, ICCV 2011
  11. 11. 11PROPOSED METHOD
  12. 12. 12 Proposed methodMotivation• We follow discriminative approach. • Determine whether a feature point is visible or not. • Only visible feature points are involved alignment step. • Invisible feature points are estimated by visible feature points using partial inference (PI) algorithm.• Using the multiple shape models, we solve pose problem. We propose pose and occlusion robust face alignment ! Visible Invisible
  13. 13. 13 Proposed methodShape Representation• Point Distribution Model• The non-rigid shape : • is represented by linear combination of shape bases with the mean shape as : mean shape associated to : eigenvectors associated to : shape parameter : scale : rotation : translation(x, y)
  14. 14. 14 Proposed methodFormulation• Shape Model with parameter, p ={s, R, q, t}• Energy function denotes whether the is aligned(visible) or not, is the number of local features.
  15. 15. 15 Proposed methodMultiple Shape Models• To cover various pose and expression, we build multiple shape models.• We build eigenvectors for nth pose, mth expression,• Given n and m, shape is
  16. 16. 16 Proposed methodFormulation with multiple shape models• Energy function
  17. 17. 17 Proposed methodAlgorithm Overview Model Hypotheses [Input] Evaluation Local Feature Detection [Output] [Hypothesis-and-test] Hypothesizing Transformation Parameters Face Hypothesizing Detection Shape Parameters
  18. 18. 18 Proposed methodLocal Feature Detection Model Hypotheses [Input] Evaluation Local Feature Detection [Output] [Hypothesis-and-test] Hypothesizing Transformation Parameters Face Hypothesizing Detection Shape Parameters
  19. 19. 19 Proposed method Local feature detection Local Feature Detection • Goal Detect feature point candidates with Gaussian Model! • Based on MCT+Adaboost algorithm [5], • We propose Hierarchical MCT to increase detection performance.[5] Jun, and Kim, “Robust Real-Time Face Detection Using Face Certainty Map”, ICB, 2007
  20. 20. 20 Proposed method Local feature detectionFeature Descriptor• Modified Census Transform (MCT) I1 I2 I3 B1 B2 B3 9 I4 I5 I6 B4 B5 B6 C   B x * 2x x1 I7 I8 I9 B7 B8 B9 1 9 M   Ix 9 x 1 Bx  1 if Ix  M B x  0 otherwise 102 105 118 0 0 0 120 111 101 1 0 0 011100000 2  224 123 119 109 1 1 0
  21. 21. 21 Proposed method Local feature detectionFeature Descriptor• Modified Census Transform (MCT) • Transformed result Gray image MCT • MCT is point feature • Represents local intensity’s difference • Very sensitive to noise
  22. 22. 22 Proposed method Local feature detectionFeature Descriptor We propose Hierarchical MCT • Regional feature • To represent regional difference • Robust to noise I1 I2 I3 9 I4 I5 I6 C   B x * 2x x 1 Partition Average MCT I7 I8 I9
  23. 23. 23 Proposed method Local feature detectionTraining procedure• Hierarchical MCT + Adaboost 35 25 Adaboost Training 15 35 5 Image pyramid ConcatenatedInput image By Integral Image MCT vector
  24. 24. 24 Proposed method Local feature detectionFeature Response• Feature response by Adaboost with different feature descriptorTrainingImageTestImage Conventional Conventional Hierarchical Hierarchical LBP MCT LBP MCT
  25. 25. 25 Proposed method Local feature detectionProcess of local feature detection [Input] Hierarchical Adaboost Regressed Search region MCT Response Response How to obtain feature point candidates?
  26. 26. 26 Proposed method Local feature detectionRepresentation of Feature Response• How to obtain feature point candidates? • Local maximum points in candidate search region arg max x  y, y  , and px  0, x is center of  x Segmented [Input] Response region
  27. 27. 27 Proposed method Local feature detectionRepresentation of Feature Response• How to obtain feature point candidates? • We compute distribution of segmented region through convex quadratic function is kth segmented region in ith feature point. is the centroid of is the inverted feature response function. • We obtain and : feature candidate’s distribution and centroid. • Independent Gaussian distribution Kronecker delta function which is visible.
  28. 28. 28 Proposed method Local feature detectionFeature clustering• Mouth corner’s appearance varies according to facial expression according.• The detection performance degrades when only one detector is used to train for all the mouth shapes and appearances. Neutral Smile Surprise
  29. 29. 29 Proposed method Local feature detectionFeature clustering• Train each detector with each clustered feature• Run detectors and combine results 
  30. 30. 30 Proposed method Local feature detectionLocal feature detection ….. ..… [Candidates [Input] [Search region] [Adaboost [output of detection] with Gaussian] Response]
  31. 31. 31 Proposed methodHypothesizing Transformation Parameters Model Hypotheses [Input] Evaluation Local Feature Detection [Output] [Hypothesis-and-test] Hypothesizing Transformation Parameters Face Hypothesizing Detection Shape Parameters
  32. 32. 32 Proposed methodHypothesizing Hypo. trans. param.• Goal Find a best combination of the local feature point candidates which represents input image well. [Feature point candidates]• Assumption for occlusion • We assume that at least half of feature points are not occluded. • Let be N is total number of features points. • N/2 feature points can be assumed to be visible ones.
  33. 33. 33 Proposed methodHypothesizing Hypo. trans. param.• Coarse-to-fine approach – The hypothesis space of visibility of feature p oints is HUGE. – Partial Inference (PI) Algorithm • 1. Transformation parameters (s, R, t) are estimate d by RANSAC. • 2. Shape parameters (q) are estimated, also transfo rmation parameters are updated by RANSAC
  34. 34. 34 Proposed method Hypothesizing Transformation Parameters Hypo. trans. param.Algorithm 1. Partial Inference (PI) algorithm for transformation parameters [PI algorithm]
  35. 35. 35 Proposed methodHypothesizing Shape Parameters Model Hypotheses [Input] Evaluation Local Feature Detection [Output] [Hypothesis-and-test] Hypothesizing Transformation Parameters Face Hypothesizing Detection Shape Parameters
  36. 36. 36 Proposed methodHypothesizing Shape Parameters• From the selected feature points , we calculate parameters p in closed form by • Visibility indicator • to and to are selected candidate’s Gaussian parameters.
  37. 37. 37 Proposed method Hypothesizing shape parameters Hypo. shp. param.Algorithm 2. Partial Inference (PI) algorithm for shape parameters [Selected feature points] [Hallucinated shape]
  38. 38. 38 Proposed methodHypothesizing for all pose and expression• Run two hypothesizing steps for all shape mod els (of face pose and expression)
  39. 39. 39 Proposed methodModel Hypothesis Evaluation Model Hypotheses [Input] Evaluation Local Feature Detection [Output] [Hypothesis-and-test] Hypothesizing Transformation Parameters Face Hypothesizing Detection Shape Parameters
  40. 40. 40 Proposed method Model Hypotheses Evaluation • We should select best pose and expression from all the hypotheses. • Hypothesis error is mean error of inliers(E) over number of inliers(v).Num. of Inliers 54 52 43 40Error of inliers 2.9755 3.23 3.37 2.95
  41. 41. 41Video
  42. 42. 42EXPERIMENTAL RESULTS
  43. 43. 43 Experimental results Training database • CMU Multi-PIE [7] • Various pose, expression and illumination • We used 10,948 images among 750,000 images • 5 Pose models • 0°, 15°~30°, 30°~45° (70 feature points) • 60°~75°, and 75°~90° (40 feature points) • 2 Expression models • Neutral and smile • surprise[7] Ralph et al., “Guide to the CMU Multi-pie database”, Technical report, CMU, 2007
  44. 44. 44 Experimental results Test database • ARDB [8] • Occlusion (Sunglasses, and scarf) • CMU Multi-PIE • Various pose, expression, illumination • For artificial occlusion • LFPW(Labeled Face Parts in the Wild) [9] • Various pose, expression, illumination, and partial occlusion. • 29 feature points • To compare our algorithm with other state-of-the art one AR DB LFPW[8] A.M. Martinez and R. Benavente. The AR Face Database. CVC Technical Report #24, June 1998[9] P. Belhumeur, et al., “Localizing parts of faces using a concensus of exemplars”, IEEE CVPR, 2011
  45. 45. 45 Experimental resultsAlignment Accuracy• Normalized error • Euclidean distance between aligned feature and ground truth with respect to face size. • If Normalized error is 0.01 with 100 pixel size face, • distance between aligned feature and ground truth is only one pixel.
  46. 46. 46 Experimental resultsAR database• Test result • 60 images
  47. 47. 47 Experimental resultsAR database• Normalized error for • Cumulative error occlusion typeNormalized mean error for occlusion type Non occlusion 0.0226 Scarf 0.0258 Sunglasses 0.0338
  48. 48. 48 Experimental resultsCMU Multi-PIE Database• Test result • Test for pose • 321 images
  49. 49. 49 Experimental resultsCMU Multi-PIE Database• Normalized mean error • Cumulative error for poseNormalized mean error for pose *60°~90° shows a little poor than 0°~45°. 0° 0.0263 60° 0.0352 Since large portion of the facial features 15° 0.0253 75° 0.0336 are covered by hair, the total number of 30° 0.0273 90° 0.0368 visible feature points detected is too small to hallucinate correct facial shape. 45° 0.0267
  50. 50. 50 Experimental resultsCMU Multi-PIE Database• Test for artificial occlusion • Face area is divided by 5-by-5. • Among 25 regions, 1 to 15 regions are selected randomly and filled by black. • From 8 of occluded regions, the fraction of occlusion starts to be over 50% of feature points. • 2,100 images
  51. 51. 51 Experimental resultsCMU Multi-PIE Database• Test result
  52. 52. 52 Experimental resultsCMU Multi-PIE Database• Normalized error for pose • For the profile(60°~90°) view, even small occlusion affects the alignment badly because there are fewer strong features like eyes, mouth, and nostrils. • However, with respect to the mean error, the proposed method shows stable alignment up to 7 degree of occlusion which is nearly 50% of occlusion.
  53. 53. 53 Experimental results LFPW database • Mean error over inter-ocular distance for 21 feature points • 240 of 300 images* P. Belhumeur, et al., “Localizing parts of faces using a concensus of exemplars”, IEEE CVPR, 2011
  54. 54. 54
  55. 55. 55Conclusion• We proposed pose and occlusion robust face alignment method.• To solve pose problem, we used multiple shape models.• To solve occlusion problem, we proposed partial inference (PI) algorithm.• We explicitly determine which part is occluded.• We proposed Hierarchical MCT+Adaboost for local feature detector to improve detection performance.
  56. 56. 56FUTURE WORK
  57. 57. 57Future work• We combine generative approach (Active Appearance Model) with discriminative approach (local feature detector).• Current facial feature tracking • AAM with temporal matching, template update, and motion estimation
  58. 58. 58Future work• Problem in facial feature tracking • Drift problem Iterative Update Appearance Error arg minE AAM I n , A,p, α  [Input] [Output] p,α  -  Update parameters p  p  p α  α  α x  x0   pi si Condition
  59. 59. 59Future work• By local feature detection result, • we can constrain the aligned feature points by AAM to the local feature detector.
  60. 60. 60Future work [Input In] Iterative Update [Point constraint] Feature point Appearance selection Error arg minE AAM I n , A,p, α  p,α  - Local feature [Output] detector Point Error Update parameters  x1  y1  p  p  p   E pts   x2  y2  α  α  α    x  x0   pi si  …  xn  yn  Condition
  61. 61. 61Future work• By local feature detection result, • We can make validation matrix of AAM for robust fitting.• After alignment, • We run feature detector on the aligned feature points. • We determine whether each point is occluded or not. • Based on feature-occlusion information, we make validation matrix of AAM for robust fitting. • Validation matrix is used for robust AAM from the next input image.
  62. 62. 62Future work Validation [Input In] Matrix Iterative Update [Point constraint] Feature point Appearance selection Error arg minE AAM I n , A,p, α  Occlusion   p,α Decision x1-pos. - x2-neg. … xn-pos.Local feature detector Point Error Update parameters  x1  y1  p  p  p [Output]   E pts   x2  y2  α  α  α    x  x0   pi si  …  xn  yn  Condition
  63. 63. 63Future work Validation[Input In+1] Matrix Iterative Update [Point constraint] Feature point Robust selection App. Error arg minE AAM I n , A,p, α  Occlusion p,α Decision  * -  x1-pos. x2-neg. … xn-pos.Local feature detector Point Error Update parameters  x1  y1  p  p  p [Output]   E pts   x2  y2  α  α  α    x  x0   pi si  …  xn  yn  Condition
  64. 64. 64Thank you.

×