Moshe Guttmann's slides on eigenface

4,535 views

Published on

Published in: Technology, News & Politics
1 Comment
8 Likes
Statistics
Notes
  • this was realy nice one
    thank u for the slide
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
4,535
On SlideShare
0
From Embeds
0
Number of Embeds
70
Actions
Shares
0
Downloads
0
Comments
1
Likes
8
Embeds 0
No embeds

No notes for slide

Moshe Guttmann's slides on eigenface

  1. 1. Eigenfaces Developed in 1991 by M.Turk & A.Pentland Based on PCA Fisherfaces Developed in 1997 by P.Belhumeur et al. Based on Fisher’s LDA Moshe Guttmann
  2. 2. <ul><li>Goal </li></ul><ul><ul><li>Face identification </li></ul></ul>Eigenfaces ? ? ? Alexander Roth - http://isl.ira.uka.de/~nickel/mmseminar04/A_Roth%20-%20Face%20Recognition.ppt Basic Face set (face space basis) Input image
  3. 3. Eigenfaces <ul><li>Proposition: </li></ul><ul><ul><li>Input vector Z = [z 1 z 2 … z n ] T </li></ul></ul><ul><ul><li>Face training set </li></ul></ul><ul><ul><ul><li>A = [a 1 a 2 … a n ] T </li></ul></ul></ul><ul><ul><li>How do we identify the face Z? </li></ul></ul><ul><ul><ul><li>Find </li></ul></ul></ul><ul><ul><ul><li>The A’ that minimize the above is most probably the same face as Z </li></ul></ul></ul>
  4. 4. Eigenfaces <ul><li>Problems </li></ul><ul><ul><li>Noise on samples (both input and training set). We cannot determine which of the coefficients is the noise and which represent the face. </li></ul></ul><ul><ul><li>Different illumination/brightness of image </li></ul></ul><ul><ul><li>Facial expression </li></ul></ul><ul><ul><li>What about background? </li></ul></ul>
  5. 5. Eigenfaces <ul><li>Solution: </li></ul><ul><ul><li>If X is an N-dimensional vector, lower the dimension of X into an n-dimensional vector (n<<N). </li></ul></ul><ul><ul><li>What are the requirements for the dimension reduction? </li></ul></ul><ul><ul><ul><li>Minimize reconstruction error </li></ul></ul></ul><ul><ul><ul><li>Minimize the correlation between basis </li></ul></ul></ul>
  6. 6. Eigenfaces <ul><li>Dimension reduction: </li></ul><ul><ul><li>How to choose the basis for the new n-dimensional space? </li></ul></ul><ul><ul><li>Principal Component Analysis (PCA) (a.k.a Karhunen-Loeve transform) </li></ul></ul>
  7. 7. Eigenfaces – PCA <ul><li>Theory – intuition: </li></ul><ul><ul><li>Rotate the data so that its primary axes lie along the axes of the coordinate space and move it so that its center of mass lies on the origin. </li></ul></ul>x 1 x 2 e 1 e 2 x x x x x x x x y 1 y 2 PCA x x x x x x x x
  8. 8. Eigenfaces – PCA <ul><li>Goal – Formally stated: </li></ul><ul><li>Problem formulation </li></ul><ul><li>Input: Matrix X = [ x 1 | … | x M ] N x M in N-dimensional space </li></ul><ul><li>Look for: W N x n projection matrix (n  N) </li></ul><ul><li>S.t. : Y = [y 1 |…|y M ] n x M = W T X </li></ul><ul><li>And correlation is minimized </li></ul>W X Y X Y x i y i
  9. 9. Eigenfaces – PCA <ul><li>Define the covariance (scatter) matrix of the input samples as : </li></ul><ul><li>(where  is the sample mean) </li></ul><ul><li>Let X’ = [x 1 -  ,…,x N -  ] T then the above expression can be rewritten simply as : </li></ul><ul><li>Cov(x) = X’X’ T </li></ul>
  10. 10. Eigenfaces – PCA <ul><li>Properties of the Covariance Matrix </li></ul><ul><li>The matrix Cov is symmetric and of dimension NxN. </li></ul><ul><li>The diagonal contains the variance of each parameter (i.e. element Cov ii is the variance in the i’th direction). </li></ul><ul><li>Each element Cov ij is the co-variance between the two directions i and j, or how correlated are they (i.e. a value of zero indicates that the two dimensions are uncorrelated). </li></ul>
  11. 11. Eigenfaces – PCA cont’ <ul><li>PCA goal – revised </li></ul><ul><li>Look for: projection matrix W </li></ul><ul><li>S.t. : [y 1 …y n ] T = W T [x 1 …x N ] T ... </li></ul><ul><li>And correlation is minimized </li></ul><ul><li>OR </li></ul><ul><li>Cov(y) is diagonal </li></ul><ul><li>max trace( Cov(y) ) </li></ul><ul><li>Note that Cov(y) can be expressed via Cov(x) and W as : </li></ul><ul><li>Cov(y) = W T Cov(x) W </li></ul>
  12. 12. Eigenfaces – Principal Component Analysis (PCA) <ul><li>Selecting the Optimal W </li></ul><ul><li>How do we find such W ? </li></ul><ul><ul><li>s.t Cov(x) is diagonal </li></ul></ul><ul><ul><li>and max trace( Cov(x) ) </li></ul></ul><ul><li> i w i = Cov(x)w i </li></ul><ul><li>Therefore : </li></ul><ul><li>Choose W opt to be the eigenvectors matrix: </li></ul><ul><li>W opt = [w 1 |…|w n ] </li></ul><ul><li>Where {w i |i=1,…,n} is the set of the n-dimensional </li></ul><ul><li>eigenvectors of Cov(x) ! </li></ul>
  13. 13. Eigenfaces – PCA cont’ <ul><li>Explaining the theory </li></ul><ul><li>Each eigenvalue represents the total variance in its dimension. </li></ul><ul><li>So… </li></ul><ul><li>Throwing away the least significant eigenvectors in W opt means throwing away the least significant variance information ! </li></ul>
  14. 14. Eigenfaces – Principal Component Analysis (PCA) <ul><li>To find a more convenient coordinate system one needs to: </li></ul>Calculate mean sample  Subtract it from all samples x i Calculate Covariance matrix for resulting samples Find the set of eigenvectors for the covariance matrix Create W opt , the projection matrix, by taking as columns the eigenvectors calculated !
  15. 15. Eigenfaces – PCA <ul><li>Now we have that any point x i can be projected to an appropriate point y i by : </li></ul><ul><li>y i = W opt T (x i -  ) </li></ul><ul><li>and conversely (since W -1 = W T ) </li></ul><ul><li>Wy i +  = x i </li></ul>X Y x i X Y y i W opt T (x i -  ) Wy i + 
  16. 16. Eigenfaces – PCA <ul><li>Data Loss </li></ul><ul><li>Sample points can still be projected via the new N x n projection matrix W opt and can still be reconstructed, but some information will be lost. </li></ul>x 1 x 2 2D data 1D data x 1 W opt T (x i -  ) x 1 x 2 2D data Wy i + 
  17. 17. Eigenfaces – PCA <ul><li>Data Loss </li></ul><ul><li>It can be shown that the mean square error between x i and its reconstruction using only m principle eigenvectors is given by the expression : </li></ul>
  18. 18. Eigenfaces – the read deal <ul><li>Eigenfaces – preparing the training set </li></ul><ul><li>Get a training set of faces (Nx1) </li></ul><ul><ul><li>Calculate μ </li></ul></ul><ul><ul><li>Make Covariance matrix </li></ul></ul><ul><ul><li>Get the n largest eigenvalues/eigenvectors of Cov </li></ul></ul><ul><ul><li>Make W </li></ul></ul>
  19. 19. Eigenfaces – the read deal <ul><li>Eigenfaces – Identifying new image </li></ul><ul><li>Get new image (Nx1) </li></ul><ul><ul><li>Calculate X’ </li></ul></ul><ul><ul><li>Project onto the “face space” </li></ul></ul><ul><ul><li>Find the “Nearest Neighbor” to the projection </li></ul></ul><ul><ul><li>The A’ that minimize the above is most probably the same face as X </li></ul></ul>
  20. 20. Eigenfaces – the read deal <ul><li>Get a training set of faces A,B,… </li></ul>Turk & Pentland – Eigenfaces for recognition
  21. 21. Eigenfaces – example <ul><li>Average of training set μ </li></ul>Turk & Pentland – Eigenfaces for recognition
  22. 22. Eigenfaces – example <ul><li>Seven of the eigenfaces calculated from the training set </li></ul>Turk & Pentland – Eigenfaces for recognition
  23. 23. Eigenfaces – example <ul><li>Face identification </li></ul><ul><li>On new Image X </li></ul>Turk & Pentland – Eigenfaces for recognition Input image and its “face space” projection
  24. 24. Eigenfaces – example <ul><li>Face identification </li></ul><ul><li>On new Image Z </li></ul>Turk & Pentland – Eigenfaces for recognition Input image and its “face space” projection
  25. 25. Eigenfaces – experiments <ul><li>Face identification – statistics (Yale Database) </li></ul>P.Belhumeur et al. – Fisherfaces vs Eigenface
  26. 26. Eigenfaces – problems <ul><li>Problems with eigenfaces </li></ul><ul><ul><li>Sensitive to illumination </li></ul></ul><ul><ul><li>Sensitive to head pose </li></ul></ul><ul><ul><li>Sensitive to rotation, scale & translation </li></ul></ul><ul><ul><li>Sensitive to different facial expression </li></ul></ul><ul><ul><li>Background interference </li></ul></ul>
  27. 27. Eigenfaces – problems <ul><li>PCA problems </li></ul><ul><ul><li>PCA is not always an optimal dimensionality-reduction procedure: </li></ul></ul>http://network.ku.edu.tr/~yyemez/ecoe508/PCA_LDA.pdf
  28. 28. Fisherfaces - LDA <ul><li>Fisher’s Linear Discriminant Analysis </li></ul><ul><ul><li>Purpose </li></ul></ul><ul><ul><ul><li>separate data clusters </li></ul></ul></ul>Poor separation http://www.wisdom.weizmann.ac.il/mathusers/ronen/course/spring01/Presentations/Hassner%20Zelnik-Manor%20-%20PCA.ppt Good separation
  29. 29. Fisherfaces <ul><li>Solution – find a better dimension reduction method </li></ul>http://network.ku.edu.tr/~yyemez/ecoe508/PCA_LDA.pdf
  30. 30. Fisherfaces - LDA <ul><li>Solution – find a better dimension reduction method </li></ul>http://www.cs.huji.ac.il/course/2005/iml/handouts/class8-PCA-LDA-CCA.pdf 2-class set example Separation function Goal: maximize
  31. 31. Fisherfaces - LDA <ul><li>Solution – find a better dimension reduction method </li></ul>http://www.cs.huji.ac.il/course/2005/iml/handouts/class8-PCA-LDA-CCA.pdf 2-class set example Separation function Goal – revised: maximize
  32. 32. Fisherfaces - LDA <ul><li>Fisher’s Linear Discriminant Analysis </li></ul><ul><ul><li>Solution </li></ul></ul><ul><ul><ul><li>Maximize the between-class scatter while minimizing the within-class scatter </li></ul></ul></ul><ul><li>Fisherfaces </li></ul><ul><ul><li>Solution </li></ul></ul><ul><ul><ul><li>Widen the class. </li></ul></ul></ul><ul><ul><ul><li>Class represents a person. </li></ul></ul></ul><ul><ul><ul><li>Have r images of the same person’s face. </li></ul></ul></ul>
  33. 33. Fisherfaces - LDA <ul><li>Linear Discriminant Analysis </li></ul><ul><li>M images </li></ul><ul><li>C classes </li></ul><ul><li>Average per class </li></ul><ul><li>Total Average </li></ul>
  34. 34. Fisherfaces - LDA <ul><li>Linear Discriminant Analysis </li></ul><ul><ul><li>Within-class scatter matrix </li></ul></ul><ul><ul><li>Between-class scatter matrix </li></ul></ul>
  35. 35. Fisherfaces - LDA <ul><li>Linear Discriminant Analysis </li></ul><ul><ul><li>Projection onto “face space” </li></ul></ul><ul><ul><li>Within class scatter matrix </li></ul></ul><ul><ul><li>Between-class scatter matrix </li></ul></ul>
  36. 36. Fisherfaces - LDA <ul><ul><li>Example: </li></ul></ul>http://www.wisdom.weizmann.ac.il/mathusers/ronen/course/spring01/Presentations/Hassner%20Zelnik-Manor%20-%20PCA.ppt Good separation
  37. 37. Fisherfaces - LDA <ul><li>Problem Finding the transformation matrix A </li></ul><ul><ul><li>Goal: </li></ul></ul><ul><ul><li>Such a transformation retains class separability while reducing the variation due to sources other than identity (e.g., illumination). </li></ul></ul>
  38. 38. Fisherfaces - LDA <ul><li>Solution - Fisherfaces </li></ul><ul><ul><li>The linear transformation is given by a matrix W whose columns are the eigenvectors of </li></ul></ul><ul><ul><li>From: </li></ul></ul><ul><ul><li>We get that a i is the eigenvector of S w -1 S B </li></ul></ul><ul><ul><ul><li>Choose the eigenvectors with the largest k eigenvalues </li></ul></ul></ul><ul><ul><ul><li>These eigenvectors give the directions of maximum discrimination. </li></ul></ul></ul>
  39. 39. Fisherfaces - LDA <ul><li>LDA - Limitations </li></ul><ul><ul><li>The matrix S w -1 S B has at most C-1 nonzero eigenvalues. </li></ul></ul><ul><ul><li>The upper limit of the LDA dimension reduction is C-1 </li></ul></ul><ul><ul><li>The matrix S w -1 does not always exist. </li></ul></ul><ul><ul><ul><li>To guarantee that S w is not singular </li></ul></ul></ul><ul><ul><ul><li>M+C training samples are needed </li></ul></ul></ul><ul><ul><ul><li>Not practical! </li></ul></ul></ul>
  40. 40. Fisherfaces - Fisherfaces <ul><li>Solution: Fisherfaces – PCA + LDA </li></ul><ul><ul><li>PCA is first applied to the data set to reduce its dimension. </li></ul></ul><ul><ul><ul><li>M’ <= M – C </li></ul></ul></ul><ul><ul><li>LDA is then applied to further reduce the dimension. </li></ul></ul><ul><ul><ul><li>C’ <= C -1 </li></ul></ul></ul>
  41. 41. Fisherfaces – the read deal <ul><li>Fisherfaces – preparing the training set </li></ul><ul><li>Get a training set of faces (Nx1) </li></ul><ul><ul><li>Calculate μ </li></ul></ul><ul><ul><li>Find the PCA projection matrix using Cov(x) </li></ul></ul><ul><ul><li>Find the LDA projection matrix using Cov(W T pca X) </li></ul></ul><ul><ul><li>Make W </li></ul></ul>
  42. 42. Fisherfaces – the read deal <ul><li>Fisherfaces – Identifying new image </li></ul><ul><li>Get new image (Nx1) </li></ul><ul><ul><li>Calculate X’ </li></ul></ul><ul><ul><li>Project onto the “face space” </li></ul></ul><ul><ul><li>Find the “Nearest Neighbor” to the projection </li></ul></ul><ul><ul><li>The A’ that minimize the above is most probably the same face as X </li></ul></ul>
  43. 43. Fisherfaces – experiments <ul><li>Face identification – statistics (Harvard Database) </li></ul>P.Belhumeur et al. – Fisherfaces vs Eigenface
  44. 44. Fisherfaces – experiments <ul><li>Face identification – statistics (Harvard Database) </li></ul>P.Belhumeur et al. – Fisherfaces vs Eigenface
  45. 45. Fisherfaces – experiments P.Belhumeur et al. – Fisherfaces vs Eigenface
  46. 46. Fisherfaces – experiments P.Belhumeur et al. – Fisherfaces vs Eigenface
  47. 47. Fisherfaces – experiments P.Belhumeur et al. – Fisherfaces vs Eigenface
  48. 48. Bibliography <ul><li>Eigenfaces </li></ul><ul><ul><li>M. Turk and A. Pentland (1991). &quot; Face recognition using eigenfaces &quot;.  Proc. IEEE Conference on Computer Vision and Pattern Recognition , 586–591. </li></ul></ul><ul><ul><li>M. Turk and A. Pentland (1991). &quot; Eigenfaces for recognition &quot;. Journal of Cognitive Neuroscience 3 (1): 71–86. </li></ul></ul><ul><li>Fisherfaces </li></ul><ul><ul><li>Peter N. Belhumeur, João P. Hespanha, David J. Kriegman “ Eigenfaces vs. Fisherfaces : Recognition Using Class Specific Linear Projection ” IEEE Transactions on Pattern Analysis and Machine Intelligence </li></ul></ul><ul><li>Class notes on PCA & LDA </li></ul><ul><ul><li>Introduction to Machine Learning. ( Amnon Shashua ) </li></ul></ul><ul><ul><li>Lecture 8: Spectral Analysis I: PCA, LDA, CCA </li></ul></ul>
  49. 49. Appendix – PCA proof Given a sample of n observations on a vector of p variables λ where the vector is chosen such that define the first principal component of the sample by the linear transformation is maximum
  50. 50. Appendix – PCA proof cont’ Likewise, define the k th PC of the sample by the linear transformation where the vector is chosen such that is maximum subject to and to
  51. 51. Appendix – PCA proof cont’ To find first note that where is the covariance matrix for the variables
  52. 52. Appendix – PCA proof cont’ To find maximize subject to Let λ be a Lagrange multiplier by differentiating… then maximize is an eigenvector of corresponding to eigenvalue therefore
  53. 53. Appendix – PCA proof cont’ We have maximized So is the largest eigenvalue of The first PC retains the greatest amount of variation in the sample.
  54. 54. Appendix – PCA proof cont’ To find the next coefficient vector maximize then let λ and φ be Lagrange multipliers, and maximize subject to and to First note that
  55. 55. Appendix – PCA proof cont’ We find that is also an eigenvector of whose eigenvalue is the second largest. In general The k th largest eigenvalue of is the variance of the k th PC. The k th PC retains the k th greatest fraction of the variation in the sample.

×