Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
×

# Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC

2,294 views

Published on

Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC

คณะสถิติประยุกต์ สถาบันบัณฑิตพัฒนบริหารศาสตร์ ร่วมกับ Data Science Thailand ร่วมกันจัดงาน The First NIDA Business Analytics and Data Sciences Contest/Conference

Published in: Education
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

### Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC

1. 1. Face Recognition & Deep Learning sanparith.marukatat@nectec.or.th
2. 2. Standard procedure • Image capturing: camera, webcam, surveillance • Face detection: locate faces in the image • Face alignment: normalize size, rectify rotation • Face matching • 1:1 Face veriﬁcation • 1:N Face recognition
3. 3. Viola-Jones Haar-like detector  (OpenCV haarcascade_frontalface_alt2.xml) face size~35x35 to 80x80 pixels too small occlusion rotation Recognition = compare these faces to known faces
4. 4. Controlled environment face size 218x218 pixels Viola-Jones eye detector Eyes distance = 81 pixels Eyes angle = -0.7 degrees Face size = 180x200 pixels Eyes distance = 100 pixels Eyes angle = 0 degrees
5. 5. Comparing face • Face image • Bitmap of size 180x200 pixels • Grayscale (0-255) • 36,000 values/face image • Given 2 face images x1 and x2 • x1(x,y) - x2(x,y) • | x1(x,y) - x2(x,y) | • (x1(x,y) - x2(x,y)) 2 • What should be used?
6. 6. Basic Maths • 1 Face image = 1 vector • 36,000 dimensions (d) • matrix with 1 column • Distance • Euclidean distance • Norm-p distance • Norm-1 distance • Norm-inﬁnity distance
7. 7. Pixels importance and projection • Not all pixels have the same importance • Pixel with low variation -> not important • Pixel with large variation -> could be important Projection When ||w||=1, wTx is the projection of x on axis w w
8. 8. Subspace projection • What should be the axis w? • How many axis do we need?
9. 9. Principal Component Analysis PCA (1) • Basic idea • Measure of information = variance • Variance of z1,…,zN for real numbers zt • Given a set of face vectors x1,…,xN and axis w  Variance of w T x1,…,w T xN is Covariance matrix
10. 10. Principal Component Analysis PCA (2) • Best axis w is obtained by maximizing w T Cw with constraint ||w||=1 • w is an eigenvector of C : Cw = a w • Variance w T Cw=a is the corresponding eigenvalue of w • PCA • Construct Covariance matrix C • Eigen-decompose C • Select m largest eigenvectors
11. 11. Eigenface (1) • What is the problem with face data? • Solution Dot matrix dxd matrix NxN matrix
12. 12. Eigenface (2) • We work with vectors of projected values x1 x2 … x40 x Enrollment Template
13. 13. Eigenface (3) • Vector of raw intensity: 36,000 dimensions • Vector of Eigenface coefﬁcients: 10 dimensions • Large Eigenface = large variation • Small Eigenface = noise
14. 14. Related techniques • Fisherface (LDA) • Nullspace LDA • Laplacianface • Locality Sensitive Discriminant Analysis • 2DPCA • 2DLDA • 2DPCA+2DLDA
15. 15. Result on ORL (~10 years ago) Techniques Accuracy #dim Eigenface 90-95 200 Fisherface 91-97 50 NLDA 92-97 40 Laplacianface 89-95 50 LSDA 91-97 50 2DPCA 91.5 2DLDA 90.5 2DPCA+2DLDA 93.5
16. 16. Limitations • Occlusion: glasses, beard • Lighting condition • Facial expression • Pose • Make-up
17. 17. Evaluation • Accuracy: ﬁnd closest template and check the ID • Veriﬁcation (access control) • Live captured image VS. stored image • We have distance -> Should we accept or not? • False Accept (FA) VS. False Reject (FR) • From a set of face images • Compute distances between all pair • Select threshold T that gives 0 FA and X FR • Number of tries distance T
18. 18. Labeled Faces in the Wild • Large number of subjects (>5,000) • Unconstrained conditions • Human performance 97-99% • Traditional methods fail • New alignment technique: funneling
19. 19. LFW results Use outside data to train the model
20. 20. Deep Learning
21. 21. Neural Network timeline McCulloch & Pitts Neuron model (1943) Perceptron limitation (1969) Backprop algorithm 70-80’s SVM (1992) Deep Learning (2006)
22. 22. • Return of Neural Network • Focus on Deep Structure • Take advantage of today computing power
23. 23. Neural Networks (1) • Neurons are connected via synapse • A neuron receives signals from other neurons • When the activation reaches a threshold, it ﬁres a signal to other neurons http://en.wikipedia.org/wiki/Neuron
24. 24. Neural Networks (2) • Universal Approximator • Classical structure: MLP • #hidden nodes, learning rate • Backprop algorithm • Gradient • Direction of change that increases value of objective function • Vector of partial derivatives wrt. each parameters • Work on all structures, all objective functions • Stoping criteria, local optima, gradient vanishing/exploding
25. 25. Deep Learning • 2006 Hinton et al.: layer by layer construction -> pre-training • Stack of RBMs, Stack of Autoencoders • Convolutional NN (CNN) • Shared weights • Take advantage of GPU
26. 26. CNN today • Common components • Convolution layer, Max-pooling layer • ReLU • Drop-out, Sampling+ﬂip training data • GPU • Tools: Caffe, TensorFlow, Theano, Torch • Structure: LeNet, AlexNet, GoogLeNet
27. 27. LeNet
28. 28. LeNet AlexNet
29. 29. LeNet AlexNet GoogLeNet
30. 30. LeNet AlexNet GoogLeNet Microsoft deep residual network: 150 layers!
31. 31. DeepID  (Sun et al. CVPR 2014) • 160 dim, 60 regions, ﬂipped • 19,200 dimensions!! • Input to other model • CelebFace • Reﬁne training
32. 32. Learning technique for deep structure Big data Computing  power GPU, etc.