Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC

Face Recognition
&
Deep Learning
sanparith.marukatat@nectec.or.th

Standard procedure
• Image capturing: camera, webcam, surveillance
• Face detection: locate faces in the image
• Face alignment: normalize size, rectify rotation
• Face matching
• 1:1 Face veriﬁcation
• 1:N Face recognition

Viola-Jones Haar-like detector 
(OpenCV haarcascade_frontalface_alt2.xml)
face size~35x35 to 80x80 pixels
too small
occlusion
rotation
Recognition = compare these
faces to known faces

Controlled environment
face size 218x218 pixels
Viola-Jones eye detector
Eyes distance = 81 pixels
Eyes angle = -0.7 degrees
Face size = 180x200 pixels
Eyes distance = 100 pixels
Eyes angle = 0 degrees

Comparing face
• Face image
• Bitmap of size 180x200 pixels
• Grayscale (0-255)
• 36,000 values/face image
• Given 2 face images x1 and x2
• x1(x,y) - x2(x,y)
• | x1(x,y) - x2(x,y) |
• (x1(x,y) - x2(x,y))
2
• What should be used?

Basic Maths
• 1 Face image = 1 vector
• 36,000 dimensions (d)
• matrix with 1 column
• Distance
• Euclidean distance
• Norm-p distance
• Norm-1 distance
• Norm-inﬁnity distance

Pixels importance and projection
• Not all pixels have the same importance
• Pixel with low variation -> not important
• Pixel with large variation -> could be important
Projection
When ||w||=1, wTx is the
projection of x on axis w
w

Subspace projection
• What should be the axis w?
• How many axis do we need?

Principal Component Analysis
PCA (1)
• Basic idea
• Measure of information = variance
• Variance of z1,…,zN for real numbers zt
• Given a set of face vectors x1,…,xN and axis w 
Variance of w
T
x1,…,w
T
xN is
Covariance matrix

Principal Component Analysis
PCA (2)
• Best axis w is obtained by maximizing w
T
Cw
with constraint ||w||=1
• w is an eigenvector of C : Cw = a w
• Variance w
T
Cw=a is the corresponding eigenvalue of w
• PCA
• Construct Covariance matrix C
• Eigen-decompose C
• Select m largest eigenvectors

Eigenface (1)
• What is the problem with face data?
• Solution
Dot matrix
dxd matrix
NxN matrix

Eigenface (2)
• We work with vectors of projected values
x1 x2 …
x40
x Enrollment
Template

Eigenface (3)
• Vector of raw intensity: 36,000 dimensions
• Vector of Eigenface coefﬁcients: 10 dimensions
• Large Eigenface = large variation
• Small Eigenface = noise

Related techniques
• Fisherface (LDA)
• Nullspace LDA
• Laplacianface
• Locality Sensitive Discriminant Analysis
• 2DPCA
• 2DLDA
• 2DPCA+2DLDA

Result on ORL (~10 years ago)
Techniques Accuracy #dim
Eigenface 90-95 200
Fisherface 91-97 50
NLDA 92-97 40
Laplacianface 89-95 50
LSDA 91-97 50
2DPCA 91.5
2DLDA 90.5
2DPCA+2DLDA 93.5

Limitations
• Occlusion: glasses, beard
• Lighting condition
• Facial expression
• Pose
• Make-up

Evaluation
• Accuracy: ﬁnd closest template and check the ID
• Veriﬁcation (access control)
• Live captured image VS. stored image
• We have distance -> Should we accept or not?
• False Accept (FA) VS. False Reject (FR)
• From a set of face images
• Compute distances between all pair
• Select threshold T that gives 0 FA and X FR
• Number of tries
distance
T

Labeled Faces in the Wild
• Large number of subjects (>5,000)
• Unconstrained conditions
• Human performance 97-99%
• Traditional methods fail
• New alignment technique: funneling

LFW results
Use outside data
to train the model

Neural Network timeline
McCulloch & Pitts
Neuron model (1943)
Perceptron limitation
(1969)
Backprop algorithm
70-80’s
SVM (1992)
Deep Learning
(2006)

• Return of Neural Network
• Focus on Deep Structure
• Take advantage of today computing power

Neural Networks (1)
• Neurons are connected via synapse
• A neuron receives signals from other neurons
• When the activation reaches a threshold, it
ﬁres a signal to other neurons
http://en.wikipedia.org/wiki/Neuron

Neural Networks (2)
• Universal Approximator
• Classical structure: MLP
• #hidden nodes, learning rate
• Backprop algorithm
• Gradient
• Direction of change that increases value of objective function
• Vector of partial derivatives wrt. each parameters
• Work on all structures, all objective functions
• Stoping criteria, local optima, gradient vanishing/exploding

Deep Learning
• 2006 Hinton et al.: layer by layer construction -> pre-training
• Stack of RBMs, Stack of Autoencoders
• Convolutional NN (CNN)
• Shared weights
• Take advantage of GPU

CNN today
• Common components
• Convolution layer, Max-pooling layer
• ReLU
• Drop-out, Sampling+ﬂip training data
• GPU
• Tools: Caffe, TensorFlow, Theano, Torch
• Structure: LeNet, AlexNet, GoogLeNet

LeNet
AlexNet
GoogLeNet
Microsoft deep residual network: 150 layers!

DeepID 
(Sun et al. CVPR 2014)
• 160 dim, 60 regions,
ﬂipped
• 19,200 dimensions!!
• Input to other model
• CelebFace
• Reﬁne training

Learning
technique
for
deep structure
Big data
Computing 
power
GPU, etc.

Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC

More Related Content

What's hot

Viewers also liked

Similar to Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC

More from BAINIDA

Recently uploaded

Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC