PR 127: FaceNet

FaceNet: A Unified Embedding for
Face Recognition and Clustering
PR-127
PR12 Season 2
Taeoh Kim, Tensorflow-KR
Image/Video Pattern Recognition Lab
School of Electrical & Electronic Engineering

References
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
• Deep Face Recognition: A Survey, M. Wang et al.
• FaceNet: A Unified Embedding for Face Recognition and Clustering, CVPR 2015, Google

Background 1: FR System
Figure from Deep Face Recognition: A Survey Paper

Background 1: FR System: Today

Background 1: FR History
Citations >1500
>500
>100

Background 1: FR History
Today
Maybe Next..?

Background 2: FR Approach
The Gallery: Target IDs
The Proble: Test ID
Identification: One-to-many, ID?
Verification: One-to-one, Yes/No?
*
Training using Identification Loss
Fine-tuning using Id/Verification Loss
and Test
Figure from SphereFace Paper

Background 3: DFR History

Deep Learning Breakthrough
Figures from DFD & DeepFace Papers
Hand-crafted Learned Filter based
84.02% Acc. TPAMI 2014
AlexNet based
97.35% Acc. CVPR 2014

DeepFace (by Facebook, CVPR 2014)
Figure from DeepFace Paper
• Training with Cross-Entropy Softmax Loss (Classification Loss)
• Fine-tune a Feature Representation using Chi-square Distance / Siamese Net
𝜒2
𝑓1, 𝑓2 = ෍
𝑖
𝑤𝑖
𝑓1 𝑖 − 𝑓2 𝑖 2
(𝑓1 𝑖 + 𝑓2[𝑖])
𝑦 = 𝜎(𝜒2
𝑓1, 𝑓2 )

Ref) Siamese Network
Figure from Andrew Ng’s Course

DeepFace (by Facebook, CVPR 2014)

DeepID2 (by HKU, NIPS 2014)

Citations >1500
>500
>100

𝐿 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥 + 𝑉𝑒𝑟𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛
𝑉𝑒𝑟𝑖𝑓 =
1
2
𝑓𝑖 − 𝑓𝑗
2
𝑖𝑓 𝑆𝑎𝑚𝑒 𝐼𝐷
1
2
max 0, 𝑚 − 𝑓𝑖 − 𝑓𝑗
2
𝑖𝑓 𝐷𝑖𝑓𝑓. 𝐼𝐷
Figure from DeepID2 Paper

1
2
2
1
2
2
Euclidean Space

1
2
2
1
2
2
Compresses
Intra-ID Variance

1
2
2
1
2
2
Enlarges
Inter-ID Variance

• Fine-tune for Verification using Joint Bayesian (98.97  99.15%)
Table from DeepID2 Paper

FaceNet (by Google, CVPR 2015)

Table from LFW Survey Paper

Before FaceNet
- Training using Identification Loss (+ Contrastive Loss)
- Fine Tune (using Metric Learning / Joint Bayesian)
FaceNet
- Training using Metric Learning

Metric Learning

• Very Big Data (~260M)
• Very Deep Network (ZFNet, GoogleNet)
• No Face Alignment
• Single Model
• Verification Loss (Metric) Only: Face L2 Embedding
- Verification: Thresholding the Distance between the two embeddings
- Identification: k-NN Classification
- Clustering: k-means Clustering

Verification
= Thresholding

Identifiaction
= K-NN Classification

Clustering
= K-means

• Very Big Data (~260M)
• Very Deep Network (ZFNet, GoogleNet)
• No Face Alignment
• Single Model
• Verification Loss (Metric) Only: Face L2 Embedding
- Verification: Thresholding the Distance between the two embeddings
- Identification: k-NN Classification
- Clustering: k-means Clustering
Triplet Loss

Figure from FaceNet Paper

Triplet Loss
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑝
2
2
Anchor
Positive
Figures from LFW Dataset

Triplet Loss
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝛼 − 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑛
2
2
Anchor
Negative

Triplet Loss
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒
𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑝
2
2
− 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑛
2
2
+ 𝛼
+
For all Triplets
Anchor
Negative
Positive

Triplet Loss Training Issue
• Large batch size for Anchor-Positive-Negative Balance
• Hard Positive in Mini Batch 𝑎𝑟𝑔𝑚𝑎𝑥 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑝
2
2
• Hard Negative in Mini-Batch 𝑎𝑟𝑔𝑚𝑖𝑛 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑛
2
2

Triplet Loss Training Issue
• Large batch size for Anchor-Positive-Negative Balance
• Hard Positive in Mini Batch 𝑎𝑟𝑔𝑚𝑎𝑥 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑝
2
2
 All Anchor-Positive
• Hard Negative in Mini-Batch 𝑎𝑟𝑔𝑚𝑖𝑛 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑛
2
2
 Semi-hard 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑝
2
2
< 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑛
2
2

FaceNet: Results

VGGFace (by Oxford, BMVC 2015)

VGGFace (by Oxford, BMVC 2015)
Method Images Networks Acc.
DeepFace 4M 3 97.35
DeepID3 200 99.47
FaceNet 200M 1 99.63
VGGFace 2.6M 1 98.95
Method Images Networks Acc.
DeepFace 4M 3 91.4
DeepID3 200 93.2
FaceNet 200M 1 95.1
VGGFace 2.6M 1 97.3
• VGGFace Dataset (Public Available)
• Softmax Loss + Triplet Loss
• Compact, SOTA Performance
LFW Test Perforamance YTF Test Perforamance
Figure from VGGFace Paper

Discussion / Practical Issues
• In FR, Dataset & Pre-processing is Very Very… Important
• Metric Learning vs Classification Loss depends on Applications
• Open-set / Closed-set
• Identification (Facebook) / Verification (Security, Face ID)
• Normal FR Research is Saturated

Face Recognition vs Object Recognition
• Small Inter-class Variations / Large Intra-class Variations (Pose, Emotion, Age)
• Small Discriminant Features / Low Resolution / Hard Occlusions
• Different Scenarios

Metric Learning in Computer Vision = One-shot Learning
• Face Recognition
• Image Retrieval
• Person Re-Identification
• Scene Recognition
• Object Tracking

FR Research Issues

References
https://github.com/davidsandberg/facenet
Tf implementation of FaceNet

Thank You!
Image/Video Pattern Recognition Lab
School of Electrical & Electronic Engineering

PR 127: FaceNet

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to PR 127: FaceNet

Similar to PR 127: FaceNet (20)

More from Taeoh Kim

More from Taeoh Kim (7)

Recently uploaded

Recently uploaded (20)

PR 127: FaceNet