SlideShare a Scribd company logo
FaceNet: A Unified Embedding for
Face Recognition and Clustering
PR-127
PR12 Season 2
Taeoh Kim, Tensorflow-KR
Image/Video Pattern Recognition Lab
School of Electrical & Electronic Engineering
References
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
• Deep Face Recognition: A Survey, M. Wang et al.
• FaceNet: A Unified Embedding for Face Recognition and Clustering, CVPR 2015, Google
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Background 1: FR System
Figure from Deep Face Recognition: A Survey Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Background 1: FR System: Today
Figure from Deep Face Recognition: A Survey Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Background 1: FR History
Figure from Deep Face Recognition: A Survey Paper
Citations >1500
>500
>100
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Background 1: FR History
Figure from Deep Face Recognition: A Survey Paper
Today
Maybe Next..?
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Background 2: FR Approach
The Gallery: Target IDs
The Proble: Test ID
Identification: One-to-many, ID?
Verification: One-to-one, Yes/No?
*
Training using Identification Loss
Fine-tuning using Id/Verification Loss
and Test
Figure from SphereFace Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Background 3: DFR History
Figure from Deep Face Recognition: A Survey Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Deep Learning Breakthrough
Figures from DFD & DeepFace Papers
Hand-crafted Learned Filter based
84.02% Acc. TPAMI 2014
AlexNet based
97.35% Acc. CVPR 2014
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
DeepFace (by Facebook, CVPR 2014)
Figure from DeepFace Paper
• Training with Cross-Entropy Softmax Loss (Classification Loss)
• Fine-tune a Feature Representation using Chi-square Distance / Siamese Net
𝜒2
𝑓1, 𝑓2 = ෍
𝑖
𝑤𝑖
𝑓1 𝑖 − 𝑓2 𝑖 2
(𝑓1 𝑖 + 𝑓2[𝑖])
𝑦 = 𝜎(𝜒2
𝑓1, 𝑓2 )
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Ref) Siamese Network
Figure from Andrew Ng’s Course
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
DeepFace (by Facebook, CVPR 2014)
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
DeepID2 (by HKU, NIPS 2014)
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Figure from Deep Face Recognition: A Survey Paper
Citations >1500
>500
>100
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
DeepID2 (by HKU, NIPS 2014)
𝐿 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥 + 𝑉𝑒𝑟𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛
𝑉𝑒𝑟𝑖𝑓 =
1
2
𝑓𝑖 − 𝑓𝑗
2
𝑖𝑓 𝑆𝑎𝑚𝑒 𝐼𝐷
1
2
max 0, 𝑚 − 𝑓𝑖 − 𝑓𝑗
2
𝑖𝑓 𝐷𝑖𝑓𝑓. 𝐼𝐷
Figure from DeepID2 Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
DeepID2 (by HKU, NIPS 2014)
𝐿 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥 + 𝑉𝑒𝑟𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛
𝑉𝑒𝑟𝑖𝑓 =
1
2
𝑓𝑖 − 𝑓𝑗
2
𝑖𝑓 𝑆𝑎𝑚𝑒 𝐼𝐷
1
2
max 0, 𝑚 − 𝑓𝑖 − 𝑓𝑗
2
𝑖𝑓 𝐷𝑖𝑓𝑓. 𝐼𝐷
Euclidean Space
Figure from DeepID2 Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
DeepID2 (by HKU, NIPS 2014)
𝐿 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥 + 𝑉𝑒𝑟𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛
𝑉𝑒𝑟𝑖𝑓 =
1
2
𝑓𝑖 − 𝑓𝑗
2
𝑖𝑓 𝑆𝑎𝑚𝑒 𝐼𝐷
1
2
max 0, 𝑚 − 𝑓𝑖 − 𝑓𝑗
2
𝑖𝑓 𝐷𝑖𝑓𝑓. 𝐼𝐷
Compresses
Intra-ID Variance
Figure from DeepID2 Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
DeepID2 (by HKU, NIPS 2014)
𝐿 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥 + 𝑉𝑒𝑟𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛
𝑉𝑒𝑟𝑖𝑓 =
1
2
𝑓𝑖 − 𝑓𝑗
2
𝑖𝑓 𝑆𝑎𝑚𝑒 𝐼𝐷
1
2
max 0, 𝑚 − 𝑓𝑖 − 𝑓𝑗
2
𝑖𝑓 𝐷𝑖𝑓𝑓. 𝐼𝐷
Enlarges
Inter-ID Variance
Figure from DeepID2 Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
DeepID2 (by HKU, NIPS 2014)
• Fine-tune for Verification using Joint Bayesian (98.97  99.15%)
Table from DeepID2 Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Figure from Deep Face Recognition: A Survey Paper
Citations >1500
>500
>100
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
FaceNet (by Google, CVPR 2015)
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
FaceNet (by Google, CVPR 2015)
Table from LFW Survey Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
FaceNet (by Google, CVPR 2015)
Before FaceNet
- Training using Identification Loss (+ Contrastive Loss)
- Fine Tune (using Metric Learning / Joint Bayesian)
FaceNet
- Training using Metric Learning
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Metric Learning
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
FaceNet (by Google, CVPR 2015)
• Very Big Data (~260M)
• Very Deep Network (ZFNet, GoogleNet)
• No Face Alignment
• Single Model
• Verification Loss (Metric) Only: Face L2 Embedding
- Verification: Thresholding the Distance between the two embeddings
- Identification: k-NN Classification
- Clustering: k-means Clustering
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Verification
= Thresholding
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Identifiaction
= K-NN Classification
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Clustering
= K-means
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
FaceNet (by Google, CVPR 2015)
• Very Big Data (~260M)
• Very Deep Network (ZFNet, GoogleNet)
• No Face Alignment
• Single Model
• Verification Loss (Metric) Only: Face L2 Embedding
- Verification: Thresholding the Distance between the two embeddings
- Identification: k-NN Classification
- Clustering: k-means Clustering
Triplet Loss
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
FaceNet (by Google, CVPR 2015)
Figure from FaceNet Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Triplet Loss
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑝
2
2
Anchor
Positive
Figures from LFW Dataset
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Triplet Loss
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝛼 − 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑛
2
2
Anchor
Negative
Figures from LFW Dataset
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Triplet Loss
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒
𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑝
2
2
− 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑛
2
2
+ 𝛼
+
For all Triplets
Anchor
Negative
Positive
Figures from LFW Dataset
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Triplet Loss Training Issue
• Large batch size for Anchor-Positive-Negative Balance
• Hard Positive in Mini Batch 𝑎𝑟𝑔𝑚𝑎𝑥 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑝
2
2
• Hard Negative in Mini-Batch 𝑎𝑟𝑔𝑚𝑖𝑛 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑛
2
2
Figure from FaceNet Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Triplet Loss Training Issue
• Large batch size for Anchor-Positive-Negative Balance
• Hard Positive in Mini Batch 𝑎𝑟𝑔𝑚𝑎𝑥 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑝
2
2
 All Anchor-Positive
• Hard Negative in Mini-Batch 𝑎𝑟𝑔𝑚𝑖𝑛 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑛
2
2
 Semi-hard 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑝
2
2
< 𝑓 𝑥𝑖
𝑎
− 𝑓 𝑥𝑖
𝑛
2
2
Figure from FaceNet Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
FaceNet: Results
Figure from FaceNet Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Figure from Deep Face Recognition: A Survey Paper
Citations >1500
>500
>100
VGGFace (by Oxford, BMVC 2015)
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
VGGFace (by Oxford, BMVC 2015)
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Method Images Networks Acc.
DeepFace 4M 3 97.35
DeepID3 200 99.47
FaceNet 200M 1 99.63
VGGFace 2.6M 1 98.95
Method Images Networks Acc.
DeepFace 4M 3 91.4
DeepID3 200 93.2
FaceNet 200M 1 95.1
VGGFace 2.6M 1 97.3
• VGGFace Dataset (Public Available)
• Softmax Loss + Triplet Loss
• Compact, SOTA Performance
LFW Test Perforamance YTF Test Perforamance
Figure from VGGFace Paper
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Figure from Deep Face Recognition: A Survey Paper
Citations >1500
>500
>100
Discussion / Practical Issues
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
• In FR, Dataset & Pre-processing is Very Very… Important
• Metric Learning vs Classification Loss depends on Applications
• Open-set / Closed-set
• Identification (Facebook) / Verification (Security, Face ID)
• Normal FR Research is Saturated
Face Recognition vs Object Recognition
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
• Small Inter-class Variations / Large Intra-class Variations (Pose, Emotion, Age)
• Small Discriminant Features / Low Resolution / Hard Occlusions
• Different Scenarios
Metric Learning in Computer Vision = One-shot Learning
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
• Face Recognition
• Image Retrieval
• Person Re-Identification
• Scene Recognition
• Object Tracking
FR Research Issues
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Figure from Deep Face Recognition: A Survey Paper
References
https://github.com/davidsandberg/facenet
Tf implementation of FaceNet
Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
Thank You!
Image/Video Pattern Recognition Lab
School of Electrical & Electronic Engineering

More Related Content

What's hot

Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
Brodmann17
 
오토인코더의 모든 것
오토인코더의 모든 것오토인코더의 모든 것
오토인코더의 모든 것
NAVER Engineering
 
U-Net (1).pptx
U-Net (1).pptxU-Net (1).pptx
U-Net (1).pptx
Changjin Lee
 
cnn ppt.pptx
cnn ppt.pptxcnn ppt.pptx
cnn ppt.pptx
rohithprabhas1
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
Hichem Felouat
 
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream) Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
IT Arena
 
LDA presentation
LDA presentationLDA presentation
LDA presentation
Mohit Gupta
 
SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)
Universitat Politècnica de Catalunya
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
Christian Perone
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
Edge AI and Vision Alliance
 
You only look once
You only look onceYou only look once
You only look once
Gin Kyeng Lee
 
[DL輪読会]Large Scale GAN Training for High Fidelity Natural Image Synthesis
[DL輪読会]Large Scale GAN Training for High Fidelity Natural Image Synthesis[DL輪読会]Large Scale GAN Training for High Fidelity Natural Image Synthesis
[DL輪読会]Large Scale GAN Training for High Fidelity Natural Image Synthesis
Deep Learning JP
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
DADAJONJURAKUZIEV
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksData Quality for Machine Learning Tasks
Data Quality for Machine Learning Tasks
Hima Patel
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
Hyeongmin Lee
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
Ferdous ahmed
 
Generating Diverse High-Fidelity Images with VQ-VAE-2
Generating Diverse High-Fidelity Images with VQ-VAE-2Generating Diverse High-Fidelity Images with VQ-VAE-2
Generating Diverse High-Fidelity Images with VQ-VAE-2
harmonylab
 
Viola-Jones Object Detection
Viola-Jones Object DetectionViola-Jones Object Detection
Viola-Jones Object Detection
Venugopal Boddu
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Universitat Politècnica de Catalunya
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
ﺁﺻﻒ ﻋﻠﯽ ﻣﯿﺮ
 

What's hot (20)

Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
오토인코더의 모든 것
오토인코더의 모든 것오토인코더의 모든 것
오토인코더의 모든 것
 
U-Net (1).pptx
U-Net (1).pptxU-Net (1).pptx
U-Net (1).pptx
 
cnn ppt.pptx
cnn ppt.pptxcnn ppt.pptx
cnn ppt.pptx
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
 
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream) Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
 
LDA presentation
LDA presentationLDA presentation
LDA presentation
 
SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
 
You only look once
You only look onceYou only look once
You only look once
 
[DL輪読会]Large Scale GAN Training for High Fidelity Natural Image Synthesis
[DL輪読会]Large Scale GAN Training for High Fidelity Natural Image Synthesis[DL輪読会]Large Scale GAN Training for High Fidelity Natural Image Synthesis
[DL輪読会]Large Scale GAN Training for High Fidelity Natural Image Synthesis
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksData Quality for Machine Learning Tasks
Data Quality for Machine Learning Tasks
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Generating Diverse High-Fidelity Images with VQ-VAE-2
Generating Diverse High-Fidelity Images with VQ-VAE-2Generating Diverse High-Fidelity Images with VQ-VAE-2
Generating Diverse High-Fidelity Images with VQ-VAE-2
 
Viola-Jones Object Detection
Viola-Jones Object DetectionViola-Jones Object Detection
Viola-Jones Object Detection
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
 

Similar to PR 127: FaceNet

A Neural Network Approach to Deep-Fake Video Detection
A Neural Network Approach to Deep-Fake Video DetectionA Neural Network Approach to Deep-Fake Video Detection
A Neural Network Approach to Deep-Fake Video Detection
IRJET Journal
 
OPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video StreamingOPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video Streaming
Alpen-Adria-Universität
 
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfOPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
Vignesh V Menon
 
Fake Video Creation and Detection: A Review
Fake Video Creation and Detection: A ReviewFake Video Creation and Detection: A Review
Fake Video Creation and Detection: A Review
IRJET Journal
 
YU JIANGANG
YU JIANGANGYU JIANGANG
YU JIANGANG
butest
 
Serena Yeung, PHD, Stanford, at MLconf Seattle 2017
Serena Yeung, PHD, Stanford, at MLconf Seattle 2017 Serena Yeung, PHD, Stanford, at MLconf Seattle 2017
Serena Yeung, PHD, Stanford, at MLconf Seattle 2017
MLconf
 
20211118 AI+ Remote Sensing
20211118 AI+ Remote Sensing20211118 AI+ Remote Sensing
20211118 AI+ Remote Sensing
Jui-Hsin (Larry) Lai
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...
IEEEBEBTECHSTUDENTPROJECTS
 
Video + Language 2019
Video + Language 2019Video + Language 2019
Video + Language 2019
Goergen Institute for Data Science
 
Video + Language
Video + LanguageVideo + Language
Beyond the GFLOPS
Beyond the GFLOPSBeyond the GFLOPS
Beyond the GFLOPS
Slide_N
 
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"..."How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
Edge AI and Vision Alliance
 
AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...
AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...
AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...
Jui-Hsin (Larry) Lai
 
Recent Trends in Signal and Image Processing - Applications
Recent Trends in Signal and Image Processing - ApplicationsRecent Trends in Signal and Image Processing - Applications
Recent Trends in Signal and Image Processing - Applications
Anand Muglikar
 
TAAI 2016 Keynote Talk: It is all about AI
TAAI 2016 Keynote Talk: It is all about AITAAI 2016 Keynote Talk: It is all about AI
TAAI 2016 Keynote Talk: It is all about AI
Yi-Shin Chen
 
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBMSolr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Lucidworks
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
CHENHuiMei
 
Scalawox deeplearning
Scalawox deeplearningScalawox deeplearning
Scalawox deeplearning
scalawox
 
VOGIN-IP-lezing-Zeno_ geradts
VOGIN-IP-lezing-Zeno_ geradtsVOGIN-IP-lezing-Zeno_ geradts
VOGIN-IP-lezing-Zeno_ geradts
voginip
 
Cycle-Contrast for Self-Supervised Video Represenation Learning
Cycle-Contrast for Self-Supervised Video Represenation LearningCycle-Contrast for Self-Supervised Video Represenation Learning
Cycle-Contrast for Self-Supervised Video Represenation Learning
Quan Kong
 

Similar to PR 127: FaceNet (20)

A Neural Network Approach to Deep-Fake Video Detection
A Neural Network Approach to Deep-Fake Video DetectionA Neural Network Approach to Deep-Fake Video Detection
A Neural Network Approach to Deep-Fake Video Detection
 
OPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video StreamingOPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video Streaming
 
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfOPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
 
Fake Video Creation and Detection: A Review
Fake Video Creation and Detection: A ReviewFake Video Creation and Detection: A Review
Fake Video Creation and Detection: A Review
 
YU JIANGANG
YU JIANGANGYU JIANGANG
YU JIANGANG
 
Serena Yeung, PHD, Stanford, at MLconf Seattle 2017
Serena Yeung, PHD, Stanford, at MLconf Seattle 2017 Serena Yeung, PHD, Stanford, at MLconf Seattle 2017
Serena Yeung, PHD, Stanford, at MLconf Seattle 2017
 
20211118 AI+ Remote Sensing
20211118 AI+ Remote Sensing20211118 AI+ Remote Sensing
20211118 AI+ Remote Sensing
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...
 
Video + Language 2019
Video + Language 2019Video + Language 2019
Video + Language 2019
 
Video + Language
Video + LanguageVideo + Language
Video + Language
 
Beyond the GFLOPS
Beyond the GFLOPSBeyond the GFLOPS
Beyond the GFLOPS
 
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"..."How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
 
AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...
AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...
AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...
 
Recent Trends in Signal and Image Processing - Applications
Recent Trends in Signal and Image Processing - ApplicationsRecent Trends in Signal and Image Processing - Applications
Recent Trends in Signal and Image Processing - Applications
 
TAAI 2016 Keynote Talk: It is all about AI
TAAI 2016 Keynote Talk: It is all about AITAAI 2016 Keynote Talk: It is all about AI
TAAI 2016 Keynote Talk: It is all about AI
 
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBMSolr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Scalawox deeplearning
Scalawox deeplearningScalawox deeplearning
Scalawox deeplearning
 
VOGIN-IP-lezing-Zeno_ geradts
VOGIN-IP-lezing-Zeno_ geradtsVOGIN-IP-lezing-Zeno_ geradts
VOGIN-IP-lezing-Zeno_ geradts
 
Cycle-Contrast for Self-Supervised Video Represenation Learning
Cycle-Contrast for Self-Supervised Video Represenation LearningCycle-Contrast for Self-Supervised Video Represenation Learning
Cycle-Contrast for Self-Supervised Video Represenation Learning
 

More from Taeoh Kim

CNN Attention Networks
CNN Attention NetworksCNN Attention Networks
CNN Attention Networks
Taeoh Kim
 
PR 113: The Perception Distortion Tradeoff
PR 113: The Perception Distortion TradeoffPR 113: The Perception Distortion Tradeoff
PR 113: The Perception Distortion Tradeoff
Taeoh Kim
 
PR 103: t-SNE
PR 103: t-SNEPR 103: t-SNE
PR 103: t-SNE
Taeoh Kim
 
Pr083 Non-local Neural Networks
Pr083 Non-local Neural NetworksPr083 Non-local Neural Networks
Pr083 Non-local Neural Networks
Taeoh Kim
 
Pr072 deep compression
Pr072 deep compressionPr072 deep compression
Pr072 deep compression
Taeoh Kim
 
Pr057 mask rcnn
Pr057 mask rcnnPr057 mask rcnn
Pr057 mask rcnn
Taeoh Kim
 
Pr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentationPr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentation
Taeoh Kim
 

More from Taeoh Kim (7)

CNN Attention Networks
CNN Attention NetworksCNN Attention Networks
CNN Attention Networks
 
PR 113: The Perception Distortion Tradeoff
PR 113: The Perception Distortion TradeoffPR 113: The Perception Distortion Tradeoff
PR 113: The Perception Distortion Tradeoff
 
PR 103: t-SNE
PR 103: t-SNEPR 103: t-SNE
PR 103: t-SNE
 
Pr083 Non-local Neural Networks
Pr083 Non-local Neural NetworksPr083 Non-local Neural Networks
Pr083 Non-local Neural Networks
 
Pr072 deep compression
Pr072 deep compressionPr072 deep compression
Pr072 deep compression
 
Pr057 mask rcnn
Pr057 mask rcnnPr057 mask rcnn
Pr057 mask rcnn
 
Pr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentationPr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentation
 

Recently uploaded

An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...
IJECEIAES
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
LAXMAREDDY22
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
ecqow
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
AjmalKhan50578
 
Data Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptxData Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptx
ramrag33
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
Prakhyath Rai
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
riddhimaagrawal986
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
GauravCar
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
TaghreedAltamimi
 

Recently uploaded (20)

An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
 
Data Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptxData Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptx
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
 

PR 127: FaceNet

  • 1. FaceNet: A Unified Embedding for Face Recognition and Clustering PR-127 PR12 Season 2 Taeoh Kim, Tensorflow-KR Image/Video Pattern Recognition Lab School of Electrical & Electronic Engineering
  • 2. References Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet • Deep Face Recognition: A Survey, M. Wang et al. • FaceNet: A Unified Embedding for Face Recognition and Clustering, CVPR 2015, Google
  • 3. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Background 1: FR System Figure from Deep Face Recognition: A Survey Paper
  • 4. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Background 1: FR System: Today Figure from Deep Face Recognition: A Survey Paper
  • 5. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Background 1: FR History Figure from Deep Face Recognition: A Survey Paper Citations >1500 >500 >100
  • 6. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Background 1: FR History Figure from Deep Face Recognition: A Survey Paper Today Maybe Next..?
  • 7. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Background 2: FR Approach The Gallery: Target IDs The Proble: Test ID Identification: One-to-many, ID? Verification: One-to-one, Yes/No? * Training using Identification Loss Fine-tuning using Id/Verification Loss and Test Figure from SphereFace Paper
  • 8. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Background 3: DFR History Figure from Deep Face Recognition: A Survey Paper
  • 9. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Deep Learning Breakthrough Figures from DFD & DeepFace Papers Hand-crafted Learned Filter based 84.02% Acc. TPAMI 2014 AlexNet based 97.35% Acc. CVPR 2014
  • 10. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet DeepFace (by Facebook, CVPR 2014) Figure from DeepFace Paper • Training with Cross-Entropy Softmax Loss (Classification Loss) • Fine-tune a Feature Representation using Chi-square Distance / Siamese Net 𝜒2 𝑓1, 𝑓2 = ෍ 𝑖 𝑤𝑖 𝑓1 𝑖 − 𝑓2 𝑖 2 (𝑓1 𝑖 + 𝑓2[𝑖]) 𝑦 = 𝜎(𝜒2 𝑓1, 𝑓2 )
  • 11. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Ref) Siamese Network Figure from Andrew Ng’s Course
  • 12. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet DeepFace (by Facebook, CVPR 2014)
  • 13. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet DeepID2 (by HKU, NIPS 2014)
  • 14. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Figure from Deep Face Recognition: A Survey Paper Citations >1500 >500 >100
  • 15. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet DeepID2 (by HKU, NIPS 2014) 𝐿 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥 + 𝑉𝑒𝑟𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛 𝑉𝑒𝑟𝑖𝑓 = 1 2 𝑓𝑖 − 𝑓𝑗 2 𝑖𝑓 𝑆𝑎𝑚𝑒 𝐼𝐷 1 2 max 0, 𝑚 − 𝑓𝑖 − 𝑓𝑗 2 𝑖𝑓 𝐷𝑖𝑓𝑓. 𝐼𝐷 Figure from DeepID2 Paper
  • 16. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet DeepID2 (by HKU, NIPS 2014) 𝐿 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥 + 𝑉𝑒𝑟𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛 𝑉𝑒𝑟𝑖𝑓 = 1 2 𝑓𝑖 − 𝑓𝑗 2 𝑖𝑓 𝑆𝑎𝑚𝑒 𝐼𝐷 1 2 max 0, 𝑚 − 𝑓𝑖 − 𝑓𝑗 2 𝑖𝑓 𝐷𝑖𝑓𝑓. 𝐼𝐷 Euclidean Space Figure from DeepID2 Paper
  • 17. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet DeepID2 (by HKU, NIPS 2014) 𝐿 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥 + 𝑉𝑒𝑟𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛 𝑉𝑒𝑟𝑖𝑓 = 1 2 𝑓𝑖 − 𝑓𝑗 2 𝑖𝑓 𝑆𝑎𝑚𝑒 𝐼𝐷 1 2 max 0, 𝑚 − 𝑓𝑖 − 𝑓𝑗 2 𝑖𝑓 𝐷𝑖𝑓𝑓. 𝐼𝐷 Compresses Intra-ID Variance Figure from DeepID2 Paper
  • 18. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet DeepID2 (by HKU, NIPS 2014) 𝐿 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥 + 𝑉𝑒𝑟𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛 𝑉𝑒𝑟𝑖𝑓 = 1 2 𝑓𝑖 − 𝑓𝑗 2 𝑖𝑓 𝑆𝑎𝑚𝑒 𝐼𝐷 1 2 max 0, 𝑚 − 𝑓𝑖 − 𝑓𝑗 2 𝑖𝑓 𝐷𝑖𝑓𝑓. 𝐼𝐷 Enlarges Inter-ID Variance Figure from DeepID2 Paper
  • 19. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet DeepID2 (by HKU, NIPS 2014) • Fine-tune for Verification using Joint Bayesian (98.97  99.15%) Table from DeepID2 Paper
  • 20. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Figure from Deep Face Recognition: A Survey Paper Citations >1500 >500 >100
  • 21. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet FaceNet (by Google, CVPR 2015)
  • 22. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet FaceNet (by Google, CVPR 2015) Table from LFW Survey Paper
  • 23. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet FaceNet (by Google, CVPR 2015) Before FaceNet - Training using Identification Loss (+ Contrastive Loss) - Fine Tune (using Metric Learning / Joint Bayesian) FaceNet - Training using Metric Learning
  • 24. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Metric Learning
  • 25. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet FaceNet (by Google, CVPR 2015) • Very Big Data (~260M) • Very Deep Network (ZFNet, GoogleNet) • No Face Alignment • Single Model • Verification Loss (Metric) Only: Face L2 Embedding - Verification: Thresholding the Distance between the two embeddings - Identification: k-NN Classification - Clustering: k-means Clustering
  • 26. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Verification = Thresholding
  • 27. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Identifiaction = K-NN Classification
  • 28. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Clustering = K-means
  • 29. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet FaceNet (by Google, CVPR 2015) • Very Big Data (~260M) • Very Deep Network (ZFNet, GoogleNet) • No Face Alignment • Single Model • Verification Loss (Metric) Only: Face L2 Embedding - Verification: Thresholding the Distance between the two embeddings - Identification: k-NN Classification - Clustering: k-means Clustering Triplet Loss
  • 30. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet FaceNet (by Google, CVPR 2015) Figure from FaceNet Paper
  • 31. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Triplet Loss 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥𝑖 𝑎 − 𝑓 𝑥𝑖 𝑝 2 2 Anchor Positive Figures from LFW Dataset
  • 32. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Triplet Loss 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝛼 − 𝑓 𝑥𝑖 𝑎 − 𝑓 𝑥𝑖 𝑛 2 2 Anchor Negative Figures from LFW Dataset
  • 33. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Triplet Loss 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥𝑖 𝑎 − 𝑓 𝑥𝑖 𝑝 2 2 − 𝑓 𝑥𝑖 𝑎 − 𝑓 𝑥𝑖 𝑛 2 2 + 𝛼 + For all Triplets Anchor Negative Positive Figures from LFW Dataset
  • 34. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Triplet Loss Training Issue • Large batch size for Anchor-Positive-Negative Balance • Hard Positive in Mini Batch 𝑎𝑟𝑔𝑚𝑎𝑥 𝑓 𝑥𝑖 𝑎 − 𝑓 𝑥𝑖 𝑝 2 2 • Hard Negative in Mini-Batch 𝑎𝑟𝑔𝑚𝑖𝑛 𝑓 𝑥𝑖 𝑎 − 𝑓 𝑥𝑖 𝑛 2 2 Figure from FaceNet Paper
  • 35. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Triplet Loss Training Issue • Large batch size for Anchor-Positive-Negative Balance • Hard Positive in Mini Batch 𝑎𝑟𝑔𝑚𝑎𝑥 𝑓 𝑥𝑖 𝑎 − 𝑓 𝑥𝑖 𝑝 2 2  All Anchor-Positive • Hard Negative in Mini-Batch 𝑎𝑟𝑔𝑚𝑖𝑛 𝑓 𝑥𝑖 𝑎 − 𝑓 𝑥𝑖 𝑛 2 2  Semi-hard 𝑓 𝑥𝑖 𝑎 − 𝑓 𝑥𝑖 𝑝 2 2 < 𝑓 𝑥𝑖 𝑎 − 𝑓 𝑥𝑖 𝑛 2 2 Figure from FaceNet Paper
  • 36. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet FaceNet: Results Figure from FaceNet Paper
  • 37. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Figure from Deep Face Recognition: A Survey Paper Citations >1500 >500 >100
  • 38. VGGFace (by Oxford, BMVC 2015) Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
  • 39. VGGFace (by Oxford, BMVC 2015) Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Method Images Networks Acc. DeepFace 4M 3 97.35 DeepID3 200 99.47 FaceNet 200M 1 99.63 VGGFace 2.6M 1 98.95 Method Images Networks Acc. DeepFace 4M 3 91.4 DeepID3 200 93.2 FaceNet 200M 1 95.1 VGGFace 2.6M 1 97.3 • VGGFace Dataset (Public Available) • Softmax Loss + Triplet Loss • Compact, SOTA Performance LFW Test Perforamance YTF Test Perforamance Figure from VGGFace Paper
  • 40. Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Figure from Deep Face Recognition: A Survey Paper Citations >1500 >500 >100
  • 41. Discussion / Practical Issues Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet • In FR, Dataset & Pre-processing is Very Very… Important • Metric Learning vs Classification Loss depends on Applications • Open-set / Closed-set • Identification (Facebook) / Verification (Security, Face ID) • Normal FR Research is Saturated
  • 42. Face Recognition vs Object Recognition Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet • Small Inter-class Variations / Large Intra-class Variations (Pose, Emotion, Age) • Small Discriminant Features / Low Resolution / Hard Occlusions • Different Scenarios
  • 43. Metric Learning in Computer Vision = One-shot Learning Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet • Face Recognition • Image Retrieval • Person Re-Identification • Scene Recognition • Object Tracking
  • 44. FR Research Issues Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet Figure from Deep Face Recognition: A Survey Paper
  • 45. References https://github.com/davidsandberg/facenet Tf implementation of FaceNet Yonsei - Image/Video Pattern Recognition LabPR-127: FaceNet
  • 46. Thank You! Image/Video Pattern Recognition Lab School of Electrical & Electronic Engineering