SlideShare a Scribd company logo
1 of 22
FaceNet: A Unified
Embedding for Face
Recognition and Clustering
13th August 2021
Types of Face recognition
• Face Verification
• Face Identification
• Face Clustering
Uses of Face Recognition
• Protecting Problem Gamblers
• Buying burgers
• Speeding up hotel check-in
• Etc.
FaceNet Model Overview
• FaceNet provides a unified embedding for face
recognition, verification and clustering tasks.
• Developed by Google Researchers - Schroff et al. at
Google in their 2015 paper
• FaceNet learns a mapping from face images to a
compact Euclidean space where distances directly
correspond to a measure of face similarity.
• A deep CNN is trained to optimize the embedding itself
using a novel online triplet mining method.
• These face embeddings achieved state-of-the-art
results on standard face recognition benchmark
datasets (cuts the error rate in comparison to the best
published result [2015] by 30%
FaceNet Architecture
• Facenet model is invariant to pose, illumination, and other variational conditions.
• FaceNet uses 22 layer deep CNN that directly trains it’s output to be a 128 dimensional embedding
• The network is trained such that the squared L2 distance between the embeddings correspond to face
similarity.
Triplet Loss
• Intuition - anchor image should be closer to positive
images as compared to negative images
• Thus, we want:
• The Loss that is being minimized:
Triplet Selection
• Choose “Hard-to-get” Triplets
• However, it is computationally infeasible to compute hard positives and hard negatives over the
entire dataset
• To avoid this: Generate triplets online. That is, select +ve and –ve (argmax and argmin) from a mini-
batch (not from the entire dataset) for the anchor image.
• They sample training data such that around 40 images are selected per identity for each mini-batch
and randomly sample negative faces for each mini-batch.
Deep Convolutional Networks
• CNN is trained using Stochastic Gradient Descent (SGD) with standard backprop and
AdaGrad.
• The inventors of Facenet explored 2 types of architecture where the difference is in the no.
of parameters and FLOPS
• Model1 – uses the Zeiler&Fergus architecture and results in a model 22 layers deep. It has
a total of 140million parameters and requires around 1.6 billion FLOPS per image.
• Model2 - based on GoogLeNet style Inception models which has 20× fewer parameters
(around 6.6M-7.5M) and up to 5×fewer FLOPS(between 500M-1.6B).
Zeiler&Fergus-
Inspired Architecture
• Consists of multiple interleaved layers of
convolutions, non-linear activations, local
response normalizations, and max
pooling layers (with several additional
1x1xd convolutional layers throughout).
• 1x1 conv layer is inspired by the cross-
channel parametric pooling
Inception-Inspired
Architecture
Dataset and Evaluation
• The model is evaluated on 4 different datasets & these parameters are
evaluated:
1. Hold-out Test Set: 1M images having the same distribution as the training set. Divided into 5
subsets. VAL and FAR are calculated on 100k x 100k image pairs.
2. Personal Photos: 12k images with FAR and VAL calculated for 12k x 12k image pairs.
3. Labeled Faces in the Wild (LFW): de-facto academic test set for face recognition. FAR and
VAL are not calculated.
4. Youtube Faces DB: setup is similar to LFW, but pairs of videos instead of images are used.
FAR and VAL are not calculated.
Experiments with Facenet
Accuracy on Different Models
Embedding
Dimensionality
JPEG
compression
Image Size
Performance on LFW
• Achieved record breaking classification accuracy of 99.63%
• This reduces the error reported for DeepFace by more than a factor of 7
• This is the performance of model NN1, but even the much smaller NN3 achieves
performance that is not statistically significantly different.
• Classification accuracy achieved is 95.12% (state-of-the-art).
• Previous efforts DeepId2+ (Sun et al.) had achieved 93.2%
Performance Youtube Faces DB
Face Clustering
• These embedding can be used to cluster a users
personal photos into groups of people withthe same
identity
• Figure shows one cluster in a users personal photo
collection, generated using agglomerative clustering.
• It is a clear showcase of the incredible invariance to
occlusion, lighting, pose and even age
How to Apply Facenet
Create a folder with images
(>1) per person. The images
should have fairly good
resolution and need not
necessarily be cropped
Images should be on
grayscale and scaled
accordingly
Use pre-trained models to
detect faces and create a
bounding box.
Train by passing cropped
images to facenet to learn
the embeddings.
For testing, pass a new
image which is not in our
database.
Compute the face
embedding using the same
network we used above and
then compare this
embedding with the rest of
the embeddings we have.
Summary and Conclusions
• state-of-the-art face recognition performance using only 128-bytes per
face.
• Minimal alignment required on the input dataset (tight crop around the
face area), unlike DeepFace (FAIR) which performs 3D alignment.
Related works
• Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deepface: Closing the
gap to human-level performance in face verification.
• Y. Sun, X. Wang, and X. Tang. Deeply learned face representations are
sparse, selective, and robust.
Appendix
Computation vs.
Accuracy Trade-off
• 100M - 200M images training face thumbnails, having 8M identities are used
• Pre-processing: detecting faces and generating a tight bound box around each face. Resized depending on
the input sizes of the networks varying from 96x96 to 224x224.
• The graph shows a strong correlation between FLOPS & accuracy achieved. There isn’t a correlation b/w
accuracy vs no. of parameters.
• NN1 and NN2 differ in number of parameters by a factor of
20. But they achieve comparable performance
• NNS2, a tiny version of NN2, can be run on a mobile phone
Sensitivity to Image Quality
Their models are robust to JPEG
compression and perform well even
at a JPEG quality of 20
Performance drop is very less with
120x120 input image size and remains
acceptable even at 80x80
Embedding Dimensionality
• They experimented with a lot of dimensionalities
and chose 128-D, as it was the best performing.
• It was expected that the larger dimensionalities
would perform better, but it could also mean that
they require more training.
• Smaller embedding dimensions could be
employed on mobile devices, with minor loss of
accuracy.
Amount of Training Data
• Smaller model with input size
of 96x96 was employed for
this analysis. It has same
architecture as NN2 but
without the 5x5 conv. in the
inception module.

More Related Content

What's hot

An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep LearningJulien SIMON
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Jeong-Gwan Lee
 
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...Symeon Papadopoulos
 
Understanding RNN and LSTM
Understanding RNN and LSTMUnderstanding RNN and LSTM
Understanding RNN and LSTM健程 杨
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural networkMojammilHusain
 
Masked Autoencoders Are Scalable Vision Learners.pptx
Masked Autoencoders Are Scalable Vision Learners.pptxMasked Autoencoders Are Scalable Vision Learners.pptx
Masked Autoencoders Are Scalable Vision Learners.pptxSangmin Woo
 
DEEPFAKE DETECTION TECHNIQUES: A REVIEW
DEEPFAKE DETECTION TECHNIQUES: A REVIEWDEEPFAKE DETECTION TECHNIQUES: A REVIEW
DEEPFAKE DETECTION TECHNIQUES: A REVIEWvivatechijri
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Simplilearn
 
Research of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkResearch of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkNAVER Engineering
 
face recognition system using LBP
face recognition system using LBPface recognition system using LBP
face recognition system using LBPMarwan H. Noman
 
PR 127: FaceNet
PR 127: FaceNetPR 127: FaceNet
PR 127: FaceNetTaeoh Kim
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving IIYu Huang
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network Yan Xu
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural networkSmriti Tikoo
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetSungminYou
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksChristian Perone
 
Deformable Convolutional Network (2017)
Deformable Convolutional Network (2017)Deformable Convolutional Network (2017)
Deformable Convolutional Network (2017)Terry Taewoong Um
 

What's hot (20)

An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)
 
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
 
Understanding RNN and LSTM
Understanding RNN and LSTMUnderstanding RNN and LSTM
Understanding RNN and LSTM
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Face Recognition
Face RecognitionFace Recognition
Face Recognition
 
Masked Autoencoders Are Scalable Vision Learners.pptx
Masked Autoencoders Are Scalable Vision Learners.pptxMasked Autoencoders Are Scalable Vision Learners.pptx
Masked Autoencoders Are Scalable Vision Learners.pptx
 
DEEPFAKE DETECTION TECHNIQUES: A REVIEW
DEEPFAKE DETECTION TECHNIQUES: A REVIEWDEEPFAKE DETECTION TECHNIQUES: A REVIEW
DEEPFAKE DETECTION TECHNIQUES: A REVIEW
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
 
Research of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkResearch of adversarial example on a deep neural network
Research of adversarial example on a deep neural network
 
face recognition system using LBP
face recognition system using LBPface recognition system using LBP
face recognition system using LBP
 
PR 127: FaceNet
PR 127: FaceNetPR 127: FaceNet
PR 127: FaceNet
 
Cnn
CnnCnn
Cnn
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II
 
Final year ppt
Final year pptFinal year ppt
Final year ppt
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural network
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNet
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Deformable Convolutional Network (2017)
Deformable Convolutional Network (2017)Deformable Convolutional Network (2017)
Deformable Convolutional Network (2017)
 

Similar to Face Detection.pptx

Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisNaeem Shehzad
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]SubhradeepMaji
 
Robustness of compressed CNNs
Robustness of compressed CNNsRobustness of compressed CNNs
Robustness of compressed CNNsKaushalya Madhawa
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNZihao(Gerald) Zhang
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 
Pixel Recurrent Neural Networks
Pixel Recurrent Neural NetworksPixel Recurrent Neural Networks
Pixel Recurrent Neural Networksneouyghur
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsChester Chen
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyNUPUR YADAV
 
PR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersPR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersJinwon Lee
 
High quality single shot capture of facial geometry
High quality single shot capture of facial geometryHigh quality single shot capture of facial geometry
High quality single shot capture of facial geometryBrohi Aijaz Ali
 
cvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptxcvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptxPyariMohanJena
 
ppt 20BET1024.pptx
ppt 20BET1024.pptxppt 20BET1024.pptx
ppt 20BET1024.pptxManeetBali
 
FaceNet: A Unified Embedding for Face Recognition and Clustering
FaceNet: A Unified Embedding for Face Recognition and ClusteringFaceNet: A Unified Embedding for Face Recognition and Clustering
FaceNet: A Unified Embedding for Face Recognition and ClusteringWilly Marroquin (WillyDevNET)
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakPyData
 
A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution Mohammed Ashour
 
Real time multi face detection using deep learning
Real time multi face detection using deep learningReal time multi face detection using deep learning
Real time multi face detection using deep learningReallykul Kuul
 
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolutionPrudhvi Raj
 

Similar to Face Detection.pptx (20)

Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
 
lec6a.ppt
lec6a.pptlec6a.ppt
lec6a.ppt
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]
 
Robustness of compressed CNNs
Robustness of compressed CNNsRobustness of compressed CNNs
Robustness of compressed CNNs
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Pixel Recurrent Neural Networks
Pixel Recurrent Neural NetworksPixel Recurrent Neural Networks
Pixel Recurrent Neural Networks
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
PR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersPR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision Learners
 
High quality single shot capture of facial geometry
High quality single shot capture of facial geometryHigh quality single shot capture of facial geometry
High quality single shot capture of facial geometry
 
cvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptxcvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptx
 
ppt 20BET1024.pptx
ppt 20BET1024.pptxppt 20BET1024.pptx
ppt 20BET1024.pptx
 
FaceNet: A Unified Embedding for Face Recognition and Clustering
FaceNet: A Unified Embedding for Face Recognition and ClusteringFaceNet: A Unified Embedding for Face Recognition and Clustering
FaceNet: A Unified Embedding for Face Recognition and Clustering
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
 
A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution
 
Image captioning
Image captioningImage captioning
Image captioning
 
Real time multi face detection using deep learning
Real time multi face detection using deep learningReal time multi face detection using deep learning
Real time multi face detection using deep learning
 
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolution
 

Recently uploaded

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 

Recently uploaded (20)

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 

Face Detection.pptx

  • 1. FaceNet: A Unified Embedding for Face Recognition and Clustering 13th August 2021
  • 2. Types of Face recognition • Face Verification • Face Identification • Face Clustering
  • 3. Uses of Face Recognition • Protecting Problem Gamblers • Buying burgers • Speeding up hotel check-in • Etc.
  • 4. FaceNet Model Overview • FaceNet provides a unified embedding for face recognition, verification and clustering tasks. • Developed by Google Researchers - Schroff et al. at Google in their 2015 paper • FaceNet learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. • A deep CNN is trained to optimize the embedding itself using a novel online triplet mining method. • These face embeddings achieved state-of-the-art results on standard face recognition benchmark datasets (cuts the error rate in comparison to the best published result [2015] by 30%
  • 5. FaceNet Architecture • Facenet model is invariant to pose, illumination, and other variational conditions. • FaceNet uses 22 layer deep CNN that directly trains it’s output to be a 128 dimensional embedding • The network is trained such that the squared L2 distance between the embeddings correspond to face similarity.
  • 6. Triplet Loss • Intuition - anchor image should be closer to positive images as compared to negative images • Thus, we want: • The Loss that is being minimized:
  • 7. Triplet Selection • Choose “Hard-to-get” Triplets • However, it is computationally infeasible to compute hard positives and hard negatives over the entire dataset • To avoid this: Generate triplets online. That is, select +ve and –ve (argmax and argmin) from a mini- batch (not from the entire dataset) for the anchor image. • They sample training data such that around 40 images are selected per identity for each mini-batch and randomly sample negative faces for each mini-batch.
  • 8. Deep Convolutional Networks • CNN is trained using Stochastic Gradient Descent (SGD) with standard backprop and AdaGrad. • The inventors of Facenet explored 2 types of architecture where the difference is in the no. of parameters and FLOPS • Model1 – uses the Zeiler&Fergus architecture and results in a model 22 layers deep. It has a total of 140million parameters and requires around 1.6 billion FLOPS per image. • Model2 - based on GoogLeNet style Inception models which has 20× fewer parameters (around 6.6M-7.5M) and up to 5×fewer FLOPS(between 500M-1.6B).
  • 9. Zeiler&Fergus- Inspired Architecture • Consists of multiple interleaved layers of convolutions, non-linear activations, local response normalizations, and max pooling layers (with several additional 1x1xd convolutional layers throughout). • 1x1 conv layer is inspired by the cross- channel parametric pooling
  • 11. Dataset and Evaluation • The model is evaluated on 4 different datasets & these parameters are evaluated: 1. Hold-out Test Set: 1M images having the same distribution as the training set. Divided into 5 subsets. VAL and FAR are calculated on 100k x 100k image pairs. 2. Personal Photos: 12k images with FAR and VAL calculated for 12k x 12k image pairs. 3. Labeled Faces in the Wild (LFW): de-facto academic test set for face recognition. FAR and VAL are not calculated. 4. Youtube Faces DB: setup is similar to LFW, but pairs of videos instead of images are used. FAR and VAL are not calculated.
  • 12. Experiments with Facenet Accuracy on Different Models Embedding Dimensionality JPEG compression Image Size
  • 13. Performance on LFW • Achieved record breaking classification accuracy of 99.63% • This reduces the error reported for DeepFace by more than a factor of 7 • This is the performance of model NN1, but even the much smaller NN3 achieves performance that is not statistically significantly different. • Classification accuracy achieved is 95.12% (state-of-the-art). • Previous efforts DeepId2+ (Sun et al.) had achieved 93.2% Performance Youtube Faces DB
  • 14. Face Clustering • These embedding can be used to cluster a users personal photos into groups of people withthe same identity • Figure shows one cluster in a users personal photo collection, generated using agglomerative clustering. • It is a clear showcase of the incredible invariance to occlusion, lighting, pose and even age
  • 15. How to Apply Facenet Create a folder with images (>1) per person. The images should have fairly good resolution and need not necessarily be cropped Images should be on grayscale and scaled accordingly Use pre-trained models to detect faces and create a bounding box. Train by passing cropped images to facenet to learn the embeddings. For testing, pass a new image which is not in our database. Compute the face embedding using the same network we used above and then compare this embedding with the rest of the embeddings we have.
  • 16. Summary and Conclusions • state-of-the-art face recognition performance using only 128-bytes per face. • Minimal alignment required on the input dataset (tight crop around the face area), unlike DeepFace (FAIR) which performs 3D alignment.
  • 17. Related works • Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deepface: Closing the gap to human-level performance in face verification. • Y. Sun, X. Wang, and X. Tang. Deeply learned face representations are sparse, selective, and robust.
  • 19. Computation vs. Accuracy Trade-off • 100M - 200M images training face thumbnails, having 8M identities are used • Pre-processing: detecting faces and generating a tight bound box around each face. Resized depending on the input sizes of the networks varying from 96x96 to 224x224. • The graph shows a strong correlation between FLOPS & accuracy achieved. There isn’t a correlation b/w accuracy vs no. of parameters. • NN1 and NN2 differ in number of parameters by a factor of 20. But they achieve comparable performance • NNS2, a tiny version of NN2, can be run on a mobile phone
  • 20. Sensitivity to Image Quality Their models are robust to JPEG compression and perform well even at a JPEG quality of 20 Performance drop is very less with 120x120 input image size and remains acceptable even at 80x80
  • 21. Embedding Dimensionality • They experimented with a lot of dimensionalities and chose 128-D, as it was the best performing. • It was expected that the larger dimensionalities would perform better, but it could also mean that they require more training. • Smaller embedding dimensions could be employed on mobile devices, with minor loss of accuracy.
  • 22. Amount of Training Data • Smaller model with input size of 96x96 was employed for this analysis. It has same architecture as NN2 but without the 5x5 conv. in the inception module.

Editor's Notes

  1. The model extracts high-quality features from the face and predict face embedding. All previous papers used CNN followed by PCA for dim reduction and then SVM for classification. Some used “warp” faces into a canonical frontal view and then learn CNN
  2. Facenet treats the CNN architecture as a blackbox FaceNet doesn’t define any new algorithm. Rather it just creates the embeddings, which can be directly used for face recognition, verification and clustering.
  3. whereαis a margin that is enforced between positive andnegative pairs.Tis the set of all possible triplets in thetraining set and has cardinalityN we can say that we want the distances between the embedding of our anchor image and the embeddings of our positive images to be lesser as compared to the distances between embedding of our anchor image and embeddings of our negative images. Alpha is defined here as the margin between positive and negative pairs. It is essentially a threshold value which determines the difference between our image pairs. If let’s say alpha is set to 0.5, then we want the difference between our anchor-positive and anchor-negative image pairs to be at least 0.5.
  4. Generating all possible triplets would result in many triplets that are easily satisfied (i.e. fulfill the constraint in Eq. (1)). These triplets would not contribute to the train-ing and result in slower convergence, as they would still be passed through the network. It is crucial to select hard triplets, that are active and can therefore contribute to im-proving the model. This process speeds up convergence as our model learns useful representations This triple t selection technique ensures consistently increasing difficulty of triplets as the network trains. Instead of picking the “hardest” positive for a given anchor, they used all the anchor-positive pairs within the batch while still selecting hard negatives ; they do this because they found this leads to a more stable and faster-converging solution.
  5. AdaGrad is used to generate variable learning rates. Fixed learning rates do not work well in deep learning. In case of CNNs where each layer is used to detect a different feature (edges, patterns etc.), a fixed learning will just not work, as different layers in our network require different learning rates to work optimally. The best model may be different depending on theapplication.E.g. a model running in a datacenter can have many parameters and require a large number of FLOPS,whereas a model running on a mobile phone needs to havefew parameters, so that it can fit into memory The initial learning rate is 0.05, margin is set to 0.2 and ReLU is chosen as the activation function.
  6. which recently won the ImageNet competition in 2014)
  7. a squared L2 distance thr All faces pairs (i , j) of the same identity are denoted with Psame, whereas all pairs of different identities are denoted with Pdiff. eshold D(xi, xj)