SlideShare a Scribd company logo
1 of 41
Download to read offline
PR:185
RetinaFace: Single-stage Dense Face
Localisation in the Wild
visionNoobDeng, Jiankang, et al. "RetinaFace: Single-stage Dense Face Localisation in the Wild." arXiv preprint arXiv:1905.00641 (2019).
(Submitted on 2 May 2019 (v1), last revised 4 May 2019 (this version, v2))
Face Detection
state-of-the-art face detection
Definition : face localization
Broader definition : face localization + landmark detection + pixel-wise face parsing + 3d reconstruction
Encoder
Encoder
ℝ"#$ Unit vector
Similarity
[0,1]
if (similarity < threshold):
same!
else:
no same!
L2norm
L2norm
Unit vector
Preprocessing
Preprocessing
ℝ"#$
0. Face Recognition
Naïve Example : Face Verification
Encoder
ℝ"#$
Preprocessing
0. Face Recognition
Naïve Example : Face Verification
ROI region Face Registration
112px
112px
Detecting
1. Facial location
2. Facial Landmarks
Preprocessing
1. Introduction
1.2 RetinaFace
1. Introduction
1.2 RetinaFace
face localization(bbox) + face landmarks(key points) + Dense localization mask
1. Introduction
1.3 Main Contributions
1. Based on a single-stage design, we propose a novel pixel-wise face localisation
method named RetinaFace, which employs a multi-task learning strategy to
simultaneously predict face score, face box, five facial landmarks, and 3D position and
correspondence of of each facial pixel.
2. On the WIDER FACE hard subset, RetinaFace outperforms the AP of the state of the
art two-stage method.
3. On the IJB-C dataset, RetinaFace helps to improve ArcFace’s verification accuracy.
4. By employing light-weight backbone networks, RetinaFace can run real-time on a
single CPU core for a VGA-resolution image.
5. Extra annotations and code have been released to facilitate future research.
WIDER Face & Person Challenge 2019
Track 1: Face Detection Track 2: Pedestrian Detection
Track 3: Cast Search by Portrait Track 4: Person Search by Language
http://wider-challenge.org/2019.html
2. Related Work
2.1 Image Pyramid vs Feature Pyramid
2. Related Work
2.1. Image pyramid v.s. feature pyramid
2.2. Two-stage v.s. single-stage
2.3. Context Modelling
2.4. Multi-task Learning
Hao, Zekun, et al. "Scale-aware face detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
Feature PyramidImage Pyramid
2. Related Work
2.2 Two-stage v.s. single-stage
2. Related Work
2.1. Image pyramid v.s. feature pyramid
2.2. Two-stage v.s. single-stage
2.3. Context Modelling
2.4. Multi-task Learning
2. Related Work
2.3 Context Modeling
2. Related Work
2.1. Image pyramid v.s. feature pyramid
2.2. Two-stage v.s. single-stage
2.3. Context Modelling
2.4. Multi-task LearningContext Module
To enhance the model’s contextual reasoning power.
2. Related Work
2.3 Context Modeling
2. Related Work
2.1 Image pyramid v.s. feature pyramid
2.2 Two-stage v.s. single-stage
2.3 Context Modelling
2.4 Multi-task LearningDeformable Convolutional Network
J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei. Deformable convolutional networks. In ICCV, 2017. 2,
X. Zhu, H. Hu, S. Lin, and J. Dai. Deformable convnets v2: More deformable, better results. arXiv:1811.11168, 2018.
2. Related Work
2.4 Multi-task Learning
2. Related Work
2.1. Image pyramid v.s. feature pyramid
2.2. Two-stage v.s. single-stage
2.3. Context Modelling
2.4. Multi-task Learning
He, Kaiming, et al. "Mask r-cnn." Proceedings of the IEEE international conference on computer vision. 2017.
Mask-rcnn
Multi-task learning
3. RetinaFace
3.1. Multi-task Loss
3. RetinaFace
3.1. Multi-task loss
3.2. Dense Regression Branch
Multi-task learning
3. RetinaFace
3.2. Dense Regression Branch
3. RetinaFace
3.1. Multi-task loss
3.2. Dense Regression Branch
Zhou, Yuxiang, et al. "Dense 3D Face Decoding over 2500FPS: Joint Texture & Shape Convolutional Mesh Decoders." Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition. 2019.
4. Experiments
4.1 Dataset
WIDER face (hard)
- 32,203 images, 393,703 face bboxes
(with a high degree of variability in scale, pose, expression, occlusion and illumination)
car accident coupleconcert
4. Experiments
4.1 Dataset
WIDER face (hard)
- 32,203 images, 393,703 face bboxes
(with a high degree of variability in scale, pose, expression, occlusion and illumination)
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
4. Experiments
4.1 Dataset
Extra Annotation
- Facial landmarks (eye centres, nose tip and mouth corners)
- 84.6k faces on the training set and 18.5k faces on the validation set.
4. Experiments
4.2 Implementation details
1. Feature pyramid
2. Context module
3. Anchor setting
4. Data augmentation
5. Training detail
6. Testing detail
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
# of anchors * (2 + 4 + 10 + 128 + 7 + 9)Conv -> DCN
4. Experiments
4.2 Implementation details
Anchor setting
- Scale step at 2^(1/3) and the aspect ratio at 1:1
- With the input image size at 640 × 640, the anchors can cover
scales from 16 × 16 to 406 × 406 on the feature pyramid levels.
In total, there are 102,300 anchors, and 75% of these anchors are
from P2.
- OHEM
- 1:3 (pos : neg)
1. Feature pyramid
2. Context module
3. Anchor setting
4. Data augmentation
5. Training detail
6. Testing detail
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
4. Experiments
4.2 Implementation details
Data augmentation
- Random crop
- Horizontal flip
- Photo-metric color distortion
Training Details
- SGD (momentum at 0.9, weight decay at 0.0005, batch size of 8 × 4)
- on four NVIDIA Tesla P40 (24GB) GPUs.
- The learning rate starts from 10−3, rising to 10−2 after 5 epochs,
then divided by 10 at 55 and 68 epochs.
- terminating at 80 epochs.
Testing Details
- flip as well as multi-scale (the short edge of image at [500, 800, 1100, 1400, 1700]) strategies.
- Box voting at IoU at 0.4 -> or NMS is okay
1. Feature pyramid
2. Context module
3. Anchor setting
4. Data augmentation
5. Training detail
6. Testing detail
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
4. Experiments – Ablation study
WIDER Face Dataset
(easy, medium, hard)
RetinaFace
Lightweight backbone -> Realtime inference
(MobileNet)
Face Detection
Face 5 Landmarks
Detection
Face
3D reconstruction
SOTA (AP 91.4%)
ArcFace
(with RetinaNet)
IJB-C Dataset
Better verification accuracyExtra supervision
4. Experiments
4.3. Ablation Study
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
IoU=0.5:0.05:0.95IoU=0.5
4. Experiments
4.3. Ablation Study
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
IoU=0.5:0.05:0.95IoU=0.5
He, Kaiming, et al. "Mask r-cnn." Proceedings of the IEEE international conference on computer vision. 2017.
From Mask r-cnn
4. Experiments : Face Box Accuracy
WIDER Face Dataset
(easy, medium, hard)
RetinaFace
Lightweight backbone -> Realtime inference
(MobileNet)
Face Detection
Face 5 Landmarks
Detection
Face
3D reconstruction
SOTA (AP 91.4%)
ArcFace
(with RetinaNet)
IJB-C Dataset
Better verification accuracyExtra supervision
4. Experiments
4.4. Face box Accuracy (WIDER face)
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
4. Experiments : Five Facial Landmarks Accuracy
WIDER Face Dataset
(easy, medium, hard)
RetinaFace
Lightweight backbone -> Realtime inference
(MobileNet)
Face Detection
Face 5 Landmarks
Detection
Face
3D reconstruction
SOTA (AP 91.4%)
ArcFace
(with RetinaNet)
IJB-C Dataset
Better verification accuracyExtra supervision
4. Experiments
4.5. Five Facial Landmark Accuracy
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
cumulative error distribution (CED)normalised mean errors (NME)
https://pdfs.semanticscholar.org/b4d2/151e29fb12dbe5d164b430273de65103d39b.pdf
26.31%
9.37%
4. Experiments : Dense Facial Landmark Accuracy
WIDER Face Dataset
(easy, medium, hard)
RetinaFace
Lightweight backbone -> Realtime inference
(MobileNet)
Face Detection
Face 5 Landmarks
Detection
Face
3D reconstruction
SOTA (AP 91.4%)
ArcFace
(with RetinaNet)
IJB-C Dataset
Better verification accuracyExtra supervision
4. Experiments
4.6. Dense Facial Landmark Accuracy
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
4. Experiments : Face Recognition Accuracy
WIDER Face Dataset
(easy, medium, hard)
RetinaFace
Lightweight backbone -> Realtime inference
(MobileNet)
Face Detection
Face 5 Landmarks
Detection
Face
3D reconstruction
SOTA (AP 91.4%)
ArcFace
(with RetinaNet)
IJB-C Dataset
Better verification accuracyExtra supervision
4. Experiments
4.7. Face Recognition Accuracy
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
4. Experiments : Inference Accuracy
WIDER Face Dataset
(easy, medium, hard)
RetinaFace
Lightweight backbone -> Realtime inference
(MobileNet)
Face Detection
Face 5 Landmarks
Detection
Face
3D reconstruction
SOTA (AP 91.4%)
ArcFace
(with RetinaNet)
IJB-C Dataset
Better verification accuracyExtra supervision
4. Experiments
4.8. Inference Efficiency
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
https://github.com/deepinsight/insightface/tree/master/RetinaFace
4. Experiments
4.8. Inference Efficiency
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
https://github.com/deepinsight/insightface/tree/master/RetinaFace
Yoo, YoungJoon, Dongyoon Han, and Sangdoo Yun. "EXTD: Extremely Tiny Face Detector via Iterative Filter Reuse." arXiv preprint arXiv:1906.06579 (2019).
4. Experiments
4.8. Inference Efficiency
4.1. Dataset
4.2. Implementation details
4.3. Ablation Study
4.4. Face box Accuracy
4.5. Five Facial Landmark Accuracy
4.6. Dense Facial Landmark Accuracy
4.7. Face Recognition Accuracy
4.8. Inference Efficiency
https://github.com/deepinsight/insightface/tree/master/RetinaFace
Yoo, YoungJoon, Dongyoon Han, and Sangdoo Yun. "EXTD: Extremely Tiny Face Detector via Iterative Filter Reuse." arXiv preprint arXiv:1906.06579 (2019).
5. Conclusion
WIDER Face Dataset
(easy, medium, hard)
RetinaFace
Lightweight backbone -> Realtime inference
(MobileNet)
Face Detection
Face 5 Landmarks
Detection
Face
3D reconstruction
SOTA (AP 91.4%)
ArcFace
(with RetinaNet)
IJB-C Dataset
Better verification accuracyExtra supervision
Code is available at https://github.com/deepinsight/insightface
(MXNet)
https://ibug.doc.ic.ac.uk/resources/lightweight-face-recognition-challenge-workshop/
Lightweight Face Recognition Challenge
https://ibug.doc.ic.ac.uk/resources/lightweight-face-recognition-challenge-workshop/
Discussion

More Related Content

What's hot

PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisHyeongmin Lee
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis taeseon ryu
 
A Comparison of Loss Function on Deep Embedding
A Comparison of Loss Function on Deep EmbeddingA Comparison of Loss Function on Deep Embedding
A Comparison of Loss Function on Deep EmbeddingCenk Bircanoğlu
 
Introduction_to_DEEP_LEARNING.ppt
Introduction_to_DEEP_LEARNING.pptIntroduction_to_DEEP_LEARNING.ppt
Introduction_to_DEEP_LEARNING.pptSwatiMahale4
 
Physically Based Rendering
Physically Based RenderingPhysically Based Rendering
Physically Based RenderingKoray Hagen
 
Modern face recognition with deep learning
Modern face recognition with deep learningModern face recognition with deep learning
Modern face recognition with deep learningmarada0033
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksDing Li
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief NetworksHasan H Topcu
 
Face Recognition Methods based on Convolutional Neural Networks
Face Recognition Methods based on Convolutional Neural NetworksFace Recognition Methods based on Convolutional Neural Networks
Face Recognition Methods based on Convolutional Neural NetworksElaheh Rashedi
 
Attn-gan : fine-grained text to image generation
Attn-gan :  fine-grained text to image generationAttn-gan :  fine-grained text to image generation
Attn-gan : fine-grained text to image generationKyuYeolJung
 
Face recognition v1
Face recognition v1Face recognition v1
Face recognition v1San Kim
 
PR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisPR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisHyeongmin Lee
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaSangwoo Mo
 
MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...
MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...
MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...Laxmi Kant Tiwari
 
Image inpainting
Image inpaintingImage inpainting
Image inpaintingKhoaBiNgcng
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMsJim Steele
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Survey of Super Resolution Task (SISR Only)
Survey of Super Resolution Task (SISR Only)Survey of Super Resolution Task (SISR Only)
Survey of Super Resolution Task (SISR Only)MYEONGGYU LEE
 
Presentation on Neural Style Transfer
Presentation on Neural Style TransferPresentation on Neural Style Transfer
Presentation on Neural Style TransferSanjoy Datta
 

What's hot (20)

PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
A Comparison of Loss Function on Deep Embedding
A Comparison of Loss Function on Deep EmbeddingA Comparison of Loss Function on Deep Embedding
A Comparison of Loss Function on Deep Embedding
 
Introduction_to_DEEP_LEARNING.ppt
Introduction_to_DEEP_LEARNING.pptIntroduction_to_DEEP_LEARNING.ppt
Introduction_to_DEEP_LEARNING.ppt
 
Physically Based Rendering
Physically Based RenderingPhysically Based Rendering
Physically Based Rendering
 
Modern face recognition with deep learning
Modern face recognition with deep learningModern face recognition with deep learning
Modern face recognition with deep learning
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief Networks
 
Face Recognition Methods based on Convolutional Neural Networks
Face Recognition Methods based on Convolutional Neural NetworksFace Recognition Methods based on Convolutional Neural Networks
Face Recognition Methods based on Convolutional Neural Networks
 
Attn-gan : fine-grained text to image generation
Attn-gan :  fine-grained text to image generationAttn-gan :  fine-grained text to image generation
Attn-gan : fine-grained text to image generation
 
Face recognition v1
Face recognition v1Face recognition v1
Face recognition v1
 
PR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisPR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image Synthesis
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat Minima
 
MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...
MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...
MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...
 
Image inpainting
Image inpaintingImage inpainting
Image inpainting
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMs
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Survey of Super Resolution Task (SISR Only)
Survey of Super Resolution Task (SISR Only)Survey of Super Resolution Task (SISR Only)
Survey of Super Resolution Task (SISR Only)
 
Presentation on Neural Style Transfer
Presentation on Neural Style TransferPresentation on Neural Style Transfer
Presentation on Neural Style Transfer
 

Similar to PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild

Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecognIlyas CHAOUA
 
Deep learning for understanding faces
Deep learning for understanding facesDeep learning for understanding faces
Deep learning for understanding facessieubebu
 
IRJET- Prediction of Facial Attribute without Landmark Information
IRJET-  	  Prediction of Facial Attribute without Landmark InformationIRJET-  	  Prediction of Facial Attribute without Landmark Information
IRJET- Prediction of Facial Attribute without Landmark InformationIRJET Journal
 
Real time multi face detection using deep learning
Real time multi face detection using deep learningReal time multi face detection using deep learning
Real time multi face detection using deep learningReallykul Kuul
 
Face and Eye Detection Varying Scenarios With Haar Classifier_2015
Face and Eye Detection Varying Scenarios With Haar Classifier_2015Face and Eye Detection Varying Scenarios With Haar Classifier_2015
Face and Eye Detection Varying Scenarios With Haar Classifier_2015Showrav Mazumder
 
IRJET- A Survey on Facial Expression Recognition Robust to Partial Occlusion
IRJET- A Survey on Facial Expression Recognition Robust to Partial OcclusionIRJET- A Survey on Facial Expression Recognition Robust to Partial Occlusion
IRJET- A Survey on Facial Expression Recognition Robust to Partial OcclusionIRJET Journal
 
Realtime face matching and gender prediction based on deep learning
Realtime face matching and gender prediction based on deep learningRealtime face matching and gender prediction based on deep learning
Realtime face matching and gender prediction based on deep learningIJECEIAES
 
IRJET - A Review on Face Recognition using Deep Learning Algorithm
IRJET -  	  A Review on Face Recognition using Deep Learning AlgorithmIRJET -  	  A Review on Face Recognition using Deep Learning Algorithm
IRJET - A Review on Face Recognition using Deep Learning AlgorithmIRJET Journal
 
Real-time eyeglass detection using transfer learning for non-standard facial...
Real-time eyeglass detection using transfer learning for  non-standard facial...Real-time eyeglass detection using transfer learning for  non-standard facial...
Real-time eyeglass detection using transfer learning for non-standard facial...IJECEIAES
 
Iris & Peri-ocular Recognition
Iris & Peri-ocular RecognitionIris & Peri-ocular Recognition
Iris & Peri-ocular RecognitionShashank Dhariwal
 
Long-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep LearningLong-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep LearningElaheh Rashedi
 
Multimodal Biometrics Recognition from Facial Video via Deep Learning
Multimodal Biometrics Recognition from Facial Video via Deep Learning Multimodal Biometrics Recognition from Facial Video via Deep Learning
Multimodal Biometrics Recognition from Facial Video via Deep Learning cscpconf
 
MULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNING
MULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNINGMULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNING
MULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNINGcsandit
 
Secure System based on Dynamic Features of IRIS Recognition
Secure System based on Dynamic Features of IRIS RecognitionSecure System based on Dynamic Features of IRIS Recognition
Secure System based on Dynamic Features of IRIS Recognitionijsrd.com
 
Criminal Detection System
Criminal Detection SystemCriminal Detection System
Criminal Detection SystemIntrader Amit
 
IRJET- Face Spoofing Detection Based on Texture Analysis and Color Space Conv...
IRJET- Face Spoofing Detection Based on Texture Analysis and Color Space Conv...IRJET- Face Spoofing Detection Based on Texture Analysis and Color Space Conv...
IRJET- Face Spoofing Detection Based on Texture Analysis and Color Space Conv...IRJET Journal
 

Similar to PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild (20)

Semantic 3DTV Content Analysis and Description
Semantic 3DTV Content Analysis and DescriptionSemantic 3DTV Content Analysis and Description
Semantic 3DTV Content Analysis and Description
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 
Deep learning for understanding faces
Deep learning for understanding facesDeep learning for understanding faces
Deep learning for understanding faces
 
IRJET- Prediction of Facial Attribute without Landmark Information
IRJET-  	  Prediction of Facial Attribute without Landmark InformationIRJET-  	  Prediction of Facial Attribute without Landmark Information
IRJET- Prediction of Facial Attribute without Landmark Information
 
Real time multi face detection using deep learning
Real time multi face detection using deep learningReal time multi face detection using deep learning
Real time multi face detection using deep learning
 
Deep Learning for Computer Vision: Face Recognition (UPC 2016)
Deep Learning for Computer Vision: Face Recognition (UPC 2016)Deep Learning for Computer Vision: Face Recognition (UPC 2016)
Deep Learning for Computer Vision: Face Recognition (UPC 2016)
 
Real time facial expression analysis using pca
Real time facial expression analysis using pcaReal time facial expression analysis using pca
Real time facial expression analysis using pca
 
Face and Eye Detection Varying Scenarios With Haar Classifier_2015
Face and Eye Detection Varying Scenarios With Haar Classifier_2015Face and Eye Detection Varying Scenarios With Haar Classifier_2015
Face and Eye Detection Varying Scenarios With Haar Classifier_2015
 
IRJET- A Survey on Facial Expression Recognition Robust to Partial Occlusion
IRJET- A Survey on Facial Expression Recognition Robust to Partial OcclusionIRJET- A Survey on Facial Expression Recognition Robust to Partial Occlusion
IRJET- A Survey on Facial Expression Recognition Robust to Partial Occlusion
 
Realtime face matching and gender prediction based on deep learning
Realtime face matching and gender prediction based on deep learningRealtime face matching and gender prediction based on deep learning
Realtime face matching and gender prediction based on deep learning
 
IRJET - A Review on Face Recognition using Deep Learning Algorithm
IRJET -  	  A Review on Face Recognition using Deep Learning AlgorithmIRJET -  	  A Review on Face Recognition using Deep Learning Algorithm
IRJET - A Review on Face Recognition using Deep Learning Algorithm
 
Real-time eyeglass detection using transfer learning for non-standard facial...
Real-time eyeglass detection using transfer learning for  non-standard facial...Real-time eyeglass detection using transfer learning for  non-standard facial...
Real-time eyeglass detection using transfer learning for non-standard facial...
 
Iris & Peri-ocular Recognition
Iris & Peri-ocular RecognitionIris & Peri-ocular Recognition
Iris & Peri-ocular Recognition
 
Long-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep LearningLong-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep Learning
 
Multimodal Biometrics Recognition from Facial Video via Deep Learning
Multimodal Biometrics Recognition from Facial Video via Deep Learning Multimodal Biometrics Recognition from Facial Video via Deep Learning
Multimodal Biometrics Recognition from Facial Video via Deep Learning
 
MULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNING
MULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNINGMULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNING
MULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNING
 
Introducing Set Of Internal Parameters For Laplacian Faces
Introducing Set Of Internal Parameters For Laplacian FacesIntroducing Set Of Internal Parameters For Laplacian Faces
Introducing Set Of Internal Parameters For Laplacian Faces
 
Secure System based on Dynamic Features of IRIS Recognition
Secure System based on Dynamic Features of IRIS RecognitionSecure System based on Dynamic Features of IRIS Recognition
Secure System based on Dynamic Features of IRIS Recognition
 
Criminal Detection System
Criminal Detection SystemCriminal Detection System
Criminal Detection System
 
IRJET- Face Spoofing Detection Based on Texture Analysis and Color Space Conv...
IRJET- Face Spoofing Detection Based on Texture Analysis and Color Space Conv...IRJET- Face Spoofing Detection Based on Texture Analysis and Color Space Conv...
IRJET- Face Spoofing Detection Based on Texture Analysis and Color Space Conv...
 

More from jaewon lee

PR-199: SNIPER:Efficient Multi Scale Training
PR-199: SNIPER:Efficient Multi Scale TrainingPR-199: SNIPER:Efficient Multi Scale Training
PR-199: SNIPER:Efficient Multi Scale Trainingjaewon lee
 
PR-146: CornerNet detecting objects as paired keypoints
PR-146: CornerNet detecting objects as paired keypointsPR-146: CornerNet detecting objects as paired keypoints
PR-146: CornerNet detecting objects as paired keypointsjaewon lee
 
PR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural NetworksPR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural Networksjaewon lee
 
PR157: Best of both worlds: human-machine collaboration for object annotation
PR157: Best of both worlds: human-machine collaboration for object annotationPR157: Best of both worlds: human-machine collaboration for object annotation
PR157: Best of both worlds: human-machine collaboration for object annotationjaewon lee
 
PR-122: Can-Creative Adversarial Networks
PR-122: Can-Creative Adversarial NetworksPR-122: Can-Creative Adversarial Networks
PR-122: Can-Creative Adversarial Networksjaewon lee
 
Pytorch kr devcon
Pytorch kr devconPytorch kr devcon
Pytorch kr devconjaewon lee
 
PR-134 How Does Batch Normalization Help Optimization?
PR-134 How Does Batch Normalization Help Optimization?PR-134 How Does Batch Normalization Help Optimization?
PR-134 How Does Batch Normalization Help Optimization?jaewon lee
 
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPPR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPjaewon lee
 

More from jaewon lee (9)

PR-199: SNIPER:Efficient Multi Scale Training
PR-199: SNIPER:Efficient Multi Scale TrainingPR-199: SNIPER:Efficient Multi Scale Training
PR-199: SNIPER:Efficient Multi Scale Training
 
PR-146: CornerNet detecting objects as paired keypoints
PR-146: CornerNet detecting objects as paired keypointsPR-146: CornerNet detecting objects as paired keypoints
PR-146: CornerNet detecting objects as paired keypoints
 
PR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural NetworksPR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural Networks
 
PR157: Best of both worlds: human-machine collaboration for object annotation
PR157: Best of both worlds: human-machine collaboration for object annotationPR157: Best of both worlds: human-machine collaboration for object annotation
PR157: Best of both worlds: human-machine collaboration for object annotation
 
PR-122: Can-Creative Adversarial Networks
PR-122: Can-Creative Adversarial NetworksPR-122: Can-Creative Adversarial Networks
PR-122: Can-Creative Adversarial Networks
 
Rgb data
Rgb dataRgb data
Rgb data
 
Pytorch kr devcon
Pytorch kr devconPytorch kr devcon
Pytorch kr devcon
 
PR-134 How Does Batch Normalization Help Optimization?
PR-134 How Does Batch Normalization Help Optimization?PR-134 How Does Batch Normalization Help Optimization?
PR-134 How Does Batch Normalization Help Optimization?
 
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPPR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
 

Recently uploaded

How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 

Recently uploaded (20)

How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 

PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild

  • 1. PR:185 RetinaFace: Single-stage Dense Face Localisation in the Wild visionNoobDeng, Jiankang, et al. "RetinaFace: Single-stage Dense Face Localisation in the Wild." arXiv preprint arXiv:1905.00641 (2019). (Submitted on 2 May 2019 (v1), last revised 4 May 2019 (this version, v2))
  • 2. Face Detection state-of-the-art face detection Definition : face localization Broader definition : face localization + landmark detection + pixel-wise face parsing + 3d reconstruction
  • 3. Encoder Encoder ℝ"#$ Unit vector Similarity [0,1] if (similarity < threshold): same! else: no same! L2norm L2norm Unit vector Preprocessing Preprocessing ℝ"#$ 0. Face Recognition Naïve Example : Face Verification
  • 4. Encoder ℝ"#$ Preprocessing 0. Face Recognition Naïve Example : Face Verification ROI region Face Registration 112px 112px Detecting 1. Facial location 2. Facial Landmarks Preprocessing
  • 6. 1. Introduction 1.2 RetinaFace face localization(bbox) + face landmarks(key points) + Dense localization mask
  • 7.
  • 8. 1. Introduction 1.3 Main Contributions 1. Based on a single-stage design, we propose a novel pixel-wise face localisation method named RetinaFace, which employs a multi-task learning strategy to simultaneously predict face score, face box, five facial landmarks, and 3D position and correspondence of of each facial pixel. 2. On the WIDER FACE hard subset, RetinaFace outperforms the AP of the state of the art two-stage method. 3. On the IJB-C dataset, RetinaFace helps to improve ArcFace’s verification accuracy. 4. By employing light-weight backbone networks, RetinaFace can run real-time on a single CPU core for a VGA-resolution image. 5. Extra annotations and code have been released to facilitate future research.
  • 9. WIDER Face & Person Challenge 2019 Track 1: Face Detection Track 2: Pedestrian Detection Track 3: Cast Search by Portrait Track 4: Person Search by Language http://wider-challenge.org/2019.html
  • 10. 2. Related Work 2.1 Image Pyramid vs Feature Pyramid 2. Related Work 2.1. Image pyramid v.s. feature pyramid 2.2. Two-stage v.s. single-stage 2.3. Context Modelling 2.4. Multi-task Learning Hao, Zekun, et al. "Scale-aware face detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. Feature PyramidImage Pyramid
  • 11. 2. Related Work 2.2 Two-stage v.s. single-stage 2. Related Work 2.1. Image pyramid v.s. feature pyramid 2.2. Two-stage v.s. single-stage 2.3. Context Modelling 2.4. Multi-task Learning
  • 12. 2. Related Work 2.3 Context Modeling 2. Related Work 2.1. Image pyramid v.s. feature pyramid 2.2. Two-stage v.s. single-stage 2.3. Context Modelling 2.4. Multi-task LearningContext Module To enhance the model’s contextual reasoning power.
  • 13. 2. Related Work 2.3 Context Modeling 2. Related Work 2.1 Image pyramid v.s. feature pyramid 2.2 Two-stage v.s. single-stage 2.3 Context Modelling 2.4 Multi-task LearningDeformable Convolutional Network J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei. Deformable convolutional networks. In ICCV, 2017. 2, X. Zhu, H. Hu, S. Lin, and J. Dai. Deformable convnets v2: More deformable, better results. arXiv:1811.11168, 2018.
  • 14. 2. Related Work 2.4 Multi-task Learning 2. Related Work 2.1. Image pyramid v.s. feature pyramid 2.2. Two-stage v.s. single-stage 2.3. Context Modelling 2.4. Multi-task Learning He, Kaiming, et al. "Mask r-cnn." Proceedings of the IEEE international conference on computer vision. 2017. Mask-rcnn Multi-task learning
  • 15. 3. RetinaFace 3.1. Multi-task Loss 3. RetinaFace 3.1. Multi-task loss 3.2. Dense Regression Branch Multi-task learning
  • 16. 3. RetinaFace 3.2. Dense Regression Branch 3. RetinaFace 3.1. Multi-task loss 3.2. Dense Regression Branch Zhou, Yuxiang, et al. "Dense 3D Face Decoding over 2500FPS: Joint Texture & Shape Convolutional Mesh Decoders." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
  • 17. 4. Experiments 4.1 Dataset WIDER face (hard) - 32,203 images, 393,703 face bboxes (with a high degree of variability in scale, pose, expression, occlusion and illumination)
  • 18. car accident coupleconcert 4. Experiments 4.1 Dataset WIDER face (hard) - 32,203 images, 393,703 face bboxes (with a high degree of variability in scale, pose, expression, occlusion and illumination) 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency
  • 19. 4. Experiments 4.1 Dataset Extra Annotation - Facial landmarks (eye centres, nose tip and mouth corners) - 84.6k faces on the training set and 18.5k faces on the validation set.
  • 20. 4. Experiments 4.2 Implementation details 1. Feature pyramid 2. Context module 3. Anchor setting 4. Data augmentation 5. Training detail 6. Testing detail 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency # of anchors * (2 + 4 + 10 + 128 + 7 + 9)Conv -> DCN
  • 21. 4. Experiments 4.2 Implementation details Anchor setting - Scale step at 2^(1/3) and the aspect ratio at 1:1 - With the input image size at 640 × 640, the anchors can cover scales from 16 × 16 to 406 × 406 on the feature pyramid levels. In total, there are 102,300 anchors, and 75% of these anchors are from P2. - OHEM - 1:3 (pos : neg) 1. Feature pyramid 2. Context module 3. Anchor setting 4. Data augmentation 5. Training detail 6. Testing detail 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency
  • 22. 4. Experiments 4.2 Implementation details Data augmentation - Random crop - Horizontal flip - Photo-metric color distortion Training Details - SGD (momentum at 0.9, weight decay at 0.0005, batch size of 8 × 4) - on four NVIDIA Tesla P40 (24GB) GPUs. - The learning rate starts from 10−3, rising to 10−2 after 5 epochs, then divided by 10 at 55 and 68 epochs. - terminating at 80 epochs. Testing Details - flip as well as multi-scale (the short edge of image at [500, 800, 1100, 1400, 1700]) strategies. - Box voting at IoU at 0.4 -> or NMS is okay 1. Feature pyramid 2. Context module 3. Anchor setting 4. Data augmentation 5. Training detail 6. Testing detail 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency
  • 23. 4. Experiments – Ablation study WIDER Face Dataset (easy, medium, hard) RetinaFace Lightweight backbone -> Realtime inference (MobileNet) Face Detection Face 5 Landmarks Detection Face 3D reconstruction SOTA (AP 91.4%) ArcFace (with RetinaNet) IJB-C Dataset Better verification accuracyExtra supervision
  • 24. 4. Experiments 4.3. Ablation Study 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency IoU=0.5:0.05:0.95IoU=0.5
  • 25. 4. Experiments 4.3. Ablation Study 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency IoU=0.5:0.05:0.95IoU=0.5 He, Kaiming, et al. "Mask r-cnn." Proceedings of the IEEE international conference on computer vision. 2017. From Mask r-cnn
  • 26. 4. Experiments : Face Box Accuracy WIDER Face Dataset (easy, medium, hard) RetinaFace Lightweight backbone -> Realtime inference (MobileNet) Face Detection Face 5 Landmarks Detection Face 3D reconstruction SOTA (AP 91.4%) ArcFace (with RetinaNet) IJB-C Dataset Better verification accuracyExtra supervision
  • 27. 4. Experiments 4.4. Face box Accuracy (WIDER face) 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency
  • 28. 4. Experiments : Five Facial Landmarks Accuracy WIDER Face Dataset (easy, medium, hard) RetinaFace Lightweight backbone -> Realtime inference (MobileNet) Face Detection Face 5 Landmarks Detection Face 3D reconstruction SOTA (AP 91.4%) ArcFace (with RetinaNet) IJB-C Dataset Better verification accuracyExtra supervision
  • 29. 4. Experiments 4.5. Five Facial Landmark Accuracy 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency cumulative error distribution (CED)normalised mean errors (NME) https://pdfs.semanticscholar.org/b4d2/151e29fb12dbe5d164b430273de65103d39b.pdf 26.31% 9.37%
  • 30. 4. Experiments : Dense Facial Landmark Accuracy WIDER Face Dataset (easy, medium, hard) RetinaFace Lightweight backbone -> Realtime inference (MobileNet) Face Detection Face 5 Landmarks Detection Face 3D reconstruction SOTA (AP 91.4%) ArcFace (with RetinaNet) IJB-C Dataset Better verification accuracyExtra supervision
  • 31. 4. Experiments 4.6. Dense Facial Landmark Accuracy 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency
  • 32. 4. Experiments : Face Recognition Accuracy WIDER Face Dataset (easy, medium, hard) RetinaFace Lightweight backbone -> Realtime inference (MobileNet) Face Detection Face 5 Landmarks Detection Face 3D reconstruction SOTA (AP 91.4%) ArcFace (with RetinaNet) IJB-C Dataset Better verification accuracyExtra supervision
  • 33. 4. Experiments 4.7. Face Recognition Accuracy 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency
  • 34. 4. Experiments : Inference Accuracy WIDER Face Dataset (easy, medium, hard) RetinaFace Lightweight backbone -> Realtime inference (MobileNet) Face Detection Face 5 Landmarks Detection Face 3D reconstruction SOTA (AP 91.4%) ArcFace (with RetinaNet) IJB-C Dataset Better verification accuracyExtra supervision
  • 35. 4. Experiments 4.8. Inference Efficiency 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency https://github.com/deepinsight/insightface/tree/master/RetinaFace
  • 36. 4. Experiments 4.8. Inference Efficiency 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency https://github.com/deepinsight/insightface/tree/master/RetinaFace Yoo, YoungJoon, Dongyoon Han, and Sangdoo Yun. "EXTD: Extremely Tiny Face Detector via Iterative Filter Reuse." arXiv preprint arXiv:1906.06579 (2019).
  • 37. 4. Experiments 4.8. Inference Efficiency 4.1. Dataset 4.2. Implementation details 4.3. Ablation Study 4.4. Face box Accuracy 4.5. Five Facial Landmark Accuracy 4.6. Dense Facial Landmark Accuracy 4.7. Face Recognition Accuracy 4.8. Inference Efficiency https://github.com/deepinsight/insightface/tree/master/RetinaFace Yoo, YoungJoon, Dongyoon Han, and Sangdoo Yun. "EXTD: Extremely Tiny Face Detector via Iterative Filter Reuse." arXiv preprint arXiv:1906.06579 (2019).
  • 38. 5. Conclusion WIDER Face Dataset (easy, medium, hard) RetinaFace Lightweight backbone -> Realtime inference (MobileNet) Face Detection Face 5 Landmarks Detection Face 3D reconstruction SOTA (AP 91.4%) ArcFace (with RetinaNet) IJB-C Dataset Better verification accuracyExtra supervision Code is available at https://github.com/deepinsight/insightface (MXNet)
  • 40. Lightweight Face Recognition Challenge https://ibug.doc.ic.ac.uk/resources/lightweight-face-recognition-challenge-workshop/