Survey of Face Detection
Approaches
Yurii Pashchenko
DataScience Lab, Odessa, 2017
Classification vs. Detection
http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf 2
Evaluation
3
Evaluation metric. Receiver Operating
Characteristic (ROC)
4
Benchmarks
● FDDB
● AFW
● PascalFace
● IJB-A
● MALF
● WIDER Face
5
FDDB: A Benchmark for Face Detection in
Unconstrained Settings
● 2 845 images with a total of 5 171 faces;
● a wide range of difficulties:
○ occlusions
○ different poses
○ low resolution
○ out-of-focus faces
● the specification of face regions as
elliptical regions
● both grayscale and color images.
http://vis-www.cs.umass.edu/fddb/ 6
FDDB. Annotation
http://vis-www.cs.umass.edu/fddb/fddb.pdf 7
FDDB.Evaluation
8
IARPA Janus Benchmark A (IJB-A)
• 5 712 images and 2085 videos,
with an average of 11.4 images
and 4.2 videos per subject
• full pose variation
• joint use for face recognition and
face detection benchmarking
• a mix of images and videos
• wider geographic variation of
subjects
• landmark locations
Brendan F Klare, Emma Taborsky, Austin Blanton, Jordan Cheney, Kristen Allen, Patrick Grother, Alan Mah, Mark Burge,
and Anil K Jain. 2015. Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A. In
Proc. IEEE Conf. Comput. Vis. Pattern Recognit. 1931–1939 9
IJB-A. Evaluation
* False Accept and Detection Rate are computed per image
10
WIDER FACE: A Face Detection Benchmark
• It consists of 32 203 images with 393 703
labeled faces, which is 10 times larger
than the current largest face detection
dataset
• The faces vary largely in appearance,
pose, and scale
• Annotated multiple attributes: occlusion,
pose, and event categories, which allows
in depth analysis of existing algorithms.
http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/ 11
WIDER FACE. Annotations
https://arxiv.org/pdf/1511.06523.pdf 12
WIDER FACE. Evaluation results
13
Comparison of Face Detection Datasets
https://arxiv.org/pdf/1511.06523.pdf 14
Viola-Jones Object Detector
• Very popular for Human Face Detection
• May be trained for Cat and Dog Face detection
• Available free in OpenCV library (http://opencv.org)
O. Parkhi, A. Vedaldi, C. V. Jawahar, and A. Zisserman. The Truth about Cats and Dogs // Proceedings
of the International Conference on Computer Vision (ICCV), 2011. J.
Liu, A. Kanazawa, D. Jacobs, P. Belhumeur. Dog Breed Classification Using Part Localization // Lecture
Notes in Computer Science Volume 7572, 2012, pp 172-185.
Main Principles
● Scanning window
● Features
● Integral image
● Boosted feature selection
● Cascaded classifier
P.A. Viola, M.J. Jones, Rapid object detection using a boosted cascade of simple features, in: CVPR, issue 1,
2001, pp. 511–518.
16
Scaning window
17
Integral Image
18
Features
⚫Available features:
⚫ HAAR
⚫ LBP
⚫ HOG
⚫Too many features!
⚫ location, scale, type
⚫ 180,000+ possible features
associated with each 24 x 24
window
⚫Not all of them are useful!
19
Feature selection
⚫Idea: Combining several weak classifiers to generate a strong
classifier
α1 α2
α3 αT
…
…
α1
h1
+ α2
h2
+ α3
h3
+ … + αT
hT >
<
Tthreshol
d
weak classifier (feature, threshold)
h1
= 1 or 0
20
Cascaded Classifier
● A 1 feature classifier achieves 100% detection rate and about 50% false
positive rate.
● A 5 feature classifier achieves 100% detection rate and 40% false
positive rate (20% cumulative) – using data from previous stage.
● A 20 feature classifier achieve 100% detection rate with 10% false
positive rate (2% cumulative)
21
Viola Jones Pipeline
https://habrahabr.ru/post/133826/ 22
Viola Jones. Evaluation Results on FDDB
23
A Convolutional Neural Network
Cascade for Face Detection
● 12-net
● 12-calibration-net
● 24-net
● 24-calibration-net
● 48-net
● 48-calibration-net
http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_A_Convolutional_Neural_2015_CVPR_paper.pdf 24
Cascade CNN. Calibration Net
The calibration pattern adjusts the window to be
N = 45 patterns, formed by all combinations of
http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_A_Convolutional_Neural_2015_CVPR_paper.pdf 25
Cascade CNN. Evaluation Results on FDDB
~14 fps on CPU ~100 fps on GPU
http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_A_Convolutional_Neural_2015_CVPR_paper.pdf 26
Joint Face Detection and Alignment using Multi-task
Cascaded Convolutional Networks (MTCCN)
• Improved previous approach
• Joint face detection and alignment
• Online Hard sample mining
• Multi-source training
https://arxiv.org/pdf/1604.02878.pdf
27
MTCNN. Evaluation on FDDB and WIDER
https://arxiv.org/pdf/1604.02878.pdf
28
Faster R-CNN
29
Region proposal network
30
Bootstrapping Face Detection with
Hard Negative Examples
• ResNet-50
• Foreground ROI thr >=0.5
• Background ROI in the interval [0.1, 0.5)
• Balancing bg-fg RoIs: 3:1
• Hard Negative mining
https://arxiv.org/pdf/1608.02236.pdf 31
Face Detection using Deep Learning: An
Improved Faster RCNN Approach (DeepIR)
• VGG16 architecture
• Hard negative mining
• Feature concatenation
• Multi-scale training
https://arxiv.org/pdf/1701.08289.pdf
32
DeepIR. Evaluation on FDDB
DeepIR
https://arxiv.org/pdf/1701.08289.pdf
33
Finding Tiny Faces (HR-ER)
https://arxiv.org/pdf/1612.04402.pdf 34
HR-ER. Approach
What about context?
https://arxiv.org/pdf/1612.04402.pdf 35
HR-ER. Evaluation on WIDER and FDDB
https://arxiv.org/pdf/1612.04402.pdf 36
THANK YOU FOR YOUR
ATTENTION!
e-mail: yurii.pashchenko@ring.com
skype: george.pashchenko
37

DataScience Lab 2017_Обзор методов детекции лиц на изображение

  • 1.
    Survey of FaceDetection Approaches Yurii Pashchenko DataScience Lab, Odessa, 2017
  • 2.
  • 3.
  • 4.
    Evaluation metric. ReceiverOperating Characteristic (ROC) 4
  • 5.
    Benchmarks ● FDDB ● AFW ●PascalFace ● IJB-A ● MALF ● WIDER Face 5
  • 6.
    FDDB: A Benchmarkfor Face Detection in Unconstrained Settings ● 2 845 images with a total of 5 171 faces; ● a wide range of difficulties: ○ occlusions ○ different poses ○ low resolution ○ out-of-focus faces ● the specification of face regions as elliptical regions ● both grayscale and color images. http://vis-www.cs.umass.edu/fddb/ 6
  • 7.
  • 8.
  • 9.
    IARPA Janus BenchmarkA (IJB-A) • 5 712 images and 2085 videos, with an average of 11.4 images and 4.2 videos per subject • full pose variation • joint use for face recognition and face detection benchmarking • a mix of images and videos • wider geographic variation of subjects • landmark locations Brendan F Klare, Emma Taborsky, Austin Blanton, Jordan Cheney, Kristen Allen, Patrick Grother, Alan Mah, Mark Burge, and Anil K Jain. 2015. Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit. 1931–1939 9
  • 10.
    IJB-A. Evaluation * FalseAccept and Detection Rate are computed per image 10
  • 11.
    WIDER FACE: AFace Detection Benchmark • It consists of 32 203 images with 393 703 labeled faces, which is 10 times larger than the current largest face detection dataset • The faces vary largely in appearance, pose, and scale • Annotated multiple attributes: occlusion, pose, and event categories, which allows in depth analysis of existing algorithms. http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/ 11
  • 12.
  • 13.
  • 14.
    Comparison of FaceDetection Datasets https://arxiv.org/pdf/1511.06523.pdf 14
  • 15.
    Viola-Jones Object Detector •Very popular for Human Face Detection • May be trained for Cat and Dog Face detection • Available free in OpenCV library (http://opencv.org) O. Parkhi, A. Vedaldi, C. V. Jawahar, and A. Zisserman. The Truth about Cats and Dogs // Proceedings of the International Conference on Computer Vision (ICCV), 2011. J. Liu, A. Kanazawa, D. Jacobs, P. Belhumeur. Dog Breed Classification Using Part Localization // Lecture Notes in Computer Science Volume 7572, 2012, pp 172-185.
  • 16.
    Main Principles ● Scanningwindow ● Features ● Integral image ● Boosted feature selection ● Cascaded classifier P.A. Viola, M.J. Jones, Rapid object detection using a boosted cascade of simple features, in: CVPR, issue 1, 2001, pp. 511–518. 16
  • 17.
  • 18.
  • 19.
    Features ⚫Available features: ⚫ HAAR ⚫LBP ⚫ HOG ⚫Too many features! ⚫ location, scale, type ⚫ 180,000+ possible features associated with each 24 x 24 window ⚫Not all of them are useful! 19
  • 20.
    Feature selection ⚫Idea: Combiningseveral weak classifiers to generate a strong classifier α1 α2 α3 αT … … α1 h1 + α2 h2 + α3 h3 + … + αT hT > < Tthreshol d weak classifier (feature, threshold) h1 = 1 or 0 20
  • 21.
    Cascaded Classifier ● A1 feature classifier achieves 100% detection rate and about 50% false positive rate. ● A 5 feature classifier achieves 100% detection rate and 40% false positive rate (20% cumulative) – using data from previous stage. ● A 20 feature classifier achieve 100% detection rate with 10% false positive rate (2% cumulative) 21
  • 22.
  • 23.
    Viola Jones. EvaluationResults on FDDB 23
  • 24.
    A Convolutional NeuralNetwork Cascade for Face Detection ● 12-net ● 12-calibration-net ● 24-net ● 24-calibration-net ● 48-net ● 48-calibration-net http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_A_Convolutional_Neural_2015_CVPR_paper.pdf 24
  • 25.
    Cascade CNN. CalibrationNet The calibration pattern adjusts the window to be N = 45 patterns, formed by all combinations of http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_A_Convolutional_Neural_2015_CVPR_paper.pdf 25
  • 26.
    Cascade CNN. EvaluationResults on FDDB ~14 fps on CPU ~100 fps on GPU http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_A_Convolutional_Neural_2015_CVPR_paper.pdf 26
  • 27.
    Joint Face Detectionand Alignment using Multi-task Cascaded Convolutional Networks (MTCCN) • Improved previous approach • Joint face detection and alignment • Online Hard sample mining • Multi-source training https://arxiv.org/pdf/1604.02878.pdf 27
  • 28.
    MTCNN. Evaluation onFDDB and WIDER https://arxiv.org/pdf/1604.02878.pdf 28
  • 29.
  • 30.
  • 31.
    Bootstrapping Face Detectionwith Hard Negative Examples • ResNet-50 • Foreground ROI thr >=0.5 • Background ROI in the interval [0.1, 0.5) • Balancing bg-fg RoIs: 3:1 • Hard Negative mining https://arxiv.org/pdf/1608.02236.pdf 31
  • 32.
    Face Detection usingDeep Learning: An Improved Faster RCNN Approach (DeepIR) • VGG16 architecture • Hard negative mining • Feature concatenation • Multi-scale training https://arxiv.org/pdf/1701.08289.pdf 32
  • 33.
    DeepIR. Evaluation onFDDB DeepIR https://arxiv.org/pdf/1701.08289.pdf 33
  • 34.
    Finding Tiny Faces(HR-ER) https://arxiv.org/pdf/1612.04402.pdf 34
  • 35.
    HR-ER. Approach What aboutcontext? https://arxiv.org/pdf/1612.04402.pdf 35
  • 36.
    HR-ER. Evaluation onWIDER and FDDB https://arxiv.org/pdf/1612.04402.pdf 36
  • 37.
    THANK YOU FORYOUR ATTENTION! e-mail: yurii.pashchenko@ring.com skype: george.pashchenko 37