DataScience Lab, 13 мая 2017
Обзор методов детекции лиц на изображение
Юрий Пащенко ( Research Engineer, Ring Labs)
В данном докладе мы предлагаем обзор наиболее новых и популярных методов обнаружения лиц, таких как Viola-Jones, Faster-RCNN, MTCCN и прочих. Мы обсудим основные критерии оценки качества алгоритма а также базы, включая FDDB, WIDER, IJB-A.
Все материалы: http://datascience.in.ua/report2017
6. FDDB: A Benchmark for Face Detection in
Unconstrained Settings
● 2 845 images with a total of 5 171 faces;
● a wide range of difficulties:
○ occlusions
○ different poses
○ low resolution
○ out-of-focus faces
● the specification of face regions as
elliptical regions
● both grayscale and color images.
http://vis-www.cs.umass.edu/fddb/ 6
9. IARPA Janus Benchmark A (IJB-A)
• 5 712 images and 2085 videos,
with an average of 11.4 images
and 4.2 videos per subject
• full pose variation
• joint use for face recognition and
face detection benchmarking
• a mix of images and videos
• wider geographic variation of
subjects
• landmark locations
Brendan F Klare, Emma Taborsky, Austin Blanton, Jordan Cheney, Kristen Allen, Patrick Grother, Alan Mah, Mark Burge,
and Anil K Jain. 2015. Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A. In
Proc. IEEE Conf. Comput. Vis. Pattern Recognit. 1931–1939 9
11. WIDER FACE: A Face Detection Benchmark
• It consists of 32 203 images with 393 703
labeled faces, which is 10 times larger
than the current largest face detection
dataset
• The faces vary largely in appearance,
pose, and scale
• Annotated multiple attributes: occlusion,
pose, and event categories, which allows
in depth analysis of existing algorithms.
http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/ 11
14. Comparison of Face Detection Datasets
https://arxiv.org/pdf/1511.06523.pdf 14
15. Viola-Jones Object Detector
• Very popular for Human Face Detection
• May be trained for Cat and Dog Face detection
• Available free in OpenCV library (http://opencv.org)
O. Parkhi, A. Vedaldi, C. V. Jawahar, and A. Zisserman. The Truth about Cats and Dogs // Proceedings
of the International Conference on Computer Vision (ICCV), 2011. J.
Liu, A. Kanazawa, D. Jacobs, P. Belhumeur. Dog Breed Classification Using Part Localization // Lecture
Notes in Computer Science Volume 7572, 2012, pp 172-185.
16. Main Principles
● Scanning window
● Features
● Integral image
● Boosted feature selection
● Cascaded classifier
P.A. Viola, M.J. Jones, Rapid object detection using a boosted cascade of simple features, in: CVPR, issue 1,
2001, pp. 511–518.
16
19. Features
⚫Available features:
⚫ HAAR
⚫ LBP
⚫ HOG
⚫Too many features!
⚫ location, scale, type
⚫ 180,000+ possible features
associated with each 24 x 24
window
⚫Not all of them are useful!
19
20. Feature selection
⚫Idea: Combining several weak classifiers to generate a strong
classifier
α1 α2
α3 αT
…
…
α1
h1
+ α2
h2
+ α3
h3
+ … + αT
hT >
<
Tthreshol
d
weak classifier (feature, threshold)
h1
= 1 or 0
20
21. Cascaded Classifier
● A 1 feature classifier achieves 100% detection rate and about 50% false
positive rate.
● A 5 feature classifier achieves 100% detection rate and 40% false
positive rate (20% cumulative) – using data from previous stage.
● A 20 feature classifier achieve 100% detection rate with 10% false
positive rate (2% cumulative)
21
24. A Convolutional Neural Network
Cascade for Face Detection
● 12-net
● 12-calibration-net
● 24-net
● 24-calibration-net
● 48-net
● 48-calibration-net
http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_A_Convolutional_Neural_2015_CVPR_paper.pdf 24
25. Cascade CNN. Calibration Net
The calibration pattern adjusts the window to be
N = 45 patterns, formed by all combinations of
http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_A_Convolutional_Neural_2015_CVPR_paper.pdf 25
26. Cascade CNN. Evaluation Results on FDDB
~14 fps on CPU ~100 fps on GPU
http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_A_Convolutional_Neural_2015_CVPR_paper.pdf 26
27. Joint Face Detection and Alignment using Multi-task
Cascaded Convolutional Networks (MTCCN)
• Improved previous approach
• Joint face detection and alignment
• Online Hard sample mining
• Multi-source training
https://arxiv.org/pdf/1604.02878.pdf
27
31. Bootstrapping Face Detection with
Hard Negative Examples
• ResNet-50
• Foreground ROI thr >=0.5
• Background ROI in the interval [0.1, 0.5)
• Balancing bg-fg RoIs: 3:1
• Hard Negative mining
https://arxiv.org/pdf/1608.02236.pdf 31
32. Face Detection using Deep Learning: An
Improved Faster RCNN Approach (DeepIR)
• VGG16 architecture
• Hard negative mining
• Feature concatenation
• Multi-scale training
https://arxiv.org/pdf/1701.08289.pdf
32