Long-term Face Tracking in the Wild using Deep Learning

Long-term Face Tracking in the Wild
using Deep Learning
Presented by:
Elaheh Rashedi
Advisor:
Xuewen Chen
Department of Computer Science
Wayne State University
KDD Workshop on Large-scale Deep Learning for Data Mining
August 2016
1

Outline
• Introduction
– Long Term Tracking Algorithm
– Face Tracking Algorithm
– Tracking Challenges
• Related Work
• Proposed Methodology
– Detection Verification Tracking (DVT)
• DL based Face Detection
• CNN based Face Verification
• Multi-patch based Face Tracking
– System Framework
– Demonstration
• Experiments and Results
• Future Work
2

Long-Term Tracking Algorithm
• Common Steps
– Select a video
– Employ a bounding box around the target
– Distinguish the object from the background
– Track the object around the same region in next frame
Ref [1]
3
Introduction Related Work Methodology Experiments and Results Future Work

Face Tracking Algorithms
• Specialized for tracking face
• Common approaches:
– Using face detection
– Using Facial Landmark Localization
4

Tracking Challenges
• Can be challenging on real world noisy videos
• Not robust against
– Appearance changes
– Occlusion
– Fast motion
– Illumination changes
– Background clutter
5

Tracking Challenges (cont.)
• Sensitive to the initialization of target
• Not able to handle all situations
• Long term tracking challenge:
– Not reliable in cases where the object leaves the view
6

Related Work
• TLD: Tracking-Learning-Detection
7
TLD Flow Chart, Ref [2]

Proposed Methodology
• Model
– Detection-Verification-Tracking
• Goal
– Long term face tracking
– Wild video target
• Employ
– Deep learning based face detection
– CNN based face verification
– Multi-patch based tracking
8

DL based Face detection
• Model
– Cascade architecture built on CNN
• CNN structure:
– 3 CNNs for faces vs. non-faces (binary classification)
– 3 CNNs for bounding box calibration (Multiclass classification)
9

DL based Face detection (cont…)
10
Ref [3]

CNN based Face Verification
• Convolutional Neural Network: 37 layers
• Feature vector dimension: 4098
• Pre-trained network based on MatConvNet (VGG)
• Verification steps:
– Resize the target face (224x224)
– Create the feature query
– Extract features for each individual face
– Compute Cosine similarity
– Compare to a threshold
11

Multi-patch based Tracking
• Employs Multiple patches around the target
• Categorize patches to reliable/non-reliable categories
• Track reliable patches
• Ignore non-reliable patches
• Result is the average of reliable patches
12

VerificationBasedFaceTrackingFlowChart
System Framework
13

Demonstration
14
Demonstration of the system for pausing the video and selecting the target face to be tracked

Demonstration
15

Results
• Implemented by Matlab R2015b
• MatConvNet
• Threshold
– Similarity threshold: 0.75
– Skip time: 3s
• Running Time
– Video Duration *2
16

Experiments and Results
17
Method
Character Roy
Precision Recall
TLD 0.7 0.37
Face-TLD 0.75 0.54
DVT (the proposed) 0.95 0.75
Table 1: The comparison between TLD, Face-TLD and the proposed DVT
method in terms of precision and recall the sitcom IT-Crowd (first series, first
episode).

Future Work
• Using reliable frames to learn the model
– Useful for long term tracking
– Less sensitive to first initialization of the target
• Learning the similarity threshold
– SVM
18

References
1. Khan, Zulfiqar Hasan, Irene Yu-Hua Gu, and Andrew G. Backhouse. "A robust particle filter-based method for tracking single visual object through
complex scenes using dynamical object shape and appearance similarity." Journal of Signal Processing Systems 65, no. 1 (2011): 63-79.
2. Kalal, Zdenek, Krystian Mikolajczyk, and Jiri Matas. "Tracking-learning-detection." IEEE transactions on pattern analysis and machine intelligence34.7
(2012): 1409-1422.
3. Li, Haoxiang, Zhe Lin, Xiaohui Shen, Jonathan Brandt, and Gang Hua. "A convolutional neural network cascade for face detection." In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, pp. 5325-5334. 2015.
4. Li, Yang, Jianke Zhu, and Steven CH Hoi. "Reliable patch trackers: Robust visual tracking by exploiting reliable patches." In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, pp. 353-361. 2015.
5. http://www.vlfeat.org/matconvnet/
6. Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “Deepface: Closing the gap to human-level performance in face verification,” in The IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), June 2014.
7. S. Chopra, R. Hadsell, and Y. LeCun, “Learning a similarity metric discriminatively, with application to face verification,” in Computer Vision and Pattern
Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, pp. 539–546, 2005.
8. Y. Sun, Y. Chen, X. Wang, and X. Tang, “Deep learning face representation by joint identification verification,” in Advances in Neural Information
Processing Systems, pp. 1988–1996, 2014.
9. Y. Sun, X. Wang, and X. Tang, “Deep learning face representation from predicting 10,000 classes,” in Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pp. 1891–1898, 2014.
10. Y. Sun, D. Liang, X.Wang, and X. Tang, “Deepid3: Face recognition with very deep neural networks,” arXiv preprint arXiv:1502.00873, 2015.
11. Y. Sun, X.Wang, and X. Tang, “Deeply learned face representations are sparse, selective, and robust,” in Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pp. 2892–2900, 2015.
12. F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified embedding for face recognition and clustering,” in The IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), June 2015.
19

Thank you!
Contact us:
Elaheh.Rashedi@wayne.edu
20

Long-term Face Tracking in the Wild using Deep Learning

More Related Content

What's hot

Similar to Long-term Face Tracking in the Wild using Deep Learning

Recently uploaded

Long-term Face Tracking in the Wild using Deep Learning

Editor's Notes