SlideShare a Scribd company logo
Learning a Multi-Center
Convolutional Network for
Unconstrained Face Alignment
Zhiwen Shao, Hengliang Zhu, Yangyang Hao,
Min Wang, and Lizhuang Ma
Shanghai Jiao Tong University
Introduction
Face Alignment
Detecting facial landmarks like pupil
centers, nose tip, mouth corners
Unconstrained scenarios including severe
occlusions and large face variations
Challenges
 Methods based on low-level handcrafted features have a
limited capacity to represent highly complex faces
Deep convolutional network
 A nonlinear regression problem, which transforms
appearance to shape
Motivation
Cascaded CNN [1], Zhou et al. [2], CFAN [3], and CDAN [4]
employ cascaded deep networks to refine predicted shapes
Previous Deep Learning Methods
time-consuming training processes
high model complexity
[1] Y. Sun, X. Wang, and X. Tang, “Deep convolutional network cascade for facial point
detection,” in IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2013, pp.
3476–3483.[2] E. Zhou, H. Fan, Z. Cao, Y. Jiang, and Q. Yin, “Extensive facial landmark localization with
coarse-to-fine convolutional network cascade,” in IEEE International Conference on Computer
Vision Workshops. IEEE, 2013, pp. 386–391.
[3] J. Zhang, S. Shan, M. Kan, and X. Chen, “Coarse-to-fine auto-encoder networks (cfan) for real-
time face alignment,” in European Conference on Computer Vision. Springer, 2014, pp. 1–16.
[4] R. Weng, J. Lu, Y.-P. Tan, and J. Zhou, “Learning cascaded deep auto-encoder networks for
face alignment,” IEEE Transactions on Multimedia, vol. 18, no. 10, pp. 2066–2078, 2016.
Multiple networks based
TCDCN [5] needs extra labels of facial attributes for
samples
one single network without auxiliary information
Previous Deep Learning Methods
limits the universality of this method
Single network based
[5] Z. Zhang, P. Luo, C. C. Loy, and X. Tang, “Learning deep representation for face alignment with
auxiliary attributes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no.
5, pp. 918–930, 2016.
Methodology
Structural Correlations
Chin is occluded Right contour is invisible
 Unconstrained faces with partial occlusion and large pose
Landmarks in the same local region have similar properties
including occlusion and visibility
Face Partition
29 landmarks 68 landmarks
Partition of facial landmarks for different labeling patterns
Left eye, right eye, nose, mouth, left contour, chin, and
right contour
Network Architecture
 Shared layers
 Multiple center-specific shape prediction layers
Network Architecture
 Shared layers
• Eight convolutional layers and one fully-connected layer
• Each max-pooling layer follows a stack of two convolutional layers
Network Architecture
• Each cluster of facial landmarks is treated as a separate center
• Each layer estimates x and y coordinates of all n facial landmarks
• Focusing on the shape estimation of a specific face region
 Multiple center-specific shape prediction layers
Loss Function
^ ^
2 2 2
2 1 22 1 2
1
[( ) ( ) ]/ (2 )
n
j j jj j
j
E w f f f f d− −
=
= − + −∑
 Weighted inter-ocular distance normalized Euclidean loss
jw weight of the j-th landmark
ground truth coordinatesf predicted coordinates
^
f
d ground truth inter-ocular distance
the first center-specific layer:
larger weights for landmarks around the left eye
Network Architecture
Multi-Center Learning
Basic Model
Network Architecture
Multi-Center Learning
Reinforcement
for Each Center
Network Architecture
Multi-Center Learning
Combined Model
Weight Computation
 Multiple relationship
( ) ( )i c i m
P P
w wη=
( )i c
P set of center-specific landmarks
( )i m
P set of remaining minor landmarks
amplification factor
Different fine-tuning steps have different center-
specific and minor facial landmarks
 Consistent with the basic model
( ) ( )
( ) ( )
| | ( | |)i c i m
i c i c
P P
w P w n P n+ − =
| |× number of elements in a set
During the i-th fine-tuning step
( )
( )
( )
( )
/[( 1) | | ]
/[( 1) | | ]
i c
i m
i c
P
i c
P
w n P n
w n P n
η η
η
= − +
= − +
other centers with relatively small weights rather than
zeroutilize implicit structural correlations among different parts
landmarks from the same cluster have similar properties
share an identical weight
search the solution smoothly
Weight Computation
During the i-th fine-tuning step
Combined Model
high-level representation
( 1) 1
0 1( , , , ) ( 1024)T D
Dx x x D+ ×
= ∈ =x L ¡
weight matrix ( 1) 2
1 2 2( , , , ) D n
n
+ ×
= ∈W w w wL ¡
0 1( , , , ) , 1, ,2T
k k k Dkw w w k n= =w L L
^
2 12 1
^
22
T
jj
T
jj
f
f
−− =
=
w x
w x
weight matrix of the i-th center-specific layer
i
W
2 1 2 1
2 2
combined i
j j
combined i
j j
− −=
=
w w
w w
( )
1, , , i c
i m j P= ∈L
Combined Model
Combined Model S combined
Θ ∪ W
complexity is as same as the basic model
improves the location performance by
exploiting the advantage of each center-specific
solution
Our multi-center learning algorithm takes full advantage of each
stage and searches the optimal solution smoothly
Experiments
Datasets
COFW
occluded dataset in the wild
1345 training images
507 testing images
IBUG
large appearance variations
3148 training images
135 testing images
Evaluation Metric
 inter-ocular distance normalized mean error
 cumulative errors distribution (CED) curves
 failure rate
failure: mean error larger than 10%
Validation of Multi-Center Learning Algorithm
Method COFW IBUG
Mean Failure Mean Failure
Basic 6.26 3.16 9.23 33.33
Combined 6.08 2.96 8.87 25.93
Mean Error (%) and Failure Rate (%)
improve the accuracy and robustness
good performance of basic model
effectiveness of our network
reinforce the learning for each local face region
Validation of Multi-Center Learning Algorithm
Mean error for different clusters on COFW
Comparison with Other Methods
Method COFW IBUG
ESR 11.2 17.00
SDM 11.14 15.40
RCPR 8.5 17.26
CFAN - 16.78
LBF - 11.98
cGPRT - 11.03
CFSS - 9.98
TCDCN 8.05 8.60
CFT 6.33 10.06
Wu et al. 5.93 -
MCNet 6.08 8.87
Comparison with Other Methods
Comparison with Other Methods
COFW
Comparison with Other Methods
IBUG
Comparison with Other Methods
Deep model Speed (FPS) CPU
Cascaded CNN 5 single core, i5-6200U 2.3GHz
CFAN* 43 i7-3770 3.4GHz
CDAN* 50 i5 3.2GHz
TCDCN 50 single core, i5-6200U 2.3GHz
CFT 31 single core, i5-6200U 2.3GHz
MCNet 67 single core, i5-6200U 2.3GHz
Time of face detection is excluded
Conclusions
 We propose a novel multi-center convolutional network, which
exploits the representation power of each center
 We propose the reinforcement for each center to improve the
shape estimation precision of each facial part
 Comprehensive experiments demonstrate that our method
achieves real-time and competitive performance compared to
other state-of-the-art techniques
Code
 Matlab
https://github.com/ZhiwenShao/MCNet
 C++
https://github.com/ZhiwenShao/MCNet-Extension
Thank you!

More Related Content

What's hot

A Review Paper on Stereo Vision Based Depth Estimation
A Review Paper on Stereo Vision Based Depth EstimationA Review Paper on Stereo Vision Based Depth Estimation
A Review Paper on Stereo Vision Based Depth Estimation
IJSRD
 
Hierarchical Approach for Total Variation Digital Image Inpainting
Hierarchical Approach for Total Variation Digital Image InpaintingHierarchical Approach for Total Variation Digital Image Inpainting
Hierarchical Approach for Total Variation Digital Image Inpainting
IJCSEA Journal
 
Ijetr021113
Ijetr021113Ijetr021113
An Improved Hybrid Model for Molecular Image Denoising
An Improved Hybrid Model for Molecular Image DenoisingAn Improved Hybrid Model for Molecular Image Denoising
An Improved Hybrid Model for Molecular Image Denoising
QuEST Global (erstwhile NeST Software)
 
Segmentation of medical images using metric topology – a region growing approach
Segmentation of medical images using metric topology – a region growing approachSegmentation of medical images using metric topology – a region growing approach
Segmentation of medical images using metric topology – a region growing approach
Ijrdt Journal
 
A010210106
A010210106A010210106
A010210106
IOSR Journals
 
Image restoration based on morphological operations
Image restoration based on morphological operationsImage restoration based on morphological operations
Image restoration based on morphological operations
ijcseit
 
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELD
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELDFINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELD
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELD
ijesajournal
 
Face skin color based recognition using local spectral and gray scale features
Face skin color based recognition using local spectral and gray scale featuresFace skin color based recognition using local spectral and gray scale features
Face skin color based recognition using local spectral and gray scale features
eSAT Journals
 
STUDY ANALYSIS ON TEETH SEGMENTATION USING LEVEL SET METHOD
STUDY ANALYSIS ON TEETH SEGMENTATION USING LEVEL SET METHODSTUDY ANALYSIS ON TEETH SEGMENTATION USING LEVEL SET METHOD
STUDY ANALYSIS ON TEETH SEGMENTATION USING LEVEL SET METHOD
aciijournal
 
Extraction of texture features by using gabor filter in wheat crop disease de...
Extraction of texture features by using gabor filter in wheat crop disease de...Extraction of texture features by using gabor filter in wheat crop disease de...
Extraction of texture features by using gabor filter in wheat crop disease de...
eSAT Journals
 
V.KARTHIKEYAN PUBLISHED ARTICLE
V.KARTHIKEYAN PUBLISHED ARTICLEV.KARTHIKEYAN PUBLISHED ARTICLE
V.KARTHIKEYAN PUBLISHED ARTICLE
KARTHIKEYAN V
 
Development of stereo matching algorithm based on sum of absolute RGB color d...
Development of stereo matching algorithm based on sum of absolute RGB color d...Development of stereo matching algorithm based on sum of absolute RGB color d...
Development of stereo matching algorithm based on sum of absolute RGB color d...
IJECEIAES
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET Journal
 
Face Recognition Using Neural Network Based Fourier Gabor Filters & Random Pr...
Face Recognition Using Neural Network Based Fourier Gabor Filters & Random Pr...Face Recognition Using Neural Network Based Fourier Gabor Filters & Random Pr...
Face Recognition Using Neural Network Based Fourier Gabor Filters & Random Pr...
CSCJournals
 

What's hot (17)

A Review Paper on Stereo Vision Based Depth Estimation
A Review Paper on Stereo Vision Based Depth EstimationA Review Paper on Stereo Vision Based Depth Estimation
A Review Paper on Stereo Vision Based Depth Estimation
 
Ijetcas14 447
Ijetcas14 447Ijetcas14 447
Ijetcas14 447
 
Hierarchical Approach for Total Variation Digital Image Inpainting
Hierarchical Approach for Total Variation Digital Image InpaintingHierarchical Approach for Total Variation Digital Image Inpainting
Hierarchical Approach for Total Variation Digital Image Inpainting
 
Ijetr021113
Ijetr021113Ijetr021113
Ijetr021113
 
An Improved Hybrid Model for Molecular Image Denoising
An Improved Hybrid Model for Molecular Image DenoisingAn Improved Hybrid Model for Molecular Image Denoising
An Improved Hybrid Model for Molecular Image Denoising
 
Segmentation of medical images using metric topology – a region growing approach
Segmentation of medical images using metric topology – a region growing approachSegmentation of medical images using metric topology – a region growing approach
Segmentation of medical images using metric topology – a region growing approach
 
A010210106
A010210106A010210106
A010210106
 
Image restoration based on morphological operations
Image restoration based on morphological operationsImage restoration based on morphological operations
Image restoration based on morphological operations
 
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELD
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELDFINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELD
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELD
 
Face skin color based recognition using local spectral and gray scale features
Face skin color based recognition using local spectral and gray scale featuresFace skin color based recognition using local spectral and gray scale features
Face skin color based recognition using local spectral and gray scale features
 
STUDY ANALYSIS ON TEETH SEGMENTATION USING LEVEL SET METHOD
STUDY ANALYSIS ON TEETH SEGMENTATION USING LEVEL SET METHODSTUDY ANALYSIS ON TEETH SEGMENTATION USING LEVEL SET METHOD
STUDY ANALYSIS ON TEETH SEGMENTATION USING LEVEL SET METHOD
 
Extraction of texture features by using gabor filter in wheat crop disease de...
Extraction of texture features by using gabor filter in wheat crop disease de...Extraction of texture features by using gabor filter in wheat crop disease de...
Extraction of texture features by using gabor filter in wheat crop disease de...
 
V.KARTHIKEYAN PUBLISHED ARTICLE
V.KARTHIKEYAN PUBLISHED ARTICLEV.KARTHIKEYAN PUBLISHED ARTICLE
V.KARTHIKEYAN PUBLISHED ARTICLE
 
Development of stereo matching algorithm based on sum of absolute RGB color d...
Development of stereo matching algorithm based on sum of absolute RGB color d...Development of stereo matching algorithm based on sum of absolute RGB color d...
Development of stereo matching algorithm based on sum of absolute RGB color d...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
 
Face Recognition Using Neural Network Based Fourier Gabor Filters & Random Pr...
Face Recognition Using Neural Network Based Fourier Gabor Filters & Random Pr...Face Recognition Using Neural Network Based Fourier Gabor Filters & Random Pr...
Face Recognition Using Neural Network Based Fourier Gabor Filters & Random Pr...
 

Similar to Learning a multi-center convolutional network for unconstrained face alignment

LITERATURE SURVEY ON SPARSE REPRESENTATION FOR NEURAL NETWORK BASED FACE DETE...
LITERATURE SURVEY ON SPARSE REPRESENTATION FOR NEURAL NETWORK BASED FACE DETE...LITERATURE SURVEY ON SPARSE REPRESENTATION FOR NEURAL NETWORK BASED FACE DETE...
LITERATURE SURVEY ON SPARSE REPRESENTATION FOR NEURAL NETWORK BASED FACE DETE...
csijjournal
 
FACE RECOGNITION USING PRINCIPAL COMPONENT ANALYSIS WITH MEDIAN FOR NORMALIZA...
FACE RECOGNITION USING PRINCIPAL COMPONENT ANALYSIS WITH MEDIAN FOR NORMALIZA...FACE RECOGNITION USING PRINCIPAL COMPONENT ANALYSIS WITH MEDIAN FOR NORMALIZA...
FACE RECOGNITION USING PRINCIPAL COMPONENT ANALYSIS WITH MEDIAN FOR NORMALIZA...
csandit
 
Robust face recognition by applying partitioning around medoids over eigen fa...
Robust face recognition by applying partitioning around medoids over eigen fa...Robust face recognition by applying partitioning around medoids over eigen fa...
Robust face recognition by applying partitioning around medoids over eigen fa...
ijcsa
 
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
ijscmc
 
Model Based Emotion Detection using Point Clouds
Model Based Emotion Detection using Point CloudsModel Based Emotion Detection using Point Clouds
Model Based Emotion Detection using Point Clouds
Lakshmi Sarvani Videla
 
Face Recognition System Using Local Ternary Pattern and Signed Number Multipl...
Face Recognition System Using Local Ternary Pattern and Signed Number Multipl...Face Recognition System Using Local Ternary Pattern and Signed Number Multipl...
Face Recognition System Using Local Ternary Pattern and Signed Number Multipl...
inventionjournals
 
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
ijscmcj
 
Long-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep LearningLong-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep Learning
Elaheh Rashedi
 
Face recognition
Face recognitionFace recognition
Face recognition
Satyendra Rajput
 
icmi2015_ChaZhang
icmi2015_ChaZhangicmi2015_ChaZhang
icmi2015_ChaZhangZhiding Yu
 
Web image annotation by diffusion maps manifold learning algorithm
Web image annotation by diffusion maps manifold learning algorithmWeb image annotation by diffusion maps manifold learning algorithm
Web image annotation by diffusion maps manifold learning algorithm
ijfcstjournal
 
Image Redundancy and Its Elimination
Image Redundancy and Its EliminationImage Redundancy and Its Elimination
Image Redundancy and Its Elimination
IJMERJOURNAL
 
S0450598102
S0450598102S0450598102
S0450598102
IJERA Editor
 
Selective local binary pattern with convolutional neural network for facial ...
Selective local binary pattern with convolutional neural  network for facial ...Selective local binary pattern with convolutional neural  network for facial ...
Selective local binary pattern with convolutional neural network for facial ...
IJECEIAES
 
Realtime face matching and gender prediction based on deep learning
Realtime face matching and gender prediction based on deep learningRealtime face matching and gender prediction based on deep learning
Realtime face matching and gender prediction based on deep learning
IJECEIAES
 
A Robust & Fast Face Detection System
A Robust & Fast Face Detection SystemA Robust & Fast Face Detection System
A Robust & Fast Face Detection System
IDES Editor
 
Face and Eye Detection Varying Scenarios With Haar Classifier_2015
Face and Eye Detection Varying Scenarios With Haar Classifier_2015Face and Eye Detection Varying Scenarios With Haar Classifier_2015
Face and Eye Detection Varying Scenarios With Haar Classifier_2015
Showrav Mazumder
 
DeepFace: Closing the Gap to Human-Level Performance in Face Verification
DeepFace: Closing the Gap to Human-Level Performance in Face VerificationDeepFace: Closing the Gap to Human-Level Performance in Face Verification
DeepFace: Closing the Gap to Human-Level Performance in Face Verification
João Gabriel Lima
 
IRJET- Face Recognition by Additive Block based Feature Extraction
IRJET- Face Recognition by Additive Block based Feature ExtractionIRJET- Face Recognition by Additive Block based Feature Extraction
IRJET- Face Recognition by Additive Block based Feature Extraction
IRJET Journal
 
Person re-identification, PhD Day 2011
Person re-identification, PhD Day 2011Person re-identification, PhD Day 2011
Person re-identification, PhD Day 2011
Riccardo Satta
 

Similar to Learning a multi-center convolutional network for unconstrained face alignment (20)

LITERATURE SURVEY ON SPARSE REPRESENTATION FOR NEURAL NETWORK BASED FACE DETE...
LITERATURE SURVEY ON SPARSE REPRESENTATION FOR NEURAL NETWORK BASED FACE DETE...LITERATURE SURVEY ON SPARSE REPRESENTATION FOR NEURAL NETWORK BASED FACE DETE...
LITERATURE SURVEY ON SPARSE REPRESENTATION FOR NEURAL NETWORK BASED FACE DETE...
 
FACE RECOGNITION USING PRINCIPAL COMPONENT ANALYSIS WITH MEDIAN FOR NORMALIZA...
FACE RECOGNITION USING PRINCIPAL COMPONENT ANALYSIS WITH MEDIAN FOR NORMALIZA...FACE RECOGNITION USING PRINCIPAL COMPONENT ANALYSIS WITH MEDIAN FOR NORMALIZA...
FACE RECOGNITION USING PRINCIPAL COMPONENT ANALYSIS WITH MEDIAN FOR NORMALIZA...
 
Robust face recognition by applying partitioning around medoids over eigen fa...
Robust face recognition by applying partitioning around medoids over eigen fa...Robust face recognition by applying partitioning around medoids over eigen fa...
Robust face recognition by applying partitioning around medoids over eigen fa...
 
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
 
Model Based Emotion Detection using Point Clouds
Model Based Emotion Detection using Point CloudsModel Based Emotion Detection using Point Clouds
Model Based Emotion Detection using Point Clouds
 
Face Recognition System Using Local Ternary Pattern and Signed Number Multipl...
Face Recognition System Using Local Ternary Pattern and Signed Number Multipl...Face Recognition System Using Local Ternary Pattern and Signed Number Multipl...
Face Recognition System Using Local Ternary Pattern and Signed Number Multipl...
 
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
 
Long-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep LearningLong-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep Learning
 
Face recognition
Face recognitionFace recognition
Face recognition
 
icmi2015_ChaZhang
icmi2015_ChaZhangicmi2015_ChaZhang
icmi2015_ChaZhang
 
Web image annotation by diffusion maps manifold learning algorithm
Web image annotation by diffusion maps manifold learning algorithmWeb image annotation by diffusion maps manifold learning algorithm
Web image annotation by diffusion maps manifold learning algorithm
 
Image Redundancy and Its Elimination
Image Redundancy and Its EliminationImage Redundancy and Its Elimination
Image Redundancy and Its Elimination
 
S0450598102
S0450598102S0450598102
S0450598102
 
Selective local binary pattern with convolutional neural network for facial ...
Selective local binary pattern with convolutional neural  network for facial ...Selective local binary pattern with convolutional neural  network for facial ...
Selective local binary pattern with convolutional neural network for facial ...
 
Realtime face matching and gender prediction based on deep learning
Realtime face matching and gender prediction based on deep learningRealtime face matching and gender prediction based on deep learning
Realtime face matching and gender prediction based on deep learning
 
A Robust & Fast Face Detection System
A Robust & Fast Face Detection SystemA Robust & Fast Face Detection System
A Robust & Fast Face Detection System
 
Face and Eye Detection Varying Scenarios With Haar Classifier_2015
Face and Eye Detection Varying Scenarios With Haar Classifier_2015Face and Eye Detection Varying Scenarios With Haar Classifier_2015
Face and Eye Detection Varying Scenarios With Haar Classifier_2015
 
DeepFace: Closing the Gap to Human-Level Performance in Face Verification
DeepFace: Closing the Gap to Human-Level Performance in Face VerificationDeepFace: Closing the Gap to Human-Level Performance in Face Verification
DeepFace: Closing the Gap to Human-Level Performance in Face Verification
 
IRJET- Face Recognition by Additive Block based Feature Extraction
IRJET- Face Recognition by Additive Block based Feature ExtractionIRJET- Face Recognition by Additive Block based Feature Extraction
IRJET- Face Recognition by Additive Block based Feature Extraction
 
Person re-identification, PhD Day 2011
Person re-identification, PhD Day 2011Person re-identification, PhD Day 2011
Person re-identification, PhD Day 2011
 

Recently uploaded

Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Sebastiano Panichella
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
Vladimir Samoylov
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
Faculty of Medicine And Health Sciences
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
IP ServerOne
 
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Orkestra
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
eCommerce Institute
 
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Access Innovations, Inc.
 
Eureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationEureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 Presentation
Access Innovations, Inc.
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
Sebastiano Panichella
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
Howard Spence
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
gharris9
 
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Matjaž Lipuš
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
khadija278284
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
OWASP Beja
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
faizulhassanfaiz1670
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Sebastiano Panichella
 
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
OECD Directorate for Financial and Enterprise Affairs
 

Recently uploaded (17)

Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
 
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
 
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
 
Eureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationEureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 Presentation
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
 
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
 
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
 

Learning a multi-center convolutional network for unconstrained face alignment

  • 1. Learning a Multi-Center Convolutional Network for Unconstrained Face Alignment Zhiwen Shao, Hengliang Zhu, Yangyang Hao, Min Wang, and Lizhuang Ma Shanghai Jiao Tong University
  • 3. Face Alignment Detecting facial landmarks like pupil centers, nose tip, mouth corners
  • 4. Unconstrained scenarios including severe occlusions and large face variations Challenges
  • 5.  Methods based on low-level handcrafted features have a limited capacity to represent highly complex faces Deep convolutional network  A nonlinear regression problem, which transforms appearance to shape Motivation
  • 6. Cascaded CNN [1], Zhou et al. [2], CFAN [3], and CDAN [4] employ cascaded deep networks to refine predicted shapes Previous Deep Learning Methods time-consuming training processes high model complexity [1] Y. Sun, X. Wang, and X. Tang, “Deep convolutional network cascade for facial point detection,” in IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2013, pp. 3476–3483.[2] E. Zhou, H. Fan, Z. Cao, Y. Jiang, and Q. Yin, “Extensive facial landmark localization with coarse-to-fine convolutional network cascade,” in IEEE International Conference on Computer Vision Workshops. IEEE, 2013, pp. 386–391. [3] J. Zhang, S. Shan, M. Kan, and X. Chen, “Coarse-to-fine auto-encoder networks (cfan) for real- time face alignment,” in European Conference on Computer Vision. Springer, 2014, pp. 1–16. [4] R. Weng, J. Lu, Y.-P. Tan, and J. Zhou, “Learning cascaded deep auto-encoder networks for face alignment,” IEEE Transactions on Multimedia, vol. 18, no. 10, pp. 2066–2078, 2016. Multiple networks based
  • 7. TCDCN [5] needs extra labels of facial attributes for samples one single network without auxiliary information Previous Deep Learning Methods limits the universality of this method Single network based [5] Z. Zhang, P. Luo, C. C. Loy, and X. Tang, “Learning deep representation for face alignment with auxiliary attributes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 5, pp. 918–930, 2016.
  • 9. Structural Correlations Chin is occluded Right contour is invisible  Unconstrained faces with partial occlusion and large pose Landmarks in the same local region have similar properties including occlusion and visibility
  • 10. Face Partition 29 landmarks 68 landmarks Partition of facial landmarks for different labeling patterns Left eye, right eye, nose, mouth, left contour, chin, and right contour
  • 11. Network Architecture  Shared layers  Multiple center-specific shape prediction layers
  • 12. Network Architecture  Shared layers • Eight convolutional layers and one fully-connected layer • Each max-pooling layer follows a stack of two convolutional layers
  • 13. Network Architecture • Each cluster of facial landmarks is treated as a separate center • Each layer estimates x and y coordinates of all n facial landmarks • Focusing on the shape estimation of a specific face region  Multiple center-specific shape prediction layers
  • 14. Loss Function ^ ^ 2 2 2 2 1 22 1 2 1 [( ) ( ) ]/ (2 ) n j j jj j j E w f f f f d− − = = − + −∑  Weighted inter-ocular distance normalized Euclidean loss jw weight of the j-th landmark ground truth coordinatesf predicted coordinates ^ f d ground truth inter-ocular distance the first center-specific layer: larger weights for landmarks around the left eye
  • 21. Weight Computation  Multiple relationship ( ) ( )i c i m P P w wη= ( )i c P set of center-specific landmarks ( )i m P set of remaining minor landmarks amplification factor Different fine-tuning steps have different center- specific and minor facial landmarks  Consistent with the basic model ( ) ( ) ( ) ( ) | | ( | |)i c i m i c i c P P w P w n P n+ − = | |× number of elements in a set During the i-th fine-tuning step
  • 22. ( ) ( ) ( ) ( ) /[( 1) | | ] /[( 1) | | ] i c i m i c P i c P w n P n w n P n η η η = − + = − + other centers with relatively small weights rather than zeroutilize implicit structural correlations among different parts landmarks from the same cluster have similar properties share an identical weight search the solution smoothly Weight Computation During the i-th fine-tuning step
  • 23. Combined Model high-level representation ( 1) 1 0 1( , , , ) ( 1024)T D Dx x x D+ × = ∈ =x L ¡ weight matrix ( 1) 2 1 2 2( , , , ) D n n + × = ∈W w w wL ¡ 0 1( , , , ) , 1, ,2T k k k Dkw w w k n= =w L L ^ 2 12 1 ^ 22 T jj T jj f f −− = = w x w x weight matrix of the i-th center-specific layer i W 2 1 2 1 2 2 combined i j j combined i j j − −= = w w w w ( ) 1, , , i c i m j P= ∈L
  • 24. Combined Model Combined Model S combined Θ ∪ W complexity is as same as the basic model improves the location performance by exploiting the advantage of each center-specific solution Our multi-center learning algorithm takes full advantage of each stage and searches the optimal solution smoothly
  • 26. Datasets COFW occluded dataset in the wild 1345 training images 507 testing images IBUG large appearance variations 3148 training images 135 testing images
  • 27. Evaluation Metric  inter-ocular distance normalized mean error  cumulative errors distribution (CED) curves  failure rate failure: mean error larger than 10%
  • 28. Validation of Multi-Center Learning Algorithm Method COFW IBUG Mean Failure Mean Failure Basic 6.26 3.16 9.23 33.33 Combined 6.08 2.96 8.87 25.93 Mean Error (%) and Failure Rate (%) improve the accuracy and robustness good performance of basic model effectiveness of our network reinforce the learning for each local face region
  • 29. Validation of Multi-Center Learning Algorithm Mean error for different clusters on COFW
  • 30. Comparison with Other Methods Method COFW IBUG ESR 11.2 17.00 SDM 11.14 15.40 RCPR 8.5 17.26 CFAN - 16.78 LBF - 11.98 cGPRT - 11.03 CFSS - 9.98 TCDCN 8.05 8.60 CFT 6.33 10.06 Wu et al. 5.93 - MCNet 6.08 8.87
  • 32. Comparison with Other Methods COFW
  • 33. Comparison with Other Methods IBUG
  • 34. Comparison with Other Methods Deep model Speed (FPS) CPU Cascaded CNN 5 single core, i5-6200U 2.3GHz CFAN* 43 i7-3770 3.4GHz CDAN* 50 i5 3.2GHz TCDCN 50 single core, i5-6200U 2.3GHz CFT 31 single core, i5-6200U 2.3GHz MCNet 67 single core, i5-6200U 2.3GHz Time of face detection is excluded
  • 35. Conclusions  We propose a novel multi-center convolutional network, which exploits the representation power of each center  We propose the reinforcement for each center to improve the shape estimation precision of each facial part  Comprehensive experiments demonstrate that our method achieves real-time and competitive performance compared to other state-of-the-art techniques

Editor's Notes

  1. Good morning, everyone. I am Zhiwen Shao. I come from Shanghai Jiao Tong University. In our paper, we propose a Multi-Center Convolutional Network to achieve face alignment.
  2. I first show the background of face alignment
  3. These images illustrate the results of face alignment.
  4. We can observe that these face images are very challenging. They have severe occlusions and large variations of pose, expression, illumination. Our goal is to develop an efficient method to handle unconstrained faces
  5. Face alignment can be regarded as a nonlinear regression problem, which transforms appearance to shape Most conventional methods are based on low-level handcrafted features, so they have a limited capacity to represent complex faces As we all know, a deep convolutional network has an outstanding representation ability. Therefore we use it to model the highly nonlinear function
  6. There are two types of deep learning methods. The first is multiple networks based. These methods employ cascaded deep networks to refine predicted shapes successively. Their training processes are complicated and time-consuming. And they have high computational cost and model complexity due to the use of multiple networks
  7. A very typical method is TCDCN. It trains only one deep network, but it needs extra labels of facial attributes for training samples. This limits the universality of this method. In contrast, our method uses one single network without auxiliary information
  8. Next I introduce our method in details
  9. Partial occlusion and large pose are main characteristics of unconstrained faces. We discover that each facial landmark is not isolated but highly correlated with adjacent landmarks. There are two examples. In the left figure, facial landmarks along the chin are all occluded. And the right figure shows that landmarks on the right side of the face are almost invisible. Therefore, landmarks in the same local face region have similar properties including occlusion and visibility.
  10. We analyze the structure of a face, and partition it into seven clusters: left eye, right eye, nose, mouth, left contour, chin, and right contour. As shown in these two figures, different labeling patterns of 29 and 68 facial landmarks are partitioned into 5 and 7 clusters respectively. Each cluster contains structurally relevant facial landmarks.
  11. This is the structure of our multi-center convolutional network. Our network consists of shared layers and multiple center-specific shape prediction layers.
  12. The shared layers contain eight convolutional layers and one fully-connected layer. Each max-pooling layer follows a stack of two convolutional layers The stack of convolutional layers is excellent in feature learning, which is proposed by VGGNet.
  13. According to the evaluation metric, we use weighted inter-ocular distance normalized Euclidean loss
  14. We first pre-train a basic model with shared layers and one shape prediction layer.
  15. Corresponding to Step 1
  16. We further fine-tune each center-specific layer respectively
  17. Corresponding to Step 2 to Step 6 Based on the pre-trained model, our network keeps shared layers and initializes each center-specific layer with the shape prediction parameters. There are m branches of center-specific layers at the end of our network. The fine-tuning of center-specific layers is mutually independent.
  18. Shared layers and integrated shape prediction layer constitute the combined model
  19. Corresponding to Step 7 We obtain the integrated shape prediction layer by combining corresponding parameters from each center-specific layer.
  20. We assume there is a multiple relationship between two weights To be consistent with the basic model, we keep weights conforming to this formula The summation of weights is ensured to equal n
  21. By solving two equations, we obtain the respective weights When emphasizing on the detection of current center, we still consider other centers with relatively small weights rather than zero. This is beneficial for utilizing implicit structural correlations among different facial parts and searching the solution smoothly
  22. Then I show the experiments
  23. Euclidean distance between two pupil centers
  24. We show the mean error of each cluster for basic model and combined model on COFW dataset It can be observed that the combined model improves the detection performance of each cluster
  25. We report the results of our method MCNet and previous works. We can see that our method outperforms most state-of-the-art methods It is worth noting that TCDCN obtains better performance than our method on IBUG partly owing to their larger training data. Although occlusions are not detected explicitly, we achieve an outstanding performance on par with Wu et al. on COFW benchmark.
  26. We plot the CED curves for our method and several state-of-the-art methods. It is observed that our method achieves competitive performance on both two benchmarks. Our method achieves better performance for high-level normalized mean error. Therefore, our method is strongly robust to unconstrained environments.
  27. There are several images from COFW We can see our method indicates higher accuracy than RCPR and CFT in the details Benefiting from utilizing structural correlations among different facial parts, our method is robust to severe occlusions.
  28. We also show example images from IBUG where our method MCNet outperforms LBF and CFSS Our method also achieves higher accuracy in the details. Therefore our method demonstrates superior capability of handling severe occlusions and complex variations of pose, expression, illumination.
  29. To obtain a more comprehensive comparison, we present the average running speed of different deep learning methods for face alignment We evaluate these methods on a single core i5-6200U 2.3GHz CPU with 1000 face images. Since CFAN and CDAN do not share their code, we use their published speed results. Both TCDCN and our method MCNet are based on only one network, so they show relatively quick speed. Cascaded CNN, CFAN and CDAN employ multiple networks, so they cost more running time. Our method only takes 15 ms on average to process one face, profiting from low model complexity and computational cost of our network. We believe that our method can be extended to real-time facial landmark tracking in unconstrained scenarios.
  30. Finally, I conclude our paper