The Zero-ETL Approach: Enhancing Data Agility and Insight
May internship challenge: User Authentication System only using image data: Combination of face image recognition and hand tracking
1. Internship Challenge Presentation:
User Authentication System only using image data:
Combination of face image recognition and hand tracking
Author: Mitsuhiko Nozawa
Date: 2021/06/21
2. Agenda
1. Overview - brief explanation about application
2. What is Authentication System?
3. Proposed Method
4. Experiment result
5. Problem of Proposed Method
6. Future Work
3. Theme:
❏ User Authentication System with Face Recognition and Hands Tracking
Application function:
❏ Using face image, verify whether user is registered or not
❏ After that, receive password by hands tracking and do authentication
Use case:
❏ Office entrance, etc...
motivation:
❏ Face recognition is researched very active, but I felt that real-world application in authentication
system seemed is not so prevailed.
❏ Inspired from ReID
Overview (brief explanation about my app)
1 2 3
4 5 6
7 8 9
4. Role
❏ judge if a person is part of a particular group
Prevailed form
❏ ID・Password
❏ Card Key
❏ Biometric Authentication
Why do you need it?
❏ Demand for restricting the use to registrants only
❏ For example, the company has a lot of confidential information, and if There are no restrictions of
entering office, the information will be leaked.
Problem
❏ ID・Password:management cost per each member
❏ Card Key :risk of lost, a little contact
What is Authentication System ?
5. Biometric Authentication
❏ Do authentication with biological unique feature by each person
❏ ex) Vein authentication, Fingerprint authentication、Face authentication
❏ Used for unlocking smartphones
Among them, Face authentication has become major recently.
❏ FaceID of iPhone
❏ FaceID is used in smart phone application of financial institution
But for those face authentication, 3D features(depth) got from sensor camera are used.
And sensor camera is expensive compared to normal camera.
What is Authentication System ?
6. Initial my application Idea:
❏ Face Authentication System only using image data got from normal camera
Advantage:
❏ no risk of lost, comparably inexpensive, completely contactless
Threat:Spoofing
❏ picture, movie, dummy from 3D printer(This breaks iphone’s FaceID)
❏ It is very difficult to solve this problem in only image data situation (a cat and mouse game)
・I gave up doing authentication only using face image, and determined to use other unique information
❏ eye tracking:risk of spoofing from movie
❏ gesture:it seems difficult to recognize complex gesture
❏ password input from hand tracking → algorithm is easy, easier method is better
Conclusive my application Idea:
❏ Face recognition and Hands Tracking Password Authentication System only using image data
Proposed Method(Application Idea)
7. Face Recognition
❏ Get features of input user and registered users from pretrained feature extractor
❏ Calculate cosine similarities of each pairs of input user and registered users
❏ If maximum similarity exceeds threshold, input user is regarded as registered user who has maximum one
❏ Otherwise input user is regarded as not registered
❏ Training of feature extractor:Deep Metric Learning (arcface)
Proposed Method(Implementation)
registered_user1_embedding
registered_user2_embedding
...
registered_userN_embedding
feature
extractor
input_user
embedding
input
image
max_sim
user
calculate similarity
8. Learning the mapping from input data to feature vectors in a multidimensional space
It is desired that similar inputs are mapped close to each other and dissimilar inputs are mapped far away.
About Metric Learning
backbone model x
W W
W x
T
normaliz
e
normaliz
e
W x
T
softmax
matmul
(emb_size, 1)
(emb_size, class_num)
(class_num, 1)
Y
Cross
Entropy
cosine similarities of input data mapping(x)
and features of each classes(W)
the mapping of input data in a
multidimensional space
9. Hand Tracking Password Authentication
❏ I have no time to originally construct model…
❏ So, I use MediaPipe Hands library created by google
❏ First, create circle mask which has one character in input image
❏ Get coordinate of index finger location from model
❏ If the coordinate is in particular circle, return its character
Proposed Method(Implementation)
10. Hand and Hand Skeleton detecting model created by Google.
This model is assumed to work in mobile device, so fast and light.
This model has 2 stages to output final results
1. Detect Palm from input image and crop image using bounding box
2. from cropped image, predict 21 hand skeleton point
I use number 8. INDEX_FINGER_TIP to create password.
About MediaPipe Hands
reference: https://google.github.io/mediapipe/solutions/hands.html
11. Proposed Method - Overall View
input
image
registered user
features
feature
extract
calculate
similarity
Face Recognition
Phase
Hand Tracking Phase
detect
hand
match?
input
passwor
d
passwor
d
match?
Authenticated!
Y
N
Y
N
data flow
control flow
12. Training feature extractor for face recognition
Dataset:
❏ CASIA WebFace(for train):10,575 identities 494,414 images
❏ LFW(for test):5749 identities 13,233 images
Metric
❏ Calculate AUC from all image pairs ( 2-combinations of a set N , match or not)
Training Strategy
❏ All images are cropped (112, 112) by using MTCNN face detector
❏ Batch size is 512
❏ ResNet50 is adopted as backbone model
❏ 5% of training data is used for validation data
Experiment
13. Experiment Results
I can’t perfectly reproduce the experiment results of paper
Possible causes:
❏ need more data
❏ program has bug
❏ backbone model is not sufficient
optimizer lr scheduling loss function AUC
exp_000 SGD 0.1 Multi Step [20, 28] Cross Entropy -(too bad)
exp_001 Adam 5e-4 Multi Step [20, 28] Cross Entropy 0.827
exp_002 Adam 5e-4 no scheduling Cross Entropy 0.616
exp_003 Adam 5e-4 no scheduling Class Weighted Cross Entropy 0.633
14. ● Hand Tracking Password Authentication takes long time
❏ The more users that use it at the same time, the more the cost increases.
● Other people can watch Password while the user is using the system
❏ keep private from other people to watch when using application...
● The input user wearing a mask, glasses and so on 😷
❏ save multiple image per user
❏ use data augmentation to wear such ornaments to input user when registering
Problems of Application
15. ● I proposed user authentication system only using image data with face recognition and hands tracking
● This application solve some existing problems
❏ risk of lost
❏ sensor camera cost of face recognition
❏ spoofing
● But there are 2 severe problems
❏ Time cost of password input
❏ Password exposure
● So if you don't have a lot of users and you can keep private from other people, the application will work
well.
Conclusion - Future Work
16. Arcface
Deng, Jiankang, et al. "ArcFace: Additive Angular Margin Loss for Deep Face Recognition." 2019 IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019.
MediaPipe Hands
Zhang, Fan, et al. "MediaPipe Hands: On-device Real-time Hand Tracking." arXiv preprint arXiv:2006.10214
(2020).
Reference