SlideShare a Scribd company logo
1 of 50
ONE SHOT LEARNING
FOR RECOGNITION
Keren Ofek
Today’s Agenda
■ Similarity and how to measure it?
■ Embedding
■ One-Shot learning Challenges
■ Siamese Network
■ DeepNet
■ Triplets
■ FaceNet
What is one shot learning?
■ One-shot learning aims to learn information about
object categories from few training examples.
■ The idea is to understand the similarity between the
detected object to an known object.
Image Source: Google
Similarity
Image Source: Google
How can we measure similarity?
■ We will find a function that quantifies a “distance”
between every pair of elements in a set
Non-negativity: f(x, y) ≥ 0
Identity of Discernible: f(x, y) = 0 <=> x = y
Symmetry: f(x, y) = f(y, x)
Triangle Inequality: f(x, z) ≤ f(x, y) + f(y, z)
Which Distance Metric to choose?
■ Pre-defined Metrics
Metrics which are fully
specified without the
knowledge of data.
E.g. Euclidian Distance:
f(𝒙 ,𝒚)= 𝒊 𝒙𝒊 − 𝒚𝒊
𝟐
■ Learned Metrics
Metrics which can only be
defined with the knowledge
of the data:
■ Un-Supervised Learning
Or
■ Supervised Learning
Based on slide from: February-16 Lecture http://slazebni.cs.illinois.edu/spring17/
Un-supervised distance metric: :
Mahalanobis Distance : f(x, y) = (x - y) T∑-1(x -
y)
where ∑ is the
mean-subtracted
covariance matrix
of all data points
From: Gene expression data clustering and
visualization based on a binary hierarchical clustering
framework
Un-supervised distance metric:
• 2-step procedure:
• Apply some supervised domain transform:
• Then use one of the un-supervised metrics for
performing the mapping.
Bellet, A., Habrard, A. and Sebban, A survey on metric learning for feature vectors and structured data
Embedding Process
■ The inputs are mapped to vectors in a low-dimensional space
■ low-dimensional space is called Target Space / Taken Space
■ In the taken Space, there is insensitivity
intra category changes, but sensitivity to inter category changes
https://hackernoon.com/latent-space-visualization-deep-learning-bits-2-bd09a46920df
Linear Discriminant Analysis (LDA)
■ LDA based upon the concept of searching for a linear combination
of variables that best separates two classes:
– Minimizes variance within the class
– Maximizes variance between classes
Fisher
Ratio
■ LDA is an example to understand
the motivation of embedding
=
https://analyticsdefined.com/introduction-linear-discriminant-analysis/
Linear Discriminant Analysis (LDA)
https://analyticsdefined.com/introduction-linear-discriminant-analysis/
The One-Shot learning challenge
■ There are lots of categories
■ The Number of categories is not always known
■ The number of samples in each category is small
 One shot learning is super relevant in the field of computer vision:
recognize objects in images from a single example.
 Linear Models are not relevant in this field.
Face Recognition challenges
Verification Recognition Clustering
The Method - Using CNN:
■ The Idea is to learn a function that maps input patterns to target
space.
■ non-linear mapping that can map any input vector to its
corresponding low-dimensional version.
■ The distance in the target space approximates the “semantic”
distance in the input space.
■ A discriminative training method - extract information about the
problem from the available data, without requiring specific
information about the categories.
■ The training will be on pairs of samples.
Energy-based models (EBM)
■ EBM capture dependencies between variables by associating a
scalar energy to each configuration of the variables.
■ No requirement for proper normalization
■ Provide considerably more flexibility in the design of architectures
and training criteria than probabilistic approaches
LeCun, Chopra, Hadsell, Ranzato, Huang, A Tutorial on Energy-Based Learning, 2006
Siamese Network Architecture
Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015
Similarity Metric
- Same Category
Genuine Pair
Minimizes
- Different
Categories
Impostor Pair
Maximizes
We seek to find a value of the parameter W such that:
 Contrastive term is needed to ensure: not only that the energy for a pair of
inputs from the same category is low, but also that the energy for a pair from
different categories is large.
Siamese Neural Networks
Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015
Siamese Neural Networks
One-shot Image Recognition
Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015
Hinge Loss:
Positive pair
Embedding Space
CNN
CNN
L(xp, xq) = ||xp – xq||2
Loss
Positive
Query
https://www.cs.cornell.edu/~kb/publications/SIG15ProductNet.pdf
Hinge Loss
Negative Pair
Embedding Space
CNN
Margin (m)
CNN
Loss = 0
L(xp, xq) = max(0, m2 - || xp – xq || 2 )
Neg
Query
https://www.cs.cornell.edu/~kb/publications/SIG15ProductNet.pdf
Hinge Loss
Negative Pair
Embedding Space
CNN
Margin (m)
CNN
Loss
L(xp, xq) = max(0, m2 - || xp – xq || 2 )
Neg
Query
https://www.cs.cornell.edu/~kb/publications/SIG15ProductNet.pdf
Visualization of Learned Features
Ahmed, Jones, Marks, An improved deep learning architecture for person re-identification, 2015
Positive Pair Negative Pair
Example: Omniglot dataset
■ Omniglot contains examples from 50 alphabets:
• well-established international languages (Latin and
Korean)
• lesser known local dialects.
Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015
Omniglot dataset: classification task
Input:
Classes:
Results:
Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015
Omniglot dataset: Verification task
Results:
■ Training on 3 data set sizes with 30,000, 90,000, and 150,000
■ Sampling random same and different pairs.
■ Generating 8 random affine distortions for each category
Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015
DeepFace (Facebook, 2014)
■ The conventional pipeline:
Detect ⇒ align ⇒ represent ⇒ classify
■ Face alignment: Transform a face to be in a canonical
pose
■ Face representation: Find a representation of a face
which is suitable for follow-up tasks (small size,
computationally cheap to compare, invariant to
irrelevant changes)
■ 3D face modeling
■ A nine-layer deep neural network
■ More than 120 million parameters
Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face
The alignment process - 'Frontalization'
Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face
The alignment process - 'Frontalization'
(a) 2D Alignment - detecting 6 fiducial points inside the detection
crop, centered at the center of the eyes, tip of the nose and
mouth locations
(b) 2D Alignment - aligned crop: composing the final 2D
transformation
Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face
The alignment process - 'Frontalization'
(c) 2D Alignment - localizing additional 67 fiducial points
(d) 3D Alignment - The reference 3D shape transformed to the 2D-
aligned crop image-plane.
Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face
The alignment process - 'Frontalization '
(e) 2D-3D Alignment - Triangle visibility w.r.t. to the fitted 3D-2D
camera; darker triangles are less visible.
(f) 3D Alignment - The 67 fiducial points induced by the 3D model
that are used to direct the piece-wise affine wrapping.
(g) 2D Alignment - The final frontalized crop
Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face
The DeepFace architecture
C1, M2, C3 : Extract low-level features (simple edges and texture)
L4, L5, L6: Locally connected Layers
L7, L8: Fully connected Layers
Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face
The training process
■ We use cross-entropy as the loss function:
(K is the index of the true label for a given input)
■ The loss is minimized over the parameters by computing the
gradient of L with respect to the parameters and by updating the
parameters using stochastic gradient descent (SGD).
■ The gradients are computed by standard backpropagation of the
error.
Face verification
■ In order to tell whether two images of faces show the same person
they try three different methods.
■ Each of these relies on the vector extracted by the first fully
connected layer in the network (4096d).
■ Let these vectors be f1 (image 1) and f2 (image 2). The methods
are then:
1. Inner product
2. Weighted X^2 (chi-squared) distance
3. Siamese network.
Results
■ LFW (Labeled Faces in the Wild): 13,323 web photos of
5,749 celebrities
Human Performance : 97.5% accuracy
The DeepFace Performance: 97.35% accuracy
■ YTF (YouTube Faces): 3425 YouTube videos of 1,595
subjects
The DeepFace Performance: 92.5% accuracy
(reduces the previous error by more than 50%!)
Triplets Network
■ The Triplet Loss minimizes the distance between an anchor and a
positive, both of which have the same identity, and maximizes the
distance between the anchor and a negative of a different identity.
Triplet Network
Hoffer, Ailon, DEEP METRIC LEARNING USING TRIPLET NETWORK , 2015
𝑁𝑒𝑡 𝑥𝑞 − 𝑁𝑒𝑡(𝑥𝑛) 2
𝑁𝑒𝑡 𝑥𝑞 − 𝑁𝑒𝑡(𝑥𝑝) 2
𝑁𝑒𝑡 𝑁𝑒𝑡 𝑁𝑒𝑡
Comparator
𝑥𝑛 𝑥𝑞 𝑥𝑝
FaceNet (Google, 2015)
■ FaceNet approach is that they use Euclidean embedding
space to find the similarity or difference between faces.
■ The method uses a deep convolutional network trained
to directly optimize the embedding itself, rather than an
intermediate bottleneck layer as in previous deep
learning approaches.
■ They used large-scale dataset
https://arxiv.org/pdf/1503.03832v3.pdf
The training process
The network consists of a batch input layer and a deep CNN
followed by L2 normalization, which results in the face
embedding. This is followed by the triplet loss during training.
Schroff, Kalenichenko, Philbin, FaceNet: A Unified Embedding for Face Recognition and Clustering, 2015
The training process
■ We learn an embedding f(x), from an image x into a feature space
R d , such that the squared distance between all faces of the same
identity is small, whereas the squared distance between a pair of
face images from different identities is large.
■ Loss Function:
𝑓 - the embedding function
𝐿 =
𝑖
𝑁
𝑓 𝑥𝑞,𝑖 − 𝑓(𝑥𝑝,𝑖)
2
2
− 𝑓 𝑥𝑞,𝑖 − 𝑓 𝑥𝑛,𝑖 2
2
+ 𝛼
+
Triplets Selection
■ In order to ensure fast convergence it is crucial to select triplets that violate
the triplet constraint in
■ We could select (𝑥𝑝,𝑖) and 𝑥𝑛,𝑖 :
(1) hard positive- argmax (𝑥𝑝,𝑖) 𝑓 𝑥𝑞,𝑖 − 𝑓(𝑥𝑝,𝑖) 2
2
(2) hard negative- argmin (𝑥𝑝,𝑖) 𝑓 𝑥𝑞,𝑖 − 𝑓(𝑥𝑛,𝑖) 2
2
■ But, Choose all anchor-positive pairs in a mini-batch while selecting only
hard-negatives gave better results.
■ To avoid local minima, we use ”semi-hard” exemplars:
𝑥𝑛,𝑖 such that
𝑓 𝑥𝑞,𝑖 − 𝑓(𝑥𝑝,𝑖)
2
2
+ 𝛼 < 𝑓 𝑥𝑞,𝑖 − 𝑓
2
2
𝑓 𝑥𝑞,𝑖 − 𝑓(𝑥𝑝,𝑖) 2
2
< 𝑓 𝑥𝑞,𝑖 − 𝑓 2
2
Embeddin
g location
of anchor
Hard
Positive
Hard
Negative
Triplets
Negative Space
Positive Space
The Model structure
(after the training)
Input
DISTANCE
The FaceNet architecture (NN1)
Schroff, Kalenichenko, Philbin, FaceNet: A Unified Embedding for Face Recognition and Clustering, 2015
The FaceNet architecture (NN2)
Schroff, Kalenichenko, Philbin, FaceNet: A Unified Embedding for Face Recognition and Clustering, 2015
The results – New records!
■ Performance on Youtube Faces DB: 95.12% accuracy
■ Performance on Labeled Faces in the Wild DB: 99.63%
accuracy
■ Sensitivity to Image Quality:
(The CNN was trained on 220x220 input images)
Schroff, Kalenichenko, Philbin, FaceNet: A Unified Embedding for Face Recognition and Clustering, 2015
# Pixels Val-Rate
1,600 37.8%
6,400 79.5%
14,400 84.5%
25,600 85.7%
65,539 86.4%
FaceNet – Image clustering
Schroff, Kalenichenko, Philbin, FaceNet: A Unified Embedding for Face Recognition and Clustering, 2015
Summary
The subjects we covered:
■ Embedding - Similarity in taken space
■ One-Shot learning Challenges
■ Network principles: Siamese Network and Triplets
■ Network in used: DeepNet and FaceNet
References
■ Chopra, Hadsell, LeCun, Learning a Similarity Metric Discriminatively, with Application to
Face Verification , 2005
■ Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance
in Face Verification, 2014
■ Schroff, Kalenichenko, Philbin, FaceNet: A Unified Embedding for Face Recognition and
Clustering, 2015
■ Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition,
2015
■ Hermans, Beyer, Leibe, In Defense of the Triplet Loss for Person Re-Identification, 2017
■ LeCun, Chopra, Hadsell, Ranzato, Huang, A Tutorial on Energy-Based Learning, 2006
■ Bell, Bala, Learning visual similarity for product design with convolutional neural
networks, 2015
■ Ahmed, Jones, Marks, An improved deep learning architecture for person re-
identification, 2015
■ Nando de Freias, Max-margin learning, transfer and memory networks, 2015
■ Jagannatha Rao, Wang, Cottrell, A Deep Siamese Neural Network Learns the Human-
Perceived Similarity Structure of Facial Expressions Without Explicit Categories, 2011
■ Hoffer, Ailon, DEEP METRIC LEARNING USING TRIPLET NETWORK , 2015
■ Bellet, Habrard, Sebban, A survey on metric learning for feature vectors and structured
THANKS!
Keren Ofek

More Related Content

Similar to one shot15729752 Deep Learning for AI and DS

Generative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksGenerative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksDenis Dus
 
Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMeetupDataScienceRoma
 
Computer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and PythonComputer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and PythonAkash Satamkar
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection ProcessBenjamin Bengfort
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Fwdays
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術CHENHuiMei
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in ImagesAnil Kumar Gupta
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slideswolf
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier홍배 김
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakPyData
 
Generational Adversarial Neural Networks - Essential Reference
Generational Adversarial Neural Networks - Essential ReferenceGenerational Adversarial Neural Networks - Essential Reference
Generational Adversarial Neural Networks - Essential ReferenceGokul Alex
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersAlbert Y. C. Chen
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecognIlyas CHAOUA
 
Throttling Malware Families in 2D
Throttling Malware Families in 2DThrottling Malware Families in 2D
Throttling Malware Families in 2DMohamed Nassar
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural networkSumeet Kakani
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspectiveAnirban Santara
 
GAN Deep Learning Approaches to Image Processing Applications (1).pptx
GAN Deep Learning Approaches to Image Processing Applications (1).pptxGAN Deep Learning Approaches to Image Processing Applications (1).pptx
GAN Deep Learning Approaches to Image Processing Applications (1).pptxRMDAcademicCoordinat
 

Similar to one shot15729752 Deep Learning for AI and DS (20)

AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
Generative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksGenerative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural Networks
 
Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image Processing
 
Computer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and PythonComputer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and Python
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection Process
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
 
Generational Adversarial Neural Networks - Essential Reference
Generational Adversarial Neural Networks - Essential ReferenceGenerational Adversarial Neural Networks - Essential Reference
Generational Adversarial Neural Networks - Essential Reference
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 
Throttling Malware Families in 2D
Throttling Malware Families in 2DThrottling Malware Families in 2D
Throttling Malware Families in 2D
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural network
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 
GAN Deep Learning Approaches to Image Processing Applications (1).pptx
GAN Deep Learning Approaches to Image Processing Applications (1).pptxGAN Deep Learning Approaches to Image Processing Applications (1).pptx
GAN Deep Learning Approaches to Image Processing Applications (1).pptx
 

More from ManiMaran230751

Deep-Learning-2017-Lecture ML DL RNN.ppt
Deep-Learning-2017-Lecture  ML DL RNN.pptDeep-Learning-2017-Lecture  ML DL RNN.ppt
Deep-Learning-2017-Lecture ML DL RNN.pptManiMaran230751
 
14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.ppt14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.pptManiMaran230751
 
12337673 deep learning RNN RNN DL ML sa.ppt
12337673 deep learning RNN RNN DL ML sa.ppt12337673 deep learning RNN RNN DL ML sa.ppt
12337673 deep learning RNN RNN DL ML sa.pptManiMaran230751
 
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptHADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptManiMaran230751
 
GNA 13552928 deep learning for GAN a.ppt
GNA 13552928 deep learning for GAN a.pptGNA 13552928 deep learning for GAN a.ppt
GNA 13552928 deep learning for GAN a.pptManiMaran230751
 
Reinforcemnet Leaning in ML and DL.pptx
Reinforcemnet Leaning in ML and  DL.pptxReinforcemnet Leaning in ML and  DL.pptx
Reinforcemnet Leaning in ML and DL.pptxManiMaran230751
 
24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptxManiMaran230751
 
The Stochastic Network Calculus: A Modern Approach.pptx
The Stochastic Network Calculus: A Modern Approach.pptxThe Stochastic Network Calculus: A Modern Approach.pptx
The Stochastic Network Calculus: A Modern Approach.pptxManiMaran230751
 
Open Access and IR along with Quality Indicators.pptx
Open Access and IR along with Quality Indicators.pptxOpen Access and IR along with Quality Indicators.pptx
Open Access and IR along with Quality Indicators.pptxManiMaran230751
 

More from ManiMaran230751 (11)

Deep-Learning-2017-Lecture ML DL RNN.ppt
Deep-Learning-2017-Lecture  ML DL RNN.pptDeep-Learning-2017-Lecture  ML DL RNN.ppt
Deep-Learning-2017-Lecture ML DL RNN.ppt
 
14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.ppt14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.ppt
 
12337673 deep learning RNN RNN DL ML sa.ppt
12337673 deep learning RNN RNN DL ML sa.ppt12337673 deep learning RNN RNN DL ML sa.ppt
12337673 deep learning RNN RNN DL ML sa.ppt
 
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptHADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
 
GNA 13552928 deep learning for GAN a.ppt
GNA 13552928 deep learning for GAN a.pptGNA 13552928 deep learning for GAN a.ppt
GNA 13552928 deep learning for GAN a.ppt
 
Reinforcemnet Leaning in ML and DL.pptx
Reinforcemnet Leaning in ML and  DL.pptxReinforcemnet Leaning in ML and  DL.pptx
Reinforcemnet Leaning in ML and DL.pptx
 
24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx
 
The Stochastic Network Calculus: A Modern Approach.pptx
The Stochastic Network Calculus: A Modern Approach.pptxThe Stochastic Network Calculus: A Modern Approach.pptx
The Stochastic Network Calculus: A Modern Approach.pptx
 
Open Access and IR along with Quality Indicators.pptx
Open Access and IR along with Quality Indicators.pptxOpen Access and IR along with Quality Indicators.pptx
Open Access and IR along with Quality Indicators.pptx
 
Fundamentals of RL.pptx
Fundamentals of RL.pptxFundamentals of RL.pptx
Fundamentals of RL.pptx
 
Acoustic Model.pptx
Acoustic Model.pptxAcoustic Model.pptx
Acoustic Model.pptx
 

Recently uploaded

AI and Ecology - The H4rmony Project.pptx
AI and Ecology - The H4rmony Project.pptxAI and Ecology - The H4rmony Project.pptx
AI and Ecology - The H4rmony Project.pptxNeoV2
 
Call Girls Sarovar Portico Naraina Hotel, New Delhi 9873777170
Call Girls Sarovar Portico Naraina Hotel, New Delhi 9873777170Call Girls Sarovar Portico Naraina Hotel, New Delhi 9873777170
Call Girls Sarovar Portico Naraina Hotel, New Delhi 9873777170simranguptaxx69
 
9873940964 Full Enjoy 24/7 Call Girls Near Shangri La’s Eros Hotel, New Delhi
9873940964 Full Enjoy 24/7 Call Girls Near Shangri La’s Eros Hotel, New Delhi9873940964 Full Enjoy 24/7 Call Girls Near Shangri La’s Eros Hotel, New Delhi
9873940964 Full Enjoy 24/7 Call Girls Near Shangri La’s Eros Hotel, New Delhidelih Escorts
 
办理学位证(KU证书)堪萨斯大学毕业证成绩单原版一比一
办理学位证(KU证书)堪萨斯大学毕业证成绩单原版一比一办理学位证(KU证书)堪萨斯大学毕业证成绩单原版一比一
办理学位证(KU证书)堪萨斯大学毕业证成绩单原版一比一F dds
 
9873940964 High Profile Call Girls Delhi |Defence Colony ( MAYA CHOPRA ) DE...
9873940964 High Profile  Call Girls  Delhi |Defence Colony ( MAYA CHOPRA ) DE...9873940964 High Profile  Call Girls  Delhi |Defence Colony ( MAYA CHOPRA ) DE...
9873940964 High Profile Call Girls Delhi |Defence Colony ( MAYA CHOPRA ) DE...Delhi Escorts
 
VIP Kolkata Call Girl Kalighat 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kalighat 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kalighat 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kalighat 👉 8250192130 Available With Roomdivyansh0kumar0
 
5 Wondrous Places You Should Visit at Least Once in Your Lifetime (1).pdf
5 Wondrous Places You Should Visit at Least Once in Your Lifetime (1).pdf5 Wondrous Places You Should Visit at Least Once in Your Lifetime (1).pdf
5 Wondrous Places You Should Visit at Least Once in Your Lifetime (1).pdfsrivastavaakshat51
 
Dwarka Call Girls 9643097474 Phone Number 24x7 Best Services
Dwarka Call Girls 9643097474 Phone Number 24x7 Best ServicesDwarka Call Girls 9643097474 Phone Number 24x7 Best Services
Dwarka Call Girls 9643097474 Phone Number 24x7 Best Servicesnajka9823
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012sapnasaifi408
 
Call Girls Ahmedabad 7397865700 Ridhima Hire Me Full Night
Call Girls Ahmedabad 7397865700 Ridhima Hire Me Full NightCall Girls Ahmedabad 7397865700 Ridhima Hire Me Full Night
Call Girls Ahmedabad 7397865700 Ridhima Hire Me Full Nightssuser7cb4ff
 
Gwalior Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Gwalior Call Girls 7001305949 WhatsApp Number 24x7 Best ServicesGwalior Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Gwalior Call Girls 7001305949 WhatsApp Number 24x7 Best Servicesnajka9823
 
Determination of antibacterial activity of various broad spectrum antibiotics...
Determination of antibacterial activity of various broad spectrum antibiotics...Determination of antibacterial activity of various broad spectrum antibiotics...
Determination of antibacterial activity of various broad spectrum antibiotics...Open Access Research Paper
 
NO1 Famous Amil In Karachi Best Amil In Karachi Bangali Baba In Karachi Aamil...
NO1 Famous Amil In Karachi Best Amil In Karachi Bangali Baba In Karachi Aamil...NO1 Famous Amil In Karachi Best Amil In Karachi Bangali Baba In Karachi Aamil...
NO1 Famous Amil In Karachi Best Amil In Karachi Bangali Baba In Karachi Aamil...Amil baba
 
Philippines-Native-Chicken.pptx file copy
Philippines-Native-Chicken.pptx file copyPhilippines-Native-Chicken.pptx file copy
Philippines-Native-Chicken.pptx file copyKristineRoseCorrales
 
Call In girls Connaught Place (DELHI)⇛9711147426🔝Delhi NCR
Call In girls Connaught Place (DELHI)⇛9711147426🔝Delhi NCRCall In girls Connaught Place (DELHI)⇛9711147426🔝Delhi NCR
Call In girls Connaught Place (DELHI)⇛9711147426🔝Delhi NCRjennyeacort
 
Hi FI Call Girl Ahmedabad 7397865700 Independent Call Girls
Hi FI Call Girl Ahmedabad 7397865700 Independent Call GirlsHi FI Call Girl Ahmedabad 7397865700 Independent Call Girls
Hi FI Call Girl Ahmedabad 7397865700 Independent Call Girlsssuser7cb4ff
 

Recently uploaded (20)

AI and Ecology - The H4rmony Project.pptx
AI and Ecology - The H4rmony Project.pptxAI and Ecology - The H4rmony Project.pptx
AI and Ecology - The H4rmony Project.pptx
 
Escort Service Call Girls In Shakti Nagar, 99530°56974 Delhi NCR
Escort Service Call Girls In Shakti Nagar, 99530°56974 Delhi NCREscort Service Call Girls In Shakti Nagar, 99530°56974 Delhi NCR
Escort Service Call Girls In Shakti Nagar, 99530°56974 Delhi NCR
 
Call Girls Sarovar Portico Naraina Hotel, New Delhi 9873777170
Call Girls Sarovar Portico Naraina Hotel, New Delhi 9873777170Call Girls Sarovar Portico Naraina Hotel, New Delhi 9873777170
Call Girls Sarovar Portico Naraina Hotel, New Delhi 9873777170
 
Hot Sexy call girls in Nehru Place, 🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Nehru Place, 🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Nehru Place, 🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Nehru Place, 🔝 9953056974 🔝 escort Service
 
9873940964 Full Enjoy 24/7 Call Girls Near Shangri La’s Eros Hotel, New Delhi
9873940964 Full Enjoy 24/7 Call Girls Near Shangri La’s Eros Hotel, New Delhi9873940964 Full Enjoy 24/7 Call Girls Near Shangri La’s Eros Hotel, New Delhi
9873940964 Full Enjoy 24/7 Call Girls Near Shangri La’s Eros Hotel, New Delhi
 
办理学位证(KU证书)堪萨斯大学毕业证成绩单原版一比一
办理学位证(KU证书)堪萨斯大学毕业证成绩单原版一比一办理学位证(KU证书)堪萨斯大学毕业证成绩单原版一比一
办理学位证(KU证书)堪萨斯大学毕业证成绩单原版一比一
 
9873940964 High Profile Call Girls Delhi |Defence Colony ( MAYA CHOPRA ) DE...
9873940964 High Profile  Call Girls  Delhi |Defence Colony ( MAYA CHOPRA ) DE...9873940964 High Profile  Call Girls  Delhi |Defence Colony ( MAYA CHOPRA ) DE...
9873940964 High Profile Call Girls Delhi |Defence Colony ( MAYA CHOPRA ) DE...
 
VIP Kolkata Call Girl Kalighat 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kalighat 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kalighat 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kalighat 👉 8250192130 Available With Room
 
5 Wondrous Places You Should Visit at Least Once in Your Lifetime (1).pdf
5 Wondrous Places You Should Visit at Least Once in Your Lifetime (1).pdf5 Wondrous Places You Should Visit at Least Once in Your Lifetime (1).pdf
5 Wondrous Places You Should Visit at Least Once in Your Lifetime (1).pdf
 
Model Call Girl in Rajiv Chowk Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Rajiv Chowk Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Rajiv Chowk Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Rajiv Chowk Delhi reach out to us at 🔝9953056974🔝
 
Dwarka Call Girls 9643097474 Phone Number 24x7 Best Services
Dwarka Call Girls 9643097474 Phone Number 24x7 Best ServicesDwarka Call Girls 9643097474 Phone Number 24x7 Best Services
Dwarka Call Girls 9643097474 Phone Number 24x7 Best Services
 
Sexy Call Girls Patel Nagar New Delhi +918448380779 Call Girls Service in Del...
Sexy Call Girls Patel Nagar New Delhi +918448380779 Call Girls Service in Del...Sexy Call Girls Patel Nagar New Delhi +918448380779 Call Girls Service in Del...
Sexy Call Girls Patel Nagar New Delhi +918448380779 Call Girls Service in Del...
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
Call Girls Ahmedabad 7397865700 Ridhima Hire Me Full Night
Call Girls Ahmedabad 7397865700 Ridhima Hire Me Full NightCall Girls Ahmedabad 7397865700 Ridhima Hire Me Full Night
Call Girls Ahmedabad 7397865700 Ridhima Hire Me Full Night
 
Gwalior Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Gwalior Call Girls 7001305949 WhatsApp Number 24x7 Best ServicesGwalior Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Gwalior Call Girls 7001305949 WhatsApp Number 24x7 Best Services
 
Determination of antibacterial activity of various broad spectrum antibiotics...
Determination of antibacterial activity of various broad spectrum antibiotics...Determination of antibacterial activity of various broad spectrum antibiotics...
Determination of antibacterial activity of various broad spectrum antibiotics...
 
NO1 Famous Amil In Karachi Best Amil In Karachi Bangali Baba In Karachi Aamil...
NO1 Famous Amil In Karachi Best Amil In Karachi Bangali Baba In Karachi Aamil...NO1 Famous Amil In Karachi Best Amil In Karachi Bangali Baba In Karachi Aamil...
NO1 Famous Amil In Karachi Best Amil In Karachi Bangali Baba In Karachi Aamil...
 
Philippines-Native-Chicken.pptx file copy
Philippines-Native-Chicken.pptx file copyPhilippines-Native-Chicken.pptx file copy
Philippines-Native-Chicken.pptx file copy
 
Call In girls Connaught Place (DELHI)⇛9711147426🔝Delhi NCR
Call In girls Connaught Place (DELHI)⇛9711147426🔝Delhi NCRCall In girls Connaught Place (DELHI)⇛9711147426🔝Delhi NCR
Call In girls Connaught Place (DELHI)⇛9711147426🔝Delhi NCR
 
Hi FI Call Girl Ahmedabad 7397865700 Independent Call Girls
Hi FI Call Girl Ahmedabad 7397865700 Independent Call GirlsHi FI Call Girl Ahmedabad 7397865700 Independent Call Girls
Hi FI Call Girl Ahmedabad 7397865700 Independent Call Girls
 

one shot15729752 Deep Learning for AI and DS

  • 1. ONE SHOT LEARNING FOR RECOGNITION Keren Ofek
  • 2. Today’s Agenda ■ Similarity and how to measure it? ■ Embedding ■ One-Shot learning Challenges ■ Siamese Network ■ DeepNet ■ Triplets ■ FaceNet
  • 3. What is one shot learning? ■ One-shot learning aims to learn information about object categories from few training examples. ■ The idea is to understand the similarity between the detected object to an known object. Image Source: Google
  • 5. How can we measure similarity? ■ We will find a function that quantifies a “distance” between every pair of elements in a set Non-negativity: f(x, y) ≥ 0 Identity of Discernible: f(x, y) = 0 <=> x = y Symmetry: f(x, y) = f(y, x) Triangle Inequality: f(x, z) ≤ f(x, y) + f(y, z)
  • 6. Which Distance Metric to choose? ■ Pre-defined Metrics Metrics which are fully specified without the knowledge of data. E.g. Euclidian Distance: f(𝒙 ,𝒚)= 𝒊 𝒙𝒊 − 𝒚𝒊 𝟐 ■ Learned Metrics Metrics which can only be defined with the knowledge of the data: ■ Un-Supervised Learning Or ■ Supervised Learning Based on slide from: February-16 Lecture http://slazebni.cs.illinois.edu/spring17/
  • 7. Un-supervised distance metric: : Mahalanobis Distance : f(x, y) = (x - y) T∑-1(x - y) where ∑ is the mean-subtracted covariance matrix of all data points From: Gene expression data clustering and visualization based on a binary hierarchical clustering framework
  • 8. Un-supervised distance metric: • 2-step procedure: • Apply some supervised domain transform: • Then use one of the un-supervised metrics for performing the mapping. Bellet, A., Habrard, A. and Sebban, A survey on metric learning for feature vectors and structured data
  • 9. Embedding Process ■ The inputs are mapped to vectors in a low-dimensional space ■ low-dimensional space is called Target Space / Taken Space ■ In the taken Space, there is insensitivity intra category changes, but sensitivity to inter category changes https://hackernoon.com/latent-space-visualization-deep-learning-bits-2-bd09a46920df
  • 10. Linear Discriminant Analysis (LDA) ■ LDA based upon the concept of searching for a linear combination of variables that best separates two classes: – Minimizes variance within the class – Maximizes variance between classes Fisher Ratio ■ LDA is an example to understand the motivation of embedding = https://analyticsdefined.com/introduction-linear-discriminant-analysis/
  • 11. Linear Discriminant Analysis (LDA) https://analyticsdefined.com/introduction-linear-discriminant-analysis/
  • 12. The One-Shot learning challenge ■ There are lots of categories ■ The Number of categories is not always known ■ The number of samples in each category is small  One shot learning is super relevant in the field of computer vision: recognize objects in images from a single example.  Linear Models are not relevant in this field.
  • 14. The Method - Using CNN: ■ The Idea is to learn a function that maps input patterns to target space. ■ non-linear mapping that can map any input vector to its corresponding low-dimensional version. ■ The distance in the target space approximates the “semantic” distance in the input space. ■ A discriminative training method - extract information about the problem from the available data, without requiring specific information about the categories. ■ The training will be on pairs of samples.
  • 15. Energy-based models (EBM) ■ EBM capture dependencies between variables by associating a scalar energy to each configuration of the variables. ■ No requirement for proper normalization ■ Provide considerably more flexibility in the design of architectures and training criteria than probabilistic approaches LeCun, Chopra, Hadsell, Ranzato, Huang, A Tutorial on Energy-Based Learning, 2006
  • 16. Siamese Network Architecture Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015
  • 17. Similarity Metric - Same Category Genuine Pair Minimizes - Different Categories Impostor Pair Maximizes We seek to find a value of the parameter W such that:  Contrastive term is needed to ensure: not only that the energy for a pair of inputs from the same category is low, but also that the energy for a pair from different categories is large.
  • 18. Siamese Neural Networks Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015
  • 19. Siamese Neural Networks One-shot Image Recognition Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015
  • 20. Hinge Loss: Positive pair Embedding Space CNN CNN L(xp, xq) = ||xp – xq||2 Loss Positive Query https://www.cs.cornell.edu/~kb/publications/SIG15ProductNet.pdf
  • 21. Hinge Loss Negative Pair Embedding Space CNN Margin (m) CNN Loss = 0 L(xp, xq) = max(0, m2 - || xp – xq || 2 ) Neg Query https://www.cs.cornell.edu/~kb/publications/SIG15ProductNet.pdf
  • 22. Hinge Loss Negative Pair Embedding Space CNN Margin (m) CNN Loss L(xp, xq) = max(0, m2 - || xp – xq || 2 ) Neg Query https://www.cs.cornell.edu/~kb/publications/SIG15ProductNet.pdf
  • 23. Visualization of Learned Features Ahmed, Jones, Marks, An improved deep learning architecture for person re-identification, 2015 Positive Pair Negative Pair
  • 24. Example: Omniglot dataset ■ Omniglot contains examples from 50 alphabets: • well-established international languages (Latin and Korean) • lesser known local dialects. Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015
  • 25. Omniglot dataset: classification task Input: Classes: Results: Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015
  • 26. Omniglot dataset: Verification task Results: ■ Training on 3 data set sizes with 30,000, 90,000, and 150,000 ■ Sampling random same and different pairs. ■ Generating 8 random affine distortions for each category Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015
  • 27. DeepFace (Facebook, 2014) ■ The conventional pipeline: Detect ⇒ align ⇒ represent ⇒ classify ■ Face alignment: Transform a face to be in a canonical pose ■ Face representation: Find a representation of a face which is suitable for follow-up tasks (small size, computationally cheap to compare, invariant to irrelevant changes) ■ 3D face modeling ■ A nine-layer deep neural network ■ More than 120 million parameters Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face
  • 28. The alignment process - 'Frontalization' Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face
  • 29. The alignment process - 'Frontalization' (a) 2D Alignment - detecting 6 fiducial points inside the detection crop, centered at the center of the eyes, tip of the nose and mouth locations (b) 2D Alignment - aligned crop: composing the final 2D transformation Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face
  • 30. The alignment process - 'Frontalization' (c) 2D Alignment - localizing additional 67 fiducial points (d) 3D Alignment - The reference 3D shape transformed to the 2D- aligned crop image-plane. Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face
  • 31. The alignment process - 'Frontalization ' (e) 2D-3D Alignment - Triangle visibility w.r.t. to the fitted 3D-2D camera; darker triangles are less visible. (f) 3D Alignment - The 67 fiducial points induced by the 3D model that are used to direct the piece-wise affine wrapping. (g) 2D Alignment - The final frontalized crop Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face
  • 32. The DeepFace architecture C1, M2, C3 : Extract low-level features (simple edges and texture) L4, L5, L6: Locally connected Layers L7, L8: Fully connected Layers Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face
  • 33. The training process ■ We use cross-entropy as the loss function: (K is the index of the true label for a given input) ■ The loss is minimized over the parameters by computing the gradient of L with respect to the parameters and by updating the parameters using stochastic gradient descent (SGD). ■ The gradients are computed by standard backpropagation of the error.
  • 34. Face verification ■ In order to tell whether two images of faces show the same person they try three different methods. ■ Each of these relies on the vector extracted by the first fully connected layer in the network (4096d). ■ Let these vectors be f1 (image 1) and f2 (image 2). The methods are then: 1. Inner product 2. Weighted X^2 (chi-squared) distance 3. Siamese network.
  • 35. Results ■ LFW (Labeled Faces in the Wild): 13,323 web photos of 5,749 celebrities Human Performance : 97.5% accuracy The DeepFace Performance: 97.35% accuracy ■ YTF (YouTube Faces): 3425 YouTube videos of 1,595 subjects The DeepFace Performance: 92.5% accuracy (reduces the previous error by more than 50%!)
  • 36. Triplets Network ■ The Triplet Loss minimizes the distance between an anchor and a positive, both of which have the same identity, and maximizes the distance between the anchor and a negative of a different identity.
  • 37. Triplet Network Hoffer, Ailon, DEEP METRIC LEARNING USING TRIPLET NETWORK , 2015 𝑁𝑒𝑡 𝑥𝑞 − 𝑁𝑒𝑡(𝑥𝑛) 2 𝑁𝑒𝑡 𝑥𝑞 − 𝑁𝑒𝑡(𝑥𝑝) 2 𝑁𝑒𝑡 𝑁𝑒𝑡 𝑁𝑒𝑡 Comparator 𝑥𝑛 𝑥𝑞 𝑥𝑝
  • 38. FaceNet (Google, 2015) ■ FaceNet approach is that they use Euclidean embedding space to find the similarity or difference between faces. ■ The method uses a deep convolutional network trained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches. ■ They used large-scale dataset https://arxiv.org/pdf/1503.03832v3.pdf
  • 39. The training process The network consists of a batch input layer and a deep CNN followed by L2 normalization, which results in the face embedding. This is followed by the triplet loss during training. Schroff, Kalenichenko, Philbin, FaceNet: A Unified Embedding for Face Recognition and Clustering, 2015
  • 40. The training process ■ We learn an embedding f(x), from an image x into a feature space R d , such that the squared distance between all faces of the same identity is small, whereas the squared distance between a pair of face images from different identities is large. ■ Loss Function: 𝑓 - the embedding function 𝐿 = 𝑖 𝑁 𝑓 𝑥𝑞,𝑖 − 𝑓(𝑥𝑝,𝑖) 2 2 − 𝑓 𝑥𝑞,𝑖 − 𝑓 𝑥𝑛,𝑖 2 2 + 𝛼 +
  • 41. Triplets Selection ■ In order to ensure fast convergence it is crucial to select triplets that violate the triplet constraint in ■ We could select (𝑥𝑝,𝑖) and 𝑥𝑛,𝑖 : (1) hard positive- argmax (𝑥𝑝,𝑖) 𝑓 𝑥𝑞,𝑖 − 𝑓(𝑥𝑝,𝑖) 2 2 (2) hard negative- argmin (𝑥𝑝,𝑖) 𝑓 𝑥𝑞,𝑖 − 𝑓(𝑥𝑛,𝑖) 2 2 ■ But, Choose all anchor-positive pairs in a mini-batch while selecting only hard-negatives gave better results. ■ To avoid local minima, we use ”semi-hard” exemplars: 𝑥𝑛,𝑖 such that 𝑓 𝑥𝑞,𝑖 − 𝑓(𝑥𝑝,𝑖) 2 2 + 𝛼 < 𝑓 𝑥𝑞,𝑖 − 𝑓 2 2 𝑓 𝑥𝑞,𝑖 − 𝑓(𝑥𝑝,𝑖) 2 2 < 𝑓 𝑥𝑞,𝑖 − 𝑓 2 2
  • 43. The Model structure (after the training) Input DISTANCE
  • 44. The FaceNet architecture (NN1) Schroff, Kalenichenko, Philbin, FaceNet: A Unified Embedding for Face Recognition and Clustering, 2015
  • 45. The FaceNet architecture (NN2) Schroff, Kalenichenko, Philbin, FaceNet: A Unified Embedding for Face Recognition and Clustering, 2015
  • 46. The results – New records! ■ Performance on Youtube Faces DB: 95.12% accuracy ■ Performance on Labeled Faces in the Wild DB: 99.63% accuracy ■ Sensitivity to Image Quality: (The CNN was trained on 220x220 input images) Schroff, Kalenichenko, Philbin, FaceNet: A Unified Embedding for Face Recognition and Clustering, 2015 # Pixels Val-Rate 1,600 37.8% 6,400 79.5% 14,400 84.5% 25,600 85.7% 65,539 86.4%
  • 47. FaceNet – Image clustering Schroff, Kalenichenko, Philbin, FaceNet: A Unified Embedding for Face Recognition and Clustering, 2015
  • 48. Summary The subjects we covered: ■ Embedding - Similarity in taken space ■ One-Shot learning Challenges ■ Network principles: Siamese Network and Triplets ■ Network in used: DeepNet and FaceNet
  • 49. References ■ Chopra, Hadsell, LeCun, Learning a Similarity Metric Discriminatively, with Application to Face Verification , 2005 ■ Taigman; Yang; Ranzato, Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face Verification, 2014 ■ Schroff, Kalenichenko, Philbin, FaceNet: A Unified Embedding for Face Recognition and Clustering, 2015 ■ Koch, Zemel, Salakhutdinov, Siamese Neural Networks for One-shot Image Recognition, 2015 ■ Hermans, Beyer, Leibe, In Defense of the Triplet Loss for Person Re-Identification, 2017 ■ LeCun, Chopra, Hadsell, Ranzato, Huang, A Tutorial on Energy-Based Learning, 2006 ■ Bell, Bala, Learning visual similarity for product design with convolutional neural networks, 2015 ■ Ahmed, Jones, Marks, An improved deep learning architecture for person re- identification, 2015 ■ Nando de Freias, Max-margin learning, transfer and memory networks, 2015 ■ Jagannatha Rao, Wang, Cottrell, A Deep Siamese Neural Network Learns the Human- Perceived Similarity Structure of Facial Expressions Without Explicit Categories, 2011 ■ Hoffer, Ailon, DEEP METRIC LEARNING USING TRIPLET NETWORK , 2015 ■ Bellet, Habrard, Sebban, A survey on metric learning for feature vectors and structured

Editor's Notes

  1. https://hackernoon.com/latent-space-visualization-deep-learning-bits-2-bd09a46920df
  2. שיטות ישנות נוספות: הפיכת תמונה להיסטוגרמה של צבעים שימוש בL1 הבעיה בשימוש בשיטות הישנות היא שמראש אנחנו קובעים מהו הembedding ומהי פונקציית המרחק. לא תמיד יש התאמה בין המטרה שלנו (זיהוי תמונה) לבין הדרך הזו. למשל זיהוי אותו עצם על רקע שונה לגמרי / היפוך של התמונה יניב היסטוגרמה שונה לחלוטין.
  3. https://hackernoonhttps://www.youtube.com/watch?v=azXCzI57Yfc.com/latent-space-visualization-deep-learning-bits-2-bd09a46920df
  4. נבצע את האמבדינג ובחירת המטריקה יחד, כחלק מ CNN בניגוד לשיטות הישנות כמו היסטוגרמה של צבעים ומרחק אוקלידי.
  5. The model does not learn the classification categories
  6. : The energy and the loss can be made zero by simply making Gw(x1) a constant function. Therefore our loss function needs a contrastive term to ensure not only that the energy for a pair of inputs from the same category is low, but also that the energy for a pair from different categories is large. This problem does not occur with properly normalized probabilistic models because making the probability of a particular pair high automatically makes the probability of other pairs low.
  7. Siamese Neural Networks One-shot Image Recognition P is a logistic function
  8. Euclidian Loss Bell, S. and Bala, K., 2015. Learning visual similarity for product design with convolutional neural networks. ACM Transactions on Graphics (TOG), 34(4), p.98.
  9. Bell, S. and Bala, K., 2015. Learning visual similarity for product design with convolutional neural networks. ACM Transactions on Graphics (TOG), 34(4), p.98.
  10. Bell, S. and Bala, K., 2015. Learning visual similarity for product design with convolutional neural networks. ACM Transactions on Graphics (TOG), 34(4), p.98.
  11. 20-way one-shot classification task
  12. *
  13. (h) 3D Alignment - A new view generated by the 3D model (not used in DeepFace).
  14. They use locally connected layers for the next three layers. This is like a convolutional layer but with each part of the image learning a different filter bank, which makes sense because the face has been normalized and the filters relevant for one part of the face probably is not for another. Areas between the eyes and the eyebrows exhibit very different appearance and have much higher discrimination ability compared to areas between the nose and the mouth. The use of local layers does not affect the computational burden of feature extraction, but does affect the number of parameters subject to training. Only because we have a large labeled dataset, we can afford three large locally connected layers.
  15. Backup after slide “The training process” – Deepface
  16. Let these vectors be f1 (image 1) and f2 (image 2). The methods are then: Inner product between f1 and f2. The classification (same person/not same person) is then done by a simple threshold. Weighted X^2 (chi-squared) distance. Equation, per vector component i: weight_i (f1[i] - f2[i])^2 / (f1[i] + f2[i]). The vector is then fed into an SVM. Siamese network. Means here simply that the absolute distance between f1 and f2 is calculated (|f1-f2|), each component is weighted by a learned weight and then the sum of the components is calculated. If the result is above a threshold, the faces are considered to show the same person.
  17. YTF : We employ the DeepFace-single representation directly by creating, for every pair of training videos, 50 pairs of frames, one from each video, and label these as same or not-same in accordance with the video training pair. Then a weighted χ 2 model is learned as in Sec. 4.1. Given a test-pair, we sample 100 random pairs of frames, one from each video, and use the mean value of the learned weighed similarity. The comparison with recent methods is shown in Table 4 and Fig. 4. We report an accuracy of 91.4% which reduces the error of the previous best methods by more than 50%. Note that there are about 100 wrong labels for video pairs, recently updated to the YTF webpage. After these are corrected, DeepFace-single actually reaches 92.5%. This experiment verifies again that the DeepFace method easily generalizes to a new target domain.
  18. where N is a possible set of all triplets in the dataset; f (x) - embedding, that translates image x into Euclidean space; x a j - ”anchor” image; x p j - ”positive” image, which should be close to an anchor image; x n j - ”negative” image; γ - margin between positive and negative images.
  19. α is a margin that is enforced between positive and negative pairs. Superscript a, p, n denote the anchor, positive and negative, respectively
  20. α is a margin that is enforced between positive and negative pairs. Superscript a, p, n denote the anchor, positive and negative, respectively