1
Click to edit Master title style
Zero-shot Image Recognition Using Relational
Matching, Adaptation and Calibration
Debasmit Das C.S. George Lee
Assistive Robotics Technology Laboratory
School of Electrical and Computer Engineering
Purdue University, West Lafayette, IN, USA
Funding Source : National Science Foundation (IIS-1813935), NVIDIA Hardware Grant
2
Outline
• INTRODUCTION
- Problem Description.
- Previous Work.
- Challenges.
• PROPOSED APPROACH
- Relational Matching.
- Domain Adaptation.
- Scaled Calibration.
• EXPERIMENTAL RESULTS
- Comparative studies.
- Parameter, Convergence results etc.
3
IntroductionZero Shot Learning (ZSL)
Feature Space
Semantic Space
• Base Categories (source domain) contain
abundant labeled data.
• Novel Categories (target domain) contain
unlabeled data.
• However, class level semantic information available
for all categories.
• Find relationship between feature and semantic
space.
Example
Target Domain
Source Domain
4
IntroductionRelated Work of ZSL
Zero-shot
Learning
Embedding
Methods
Transductive
approaches
Generative
approaches
Hybrid
approaches
• Linear embedding
[Bernardino et al. ICML’15]
• Deep Embedding
[Zhang et al. CVPR’17]
• Multiview
[Fu et al. TPAMI’15]
• Dictionary Learning
[Kodirov et al. ICCV’15]
• Constrained VAE
[Verma et al. CVPR’18]
• Feature GAN
[Xian et al. CVPR’18]
• Semantic Similarity
[Zhang et al. CVPR ’15]
• Convex Combo
[Norouzi et al. ICLR’13]
[Relate feature & semantics ]
[Use unlabeled test data] [Generate data]
[Novel class from old class]
5
Challenges of ZSL
Hubness Domain Shift Seen Class Biasedness
• In the GZSL Setting ,
test data can be from
both seen and
unseen categories.
• Most unseen test
data predicted as
seen categories.
• Initially studied by
Chao et al. ECCV’16.
• Domain shift between
unseen test data and
unseen semantic
embeddings.
• Since unseen test data
not used in training.
• Phenomenon where only
a few candidates become
nearest neighbor
predictions.
• Due to curse of
dimensionality.
• Initially studied by
Radovanovic et al.
JMLR’10.
Introduction
6
Proposed Solution
One-to-one and pairwise
regression
Domain Adaptation Calibration
• Need to adapt semantic
embeddings to unseen
test data.
• Use previous DA
approach [Das & Lee
EAAI’18].
• Find correspondences
between semantic
embedding and unseen
test samples.
• Scaled calibration to
reduce scores of seen
classes.
• Implicit reduction of
variance of seen
classes.
• Structural matching
between semantics
and feature.
• Implicit
reduction of
dimensionality.
Proposed Approach
ADDRESS HUBNESS ADDRESS DOMAIN
SHIFT
ADDRESS BIASEDNESS
7
Proposed Framework Proposed Approach
8
Relational Matching
• Firstly, match between a seen sample and the
corresponding semantic embedding.
• Secondly, try to match the structure (pair-wise
distance matrix) between the seen prototypes and
semantic embeddings.
One-to-one regression Pairwise regression
Minimize with gradient descent
Proposed Approach
9
Domain Adaptation
• Adapt the unseen semantic embeddings (A) close to
the unseen test data (U).
• Find correspondences (C) between each data point
and semantic embedding with class regularization.
Correspondence based loss Group Lasso based regularization
Conditional Gradient based optimization
Proposed Approach
10
Scaled Calibration
• Modify the nearest neighbor Euclidean distance scores.
• Euclidean distance scores for seen classes are scaled while
that of unseen classes are kept the same.
Seen Total
Unseen
Proposed Approach
11
Experimental Results
• Animals with Attributes (AwA2)
[Lampert et al. TPAMI’14]
(Att – 85, Ysrc - 40 , Ytar - 10 )
• Pascal & Yahoo (aPY)
[Farhadi et al. CVPR’09]
(Att – 64, Ysrc - 20 , Ytar - 12 )
• Caltech-UCSD Birds (CUB)
[Welinder et al. ‘10]
(Att – 312, Ysrc - 150 , Ytar - 50 )
• Scene Understanding (SUN)
[Patterson et al. CVPR’12]
(Att – 102, Ysrc - 645, Ytar - 72 )
DatasetsComparison with previous work on four datasets.
Comparative Study
tr – Unseen class accuracy in traditional setting
u – Unseen class accuracy in generalized setting
s – Seen class accuracy in generalized setting
H – Harmonic mean of u and s
R – Relational Matching
RA – Relational Matching + Domain Adaptation
RC – Relational Matching + Scaled Calibration
RAC – Relational Matching + Domain
Adaptation + Scaled Calibration
12
Experimental ResultsSensitivity Studies I
Effect of the calibration factor
Effect of the structural matching weight
13
Experimental ResultsSensitivity Studies II
Effect of changing the
proportion of seen classes
Effect of changing the
number of test samples
AwA2 SUN
14
Experimental ResultsConvergence Analysis
Convergence results on
AwA2 dataset
Convergence results on
SUN datasetEffect of no. of epochs
on test accuracy
15
Experimental ResultsVisualization & Hubness
Feature Visualization
Without Domain
Adaptation
With Domain
Adaptation
Hubness Measurement
Hubness measured using
skewness of NN
prediction disitribution.
Unseen Features Seen Features Unseen Semantic Embedding
Seen Semantic Embedding
16
• Three-step approach to ZSL with structural
matching, domain adaptation and calibration.
• Tested on four challenging ZSL datasets on which it
has substantial improvement in performance.
• Domain adaptation found to be most effective.
Hubness is also reduced.
Conclusion
Future Work
Distinguishing between novel and base categories and
investigate generative models.
17
THANK YOU
Any Questions ?

Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibration

  • 1.
    1 Click to editMaster title style Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibration Debasmit Das C.S. George Lee Assistive Robotics Technology Laboratory School of Electrical and Computer Engineering Purdue University, West Lafayette, IN, USA Funding Source : National Science Foundation (IIS-1813935), NVIDIA Hardware Grant
  • 2.
    2 Outline • INTRODUCTION - ProblemDescription. - Previous Work. - Challenges. • PROPOSED APPROACH - Relational Matching. - Domain Adaptation. - Scaled Calibration. • EXPERIMENTAL RESULTS - Comparative studies. - Parameter, Convergence results etc.
  • 3.
    3 IntroductionZero Shot Learning(ZSL) Feature Space Semantic Space • Base Categories (source domain) contain abundant labeled data. • Novel Categories (target domain) contain unlabeled data. • However, class level semantic information available for all categories. • Find relationship between feature and semantic space. Example Target Domain Source Domain
  • 4.
    4 IntroductionRelated Work ofZSL Zero-shot Learning Embedding Methods Transductive approaches Generative approaches Hybrid approaches • Linear embedding [Bernardino et al. ICML’15] • Deep Embedding [Zhang et al. CVPR’17] • Multiview [Fu et al. TPAMI’15] • Dictionary Learning [Kodirov et al. ICCV’15] • Constrained VAE [Verma et al. CVPR’18] • Feature GAN [Xian et al. CVPR’18] • Semantic Similarity [Zhang et al. CVPR ’15] • Convex Combo [Norouzi et al. ICLR’13] [Relate feature & semantics ] [Use unlabeled test data] [Generate data] [Novel class from old class]
  • 5.
    5 Challenges of ZSL HubnessDomain Shift Seen Class Biasedness • In the GZSL Setting , test data can be from both seen and unseen categories. • Most unseen test data predicted as seen categories. • Initially studied by Chao et al. ECCV’16. • Domain shift between unseen test data and unseen semantic embeddings. • Since unseen test data not used in training. • Phenomenon where only a few candidates become nearest neighbor predictions. • Due to curse of dimensionality. • Initially studied by Radovanovic et al. JMLR’10. Introduction
  • 6.
    6 Proposed Solution One-to-one andpairwise regression Domain Adaptation Calibration • Need to adapt semantic embeddings to unseen test data. • Use previous DA approach [Das & Lee EAAI’18]. • Find correspondences between semantic embedding and unseen test samples. • Scaled calibration to reduce scores of seen classes. • Implicit reduction of variance of seen classes. • Structural matching between semantics and feature. • Implicit reduction of dimensionality. Proposed Approach ADDRESS HUBNESS ADDRESS DOMAIN SHIFT ADDRESS BIASEDNESS
  • 7.
  • 8.
    8 Relational Matching • Firstly,match between a seen sample and the corresponding semantic embedding. • Secondly, try to match the structure (pair-wise distance matrix) between the seen prototypes and semantic embeddings. One-to-one regression Pairwise regression Minimize with gradient descent Proposed Approach
  • 9.
    9 Domain Adaptation • Adaptthe unseen semantic embeddings (A) close to the unseen test data (U). • Find correspondences (C) between each data point and semantic embedding with class regularization. Correspondence based loss Group Lasso based regularization Conditional Gradient based optimization Proposed Approach
  • 10.
    10 Scaled Calibration • Modifythe nearest neighbor Euclidean distance scores. • Euclidean distance scores for seen classes are scaled while that of unseen classes are kept the same. Seen Total Unseen Proposed Approach
  • 11.
    11 Experimental Results • Animalswith Attributes (AwA2) [Lampert et al. TPAMI’14] (Att – 85, Ysrc - 40 , Ytar - 10 ) • Pascal & Yahoo (aPY) [Farhadi et al. CVPR’09] (Att – 64, Ysrc - 20 , Ytar - 12 ) • Caltech-UCSD Birds (CUB) [Welinder et al. ‘10] (Att – 312, Ysrc - 150 , Ytar - 50 ) • Scene Understanding (SUN) [Patterson et al. CVPR’12] (Att – 102, Ysrc - 645, Ytar - 72 ) DatasetsComparison with previous work on four datasets. Comparative Study tr – Unseen class accuracy in traditional setting u – Unseen class accuracy in generalized setting s – Seen class accuracy in generalized setting H – Harmonic mean of u and s R – Relational Matching RA – Relational Matching + Domain Adaptation RC – Relational Matching + Scaled Calibration RAC – Relational Matching + Domain Adaptation + Scaled Calibration
  • 12.
    12 Experimental ResultsSensitivity StudiesI Effect of the calibration factor Effect of the structural matching weight
  • 13.
    13 Experimental ResultsSensitivity StudiesII Effect of changing the proportion of seen classes Effect of changing the number of test samples AwA2 SUN
  • 14.
    14 Experimental ResultsConvergence Analysis Convergenceresults on AwA2 dataset Convergence results on SUN datasetEffect of no. of epochs on test accuracy
  • 15.
    15 Experimental ResultsVisualization &Hubness Feature Visualization Without Domain Adaptation With Domain Adaptation Hubness Measurement Hubness measured using skewness of NN prediction disitribution. Unseen Features Seen Features Unseen Semantic Embedding Seen Semantic Embedding
  • 16.
    16 • Three-step approachto ZSL with structural matching, domain adaptation and calibration. • Tested on four challenging ZSL datasets on which it has substantial improvement in performance. • Domain adaptation found to be most effective. Hubness is also reduced. Conclusion Future Work Distinguishing between novel and base categories and investigate generative models.
  • 17.

Editor's Notes

  • #3 Introduction – introduction to the problem and previous work Our method – problem formulation, optimization and proposed solution Experimental results – experimental results and discussions
  • #4 Give an example of a classification setting
  • #5 Just talk about the limitations of these methods and no more details
  • #6 Talk in details and study about hubness.
  • #7 COMBINATION OF EMBEDDING AND TRANSDUCTIVE EXCEPT THAT POST-PROCESSING INSTEAD OF DIRECT TRANSDUCTIVE IS USED
  • #8 3 step procedure
  • #9 Talk that local method may have better accuracy but it might be slower
  • #10 Talk that local method may have better accuracy but it might be slower
  • #11 Talk that local method may have better accuracy but it might be slower
  • #12 Explain
  • #13 Explain
  • #14 Explain
  • #15 Explain
  • #16 Explain