SlideShare a Scribd company logo
1 of 47
Download to read offline
Meta-Learning for the representation
change
in task-specific update
Defence for the Master’s Thesis
2020. 12. 14
Hyungjun Yoo (Advisor : Se-Young Yun)
Graduate School of Knowledge Service Engineering, OSI Lab
Table of Contents
Purpose of the Research
Leverage representation change through freezing head and updating
body only in task-specific adaptation for efficient Meta-learning
1. Introduction : What is Meta-Learning?
2. Problem Setting : Few-shot Classification, MAML, ANIL
3. BOIL Algorithm : BOIL, Domain-agnostic adaptation
4. Representation Change in BOIL : Cosine similarity, CKA, Empirical Analysis,
Disconnection trick (ResNet)
5. Conclusion : Research contributions
Hyungjun Yoo Defense for the Master’s Thesis 3 / 46
14th December 2020
• Deep Neural Network needs a large labeled dataset and requires long training time.
What is Meta-learning?
Introduction
AlexNet
Residual Net
Inception V3
Hyungjun Yoo Defense for the Master’s Thesis 4 / 46
14th December 2020
• Deep Neural Network needs a large labeled dataset and requires long training time.
• But human can learn with only few samples, and also, human can use previous knowledge to learn new task.
What is Meta-learning?
Introduction
AlexNet
Residual Net
Inception V3
Hyungjun Yoo Defense for the Master’s Thesis 5 / 46
14th December 2020
What is Meta-learning?
Introduction
• Meta-Learning (Learning to learn) : tries to make DNN mimic human intelligence
• Model is able to learn with few shot examples in each task
• Model which learns to learn from the previous similar tasks can quickly learns new task
• Why learning to learn?
• Advantages to effectively reuse data on other tasks
• Applicability replace manual engineering of architecture, hyperparameters, etc.
• Learning property to quickly adapt to unexpected scenarios (inevitable failures, long tail)
• Problem Domains
• Few-shot classification
Hyungjun Yoo Defense for the Master’s Thesis 6 / 46
14th December 2020
Main Approaches of Meta-learning
Introduction
Metric Based Model Based Optimization Based
Key Idea Metric learning RNN, external memory Gradient Descent
Key Papers
Matching Net
(Vinyals et al., 2016)
Prototypical Net
(Snell et al., 2017)
MANN
(Santoro et al., 2016)
Meta Network
(Munkhdalai & Yu, 2017)
MAML
(Finn et al., 2017)
Reptile
(Nichol et al., 2018)
Strength
Simple
Entirely feedforward
Applicable to any baseline
model
Lends well to OOD tasks
Model Agnostic
Weakness
Hard to generalize to varying
example size
Restricted domain
Excessive computation and
parameters
Often data-inefficient
Training instability
Second-order optimization
Hyungjun Yoo Defense for the Master’s Thesis 7 / 46
14th December 2020
Few-shot learning (𝒏-way 𝒌-shot classification)
Problem Setting
Query Set, 𝐷𝜏1
𝑞𝑟𝑦
Support Set, 𝐷𝜏1
𝑠𝑝𝑡
Query Set, 𝐷𝜏2
𝑞𝑟𝑦
Support Set, 𝐷𝜏2
𝑠𝑝𝑡
, 𝝉𝟏 , 𝝉𝟐
. , 𝝉𝟏
.
Query Set, 𝐷𝜏1
𝑞𝑟𝑦
Support Set, 𝐷𝜏1
𝑠𝑝𝑡
Hyungjun Yoo Defense for the Master’s Thesis 8 / 46
14th December 2020
Meta-learning Framework (MAML)
• Model-Agnostic Meta-Learning (MAML, Finn et al., 2017)
• MAML uses 𝑏(meta-batch size) tasks for a single iteration, and each iteration consists of two steps, inner
loop and outer loop.
• Two steps of parameter update
• Inner loop : from meta-initialization(𝜃0), task-specifically updates using support set
Problem Setting
inner loop : 𝜃𝜏𝑖
= 𝜃0 − 𝜕∇𝜃0
𝐿𝜏𝑖
, 𝑓𝜃0
, 𝐷𝜏𝑖
𝑠𝑝𝑡
Hyungjun Yoo Defense for the Master’s Thesis 9 / 46
14th December 2020
Meta-learning Framework (MAML)
• Model-Agnostic Meta-Learning (MAML, Finn et al., 2017)
• MAML uses 𝑏(meta-batch size) tasks for a single iteration, and each iteration consists of two steps, inner
loop and outer loop.
• Two steps of parameter update
• Inner loop : from meta-initialization(𝜃0), task-specifically updates using support set
• Outer loop : updates meta-initialization with average of the task-specific losses using query set
Problem Setting
inner loop :
outer loop :
𝜃𝜏𝑖
= 𝜃0 − 𝜕∇𝜃0
𝐿𝜏𝑖
, 𝑓𝜃0
, 𝐷𝜏𝑖
𝑠𝑝𝑡
𝜃0 = 𝜃0 − 𝛽∇𝜃0
෍
𝜏𝑖~𝑝(𝑇)
𝜏𝑏
𝐿𝜏𝑖
, 𝑓𝜃𝜏𝑖
, 𝐷𝜏𝑖
𝑞𝑟𝑦
Hyungjun Yoo Defense for the Master’s Thesis 10 / 46
14th December 2020
Representation Reuse and Change
• Rapid Learning or Feature Reuse? (Raghu et al., 2020)
Problem Setting
• Divide the network with two parts : 𝜃 = 𝜃𝑒𝑥𝑡
, 𝜃𝑐𝑙𝑠
• body (𝜃𝑒𝑥𝑡
, extractor, conv layers)
• head (𝜃𝑐𝑙𝑠
, classifier, fully connected layer)
Hyungjun Yoo Defense for the Master’s Thesis 11 / 46
14th December 2020
Representation Reuse and Change
• Rapid Learning or Feature Reuse? (Raghu et al., 2020)
Problem Setting
• Divide the network with two parts : 𝜃 = 𝜃𝑒𝑥𝑡
, 𝜃𝑐𝑙𝑠
• Representation Change / Representation Reuse
• Rapid Learning :
Representations from body are significantly
changed after inner update.
→ Representation Change
• Feature Reuse :
Representations from body are negligibly
changed after inner update and reused.
→ Representation reuse
Hyungjun Yoo Defense for the Master’s Thesis 12 / 46
14th December 2020
Representation Reuse and Change
• Rapid Learning or Feature Reuse? (Raghu et al., 2020)
Problem Setting
• Divide the network with two parts : 𝜃 = 𝜃𝑒𝑥𝑡
, 𝜃𝑐𝑙𝑠
• Representation Change / Representation Reuse
• The dominant factor of MAML’s effectiveness is
representation reuse.
• Meta-trained model’s body is already able to
extract good representations before inner update,
and representations are not changed after task-
specific update (inner loop).
Hyungjun Yoo Defense for the Master’s Thesis 13 / 46
14th December 2020
Representation Reuse and Change
• Rapid Learning or Feature Reuse? (Raghu et al., 2020)
Problem Setting
• Divide the network with two parts : 𝜃 = 𝜃𝑒𝑥𝑡
, 𝜃𝑐𝑙𝑠
• Representation Change / Representation Reuse
• The dominant factor of MAML’s effectiveness is
representation reuse.
• Suggest more computationally efficient update rule
• ANIL : Head only update in inner loop
• Inner loop updates :
𝜃𝜏𝑖
𝑒𝑥𝑡
= 𝜃0
𝑒𝑥𝑡
(No update in inner loop)
𝜃𝜏𝑖
𝑐𝑙𝑠
= 𝜃0
𝑐𝑙𝑠
− 𝜕ℎ∇𝜃0
𝐿𝜏𝑖
, 𝑓𝜃0
, 𝐷𝜏𝑖
𝑠𝑝𝑡
• Outer loop updates :
𝜃0 = 𝜃0 − 𝛽∇𝜃0
σ𝜏𝑖~𝑝(𝑇) 𝐿𝜏𝑖
, 𝑓𝜃𝜏𝑖
, 𝐷𝜏𝑖
𝑞𝑟𝑦
Hyungjun Yoo Defense for the Master’s Thesis 14 / 46
14th December 2020
• The reverse version of ANIL : In inner loop, classifier is fixed, and body is only updated
BOIL(Body Only update in Inner Loop) Algorithm
BOIL Algorithm
• Inner loop updates : 𝜃𝜏𝑖
𝑒𝑥𝑡
= 𝜃0
𝑒𝑥𝑡
− 𝜕𝑏∇𝜃0
𝐿𝜏𝑖
, 𝑓𝜃0
, 𝐷𝜏𝑖
𝑠𝑝𝑡
𝜃𝜏𝑖
𝑐𝑙𝑠
= 𝜃0
𝑐𝑙𝑠
(No update)
• Outer loop updates :
𝜃0 = 𝜃0 −
𝛽∇𝜃0
σ𝜏𝑖~𝑝(𝑇) 𝐿𝜏𝑖
, 𝑓𝜃𝜏𝑖
, 𝐷𝜏𝑖
𝑞𝑟𝑦
• The learning rates according to the algorithms
4conv network ResNet-12
MAML ANIL BOIL MAML ANIL BOIL
𝛼𝑏 0.5 0.0 0.5 0.3 0.0 0.3
𝛼ℎ 0.5 0.5 0.0 0.3 0.3 0.0
𝛽𝑏 0.001 0.001 0.001 0.0006 0.0006 0.0006
𝛽ℎ 0.001 0.001 0.001 0.0006 0.0006 0.0006
Hyungjun Yoo Defense for the Master’s Thesis 15 / 46
14th December 2020
• Representation change in BOIL through task-specific update
• Difference in task-specific (inner) updates between MAML/ANIL and BOIL
• (a) MAML / ANIL : mainly updates the head with a negligible change in body (extractor).
hence, representations on the feature space are almost identical.
BOIL(Body Only update in Inner Loop) Algorithm
BOIL Algorithm
(a) MAML/ANIL.
Decision boundaries
Representations
Inner loop
Hyungjun Yoo Defense for the Master’s Thesis 16 / 46
14th December 2020
• Representation change in BOIL through task-specific adaptation (inner loop update)
• Difference in task-specific (inner) updates between MAML/ANIL and BOIL
• (a) MAML / ANIL : mainly updates the head with a negligible change in body (extractor).
hence, representations on the feature space are almost identical.
• (b) BOIL : updates only the body without changing the head through inner updates.
Representations on the feature space change significantly following the fixed decision
boundaries.
BOIL(Body Only update in Inner Loop) Algorithm
BOIL Algorithm
(a) MAML/ANIL.
Decision boundaries
Representations
Inner loop
(b) BOIL.
Inner loop
Hyungjun Yoo Defense for the Master’s Thesis 17 / 46
14th December 2020
• The goal of meta-learning : Ability to adapt to environment where the source and target are even very
different
Necessity of Representation Change :
Domain-agnostic adaptation
BOIL Algorithm
Source Domain :
mini-ImageNet
(training classes)
Target Domain :
mini-ImageNet
(test classes)
(a) Same Domain Adaptation (b) Cross Domain Adaptation (Domain-Agnostic)
Source Domain :
mini-ImageNet
(training classes)
Target Domain :
CUB (Bird only)
(test classes)
Hyungjun Yoo Defense for the Master’s Thesis 18 / 46
14th December 2020
• The goal of meta-learning : Ability to adapt to environment where the source and target are significantly different
• When there are no strong similarities between the source and target domains, representation reuse using good
representations for the source domain could be imperfect representations for the target domain.
Necessity of Representation Change :
Domain-agnostic adaptation
BOIL Algorithm
Source Domain :
mini-ImageNet
(training classes)
Target Domain :
mini-ImageNet
(test classes)
(a) Same Domain Adaptation (b) Cross Domain Adaptation (Domain-Agnostic)
Source Domain :
mini-ImageNet
(training classes)
Target Domain :
CUB (Bird only)
(test classes)
Hyungjun Yoo Defense for the Master’s Thesis 19 / 46
14th December 2020
• The goal of meta-learning : Ability to adapt to environment where the source and target are significantly different
• When there are no strong similarities between the source and target domains, representation reuse using good
representations for the source domain could be imperfect representations for the target domain.
• Therefore, the ability to adapt well to other target domain, i.e. the ability to task-specifically update in response
to unseen tasks and change the representation(=representation change) during inner loop, is necessary.
Source Domain :
mini-ImageNet
(training classes)
Target Domain :
mini-ImageNet
(test classes)
(a) Same Domain Adaptation (b) Cross Domain Adaptation (Domain-Agnostic)
Source Domain :
mini-ImageNet
(training classes)
Target Domain :
CUB (Bird only)
(test classes)
Necessity of Representation Change :
Domain-agnostic adaptation
BOIL Algorithm
Hyungjun Yoo Defense for the Master’s Thesis 20 / 46
14th December 2020
Superiority of BOIL in Domain-agnostic Adaptation
• We divide the types of datasets into General (mini-ImageNet, tiered-ImageNet) domain and Specific (Cars, CUB)
domain based on fineness.
BOIL Algorithm
Hyungjun Yoo Defense for the Master’s Thesis 21 / 46
14th December 2020
Superiority of BOIL in Domain-agnostic Adaptation
• We divide the types of datasets into General (mini-ImageNet, tiered-ImageNet) domain and Specific (Cars, CUB)
domain based on fineness.
• Adaptation types : [General → General], [General → Specific], [Specific → General], [Specific → Specific]
(in order to make the situation realistic and make it more difficult)
BOIL Algorithm
Hyungjun Yoo Defense for the Master’s Thesis 22 / 46
14th December 2020
Superiority of BOIL in Domain-agnostic Adaptation
• We divide the types of datasets into General (mini-ImageNet, tiered-ImageNet) domain and Specific (Cars, CUB)
domain based on fineness.
• Adaptation types : [General → General], [General → Specific], [Specific → General], [Specific → Specific]
(in order to make the situation realistic and make it more difficult)
• With all settings, BOIL overwhelms MAML/ANIL via representation change.
BOIL Algorithm
Hyungjun Yoo Defense for the Master’s Thesis 23 / 46
14th December 2020
Cosine Similarity of Representations of BOIL
Representation Change in BOIL
• Cosine Similarity : We calculate Cosine Similarity between the representations of a query set including 5
classes and 15 samples per class from mini-ImageNet after every convolution module.
• The orange line represents the average of the cosine similarity between the samples having the same class,
and the blue line represents the average of the cosine similarity between the samples having different classes.
(a) MAML (b) ANIL
Hyungjun Yoo Defense for the Master’s Thesis 24 / 46
14th December 2020
Cosine Similarity of Representations of BOIL
Representation Change in BOIL
• MAML/ANIL :
• Their patterns do not show any noticeable difference before or after update.
→ The effectiveness of MAML/ANIL heavily leans on the meta-initialized body, not the task-specific
adaptation.
(representation reuse)
(a) MAML (b) ANIL
Hyungjun Yoo Defense for the Master’s Thesis 25 / 46
14th December 2020
Cosine Similarity of Representations of BOIL
Representation Change in BOIL
• BOIL :
• Before adaptation, BOIL’s meta-initialized body cannot distinguish the classes.
• After adaptation, the similarity of the different classes rapidly decrease on conv4.
→ The body can distinguish the classes through adaptation. (representation change)
(c) BOIL
Hyungjun Yoo Defense for the Master’s Thesis 26 / 46
14th December 2020
CKA of Representations of BOIL
• CKA : We calculate CKA values of representations of query set before and after adaptation.
When the CKA between two representations is close to 1, the representations are almost identical.
Representation Change in BOIL
(a) On mini-ImageNet dataset. (b) On Cars dataset.
Hyungjun Yoo Defense for the Master’s Thesis 27 / 46
14th December 2020
CKA of Representations of BOIL
• MAML/ANIL : CKA shows that the MAML/ANIL algorithms do not change the representation in the body.
• BOIL : BOIL changes the representation of the last conv layer.
This result indicates that the BOIL algorithm rapidly learns task through representation change.
Representation Change in BOIL
(a) On mini-ImageNet dataset. (b) On Cars dataset.
Hyungjun Yoo Defense for the Master’s Thesis 28 / 46
14th December 2020
Empirical Analysis of BOIL
(a) Toy example of NIL-testing (3-way 5-shot).
• NIL-testing : In meta-testing, we make class-prototypes with support set and measure using similarity with
query set
in order to measure the effectiveness of representations after body.
Representation Change in BOIL
Hyungjun Yoo Defense for the Master’s Thesis 29 / 46
14th December 2020
Empirical Analysis of BOIL
• NIL-testing : In meta-testing, we make class-prototypes with support set and measure using similarity with query set.
• With the head : Before adaptation, all algorithms on the same- and cross-domain are unable to distinguish all
classes (20%).
Representation Change in BOIL
Hyungjun Yoo Defense for the Master’s Thesis 30 / 46
14th December 2020
Empirical Analysis of BOIL
• NIL-testing : In meta-testing, we make class-prototypes with support set and measure using similarity with query set.
• With the head : Before adaptation, all algorithms on the same- and cross-domain are unable to distinguish all
classes (20%). After adaptation, BOIL overwhelms the performance of the other algorithms. This means that
representation change of BOIL is more effective than representation reuse of MAML/ANIL.
Representation Change in BOIL
Hyungjun Yoo Defense for the Master’s Thesis 31 / 46
14th December 2020
Empirical Analysis of BOIL
• NIL-testing : In meta-testing, we make class-prototypes with support set and measure using similarity with query set.
• With the head : Before adaptation, all algorithms on the same- and cross-domain are unable to distinguish all
classes (20%). After adaptation, BOIL overwhelms the performance of the other algorithms. This means that
representation change of BOIL is more effective than representation reuse of MAML/ANIL.
• Without the head : MAML and ANIL already generate sufficient representations to classify, and adaptation
makes little or no difference.
Representation Change in BOIL
Hyungjun Yoo Defense for the Master’s Thesis 32 / 46
14th December 2020
Empirical Analysis of BOIL
• NIL-testing : In meta-testing, we make class-prototypes with support set and measure using similarity with query set.
• With the head : Before adaptation, all algorithms on the same- and cross-domain are unable to distinguish all
classes (20%). After adaptation, BOIL overwhelms the performance of the other algorithms. This means that
representation change of BOIL is more effective than representation reuse of MAML/ANIL.
• Without the head : MAML and ANIL already generate sufficient representations to classify, and adaptation makes
little or no difference. BOIL shows a steep performance improvement through adaptation on the same- and
cross-domain. This result implies that the body of BOIL can be task-specifically updated.
Representation Change in BOIL
Hyungjun Yoo Defense for the Master’s Thesis 33 / 46
14th December 2020
Ablation Study of Learning Layer
• Ablation study : We conduct experiments to train multiple consecutive layers with / without the head.
Representation Change in BOIL
Hyungjun Yoo Defense for the Master’s Thesis 34 / 46
14th December 2020
Ablation Study of Learning Layer
• Ablation study : we conduct experiments to train multiple consecutive layers with / without the head.
• We consistently observe that learning with the head is far from the best accuracy. → Freezing head is crucial.
Representation Change in BOIL
Hyungjun Yoo Defense for the Master’s Thesis 35 / 46
14th December 2020
Ablation Study of Learning Layer
• Ablation study : we conduct experiments to train multiple consecutive layers with / without the head.
• We consistently observe that learning with the head is far from the best accuracy. → Freezing head is crucial.
• We also find several settings skipping the lower-level layers in the inner loop that performs lightly better than
BOIL. We believe each neural network architecture and data set pair has its own best layer combination. When
it is allowed to search for the best combination using huge computing power, we can further improve BOIL.
Representation Change in BOIL
Hyungjun Yoo Defense for the Master’s Thesis 36 / 46
14th December 2020
BOIL to the Larger Network (Residual Network)
• We explore BOIL’s applicability to a deeper network with the wiring structure, ResNet-12.
In general, deeper networks use feature wiring structures to facilitate the feature propagation, e.g. skip
connection.
Representation Change in BOIL
(a) With the last skip connection.
Hyungjun Yoo Defense for the Master’s Thesis 37 / 46
14th December 2020
BOIL to the Larger Network (Residual Network)
• We explore BOIL’s applicability to a deeper network with the wiring structure, ResNet-12.
In general, deeper networks use feature wiring structures to facilitate the feature propagation, e.g. skip connection.
• Disconnection trick : We propose a simple trick by disconnecting the last skip connection in order to reinforce
the representation change in high level of body.
Representation Change in BOIL
(a) With the last skip connection. (b) Without the last skip connection.
(Disconnection trick)
Hyungjun Yoo Defense for the Master’s Thesis 38 / 46
14th December 2020
Representation Change via Disconnection Trick
Representation Change in BOIL
* LSC : last skip connection
Hyungjun Yoo Defense for the Master’s Thesis 39 / 46
14th December 2020
Main Contributions of Research
• We emphasize the necessity of representation change for meta-learning through cross-domain adaptation
experiments.
Conclusion
Hyungjun Yoo Defense for the Master’s Thesis 40 / 46
14th December 2020
Main Contributions of Research
• We emphasize the necessity of representation change for meta-learning through cross-domain adaptation
experiments.
• We propose a simple but effective meta-learning algorithm that learns the Body (extractor)of the model
Only in the Inner Loop(BOIL). We empirically show that BOIL improves the performance over all benchmark
data sets and that this improvement is particularly noticeable in fine-grained data sets or cross-domain
adaptation.
Conclusion
Hyungjun Yoo Defense for the Master’s Thesis 41 / 46
14th December 2020
Main Contributions of Research
• We emphasize the necessity of representation change for meta-learning through cross-domain adaptation
experiments.
• We propose a simple but effective meta-learning algorithm that learns the Body (extractor)of the model Only in
the Inner Loop(BOIL). We empirically show that BOIL improves the performance over all benchmark data sets
and that this improvement is particularly noticeable in fine-grained data sets or cross-domain adaptation.
• We demonstrate that the BOIL algorithm enjoys representation layer reuse on the low-/mid-level body and
representation layer change on the high-level body using the cosine similarity and CKA, and empirically
analyze the effectiveness of the body of BOIL through an ablation study on learning layers.
Conclusion
Hyungjun Yoo Defense for the Master’s Thesis 42 / 46
14th December 2020
Main Contributions of Research
• We emphasize the necessity of representation change for meta-learning through cross-domain adaptation
experiments.
• We propose a simple but effective meta-learning algorithm that learns the Body (extractor)of the model Only in
the Inner Loop(BOIL). We empirically show that BOIL improves the performance over all benchmark data sets
and that this improvement is particularly noticeable in fine-grained data sets or cross-domain adaptation.
• We demonstrate that the BOIL algorithm enjoys representation layer reuse on the low-/mid-level body and
representation layer change on the high-level body using the cosine similarity and CKA, and empirically analyze
the effectiveness of the body of BOIL through an ablation study on learning layers.
• For ResNet architectures, we propose a disconnection trick that removes the back-propagation path of the last
skip connection. The disconnection trick strengthens representation layer change on the high-level body.
Conclusion
Thank you.
Reference
• Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation ofdeep
networks. In Proceedings of the 34th International Conference on Machine Learning-Volume70, pp. 1126–
1135. JMLR. org, 2017.
• Aniruddh Raghu, Maithra Raghu, Samy Bengio, and Oriol Vinyals. Rapid learning or feature reuse?
towards understanding the effectiveness of maml. In International Conference on Learning
Representations, 2020.
• Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et al. Matching networks for one shot
learning. In Advances in neural information processing systems, pp. 3630–3638, 2016.
• Boris Oreshkin, Pau Rodríguez López, and Alexandre Lacoste. Tadam: Task dependent adaptive metric for
improved few-shot learning. In Advances in Neural Information Processing Systems, pp.721–731, 2018.
• Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey Hinton. Similarity of neural network
representations revisited. arXiv preprint arXiv:1905.00414, 2019.
• Leland McInnes, John Healy, Nathaniel Saul, and Lukas Großberger. Umap: Uniform manifold
approximation and projection. Journal of Open Source Software, 3(29), 2018.
Hyungjun Yoo Defense for the Master’s Thesis 45 / 46
14th December 2020
Model Implementations and Datasets
• Model Implementations
• 4conv network(Vinyals et al., 2016) : has 4 conv modules [3×3 conv layer(64 filters), BN, ReLU, 2×2 max-pool.]
• ResNet-12(Oreshkin et al., 2018) : has 4 residual blocks [3 conv modules, BN, Leaky ReLU]
• Datasets
Appendix
Datasets miniImageNet tieredImageNet Cars CUB
Source
Image size
Fineness
# meta-training classes
# meta-valication classes
# meta-testing classes
Split setting
Russakovsky et al. (2015)
84×84
Coarse-grain
64
16
20
Vinyals et al. (2016)
Russakovsky et al. (2015)
84×84
Coarse-grain
351
97
160
Ren et al. (2018)
Krause et al. (2013)
84×84
Fine-grain
98
49
49
Tseng et al. (2020)
Welinder et al. (2010)
84×84
Fine-grain
100
50
50
Hilliard et al. (2018)
Datasets FC100 CIFAR-FS VGG-Flower Aircraft
Source
Image size
Fineness
# meta-training classes
# meta-valication classes
# meta-testing classes
Split setting
Krizhevsky et al. (2009)
32×32
Coarse-grain
60
20
20
Bertinetto et al. (2018)
Krizhevsky et al. (2009)
32×32
Coarse-grain
64
16
20
Oreshkin et al. (2018)
Nilsback & Zisserman (2008)
32×32
Fine-grain
71
16
15
Na et al. (2019)
Maji et al. (2013)
32×32
Fine-grain
70
15
15
Na et al. (2019)
Hyungjun Yoo Defense for the Master’s Thesis 46 / 46
14th December 2020
• UMAP visualization results of Benchmark data sets (Training domain → Test domain)
Visualization with UMAP
Appendix
mini-ImageNet → mini-ImageNet CUB → Cars
tieredImageNet → miniImageNet
Hyungjun Yoo Defense for the Master’s Thesis 47 / 46
14th December 2020
Representation Change via Disconnection Trick
Appendix
(a) Cosine similarity in block4 (the last block) of ResNet-12, BOIL with LSC
(b) Cosine similarity in block4 (the last block) of ResNet-12, BOIL without LSC

More Related Content

Similar to BOIL: Towards Representation Change for Few-shot Learning

Preliminary Exam Slides
Preliminary Exam SlidesPreliminary Exam Slides
Preliminary Exam SlidesDebasmit Das
 
NTU DBME5028 Week8 Transfer Learning
NTU DBME5028 Week8 Transfer LearningNTU DBME5028 Week8 Transfer Learning
NTU DBME5028 Week8 Transfer LearningSean Yu
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Fernando Constantino
 
PhD Defense Slides
PhD Defense SlidesPhD Defense Slides
PhD Defense SlidesDebasmit Das
 
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image SynthesisLarge Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image SynthesisSeunghyun Hwang
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNAVER Engineering
 
Review : Rethinking Pre-training and Self-training
Review : Rethinking Pre-training and Self-trainingReview : Rethinking Pre-training and Self-training
Review : Rethinking Pre-training and Self-trainingDongmin Choi
 
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...Antonio Tejero de Pablos
 
End-to-end deep auto-encoder for segmenting a moving object with limited tra...
End-to-end deep auto-encoder for segmenting a moving object  with limited tra...End-to-end deep auto-encoder for segmenting a moving object  with limited tra...
End-to-end deep auto-encoder for segmenting a moving object with limited tra...IJECEIAES
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...IEEEFINALYEARSTUDENTPROJECT
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...IEEEMEMTECHSTUDENTSPROJECTS
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEEFINALYEARSTUDENTPROJECTS
 
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in RGentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in RMarco Wirthlin
 
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksVincenzo Lomonaco
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...MLAI2
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingSangwoo Mo
 
A hybrid constructive algorithm incorporating teaching-learning based optimiz...
A hybrid constructive algorithm incorporating teaching-learning based optimiz...A hybrid constructive algorithm incorporating teaching-learning based optimiz...
A hybrid constructive algorithm incorporating teaching-learning based optimiz...IJECEIAES
 

Similar to BOIL: Towards Representation Change for Few-shot Learning (20)

Preliminary Exam Slides
Preliminary Exam SlidesPreliminary Exam Slides
Preliminary Exam Slides
 
NTU DBME5028 Week8 Transfer Learning
NTU DBME5028 Week8 Transfer LearningNTU DBME5028 Week8 Transfer Learning
NTU DBME5028 Week8 Transfer Learning
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.
 
PhD Defense Slides
PhD Defense SlidesPhD Defense Slides
PhD Defense Slides
 
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image SynthesisLarge Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image Synthesis
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltc
 
Review : Rethinking Pre-training and Self-training
Review : Rethinking Pre-training and Self-trainingReview : Rethinking Pre-training and Self-training
Review : Rethinking Pre-training and Self-training
 
Tutorial inns2019 full
Tutorial inns2019 fullTutorial inns2019 full
Tutorial inns2019 full
 
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
 
End-to-end deep auto-encoder for segmenting a moving object with limited tra...
End-to-end deep auto-encoder for segmenting a moving object  with limited tra...End-to-end deep auto-encoder for segmenting a moving object  with limited tra...
End-to-end deep auto-encoder for segmenting a moving object with limited tra...
 
Conv xg
Conv xgConv xg
Conv xg
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
 
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in RGentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
 
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural Networks
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
A hybrid constructive algorithm incorporating teaching-learning based optimiz...
A hybrid constructive algorithm incorporating teaching-learning based optimiz...A hybrid constructive algorithm incorporating teaching-learning based optimiz...
A hybrid constructive algorithm incorporating teaching-learning based optimiz...
 

Recently uploaded

DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 

Recently uploaded (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 

BOIL: Towards Representation Change for Few-shot Learning

  • 1. Meta-Learning for the representation change in task-specific update Defence for the Master’s Thesis 2020. 12. 14 Hyungjun Yoo (Advisor : Se-Young Yun) Graduate School of Knowledge Service Engineering, OSI Lab
  • 2. Table of Contents Purpose of the Research Leverage representation change through freezing head and updating body only in task-specific adaptation for efficient Meta-learning 1. Introduction : What is Meta-Learning? 2. Problem Setting : Few-shot Classification, MAML, ANIL 3. BOIL Algorithm : BOIL, Domain-agnostic adaptation 4. Representation Change in BOIL : Cosine similarity, CKA, Empirical Analysis, Disconnection trick (ResNet) 5. Conclusion : Research contributions
  • 3. Hyungjun Yoo Defense for the Master’s Thesis 3 / 46 14th December 2020 • Deep Neural Network needs a large labeled dataset and requires long training time. What is Meta-learning? Introduction AlexNet Residual Net Inception V3
  • 4. Hyungjun Yoo Defense for the Master’s Thesis 4 / 46 14th December 2020 • Deep Neural Network needs a large labeled dataset and requires long training time. • But human can learn with only few samples, and also, human can use previous knowledge to learn new task. What is Meta-learning? Introduction AlexNet Residual Net Inception V3
  • 5. Hyungjun Yoo Defense for the Master’s Thesis 5 / 46 14th December 2020 What is Meta-learning? Introduction • Meta-Learning (Learning to learn) : tries to make DNN mimic human intelligence • Model is able to learn with few shot examples in each task • Model which learns to learn from the previous similar tasks can quickly learns new task • Why learning to learn? • Advantages to effectively reuse data on other tasks • Applicability replace manual engineering of architecture, hyperparameters, etc. • Learning property to quickly adapt to unexpected scenarios (inevitable failures, long tail) • Problem Domains • Few-shot classification
  • 6. Hyungjun Yoo Defense for the Master’s Thesis 6 / 46 14th December 2020 Main Approaches of Meta-learning Introduction Metric Based Model Based Optimization Based Key Idea Metric learning RNN, external memory Gradient Descent Key Papers Matching Net (Vinyals et al., 2016) Prototypical Net (Snell et al., 2017) MANN (Santoro et al., 2016) Meta Network (Munkhdalai & Yu, 2017) MAML (Finn et al., 2017) Reptile (Nichol et al., 2018) Strength Simple Entirely feedforward Applicable to any baseline model Lends well to OOD tasks Model Agnostic Weakness Hard to generalize to varying example size Restricted domain Excessive computation and parameters Often data-inefficient Training instability Second-order optimization
  • 7. Hyungjun Yoo Defense for the Master’s Thesis 7 / 46 14th December 2020 Few-shot learning (𝒏-way 𝒌-shot classification) Problem Setting Query Set, 𝐷𝜏1 𝑞𝑟𝑦 Support Set, 𝐷𝜏1 𝑠𝑝𝑡 Query Set, 𝐷𝜏2 𝑞𝑟𝑦 Support Set, 𝐷𝜏2 𝑠𝑝𝑡 , 𝝉𝟏 , 𝝉𝟐 . , 𝝉𝟏 . Query Set, 𝐷𝜏1 𝑞𝑟𝑦 Support Set, 𝐷𝜏1 𝑠𝑝𝑡
  • 8. Hyungjun Yoo Defense for the Master’s Thesis 8 / 46 14th December 2020 Meta-learning Framework (MAML) • Model-Agnostic Meta-Learning (MAML, Finn et al., 2017) • MAML uses 𝑏(meta-batch size) tasks for a single iteration, and each iteration consists of two steps, inner loop and outer loop. • Two steps of parameter update • Inner loop : from meta-initialization(𝜃0), task-specifically updates using support set Problem Setting inner loop : 𝜃𝜏𝑖 = 𝜃0 − 𝜕∇𝜃0 𝐿𝜏𝑖 , 𝑓𝜃0 , 𝐷𝜏𝑖 𝑠𝑝𝑡
  • 9. Hyungjun Yoo Defense for the Master’s Thesis 9 / 46 14th December 2020 Meta-learning Framework (MAML) • Model-Agnostic Meta-Learning (MAML, Finn et al., 2017) • MAML uses 𝑏(meta-batch size) tasks for a single iteration, and each iteration consists of two steps, inner loop and outer loop. • Two steps of parameter update • Inner loop : from meta-initialization(𝜃0), task-specifically updates using support set • Outer loop : updates meta-initialization with average of the task-specific losses using query set Problem Setting inner loop : outer loop : 𝜃𝜏𝑖 = 𝜃0 − 𝜕∇𝜃0 𝐿𝜏𝑖 , 𝑓𝜃0 , 𝐷𝜏𝑖 𝑠𝑝𝑡 𝜃0 = 𝜃0 − 𝛽∇𝜃0 ෍ 𝜏𝑖~𝑝(𝑇) 𝜏𝑏 𝐿𝜏𝑖 , 𝑓𝜃𝜏𝑖 , 𝐷𝜏𝑖 𝑞𝑟𝑦
  • 10. Hyungjun Yoo Defense for the Master’s Thesis 10 / 46 14th December 2020 Representation Reuse and Change • Rapid Learning or Feature Reuse? (Raghu et al., 2020) Problem Setting • Divide the network with two parts : 𝜃 = 𝜃𝑒𝑥𝑡 , 𝜃𝑐𝑙𝑠 • body (𝜃𝑒𝑥𝑡 , extractor, conv layers) • head (𝜃𝑐𝑙𝑠 , classifier, fully connected layer)
  • 11. Hyungjun Yoo Defense for the Master’s Thesis 11 / 46 14th December 2020 Representation Reuse and Change • Rapid Learning or Feature Reuse? (Raghu et al., 2020) Problem Setting • Divide the network with two parts : 𝜃 = 𝜃𝑒𝑥𝑡 , 𝜃𝑐𝑙𝑠 • Representation Change / Representation Reuse • Rapid Learning : Representations from body are significantly changed after inner update. → Representation Change • Feature Reuse : Representations from body are negligibly changed after inner update and reused. → Representation reuse
  • 12. Hyungjun Yoo Defense for the Master’s Thesis 12 / 46 14th December 2020 Representation Reuse and Change • Rapid Learning or Feature Reuse? (Raghu et al., 2020) Problem Setting • Divide the network with two parts : 𝜃 = 𝜃𝑒𝑥𝑡 , 𝜃𝑐𝑙𝑠 • Representation Change / Representation Reuse • The dominant factor of MAML’s effectiveness is representation reuse. • Meta-trained model’s body is already able to extract good representations before inner update, and representations are not changed after task- specific update (inner loop).
  • 13. Hyungjun Yoo Defense for the Master’s Thesis 13 / 46 14th December 2020 Representation Reuse and Change • Rapid Learning or Feature Reuse? (Raghu et al., 2020) Problem Setting • Divide the network with two parts : 𝜃 = 𝜃𝑒𝑥𝑡 , 𝜃𝑐𝑙𝑠 • Representation Change / Representation Reuse • The dominant factor of MAML’s effectiveness is representation reuse. • Suggest more computationally efficient update rule • ANIL : Head only update in inner loop • Inner loop updates : 𝜃𝜏𝑖 𝑒𝑥𝑡 = 𝜃0 𝑒𝑥𝑡 (No update in inner loop) 𝜃𝜏𝑖 𝑐𝑙𝑠 = 𝜃0 𝑐𝑙𝑠 − 𝜕ℎ∇𝜃0 𝐿𝜏𝑖 , 𝑓𝜃0 , 𝐷𝜏𝑖 𝑠𝑝𝑡 • Outer loop updates : 𝜃0 = 𝜃0 − 𝛽∇𝜃0 σ𝜏𝑖~𝑝(𝑇) 𝐿𝜏𝑖 , 𝑓𝜃𝜏𝑖 , 𝐷𝜏𝑖 𝑞𝑟𝑦
  • 14. Hyungjun Yoo Defense for the Master’s Thesis 14 / 46 14th December 2020 • The reverse version of ANIL : In inner loop, classifier is fixed, and body is only updated BOIL(Body Only update in Inner Loop) Algorithm BOIL Algorithm • Inner loop updates : 𝜃𝜏𝑖 𝑒𝑥𝑡 = 𝜃0 𝑒𝑥𝑡 − 𝜕𝑏∇𝜃0 𝐿𝜏𝑖 , 𝑓𝜃0 , 𝐷𝜏𝑖 𝑠𝑝𝑡 𝜃𝜏𝑖 𝑐𝑙𝑠 = 𝜃0 𝑐𝑙𝑠 (No update) • Outer loop updates : 𝜃0 = 𝜃0 − 𝛽∇𝜃0 σ𝜏𝑖~𝑝(𝑇) 𝐿𝜏𝑖 , 𝑓𝜃𝜏𝑖 , 𝐷𝜏𝑖 𝑞𝑟𝑦 • The learning rates according to the algorithms 4conv network ResNet-12 MAML ANIL BOIL MAML ANIL BOIL 𝛼𝑏 0.5 0.0 0.5 0.3 0.0 0.3 𝛼ℎ 0.5 0.5 0.0 0.3 0.3 0.0 𝛽𝑏 0.001 0.001 0.001 0.0006 0.0006 0.0006 𝛽ℎ 0.001 0.001 0.001 0.0006 0.0006 0.0006
  • 15. Hyungjun Yoo Defense for the Master’s Thesis 15 / 46 14th December 2020 • Representation change in BOIL through task-specific update • Difference in task-specific (inner) updates between MAML/ANIL and BOIL • (a) MAML / ANIL : mainly updates the head with a negligible change in body (extractor). hence, representations on the feature space are almost identical. BOIL(Body Only update in Inner Loop) Algorithm BOIL Algorithm (a) MAML/ANIL. Decision boundaries Representations Inner loop
  • 16. Hyungjun Yoo Defense for the Master’s Thesis 16 / 46 14th December 2020 • Representation change in BOIL through task-specific adaptation (inner loop update) • Difference in task-specific (inner) updates between MAML/ANIL and BOIL • (a) MAML / ANIL : mainly updates the head with a negligible change in body (extractor). hence, representations on the feature space are almost identical. • (b) BOIL : updates only the body without changing the head through inner updates. Representations on the feature space change significantly following the fixed decision boundaries. BOIL(Body Only update in Inner Loop) Algorithm BOIL Algorithm (a) MAML/ANIL. Decision boundaries Representations Inner loop (b) BOIL. Inner loop
  • 17. Hyungjun Yoo Defense for the Master’s Thesis 17 / 46 14th December 2020 • The goal of meta-learning : Ability to adapt to environment where the source and target are even very different Necessity of Representation Change : Domain-agnostic adaptation BOIL Algorithm Source Domain : mini-ImageNet (training classes) Target Domain : mini-ImageNet (test classes) (a) Same Domain Adaptation (b) Cross Domain Adaptation (Domain-Agnostic) Source Domain : mini-ImageNet (training classes) Target Domain : CUB (Bird only) (test classes)
  • 18. Hyungjun Yoo Defense for the Master’s Thesis 18 / 46 14th December 2020 • The goal of meta-learning : Ability to adapt to environment where the source and target are significantly different • When there are no strong similarities between the source and target domains, representation reuse using good representations for the source domain could be imperfect representations for the target domain. Necessity of Representation Change : Domain-agnostic adaptation BOIL Algorithm Source Domain : mini-ImageNet (training classes) Target Domain : mini-ImageNet (test classes) (a) Same Domain Adaptation (b) Cross Domain Adaptation (Domain-Agnostic) Source Domain : mini-ImageNet (training classes) Target Domain : CUB (Bird only) (test classes)
  • 19. Hyungjun Yoo Defense for the Master’s Thesis 19 / 46 14th December 2020 • The goal of meta-learning : Ability to adapt to environment where the source and target are significantly different • When there are no strong similarities between the source and target domains, representation reuse using good representations for the source domain could be imperfect representations for the target domain. • Therefore, the ability to adapt well to other target domain, i.e. the ability to task-specifically update in response to unseen tasks and change the representation(=representation change) during inner loop, is necessary. Source Domain : mini-ImageNet (training classes) Target Domain : mini-ImageNet (test classes) (a) Same Domain Adaptation (b) Cross Domain Adaptation (Domain-Agnostic) Source Domain : mini-ImageNet (training classes) Target Domain : CUB (Bird only) (test classes) Necessity of Representation Change : Domain-agnostic adaptation BOIL Algorithm
  • 20. Hyungjun Yoo Defense for the Master’s Thesis 20 / 46 14th December 2020 Superiority of BOIL in Domain-agnostic Adaptation • We divide the types of datasets into General (mini-ImageNet, tiered-ImageNet) domain and Specific (Cars, CUB) domain based on fineness. BOIL Algorithm
  • 21. Hyungjun Yoo Defense for the Master’s Thesis 21 / 46 14th December 2020 Superiority of BOIL in Domain-agnostic Adaptation • We divide the types of datasets into General (mini-ImageNet, tiered-ImageNet) domain and Specific (Cars, CUB) domain based on fineness. • Adaptation types : [General → General], [General → Specific], [Specific → General], [Specific → Specific] (in order to make the situation realistic and make it more difficult) BOIL Algorithm
  • 22. Hyungjun Yoo Defense for the Master’s Thesis 22 / 46 14th December 2020 Superiority of BOIL in Domain-agnostic Adaptation • We divide the types of datasets into General (mini-ImageNet, tiered-ImageNet) domain and Specific (Cars, CUB) domain based on fineness. • Adaptation types : [General → General], [General → Specific], [Specific → General], [Specific → Specific] (in order to make the situation realistic and make it more difficult) • With all settings, BOIL overwhelms MAML/ANIL via representation change. BOIL Algorithm
  • 23. Hyungjun Yoo Defense for the Master’s Thesis 23 / 46 14th December 2020 Cosine Similarity of Representations of BOIL Representation Change in BOIL • Cosine Similarity : We calculate Cosine Similarity between the representations of a query set including 5 classes and 15 samples per class from mini-ImageNet after every convolution module. • The orange line represents the average of the cosine similarity between the samples having the same class, and the blue line represents the average of the cosine similarity between the samples having different classes. (a) MAML (b) ANIL
  • 24. Hyungjun Yoo Defense for the Master’s Thesis 24 / 46 14th December 2020 Cosine Similarity of Representations of BOIL Representation Change in BOIL • MAML/ANIL : • Their patterns do not show any noticeable difference before or after update. → The effectiveness of MAML/ANIL heavily leans on the meta-initialized body, not the task-specific adaptation. (representation reuse) (a) MAML (b) ANIL
  • 25. Hyungjun Yoo Defense for the Master’s Thesis 25 / 46 14th December 2020 Cosine Similarity of Representations of BOIL Representation Change in BOIL • BOIL : • Before adaptation, BOIL’s meta-initialized body cannot distinguish the classes. • After adaptation, the similarity of the different classes rapidly decrease on conv4. → The body can distinguish the classes through adaptation. (representation change) (c) BOIL
  • 26. Hyungjun Yoo Defense for the Master’s Thesis 26 / 46 14th December 2020 CKA of Representations of BOIL • CKA : We calculate CKA values of representations of query set before and after adaptation. When the CKA between two representations is close to 1, the representations are almost identical. Representation Change in BOIL (a) On mini-ImageNet dataset. (b) On Cars dataset.
  • 27. Hyungjun Yoo Defense for the Master’s Thesis 27 / 46 14th December 2020 CKA of Representations of BOIL • MAML/ANIL : CKA shows that the MAML/ANIL algorithms do not change the representation in the body. • BOIL : BOIL changes the representation of the last conv layer. This result indicates that the BOIL algorithm rapidly learns task through representation change. Representation Change in BOIL (a) On mini-ImageNet dataset. (b) On Cars dataset.
  • 28. Hyungjun Yoo Defense for the Master’s Thesis 28 / 46 14th December 2020 Empirical Analysis of BOIL (a) Toy example of NIL-testing (3-way 5-shot). • NIL-testing : In meta-testing, we make class-prototypes with support set and measure using similarity with query set in order to measure the effectiveness of representations after body. Representation Change in BOIL
  • 29. Hyungjun Yoo Defense for the Master’s Thesis 29 / 46 14th December 2020 Empirical Analysis of BOIL • NIL-testing : In meta-testing, we make class-prototypes with support set and measure using similarity with query set. • With the head : Before adaptation, all algorithms on the same- and cross-domain are unable to distinguish all classes (20%). Representation Change in BOIL
  • 30. Hyungjun Yoo Defense for the Master’s Thesis 30 / 46 14th December 2020 Empirical Analysis of BOIL • NIL-testing : In meta-testing, we make class-prototypes with support set and measure using similarity with query set. • With the head : Before adaptation, all algorithms on the same- and cross-domain are unable to distinguish all classes (20%). After adaptation, BOIL overwhelms the performance of the other algorithms. This means that representation change of BOIL is more effective than representation reuse of MAML/ANIL. Representation Change in BOIL
  • 31. Hyungjun Yoo Defense for the Master’s Thesis 31 / 46 14th December 2020 Empirical Analysis of BOIL • NIL-testing : In meta-testing, we make class-prototypes with support set and measure using similarity with query set. • With the head : Before adaptation, all algorithms on the same- and cross-domain are unable to distinguish all classes (20%). After adaptation, BOIL overwhelms the performance of the other algorithms. This means that representation change of BOIL is more effective than representation reuse of MAML/ANIL. • Without the head : MAML and ANIL already generate sufficient representations to classify, and adaptation makes little or no difference. Representation Change in BOIL
  • 32. Hyungjun Yoo Defense for the Master’s Thesis 32 / 46 14th December 2020 Empirical Analysis of BOIL • NIL-testing : In meta-testing, we make class-prototypes with support set and measure using similarity with query set. • With the head : Before adaptation, all algorithms on the same- and cross-domain are unable to distinguish all classes (20%). After adaptation, BOIL overwhelms the performance of the other algorithms. This means that representation change of BOIL is more effective than representation reuse of MAML/ANIL. • Without the head : MAML and ANIL already generate sufficient representations to classify, and adaptation makes little or no difference. BOIL shows a steep performance improvement through adaptation on the same- and cross-domain. This result implies that the body of BOIL can be task-specifically updated. Representation Change in BOIL
  • 33. Hyungjun Yoo Defense for the Master’s Thesis 33 / 46 14th December 2020 Ablation Study of Learning Layer • Ablation study : We conduct experiments to train multiple consecutive layers with / without the head. Representation Change in BOIL
  • 34. Hyungjun Yoo Defense for the Master’s Thesis 34 / 46 14th December 2020 Ablation Study of Learning Layer • Ablation study : we conduct experiments to train multiple consecutive layers with / without the head. • We consistently observe that learning with the head is far from the best accuracy. → Freezing head is crucial. Representation Change in BOIL
  • 35. Hyungjun Yoo Defense for the Master’s Thesis 35 / 46 14th December 2020 Ablation Study of Learning Layer • Ablation study : we conduct experiments to train multiple consecutive layers with / without the head. • We consistently observe that learning with the head is far from the best accuracy. → Freezing head is crucial. • We also find several settings skipping the lower-level layers in the inner loop that performs lightly better than BOIL. We believe each neural network architecture and data set pair has its own best layer combination. When it is allowed to search for the best combination using huge computing power, we can further improve BOIL. Representation Change in BOIL
  • 36. Hyungjun Yoo Defense for the Master’s Thesis 36 / 46 14th December 2020 BOIL to the Larger Network (Residual Network) • We explore BOIL’s applicability to a deeper network with the wiring structure, ResNet-12. In general, deeper networks use feature wiring structures to facilitate the feature propagation, e.g. skip connection. Representation Change in BOIL (a) With the last skip connection.
  • 37. Hyungjun Yoo Defense for the Master’s Thesis 37 / 46 14th December 2020 BOIL to the Larger Network (Residual Network) • We explore BOIL’s applicability to a deeper network with the wiring structure, ResNet-12. In general, deeper networks use feature wiring structures to facilitate the feature propagation, e.g. skip connection. • Disconnection trick : We propose a simple trick by disconnecting the last skip connection in order to reinforce the representation change in high level of body. Representation Change in BOIL (a) With the last skip connection. (b) Without the last skip connection. (Disconnection trick)
  • 38. Hyungjun Yoo Defense for the Master’s Thesis 38 / 46 14th December 2020 Representation Change via Disconnection Trick Representation Change in BOIL * LSC : last skip connection
  • 39. Hyungjun Yoo Defense for the Master’s Thesis 39 / 46 14th December 2020 Main Contributions of Research • We emphasize the necessity of representation change for meta-learning through cross-domain adaptation experiments. Conclusion
  • 40. Hyungjun Yoo Defense for the Master’s Thesis 40 / 46 14th December 2020 Main Contributions of Research • We emphasize the necessity of representation change for meta-learning through cross-domain adaptation experiments. • We propose a simple but effective meta-learning algorithm that learns the Body (extractor)of the model Only in the Inner Loop(BOIL). We empirically show that BOIL improves the performance over all benchmark data sets and that this improvement is particularly noticeable in fine-grained data sets or cross-domain adaptation. Conclusion
  • 41. Hyungjun Yoo Defense for the Master’s Thesis 41 / 46 14th December 2020 Main Contributions of Research • We emphasize the necessity of representation change for meta-learning through cross-domain adaptation experiments. • We propose a simple but effective meta-learning algorithm that learns the Body (extractor)of the model Only in the Inner Loop(BOIL). We empirically show that BOIL improves the performance over all benchmark data sets and that this improvement is particularly noticeable in fine-grained data sets or cross-domain adaptation. • We demonstrate that the BOIL algorithm enjoys representation layer reuse on the low-/mid-level body and representation layer change on the high-level body using the cosine similarity and CKA, and empirically analyze the effectiveness of the body of BOIL through an ablation study on learning layers. Conclusion
  • 42. Hyungjun Yoo Defense for the Master’s Thesis 42 / 46 14th December 2020 Main Contributions of Research • We emphasize the necessity of representation change for meta-learning through cross-domain adaptation experiments. • We propose a simple but effective meta-learning algorithm that learns the Body (extractor)of the model Only in the Inner Loop(BOIL). We empirically show that BOIL improves the performance over all benchmark data sets and that this improvement is particularly noticeable in fine-grained data sets or cross-domain adaptation. • We demonstrate that the BOIL algorithm enjoys representation layer reuse on the low-/mid-level body and representation layer change on the high-level body using the cosine similarity and CKA, and empirically analyze the effectiveness of the body of BOIL through an ablation study on learning layers. • For ResNet architectures, we propose a disconnection trick that removes the back-propagation path of the last skip connection. The disconnection trick strengthens representation layer change on the high-level body. Conclusion
  • 44. Reference • Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation ofdeep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume70, pp. 1126– 1135. JMLR. org, 2017. • Aniruddh Raghu, Maithra Raghu, Samy Bengio, and Oriol Vinyals. Rapid learning or feature reuse? towards understanding the effectiveness of maml. In International Conference on Learning Representations, 2020. • Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et al. Matching networks for one shot learning. In Advances in neural information processing systems, pp. 3630–3638, 2016. • Boris Oreshkin, Pau Rodríguez López, and Alexandre Lacoste. Tadam: Task dependent adaptive metric for improved few-shot learning. In Advances in Neural Information Processing Systems, pp.721–731, 2018. • Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey Hinton. Similarity of neural network representations revisited. arXiv preprint arXiv:1905.00414, 2019. • Leland McInnes, John Healy, Nathaniel Saul, and Lukas Großberger. Umap: Uniform manifold approximation and projection. Journal of Open Source Software, 3(29), 2018.
  • 45. Hyungjun Yoo Defense for the Master’s Thesis 45 / 46 14th December 2020 Model Implementations and Datasets • Model Implementations • 4conv network(Vinyals et al., 2016) : has 4 conv modules [3×3 conv layer(64 filters), BN, ReLU, 2×2 max-pool.] • ResNet-12(Oreshkin et al., 2018) : has 4 residual blocks [3 conv modules, BN, Leaky ReLU] • Datasets Appendix Datasets miniImageNet tieredImageNet Cars CUB Source Image size Fineness # meta-training classes # meta-valication classes # meta-testing classes Split setting Russakovsky et al. (2015) 84×84 Coarse-grain 64 16 20 Vinyals et al. (2016) Russakovsky et al. (2015) 84×84 Coarse-grain 351 97 160 Ren et al. (2018) Krause et al. (2013) 84×84 Fine-grain 98 49 49 Tseng et al. (2020) Welinder et al. (2010) 84×84 Fine-grain 100 50 50 Hilliard et al. (2018) Datasets FC100 CIFAR-FS VGG-Flower Aircraft Source Image size Fineness # meta-training classes # meta-valication classes # meta-testing classes Split setting Krizhevsky et al. (2009) 32×32 Coarse-grain 60 20 20 Bertinetto et al. (2018) Krizhevsky et al. (2009) 32×32 Coarse-grain 64 16 20 Oreshkin et al. (2018) Nilsback & Zisserman (2008) 32×32 Fine-grain 71 16 15 Na et al. (2019) Maji et al. (2013) 32×32 Fine-grain 70 15 15 Na et al. (2019)
  • 46. Hyungjun Yoo Defense for the Master’s Thesis 46 / 46 14th December 2020 • UMAP visualization results of Benchmark data sets (Training domain → Test domain) Visualization with UMAP Appendix mini-ImageNet → mini-ImageNet CUB → Cars tieredImageNet → miniImageNet
  • 47. Hyungjun Yoo Defense for the Master’s Thesis 47 / 46 14th December 2020 Representation Change via Disconnection Trick Appendix (a) Cosine similarity in block4 (the last block) of ResNet-12, BOIL with LSC (b) Cosine similarity in block4 (the last block) of ResNet-12, BOIL without LSC