SlideShare a Scribd company logo
1 of 34
Download to read offline
Meta-GMVAE: Mixture of Gaussian VAEs for
Unsupervised Meta-Learning
Dong Bok Lee1, Dongchan Min1, Seanie Lee1, and Sung Ju Hwang1,2
KAIST1, AITRICS2, Seoul
Introduction
Unsupervised learning aims to learn meaningful representations from unlabeled data
that can be transferred to down-stream tasks.
Image
Unsupervised
Learning
Image recognition
…
Image segmentation
Representation
Introduction
Meta-learning shares the spirit of unsupervised learning in that they seek to learn more
effective learning procedure than learning from scratch.
Unsupervised Learning
Image recognition
…
Image segmentation
Representation
Meta-Learning
Task1: Tiger vs Cat
…
Task2: Car vs Bicycle
Model
…
…
Introduction
The fundamental difference of the two is that the most meta-learning approaches are
supervised, assuming full access to the labels.
Prototypical Network [1]
[1] Jake Snell, Kevin Swersky, and Richard S. Zemel: “Prototypical Networks for Few-shot learning”, NeuRIPS 2017
[2] Chelsea Finn, Pieter Abbeel, and Sergey Levine: “Model-Agnostic Meta-learning for Fast Adaptation of Deep Networks”, ICML 2018
MAML [2]
Supervision
Introduction
Due to the assumption of supervision, existing meta-learning methods have limitation:
they requires massive amounts of human efforts on labeling.
Labeling
Cat Dog
Car
Introduction
To overcome this, two recent works have proposed unsupervised meta-learning:
Unlabeled dataset
Unsupervised Meta-training Supervised Meta-test
Task 1
Support Set
Task N
Support Set
…
Query Set
Query Set
Introduction
They focus on constructing supervised meta-training dataset by pseudo labeling with
clustering and augmentation.
Unlabeled dataset
Unsupervised Meta-training Supervised Meta-test
Task 1
Support Set
Task N
Support Set
…
Query Set
Query Set
Pseudo Labeling
Method
In this work, we have focused on developing a principled unsupervised meta-learning
method, namely Meta-GMVAE.
The main idea is to bridge the gap between the process of unsupervised meta-training
and that of supervised meta-test.
Method (Unsupervised Meta-training)
Specifically, we start from Variational Autoencoder [5], where its prior is modelled with
Gaussian Mixtures.
The assumption is that each modality can represent a label at meta-test.
Its generative process is as follows:
1.
2.
3.
[5] Diederak P. Kingma, and Max Welling: “Auto-encoding Varitional Bayes”, ICLR 2013
A graphical illustration of GMVAE
Method (Unsupervised Meta-training)
However, the difference from the previous work on GMVAE [6] is that they fix the prior
parameters since they target a single task learning.
[6] Nat Dilokthanakul et al: “Deep unsupervised clustering with Gaussian Mixture Variational Autoencoder”, arXiv 2016
1.
2.
3.
A graphical illustration of GMVAE
This part is fixed
Method (Unsupervised Meta-training)
To learn the set-dependent multi-modalities, we assume that there exists a parameter
for each episode dataset which is randomly drawn.
A graphical illustration of Meta-GMVAE
A graphical illustration of GMVAE
Method (Unsupervised Meta-training)
Then we derive the following variational lower bound for the marginal log-likelihood:
Method (Unsupervised Meta-training)
Here we use i.i.d assumption on the data log-likelihood:
where the number of datapoints is 𝑀.
Method (Unsupervised Meta-training)
Then we introduce Gaussian Mixture prior and set-dependent variational posterior:
where z is the latent variable.
Method (Unsupervised Meta-training)
Here we model set-dependent posterior to encode each data within the given dataset
into the latent space.
Specifically, this is implemented by TransfomerEncoder proposed by Vaswani et al. [7].
[7] Ashish Vaswani et al: “Attention is All You Need”, NeuRIPS 2017
Method (Unsupervised Meta-training)
Then we derive the variational lower bound as follows:
Method (Unsupervised Meta-training)
Finally, we estimate the variational lowerbound with Monte Carlo estimation.
where N is the size of Monte Carlo samples.
Method (Unsupervised Meta-training)
In our setting, the prior parameter characterizes the given dataset.
To obtain the parameter that optimally explain the given dataset, we propose to locally
maximize the lowerbound as follows:
where
This leads to the MLE solution of the prior distribution.
Method (Unsupervised Meta-training)
However, we do not have an analytic MLE solution of GMM.
To this end, we propose to obtain optimal using EM algorithm as follows:
where we tie the covariance matrix with identity matrix.
Method (Unsupervised Meta-training)
Then the training objective of Meta-GMVAE is as follows:
where is obtained by performing the EM algorithm on the latent MC samples.
Method (Supervised Meta-test)
However, there is no guarantee that each modality obtained by EM algorithm
corresponds to the label.
To overcome this, we use support set as observations of EM algorithm.
A graphical illustration of
predicting labels
A graphical illustration of Meta-GMVAE
Method (Supervised Meta-test)
Then, we perform semi-supervised EM to obtain prior parameters as follows:
where the support set is considered as labeled data.
Method (Supervised Meta-test)
Finally, we predict labels of query set using the aggregated posterior as follows:
where we reuse the Monte Carlo samples.
Method (SimCLR)
The ability to generate samples may not be necessary for discriminative tasks.
Moreover, it is known to be challenging to train generative models on real data.
Therefore, we propose to use features pretrained by SimCLR [8] as an input.
[8] Ting Chen et al: “A Simple Framework for Constrastive Learning of Visual Representations”, ICML 2020
A graphical illustration of SimCLR Image recognition accuracy on ImageNet
Experimental Setups (Dataset)
We ran experiment using two benchmark datasets:
1) Omniglot dataset
28 x 28 resolution
grayscale
1200 meta-training classes
323 meta-test classes
Experimental Setups (Dataset)
We ran experiment using two benchmark datasets:
2) Mini-ImageNet dataset
84 x 84 resolution
RGB
64 meta-training classes
20 meta-test classes
Experimental Setups (Baselines)
We compare our Meta-GMVAE with four baselines as follows:
1) Prototypical Networks (oracle) [1]: metric-based meta-learning with supervision.
2) MAML (oracle) [2]: gradient-based meta-learning with supervision.
3) CACTUS [3]: constructing pseudo tasks by clustering on deep embedding space.
4) UMTRA [4]: constructing pseudo tasks using augmentation.
[1] Jake Snell, Kevin Swersky, and Richard S. Zemel: “Prototypical Networks for Few-shot learning”, NeuRIPS 2017
[2] Chelsea Finn, Pieter Abbeel, and Sergey Levine: “Model-Agnostic Meta-learning for Fast Adaptation of Deep Networks”, ICML 2018
[3] Kyle Hsu, Sergey Levine, and Chelsea Finn: “Unsupervised Learning vis Meta-Learning”, ICLR 2018
[4] Siavash Khodadadeh, Ladislau Boloni, and Mubarak Shaloni: “Unsupervised Meta-Learning for Few-shot Classification”, NeuRIPS 2019
Experimental Results (Few-shot Classification)
Meta-GMVAE obtains better performance than existing unsupervised methods in 5
settings and match the performance in 3 settings.
The few-shot classification results (way, shot) on the Omiglot and Mini-ImageNet Datasets
Experimental Results (Few-shot Classification)
Interestingly, Meta-GMVAE even obtains better performance than MAML on a certain
setting (i.e., (5,1) on Omniglot), while utilizing as small as 0.1% of labels.
The few-shot classification results (way, shot) on the Omiglot and Mini-ImageNet Datasets
Experimental Results (Visualization)
The below figures show how Meta-GMVAE learns and realizes class-concept.
At meta-train, Meta-GMVAE captures the similar visual structure but not class-concept.
However, it easily realizes the class-concept at meta-test.
The samples obtained and generated for each mode of Meta-GMVAE
Experimental Results (Ablation Study)
We conduct ablation study on Meta-GMVAE by eliminating each component.
The below shows that all the components are critical to the performance.
The results of ablation study on Meta-GMVAE
Experimental Results (Cross-way)
In more realistic settings, we can not know the way at meta-test time.
Meta-GMVAE also shows its robustness on the cross-way classification experiments.
The results of cross-way 1-shot experiments on Omniglot dataset
Experimental Results (Cross-way Visualization)
We further visualize the latent space for the cross-shot generalization experiment.
The below figure shows that Meta-GMVAE trained 20-way can cluster 5-way task.
The visualization of the latent space
Conclusion
1. We propose a principled meta-learning model, namely Meta-GMVAE, which meta-
learns the set-conditioned prior and posterior network for a VAE.
2. We propose to learn the multi-modal structure of a given dataset with the Gaussian
mixture prior, such that it can adapt to a novel dataset via the EM algorithm.
3. We show that Meta-GMVAE largely outperforms relevant unsupervised meta-
learning baselines on two benchmark datasets, while obtaining even better
performance than a supervised meta-learning model under a specific setting.

More Related Content

What's hot

Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learningDongHyun Kwak
 
Meta-learning and the ELBO
Meta-learning and the ELBOMeta-learning and the ELBO
Meta-learning and the ELBOYoonho Lee
 
머피의 머신러닝 13 Sparse Linear Model
머피의 머신러닝 13 Sparse Linear Model머피의 머신러닝 13 Sparse Linear Model
머피의 머신러닝 13 Sparse Linear ModelJungkyu Lee
 
On First-Order Meta-Learning Algorithms
On First-Order Meta-Learning AlgorithmsOn First-Order Meta-Learning Algorithms
On First-Order Meta-Learning AlgorithmsYoonho Lee
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural NetworksDean Wyatte
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Modelspetitegeek
 
OK08-klasifikasi-dgn-naive-bayes.ppt
OK08-klasifikasi-dgn-naive-bayes.pptOK08-klasifikasi-dgn-naive-bayes.ppt
OK08-klasifikasi-dgn-naive-bayes.pptIndra Hermawan
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBenjamin Bengfort
 
[GomGuard] 뉴런부터 YOLO 까지 - 딥러닝 전반에 대한 이야기
[GomGuard] 뉴런부터 YOLO 까지 - 딥러닝 전반에 대한 이야기[GomGuard] 뉴런부터 YOLO 까지 - 딥러닝 전반에 대한 이야기
[GomGuard] 뉴런부터 YOLO 까지 - 딥러닝 전반에 대한 이야기JungHyun Hong
 
PRML輪読#7
PRML輪読#7PRML輪読#7
PRML輪読#7matsuolab
 
UNIT 1 Machine Learning [KCS-055] (1).pptx
UNIT 1 Machine Learning [KCS-055] (1).pptxUNIT 1 Machine Learning [KCS-055] (1).pptx
UNIT 1 Machine Learning [KCS-055] (1).pptxRohanPathak30
 
Meta learning with memory augmented neural network
Meta learning with memory augmented neural networkMeta learning with memory augmented neural network
Meta learning with memory augmented neural networkKaty Lee
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningSungchul Kim
 
Safe Reinforcement Learning
Safe Reinforcement LearningSafe Reinforcement Learning
Safe Reinforcement LearningDongmin Lee
 
Latent factor models for Collaborative Filtering
Latent factor models for Collaborative FilteringLatent factor models for Collaborative Filtering
Latent factor models for Collaborative Filteringsscdotopen
 

What's hot (20)

Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learning
 
Meta-learning and the ELBO
Meta-learning and the ELBOMeta-learning and the ELBO
Meta-learning and the ELBO
 
머피의 머신러닝 13 Sparse Linear Model
머피의 머신러닝 13 Sparse Linear Model머피의 머신러닝 13 Sparse Linear Model
머피의 머신러닝 13 Sparse Linear Model
 
APRIORI ALGORITHM -PPT.pptx
APRIORI ALGORITHM -PPT.pptxAPRIORI ALGORITHM -PPT.pptx
APRIORI ALGORITHM -PPT.pptx
 
On First-Order Meta-Learning Algorithms
On First-Order Meta-Learning AlgorithmsOn First-Order Meta-Learning Algorithms
On First-Order Meta-Learning Algorithms
 
Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural Networks
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Models
 
OK08-klasifikasi-dgn-naive-bayes.ppt
OK08-klasifikasi-dgn-naive-bayes.pptOK08-klasifikasi-dgn-naive-bayes.ppt
OK08-klasifikasi-dgn-naive-bayes.ppt
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix Factorization
 
PRML chapter7
PRML chapter7PRML chapter7
PRML chapter7
 
[GomGuard] 뉴런부터 YOLO 까지 - 딥러닝 전반에 대한 이야기
[GomGuard] 뉴런부터 YOLO 까지 - 딥러닝 전반에 대한 이야기[GomGuard] 뉴런부터 YOLO 까지 - 딥러닝 전반에 대한 이야기
[GomGuard] 뉴런부터 YOLO 까지 - 딥러닝 전반에 대한 이야기
 
Matrix Factorization
Matrix FactorizationMatrix Factorization
Matrix Factorization
 
PRML輪読#7
PRML輪読#7PRML輪読#7
PRML輪読#7
 
UNIT 1 Machine Learning [KCS-055] (1).pptx
UNIT 1 Machine Learning [KCS-055] (1).pptxUNIT 1 Machine Learning [KCS-055] (1).pptx
UNIT 1 Machine Learning [KCS-055] (1).pptx
 
Xgboost
XgboostXgboost
Xgboost
 
Meta learning with memory augmented neural network
Meta learning with memory augmented neural networkMeta learning with memory augmented neural network
Meta learning with memory augmented neural network
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
 
Safe Reinforcement Learning
Safe Reinforcement LearningSafe Reinforcement Learning
Safe Reinforcement Learning
 
Latent factor models for Collaborative Filtering
Latent factor models for Collaborative FilteringLatent factor models for Collaborative Filtering
Latent factor models for Collaborative Filtering
 

Similar to Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning

Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative AttributesVikas Jain
 
Evaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernelsEvaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernelsinfopapers
 
Meta Dropout: Learning to Perturb Latent Features for Generalization
Meta Dropout: Learning to Perturb Latent Features for Generalization Meta Dropout: Learning to Perturb Latent Features for Generalization
Meta Dropout: Learning to Perturb Latent Features for Generalization MLAI2
 
A scalable collaborative filtering framework based on co-clustering
A scalable collaborative filtering framework based on co-clusteringA scalable collaborative filtering framework based on co-clustering
A scalable collaborative filtering framework based on co-clusteringlau
 
TWCC22_PPT_v3_KL.pdf quantum computer, quantum computer , quantum computer ,...
TWCC22_PPT_v3_KL.pdf  quantum computer, quantum computer , quantum computer ,...TWCC22_PPT_v3_KL.pdf  quantum computer, quantum computer , quantum computer ,...
TWCC22_PPT_v3_KL.pdf quantum computer, quantum computer , quantum computer ,...Kuan-Tsae Huang
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsDevansh16
 
A Low Rank Mechanism to Detect and Achieve Partially Completed Image Tags
A Low Rank Mechanism to Detect and Achieve Partially Completed Image TagsA Low Rank Mechanism to Detect and Achieve Partially Completed Image Tags
A Low Rank Mechanism to Detect and Achieve Partially Completed Image TagsIRJET Journal
 
Learning to rank image tags with limited training examples
Learning to rank image tags with limited training examplesLearning to rank image tags with limited training examples
Learning to rank image tags with limited training examplesCloudTechnologies
 
Review : Prototype Mixture Models for Few-shot Semantic Segmentation
Review : Prototype Mixture Models for Few-shot Semantic SegmentationReview : Prototype Mixture Models for Few-shot Semantic Segmentation
Review : Prototype Mixture Models for Few-shot Semantic SegmentationDongmin Choi
 
Convolutional auto-encoded extreme learning machine for incremental learning ...
Convolutional auto-encoded extreme learning machine for incremental learning ...Convolutional auto-encoded extreme learning machine for incremental learning ...
Convolutional auto-encoded extreme learning machine for incremental learning ...IJECEIAES
 
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...AkankshaRawat53
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...MLAI2
 
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES - IEEE PROJECTS I...
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES  - IEEE PROJECTS I...LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES  - IEEE PROJECTS I...
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES - IEEE PROJECTS I...Nexgen Technology
 
Learning to rank image tags with limited
Learning to rank image tags with limitedLearning to rank image tags with limited
Learning to rank image tags with limitednexgentech15
 
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLESLEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLESnexgentechnology
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsVijay Karan
 
Segmentation of Images by using Fuzzy k-means clustering with ACO
Segmentation of Images by using Fuzzy k-means clustering with ACOSegmentation of Images by using Fuzzy k-means clustering with ACO
Segmentation of Images by using Fuzzy k-means clustering with ACOIJTET Journal
 
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...CSCJournals
 

Similar to Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning (20)

Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative Attributes
 
Evaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernelsEvaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernels
 
Meta Dropout: Learning to Perturb Latent Features for Generalization
Meta Dropout: Learning to Perturb Latent Features for Generalization Meta Dropout: Learning to Perturb Latent Features for Generalization
Meta Dropout: Learning to Perturb Latent Features for Generalization
 
A scalable collaborative filtering framework based on co-clustering
A scalable collaborative filtering framework based on co-clusteringA scalable collaborative filtering framework based on co-clustering
A scalable collaborative filtering framework based on co-clustering
 
TWCC22_PPT_v3_KL.pdf quantum computer, quantum computer , quantum computer ,...
TWCC22_PPT_v3_KL.pdf  quantum computer, quantum computer , quantum computer ,...TWCC22_PPT_v3_KL.pdf  quantum computer, quantum computer , quantum computer ,...
TWCC22_PPT_v3_KL.pdf quantum computer, quantum computer , quantum computer ,...
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
 
A Low Rank Mechanism to Detect and Achieve Partially Completed Image Tags
A Low Rank Mechanism to Detect and Achieve Partially Completed Image TagsA Low Rank Mechanism to Detect and Achieve Partially Completed Image Tags
A Low Rank Mechanism to Detect and Achieve Partially Completed Image Tags
 
Deep learning-practical
Deep learning-practicalDeep learning-practical
Deep learning-practical
 
Learning to rank image tags with limited training examples
Learning to rank image tags with limited training examplesLearning to rank image tags with limited training examples
Learning to rank image tags with limited training examples
 
Review : Prototype Mixture Models for Few-shot Semantic Segmentation
Review : Prototype Mixture Models for Few-shot Semantic SegmentationReview : Prototype Mixture Models for Few-shot Semantic Segmentation
Review : Prototype Mixture Models for Few-shot Semantic Segmentation
 
Convolutional auto-encoded extreme learning machine for incremental learning ...
Convolutional auto-encoded extreme learning machine for incremental learning ...Convolutional auto-encoded extreme learning machine for incremental learning ...
Convolutional auto-encoded extreme learning machine for incremental learning ...
 
SVM & MLP on Matlab program
 SVM & MLP on Matlab program  SVM & MLP on Matlab program
SVM & MLP on Matlab program
 
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
 
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES - IEEE PROJECTS I...
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES  - IEEE PROJECTS I...LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES  - IEEE PROJECTS I...
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES - IEEE PROJECTS I...
 
Learning to rank image tags with limited
Learning to rank image tags with limitedLearning to rank image tags with limited
Learning to rank image tags with limited
 
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLESLEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab Projects
 
Segmentation of Images by using Fuzzy k-means clustering with ACO
Segmentation of Images by using Fuzzy k-means clustering with ACOSegmentation of Images by using Fuzzy k-means clustering with ACO
Segmentation of Images by using Fuzzy k-means clustering with ACO
 
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
 

More from MLAI2

Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...MLAI2
 
Online Hyperparameter Meta-Learning with Hypergradient Distillation
Online Hyperparameter Meta-Learning with Hypergradient DistillationOnline Hyperparameter Meta-Learning with Hypergradient Distillation
Online Hyperparameter Meta-Learning with Hypergradient DistillationMLAI2
 
Online Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningOnline Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningMLAI2
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningMLAI2
 
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual LearningSequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual LearningMLAI2
 
Skill-Based Meta-Reinforcement Learning
Skill-Based Meta-Reinforcement LearningSkill-Based Meta-Reinforcement Learning
Skill-Based Meta-Reinforcement LearningMLAI2
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsMLAI2
 
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...MLAI2
 
Mini-Batch Consistent Slot Set Encoder For Scalable Set Encoding
Mini-Batch Consistent Slot Set Encoder For Scalable Set EncodingMini-Batch Consistent Slot Set Encoder For Scalable Set Encoding
Mini-Batch Consistent Slot Set Encoder For Scalable Set EncodingMLAI2
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningMLAI2
 
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...MLAI2
 
Accurate Learning of Graph Representations with Graph Multiset Pooling
Accurate Learning of Graph Representations with Graph Multiset PoolingAccurate Learning of Graph Representations with Graph Multiset Pooling
Accurate Learning of Graph Representations with Graph Multiset PoolingMLAI2
 
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...MLAI2
 
Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...
Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...
Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...MLAI2
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMLAI2
 
Adversarial Self-Supervised Contrastive Learning
Adversarial Self-Supervised Contrastive LearningAdversarial Self-Supervised Contrastive Learning
Adversarial Self-Supervised Contrastive LearningMLAI2
 
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...MLAI2
 
Neural Mask Generator : Learning to Generate Adaptive Word Maskings for Langu...
Neural Mask Generator : Learning to Generate Adaptive WordMaskings for Langu...Neural Mask Generator : Learning to Generate Adaptive WordMaskings for Langu...
Neural Mask Generator : Learning to Generate Adaptive Word Maskings for Langu...MLAI2
 
Cost-effective Interactive Attention Learning with Neural Attention Process
Cost-effective Interactive Attention Learning with Neural Attention ProcessCost-effective Interactive Attention Learning with Neural Attention Process
Cost-effective Interactive Attention Learning with Neural Attention ProcessMLAI2
 
Adversarial Neural Pruning with Latent Vulnerability Suppression
Adversarial Neural Pruning with Latent Vulnerability SuppressionAdversarial Neural Pruning with Latent Vulnerability Suppression
Adversarial Neural Pruning with Latent Vulnerability SuppressionMLAI2
 

More from MLAI2 (20)

Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...
 
Online Hyperparameter Meta-Learning with Hypergradient Distillation
Online Hyperparameter Meta-Learning with Hypergradient DistillationOnline Hyperparameter Meta-Learning with Hypergradient Distillation
Online Hyperparameter Meta-Learning with Hypergradient Distillation
 
Online Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningOnline Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual Learning
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual Learning
 
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual LearningSequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
 
Skill-Based Meta-Reinforcement Learning
Skill-Based Meta-Reinforcement LearningSkill-Based Meta-Reinforcement Learning
Skill-Based Meta-Reinforcement Learning
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with Hypergraphs
 
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...
 
Mini-Batch Consistent Slot Set Encoder For Scalable Set Encoding
Mini-Batch Consistent Slot Set Encoder For Scalable Set EncodingMini-Batch Consistent Slot Set Encoder For Scalable Set Encoding
Mini-Batch Consistent Slot Set Encoder For Scalable Set Encoding
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
 
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
 
Accurate Learning of Graph Representations with Graph Multiset Pooling
Accurate Learning of Graph Representations with Graph Multiset PoolingAccurate Learning of Graph Representations with Graph Multiset Pooling
Accurate Learning of Graph Representations with Graph Multiset Pooling
 
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
 
Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...
Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...
Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
 
Adversarial Self-Supervised Contrastive Learning
Adversarial Self-Supervised Contrastive LearningAdversarial Self-Supervised Contrastive Learning
Adversarial Self-Supervised Contrastive Learning
 
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
 
Neural Mask Generator : Learning to Generate Adaptive Word Maskings for Langu...
Neural Mask Generator : Learning to Generate Adaptive WordMaskings for Langu...Neural Mask Generator : Learning to Generate Adaptive WordMaskings for Langu...
Neural Mask Generator : Learning to Generate Adaptive Word Maskings for Langu...
 
Cost-effective Interactive Attention Learning with Neural Attention Process
Cost-effective Interactive Attention Learning with Neural Attention ProcessCost-effective Interactive Attention Learning with Neural Attention Process
Cost-effective Interactive Attention Learning with Neural Attention Process
 
Adversarial Neural Pruning with Latent Vulnerability Suppression
Adversarial Neural Pruning with Latent Vulnerability SuppressionAdversarial Neural Pruning with Latent Vulnerability Suppression
Adversarial Neural Pruning with Latent Vulnerability Suppression
 

Recently uploaded

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Recently uploaded (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning

  • 1. Meta-GMVAE: Mixture of Gaussian VAEs for Unsupervised Meta-Learning Dong Bok Lee1, Dongchan Min1, Seanie Lee1, and Sung Ju Hwang1,2 KAIST1, AITRICS2, Seoul
  • 2. Introduction Unsupervised learning aims to learn meaningful representations from unlabeled data that can be transferred to down-stream tasks. Image Unsupervised Learning Image recognition … Image segmentation Representation
  • 3. Introduction Meta-learning shares the spirit of unsupervised learning in that they seek to learn more effective learning procedure than learning from scratch. Unsupervised Learning Image recognition … Image segmentation Representation Meta-Learning Task1: Tiger vs Cat … Task2: Car vs Bicycle Model … …
  • 4. Introduction The fundamental difference of the two is that the most meta-learning approaches are supervised, assuming full access to the labels. Prototypical Network [1] [1] Jake Snell, Kevin Swersky, and Richard S. Zemel: “Prototypical Networks for Few-shot learning”, NeuRIPS 2017 [2] Chelsea Finn, Pieter Abbeel, and Sergey Levine: “Model-Agnostic Meta-learning for Fast Adaptation of Deep Networks”, ICML 2018 MAML [2] Supervision
  • 5. Introduction Due to the assumption of supervision, existing meta-learning methods have limitation: they requires massive amounts of human efforts on labeling. Labeling Cat Dog Car
  • 6. Introduction To overcome this, two recent works have proposed unsupervised meta-learning: Unlabeled dataset Unsupervised Meta-training Supervised Meta-test Task 1 Support Set Task N Support Set … Query Set Query Set
  • 7. Introduction They focus on constructing supervised meta-training dataset by pseudo labeling with clustering and augmentation. Unlabeled dataset Unsupervised Meta-training Supervised Meta-test Task 1 Support Set Task N Support Set … Query Set Query Set Pseudo Labeling
  • 8. Method In this work, we have focused on developing a principled unsupervised meta-learning method, namely Meta-GMVAE. The main idea is to bridge the gap between the process of unsupervised meta-training and that of supervised meta-test.
  • 9. Method (Unsupervised Meta-training) Specifically, we start from Variational Autoencoder [5], where its prior is modelled with Gaussian Mixtures. The assumption is that each modality can represent a label at meta-test. Its generative process is as follows: 1. 2. 3. [5] Diederak P. Kingma, and Max Welling: “Auto-encoding Varitional Bayes”, ICLR 2013 A graphical illustration of GMVAE
  • 10. Method (Unsupervised Meta-training) However, the difference from the previous work on GMVAE [6] is that they fix the prior parameters since they target a single task learning. [6] Nat Dilokthanakul et al: “Deep unsupervised clustering with Gaussian Mixture Variational Autoencoder”, arXiv 2016 1. 2. 3. A graphical illustration of GMVAE This part is fixed
  • 11. Method (Unsupervised Meta-training) To learn the set-dependent multi-modalities, we assume that there exists a parameter for each episode dataset which is randomly drawn. A graphical illustration of Meta-GMVAE A graphical illustration of GMVAE
  • 12. Method (Unsupervised Meta-training) Then we derive the following variational lower bound for the marginal log-likelihood:
  • 13. Method (Unsupervised Meta-training) Here we use i.i.d assumption on the data log-likelihood: where the number of datapoints is 𝑀.
  • 14. Method (Unsupervised Meta-training) Then we introduce Gaussian Mixture prior and set-dependent variational posterior: where z is the latent variable.
  • 15. Method (Unsupervised Meta-training) Here we model set-dependent posterior to encode each data within the given dataset into the latent space. Specifically, this is implemented by TransfomerEncoder proposed by Vaswani et al. [7]. [7] Ashish Vaswani et al: “Attention is All You Need”, NeuRIPS 2017
  • 16. Method (Unsupervised Meta-training) Then we derive the variational lower bound as follows:
  • 17. Method (Unsupervised Meta-training) Finally, we estimate the variational lowerbound with Monte Carlo estimation. where N is the size of Monte Carlo samples.
  • 18. Method (Unsupervised Meta-training) In our setting, the prior parameter characterizes the given dataset. To obtain the parameter that optimally explain the given dataset, we propose to locally maximize the lowerbound as follows: where This leads to the MLE solution of the prior distribution.
  • 19. Method (Unsupervised Meta-training) However, we do not have an analytic MLE solution of GMM. To this end, we propose to obtain optimal using EM algorithm as follows: where we tie the covariance matrix with identity matrix.
  • 20. Method (Unsupervised Meta-training) Then the training objective of Meta-GMVAE is as follows: where is obtained by performing the EM algorithm on the latent MC samples.
  • 21. Method (Supervised Meta-test) However, there is no guarantee that each modality obtained by EM algorithm corresponds to the label. To overcome this, we use support set as observations of EM algorithm. A graphical illustration of predicting labels A graphical illustration of Meta-GMVAE
  • 22. Method (Supervised Meta-test) Then, we perform semi-supervised EM to obtain prior parameters as follows: where the support set is considered as labeled data.
  • 23. Method (Supervised Meta-test) Finally, we predict labels of query set using the aggregated posterior as follows: where we reuse the Monte Carlo samples.
  • 24. Method (SimCLR) The ability to generate samples may not be necessary for discriminative tasks. Moreover, it is known to be challenging to train generative models on real data. Therefore, we propose to use features pretrained by SimCLR [8] as an input. [8] Ting Chen et al: “A Simple Framework for Constrastive Learning of Visual Representations”, ICML 2020 A graphical illustration of SimCLR Image recognition accuracy on ImageNet
  • 25. Experimental Setups (Dataset) We ran experiment using two benchmark datasets: 1) Omniglot dataset 28 x 28 resolution grayscale 1200 meta-training classes 323 meta-test classes
  • 26. Experimental Setups (Dataset) We ran experiment using two benchmark datasets: 2) Mini-ImageNet dataset 84 x 84 resolution RGB 64 meta-training classes 20 meta-test classes
  • 27. Experimental Setups (Baselines) We compare our Meta-GMVAE with four baselines as follows: 1) Prototypical Networks (oracle) [1]: metric-based meta-learning with supervision. 2) MAML (oracle) [2]: gradient-based meta-learning with supervision. 3) CACTUS [3]: constructing pseudo tasks by clustering on deep embedding space. 4) UMTRA [4]: constructing pseudo tasks using augmentation. [1] Jake Snell, Kevin Swersky, and Richard S. Zemel: “Prototypical Networks for Few-shot learning”, NeuRIPS 2017 [2] Chelsea Finn, Pieter Abbeel, and Sergey Levine: “Model-Agnostic Meta-learning for Fast Adaptation of Deep Networks”, ICML 2018 [3] Kyle Hsu, Sergey Levine, and Chelsea Finn: “Unsupervised Learning vis Meta-Learning”, ICLR 2018 [4] Siavash Khodadadeh, Ladislau Boloni, and Mubarak Shaloni: “Unsupervised Meta-Learning for Few-shot Classification”, NeuRIPS 2019
  • 28. Experimental Results (Few-shot Classification) Meta-GMVAE obtains better performance than existing unsupervised methods in 5 settings and match the performance in 3 settings. The few-shot classification results (way, shot) on the Omiglot and Mini-ImageNet Datasets
  • 29. Experimental Results (Few-shot Classification) Interestingly, Meta-GMVAE even obtains better performance than MAML on a certain setting (i.e., (5,1) on Omniglot), while utilizing as small as 0.1% of labels. The few-shot classification results (way, shot) on the Omiglot and Mini-ImageNet Datasets
  • 30. Experimental Results (Visualization) The below figures show how Meta-GMVAE learns and realizes class-concept. At meta-train, Meta-GMVAE captures the similar visual structure but not class-concept. However, it easily realizes the class-concept at meta-test. The samples obtained and generated for each mode of Meta-GMVAE
  • 31. Experimental Results (Ablation Study) We conduct ablation study on Meta-GMVAE by eliminating each component. The below shows that all the components are critical to the performance. The results of ablation study on Meta-GMVAE
  • 32. Experimental Results (Cross-way) In more realistic settings, we can not know the way at meta-test time. Meta-GMVAE also shows its robustness on the cross-way classification experiments. The results of cross-way 1-shot experiments on Omniglot dataset
  • 33. Experimental Results (Cross-way Visualization) We further visualize the latent space for the cross-shot generalization experiment. The below figure shows that Meta-GMVAE trained 20-way can cluster 5-way task. The visualization of the latent space
  • 34. Conclusion 1. We propose a principled meta-learning model, namely Meta-GMVAE, which meta- learns the set-conditioned prior and posterior network for a VAE. 2. We propose to learn the multi-modal structure of a given dataset with the Gaussian mixture prior, such that it can adapt to a novel dataset via the EM algorithm. 3. We show that Meta-GMVAE largely outperforms relevant unsupervised meta- learning baselines on two benchmark datasets, while obtaining even better performance than a supervised meta-learning model under a specific setting.