1. 1
Few shot learning –
an overview
Hannes Fassold, JOANNEUM RESEARCH
2021-09-01
2. Introduction & Motivation
• Motivation
• State-of-the-art DL algorithms need a large annotated dataset for training
• Image classification: hundreds to thousands images for each class
(ImageNet dataset has ~ 1,000 images for each class)
• Often not possible (or viable) to gather and annotate such a large dataset
• There may not be enough data samples for each object class
• E.g. for training a vision-based defect detector for industrial inspection:
Defects are occurring (naturally) very seldomly
• It may be too costly to annotate manually all data samples in a large dataset
• MS COCO dataset: 2.5 million human-labeled object instances
(bounding boxes, segmentation mask) in 328,000 images
2
3. Introduction & Motivation
• Variety of approaches to work around the problem of “data scarcity”
• Transfer learning
• Fine-tune a pretrained model (trained on e.g. ImageNet) on your data
• Synthetic training dataset generation
• E.g. for object detection: overlay randomly objects on real backgrounds
• Domain transfer / adaption
• Transfer an existing available datasets from its original domain
to your domain (e.g. photo -> cartoon)
• Semi-supervised, self-supervised & unsupervised learning
• Employ additionally an un-labelled larger dataset as support set
• Few-shot learning (this talk)
• Methods designed specifically to handle only a few samples per class
3
4. Few-shot learning - for image classification
• Task: Classify a image into one of N classes
• Using only a few (1 – 10) samples for each class
• Very active research field with a lot of progress
• Categorization of methods
• Data augmentation / ‘hallucination’ methods
• Generate more samples in various ways
from the few available ones
• Metric learning / embedding methods
• Embed sample (features) in a metric space
and do the classification in this space
• Meta-learning / optimization methods
• Transform an existing learner (classifier) quickly in a few
“meta-learning” steps so that it is able to classify the novel classes
4
5. Delta-Encoder
• Example for a data augmentation method [NeurIPS 2018]
• Utilizes a variant of auto-encoder
• Encoder learns transferable deformations
between pairs of samples of the same class
• Decoder applies these deformations to synthesize novel
samples (in feature space) from a reference sample
5
Source: https://arxiv.org/pdf/1806.04734.pdf
6. Prototypical networks
• Example for a metric learning method [NeurIPS 2017]
• Key idea
• Learn a non-linear mapping (via a neural network) of
the input data samples into an embedding (feature) space
• Simple NN composed of 4 blocks, each block is:
(convolution –> batchnorm –> relu –> maxpool)
• Each class has its ‘prototype’, calculated as the mean
of all its class sample’s embeddings in feature space
• Classification of a new query image is done simply by
finding its nearest class prototype in feature space
• Normal Euclidean distance is employ
6
Source: https://arxiv.org/abs/1703.05175
7. Model-agnostic meta learning (MAML)
• Example for a meta-learning method [ICML 2017]
• Key idea
• Utilizes the observation that in a neural network typically
only a part of the network parameters (layers) is task-specific
• E.g. in a CNN, the lower (first) layers are usually very
general and higher (last) layers are more task-specific
• MAML optimizes (via meta-learning steps) for a network
representation which is able to quickly adapt (via
a few normal gradient steps) to the new image classes
• MAML is a very general approach
• Not only usable for few-shot (image) classification, but
also for regression, reinforcement learning etc.
7
Source: https://arxiv.org/pdf/1703.03400.pdf
8. Our research group activities in few-shot learning
• Few-shot object detection
• Detecting objects in an image (bounding box),
using only a few object samples for each class
• Our work in AI4Media
• Focuses on few-shot object detection serving use cases in
annotating incoming material in media production or for archiving
• Use case differs from workflow in academics (benchmark) in some ways:
• Number of samples per class may differ
• Typically, new classes are added iteratively
• Based on method “Frustratingly Simple Few-Shot Object Detection”
• Working on GUI for comfortable adding of new object classes & samples
• Github repo at https://github.com/wbailer/few-shot-object-detection
8