SlideShare a Scribd company logo
A Survey on Deep Semi-supervised Learning
Based on: https://arxiv.org/pdf/2103.00550.pdf
Original Authors: Xiangli Yang, Zixing Song, Irwin King, Fellow, IEEE, Zenglin Xu, Senior Member, IEEE
Date Published: 23 August 2021
Introduction
What is deep semi-supervised learning (DSSL)?
● Combo of supervised and unsupervised learning which uses a small portion of labeled
examples and a large number of unlabeled data
● Model must learn and make predictions on new examples.
● The basic procedure involves using the existing labeled data to label the rest of the unlabelled
data, thus effectively helping to increase the training data.
Why DSSL is needed?
● Most of the real-world use cases do not have extensive labeled data and labeling data is
challenging and requires a considerable amount of resources, time, and effort.
General idea of SSL
Fig. 1. General idea of Semi supervised learning
(https://www.enjoyalgorithms.com/)
Learning paradigms in SSL
Two dominant learning paradigms in SSL
● Transductive
● Transductive methods do not construct a classifier for the entire input space. So the
predictions of such systems are restricted to the objects on which they have been
trained and therefore, transductive methods have no separate train and test phases.
● Inductive
● A classifier that relies on inductive methods can predict any object in input space. This
classifier may be trained using unlabeled data, but its predictions for previously unseen
objects are independent of each other once training is complete.
The majority of graph-based methods are transductive,
whereas most other types of SSL methods are inductive.
SSL Assumptions
SSL may not be improved unless certain assumptions are made about the distribution of data.
● Self-training assumption: Predictions with high confidence are considered to be accurate.
● Co-training assumption: Instance x has two conditionally independent views and each view is sufficient for
a classification task.
● Generative model assumption: When the number of mixed components, a prior p(y), and a conditional
distribution p(x|y) are correct, data can be assumed to come from the mixed model.
● Cluster assumption: Two points x1 and x2 in the same cluster should be categorized as one.
● Low-density separation: A low-density region should be used as the boundary, not an area with high
density.
● Manifold assumption: If two points x1 and x2 are located in a local neighborhood in the low-dimensional
manifold, they have similar class labels.
Taxonomy of DSSL
Fig. 2. The taxonomy of major deep semi-supervised learning
methods based on loss function and model design.
Important DSSL
methods
● Generative models
○ Semi-supervised GANs
○ Semi-supervised VAE
● Consistency regularization
methods
● Graph based methods
○ Auto-encoder based methods
○ GNN based methods
● Pseudo labeling methods
○ Disagreement based models
○ Self training models
● Hybrid methods
Generative models - Semi-supervised GANs
● A GAN is an unsupervised model. It consists of a generative model that is trained on
unlabeled data, as well as a discriminative classifier that determines the quality of the
generator.
● GANs are able to learn the distribution of real data from unlabeled samples, which
facilitates SSL. There are four main themes in how to use GANs for SSL.
○ re-using the features from the discriminator
○ using GAN-generated samples to regularize a classifier
○ learning an inference model
○ using samples produced by a GAN as additional training data.
Semi-supervised GANs - Examples
● CatGAN: Categorical Generative Adversarial Network (CatGAN) modifies the GAN's objective function to incorporate
mutual information between observed samples and their predicted categorical distributions. The goal is to train a
discriminator which distinguishes the samples into K categories by labeling y to each x, instead of learning a binary
discriminator value function.
● CCGAN: The use of Context-Conditional Generative Adversarial Networks (CCGAN) is proposed as a method of
harnessing unlabeled image data using an adversarial loss for cases like image in-painting. Contextual information
provided by the surrounding parts of the image. The generator is trained to generate pixels within a missing piece of
image.
● Improved GAN: There are several methods to adapt GANs into a semi-supervised classification scenario. Cat-GAN
forces the discriminator to maximize the mutual information between examples and their predicted class
distributions instead of training the discriminator to learn a binary classification.
Generative models - Semi-supervised VAE
Variational AutoEncoders (VAEs) combine deep autoencoders and generative latent-variable models.
VAE is trained with two objectives - reconstruction objective between inputs and reconstructed versions, and
variational objective learning of a latent space that follows a Gaussian distribution.
SSL can benefit from VAEs for three reasons:
● Incorporating unlabeled data is a natural process
● Using latent space, it is possible to disentangle representations easily.
● It also allows us to use variational neural methods.
VAEs can be used as semi-supervised learning models in two steps.
● First, a VAE is trained using both unlabeled and labeled data in order to extract latent representations.
● The second step entails a VAE in which the latent representation is supplemented by the label vector. The
label vector contains the true labels for labeled data points and is used to construct additional latent
variables for unlabelled data.
Semi-supervised VAE - Examples
● Semi-supervised Sequential Variational Autoencoder (SSVAE): Consists of a Seq2Seq
structure and a sequential classifier. In the Seq2Seq structure, the input sequence is firstly
encoded by a recurrent neural network and then decoded by another recurrent neural network
conditioned on both latent variable and categorical label.
● Infinite VAE: Mixture of an infinite number of autoencoders capable of scaling with data
complexity to better capture its intrinsic structure. The unsupervised generative model is
trained using unlabeled data, then this model can be combined with the available labeled
data to train a discriminative model, which is also a mixture of experts.
Consistency Regularization
● Consistency regularization is based on the idea that predictions should be less affected by
extra perturbation imposed on the input samples.
● Consistency regularization SSL methods typically use the Teacher-Student structure.
● The objective is to train the model to predict consistently on an unlabeled example and its
perturbed version.
● The model learns as a student, and as a teacher, it creates targets simultaneously. Since
models themselves generate targets, they may be incorrect and are then used as students for
learning.
Consistency Regularization - Examples
● Ladder networks: This network consists of two encoders, a corrupted and clean one, and a decoder.
The corrupted one has Gaussian noise injected at each layer after batch normalization. The input x
on passing through clean encoder will produce output y and on passing through corrupted one
produce ỹ. This ỹ will be fed into the denoising decoder to reconstruct the clean output y. The
training loss is computed as the MSE between clean activation and reconstructed activation.
● Π-Model: This is a simplified ladder network, where the corrupted encoder is removed, and the
same network is used to get the prediction for both corrupted and uncorrupted input. In this model,
two random augmentations of a sample for both labeled and unlabeled data are forward
propagated, and the objective is to produce consistent predictions on the variants, ie reduce the
distance between two predictions.
Graph-Based Methods
● The main idea of graph-based semi-supervised learning (GSSL) is to extract a
graph out of the raw data where each node represents a training sample and the
edge represents the similarity measurement of samples.
● There are labeled and unlabeled samples, the objective is to propagate the labels
from the labeled nodes to the unlabeled ones.
● The GSSL methods are broadly classified into two:
○ AutoEncoder-based
○ GNN-based methods.
Graph-Based Methods - Examples
● Structural deep network embedding (SDNE) is an AutoEncoder based method. This
framework consists of unsupervised and supervised parts. The first is an autoencoder
designed to produce an embedding result for each node to rebuild the neighborhood. The
second part utilizes Laplacian Eigenmaps, which penalize the model when related vertices
are far apart.
● Basic GNN: Graph Neural Networks (GNNs) is a classifier is first trained to predict class
labels for the labeled nodes. Then it can be applied to the unlabeled nodes based on the final,
hidden state of the GNN-based model. It takes advantage of neural message passing in
which messages are exchanged and updated between each pair of nodes by using neural
networks.
Pseudo-Labeling Methods
The pseudo labeling method works in two steps.
● In the first step, a model is trained on a limited set of labeled data.
● The second step leverages the same model to create pseudo labels on unlabeled data and add the high
confidence pseudo labels as targets to the existing labeled dataset creating additional training data.
There are two main patterns: one is to improve the performance of the whole framework based on the
disagreement of views or multiple networks, and the other is self-training.
● Disagreement-based methods train multiple learners and focus on exploiting the disagreement during the
training.
● The self-training algorithm leverages the model’s own confident predictions to produce the pseudo labels
for unlabeled data.
Pseudo-Labeling Methods - Examples
● Pseudo-label: This is a simple and efficient SSL method that allows the network to be trained simultaneously with labeled and
unlabeled data. Initially, the model is trained with labeled data using a cross-entropy loss. The same model is used to predict an
entire batch of unlabeled samples. The maximum confidence prediction is called a pseudo-label.
● Noisy Student: This is a semi supervised method that works on knowledge distillation with equal-or-larger student models. The
teacher model is first trained on labeled images to generate pseudo labels for unlabeled examples. Following this, a larger
student model is trained on the combination of labeled and pseudo-labeled samples. These combined instances are
augmented using data augmentation techniques and model noise. With several iterations of this algorithm, the student model
becomes the new teacher and relabels the unlabeled data, and the cycle is repeated.
● SimCLRv2: This is the SSL version of SimCLR (A Simple Framework for Contrastive Learning of Visual Representations).
SimCLRv2 can be summarized in three steps: task agnostic unsupervised pre-training, supervised fine-tuning on labeled
samples, and self-training or distillation with task-specific unlabeled examples. SimCLRv2 learns representations by
maximizing the contrastive learning loss function.
Hybrid Methods
Hybrid methods combine ideas from the above-mentioned methods such as :
● Pseudo-label
● Consistency regularization
● Entropy minimization for performance improvement.
Hybrid Methods - Examples
● MixMatch: This method combines consistency regularization and entropy minimization in a unified
loss function. It first introduces data augmentation in both labeled and unlabeled data. Each
unlabeled sample is augmented K number of times and predictions of different augmentations are
then averaged. To reduce entropy, guessed labels are sharpened before a final label is provided.
After this, Mixup regularization is applied on both labeled and unlabeled data.
● FixMatch: This method combines consistency regularization and pseudo-labeling in a simplified
way. Here, for each unlabeled image, weak augmentation and strong augmentations are applied to
get two images. Both augmentations are passed through the model to get predictions. Then it uses
consistency regularization as cross-entropy between one-hot pseudo-labels of weakly augmented
images and prediction of strongly augmented images.
Challenges
The SSL method, like any other machine learning method, has its own set of challenges.
One of the major challenges is that it is unknown how SSL works internally and what role various techniques, such
as data augmentation, training methods, and loss functions, play exactly.
The SSL approaches above generally operate at their best only in ideal environments when the training dataset
meets the design assumptions, however, in reality, the distribution of the dataset is unknown and does not
necessarily meet these ideal conditions and might produce unexpected results.
If training data is highly imbalanced, the models tend to favor the majority class, and in some cases ignore the
minority class entirely.
The use of unlabeled data may result in worse generalization performance than a model learned only with labeled
data.
Conclusion
Deep semi-supervised learning
has already demonstrated
remarkable results in various
tasks and has garnered a lot of
interest from the research
community due to its important
practical applications.
Thank you!

More Related Content

What's hot

Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
milad abbasi
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagation
Krish_ver2
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
Emanuele Ghelfi
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
Sharayu Patil
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
CloudxLab
 
Density based clustering
Density based clusteringDensity based clustering
Density based clustering
YaswanthHariKumarVud
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
Mark Chang
 
boosting algorithm
boosting algorithmboosting algorithm
boosting algorithm
Prithvi Paneru
 
Autoencoder
AutoencoderAutoencoder
Autoencoder
HARISH R
 
Vanishing & Exploding Gradients
Vanishing & Exploding GradientsVanishing & Exploding Gradients
Vanishing & Exploding Gradients
Siddharth Vij
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
NAVER Engineering
 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)
Prakhar Rastogi
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
Krish_ver2
 
StarGAN
StarGANStarGAN
StarGAN
Joonhyung Lee
 
Dbscan algorithom
Dbscan algorithomDbscan algorithom
Dbscan algorithom
Mahbubur Rahman Shimul
 
Semi supervised learning machine learning made simple
Semi supervised learning  machine learning made simpleSemi supervised learning  machine learning made simple
Semi supervised learning machine learning made simple
Devansh16
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)
Mustafa Sherazi
 
Xgboost
XgboostXgboost
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximizationbutest
 
Optics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structureOptics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structure
Rajesh Piryani
 

What's hot (20)

Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagation
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
 
Density based clustering
Density based clusteringDensity based clustering
Density based clustering
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
boosting algorithm
boosting algorithmboosting algorithm
boosting algorithm
 
Autoencoder
AutoencoderAutoencoder
Autoencoder
 
Vanishing & Exploding Gradients
Vanishing & Exploding GradientsVanishing & Exploding Gradients
Vanishing & Exploding Gradients
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
StarGAN
StarGANStarGAN
StarGAN
 
Dbscan algorithom
Dbscan algorithomDbscan algorithom
Dbscan algorithom
 
Semi supervised learning machine learning made simple
Semi supervised learning  machine learning made simpleSemi supervised learning  machine learning made simple
Semi supervised learning machine learning made simple
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)
 
Xgboost
XgboostXgboost
Xgboost
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
 
Optics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structureOptics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structure
 

Similar to Deep Semi-supervised Learning methods

Dive into Machine Learning Event MUGDSC.pptx
Dive into Machine Learning Event MUGDSC.pptxDive into Machine Learning Event MUGDSC.pptx
Dive into Machine Learning Event MUGDSC.pptx
RakshaAgrawal21
 
Dive into Machine Learning Event--MUGDSC
Dive into Machine Learning Event--MUGDSCDive into Machine Learning Event--MUGDSC
Dive into Machine Learning Event--MUGDSC
RakshaAgrawal21
 
Applications of machine learning in Wireless sensor networks.
Applications of machine learning in Wireless sensor networks.Applications of machine learning in Wireless sensor networks.
Applications of machine learning in Wireless sensor networks.
Sahana B S
 
Second subjective assignment
Second  subjective assignmentSecond  subjective assignment
Second subjective assignment
yatheeshabodumalla
 
Node classification with graph neural network based centrality measures and f...
Node classification with graph neural network based centrality measures and f...Node classification with graph neural network based centrality measures and f...
Node classification with graph neural network based centrality measures and f...
IJECEIAES
 
Seeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper reviewSeeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper review
QuantUniversity
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
Naeem Shehzad
 
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
thanhdowork
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
NUPUR YADAV
 
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
AkankshaRawat53
 
Mnist soln
Mnist solnMnist soln
Mnist soln
DanishFaisal4
 
When deep learners change their mind learning dynamics for active learning
When deep learners change their mind  learning dynamics for active learningWhen deep learners change their mind  learning dynamics for active learning
When deep learners change their mind learning dynamics for active learning
Devansh16
 
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
ssuser4b1f48
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
Nandhini S
 
Data Mining Module 3 Business Analtics..pdf
Data Mining Module 3 Business Analtics..pdfData Mining Module 3 Business Analtics..pdf
Data Mining Module 3 Business Analtics..pdf
Jayanti Pande
 
image_classification.pptx
image_classification.pptximage_classification.pptx
image_classification.pptx
tayyaba977749
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
Natasha Grant
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
RAHUL BHOJWANI
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
Vikash Kumar
 
An Extensive Review on Generative Adversarial Networks GAN’s
An Extensive Review on Generative Adversarial Networks GAN’sAn Extensive Review on Generative Adversarial Networks GAN’s
An Extensive Review on Generative Adversarial Networks GAN’s
ijtsrd
 

Similar to Deep Semi-supervised Learning methods (20)

Dive into Machine Learning Event MUGDSC.pptx
Dive into Machine Learning Event MUGDSC.pptxDive into Machine Learning Event MUGDSC.pptx
Dive into Machine Learning Event MUGDSC.pptx
 
Dive into Machine Learning Event--MUGDSC
Dive into Machine Learning Event--MUGDSCDive into Machine Learning Event--MUGDSC
Dive into Machine Learning Event--MUGDSC
 
Applications of machine learning in Wireless sensor networks.
Applications of machine learning in Wireless sensor networks.Applications of machine learning in Wireless sensor networks.
Applications of machine learning in Wireless sensor networks.
 
Second subjective assignment
Second  subjective assignmentSecond  subjective assignment
Second subjective assignment
 
Node classification with graph neural network based centrality measures and f...
Node classification with graph neural network based centrality measures and f...Node classification with graph neural network based centrality measures and f...
Node classification with graph neural network based centrality measures and f...
 
Seeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper reviewSeeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper review
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
 
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
 
Mnist soln
Mnist solnMnist soln
Mnist soln
 
When deep learners change their mind learning dynamics for active learning
When deep learners change their mind  learning dynamics for active learningWhen deep learners change their mind  learning dynamics for active learning
When deep learners change their mind learning dynamics for active learning
 
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
Data Mining Module 3 Business Analtics..pdf
Data Mining Module 3 Business Analtics..pdfData Mining Module 3 Business Analtics..pdf
Data Mining Module 3 Business Analtics..pdf
 
image_classification.pptx
image_classification.pptximage_classification.pptx
image_classification.pptx
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 
An Extensive Review on Generative Adversarial Networks GAN’s
An Extensive Review on Generative Adversarial Networks GAN’sAn Extensive Review on Generative Adversarial Networks GAN’s
An Extensive Review on Generative Adversarial Networks GAN’s
 

Recently uploaded

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 

Recently uploaded (20)

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 

Deep Semi-supervised Learning methods

  • 1. A Survey on Deep Semi-supervised Learning Based on: https://arxiv.org/pdf/2103.00550.pdf Original Authors: Xiangli Yang, Zixing Song, Irwin King, Fellow, IEEE, Zenglin Xu, Senior Member, IEEE Date Published: 23 August 2021
  • 2. Introduction What is deep semi-supervised learning (DSSL)? ● Combo of supervised and unsupervised learning which uses a small portion of labeled examples and a large number of unlabeled data ● Model must learn and make predictions on new examples. ● The basic procedure involves using the existing labeled data to label the rest of the unlabelled data, thus effectively helping to increase the training data. Why DSSL is needed? ● Most of the real-world use cases do not have extensive labeled data and labeling data is challenging and requires a considerable amount of resources, time, and effort.
  • 3. General idea of SSL Fig. 1. General idea of Semi supervised learning (https://www.enjoyalgorithms.com/)
  • 4. Learning paradigms in SSL Two dominant learning paradigms in SSL ● Transductive ● Transductive methods do not construct a classifier for the entire input space. So the predictions of such systems are restricted to the objects on which they have been trained and therefore, transductive methods have no separate train and test phases. ● Inductive ● A classifier that relies on inductive methods can predict any object in input space. This classifier may be trained using unlabeled data, but its predictions for previously unseen objects are independent of each other once training is complete. The majority of graph-based methods are transductive, whereas most other types of SSL methods are inductive.
  • 5. SSL Assumptions SSL may not be improved unless certain assumptions are made about the distribution of data. ● Self-training assumption: Predictions with high confidence are considered to be accurate. ● Co-training assumption: Instance x has two conditionally independent views and each view is sufficient for a classification task. ● Generative model assumption: When the number of mixed components, a prior p(y), and a conditional distribution p(x|y) are correct, data can be assumed to come from the mixed model. ● Cluster assumption: Two points x1 and x2 in the same cluster should be categorized as one. ● Low-density separation: A low-density region should be used as the boundary, not an area with high density. ● Manifold assumption: If two points x1 and x2 are located in a local neighborhood in the low-dimensional manifold, they have similar class labels.
  • 6. Taxonomy of DSSL Fig. 2. The taxonomy of major deep semi-supervised learning methods based on loss function and model design.
  • 7. Important DSSL methods ● Generative models ○ Semi-supervised GANs ○ Semi-supervised VAE ● Consistency regularization methods ● Graph based methods ○ Auto-encoder based methods ○ GNN based methods ● Pseudo labeling methods ○ Disagreement based models ○ Self training models ● Hybrid methods
  • 8. Generative models - Semi-supervised GANs ● A GAN is an unsupervised model. It consists of a generative model that is trained on unlabeled data, as well as a discriminative classifier that determines the quality of the generator. ● GANs are able to learn the distribution of real data from unlabeled samples, which facilitates SSL. There are four main themes in how to use GANs for SSL. ○ re-using the features from the discriminator ○ using GAN-generated samples to regularize a classifier ○ learning an inference model ○ using samples produced by a GAN as additional training data.
  • 9. Semi-supervised GANs - Examples ● CatGAN: Categorical Generative Adversarial Network (CatGAN) modifies the GAN's objective function to incorporate mutual information between observed samples and their predicted categorical distributions. The goal is to train a discriminator which distinguishes the samples into K categories by labeling y to each x, instead of learning a binary discriminator value function. ● CCGAN: The use of Context-Conditional Generative Adversarial Networks (CCGAN) is proposed as a method of harnessing unlabeled image data using an adversarial loss for cases like image in-painting. Contextual information provided by the surrounding parts of the image. The generator is trained to generate pixels within a missing piece of image. ● Improved GAN: There are several methods to adapt GANs into a semi-supervised classification scenario. Cat-GAN forces the discriminator to maximize the mutual information between examples and their predicted class distributions instead of training the discriminator to learn a binary classification.
  • 10. Generative models - Semi-supervised VAE Variational AutoEncoders (VAEs) combine deep autoencoders and generative latent-variable models. VAE is trained with two objectives - reconstruction objective between inputs and reconstructed versions, and variational objective learning of a latent space that follows a Gaussian distribution. SSL can benefit from VAEs for three reasons: ● Incorporating unlabeled data is a natural process ● Using latent space, it is possible to disentangle representations easily. ● It also allows us to use variational neural methods. VAEs can be used as semi-supervised learning models in two steps. ● First, a VAE is trained using both unlabeled and labeled data in order to extract latent representations. ● The second step entails a VAE in which the latent representation is supplemented by the label vector. The label vector contains the true labels for labeled data points and is used to construct additional latent variables for unlabelled data.
  • 11. Semi-supervised VAE - Examples ● Semi-supervised Sequential Variational Autoencoder (SSVAE): Consists of a Seq2Seq structure and a sequential classifier. In the Seq2Seq structure, the input sequence is firstly encoded by a recurrent neural network and then decoded by another recurrent neural network conditioned on both latent variable and categorical label. ● Infinite VAE: Mixture of an infinite number of autoencoders capable of scaling with data complexity to better capture its intrinsic structure. The unsupervised generative model is trained using unlabeled data, then this model can be combined with the available labeled data to train a discriminative model, which is also a mixture of experts.
  • 12. Consistency Regularization ● Consistency regularization is based on the idea that predictions should be less affected by extra perturbation imposed on the input samples. ● Consistency regularization SSL methods typically use the Teacher-Student structure. ● The objective is to train the model to predict consistently on an unlabeled example and its perturbed version. ● The model learns as a student, and as a teacher, it creates targets simultaneously. Since models themselves generate targets, they may be incorrect and are then used as students for learning.
  • 13. Consistency Regularization - Examples ● Ladder networks: This network consists of two encoders, a corrupted and clean one, and a decoder. The corrupted one has Gaussian noise injected at each layer after batch normalization. The input x on passing through clean encoder will produce output y and on passing through corrupted one produce ỹ. This ỹ will be fed into the denoising decoder to reconstruct the clean output y. The training loss is computed as the MSE between clean activation and reconstructed activation. ● Π-Model: This is a simplified ladder network, where the corrupted encoder is removed, and the same network is used to get the prediction for both corrupted and uncorrupted input. In this model, two random augmentations of a sample for both labeled and unlabeled data are forward propagated, and the objective is to produce consistent predictions on the variants, ie reduce the distance between two predictions.
  • 14. Graph-Based Methods ● The main idea of graph-based semi-supervised learning (GSSL) is to extract a graph out of the raw data where each node represents a training sample and the edge represents the similarity measurement of samples. ● There are labeled and unlabeled samples, the objective is to propagate the labels from the labeled nodes to the unlabeled ones. ● The GSSL methods are broadly classified into two: ○ AutoEncoder-based ○ GNN-based methods.
  • 15. Graph-Based Methods - Examples ● Structural deep network embedding (SDNE) is an AutoEncoder based method. This framework consists of unsupervised and supervised parts. The first is an autoencoder designed to produce an embedding result for each node to rebuild the neighborhood. The second part utilizes Laplacian Eigenmaps, which penalize the model when related vertices are far apart. ● Basic GNN: Graph Neural Networks (GNNs) is a classifier is first trained to predict class labels for the labeled nodes. Then it can be applied to the unlabeled nodes based on the final, hidden state of the GNN-based model. It takes advantage of neural message passing in which messages are exchanged and updated between each pair of nodes by using neural networks.
  • 16. Pseudo-Labeling Methods The pseudo labeling method works in two steps. ● In the first step, a model is trained on a limited set of labeled data. ● The second step leverages the same model to create pseudo labels on unlabeled data and add the high confidence pseudo labels as targets to the existing labeled dataset creating additional training data. There are two main patterns: one is to improve the performance of the whole framework based on the disagreement of views or multiple networks, and the other is self-training. ● Disagreement-based methods train multiple learners and focus on exploiting the disagreement during the training. ● The self-training algorithm leverages the model’s own confident predictions to produce the pseudo labels for unlabeled data.
  • 17. Pseudo-Labeling Methods - Examples ● Pseudo-label: This is a simple and efficient SSL method that allows the network to be trained simultaneously with labeled and unlabeled data. Initially, the model is trained with labeled data using a cross-entropy loss. The same model is used to predict an entire batch of unlabeled samples. The maximum confidence prediction is called a pseudo-label. ● Noisy Student: This is a semi supervised method that works on knowledge distillation with equal-or-larger student models. The teacher model is first trained on labeled images to generate pseudo labels for unlabeled examples. Following this, a larger student model is trained on the combination of labeled and pseudo-labeled samples. These combined instances are augmented using data augmentation techniques and model noise. With several iterations of this algorithm, the student model becomes the new teacher and relabels the unlabeled data, and the cycle is repeated. ● SimCLRv2: This is the SSL version of SimCLR (A Simple Framework for Contrastive Learning of Visual Representations). SimCLRv2 can be summarized in three steps: task agnostic unsupervised pre-training, supervised fine-tuning on labeled samples, and self-training or distillation with task-specific unlabeled examples. SimCLRv2 learns representations by maximizing the contrastive learning loss function.
  • 18. Hybrid Methods Hybrid methods combine ideas from the above-mentioned methods such as : ● Pseudo-label ● Consistency regularization ● Entropy minimization for performance improvement.
  • 19. Hybrid Methods - Examples ● MixMatch: This method combines consistency regularization and entropy minimization in a unified loss function. It first introduces data augmentation in both labeled and unlabeled data. Each unlabeled sample is augmented K number of times and predictions of different augmentations are then averaged. To reduce entropy, guessed labels are sharpened before a final label is provided. After this, Mixup regularization is applied on both labeled and unlabeled data. ● FixMatch: This method combines consistency regularization and pseudo-labeling in a simplified way. Here, for each unlabeled image, weak augmentation and strong augmentations are applied to get two images. Both augmentations are passed through the model to get predictions. Then it uses consistency regularization as cross-entropy between one-hot pseudo-labels of weakly augmented images and prediction of strongly augmented images.
  • 20. Challenges The SSL method, like any other machine learning method, has its own set of challenges. One of the major challenges is that it is unknown how SSL works internally and what role various techniques, such as data augmentation, training methods, and loss functions, play exactly. The SSL approaches above generally operate at their best only in ideal environments when the training dataset meets the design assumptions, however, in reality, the distribution of the dataset is unknown and does not necessarily meet these ideal conditions and might produce unexpected results. If training data is highly imbalanced, the models tend to favor the majority class, and in some cases ignore the minority class entirely. The use of unlabeled data may result in worse generalization performance than a model learned only with labeled data.
  • 21. Conclusion Deep semi-supervised learning has already demonstrated remarkable results in various tasks and has garnered a lot of interest from the research community due to its important practical applications.