SlideShare a Scribd company logo
Auto-encoding variational Bayes
Diederik P Kingma
1
Max Welling
2
Presented by : Mehdi Cherti (LAL/CNRS)
9th May 2015
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
What is a generative model ?
A model of how the data X was generated
Typically, the purpose is to nd a model for : p(x) or p(x, y)
y can be a set of latent (hidden) variables or a set of output
variables, for discriminative problems
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
Training generative models
Typically, we assume a parametric form of the probability
density :
p(x|Θ)
Given an i.i.d dataset : X = (x1, x2, ..., xN), we typically do :
Maximum likelihood (ML) : argmaxΘp(X|Θ)
Maximum a posteriori (MAP) : argmaxΘp(X|Θ)p(Θ)
Bayesian inference : p(Θ|X) = p(x|Θ)p(Θ)´
Θ
p(x|Θ)p(Θ)dΘ
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
The problem
let x be the observed variables
we assume a latent representation z
we dene pΘ(z) and pΘ(x|z)
We want to design a generative model where:
pΘ(x) =
´
pΘ(x|z)pΘ(z)dz is intractable
pΘ(z|x) = pΘ(x|z)pΘ(z)/pΘ(x) is intractable
we have large datasets : we want to avoid sampling based
training procedures (e.g MCMC)
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
The proposed solution
They propose:
a fast training procedure that estimates the parameters Θ: for
data generation
an approximation of the posterior pΘ(z|x) : for data
representation
an approximation of the marginal pΘ(x) : for model
evaluation and as a prior for other tasks
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
Formulation of the problem
the process of generation consists of sampling z from pΘ(z) then x
from pΘ(x|z).
Let's dene :
a prior over over the latent representation pΘ(z),
a decoder pΘ(x|z)
We want to maximize the log-likelihood of the data
(x(1), x(2), ..., x(N)):
logpΘ(x(1)
, x(2)
, ..., x(N)
) =
i
logpΘ(xi)
and be able to do inference : pΘ(z|x)
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
The variational lower bound
We will learn an approximate of pΘ(z|x) : qΦ(z|x) by
maximizing a lower bound of the log-likelihood of the data
We can write :
logpΘ(x) = DKL(qΦ(z|x)||pΘ(z|x)) + L(Θ, φ, x) where:
L(Θ, Φ, x) = EqΦ(z|x)[logpΘ(x, z) − logqφ
(z|x)]
L(Θ, Φ, x)is called the variational lower bound, and the goal is
to maximize it w.r.t to all the parameters (Θ, Φ)
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
Estimating the lower bound gradients
We need to compute
∂L(Θ,Φ,x)
∂Θ , ∂L(Θ,Φ,x)
∂φ to apply gradient
descent
For that, we use the reparametrisation trick : we sample
from a noise variable p( ) and apply a determenistic function
to it so that we obtain correct samples from qφ(z|x), meaning:
if ∼ p( ) we nd g so that if z = g(x, φ, ) then z ∼ qφ(z|x)
g can be the inverse CDF of qΦ(z|x) if is uniform
With the reparametrisation trick we can rewrite L:
L(Θ, Φ, x) = E ∼p( )[logpΘ(x, g(x, φ, )) − logqφ
(g(x, φ, )|x)]
We then estimate the gradients with Monte Carlo
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
A connection with auto-encoders
Note that L can also be written in this form:
L(Θ, φ, x) = −DKL(qΦ(z|x)||pΘ(z)) + EqΦ(z|x)[logpΘ(x|z)]
We can interpret the rst term as a regularizer : it forces
qΦ(z|x) to not be too divergent from the prior pΘ(z)
We can interpret the (-second term) as the reconstruction
error
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
The algorithm
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
Variational auto-encoders
It is a model example which uses the procedure described
above to maximize the lower bound
In V.A, we choose:
pΘ(z) = N(0, I)
pΘ(x|z) :
is normal distribution for real data, we have neural network
decoder that computes µand σ of this distribution from z
is multivariate bernoulli for boolean data, we have neural
network decoder that computes the probability of 1 from z
qΦ(z|x) = N(µ(x), σ(x)I) : we have a neural network
encoder that computes µand σ of qΦ(z|x) from x
∼ N(0, I) and z = g(x, φ, ) = µ(x) + σ(x) ∗
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
Experiments (1)
Samples from MNIST:
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
Experiments (2)
2D-Latent space manifolds from MNIST and Frey datasets
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
Experiments (3)
Comparison of the lower bound with the Wake-sleep algorithm :
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
Experiments (4)
Comparison of the marginal log-likelihood with Wake-Sleep and
Monte Carlo EM (MCEM):
Diederik P Kingma, Max Welling Auto-encoding variational Bayes
Implementation : https://github.com/mehdidc/lasagnekit
Diederik P Kingma, Max Welling Auto-encoding variational Bayes

More Related Content

What's hot

Backpropagation in Convolutional Neural Network
Backpropagation in Convolutional Neural NetworkBackpropagation in Convolutional Neural Network
Backpropagation in Convolutional Neural Network
Hiroshi Kuwajima
 

What's hot (20)

Backpropagation in Convolutional Neural Network
Backpropagation in Convolutional Neural NetworkBackpropagation in Convolutional Neural Network
Backpropagation in Convolutional Neural Network
 
차원축소 훑어보기 (PCA, SVD, NMF)
차원축소 훑어보기 (PCA, SVD, NMF)차원축소 훑어보기 (PCA, SVD, NMF)
차원축소 훑어보기 (PCA, SVD, NMF)
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief Networks
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
 
Generating Diverse High-Fidelity Images with VQ-VAE-2
Generating Diverse High-Fidelity Images with VQ-VAE-2Generating Diverse High-Fidelity Images with VQ-VAE-2
Generating Diverse High-Fidelity Images with VQ-VAE-2
 
20191005LT会用 CTF writeup(pwn)
20191005LT会用 CTF writeup(pwn)20191005LT会用 CTF writeup(pwn)
20191005LT会用 CTF writeup(pwn)
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density Models
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
 
Neural Networks and Deep Learning
Neural Networks and Deep LearningNeural Networks and Deep Learning
Neural Networks and Deep Learning
 
Survey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer VisionSurvey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer Vision
 
Applications in Machine Learning
Applications in Machine LearningApplications in Machine Learning
Applications in Machine Learning
 
[DL輪読会]Wavenet a generative model for raw audio
[DL輪読会]Wavenet a generative model for raw audio[DL輪読会]Wavenet a generative model for raw audio
[DL輪読会]Wavenet a generative model for raw audio
 
CNN
CNNCNN
CNN
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...
Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...
Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...
 
Computer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC AlgorithmComputer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC Algorithm
 
Hashing Part Two: Cuckoo Hashing
Hashing Part Two: Cuckoo HashingHashing Part Two: Cuckoo Hashing
Hashing Part Two: Cuckoo Hashing
 
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
 
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
 

Viewers also liked

(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
Masahiro Suzuki
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature Learning
Amgad Muhammad
 

Viewers also liked (20)

Deep Style: Using Variational Auto-encoders for Image Generation
Deep Style: Using Variational Auto-encoders for Image GenerationDeep Style: Using Variational Auto-encoders for Image Generation
Deep Style: Using Variational Auto-encoders for Image Generation
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
Deep image generating models
Deep image generating modelsDeep image generating models
Deep image generating models
 
jQuery Keynote - Fall 2010
jQuery Keynote - Fall 2010jQuery Keynote - Fall 2010
jQuery Keynote - Fall 2010
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
 
Variational autoencoder talk
Variational autoencoder talkVariational autoencoder talk
Variational autoencoder talk
 
REST APIs in Laravel 101
REST APIs in Laravel 101REST APIs in Laravel 101
REST APIs in Laravel 101
 
Bootstrat REST APIs with Laravel 5
Bootstrat REST APIs with Laravel 5Bootstrat REST APIs with Laravel 5
Bootstrat REST APIs with Laravel 5
 
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
 
Applying Computer Vision to Art History
Applying Computer Vision to Art HistoryApplying Computer Vision to Art History
Applying Computer Vision to Art History
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature Learning
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)
 
Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
Autoencoders for image_classification
Autoencoders for image_classificationAutoencoders for image_classification
Autoencoders for image_classification
 
10 home remedies for nausea
10 home remedies for nausea10 home remedies for nausea
10 home remedies for nausea
 
Simple Introduction to AutoEncoder
Simple Introduction to AutoEncoderSimple Introduction to AutoEncoder
Simple Introduction to AutoEncoder
 
Unsupervised Computer Vision: The Current State of the Art
Unsupervised Computer Vision: The Current State of the ArtUnsupervised Computer Vision: The Current State of the Art
Unsupervised Computer Vision: The Current State of the Art
 
지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016
지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016
지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016
 

Similar to Auto encoding-variational-bayes

02-VariableLengthCodes_pres.pdf
02-VariableLengthCodes_pres.pdf02-VariableLengthCodes_pres.pdf
02-VariableLengthCodes_pres.pdf
JunZhao68
 

Similar to Auto encoding-variational-bayes (20)

Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep Learning
 
Introduction to modern Variational Inference.
Introduction to modern Variational Inference.Introduction to modern Variational Inference.
Introduction to modern Variational Inference.
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Error control coding bch, reed-solomon etc..
Error control coding   bch, reed-solomon etc..Error control coding   bch, reed-solomon etc..
Error control coding bch, reed-solomon etc..
 
Cheatsheet supervised-learning
Cheatsheet supervised-learningCheatsheet supervised-learning
Cheatsheet supervised-learning
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.ppt
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.ppt
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.ppt
 
Meta-learning and the ELBO
Meta-learning and the ELBOMeta-learning and the ELBO
Meta-learning and the ELBO
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
talk MCMC & SMC 2004
talk MCMC & SMC 2004talk MCMC & SMC 2004
talk MCMC & SMC 2004
 
Deep Learning for Cyber Security
Deep Learning for Cyber SecurityDeep Learning for Cyber Security
Deep Learning for Cyber Security
 
從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論
 
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
 
02-VariableLengthCodes_pres.pdf
02-VariableLengthCodes_pres.pdf02-VariableLengthCodes_pres.pdf
02-VariableLengthCodes_pres.pdf
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9
 
ML unit-1.pptx
ML unit-1.pptxML unit-1.pptx
ML unit-1.pptx
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 
Tensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsTensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEs
 
Decision Making with Hierarchical Credal Sets (IPMU 2014)
Decision Making with Hierarchical Credal Sets (IPMU 2014)Decision Making with Hierarchical Credal Sets (IPMU 2014)
Decision Making with Hierarchical Credal Sets (IPMU 2014)
 

Recently uploaded

FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
Sérgio Sacani
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
sreddyrahul
 
Seminar on Halal AGriculture and Fisheries.pptx
Seminar on Halal AGriculture and Fisheries.pptxSeminar on Halal AGriculture and Fisheries.pptx
Seminar on Halal AGriculture and Fisheries.pptx
RUDYLUMAPINET2
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
muralinath2
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
Jocelyn Atis
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
PirithiRaju
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Sérgio Sacani
 

Recently uploaded (20)

The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
 
Hemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. MuralinathHemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. Muralinath
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
 
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
GBSN -  Microbiology (Lab  1) Microbiology Lab Safety ProceduresGBSN -  Microbiology (Lab  1) Microbiology Lab Safety Procedures
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
Seminar on Halal AGriculture and Fisheries.pptx
Seminar on Halal AGriculture and Fisheries.pptxSeminar on Halal AGriculture and Fisheries.pptx
Seminar on Halal AGriculture and Fisheries.pptx
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
 
Erythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C KalyanErythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C Kalyan
 
SAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniquesSAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniques
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 

Auto encoding-variational-bayes

  • 1. Auto-encoding variational Bayes Diederik P Kingma 1 Max Welling 2 Presented by : Mehdi Cherti (LAL/CNRS) 9th May 2015 Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 2. Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 3. What is a generative model ? A model of how the data X was generated Typically, the purpose is to nd a model for : p(x) or p(x, y) y can be a set of latent (hidden) variables or a set of output variables, for discriminative problems Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 4. Training generative models Typically, we assume a parametric form of the probability density : p(x|Θ) Given an i.i.d dataset : X = (x1, x2, ..., xN), we typically do : Maximum likelihood (ML) : argmaxΘp(X|Θ) Maximum a posteriori (MAP) : argmaxΘp(X|Θ)p(Θ) Bayesian inference : p(Θ|X) = p(x|Θ)p(Θ)´ Θ p(x|Θ)p(Θ)dΘ Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 5. The problem let x be the observed variables we assume a latent representation z we dene pΘ(z) and pΘ(x|z) We want to design a generative model where: pΘ(x) = ´ pΘ(x|z)pΘ(z)dz is intractable pΘ(z|x) = pΘ(x|z)pΘ(z)/pΘ(x) is intractable we have large datasets : we want to avoid sampling based training procedures (e.g MCMC) Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 6. The proposed solution They propose: a fast training procedure that estimates the parameters Θ: for data generation an approximation of the posterior pΘ(z|x) : for data representation an approximation of the marginal pΘ(x) : for model evaluation and as a prior for other tasks Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 7. Formulation of the problem the process of generation consists of sampling z from pΘ(z) then x from pΘ(x|z). Let's dene : a prior over over the latent representation pΘ(z), a decoder pΘ(x|z) We want to maximize the log-likelihood of the data (x(1), x(2), ..., x(N)): logpΘ(x(1) , x(2) , ..., x(N) ) = i logpΘ(xi) and be able to do inference : pΘ(z|x) Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 8. The variational lower bound We will learn an approximate of pΘ(z|x) : qΦ(z|x) by maximizing a lower bound of the log-likelihood of the data We can write : logpΘ(x) = DKL(qΦ(z|x)||pΘ(z|x)) + L(Θ, φ, x) where: L(Θ, Φ, x) = EqΦ(z|x)[logpΘ(x, z) − logqφ (z|x)] L(Θ, Φ, x)is called the variational lower bound, and the goal is to maximize it w.r.t to all the parameters (Θ, Φ) Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 9. Estimating the lower bound gradients We need to compute ∂L(Θ,Φ,x) ∂Θ , ∂L(Θ,Φ,x) ∂φ to apply gradient descent For that, we use the reparametrisation trick : we sample from a noise variable p( ) and apply a determenistic function to it so that we obtain correct samples from qφ(z|x), meaning: if ∼ p( ) we nd g so that if z = g(x, φ, ) then z ∼ qφ(z|x) g can be the inverse CDF of qΦ(z|x) if is uniform With the reparametrisation trick we can rewrite L: L(Θ, Φ, x) = E ∼p( )[logpΘ(x, g(x, φ, )) − logqφ (g(x, φ, )|x)] We then estimate the gradients with Monte Carlo Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 10. A connection with auto-encoders Note that L can also be written in this form: L(Θ, φ, x) = −DKL(qΦ(z|x)||pΘ(z)) + EqΦ(z|x)[logpΘ(x|z)] We can interpret the rst term as a regularizer : it forces qΦ(z|x) to not be too divergent from the prior pΘ(z) We can interpret the (-second term) as the reconstruction error Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 11. The algorithm Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 12. Variational auto-encoders It is a model example which uses the procedure described above to maximize the lower bound In V.A, we choose: pΘ(z) = N(0, I) pΘ(x|z) : is normal distribution for real data, we have neural network decoder that computes µand σ of this distribution from z is multivariate bernoulli for boolean data, we have neural network decoder that computes the probability of 1 from z qΦ(z|x) = N(µ(x), σ(x)I) : we have a neural network encoder that computes µand σ of qΦ(z|x) from x ∼ N(0, I) and z = g(x, φ, ) = µ(x) + σ(x) ∗ Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 13. Experiments (1) Samples from MNIST: Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 14. Experiments (2) 2D-Latent space manifolds from MNIST and Frey datasets Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 15. Experiments (3) Comparison of the lower bound with the Wake-sleep algorithm : Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 16. Experiments (4) Comparison of the marginal log-likelihood with Wake-Sleep and Monte Carlo EM (MCEM): Diederik P Kingma, Max Welling Auto-encoding variational Bayes
  • 17. Implementation : https://github.com/mehdidc/lasagnekit Diederik P Kingma, Max Welling Auto-encoding variational Bayes