SlideShare a Scribd company logo
Chapter 6

Deep generative models

6.1 ∼ 6.2
6.1 Variational autoencoder
• Generative model
p(zn) = 𝒩(zn ∣ 0, I),
p(xn ∣ zn, W) = 𝒩 (xm ∣ f(zn; W), λ−1
x I)
(6.1)
(6.2)
f : generative network or decoder
• Posterior and objective
p(Z, W ∣ X) =
p(W)∏
N
n=1
p(xn ∣ zn, W)p(zn)
p(X)
(6.3)
DKL [q(Z, W) ∣ p(Z, W|X)] (6.4)
6.1.1 Generative and inference networks
6.1.1.1 Generative model and posterior approximation
• Mean-field approximation
q(Z, W; X, ψ, ξ) = q(Z; X, ψ)q(W; ξ) (6.5)
q(W; ξ) = 𝒩
∏
i,j,l
(w(l)
i,j
∣ m(l)
i,j
, v(l)
i,j
) (6.6)
q(Z; X, ψ) =
N
∏
n=1
q(zn; xn, ψ)
=
N
∏
n=1
𝒩 (zn ∣ m(xn; ψ), diag(v(xn; ψ))) (6.7)
f(xn, ψ) = (m(xn; ψ), ln(v(xn; ψ))) (6.8)
f : inference (recognition) network or encoder
• Amortized inference
f(xn, ψ)New data
Variational parameters
for the new data
Inference network
Similar idea used in Helmholtz machine (Dayan et al.,1995)
http://jmlr.org/papers/v14/hoffman13a.html
• Global and local latent variables
DKL [q(Z, W; X, ϕ, ξ) ∥ p(Z, W|X)]
= − {E[ln p(X, Z, W)] − E[ln q(Z; X, ψ)] − E[ln q(W; ξ)]} + ln p(X)
(6.9)
∴ ln p(X) − DKL [q(Z, W; X, ϕ, ξ) ∥ p(Z, W|X)]
= E[ln p(X, Z, W)] − E[ln q(Z; X, ψ)] − E[ln q(W; ξ)]
= ℒ(ψ, ξ) (6.10)
Maximize w.r.t. andℒ(ψ, ξ) ψ ξ
(6.11)
ℒ 𝒮(ψ, ξ)
=
N
M ∑
n∈𝒮
{E[ln p(xn ∣ zn, W)] + E[ln p(zn)] − E[ln q(zn)]}+
E[ln p(W)] − E[ln q(W; ξ)]
6.1.1.2 Training by variational inference
• Gradients of parameters
∇ξℒ 𝒮(ψ, ξ)
=
N
M ∑
n∈𝒮
∇ξE[ln p(xn ∣ zn, W)] + ∇ξE[ln p(W)] − ∇ξE[ln q(W)]
(6.12)
∇ψ ℒ 𝒮(ψ, ξ)
=
N
M ∑
n∈𝒮
{∇ψ E[ln p(xn ∣ zn, W)] + ∇ψ E[ln p(zn)] − ∇ψ E[ln q(zn)]}
(6.13)
ξ : variational parameter of q(W; ξ)
ψ : inference network parameter of f(xn; ψ)
Labelled data 𝒟 𝒜 = {X 𝒜, Y 𝒜}
Un-labelled data 𝒟 𝒰 = X 𝒰
6.1.2.1 M1 model
1. Train encoder and decoder with

2. Train supervised model with
{X 𝒜, X 𝒰}
{Z 𝒜, Y 𝒜}
where is encoded from with the model of 1.Z 𝒜 X 𝒜
6.1.2 Semi-supervised models
6.1.2.2 M2 model
X 𝒜
Y 𝒜 Z 𝒜
W X 𝒰
Z 𝒰Y 𝒰
• Generative process with shared parameter (and shared
prior on and
W
p(Y) p(Z)
p(X 𝒜, X 𝒰, Y 𝒜, Y 𝒰, Z 𝒜, Z 𝒰, W)
= p(X 𝒜 |Y 𝒜, Z 𝒜)p(Y 𝒜)p(Z 𝒜)p(X 𝒰 |Y 𝒰, Z 𝒰)p(Y 𝒰)p(Z 𝒰) (6.14)
• Approximate posterior
q(Z 𝒜; X 𝒜, Y 𝒜, ψ) =
∏
n∈𝒜
𝒩(zn |m(xn, yn; ψ), diagm(v(xn, yn; ψ))) (6.15)
q(Z 𝒰; X 𝒰, ψ) =
∏
n∈𝒰
𝒩(zn |m(xn; ψ), diagm(v(xn; ψ)) (6.16)
q(Y 𝒰; X 𝒰, ψ) =
∏
n∈𝒰
Cat(yn |π(xn; ψ)) (6.17)
m, v, π : inference networks parametrized with ψ
q(W; ξ) : Gaussian distribution parametrized with ξ
• KL-divergence
DKL[q(Y 𝒰, Z 𝒜, Z 𝒰, W; X 𝒜, Y 𝒜, X 𝒰, ξ, ψ ∥ p(Y 𝒰, Z 𝒜, Z 𝒰, W ∣ X 𝒜, X 𝒰, Y 𝒜)]
= ℱ(ξ, ψ) + const . (6.18)
ℱ(ξ, ψ) = ℒ 𝒜(X 𝒜, Y 𝒜; ξ, ψ) + ℒ 𝒰(X 𝒰; ξ, ψ) − DKL[q(W; ψ) ∥ p(W)] (6.19)
ℒ 𝒜(X 𝒜, Y 𝒜; ξ, ψ)
= E[ln p(X 𝒜 |Y 𝒜, Z 𝒜, W)] + E[ln p(Z 𝒜)] − E[ln q(Z 𝒜; X 𝒜, Y 𝒜, ψ)] (6.20)
ℒ 𝒰(X 𝒰; ξ, ψ) = E[ln p(X 𝒰 |Y 𝒰, Z 𝒰, W)] + E[ln p(Y 𝒰)] + E[ln p(Z 𝒰)]
−E[ln q(Y 𝒰; X 𝒰, ψ)] − E[ln q(Z 𝒰; X 𝒰, ψ)]
(6.21)
• Maximize w.r.t. andℱ(ξ, ψ) ξ ψ
• Extension of objective function to use labelled data with a
classification likelihood
ℱβ(ξ, ψ) = ℱ(ξ, ψ) + β ln q(Y 𝒜; X 𝒜, ψ) (6.22)
β : weight of classification likelihood
6.1.3 Applications and extensions
6.1.3.1 Extension of models
• Incorporate recurrent network and attention (DRAW)

• Convolutional VAE

• Disentangle representation learning

• Multi-modal learning with shared latent representation
(e.g., images and texts)
https://jhui.github.io/2017/04/30/DRAW-Deep-recurrent-attentive-writer/
Explanation of DRAW with python implementation:
6.1.3.2 Importance weighted AE
ℒT = Ez(t)∼q(z(t))
[
ln
1
T
T
∑
t=1
p(x, z(t)
)
q(z(t); x) ]
≤ ln Ez(t)∼q(z(t))
[
1
T
T
∑
t=1
p(x, z(t)
)
q(z(t); x)]
= ln Ez(t)∼q(z(t))
[
1
T
T
∑
t=1
p(x|z(t)
)
p(z(t)
)
q(z(t); x) ]
= ln p(x)
(6.23)
• Equivalent to ELBO when T=1

• Larger T is, tighter the bound (appendix A in the paper):
ln p(x) ≥ ⋯ ≥ ℒt+1 ≥ ℒt ≥ ⋯ℒ1 = ℒ (6.24)

More Related Content

What's hot

Deep generative model.pdf
Deep generative model.pdfDeep generative model.pdf
Deep generative model.pdf
Hyungjoo Cho
 
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Kostas Hatalis, PhD
 
Using R Tool for Probability and Statistics
Using R Tool for Probability and Statistics Using R Tool for Probability and Statistics
Using R Tool for Probability and Statistics
nazlitemu
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4
arogozhnikov
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1
arogozhnikov
 
SPDE presentation 2012
SPDE presentation 2012SPDE presentation 2012
SPDE presentation 2012
Zheng Mengdi
 
MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic track
arogozhnikov
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3
arogozhnikov
 
Intro to ABC
Intro to ABCIntro to ABC
Intro to ABC
Matt Moores
 
Non-informative reparametrisation for location-scale mixtures
Non-informative reparametrisation for location-scale mixturesNon-informative reparametrisation for location-scale mixtures
Non-informative reparametrisation for location-scale mixtures
Christian Robert
 
Reweighting and Boosting to uniforimty in HEP
Reweighting and Boosting to uniforimty in HEPReweighting and Boosting to uniforimty in HEP
Reweighting and Boosting to uniforimty in HEP
arogozhnikov
 
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
The Statistical and Applied Mathematical Sciences Institute
 
Lesson 27: Integration by Substitution (Section 041 slides)
Lesson 27: Integration by Substitution (Section 041 slides)Lesson 27: Integration by Substitution (Section 041 slides)
Lesson 27: Integration by Substitution (Section 041 slides)
Matthew Leingang
 
Machine learning in science and industry — day 3
Machine learning in science and industry — day 3Machine learning in science and industry — day 3
Machine learning in science and industry — day 3
arogozhnikov
 
Triple integrals and applications
Triple integrals and applicationsTriple integrals and applications
(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence
Masahiro Suzuki
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9
Daisuke Yoneoka
 
Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Variational Autoencoder Tutorial
Variational Autoencoder Tutorial
Hojin Yang
 
Dag in mmhc
Dag in mmhcDag in mmhc
Dag in mmhc
KyusonLim
 

What's hot (20)

Deep generative model.pdf
Deep generative model.pdfDeep generative model.pdf
Deep generative model.pdf
 
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
 
Using R Tool for Probability and Statistics
Using R Tool for Probability and Statistics Using R Tool for Probability and Statistics
Using R Tool for Probability and Statistics
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1
 
SPDE presentation 2012
SPDE presentation 2012SPDE presentation 2012
SPDE presentation 2012
 
MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic track
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3
 
Intro to ABC
Intro to ABCIntro to ABC
Intro to ABC
 
Non-informative reparametrisation for location-scale mixtures
Non-informative reparametrisation for location-scale mixturesNon-informative reparametrisation for location-scale mixtures
Non-informative reparametrisation for location-scale mixtures
 
Reweighting and Boosting to uniforimty in HEP
Reweighting and Boosting to uniforimty in HEPReweighting and Boosting to uniforimty in HEP
Reweighting and Boosting to uniforimty in HEP
 
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
 
Lesson 27: Integration by Substitution (Section 041 slides)
Lesson 27: Integration by Substitution (Section 041 slides)Lesson 27: Integration by Substitution (Section 041 slides)
Lesson 27: Integration by Substitution (Section 041 slides)
 
Machine learning in science and industry — day 3
Machine learning in science and industry — day 3Machine learning in science and industry — day 3
Machine learning in science and industry — day 3
 
Triple integrals and applications
Triple integrals and applicationsTriple integrals and applications
Triple integrals and applications
 
(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9
 
Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Variational Autoencoder Tutorial
Variational Autoencoder Tutorial
 
Dag in mmhc
Dag in mmhcDag in mmhc
Dag in mmhc
 

Similar to 20191123 bayes dl-jp

talk MCMC & SMC 2004
talk MCMC & SMC 2004talk MCMC & SMC 2004
talk MCMC & SMC 2004
Stephane Senecal
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
The Statistical and Applied Mathematical Sciences Institute
 
sada_pres
sada_pressada_pres
sada_pres
Stephane Senecal
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANs
Mark Chang
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
The Statistical and Applied Mathematical Sciences Institute
 
Probabilistic Control of Switched Linear Systems with Chance Constraints
Probabilistic Control of Switched Linear Systems with Chance ConstraintsProbabilistic Control of Switched Linear Systems with Chance Constraints
Probabilistic Control of Switched Linear Systems with Chance Constraints
Leo Asselborn
 
Low Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse ProblemsLow Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse Problems
Gabriel Peyré
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formula
Alexander Litvinenko
 
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Gabriel Peyré
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Universitat Politècnica de Catalunya
 
2018 MUMS Fall Course - Sampling-based techniques for uncertainty propagation...
2018 MUMS Fall Course - Sampling-based techniques for uncertainty propagation...2018 MUMS Fall Course - Sampling-based techniques for uncertainty propagation...
2018 MUMS Fall Course - Sampling-based techniques for uncertainty propagation...
The Statistical and Applied Mathematical Sciences Institute
 
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Alexander Litvinenko
 
Probability based learning (in book: Machine learning for predictve data anal...
Probability based learning (in book: Machine learning for predictve data anal...Probability based learning (in book: Machine learning for predictve data anal...
Probability based learning (in book: Machine learning for predictve data anal...
Duyen Do
 
Control of Discrete-Time Piecewise Affine Probabilistic Systems using Reachab...
Control of Discrete-Time Piecewise Affine Probabilistic Systems using Reachab...Control of Discrete-Time Piecewise Affine Probabilistic Systems using Reachab...
Control of Discrete-Time Piecewise Affine Probabilistic Systems using Reachab...
Leo Asselborn
 
Lecture9 xing
Lecture9 xingLecture9 xing
Lecture9 xing
Tianlu Wang
 
Han Liu MedicReS World Congress 2015
Han Liu MedicReS World Congress 2015Han Liu MedicReS World Congress 2015
Han Liu MedicReS World Congress 2015
MedicReS
 
Efficient Analysis of high-dimensional data in tensor formats
Efficient Analysis of high-dimensional data in tensor formatsEfficient Analysis of high-dimensional data in tensor formats
Efficient Analysis of high-dimensional data in tensor formats
Alexander Litvinenko
 
Semi vae memo (2)
Semi vae memo (2)Semi vae memo (2)
Semi vae memo (2)
Masato Nakai
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
Deep Learning JP
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective Divergences
Frank Nielsen
 

Similar to 20191123 bayes dl-jp (20)

talk MCMC & SMC 2004
talk MCMC & SMC 2004talk MCMC & SMC 2004
talk MCMC & SMC 2004
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
sada_pres
sada_pressada_pres
sada_pres
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANs
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Probabilistic Control of Switched Linear Systems with Chance Constraints
Probabilistic Control of Switched Linear Systems with Chance ConstraintsProbabilistic Control of Switched Linear Systems with Chance Constraints
Probabilistic Control of Switched Linear Systems with Chance Constraints
 
Low Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse ProblemsLow Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse Problems
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formula
 
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
 
2018 MUMS Fall Course - Sampling-based techniques for uncertainty propagation...
2018 MUMS Fall Course - Sampling-based techniques for uncertainty propagation...2018 MUMS Fall Course - Sampling-based techniques for uncertainty propagation...
2018 MUMS Fall Course - Sampling-based techniques for uncertainty propagation...
 
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
 
Probability based learning (in book: Machine learning for predictve data anal...
Probability based learning (in book: Machine learning for predictve data anal...Probability based learning (in book: Machine learning for predictve data anal...
Probability based learning (in book: Machine learning for predictve data anal...
 
Control of Discrete-Time Piecewise Affine Probabilistic Systems using Reachab...
Control of Discrete-Time Piecewise Affine Probabilistic Systems using Reachab...Control of Discrete-Time Piecewise Affine Probabilistic Systems using Reachab...
Control of Discrete-Time Piecewise Affine Probabilistic Systems using Reachab...
 
Lecture9 xing
Lecture9 xingLecture9 xing
Lecture9 xing
 
Han Liu MedicReS World Congress 2015
Han Liu MedicReS World Congress 2015Han Liu MedicReS World Congress 2015
Han Liu MedicReS World Congress 2015
 
Efficient Analysis of high-dimensional data in tensor formats
Efficient Analysis of high-dimensional data in tensor formatsEfficient Analysis of high-dimensional data in tensor formats
Efficient Analysis of high-dimensional data in tensor formats
 
Semi vae memo (2)
Semi vae memo (2)Semi vae memo (2)
Semi vae memo (2)
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective Divergences
 

More from Taku Yoshioka

20191026 bayes dl
20191026 bayes dl20191026 bayes dl
20191026 bayes dl
Taku Yoshioka
 
20191019 sinkhorn
20191019 sinkhorn20191019 sinkhorn
20191019 sinkhorn
Taku Yoshioka
 
20181221 q-trader
20181221 q-trader20181221 q-trader
20181221 q-trader
Taku Yoshioka
 
20181125 pybullet
20181125 pybullet20181125 pybullet
20181125 pybullet
Taku Yoshioka
 
20180722 pyro
20180722 pyro20180722 pyro
20180722 pyro
Taku Yoshioka
 
20171207 domain-adaptation
20171207 domain-adaptation20171207 domain-adaptation
20171207 domain-adaptation
Taku Yoshioka
 
20171025 pp-in-robotics
20171025 pp-in-robotics20171025 pp-in-robotics
20171025 pp-in-robotics
Taku Yoshioka
 
20160611 pymc3-latent
20160611 pymc3-latent20160611 pymc3-latent
20160611 pymc3-latent
Taku Yoshioka
 
自動微分変分ベイズ法の紹介
自動微分変分ベイズ法の紹介自動微分変分ベイズ法の紹介
自動微分変分ベイズ法の紹介
Taku Yoshioka
 

More from Taku Yoshioka (9)

20191026 bayes dl
20191026 bayes dl20191026 bayes dl
20191026 bayes dl
 
20191019 sinkhorn
20191019 sinkhorn20191019 sinkhorn
20191019 sinkhorn
 
20181221 q-trader
20181221 q-trader20181221 q-trader
20181221 q-trader
 
20181125 pybullet
20181125 pybullet20181125 pybullet
20181125 pybullet
 
20180722 pyro
20180722 pyro20180722 pyro
20180722 pyro
 
20171207 domain-adaptation
20171207 domain-adaptation20171207 domain-adaptation
20171207 domain-adaptation
 
20171025 pp-in-robotics
20171025 pp-in-robotics20171025 pp-in-robotics
20171025 pp-in-robotics
 
20160611 pymc3-latent
20160611 pymc3-latent20160611 pymc3-latent
20160611 pymc3-latent
 
自動微分変分ベイズ法の紹介
自動微分変分ベイズ法の紹介自動微分変分ベイズ法の紹介
自動微分変分ベイズ法の紹介
 

Recently uploaded

DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
gerogepatton
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
zubairahmad848137
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
Las Vegas Warehouse
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
Engine Lubrication performance System.pdf
Engine Lubrication performance System.pdfEngine Lubrication performance System.pdf
Engine Lubrication performance System.pdf
mamamaam477
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
Mahmoud Morsy
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
KrishnaveniKrishnara1
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
Roger Rozario
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
john krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptxjohn krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptx
Madan Karki
 
Introduction to AI Safety (public presentation).pptx
Introduction to AI Safety (public presentation).pptxIntroduction to AI Safety (public presentation).pptx
Introduction to AI Safety (public presentation).pptx
MiscAnnoy1
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 

Recently uploaded (20)

DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
Engine Lubrication performance System.pdf
Engine Lubrication performance System.pdfEngine Lubrication performance System.pdf
Engine Lubrication performance System.pdf
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
john krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptxjohn krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptx
 
Introduction to AI Safety (public presentation).pptx
Introduction to AI Safety (public presentation).pptxIntroduction to AI Safety (public presentation).pptx
Introduction to AI Safety (public presentation).pptx
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 

20191123 bayes dl-jp

  • 1. Chapter 6 Deep generative models 6.1 ∼ 6.2
  • 2. 6.1 Variational autoencoder • Generative model p(zn) = 𝒩(zn ∣ 0, I), p(xn ∣ zn, W) = 𝒩 (xm ∣ f(zn; W), λ−1 x I) (6.1) (6.2) f : generative network or decoder • Posterior and objective p(Z, W ∣ X) = p(W)∏ N n=1 p(xn ∣ zn, W)p(zn) p(X) (6.3) DKL [q(Z, W) ∣ p(Z, W|X)] (6.4) 6.1.1 Generative and inference networks 6.1.1.1 Generative model and posterior approximation
  • 3. • Mean-field approximation q(Z, W; X, ψ, ξ) = q(Z; X, ψ)q(W; ξ) (6.5) q(W; ξ) = 𝒩 ∏ i,j,l (w(l) i,j ∣ m(l) i,j , v(l) i,j ) (6.6) q(Z; X, ψ) = N ∏ n=1 q(zn; xn, ψ) = N ∏ n=1 𝒩 (zn ∣ m(xn; ψ), diag(v(xn; ψ))) (6.7) f(xn, ψ) = (m(xn; ψ), ln(v(xn; ψ))) (6.8) f : inference (recognition) network or encoder
  • 4. • Amortized inference f(xn, ψ)New data Variational parameters for the new data Inference network Similar idea used in Helmholtz machine (Dayan et al.,1995)
  • 6. DKL [q(Z, W; X, ϕ, ξ) ∥ p(Z, W|X)] = − {E[ln p(X, Z, W)] − E[ln q(Z; X, ψ)] − E[ln q(W; ξ)]} + ln p(X) (6.9) ∴ ln p(X) − DKL [q(Z, W; X, ϕ, ξ) ∥ p(Z, W|X)] = E[ln p(X, Z, W)] − E[ln q(Z; X, ψ)] − E[ln q(W; ξ)] = ℒ(ψ, ξ) (6.10) Maximize w.r.t. andℒ(ψ, ξ) ψ ξ (6.11) ℒ 𝒮(ψ, ξ) = N M ∑ n∈𝒮 {E[ln p(xn ∣ zn, W)] + E[ln p(zn)] − E[ln q(zn)]}+ E[ln p(W)] − E[ln q(W; ξ)] 6.1.1.2 Training by variational inference
  • 7. • Gradients of parameters ∇ξℒ 𝒮(ψ, ξ) = N M ∑ n∈𝒮 ∇ξE[ln p(xn ∣ zn, W)] + ∇ξE[ln p(W)] − ∇ξE[ln q(W)] (6.12) ∇ψ ℒ 𝒮(ψ, ξ) = N M ∑ n∈𝒮 {∇ψ E[ln p(xn ∣ zn, W)] + ∇ψ E[ln p(zn)] − ∇ψ E[ln q(zn)]} (6.13) ξ : variational parameter of q(W; ξ) ψ : inference network parameter of f(xn; ψ)
  • 8. Labelled data 𝒟 𝒜 = {X 𝒜, Y 𝒜} Un-labelled data 𝒟 𝒰 = X 𝒰 6.1.2.1 M1 model 1. Train encoder and decoder with 2. Train supervised model with {X 𝒜, X 𝒰} {Z 𝒜, Y 𝒜} where is encoded from with the model of 1.Z 𝒜 X 𝒜 6.1.2 Semi-supervised models
  • 9. 6.1.2.2 M2 model X 𝒜 Y 𝒜 Z 𝒜 W X 𝒰 Z 𝒰Y 𝒰 • Generative process with shared parameter (and shared prior on and W p(Y) p(Z) p(X 𝒜, X 𝒰, Y 𝒜, Y 𝒰, Z 𝒜, Z 𝒰, W) = p(X 𝒜 |Y 𝒜, Z 𝒜)p(Y 𝒜)p(Z 𝒜)p(X 𝒰 |Y 𝒰, Z 𝒰)p(Y 𝒰)p(Z 𝒰) (6.14)
  • 10. • Approximate posterior q(Z 𝒜; X 𝒜, Y 𝒜, ψ) = ∏ n∈𝒜 𝒩(zn |m(xn, yn; ψ), diagm(v(xn, yn; ψ))) (6.15) q(Z 𝒰; X 𝒰, ψ) = ∏ n∈𝒰 𝒩(zn |m(xn; ψ), diagm(v(xn; ψ)) (6.16) q(Y 𝒰; X 𝒰, ψ) = ∏ n∈𝒰 Cat(yn |π(xn; ψ)) (6.17) m, v, π : inference networks parametrized with ψ q(W; ξ) : Gaussian distribution parametrized with ξ
  • 11. • KL-divergence DKL[q(Y 𝒰, Z 𝒜, Z 𝒰, W; X 𝒜, Y 𝒜, X 𝒰, ξ, ψ ∥ p(Y 𝒰, Z 𝒜, Z 𝒰, W ∣ X 𝒜, X 𝒰, Y 𝒜)] = ℱ(ξ, ψ) + const . (6.18) ℱ(ξ, ψ) = ℒ 𝒜(X 𝒜, Y 𝒜; ξ, ψ) + ℒ 𝒰(X 𝒰; ξ, ψ) − DKL[q(W; ψ) ∥ p(W)] (6.19) ℒ 𝒜(X 𝒜, Y 𝒜; ξ, ψ) = E[ln p(X 𝒜 |Y 𝒜, Z 𝒜, W)] + E[ln p(Z 𝒜)] − E[ln q(Z 𝒜; X 𝒜, Y 𝒜, ψ)] (6.20) ℒ 𝒰(X 𝒰; ξ, ψ) = E[ln p(X 𝒰 |Y 𝒰, Z 𝒰, W)] + E[ln p(Y 𝒰)] + E[ln p(Z 𝒰)] −E[ln q(Y 𝒰; X 𝒰, ψ)] − E[ln q(Z 𝒰; X 𝒰, ψ)] (6.21) • Maximize w.r.t. andℱ(ξ, ψ) ξ ψ
  • 12. • Extension of objective function to use labelled data with a classification likelihood ℱβ(ξ, ψ) = ℱ(ξ, ψ) + β ln q(Y 𝒜; X 𝒜, ψ) (6.22) β : weight of classification likelihood
  • 13. 6.1.3 Applications and extensions 6.1.3.1 Extension of models • Incorporate recurrent network and attention (DRAW) • Convolutional VAE • Disentangle representation learning • Multi-modal learning with shared latent representation (e.g., images and texts) https://jhui.github.io/2017/04/30/DRAW-Deep-recurrent-attentive-writer/ Explanation of DRAW with python implementation:
  • 14. 6.1.3.2 Importance weighted AE ℒT = Ez(t)∼q(z(t)) [ ln 1 T T ∑ t=1 p(x, z(t) ) q(z(t); x) ] ≤ ln Ez(t)∼q(z(t)) [ 1 T T ∑ t=1 p(x, z(t) ) q(z(t); x)] = ln Ez(t)∼q(z(t)) [ 1 T T ∑ t=1 p(x|z(t) ) p(z(t) ) q(z(t); x) ] = ln p(x) (6.23) • Equivalent to ELBO when T=1 • Larger T is, tighter the bound (appendix A in the paper): ln p(x) ≥ ⋯ ≥ ℒt+1 ≥ ℒt ≥ ⋯ℒ1 = ℒ (6.24)