SlideShare a Scribd company logo
1 of 64
Download to read offline
Introduction to Deep
Generative Models
Herman Dong
Music and Audio Computing Lab (MACLab),
Research Center for Information Technology Innovation,
Academia Sinica
MuseGAN
Learn about our recent work on using GAN to compose
pop song at https://salu133445.github.io/musegan/
Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang and Yi-Hsuan Yang. 2017. MuseGAN: Symbolic-domain Music Generation
and Accompaniment with Multi-track Sequential Generative Adversarial Networks. arXiv preprint arXiv:1709.06298.
Outline
• Brief introduction to deep generative models
• AE (Autoencoder)
• VAE (Variational Autoencoder)
• GAN (Generative Adversarial Networks)
• AAE (Adversarial Autoencoder)
• VAE/GAN
• ADA (Adversarial Domain Adaption)
• Reformulation
• Graphical model representation
• Connection to Wake-sleep Algorithm
AE (Autoencoder)
• minimize reconstruction loss
EX
D
E(X)
D(E(X))
Lossreconstruction
loss
latent
data
AE (Autoencoder)
• minimize reconstruction loss
reconstruction
loss
EX
D
E(X)
D(E(X))
Loss
latent
data
AE (Autoencoder)
• minimize reconstruction loss
reconstruction
loss
EX
D
E(X)
D(E(X))
Loss
latent
data
VAE (Variational Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
PP(z) z ε
latentdata
VAE (Variational Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
Loss
PP(z) z ε
reconstruction
loss
KL
divergence
latentdata
≈
VAE (Variational Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
Loss
PP(z) z ε
reconstruction
loss
KL
divergence
latentdata
≈
VAE (Variational Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
Loss
PP(z) z ε
reconstruction
loss
KL
divergence
latentdata
μ σ
≈
VAE (Variational Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
Loss
PP(z) z ε
reconstruction
loss
KL
divergence
latentdata
μ σ
μ
=
z σ
+ ∙
ε
≈
VAE (Variational Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
Loss
PP(z) z ε
testing
(generation)
reconstruction
loss
KL
divergence
latentdata
μ σ
μ
=
z σ
+ ∙
ε
≈
GAN (Generative Adversarial Network)
• minimize distance between the distribution of real data and generated
samples
Gz~p(z) G(z)
D
X
1/0
latent data
GAN (Generative Adversarial Network)
• minimize distance between the distribution of real data and generated
samples
Gz~p(z) G(z)
D
X
1/0
adversarial
training
log(1-D(X))
+
log(D(G(z)))
log(1-D(G(z)))
Make G(z) undistinguishable
from real data for D
Distinguish G(z) being fake
from X being real
latent data
GAN (Generative Adversarial Network)
• minimize distance between the distribution of real data and generated
samples
Gz~p(z) G(z)
D
X
1/0
adversarial
training
log(1-D(X))
+
log(D(G(z)))
log(1-D(G(z)))
Make G(z) undistinguishable
from real data for D
Distinguish G(z) being fake
from X being real
latent data
GAN (Generative Adversarial Network)
• minimize distance between the distribution of real data and generated
samples
Gz~p(z) G(z)
D
X
1/0
adversarial
training
log(1-D(X))
+
log(D(G(z)))
log(1-D(G(z)))
Make G(z) undistinguishable
from real data for D
Distinguish G(z) being fake
from X being real
latent data
GAN (Generative Adversarial Network)
• minimize distance between the distribution of real data and generated
samples
Gz~p(z) G(z)
D
X
1/0
adversarial
training
log(1-D(X))
+
log(D(G(z)))
log(1-D(G(z)))
Make G(z) undistinguishable
from real data for D
Distinguish G(z) being fake
from X being real
testing
(generation)
latent data
GAN vs VAE
• GAN
• Generator aim to fool the discriminator
• Discriminator aim to distinguish generated data from real data
• output images are sharper
• higher diversity, lower stability
• VAE
• Objective: reconstruct real data
• using pixel-to-pixel loss
• output images are more blurred
• lower diversity, higher stability
GAN
VAE
A. B. L. Larsen, S. K. Sønderby, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
GAN vs VAE
• GAN
• Generator aim to fool the discriminator
• Discriminator aim to distinguish generated data from real data
• output images are sharper
• higher diversity, lower stability
• VAE
• Objective: reconstruct real data
• using pixel-to-pixel loss
• output images are more blurred
• lower diversity, higher stability
GAN
VAE
A. B. L. Larsen, S. K. Sønderby, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
AAE (Adversarial Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
PP(z) z ε
reconstruction
loss
latentdata
≈
Loss
KL
divergence
AAE (Adversarial Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
PP(z)
reconstruction
loss
latentdata
≈
D 1/0
adversarial
training
Make Q(X) undistinguishable
from real p(X) for D
Distinguish Q(X) being fake
from p(X) being real
AAE (Adversarial Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
PP(z)
reconstruction
loss
latentdata
≈
D 1/0
adversarial
training
Make Q(X) undistinguishable
from real p(X) for D
Distinguish Q(X) being fake
from p(X) being real
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
Loss
reconstruction
loss
X
VAE
GAN
VAE/GAN
data
G(z) zG
latent
D1/0
X
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) zG
D1/0
X
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) z
D1/0
(G)
X
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) z
D1/0
adversarial
training
Distinguish G(z) and P(z)
being fake from X being real
𝓛𝓛GAN
(G)
X
Make G(z) and P(z)
undistinguishable
from real data
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) z
D1/0
adversarial
training
Distinguish G(z) and P(z)
being fake from X being real
𝓛𝓛GAN
Loss
𝓛𝓛DIS hidden representation
reconstruction loss
(G)
X
Make G(z) and P(z)
undistinguishable
from real data
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) z
D1/0
adversarial
training
Distinguish G(z) and P(z)
being fake from X being real
𝓛𝓛GAN
Loss
𝓛𝓛DIS hidden representation
reconstruction loss
𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩
(G)
X
Make G(z) and P(z)
undistinguishable
from real data
𝓛𝓛 = 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 + 𝓛𝓛DIS + 𝓛𝓛GAN
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) z
D1/0
adversarial
training
Distinguish G(z) and P(z)
being fake from X being real
𝓛𝓛GAN
Loss
𝓛𝓛DIS hidden representation
reconstruction loss
𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩
(G)
𝓛𝓛 = 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 + 𝓛𝓛DIS + 𝓛𝓛GAN
X
Make G(z) and P(z)
undistinguishable
from real data
VAE/GAN
VAE
VAEDIS
GAN/VAE
GAN
Ground
Truth
VAE
VAEDIS
VAE/GAN
Generation test Reconstruction test
A. B. L. Larsen, S. K. Sønderby, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) zG
latent
D1/0
adversarial
training
Distinguish G(z) and P(z)
being fake from X being real
𝓛𝓛GAN
Loss
𝓛𝓛DIS hidden representation
reconstruction loss
𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩
(G)
𝓛𝓛 = 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 + 𝓛𝓛DIS + 𝓛𝓛GAN
!?
X
What’s going on?
latentdata
Loss
reconstruction
loss
P
X Q
Q(X)
(+ε)
P(Q(X))
AE
What’s going on?
latentdata
Loss
reconstruction
loss
P z~p(z)
Loss KL divergenceX Q
Q(X)
(+ε)
P(Q(X))
VAE
What’s going on?
latentdata
Loss
reconstruction
loss
P z~p(z) 1/0D
adversarial
divergence
X Q
Q(X)
(+ε)
P(Q(X))
AAE
What’s going on?
latentdata
D1/0
adversarial
loss P z~p(z)P(z)
X
GAN
What’s going on?
latentdata
D1/0
adversarial
loss
Loss
hidden
representation
reconstruction
loss
P z~p(z)P(z)
Loss KL divergenceX Q
Q(X)
(+ε)
P(Q(X))
VAE/GAN
What’s going on?
latentdata
Loss
reconstruction
loss
D1/0
adversarial
loss
Loss
hidden
representation
reconstruction
loss
P z~p(z)P(z)
Loss KL divergence
1/0D
adversarial
divergence
X Q
Q(X)
(+ε)
P(Q(X))
ADA (Adversarial Domain Adaption)
• Goal: given labeled data in source domain, aim to classify unlabeled data in
target domain.
G
G(Xsrc)
data feature
Xsrc C class
ADA (Adversarial Domain Adaption)
• Goal: given labeled data in source domain, aim to classify unlabeled data in
target domain.
G
G(Xsrc)
Xtgt
data feature
Xsrc
G(Xtgt)
C class
bad features
ADA (Adversarial Domain Adaption)
• Goal: given labeled data in source domain, aim to classify unlabeled data in
target domain.
G
G(Xsrc)
Xtgt
adversarial
training
Make G(Xtgt) and G(Xsrc)
undistinguishable for D
Distinguish G(Xtgt) as target domain
from G(Xsrc) as source domain
data feature
Xsrc
G(Xtgt) D 1/0
C class
ADA (Adversarial Domain Adaption)
• Goal: given labeled data in source domain, aim to classify unlabeled data in
target domain.
G
G(Xsrc)
Xtgt
adversarial
training
Make G(Xtgt) and G(Xsrc)
undistinguishable for D
Distinguish G(Xtgt) as target domain
from G(Xsrc) as source domain
data feature
Xsrc
G(Xtgt) D 1/0
C class
ADA (Adversarial Domain Adaption)
• Goal: given labeled data in source domain, aim to classify unlabeled data in
target domain.
G
G(Xsrc)
Xtgt
adversarial
training
Make G(Xtgt) and G(Xsrc)
undistinguishable for D
Distinguish G(Xtgt) as target domain
from G(Xsrc) as source domain
data feature
Xsrc
G(Xtgt) D 1/0
C class
testing
(classification)
On Unifying Deep Generative Models
z y
x
latent (data)
data (feature)
label
On Unifying Deep Generative Models
z y
x
latent (data)
data (feature)
label
𝒑𝒑 𝒛𝒛
On Unifying Deep Generative Models
• 𝑮𝑮𝜽𝜽 – 𝜃𝜃 are parameters in generator
• 𝑫𝑫𝝓𝝓 – 𝜙𝜙 are parameters in generator
z y
x
latent (data)
data (feature)
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
On Unifying Deep Generative Models
• 𝑮𝑮𝜽𝜽 – 𝜃𝜃 are parameters in generator
• 𝑫𝑫𝝓𝝓 – 𝜙𝜙 are parameters in generator
• Solid line – generative process
• Dashed line – inference process
• Hollow arrow – deterministic transformation
• Red arrow – adversarial mechanism
• 𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙 denotes 𝒒𝒒𝝓𝝓 𝒚𝒚 𝒙𝒙 and 𝒒𝒒𝝓𝝓 𝟏𝟏 − 𝒚𝒚 𝒙𝒙
z y
x
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent (data)
data (feature)
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
𝒚𝒚 = �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
ADA
𝒚𝒚 = �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
GAN
On Unifying Deep Generative Models
• 𝑮𝑮𝜽𝜽 – 𝜃𝜃 are parameters in generator
• 𝑫𝑫𝝓𝝓 – 𝜙𝜙 are parameters in generator
• Solid line – generative process
• Dashed line – inference process
• Hollow arrow – deterministic transformation
• Red arrow – adversarial mechanism
• 𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙 denotes 𝒒𝒒𝝓𝝓 𝒚𝒚 𝒙𝒙 and 𝒒𝒒𝝓𝝓 𝟏𝟏 − 𝒚𝒚 𝒙𝒙
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒚𝒚 = �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
GAN
𝒚𝒚 = �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
ADA
latent (data)
data (feature)
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
(ADA)
On Unifying Deep Generative Models
z y
x
latent
data
label
VAE
On Unifying Deep Generative Models
z y
x
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
VAE
On Unifying Deep Generative Models
z y
x
𝒒𝒒∗
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 = �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
perfect discriminator
VAE
On Unifying Deep Generative Models
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚
𝒒𝒒∗
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
= �
𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 0
𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 1
= �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
perfect discriminator
VAE
degenerated
adversarial
discriminator
On Unifying Deep Generative Models
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚
𝒒𝒒∗
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
= �
𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 0
𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 1
= �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
perfect discriminator
VAE
degenerated
adversarial
discriminator
adversary
activated VAE
(AAVAE)
On Unifying Deep Generative Models
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚
𝒒𝒒∗
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
= �
𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 0
𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 1
= �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
perfect discriminator
VAE
On Unifying Deep Generative Models
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
labelz y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
latent
data
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
𝒑𝒑 𝒛𝒛
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
On Unifying Deep Generative Models
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
InfoGAN
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
latent
data
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
degenerated
code space
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
On Unifying Deep Generative Models
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
labelz y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
latent
data
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
𝒑𝒑 𝒛𝒛
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
On Unifying Deep Generative Models
x y
z 𝒑𝒑𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒛𝒛
data
latent
labelz y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
latent
data
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
𝒑𝒑 𝒙𝒙
𝒛𝒛 = 𝑮𝑮𝜽𝜽 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒚𝒚
On Unifying Deep Generative Models
x y
z 𝒑𝒑𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒛𝒛
data
latent
label
AAE
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
latent
data
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
𝒑𝒑 𝒙𝒙
𝒛𝒛 = 𝑮𝑮𝜽𝜽 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒚𝒚
GAN vs VAE
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
latent
data
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚
𝒒𝒒∗
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
VAE
label
InfoGAN vs VAE
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚
𝒒𝒒∗
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
VAE
labelz y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
InfoGAN
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
InfoGAN vs AAE
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
InfoGAN
x y
z 𝒑𝒑𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒛𝒛
latent
label
𝒛𝒛 = 𝑮𝑮𝜽𝜽 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝜼𝜼 𝒙𝒙 𝒛𝒛, 𝒚𝒚
data
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
AAE
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
Wake-sleep Algorithm
• 𝒉𝒉 - general latent variables
• 𝝀𝝀 - general parameters
• 𝜽𝜽 - generator parameters
• In wake phase, update 𝜽𝜽 by fitting 𝒑𝒑𝜽𝜽 𝒙𝒙|𝒉𝒉 to 𝒙𝒙 and 𝒉𝒉 inferred by 𝒒𝒒𝝀𝝀 𝒉𝒉|𝒙𝒙 .
• In sleep phase, update 𝝀𝝀 based on generated samples.
• VAE: 𝒉𝒉 → 𝒛𝒛, 𝝀𝝀 → 𝜼𝜼
• GAN: 𝒉𝒉 → 𝒚𝒚, 𝝀𝝀 → 𝝓𝝓
Wake: max
𝜽𝜽
𝔼𝔼𝒒𝒒𝝀𝝀 𝒉𝒉|𝒙𝒙 𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 log 𝑝𝑝𝜃𝜃 𝑥𝑥 ℎ
Sleep: max
𝜆𝜆
𝔼𝔼𝒑𝒑𝜽𝜽 𝒙𝒙|𝒉𝒉 𝒑𝒑 𝒉𝒉 log 𝑞𝑞𝜆𝜆 ℎ 𝑥𝑥
References
• D. P. Kingma, M. Welling. Auto-Encoding Variational Bayes. arXiv preprint arXiv:1312.6114, 2014
• I. J. GoodFellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio.
Genrative Adversrial Nets. arXiv:preprint arXiv:1406.2661, 2014
• A. B. L. Larsen, S. K. Sønderby and O. Winther. Autoencoding beyond pixels using a learned
similarity metric. arXiv preprint arXiv:1512.09300, 2015
• A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow and B. Frey. Adversarial Autoencoders. arXiv preprint
arXiv:1511.05644, 2016
• Z. Hu, Z. Yang, R. Salakhutdinov and E. P. Xing. On unifying deep generative models. arXiv preprint
arXiv:1706.00550, 2017

More Related Content

What's hot

ADNI Data Training Part I
ADNI Data Training Part IADNI Data Training Part I
ADNI Data Training Part IAlzforum
 
Recent Trends in Deep Learning
Recent Trends in Deep LearningRecent Trends in Deep Learning
Recent Trends in Deep LearningSungjoon Choi
 
Survey on self supervised image segmentation
Survey on self supervised image segmentationSurvey on self supervised image segmentation
Survey on self supervised image segmentationArithmer Inc.
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelinesjeykottalam
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksBennoG1
 
GANs Presentation.pptx
GANs Presentation.pptxGANs Presentation.pptx
GANs Presentation.pptxMAHMOUD729246
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science
 
AnoGAN을 이용한 철강 소재 결함 검출 AI
AnoGAN을 이용한 철강 소재 결함 검출 AIAnoGAN을 이용한 철강 소재 결함 검출 AI
AnoGAN을 이용한 철강 소재 결함 검출 AIHYEJINLIM10
 
Unsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANUnsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANShyam Krishna Khadka
 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)SungminYou
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)Thomas da Silva Paula
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...Edge AI and Vision Alliance
 
Data mining Software "RapidMiner" Presentation
Data mining Software "RapidMiner" PresentationData mining Software "RapidMiner" Presentation
Data mining Software "RapidMiner" PresentationMd. Mizanur Rahman Nur
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMark Chang
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learningAntonio Rueda-Toicen
 
Image to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANImage to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANS.Shayan Daneshvar
 
Fuzzy rule based expert system for diagnosis of lung cancer
Fuzzy rule based expert system for diagnosis of lung cancerFuzzy rule based expert system for diagnosis of lung cancer
Fuzzy rule based expert system for diagnosis of lung cancerFarzad Vasheghani Farahani
 

What's hot (20)

ADNI Data Training Part I
ADNI Data Training Part IADNI Data Training Part I
ADNI Data Training Part I
 
Recent Trends in Deep Learning
Recent Trends in Deep LearningRecent Trends in Deep Learning
Recent Trends in Deep Learning
 
Survey on self supervised image segmentation
Survey on self supervised image segmentationSurvey on self supervised image segmentation
Survey on self supervised image segmentation
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial Networks
 
GANs Presentation.pptx
GANs Presentation.pptxGANs Presentation.pptx
GANs Presentation.pptx
 
Bleu vs rouge
Bleu vs rougeBleu vs rouge
Bleu vs rouge
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
AnoGAN을 이용한 철강 소재 결함 검출 AI
AnoGAN을 이용한 철강 소재 결함 검출 AIAnoGAN을 이용한 철강 소재 결함 검출 AI
AnoGAN을 이용한 철강 소재 결함 검출 AI
 
Unsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANUnsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGAN
 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
 
Data mining Software "RapidMiner" Presentation
Data mining Software "RapidMiner" PresentationData mining Software "RapidMiner" Presentation
Data mining Software "RapidMiner" Presentation
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Generative models
Generative modelsGenerative models
Generative models
 
Introduction of VAE
Introduction of VAEIntroduction of VAE
Introduction of VAE
 
Image to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANImage to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GAN
 
Fuzzy rule based expert system for diagnosis of lung cancer
Fuzzy rule based expert system for diagnosis of lung cancerFuzzy rule based expert system for diagnosis of lung cancer
Fuzzy rule based expert system for diagnosis of lung cancer
 

Similar to Introduction to Deep Generative Models

Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Olga Zinkevych
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기NAVER Engineering
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksYunjey Choi
 
A Walk in the GAN Zoo
A Walk in the GAN ZooA Walk in the GAN Zoo
A Walk in the GAN ZooLarry Guo
 
Otter 2016-11-28-01-ss
Otter 2016-11-28-01-ssOtter 2016-11-28-01-ss
Otter 2016-11-28-01-ssRuo Ando
 
從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論岳華 杜
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptRMDAcademicCoordinat
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptsomeyamohsen2
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptGayathriSanthosh11
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsMark Chang
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesFrank Nielsen
 
Deep generative model.pdf
Deep generative model.pdfDeep generative model.pdf
Deep generative model.pdfHyungjoo Cho
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Daisuke Yoneoka
 
CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2zukun
 
[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers
[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers
[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layersyumakishi
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative ModelsKenta Oono
 
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisDetecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisSilvio Cesare
 
pptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacespptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacesbutest
 

Similar to Introduction to Deep Generative Models (20)

Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Abstract machines for great good
Abstract machines for great goodAbstract machines for great good
Abstract machines for great good
 
A Walk in the GAN Zoo
A Walk in the GAN ZooA Walk in the GAN Zoo
A Walk in the GAN Zoo
 
Otter 2016-11-28-01-ss
Otter 2016-11-28-01-ssOtter 2016-11-28-01-ss
Otter 2016-11-28-01-ss
 
從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.ppt
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.ppt
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.ppt
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANs
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective Divergences
 
Deep generative model.pdf
Deep generative model.pdfDeep generative model.pdf
Deep generative model.pdf
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9
 
CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2
 
[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers
[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers
[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
 
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisDetecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
 
pptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacespptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspaces
 

More from Hao-Wen (Herman) Dong

Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...Hao-Wen (Herman) Dong
 
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...Hao-Wen (Herman) Dong
 
Organizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository OrganizationOrganizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository OrganizationHao-Wen (Herman) Dong
 
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...Hao-Wen (Herman) Dong
 
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GANRecent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GANHao-Wen (Herman) Dong
 

More from Hao-Wen (Herman) Dong (6)

Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
 
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
 
Organizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository OrganizationOrganizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository Organization
 
What is Critical in GAN Training?
What is Critical in GAN Training?What is Critical in GAN Training?
What is Critical in GAN Training?
 
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
 
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GANRecent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
 

Recently uploaded

Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayZachary Labe
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10ROLANARIBATO3
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)DHURKADEVIBASKAR
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsCharlene Llagas
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Cytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxCytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxVarshiniMK
 

Recently uploaded (20)

Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work Day
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of Traits
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Cytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxCytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptx
 

Introduction to Deep Generative Models

  • 1. Introduction to Deep Generative Models Herman Dong Music and Audio Computing Lab (MACLab), Research Center for Information Technology Innovation, Academia Sinica
  • 2. MuseGAN Learn about our recent work on using GAN to compose pop song at https://salu133445.github.io/musegan/ Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang and Yi-Hsuan Yang. 2017. MuseGAN: Symbolic-domain Music Generation and Accompaniment with Multi-track Sequential Generative Adversarial Networks. arXiv preprint arXiv:1709.06298.
  • 3. Outline • Brief introduction to deep generative models • AE (Autoencoder) • VAE (Variational Autoencoder) • GAN (Generative Adversarial Networks) • AAE (Adversarial Autoencoder) • VAE/GAN • ADA (Adversarial Domain Adaption) • Reformulation • Graphical model representation • Connection to Wake-sleep Algorithm
  • 4. AE (Autoencoder) • minimize reconstruction loss EX D E(X) D(E(X)) Lossreconstruction loss latent data
  • 5. AE (Autoencoder) • minimize reconstruction loss reconstruction loss EX D E(X) D(E(X)) Loss latent data
  • 6. AE (Autoencoder) • minimize reconstruction loss reconstruction loss EX D E(X) D(E(X)) Loss latent data
  • 7. VAE (Variational Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) PP(z) z ε latentdata
  • 8. VAE (Variational Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss Loss PP(z) z ε reconstruction loss KL divergence latentdata ≈
  • 9. VAE (Variational Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss Loss PP(z) z ε reconstruction loss KL divergence latentdata ≈
  • 10. VAE (Variational Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss Loss PP(z) z ε reconstruction loss KL divergence latentdata μ σ ≈
  • 11. VAE (Variational Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss Loss PP(z) z ε reconstruction loss KL divergence latentdata μ σ μ = z σ + ∙ ε ≈
  • 12. VAE (Variational Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss Loss PP(z) z ε testing (generation) reconstruction loss KL divergence latentdata μ σ μ = z σ + ∙ ε ≈
  • 13. GAN (Generative Adversarial Network) • minimize distance between the distribution of real data and generated samples Gz~p(z) G(z) D X 1/0 latent data
  • 14. GAN (Generative Adversarial Network) • minimize distance between the distribution of real data and generated samples Gz~p(z) G(z) D X 1/0 adversarial training log(1-D(X)) + log(D(G(z))) log(1-D(G(z))) Make G(z) undistinguishable from real data for D Distinguish G(z) being fake from X being real latent data
  • 15. GAN (Generative Adversarial Network) • minimize distance between the distribution of real data and generated samples Gz~p(z) G(z) D X 1/0 adversarial training log(1-D(X)) + log(D(G(z))) log(1-D(G(z))) Make G(z) undistinguishable from real data for D Distinguish G(z) being fake from X being real latent data
  • 16. GAN (Generative Adversarial Network) • minimize distance between the distribution of real data and generated samples Gz~p(z) G(z) D X 1/0 adversarial training log(1-D(X)) + log(D(G(z))) log(1-D(G(z))) Make G(z) undistinguishable from real data for D Distinguish G(z) being fake from X being real latent data
  • 17. GAN (Generative Adversarial Network) • minimize distance between the distribution of real data and generated samples Gz~p(z) G(z) D X 1/0 adversarial training log(1-D(X)) + log(D(G(z))) log(1-D(G(z))) Make G(z) undistinguishable from real data for D Distinguish G(z) being fake from X being real testing (generation) latent data
  • 18. GAN vs VAE • GAN • Generator aim to fool the discriminator • Discriminator aim to distinguish generated data from real data • output images are sharper • higher diversity, lower stability • VAE • Objective: reconstruct real data • using pixel-to-pixel loss • output images are more blurred • lower diversity, higher stability GAN VAE A. B. L. Larsen, S. K. Sønderby, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
  • 19. GAN vs VAE • GAN • Generator aim to fool the discriminator • Discriminator aim to distinguish generated data from real data • output images are sharper • higher diversity, lower stability • VAE • Objective: reconstruct real data • using pixel-to-pixel loss • output images are more blurred • lower diversity, higher stability GAN VAE A. B. L. Larsen, S. K. Sønderby, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
  • 20. AAE (Adversarial Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss PP(z) z ε reconstruction loss latentdata ≈ Loss KL divergence
  • 21. AAE (Adversarial Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss PP(z) reconstruction loss latentdata ≈ D 1/0 adversarial training Make Q(X) undistinguishable from real p(X) for D Distinguish Q(X) being fake from p(X) being real
  • 22. AAE (Adversarial Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss PP(z) reconstruction loss latentdata ≈ D 1/0 adversarial training Make Q(X) undistinguishable from real p(X) for D Distinguish Q(X) being fake from p(X) being real
  • 23. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ Loss reconstruction loss X VAE
  • 25. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) zG D1/0 X
  • 26. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) z D1/0 (G) X
  • 27. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) z D1/0 adversarial training Distinguish G(z) and P(z) being fake from X being real 𝓛𝓛GAN (G) X Make G(z) and P(z) undistinguishable from real data
  • 28. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) z D1/0 adversarial training Distinguish G(z) and P(z) being fake from X being real 𝓛𝓛GAN Loss 𝓛𝓛DIS hidden representation reconstruction loss (G) X Make G(z) and P(z) undistinguishable from real data
  • 29. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) z D1/0 adversarial training Distinguish G(z) and P(z) being fake from X being real 𝓛𝓛GAN Loss 𝓛𝓛DIS hidden representation reconstruction loss 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 (G) X Make G(z) and P(z) undistinguishable from real data 𝓛𝓛 = 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 + 𝓛𝓛DIS + 𝓛𝓛GAN
  • 30. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) z D1/0 adversarial training Distinguish G(z) and P(z) being fake from X being real 𝓛𝓛GAN Loss 𝓛𝓛DIS hidden representation reconstruction loss 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 (G) 𝓛𝓛 = 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 + 𝓛𝓛DIS + 𝓛𝓛GAN X Make G(z) and P(z) undistinguishable from real data
  • 31. VAE/GAN VAE VAEDIS GAN/VAE GAN Ground Truth VAE VAEDIS VAE/GAN Generation test Reconstruction test A. B. L. Larsen, S. K. Sønderby, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
  • 32. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) zG latent D1/0 adversarial training Distinguish G(z) and P(z) being fake from X being real 𝓛𝓛GAN Loss 𝓛𝓛DIS hidden representation reconstruction loss 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 (G) 𝓛𝓛 = 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 + 𝓛𝓛DIS + 𝓛𝓛GAN !? X
  • 34. What’s going on? latentdata Loss reconstruction loss P z~p(z) Loss KL divergenceX Q Q(X) (+ε) P(Q(X)) VAE
  • 35. What’s going on? latentdata Loss reconstruction loss P z~p(z) 1/0D adversarial divergence X Q Q(X) (+ε) P(Q(X)) AAE
  • 38. What’s going on? latentdata Loss reconstruction loss D1/0 adversarial loss Loss hidden representation reconstruction loss P z~p(z)P(z) Loss KL divergence 1/0D adversarial divergence X Q Q(X) (+ε) P(Q(X))
  • 39. ADA (Adversarial Domain Adaption) • Goal: given labeled data in source domain, aim to classify unlabeled data in target domain. G G(Xsrc) data feature Xsrc C class
  • 40. ADA (Adversarial Domain Adaption) • Goal: given labeled data in source domain, aim to classify unlabeled data in target domain. G G(Xsrc) Xtgt data feature Xsrc G(Xtgt) C class bad features
  • 41. ADA (Adversarial Domain Adaption) • Goal: given labeled data in source domain, aim to classify unlabeled data in target domain. G G(Xsrc) Xtgt adversarial training Make G(Xtgt) and G(Xsrc) undistinguishable for D Distinguish G(Xtgt) as target domain from G(Xsrc) as source domain data feature Xsrc G(Xtgt) D 1/0 C class
  • 42. ADA (Adversarial Domain Adaption) • Goal: given labeled data in source domain, aim to classify unlabeled data in target domain. G G(Xsrc) Xtgt adversarial training Make G(Xtgt) and G(Xsrc) undistinguishable for D Distinguish G(Xtgt) as target domain from G(Xsrc) as source domain data feature Xsrc G(Xtgt) D 1/0 C class
  • 43. ADA (Adversarial Domain Adaption) • Goal: given labeled data in source domain, aim to classify unlabeled data in target domain. G G(Xsrc) Xtgt adversarial training Make G(Xtgt) and G(Xsrc) undistinguishable for D Distinguish G(Xtgt) as target domain from G(Xsrc) as source domain data feature Xsrc G(Xtgt) D 1/0 C class testing (classification)
  • 44. On Unifying Deep Generative Models z y x latent (data) data (feature) label
  • 45. On Unifying Deep Generative Models z y x latent (data) data (feature) label 𝒑𝒑 𝒛𝒛
  • 46. On Unifying Deep Generative Models • 𝑮𝑮𝜽𝜽 – 𝜃𝜃 are parameters in generator • 𝑫𝑫𝝓𝝓 – 𝜙𝜙 are parameters in generator z y x latent (data) data (feature) label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
  • 47. On Unifying Deep Generative Models • 𝑮𝑮𝜽𝜽 – 𝜃𝜃 are parameters in generator • 𝑫𝑫𝝓𝝓 – 𝜙𝜙 are parameters in generator • Solid line – generative process • Dashed line – inference process • Hollow arrow – deterministic transformation • Red arrow – adversarial mechanism • 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 denotes 𝒒𝒒𝝓𝝓 𝒚𝒚 𝒙𝒙 and 𝒒𝒒𝝓𝝓 𝟏𝟏 − 𝒚𝒚 𝒙𝒙 z y x 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent (data) data (feature) label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 ADA 𝒚𝒚 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 GAN
  • 48. On Unifying Deep Generative Models • 𝑮𝑮𝜽𝜽 – 𝜃𝜃 are parameters in generator • 𝑫𝑫𝝓𝝓 – 𝜙𝜙 are parameters in generator • Solid line – generative process • Dashed line – inference process • Hollow arrow – deterministic transformation • Red arrow – adversarial mechanism • 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 denotes 𝒒𝒒𝝓𝝓 𝒚𝒚 𝒙𝒙 and 𝒒𝒒𝝓𝝓 𝟏𝟏 − 𝒚𝒚 𝒙𝒙 z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒚𝒚 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 GAN 𝒚𝒚 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 ADA latent (data) data (feature) label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN (ADA)
  • 49. On Unifying Deep Generative Models z y x latent data label VAE
  • 50. On Unifying Deep Generative Models z y x latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 VAE
  • 51. On Unifying Deep Generative Models z y x 𝒒𝒒∗ 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 perfect discriminator VAE
  • 52. On Unifying Deep Generative Models z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚 𝒒𝒒∗ 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 = � 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 0 𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 1 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 perfect discriminator VAE
  • 53. degenerated adversarial discriminator On Unifying Deep Generative Models z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚 𝒒𝒒∗ 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 = � 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 0 𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 1 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 perfect discriminator VAE
  • 54. degenerated adversarial discriminator adversary activated VAE (AAVAE) On Unifying Deep Generative Models z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚 𝒒𝒒∗ 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 = � 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 0 𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 1 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 perfect discriminator VAE
  • 55. On Unifying Deep Generative Models z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data labelz y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 latent data label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN 𝒑𝒑 𝒛𝒛 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN
  • 56. On Unifying Deep Generative Models z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 InfoGAN z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 latent data label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN degenerated code space 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
  • 57. On Unifying Deep Generative Models z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data labelz y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 latent data label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN 𝒑𝒑 𝒛𝒛 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN
  • 58. On Unifying Deep Generative Models x y z 𝒑𝒑𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒛𝒛 data latent labelz y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 latent data label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN 𝒑𝒑 𝒙𝒙 𝒛𝒛 = 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒚𝒚
  • 59. On Unifying Deep Generative Models x y z 𝒑𝒑𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒛𝒛 data latent label AAE z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 latent data label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN 𝒑𝒑 𝒙𝒙 𝒛𝒛 = 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒚𝒚
  • 60. GAN vs VAE z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 latent data label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚 𝒒𝒒∗ 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 VAE label
  • 61. InfoGAN vs VAE z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚 𝒒𝒒∗ 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 VAE labelz y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 InfoGAN 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
  • 62. InfoGAN vs AAE z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 InfoGAN x y z 𝒑𝒑𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒛𝒛 latent label 𝒛𝒛 = 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝜼𝜼 𝒙𝒙 𝒛𝒛, 𝒚𝒚 data 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 AAE 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
  • 63. Wake-sleep Algorithm • 𝒉𝒉 - general latent variables • 𝝀𝝀 - general parameters • 𝜽𝜽 - generator parameters • In wake phase, update 𝜽𝜽 by fitting 𝒑𝒑𝜽𝜽 𝒙𝒙|𝒉𝒉 to 𝒙𝒙 and 𝒉𝒉 inferred by 𝒒𝒒𝝀𝝀 𝒉𝒉|𝒙𝒙 . • In sleep phase, update 𝝀𝝀 based on generated samples. • VAE: 𝒉𝒉 → 𝒛𝒛, 𝝀𝝀 → 𝜼𝜼 • GAN: 𝒉𝒉 → 𝒚𝒚, 𝝀𝝀 → 𝝓𝝓 Wake: max 𝜽𝜽 𝔼𝔼𝒒𝒒𝝀𝝀 𝒉𝒉|𝒙𝒙 𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 log 𝑝𝑝𝜃𝜃 𝑥𝑥 ℎ Sleep: max 𝜆𝜆 𝔼𝔼𝒑𝒑𝜽𝜽 𝒙𝒙|𝒉𝒉 𝒑𝒑 𝒉𝒉 log 𝑞𝑞𝜆𝜆 ℎ 𝑥𝑥
  • 64. References • D. P. Kingma, M. Welling. Auto-Encoding Variational Bayes. arXiv preprint arXiv:1312.6114, 2014 • I. J. GoodFellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Genrative Adversrial Nets. arXiv:preprint arXiv:1406.2661, 2014 • A. B. L. Larsen, S. K. Sønderby and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015 • A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow and B. Frey. Adversarial Autoencoders. arXiv preprint arXiv:1511.05644, 2016 • Z. Hu, Z. Yang, R. Salakhutdinov and E. P. Xing. On unifying deep generative models. arXiv preprint arXiv:1706.00550, 2017