SlideShare a Scribd company logo
1 of 64
Download to read offline
Introduction to Deep
Generative Models
Herman Dong
Music and Audio Computing Lab (MACLab),
Research Center for Information Technology Innovation,
Academia Sinica
MuseGAN
Learn about our recent work on using GAN to compose
pop song at https://salu133445.github.io/musegan/
Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang and Yi-Hsuan Yang. 2017. MuseGAN: Symbolic-domain Music Generation
and Accompaniment with Multi-track Sequential Generative Adversarial Networks. arXiv preprint arXiv:1709.06298.
Outline
• Brief introduction to deep generative models
• AE (Autoencoder)
• VAE (Variational Autoencoder)
• GAN (Generative Adversarial Networks)
• AAE (Adversarial Autoencoder)
• VAE/GAN
• ADA (Adversarial Domain Adaption)
• Reformulation
• Graphical model representation
• Connection to Wake-sleep Algorithm
AE (Autoencoder)
• minimize reconstruction loss
EX
D
E(X)
D(E(X))
Lossreconstruction
loss
latent
data
AE (Autoencoder)
• minimize reconstruction loss
reconstruction
loss
EX
D
E(X)
D(E(X))
Loss
latent
data
AE (Autoencoder)
• minimize reconstruction loss
reconstruction
loss
EX
D
E(X)
D(E(X))
Loss
latent
data
VAE (Variational Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
PP(z) z ε
latentdata
VAE (Variational Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
Loss
PP(z) z ε
reconstruction
loss
KL
divergence
latentdata
≈
VAE (Variational Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
Loss
PP(z) z ε
reconstruction
loss
KL
divergence
latentdata
≈
VAE (Variational Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
Loss
PP(z) z ε
reconstruction
loss
KL
divergence
latentdata
μ σ
≈
VAE (Variational Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
Loss
PP(z) z ε
reconstruction
loss
KL
divergence
latentdata
μ σ
μ
=
z σ
+ ∙
ε
≈
VAE (Variational Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
Loss
PP(z) z ε
testing
(generation)
reconstruction
loss
KL
divergence
latentdata
μ σ
μ
=
z σ
+ ∙
ε
≈
GAN (Generative Adversarial Network)
• minimize distance between the distribution of real data and generated
samples
Gz~p(z) G(z)
D
X
1/0
latent data
GAN (Generative Adversarial Network)
• minimize distance between the distribution of real data and generated
samples
Gz~p(z) G(z)
D
X
1/0
adversarial
training
log(1-D(X))
+
log(D(G(z)))
log(1-D(G(z)))
Make G(z) undistinguishable
from real data for D
Distinguish G(z) being fake
from X being real
latent data
GAN (Generative Adversarial Network)
• minimize distance between the distribution of real data and generated
samples
Gz~p(z) G(z)
D
X
1/0
adversarial
training
log(1-D(X))
+
log(D(G(z)))
log(1-D(G(z)))
Make G(z) undistinguishable
from real data for D
Distinguish G(z) being fake
from X being real
latent data
GAN (Generative Adversarial Network)
• minimize distance between the distribution of real data and generated
samples
Gz~p(z) G(z)
D
X
1/0
adversarial
training
log(1-D(X))
+
log(D(G(z)))
log(1-D(G(z)))
Make G(z) undistinguishable
from real data for D
Distinguish G(z) being fake
from X being real
latent data
GAN (Generative Adversarial Network)
• minimize distance between the distribution of real data and generated
samples
Gz~p(z) G(z)
D
X
1/0
adversarial
training
log(1-D(X))
+
log(D(G(z)))
log(1-D(G(z)))
Make G(z) undistinguishable
from real data for D
Distinguish G(z) being fake
from X being real
testing
(generation)
latent data
GAN vs VAE
• GAN
• Generator aim to fool the discriminator
• Discriminator aim to distinguish generated data from real data
• output images are sharper
• higher diversity, lower stability
• VAE
• Objective: reconstruct real data
• using pixel-to-pixel loss
• output images are more blurred
• lower diversity, higher stability
GAN
VAE
A. B. L. Larsen, S. K. Sønderby, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
GAN vs VAE
• GAN
• Generator aim to fool the discriminator
• Discriminator aim to distinguish generated data from real data
• output images are sharper
• higher diversity, lower stability
• VAE
• Objective: reconstruct real data
• using pixel-to-pixel loss
• output images are more blurred
• lower diversity, higher stability
GAN
VAE
A. B. L. Larsen, S. K. Sønderby, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
AAE (Adversarial Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
PP(z) z ε
reconstruction
loss
latentdata
≈
Loss
KL
divergence
AAE (Adversarial Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
PP(z)
reconstruction
loss
latentdata
≈
D 1/0
adversarial
training
Make Q(X) undistinguishable
from real p(X) for D
Distinguish Q(X) being fake
from p(X) being real
AAE (Adversarial Autoencoder)
• minimize reconstruction loss
• minimize distance between encoded latent distribution and prior distribution
QX Q(X)
p(z)
Loss
PP(z)
reconstruction
loss
latentdata
≈
D 1/0
adversarial
training
Make Q(X) undistinguishable
from real p(X) for D
Distinguish Q(X) being fake
from p(X) being real
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
Loss
reconstruction
loss
X
VAE
GAN
VAE/GAN
data
G(z) zG
latent
D1/0
X
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) zG
D1/0
X
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) z
D1/0
(G)
X
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) z
D1/0
adversarial
training
Distinguish G(z) and P(z)
being fake from X being real
𝓛𝓛GAN
(G)
X
Make G(z) and P(z)
undistinguishable
from real data
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) z
D1/0
adversarial
training
Distinguish G(z) and P(z)
being fake from X being real
𝓛𝓛GAN
Loss
𝓛𝓛DIS hidden representation
reconstruction loss
(G)
X
Make G(z) and P(z)
undistinguishable
from real data
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) z
D1/0
adversarial
training
Distinguish G(z) and P(z)
being fake from X being real
𝓛𝓛GAN
Loss
𝓛𝓛DIS hidden representation
reconstruction loss
𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩
(G)
X
Make G(z) and P(z)
undistinguishable
from real data
𝓛𝓛 = 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 + 𝓛𝓛DIS + 𝓛𝓛GAN
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) z
D1/0
adversarial
training
Distinguish G(z) and P(z)
being fake from X being real
𝓛𝓛GAN
Loss
𝓛𝓛DIS hidden representation
reconstruction loss
𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩
(G)
𝓛𝓛 = 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 + 𝓛𝓛DIS + 𝓛𝓛GAN
X
Make G(z) and P(z)
undistinguishable
from real data
VAE/GAN
VAE
VAEDIS
GAN/VAE
GAN
Ground
Truth
VAE
VAEDIS
VAE/GAN
Generation test Reconstruction test
A. B. L. Larsen, S. K. Sønderby, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
VAE/GAN
Q Q(X)
p(z)
Loss
PP(z) z ε
KL
divergence
latent
data
≈
G(z) zG
latent
D1/0
adversarial
training
Distinguish G(z) and P(z)
being fake from X being real
𝓛𝓛GAN
Loss
𝓛𝓛DIS hidden representation
reconstruction loss
𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩
(G)
𝓛𝓛 = 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 + 𝓛𝓛DIS + 𝓛𝓛GAN
!?
X
What’s going on?
latentdata
Loss
reconstruction
loss
P
X Q
Q(X)
(+ε)
P(Q(X))
AE
What’s going on?
latentdata
Loss
reconstruction
loss
P z~p(z)
Loss KL divergenceX Q
Q(X)
(+ε)
P(Q(X))
VAE
What’s going on?
latentdata
Loss
reconstruction
loss
P z~p(z) 1/0D
adversarial
divergence
X Q
Q(X)
(+ε)
P(Q(X))
AAE
What’s going on?
latentdata
D1/0
adversarial
loss P z~p(z)P(z)
X
GAN
What’s going on?
latentdata
D1/0
adversarial
loss
Loss
hidden
representation
reconstruction
loss
P z~p(z)P(z)
Loss KL divergenceX Q
Q(X)
(+ε)
P(Q(X))
VAE/GAN
What’s going on?
latentdata
Loss
reconstruction
loss
D1/0
adversarial
loss
Loss
hidden
representation
reconstruction
loss
P z~p(z)P(z)
Loss KL divergence
1/0D
adversarial
divergence
X Q
Q(X)
(+ε)
P(Q(X))
ADA (Adversarial Domain Adaption)
• Goal: given labeled data in source domain, aim to classify unlabeled data in
target domain.
G
G(Xsrc)
data feature
Xsrc C class
ADA (Adversarial Domain Adaption)
• Goal: given labeled data in source domain, aim to classify unlabeled data in
target domain.
G
G(Xsrc)
Xtgt
data feature
Xsrc
G(Xtgt)
C class
bad features
ADA (Adversarial Domain Adaption)
• Goal: given labeled data in source domain, aim to classify unlabeled data in
target domain.
G
G(Xsrc)
Xtgt
adversarial
training
Make G(Xtgt) and G(Xsrc)
undistinguishable for D
Distinguish G(Xtgt) as target domain
from G(Xsrc) as source domain
data feature
Xsrc
G(Xtgt) D 1/0
C class
ADA (Adversarial Domain Adaption)
• Goal: given labeled data in source domain, aim to classify unlabeled data in
target domain.
G
G(Xsrc)
Xtgt
adversarial
training
Make G(Xtgt) and G(Xsrc)
undistinguishable for D
Distinguish G(Xtgt) as target domain
from G(Xsrc) as source domain
data feature
Xsrc
G(Xtgt) D 1/0
C class
ADA (Adversarial Domain Adaption)
• Goal: given labeled data in source domain, aim to classify unlabeled data in
target domain.
G
G(Xsrc)
Xtgt
adversarial
training
Make G(Xtgt) and G(Xsrc)
undistinguishable for D
Distinguish G(Xtgt) as target domain
from G(Xsrc) as source domain
data feature
Xsrc
G(Xtgt) D 1/0
C class
testing
(classification)
On Unifying Deep Generative Models
z y
x
latent (data)
data (feature)
label
On Unifying Deep Generative Models
z y
x
latent (data)
data (feature)
label
𝒑𝒑 𝒛𝒛
On Unifying Deep Generative Models
• 𝑮𝑮𝜽𝜽 – 𝜃𝜃 are parameters in generator
• 𝑫𝑫𝝓𝝓 – 𝜙𝜙 are parameters in generator
z y
x
latent (data)
data (feature)
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
On Unifying Deep Generative Models
• 𝑮𝑮𝜽𝜽 – 𝜃𝜃 are parameters in generator
• 𝑫𝑫𝝓𝝓 – 𝜙𝜙 are parameters in generator
• Solid line – generative process
• Dashed line – inference process
• Hollow arrow – deterministic transformation
• Red arrow – adversarial mechanism
• 𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙 denotes 𝒒𝒒𝝓𝝓 𝒚𝒚 𝒙𝒙 and 𝒒𝒒𝝓𝝓 𝟏𝟏 − 𝒚𝒚 𝒙𝒙
z y
x
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent (data)
data (feature)
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
𝒚𝒚 = �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
ADA
𝒚𝒚 = �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
GAN
On Unifying Deep Generative Models
• 𝑮𝑮𝜽𝜽 – 𝜃𝜃 are parameters in generator
• 𝑫𝑫𝝓𝝓 – 𝜙𝜙 are parameters in generator
• Solid line – generative process
• Dashed line – inference process
• Hollow arrow – deterministic transformation
• Red arrow – adversarial mechanism
• 𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙 denotes 𝒒𝒒𝝓𝝓 𝒚𝒚 𝒙𝒙 and 𝒒𝒒𝝓𝝓 𝟏𝟏 − 𝒚𝒚 𝒙𝒙
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒚𝒚 = �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
GAN
𝒚𝒚 = �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
ADA
latent (data)
data (feature)
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
(ADA)
On Unifying Deep Generative Models
z y
x
latent
data
label
VAE
On Unifying Deep Generative Models
z y
x
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
VAE
On Unifying Deep Generative Models
z y
x
𝒒𝒒∗
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 = �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
perfect discriminator
VAE
On Unifying Deep Generative Models
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚
𝒒𝒒∗
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
= �
𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 0
𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 1
= �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
perfect discriminator
VAE
degenerated
adversarial
discriminator
On Unifying Deep Generative Models
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚
𝒒𝒒∗
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
= �
𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 0
𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 1
= �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
perfect discriminator
VAE
degenerated
adversarial
discriminator
adversary
activated VAE
(AAVAE)
On Unifying Deep Generative Models
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚
𝒒𝒒∗
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
= �
𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 0
𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 1
= �
𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
perfect discriminator
VAE
On Unifying Deep Generative Models
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
labelz y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
latent
data
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
𝒑𝒑 𝒛𝒛
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
On Unifying Deep Generative Models
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
InfoGAN
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
latent
data
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
degenerated
code space
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
On Unifying Deep Generative Models
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
labelz y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
latent
data
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
𝒑𝒑 𝒛𝒛
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
On Unifying Deep Generative Models
x y
z 𝒑𝒑𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒛𝒛
data
latent
labelz y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
latent
data
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
𝒑𝒑 𝒙𝒙
𝒛𝒛 = 𝑮𝑮𝜽𝜽 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒚𝒚
On Unifying Deep Generative Models
x y
z 𝒑𝒑𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒛𝒛
data
latent
label
AAE
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
latent
data
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
𝒑𝒑 𝒙𝒙
𝒛𝒛 = 𝑮𝑮𝜽𝜽 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒚𝒚
GAN vs VAE
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
latent
data
label
𝒑𝒑 𝒛𝒛
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
GAN
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚
𝒒𝒒∗
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
VAE
label
InfoGAN vs VAE
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚
𝒒𝒒∗
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
VAE
labelz y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
InfoGAN
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
InfoGAN vs AAE
z y
x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒙𝒙
latent
data
label
𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚
InfoGAN
x y
z 𝒑𝒑𝜽𝜽 𝒛𝒛 𝒚𝒚
𝒒𝒒𝝓𝝓
𝒓𝒓
𝒚𝒚 𝒛𝒛
latent
label
𝒛𝒛 = 𝑮𝑮𝜽𝜽 𝒙𝒙
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒚𝒚
𝒒𝒒𝜼𝜼 𝒙𝒙 𝒛𝒛, 𝒚𝒚
data
𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
AAE
𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
Wake-sleep Algorithm
• 𝒉𝒉 - general latent variables
• 𝝀𝝀 - general parameters
• 𝜽𝜽 - generator parameters
• In wake phase, update 𝜽𝜽 by fitting 𝒑𝒑𝜽𝜽 𝒙𝒙|𝒉𝒉 to 𝒙𝒙 and 𝒉𝒉 inferred by 𝒒𝒒𝝀𝝀 𝒉𝒉|𝒙𝒙 .
• In sleep phase, update 𝝀𝝀 based on generated samples.
• VAE: 𝒉𝒉 → 𝒛𝒛, 𝝀𝝀 → 𝜼𝜼
• GAN: 𝒉𝒉 → 𝒚𝒚, 𝝀𝝀 → 𝝓𝝓
Wake: max
𝜽𝜽
𝔼𝔼𝒒𝒒𝝀𝝀 𝒉𝒉|𝒙𝒙 𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 log 𝑝𝑝𝜃𝜃 𝑥𝑥 ℎ
Sleep: max
𝜆𝜆
𝔼𝔼𝒑𝒑𝜽𝜽 𝒙𝒙|𝒉𝒉 𝒑𝒑 𝒉𝒉 log 𝑞𝑞𝜆𝜆 ℎ 𝑥𝑥
References
• D. P. Kingma, M. Welling. Auto-Encoding Variational Bayes. arXiv preprint arXiv:1312.6114, 2014
• I. J. GoodFellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio.
Genrative Adversrial Nets. arXiv:preprint arXiv:1406.2661, 2014
• A. B. L. Larsen, S. K. Sønderby and O. Winther. Autoencoding beyond pixels using a learned
similarity metric. arXiv preprint arXiv:1512.09300, 2015
• A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow and B. Frey. Adversarial Autoencoders. arXiv preprint
arXiv:1511.05644, 2016
• Z. Hu, Z. Yang, R. Salakhutdinov and E. P. Xing. On unifying deep generative models. arXiv preprint
arXiv:1706.00550, 2017

More Related Content

What's hot

Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...Edureka!
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksYunjey Choi
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and ApplicationsHoang Nguyen
 
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...Vitaly Bondar
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Yuta Niki
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)Kuppusamy P
 
Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial NetworksDong Heon Cho
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기NAVER Engineering
 
Brief introduction on GAN
Brief introduction on GANBrief introduction on GAN
Brief introduction on GANDai-Hai Nguyen
 
Autoencoders
AutoencodersAutoencoders
AutoencodersCloudxLab
 
Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Grigory Sapunov
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023HyunJoon Jung
 
Diffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisDiffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisBeerenSahu
 
Data preprocessing in Machine learning
Data preprocessing in Machine learning Data preprocessing in Machine learning
Data preprocessing in Machine learning pyingkodi maran
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
GANs Presentation.pptx
GANs Presentation.pptxGANs Presentation.pptx
GANs Presentation.pptxMAHMOUD729246
 
Reasoning in Description Logics
Reasoning in Description Logics  Reasoning in Description Logics
Reasoning in Description Logics R A Akerkar
 

What's hot (20)

Transformers in 2021
Transformers in 2021Transformers in 2021
Transformers in 2021
 
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and Applications
 
Gpt models
Gpt modelsGpt models
Gpt models
 
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial Networks
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
 
Brief introduction on GAN
Brief introduction on GANBrief introduction on GAN
Brief introduction on GAN
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
 
Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023
 
Diffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisDiffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesis
 
Data preprocessing in Machine learning
Data preprocessing in Machine learning Data preprocessing in Machine learning
Data preprocessing in Machine learning
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
GANs Presentation.pptx
GANs Presentation.pptxGANs Presentation.pptx
GANs Presentation.pptx
 
Reasoning in Description Logics
Reasoning in Description Logics  Reasoning in Description Logics
Reasoning in Description Logics
 

Similar to Introduction to Deep Generative Models

Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Olga Zinkevych
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksBennoG1
 
A Walk in the GAN Zoo
A Walk in the GAN ZooA Walk in the GAN Zoo
A Walk in the GAN ZooLarry Guo
 
Otter 2016-11-28-01-ss
Otter 2016-11-28-01-ssOtter 2016-11-28-01-ss
Otter 2016-11-28-01-ssRuo Ando
 
從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論岳華 杜
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptRMDAcademicCoordinat
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptsomeyamohsen2
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptGayathriSanthosh11
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsMark Chang
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesFrank Nielsen
 
Deep generative model.pdf
Deep generative model.pdfDeep generative model.pdf
Deep generative model.pdfHyungjoo Cho
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Daisuke Yoneoka
 
CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2zukun
 
[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers
[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers
[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layersyumakishi
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative ModelsKenta Oono
 
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisDetecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisSilvio Cesare
 
pptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacespptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacesbutest
 
pptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacespptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacesbutest
 

Similar to Introduction to Deep Generative Models (20)

Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial Networks
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Abstract machines for great good
Abstract machines for great goodAbstract machines for great good
Abstract machines for great good
 
A Walk in the GAN Zoo
A Walk in the GAN ZooA Walk in the GAN Zoo
A Walk in the GAN Zoo
 
Otter 2016-11-28-01-ss
Otter 2016-11-28-01-ssOtter 2016-11-28-01-ss
Otter 2016-11-28-01-ss
 
從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.ppt
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.ppt
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.ppt
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANs
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective Divergences
 
Deep generative model.pdf
Deep generative model.pdfDeep generative model.pdf
Deep generative model.pdf
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9
 
CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2
 
[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers
[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers
[Japanese]Obake-GAN (Perturbative GAN): GAN with Perturbation Layers
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
 
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisDetecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
 
pptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacespptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspaces
 
pptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacespptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspaces
 

More from Hao-Wen (Herman) Dong

Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...Hao-Wen (Herman) Dong
 
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...Hao-Wen (Herman) Dong
 
Organizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository OrganizationOrganizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository OrganizationHao-Wen (Herman) Dong
 
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...Hao-Wen (Herman) Dong
 
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GANRecent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GANHao-Wen (Herman) Dong
 

More from Hao-Wen (Herman) Dong (6)

Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
 
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
 
Organizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository OrganizationOrganizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository Organization
 
What is Critical in GAN Training?
What is Critical in GAN Training?What is Critical in GAN Training?
What is Critical in GAN Training?
 
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
 
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GANRecent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
 

Recently uploaded

SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptxBREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptxPABOLU TEJASREE
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 

Recently uploaded (20)

SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptxBREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 

Introduction to Deep Generative Models

  • 1. Introduction to Deep Generative Models Herman Dong Music and Audio Computing Lab (MACLab), Research Center for Information Technology Innovation, Academia Sinica
  • 2. MuseGAN Learn about our recent work on using GAN to compose pop song at https://salu133445.github.io/musegan/ Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang and Yi-Hsuan Yang. 2017. MuseGAN: Symbolic-domain Music Generation and Accompaniment with Multi-track Sequential Generative Adversarial Networks. arXiv preprint arXiv:1709.06298.
  • 3. Outline • Brief introduction to deep generative models • AE (Autoencoder) • VAE (Variational Autoencoder) • GAN (Generative Adversarial Networks) • AAE (Adversarial Autoencoder) • VAE/GAN • ADA (Adversarial Domain Adaption) • Reformulation • Graphical model representation • Connection to Wake-sleep Algorithm
  • 4. AE (Autoencoder) • minimize reconstruction loss EX D E(X) D(E(X)) Lossreconstruction loss latent data
  • 5. AE (Autoencoder) • minimize reconstruction loss reconstruction loss EX D E(X) D(E(X)) Loss latent data
  • 6. AE (Autoencoder) • minimize reconstruction loss reconstruction loss EX D E(X) D(E(X)) Loss latent data
  • 7. VAE (Variational Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) PP(z) z ε latentdata
  • 8. VAE (Variational Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss Loss PP(z) z ε reconstruction loss KL divergence latentdata ≈
  • 9. VAE (Variational Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss Loss PP(z) z ε reconstruction loss KL divergence latentdata ≈
  • 10. VAE (Variational Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss Loss PP(z) z ε reconstruction loss KL divergence latentdata μ σ ≈
  • 11. VAE (Variational Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss Loss PP(z) z ε reconstruction loss KL divergence latentdata μ σ μ = z σ + ∙ ε ≈
  • 12. VAE (Variational Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss Loss PP(z) z ε testing (generation) reconstruction loss KL divergence latentdata μ σ μ = z σ + ∙ ε ≈
  • 13. GAN (Generative Adversarial Network) • minimize distance between the distribution of real data and generated samples Gz~p(z) G(z) D X 1/0 latent data
  • 14. GAN (Generative Adversarial Network) • minimize distance between the distribution of real data and generated samples Gz~p(z) G(z) D X 1/0 adversarial training log(1-D(X)) + log(D(G(z))) log(1-D(G(z))) Make G(z) undistinguishable from real data for D Distinguish G(z) being fake from X being real latent data
  • 15. GAN (Generative Adversarial Network) • minimize distance between the distribution of real data and generated samples Gz~p(z) G(z) D X 1/0 adversarial training log(1-D(X)) + log(D(G(z))) log(1-D(G(z))) Make G(z) undistinguishable from real data for D Distinguish G(z) being fake from X being real latent data
  • 16. GAN (Generative Adversarial Network) • minimize distance between the distribution of real data and generated samples Gz~p(z) G(z) D X 1/0 adversarial training log(1-D(X)) + log(D(G(z))) log(1-D(G(z))) Make G(z) undistinguishable from real data for D Distinguish G(z) being fake from X being real latent data
  • 17. GAN (Generative Adversarial Network) • minimize distance between the distribution of real data and generated samples Gz~p(z) G(z) D X 1/0 adversarial training log(1-D(X)) + log(D(G(z))) log(1-D(G(z))) Make G(z) undistinguishable from real data for D Distinguish G(z) being fake from X being real testing (generation) latent data
  • 18. GAN vs VAE • GAN • Generator aim to fool the discriminator • Discriminator aim to distinguish generated data from real data • output images are sharper • higher diversity, lower stability • VAE • Objective: reconstruct real data • using pixel-to-pixel loss • output images are more blurred • lower diversity, higher stability GAN VAE A. B. L. Larsen, S. K. Sønderby, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
  • 19. GAN vs VAE • GAN • Generator aim to fool the discriminator • Discriminator aim to distinguish generated data from real data • output images are sharper • higher diversity, lower stability • VAE • Objective: reconstruct real data • using pixel-to-pixel loss • output images are more blurred • lower diversity, higher stability GAN VAE A. B. L. Larsen, S. K. Sønderby, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
  • 20. AAE (Adversarial Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss PP(z) z ε reconstruction loss latentdata ≈ Loss KL divergence
  • 21. AAE (Adversarial Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss PP(z) reconstruction loss latentdata ≈ D 1/0 adversarial training Make Q(X) undistinguishable from real p(X) for D Distinguish Q(X) being fake from p(X) being real
  • 22. AAE (Adversarial Autoencoder) • minimize reconstruction loss • minimize distance between encoded latent distribution and prior distribution QX Q(X) p(z) Loss PP(z) reconstruction loss latentdata ≈ D 1/0 adversarial training Make Q(X) undistinguishable from real p(X) for D Distinguish Q(X) being fake from p(X) being real
  • 23. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ Loss reconstruction loss X VAE
  • 25. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) zG D1/0 X
  • 26. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) z D1/0 (G) X
  • 27. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) z D1/0 adversarial training Distinguish G(z) and P(z) being fake from X being real 𝓛𝓛GAN (G) X Make G(z) and P(z) undistinguishable from real data
  • 28. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) z D1/0 adversarial training Distinguish G(z) and P(z) being fake from X being real 𝓛𝓛GAN Loss 𝓛𝓛DIS hidden representation reconstruction loss (G) X Make G(z) and P(z) undistinguishable from real data
  • 29. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) z D1/0 adversarial training Distinguish G(z) and P(z) being fake from X being real 𝓛𝓛GAN Loss 𝓛𝓛DIS hidden representation reconstruction loss 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 (G) X Make G(z) and P(z) undistinguishable from real data 𝓛𝓛 = 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 + 𝓛𝓛DIS + 𝓛𝓛GAN
  • 30. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) z D1/0 adversarial training Distinguish G(z) and P(z) being fake from X being real 𝓛𝓛GAN Loss 𝓛𝓛DIS hidden representation reconstruction loss 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 (G) 𝓛𝓛 = 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 + 𝓛𝓛DIS + 𝓛𝓛GAN X Make G(z) and P(z) undistinguishable from real data
  • 31. VAE/GAN VAE VAEDIS GAN/VAE GAN Ground Truth VAE VAEDIS VAE/GAN Generation test Reconstruction test A. B. L. Larsen, S. K. Sønderby, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
  • 32. VAE/GAN Q Q(X) p(z) Loss PP(z) z ε KL divergence latent data ≈ G(z) zG latent D1/0 adversarial training Distinguish G(z) and P(z) being fake from X being real 𝓛𝓛GAN Loss 𝓛𝓛DIS hidden representation reconstruction loss 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 (G) 𝓛𝓛 = 𝓛𝓛𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩 + 𝓛𝓛DIS + 𝓛𝓛GAN !? X
  • 34. What’s going on? latentdata Loss reconstruction loss P z~p(z) Loss KL divergenceX Q Q(X) (+ε) P(Q(X)) VAE
  • 35. What’s going on? latentdata Loss reconstruction loss P z~p(z) 1/0D adversarial divergence X Q Q(X) (+ε) P(Q(X)) AAE
  • 38. What’s going on? latentdata Loss reconstruction loss D1/0 adversarial loss Loss hidden representation reconstruction loss P z~p(z)P(z) Loss KL divergence 1/0D adversarial divergence X Q Q(X) (+ε) P(Q(X))
  • 39. ADA (Adversarial Domain Adaption) • Goal: given labeled data in source domain, aim to classify unlabeled data in target domain. G G(Xsrc) data feature Xsrc C class
  • 40. ADA (Adversarial Domain Adaption) • Goal: given labeled data in source domain, aim to classify unlabeled data in target domain. G G(Xsrc) Xtgt data feature Xsrc G(Xtgt) C class bad features
  • 41. ADA (Adversarial Domain Adaption) • Goal: given labeled data in source domain, aim to classify unlabeled data in target domain. G G(Xsrc) Xtgt adversarial training Make G(Xtgt) and G(Xsrc) undistinguishable for D Distinguish G(Xtgt) as target domain from G(Xsrc) as source domain data feature Xsrc G(Xtgt) D 1/0 C class
  • 42. ADA (Adversarial Domain Adaption) • Goal: given labeled data in source domain, aim to classify unlabeled data in target domain. G G(Xsrc) Xtgt adversarial training Make G(Xtgt) and G(Xsrc) undistinguishable for D Distinguish G(Xtgt) as target domain from G(Xsrc) as source domain data feature Xsrc G(Xtgt) D 1/0 C class
  • 43. ADA (Adversarial Domain Adaption) • Goal: given labeled data in source domain, aim to classify unlabeled data in target domain. G G(Xsrc) Xtgt adversarial training Make G(Xtgt) and G(Xsrc) undistinguishable for D Distinguish G(Xtgt) as target domain from G(Xsrc) as source domain data feature Xsrc G(Xtgt) D 1/0 C class testing (classification)
  • 44. On Unifying Deep Generative Models z y x latent (data) data (feature) label
  • 45. On Unifying Deep Generative Models z y x latent (data) data (feature) label 𝒑𝒑 𝒛𝒛
  • 46. On Unifying Deep Generative Models • 𝑮𝑮𝜽𝜽 – 𝜃𝜃 are parameters in generator • 𝑫𝑫𝝓𝝓 – 𝜙𝜙 are parameters in generator z y x latent (data) data (feature) label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
  • 47. On Unifying Deep Generative Models • 𝑮𝑮𝜽𝜽 – 𝜃𝜃 are parameters in generator • 𝑫𝑫𝝓𝝓 – 𝜙𝜙 are parameters in generator • Solid line – generative process • Dashed line – inference process • Hollow arrow – deterministic transformation • Red arrow – adversarial mechanism • 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 denotes 𝒒𝒒𝝓𝝓 𝒚𝒚 𝒙𝒙 and 𝒒𝒒𝝓𝝓 𝟏𝟏 − 𝒚𝒚 𝒙𝒙 z y x 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent (data) data (feature) label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 ADA 𝒚𝒚 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 GAN
  • 48. On Unifying Deep Generative Models • 𝑮𝑮𝜽𝜽 – 𝜃𝜃 are parameters in generator • 𝑫𝑫𝝓𝝓 – 𝜙𝜙 are parameters in generator • Solid line – generative process • Dashed line – inference process • Hollow arrow – deterministic transformation • Red arrow – adversarial mechanism • 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 denotes 𝒒𝒒𝝓𝝓 𝒚𝒚 𝒙𝒙 and 𝒒𝒒𝝓𝝓 𝟏𝟏 − 𝒚𝒚 𝒙𝒙 z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒚𝒚 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 GAN 𝒚𝒚 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 ADA latent (data) data (feature) label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN (ADA)
  • 49. On Unifying Deep Generative Models z y x latent data label VAE
  • 50. On Unifying Deep Generative Models z y x latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 VAE
  • 51. On Unifying Deep Generative Models z y x 𝒒𝒒∗ 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 perfect discriminator VAE
  • 52. On Unifying Deep Generative Models z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚 𝒒𝒒∗ 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 = � 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 0 𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 1 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 perfect discriminator VAE
  • 53. degenerated adversarial discriminator On Unifying Deep Generative Models z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚 𝒒𝒒∗ 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 = � 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 0 𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 1 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 perfect discriminator VAE
  • 54. degenerated adversarial discriminator adversary activated VAE (AAVAE) On Unifying Deep Generative Models z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚 𝒒𝒒∗ 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 = � 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 0 𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 , 𝑖𝑖𝑖𝑖 𝑦𝑦 = 1 = � 𝟏𝟏, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝟎𝟎, 𝑖𝑖𝑖𝑖 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 perfect discriminator VAE
  • 55. On Unifying Deep Generative Models z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data labelz y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 latent data label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN 𝒑𝒑 𝒛𝒛 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN
  • 56. On Unifying Deep Generative Models z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 InfoGAN z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 latent data label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN degenerated code space 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚
  • 57. On Unifying Deep Generative Models z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data labelz y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 latent data label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN 𝒑𝒑 𝒛𝒛 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN
  • 58. On Unifying Deep Generative Models x y z 𝒑𝒑𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒛𝒛 data latent labelz y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 latent data label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN 𝒑𝒑 𝒙𝒙 𝒛𝒛 = 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒚𝒚
  • 59. On Unifying Deep Generative Models x y z 𝒑𝒑𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒛𝒛 data latent label AAE z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 latent data label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN 𝒑𝒑 𝒙𝒙 𝒛𝒛 = 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒚𝒚
  • 60. GAN vs VAE z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 latent data label 𝒑𝒑 𝒛𝒛 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛 GAN z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚 𝒒𝒒∗ 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 VAE label
  • 61. InfoGAN vs VAE z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒛𝒛, 𝒚𝒚 𝒒𝒒∗ 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 VAE labelz y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 InfoGAN 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
  • 62. InfoGAN vs AAE z y x 𝒑𝒑𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒙𝒙 latent data label 𝒒𝒒𝜼𝜼 𝒛𝒛 𝒙𝒙, 𝒚𝒚 InfoGAN x y z 𝒑𝒑𝜽𝜽 𝒛𝒛 𝒚𝒚 𝒒𝒒𝝓𝝓 𝒓𝒓 𝒚𝒚 𝒛𝒛 latent label 𝒛𝒛 = 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒙𝒙 𝒚𝒚 𝒒𝒒𝜼𝜼 𝒙𝒙 𝒛𝒛, 𝒚𝒚 data 𝒑𝒑 𝑮𝑮𝜽𝜽 𝒛𝒛 𝒚𝒚 AAE 𝒙𝒙 = 𝑮𝑮𝜽𝜽 𝒛𝒛
  • 63. Wake-sleep Algorithm • 𝒉𝒉 - general latent variables • 𝝀𝝀 - general parameters • 𝜽𝜽 - generator parameters • In wake phase, update 𝜽𝜽 by fitting 𝒑𝒑𝜽𝜽 𝒙𝒙|𝒉𝒉 to 𝒙𝒙 and 𝒉𝒉 inferred by 𝒒𝒒𝝀𝝀 𝒉𝒉|𝒙𝒙 . • In sleep phase, update 𝝀𝝀 based on generated samples. • VAE: 𝒉𝒉 → 𝒛𝒛, 𝝀𝝀 → 𝜼𝜼 • GAN: 𝒉𝒉 → 𝒚𝒚, 𝝀𝝀 → 𝝓𝝓 Wake: max 𝜽𝜽 𝔼𝔼𝒒𝒒𝝀𝝀 𝒉𝒉|𝒙𝒙 𝒑𝒑𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒙𝒙 log 𝑝𝑝𝜃𝜃 𝑥𝑥 ℎ Sleep: max 𝜆𝜆 𝔼𝔼𝒑𝒑𝜽𝜽 𝒙𝒙|𝒉𝒉 𝒑𝒑 𝒉𝒉 log 𝑞𝑞𝜆𝜆 ℎ 𝑥𝑥
  • 64. References • D. P. Kingma, M. Welling. Auto-Encoding Variational Bayes. arXiv preprint arXiv:1312.6114, 2014 • I. J. GoodFellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Genrative Adversrial Nets. arXiv:preprint arXiv:1406.2661, 2014 • A. B. L. Larsen, S. K. Sønderby and O. Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015 • A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow and B. Frey. Adversarial Autoencoders. arXiv preprint arXiv:1511.05644, 2016 • Z. Hu, Z. Yang, R. Salakhutdinov and E. P. Xing. On unifying deep generative models. arXiv preprint arXiv:1706.00550, 2017