Deep Generative Models
Chia-Wen (Sara) Cheng
cwcheng@cs.utexas.edu
Roadmap
What is a generative model?
Why study generative models?
What are popular deep generative models?
How do they work?
What is a generative model?
Generative Model vs. Discriminative Model
Given input data , you want to classify the data into labels
Generative Model Discriminative Model
posterior probabilityjoint probability
Example
Data (x,y): (1,0), (1,0), (2,0), (2,1)
y=0 y=1
x=1 1/2 0
x=2 1/4 1/4
y=0 y=1
x=1 1 0
x=2 1/2 1/2
In short
Generative Model:
Based on my generation assumptions, which category is most likely to generate
this observation?
Discriminative Model:
Not care about how the data was generated, simply categorizes a given
observation
Example 1: Classify a speech to a language
Generative Model:
Learn each language and determine which language the speech belongs to
Discriminative Model:
Determine the linguistic differences without learning any language -- a much
easier task!
Example 2: Classify an animal as an elephant or a dog
Generative Model
Build models of what elephants and dogs
like respectively
Then match the new animal against the
elephant model and the dog model
Discriminative Model
Find a decision boundary that separates
elephants and dogs
Check on which side of the decision
boundary it falls
Summary
Generative Model
● Model P(x,y)
● Model the distribution of individual
classes
● Can “generate” synthetic data points
● E.g. Hidden Markov Model
Discriminative Model
● Model P(y|x)
● Learn the boundary between
classes
● Usually better performance
● E.g. logistic regression, SVM
Why study generative models?
Power of Generative Models
Generate
Images
P. Isola, J.-Y. Zhu, T. Zhou, and A. A.
Efros. Image- to-image translation with
conditional adversarial networks. In
CVPR, 2017.
Power of Generative Models
Generate
Images
Speech
Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen
Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew
Senior,Koray Kavukcuoglu. WaveNet: A Generative Model for Raw
Audio. arXiv preprint arXiv:1609.03499, 2016.
Power of Generative Models
Generate
Images
Speech
Handwriting
Alex Graves. Generating Sequences With Recurrent Neural
Networks. arXiv preprint arXiv:1308.0850, 2014.
Power of Generative Models
Generate
Images
Speech
Handwriting
Language
Alex Graves. Generating Sequences With Recurrent Neural
Networks. arXiv preprint arXiv:1308.0850, 2014.
Slide from:
NIPS 2016 Tutorial: Generative Adversarial Networks
http://www.iangoodfellow.com/slides/2016-12-04-NIPS.pdf
Representation Learning
Xi Chen, Yan Duan, Rein Houthooft, John Schulman,
Ilya Sutskever, Pieter Abbeel. InfoGAN: Interpretable
Representation Learning by Information Maximizing
Generative Adversarial Nets. In NIPS, 2016.
Simulation, Planning, Reasoning
Junhyuk Oh, Xiaoxiao Guo, Honglak Lee, Richard
Lewis, Satinder Singh. Action-Conditional Video
Prediction using Deep Networks in Atari Games. In
NIPS, 2015.
Slide from:
NIPS 2016 Tutorial: Generative Adversarial Networks
http://www.iangoodfellow.com/slides/2016-12-04-NIPS.pdf
Slide from:
NIPS 2016 Tutorial: Generative Adversarial Networks
http://www.iangoodfellow.com/slides/2016-12-04-NIPS.pdf
Slide from:
NIPS 2016 Tutorial: Generative Adversarial Networks
http://www.iangoodfellow.com/slides/2016-12-04-NIPS.pdf
Slide from:
NIPS 2016 Tutorial: Generative Adversarial Networks
http://www.iangoodfellow.com/slides/2016-12-04-NIPS.pdf
What are popular deep
generative models?
Three Most Popular Generative Models
Autoregressive Models
Variational Autoencoders (VAEs)
Generative Adversarial Networks (GANs)
How do they work?
Three Most Popular Generative Models
Autoregressive Models
Variational Autoencoders (VAEs)
Generative Adversarial Networks (GANs)
Maximum Likelihood
Generative model: given n examples , recover
Likelihood:
Model
Autoregressive Models
Explicit formula based on chain rule:
Generation:
Sample one step at a time, conditioned on all previous steps
RNN Language Modeling
Generate sentences
WaveNet
Generate raw audio
Demo
Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals,
Alex Graves, Nal Kalchbrenner, Andrew Senior,Koray Kavukcuoglu. WaveNet: A
Generative Model for Raw Audio. arXiv preprint arXiv:1609.03499, 2016.
PixelRNN
Generate an image pixel by pixel
Aaron van den Oord, Nal Kalchbrenner,
and Koray Kavukcuoglu. Pixel recurrent
neural networks. In ICLR, 2016.
.
.
.
Summary of Autoregressive Models
Advantages
Construct explicit, tractable density function
Disadvantages
sample generation cost
Tends to emphasize details over global data
Generation not controlled by a latent code
Three Most Popular Generative Models
Autoregressive Models
Variational Autoencoders (VAEs)
Generative Adversarial Networks (GANs)
Variational Autoencoders (VAEs)
For each datapoint
Draw latent variables
Draw datapoint
Autoencoder
Encoder Decoder
Data Reconstruction
latent
representation
Likelihood
Maximize Log-Likelihood
Learning Objective:
Integrating over all possible z requires
exponential time to compute
Approximate the latent variable distribution
Approximate using
Approximate the latent variable distribution
KL divergence:
Expected value:
Objective
Maximize variational lower bound
variational lower bound
Objective
Maximize variational lower bound
Minimize reconstruction error:
Training samples have higher probability
Latent variable distribution
should be like the prior p(z)
More details in math:
Tutorial on Variational Autoencoders
https://arxiv.org/pdf/1606.05908.pdf
VAE
Encoder Decoder
Data
Maximize
Sample from
*
+
sampled
latent
vector
Reconstruction
VAE
Encoder Decoder
Data
Maximize
Sample from
*
+
sampled
latent
vector
Reconstruction
Reconstruction Loss
Samples
Tom White. Sampling Generative Networks. arXiv preprint arXiv:1609.04468, 2016.
Problems with VAEs
The gap between the lower bound and the true likelihood
Not really try to simulate real images
Samples tend to have lower quality
Three Most Popular Generative Models
Autoregressive Models
Variational Autoencoders (VAEs)
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)
A game between a generator and a discriminator
Training
Objective
For each iteration:
Sample a mini-batch of fake images and a mini-bathc of true images
Update D using backpropagation
Update G using backpropagation
DCGAN
Generator network
Alec Radford, Luke Metz, Soumith Chintala. Unsupervised
Representation Learning with Deep Convolutional Generative
Adversarial Networks. In ICLR, 2016.
DCGANs for LSUN Bedrooms
DCGAN in TensorFlow
A great start point for training GANs
https://github.com/carpedm20/DCGAN-tensorflow
17 hacks to a GAN
https://github.com/soumith/ganhacks
GANs for Videos
Demo
Generate Images from Captions
Elman Mansimov, Emilio Parisotto, Jimmy Lei Ba,
Ruslan Salakhutdinov.Generating Images from
Captions with Attention. In ICLR, 2016.
Image-to-Image Translation
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A.
Efros. Image-to-Image Translation with
Conditional Adversarial Networks. In CVPR, 2017.
Summary
Often subjectively regarded as producing the best samples
Difficult to train
References
● https://stackoverflow.com/questions/879432/what-is-the-difference-between-a-generative-and-discriminative-algorithm
● http://www.cedar.buffalo.edu/~srihari/CSE574/Discriminative-Generative.pdf
● https://duphan.wordpress.com/tag/generative-model/
● https://arxiv.org/pdf/1701.00160.pdf
● http://introtodeeplearning.com/6S191-Deep-Generative-Models.pdf
● https://jaan.io/what-is-variational-autoencoder-vae-tutorial/
● https://arxiv.org/pdf/1606.05908.pdf
● http://www.cs.toronto.edu/~slwang/generative_model.pdf
● https://ishmaelbelghazi.github.io/ALI

Deep Generative Models