A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)

A (very) gentle introduction to
Generative Adversarial
Networks (a.k.a. GANs)
Thomas Paula
#4 Porto Alegre ML Meetup

Who am I
2
Thomas Paula
Software Engineer
Machine Learning Engineer
Twitter
@tsp_thomas

“(…) (GANs) and the
variations that are now
being proposed is the most
interesting idea in the last
10 years in ML (…)”
Yann LeCun
Director of Facebook
AI Research
Source: https://www.quora.com/What-are-some-recent-and-potentially-upcoming-breakthroughs-in-deep-learning/answer/Yann-LeCun

Recalling supervised learning
• Labelled data;
• Algorithms try to predict an output value based on a
given input;
• Examples include
• Classification algorithms such as SVM
• Regression algorithms such as Linear Regression
6

Recalling unsupervised learning
• Unlabelled data;
• Algorithms try to discover hidden structures in the data;
• Examples include
• Clustering algorithms such as K-means
• Generative models such as GANs
7

Discriminative models
• Learns a function that maps the input 𝑥 to an output 𝑦;
• Conditional probability 𝑃(𝑦|𝑥);
• Classification algorithms as SVM.
8

Generative Models
9
• Tries to learn a joint probability of the input 𝑥 and the
output 𝑦 at the same time;
• Joint probability 𝑃 𝑥, 𝑦 ;
• Generative statistical models as Latent Dirichelet
Allocation.

“What I do not understand,
I cannot create.”
Richard Feynman
Nobel in Physics in
1965
Source: https://openai.com/blog/generative-models/

What are GANs?
12
First, an intuition
Goal: produce counterfeit money
that is as similar as real money.
Goal: distinguish between real and
counterfeit money.
x

What are GANs?
13
First, an intuition
Goal: distinguish between real and
counterfeit money.
x
generator discriminator
Goal: produce counterfeit money
that is as similar as real money.

What are GANs?
14
generator
discriminator
𝑧 𝑥
𝑥

What are GANs?
15
generator
discriminator
𝑧 𝑥
𝑥Generated from
Gaussian or Normal
distribution (usually)
Generated instance

What are GANs?
16
generator
discriminator
𝑧 𝑥
𝑥
Generated instance
Training data

Generative Adversarial Networks
• Created by Ian Goodfellow (OpenAI);
• Two neural networks compete (minmax game)
• Discriminative network tries to distinguish between real and
fake data
• Generative network tries to generate samples to fool the
discriminative one
• Use latent code (𝑧).
17
Source: Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.

18
Source: Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.

“It should happen until the
generator exactly reproduces
the true data distribution and
the discriminator is guessing at
random, unable to find a
difference.”
OpenAI

However, some problems arise…
• Gradient Descent does not always find Nash equilibrium;
• Mode Collapse: Generator starts to produce several copies of
the same image.
• Expected: First Max Discriminator and Min Generator
• Problem: First Min Generator and Max Discriminator
• Causes low diversity output.
21
Sources: https://www.quora.com/What-are-the-pros-and-cons-of-using-generative-adversarial-networks-a-type-of-neural-network, and Generative Adversarial
Networks (GANs) #AIWTB 2016 video.

Some Advances
22
Since GANs’ creation, in 2014 (not necessarily in chronological order)

LAPGAN
• Produces high quality samples
of natural images (at 32x32
and 64x64);
• Cascade of convolutional
neural networks with Laplacian
pyramid framework.
23
Laplacian Pyramid GAN – Emily Denton et al. (2015)
Source: https://en.wikipedia.org/wiki/Pyramid_(image_processing)

24
Source: Denton, Emily L., Soumith Chintala, and Rob Fergus. "Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks." Advances in
neural information processing systems. 2015.

DCGAN
• CNNs combined with GANs;
• Train GANs and later use learned feature extractors for
supervised tasks;
• Successful use of CNNs after LAPGAN;
• Created guidelines for stable Deep Convolutional GANs.
25
Deep Convolutional GAN – Alec Radford et al. (2016)
Source: Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv
preprint arXiv:1511.06434 (2015).

DCGAN
Used three recent changes in CNNs:
• All convolutional network – strided convolutions;
• Elimination of fully-connected layers;
• Batch normalization.
26
preprint arXiv:1511.06434 (2015).

DCGAN
27
preprint arXiv:1511.06434 (2015).
DCGAN generator for LSUN dataset.

28
preprint arXiv:1511.06434 (2015).

29
preprint arXiv:1511.06434 (2015).
DCGAN – Vector Arithmetic

InfoGAN
• Modification of GANs that encourages it to learn
interpretable and meaningful representations;
• Learns disentangled representations;
• For instance, facial expression, eye color, hairstyle, presence
or absence of eyeglasses for faces
• Decomposes input into 𝑧 (noise) and 𝑐 (latent code).
30
Deep Convolutional GAN – Xi Chen et al. (2016)
Source: Chen, Xi, et al. "Infogan: Interpretable representation learning by information maximizing generative adversarial nets." Advances in Neural Information
Processing Systems. 2016.

31
InfoGAN
Deep Convolutional GAN – Xi Chen et al. (2016)

32

Let’s pause for a while…
33
Paper regarding improvements on training GANs

Improved Techniques for Training GANs
• Techniques that encourage convergence of GANs – do
you recall Nash equilibrium?
• Why is it difficult?
• Cost functions are non-convex;
• Parameters are continuous;
• Parameters space is very high dimensional.
34
Salimans et al. (2016)
Sources: Salimans, Tim, et al. "Improved techniques for training gans." Advances in Neural Information Processing Systems, 2016; and Goodfellow, Ian.
"Generative Adversarial Networks," NIPS 2016 tutorial.

Feature matching
• Instead of maximizing the output of the
discriminator, requires the generator to generate
data that matches statistics of real data;
• Generator trained to match expected value of an
intermediate layer of the discriminator.
35

Minibatch discrimination
• Allows the discriminator
to look at multiple
examples in combination;
• Avoids mode collapse.
36

Historical averaging
• Adds a term in cost functions to consider past
parameters;
• Inspired on Fictitious Play (based on frequency of play).
37

One-sided label smoothing
• Reduces confidence, considering targets of 0.1 and 0.9
instead of 0 and 1;
• Prevents discriminator from giving very large gradient
signal to generator.
38

Virtual Batch Normalization (VBN)
• Regular batch normalization can cause strong intra-batch
correlation;
• VBN avoids the problem by using a reference batch and its
statistics to combine with the given instances;
• It is computationally expensive.
39

Coming Back to Advances
40
Since GANs’ creation, in 2014 (not necessarily in chronological order)

Advances... and some Applications
41
iGAN
Source: Zhu, Jun-Yan, et al. "Generative visual manipulation on the natural image manifold." European Conference on Computer Vision. Springer International
Publishing, 2016.
Source: Gifs generated from original video (https://www.youtube.com/watch?v=9c4z6YsBGQ0).

42
cGAN – Conditional GAN
Source: Reed, Scott, et al. "Generative adversarial text to image synthesis." Proceedings of The 33rd International Conference on Machine Learning. Vol. 3.
2016.

43
Source: Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." arXiv preprint arXiv:1609.04802 (2016).
Super resolution

44
Source: Brock, Andrew, et al. "Neural photo editing with introspective adversarial networks." arXiv preprint arXiv:1609.07093 (2016).
Neural Photo Editing with Introspective Adversarial Networks

Nope!
• Around 72 papers cite GAN’s original paper (all from
this year!)*;
• The latest is called WGAN (Wasserstein GAN)
• It was submitted to arxiv in 26 Jan!
• “We empirically show that WGANs cure the main training problems of GANs”
46
Sources: Wasserstein GAN - https://arxiv.org/abs/1701.07875. * Simple search in Google Scholar.

Bottom line (in a nutshell)
• GANs are composed of two networks that compete
• One network generates samples (generator)
• Another network differenciate between real and generated data
(discriminator)
• It is an unsupervised learning technique that is trained in
a supervised manner;
• Interesting and challenging open research questions.
48

Bottom line (in a nutshell)
• Finding Nash equilibrium in high-dimensional, continuos,
non-convex games is an important open research
problem;
• There is no rule of the thumb for evaluating generative
models.
49

“(…) in the process of
training generative
models, we will endow the
computer with an
understanding of the
world and what it is made
up of.” OpenAI

Thank you!
Thomas Paula
Software Engineer
Machine Learning Engineer
Twitter
@tsp_thomas

A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)

More Related Content

What's hot

Similar to A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)

Recently uploaded

A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)