SlideShare a Scribd company logo
1 of 99
Download to read offline
1
Introduction
Generative
Adversarial Networks
Yunjey Choi
Korea University
DAVIAN LAB
2
Speaker Introduction Introduction
B.S. in Computer Science & Engineering at Korea University
M.S. Student in Computer Science & Engineering at Korea University (Current)
Interest: Deep Learning,TensorFlow, PyTorch
GitHub Link: https://github.com/yunjey
3
Referenced Slides Introduction
• Namju Kim. Generative Adversarial Networks (GAN)
https://www.slideshare.net/ssuser77ee21/generative-adversarial-networks-70896091
• Taehoon Kim. 지적 대화를 위한 깊고 넓은 딥러닝
https://www.slideshare.net/carpedm20/ss-63116251
Introduction
01
5
Branches of ML Introduction
Semi-supervised
Learning Unsupervised
Learning
Supervised
Learning
Reinforcement
Learning
Machine Learning
No labeled data
No feedback
“find hidden structure”
Labeled data
Direct feedback
No labeled data
Delayed feedback
Reward signal
6
Supervised Learning Introduction
Discriminative
Model woman
The discriminative model learns how to classify input to its class.
man
(1)
(1)
(2)
(2)
Input image
(64x64x3)
7
Unsupervised Learning Introduction
Generative
Model
Latent
code
The generative model learns the distribution of training data.
(100)
Image
(64x64x3)
8
Generative Model
𝑋 1 2 3 4 5 6
𝑃 𝑋
1
6
1
6
1
6
0
6
1
6
2
6
1 2 3 4 5 6
1
6
2
6
𝑝 𝑥
𝑥
Probability Distribution Introduction
Random variable
Probability Basics (Review)
Probability mass function
9
Generative Model
What if 𝑥 is actual images in the training data?
At this point, 𝑥 can be represented as a (for example) 64x64x3 dimensional vector.
Probability Distribution Introduction
10
Generative Model
𝑥
𝑝 𝑑𝑎𝑡𝑎(𝑥)
Probability Distribution Introduction
There is a 𝑝 𝑑𝑎𝑡𝑎(𝑥) that represents the distribution of actual images.
Probability density function
11
Generative Model
𝑥
𝑝 𝑑𝑎𝑡𝑎(𝑥)
Probability Distribution Introduction
Let’s take an example with human face image dataset.
Our dataset may contain few images of men with glasses.
𝑥1 𝑥2 𝑥3 𝑥4
𝑥1 is a 64x64x3 high dimensional vector
representing a man with glasses.
The probability
density value is low
The probability
density value is high
12
Generative Model
𝑥
𝑝 𝑑𝑎𝑡𝑎(𝑥)
Probability Distribution Introduction
Our dataset may contain many images of women with black hair.
𝑥1 𝑥2 𝑥3 𝑥4
𝑥2 is a 64x64x3 high dimensional vector
representing a woman with black hair.
13
Generative Model
𝑥
𝑝 𝑑𝑎𝑡𝑎(𝑥)
Probability Distribution Introduction
Our dataset may contain very many images of women with blonde hair.
𝑥1 𝑥2 𝑥3 𝑥4
𝑥3 is a 64x64x3 high dimensional vector
representing a woman with blonde hair.
The probability
density value is
very high
14
Generative Model
𝑥
𝑝 𝑑𝑎𝑡𝑎(𝑥)
Probability Distribution Introduction
Our dataset may not contain these strange images.
𝑥1 𝑥2 𝑥3 𝑥4
𝑥4 is an 64x64x3 high dimensional vector
representing very strange images.
The probability
density value is
almost 0
15
Generative Model
𝑥
Probability Distribution Introduction
The goal of the generative model is to find a 𝑝 𝑚𝑜𝑑𝑒𝑙(𝑥) that
approximates 𝑝 𝑑𝑎𝑡𝑎(𝑥) well.
𝑥1 𝑥2 𝑥3 𝑥4
𝑝 𝑚𝑜𝑑𝑒𝑙(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝐷𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑎𝑐𝑡𝑢𝑎𝑙 𝑖𝑚𝑎𝑔𝑒𝑠
𝐷𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑖𝑚𝑎𝑔𝑒𝑠 𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑒𝑑 𝑏𝑦 𝑡ℎ𝑒 𝑚𝑜𝑑𝑒𝑙
Generative Adversarial
Networks
02
17
Intuition in GAN GANs
G(z)
DGz D(G(z))
D D(x)
x
Fake image
Real image
The probability of that
x came from the real data
(0~1)Discriminator
Generator
Latent Code
May be high
May be low
Training with real images
Training with fake images
18
Intuition in GAN GANs
G(z)
DGz D(G(z))
D D(x)
x
Real image
(64x64x3)
This value should
be close to 1.Discriminator
(Neural Network)
The discriminator should classify
a real image as real.
19
Intuition in GAN GANs
G(z)
Dz D(G(z))
D D(x)
x
Fake image generated by the generator
(64x64x3)
Generator
This value should
be close to 0.
The discriminator should classify
a fake image as fake.
20
Intuition in GAN GANs
G(z)
DGz D(G(z))
D D(x)
x
Generated image
(64x64x3)
Generator
(Neural Network)
Latent Code
(100)
This value should
be close to 1.
The generator should create an image
that is indistinguishable from real to
deceive the discriminator
𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺(𝑧)))
21
Objective Function of GAN GANs
G(z)
DGz D(G(z))
D D(x)
x
𝐺 𝐷
Training with real images
Training with fake images
Maximum when 𝐷(𝑥) = 1 Maximum when 𝐷(𝐺(𝑧)) = 0
Sample 𝑥 from real data distribution Sample latent code 𝑧 from Gaussian distribution
Train D to classify fake images as fake
Train D to classify real images as real
𝐷 should maximize 𝑉(𝐷, 𝐺)
Objective function
𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺(𝑧)))
𝐺 is independent of this part
22
Objective Function of GAN GANs
𝐺 should minimize 𝑉(𝐷, 𝐺) Minimum when 𝐷(𝐺(𝑧)) = 1
G(z)
DGz D(G(z))
D D(x)
x
Train G to deceive D
𝐺 𝐷
Training with real images
Training with fake images
Objective function
23
PyTorch Implementation GANs
G(z) DGz D(G(z))
D D(x)x
Training with real images
Training with fake images
Define the discriminator
input size: 784
hidden size: 128
output size: 1
24
PyTorch Implementation GANs
G(z) DGz D(G(z))
D D(x)x
Discriminator
Assume x is MNIST (784 dimension)
Output probability(1 dimension)
25
PyTorch Implementation GANs
G(z) DGz D(G(z))
D D(x)x
Generator
Define the generator
input size: 100
hidden size: 128
output size: 784
Latent code (100 dimension) Generated image (784 dimension)
26
PyTorch Implementation GANs
G(z) DGz D(G(z))
D D(x)x
Binary Cross Entropy Loss (ℎ(𝑥), 𝑦)
− 𝑦 log ℎ 𝑥 − (1 − 𝑦) log(1 − ℎ(𝑥))
27
PyTorch Implementation GANs
G(z) DGz D(G(z))
D D(x)x
Optimizer for D and G
28
PyTorch Implementation GANs
G(z) DGz D(G(z))
D D(x)x
x is a tensor of shape (batch_size, 784).
z is a tensor of shape (batch_size, 100).
29
PyTorch Implementation GANs
G(z) DGz D(G(z))
D D(x)x
D(x) gets closer to 1.
D(G(z)) gets closer to 0
Forward, Backward and Gradient Descent
Train the discriminator
with real images
Train the discriminator
with fake images
30
PyTorch Implementation GANs
G(z) DGz D(G(z))
D D(x)x
Forward, Backward and Gradient Descent
D(G(z)) gets closer to 1
Train the generator
to deceive the discriminator
31
PyTorch Implementation GANs
G(z) DGz D(G(z))
D D(x)x
The complete code can be found here
https://github.com/yunjey/pytorch-tutorial
32
Objective Function of Generator
𝐺
Objective function of G
Non-Saturating Game GANs
Images created by the generator
at the beginning of training
𝑦 = log(1 − 𝑥)
The gradient is relatively small at D(G(z))=0
At the beginning of training, the discriminator can clearly classify the
generated image as fake because the quality of the image is very low.
This means that D(G(z)) is almost zero at early stages of training.
𝑚𝑖𝑛 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
33
Solution for Poor Gradient
𝐺
𝑚𝑎𝑥 𝐸𝑧~𝑝 𝑧(𝑧) log 𝐷(𝐺 𝑧 )
𝐺
𝑚𝑖𝑛 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
Non-Saturating Game GANs
𝑦 = log(𝑥)
The gradient is very large at D(G(z))=0
Use binary cross entropy loss function with fake label (1)
𝑚𝑖𝑛 𝐸𝑧~𝑝 𝑧(𝑧)[−𝑦 log 𝐷(𝐺 𝑧 ) − 1 − 𝑦 log(1 − 𝐷(𝐺 𝑧 )]
𝑚𝑖𝑛 𝐸𝑧~𝑝 𝑧(𝑧)[− log 𝐷 𝐺 𝑧 ]
𝐺
𝐺
𝑦 = 1
Modification (heuristically motivated)
• Practical Usage
34
Theory in GAN GANs
𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
𝑚𝑖𝑛 𝐽𝑆𝐷(𝑝 𝑑𝑎𝑡𝑎||𝑝 𝑔)
𝐺, 𝐷
𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺
𝐺 𝐷
𝑥
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝐽𝑆𝐷(𝑃||𝑄) =
1
2
𝐾𝐿(𝑃| 𝑀 +
1
2
𝐾𝐿(𝑄| 𝑀
𝑤ℎ𝑒𝑟𝑒 𝑀 =
1
2
(𝑃 + 𝑄) KL Divergence
• Why does GANs work?
same
Please see Appendix for details
Because it actually minimizes the distance between the real data distribution and the model distribution.
Objective function of GANs Jenson-Shannon divergence
Variants of GAN
03
36
DCGAN Variants of
GAN
Radford et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, 2015
• Deep Convolutional GAN(DCGAN), 2015
The authors present a model that is still highly preferred.
37
DCGAN Variants of
GAN
DGz D(G(z))
D D(x)
x
Use convolution, Leaky ReLU
Use deconvolution, ReLU
• No pooling layer (Instead strided convolution)
• Use batch normalization
• Adam optimizer(lr=0.0002, beta1=0.5, beta2=0.999)
38
DCGAN Variants of
GAN
Radford et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, 2015
• Latent vector arithmetic
39
LSGAN Variants of
GAN
Xudong Mao et al. Least Squares Generative Adversarial Networks, 2016
• Least Squares GAN (LSGAN)
Proposed a GAN model that adopts the least squares loss function for the discriminator.
40
LSGAN Variants of
GAN
Vanilla GAN LSGAN
Remove sigmoid
non-linearity
in last layer
41
LSGAN Variants of
GAN
Vanilla GAN LSGAN
Generator is the
same as original
42
LSGAN Variants of
GAN
Vanilla GAN LSGAN
Replace cross entropy loss
to least squares loss (L2)
D(x) gets closer to 1
D(G(z)) gets closer to 0
(same as original)
43
LSGAN Variants of
GAN
Vanilla GAN LSGAN
Replace cross entropy loss
to least squares loss (L2)
D(G(z)) gets closer to 1
(same as original)
44
LSGAN Variants of
GAN
• Results (LSUN dataset)
Xudong Mao et al. Least Squares Generative Adversarial Networks, 2016
45
LSGAN Variants of
GAN
• Results (CelebA)
46
SGAN Variants of
GAN
D
G D
one-hot vector representing 2
Real image
latent vector z
fake image
(1) FC layer with softmax
• Semi-Supervised GAN
Training with real images
Training with fake images
11 dimension
(10 classes + fake)
(1)
(1)
one-hot vector representing a fake label
one-hot vector representing 5
Augustus Odena et al. Semi-Supervised Learning with Generative Adversarial Netwoks, 2016
47
SGAN Variants of
GAN
• Results (Game Character)
1 2 3 4 5
The generator can create an character image that takes a certain pose.
one-hot vectors
representing class labels
The code will be available soon: https://github.com/yunjey
48
ACGAN Variants of
GAN
Augustus Odena et al. Conditional Image Synthesis with Auxilary Classifier, 2016
• Auxiliary Classifier GAN(ACGAN), 2016
Proposed a new method for improved training of GANs using class labels.
49
ACGAN Variants of
GAN
real or fake?
D
G
real or fake?
D
one-hot vector representing 2
Real image
one-hot vector representing 5
latent vector z
fake image
one-hot vector representing 2
(1) FC layer with sigmoid
(2) FC layer with softmax
(1)
(2)
Discriminator
(multi-task learning)
(1)
(2)
• How does it work?
Training with real images
Training with fake images
Extensions of GAN
04
51
CycleGAN Extensions
• CycleGAN: Unpaired Image-to-Image Translation
presents a GAN model that transfer an image from a source domain A to a target
domain B in the absence of paired examples.
Jun-Yan Zhu et al. Unpaired Image-to-Image Translation using Cycle Consistent Adversarial Networks, 2017
52
CycleGAN
• How does it work?
real or fake ? DB
GAB
Real Image in domain A Fake Image in domain B
Real Image in domain B
Discriminator for domain B
The generator GAB should generates a horse
from the zebra to deceive the discriminator DB.
Extensions
53
CycleGAN
• How does it work?
DB
GAB
Discriminator for domain B
GBA
Reconstructed Image
L2 Loss
GBA generates a reconstructed image of domain A.
This makes the shape to be maintained
when GAB generates a horse image from the zebra.real or fake ?
Real Image in domain A
Real Image in domain B
Fake Image in domain B
Extensions
54
CycleGAN Extensions
• Results
Jun-Yan Zhu et al. Unpaired Image-to-Image Translation using Cycle Consistent Adversarial Networks, 2017
55
CycleGAN Extensions
• Results
MNIST-to-SVHNSVHN-to-MNIST
Odd columns contain real images and even columns contain generated images.
https://github.com/yunjey/mnist-svhn-transfer
56
Text2Image
Scott Reed et al. Generative Adversarial Text to Image Synthesis, 2016
• Generative Adversarial Text to Image Synthesis, 2016
presents a novel model architecture that generates an image from the text.
Extensions
57
Text2Image
D
Sentence Embedding
(100)
Is the image real and
relevant to the sentence?
Concatenate at last conv layer
Discriminator should say ‘yes’.
Real image
(128x128)
• Training with (real image, right text)
Extensions
“A small red bird with a black beak”
58
Text2Image
Fake image
(128x128)
D
G
Z
(100)
Sentence Embedding
(100)
Is the image real and
relevant to the sentence?
right text
Discriminator should say ‘no’.
• Training with (fake image, right text)
Generator should creates an
image relevant to the sentence
to deceive the discriminator.
Extensions
“A small red bird with
a black beak”
59
Text2Image
D Is the image real and
relevant to the sentence?
wrong text
(sampled randomly from the training data)
Discriminator should say ‘no’.
• Training with (real image, wrong text)
Extensions
Real image
(128x128)
“A small yellow bird with a brown beak”
Sentence Embedding
(100)
60
StackGAN
Han Zhang et al. StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, 2016
• StackGAN:Text to Photo-realistic Image Synthesis
Extensions
61
StackGAN
Fake image
(128x128)
Real image
(128x128)
D
G
Z
(100)
‘real’
or
‘fake’
Generates a128x128 image from scratch (not guarantee good result)
Discriminator for 128x128 image
• Generating 128x128 from scratch
Extensions
62
StackGAN Extensions
D1 D2
G1 G2
Fake image
(64x64)
Real image
(64x64)
Fake image
(128x128)
Real image
(128x128)
‘real’
or
‘fake’
Generates a 64x64 image
Upscales a 64x64 image to 128x128 (Easier than generating from scratch).
Discriminator for 64x64 image Discriminator for 128x128 image
• Generating 128x128 from 64x64
‘real’
or
‘fake’
Z
(100)
Future of GAN
05
64
Convergence Measure
• Boundary Equilibrium GAN (BEGAN)
Extensions
David Berthelot et al. BEGAN: Boundary Equilibrium Generative Adversarial Networks, 2017
65
Convergence Measure
• Reconstruction Loss
Extensions
Sitao Xiang. On the effect of Batch Normalization and Weight Normalization in Generative Adversarial Network, 2017
Gz
G(z) x
Test images
66
Better Upsampling
• Deconvolution Checkboard Artifacts
Extensions
http://distill.pub/2016/deconv-checkerboard/
67
Better Upsampling
• Deconvolution vs Resize-Convolution
Extensions
http://distill.pub/2016/deconv-checkerboard/
68
GAN in Supervised Learning
• Machine Translation (Seq2Seq)
Extensions
A B C D
Y
X Y Z<start>
Z <end>X
Should ‘ABCD’ be translated to ‘XYZ’?
Tackling the supervised learning
69
GAN in Supervised Learning
• Machine Translation (GANs)
Extensions
D
G
D
Training with real sentences
Training with fake sentences
A
(English)
B
(Korean)
Does A and B have the same meaning?
The discriminator should say ‘yes’.
Does A and B have the same meaning?
The discriminator should say ‘no’.
A
(English)
B
(Fake Korean)
Lijun Wu et al. Adversarial Neural Machine Translation, 2017
The generator should generate B
that has the same meaning as
A to deceive the discriminator.
Thank you
Appendix
06
72
𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
𝐺 𝐷
Theory in GAN GANs
𝑂𝑏𝑗𝑒𝑐𝑡𝑖𝑣𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐺𝐴𝑁𝑠 𝑟𝑒𝑎𝑙 𝑑𝑎𝑡𝑎 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
ℎ𝑖𝑔ℎ 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙 𝑣𝑒𝑐𝑡𝑜𝑟 (𝑒. 𝑔. 64 × 64) l𝑜𝑤 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙 𝑣𝑒𝑐𝑡𝑜𝑟 (𝑒. 𝑔. 100)
73
𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
𝐺 𝐷
Theory in GAN GANs
𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷
𝑂𝑝𝑡𝑖𝑚𝑎𝑙 𝐷
𝐷
𝐺𝑒𝑡 𝐷 𝑤ℎ𝑒𝑛 𝑉 𝐷 𝑖𝑠 𝑚𝑎𝑥𝑖𝑚𝑢𝑚
74
𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
𝐺 𝐷
Theory in GAN GANs
𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸 𝑥~𝑝 𝑔(𝑥) log(1 − 𝐷(𝑥))
𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷
𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑥 𝑓𝑟𝑜𝑚 𝑝 𝑔
𝑖𝑛𝑠𝑡𝑒𝑎𝑑 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑧 𝑓𝑟𝑜𝑚 𝑝𝑧
𝑣𝑒𝑐𝑡𝑜𝑟 𝑜𝑓 64 × 64 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛
𝑣𝑒𝑐𝑡𝑜𝑟 𝑜𝑓 100 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛
𝑚𝑜𝑑𝑒𝑙 𝐺 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑓𝑜𝑟 ℎ𝑖𝑔ℎ 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙 𝑣𝑒𝑐𝑡𝑜𝑟 (𝑒. 𝑔. 64 × 64)
𝐷
75
𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
𝐺 𝐷
Theory in GAN GANs
𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸 𝑥~𝑝 𝑔(𝑥) log(1 − 𝐷(𝑥))
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥
𝑥 𝑥
𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷
𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑥 𝑓𝑟𝑜𝑚 𝑝 𝑔
𝑖𝑛𝑠𝑡𝑒𝑎𝑑 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑧 𝑓𝑟𝑜𝑚 𝑝𝑧
= න 𝑝 𝑥 𝑓 𝑥 𝑑𝑥𝐸 𝑥~𝑝(𝑥) 𝑓(𝑥)
𝐷𝑒𝑓𝑖𝑛𝑖𝑡𝑖𝑜𝑛 𝑜𝑓 𝐸𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛
𝑥
𝐼𝑛𝑡𝑒𝑔𝑟𝑎𝑡𝑒 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑥
𝐷
76
𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
𝐺 𝐷
Theory in GAN GANs
𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸 𝑥~𝑝 𝑔(𝑥) log(1 − 𝐷(𝑥))
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥
𝑥
𝑥 𝑥
𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷
𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑥 𝑓𝑟𝑜𝑚 𝑝 𝑔
𝑖𝑛𝑠𝑡𝑒𝑎𝑑 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑧 𝑓𝑟𝑜𝑚 𝑝𝑧
= න 𝑝 𝑥 𝑓 𝑥 𝑑𝑥𝐸 𝑥~𝑝(𝑥) 𝑓(𝑥)
𝐷𝑒𝑓𝑖𝑛𝑖𝑡𝑖𝑜𝑛 𝑜𝑓 𝐸𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛
𝑥
𝐵𝑎𝑠𝑖𝑐 𝑝𝑟𝑜𝑝𝑒𝑟𝑡𝑦 𝑜𝑓 𝐼𝑛𝑡𝑒𝑔𝑟𝑎𝑙
𝐷
77
𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
𝐺 𝐷
Theory in GAN GANs
𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸 𝑥~𝑝 𝑔(𝑥) log(1 − 𝐷(𝑥))
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥
𝑥
𝑥 𝑥
𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷
𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑥 𝑓𝑟𝑜𝑚 𝑝 𝑔
𝑖𝑛𝑠𝑡𝑒𝑎𝑑 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑧 𝑓𝑟𝑜𝑚 𝑝𝑧
= න 𝑝 𝑥 𝑓 𝑥 𝑑𝑥𝐸 𝑥~𝑝(𝑥) 𝑓(𝑥)
𝐷𝑒𝑓𝑖𝑛𝑖𝑡𝑖𝑜𝑛 𝑜𝑓 𝐸𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛
𝑥
𝐵𝑎𝑠𝑖𝑐 𝑝𝑟𝑜𝑝𝑒𝑟𝑡𝑦 𝑜𝑓 𝐼𝑛𝑡𝑒𝑔𝑟𝑎𝑙
𝑁𝑜𝑤 𝑤𝑒 𝑛𝑒𝑒𝑑 𝑡𝑜 𝑓𝑖𝑛𝑑 𝐷 𝑥 𝑤ℎ𝑖𝑐ℎ 𝑚𝑎𝑘𝑒𝑠 𝑡ℎ𝑒 𝑓𝑢𝑛𝑡𝑖𝑜𝑛 𝑖𝑛𝑠𝑖𝑑𝑒 𝑖𝑛𝑡𝑒𝑔𝑟𝑎𝑙 𝑚𝑎𝑥𝑖𝑚𝑢𝑚.
𝐷
78
Theoretical Results
= 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥
Theory in GAN GANs
𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷
𝐷
𝐷
𝑇ℎ𝑒 𝑓𝑢𝑛𝑡𝑖𝑜𝑛 𝑖𝑛𝑠𝑖𝑑𝑒 𝑖𝑛𝑡𝑒𝑔𝑟𝑎𝑙
79
Theoretical Results
= 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥
Theory in GAN GANs
𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷
𝐷
𝐷
𝑎 log 𝑦 + 𝑏 log 1 − 𝑦
𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝 𝑑𝑎𝑡𝑎 𝑥 , 𝑦 = 𝐷 𝑥 , 𝑏 = 𝑝 𝑔 𝑥
80
Theoretical Results
= 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥
Theory in GAN GANs
𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷
𝐷
𝐷
𝑎 log 𝑦 + 𝑏 log 1 − 𝑦
𝑎
𝑦
+
−𝑏
1 − 𝑦
=
𝑎 − (𝑎 + 𝑏)𝑦
𝑦(1 − 𝑦)
𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝 𝑑𝑎𝑡𝑎 𝑥 , 𝑦 = 𝐷 𝑥 , 𝑏 = 𝑝 𝑔 𝑥
𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡𝑖𝑎𝑡𝑒 𝑤𝑖𝑡ℎ 𝑟𝑒𝑠𝑝𝑒𝑐𝑡 𝑡𝑜 𝐷 𝑥 𝑢𝑠𝑖𝑛𝑔
𝑁𝑜𝑡𝑒 𝑡ℎ𝑎𝑡 𝐷 𝑥 𝑐𝑎𝑛 𝑛𝑜𝑡 𝑎𝑓𝑓𝑒𝑐𝑡 𝑡𝑜 𝑝 𝑑𝑎𝑡𝑎 𝑥 𝑎𝑛𝑑 𝑝 𝑔 𝑥 .
𝑑
𝑑𝑥
log 𝑓(𝑥) =
𝑓′
(𝑥)
𝑓(𝑥)
81
Theoretical Results
= 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥
Theory in GAN GANs
𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷
𝐷
𝐷
𝑎 log 𝑦 + 𝑏 log 1 − 𝑦
𝑎
𝑦
+
−𝑏
1 − 𝑦
𝑎 − (𝑎 + 𝑏)𝑦
𝑦(1 − 𝑦)
= 0
𝑦 =
𝑎
𝑎 + 𝑏
=
𝑎 − (𝑎 + 𝑏)𝑦
𝑦(1 − 𝑦)
𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝 𝑑𝑎𝑡𝑎 𝑥 , 𝑦 = 𝐷 𝑥 , 𝑏 = 𝑝 𝑔 𝑥
𝑁𝑜𝑡𝑒 𝑡ℎ𝑎𝑡 𝐷 𝑥 𝑐𝑎𝑛 𝑛𝑜𝑡 𝑎𝑓𝑓𝑒𝑐𝑡 𝑡𝑜 𝑝 𝑑𝑎𝑡𝑎 𝑥 𝑎𝑛𝑑 𝑝 𝑔 𝑥 .
𝐼𝑡 ℎ𝑎𝑠 𝑎 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑣𝑎𝑙𝑢𝑒 𝑤ℎ𝑒𝑛
𝑁𝑜𝑡𝑒 𝑡ℎ𝑎𝑡 𝑡ℎ𝑒 𝑙𝑜𝑐𝑎𝑙 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑖𝑠 𝑡ℎ𝑒 𝑔𝑙𝑜𝑏𝑎𝑙 𝑚𝑎𝑥𝑖𝑚𝑢𝑚
𝑤ℎ𝑒𝑛 𝑡ℎ𝑒𝑟𝑒 𝑎𝑟𝑒 𝑛𝑜 𝑜𝑡ℎ𝑒𝑟 𝑙𝑜𝑐𝑎𝑙 𝑒𝑥𝑡𝑟𝑒𝑚𝑒𝑠.
𝑑
𝑑𝑥
log 𝑓(𝑥) =
𝑓′
(𝑥)
𝑓(𝑥)
𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡𝑖𝑎𝑡𝑒 𝑤𝑖𝑡ℎ 𝑟𝑒𝑠𝑝𝑒𝑐𝑡 𝑡𝑜 𝐷 𝑥 𝑢𝑠𝑖𝑛𝑔
𝐹𝑖𝑛𝑑 𝑡ℎ𝑒 𝑝𝑜𝑖𝑛𝑡 𝑤ℎ𝑒𝑟𝑒 𝑡ℎ𝑒 𝑑𝑒𝑟𝑖𝑣𝑎𝑡𝑖𝑣𝑒 𝑣𝑎𝑙𝑢𝑒 𝑖𝑠 0 𝑙𝑜𝑐𝑎𝑙 𝑒𝑥𝑡𝑟𝑒𝑚𝑒 .
82
Theoretical Results
= 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥
Theory in GAN GANs
𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷
𝐷
𝐷
𝑎 log 𝑦 + 𝑏 log 1 − 𝑦
𝑎
𝑦
+
−𝑏
1 − 𝑦
𝑎 − (𝑎 + 𝑏)𝑦
𝑦(1 − 𝑦)
= 0
𝑦 =
𝑎
𝑎 + 𝑏
=
𝑎 − (𝑎 + 𝑏)𝑦
𝑦(1 − 𝑦)
𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝 𝑑𝑎𝑡𝑎 𝑥 , 𝑦 = 𝐷 𝑥 , 𝑏 = 𝑝 𝑔 𝑥
𝑁𝑜𝑡𝑒 𝑡ℎ𝑎𝑡 𝐷 𝑥 𝑐𝑎𝑛 𝑛𝑜𝑡 𝑎𝑓𝑓𝑒𝑐𝑡 𝑡𝑜 𝑝 𝑑𝑎𝑡𝑎 𝑥 𝑎𝑛𝑑 𝑝 𝑔 𝑥 .
𝐼𝑡 ℎ𝑎𝑠 𝑎 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑣𝑎𝑙𝑢𝑒 𝑤ℎ𝑒𝑛
𝑑
𝑑𝑥
log 𝑓(𝑥) =
𝑓′
(𝑥)
𝑓(𝑥)
𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡𝑖𝑎𝑡𝑒 𝑤𝑖𝑡ℎ 𝑟𝑒𝑠𝑝𝑒𝑐𝑡 𝑡𝑜 𝐷 𝑥 𝑢𝑠𝑖𝑛𝑔
𝐹𝑖𝑛𝑑 𝑡ℎ𝑒 𝑝𝑜𝑖𝑛𝑡 𝑤ℎ𝑒𝑟𝑒 𝑡ℎ𝑒 𝑑𝑒𝑟𝑖𝑣𝑎𝑡𝑖𝑣𝑒 𝑣𝑎𝑙𝑢𝑒 𝑖𝑠 0 𝑙𝑜𝑐𝑎𝑙 𝑒𝑥𝑡𝑟𝑒𝑚𝑒 .
𝐷∗ 𝑥 =
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝 𝑑𝑎𝑡𝑎 𝑥 , 𝑦 = 𝐷 𝑥 , 𝑏 = 𝑝 𝑔 𝑥
83
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
Objective Function of G
(G should minimize C(G)
Theory in GAN GANs
84
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
Theory in GAN GANs
Optimal discriminator
85
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
+ 𝐸 𝑥~𝑝 𝑔
log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
Theory in GAN GANs
86
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
+ 𝐸 𝑥~𝑝 𝑔
log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
Theory in GAN GANs
𝑥 𝑥
= න 𝑝 𝑥 𝑓 𝑥 𝑑𝑥𝐸 𝑥~𝑝(𝑥) 𝑓(𝑥)
Definition of Expectation
𝑥
87
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
+ 𝐸 𝑥~𝑝 𝑔
log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
Theory in GAN GANs
𝑥 𝑥
88
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
+ 𝐸 𝑥~𝑝 𝑔
log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
2 ∙ 𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
Theory in GAN GANs
𝑥 𝑥
𝑥 𝑥
How?
89
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
2 ∙ 𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
න 𝑝 𝑑𝑎𝑡𝑎 𝑥 𝑑𝑥 = 1
Theory in GAN GANs
𝑥
Property of probability density function
90
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
2 ∙ 𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
න 𝑝 𝑑𝑎𝑡𝑎 𝑥 𝑑𝑥 = 1 𝑙𝑜𝑔2 = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 ∙ 𝑙𝑜𝑔2 𝑑𝑥
Theory in GAN GANs
𝑥 𝑥trick
91
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
2 ∙ 𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
න 𝑝 𝑑𝑎𝑡𝑎 𝑥 𝑑𝑥 = 1 𝑙𝑜𝑔2 = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 ∙ 𝑙𝑜𝑔2 𝑑𝑥
𝑙𝑜𝑔2 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
Theory in GAN GANs
𝑥 𝑥
𝑥 𝑥
92
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
+ 𝐸 𝑥~𝑝 𝑔
log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
2 ∙ 𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
Theory in GAN GANs
𝑥 𝑥
𝑥𝑥
93
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
+ 𝐸 𝑥~𝑝 𝑔
log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
2 ∙ 𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
) + 𝐾𝐿(𝑝 𝑔||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
)
Theory in GAN GANs
𝑥𝑥
94
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
+ 𝐸 𝑥~𝑝 𝑔
log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
= −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
2 ∙ 𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
) + 𝐾𝐿(𝑝 𝑔||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
)
𝐾𝐿(𝑃| 𝑄 = න 𝑃(𝑥) log
𝑃(𝑥)
𝑄(𝑥)
𝑑𝑥
Theory in GAN GANs
𝑥
𝑥 𝑥
Definition of KL-Divergence
95
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
+ 𝐸 𝑥~𝑝 𝑔
log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
= −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
2 ∙ 𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
) + 𝐾𝐿(𝑝 𝑔||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
)
𝐾𝐿(𝑃| 𝑄 = න 𝑃(𝑥) log
𝑃(𝑥)
𝑄(𝑥)
𝑑𝑥 න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
2
𝑑𝑥 = 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
)
Theory in GAN GANs
𝑥
𝑥 𝑥
𝑥
96
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
+ 𝐸 𝑥~𝑝 𝑔
log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
2 ∙ 𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
) + 𝐾𝐿(𝑝 𝑔||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
)
= −𝑙𝑜𝑔4 + 2 ∙ 𝐽𝑆𝐷(𝑝 𝑑𝑎𝑡𝑎||𝑝 𝑔)
Theory in GAN GANs
97
Theoretical Results
𝐶 𝐺 = max 𝑉(𝐷, 𝐺)
𝐷
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗(𝑥))
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
+ 𝐸 𝑥~𝑝 𝑔
log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
2 ∙ 𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
) + 𝐾𝐿(𝑝 𝑔||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
)
= −𝑙𝑜𝑔4 + 2 ∙ 𝐽𝑆𝐷(𝑝 𝑑𝑎𝑡𝑎||𝑝 𝑔)
Theory in GAN GANs
98
Theory in GAN GANs
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗
𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗
(𝑥))
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
2 ∙ 𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
) + 𝐾𝐿(𝑝 𝑔||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
)
= −𝑙𝑜𝑔4 + 2 ∙ 𝐽𝑆𝐷(𝑝 𝑑𝑎𝑡𝑎||𝑝 𝑔)
𝑥 𝑥
𝑥 𝑥
𝑥𝑥
𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 =
𝐺 𝐷
𝑚𝑖𝑛 𝑉 𝐷∗, 𝐺
𝐺
𝑉 𝐷∗, 𝐺
99
Theory in GAN GANs
= 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎
log 𝐷∗
𝑥 + 𝐸 𝑥~𝑝 𝑔
log(1 − 𝐷∗
(𝑥))
= න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log
2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥 + න 𝑝 𝑔 𝑥 log
2 ∙ 𝑝 𝑔(𝑥)
𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥)
𝑑𝑥
= −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
) + 𝐾𝐿(𝑝 𝑔||
𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔
2
)
= −𝑙𝑜𝑔4 + 2 ∙ 𝐽𝑆𝐷(𝑝 𝑑𝑎𝑡𝑎||𝑝 𝑔)
𝑥 𝑥
𝑥 𝑥
𝑥𝑥
𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 =
𝐺 𝐷
𝑚𝑖𝑛 𝑉 𝐷∗, 𝐺
𝐺
𝑉 𝐷∗, 𝐺
𝐺 𝑠ℎ𝑜𝑢𝑙𝑑 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒
𝐺 𝑠ℎ𝑜𝑢𝑙𝑑 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒
𝑂𝑝𝑡𝑖𝑚𝑖𝑧𝑖𝑛𝑔 𝑉 𝐷, 𝐺 𝑖𝑠 𝑠𝑎𝑚𝑒 𝑎𝑠 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑖𝑛𝑔 𝐽𝑆𝐷(𝑝 𝑑𝑎𝑡𝑎||𝑝 𝑔)
𝑂𝑝𝑡𝑖𝑚𝑎𝑙 𝐷

More Related Content

What's hot

Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science
 
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Amol Patil
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMark Chang
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksBennoG1
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsArtifacia
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and ApplicationsHoang Nguyen
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIWithTheBest
 
Introduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANsIntroduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANsHichem Felouat
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGANNAVER Engineering
 
A Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial NetworksA Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial NetworksJong Wook Kim
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational AutoencoderMark Chang
 
Unsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANUnsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANShyam Krishna Khadka
 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Prakhar Rastogi
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical ImagingSanghoon Hong
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Universitat Politècnica de Catalunya
 
Generative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging ApplicationsGenerative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging ApplicationsKyuhwan Jung
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep LearningSebastian Ruder
 
Research of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkResearch of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkNAVER Engineering
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
GANs Presentation.pptx
GANs Presentation.pptxGANs Presentation.pptx
GANs Presentation.pptxMAHMOUD729246
 

What's hot (20)

Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial Networks
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their Applications
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and Applications
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
 
Introduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANsIntroduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANs
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
 
A Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial NetworksA Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial Networks
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
Unsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANUnsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGAN
 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical Imaging
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
 
Generative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging ApplicationsGenerative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging Applications
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Research of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkResearch of adversarial example on a deep neural network
Research of adversarial example on a deep neural network
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 
GANs Presentation.pptx
GANs Presentation.pptxGANs Presentation.pptx
GANs Presentation.pptx
 

Viewers also liked

AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017Carol Smith
 
TCP가 실패하는 상황들
TCP가 실패하는 상황들TCP가 실패하는 상황들
TCP가 실패하는 상황들ssuser7c5a40
 
Environmental impact assessment methodology
Environmental impact assessment methodologyEnvironmental impact assessment methodology
Environmental impact assessment methodologyJustin Joy
 
Environmental impact assessment
Environmental impact assessmentEnvironmental impact assessment
Environmental impact assessmentSayyid Ina
 
Environmental Impact Assessment - University of Winnipeg
Environmental Impact Assessment - University of WinnipegEnvironmental Impact Assessment - University of Winnipeg
Environmental Impact Assessment - University of WinnipegJohn Gunter
 
Deep Learning with PyTorch
Deep Learning with PyTorchDeep Learning with PyTorch
Deep Learning with PyTorchMayur Bhangale
 
Imelda Winters Presentation Environmental Impact Assessment
Imelda Winters Presentation Environmental Impact AssessmentImelda Winters Presentation Environmental Impact Assessment
Imelda Winters Presentation Environmental Impact AssessmentImelda Winters
 
Environmental Audit and Environmental Impact Assessment
Environmental Audit and Environmental Impact AssessmentEnvironmental Audit and Environmental Impact Assessment
Environmental Audit and Environmental Impact AssessmentEffah Effervescence
 
Deep learning an Introduction with Competitive Landscape
Deep learning an Introduction with Competitive LandscapeDeep learning an Introduction with Competitive Landscape
Deep learning an Introduction with Competitive LandscapeShivaji Dutta
 
Methods of eia(environmental impact assessment)
Methods of eia(environmental impact assessment)Methods of eia(environmental impact assessment)
Methods of eia(environmental impact assessment)Akhil Chibber
 
Environmental impact assessment
Environmental impact assessmentEnvironmental impact assessment
Environmental impact assessmentKashmeera N.A.
 
Environmental impact assessment (eia) By Mr Allah Dad Khan Visiting Professor...
Environmental impact assessment (eia) By Mr Allah Dad Khan Visiting Professor...Environmental impact assessment (eia) By Mr Allah Dad Khan Visiting Professor...
Environmental impact assessment (eia) By Mr Allah Dad Khan Visiting Professor...Mr.Allah Dad Khan
 
Pytorch for tf_developers
Pytorch for tf_developersPytorch for tf_developers
Pytorch for tf_developersAbdul Muneer
 
Summarization of Environmental Impact Assessment Methodology by Dr. I.M. Mis...
Summarization of Environmental Impact  Assessment Methodology by Dr. I.M. Mis...Summarization of Environmental Impact  Assessment Methodology by Dr. I.M. Mis...
Summarization of Environmental Impact Assessment Methodology by Dr. I.M. Mis...Arvind Kumar
 
ENVIRONMENTAL IMPACT ASSESSMENT - EIA
ENVIRONMENTAL IMPACT ASSESSMENT - EIAENVIRONMENTAL IMPACT ASSESSMENT - EIA
ENVIRONMENTAL IMPACT ASSESSMENT - EIASakthivel R
 
Seminar on Environmental Impact Assessment
Seminar on Environmental Impact AssessmentSeminar on Environmental Impact Assessment
Seminar on Environmental Impact Assessmentashwinpand90
 
environmental impact assessment
environmental impact assessment environmental impact assessment
environmental impact assessment Pramoda Raj
 
Environmental impact assessment concept
Environmental impact assessment conceptEnvironmental impact assessment concept
Environmental impact assessment conceptIntan Ayuna
 
Environmental impact assessment m5
Environmental impact assessment m5Environmental impact assessment m5
Environmental impact assessment m5Bibhabasu Mohanty
 

Viewers also liked (20)

AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
 
TCP가 실패하는 상황들
TCP가 실패하는 상황들TCP가 실패하는 상황들
TCP가 실패하는 상황들
 
Matrix calculus
Matrix calculusMatrix calculus
Matrix calculus
 
Environmental impact assessment methodology
Environmental impact assessment methodologyEnvironmental impact assessment methodology
Environmental impact assessment methodology
 
Environmental impact assessment
Environmental impact assessmentEnvironmental impact assessment
Environmental impact assessment
 
Environmental Impact Assessment - University of Winnipeg
Environmental Impact Assessment - University of WinnipegEnvironmental Impact Assessment - University of Winnipeg
Environmental Impact Assessment - University of Winnipeg
 
Deep Learning with PyTorch
Deep Learning with PyTorchDeep Learning with PyTorch
Deep Learning with PyTorch
 
Imelda Winters Presentation Environmental Impact Assessment
Imelda Winters Presentation Environmental Impact AssessmentImelda Winters Presentation Environmental Impact Assessment
Imelda Winters Presentation Environmental Impact Assessment
 
Environmental Audit and Environmental Impact Assessment
Environmental Audit and Environmental Impact AssessmentEnvironmental Audit and Environmental Impact Assessment
Environmental Audit and Environmental Impact Assessment
 
Deep learning an Introduction with Competitive Landscape
Deep learning an Introduction with Competitive LandscapeDeep learning an Introduction with Competitive Landscape
Deep learning an Introduction with Competitive Landscape
 
Methods of eia(environmental impact assessment)
Methods of eia(environmental impact assessment)Methods of eia(environmental impact assessment)
Methods of eia(environmental impact assessment)
 
Environmental impact assessment
Environmental impact assessmentEnvironmental impact assessment
Environmental impact assessment
 
Environmental impact assessment (eia) By Mr Allah Dad Khan Visiting Professor...
Environmental impact assessment (eia) By Mr Allah Dad Khan Visiting Professor...Environmental impact assessment (eia) By Mr Allah Dad Khan Visiting Professor...
Environmental impact assessment (eia) By Mr Allah Dad Khan Visiting Professor...
 
Pytorch for tf_developers
Pytorch for tf_developersPytorch for tf_developers
Pytorch for tf_developers
 
Summarization of Environmental Impact Assessment Methodology by Dr. I.M. Mis...
Summarization of Environmental Impact  Assessment Methodology by Dr. I.M. Mis...Summarization of Environmental Impact  Assessment Methodology by Dr. I.M. Mis...
Summarization of Environmental Impact Assessment Methodology by Dr. I.M. Mis...
 
ENVIRONMENTAL IMPACT ASSESSMENT - EIA
ENVIRONMENTAL IMPACT ASSESSMENT - EIAENVIRONMENTAL IMPACT ASSESSMENT - EIA
ENVIRONMENTAL IMPACT ASSESSMENT - EIA
 
Seminar on Environmental Impact Assessment
Seminar on Environmental Impact AssessmentSeminar on Environmental Impact Assessment
Seminar on Environmental Impact Assessment
 
environmental impact assessment
environmental impact assessment environmental impact assessment
environmental impact assessment
 
Environmental impact assessment concept
Environmental impact assessment conceptEnvironmental impact assessment concept
Environmental impact assessment concept
 
Environmental impact assessment m5
Environmental impact assessment m5Environmental impact assessment m5
Environmental impact assessment m5
 

Similar to Generative adversarial networks

A Walk in the GAN Zoo
A Walk in the GAN ZooA Walk in the GAN Zoo
A Walk in the GAN ZooLarry Guo
 
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GANRecent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GANHao-Wen (Herman) Dong
 
Generative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksGenerative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksDenis Dus
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksMLReview
 
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLuba Elliott
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptDivyaGugulothu
 
Introduction to Deep Generative Models
Introduction to Deep Generative ModelsIntroduction to Deep Generative Models
Introduction to Deep Generative ModelsHao-Wen (Herman) Dong
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksKyuri Kim
 
EuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEmanuele Ghelfi
 
GAN in medical imaging
GAN in medical imagingGAN in medical imaging
GAN in medical imagingCheng-Bin Jin
 
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...NAVER Engineering
 
GDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentGDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentElectronic Arts / DICE
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptRMDAcademicCoordinat
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptsomeyamohsen2
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptGayathriSanthosh11
 
Synthetic Image Data Generation using GAN &Triple GAN.pptx
Synthetic Image Data Generation using GAN &Triple GAN.pptxSynthetic Image Data Generation using GAN &Triple GAN.pptx
Synthetic Image Data Generation using GAN &Triple GAN.pptxRupeshKumar301638
 
GANs Deep Learning Summer School
GANs Deep Learning Summer SchoolGANs Deep Learning Summer School
GANs Deep Learning Summer SchoolRubens Zimbres, PhD
 

Similar to Generative adversarial networks (20)

A Walk in the GAN Zoo
A Walk in the GAN ZooA Walk in the GAN Zoo
A Walk in the GAN Zoo
 
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GANRecent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
 
Generative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksGenerative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural Networks
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial Networks
 
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
 
Generative adversarial text to image synthesis
Generative adversarial text to image synthesisGenerative adversarial text to image synthesis
Generative adversarial text to image synthesis
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.ppt
 
Introduction to Deep Generative Models
Introduction to Deep Generative ModelsIntroduction to Deep Generative Models
Introduction to Deep Generative Models
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
EuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and Applications
 
GAN in medical imaging
GAN in medical imagingGAN in medical imaging
GAN in medical imaging
 
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
 
GDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentGDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game Development
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.ppt
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.ppt
 
Deep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.pptDeep-Learning-2017-Lecture7GAN.ppt
Deep-Learning-2017-Lecture7GAN.ppt
 
그림 그리는 AI
그림 그리는 AI그림 그리는 AI
그림 그리는 AI
 
DiscoGAN
DiscoGANDiscoGAN
DiscoGAN
 
Synthetic Image Data Generation using GAN &Triple GAN.pptx
Synthetic Image Data Generation using GAN &Triple GAN.pptxSynthetic Image Data Generation using GAN &Triple GAN.pptx
Synthetic Image Data Generation using GAN &Triple GAN.pptx
 
GANs Deep Learning Summer School
GANs Deep Learning Summer SchoolGANs Deep Learning Summer School
GANs Deep Learning Summer School
 

Recently uploaded

%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxalwaysnagaraju26
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456KiaraTiradoMicha
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 

Recently uploaded (20)

%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 

Generative adversarial networks

  • 2. 2 Speaker Introduction Introduction B.S. in Computer Science & Engineering at Korea University M.S. Student in Computer Science & Engineering at Korea University (Current) Interest: Deep Learning,TensorFlow, PyTorch GitHub Link: https://github.com/yunjey
  • 3. 3 Referenced Slides Introduction • Namju Kim. Generative Adversarial Networks (GAN) https://www.slideshare.net/ssuser77ee21/generative-adversarial-networks-70896091 • Taehoon Kim. 지적 대화를 위한 깊고 넓은 딥러닝 https://www.slideshare.net/carpedm20/ss-63116251
  • 5. 5 Branches of ML Introduction Semi-supervised Learning Unsupervised Learning Supervised Learning Reinforcement Learning Machine Learning No labeled data No feedback “find hidden structure” Labeled data Direct feedback No labeled data Delayed feedback Reward signal
  • 6. 6 Supervised Learning Introduction Discriminative Model woman The discriminative model learns how to classify input to its class. man (1) (1) (2) (2) Input image (64x64x3)
  • 7. 7 Unsupervised Learning Introduction Generative Model Latent code The generative model learns the distribution of training data. (100) Image (64x64x3)
  • 8. 8 Generative Model 𝑋 1 2 3 4 5 6 𝑃 𝑋 1 6 1 6 1 6 0 6 1 6 2 6 1 2 3 4 5 6 1 6 2 6 𝑝 𝑥 𝑥 Probability Distribution Introduction Random variable Probability Basics (Review) Probability mass function
  • 9. 9 Generative Model What if 𝑥 is actual images in the training data? At this point, 𝑥 can be represented as a (for example) 64x64x3 dimensional vector. Probability Distribution Introduction
  • 10. 10 Generative Model 𝑥 𝑝 𝑑𝑎𝑡𝑎(𝑥) Probability Distribution Introduction There is a 𝑝 𝑑𝑎𝑡𝑎(𝑥) that represents the distribution of actual images. Probability density function
  • 11. 11 Generative Model 𝑥 𝑝 𝑑𝑎𝑡𝑎(𝑥) Probability Distribution Introduction Let’s take an example with human face image dataset. Our dataset may contain few images of men with glasses. 𝑥1 𝑥2 𝑥3 𝑥4 𝑥1 is a 64x64x3 high dimensional vector representing a man with glasses. The probability density value is low
  • 12. The probability density value is high 12 Generative Model 𝑥 𝑝 𝑑𝑎𝑡𝑎(𝑥) Probability Distribution Introduction Our dataset may contain many images of women with black hair. 𝑥1 𝑥2 𝑥3 𝑥4 𝑥2 is a 64x64x3 high dimensional vector representing a woman with black hair.
  • 13. 13 Generative Model 𝑥 𝑝 𝑑𝑎𝑡𝑎(𝑥) Probability Distribution Introduction Our dataset may contain very many images of women with blonde hair. 𝑥1 𝑥2 𝑥3 𝑥4 𝑥3 is a 64x64x3 high dimensional vector representing a woman with blonde hair. The probability density value is very high
  • 14. 14 Generative Model 𝑥 𝑝 𝑑𝑎𝑡𝑎(𝑥) Probability Distribution Introduction Our dataset may not contain these strange images. 𝑥1 𝑥2 𝑥3 𝑥4 𝑥4 is an 64x64x3 high dimensional vector representing very strange images. The probability density value is almost 0
  • 15. 15 Generative Model 𝑥 Probability Distribution Introduction The goal of the generative model is to find a 𝑝 𝑚𝑜𝑑𝑒𝑙(𝑥) that approximates 𝑝 𝑑𝑎𝑡𝑎(𝑥) well. 𝑥1 𝑥2 𝑥3 𝑥4 𝑝 𝑚𝑜𝑑𝑒𝑙(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝐷𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑎𝑐𝑡𝑢𝑎𝑙 𝑖𝑚𝑎𝑔𝑒𝑠 𝐷𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑖𝑚𝑎𝑔𝑒𝑠 𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑒𝑑 𝑏𝑦 𝑡ℎ𝑒 𝑚𝑜𝑑𝑒𝑙
  • 17. 17 Intuition in GAN GANs G(z) DGz D(G(z)) D D(x) x Fake image Real image The probability of that x came from the real data (0~1)Discriminator Generator Latent Code May be high May be low Training with real images Training with fake images
  • 18. 18 Intuition in GAN GANs G(z) DGz D(G(z)) D D(x) x Real image (64x64x3) This value should be close to 1.Discriminator (Neural Network) The discriminator should classify a real image as real.
  • 19. 19 Intuition in GAN GANs G(z) Dz D(G(z)) D D(x) x Fake image generated by the generator (64x64x3) Generator This value should be close to 0. The discriminator should classify a fake image as fake.
  • 20. 20 Intuition in GAN GANs G(z) DGz D(G(z)) D D(x) x Generated image (64x64x3) Generator (Neural Network) Latent Code (100) This value should be close to 1. The generator should create an image that is indistinguishable from real to deceive the discriminator
  • 21. 𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺(𝑧))) 21 Objective Function of GAN GANs G(z) DGz D(G(z)) D D(x) x 𝐺 𝐷 Training with real images Training with fake images Maximum when 𝐷(𝑥) = 1 Maximum when 𝐷(𝐺(𝑧)) = 0 Sample 𝑥 from real data distribution Sample latent code 𝑧 from Gaussian distribution Train D to classify fake images as fake Train D to classify real images as real 𝐷 should maximize 𝑉(𝐷, 𝐺) Objective function
  • 22. 𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺(𝑧))) 𝐺 is independent of this part 22 Objective Function of GAN GANs 𝐺 should minimize 𝑉(𝐷, 𝐺) Minimum when 𝐷(𝐺(𝑧)) = 1 G(z) DGz D(G(z)) D D(x) x Train G to deceive D 𝐺 𝐷 Training with real images Training with fake images Objective function
  • 23. 23 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x Training with real images Training with fake images
  • 24. Define the discriminator input size: 784 hidden size: 128 output size: 1 24 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x Discriminator Assume x is MNIST (784 dimension) Output probability(1 dimension)
  • 25. 25 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x Generator Define the generator input size: 100 hidden size: 128 output size: 784 Latent code (100 dimension) Generated image (784 dimension)
  • 26. 26 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x Binary Cross Entropy Loss (ℎ(𝑥), 𝑦) − 𝑦 log ℎ 𝑥 − (1 − 𝑦) log(1 − ℎ(𝑥))
  • 27. 27 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x Optimizer for D and G
  • 28. 28 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x x is a tensor of shape (batch_size, 784). z is a tensor of shape (batch_size, 100).
  • 29. 29 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x D(x) gets closer to 1. D(G(z)) gets closer to 0 Forward, Backward and Gradient Descent Train the discriminator with real images Train the discriminator with fake images
  • 30. 30 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x Forward, Backward and Gradient Descent D(G(z)) gets closer to 1 Train the generator to deceive the discriminator
  • 31. 31 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x The complete code can be found here https://github.com/yunjey/pytorch-tutorial
  • 32. 32 Objective Function of Generator 𝐺 Objective function of G Non-Saturating Game GANs Images created by the generator at the beginning of training 𝑦 = log(1 − 𝑥) The gradient is relatively small at D(G(z))=0 At the beginning of training, the discriminator can clearly classify the generated image as fake because the quality of the image is very low. This means that D(G(z)) is almost zero at early stages of training. 𝑚𝑖𝑛 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )
  • 33. 33 Solution for Poor Gradient 𝐺 𝑚𝑎𝑥 𝐸𝑧~𝑝 𝑧(𝑧) log 𝐷(𝐺 𝑧 ) 𝐺 𝑚𝑖𝑛 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) Non-Saturating Game GANs 𝑦 = log(𝑥) The gradient is very large at D(G(z))=0 Use binary cross entropy loss function with fake label (1) 𝑚𝑖𝑛 𝐸𝑧~𝑝 𝑧(𝑧)[−𝑦 log 𝐷(𝐺 𝑧 ) − 1 − 𝑦 log(1 − 𝐷(𝐺 𝑧 )] 𝑚𝑖𝑛 𝐸𝑧~𝑝 𝑧(𝑧)[− log 𝐷 𝐺 𝑧 ] 𝐺 𝐺 𝑦 = 1 Modification (heuristically motivated) • Practical Usage
  • 34. 34 Theory in GAN GANs 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) 𝑚𝑖𝑛 𝐽𝑆𝐷(𝑝 𝑑𝑎𝑡𝑎||𝑝 𝑔) 𝐺, 𝐷 𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 𝐺 𝐷 𝑥 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝐽𝑆𝐷(𝑃||𝑄) = 1 2 𝐾𝐿(𝑃| 𝑀 + 1 2 𝐾𝐿(𝑄| 𝑀 𝑤ℎ𝑒𝑟𝑒 𝑀 = 1 2 (𝑃 + 𝑄) KL Divergence • Why does GANs work? same Please see Appendix for details Because it actually minimizes the distance between the real data distribution and the model distribution. Objective function of GANs Jenson-Shannon divergence
  • 36. 36 DCGAN Variants of GAN Radford et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, 2015 • Deep Convolutional GAN(DCGAN), 2015 The authors present a model that is still highly preferred.
  • 37. 37 DCGAN Variants of GAN DGz D(G(z)) D D(x) x Use convolution, Leaky ReLU Use deconvolution, ReLU • No pooling layer (Instead strided convolution) • Use batch normalization • Adam optimizer(lr=0.0002, beta1=0.5, beta2=0.999)
  • 38. 38 DCGAN Variants of GAN Radford et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, 2015 • Latent vector arithmetic
  • 39. 39 LSGAN Variants of GAN Xudong Mao et al. Least Squares Generative Adversarial Networks, 2016 • Least Squares GAN (LSGAN) Proposed a GAN model that adopts the least squares loss function for the discriminator.
  • 40. 40 LSGAN Variants of GAN Vanilla GAN LSGAN Remove sigmoid non-linearity in last layer
  • 41. 41 LSGAN Variants of GAN Vanilla GAN LSGAN Generator is the same as original
  • 42. 42 LSGAN Variants of GAN Vanilla GAN LSGAN Replace cross entropy loss to least squares loss (L2) D(x) gets closer to 1 D(G(z)) gets closer to 0 (same as original)
  • 43. 43 LSGAN Variants of GAN Vanilla GAN LSGAN Replace cross entropy loss to least squares loss (L2) D(G(z)) gets closer to 1 (same as original)
  • 44. 44 LSGAN Variants of GAN • Results (LSUN dataset) Xudong Mao et al. Least Squares Generative Adversarial Networks, 2016
  • 45. 45 LSGAN Variants of GAN • Results (CelebA)
  • 46. 46 SGAN Variants of GAN D G D one-hot vector representing 2 Real image latent vector z fake image (1) FC layer with softmax • Semi-Supervised GAN Training with real images Training with fake images 11 dimension (10 classes + fake) (1) (1) one-hot vector representing a fake label one-hot vector representing 5 Augustus Odena et al. Semi-Supervised Learning with Generative Adversarial Netwoks, 2016
  • 47. 47 SGAN Variants of GAN • Results (Game Character) 1 2 3 4 5 The generator can create an character image that takes a certain pose. one-hot vectors representing class labels The code will be available soon: https://github.com/yunjey
  • 48. 48 ACGAN Variants of GAN Augustus Odena et al. Conditional Image Synthesis with Auxilary Classifier, 2016 • Auxiliary Classifier GAN(ACGAN), 2016 Proposed a new method for improved training of GANs using class labels.
  • 49. 49 ACGAN Variants of GAN real or fake? D G real or fake? D one-hot vector representing 2 Real image one-hot vector representing 5 latent vector z fake image one-hot vector representing 2 (1) FC layer with sigmoid (2) FC layer with softmax (1) (2) Discriminator (multi-task learning) (1) (2) • How does it work? Training with real images Training with fake images
  • 51. 51 CycleGAN Extensions • CycleGAN: Unpaired Image-to-Image Translation presents a GAN model that transfer an image from a source domain A to a target domain B in the absence of paired examples. Jun-Yan Zhu et al. Unpaired Image-to-Image Translation using Cycle Consistent Adversarial Networks, 2017
  • 52. 52 CycleGAN • How does it work? real or fake ? DB GAB Real Image in domain A Fake Image in domain B Real Image in domain B Discriminator for domain B The generator GAB should generates a horse from the zebra to deceive the discriminator DB. Extensions
  • 53. 53 CycleGAN • How does it work? DB GAB Discriminator for domain B GBA Reconstructed Image L2 Loss GBA generates a reconstructed image of domain A. This makes the shape to be maintained when GAB generates a horse image from the zebra.real or fake ? Real Image in domain A Real Image in domain B Fake Image in domain B Extensions
  • 54. 54 CycleGAN Extensions • Results Jun-Yan Zhu et al. Unpaired Image-to-Image Translation using Cycle Consistent Adversarial Networks, 2017
  • 55. 55 CycleGAN Extensions • Results MNIST-to-SVHNSVHN-to-MNIST Odd columns contain real images and even columns contain generated images. https://github.com/yunjey/mnist-svhn-transfer
  • 56. 56 Text2Image Scott Reed et al. Generative Adversarial Text to Image Synthesis, 2016 • Generative Adversarial Text to Image Synthesis, 2016 presents a novel model architecture that generates an image from the text. Extensions
  • 57. 57 Text2Image D Sentence Embedding (100) Is the image real and relevant to the sentence? Concatenate at last conv layer Discriminator should say ‘yes’. Real image (128x128) • Training with (real image, right text) Extensions “A small red bird with a black beak”
  • 58. 58 Text2Image Fake image (128x128) D G Z (100) Sentence Embedding (100) Is the image real and relevant to the sentence? right text Discriminator should say ‘no’. • Training with (fake image, right text) Generator should creates an image relevant to the sentence to deceive the discriminator. Extensions “A small red bird with a black beak”
  • 59. 59 Text2Image D Is the image real and relevant to the sentence? wrong text (sampled randomly from the training data) Discriminator should say ‘no’. • Training with (real image, wrong text) Extensions Real image (128x128) “A small yellow bird with a brown beak” Sentence Embedding (100)
  • 60. 60 StackGAN Han Zhang et al. StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, 2016 • StackGAN:Text to Photo-realistic Image Synthesis Extensions
  • 61. 61 StackGAN Fake image (128x128) Real image (128x128) D G Z (100) ‘real’ or ‘fake’ Generates a128x128 image from scratch (not guarantee good result) Discriminator for 128x128 image • Generating 128x128 from scratch Extensions
  • 62. 62 StackGAN Extensions D1 D2 G1 G2 Fake image (64x64) Real image (64x64) Fake image (128x128) Real image (128x128) ‘real’ or ‘fake’ Generates a 64x64 image Upscales a 64x64 image to 128x128 (Easier than generating from scratch). Discriminator for 64x64 image Discriminator for 128x128 image • Generating 128x128 from 64x64 ‘real’ or ‘fake’ Z (100)
  • 64. 64 Convergence Measure • Boundary Equilibrium GAN (BEGAN) Extensions David Berthelot et al. BEGAN: Boundary Equilibrium Generative Adversarial Networks, 2017
  • 65. 65 Convergence Measure • Reconstruction Loss Extensions Sitao Xiang. On the effect of Batch Normalization and Weight Normalization in Generative Adversarial Network, 2017 Gz G(z) x Test images
  • 66. 66 Better Upsampling • Deconvolution Checkboard Artifacts Extensions http://distill.pub/2016/deconv-checkerboard/
  • 67. 67 Better Upsampling • Deconvolution vs Resize-Convolution Extensions http://distill.pub/2016/deconv-checkerboard/
  • 68. 68 GAN in Supervised Learning • Machine Translation (Seq2Seq) Extensions A B C D Y X Y Z<start> Z <end>X Should ‘ABCD’ be translated to ‘XYZ’? Tackling the supervised learning
  • 69. 69 GAN in Supervised Learning • Machine Translation (GANs) Extensions D G D Training with real sentences Training with fake sentences A (English) B (Korean) Does A and B have the same meaning? The discriminator should say ‘yes’. Does A and B have the same meaning? The discriminator should say ‘no’. A (English) B (Fake Korean) Lijun Wu et al. Adversarial Neural Machine Translation, 2017 The generator should generate B that has the same meaning as A to deceive the discriminator.
  • 72. 72 𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) 𝐺 𝐷 Theory in GAN GANs 𝑂𝑏𝑗𝑒𝑐𝑡𝑖𝑣𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐺𝐴𝑁𝑠 𝑟𝑒𝑎𝑙 𝑑𝑎𝑡𝑎 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 ℎ𝑖𝑔ℎ 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙 𝑣𝑒𝑐𝑡𝑜𝑟 (𝑒. 𝑔. 64 × 64) l𝑜𝑤 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙 𝑣𝑒𝑐𝑡𝑜𝑟 (𝑒. 𝑔. 100)
  • 73. 73 𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) 𝐺 𝐷 Theory in GAN GANs 𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) 𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷 𝑂𝑝𝑡𝑖𝑚𝑎𝑙 𝐷 𝐷 𝐺𝑒𝑡 𝐷 𝑤ℎ𝑒𝑛 𝑉 𝐷 𝑖𝑠 𝑚𝑎𝑥𝑖𝑚𝑢𝑚
  • 74. 74 𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) 𝐺 𝐷 Theory in GAN GANs 𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸 𝑥~𝑝 𝑔(𝑥) log(1 − 𝐷(𝑥)) 𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑥 𝑓𝑟𝑜𝑚 𝑝 𝑔 𝑖𝑛𝑠𝑡𝑒𝑎𝑑 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑧 𝑓𝑟𝑜𝑚 𝑝𝑧 𝑣𝑒𝑐𝑡𝑜𝑟 𝑜𝑓 64 × 64 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑣𝑒𝑐𝑡𝑜𝑟 𝑜𝑓 100 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑚𝑜𝑑𝑒𝑙 𝐺 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑓𝑜𝑟 ℎ𝑖𝑔ℎ 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙 𝑣𝑒𝑐𝑡𝑜𝑟 (𝑒. 𝑔. 64 × 64) 𝐷
  • 75. 75 𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) 𝐺 𝐷 Theory in GAN GANs 𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸 𝑥~𝑝 𝑔(𝑥) log(1 − 𝐷(𝑥)) = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥 𝑥 𝑥 𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑥 𝑓𝑟𝑜𝑚 𝑝 𝑔 𝑖𝑛𝑠𝑡𝑒𝑎𝑑 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑧 𝑓𝑟𝑜𝑚 𝑝𝑧 = න 𝑝 𝑥 𝑓 𝑥 𝑑𝑥𝐸 𝑥~𝑝(𝑥) 𝑓(𝑥) 𝐷𝑒𝑓𝑖𝑛𝑖𝑡𝑖𝑜𝑛 𝑜𝑓 𝐸𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛 𝑥 𝐼𝑛𝑡𝑒𝑔𝑟𝑎𝑡𝑒 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑥 𝐷
  • 76. 76 𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) 𝐺 𝐷 Theory in GAN GANs 𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸 𝑥~𝑝 𝑔(𝑥) log(1 − 𝐷(𝑥)) = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥 = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥 𝑥 𝑥 𝑥 𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑥 𝑓𝑟𝑜𝑚 𝑝 𝑔 𝑖𝑛𝑠𝑡𝑒𝑎𝑑 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑧 𝑓𝑟𝑜𝑚 𝑝𝑧 = න 𝑝 𝑥 𝑓 𝑥 𝑑𝑥𝐸 𝑥~𝑝(𝑥) 𝑓(𝑥) 𝐷𝑒𝑓𝑖𝑛𝑖𝑡𝑖𝑜𝑛 𝑜𝑓 𝐸𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛 𝑥 𝐵𝑎𝑠𝑖𝑐 𝑝𝑟𝑜𝑝𝑒𝑟𝑡𝑦 𝑜𝑓 𝐼𝑛𝑡𝑒𝑔𝑟𝑎𝑙 𝐷
  • 77. 77 𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) 𝐺 𝐷 Theory in GAN GANs 𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 ) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝐸 𝑥~𝑝 𝑔(𝑥) log(1 − 𝐷(𝑥)) = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥 = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥 𝑥 𝑥 𝑥 𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑥 𝑓𝑟𝑜𝑚 𝑝 𝑔 𝑖𝑛𝑠𝑡𝑒𝑎𝑑 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑧 𝑓𝑟𝑜𝑚 𝑝𝑧 = න 𝑝 𝑥 𝑓 𝑥 𝑑𝑥𝐸 𝑥~𝑝(𝑥) 𝑓(𝑥) 𝐷𝑒𝑓𝑖𝑛𝑖𝑡𝑖𝑜𝑛 𝑜𝑓 𝐸𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛 𝑥 𝐵𝑎𝑠𝑖𝑐 𝑝𝑟𝑜𝑝𝑒𝑟𝑡𝑦 𝑜𝑓 𝐼𝑛𝑡𝑒𝑔𝑟𝑎𝑙 𝑁𝑜𝑤 𝑤𝑒 𝑛𝑒𝑒𝑑 𝑡𝑜 𝑓𝑖𝑛𝑑 𝐷 𝑥 𝑤ℎ𝑖𝑐ℎ 𝑚𝑎𝑘𝑒𝑠 𝑡ℎ𝑒 𝑓𝑢𝑛𝑡𝑖𝑜𝑛 𝑖𝑛𝑠𝑖𝑑𝑒 𝑖𝑛𝑡𝑒𝑔𝑟𝑎𝑙 𝑚𝑎𝑥𝑖𝑚𝑢𝑚. 𝐷
  • 78. 78 Theoretical Results = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 Theory in GAN GANs 𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 𝐷 𝐷 𝑇ℎ𝑒 𝑓𝑢𝑛𝑡𝑖𝑜𝑛 𝑖𝑛𝑠𝑖𝑑𝑒 𝑖𝑛𝑡𝑒𝑔𝑟𝑎𝑙
  • 79. 79 Theoretical Results = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 Theory in GAN GANs 𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 𝐷 𝐷 𝑎 log 𝑦 + 𝑏 log 1 − 𝑦 𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝 𝑑𝑎𝑡𝑎 𝑥 , 𝑦 = 𝐷 𝑥 , 𝑏 = 𝑝 𝑔 𝑥
  • 80. 80 Theoretical Results = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 Theory in GAN GANs 𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 𝐷 𝐷 𝑎 log 𝑦 + 𝑏 log 1 − 𝑦 𝑎 𝑦 + −𝑏 1 − 𝑦 = 𝑎 − (𝑎 + 𝑏)𝑦 𝑦(1 − 𝑦) 𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝 𝑑𝑎𝑡𝑎 𝑥 , 𝑦 = 𝐷 𝑥 , 𝑏 = 𝑝 𝑔 𝑥 𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡𝑖𝑎𝑡𝑒 𝑤𝑖𝑡ℎ 𝑟𝑒𝑠𝑝𝑒𝑐𝑡 𝑡𝑜 𝐷 𝑥 𝑢𝑠𝑖𝑛𝑔 𝑁𝑜𝑡𝑒 𝑡ℎ𝑎𝑡 𝐷 𝑥 𝑐𝑎𝑛 𝑛𝑜𝑡 𝑎𝑓𝑓𝑒𝑐𝑡 𝑡𝑜 𝑝 𝑑𝑎𝑡𝑎 𝑥 𝑎𝑛𝑑 𝑝 𝑔 𝑥 . 𝑑 𝑑𝑥 log 𝑓(𝑥) = 𝑓′ (𝑥) 𝑓(𝑥)
  • 81. 81 Theoretical Results = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 Theory in GAN GANs 𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 𝐷 𝐷 𝑎 log 𝑦 + 𝑏 log 1 − 𝑦 𝑎 𝑦 + −𝑏 1 − 𝑦 𝑎 − (𝑎 + 𝑏)𝑦 𝑦(1 − 𝑦) = 0 𝑦 = 𝑎 𝑎 + 𝑏 = 𝑎 − (𝑎 + 𝑏)𝑦 𝑦(1 − 𝑦) 𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝 𝑑𝑎𝑡𝑎 𝑥 , 𝑦 = 𝐷 𝑥 , 𝑏 = 𝑝 𝑔 𝑥 𝑁𝑜𝑡𝑒 𝑡ℎ𝑎𝑡 𝐷 𝑥 𝑐𝑎𝑛 𝑛𝑜𝑡 𝑎𝑓𝑓𝑒𝑐𝑡 𝑡𝑜 𝑝 𝑑𝑎𝑡𝑎 𝑥 𝑎𝑛𝑑 𝑝 𝑔 𝑥 . 𝐼𝑡 ℎ𝑎𝑠 𝑎 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑣𝑎𝑙𝑢𝑒 𝑤ℎ𝑒𝑛 𝑁𝑜𝑡𝑒 𝑡ℎ𝑎𝑡 𝑡ℎ𝑒 𝑙𝑜𝑐𝑎𝑙 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑖𝑠 𝑡ℎ𝑒 𝑔𝑙𝑜𝑏𝑎𝑙 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑤ℎ𝑒𝑛 𝑡ℎ𝑒𝑟𝑒 𝑎𝑟𝑒 𝑛𝑜 𝑜𝑡ℎ𝑒𝑟 𝑙𝑜𝑐𝑎𝑙 𝑒𝑥𝑡𝑟𝑒𝑚𝑒𝑠. 𝑑 𝑑𝑥 log 𝑓(𝑥) = 𝑓′ (𝑥) 𝑓(𝑥) 𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡𝑖𝑎𝑡𝑒 𝑤𝑖𝑡ℎ 𝑟𝑒𝑠𝑝𝑒𝑐𝑡 𝑡𝑜 𝐷 𝑥 𝑢𝑠𝑖𝑛𝑔 𝐹𝑖𝑛𝑑 𝑡ℎ𝑒 𝑝𝑜𝑖𝑛𝑡 𝑤ℎ𝑒𝑟𝑒 𝑡ℎ𝑒 𝑑𝑒𝑟𝑖𝑣𝑎𝑡𝑖𝑣𝑒 𝑣𝑎𝑙𝑢𝑒 𝑖𝑠 0 𝑙𝑜𝑐𝑎𝑙 𝑒𝑥𝑡𝑟𝑒𝑚𝑒 .
  • 82. 82 Theoretical Results = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝐷 𝑥 + 𝑝 𝑔 𝑥 log 1 − 𝐷 𝑥 Theory in GAN GANs 𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 𝐷 𝐷 𝑎 log 𝑦 + 𝑏 log 1 − 𝑦 𝑎 𝑦 + −𝑏 1 − 𝑦 𝑎 − (𝑎 + 𝑏)𝑦 𝑦(1 − 𝑦) = 0 𝑦 = 𝑎 𝑎 + 𝑏 = 𝑎 − (𝑎 + 𝑏)𝑦 𝑦(1 − 𝑦) 𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝 𝑑𝑎𝑡𝑎 𝑥 , 𝑦 = 𝐷 𝑥 , 𝑏 = 𝑝 𝑔 𝑥 𝑁𝑜𝑡𝑒 𝑡ℎ𝑎𝑡 𝐷 𝑥 𝑐𝑎𝑛 𝑛𝑜𝑡 𝑎𝑓𝑓𝑒𝑐𝑡 𝑡𝑜 𝑝 𝑑𝑎𝑡𝑎 𝑥 𝑎𝑛𝑑 𝑝 𝑔 𝑥 . 𝐼𝑡 ℎ𝑎𝑠 𝑎 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑣𝑎𝑙𝑢𝑒 𝑤ℎ𝑒𝑛 𝑑 𝑑𝑥 log 𝑓(𝑥) = 𝑓′ (𝑥) 𝑓(𝑥) 𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡𝑖𝑎𝑡𝑒 𝑤𝑖𝑡ℎ 𝑟𝑒𝑠𝑝𝑒𝑐𝑡 𝑡𝑜 𝐷 𝑥 𝑢𝑠𝑖𝑛𝑔 𝐹𝑖𝑛𝑑 𝑡ℎ𝑒 𝑝𝑜𝑖𝑛𝑡 𝑤ℎ𝑒𝑟𝑒 𝑡ℎ𝑒 𝑑𝑒𝑟𝑖𝑣𝑎𝑡𝑖𝑣𝑒 𝑣𝑎𝑙𝑢𝑒 𝑖𝑠 0 𝑙𝑜𝑐𝑎𝑙 𝑒𝑥𝑡𝑟𝑒𝑚𝑒 . 𝐷∗ 𝑥 = 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝 𝑑𝑎𝑡𝑎 𝑥 , 𝑦 = 𝐷 𝑥 , 𝑏 = 𝑝 𝑔 𝑥
  • 83. 83 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 Objective Function of G (G should minimize C(G) Theory in GAN GANs
  • 84. 84 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) Theory in GAN GANs Optimal discriminator
  • 85. 85 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) + 𝐸 𝑥~𝑝 𝑔 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) Theory in GAN GANs
  • 86. 86 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) + 𝐸 𝑥~𝑝 𝑔 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 Theory in GAN GANs 𝑥 𝑥 = න 𝑝 𝑥 𝑓 𝑥 𝑑𝑥𝐸 𝑥~𝑝(𝑥) 𝑓(𝑥) Definition of Expectation 𝑥
  • 87. 87 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) + 𝐸 𝑥~𝑝 𝑔 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 Theory in GAN GANs 𝑥 𝑥
  • 88. 88 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) + 𝐸 𝑥~𝑝 𝑔 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 2 ∙ 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 Theory in GAN GANs 𝑥 𝑥 𝑥 𝑥 How?
  • 89. 89 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 2 ∙ 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 න 𝑝 𝑑𝑎𝑡𝑎 𝑥 𝑑𝑥 = 1 Theory in GAN GANs 𝑥 Property of probability density function
  • 90. 90 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 2 ∙ 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 න 𝑝 𝑑𝑎𝑡𝑎 𝑥 𝑑𝑥 = 1 𝑙𝑜𝑔2 = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 ∙ 𝑙𝑜𝑔2 𝑑𝑥 Theory in GAN GANs 𝑥 𝑥trick
  • 91. 91 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 2 ∙ 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 න 𝑝 𝑑𝑎𝑡𝑎 𝑥 𝑑𝑥 = 1 𝑙𝑜𝑔2 = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 ∙ 𝑙𝑜𝑔2 𝑑𝑥 𝑙𝑜𝑔2 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 Theory in GAN GANs 𝑥 𝑥 𝑥 𝑥
  • 92. 92 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) + 𝐸 𝑥~𝑝 𝑔 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 2 ∙ 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 Theory in GAN GANs 𝑥 𝑥 𝑥𝑥
  • 93. 93 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) + 𝐸 𝑥~𝑝 𝑔 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 2 ∙ 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) Theory in GAN GANs 𝑥𝑥
  • 94. 94 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) + 𝐸 𝑥~𝑝 𝑔 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) = −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 2 ∙ 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) 𝐾𝐿(𝑃| 𝑄 = න 𝑃(𝑥) log 𝑃(𝑥) 𝑄(𝑥) 𝑑𝑥 Theory in GAN GANs 𝑥 𝑥 𝑥 Definition of KL-Divergence
  • 95. 95 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) + 𝐸 𝑥~𝑝 𝑔 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) = −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 2 ∙ 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) 𝐾𝐿(𝑃| 𝑄 = න 𝑃(𝑥) log 𝑃(𝑥) 𝑄(𝑥) 𝑑𝑥 න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 2 𝑑𝑥 = 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) Theory in GAN GANs 𝑥 𝑥 𝑥 𝑥
  • 96. 96 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) + 𝐸 𝑥~𝑝 𝑔 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 2 ∙ 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) = −𝑙𝑜𝑔4 + 2 ∙ 𝐽𝑆𝐷(𝑝 𝑑𝑎𝑡𝑎||𝑝 𝑔) Theory in GAN GANs
  • 97. 97 Theoretical Results 𝐶 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗(𝑥)) = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) + 𝐸 𝑥~𝑝 𝑔 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 2 ∙ 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) = −𝑙𝑜𝑔4 + 2 ∙ 𝐽𝑆𝐷(𝑝 𝑑𝑎𝑡𝑎||𝑝 𝑔) Theory in GAN GANs
  • 98. 98 Theory in GAN GANs = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗ (𝑥)) = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 2 ∙ 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) = −𝑙𝑜𝑔4 + 2 ∙ 𝐽𝑆𝐷(𝑝 𝑑𝑎𝑡𝑎||𝑝 𝑔) 𝑥 𝑥 𝑥 𝑥 𝑥𝑥 𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐺 𝐷 𝑚𝑖𝑛 𝑉 𝐷∗, 𝐺 𝐺 𝑉 𝐷∗, 𝐺
  • 99. 99 Theory in GAN GANs = 𝐸 𝑥~𝑝 𝑑𝑎𝑡𝑎 log 𝐷∗ 𝑥 + 𝐸 𝑥~𝑝 𝑔 log(1 − 𝐷∗ (𝑥)) = න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + න 𝑝 𝑑𝑎𝑡𝑎 𝑥 log 2 ∙ 𝑝 𝑑𝑎𝑡𝑎(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 + න 𝑝 𝑔 𝑥 log 2 ∙ 𝑝 𝑔(𝑥) 𝑝 𝑑𝑎𝑡𝑎(𝑥) + 𝑝 𝑔(𝑥) 𝑑𝑥 = −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝 𝑑𝑎𝑡𝑎|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 𝑑𝑎𝑡𝑎 + 𝑝 𝑔 2 ) = −𝑙𝑜𝑔4 + 2 ∙ 𝐽𝑆𝐷(𝑝 𝑑𝑎𝑡𝑎||𝑝 𝑔) 𝑥 𝑥 𝑥 𝑥 𝑥𝑥 𝑚𝑖𝑛 𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐺 𝐷 𝑚𝑖𝑛 𝑉 𝐷∗, 𝐺 𝐺 𝑉 𝐷∗, 𝐺 𝐺 𝑠ℎ𝑜𝑢𝑙𝑑 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝐺 𝑠ℎ𝑜𝑢𝑙𝑑 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑂𝑝𝑡𝑖𝑚𝑖𝑧𝑖𝑛𝑔 𝑉 𝐷, 𝐺 𝑖𝑠 𝑠𝑎𝑚𝑒 𝑎𝑠 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑖𝑛𝑔 𝐽𝑆𝐷(𝑝 𝑑𝑎𝑡𝑎||𝑝 𝑔) 𝑂𝑝𝑡𝑖𝑚𝑎𝑙 𝐷