Deep Convolutional GANs
ISL Lab Seminar
Hansol Kang
: Meaning of Latent Space
Contents
Review of GAN
DCGAN
Experiment
Summary
2018-10-05
2
Review of GAN
2018-10-05
3
• Adversarial nets
1) Global Optimality of datag pp 
2) Convergence of Algorithm
D GVs
x
)(xpdata
“Generative Adversarial Networks”
Goal Method
D
DCGAN
2018-10-05
4
• Introduction
* Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint
arXiv:1511.06434 (2015).
*
“I have the strongest MLP army.”
“I have too.”
G
DCGAN
2018-10-05
6
• Introduction
*
D
G
“What are they doing?”
“We have a better CNN than MLP”
D
“I have the strongest MLP army.”
“I have too.”
G
Vanilla GAN DCGAN
* Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint
arXiv:1511.06434 (2015).
DCGAN
2018-10-05
7
• Contributions
Generating Natural
Image
Deep Convolutional
GANs
Image Classification
using D
Filter Visualization
Vector arithmetic
properties
Z“I’m very Important”
Who am I?
Black box Real
D
A
B
C
DCGAN
2018-10-05
8
• Contributions
Generating Natural
Image
Deep Convolutional
GANs
Image Classification
using D
Filter Visualization
Vector arithmetic
properties
DCGAN
2018-10-05
9
• Approach and Model Architecture
Replace any pooling layers with strided convolutions (discriminator) and fractional-
strided convolutions (generator).
Use batchnorm in both the generator and the discriminator.
Remove fully connected hidden layers for deeper architectures.
Use ReLU activation in generator for all layers except for the output, which uses
Tanh.
Use LeakyReLU activation in the discriminator for all layers.
DCGAN
2018-10-05
10
• Approach and Model Architecture
Strided Convolution Fractional Convolution(Transposed Convolution)
DCGAN
2018-10-05
11
• Approach and Model Architecture
Batch Normalization
Except for these layers.
Output layer of Generator
Input layer of Discriminator
DCGAN
2018-10-05
12
• Approach and Model Architecture
No fully connected layer
Classical CNN
GAP(Global Average Pooling)
http://nmhkahn.github.io/Casestudy-CNN
DCGAN
2018-10-05
13
• Approach and Model Architecture
No fully connected layer
https://raw.githubusercontent.com/znxlwm/pytorch-MNIST-CelebA-GAN-DCGAN/master/pytorch_DCGAN.png
DCGAN
2018-10-05
14
• Approach and Model Architecture
ReLU, Tanh, LeakyReLU
http://gmelli.org/RKB/Rectified_Linear_Unit_(ReLU)_Activation_Function
Generator : ReLU, Tanh
Discriminator : LeakyReLu , Sigmoid
DCGAN
2018-10-05
15
• Details of Adversarial Training
• Mini-batch stochastic gradient descent(SGD); mini-batch size of 128
• All weights initialized from a zero-centered Normal distribution with standard deviation 0.02
• Leaky slope 0.2
• Adam optimizer; lr =0.0002, beta1 = 0.9, beta2 = 0.5
DCGAN
2018-10-05
16
• Details of Adversarial Training
LSUN dataset
1 epoch
DCGAN
2018-10-05
17
• Details of Adversarial Training
LSUN dataset
5 epochs
DCGAN
2018-10-05
18
• Empirical Validation of DCGANs Capabilities
• CIFAR-10
• Classification
• Domain robustness
DCGAN
2018-10-05
19
• Empirical Validation of DCGANs Capabilities
SVHN(Street View House Numbers) dataset
DCGAN
2018-10-05
20
• Investigating and Visualizing The Internals of The Networks
Walking in the latent space
DCGAN
2018-10-05
21
• Investigating and Visualizing The Internals of The Networks(cont.)
Visualizing the discriminator features
DCGAN
2018-10-05
22
• Investigating and Visualizing The Internals of The Networks(cont.)
Forgetting to draw certain objects
in charge of windows in charge of beds
in charge of lamps in charge of doors
…
Latent code Filters(Conv) Generation
1
0
0
Noise(z)
DCGAN
2018-10-05
23
• Investigating and Visualizing The Internals of The Networks(cont.)
Forgetting to draw certain objects
DCGAN
2018-10-05
24
• Investigating and Visualizing The Internals of The Networks(cont.)
Vector arithmetic on face samples
DCGAN
2018-10-05
25
• Investigating and Visualizing The Internals of The Networks(cont.)
Vector arithmetic on face samples
DCGAN
2018-10-05
26
• Investigating and Visualizing The Internals of The Networks(cont.)
Vector arithmetic on face samples
DCGAN
2018-10-05
27
• Investigating and Visualizing The Internals of The Networks(cont.)
Vector arithmetic on face samples
Experiment
• Code
2018-10-05
28https://github.com/messy-snail/GAN_PyTorch
Experiment
• Code
2018-10-05
29https://github.com/messy-snail/GAN_PyTorch
Experiment
• Results#1 CelebA
2018-10-05
30
Ground Truth
Vanilla GAN :
DCGAN :
Epoch 1 Epoch 5 Epoch 100
Epoch 1 Epoch 5 Epoch 30
Still have this sample
Results are cherry picked
Experiment
• Results#2 LSUN)
2018-10-05
31
Ground Truth
Vanilla GAN :
DCGAN :
Epoch 1 Epoch 5 Epoch 12
Epoch 1 Epoch 2 Epoch 5
Results are cherry picked
Experiment
• Results#3 Korean Idol – Transfer trial
2018-10-05
32
• I used weights and biases
generated by celebA learning.
• I wanted the effect of transfer
learning but failed.
Maybe these factors
(Asian, cropping image)
Ground Truth Epoch 1 Epoch 2 Epoch 3
Epoch 4 Epoch 5 Epoch 6
Experiment
• Results#4 Korean Idol
2018-10-05
33
Ground Truth Epoch 1 Epoch 5 Epoch 30
Epoch 50 Epoch 100 Epoch 150
• 10000 images
Insufficient data set
Summary
2018-10-05
34
• Stable set of architectures for training generative adversarial networks
• Good representations of images for supervised learning and generative modeling
• Sometimes collapse a subset of filters to a single oscillating mode
• Latent code has a special meaning, not a simple noise component.
[Instability of GAN]
Future work
2018-10-05
35
Paper Review
Vanilla GAN
DCGAN
InfoGAN
Unrolled GAN
Wasserstein GAN
LS GAN
BEGAN
Pix2Pix
Cycle GAN
Proposed Model
SpyGAN
Tips
Document
Programming
Mathematical Study
Information theory
(working title)
&
Appendix
• Issues at the VAE Seminar (18.07.23)
2018-10-05
37
 Issue#1 Performance of VAE and GAN
 Issue#2 Log likelihood
 Issue#3 Dimension of latent code
 Issue#4 Why manifold?
Durk Kingma
1. Adam: A Method for Stochastic Optimization
2. Auto-Encoding Variational Bayes
Machine Learning researcher at OpenAI
- Mathematically very difficult papers
Intuitive explanation
https://www.youtube.com/watch?v=o_peo6U7IRM
오토인코더의 모든 것
: I refer to this video
Appendix
• Issue #1 Performance of VAE and GAN
2018-10-05
38
“Compared to GAN, VAE is relatively blurred and I do not know why.”
“Cost function”
    )(||)|()(|log),,( )|( zpxzqKLzgxExL xzq 
 
),,(min xL VAE
),(maxmin DGV
DG
      zGDExDEDGV zdata pzpx  1log)(log),( ~~
GAN
Intuition
Reconstruction Error Regularization
≈ D Loss ≈ G Loss
Appendix
• Issue #1 Performance of VAE and GAN
2018-10-05
39
VAE Loss= Recon. Error + Regularization GAN Loss= G_Loss + D_Loss
E D
Recon. Error
D
Real
Fake
1. Optimize
2. Image Quality
3. Generalization
VAE vs. GAN
Appendix
• Issue #2 Log likelihood
2018-10-05
40
Question about log likelihood
“Summation and monotonically increasing”
MLE(Maximum Likelihood Estimation) : Unknown parameter estimation from observation
)|(maxargˆ 

yp
eg. Gaussian Distribution
Mean and Std

i
iyp )|(maxarg 

 





i
i
i
i ypyp )|(logmaxarg)|(logmaxarg 

: monotonically
increasing function1
Log(x)
cf.

i
ixp )(logmaxarg 

Generation model
Appendix
• Issue #3 Dimension of latent code
2018-10-05
41
“Is the latent code dimension always small?”
“Yes”
AE, What’s this? Dimension reduction
E D
High Low
Interested
Sparse AE
FAILED
Appendix
• Issue #4 Why manifold?
2018-10-05
42
What’s the manifold and Why explain the manifold?
“Concept of manifold and Difference of between AE and VAE”
High Low
Subspace
=Manifold
Concept of manifold
D
Purpose of AE : Manifold Learning
Purpose of AE and VAE
Assumption(manifold hypothesis)
Uniform
sampling
E
Unsupervised Learning
D
Purpose of VAE : Generative Model
E
Unsupervised Learning
: Correlation between generation and manifold…
Appendix
• PyTorch (Variable length inputs)
2018-10-05
43
Shape = {Size} torch.Size([128, 3, 32, 32])
Shape = {Size} torch.Size([128, 64, 16, 16])
Shape = {Size} torch.Size([128, 16384])3x32x32
CIFAR-10
Shape = {Size} torch.Size([128, 64, 109, 89])
Shape = {Size} torch.Size([128, 3, 218, 178])
Shape = {Size} torch.Size([128, 620864])
3x178x218
CelebA
Conv
Input
Pool
FC
Conv2d(in_ch, out_ch, k_size, s, p)
Reshape(bat_sz,-1)
Input size is not fixed.

Deep Convolutional GANs - meaning of latent space

  • 1.
    Deep Convolutional GANs ISLLab Seminar Hansol Kang : Meaning of Latent Space
  • 2.
  • 3.
    Review of GAN 2018-10-05 3 •Adversarial nets 1) Global Optimality of datag pp  2) Convergence of Algorithm D GVs x )(xpdata “Generative Adversarial Networks” Goal Method
  • 4.
    D DCGAN 2018-10-05 4 • Introduction * Radford,Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015). * “I have the strongest MLP army.” “I have too.” G
  • 6.
    DCGAN 2018-10-05 6 • Introduction * D G “What arethey doing?” “We have a better CNN than MLP” D “I have the strongest MLP army.” “I have too.” G Vanilla GAN DCGAN * Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).
  • 7.
    DCGAN 2018-10-05 7 • Contributions Generating Natural Image DeepConvolutional GANs Image Classification using D Filter Visualization Vector arithmetic properties
  • 8.
    Z“I’m very Important” Whoam I? Black box Real D A B C DCGAN 2018-10-05 8 • Contributions Generating Natural Image Deep Convolutional GANs Image Classification using D Filter Visualization Vector arithmetic properties
  • 9.
    DCGAN 2018-10-05 9 • Approach andModel Architecture Replace any pooling layers with strided convolutions (discriminator) and fractional- strided convolutions (generator). Use batchnorm in both the generator and the discriminator. Remove fully connected hidden layers for deeper architectures. Use ReLU activation in generator for all layers except for the output, which uses Tanh. Use LeakyReLU activation in the discriminator for all layers.
  • 10.
    DCGAN 2018-10-05 10 • Approach andModel Architecture Strided Convolution Fractional Convolution(Transposed Convolution)
  • 11.
    DCGAN 2018-10-05 11 • Approach andModel Architecture Batch Normalization Except for these layers. Output layer of Generator Input layer of Discriminator
  • 12.
    DCGAN 2018-10-05 12 • Approach andModel Architecture No fully connected layer Classical CNN GAP(Global Average Pooling) http://nmhkahn.github.io/Casestudy-CNN
  • 13.
    DCGAN 2018-10-05 13 • Approach andModel Architecture No fully connected layer https://raw.githubusercontent.com/znxlwm/pytorch-MNIST-CelebA-GAN-DCGAN/master/pytorch_DCGAN.png
  • 14.
    DCGAN 2018-10-05 14 • Approach andModel Architecture ReLU, Tanh, LeakyReLU http://gmelli.org/RKB/Rectified_Linear_Unit_(ReLU)_Activation_Function Generator : ReLU, Tanh Discriminator : LeakyReLu , Sigmoid
  • 15.
    DCGAN 2018-10-05 15 • Details ofAdversarial Training • Mini-batch stochastic gradient descent(SGD); mini-batch size of 128 • All weights initialized from a zero-centered Normal distribution with standard deviation 0.02 • Leaky slope 0.2 • Adam optimizer; lr =0.0002, beta1 = 0.9, beta2 = 0.5
  • 16.
    DCGAN 2018-10-05 16 • Details ofAdversarial Training LSUN dataset 1 epoch
  • 17.
    DCGAN 2018-10-05 17 • Details ofAdversarial Training LSUN dataset 5 epochs
  • 18.
    DCGAN 2018-10-05 18 • Empirical Validationof DCGANs Capabilities • CIFAR-10 • Classification • Domain robustness
  • 19.
    DCGAN 2018-10-05 19 • Empirical Validationof DCGANs Capabilities SVHN(Street View House Numbers) dataset
  • 20.
    DCGAN 2018-10-05 20 • Investigating andVisualizing The Internals of The Networks Walking in the latent space
  • 21.
    DCGAN 2018-10-05 21 • Investigating andVisualizing The Internals of The Networks(cont.) Visualizing the discriminator features
  • 22.
    DCGAN 2018-10-05 22 • Investigating andVisualizing The Internals of The Networks(cont.) Forgetting to draw certain objects in charge of windows in charge of beds in charge of lamps in charge of doors … Latent code Filters(Conv) Generation 1 0 0 Noise(z)
  • 23.
    DCGAN 2018-10-05 23 • Investigating andVisualizing The Internals of The Networks(cont.) Forgetting to draw certain objects
  • 24.
    DCGAN 2018-10-05 24 • Investigating andVisualizing The Internals of The Networks(cont.) Vector arithmetic on face samples
  • 25.
    DCGAN 2018-10-05 25 • Investigating andVisualizing The Internals of The Networks(cont.) Vector arithmetic on face samples
  • 26.
    DCGAN 2018-10-05 26 • Investigating andVisualizing The Internals of The Networks(cont.) Vector arithmetic on face samples
  • 27.
    DCGAN 2018-10-05 27 • Investigating andVisualizing The Internals of The Networks(cont.) Vector arithmetic on face samples
  • 28.
  • 29.
  • 30.
    Experiment • Results#1 CelebA 2018-10-05 30 GroundTruth Vanilla GAN : DCGAN : Epoch 1 Epoch 5 Epoch 100 Epoch 1 Epoch 5 Epoch 30 Still have this sample Results are cherry picked
  • 31.
    Experiment • Results#2 LSUN) 2018-10-05 31 GroundTruth Vanilla GAN : DCGAN : Epoch 1 Epoch 5 Epoch 12 Epoch 1 Epoch 2 Epoch 5 Results are cherry picked
  • 32.
    Experiment • Results#3 KoreanIdol – Transfer trial 2018-10-05 32 • I used weights and biases generated by celebA learning. • I wanted the effect of transfer learning but failed. Maybe these factors (Asian, cropping image) Ground Truth Epoch 1 Epoch 2 Epoch 3 Epoch 4 Epoch 5 Epoch 6
  • 33.
    Experiment • Results#4 KoreanIdol 2018-10-05 33 Ground Truth Epoch 1 Epoch 5 Epoch 30 Epoch 50 Epoch 100 Epoch 150 • 10000 images Insufficient data set
  • 34.
    Summary 2018-10-05 34 • Stable setof architectures for training generative adversarial networks • Good representations of images for supervised learning and generative modeling • Sometimes collapse a subset of filters to a single oscillating mode • Latent code has a special meaning, not a simple noise component. [Instability of GAN]
  • 35.
    Future work 2018-10-05 35 Paper Review VanillaGAN DCGAN InfoGAN Unrolled GAN Wasserstein GAN LS GAN BEGAN Pix2Pix Cycle GAN Proposed Model SpyGAN Tips Document Programming Mathematical Study Information theory (working title)
  • 36.
  • 37.
    Appendix • Issues atthe VAE Seminar (18.07.23) 2018-10-05 37  Issue#1 Performance of VAE and GAN  Issue#2 Log likelihood  Issue#3 Dimension of latent code  Issue#4 Why manifold? Durk Kingma 1. Adam: A Method for Stochastic Optimization 2. Auto-Encoding Variational Bayes Machine Learning researcher at OpenAI - Mathematically very difficult papers Intuitive explanation https://www.youtube.com/watch?v=o_peo6U7IRM 오토인코더의 모든 것 : I refer to this video
  • 38.
    Appendix • Issue #1Performance of VAE and GAN 2018-10-05 38 “Compared to GAN, VAE is relatively blurred and I do not know why.” “Cost function”     )(||)|()(|log),,( )|( zpxzqKLzgxExL xzq    ),,(min xL VAE ),(maxmin DGV DG       zGDExDEDGV zdata pzpx  1log)(log),( ~~ GAN Intuition Reconstruction Error Regularization ≈ D Loss ≈ G Loss
  • 39.
    Appendix • Issue #1Performance of VAE and GAN 2018-10-05 39 VAE Loss= Recon. Error + Regularization GAN Loss= G_Loss + D_Loss E D Recon. Error D Real Fake 1. Optimize 2. Image Quality 3. Generalization VAE vs. GAN
  • 40.
    Appendix • Issue #2Log likelihood 2018-10-05 40 Question about log likelihood “Summation and monotonically increasing” MLE(Maximum Likelihood Estimation) : Unknown parameter estimation from observation )|(maxargˆ   yp eg. Gaussian Distribution Mean and Std  i iyp )|(maxarg          i i i i ypyp )|(logmaxarg)|(logmaxarg   : monotonically increasing function1 Log(x) cf.  i ixp )(logmaxarg   Generation model
  • 41.
    Appendix • Issue #3Dimension of latent code 2018-10-05 41 “Is the latent code dimension always small?” “Yes” AE, What’s this? Dimension reduction E D High Low Interested Sparse AE FAILED
  • 42.
    Appendix • Issue #4Why manifold? 2018-10-05 42 What’s the manifold and Why explain the manifold? “Concept of manifold and Difference of between AE and VAE” High Low Subspace =Manifold Concept of manifold D Purpose of AE : Manifold Learning Purpose of AE and VAE Assumption(manifold hypothesis) Uniform sampling E Unsupervised Learning D Purpose of VAE : Generative Model E Unsupervised Learning : Correlation between generation and manifold…
  • 43.
    Appendix • PyTorch (Variablelength inputs) 2018-10-05 43 Shape = {Size} torch.Size([128, 3, 32, 32]) Shape = {Size} torch.Size([128, 64, 16, 16]) Shape = {Size} torch.Size([128, 16384])3x32x32 CIFAR-10 Shape = {Size} torch.Size([128, 64, 109, 89]) Shape = {Size} torch.Size([128, 3, 218, 178]) Shape = {Size} torch.Size([128, 620864]) 3x178x218 CelebA Conv Input Pool FC Conv2d(in_ch, out_ch, k_size, s, p) Reshape(bat_sz,-1) Input size is not fixed.