Rsgan iconi

RSGAN:
Regularization-on-Sigma GAN
Chung-Il Kim*, Seungwon Jung and Eenjun Hwang
School of Electrical Engineering, Korea University

Contents
▪ Introduction
▪ RSGAN
▪ Experiments
▪ Conclusion
2/15

Generative Adversarial Nets
▪ A generator(G) produces synthetic data from random variables.
▪ A discriminator(D) get two inputs; real one and fake one.
▪ The D determines if the input is authentic.
▪ Given by loss,
D & G are each trained by
the optimizer.
3/15[1] https://medium.com/coinmonks/celebrity-face-generation-using-gans-tensorflow-implementation-eaa2001eef86

Examples of Data
Generated by Recent GANs
BEGAN
(Boundary equilibrium GANs, 2017, Google)
Train data: CelebA
Human beings almost cannot tell the real
data from the synthetic data.
4/15
Fig 2. A total of 15 samples generated by
BEGAN[2]
[2] D. Berthelot, T. Schumm, and L. Metz, "BEGAN: boundary equilibrium generative adversarial networks," arXiv preprint
arXiv:1703.10717, 2017.

Failed Examples Generated
by GANs
In 737k steps, BEGAN keep on generating the same
data. At 2,000k steps, it fail to learn a data distribution.
This phenomenon is called ‘mode collapse’.
Several different input z vectors, but same output
(possibly due to low model capacity or inadequate
optimization).
Detecting mode collapse is very challenging.
5/15Fig 3. A total of 16 samples generated from a
few training steps are shown in BEGAN.

RSGAN
More stable than BEGAN in a long-term
learning
Each frame = 10k steps
Mode collapse has not been observed
during every global 2,400k steps.
6/15

Image Stability on
Sequential Steps
32x32 images generated
Each 16 samples monitored every
1000 steps until 2400k
At 737k, BEGAN start to mode
collapse, weird images
RSGAN generated diverse human-
like face steadily until 2,400k steps
7/15

BEGAN Training Procedure
8/15
Synthetic sample
𝐺(𝑧)
Real sample
𝑥
Discriminator
D
(Auto-Encoder)
output
𝐷(𝑥)
output
𝐷(𝐺(𝑧))
‘Real data error’
𝐿(𝑥) = |𝐷(𝑥) − 𝑥|
G
Real data
Noise
𝑧 Generator
‘Synthetic data error’
𝐿(𝐺(𝑧)) = |𝐷(𝐺(𝑧)) − 𝐺(𝑧)|
① ②
③
④

BEGAN Objectives
▪ Discriminator objectives
▪ Minimize ‘real data error’ and maximize ‘synthetic data error’
ℒ 𝐷𝑖𝑠𝑐𝑟𝑖𝑚𝑖𝑛𝑎𝑡𝑜𝑟 = argmin[𝐿 𝑥 − 𝑘 𝑡
∗
𝐿 𝐺 𝑧 ]
▪ Generator objectives
▪ Minimize ‘synthetic data error’
ℒ 𝐺𝑒𝑛𝑒𝑟𝑎𝑡𝑜𝑟 = argmin[𝐿 𝐺 𝑧 ]
▪ 𝑘 𝑡
∗
: proportional control variable for stable learning
9/15

RSGAN Training Procedure
10/15
Synthetic sample
𝐺(𝑧)
Real sample
𝑥
Discriminator
D
(Auto-Encoder)
output
𝐷(𝑥)
output
𝐷(𝐺(𝑧))
‘A metric on real data’
𝑚(𝑥, 𝐷(𝑥))
G
Real data
Noise
𝑧 Generator
‘A metric on synthetic data’
𝑚(𝐺(𝑧), 𝐷(𝐺(𝑧)))
① ②
③
④

RSGAN Objectives
▪ Discriminator objectives
▪ Minimize ‘a metric on real data’ and maximize ‘a metric on synthetic data’
ℒ 𝐷𝑖𝑠𝑐𝑟𝑖𝑚𝑖𝑛𝑎𝑡𝑜𝑟 = argmin[m(𝑥, 𝐷 𝑥 ) − 𝑘 𝑡
∗
m(𝐺 𝑧 , 𝐷(𝐺 𝑧 ))]
▪ Generator objectives
▪ Minimize ‘a metric on synthetic data’
ℒ 𝐺𝑒𝑛𝑒𝑟𝑎𝑡𝑜𝑟 = argmin[m(𝐺 𝑧 , 𝐷(𝐺 𝑧 ))]
▪ 𝑘 𝑡
∗
: proportional control variable for stable learning
11

Metric
▪ A function that defines a distance between each pair of elements of a set
▪ Ex) Euclidian distance, Manhattan distance, Jensen-Shannon Divergence, Wasserstein-1
distance, Wasserstein-2 distance
▪ RSGAN applied Wasserstein-2 distance
▪ Given data distribution 𝑃 𝑎𝑛𝑑 𝑄, Wasserstein-2 distance denoted:
𝑊2 𝑃, 𝑄 = 𝑚 𝑃 − 𝑚 𝑄
2
2
+ 𝑡𝑟𝑎𝑐𝑒(𝐶 𝑃 + 𝐶 𝑄 − 2 𝐶 𝑄
1
2
𝐶 𝑃
1
2
𝐶 𝑄
1
2
1
2
)
𝑚 𝑃, 𝑚 𝑄: 𝑚𝑒𝑎𝑛𝑠, 𝐶 𝑃, 𝐶 𝑄: 𝑐𝑜𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑚𝑎𝑡𝑟𝑖𝑐𝑒𝑠
12

Wasserstein-2 Distance
▪ Comparison BEGAN with RSGAN
▪ RSGAN consider not only means but also variances
13/15
BEGAN RSGAN
𝐿(𝑥) = |𝐷(𝑥) − 𝑥| m(𝑥, 𝐷 𝑥 )= 𝑚x − 𝑚D(x)
2
2
+
1
n
𝜎 𝑥 − 𝜎 𝐷(𝑥)
2
2
 RSGAN use simplified W-2 distance and it denoted:
𝑅𝑆𝐿 𝑃, 𝑄 = 𝑚 𝑃 − 𝑚 𝑄
2
2
+
1
n
𝜎 𝑃 − 𝜎 𝑄
2
2
𝜎 𝑃 𝑎𝑛𝑑 𝜎 𝑄: 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
 Derived procedure is in paper

Loss Graph
▪ Dataset: CelebA with 32 resolution
▪ Filter size: D with 128, G with 64
▪ Learning rate: D with 0.00008
▪ 𝛾(equilibrium): 0.5
▪ No technique used
▪ Batch normalization, dropout, transpose
convolutions, skip connections or
refinement in BEGAN
▪ BEGAN and RSGAN start to minimize
losses and converged to some value
14/15

Conclusion
▪ We proposed a new GAN model called RSGAN
▪ RSGAN uses Wasserstein-2 as its loss metric
▪ RSGAN can perform up to 2,400k training steps stably for CelebA dataset
▪ We plan to make RSGAN more stable using depthwise separable
convolution[3]
15/15[3] F. Chollet, "Xception: Deep learning with depthwise separable convolutions," arXiv preprint, p. 1610.02357, 2017.

Rsgan iconi

Recommended

Recommended

More Related Content

Similar to Rsgan iconi

Similar to Rsgan iconi (20)

Recently uploaded

Recently uploaded (20)

Rsgan iconi

Editor's Notes