Wasserstein GAN
2017-07-12 NN #3 @ TFUG
Yuta Kashino ( )
BakFoo, Inc. CEO
Astro Physics /Observational Cosmology
Zope / Python
Realtime Data Platform for Enterprise / Prototyping
Yuta Kashino ( )
arXiv
stat.ML, stat.TH, cs.CV, cs.CL, cs.LG
math-ph, astro-ph
- PyCon2016
- PyCon2017 Edward
- 2017 8 TFUG
@yutakashino
https://www.slideshare.net/yutakashino/pyconjp2016
Wasserstein GAN
…
- WGAN: GAN
-
- DCGAN
-
-
Generative Adversarial
Networks
GAN 1
- Generative Adversarial Networks
- Ian Goodfelow
- Bengio , Theano/Pyleran2
- Google Brain
- 2016 NIPIS Tutorial
- : The GAN Zoo
https://goo.gl/uC8xn2
https://github.com/hindupuravinash/the-gan-zoo
GAN 2
- GAN …
- Meow Generator
- HDCGAN, WGAN, LSGAN…
https://ajolicoeur.wordpress.com/cats/
https://github.com/hindupuravinash/the-gan-zoo
Vanila GAN
- Generator Discriminator min/max
- G D
- MLP
https://goo.gl/vHUpqG https://goo.gl/7u4zS6
DCGAN
-
- CNN
- G /D
- Pool/ Full Batch Norm, Leaky ReLU
Unsupervised Representation Learning with
Deep Convolutional Generative Adversarial Networks
https://arxiv.org/abs/1511.06434
https://goo.gl/8EmZgT
GAN
- G/D JS
-
-
-
-
-
Wasserstein GAN
WGAN 1
- / /
- /
-
-
-
-
- (
WGAN 2
-
- Read-through: Wasserstein GAN
- Wasserstein GAN and the Kantorovich-Rubinstein
Duality
-
https://goo.gl/7ywVwc
https://goo.gl/40eCbR
WGAN
GAN Descriminator/Critic W
1. W
( 1, 2)
2. W
( 3)
3. W
4.
WGAN
1. WGAN:
4
- Total Variation(TV)
- Kullback-Leibler (KL) divergence
- Jenson-Shannon (JS) divergence
- Earth Mover (EM) / Wasserstein
(Pr, Pg) = sup
A
|Pr(A) Pg(A)|
KL(PrkPg) =
Z
x
log
✓
Pr(x)
Pg(x)
◆
Pr(x) dx
JS(Pr, Pg) =
1
2
KL(PrkPm) +
1
2
KL(PgkPm)
M = Pr/2 + Pg/2M
W(Pr, Pg) = inf
2⇧(Pr,Pg)
E(x,y)⇠
⇥
kx yk
⇤
4
- W
- JS
- KL
- TV
KL(P0kP✓) = KL(P✓kP0) =
(
+1 if ✓ 6= 0 ,
0 if ✓ = 0 ,
(P0, P✓) =
(
1 if ✓ 6= 0 ,
0 if ✓ = 0 .
JS(P0, P✓) =
(
log 2 if ✓ 6= 0 ,
0 if ✓ = 0 ,
W(P0, P✓) = |✓|
U[0, 1]
https://goo.gl/40eCbR
3 1
- 1:
W
- 2:
W
W
- 1, 2
W GAN Loss
3 2
3: Kantorovich-Rubinstein
- W
- W
max
w2W
Ex⇠Pr
[fw(x)] Ex⇠P✓
[fw(x)]  sup
kfkLK
Ex⇠Pr
[f(x)] Ex⇠P✓
[f(x)]
= K · W(Pr, P✓)
r✓W(Pr, P✓) = r✓(Ex⇠Pr
[fw(x)] Ez⇠Z[fw(g✓(z))])
= Ez⇠Z[r✓fw(g✓(z))]
W/EM 1
-
-
W(Pr, Pg) = inf
2⇧(Pr,Pg)
E(x,y)⇠
⇥
kx yk
⇤
scypy.optimize.linprog
γ
https://goo.gl/7ywVwc
https://goo.gl/7ywVwc
W/EM 2
: Kantorovich-Rubinstein
-
W(Pr, Pg) = inf
2⇧(Pr,Pg)
E(x,y)⇠
⇥
kx yk
⇤
W(Pr, Pg) = sup
kfkL1
Ex⇠Pr
[f(x)] Ex⇠Pg
[f(x)]
3 2( )
3: Kantorovich-Rubinstein
- W
- W
max
w2W
Ex⇠Pr
[fw(x)] Ex⇠P✓
[fw(x)]  sup
kfkLK
Ex⇠Pr
[f(x)] Ex⇠P✓
[f(x)]
= K · W(Pr, P✓)
r✓W(Pr, P✓) = r✓(Ex⇠Pr
[fw(x)] Ez⇠Z[fw(g✓(z))])
= Ez⇠Z[r✓fw(g✓(z))]
2. WGAN:
r✓W(Pr, P✓) = r✓(Ex⇠Pr
[fw(x)] Ez⇠Z[fw(g✓(z))])
= Ez⇠Z[r✓fw(g✓(z))]
PyTorch
https://goo.gl/unktzn
3. WGAN:
- WGAN
- W
DCGAN
JS
WGAN
W
DCGAN
WGAN
DCGAN
BatchNorm OK
BatchNorm
WGAN
BatchNorm
DCGAN
OK
MLP
WGAN
MLP
DCGAN
-
-
- WGAN
- GAN G D
Improved Training of Wasserstein GANs
https://arxiv.org/abs/1704.00028
Do GANs actually learn the distribution? An empirical study
https://arxiv.org/abs/1706.08224
WGAN
GAN Descriminator/Critic W
1. W
( 1, 2)
2. W
( 3)
3. W
4.
Questions
kashino@bakfoo.com
@yutakashino
BakFoo, Inc.
NHK NMAPS: +
BakFoo, Inc.
PyConJP 2015
Python
BakFoo, Inc.
BakFoo, Inc.
: SNS +

Wasserstein GAN Tfug2017 07-12