[PR12] Spectral Normalization for Generative Adversarial Networks

Spectral Normalization for
Generative Adversarial Networks
PR12와 함께 이해하는
Jaejun Yoo
Clova ML / NAVER
PR12
13th May, 2018

Today’s contents
Spectral Normalization for Generative Adversarial Networks
by Takeru Miyato, Toshiki Kataoka, Masanori Koyama, Yuichi Yoshida
Feb. 2018: https://arxiv.org/abs/1802.05957
Accept: (Oral)
Rating: 8-8-8
ICLR 2018

Motivation & Contribution
“One of the challenges in the study of generative
adversarial networks is the instability of its training.”
: Proposed a novel weight normalization technique called
spectral normalization to stabilize the training of the
discriminator of GANs.
• Lipschitz constant is the only hyper-parameter to be tuned,
and the algorithm does not require intensive tuning of the
only hyper-parameter for satisfactory performance.
• Implementation is simple and the additional computational
cost is small.

GANs is hard to train… why?
WGAN, GAN-GP

GANs is hard to train… why?
WGAN, GAN-GP
While input based regularizations allow for relatively easy formulations
based on samples, they also suffer from the fact that, they cannot
impose regularization on the space outside of the supports of the
generator and data distributions without introducing somewhat
heuristic means.

Spectral Normalization: Prerequisites
“Matrix Norm”
from WiKi
https://math.stackexchange.com/questions/586663/why-
does-the-spectral-norm-equal-the-largest-singular-value

Spectral Normalization
Gradient Analysis of Spectrally Normalized Weights

Spectral Normalization
Gradient Analysis of Spectrally Normalized Weights *
* + (10)

Why SN is better than…
… Weight Normalization

Why SN is better than…
… Gradient Penalty
• The approach has an obvious weakness of being heavily dependent on the support of
the current generative distribution. As a matter of course, the generative distribution
and its support gradually changes in the course of the training, and this can
destabilize the effect of such regularization.
• While this seems to serve the same purpose as spectral normalization, orthonormal
regularization are mathematically quite different from our spectral normalization
because the orthonormal regularization destroys the information about the spectrum
by setting all the singular values to one. On the other hand, spectral normalization
only scales the spectrum so that the its maximum will be one.
… Orthonormal Regularization

Results
Squared singular values of weight matrices trained with different methods

Results
Squared singular values of weight matrices trained with different methods
> “AS EXPECTED!”

Results
Comparison between SN and orthonormal regularization

Results
Comparison between SN and orthonormal regularization
> “SN is more stable against various network architecture”

Results
SN remains effective on a large high dimensional dataset!

Summary
They proposed a novel weight normalization technique
called spectral normalization to stabilize the training of the
discriminator of GANs
• in various network architectures
• in various hyperparameter settings
• in various datasets
• with an intuitive and straight forward idea
• only using a relatively easy linear algebra knowledge
Practicality
Principled way

[PR12] Spectral Normalization for Generative Adversarial Networks

More Related Content

What's hot

Similar to [PR12] Spectral Normalization for Generative Adversarial Networks

More from JaeJun Yoo

Recently uploaded

[PR12] Spectral Normalization for Generative Adversarial Networks