펀디멘탈팀 고형권 님의 STYLE GAN2 논문 리뷰 입니다
지난번 리뷰 했던 Style gan에 이어, Style gan 2 의 논문 리뷰 입니다! Style Gan은 계속해서 Sota 자리를 유지했지만 Style Gan 내부적으로 가끔씩 물방울 모양의 artifact가 inference과정에서 큰 방해가 됨을 확인했습니다. 이와 더불어 StyleGAN에서는 AdaIN이 feature map의 평균과 분산을 normalize했지만, StyleGAN2에서는 convolution weight를 normalize한다. AdaIN에서 평균을 제거하고 표준편차만 사용하였고, 표준편차만으로도 충분하다는 것을 알게 되었다. 또한. bias와 noise를 block 외부로 빼서 style과 noise의 영향력을 독립시켰습니다.
기존에는 noise의 영향력이 style의 크기에 반비례하였으나, noise의 변화에 따른 효과가 분명해졌습니다. 이는 Instance Normalization과 수학적으로 동일한 방법은 아니지만, output feature map을 standard unit standard deviation을 갖도록 해주어 학습을 더욱 안정적으로 만들며 물방울 artifact를 없애는데도 큰 성과를 이루어 냈습니다!
오늘도 많은 관심 미리 감사드립니다!
8. StyleGAN2 (2020 CVPR)
● In this paper, we focus all analysis solely on W, as it is the relevant latent
space from the synthesis network’s point of view.
8
11. The Problem of StyleGAN (Droplet artifacts)
https://www.youtube.com/watch?v=c-NJtV9Jvp0
11
12. B. Weight Demodulation
● AdaIN operation normalizes the mean and variance of each feature map
separately, thereby potentially destroying any information found in the
magnitudes of the features relative to each other.
● How to fix? Get rid of AdaIN structure
12
13. B. Weight Demodulation
● BEFORE (StyleGAN)
● W → affine transform → A
Image from original paper
13
19. Perceptual Path Length score
● Perceptual Path Length score (PPL): Sampled from latent-space vectors using
interpolation → distance calculation using features of VGG
Image from original paper
19
21. Why LOW PPL == GOOD Quality?
● NOT immediately obvious…
● Hypothesis:
GOOD image region (STRETCHED, large latent space)
BAD image region (SQUEEZED, small latent space)
Image from original paper
21
23. Why better than FID and Precision & Recall?
Geirhos et al., 2019 ICLR (oral)
23
24. D. Path Length Regularization
● We would like to encourage that a fixed-size step in W results in a non-zero,
fixed-magnitude change in the image.
● We can measure the deviation from this ideal empirically by stepping into
random directions in the image space and observing the corresponding w
gradients.
Augustus Odena, Jacob Buckman, Catherine Olsson, Tom B. Brown, Christopher Olah, Colin Raffel, and Ian Goodfellow.
Is generator conditioning causally related to GAN performance? CoRR, abs/1802.08768, 2018.
24
25. C. Lazy Regularization
● As the resulting regularization term is somewhat expensive to compute, we
first describe a general optimization that applies to any regularization
technique.
● … performed only once every 16 mini-batches
25
27. The Problem of StyleGAN (Phase artifacts)
Image from original paper
27
28. The Problem of StyleGAN (Phase artifacts)
https://www.youtube.com/watch?v=c-NJtV9Jvp0
28
29. The Problem of StyleGAN (Phase artifacts)
● … We believe the problem is that in progressive growing each resolution
serves momentarily as the output resolution, forcing it to generate maximal
frequency details, which then leads to the trained network to have
excessively high frequencies in the intermediate layers, ...
29
31. E. No Growing, New G & D Architecture
Image from original paper
31
32. E. No Growing, New G & D Architecture
Image from original paper
32
33. The Result of New Architecture
https://www.youtube.com/watch?v=c-NJtV9Jvp0
33
34. F. Large Networks for Better Performance
… To verify this, we inspected the generated images
manually and noticed that they generally lack some of
the pixel-level detail...
Image from original paper
34
35. F. Large Networks for Better Performance
Image from original paper
35
36. Deep Fake Detection Algorithm
● How can we detect generated / synthesized image?
Image from original paper
36
37. Deep Fake Detection Algorithm
https://www.youtube.com/watch?v=c-NJtV9Jvp0
37
38. How to do this? LPIPS distance
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018). The unreasonable effectiveness of deep features as a
perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 586-595).
38
39. How to do this?
● (for StyleGAN) it appears that the
mapping from W to images is too
complex for this to succeed reliably in
practice.
● We find it encouraging that StyleGAN2
makes source attribution easier even
though the image quality has improved
significantly.
Image from original paper
39