3. Paper reading
Today’s paper
Shen Y, Gu J, Tang X, Zhou B. Interpreting the latent space of gans for semantic
face editing. InProceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition 2020 (pp. 9243-9252).
● “we propose a novel framework, called InterFaceGAN”
● “for semantic face editing by interpreting the latent semantics learned by GANs.”
● Style GAN (or PGGAN) + disentangled latent codes
4. Why this paper?
CVPR accepted paper list → grep “interpret”
● ALFRED
● A Disentangling Invertible Interpretation Network for Explaining Latent Representations
● Self-supervised Learning of Interpretable Keypoints from Unlabelled Videos
● Interpreting the Latent Space of GANs for Semantic Face Editing
● Interpretable and Accurate Fine-grained Recognition via Region Grouping
5. TL; DR
- We are able to change images semantically (Age, Eyeglasses, …)
- Using models Style GAN or PG-GAN
9. Previous Reseach
Style GAN
Key Points:
- Progressive Growing + AdaIN + Mixing Regularization.
- Using two latent codes (z and w).
Latent codes are disentanglement.
10. Previous Reseach
Image2StyleGAN
Key Points:
Optimize w from original images (like AnoGAN).
If we change w1 to w2, we can do morphing.
However, It’s not semantically.
We’d like to convert image semantically.
e.g.)male → female
smile → expressionless
14. Method
1. Given a well-trained GAN model (Style GAN or PGGAN)
2. Generate z from image by the model
3. To calculate the semantic scores
4. Define the separation boundary (for m semantics)
Used m=5 key facial attributes for analysis in the paper
● Pose
● Smile (Expression)
● Age
● Gender
● Eyeglasses
5. To generate a image from z_edit
↑ This is the image edited semantically.
15. Method
1. Given a well-trained GAN model (Style GAN or PGGAN)
2. Generate z from image by the model
3. To calculate the semantic scores
4. Define the separation boundary (for m semantics)
Used m=5 key facial attributes for analysis in the paper
● Pose
● Smile (Expression)
● Age
● Gender
● Eyeglasses
5. To generate a image from z_edit
↑ This is the image edited semantically.
17. Method
Using with SVMs for defining the separation boundary (for m semantics)
z
Hyperplane
normal vector n Male
Female
z
z
z
Here, z can be changed semantically.
However, “when there is more than one attribute,
editing one may affect another since some
semantics can be coupled with each other.”
18. Method
Using with SVMs for defining the separation boundary (for m semantics)
z
Hyperplane
normal vector n Male
Female
z
z
z
Here, z can be changed semantically.
However, “when there is more than one attribute,
editing one may affect another since some
semantics can be coupled with each other.”
19.
20. Method
1. To calculate the semantic scores from generated images
a. with ResNet50 trained by CelebA
b. Original scores are binary and multi-class label
2. Define the separation boundary (for m semantics)
a. using SVMs
3. Move latent codes for changing image semantically
a. manually forcing N・N to be diagonal. (named conditional manipulation)
22. Experimental
pose and smile are almost
orthogonal to other attributes.
Nevertheless, gender,
age, and eyeglasses need
conditional manipulation.
(not orthogonal)
25. Conclusion
● We can convert img1 to img2 smoothly and semantically.
● With the pretrained model, training is relatively fast.
● It may not be accurate enough to generate novel images.
Note
- Are not there any quantitative metrics of "semantically"?
26. References
1. Shen Y, Gu J, Tang X, Zhou B. Interpreting the latent space of gans for semantic face editing. InProceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition 2020 (pp. 9243-9252).
2. Karras T, Aila T, Laine S, Lehtinen J. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint
arXiv:1710.10196. 2017 Oct 27.
3. Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks. InProceedings of the IEEE conference on
computer vision and pattern recognition 2019 (pp. 4401-4410).
4. Abdal R, Qin Y, Wonka P. Image2stylegan: How to embed images into the stylegan latent space?. InProceedings of the IEEE international
conference on computer vision 2019 (pp. 4432-4441).
5. StyleGANを遊び尽くせ!! ~追加学習不要の画像編集~ - Qiita https://qiita.com/pacifinapacific/items/1d6cca0ff4060e12d336
6. GANの基礎からStyleGAN2まで. この記事について| by akira | Medium
https://medium.com/@akichan_f/gan%E3%81%AE%E5%9F%BA%E7%A4%8E%E3%81%8B%E3%82%89stylegan2%E3%81%BE%E3%8
1%A7-dfd2608410b3