Paper reading _interpreting_the_latent_space_of_ga_ns_for_semantic_face_editing

Interpreting the Latent Space of
GANs for Semantic Face Editing
SATO, Ryosuke

Paper reading
Today’s paper
Shen Y, Gu J, Tang X, Zhou B. Interpreting the latent space of gans for semantic
face editing. InProceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition 2020 (pp. 9243-9252).
● “we propose a novel framework, called InterFaceGAN”
● “for semantic face editing by interpreting the latent semantics learned by GANs.”
● Style GAN (or PGGAN) + disentangled latent codes

Why this paper?
CVPR accepted paper list → grep “interpret”
● ALFRED
● A Disentangling Invertible Interpretation Network for Explaining Latent Representations
● Self-supervised Learning of Interpretable Keypoints from Unlabelled Videos
● Interpreting the Latent Space of GANs for Semantic Face Editing
● Interpretable and Accurate Fine-grained Recognition via Region Grouping

TL; DR
- We are able to change images semantically (Age, Eyeglasses, …)
- Using models Style GAN or PG-GAN

Previous Reseach
PG-GAN (Progressive Growing of GANs)
Key Points:
Growing the G and D progressively.
(grow low → high resolution.)

Previous Reseach
Style GAN
Key Points:
- Progressive Growing + AdaIN + Mixing Regularization.
- Using two latent codes (z and w).
Latent codes are disentanglement.

Previous Reseach
Image2StyleGAN
Key Points:
Optimize w from original images (like AnoGAN).
If we change w1 to w2, we can do morphing.
However, It’s not semantically.
We’d like to convert image semantically.
e.g.)male → female
smile → expressionless

Method
1. Given a well-trained GAN model (Style GAN or PGGAN)
2. Generate z from image by the model
3. To calculate the semantic scores
4. Deﬁne the separation boundary (for m semantics)
Used m=5 key facial attributes for analysis in the paper
● Pose
● Smile (Expression)
● Age
● Gender
● Eyeglasses
5. To generate a image from z_edit
↑ This is the image edited semantically.

Method
To calculate the semantic scores

Method
Using with SVMs for deﬁning the separation boundary (for m semantics)
z
Hyperplane
normal vector n Male
Female
z
z
z
Here, z can be changed semantically.
However, “when there is more than one attribute,
editing one may aﬀect another since some
semantics can be coupled with each other.”

Method
1. To calculate the semantic scores from generated images
a. with ResNet50 trained by CelebA
b. Original scores are binary and multi-class label
2. Deﬁne the separation boundary (for m semantics)
a. using SVMs
3. Move latent codes for changing image semantically
a. manually forcing N･N to be diagonal. (named conditional manipulation)

Experimental
pose and smile are almost
orthogonal to other attributes.
Nevertheless, gender,
age, and eyeglasses need
conditional manipulation.
(not orthogonal)

Experimental
Additional experiment
ref: StyleGANを遊び尽くせ!! 追加学習不要の画像編集-Qiita
left: Celeb A right: Novel image

Conclusion
● We can convert img1 to img2 smoothly and semantically.
● With the pretrained model, training is relatively fast.
● It may not be accurate enough to generate novel images.
Note
- Are not there any quantitative metrics of "semantically"?

References
1. Shen Y, Gu J, Tang X, Zhou B. Interpreting the latent space of gans for semantic face editing. InProceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition 2020 (pp. 9243-9252).
2. Karras T, Aila T, Laine S, Lehtinen J. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint
arXiv:1710.10196. 2017 Oct 27.
3. Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks. InProceedings of the IEEE conference on
computer vision and pattern recognition 2019 (pp. 4401-4410).
4. Abdal R, Qin Y, Wonka P. Image2stylegan: How to embed images into the stylegan latent space?. InProceedings of the IEEE international
conference on computer vision 2019 (pp. 4432-4441).
5. StyleGANを遊び尽くせ!! ~追加学習不要の画像編集~ - Qiita https://qiita.com/pacifinapacific/items/1d6cca0ff4060e12d336
6. GANの基礎からStyleGAN2まで. この記事について| by akira | Medium
https://medium.com/@akichan_f/gan%E3%81%AE%E5%9F%BA%E7%A4%8E%E3%81%8B%E3%82%89stylegan2%E3%81%BE%E3%8
1%A7-dfd2608410b3

Paper reading _interpreting_the_latent_space_of_ga_ns_for_semantic_face_editing

Recommended

Recommended

More Related Content

Similar to Paper reading _interpreting_the_latent_space_of_ga_ns_for_semantic_face_editing

Similar to Paper reading _interpreting_the_latent_space_of_ga_ns_for_semantic_face_editing (20)

Recently uploaded

Recently uploaded (20)

Paper reading _interpreting_the_latent_space_of_ga_ns_for_semantic_face_editing