2. Recall: Text-Guided Manipulation with StyleGAN
โข Input-Dependent method requiresโฆ
โข Different directions for every Image & Text pair
โข Time consuming
Unhappy
3. Recall: Text-Guided Manipulation with StyleGAN
โข Input-Agnostic method requiresโฆ
โข Universal direction for one text
โข Efficient
Unhappy
4. Recall: Text-Guided Manipulation with StyleGAN
โข Two methods on Discovery of Input-Agnostic Direction
โข Global Mapper1):
โข needs optimization for every text prompt
โข Global Direction1):
โข finds a universal direction of a given text instantly
1) StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021)
5. Recall: StyleCLIP Global Direction
โข Similarity between latent space parameter and text guidance
1) StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021)
๐!
StyleGAN
Latent Space
Similarity Text
Guidance
โOLDโ
๐"
๐#
๐๐๐ (๐!, โ๐๐๐โ)
๐๐๐ (๐", โ๐๐๐โ)
๐๐๐ (๐#, โ๐๐๐โ)
Similarity between StyleGAN Latent Space & Text
cannot be computed directly!!
6. Recall: StyleCLIP Global Direction
โข Global Direction1) uses precomputed latent space dictionary
1) StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021)
Dictionary
CLIP(Image)
๐!
StyleGAN
Latent Space
๐"
๐#
CLIP(Text)
๐!
๐"
๐#
CLIP(โOLDโ)
Dictionary enables direct
similarity computation in CLIP space
7. Problem of StyleCLIP Global Direction
โข Global Direction1) computes single-channel CLIP representation
โข Ignores multi-channel interaction in latent space!
1) StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021)
Dictionary
CLIP(Image)
๐!
StyleGAN
Latent Space
๐"
๐#
๐!
๐"
๐#
8. Importance of Multi-Channel Interaction in Latent Space
โข Manipulation on channels in StyleGAN Latent Space
No change
Blondish hair
White
+
Guidance: โWhiteโ
Cannot be found!
9. Learning Multi-Channel Interaction into a Dictionary
โข Optimal Case: Text and Direction pair exists
Text & Direction
๐!
๐"
๐#
Text 1: Young Text 2: Woman Text 3: Glassess
+0.3
-0.7
+0.0 +0.1
+0.2
+0.9
+0.9
-0.1
-0.3
Do not exist
10. Learning Multi-Channel Interaction into a Dictionary
โข Substitute Text โ Direction pairs!
Substitution
CLIP Encoded
Text Guidance
Input-Agnostic Direction of Text Unsupervised Directions(1), (2)
CLIP Encoded
Unsupervised Directions
1) Closed-Form Factorization of Latent Semantics in GANs, CVPR 2021
2) GANSpace: Discovering Interpretable GAN Controls, NeurIPS 2020
13. Learning a Dictionary with Pairs
โข Manipulation effect of unsupervised directions is decomposed
โข The parameter relevant to decomposed effect is identified