SlideShare a Scribd company logo
1 of 18
Download to read offline
Learning Input-Agnostic
Manipulation Directions in StyleGAN
with Text Guidance
Yoonjeon Kim, Hyunsu Kim , Junho Kim , Yunjey Choi, Eunho Yang
ICLR 2023
Recall: Text-Guided Manipulation with StyleGAN
โ€ข Input-Dependent method requiresโ€ฆ
โ€ข Different directions for every Image & Text pair
โ€ข Time consuming
Unhappy
Recall: Text-Guided Manipulation with StyleGAN
โ€ข Input-Agnostic method requiresโ€ฆ
โ€ข Universal direction for one text
โ€ข Efficient
Unhappy
Recall: Text-Guided Manipulation with StyleGAN
โ€ข Two methods on Discovery of Input-Agnostic Direction
โ€ข Global Mapper1):
โ€ข needs optimization for every text prompt
โ€ข Global Direction1):
โ€ข finds a universal direction of a given text instantly
1) StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021)
Recall: StyleCLIP Global Direction
โ€ข Similarity between latent space parameter and text guidance
1) StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021)
๐‘!
StyleGAN
Latent Space
Similarity Text
Guidance
โ€œOLDโ€
๐‘"
๐‘#
๐’”๐’Š๐’Ž (๐‘!, โ€œ๐‘‚๐‘™๐‘‘โ€)
๐’”๐’Š๐’Ž (๐‘", โ€œ๐‘‚๐‘™๐‘‘โ€)
๐’”๐’Š๐’Ž (๐‘#, โ€œ๐‘‚๐‘™๐‘‘โ€)
Similarity between StyleGAN Latent Space & Text
cannot be computed directly!!
Recall: StyleCLIP Global Direction
โ€ข Global Direction1) uses precomputed latent space dictionary
1) StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021)
Dictionary
CLIP(Image)
๐‘!
StyleGAN
Latent Space
๐‘"
๐‘#
CLIP(Text)
๐‘‘!
๐‘‘"
๐‘‘#
CLIP(โ€œOLDโ€)
Dictionary enables direct
similarity computation in CLIP space
Problem of StyleCLIP Global Direction
โ€ข Global Direction1) computes single-channel CLIP representation
โ€ข Ignores multi-channel interaction in latent space!
1) StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021)
Dictionary
CLIP(Image)
๐‘!
StyleGAN
Latent Space
๐‘"
๐‘#
๐‘‘!
๐‘‘"
๐‘‘#
Importance of Multi-Channel Interaction in Latent Space
โ€ข Manipulation on channels in StyleGAN Latent Space
No change
Blondish hair
White
+
Guidance: โ€œWhiteโ€
Cannot be found!
Learning Multi-Channel Interaction into a Dictionary
โ€ข Optimal Case: Text and Direction pair exists
Text & Direction
๐‘!
๐‘"
๐‘#
Text 1: Young Text 2: Woman Text 3: Glassess
+0.3
-0.7
+0.0 +0.1
+0.2
+0.9
+0.9
-0.1
-0.3
Do not exist
Learning Multi-Channel Interaction into a Dictionary
โ€ข Substitute Text โ€“ Direction pairs!
Substitution
CLIP Encoded
Text Guidance
Input-Agnostic Direction of Text Unsupervised Directions(1), (2)
CLIP Encoded
Unsupervised Directions
1) Closed-Form Factorization of Latent Semantics in GANs, CVPR 2021
2) GANSpace: Discovering Interpretable GAN Controls, NeurIPS 2020
Method: Unsupervised Directions and CLIP encodings
โ€ข Unsupervised Directions: ๐œถ(๐Ÿ), ๐œถ(๐Ÿ), ๐œถ(๐Ÿ‘)
Method: Unsupervised Directions and CLIP encodings
โ€ข CLIP encoded directions: ๐“(๐œถ(๐Ÿ)
), ๐“( ๐œถ ๐Ÿ
), ๐“(๐œถ(๐Ÿ‘)
)
Learning a Dictionary with Pairs
โ€ข Manipulation effect of unsupervised directions is decomposed
โ€ข The parameter relevant to decomposed effect is identified
Qualitative Results
Qualitative Results
Qualitative Results
Qualitative Results
Thank you !
Any Questions ?

More Related Content

Similar to Y. Kim, ICLR 2023, MLILAB, KAISTAI

CSP: Huh? And Components
CSP: Huh? And ComponentsCSP: Huh? And Components
CSP: Huh? And Components
Daniel Fagnan
ย 
CSP: Huh? And Components
CSP: Huh? And ComponentsCSP: Huh? And Components
CSP: Huh? And Components
Daniel Fagnan
ย 

Similar to Y. Kim, ICLR 2023, MLILAB, KAISTAI (14)

.NET UY Meetup 7 - CLR Memory by Fabian Alves
.NET UY Meetup 7 - CLR Memory by Fabian Alves.NET UY Meetup 7 - CLR Memory by Fabian Alves
.NET UY Meetup 7 - CLR Memory by Fabian Alves
ย 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
ย 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
ย 
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
ย 
CSP: Huh? And Components
CSP: Huh? And ComponentsCSP: Huh? And Components
CSP: Huh? And Components
ย 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
ย 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ย 
CSP: Huh? And Components
CSP: Huh? And ComponentsCSP: Huh? And Components
CSP: Huh? And Components
ย 
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
ย 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From Scratch
ย 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
ย 
AINL 2016: Nikolenko
AINL 2016: NikolenkoAINL 2016: Nikolenko
AINL 2016: Nikolenko
ย 
Image Style Transfer and AI on iOS Mobile App
Image Style Transfer and AI on iOS Mobile AppImage Style Transfer and AI on iOS Mobile App
Image Style Transfer and AI on iOS Mobile App
ย 
Generating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural NetworksGenerating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural Networks
ย 

More from MLILAB

I. Chung, AAAI 2020, MLILAB, KAIST AI
I. Chung, AAAI 2020, MLILAB, KAIST AII. Chung, AAAI 2020, MLILAB, KAIST AI
I. Chung, AAAI 2020, MLILAB, KAIST AI
MLILAB
ย 
H. Shim, NeurIPS 2018, MLILAB, KAIST AI
H. Shim, NeurIPS 2018, MLILAB, KAIST AIH. Shim, NeurIPS 2018, MLILAB, KAIST AI
H. Shim, NeurIPS 2018, MLILAB, KAIST AI
MLILAB
ย 

More from MLILAB (20)

J. Jeong, AAAI 2024, MLILAB, KAIST AI..
J. Jeong,  AAAI 2024, MLILAB, KAIST AI..J. Jeong,  AAAI 2024, MLILAB, KAIST AI..
J. Jeong, AAAI 2024, MLILAB, KAIST AI..
ย 
J. Yun, NeurIPS 2023, MLILAB, KAISTAI
J. Yun,  NeurIPS 2023,  MLILAB,  KAISTAIJ. Yun,  NeurIPS 2023,  MLILAB,  KAISTAI
J. Yun, NeurIPS 2023, MLILAB, KAISTAI
ย 
S. Kim, NeurIPS 2023, MLILAB, KAISTAI
S. Kim,  NeurIPS 2023,  MLILAB,  KAISTAIS. Kim,  NeurIPS 2023,  MLILAB,  KAISTAI
S. Kim, NeurIPS 2023, MLILAB, KAISTAI
ย 
C. Kim, INTERSPEECH 2023, MLILAB, KAISTAI
C. Kim, INTERSPEECH 2023, MLILAB, KAISTAIC. Kim, INTERSPEECH 2023, MLILAB, KAISTAI
C. Kim, INTERSPEECH 2023, MLILAB, KAISTAI
ย 
Y. Jung, ICML 2023, MLILAB, KAISTAI
Y. Jung, ICML 2023, MLILAB, KAISTAIY. Jung, ICML 2023, MLILAB, KAISTAI
Y. Jung, ICML 2023, MLILAB, KAISTAI
ย 
J. Song, S. Kim, ICML 2023, MLILAB, KAISTAI
J. Song, S. Kim, ICML 2023, MLILAB, KAISTAIJ. Song, S. Kim, ICML 2023, MLILAB, KAISTAI
J. Song, S. Kim, ICML 2023, MLILAB, KAISTAI
ย 
K. Seo, ICASSP 2023, MLILAB, KAISTAI
K. Seo, ICASSP 2023, MLILAB, KAISTAIK. Seo, ICASSP 2023, MLILAB, KAISTAI
K. Seo, ICASSP 2023, MLILAB, KAISTAI
ย 
G. Kim, CVPR 2023, MLILAB, KAISTAI
G. Kim, CVPR 2023, MLILAB, KAISTAIG. Kim, CVPR 2023, MLILAB, KAISTAI
G. Kim, CVPR 2023, MLILAB, KAISTAI
ย 
S. Kim, ICLR 2023, MLILAB, KAISTAI
S. Kim, ICLR 2023, MLILAB, KAISTAIS. Kim, ICLR 2023, MLILAB, KAISTAI
S. Kim, ICLR 2023, MLILAB, KAISTAI
ย 
J. Yun, AISTATS 2022, MLILAB, KAISTAI
J. Yun, AISTATS 2022, MLILAB, KAISTAIJ. Yun, AISTATS 2022, MLILAB, KAISTAI
J. Yun, AISTATS 2022, MLILAB, KAISTAI
ย 
J. Song, J. Park, ICML 2022, MLILAB, KAISTAI
J. Song, J. Park, ICML 2022, MLILAB, KAISTAIJ. Song, J. Park, ICML 2022, MLILAB, KAISTAI
J. Song, J. Park, ICML 2022, MLILAB, KAISTAI
ย 
J. Park, J. Song, ICLR 2022, MLILAB, KAISTAI
J. Park, J. Song, ICLR 2022, MLILAB, KAISTAIJ. Park, J. Song, ICLR 2022, MLILAB, KAISTAI
J. Park, J. Song, ICLR 2022, MLILAB, KAISTAI
ย 
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAIJ. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
ย 
J. Park, AAAI 2022, MLILAB, KAIST AI
J. Park, AAAI 2022, MLILAB, KAIST AIJ. Park, AAAI 2022, MLILAB, KAIST AI
J. Park, AAAI 2022, MLILAB, KAIST AI
ย 
J. Song, et. al., ASRU 2021, MLILAB, KAIST AI
J. Song, et. al., ASRU 2021, MLILAB, KAIST AIJ. Song, et. al., ASRU 2021, MLILAB, KAIST AI
J. Song, et. al., ASRU 2021, MLILAB, KAIST AI
ย 
J. Song, H. Shim et al., ICASSP 2021, MLILAB, KAIST AI
J. Song, H. Shim et al., ICASSP 2021, MLILAB, KAIST AIJ. Song, H. Shim et al., ICASSP 2021, MLILAB, KAIST AI
J. Song, H. Shim et al., ICASSP 2021, MLILAB, KAIST AI
ย 
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AIT. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
ย 
G. Park, J.-Y. Yang, et. al., NeurIPS 2020, MLILAB, KAIST AI
G. Park, J.-Y. Yang, et. al., NeurIPS 2020, MLILAB, KAIST AIG. Park, J.-Y. Yang, et. al., NeurIPS 2020, MLILAB, KAIST AI
G. Park, J.-Y. Yang, et. al., NeurIPS 2020, MLILAB, KAIST AI
ย 
I. Chung, AAAI 2020, MLILAB, KAIST AI
I. Chung, AAAI 2020, MLILAB, KAIST AII. Chung, AAAI 2020, MLILAB, KAIST AI
I. Chung, AAAI 2020, MLILAB, KAIST AI
ย 
H. Shim, NeurIPS 2018, MLILAB, KAIST AI
H. Shim, NeurIPS 2018, MLILAB, KAIST AIH. Shim, NeurIPS 2018, MLILAB, KAIST AI
H. Shim, NeurIPS 2018, MLILAB, KAIST AI
ย 

Recently uploaded

Call Now โ‰ฝ 9953056974 โ‰ผ๐Ÿ” Call Girls In New Ashok Nagar โ‰ผ๐Ÿ” Delhi door step de...
Call Now โ‰ฝ 9953056974 โ‰ผ๐Ÿ” Call Girls In New Ashok Nagar  โ‰ผ๐Ÿ” Delhi door step de...Call Now โ‰ฝ 9953056974 โ‰ผ๐Ÿ” Call Girls In New Ashok Nagar  โ‰ผ๐Ÿ” Delhi door step de...
Call Now โ‰ฝ 9953056974 โ‰ผ๐Ÿ” Call Girls In New Ashok Nagar โ‰ผ๐Ÿ” Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
ย 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
ย 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Dr.Costas Sachpazis
ย 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
ย 

Recently uploaded (20)

(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
ย 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
ย 
Call Now โ‰ฝ 9953056974 โ‰ผ๐Ÿ” Call Girls In New Ashok Nagar โ‰ผ๐Ÿ” Delhi door step de...
Call Now โ‰ฝ 9953056974 โ‰ผ๐Ÿ” Call Girls In New Ashok Nagar  โ‰ผ๐Ÿ” Delhi door step de...Call Now โ‰ฝ 9953056974 โ‰ผ๐Ÿ” Call Girls In New Ashok Nagar  โ‰ผ๐Ÿ” Delhi door step de...
Call Now โ‰ฝ 9953056974 โ‰ผ๐Ÿ” Call Girls In New Ashok Nagar โ‰ผ๐Ÿ” Delhi door step de...
ย 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
ย 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
ย 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
ย 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
ย 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
ย 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
ย 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
ย 
Top Rated Pune Call Girls Budhwar Peth โŸŸ 6297143586 โŸŸ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth โŸŸ 6297143586 โŸŸ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth โŸŸ 6297143586 โŸŸ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth โŸŸ 6297143586 โŸŸ Call Me For Genuine Se...
ย 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
ย 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
ย 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
ย 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
ย 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
ย 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
ย 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
ย 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
ย 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
ย 

Y. Kim, ICLR 2023, MLILAB, KAISTAI

  • 1. Learning Input-Agnostic Manipulation Directions in StyleGAN with Text Guidance Yoonjeon Kim, Hyunsu Kim , Junho Kim , Yunjey Choi, Eunho Yang ICLR 2023
  • 2. Recall: Text-Guided Manipulation with StyleGAN โ€ข Input-Dependent method requiresโ€ฆ โ€ข Different directions for every Image & Text pair โ€ข Time consuming Unhappy
  • 3. Recall: Text-Guided Manipulation with StyleGAN โ€ข Input-Agnostic method requiresโ€ฆ โ€ข Universal direction for one text โ€ข Efficient Unhappy
  • 4. Recall: Text-Guided Manipulation with StyleGAN โ€ข Two methods on Discovery of Input-Agnostic Direction โ€ข Global Mapper1): โ€ข needs optimization for every text prompt โ€ข Global Direction1): โ€ข finds a universal direction of a given text instantly 1) StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021)
  • 5. Recall: StyleCLIP Global Direction โ€ข Similarity between latent space parameter and text guidance 1) StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021) ๐‘! StyleGAN Latent Space Similarity Text Guidance โ€œOLDโ€ ๐‘" ๐‘# ๐’”๐’Š๐’Ž (๐‘!, โ€œ๐‘‚๐‘™๐‘‘โ€) ๐’”๐’Š๐’Ž (๐‘", โ€œ๐‘‚๐‘™๐‘‘โ€) ๐’”๐’Š๐’Ž (๐‘#, โ€œ๐‘‚๐‘™๐‘‘โ€) Similarity between StyleGAN Latent Space & Text cannot be computed directly!!
  • 6. Recall: StyleCLIP Global Direction โ€ข Global Direction1) uses precomputed latent space dictionary 1) StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021) Dictionary CLIP(Image) ๐‘! StyleGAN Latent Space ๐‘" ๐‘# CLIP(Text) ๐‘‘! ๐‘‘" ๐‘‘# CLIP(โ€œOLDโ€) Dictionary enables direct similarity computation in CLIP space
  • 7. Problem of StyleCLIP Global Direction โ€ข Global Direction1) computes single-channel CLIP representation โ€ข Ignores multi-channel interaction in latent space! 1) StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021) Dictionary CLIP(Image) ๐‘! StyleGAN Latent Space ๐‘" ๐‘# ๐‘‘! ๐‘‘" ๐‘‘#
  • 8. Importance of Multi-Channel Interaction in Latent Space โ€ข Manipulation on channels in StyleGAN Latent Space No change Blondish hair White + Guidance: โ€œWhiteโ€ Cannot be found!
  • 9. Learning Multi-Channel Interaction into a Dictionary โ€ข Optimal Case: Text and Direction pair exists Text & Direction ๐‘! ๐‘" ๐‘# Text 1: Young Text 2: Woman Text 3: Glassess +0.3 -0.7 +0.0 +0.1 +0.2 +0.9 +0.9 -0.1 -0.3 Do not exist
  • 10. Learning Multi-Channel Interaction into a Dictionary โ€ข Substitute Text โ€“ Direction pairs! Substitution CLIP Encoded Text Guidance Input-Agnostic Direction of Text Unsupervised Directions(1), (2) CLIP Encoded Unsupervised Directions 1) Closed-Form Factorization of Latent Semantics in GANs, CVPR 2021 2) GANSpace: Discovering Interpretable GAN Controls, NeurIPS 2020
  • 11. Method: Unsupervised Directions and CLIP encodings โ€ข Unsupervised Directions: ๐œถ(๐Ÿ), ๐œถ(๐Ÿ), ๐œถ(๐Ÿ‘)
  • 12. Method: Unsupervised Directions and CLIP encodings โ€ข CLIP encoded directions: ๐“(๐œถ(๐Ÿ) ), ๐“( ๐œถ ๐Ÿ ), ๐“(๐œถ(๐Ÿ‘) )
  • 13. Learning a Dictionary with Pairs โ€ข Manipulation effect of unsupervised directions is decomposed โ€ข The parameter relevant to decomposed effect is identified
  • 18. Thank you ! Any Questions ?