SlideShare a Scribd company logo
Image-to-Image Translation:
Methods and Applications
Yingxue Pang, Jianxin Lin, Tao Qin, and Zhibo Chen
IEEE Transactions on Multimedia, 2021
,
2022/6/17
◼Image-to-Image translation (I2I)
I.
i.
◼Image-to-Image translation (I2I)
•
•
•
◼Variational Auto Encoder
[Kingma & Welling, arXiv2014]
• 𝓏 𝓍
𝑝𝜃(𝓍|𝓏)
• 𝓍 𝑝𝜃(𝓍)
• log 𝑝𝜃(𝓍)
◼Generative Adversarial Networks
[Goodfellow+, NeurIPS2014]
•
•
•
Generative Adversarial Network (GAN)
◼Unconditional GAN
• GAN [Goodfellow+, NeurIPS2014]
• DGANs [Radford+, ICLR2016]
•
◼Conditional GAN [Mirza & Osindero, arXiv2014]
•
•
or
◼Image-to-Image translation (I2I)
I.
ii.
◼ Peak signal-to-noise ratio (PSNR)
•
◼Structural similarity index (SSIM) [Wang+, IEEE2004]
• ground truth
◼Inception score (IS) [Salimans+, NeurIPS2016]
• Inception
◼Mask-SSIM and Mask-IS [Ma+, NeurIPS2017]
•
•
◼Conditional inception score (CIS) [Huang+, ECCV2018]
• I2I
•
◼Perceptual distance (PD) [Johnson+, ECCV2016]
•
◼Fréchet inception distance (FID) [Heusel+, NeurIPS2017]
•
◼Kernel inception distance (KID) [Binkowski+, ICLR2018]
• Inception network [Szegedy+, CVPR2016]
•
◼Single image Fréchet inception distance (SIFID) [Shaham+, ICCV2019]
•
• FID
◼Learned Perceptual Image Patch Similarity (LPIPS)
[Zhang+, CVPR2018]
•
•
•
◼FCN scores [Isola+, CVPR2017]
• Semantic map
• IoU
◼Classification accuracy [Liu+, NeurIPS2017]
•
•
◼Density and Coverage (DC) [Naeem+, PMLR2020]
•
•
◼Image-to-Image translation (I2I)
II. Two-domain I2I , Multi-domain I2I
i. Two-Domain Image-To-Image Translation
Two-Domain Image-To-Image Translation
◼Two-Domain I2I
• Image style transfer [Zhu+, ICCV2017]
• Semantic segmentation [Park+, CVPR2017]
• Image colorization [Suárez+, CVPR2017],
◼Two-Domain I2I
• Supervised
•
• Unsupervised
•
• Semi-supervised
•
• Few-shot
•
Image style transfer
Image colorization
Supervised I2I Single-modal Output
◼Single-modal Output
•
◼Pix2pix [Isola+, CVPR2017]
• Conditional GAN
• 𝐿1
◼Pix2PixHD [Wang+, CVPR2018]
• 2048 1024
◼SPADE [Park+, CVPR2019]
•
SPADE [Park+, CVPR2019]
Supervised I2I Multimodal Output
◼Multimodal Outputs
•
•
◼BicycleGAN [Zhu+, NeurIPS2017]
• cVAE-GAN [Hinton & Salakhutdinov, Science2006]
cLR-GAN [Chen+, NeurIPS2016]
•
• I2I
◼PixelNN [Bansal+, ICLR2018]
•
• ( ) PixelNN [Bansal+, ICLR2018]
Unsupervised I2I Single-modal Output
◼Translation using
a Cycle consistency Constraint
•
• Cycle-consistency Loss
•
loss
• CycleGAN [Zhu+, ICCV2017]
• 2
• U-GAT-IT [Kim+, ICLR2019]
•
•
CycleGAN [Zhu+, ICCV2017]
Unsupervised I2I Single-modal Output
◼Translation beyond Cycle-consistency Constraint
•
• TraVeLGAN [Amodio & Krishnaswamy, CVPR2019]
•
• CUT [Park+, ECCV2020]
•
•
Unsupervised I2I Single-modal Output
◼Translation of Fine-grained Objects
•
◼DAGAN [Ma+, CVPR2018]
◼Attention GAN
[Chen+, ECCV2018]
◼Attention guided GAN
[Mejjati+, NeurIPS2017]
Unsupervised I2I Single-modal Output
◼Translation by combining knowledge in other fields
•
◼RevGAN [Ouderaa & Worrall, CVPR2019]
◼Art2Real [Tomei+, CVPR2019]
◼GDWCT [Cho+, CVPR2019]
◼NICE-GAN [Chen+, CVPR2020]
Art2Real [Tomei+, CVPR2019]
Unsupervised I2I Multimodal Output
◼CycleGAN
• DSVIB [Kazemi+, NeurIPS2018]
• Augmented CycleGAN [Almahairi+, PMLR2018]
◼Disentangled representations
• cd-GAN [Lin+, CVPR2018]
• MUNIT [Huang+, ECCV2018]
• DRIT [Lee+, ECCV2018]
• EGSC-IT [Ma+, ICLR2018]
◼
• INIT [Shen+, CVPR2019]
• DSMAP [Chang+, ECCV2020]
DRIT [Lee+, ECCV2018]
Semi-supervised Image-to-Image Translation
◼TCR-SSIT [Mustafa & Mantiuk, ECCV2020]
• Transformation Consistency Regularization (TCR)
•
Few-Shot Image-to-Image Translation
◼ I2I
◼Transferring GAN
[Wang+, ECCV2018]
• GAN
•
◼MT-GAN [Lin+, arXiv2019]
•
•
One-shot Image-to-Image Translation
◼
• OST [Benaim & Wolf, NeurIPS2018]
•
•
• BiOST [Cohen & Wolf, ICCV2019]
•
•
◼
• TuiGAN [Lin+, ECCV2020]
•
• coarse-to-fine
◼Image-to-Image translation (I2I)
II. Two-domain I2I , Multi-domain I2I
ii. Multi-Domain Image-To-Image Translation
Unsupervised I2I Single-modal Output
◼Training with multi-modules
• 2 I2I
•
• Domain-Bank [Hui+, ICPR2018]
• n
n
• ModularGAN [Zhao+, ECCV2018]
•
Training with one generator and discriminator pair
◼Training with one generator and discriminator pair
•
•
• StarGAN [Choi+, CVPR2018]
•
•
• AttGAN [He+, IEEE TIP2019]
•
•
• RelGAN [Wu+, ICCV2019] STGAN [Liu+, CVPR2019]
•
•
StarGAN, RelGAN
Training by combining knowledge in other fields
◼Training by combining knowledge in other fields
•
• Fixed-Point GAN [Siddiquee+, ICCV2019]
•
• SGN [Chang+, ICCV2019]
•
sym-parameter
• INIT [Cao+, ECCV2020]
•
• ADSPM [Wu+, ICCV2019]
•
SGN Structure
Unsupervised I2I Multimodal Output
◼ DosGAN [Lin+, arXiv2019]
• CNN
•
◼ GANimation [Pumarola+, ECCV2018]
•
•
◼ Disentanglement assumption
• UFDN [Liu+, NeurIPS2018]
• DMIT [Yu+, NeurIPS2019]
• StarGAN v2 [Choi+, CVPR2020]
• DRIT++ [Lee+, ECCV2018]
• GMM-UNIT [Liu+, arXiv2020]
Semi-Supervised Multi-domain I2I
◼SEMIT [Wang+, CVPR2020]
• Few-shot I2I
•
◼AGUIT [Li+, arXiv2019]
1.
2. AdaIN [Huang+, ICCV2017]
3. Cycle-consistency Loss Feature-consistency Loss
Few-shot Multi-domain I2I
◼FUNIT [Liu+, ICCV2019]
•
• I2I
• I2I
◼COCO-FUNIT [Saito+, ECCV2020]
•
•
◼ZstGAN [Lin+, arXiv2019]
• zero-shot I2I
•
◼Image-to-Image translation (I2I)
II. Two-domain I2I , Multi-domain I2I
iii.
Experimental Evaluation
◼ Two-Domain I2I
• Pix2pix [Isola+, CVPR2017]
• BicycleGAN [Zhu+, NeurIPS2017]
• CycleGAN [Zhu+, ICCV2017]
• U-GAT-IT [Kim+, ICLR2019]
• GDWCT [Cho+, CVPR2019]
• CUT [Park+, ECCV2020]
• MUNIT [Huang+, ECCV2018]
◼ Multi-Domain I2I
• StarGAN [Choi+, CVPR2018]
• AttGAN [He+, IEEE TIP2019]
• STGAN [Liu+, CVPR2019]
• DosGAN [Lin+, arXiv2019]
• StarGANv2 [Choi+, CVPR2020]
◼UT-Zap50K [Yu & Grauman, CVPR2019]
•
• 49826
• 200
• 256 256
◼CelebA [Liu, ICCV2015]
• 202599
• 40
• train, val, test 8:1:1
• 178 218 center crop
•
◼Inception score (IS) [Salimans+, NeurIPS2016]
•
◼Fréchet inception distance (FID) [Heusel+, NeurIPS2017]
•
◼LPIPS [Zhang+, CVPR2018]
• LPIPS
•
Two-Domain I2I
◼Single-modal ◼Multimodal
Two-Domain I2I
◼Single-modal
•
• CUT FID IS
•
• StyleGAN
[Karras+, CVPR2019]
◼Multimodal
•
• LPIPS 0.047
MultiDomain I2I
◼Single-modal ◼Multimodal
MultiDomain I2I
◼StarGANv2
• FID IS LPIPS StarGANv2
• LPIPS 0.336
◼Image-to-Image translation (I2I)
III. I2I
i. I2I
ii.
Application
◼
• [Park+, CVPR2019]
• [Yan+, ACM2017]
•
• [Isola+, CVPR2017]
• [Almahairi+, PMLR2018]
• [Mao+, CVPR2019]
• [Shocher+, CVPR2020]
• [Pathak+, CVPR2016]
• [Zheng+, AAAI2019]
◼
• [Hicsinmez+, Image and Vision Computing, 2020]
• [Taigman+, ICLR2019]
◼
• [Zhu+, ICCV2017]
• [Kim+, ICML2017]
◼
• [Yi+, ICCV2017]
• [Park+, NeurIPS2020]
◼
• [Yuan+, CVPR2018]
• [Manakov+, DART2019]
• [Zhang+, TCSVT2020]
• [Engin+, CVPRW2019]
• [Madam+, ECCV2018]
◼
• I2I
◼
•
◼AMT perceptual studies
• Amazon Mechanical Turk (AMT)
•
•
Two-domain, Multi-domain I2Iの分類
◼Inception score (IS) Fréchet inception distance (FID)
• Two-Domain & Single-modal
•
• Two-Domain & Multimodal
• 19 FID
• Multi-Domain & Single-modal
• FID
• Multi-Domain & Multimodal
• 19
• FID

More Related Content

What's hot

CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)
CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)
CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)
Tenki Lee
 
畳み込みLstm
畳み込みLstm畳み込みLstm
畳み込みLstm
tak9029
 
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)
Yoshitaka Ushiku
 
強化学習アルゴリズムPPOの解説と実験
強化学習アルゴリズムPPOの解説と実験強化学習アルゴリズムPPOの解説と実験
強化学習アルゴリズムPPOの解説と実験
克海 納谷
 
カルマンフィルタ入門
カルマンフィルタ入門カルマンフィルタ入門
カルマンフィルタ入門
Yasunori Nihei
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!Transformer
Arithmer Inc.
 
Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)
Yamato OKAMOTO
 
SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向
SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向
SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向
SSII
 
【DL輪読会】"A Generalist Agent"
【DL輪読会】"A Generalist Agent"【DL輪読会】"A Generalist Agent"
【DL輪読会】"A Generalist Agent"
Deep Learning JP
 
【メタサーベイ】Video Transformer
 【メタサーベイ】Video Transformer 【メタサーベイ】Video Transformer
【メタサーベイ】Video Transformer
cvpaper. challenge
 
強化学習と逆強化学習を組み合わせた模倣学習
強化学習と逆強化学習を組み合わせた模倣学習強化学習と逆強化学習を組み合わせた模倣学習
強化学習と逆強化学習を組み合わせた模倣学習
Eiji Uchibe
 
[DL輪読会]Clebsch–Gordan Nets: a Fully Fourier Space Spherical Convolutional Neu...
[DL輪読会]Clebsch–Gordan Nets: a Fully Fourier Space Spherical Convolutional Neu...[DL輪読会]Clebsch–Gordan Nets: a Fully Fourier Space Spherical Convolutional Neu...
[DL輪読会]Clebsch–Gordan Nets: a Fully Fourier Space Spherical Convolutional Neu...
Deep Learning JP
 
[DL輪読会]DropBlock: A regularization method for convolutional networks
[DL輪読会]DropBlock: A regularization method for convolutional networks[DL輪読会]DropBlock: A regularization method for convolutional networks
[DL輪読会]DropBlock: A regularization method for convolutional networks
Deep Learning JP
 
[DL輪読会]Learning to Simulate Complex Physics with Graph Networks
[DL輪読会]Learning to Simulate Complex Physics with Graph Networks[DL輪読会]Learning to Simulate Complex Physics with Graph Networks
[DL輪読会]Learning to Simulate Complex Physics with Graph Networks
Deep Learning JP
 
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learning
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learningゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learning
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learning
Preferred Networks
 
【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ
Deep Learning JP
 
強化学習の基礎と深層強化学習(東京大学 松尾研究室 深層強化学習サマースクール講義資料)
強化学習の基礎と深層強化学習(東京大学 松尾研究室 深層強化学習サマースクール講義資料)強化学習の基礎と深層強化学習(東京大学 松尾研究室 深層強化学習サマースクール講義資料)
強化学習の基礎と深層強化学習(東京大学 松尾研究室 深層強化学習サマースクール講義資料)
Shota Imai
 
Domain Adaptation 発展と動向まとめ(サーベイ資料)
Domain Adaptation 発展と動向まとめ(サーベイ資料)Domain Adaptation 発展と動向まとめ(サーベイ資料)
Domain Adaptation 発展と動向まとめ(サーベイ資料)
Yamato OKAMOTO
 
Deeplearning輪読会
Deeplearning輪読会Deeplearning輪読会
Deeplearning輪読会
正志 坪坂
 
Optimizer入門&最新動向
Optimizer入門&最新動向Optimizer入門&最新動向
Optimizer入門&最新動向
Motokawa Tetsuya
 

What's hot (20)

CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)
CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)
CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)
 
畳み込みLstm
畳み込みLstm畳み込みLstm
畳み込みLstm
 
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)
 
強化学習アルゴリズムPPOの解説と実験
強化学習アルゴリズムPPOの解説と実験強化学習アルゴリズムPPOの解説と実験
強化学習アルゴリズムPPOの解説と実験
 
カルマンフィルタ入門
カルマンフィルタ入門カルマンフィルタ入門
カルマンフィルタ入門
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!Transformer
 
Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)
 
SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向
SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向
SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向
 
【DL輪読会】"A Generalist Agent"
【DL輪読会】"A Generalist Agent"【DL輪読会】"A Generalist Agent"
【DL輪読会】"A Generalist Agent"
 
【メタサーベイ】Video Transformer
 【メタサーベイ】Video Transformer 【メタサーベイ】Video Transformer
【メタサーベイ】Video Transformer
 
強化学習と逆強化学習を組み合わせた模倣学習
強化学習と逆強化学習を組み合わせた模倣学習強化学習と逆強化学習を組み合わせた模倣学習
強化学習と逆強化学習を組み合わせた模倣学習
 
[DL輪読会]Clebsch–Gordan Nets: a Fully Fourier Space Spherical Convolutional Neu...
[DL輪読会]Clebsch–Gordan Nets: a Fully Fourier Space Spherical Convolutional Neu...[DL輪読会]Clebsch–Gordan Nets: a Fully Fourier Space Spherical Convolutional Neu...
[DL輪読会]Clebsch–Gordan Nets: a Fully Fourier Space Spherical Convolutional Neu...
 
[DL輪読会]DropBlock: A regularization method for convolutional networks
[DL輪読会]DropBlock: A regularization method for convolutional networks[DL輪読会]DropBlock: A regularization method for convolutional networks
[DL輪読会]DropBlock: A regularization method for convolutional networks
 
[DL輪読会]Learning to Simulate Complex Physics with Graph Networks
[DL輪読会]Learning to Simulate Complex Physics with Graph Networks[DL輪読会]Learning to Simulate Complex Physics with Graph Networks
[DL輪読会]Learning to Simulate Complex Physics with Graph Networks
 
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learning
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learningゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learning
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learning
 
【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ
 
強化学習の基礎と深層強化学習(東京大学 松尾研究室 深層強化学習サマースクール講義資料)
強化学習の基礎と深層強化学習(東京大学 松尾研究室 深層強化学習サマースクール講義資料)強化学習の基礎と深層強化学習(東京大学 松尾研究室 深層強化学習サマースクール講義資料)
強化学習の基礎と深層強化学習(東京大学 松尾研究室 深層強化学習サマースクール講義資料)
 
Domain Adaptation 発展と動向まとめ(サーベイ資料)
Domain Adaptation 発展と動向まとめ(サーベイ資料)Domain Adaptation 発展と動向まとめ(サーベイ資料)
Domain Adaptation 発展と動向まとめ(サーベイ資料)
 
Deeplearning輪読会
Deeplearning輪読会Deeplearning輪読会
Deeplearning輪読会
 
Optimizer入門&最新動向
Optimizer入門&最新動向Optimizer入門&最新動向
Optimizer入門&最新動向
 

More from Toru Tamaki

論文紹介:Deep Learning-Based Human Pose Estimation: A Survey
論文紹介:Deep Learning-Based Human Pose Estimation: A Survey論文紹介:Deep Learning-Based Human Pose Estimation: A Survey
論文紹介:Deep Learning-Based Human Pose Estimation: A Survey
Toru Tamaki
 
論文紹介:A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, a...
論文紹介:A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, a...論文紹介:A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, a...
論文紹介:A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, a...
Toru Tamaki
 
論文紹介:When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Seg...
論文紹介:When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Seg...論文紹介:When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Seg...
論文紹介:When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Seg...
Toru Tamaki
 
論文紹介:Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
論文紹介:Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation論文紹介:Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
論文紹介:Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
Toru Tamaki
 
論文紹介:Multi-criteria Token Fusion with One-step-ahead Attention for Efficient ...
論文紹介:Multi-criteria Token Fusion with One-step-ahead Attention for Efficient ...論文紹介:Multi-criteria Token Fusion with One-step-ahead Attention for Efficient ...
論文紹介:Multi-criteria Token Fusion with One-step-ahead Attention for Efficient ...
Toru Tamaki
 
論文紹介:ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
論文紹介:ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation論文紹介:ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
論文紹介:ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
Toru Tamaki
 
論文紹介:ArcFace: Additive Angular Margin Loss for Deep Face Recognition
論文紹介:ArcFace: Additive Angular Margin Loss for Deep Face Recognition論文紹介:ArcFace: Additive Angular Margin Loss for Deep Face Recognition
論文紹介:ArcFace: Additive Angular Margin Loss for Deep Face Recognition
Toru Tamaki
 
論文紹介:Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers
論文紹介:Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers論文紹介:Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers
論文紹介:Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers
Toru Tamaki
 
論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...
論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...
論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...
Toru Tamaki
 
論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding
論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding
論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding
Toru Tamaki
 
論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...
論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...
論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...
Toru Tamaki
 
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
Toru Tamaki
 
論文紹介:Automated Classification of Model Errors on ImageNet
論文紹介:Automated Classification of Model Errors on ImageNet論文紹介:Automated Classification of Model Errors on ImageNet
論文紹介:Automated Classification of Model Errors on ImageNet
Toru Tamaki
 
論文紹介:Semantic segmentation using Vision Transformers: A survey
論文紹介:Semantic segmentation using Vision Transformers: A survey論文紹介:Semantic segmentation using Vision Transformers: A survey
論文紹介:Semantic segmentation using Vision Transformers: A survey
Toru Tamaki
 
論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
Toru Tamaki
 
論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...
論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...
論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...
Toru Tamaki
 
論文紹介:Tracking Anything with Decoupled Video Segmentation
論文紹介:Tracking Anything with Decoupled Video Segmentation論文紹介:Tracking Anything with Decoupled Video Segmentation
論文紹介:Tracking Anything with Decoupled Video Segmentation
Toru Tamaki
 
論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope
論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope
論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope
Toru Tamaki
 
論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...
論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...
論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...
Toru Tamaki
 
論文紹介:Multitask Vision-Language Prompt Tuning
論文紹介:Multitask Vision-Language Prompt Tuning論文紹介:Multitask Vision-Language Prompt Tuning
論文紹介:Multitask Vision-Language Prompt Tuning
Toru Tamaki
 

More from Toru Tamaki (20)

論文紹介:Deep Learning-Based Human Pose Estimation: A Survey
論文紹介:Deep Learning-Based Human Pose Estimation: A Survey論文紹介:Deep Learning-Based Human Pose Estimation: A Survey
論文紹介:Deep Learning-Based Human Pose Estimation: A Survey
 
論文紹介:A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, a...
論文紹介:A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, a...論文紹介:A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, a...
論文紹介:A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, a...
 
論文紹介:When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Seg...
論文紹介:When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Seg...論文紹介:When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Seg...
論文紹介:When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Seg...
 
論文紹介:Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
論文紹介:Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation論文紹介:Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
論文紹介:Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
 
論文紹介:Multi-criteria Token Fusion with One-step-ahead Attention for Efficient ...
論文紹介:Multi-criteria Token Fusion with One-step-ahead Attention for Efficient ...論文紹介:Multi-criteria Token Fusion with One-step-ahead Attention for Efficient ...
論文紹介:Multi-criteria Token Fusion with One-step-ahead Attention for Efficient ...
 
論文紹介:ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
論文紹介:ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation論文紹介:ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
論文紹介:ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
 
論文紹介:ArcFace: Additive Angular Margin Loss for Deep Face Recognition
論文紹介:ArcFace: Additive Angular Margin Loss for Deep Face Recognition論文紹介:ArcFace: Additive Angular Margin Loss for Deep Face Recognition
論文紹介:ArcFace: Additive Angular Margin Loss for Deep Face Recognition
 
論文紹介:Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers
論文紹介:Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers論文紹介:Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers
論文紹介:Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers
 
論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...
論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...
論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...
 
論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding
論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding
論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding
 
論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...
論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...
論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...
 
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
 
論文紹介:Automated Classification of Model Errors on ImageNet
論文紹介:Automated Classification of Model Errors on ImageNet論文紹介:Automated Classification of Model Errors on ImageNet
論文紹介:Automated Classification of Model Errors on ImageNet
 
論文紹介:Semantic segmentation using Vision Transformers: A survey
論文紹介:Semantic segmentation using Vision Transformers: A survey論文紹介:Semantic segmentation using Vision Transformers: A survey
論文紹介:Semantic segmentation using Vision Transformers: A survey
 
論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
 
論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...
論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...
論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...
 
論文紹介:Tracking Anything with Decoupled Video Segmentation
論文紹介:Tracking Anything with Decoupled Video Segmentation論文紹介:Tracking Anything with Decoupled Video Segmentation
論文紹介:Tracking Anything with Decoupled Video Segmentation
 
論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope
論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope
論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope
 
論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...
論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...
論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...
 
論文紹介:Multitask Vision-Language Prompt Tuning
論文紹介:Multitask Vision-Language Prompt Tuning論文紹介:Multitask Vision-Language Prompt Tuning
論文紹介:Multitask Vision-Language Prompt Tuning
 

Recently uploaded

遺伝的アルゴリズムと知識蒸留による大規模言語モデル(LLM)の学習とハイパーパラメータ最適化
遺伝的アルゴリズムと知識蒸留による大規模言語モデル(LLM)の学習とハイパーパラメータ最適化遺伝的アルゴリズムと知識蒸留による大規模言語モデル(LLM)の学習とハイパーパラメータ最適化
遺伝的アルゴリズムと知識蒸留による大規模言語モデル(LLM)の学習とハイパーパラメータ最適化
t m
 
This is the company presentation material of RIZAP Technologies, Inc.
This is the company presentation material of RIZAP Technologies, Inc.This is the company presentation material of RIZAP Technologies, Inc.
This is the company presentation material of RIZAP Technologies, Inc.
chiefujita1
 
ReonHata_便利の副作用に気づかせるための発想支援手法の評価---行為の増減の提示による気づきへの影響---
ReonHata_便利の副作用に気づかせるための発想支援手法の評価---行為の増減の提示による気づきへの影響---ReonHata_便利の副作用に気づかせるための発想支援手法の評価---行為の増減の提示による気づきへの影響---
ReonHata_便利の副作用に気づかせるための発想支援手法の評価---行為の増減の提示による気づきへの影響---
Matsushita Laboratory
 
TaketoFujikawa_物語のコンセプトに基づく情報アクセス手法の基礎検討_JSAI2024
TaketoFujikawa_物語のコンセプトに基づく情報アクセス手法の基礎検討_JSAI2024TaketoFujikawa_物語のコンセプトに基づく情報アクセス手法の基礎検討_JSAI2024
TaketoFujikawa_物語のコンセプトに基づく情報アクセス手法の基礎検討_JSAI2024
Matsushita Laboratory
 
LoRaWAN 4チャンネル電流センサー・コンバーター CS01-LB 日本語マニュアル
LoRaWAN 4チャンネル電流センサー・コンバーター CS01-LB 日本語マニュアルLoRaWAN 4チャンネル電流センサー・コンバーター CS01-LB 日本語マニュアル
LoRaWAN 4チャンネル電流センサー・コンバーター CS01-LB 日本語マニュアル
CRI Japan, Inc.
 
CS集会#13_なるほどわからん通信技術 発表資料
CS集会#13_なるほどわからん通信技術 発表資料CS集会#13_なるほどわからん通信技術 発表資料
CS集会#13_なるほどわからん通信技術 発表資料
Yuuitirou528 default
 
JSAI_類似画像マッチングによる器への印象付与手法の妥当性検証_ver.3_高橋りさ
JSAI_類似画像マッチングによる器への印象付与手法の妥当性検証_ver.3_高橋りさJSAI_類似画像マッチングによる器への印象付与手法の妥当性検証_ver.3_高橋りさ
JSAI_類似画像マッチングによる器への印象付与手法の妥当性検証_ver.3_高橋りさ
0207sukipio
 

Recently uploaded (7)

遺伝的アルゴリズムと知識蒸留による大規模言語モデル(LLM)の学習とハイパーパラメータ最適化
遺伝的アルゴリズムと知識蒸留による大規模言語モデル(LLM)の学習とハイパーパラメータ最適化遺伝的アルゴリズムと知識蒸留による大規模言語モデル(LLM)の学習とハイパーパラメータ最適化
遺伝的アルゴリズムと知識蒸留による大規模言語モデル(LLM)の学習とハイパーパラメータ最適化
 
This is the company presentation material of RIZAP Technologies, Inc.
This is the company presentation material of RIZAP Technologies, Inc.This is the company presentation material of RIZAP Technologies, Inc.
This is the company presentation material of RIZAP Technologies, Inc.
 
ReonHata_便利の副作用に気づかせるための発想支援手法の評価---行為の増減の提示による気づきへの影響---
ReonHata_便利の副作用に気づかせるための発想支援手法の評価---行為の増減の提示による気づきへの影響---ReonHata_便利の副作用に気づかせるための発想支援手法の評価---行為の増減の提示による気づきへの影響---
ReonHata_便利の副作用に気づかせるための発想支援手法の評価---行為の増減の提示による気づきへの影響---
 
TaketoFujikawa_物語のコンセプトに基づく情報アクセス手法の基礎検討_JSAI2024
TaketoFujikawa_物語のコンセプトに基づく情報アクセス手法の基礎検討_JSAI2024TaketoFujikawa_物語のコンセプトに基づく情報アクセス手法の基礎検討_JSAI2024
TaketoFujikawa_物語のコンセプトに基づく情報アクセス手法の基礎検討_JSAI2024
 
LoRaWAN 4チャンネル電流センサー・コンバーター CS01-LB 日本語マニュアル
LoRaWAN 4チャンネル電流センサー・コンバーター CS01-LB 日本語マニュアルLoRaWAN 4チャンネル電流センサー・コンバーター CS01-LB 日本語マニュアル
LoRaWAN 4チャンネル電流センサー・コンバーター CS01-LB 日本語マニュアル
 
CS集会#13_なるほどわからん通信技術 発表資料
CS集会#13_なるほどわからん通信技術 発表資料CS集会#13_なるほどわからん通信技術 発表資料
CS集会#13_なるほどわからん通信技術 発表資料
 
JSAI_類似画像マッチングによる器への印象付与手法の妥当性検証_ver.3_高橋りさ
JSAI_類似画像マッチングによる器への印象付与手法の妥当性検証_ver.3_高橋りさJSAI_類似画像マッチングによる器への印象付与手法の妥当性検証_ver.3_高橋りさ
JSAI_類似画像マッチングによる器への印象付与手法の妥当性検証_ver.3_高橋りさ
 

文献紹介:Image-to-Image Translation: Methods and Applications

  • 1. Image-to-Image Translation: Methods and Applications Yingxue Pang, Jianxin Lin, Tao Qin, and Zhibo Chen IEEE Transactions on Multimedia, 2021 , 2022/6/17
  • 4. ◼Variational Auto Encoder [Kingma & Welling, arXiv2014] • 𝓏 𝓍 𝑝𝜃(𝓍|𝓏) • 𝓍 𝑝𝜃(𝓍) • log 𝑝𝜃(𝓍) ◼Generative Adversarial Networks [Goodfellow+, NeurIPS2014] • • •
  • 5. Generative Adversarial Network (GAN) ◼Unconditional GAN • GAN [Goodfellow+, NeurIPS2014] • DGANs [Radford+, ICLR2016] • ◼Conditional GAN [Mirza & Osindero, arXiv2014] • • or
  • 7. ◼ Peak signal-to-noise ratio (PSNR) • ◼Structural similarity index (SSIM) [Wang+, IEEE2004] • ground truth ◼Inception score (IS) [Salimans+, NeurIPS2016] • Inception ◼Mask-SSIM and Mask-IS [Ma+, NeurIPS2017] • •
  • 8. ◼Conditional inception score (CIS) [Huang+, ECCV2018] • I2I • ◼Perceptual distance (PD) [Johnson+, ECCV2016] • ◼Fréchet inception distance (FID) [Heusel+, NeurIPS2017] •
  • 9. ◼Kernel inception distance (KID) [Binkowski+, ICLR2018] • Inception network [Szegedy+, CVPR2016] • ◼Single image Fréchet inception distance (SIFID) [Shaham+, ICCV2019] • • FID ◼Learned Perceptual Image Patch Similarity (LPIPS) [Zhang+, CVPR2018] • • •
  • 10. ◼FCN scores [Isola+, CVPR2017] • Semantic map • IoU ◼Classification accuracy [Liu+, NeurIPS2017] • • ◼Density and Coverage (DC) [Naeem+, PMLR2020] • •
  • 11. ◼Image-to-Image translation (I2I) II. Two-domain I2I , Multi-domain I2I i. Two-Domain Image-To-Image Translation
  • 12. Two-Domain Image-To-Image Translation ◼Two-Domain I2I • Image style transfer [Zhu+, ICCV2017] • Semantic segmentation [Park+, CVPR2017] • Image colorization [Suárez+, CVPR2017], ◼Two-Domain I2I • Supervised • • Unsupervised • • Semi-supervised • • Few-shot • Image style transfer Image colorization
  • 13. Supervised I2I Single-modal Output ◼Single-modal Output • ◼Pix2pix [Isola+, CVPR2017] • Conditional GAN • 𝐿1 ◼Pix2PixHD [Wang+, CVPR2018] • 2048 1024 ◼SPADE [Park+, CVPR2019] • SPADE [Park+, CVPR2019]
  • 14. Supervised I2I Multimodal Output ◼Multimodal Outputs • • ◼BicycleGAN [Zhu+, NeurIPS2017] • cVAE-GAN [Hinton & Salakhutdinov, Science2006] cLR-GAN [Chen+, NeurIPS2016] • • I2I ◼PixelNN [Bansal+, ICLR2018] • • ( ) PixelNN [Bansal+, ICLR2018]
  • 15. Unsupervised I2I Single-modal Output ◼Translation using a Cycle consistency Constraint • • Cycle-consistency Loss • loss • CycleGAN [Zhu+, ICCV2017] • 2 • U-GAT-IT [Kim+, ICLR2019] • • CycleGAN [Zhu+, ICCV2017]
  • 16. Unsupervised I2I Single-modal Output ◼Translation beyond Cycle-consistency Constraint • • TraVeLGAN [Amodio & Krishnaswamy, CVPR2019] • • CUT [Park+, ECCV2020] • •
  • 17. Unsupervised I2I Single-modal Output ◼Translation of Fine-grained Objects • ◼DAGAN [Ma+, CVPR2018] ◼Attention GAN [Chen+, ECCV2018] ◼Attention guided GAN [Mejjati+, NeurIPS2017]
  • 18. Unsupervised I2I Single-modal Output ◼Translation by combining knowledge in other fields • ◼RevGAN [Ouderaa & Worrall, CVPR2019] ◼Art2Real [Tomei+, CVPR2019] ◼GDWCT [Cho+, CVPR2019] ◼NICE-GAN [Chen+, CVPR2020] Art2Real [Tomei+, CVPR2019]
  • 19. Unsupervised I2I Multimodal Output ◼CycleGAN • DSVIB [Kazemi+, NeurIPS2018] • Augmented CycleGAN [Almahairi+, PMLR2018] ◼Disentangled representations • cd-GAN [Lin+, CVPR2018] • MUNIT [Huang+, ECCV2018] • DRIT [Lee+, ECCV2018] • EGSC-IT [Ma+, ICLR2018] ◼ • INIT [Shen+, CVPR2019] • DSMAP [Chang+, ECCV2020] DRIT [Lee+, ECCV2018]
  • 20. Semi-supervised Image-to-Image Translation ◼TCR-SSIT [Mustafa & Mantiuk, ECCV2020] • Transformation Consistency Regularization (TCR) •
  • 21. Few-Shot Image-to-Image Translation ◼ I2I ◼Transferring GAN [Wang+, ECCV2018] • GAN • ◼MT-GAN [Lin+, arXiv2019] • •
  • 22. One-shot Image-to-Image Translation ◼ • OST [Benaim & Wolf, NeurIPS2018] • • • BiOST [Cohen & Wolf, ICCV2019] • • ◼ • TuiGAN [Lin+, ECCV2020] • • coarse-to-fine
  • 23. ◼Image-to-Image translation (I2I) II. Two-domain I2I , Multi-domain I2I ii. Multi-Domain Image-To-Image Translation
  • 24. Unsupervised I2I Single-modal Output ◼Training with multi-modules • 2 I2I • • Domain-Bank [Hui+, ICPR2018] • n n • ModularGAN [Zhao+, ECCV2018] •
  • 25. Training with one generator and discriminator pair ◼Training with one generator and discriminator pair • • • StarGAN [Choi+, CVPR2018] • • • AttGAN [He+, IEEE TIP2019] • • • RelGAN [Wu+, ICCV2019] STGAN [Liu+, CVPR2019] • • StarGAN, RelGAN
  • 26. Training by combining knowledge in other fields ◼Training by combining knowledge in other fields • • Fixed-Point GAN [Siddiquee+, ICCV2019] • • SGN [Chang+, ICCV2019] • sym-parameter • INIT [Cao+, ECCV2020] • • ADSPM [Wu+, ICCV2019] • SGN Structure
  • 27. Unsupervised I2I Multimodal Output ◼ DosGAN [Lin+, arXiv2019] • CNN • ◼ GANimation [Pumarola+, ECCV2018] • • ◼ Disentanglement assumption • UFDN [Liu+, NeurIPS2018] • DMIT [Yu+, NeurIPS2019] • StarGAN v2 [Choi+, CVPR2020] • DRIT++ [Lee+, ECCV2018] • GMM-UNIT [Liu+, arXiv2020]
  • 28. Semi-Supervised Multi-domain I2I ◼SEMIT [Wang+, CVPR2020] • Few-shot I2I • ◼AGUIT [Li+, arXiv2019] 1. 2. AdaIN [Huang+, ICCV2017] 3. Cycle-consistency Loss Feature-consistency Loss
  • 29. Few-shot Multi-domain I2I ◼FUNIT [Liu+, ICCV2019] • • I2I • I2I ◼COCO-FUNIT [Saito+, ECCV2020] • • ◼ZstGAN [Lin+, arXiv2019] • zero-shot I2I •
  • 30. ◼Image-to-Image translation (I2I) II. Two-domain I2I , Multi-domain I2I iii.
  • 31. Experimental Evaluation ◼ Two-Domain I2I • Pix2pix [Isola+, CVPR2017] • BicycleGAN [Zhu+, NeurIPS2017] • CycleGAN [Zhu+, ICCV2017] • U-GAT-IT [Kim+, ICLR2019] • GDWCT [Cho+, CVPR2019] • CUT [Park+, ECCV2020] • MUNIT [Huang+, ECCV2018] ◼ Multi-Domain I2I • StarGAN [Choi+, CVPR2018] • AttGAN [He+, IEEE TIP2019] • STGAN [Liu+, CVPR2019] • DosGAN [Lin+, arXiv2019] • StarGANv2 [Choi+, CVPR2020]
  • 32. ◼UT-Zap50K [Yu & Grauman, CVPR2019] • • 49826 • 200 • 256 256 ◼CelebA [Liu, ICCV2015] • 202599 • 40 • train, val, test 8:1:1 • 178 218 center crop •
  • 33. ◼Inception score (IS) [Salimans+, NeurIPS2016] • ◼Fréchet inception distance (FID) [Heusel+, NeurIPS2017] • ◼LPIPS [Zhang+, CVPR2018] • LPIPS •
  • 35. Two-Domain I2I ◼Single-modal • • CUT FID IS • • StyleGAN [Karras+, CVPR2019] ◼Multimodal • • LPIPS 0.047
  • 37. MultiDomain I2I ◼StarGANv2 • FID IS LPIPS StarGANv2 • LPIPS 0.336
  • 39. Application ◼ • [Park+, CVPR2019] • [Yan+, ACM2017] • • [Isola+, CVPR2017] • [Almahairi+, PMLR2018] • [Mao+, CVPR2019] • [Shocher+, CVPR2020] • [Pathak+, CVPR2016] • [Zheng+, AAAI2019]
  • 40. ◼ • [Hicsinmez+, Image and Vision Computing, 2020] • [Taigman+, ICLR2019] ◼ • [Zhu+, ICCV2017] • [Kim+, ICML2017] ◼ • [Yi+, ICCV2017] • [Park+, NeurIPS2020]
  • 41. ◼ • [Yuan+, CVPR2018] • [Manakov+, DART2019] • [Zhang+, TCSVT2020] • [Engin+, CVPRW2019] • [Madam+, ECCV2018]
  • 43.
  • 44. ◼AMT perceptual studies • Amazon Mechanical Turk (AMT) • •
  • 46. ◼Inception score (IS) Fréchet inception distance (FID) • Two-Domain & Single-modal • • Two-Domain & Multimodal • 19 FID • Multi-Domain & Single-modal • FID • Multi-Domain & Multimodal • 19 • FID