SlideShare a Scribd company logo
1 of 87
Download to read offline
Image Translation with GAN
Presentor : Junho Cho
Junho Cho, Perception and Intelligence Lab, SNU 1
Problem statement of Image Translation
Learn
that convert an image of source domain to an image of target domain
Junho Cho, Perception and Intelligence Lab, SNU 2
Image Translation: and are pair-wise labeled
Junho Cho, Perception and Intelligence Lab, SNU 3
Image Translation: and are not pair-wised
Junho Cho, Perception and Intelligence Lab, SNU 4
Junho Cho, Perception and Intelligence Lab, SNU 5
Junho Cho, Perception and Intelligence Lab, SNU 6
Junho Cho, Perception and Intelligence Lab, SNU 7
Before, Style Transfer (NeuralArt) was prominent
Junho Cho, Perception and Intelligence Lab, SNU 8
Junho Cho, Perception and Intelligence Lab, SNU 9
Junho Cho, Perception and Intelligence Lab, SNU 10
Perceptual Losses for Real-Time Style Transfer and Super-Resolution
Junho Cho, Perception and Intelligence Lab, SNU 11
But it largely depends on textual information of an target style
Junho Cho, Perception and Intelligence Lab, SNU 12
How to learn more general Image Translation?
Junho Cho, Perception and Intelligence Lab, SNU 13
Generative
Adversarial
Network
GAN!
Junho Cho, Perception and Intelligence Lab, SNU 14
Junho Cho, Perception and Intelligence Lab, SNU 15
Deep Convolutional GAN
(DCGAN)
Junho Cho, Perception and Intelligence Lab, SNU 16
Two major problems of Image Translation
1. Convert to which domain?
• learn which " "?
2. How to learn the dataset?
• how to properly form dataset?
• pair-wise Supervised? or Unsupervised?
Junho Cho, Perception and Intelligence Lab, SNU 17
Today, presenting SOTA of Image Translation papers of
- pix2pix: Image-to-Image Translation with Conditional Adversarial Networks (CVPR2017)
- Domain Transfer Network: Unsupervised Cross-Domain Image Generation (ICLR2017)
- CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
- DiscoGAN: Learning to Discover Cross-Domain Relations with Generative Adversarial Networks
Junho Cho, Perception and Intelligence Lab, SNU 18
1. Image-to-Image Translation with
Conditional Adversarial Networks
(pix2pix)
CVPR2017
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros
Junho Cho, Perception and Intelligence Lab, SNU 19
Junho Cho, Perception and Intelligence Lab, SNU 20
Junho Cho, Perception and Intelligence Lab, SNU 21
Junho Cho, Perception and Intelligence Lab, SNU 22
Learn pair-wise images of and
- BW & Color image
- Street Scene & Label
- Facade & Label
- Aerial & Map
- Day & Night
- Edges & Photo
source image , target image (label) is pair-
wise.
thus it is Supervised Learning
Junho Cho, Perception and Intelligence Lab, SNU 23
Generator of pix2pix
where : image and : noise
Use U-Net shaped network
- known to be powerful at segmentation
task
- use spatial information from features of
bottom layer
- use dropout as noise in decoder part
Junho Cho, Perception and Intelligence Lab, SNU 24
Discriminator of pix2pix
Junho Cho, Perception and Intelligence Lab, SNU 25
Loss function
: source image, : target image, : noise
Junho Cho, Perception and Intelligence Lab, SNU 26
Result_
Junho Cho, Perception and Intelligence Lab, SNU 27
Junho Cho, Perception and Intelligence Lab, SNU 28
Junho Cho, Perception and Intelligence Lab, SNU 29
Junho Cho, Perception and Intelligence Lab, SNU 30
Junho Cho, Perception and Intelligence Lab, SNU 31
Junho Cho, Perception and Intelligence Lab, SNU 32
Do demo!
https://affinelayer.com/pixsrv/
Junho Cho, Perception and Intelligence Lab, SNU 33
2. Unsupervised Cross-Domain
Image Generation (DTN)
ICLR2017
Yaniv Taigman, Adam Polyak, Lior Wolf
Junho Cho, Perception and Intelligence Lab, SNU 34
Learn
of two related domains, and
without labels!
(labels of images are usually expensive)
Junho Cho, Perception and Intelligence Lab, SNU 35
Junho Cho, Perception and Intelligence Lab, SNU 36
Baseline model
: discriminator, : generator,
: context encoder. outputs feature. (128-dim)
Junho Cho, Perception and Intelligence Lab, SNU 37
•
•
• -constancy : Does have similar context?
Junho Cho, Perception and Intelligence Lab, SNU 38
1.
2.
• : distance metric. ex) MSE
• : "Pretrained" context encoder. Parameter fixed.
• can be pretrained with classification task on
• Minimize two Risks : and
Junho Cho, Perception and Intelligence Lab, SNU 39
Experimentally,
Baseline model didn't produce
desirable results.
Thus, similar but more elaborate architecture proposed
Junho Cho, Perception and Intelligence Lab, SNU 40
Proposed "Domain Transfer Network (DTN)"
Junho Cho, Perception and Intelligence Lab, SNU 41
Two Difference from the Baseline
First, : the context encoder now encode as then will
generate from it :
- focuses to generate from given context
Junho Cho, Perception and Intelligence Lab, SNU 42
Two Difference from the Baseline
Second, for , is also encoded by and applied
- "Pretrained on " would not be good as much as on . But enough for context encoding purpose
- : should be similar to
- Also takes and performs ternary (3-class) classification. (one real, two fakes)
Junho Cho, Perception and Intelligence Lab, SNU 43
Losses
Junho Cho, Perception and Intelligence Lab, SNU 44
: generated from ? / : generated from ? / : sample from ?
Junho Cho, Perception and Intelligence Lab, SNU 45
Generator : Adversarial Loss
Fool to classify as sample from
Junho Cho, Perception and Intelligence Lab, SNU 46
Generator : and Identity preserving
, in feature level
, in pixel level
used as MSE in this work
Junho Cho, Perception and Intelligence Lab, SNU 47
•
•
minimized over
minimized over
Junho Cho, Perception and Intelligence Lab, SNU 48
Experiments1. Street View House Numbers (SVHN) MNIST
2. Face Emoji
Both cases, and domains differ considerably
Junho Cho, Perception and Intelligence Lab, SNU 49
SVHN MNIST
Junho Cho, Perception and Intelligence Lab, SNU 50
• 4 convs (each filters 64,128,256,128) / max pooling / ReLU
• input RGB / output 128-dim vector.
• do not need to be very powerful classifier.
• achieves 4.95% error on SVHN test set
• Weaker in : 23.92% error on MNIST.
• Learn analogy of unlabeled examples
Junho Cho, Perception and Intelligence Lab, SNU 51
• Inspired by DCGAN
• SVHN-trained 's 128D representation
• four blocks of deconv, BN, ReLU. TanH at final.
•
•
Junho Cho, Perception and Intelligence Lab, SNU 52
Junho Cho, Perception and Intelligence Lab, SNU 53
Evaluate DTN
Train classifier on .
- Architecture same as
- MNIST performance 99.4% test set.
Evaluate by testing MNIST classifier on
using : label.
Junho Cho, Perception and Intelligence Lab, SNU 54
Junho Cho, Perception and Intelligence Lab, SNU 55
Unseen Digits
Study the ability of DTN to overcome
omission of a class in samples.
For example, class '3'
Ablation applied on
- training DTN, domain
- training DTN, domain
- training .
But '3' exists in testing DTN! Compare
results.
Junho Cho, Perception and Intelligence Lab, SNU 56
(a) The input images. (b) Results of our DTN. (c) 3 was not in SVNH. (d) 3 was not in MNIST. (e) 3 was
not shown in both SVHN and MNIST. (f) The digit 3 was not shown in SVHN, MNIST and during the
training of f.
Junho Cho, Perception and Intelligence Lab, SNU 57
Junho Cho, Perception and Intelligence Lab, SNU 58
Domain Adaptation
: labeled, unlabeled, want to train classifier of
Train k-NN classifier
Junho Cho, Perception and Intelligence Lab, SNU 59
Face Emoji• face from Facescrub/CelebA
• emoji gained from bitmoji.com, not publicized
• preprocess on emoji with heuristics. Align face.
• from DeepFace pretrained network.
• (Taigman et al. 2014) the author's previous work
• is 256-dim
• outputs
• SR (Dong et al. 2015) to upscale final output.
Junho Cho, Perception and Intelligence Lab, SNU 60
Results !
choose via validation
Junho Cho, Perception and Intelligence Lab, SNU 61
Original style transfer can't solve it
DTN also can style transfer.
DTN is more general than Styler Transfer method.
Junho Cho, Perception and Intelligence Lab, SNU 62
Limitations
• usually can be trained in one domain,
thus asymmetric.
• Handle two domains differently.
• is bad.
• Bounded by . Needs pre-traied context
encoder.
• any better way to learn context without
pretraining?
• Any more tasks?
Junho Cho, Perception and Intelligence Lab, SNU 63
Conclusion1. Demonstrate Domain Transfer, as an unsupervised method.
• Can be generalized to various problems.
2. -constancy to maintain context of domain &
3. Simple domain adaptation and good performance
• inspiring work to future domain adaptation research
More open reviews at OpenReview.net
Junho Cho, Perception and Intelligence Lab, SNU 64
3. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (CycleGAN)
UC Berkeley (pix2pix upgrade)
&
Learning to Discover Cross-Domain Relations with Generative Adversarial Networks (DiscoGAN)
SK T-Brain
Junho Cho, Perception and Intelligence Lab, SNU 65
DiscoGAN & CycleGAN
Almost Identical concept.
DiscoGAN came 15 days earlier. Low resolution ( )
CycleGAN has better qualitative results ( ) and quantative experiments.
Difference from DTN
• No -constancy. Do not need pre-trained context encoder
• Only need dataset and
Junho Cho, Perception and Intelligence Lab, SNU 66
DiscoGAN
Junho Cho, Perception and Intelligence Lab, SNU 67
DiscoGAN
Junho Cho, Perception and Intelligence Lab, SNU 68
without cross domain matching, GAN has mode collapse
learn projection to mode in domain , while two domains have one-to-one relation
Junho Cho, Perception and Intelligence Lab, SNU 69
Typical GAN issue: Mode collapse
top is ideal case, bottom is mode collapse failure case
Junho Cho, Perception and Intelligence Lab, SNU 70
Toy problem of 2-dim Gaussian mixture model
• 5 modes of domain A to 10 modes of domain B
GAN, GAN + const show injective mapping & mode collapse
DiscoGAN shows bijective mapping & generate all 10 modes of B.
Junho Cho, Perception and Intelligence Lab, SNU 71
Junho Cho, Perception and Intelligence Lab, SNU 72
proposed DiscoGAN
Junho Cho, Perception and Intelligence Lab, SNU 73
CycleGAN has similar contribution on this point
Junho Cho, Perception and Intelligence Lab, SNU 74
Junho Cho, Perception and Intelligence Lab, SNU 75
Results
Junho Cho, Perception and Intelligence Lab, SNU 76
Junho Cho, Perception and Intelligence Lab, SNU 77
codes and more results in
https://github.com/SKTBrain/DiscoGAN
https://github.com/carpedm20/DiscoGAN-pytorch
Junho Cho, Perception and Intelligence Lab, SNU 78
CycleGAN
Use more GAN techniques: LSGAN, use image buffer of previous generated samples
Junho Cho, Perception and Intelligence Lab, SNU 79
Junho Cho, Perception and Intelligence Lab, SNU 80
Junho Cho, Perception and Intelligence Lab, SNU 81
Junho Cho, Perception and Intelligence Lab, SNU 82
Junho Cho, Perception and Intelligence Lab, SNU 83
Junho Cho, Perception and Intelligence Lab, SNU 84
failure case
Junho Cho, Perception and Intelligence Lab, SNU 85
CycleGAN demonstrates more experiments!
project page : https://junyanz.github.io/CycleGAN/
code available with Torch and PyTorch
Junho Cho, Perception and Intelligence Lab, SNU 86
Thank you!
Junho Cho, Perception and Intelligence Lab, SNU 87

More Related Content

What's hot

GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and ApplicationsHoang Nguyen
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
 
Style gan2 review
Style gan2 reviewStyle gan2 review
Style gan2 reviewtaeseon ryu
 
[DL輪読会]Toward Multimodal Image-to-Image Translation (NIPS'17)
[DL輪読会]Toward Multimodal Image-to-Image Translation (NIPS'17)[DL輪読会]Toward Multimodal Image-to-Image Translation (NIPS'17)
[DL輪読会]Toward Multimodal Image-to-Image Translation (NIPS'17)Deep Learning JP
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation LearningJure Leskovec
 
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Universitat Politècnica de Catalunya
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksBennoG1
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMark Chang
 
論文読んだよ “State Lattice-based Motion Planning for Autonomous On-Road Driving”
論文読んだよ “State Lattice-based Motion Planning for Autonomous On-Road Driving”論文読んだよ “State Lattice-based Motion Planning for Autonomous On-Road Driving”
論文読んだよ “State Lattice-based Motion Planning for Autonomous On-Road Driving”Adachi (OEI)
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaPreferred Networks
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)Thomas da Silva Paula
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisHyeongmin Lee
 
COM2304: Introduction to Computer Vision & Image Processing
COM2304: Introduction to Computer Vision & Image Processing COM2304: Introduction to Computer Vision & Image Processing
COM2304: Introduction to Computer Vision & Image Processing Hemantha Kulathilake
 
CycleGAN이 무엇인지 알아보자
CycleGAN이 무엇인지 알아보자CycleGAN이 무엇인지 알아보자
CycleGAN이 무엇인지 알아보자Kwangsik Lee
 
Medical Image Synthesis with Improved Cycle-GAN: CT from CECT
Medical Image Synthesis with Improved Cycle-GAN: CT from CECT Medical Image Synthesis with Improved Cycle-GAN: CT from CECT
Medical Image Synthesis with Improved Cycle-GAN: CT from CECT BoahKim2
 

What's hot (20)

GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and Applications
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
Style gan2 review
Style gan2 reviewStyle gan2 review
Style gan2 review
 
[DL輪読会]Toward Multimodal Image-to-Image Translation (NIPS'17)
[DL輪読会]Toward Multimodal Image-to-Image Translation (NIPS'17)[DL輪読会]Toward Multimodal Image-to-Image Translation (NIPS'17)
[DL輪読会]Toward Multimodal Image-to-Image Translation (NIPS'17)
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation Learning
 
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial Networks
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
論文読んだよ “State Lattice-based Motion Planning for Autonomous On-Road Driving”
論文読んだよ “State Lattice-based Motion Planning for Autonomous On-Road Driving”論文読んだよ “State Lattice-based Motion Planning for Autonomous On-Road Driving”
論文読んだよ “State Lattice-based Motion Planning for Autonomous On-Road Driving”
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi Kerola
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 
CV_2 Filtering_Example
CV_2 Filtering_ExampleCV_2 Filtering_Example
CV_2 Filtering_Example
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
COM2304: Introduction to Computer Vision & Image Processing
COM2304: Introduction to Computer Vision & Image Processing COM2304: Introduction to Computer Vision & Image Processing
COM2304: Introduction to Computer Vision & Image Processing
 
CycleGAN이 무엇인지 알아보자
CycleGAN이 무엇인지 알아보자CycleGAN이 무엇인지 알아보자
CycleGAN이 무엇인지 알아보자
 
Medical Image Synthesis with Improved Cycle-GAN: CT from CECT
Medical Image Synthesis with Improved Cycle-GAN: CT from CECT Medical Image Synthesis with Improved Cycle-GAN: CT from CECT
Medical Image Synthesis with Improved Cycle-GAN: CT from CECT
 
Swin transformer
Swin transformerSwin transformer
Swin transformer
 

Similar to Image Translation with GAN

Unsupervised Cross-Domain Image Generation
Unsupervised Cross-Domain Image GenerationUnsupervised Cross-Domain Image Generation
Unsupervised Cross-Domain Image GenerationJunho Cho
 
Convolutional Neural Networks CNN
Convolutional Neural Networks CNNConvolutional Neural Networks CNN
Convolutional Neural Networks CNNAbdullah al Mamun
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision Chen Sagiv
 
20141003.journal club
20141003.journal club20141003.journal club
20141003.journal clubHayaru SHOUNO
 
Introduction to Interpretable Machine Learning
Introduction to Interpretable Machine LearningIntroduction to Interpretable Machine Learning
Introduction to Interpretable Machine LearningNguyen Giang
 
Convolutional Neural Networks Research
Convolutional Neural Networks ResearchConvolutional Neural Networks Research
Convolutional Neural Networks ResearchTanmay Ghai
 
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)Abdulrahman Kerim
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering
 
ct_meeting_final_jcy (1).pdf
ct_meeting_final_jcy (1).pdfct_meeting_final_jcy (1).pdf
ct_meeting_final_jcy (1).pdfssuser2c7393
 
161209 Unsupervised Learning of Video Representations using LSTMs
161209 Unsupervised Learning of Video Representations using LSTMs161209 Unsupervised Learning of Video Representations using LSTMs
161209 Unsupervised Learning of Video Representations using LSTMsJunho Cho
 
IISc Internship Report
IISc Internship ReportIISc Internship Report
IISc Internship ReportHarshilJain26
 
introduction to deeplearning
introduction to deeplearningintroduction to deeplearning
introduction to deeplearningEyad Alshami
 
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo appDetails of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo appPAY2 YOU
 
ppt icitisee 2022_without_recording.pptx
ppt icitisee 2022_without_recording.pptxppt icitisee 2022_without_recording.pptx
ppt icitisee 2022_without_recording.pptxssusera4da91
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術CHENHuiMei
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Julien SIMON
 

Similar to Image Translation with GAN (20)

Unsupervised Cross-Domain Image Generation
Unsupervised Cross-Domain Image GenerationUnsupervised Cross-Domain Image Generation
Unsupervised Cross-Domain Image Generation
 
Convolutional Neural Networks CNN
Convolutional Neural Networks CNNConvolutional Neural Networks CNN
Convolutional Neural Networks CNN
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision
 
20141003.journal club
20141003.journal club20141003.journal club
20141003.journal club
 
Introduction to Interpretable Machine Learning
Introduction to Interpretable Machine LearningIntroduction to Interpretable Machine Learning
Introduction to Interpretable Machine Learning
 
Convolutional Neural Networks Research
Convolutional Neural Networks ResearchConvolutional Neural Networks Research
Convolutional Neural Networks Research
 
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networks
 
F0164348
F0164348F0164348
F0164348
 
Learning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep visionLearning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep vision
 
ct_meeting_final_jcy (1).pdf
ct_meeting_final_jcy (1).pdfct_meeting_final_jcy (1).pdf
ct_meeting_final_jcy (1).pdf
 
Final Poster
Final PosterFinal Poster
Final Poster
 
161209 Unsupervised Learning of Video Representations using LSTMs
161209 Unsupervised Learning of Video Representations using LSTMs161209 Unsupervised Learning of Video Representations using LSTMs
161209 Unsupervised Learning of Video Representations using LSTMs
 
IISc Internship Report
IISc Internship ReportIISc Internship Report
IISc Internship Report
 
introduction to deeplearning
introduction to deeplearningintroduction to deeplearning
introduction to deeplearning
 
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo appDetails of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
 
ppt icitisee 2022_without_recording.pptx
ppt icitisee 2022_without_recording.pptxppt icitisee 2022_without_recording.pptx
ppt icitisee 2022_without_recording.pptx
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)
 

More from Junho Cho

Get Used to Command Line Interface
Get Used to Command Line InterfaceGet Used to Command Line Interface
Get Used to Command Line InterfaceJunho Cho
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural NetworkJunho Cho
 
160805 End-to-End Memory Networks
160805 End-to-End Memory Networks160805 End-to-End Memory Networks
160805 End-to-End Memory NetworksJunho Cho
 
160205 NeuralArt - Understanding Neural Representation
160205 NeuralArt - Understanding Neural Representation160205 NeuralArt - Understanding Neural Representation
160205 NeuralArt - Understanding Neural RepresentationJunho Cho
 
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural NetworksJunho Cho
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNNJunho Cho
 
150424 Scalable Object Detection using Deep Neural Networks
150424 Scalable Object Detection using Deep Neural Networks150424 Scalable Object Detection using Deep Neural Networks
150424 Scalable Object Detection using Deep Neural NetworksJunho Cho
 

More from Junho Cho (7)

Get Used to Command Line Interface
Get Used to Command Line InterfaceGet Used to Command Line Interface
Get Used to Command Line Interface
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
 
160805 End-to-End Memory Networks
160805 End-to-End Memory Networks160805 End-to-End Memory Networks
160805 End-to-End Memory Networks
 
160205 NeuralArt - Understanding Neural Representation
160205 NeuralArt - Understanding Neural Representation160205 NeuralArt - Understanding Neural Representation
160205 NeuralArt - Understanding Neural Representation
 
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNN
 
150424 Scalable Object Detection using Deep Neural Networks
150424 Scalable Object Detection using Deep Neural Networks150424 Scalable Object Detection using Deep Neural Networks
150424 Scalable Object Detection using Deep Neural Networks
 

Recently uploaded

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 

Recently uploaded (20)

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 

Image Translation with GAN

  • 1. Image Translation with GAN Presentor : Junho Cho Junho Cho, Perception and Intelligence Lab, SNU 1
  • 2. Problem statement of Image Translation Learn that convert an image of source domain to an image of target domain Junho Cho, Perception and Intelligence Lab, SNU 2
  • 3. Image Translation: and are pair-wise labeled Junho Cho, Perception and Intelligence Lab, SNU 3
  • 4. Image Translation: and are not pair-wised Junho Cho, Perception and Intelligence Lab, SNU 4
  • 5. Junho Cho, Perception and Intelligence Lab, SNU 5
  • 6. Junho Cho, Perception and Intelligence Lab, SNU 6
  • 7. Junho Cho, Perception and Intelligence Lab, SNU 7
  • 8. Before, Style Transfer (NeuralArt) was prominent Junho Cho, Perception and Intelligence Lab, SNU 8
  • 9. Junho Cho, Perception and Intelligence Lab, SNU 9
  • 10. Junho Cho, Perception and Intelligence Lab, SNU 10
  • 11. Perceptual Losses for Real-Time Style Transfer and Super-Resolution Junho Cho, Perception and Intelligence Lab, SNU 11
  • 12. But it largely depends on textual information of an target style Junho Cho, Perception and Intelligence Lab, SNU 12
  • 13. How to learn more general Image Translation? Junho Cho, Perception and Intelligence Lab, SNU 13
  • 15. Junho Cho, Perception and Intelligence Lab, SNU 15
  • 16. Deep Convolutional GAN (DCGAN) Junho Cho, Perception and Intelligence Lab, SNU 16
  • 17. Two major problems of Image Translation 1. Convert to which domain? • learn which " "? 2. How to learn the dataset? • how to properly form dataset? • pair-wise Supervised? or Unsupervised? Junho Cho, Perception and Intelligence Lab, SNU 17
  • 18. Today, presenting SOTA of Image Translation papers of - pix2pix: Image-to-Image Translation with Conditional Adversarial Networks (CVPR2017) - Domain Transfer Network: Unsupervised Cross-Domain Image Generation (ICLR2017) - CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks - DiscoGAN: Learning to Discover Cross-Domain Relations with Generative Adversarial Networks Junho Cho, Perception and Intelligence Lab, SNU 18
  • 19. 1. Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) CVPR2017 Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros Junho Cho, Perception and Intelligence Lab, SNU 19
  • 20. Junho Cho, Perception and Intelligence Lab, SNU 20
  • 21. Junho Cho, Perception and Intelligence Lab, SNU 21
  • 22. Junho Cho, Perception and Intelligence Lab, SNU 22
  • 23. Learn pair-wise images of and - BW & Color image - Street Scene & Label - Facade & Label - Aerial & Map - Day & Night - Edges & Photo source image , target image (label) is pair- wise. thus it is Supervised Learning Junho Cho, Perception and Intelligence Lab, SNU 23
  • 24. Generator of pix2pix where : image and : noise Use U-Net shaped network - known to be powerful at segmentation task - use spatial information from features of bottom layer - use dropout as noise in decoder part Junho Cho, Perception and Intelligence Lab, SNU 24
  • 25. Discriminator of pix2pix Junho Cho, Perception and Intelligence Lab, SNU 25
  • 26. Loss function : source image, : target image, : noise Junho Cho, Perception and Intelligence Lab, SNU 26
  • 27. Result_ Junho Cho, Perception and Intelligence Lab, SNU 27
  • 28. Junho Cho, Perception and Intelligence Lab, SNU 28
  • 29. Junho Cho, Perception and Intelligence Lab, SNU 29
  • 30. Junho Cho, Perception and Intelligence Lab, SNU 30
  • 31. Junho Cho, Perception and Intelligence Lab, SNU 31
  • 32. Junho Cho, Perception and Intelligence Lab, SNU 32
  • 33. Do demo! https://affinelayer.com/pixsrv/ Junho Cho, Perception and Intelligence Lab, SNU 33
  • 34. 2. Unsupervised Cross-Domain Image Generation (DTN) ICLR2017 Yaniv Taigman, Adam Polyak, Lior Wolf Junho Cho, Perception and Intelligence Lab, SNU 34
  • 35. Learn of two related domains, and without labels! (labels of images are usually expensive) Junho Cho, Perception and Intelligence Lab, SNU 35
  • 36. Junho Cho, Perception and Intelligence Lab, SNU 36
  • 37. Baseline model : discriminator, : generator, : context encoder. outputs feature. (128-dim) Junho Cho, Perception and Intelligence Lab, SNU 37
  • 38. • • • -constancy : Does have similar context? Junho Cho, Perception and Intelligence Lab, SNU 38
  • 39. 1. 2. • : distance metric. ex) MSE • : "Pretrained" context encoder. Parameter fixed. • can be pretrained with classification task on • Minimize two Risks : and Junho Cho, Perception and Intelligence Lab, SNU 39
  • 40. Experimentally, Baseline model didn't produce desirable results. Thus, similar but more elaborate architecture proposed Junho Cho, Perception and Intelligence Lab, SNU 40
  • 41. Proposed "Domain Transfer Network (DTN)" Junho Cho, Perception and Intelligence Lab, SNU 41
  • 42. Two Difference from the Baseline First, : the context encoder now encode as then will generate from it : - focuses to generate from given context Junho Cho, Perception and Intelligence Lab, SNU 42
  • 43. Two Difference from the Baseline Second, for , is also encoded by and applied - "Pretrained on " would not be good as much as on . But enough for context encoding purpose - : should be similar to - Also takes and performs ternary (3-class) classification. (one real, two fakes) Junho Cho, Perception and Intelligence Lab, SNU 43
  • 44. Losses Junho Cho, Perception and Intelligence Lab, SNU 44
  • 45. : generated from ? / : generated from ? / : sample from ? Junho Cho, Perception and Intelligence Lab, SNU 45
  • 46. Generator : Adversarial Loss Fool to classify as sample from Junho Cho, Perception and Intelligence Lab, SNU 46
  • 47. Generator : and Identity preserving , in feature level , in pixel level used as MSE in this work Junho Cho, Perception and Intelligence Lab, SNU 47
  • 48. • • minimized over minimized over Junho Cho, Perception and Intelligence Lab, SNU 48
  • 49. Experiments1. Street View House Numbers (SVHN) MNIST 2. Face Emoji Both cases, and domains differ considerably Junho Cho, Perception and Intelligence Lab, SNU 49
  • 50. SVHN MNIST Junho Cho, Perception and Intelligence Lab, SNU 50
  • 51. • 4 convs (each filters 64,128,256,128) / max pooling / ReLU • input RGB / output 128-dim vector. • do not need to be very powerful classifier. • achieves 4.95% error on SVHN test set • Weaker in : 23.92% error on MNIST. • Learn analogy of unlabeled examples Junho Cho, Perception and Intelligence Lab, SNU 51
  • 52. • Inspired by DCGAN • SVHN-trained 's 128D representation • four blocks of deconv, BN, ReLU. TanH at final. • • Junho Cho, Perception and Intelligence Lab, SNU 52
  • 53. Junho Cho, Perception and Intelligence Lab, SNU 53
  • 54. Evaluate DTN Train classifier on . - Architecture same as - MNIST performance 99.4% test set. Evaluate by testing MNIST classifier on using : label. Junho Cho, Perception and Intelligence Lab, SNU 54
  • 55. Junho Cho, Perception and Intelligence Lab, SNU 55
  • 56. Unseen Digits Study the ability of DTN to overcome omission of a class in samples. For example, class '3' Ablation applied on - training DTN, domain - training DTN, domain - training . But '3' exists in testing DTN! Compare results. Junho Cho, Perception and Intelligence Lab, SNU 56
  • 57. (a) The input images. (b) Results of our DTN. (c) 3 was not in SVNH. (d) 3 was not in MNIST. (e) 3 was not shown in both SVHN and MNIST. (f) The digit 3 was not shown in SVHN, MNIST and during the training of f. Junho Cho, Perception and Intelligence Lab, SNU 57
  • 58. Junho Cho, Perception and Intelligence Lab, SNU 58
  • 59. Domain Adaptation : labeled, unlabeled, want to train classifier of Train k-NN classifier Junho Cho, Perception and Intelligence Lab, SNU 59
  • 60. Face Emoji• face from Facescrub/CelebA • emoji gained from bitmoji.com, not publicized • preprocess on emoji with heuristics. Align face. • from DeepFace pretrained network. • (Taigman et al. 2014) the author's previous work • is 256-dim • outputs • SR (Dong et al. 2015) to upscale final output. Junho Cho, Perception and Intelligence Lab, SNU 60
  • 61. Results ! choose via validation Junho Cho, Perception and Intelligence Lab, SNU 61
  • 62. Original style transfer can't solve it DTN also can style transfer. DTN is more general than Styler Transfer method. Junho Cho, Perception and Intelligence Lab, SNU 62
  • 63. Limitations • usually can be trained in one domain, thus asymmetric. • Handle two domains differently. • is bad. • Bounded by . Needs pre-traied context encoder. • any better way to learn context without pretraining? • Any more tasks? Junho Cho, Perception and Intelligence Lab, SNU 63
  • 64. Conclusion1. Demonstrate Domain Transfer, as an unsupervised method. • Can be generalized to various problems. 2. -constancy to maintain context of domain & 3. Simple domain adaptation and good performance • inspiring work to future domain adaptation research More open reviews at OpenReview.net Junho Cho, Perception and Intelligence Lab, SNU 64
  • 65. 3. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (CycleGAN) UC Berkeley (pix2pix upgrade) & Learning to Discover Cross-Domain Relations with Generative Adversarial Networks (DiscoGAN) SK T-Brain Junho Cho, Perception and Intelligence Lab, SNU 65
  • 66. DiscoGAN & CycleGAN Almost Identical concept. DiscoGAN came 15 days earlier. Low resolution ( ) CycleGAN has better qualitative results ( ) and quantative experiments. Difference from DTN • No -constancy. Do not need pre-trained context encoder • Only need dataset and Junho Cho, Perception and Intelligence Lab, SNU 66
  • 67. DiscoGAN Junho Cho, Perception and Intelligence Lab, SNU 67
  • 68. DiscoGAN Junho Cho, Perception and Intelligence Lab, SNU 68
  • 69. without cross domain matching, GAN has mode collapse learn projection to mode in domain , while two domains have one-to-one relation Junho Cho, Perception and Intelligence Lab, SNU 69
  • 70. Typical GAN issue: Mode collapse top is ideal case, bottom is mode collapse failure case Junho Cho, Perception and Intelligence Lab, SNU 70
  • 71. Toy problem of 2-dim Gaussian mixture model • 5 modes of domain A to 10 modes of domain B GAN, GAN + const show injective mapping & mode collapse DiscoGAN shows bijective mapping & generate all 10 modes of B. Junho Cho, Perception and Intelligence Lab, SNU 71
  • 72. Junho Cho, Perception and Intelligence Lab, SNU 72
  • 73. proposed DiscoGAN Junho Cho, Perception and Intelligence Lab, SNU 73
  • 74. CycleGAN has similar contribution on this point Junho Cho, Perception and Intelligence Lab, SNU 74
  • 75. Junho Cho, Perception and Intelligence Lab, SNU 75
  • 76. Results Junho Cho, Perception and Intelligence Lab, SNU 76
  • 77. Junho Cho, Perception and Intelligence Lab, SNU 77
  • 78. codes and more results in https://github.com/SKTBrain/DiscoGAN https://github.com/carpedm20/DiscoGAN-pytorch Junho Cho, Perception and Intelligence Lab, SNU 78
  • 79. CycleGAN Use more GAN techniques: LSGAN, use image buffer of previous generated samples Junho Cho, Perception and Intelligence Lab, SNU 79
  • 80. Junho Cho, Perception and Intelligence Lab, SNU 80
  • 81. Junho Cho, Perception and Intelligence Lab, SNU 81
  • 82. Junho Cho, Perception and Intelligence Lab, SNU 82
  • 83. Junho Cho, Perception and Intelligence Lab, SNU 83
  • 84. Junho Cho, Perception and Intelligence Lab, SNU 84
  • 85. failure case Junho Cho, Perception and Intelligence Lab, SNU 85
  • 86. CycleGAN demonstrates more experiments! project page : https://junyanz.github.io/CycleGAN/ code available with Torch and PyTorch Junho Cho, Perception and Intelligence Lab, SNU 86
  • 87. Thank you! Junho Cho, Perception and Intelligence Lab, SNU 87