SlideShare a Scribd company logo
1 of 60
Download to read offline
Recent Progress on Utilizing Tag
Information with GANs
StarGAN & TD-GAN
Hao-Wen Dong
Dec 26, 2017
Research Center for IT Innovation, Academia Sinica
Table of contents
1. Introduction
2. StarGAN
3. TD-GAN
4. Conclusions & Discussions
1
Introduction
Main Idea
How to utilize tag information?
2
Straightforward Approach
Fig. 1: Girl Friend Factory∗: generating anime characters with
specific attributes by conditional GANs
∗https://hiroshiba.github.io/girl_friend_factory/index.html 3
Inspiring Viewpoints
StarGAN [1]
Key Assumption: images of different tags can be viewed
as different domains
Approach: multi-domain image-to-image translation
Paper:
Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun
Kim, and Jaegul Choo, “StarGAN: Unified Generative
Adversarial Networks for Multi-Domain Image-to-Image
Translation”, arXiv preprint arXiv:1711.09020, 2017.
4
Inspiring Viewpoints
TD-GAN [2]
Key Assumption: images and tags record the same object
from two different perspectives
Approach: enforce consistency between learned
disentangled representations for images and tags
Paper:
Chaoyue Wang, Chaohui Wang, Chang Xu, and Dacheng Tao,
“Tag Disentangled Generative Adversarial Networks for Object
Image Re-rendering”, in Proc. 36th Int. Joint Conf. on Artificial
Intelligence (IJCAI), 2017.
5
StarGAN
StarGAN - Motivation
Fig. 2: CycleGAN [3]: Cycle-Consistent Adversarial Networks
6
StarGAN - Motivation
Fig. 3: Comparison of cross-domain and multi-domain models
7
StarGAN - Qualitative Results
Fig. 4: Results on CelebA
8
StarGAN - Qualitative Results
Fig. 5: Results on CelebA via transferring knowledge learned
from RaFD
9
StarGAN - System Overview
Fig. 6: System overview
10
StarGAN - Formulation
G : (x, c) → y
D : x → (Dsrc(x), Dcls(x))
x : input image, c : domain label, y : output image
11
StarGAN - Objective Functions
Adversarial Loss
Ladv = Ex[log Dsrc(x)] + Ex,c[log(1 − Dsrc(G(x, c)))]
12
StarGAN - Objective Functions
Adversarial Loss
Ladv = Ex[log Dsrc(x)] + Ex,c[log(1 − Dsrc(G(x, c)))]
Domain Classification Loss
Lr
cls = Ex,c′ [− log Dcls(c′
|x)] (real images)
Lf
cls = Ex,c[− log Dcls(c|G(x, c))] (fake images)
12
StarGAN - Objective Functions
Adversarial Loss
Ladv = Ex[log Dsrc(x)] + Ex,c[log(1 − Dsrc(G(x, c)))]
Domain Classification Loss
Lr
cls = Ex,c′ [− log Dcls(c′
|x)] (real images)
Lf
cls = Ex,c[− log Dcls(c|G(x, c))] (fake images)
Reconstruction Loss
Lrec = Ex,c,c′ [||x − G(G(x, c), c′
)||1]
12
StarGAN - Objective Functions
Full Objective Functions
LD = −Ladv + λclsLr
cls
LG = −Ladv + λclsLf
cls + λrecLrec
13
StarGAN - Qualitative Results
Fig. 7: Qualitative comparison among different models
14
StarGAN - Multiple Datasets
CelebA dataset
(binary) Black, Blond, Brown, Male, Young, etc.
RaFD dataset
(categorical) Angry, Fearful, Happy, Sad, Disgusted, etc.
15
StarGAN - Multiple Datasets
CelebA dataset
(binary) Black, Blond, Brown, Male, Young, etc.
RaFD dataset
(categorical) Angry, Fearful, Happy, Sad, Disgusted, etc.
Fig. 8: Label vectors and mask vector
15
StarGAN - Training on Multiple Datasets
Fig. 9: System overview - multiple datasets (I)
16
StarGAN - Training on Multiple Datasets
Fig. 9 (cont.): System overview - multiple datasets (II)
17
StarGAN - Qualitative Results
Fig. 10: Results on CelebA via transferring knowledge learned
from RaFD
18
StarGAN - Mask Vector
Fig. 11: Learned role of mask vector
19
StarGAN - Quantitative Experiments
Amazon Mechanical Turk (AMT)
vote for the best generated im-
ages based on:
• perceptual realism
• quality of transfer in
attribute(s)
• preservation of a figure’s
original identity
20
StarGAN - Quantitative Results
Method Hair color Gender Aged
DIAT 9.3% 31.4% 6.9%
CycleGAN 20.0% 16.6% 13.3%
IcGAN 4.5% 12.9% 9.2%
StarGAN 66.2 39.1 70.6
Table 1: AMT perceptual evaluation (by votes)
21
StarGAN - Quantitative Results
Method H+G H+A G+A H+G+A
DIAT 20.4% 15.6% 18.7% 15.6%
CycleGAN 14.0% 12.0% 11.2% 11.9%
IcGAN 18.2% 10.9% 20.3% 20.3%
StarGAN 47.4 61.5 49.8 52.2
Table 1 (cont.): AMT perceptual evaluation (by votes)
22
TD-GAN
TD-GAN - Motivation & Goals
Key Assumption
the image and its tags record the same object from two
different perspectives, so they should share the same
disentangled representations
Goals
• to extract disentangled and interpretable
representations for both image and its tags
• to explore the consistency between the image and
its tags by integrating the tag mapping net
23
TD-GAN - Qualitative Results
Fig. 12: Multi-factor transformation
24
TD-GAN - System Overview
Disentangling network R
x → R(x)
image → disentangled representations
25
TD-GAN - System Overview
Disentangling network R
x → R(x)
image → disentangled representations
Tag mapping net g
C → g(C)
tag code → disentangled representations
25
TD-GAN - System Overview
Generative network G
g(C) or R(x) → G(g(C)) or G(R(x))
disentangled representations → re-rendered image
26
TD-GAN - System Overview
Generative network G
g(C) or R(x) → G(g(C)) or G(R(x))
disentangled representations → re-rendered image
Discriminative network D
adversarial training with G and R
26
TD-GAN - System Overview
Fig. 13: System overview
27
TD-GAN - Formulation
Let the tagged training dataset be
XL
= {(x1, C1), . . . , (x|XL|, C|XL|)} ,
where tag codes
Ci = (cide
i , cview
i , cexp
i , . . .)
and cide
i , cview
i , cexp
i , . . . are one-hot encoding vectors.
Let the untagged training dataset be XU
.
28
TD-GAN - Objective Functions
Discrepency between disentangled representations
f1(R, g) =
1
|XL|
∑
xi∈XL
||R(xi) − g(Ci)||2
2
29
TD-GAN - Objective Functions
Discrepency between disentangled representations
f1(R, g) =
1
|XL|
∑
xi∈XL
||R(xi) − g(Ci)||2
2
Discrepency between real and rendered images
f2(G, g) =
1
|XL|
∑
xi∈XL
||G(g(Ci)) − xi||2
2
29
TD-GAN - Objective Functions
Discrepency between disentangled representations
f1(R, g) =
1
|XL|
∑
xi∈XL
||R(xi) − g(Ci)||2
2
Discrepency between real and rendered images
f2(G, g) =
1
|XL|
∑
xi∈XL
||G(g(Ci)) − xi||2
2
29
TD-GAN - Objective Functions
Reconstruction Loss for untagged images
˜f1(G, R) =
1
|XU |
∑
xi∈XU
||G(R(xi)) − xi||2
2
Adversarial Loss
f3(R, G, D) = E[log D(x)] + E[log(1 − D(G(R(x))))]
30
TD-GAN - Objective Functions
Full Objective Functions
LR = min
R
λ1f1(R, g∗
) + λ3f3(R, G∗
, D∗
)
Lg = min
g
λ1f1(R∗
, g) + λ2f2(G∗
, g)
LG = min
G
λ2f2(G, g∗
) + λ3f3(R∗
, G, D∗
)
LD = max
D
λ3f3(R∗
, G∗
, D)
(R∗
, g∗
, G∗
, D∗
are fixed as the configurations
obtained from the previous iteration)
31
TD-GAN - Qualitative Results
Fig. 14: Novel view synthesis results on 3D-chairs dataset 32
TD-GAN - Semi-supervised Extension
Test three representative training settings:
• fully-500: all the 500 training models and their tags
• fully-100: only the first 100 models and their tags
• semi-(100,400): all the 500 training models but only
the tags of the first 100 models
33
TD-GAN - Qualitative Results
Fig. 15: Novel view synthesis results trained in three settings
34
TD-GAN - Quantitative Results
Fig. 16: MSE of novel view synthesis results trained in three
settings (using testing chair images under 0◦ as inputs) 35
TD-GAN - Qualitative Results
Fig. 17: Illumination transformation of human face results on
Multi-PIE dataset
36
TD-GAN - Quantitative Results
XCov TD-GAN TD-GAN
Flash-1 0.5776 0.5623 0.5280
Flash-4 5.0692 0.8972 0.8818
Flash-6 4.3991 1.2509 0.1079
Flash-9 3.4639 0.6145 0.5870
Flash-12 2.4624 0.7142 0.6973
All Flash (mean) 3.8675 0.6966 0.6667
Table 2: MSE (×10−2) of illumination transformation results
37
TD-GAN - Possible Applications
• virtual reality systems
e.g. to naturally ‘paste’ persons into a virtual
environment by re-rendering of faces for the
continuous pose, illumination directions, and
various expressions
• architecture
• simulators
• video games
• movies
• visual effects
38
Conclusions & Discussions
Conclusions & Discussions
• Assumptions matter.
39
Conclusions & Discussions
• Assumptions matter. (are they reasonable?)
• images of different tags can be viewed as different
domains
• images and tags record the same object from two
different perspectives
39
Conclusions & Discussions
• Assumptions matter. (are they reasonable?)
• images of different tags can be viewed as different
domains
• images and tags record the same object from two
different perspectives
• How to utilize tag information?
39
Conclusions & Discussions
• Assumptions matter. (are they reasonable?)
• images of different tags can be viewed as different
domains
• images and tags record the same object from two
different perspectives
• How to utilize tag information? (plenty of ways)
• multi-domain image-to-image translation
• enforce consistency between learned disentangled
representations for images and tags
39
Conclusions & Discussions
• Assumptions matter. (are they reasonable?)
• images of different tags can be viewed as different
domains
• images and tags record the same object from two
different perspectives
• How to utilize tag information? (plenty of ways)
• multi-domain image-to-image translation
• enforce consistency between learned disentangled
representations for images and tags
• Does disentangling make sense?
39
Conclusions & Discussions
• Assumptions matter. (are they reasonable?)
• images of different tags can be viewed as different
domains
• images and tags record the same object from two
different perspectives
• How to utilize tag information? (plenty of ways)
• multi-domain image-to-image translation
• enforce consistency between learned disentangled
representations for images and tags
• Does disentangling make sense? (some trade-offs?)
• interpretability vs. effectiveness (efficiency)
• information underneath entangled factors
39
Questions?
ICGAN - System Overview
Fig. 18: ICGAN [4] - system overview
DC-IGN - System Overview
Fig. 19: DC-IGN [5] - system overview
References
[1] Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo, “StarGAN: Unified generative
adversarial networks for multi-Domain image-to-image translation,” ArXiv
preprint arXiv:1711.09020, 2017.
[2] C. Wang, C. Wang, C. Xu, and D. Tao, “Tag disentangled generative adversarial
networks for object image Re-rendering,” in Proc. 36th Int. Joint Conf. on
Artificial Intelligence (IJCAI), 2017.
[3] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation
using cycle-consistent adversarial networks,” in The IEEE International
Conference on Computer Vision (ICCV), 2017.
[4] G. Perarnau, J. van de Weijer, B. Raducanu, and J. M. Álvarez, “Invertible
conditional gans for image editing,” in Proc. NIPS 2016 Workshop on Adversarial
Training, 2016.
[5] T. D. Kulkarni, W. Whitney, P. Kohli, and J. B. Tenenbaum, “Deep convolutional
inverse graphics network,” in Advances in Neural Information Processing
Systems (NIPS) 28, 2015.

More Related Content

What's hot

Introduction of DiscoGAN
Introduction of DiscoGANIntroduction of DiscoGAN
Introduction of DiscoGANSeongcheol Baek
 
Generative adversarial network_Ayadi_Alaeddine
Generative adversarial network_Ayadi_AlaeddineGenerative adversarial network_Ayadi_Alaeddine
Generative adversarial network_Ayadi_AlaeddineDeep Learning Italia
 
ECCV2010: feature learning for image classification, part 4
ECCV2010: feature learning for image classification, part 4ECCV2010: feature learning for image classification, part 4
ECCV2010: feature learning for image classification, part 4zukun
 
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...宏毅 李
 
A Walk in the GAN Zoo
A Walk in the GAN ZooA Walk in the GAN Zoo
A Walk in the GAN ZooLarry Guo
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2zukun
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXBenjamin Bengfort
 
A Fast and Dirty Intro to NetworkX (and D3)
A Fast and Dirty Intro to NetworkX (and D3)A Fast and Dirty Intro to NetworkX (and D3)
A Fast and Dirty Intro to NetworkX (and D3)Lynn Cherny
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Benjamin Bengfort
 
LSGAN - SIMPle(Simple Idea Meaningful Performance Level up)
LSGAN - SIMPle(Simple Idea Meaningful Performance Level up)LSGAN - SIMPle(Simple Idea Meaningful Performance Level up)
LSGAN - SIMPle(Simple Idea Meaningful Performance Level up)Hansol Kang
 
CS 354 Blending, Compositing, Anti-aliasing
CS 354 Blending, Compositing, Anti-aliasingCS 354 Blending, Compositing, Anti-aliasing
CS 354 Blending, Compositing, Anti-aliasingMark Kilgard
 
Matrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpMatrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpankit_ppt
 
CS 354 Acceleration Structures
CS 354 Acceleration StructuresCS 354 Acceleration Structures
CS 354 Acceleration StructuresMark Kilgard
 
CS 354 More Graphics Pipeline
CS 354 More Graphics PipelineCS 354 More Graphics Pipeline
CS 354 More Graphics PipelineMark Kilgard
 
Encryption Quality Analysis and Security Evaluation of CAST-128 Algorithm and...
Encryption Quality Analysis and Security Evaluation of CAST-128 Algorithm and...Encryption Quality Analysis and Security Evaluation of CAST-128 Algorithm and...
Encryption Quality Analysis and Security Evaluation of CAST-128 Algorithm and...IJNSA Journal
 

What's hot (20)

그림 그리는 AI
그림 그리는 AI그림 그리는 AI
그림 그리는 AI
 
Introduction of DiscoGAN
Introduction of DiscoGANIntroduction of DiscoGAN
Introduction of DiscoGAN
 
DiscoGAN
DiscoGANDiscoGAN
DiscoGAN
 
Generative adversarial network_Ayadi_Alaeddine
Generative adversarial network_Ayadi_AlaeddineGenerative adversarial network_Ayadi_Alaeddine
Generative adversarial network_Ayadi_Alaeddine
 
ECCV2010: feature learning for image classification, part 4
ECCV2010: feature learning for image classification, part 4ECCV2010: feature learning for image classification, part 4
ECCV2010: feature learning for image classification, part 4
 
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
 
Networkx tutorial
Networkx tutorialNetworkx tutorial
Networkx tutorial
 
A Walk in the GAN Zoo
A Walk in the GAN ZooA Walk in the GAN Zoo
A Walk in the GAN Zoo
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
 
A Fast and Dirty Intro to NetworkX (and D3)
A Fast and Dirty Intro to NetworkX (and D3)A Fast and Dirty Intro to NetworkX (and D3)
A Fast and Dirty Intro to NetworkX (and D3)
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)
 
LSGAN - SIMPle(Simple Idea Meaningful Performance Level up)
LSGAN - SIMPle(Simple Idea Meaningful Performance Level up)LSGAN - SIMPle(Simple Idea Meaningful Performance Level up)
LSGAN - SIMPle(Simple Idea Meaningful Performance Level up)
 
CS 354 Blending, Compositing, Anti-aliasing
CS 354 Blending, Compositing, Anti-aliasingCS 354 Blending, Compositing, Anti-aliasing
CS 354 Blending, Compositing, Anti-aliasing
 
Gan intro
Gan introGan intro
Gan intro
 
Matrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpMatrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlp
 
CS 354 Acceleration Structures
CS 354 Acceleration StructuresCS 354 Acceleration Structures
CS 354 Acceleration Structures
 
CS 354 More Graphics Pipeline
CS 354 More Graphics PipelineCS 354 More Graphics Pipeline
CS 354 More Graphics Pipeline
 
Encryption Quality Analysis and Security Evaluation of CAST-128 Algorithm and...
Encryption Quality Analysis and Security Evaluation of CAST-128 Algorithm and...Encryption Quality Analysis and Security Evaluation of CAST-128 Algorithm and...
Encryption Quality Analysis and Security Evaluation of CAST-128 Algorithm and...
 

Similar to Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN

Neo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpNeo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpAdrian Ziegler
 
Generative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksGenerative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksDenis Dus
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptDivyaGugulothu
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative AttributesVikas Jain
 
Multi graph encoder
Multi graph encoderMulti graph encoder
Multi graph encoderYajie Zhou
 
Unpaired Image Translations Using GANs: A Review
Unpaired Image Translations Using GANs: A ReviewUnpaired Image Translations Using GANs: A Review
Unpaired Image Translations Using GANs: A ReviewIRJET Journal
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Daniel Chan
 
Multiple Graphs: Updatable Views
Multiple Graphs: Updatable ViewsMultiple Graphs: Updatable Views
Multiple Graphs: Updatable ViewsopenCypher
 
R visualization: ggplot2, googlevis, plotly, igraph Overview
R visualization: ggplot2, googlevis, plotly, igraph OverviewR visualization: ggplot2, googlevis, plotly, igraph Overview
R visualization: ggplot2, googlevis, plotly, igraph OverviewOlga Scrivner
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernelsivaderivader
 
Working with images in matlab graphics
Working with images in matlab graphicsWorking with images in matlab graphics
Working with images in matlab graphicsmustafa_92
 
Image Classification using Deep Learning
Image Classification using Deep LearningImage Classification using Deep Learning
Image Classification using Deep LearningIRJET Journal
 
Transformations and actions a visual guide training
Transformations and actions a visual guide trainingTransformations and actions a visual guide training
Transformations and actions a visual guide trainingSpark Summit
 
Project_Final_Review.pdf
Project_Final_Review.pdfProject_Final_Review.pdf
Project_Final_Review.pdfDivyaGugulothu
 
Ray: A Cluster Computing Engine for Reinforcement Learning Applications with ...
Ray: A Cluster Computing Engine for Reinforcement Learning Applications with ...Ray: A Cluster Computing Engine for Reinforcement Learning Applications with ...
Ray: A Cluster Computing Engine for Reinforcement Learning Applications with ...Databricks
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksMLReview
 
Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Databricks
 
Multiple graphs in openCypher
Multiple graphs in openCypherMultiple graphs in openCypher
Multiple graphs in openCypheropenCypher
 
AIML4 CNN lab256 1hr (111-1).pdf
AIML4 CNN lab256 1hr (111-1).pdfAIML4 CNN lab256 1hr (111-1).pdf
AIML4 CNN lab256 1hr (111-1).pdfssuserb4d806
 

Similar to Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN (20)

Neo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpNeo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExp
 
Generative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksGenerative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural Networks
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.ppt
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative Attributes
 
Multi graph encoder
Multi graph encoderMulti graph encoder
Multi graph encoder
 
Unpaired Image Translations Using GANs: A Review
Unpaired Image Translations Using GANs: A ReviewUnpaired Image Translations Using GANs: A Review
Unpaired Image Translations Using GANs: A Review
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)
 
Multiple Graphs: Updatable Views
Multiple Graphs: Updatable ViewsMultiple Graphs: Updatable Views
Multiple Graphs: Updatable Views
 
R visualization: ggplot2, googlevis, plotly, igraph Overview
R visualization: ggplot2, googlevis, plotly, igraph OverviewR visualization: ggplot2, googlevis, plotly, igraph Overview
R visualization: ggplot2, googlevis, plotly, igraph Overview
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
 
Working with images in matlab graphics
Working with images in matlab graphicsWorking with images in matlab graphics
Working with images in matlab graphics
 
Image Classification using Deep Learning
Image Classification using Deep LearningImage Classification using Deep Learning
Image Classification using Deep Learning
 
Transformations and actions a visual guide training
Transformations and actions a visual guide trainingTransformations and actions a visual guide training
Transformations and actions a visual guide training
 
Project_Final_Review.pdf
Project_Final_Review.pdfProject_Final_Review.pdf
Project_Final_Review.pdf
 
3DRepo
3DRepo3DRepo
3DRepo
 
Ray: A Cluster Computing Engine for Reinforcement Learning Applications with ...
Ray: A Cluster Computing Engine for Reinforcement Learning Applications with ...Ray: A Cluster Computing Engine for Reinforcement Learning Applications with ...
Ray: A Cluster Computing Engine for Reinforcement Learning Applications with ...
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial Networks
 
Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™
 
Multiple graphs in openCypher
Multiple graphs in openCypherMultiple graphs in openCypher
Multiple graphs in openCypher
 
AIML4 CNN lab256 1hr (111-1).pdf
AIML4 CNN lab256 1hr (111-1).pdfAIML4 CNN lab256 1hr (111-1).pdf
AIML4 CNN lab256 1hr (111-1).pdf
 

More from Hao-Wen (Herman) Dong

Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...Hao-Wen (Herman) Dong
 
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...Hao-Wen (Herman) Dong
 
Organizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository OrganizationOrganizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository OrganizationHao-Wen (Herman) Dong
 
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...Hao-Wen (Herman) Dong
 
Introduction to Deep Generative Models
Introduction to Deep Generative ModelsIntroduction to Deep Generative Models
Introduction to Deep Generative ModelsHao-Wen (Herman) Dong
 

More from Hao-Wen (Herman) Dong (6)

Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
Training Generative Adversarial Networks with Binary Neurons by End-to-end Ba...
 
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
 
Organizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository OrganizationOrganizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository Organization
 
What is Critical in GAN Training?
What is Critical in GAN Training?What is Critical in GAN Training?
What is Critical in GAN Training?
 
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
 
Introduction to Deep Generative Models
Introduction to Deep Generative ModelsIntroduction to Deep Generative Models
Introduction to Deep Generative Models
 

Recently uploaded

Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxEran Akiva Sinbar
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett SquareIsiahStephanRadaza
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxdharshini369nike
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10ROLANARIBATO3
 
insect anatomy and insect body wall and their physiology
insect anatomy and insect body wall and their  physiologyinsect anatomy and insect body wall and their  physiology
insect anatomy and insect body wall and their physiologyDrAnita Sharma
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 

Recently uploaded (20)

Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett Square
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptx
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10
 
insect anatomy and insect body wall and their physiology
insect anatomy and insect body wall and their  physiologyinsect anatomy and insect body wall and their  physiology
insect anatomy and insect body wall and their physiology
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 

Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN

  • 1. Recent Progress on Utilizing Tag Information with GANs StarGAN & TD-GAN Hao-Wen Dong Dec 26, 2017 Research Center for IT Innovation, Academia Sinica
  • 2. Table of contents 1. Introduction 2. StarGAN 3. TD-GAN 4. Conclusions & Discussions 1
  • 4. Main Idea How to utilize tag information? 2
  • 5. Straightforward Approach Fig. 1: Girl Friend Factory∗: generating anime characters with specific attributes by conditional GANs ∗https://hiroshiba.github.io/girl_friend_factory/index.html 3
  • 6. Inspiring Viewpoints StarGAN [1] Key Assumption: images of different tags can be viewed as different domains Approach: multi-domain image-to-image translation Paper: Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo, “StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation”, arXiv preprint arXiv:1711.09020, 2017. 4
  • 7. Inspiring Viewpoints TD-GAN [2] Key Assumption: images and tags record the same object from two different perspectives Approach: enforce consistency between learned disentangled representations for images and tags Paper: Chaoyue Wang, Chaohui Wang, Chang Xu, and Dacheng Tao, “Tag Disentangled Generative Adversarial Networks for Object Image Re-rendering”, in Proc. 36th Int. Joint Conf. on Artificial Intelligence (IJCAI), 2017. 5
  • 9. StarGAN - Motivation Fig. 2: CycleGAN [3]: Cycle-Consistent Adversarial Networks 6
  • 10. StarGAN - Motivation Fig. 3: Comparison of cross-domain and multi-domain models 7
  • 11. StarGAN - Qualitative Results Fig. 4: Results on CelebA 8
  • 12. StarGAN - Qualitative Results Fig. 5: Results on CelebA via transferring knowledge learned from RaFD 9
  • 13. StarGAN - System Overview Fig. 6: System overview 10
  • 14. StarGAN - Formulation G : (x, c) → y D : x → (Dsrc(x), Dcls(x)) x : input image, c : domain label, y : output image 11
  • 15. StarGAN - Objective Functions Adversarial Loss Ladv = Ex[log Dsrc(x)] + Ex,c[log(1 − Dsrc(G(x, c)))] 12
  • 16. StarGAN - Objective Functions Adversarial Loss Ladv = Ex[log Dsrc(x)] + Ex,c[log(1 − Dsrc(G(x, c)))] Domain Classification Loss Lr cls = Ex,c′ [− log Dcls(c′ |x)] (real images) Lf cls = Ex,c[− log Dcls(c|G(x, c))] (fake images) 12
  • 17. StarGAN - Objective Functions Adversarial Loss Ladv = Ex[log Dsrc(x)] + Ex,c[log(1 − Dsrc(G(x, c)))] Domain Classification Loss Lr cls = Ex,c′ [− log Dcls(c′ |x)] (real images) Lf cls = Ex,c[− log Dcls(c|G(x, c))] (fake images) Reconstruction Loss Lrec = Ex,c,c′ [||x − G(G(x, c), c′ )||1] 12
  • 18. StarGAN - Objective Functions Full Objective Functions LD = −Ladv + λclsLr cls LG = −Ladv + λclsLf cls + λrecLrec 13
  • 19. StarGAN - Qualitative Results Fig. 7: Qualitative comparison among different models 14
  • 20. StarGAN - Multiple Datasets CelebA dataset (binary) Black, Blond, Brown, Male, Young, etc. RaFD dataset (categorical) Angry, Fearful, Happy, Sad, Disgusted, etc. 15
  • 21. StarGAN - Multiple Datasets CelebA dataset (binary) Black, Blond, Brown, Male, Young, etc. RaFD dataset (categorical) Angry, Fearful, Happy, Sad, Disgusted, etc. Fig. 8: Label vectors and mask vector 15
  • 22. StarGAN - Training on Multiple Datasets Fig. 9: System overview - multiple datasets (I) 16
  • 23. StarGAN - Training on Multiple Datasets Fig. 9 (cont.): System overview - multiple datasets (II) 17
  • 24. StarGAN - Qualitative Results Fig. 10: Results on CelebA via transferring knowledge learned from RaFD 18
  • 25. StarGAN - Mask Vector Fig. 11: Learned role of mask vector 19
  • 26. StarGAN - Quantitative Experiments Amazon Mechanical Turk (AMT) vote for the best generated im- ages based on: • perceptual realism • quality of transfer in attribute(s) • preservation of a figure’s original identity 20
  • 27. StarGAN - Quantitative Results Method Hair color Gender Aged DIAT 9.3% 31.4% 6.9% CycleGAN 20.0% 16.6% 13.3% IcGAN 4.5% 12.9% 9.2% StarGAN 66.2 39.1 70.6 Table 1: AMT perceptual evaluation (by votes) 21
  • 28. StarGAN - Quantitative Results Method H+G H+A G+A H+G+A DIAT 20.4% 15.6% 18.7% 15.6% CycleGAN 14.0% 12.0% 11.2% 11.9% IcGAN 18.2% 10.9% 20.3% 20.3% StarGAN 47.4 61.5 49.8 52.2 Table 1 (cont.): AMT perceptual evaluation (by votes) 22
  • 30. TD-GAN - Motivation & Goals Key Assumption the image and its tags record the same object from two different perspectives, so they should share the same disentangled representations Goals • to extract disentangled and interpretable representations for both image and its tags • to explore the consistency between the image and its tags by integrating the tag mapping net 23
  • 31. TD-GAN - Qualitative Results Fig. 12: Multi-factor transformation 24
  • 32. TD-GAN - System Overview Disentangling network R x → R(x) image → disentangled representations 25
  • 33. TD-GAN - System Overview Disentangling network R x → R(x) image → disentangled representations Tag mapping net g C → g(C) tag code → disentangled representations 25
  • 34. TD-GAN - System Overview Generative network G g(C) or R(x) → G(g(C)) or G(R(x)) disentangled representations → re-rendered image 26
  • 35. TD-GAN - System Overview Generative network G g(C) or R(x) → G(g(C)) or G(R(x)) disentangled representations → re-rendered image Discriminative network D adversarial training with G and R 26
  • 36. TD-GAN - System Overview Fig. 13: System overview 27
  • 37. TD-GAN - Formulation Let the tagged training dataset be XL = {(x1, C1), . . . , (x|XL|, C|XL|)} , where tag codes Ci = (cide i , cview i , cexp i , . . .) and cide i , cview i , cexp i , . . . are one-hot encoding vectors. Let the untagged training dataset be XU . 28
  • 38. TD-GAN - Objective Functions Discrepency between disentangled representations f1(R, g) = 1 |XL| ∑ xi∈XL ||R(xi) − g(Ci)||2 2 29
  • 39. TD-GAN - Objective Functions Discrepency between disentangled representations f1(R, g) = 1 |XL| ∑ xi∈XL ||R(xi) − g(Ci)||2 2 Discrepency between real and rendered images f2(G, g) = 1 |XL| ∑ xi∈XL ||G(g(Ci)) − xi||2 2 29
  • 40. TD-GAN - Objective Functions Discrepency between disentangled representations f1(R, g) = 1 |XL| ∑ xi∈XL ||R(xi) − g(Ci)||2 2 Discrepency between real and rendered images f2(G, g) = 1 |XL| ∑ xi∈XL ||G(g(Ci)) − xi||2 2 29
  • 41. TD-GAN - Objective Functions Reconstruction Loss for untagged images ˜f1(G, R) = 1 |XU | ∑ xi∈XU ||G(R(xi)) − xi||2 2 Adversarial Loss f3(R, G, D) = E[log D(x)] + E[log(1 − D(G(R(x))))] 30
  • 42. TD-GAN - Objective Functions Full Objective Functions LR = min R λ1f1(R, g∗ ) + λ3f3(R, G∗ , D∗ ) Lg = min g λ1f1(R∗ , g) + λ2f2(G∗ , g) LG = min G λ2f2(G, g∗ ) + λ3f3(R∗ , G, D∗ ) LD = max D λ3f3(R∗ , G∗ , D) (R∗ , g∗ , G∗ , D∗ are fixed as the configurations obtained from the previous iteration) 31
  • 43. TD-GAN - Qualitative Results Fig. 14: Novel view synthesis results on 3D-chairs dataset 32
  • 44. TD-GAN - Semi-supervised Extension Test three representative training settings: • fully-500: all the 500 training models and their tags • fully-100: only the first 100 models and their tags • semi-(100,400): all the 500 training models but only the tags of the first 100 models 33
  • 45. TD-GAN - Qualitative Results Fig. 15: Novel view synthesis results trained in three settings 34
  • 46. TD-GAN - Quantitative Results Fig. 16: MSE of novel view synthesis results trained in three settings (using testing chair images under 0◦ as inputs) 35
  • 47. TD-GAN - Qualitative Results Fig. 17: Illumination transformation of human face results on Multi-PIE dataset 36
  • 48. TD-GAN - Quantitative Results XCov TD-GAN TD-GAN Flash-1 0.5776 0.5623 0.5280 Flash-4 5.0692 0.8972 0.8818 Flash-6 4.3991 1.2509 0.1079 Flash-9 3.4639 0.6145 0.5870 Flash-12 2.4624 0.7142 0.6973 All Flash (mean) 3.8675 0.6966 0.6667 Table 2: MSE (×10−2) of illumination transformation results 37
  • 49. TD-GAN - Possible Applications • virtual reality systems e.g. to naturally ‘paste’ persons into a virtual environment by re-rendering of faces for the continuous pose, illumination directions, and various expressions • architecture • simulators • video games • movies • visual effects 38
  • 51. Conclusions & Discussions • Assumptions matter. 39
  • 52. Conclusions & Discussions • Assumptions matter. (are they reasonable?) • images of different tags can be viewed as different domains • images and tags record the same object from two different perspectives 39
  • 53. Conclusions & Discussions • Assumptions matter. (are they reasonable?) • images of different tags can be viewed as different domains • images and tags record the same object from two different perspectives • How to utilize tag information? 39
  • 54. Conclusions & Discussions • Assumptions matter. (are they reasonable?) • images of different tags can be viewed as different domains • images and tags record the same object from two different perspectives • How to utilize tag information? (plenty of ways) • multi-domain image-to-image translation • enforce consistency between learned disentangled representations for images and tags 39
  • 55. Conclusions & Discussions • Assumptions matter. (are they reasonable?) • images of different tags can be viewed as different domains • images and tags record the same object from two different perspectives • How to utilize tag information? (plenty of ways) • multi-domain image-to-image translation • enforce consistency between learned disentangled representations for images and tags • Does disentangling make sense? 39
  • 56. Conclusions & Discussions • Assumptions matter. (are they reasonable?) • images of different tags can be viewed as different domains • images and tags record the same object from two different perspectives • How to utilize tag information? (plenty of ways) • multi-domain image-to-image translation • enforce consistency between learned disentangled representations for images and tags • Does disentangling make sense? (some trade-offs?) • interpretability vs. effectiveness (efficiency) • information underneath entangled factors 39
  • 58. ICGAN - System Overview Fig. 18: ICGAN [4] - system overview
  • 59. DC-IGN - System Overview Fig. 19: DC-IGN [5] - system overview
  • 60. References [1] Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo, “StarGAN: Unified generative adversarial networks for multi-Domain image-to-image translation,” ArXiv preprint arXiv:1711.09020, 2017. [2] C. Wang, C. Wang, C. Xu, and D. Tao, “Tag disentangled generative adversarial networks for object image Re-rendering,” in Proc. 36th Int. Joint Conf. on Artificial Intelligence (IJCAI), 2017. [3] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in The IEEE International Conference on Computer Vision (ICCV), 2017. [4] G. Perarnau, J. van de Weijer, B. Raducanu, and J. M. Álvarez, “Invertible conditional gans for image editing,” in Proc. NIPS 2016 Workshop on Adversarial Training, 2016. [5] T. D. Kulkarni, W. Whitney, P. Kohli, and J. B. Tenenbaum, “Deep convolutional inverse graphics network,” in Advances in Neural Information Processing Systems (NIPS) 28, 2015.