SlideShare a Scribd company logo
1 of 36
Download to read offline
1
DaeJin Kim
Research Trends in
Editing image using GAN
: Text-Adaptive GAN, Editable GAN
2
1. Introduction
1. Conditional GAN
2. AttGAN
2. Related Work
1. Reference
2. Motivation
3. Model Structure
4. Text-Adaptive Discriminator
5. Formulation
6. Implementation Details
7. Experiments
8. Limitations
3. Text-Adaptive GAN
4. Editable GAN
1. Comparison of existing GANs
2. Key idea
3. Model Structure
4. Formulation
5. Experiments
6. Conclusion
5. Editable Text-Adaptive GAN
Table of Contents
1. Reference
2. Motivation
3. Model Structure
4. Connection Network
5. Formulation
6. Experiments
7. Limitations
6. Discussion
3
Generate
edited images
1. Editing Images
: Editing Methods aim to manipulate single or multiple attributes of a original image, i.e., to generate a
new images with desired attriutes while preserving other details.
Introduction
4
Introduction
2. Approaches to Introduce
1) Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language [1]
2) Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously [2]
3) Editable GAN + Text-Adpative GAN (By my suggestion)
b) EditableGANa) Text-Adaptive GAN
[1] Baek, Kyungjune, et al. “Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously.”
[2] Nam, Seonghyeon, et al. “Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language”
5
Related Works
1. Conditional GAN
: Conditional GAN suggests a new framework to control the semantics of generated samples; they
formulate the problem as reproducing the conditional data distribution by training the conditional model
distribution.
https://github.com/hwalsuklee/tensorflow-generative-model-collections
6
Related Works
2. AttGAN
: AttGAN aims to generate a new face with desired attributes while preserving other details. Introduced in
He, Z, et al. “Arbitrary facial attribute editing: Only change what you want” in arXiv
Add
Glasses
Blond
Hair
7
Related Works
2. AttGAN
: Based on the encoder-decoder architecture, AttGAN apply an attribute classification constraint to the
generated image to just guarantee the correct changes of desired attributes, i.e., to “change what you want”.
8
Text-Adaptive GAN
1. Reference
: Nam, Seonghyeon, et al. “Text-Adaptive Generative Adversarial
Networks: Manipulating Images with Natural Language” in NeurIPS 2018
9
2. Motivation
: Text-Adaptive GAN aims to semantically modify visual attributes of an object in an image according to
the text describing the new visual appearance.
Text-Adaptive GAN
“This is a black bird with gray
and white wings and a bright
yellow belly and chest.”
Proposed
Synthesize novel images,
not manipulate.
Do not fully preserve
text-irrelevant contents.
Existing Methods
10
Text-Adaptive GAN
3. Model Structure
: Model structure figure in paper.
11
3. Model Structure
: A simplified architecture of Text-Adaptive GAN.
Text-Adaptive GAN
Generator
Discriminator
Text Encoder
“She has blond hair”
Text-Adaptive
Discriminator
Text Encoder
“She has blond hair”
Real / Fake?
Has described
attributes?
Learning independently
12
4. Text-Adaptive Discriminator
: Text-Adaptive Discriminator classifies each attributes independently using word-level local discriminators.
By doing so, the generator receives feedback from each local discriminator for each visual attributes.
Text-Adaptive GAN
1) Determines whether a visual attribute related to word exists in the image.
2) Adding word-level attentions to reduce the impact of less important words.
(using softmax values)
(u: temporal average of wi)
3) Final scores
13
5. Formulation
Text-Adaptive GAN
“This is a brown bird”
Original x Positive Text t
“This is a black bird with
gray and white wings and
a bright yellow belly and
chest.”
Negative Text ƸtGenerated G(x, Ƹt)
x has classes described in t ?
G(x, Ƹ𝑡) has classes described in Ƹ𝑡 ?
a) Discriminator
b) Generator
log 𝐷(𝐺(𝑥, 𝑡))
−−
−
14
6. Implementation Details
Text-Adaptive GAN
- Using bidirectional RNN to encode the whole text
- Using conditioning augmentation method for smooth text representation and the diversity of generated outputs
Randomly sample latent variables from the independent Gaussian
distribution Ɲ with 𝜇 𝜙 , 𝜎(𝜙). (Introduced with StackGAN)
- Using fastText for word embedding
15
7. Experiments
Text-Adaptive GAN
a) Accuracy, Natrualness (Evaluated by users), L2 reconstruction error b) Mutli-modal retrieval task on CUB dataset
16
8. Limitations
Text-Adaptive GAN
- Can not edit properly for objects that do nat match those attributes in datasets.
“This flower is blue”+ = Bad result
- Good at only for a few attributes.
“This bird has
a very small wings”
+ = Bad result
17
Editable GAN
1. Reference
: Baek, Kyungjune, et al. “Editable Generative Adversarial
Networks: Generating and Editing Faces Simultaneously.” in ACCV 2018
18
2. Motivation
: Develop a single unified model that can simultaneously create and edit high quality face images with
desired attributes.
Editable GAN
Single model (Proposed)
+ Blond hair
Edit Attribute
Generate novel image
Blond hair
IcGAN, VAE/GAN …
AttGAN, cGAN …
Share
19
3. Model Structure
: Model structure figure in paper.
Editable GAN
20
3. Model Structure
: A simplified architecture of Editable GAN
Editable GAN
Generator
Discriminator
Attribute
Classifier
Real / Fake?
Has described
attributes?
Connection
Network
Structural
Information
Attribute
Information
Estimate
Latent vector
𝑧
𝑦
𝑥
21
3. Model Structure
: Generate novel images with specific attributes in natural language.
Editable GAN
Generator
Discriminator
Attribute
Classifier
Real / Fake?
Has described
attributes?
Connection
Network
Structural
Information
Attribute
Information
Estimate
Latent vector
𝑧
𝑦 𝑥 𝑔𝑒𝑛
Sample from
uniform distribution
𝑎 = [0, 1, … , 0, … ]
(Blond hair)
22
3. Model Structure
: Manipulate images with specific attributes.
Editable GAN
Generator
Discriminator
Attribute
Classifier
Connection
Network
Structural
Information
Attribute
Information
Estimate
Latent vector
ǁ𝑧
𝑦
Real / Fake?
Has described
attributes?
Edited
Original
𝑥 𝑜𝑟𝑖𝑔𝑖𝑛
𝑥 𝑒𝑑𝑖𝑡
(Estimated)
Original
Latent vector
𝑎 = [0, 1, … , 0, … ]
(Blond hair)
23
4. Connection Network
: Connection Network performs the inverse generation process. Take 𝑓𝑑 from the discriminator and 𝑓𝑐
from the classifier as input, it estimates the latent vector.
By using connection network, it is able to bypass the disadvantage of the encoder-decoder architecture,
which overloads the generator training.
Editable GAN
Discriminator ClassifierReal image? Blond hair?
𝑓𝑑: Vector used for
detect fake images
𝑓𝑐: Vector used for
check classes (attributes)
Latent vector
Output feature vector
of the last fully connected layer
(Structural Information) (Attribute Information)
24
5. Formulation
Editable GAN
c) Connection Network: Estimate image’s latent vector
Vector from 𝐺(𝑧, 𝑦)
a) Discriminator
b) Generator: Ensure reality, with the correct classes
+
𝐿 𝑎𝑑𝑣 + 𝐿 𝑐𝑙𝑎𝑠𝑠
+
Fool Discriminator Right Classes?
Novel Image Edited Image
𝑓𝑑𝑓𝑐
(Estimated)
𝑥
CN ǁ𝑧
(Random)
Decoder
𝑦
𝐺(𝑧, 𝑦)(Classes)
𝑧
(Edit image)
25
6. Experiments
Editable GAN
a) Image quality, Reconstruction performance
b) Image editing c) Image generating d) Control the strength of attribute effect
26
7. Limitations
Editable GAN
- Compare with other methods, Editable GAN is not very good at editing.
Compared to AttGAN, it does not properly preserve other details.
a) Editable GAN b) AttGAN
Not match Match
27
Editable Text-Adaptive GAN ?
Feature CGAN AttGAN AttnGAN SISGAN TAGAN EditableGAN ???
Inplicit Classes
(Natural Language)
X X O O O X O
Generating Arbitrary Image O X O X X O O
Editing Image X O X O O O O
1. Comparison of Existing GANs
: Comparison of some GANs, which are known to perform well in editing or generative arbitrary images.
28
Editable Text-Adaptive GAN ?
Feature CGAN AttGAN AttnGAN SISGAN TAGAN EditableGAN ???
Inplicit Classes
(Natural Language)
X X O O O X O
Generating Arbitrary Image O X O X X O O
Editing Image X O X O O O O
1. Comparison of Existing GANs
: Comparison of some GANs, which are known to perform well in editing or generative arbitrary images.
Q. Is it possible to make a novel framework that can generate and edit images simultaneously
using natural language with reference to Editable GAN [1] and Text-Adaptive GAN [2]?
[1] Baek, Kyungjune, et al. “Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously.”
[2] Nam, Seonghyeon, et al. “Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language”
29
Editable Text-Adaptive GAN ?
2. Key idea
: Combine Editable GAN [1] and Text-Adpative GAN [2]. Proposed framework uses connection network
and text-adaptive discriminator and includes some components of both models.
[1] Baek, Kyungjune, et al. “Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously.”
[2] Nam, Seonghyeon, et al. “Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language”
+
Edit Images with natural language
Generate and edit simultaneously
30
Editable Text-Adaptive GAN ?
3. Model Structure
Decoder
Classifier
Attribute
Classifier
Text Encoder
Text Encoder
CN
Text-Adaptive
Classifier
Unconditional Loss
(Adversarial Loss)
Conditional Loss
Discriminator
Generator
𝑥
𝑧
𝑓𝑑 𝑓𝑐
31
Editable Text-Adaptive GAN ?
3. Model Structure
: Generate novel images with specific attributes in natural language.
Decoder
Classifier
Attribute
Classifier
Text Encoder
Text Encoder
CN
Text-Adaptive
Classifier
Unconditional Loss
(Adversarial Loss)
Conditional Loss
Sample from
uniform distribution
“She has blond hair”
𝑥 𝑔𝑒𝑛
𝑧
32
Classifier
Attribute
Classifier
CN
Editable Text-Adaptive GAN ?
3. Model Structure
: Manipulate images with specific attributes in natural language.
Decoder
Text Encoder
Text Encoder
Text-Adaptive
Classifier
Unconditional Loss
(Adversarial Loss)
Conditional Loss
Edited
Original
𝑥 𝑜𝑟𝑖𝑔𝑖𝑛
𝑥 𝑒𝑑𝑖𝑡
ǁ𝑧
𝑓𝑑 𝑓𝑐
“She has blond hair”
33
Editable Text-Adaptive GAN ?
4. Formulation
c) Connection Network: Estimate image’s latent vector
a) Discriminator with Text-Adaptive Classifier
b) Generator: Ensure reality, with the correct classes + Preserve other details in editing
𝐿 𝐷 = 𝐿 𝑎𝑑𝑣 + 𝜆 𝑐𝑜𝑛𝑑 𝐿 𝐶 𝑟𝑒𝑎𝑙
𝐿 𝐺 = 𝐿 𝑎𝑑𝑣 + 𝜆 𝑐𝑜𝑛𝑑 𝐿 𝐶 𝑒𝑑𝑖𝑡
+ 𝐿 𝐶 𝑔𝑒𝑛
+ 𝜆 𝑟𝑒𝑐𝑜𝑛 𝐿 𝑟𝑒𝑐 𝑖𝑚𝑎𝑔𝑒
𝐿 𝐶𝑁 = 𝜆 𝑟𝑒𝑐𝑜𝑛 𝐿 𝑟𝑒𝑐 𝑧
More suitable for editing (Only change what you want)
34
Editable Text-Adaptive GAN ?
5. Experiments
: Still training and fine-tuning…. Results in below are images that comes out during learning.
“Not good, But it works anyway”
“This flower has petals that are white
and has patches of yellow.”
“A light pink flower with pointed petals
and a yellow circle.”
Original
Image
Edited
Image
Given same
description
a) Recognize attributes in text b) Generate and Edit images simultaneously with given text
Novel
Image
35
Editable Text-Adaptive GAN ?
6. Conclusion
: The limitations of Editable Text-Adaptvie GAN and what I got.
- It has all the problems of the existing two models (Editable GAN, Text-Adaptive GAN).
: Text-Adaptive Discriminator and Connection Network works independtly, so it is not helpful to
solve those problems by combining two models.
- Reconstruction loss in image units, not just latent vectors, works effectively even this model does not
based on encoder-decoder architecture.
- 𝑓𝑐 probably contains enough attribute information. There was no major problem in learning without
entering y in Connection Network.
Reconsturction Loss
by images
36
Discussion
- Is it possible to generate and edit images simultaneously without loss of original
information?
- Can we improve performance by integrating with a structure with other models like
StackGAN which introduced for using natural language?
- Metrics to compare the modified image with the original one. (Reconstruction loss is
greatly increased when the color/structure of the image is changed.)

More Related Content

What's hot

Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Uninformed Search technique
Uninformed Search techniqueUninformed Search technique
Uninformed Search techniqueKapil Dahal
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and ApplicationsHoang Nguyen
 
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Amol Patil
 
Anomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersAnomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersGianmario Spacagna
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Manohar Mukku
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMark Chang
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...Edge AI and Vision Alliance
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
Autoencoder
AutoencoderAutoencoder
AutoencoderHARISH R
 
ProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) IntroductionProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) Introductionwahab khan
 
Computational Learning Theory
Computational Learning TheoryComputational Learning Theory
Computational Learning Theorybutest
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkGayatri Khanvilkar
 
Temporal difference learning
Temporal difference learningTemporal difference learning
Temporal difference learningJie-Han Chen
 
“Optimization Techniques with Intel’s OpenVINO to Enhance Performance on Your...
“Optimization Techniques with Intel’s OpenVINO to Enhance Performance on Your...“Optimization Techniques with Intel’s OpenVINO to Enhance Performance on Your...
“Optimization Techniques with Intel’s OpenVINO to Enhance Performance on Your...Edge AI and Vision Alliance
 
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019Universitat Politècnica de Catalunya
 

What's hot (20)

Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Uninformed Search technique
Uninformed Search techniqueUninformed Search technique
Uninformed Search technique
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and Applications
 
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)
 
Anomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersAnomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-Encoders
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
Autoencoder
AutoencoderAutoencoder
Autoencoder
 
Style gan
Style ganStyle gan
Style gan
 
ProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) IntroductionProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) Introduction
 
Computational Learning Theory
Computational Learning TheoryComputational Learning Theory
Computational Learning Theory
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
AI Lecture 5 (game playing)
AI Lecture 5 (game playing)AI Lecture 5 (game playing)
AI Lecture 5 (game playing)
 
Gnn overview
Gnn overviewGnn overview
Gnn overview
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
 
Temporal difference learning
Temporal difference learningTemporal difference learning
Temporal difference learning
 
“Optimization Techniques with Intel’s OpenVINO to Enhance Performance on Your...
“Optimization Techniques with Intel’s OpenVINO to Enhance Performance on Your...“Optimization Techniques with Intel’s OpenVINO to Enhance Performance on Your...
“Optimization Techniques with Intel’s OpenVINO to Enhance Performance on Your...
 
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
 

Similar to Research Trends in Editing image using GAN (TAGAN, Editable GAN)

Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
 Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar... Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...Lviv Data Science Summer School
 
A Survey of Generative Adversarial Neural Networks (GAN) for Text-to-Image Sy...
A Survey of Generative Adversarial Neural Networks (GAN) for Text-to-Image Sy...A Survey of Generative Adversarial Neural Networks (GAN) for Text-to-Image Sy...
A Survey of Generative Adversarial Neural Networks (GAN) for Text-to-Image Sy...Mirsaeid Abolghasemi
 
Implementing Neural Style Transfer
Implementing Neural Style Transfer Implementing Neural Style Transfer
Implementing Neural Style Transfer Tahsin Mayeesha
 
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNINGATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNINGNathan Mathis
 
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Ad...
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Ad...AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Ad...
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Ad...Willy Marroquin (WillyDevNET)
 
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYIMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYijcsit
 
Image Generation with Gans-based Techniques: A Survey
Image Generation with Gans-based Techniques: A SurveyImage Generation with Gans-based Techniques: A Survey
Image Generation with Gans-based Techniques: A SurveyAIRCC Publishing Corporation
 
Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...
Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...
Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...Catalina Arango
 
ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...
ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...
ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...sipij
 
IMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTIONIMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTIONijscai
 
Image Generation from Caption
Image Generation from Caption Image Generation from Caption
Image Generation from Caption IJSCAI Journal
 
IRJET- Concepts, Methods and Applications of Neural Style Transfer: A Rev...
IRJET-  	  Concepts, Methods and Applications of Neural Style Transfer: A Rev...IRJET-  	  Concepts, Methods and Applications of Neural Style Transfer: A Rev...
IRJET- Concepts, Methods and Applications of Neural Style Transfer: A Rev...IRJET Journal
 
Web Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical ModelsWeb Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical ModelsGUANBO
 
Face-GAN project report.pptx
Face-GAN project report.pptxFace-GAN project report.pptx
Face-GAN project report.pptxAndleebFatima16
 
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTMDEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTMIRJET Journal
 
Multiple Style-Transfer in Real-Time
Multiple Style-Transfer in Real-TimeMultiple Style-Transfer in Real-Time
Multiple Style-Transfer in Real-TimeKaustavChakraborty28
 
OBJECT DETECTION AND RECOGNITION: A SURVEY
OBJECT DETECTION AND RECOGNITION: A SURVEYOBJECT DETECTION AND RECOGNITION: A SURVEY
OBJECT DETECTION AND RECOGNITION: A SURVEYJournal For Research
 
Image captioning using DL and NLP.pptx
Image captioning using DL and NLP.pptxImage captioning using DL and NLP.pptx
Image captioning using DL and NLP.pptxMrUnknown820784
 

Similar to Research Trends in Editing image using GAN (TAGAN, Editable GAN) (20)

Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
 Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar... Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
 
A Survey of Generative Adversarial Neural Networks (GAN) for Text-to-Image Sy...
A Survey of Generative Adversarial Neural Networks (GAN) for Text-to-Image Sy...A Survey of Generative Adversarial Neural Networks (GAN) for Text-to-Image Sy...
A Survey of Generative Adversarial Neural Networks (GAN) for Text-to-Image Sy...
 
Implementing Neural Style Transfer
Implementing Neural Style Transfer Implementing Neural Style Transfer
Implementing Neural Style Transfer
 
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNINGATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
 
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Ad...
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Ad...AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Ad...
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Ad...
 
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYIMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
 
Image Generation with Gans-based Techniques: A Survey
Image Generation with Gans-based Techniques: A SurveyImage Generation with Gans-based Techniques: A Survey
Image Generation with Gans-based Techniques: A Survey
 
Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...
Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...
Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...
 
ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...
ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...
ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...
 
IMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTIONIMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTION
 
Image Generation from Caption
Image Generation from Caption Image Generation from Caption
Image Generation from Caption
 
IRJET- Concepts, Methods and Applications of Neural Style Transfer: A Rev...
IRJET-  	  Concepts, Methods and Applications of Neural Style Transfer: A Rev...IRJET-  	  Concepts, Methods and Applications of Neural Style Transfer: A Rev...
IRJET- Concepts, Methods and Applications of Neural Style Transfer: A Rev...
 
Web Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical ModelsWeb Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical Models
 
Face-GAN project report.pptx
Face-GAN project report.pptxFace-GAN project report.pptx
Face-GAN project report.pptx
 
Face-GAN project report
Face-GAN project reportFace-GAN project report
Face-GAN project report
 
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTMDEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
 
Fashion AI
Fashion AIFashion AI
Fashion AI
 
Multiple Style-Transfer in Real-Time
Multiple Style-Transfer in Real-TimeMultiple Style-Transfer in Real-Time
Multiple Style-Transfer in Real-Time
 
OBJECT DETECTION AND RECOGNITION: A SURVEY
OBJECT DETECTION AND RECOGNITION: A SURVEYOBJECT DETECTION AND RECOGNITION: A SURVEY
OBJECT DETECTION AND RECOGNITION: A SURVEY
 
Image captioning using DL and NLP.pptx
Image captioning using DL and NLP.pptxImage captioning using DL and NLP.pptx
Image captioning using DL and NLP.pptx
 

Recently uploaded

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Recently uploaded (20)

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 

Research Trends in Editing image using GAN (TAGAN, Editable GAN)

  • 1. 1 DaeJin Kim Research Trends in Editing image using GAN : Text-Adaptive GAN, Editable GAN
  • 2. 2 1. Introduction 1. Conditional GAN 2. AttGAN 2. Related Work 1. Reference 2. Motivation 3. Model Structure 4. Text-Adaptive Discriminator 5. Formulation 6. Implementation Details 7. Experiments 8. Limitations 3. Text-Adaptive GAN 4. Editable GAN 1. Comparison of existing GANs 2. Key idea 3. Model Structure 4. Formulation 5. Experiments 6. Conclusion 5. Editable Text-Adaptive GAN Table of Contents 1. Reference 2. Motivation 3. Model Structure 4. Connection Network 5. Formulation 6. Experiments 7. Limitations 6. Discussion
  • 3. 3 Generate edited images 1. Editing Images : Editing Methods aim to manipulate single or multiple attributes of a original image, i.e., to generate a new images with desired attriutes while preserving other details. Introduction
  • 4. 4 Introduction 2. Approaches to Introduce 1) Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language [1] 2) Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously [2] 3) Editable GAN + Text-Adpative GAN (By my suggestion) b) EditableGANa) Text-Adaptive GAN [1] Baek, Kyungjune, et al. “Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously.” [2] Nam, Seonghyeon, et al. “Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language”
  • 5. 5 Related Works 1. Conditional GAN : Conditional GAN suggests a new framework to control the semantics of generated samples; they formulate the problem as reproducing the conditional data distribution by training the conditional model distribution. https://github.com/hwalsuklee/tensorflow-generative-model-collections
  • 6. 6 Related Works 2. AttGAN : AttGAN aims to generate a new face with desired attributes while preserving other details. Introduced in He, Z, et al. “Arbitrary facial attribute editing: Only change what you want” in arXiv Add Glasses Blond Hair
  • 7. 7 Related Works 2. AttGAN : Based on the encoder-decoder architecture, AttGAN apply an attribute classification constraint to the generated image to just guarantee the correct changes of desired attributes, i.e., to “change what you want”.
  • 8. 8 Text-Adaptive GAN 1. Reference : Nam, Seonghyeon, et al. “Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language” in NeurIPS 2018
  • 9. 9 2. Motivation : Text-Adaptive GAN aims to semantically modify visual attributes of an object in an image according to the text describing the new visual appearance. Text-Adaptive GAN “This is a black bird with gray and white wings and a bright yellow belly and chest.” Proposed Synthesize novel images, not manipulate. Do not fully preserve text-irrelevant contents. Existing Methods
  • 10. 10 Text-Adaptive GAN 3. Model Structure : Model structure figure in paper.
  • 11. 11 3. Model Structure : A simplified architecture of Text-Adaptive GAN. Text-Adaptive GAN Generator Discriminator Text Encoder “She has blond hair” Text-Adaptive Discriminator Text Encoder “She has blond hair” Real / Fake? Has described attributes? Learning independently
  • 12. 12 4. Text-Adaptive Discriminator : Text-Adaptive Discriminator classifies each attributes independently using word-level local discriminators. By doing so, the generator receives feedback from each local discriminator for each visual attributes. Text-Adaptive GAN 1) Determines whether a visual attribute related to word exists in the image. 2) Adding word-level attentions to reduce the impact of less important words. (using softmax values) (u: temporal average of wi) 3) Final scores
  • 13. 13 5. Formulation Text-Adaptive GAN “This is a brown bird” Original x Positive Text t “This is a black bird with gray and white wings and a bright yellow belly and chest.” Negative Text ƸtGenerated G(x, Ƹt) x has classes described in t ? G(x, Ƹ𝑡) has classes described in Ƹ𝑡 ? a) Discriminator b) Generator log 𝐷(𝐺(𝑥, 𝑡)) −− −
  • 14. 14 6. Implementation Details Text-Adaptive GAN - Using bidirectional RNN to encode the whole text - Using conditioning augmentation method for smooth text representation and the diversity of generated outputs Randomly sample latent variables from the independent Gaussian distribution Ɲ with 𝜇 𝜙 , 𝜎(𝜙). (Introduced with StackGAN) - Using fastText for word embedding
  • 15. 15 7. Experiments Text-Adaptive GAN a) Accuracy, Natrualness (Evaluated by users), L2 reconstruction error b) Mutli-modal retrieval task on CUB dataset
  • 16. 16 8. Limitations Text-Adaptive GAN - Can not edit properly for objects that do nat match those attributes in datasets. “This flower is blue”+ = Bad result - Good at only for a few attributes. “This bird has a very small wings” + = Bad result
  • 17. 17 Editable GAN 1. Reference : Baek, Kyungjune, et al. “Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously.” in ACCV 2018
  • 18. 18 2. Motivation : Develop a single unified model that can simultaneously create and edit high quality face images with desired attributes. Editable GAN Single model (Proposed) + Blond hair Edit Attribute Generate novel image Blond hair IcGAN, VAE/GAN … AttGAN, cGAN … Share
  • 19. 19 3. Model Structure : Model structure figure in paper. Editable GAN
  • 20. 20 3. Model Structure : A simplified architecture of Editable GAN Editable GAN Generator Discriminator Attribute Classifier Real / Fake? Has described attributes? Connection Network Structural Information Attribute Information Estimate Latent vector 𝑧 𝑦 𝑥
  • 21. 21 3. Model Structure : Generate novel images with specific attributes in natural language. Editable GAN Generator Discriminator Attribute Classifier Real / Fake? Has described attributes? Connection Network Structural Information Attribute Information Estimate Latent vector 𝑧 𝑦 𝑥 𝑔𝑒𝑛 Sample from uniform distribution 𝑎 = [0, 1, … , 0, … ] (Blond hair)
  • 22. 22 3. Model Structure : Manipulate images with specific attributes. Editable GAN Generator Discriminator Attribute Classifier Connection Network Structural Information Attribute Information Estimate Latent vector ǁ𝑧 𝑦 Real / Fake? Has described attributes? Edited Original 𝑥 𝑜𝑟𝑖𝑔𝑖𝑛 𝑥 𝑒𝑑𝑖𝑡 (Estimated) Original Latent vector 𝑎 = [0, 1, … , 0, … ] (Blond hair)
  • 23. 23 4. Connection Network : Connection Network performs the inverse generation process. Take 𝑓𝑑 from the discriminator and 𝑓𝑐 from the classifier as input, it estimates the latent vector. By using connection network, it is able to bypass the disadvantage of the encoder-decoder architecture, which overloads the generator training. Editable GAN Discriminator ClassifierReal image? Blond hair? 𝑓𝑑: Vector used for detect fake images 𝑓𝑐: Vector used for check classes (attributes) Latent vector Output feature vector of the last fully connected layer (Structural Information) (Attribute Information)
  • 24. 24 5. Formulation Editable GAN c) Connection Network: Estimate image’s latent vector Vector from 𝐺(𝑧, 𝑦) a) Discriminator b) Generator: Ensure reality, with the correct classes + 𝐿 𝑎𝑑𝑣 + 𝐿 𝑐𝑙𝑎𝑠𝑠 + Fool Discriminator Right Classes? Novel Image Edited Image 𝑓𝑑𝑓𝑐 (Estimated) 𝑥 CN ǁ𝑧 (Random) Decoder 𝑦 𝐺(𝑧, 𝑦)(Classes) 𝑧 (Edit image)
  • 25. 25 6. Experiments Editable GAN a) Image quality, Reconstruction performance b) Image editing c) Image generating d) Control the strength of attribute effect
  • 26. 26 7. Limitations Editable GAN - Compare with other methods, Editable GAN is not very good at editing. Compared to AttGAN, it does not properly preserve other details. a) Editable GAN b) AttGAN Not match Match
  • 27. 27 Editable Text-Adaptive GAN ? Feature CGAN AttGAN AttnGAN SISGAN TAGAN EditableGAN ??? Inplicit Classes (Natural Language) X X O O O X O Generating Arbitrary Image O X O X X O O Editing Image X O X O O O O 1. Comparison of Existing GANs : Comparison of some GANs, which are known to perform well in editing or generative arbitrary images.
  • 28. 28 Editable Text-Adaptive GAN ? Feature CGAN AttGAN AttnGAN SISGAN TAGAN EditableGAN ??? Inplicit Classes (Natural Language) X X O O O X O Generating Arbitrary Image O X O X X O O Editing Image X O X O O O O 1. Comparison of Existing GANs : Comparison of some GANs, which are known to perform well in editing or generative arbitrary images. Q. Is it possible to make a novel framework that can generate and edit images simultaneously using natural language with reference to Editable GAN [1] and Text-Adaptive GAN [2]? [1] Baek, Kyungjune, et al. “Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously.” [2] Nam, Seonghyeon, et al. “Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language”
  • 29. 29 Editable Text-Adaptive GAN ? 2. Key idea : Combine Editable GAN [1] and Text-Adpative GAN [2]. Proposed framework uses connection network and text-adaptive discriminator and includes some components of both models. [1] Baek, Kyungjune, et al. “Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously.” [2] Nam, Seonghyeon, et al. “Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language” + Edit Images with natural language Generate and edit simultaneously
  • 30. 30 Editable Text-Adaptive GAN ? 3. Model Structure Decoder Classifier Attribute Classifier Text Encoder Text Encoder CN Text-Adaptive Classifier Unconditional Loss (Adversarial Loss) Conditional Loss Discriminator Generator 𝑥 𝑧 𝑓𝑑 𝑓𝑐
  • 31. 31 Editable Text-Adaptive GAN ? 3. Model Structure : Generate novel images with specific attributes in natural language. Decoder Classifier Attribute Classifier Text Encoder Text Encoder CN Text-Adaptive Classifier Unconditional Loss (Adversarial Loss) Conditional Loss Sample from uniform distribution “She has blond hair” 𝑥 𝑔𝑒𝑛 𝑧
  • 32. 32 Classifier Attribute Classifier CN Editable Text-Adaptive GAN ? 3. Model Structure : Manipulate images with specific attributes in natural language. Decoder Text Encoder Text Encoder Text-Adaptive Classifier Unconditional Loss (Adversarial Loss) Conditional Loss Edited Original 𝑥 𝑜𝑟𝑖𝑔𝑖𝑛 𝑥 𝑒𝑑𝑖𝑡 ǁ𝑧 𝑓𝑑 𝑓𝑐 “She has blond hair”
  • 33. 33 Editable Text-Adaptive GAN ? 4. Formulation c) Connection Network: Estimate image’s latent vector a) Discriminator with Text-Adaptive Classifier b) Generator: Ensure reality, with the correct classes + Preserve other details in editing 𝐿 𝐷 = 𝐿 𝑎𝑑𝑣 + 𝜆 𝑐𝑜𝑛𝑑 𝐿 𝐶 𝑟𝑒𝑎𝑙 𝐿 𝐺 = 𝐿 𝑎𝑑𝑣 + 𝜆 𝑐𝑜𝑛𝑑 𝐿 𝐶 𝑒𝑑𝑖𝑡 + 𝐿 𝐶 𝑔𝑒𝑛 + 𝜆 𝑟𝑒𝑐𝑜𝑛 𝐿 𝑟𝑒𝑐 𝑖𝑚𝑎𝑔𝑒 𝐿 𝐶𝑁 = 𝜆 𝑟𝑒𝑐𝑜𝑛 𝐿 𝑟𝑒𝑐 𝑧 More suitable for editing (Only change what you want)
  • 34. 34 Editable Text-Adaptive GAN ? 5. Experiments : Still training and fine-tuning…. Results in below are images that comes out during learning. “Not good, But it works anyway” “This flower has petals that are white and has patches of yellow.” “A light pink flower with pointed petals and a yellow circle.” Original Image Edited Image Given same description a) Recognize attributes in text b) Generate and Edit images simultaneously with given text Novel Image
  • 35. 35 Editable Text-Adaptive GAN ? 6. Conclusion : The limitations of Editable Text-Adaptvie GAN and what I got. - It has all the problems of the existing two models (Editable GAN, Text-Adaptive GAN). : Text-Adaptive Discriminator and Connection Network works independtly, so it is not helpful to solve those problems by combining two models. - Reconstruction loss in image units, not just latent vectors, works effectively even this model does not based on encoder-decoder architecture. - 𝑓𝑐 probably contains enough attribute information. There was no major problem in learning without entering y in Connection Network. Reconsturction Loss by images
  • 36. 36 Discussion - Is it possible to generate and edit images simultaneously without loss of original information? - Can we improve performance by integrating with a structure with other models like StackGAN which introduced for using natural language? - Metrics to compare the modified image with the original one. (Reconstruction loss is greatly increased when the color/structure of the image is changed.)