SlideShare a Scribd company logo
1 of 23
Download to read offline
1
Published by T. Kim, M. Cha, H. Kim, J. K. Lee, and, J. Kim
Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017
Seongcheol Baek
Reading Circle Presentation @ Hikihara Lab
Department of Electrical Engineering, Kyoto University
2019/07/19
Learning to Discover Cross-Domain Relations
with Generative Adversarial Networks
Focus of this presentation
- Recently emerging issues around GAN
- Introduction of generative adversarial networks (GAN)
- What is DiscoGAN?
Problems of interest / model architecture / mode collapse problem /
experiments / summaries / comments
2
Recent generative technologies
3
2014 2015 2016 2017 2018 2019
Ian J. Goodfellow
invented
“generative
adversarial network”
Deep
Convolutional GAN
(DCGAN)
Least Squares GAN
(LSGAN)
Semi-Supervised GAN StackGAN,
Auxiliary
Classifier GAN
(ACGAN)
Jun. Oct. Nov. Oct. Mar. Aug. Sep. Oct. Sep. Mar. May
Samsung deepfake AI
fabricate a video from
a single profile picGauGAN (Source: Nvidia)
BW clips into color (Source: Nvidia)
CycleGAN
Original AlphaGo
beat a professional
Go player
DiscoGAN
Recent issues around deepfakes – security, art, etc.
4
A viral video that Obama insults
Donald Trump is fabricated
with FakeApp (Photo: Youtube)
A deepfake clip of Mark Zuckerberg
is being allowed to remain on Instagram
(Photo: Bill Poster UK)
- US lawmakers say AI deepfakes ‘have the potential to disrupt every facet of our society’
- At individual level, deepfakes can be used for cyberbullying, defamation and blackmail
Edmond de Belamy: The first piece
of AI-generated art
(created by GAN in 2018)
What is GAN?
5
- Two neural networks contest with each other in a game. Given a training set, GAN learns to
generate new data with the same statistics as the training set.
- Minimax two-player game (Generative model v.s. Discriminative model)
Minimax Problem of GAN
6
min
$
max
'
((*, ,) = /0~23454(0) log , 0 + /:~2;(;)[log(1 − ,(*(;)))]
( *, , = @
0
ABCDC 0 log , 0 d0 + @
;
AF (;) log(1 − , *(;) ) d;
Training of Generator – min
$
[1 − ,(* G )] = 0
Training of Discriminator max
'
,(I) = 1 max
'
[1 − ,(* G )] = 1
Discriminant for real data Discriminant for generated data
- ((,, *) has a saddle point at ,(* ; ) =
J
K
, ∈ [0, 1]
data is fake/real
Discover Cross-Domain Relations with GAN
7
Training of 2 different data sets
without explicitly paired labelling
Results of domain transfer
- Previous AI could also transfer data from one domain to another, preserving key attributes
- Previous training methods (~2016) require paired data, that is costly and hard to collect
- DiscoGAN requires training of 2 different data sets without any paired data, and its results
shows better performance with robustness to the mode collapse problem
(Domain A) (Domain B)
!"#
!#"
Network Models – DiscoGAN & Previous GANs
8
Standard GAN with GAN loss
GAN with a reconstruction loss & GAN loss
DiscoGAN
- Each generator consists of encoder-decoder pair (input and output are images)
- GAN loss (and the reconstruction) is to be minimized on training processes
- In DiscoGAN, 2 coupled GANs map each domain to its counterpart domain (bijective)
Problem Formulation (1)
9
- Reconstruction loss measures how well the original input is reconstructed after a sequence of two
generations: !"#$%&'
= )(+,-,, +,) such as !0, !1, or Huber loss
- GAN loss measures how realistic the generated image is in domain B: !2,$3
= −56'~8'(6) log <- +,-
- Relaxed constraints are considered to guarantee bijection and domain transition
- Bijection: ideally =,-
>0
= =-,
→ min
2'3
(!"#$%&'
), min
23'
(!"#$%&3
)
- Domain transition: ideally B,- ∈ ℝ-, B-, ∈ ℝ,
→ min
E3
(!E3
), min
E'
(!E'
)
Problem Formulation (2)
10
Training of Generator
(in case of !"#)
Training of Discriminator
(in case of $#)
Constraints Level
(a) Standard GAN with GAN
loss
%&'
= −*+~-+
[log &'(3+'(4+))]
%&'
= −*+~-+
[log &'(3+'(4+))] –
(b) GAN with a
reconstruction loss & GAN
loss
%3+'
= %3+7'
+ %9:7;<+
= −*+~-+
log &' 3+' 4+
+ =(4+'+, 4+)
%&'
= −*'~-'
[log &' 4' ]
− *+~-+
[log(1 − &'(3+'(4+)))]
doubled DOF
from (a),
weaker than (a)
(c) DiscoGAN %3 = %3+'
+ %@AB
= %3+7'
+ %9:7;<+
+ %3+7+
+ %9:7;<'
= −*+~-+
log &' 3+' 4+
+ =(4+'+, 4+)
− *'~-'
log &+ 3'+ 4'
+ =(4'+', 4')
%& = %&+
+ %&'
= −*+~-+
[log &+ 4+ ]
− *'~-'
[log(1 − &+(3'+(4')))]
− *'~-'
[log &' 4' ]
− *+~-+
[log(1 − &'(3+'(4+)))]
doubled DOF
from (b),
weaker than (b)
Architecture of Generator
11
- Each generator takes an image and feeds it through an encoder-decoder pair
- Number of layers ranges from 4 to 5 depending on the domain
Encoder
(convolution layer)
Decoder
(deconvolution layer)
Domain A (resp. B) Domain B (resp. A)
Architecture of Discriminator
12
- Each discriminator feeds an image through convolution layers
- Discriminator outputs a scalar output based on sigmoid, telling how real fed image is
Toy Experiment – Domain Transition Test
13
- In DiscoGAN, discriminator B is perfectly fooled by translated sampled from domain A
- DiscoGAN prevents mode-collapse by translating into distinct well-bounded regions that do
not overlap
Initial state Standard GAN GAN with !"#$%& DiscoGAN
'(
Colored points: samples in domain A
Black x’s: target modes in domain B
Mode Collapse Problem
14
The gradients are biased towards the mode from which
higher number of samples are drawn to form the real training data
- Generator outputs unintended images in different mode, which occurs prevalently in GANs
- Usually, GAN remedy this problem with losses, however it has not been resolved perfectly
- Other examples: communication system, cryptography, automaton, etc.
Why DiscoGAN is robust to mode-collapse?
15
- In DiscoGAN, two coupled models are trained together simultaneously. !"#’s and !#"’s
share parameters
- Constraints of coupled reconstruction losses lead to the strict bijection
Real Domain Experiment – Car to Car, Face to Face
16
Input data Standard GAN GAN with !"#$%& DiscoGAN
CartocarFacetoface
- Reconstruction tests
- Results in DiscoGAN show higher correlations, (robust to mode collapse)
–
Real Domain Experiment – Face Conversion
17
Translation of gender
Blond to black,
Black to blond hair
Glasses to non-glasses,
non-glasses to glasses
- DiscoGAN translates specific feature, preserving other facial features
Cross-Domain Experiment (1)
18
Chair to car Car to face
- Note that training is implemented without any paired data
- The main attribute (azimuth) is preserved
Cross-Domain Experiment (2)
19
- 1-to-N problem
Handbag to sketches
Sketches to shoes
Sketches to handbags
Cross-Domain Experiment (3)
20
- Same style is discovered
Handbag to shoes
Shoes to handbag
Summaries
21
- DiscoGAN is proposed as a learning method to discover cross-domain relations without any
pair labels
- Results showed better performance with robustness to mode-collapse. The symmetry
granted by coupling 2 GANs, is considered to be a key factor for the dynamical robustness
Comments
- The strategy to couple two GAN models reminded me of the symmetry of dynamics. Some
correlations could be drawn to handle the stability problem…?
- This paper is giving me many ideas. It is very pleasant.
22
Thank you!
- Source code for simulations
Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks" (Github) ... https://github.com/SKTBrain/DiscoGAN
- This presentation is also available on:
https://www.slideshare.net/SeongcheolBaek/introduction-of-discogan
References
23
- Crux of Presentation
T. Kim, et al., Learning to Discover Cross-Domain Relations with Generative Adversarial Networks (arXiv) ... https://arxiv.org/abs/1703.05192
- Recent generative technologies
Apple announces Animoji (The Verge) … https://www.theverge.com/2017/9/12/16290210/new-iphone-emoji-animated-animoji-apple-ios-11-update
AI Can Convert Black and White Clips into Color (Nvidia Developer) ... https://news.developer.nvidia.com/ai-can-convert-black-and-white-clips-into-color/
Nvidia’s latest AI software turns rough doodles into realistic landscapes (The Verge) ... https://www.theverge.com/2019/3/19/18272602/ai-art-generation-gan-nvidia-doodle-landscapes
Deepfakes are getting easier than ever to make (The Verge) … https://www.theverge.com/2019/5/23/18637373/deepfakes-samsung-ai-research-results-single-photo-algorithm
- Recent issues around deepfakes – security, art, etc.
A viral video that appeared to show Obama calling Trump a 'dips---' shows a disturbing new trend called 'deepfakes’ (Business Insider) … https://www.businessinsider.com/obama-
deepfake-video-insulting-trump-2018-4
New deepfake tech turns a single photo and audio file into a singing video portrait (The Verge) ... https://www.theverge.com/2019/6/20/18692671/deepfake-technology-singing-talking-
video-portrait-from-a-single-image-imperial-college-samsung
US lawmakers say AI deepfakes ‘have the potential to disrupt every facet of our society’ (The Verge) … https://www.theverge.com/2018/9/14/17859188/ai-deepfakes-national-security-
threat-lawmakers-letter-intelligence-community
Deepfakes: A Threat to Individuals and National Security (Lionbridge) … https://lionbridge.ai/articles/deepfakes-a-threat-to-individuals-and-national-security/
A deepfake clip of Mark Zuckerberg is being allowed to remain on Instagram (iNews) … https://inews.co.uk/news/technology/a-deepfake-clip-of-mark-zuckerberg-is-being-allowed-to-
remain-on-instagram/
Portrait of Edmond Belamy created by GAN (Wikipedia) … https://en.wikipedia.org/wiki/Edmond_de_Belamy
- Generative Adversarial Network
I. J. Goodfellow, Generative Adversarial Nets (arXiv) … https://arxiv.org/abs/1406.2661
Tutorial on Generative Adversarial Networks … https://www.slideshare.net/ckmarkohchang/generative-adversarial-networks
- Mode Collapse Problem
A. Ghosh, et al., Multi-Agent Diverse Generative Adversarial Networks (Research Gate) … https://www.researchgate.net/publication/315882247_Multi-
Agent_Diverse_Generative_Adversarial_Networks

More Related Content

What's hot

Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models Chia-Wen Cheng
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsHyeongmin Lee
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and ApplicationsHoang Nguyen
 
[PR12] intro. to gans jaejun yoo
[PR12] intro. to gans   jaejun yoo[PR12] intro. to gans   jaejun yoo
[PR12] intro. to gans jaejun yooJaeJun Yoo
 
Algorithm Design and Analysis - Practical File
Algorithm Design and Analysis - Practical FileAlgorithm Design and Analysis - Practical File
Algorithm Design and Analysis - Practical FileKushagraChadha1
 
Image-to-Image Translation
Image-to-Image TranslationImage-to-Image Translation
Image-to-Image TranslationJunho Kim
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative ModelsMLReview
 
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...Vitaly Bondar
 
BKK16-315 Graphics Stack Update
BKK16-315 Graphics Stack UpdateBKK16-315 Graphics Stack Update
BKK16-315 Graphics Stack UpdateLinaro
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical ImagingSanghoon Hong
 
Computer vision lane line detection
Computer vision lane line detectionComputer vision lane line detection
Computer vision lane line detectionJonathan Mitchell
 
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow) Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow) Lalit Jain
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsArtifacia
 
Derivation of Convolutional Neural Network from Fully Connected Network Step-...
Derivation of Convolutional Neural Network from Fully Connected Network Step-...Derivation of Convolutional Neural Network from Fully Connected Network Step-...
Derivation of Convolutional Neural Network from Fully Connected Network Step-...Ahmed Gad
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksBennoG1
 
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...hyunyoung Lee
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)Thomas da Silva Paula
 
Diffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisDiffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisBeerenSahu
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIWithTheBest
 

What's hot (20)

Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and Applications
 
[PR12] intro. to gans jaejun yoo
[PR12] intro. to gans   jaejun yoo[PR12] intro. to gans   jaejun yoo
[PR12] intro. to gans jaejun yoo
 
Algorithm Design and Analysis - Practical File
Algorithm Design and Analysis - Practical FileAlgorithm Design and Analysis - Practical File
Algorithm Design and Analysis - Practical File
 
Image-to-Image Translation
Image-to-Image TranslationImage-to-Image Translation
Image-to-Image Translation
 
Open gl
Open glOpen gl
Open gl
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative Models
 
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
 
BKK16-315 Graphics Stack Update
BKK16-315 Graphics Stack UpdateBKK16-315 Graphics Stack Update
BKK16-315 Graphics Stack Update
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical Imaging
 
Computer vision lane line detection
Computer vision lane line detectionComputer vision lane line detection
Computer vision lane line detection
 
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow) Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their Applications
 
Derivation of Convolutional Neural Network from Fully Connected Network Step-...
Derivation of Convolutional Neural Network from Fully Connected Network Step-...Derivation of Convolutional Neural Network from Fully Connected Network Step-...
Derivation of Convolutional Neural Network from Fully Connected Network Step-...
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial Networks
 
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 
Diffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisDiffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesis
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
 

Similar to Introduction of DiscoGAN

Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Databricks
 
Tips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software EngineeringTips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software Engineeringjtdudley
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compilerGrokking VN
 
Class[3][5th jun] [three js]
Class[3][5th jun] [three js]Class[3][5th jun] [three js]
Class[3][5th jun] [three js]Saajid Akram
 
2 Years of Real World FP at REA
2 Years of Real World FP at REA2 Years of Real World FP at REA
2 Years of Real World FP at REAkenbot
 
VitaFlow | Mageswaran Dhandapani [Pramati]
VitaFlow | Mageswaran Dhandapani [Pramati]VitaFlow | Mageswaran Dhandapani [Pramati]
VitaFlow | Mageswaran Dhandapani [Pramati]Pramati Technologies
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptDivyaGugulothu
 
Test-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxTest-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxVictor Rentea
 
Engine Terminology
Engine TerminologyEngine Terminology
Engine Terminologykamkill
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowOswald Campesato
 
Andriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsAndriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsOWASP Kyiv
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Prabindh Sundareson
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdfnyomans1
 
Regex Considered Harmful: Use Rosie Pattern Language Instead
Regex Considered Harmful: Use Rosie Pattern Language InsteadRegex Considered Harmful: Use Rosie Pattern Language Instead
Regex Considered Harmful: Use Rosie Pattern Language InsteadAll Things Open
 
Performance #5 cpu and battery
Performance #5  cpu and batteryPerformance #5  cpu and battery
Performance #5 cpu and batteryVitali Pekelis
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...Taeksoo Kim
 

Similar to Introduction of DiscoGAN (20)

Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
 
Tips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software EngineeringTips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software Engineering
 
DiscoGAN
DiscoGANDiscoGAN
DiscoGAN
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compiler
 
Class[3][5th jun] [three js]
Class[3][5th jun] [three js]Class[3][5th jun] [three js]
Class[3][5th jun] [three js]
 
2 Years of Real World FP at REA
2 Years of Real World FP at REA2 Years of Real World FP at REA
2 Years of Real World FP at REA
 
VitaFlow | Mageswaran Dhandapani [Pramati]
VitaFlow | Mageswaran Dhandapani [Pramati]VitaFlow | Mageswaran Dhandapani [Pramati]
VitaFlow | Mageswaran Dhandapani [Pramati]
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.ppt
 
Test-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxTest-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptx
 
Generative AI for Reengineering Variants into Software Product Lines: An Expe...
Generative AI for Reengineering Variants into Software Product Lines: An Expe...Generative AI for Reengineering Variants into Software Product Lines: An Expe...
Generative AI for Reengineering Variants into Software Product Lines: An Expe...
 
Engine Terminology
Engine TerminologyEngine Terminology
Engine Terminology
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
 
Andriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsAndriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tips
 
ELAVARASAN.pdf
ELAVARASAN.pdfELAVARASAN.pdf
ELAVARASAN.pdf
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011
 
Evolution of Spark APIs
Evolution of Spark APIsEvolution of Spark APIs
Evolution of Spark APIs
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdf
 
Regex Considered Harmful: Use Rosie Pattern Language Instead
Regex Considered Harmful: Use Rosie Pattern Language InsteadRegex Considered Harmful: Use Rosie Pattern Language Instead
Regex Considered Harmful: Use Rosie Pattern Language Instead
 
Performance #5 cpu and battery
Performance #5  cpu and batteryPerformance #5  cpu and battery
Performance #5 cpu and battery
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
 

Recently uploaded

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Introduction of DiscoGAN

  • 1. 1 Published by T. Kim, M. Cha, H. Kim, J. K. Lee, and, J. Kim Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017 Seongcheol Baek Reading Circle Presentation @ Hikihara Lab Department of Electrical Engineering, Kyoto University 2019/07/19 Learning to Discover Cross-Domain Relations with Generative Adversarial Networks
  • 2. Focus of this presentation - Recently emerging issues around GAN - Introduction of generative adversarial networks (GAN) - What is DiscoGAN? Problems of interest / model architecture / mode collapse problem / experiments / summaries / comments 2
  • 3. Recent generative technologies 3 2014 2015 2016 2017 2018 2019 Ian J. Goodfellow invented “generative adversarial network” Deep Convolutional GAN (DCGAN) Least Squares GAN (LSGAN) Semi-Supervised GAN StackGAN, Auxiliary Classifier GAN (ACGAN) Jun. Oct. Nov. Oct. Mar. Aug. Sep. Oct. Sep. Mar. May Samsung deepfake AI fabricate a video from a single profile picGauGAN (Source: Nvidia) BW clips into color (Source: Nvidia) CycleGAN Original AlphaGo beat a professional Go player DiscoGAN
  • 4. Recent issues around deepfakes – security, art, etc. 4 A viral video that Obama insults Donald Trump is fabricated with FakeApp (Photo: Youtube) A deepfake clip of Mark Zuckerberg is being allowed to remain on Instagram (Photo: Bill Poster UK) - US lawmakers say AI deepfakes ‘have the potential to disrupt every facet of our society’ - At individual level, deepfakes can be used for cyberbullying, defamation and blackmail Edmond de Belamy: The first piece of AI-generated art (created by GAN in 2018)
  • 5. What is GAN? 5 - Two neural networks contest with each other in a game. Given a training set, GAN learns to generate new data with the same statistics as the training set. - Minimax two-player game (Generative model v.s. Discriminative model)
  • 6. Minimax Problem of GAN 6 min $ max ' ((*, ,) = /0~23454(0) log , 0 + /:~2;(;)[log(1 − ,(*(;)))] ( *, , = @ 0 ABCDC 0 log , 0 d0 + @ ; AF (;) log(1 − , *(;) ) d; Training of Generator – min $ [1 − ,(* G )] = 0 Training of Discriminator max ' ,(I) = 1 max ' [1 − ,(* G )] = 1 Discriminant for real data Discriminant for generated data - ((,, *) has a saddle point at ,(* ; ) = J K , ∈ [0, 1] data is fake/real
  • 7. Discover Cross-Domain Relations with GAN 7 Training of 2 different data sets without explicitly paired labelling Results of domain transfer - Previous AI could also transfer data from one domain to another, preserving key attributes - Previous training methods (~2016) require paired data, that is costly and hard to collect - DiscoGAN requires training of 2 different data sets without any paired data, and its results shows better performance with robustness to the mode collapse problem (Domain A) (Domain B) !"# !#"
  • 8. Network Models – DiscoGAN & Previous GANs 8 Standard GAN with GAN loss GAN with a reconstruction loss & GAN loss DiscoGAN - Each generator consists of encoder-decoder pair (input and output are images) - GAN loss (and the reconstruction) is to be minimized on training processes - In DiscoGAN, 2 coupled GANs map each domain to its counterpart domain (bijective)
  • 9. Problem Formulation (1) 9 - Reconstruction loss measures how well the original input is reconstructed after a sequence of two generations: !"#$%&' = )(+,-,, +,) such as !0, !1, or Huber loss - GAN loss measures how realistic the generated image is in domain B: !2,$3 = −56'~8'(6) log <- +,- - Relaxed constraints are considered to guarantee bijection and domain transition - Bijection: ideally =,- >0 = =-, → min 2'3 (!"#$%&' ), min 23' (!"#$%&3 ) - Domain transition: ideally B,- ∈ ℝ-, B-, ∈ ℝ, → min E3 (!E3 ), min E' (!E' )
  • 10. Problem Formulation (2) 10 Training of Generator (in case of !"#) Training of Discriminator (in case of $#) Constraints Level (a) Standard GAN with GAN loss %&' = −*+~-+ [log &'(3+'(4+))] %&' = −*+~-+ [log &'(3+'(4+))] – (b) GAN with a reconstruction loss & GAN loss %3+' = %3+7' + %9:7;<+ = −*+~-+ log &' 3+' 4+ + =(4+'+, 4+) %&' = −*'~-' [log &' 4' ] − *+~-+ [log(1 − &'(3+'(4+)))] doubled DOF from (a), weaker than (a) (c) DiscoGAN %3 = %3+' + %@AB = %3+7' + %9:7;<+ + %3+7+ + %9:7;<' = −*+~-+ log &' 3+' 4+ + =(4+'+, 4+) − *'~-' log &+ 3'+ 4' + =(4'+', 4') %& = %&+ + %&' = −*+~-+ [log &+ 4+ ] − *'~-' [log(1 − &+(3'+(4')))] − *'~-' [log &' 4' ] − *+~-+ [log(1 − &'(3+'(4+)))] doubled DOF from (b), weaker than (b)
  • 11. Architecture of Generator 11 - Each generator takes an image and feeds it through an encoder-decoder pair - Number of layers ranges from 4 to 5 depending on the domain Encoder (convolution layer) Decoder (deconvolution layer) Domain A (resp. B) Domain B (resp. A)
  • 12. Architecture of Discriminator 12 - Each discriminator feeds an image through convolution layers - Discriminator outputs a scalar output based on sigmoid, telling how real fed image is
  • 13. Toy Experiment – Domain Transition Test 13 - In DiscoGAN, discriminator B is perfectly fooled by translated sampled from domain A - DiscoGAN prevents mode-collapse by translating into distinct well-bounded regions that do not overlap Initial state Standard GAN GAN with !"#$%& DiscoGAN '( Colored points: samples in domain A Black x’s: target modes in domain B
  • 14. Mode Collapse Problem 14 The gradients are biased towards the mode from which higher number of samples are drawn to form the real training data - Generator outputs unintended images in different mode, which occurs prevalently in GANs - Usually, GAN remedy this problem with losses, however it has not been resolved perfectly - Other examples: communication system, cryptography, automaton, etc.
  • 15. Why DiscoGAN is robust to mode-collapse? 15 - In DiscoGAN, two coupled models are trained together simultaneously. !"#’s and !#"’s share parameters - Constraints of coupled reconstruction losses lead to the strict bijection
  • 16. Real Domain Experiment – Car to Car, Face to Face 16 Input data Standard GAN GAN with !"#$%& DiscoGAN CartocarFacetoface - Reconstruction tests - Results in DiscoGAN show higher correlations, (robust to mode collapse) –
  • 17. Real Domain Experiment – Face Conversion 17 Translation of gender Blond to black, Black to blond hair Glasses to non-glasses, non-glasses to glasses - DiscoGAN translates specific feature, preserving other facial features
  • 18. Cross-Domain Experiment (1) 18 Chair to car Car to face - Note that training is implemented without any paired data - The main attribute (azimuth) is preserved
  • 19. Cross-Domain Experiment (2) 19 - 1-to-N problem Handbag to sketches Sketches to shoes Sketches to handbags
  • 20. Cross-Domain Experiment (3) 20 - Same style is discovered Handbag to shoes Shoes to handbag
  • 21. Summaries 21 - DiscoGAN is proposed as a learning method to discover cross-domain relations without any pair labels - Results showed better performance with robustness to mode-collapse. The symmetry granted by coupling 2 GANs, is considered to be a key factor for the dynamical robustness Comments - The strategy to couple two GAN models reminded me of the symmetry of dynamics. Some correlations could be drawn to handle the stability problem…? - This paper is giving me many ideas. It is very pleasant.
  • 22. 22 Thank you! - Source code for simulations Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks" (Github) ... https://github.com/SKTBrain/DiscoGAN - This presentation is also available on: https://www.slideshare.net/SeongcheolBaek/introduction-of-discogan
  • 23. References 23 - Crux of Presentation T. Kim, et al., Learning to Discover Cross-Domain Relations with Generative Adversarial Networks (arXiv) ... https://arxiv.org/abs/1703.05192 - Recent generative technologies Apple announces Animoji (The Verge) … https://www.theverge.com/2017/9/12/16290210/new-iphone-emoji-animated-animoji-apple-ios-11-update AI Can Convert Black and White Clips into Color (Nvidia Developer) ... https://news.developer.nvidia.com/ai-can-convert-black-and-white-clips-into-color/ Nvidia’s latest AI software turns rough doodles into realistic landscapes (The Verge) ... https://www.theverge.com/2019/3/19/18272602/ai-art-generation-gan-nvidia-doodle-landscapes Deepfakes are getting easier than ever to make (The Verge) … https://www.theverge.com/2019/5/23/18637373/deepfakes-samsung-ai-research-results-single-photo-algorithm - Recent issues around deepfakes – security, art, etc. A viral video that appeared to show Obama calling Trump a 'dips---' shows a disturbing new trend called 'deepfakes’ (Business Insider) … https://www.businessinsider.com/obama- deepfake-video-insulting-trump-2018-4 New deepfake tech turns a single photo and audio file into a singing video portrait (The Verge) ... https://www.theverge.com/2019/6/20/18692671/deepfake-technology-singing-talking- video-portrait-from-a-single-image-imperial-college-samsung US lawmakers say AI deepfakes ‘have the potential to disrupt every facet of our society’ (The Verge) … https://www.theverge.com/2018/9/14/17859188/ai-deepfakes-national-security- threat-lawmakers-letter-intelligence-community Deepfakes: A Threat to Individuals and National Security (Lionbridge) … https://lionbridge.ai/articles/deepfakes-a-threat-to-individuals-and-national-security/ A deepfake clip of Mark Zuckerberg is being allowed to remain on Instagram (iNews) … https://inews.co.uk/news/technology/a-deepfake-clip-of-mark-zuckerberg-is-being-allowed-to- remain-on-instagram/ Portrait of Edmond Belamy created by GAN (Wikipedia) … https://en.wikipedia.org/wiki/Edmond_de_Belamy - Generative Adversarial Network I. J. Goodfellow, Generative Adversarial Nets (arXiv) … https://arxiv.org/abs/1406.2661 Tutorial on Generative Adversarial Networks … https://www.slideshare.net/ckmarkohchang/generative-adversarial-networks - Mode Collapse Problem A. Ghosh, et al., Multi-Agent Diverse Generative Adversarial Networks (Research Gate) … https://www.researchgate.net/publication/315882247_Multi- Agent_Diverse_Generative_Adversarial_Networks