SlideShare a Scribd company logo
1 of 23
Download to read offline
1
Published by T. Kim, M. Cha, H. Kim, J. K. Lee, and, J. Kim
Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017
Seongcheol Baek
Reading Circle Presentation @ Hikihara Lab
Department of Electrical Engineering, Kyoto University
2019/07/19
Learning to Discover Cross-Domain Relations
with Generative Adversarial Networks
Focus of this presentation
- Recently emerging issues around GAN
- Introduction of generative adversarial networks (GAN)
- What is DiscoGAN?
Problems of interest / model architecture / mode collapse problem /
experiments / summaries / comments
2
Recent generative technologies
3
2014 2015 2016 2017 2018 2019
Ian J. Goodfellow
invented
“generative
adversarial network”
Deep
Convolutional GAN
(DCGAN)
Least Squares GAN
(LSGAN)
Semi-Supervised GAN StackGAN,
Auxiliary
Classifier GAN
(ACGAN)
Jun. Oct. Nov. Oct. Mar. Aug. Sep. Oct. Sep. Mar. May
Samsung deepfake AI
fabricate a video from
a single profile picGauGAN (Source: Nvidia)
BW clips into color (Source: Nvidia)
CycleGAN
Original AlphaGo
beat a professional
Go player
DiscoGAN
Recent issues around deepfakes – security, art, etc.
4
A viral video that Obama insults
Donald Trump is fabricated
with FakeApp (Photo: Youtube)
A deepfake clip of Mark Zuckerberg
is being allowed to remain on Instagram
(Photo: Bill Poster UK)
- US lawmakers say AI deepfakes ‘have the potential to disrupt every facet of our society’
- At individual level, deepfakes can be used for cyberbullying, defamation and blackmail
Edmond de Belamy: The first piece
of AI-generated art
(created by GAN in 2018)
What is GAN?
5
- Two neural networks contest with each other in a game. Given a training set, GAN learns to
generate new data with the same statistics as the training set.
- Minimax two-player game (Generative model v.s. Discriminative model)
Minimax Problem of GAN
6
min
$
max
'
((*, ,) = /0~23454(0) log , 0 + /:~2;(;)[log(1 − ,(*(;)))]
( *, , = @
0
ABCDC 0 log , 0 d0 + @
;
AF (;) log(1 − , *(;) ) d;
Training of Generator – min
$
[1 − ,(* G )] = 0
Training of Discriminator max
'
,(I) = 1 max
'
[1 − ,(* G )] = 1
Discriminant for real data Discriminant for generated data
- ((,, *) has a saddle point at ,(* ; ) =
J
K
, ∈ [0, 1]
data is fake/real
Discover Cross-Domain Relations with GAN
7
Training of 2 different data sets
without explicitly paired labelling
Results of domain transfer
- Previous AI could also transfer data from one domain to another, preserving key attributes
- Previous training methods (~2016) require paired data, that is costly and hard to collect
- DiscoGAN requires training of 2 different data sets without any paired data, and its results
shows better performance with robustness to the mode collapse problem
(Domain A) (Domain B)
!"#
!#"
Network Models – DiscoGAN & Previous GANs
8
Standard GAN with GAN loss
GAN with a reconstruction loss & GAN loss
DiscoGAN
- Each generator consists of encoder-decoder pair (input and output are images)
- GAN loss (and the reconstruction) is to be minimized on training processes
- In DiscoGAN, 2 coupled GANs map each domain to its counterpart domain (bijective)
Problem Formulation (1)
9
- Reconstruction loss measures how well the original input is reconstructed after a sequence of two
generations: !"#$%&'
= )(+,-,, +,) such as !0, !1, or Huber loss
- GAN loss measures how realistic the generated image is in domain B: !2,$3
= −56'~8'(6) log <- +,-
- Relaxed constraints are considered to guarantee bijection and domain transition
- Bijection: ideally =,-
>0
= =-,
→ min
2'3
(!"#$%&'
), min
23'
(!"#$%&3
)
- Domain transition: ideally B,- ∈ ℝ-, B-, ∈ ℝ,
→ min
E3
(!E3
), min
E'
(!E'
)
Problem Formulation (2)
10
Training of Generator
(in case of !"#)
Training of Discriminator
(in case of $#)
Constraints Level
(a) Standard GAN with GAN
loss
%&'
= −*+~-+
[log &'(3+'(4+))]
%&'
= −*+~-+
[log &'(3+'(4+))] –
(b) GAN with a
reconstruction loss & GAN
loss
%3+'
= %3+7'
+ %9:7;<+
= −*+~-+
log &' 3+' 4+
+ =(4+'+, 4+)
%&'
= −*'~-'
[log &' 4' ]
− *+~-+
[log(1 − &'(3+'(4+)))]
doubled DOF
from (a),
weaker than (a)
(c) DiscoGAN %3 = %3+'
+ %@AB
= %3+7'
+ %9:7;<+
+ %3+7+
+ %9:7;<'
= −*+~-+
log &' 3+' 4+
+ =(4+'+, 4+)
− *'~-'
log &+ 3'+ 4'
+ =(4'+', 4')
%& = %&+
+ %&'
= −*+~-+
[log &+ 4+ ]
− *'~-'
[log(1 − &+(3'+(4')))]
− *'~-'
[log &' 4' ]
− *+~-+
[log(1 − &'(3+'(4+)))]
doubled DOF
from (b),
weaker than (b)
Architecture of Generator
11
- Each generator takes an image and feeds it through an encoder-decoder pair
- Number of layers ranges from 4 to 5 depending on the domain
Encoder
(convolution layer)
Decoder
(deconvolution layer)
Domain A (resp. B) Domain B (resp. A)
Architecture of Discriminator
12
- Each discriminator feeds an image through convolution layers
- Discriminator outputs a scalar output based on sigmoid, telling how real fed image is
Toy Experiment – Domain Transition Test
13
- In DiscoGAN, discriminator B is perfectly fooled by translated sampled from domain A
- DiscoGAN prevents mode-collapse by translating into distinct well-bounded regions that do
not overlap
Initial state Standard GAN GAN with !"#$%& DiscoGAN
'(
Colored points: samples in domain A
Black x’s: target modes in domain B
Mode Collapse Problem
14
The gradients are biased towards the mode from which
higher number of samples are drawn to form the real training data
- Generator outputs unintended images in different mode, which occurs prevalently in GANs
- Usually, GAN remedy this problem with losses, however it has not been resolved perfectly
- Other examples: communication system, cryptography, automaton, etc.
Why DiscoGAN is robust to mode-collapse?
15
- In DiscoGAN, two coupled models are trained together simultaneously. !"#’s and !#"’s
share parameters
- Constraints of coupled reconstruction losses lead to the strict bijection
Real Domain Experiment – Car to Car, Face to Face
16
Input data Standard GAN GAN with !"#$%& DiscoGAN
CartocarFacetoface
- Reconstruction tests
- Results in DiscoGAN show higher correlations, (robust to mode collapse)
–
Real Domain Experiment – Face Conversion
17
Translation of gender
Blond to black,
Black to blond hair
Glasses to non-glasses,
non-glasses to glasses
- DiscoGAN translates specific feature, preserving other facial features
Cross-Domain Experiment (1)
18
Chair to car Car to face
- Note that training is implemented without any paired data
- The main attribute (azimuth) is preserved
Cross-Domain Experiment (2)
19
- 1-to-N problem
Handbag to sketches
Sketches to shoes
Sketches to handbags
Cross-Domain Experiment (3)
20
- Same style is discovered
Handbag to shoes
Shoes to handbag
Summaries
21
- DiscoGAN is proposed as a learning method to discover cross-domain relations without any
pair labels
- Results showed better performance with robustness to mode-collapse. The symmetry
granted by coupling 2 GANs, is considered to be a key factor for the dynamical robustness
Comments
- The strategy to couple two GAN models reminded me of the symmetry of dynamics. Some
correlations could be drawn to handle the stability problem…?
- This paper is giving me many ideas. It is very pleasant.
22
Thank you!
- Source code for simulations
Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks" (Github) ... https://github.com/SKTBrain/DiscoGAN
- This presentation is also available on:
https://www.slideshare.net/SeongcheolBaek/introduction-of-discogan
References
23
- Crux of Presentation
T. Kim, et al., Learning to Discover Cross-Domain Relations with Generative Adversarial Networks (arXiv) ... https://arxiv.org/abs/1703.05192
- Recent generative technologies
Apple announces Animoji (The Verge) … https://www.theverge.com/2017/9/12/16290210/new-iphone-emoji-animated-animoji-apple-ios-11-update
AI Can Convert Black and White Clips into Color (Nvidia Developer) ... https://news.developer.nvidia.com/ai-can-convert-black-and-white-clips-into-color/
Nvidia’s latest AI software turns rough doodles into realistic landscapes (The Verge) ... https://www.theverge.com/2019/3/19/18272602/ai-art-generation-gan-nvidia-doodle-landscapes
Deepfakes are getting easier than ever to make (The Verge) … https://www.theverge.com/2019/5/23/18637373/deepfakes-samsung-ai-research-results-single-photo-algorithm
- Recent issues around deepfakes – security, art, etc.
A viral video that appeared to show Obama calling Trump a 'dips---' shows a disturbing new trend called 'deepfakes’ (Business Insider) … https://www.businessinsider.com/obama-
deepfake-video-insulting-trump-2018-4
New deepfake tech turns a single photo and audio file into a singing video portrait (The Verge) ... https://www.theverge.com/2019/6/20/18692671/deepfake-technology-singing-talking-
video-portrait-from-a-single-image-imperial-college-samsung
US lawmakers say AI deepfakes ‘have the potential to disrupt every facet of our society’ (The Verge) … https://www.theverge.com/2018/9/14/17859188/ai-deepfakes-national-security-
threat-lawmakers-letter-intelligence-community
Deepfakes: A Threat to Individuals and National Security (Lionbridge) … https://lionbridge.ai/articles/deepfakes-a-threat-to-individuals-and-national-security/
A deepfake clip of Mark Zuckerberg is being allowed to remain on Instagram (iNews) … https://inews.co.uk/news/technology/a-deepfake-clip-of-mark-zuckerberg-is-being-allowed-to-
remain-on-instagram/
Portrait of Edmond Belamy created by GAN (Wikipedia) … https://en.wikipedia.org/wiki/Edmond_de_Belamy
- Generative Adversarial Network
I. J. Goodfellow, Generative Adversarial Nets (arXiv) … https://arxiv.org/abs/1406.2661
Tutorial on Generative Adversarial Networks … https://www.slideshare.net/ckmarkohchang/generative-adversarial-networks
- Mode Collapse Problem
A. Ghosh, et al., Multi-Agent Diverse Generative Adversarial Networks (Research Gate) … https://www.researchgate.net/publication/315882247_Multi-
Agent_Diverse_Generative_Adversarial_Networks

More Related Content

What's hot

Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...Indraneel Pole
 
Computer graphics
Computer graphicsComputer graphics
Computer graphicsMohsin Azam
 
Evolution of the StyleGAN family
Evolution of the StyleGAN familyEvolution of the StyleGAN family
Evolution of the StyleGAN familyVitaly Bondar
 
pattern classification
pattern classificationpattern classification
pattern classificationRanjan Ganguli
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...hyunyoung Lee
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetSungminYou
 
Introduction to AI & ML
Introduction to AI & MLIntroduction to AI & ML
Introduction to AI & MLMandy Sidana
 
L13 string handling(string class)
L13 string handling(string class)L13 string handling(string class)
L13 string handling(string class)teach4uin
 
Computer Graphics
Computer GraphicsComputer Graphics
Computer GraphicsAdri Jovin
 
Visible surface identification
Visible surface identificationVisible surface identification
Visible surface identificationPooja Dixit
 
Image to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANImage to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANS.Shayan Daneshvar
 
Computer Vision Basics
Computer Vision BasicsComputer Vision Basics
Computer Vision BasicsSuren Kumar
 
Object Detection & Tracking
Object Detection & TrackingObject Detection & Tracking
Object Detection & TrackingAkshay Gujarathi
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksBennoG1
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basicsBrodmann17
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Manohar Mukku
 

What's hot (20)

Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
 
Computer graphics
Computer graphicsComputer graphics
Computer graphics
 
Evolution of the StyleGAN family
Evolution of the StyleGAN familyEvolution of the StyleGAN family
Evolution of the StyleGAN family
 
pattern classification
pattern classificationpattern classification
pattern classification
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNet
 
Introduction to AI & ML
Introduction to AI & MLIntroduction to AI & ML
Introduction to AI & ML
 
L13 string handling(string class)
L13 string handling(string class)L13 string handling(string class)
L13 string handling(string class)
 
Computer Graphics
Computer GraphicsComputer Graphics
Computer Graphics
 
Visible surface identification
Visible surface identificationVisible surface identification
Visible surface identification
 
Image to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANImage to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GAN
 
Computer Vision Basics
Computer Vision BasicsComputer Vision Basics
Computer Vision Basics
 
Object Detection & Tracking
Object Detection & TrackingObject Detection & Tracking
Object Detection & Tracking
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial Networks
 
Computer vision ppt
Computer vision pptComputer vision ppt
Computer vision ppt
 
Style gan
Style ganStyle gan
Style gan
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
 

Similar to Introduction of DiscoGAN

Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Databricks
 
Tips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software EngineeringTips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software Engineeringjtdudley
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compilerGrokking VN
 
Class[3][5th jun] [three js]
Class[3][5th jun] [three js]Class[3][5th jun] [three js]
Class[3][5th jun] [three js]Saajid Akram
 
2 Years of Real World FP at REA
2 Years of Real World FP at REA2 Years of Real World FP at REA
2 Years of Real World FP at REAkenbot
 
VitaFlow | Mageswaran Dhandapani [Pramati]
VitaFlow | Mageswaran Dhandapani [Pramati]VitaFlow | Mageswaran Dhandapani [Pramati]
VitaFlow | Mageswaran Dhandapani [Pramati]Pramati Technologies
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptDivyaGugulothu
 
Test-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxTest-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxVictor Rentea
 
Engine Terminology
Engine TerminologyEngine Terminology
Engine Terminologykamkill
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowOswald Campesato
 
Andriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsAndriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsOWASP Kyiv
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Prabindh Sundareson
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdfnyomans1
 
Regex Considered Harmful: Use Rosie Pattern Language Instead
Regex Considered Harmful: Use Rosie Pattern Language InsteadRegex Considered Harmful: Use Rosie Pattern Language Instead
Regex Considered Harmful: Use Rosie Pattern Language InsteadAll Things Open
 
Performance #5 cpu and battery
Performance #5  cpu and batteryPerformance #5  cpu and battery
Performance #5 cpu and batteryVitali Pekelis
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...Taeksoo Kim
 
Strategies for refactoring and migrating a big old project to be multilingual...
Strategies for refactoring and migrating a big old project to be multilingual...Strategies for refactoring and migrating a big old project to be multilingual...
Strategies for refactoring and migrating a big old project to be multilingual...benjaoming
 

Similar to Introduction of DiscoGAN (20)

Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
 
Tips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software EngineeringTips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software Engineering
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compiler
 
Class[3][5th jun] [three js]
Class[3][5th jun] [three js]Class[3][5th jun] [three js]
Class[3][5th jun] [three js]
 
2 Years of Real World FP at REA
2 Years of Real World FP at REA2 Years of Real World FP at REA
2 Years of Real World FP at REA
 
VitaFlow | Mageswaran Dhandapani [Pramati]
VitaFlow | Mageswaran Dhandapani [Pramati]VitaFlow | Mageswaran Dhandapani [Pramati]
VitaFlow | Mageswaran Dhandapani [Pramati]
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.ppt
 
Test-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxTest-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptx
 
Generative AI for Reengineering Variants into Software Product Lines: An Expe...
Generative AI for Reengineering Variants into Software Product Lines: An Expe...Generative AI for Reengineering Variants into Software Product Lines: An Expe...
Generative AI for Reengineering Variants into Software Product Lines: An Expe...
 
Engine Terminology
Engine TerminologyEngine Terminology
Engine Terminology
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
 
Andriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsAndriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tips
 
ELAVARASAN.pdf
ELAVARASAN.pdfELAVARASAN.pdf
ELAVARASAN.pdf
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011
 
Evolution of Spark APIs
Evolution of Spark APIsEvolution of Spark APIs
Evolution of Spark APIs
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdf
 
Regex Considered Harmful: Use Rosie Pattern Language Instead
Regex Considered Harmful: Use Rosie Pattern Language InsteadRegex Considered Harmful: Use Rosie Pattern Language Instead
Regex Considered Harmful: Use Rosie Pattern Language Instead
 
Performance #5 cpu and battery
Performance #5  cpu and batteryPerformance #5  cpu and battery
Performance #5 cpu and battery
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
 
Strategies for refactoring and migrating a big old project to be multilingual...
Strategies for refactoring and migrating a big old project to be multilingual...Strategies for refactoring and migrating a big old project to be multilingual...
Strategies for refactoring and migrating a big old project to be multilingual...
 

Recently uploaded

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Recently uploaded (20)

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Introduction of DiscoGAN

  • 1. 1 Published by T. Kim, M. Cha, H. Kim, J. K. Lee, and, J. Kim Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017 Seongcheol Baek Reading Circle Presentation @ Hikihara Lab Department of Electrical Engineering, Kyoto University 2019/07/19 Learning to Discover Cross-Domain Relations with Generative Adversarial Networks
  • 2. Focus of this presentation - Recently emerging issues around GAN - Introduction of generative adversarial networks (GAN) - What is DiscoGAN? Problems of interest / model architecture / mode collapse problem / experiments / summaries / comments 2
  • 3. Recent generative technologies 3 2014 2015 2016 2017 2018 2019 Ian J. Goodfellow invented “generative adversarial network” Deep Convolutional GAN (DCGAN) Least Squares GAN (LSGAN) Semi-Supervised GAN StackGAN, Auxiliary Classifier GAN (ACGAN) Jun. Oct. Nov. Oct. Mar. Aug. Sep. Oct. Sep. Mar. May Samsung deepfake AI fabricate a video from a single profile picGauGAN (Source: Nvidia) BW clips into color (Source: Nvidia) CycleGAN Original AlphaGo beat a professional Go player DiscoGAN
  • 4. Recent issues around deepfakes – security, art, etc. 4 A viral video that Obama insults Donald Trump is fabricated with FakeApp (Photo: Youtube) A deepfake clip of Mark Zuckerberg is being allowed to remain on Instagram (Photo: Bill Poster UK) - US lawmakers say AI deepfakes ‘have the potential to disrupt every facet of our society’ - At individual level, deepfakes can be used for cyberbullying, defamation and blackmail Edmond de Belamy: The first piece of AI-generated art (created by GAN in 2018)
  • 5. What is GAN? 5 - Two neural networks contest with each other in a game. Given a training set, GAN learns to generate new data with the same statistics as the training set. - Minimax two-player game (Generative model v.s. Discriminative model)
  • 6. Minimax Problem of GAN 6 min $ max ' ((*, ,) = /0~23454(0) log , 0 + /:~2;(;)[log(1 − ,(*(;)))] ( *, , = @ 0 ABCDC 0 log , 0 d0 + @ ; AF (;) log(1 − , *(;) ) d; Training of Generator – min $ [1 − ,(* G )] = 0 Training of Discriminator max ' ,(I) = 1 max ' [1 − ,(* G )] = 1 Discriminant for real data Discriminant for generated data - ((,, *) has a saddle point at ,(* ; ) = J K , ∈ [0, 1] data is fake/real
  • 7. Discover Cross-Domain Relations with GAN 7 Training of 2 different data sets without explicitly paired labelling Results of domain transfer - Previous AI could also transfer data from one domain to another, preserving key attributes - Previous training methods (~2016) require paired data, that is costly and hard to collect - DiscoGAN requires training of 2 different data sets without any paired data, and its results shows better performance with robustness to the mode collapse problem (Domain A) (Domain B) !"# !#"
  • 8. Network Models – DiscoGAN & Previous GANs 8 Standard GAN with GAN loss GAN with a reconstruction loss & GAN loss DiscoGAN - Each generator consists of encoder-decoder pair (input and output are images) - GAN loss (and the reconstruction) is to be minimized on training processes - In DiscoGAN, 2 coupled GANs map each domain to its counterpart domain (bijective)
  • 9. Problem Formulation (1) 9 - Reconstruction loss measures how well the original input is reconstructed after a sequence of two generations: !"#$%&' = )(+,-,, +,) such as !0, !1, or Huber loss - GAN loss measures how realistic the generated image is in domain B: !2,$3 = −56'~8'(6) log <- +,- - Relaxed constraints are considered to guarantee bijection and domain transition - Bijection: ideally =,- >0 = =-, → min 2'3 (!"#$%&' ), min 23' (!"#$%&3 ) - Domain transition: ideally B,- ∈ ℝ-, B-, ∈ ℝ, → min E3 (!E3 ), min E' (!E' )
  • 10. Problem Formulation (2) 10 Training of Generator (in case of !"#) Training of Discriminator (in case of $#) Constraints Level (a) Standard GAN with GAN loss %&' = −*+~-+ [log &'(3+'(4+))] %&' = −*+~-+ [log &'(3+'(4+))] – (b) GAN with a reconstruction loss & GAN loss %3+' = %3+7' + %9:7;<+ = −*+~-+ log &' 3+' 4+ + =(4+'+, 4+) %&' = −*'~-' [log &' 4' ] − *+~-+ [log(1 − &'(3+'(4+)))] doubled DOF from (a), weaker than (a) (c) DiscoGAN %3 = %3+' + %@AB = %3+7' + %9:7;<+ + %3+7+ + %9:7;<' = −*+~-+ log &' 3+' 4+ + =(4+'+, 4+) − *'~-' log &+ 3'+ 4' + =(4'+', 4') %& = %&+ + %&' = −*+~-+ [log &+ 4+ ] − *'~-' [log(1 − &+(3'+(4')))] − *'~-' [log &' 4' ] − *+~-+ [log(1 − &'(3+'(4+)))] doubled DOF from (b), weaker than (b)
  • 11. Architecture of Generator 11 - Each generator takes an image and feeds it through an encoder-decoder pair - Number of layers ranges from 4 to 5 depending on the domain Encoder (convolution layer) Decoder (deconvolution layer) Domain A (resp. B) Domain B (resp. A)
  • 12. Architecture of Discriminator 12 - Each discriminator feeds an image through convolution layers - Discriminator outputs a scalar output based on sigmoid, telling how real fed image is
  • 13. Toy Experiment – Domain Transition Test 13 - In DiscoGAN, discriminator B is perfectly fooled by translated sampled from domain A - DiscoGAN prevents mode-collapse by translating into distinct well-bounded regions that do not overlap Initial state Standard GAN GAN with !"#$%& DiscoGAN '( Colored points: samples in domain A Black x’s: target modes in domain B
  • 14. Mode Collapse Problem 14 The gradients are biased towards the mode from which higher number of samples are drawn to form the real training data - Generator outputs unintended images in different mode, which occurs prevalently in GANs - Usually, GAN remedy this problem with losses, however it has not been resolved perfectly - Other examples: communication system, cryptography, automaton, etc.
  • 15. Why DiscoGAN is robust to mode-collapse? 15 - In DiscoGAN, two coupled models are trained together simultaneously. !"#’s and !#"’s share parameters - Constraints of coupled reconstruction losses lead to the strict bijection
  • 16. Real Domain Experiment – Car to Car, Face to Face 16 Input data Standard GAN GAN with !"#$%& DiscoGAN CartocarFacetoface - Reconstruction tests - Results in DiscoGAN show higher correlations, (robust to mode collapse) –
  • 17. Real Domain Experiment – Face Conversion 17 Translation of gender Blond to black, Black to blond hair Glasses to non-glasses, non-glasses to glasses - DiscoGAN translates specific feature, preserving other facial features
  • 18. Cross-Domain Experiment (1) 18 Chair to car Car to face - Note that training is implemented without any paired data - The main attribute (azimuth) is preserved
  • 19. Cross-Domain Experiment (2) 19 - 1-to-N problem Handbag to sketches Sketches to shoes Sketches to handbags
  • 20. Cross-Domain Experiment (3) 20 - Same style is discovered Handbag to shoes Shoes to handbag
  • 21. Summaries 21 - DiscoGAN is proposed as a learning method to discover cross-domain relations without any pair labels - Results showed better performance with robustness to mode-collapse. The symmetry granted by coupling 2 GANs, is considered to be a key factor for the dynamical robustness Comments - The strategy to couple two GAN models reminded me of the symmetry of dynamics. Some correlations could be drawn to handle the stability problem…? - This paper is giving me many ideas. It is very pleasant.
  • 22. 22 Thank you! - Source code for simulations Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks" (Github) ... https://github.com/SKTBrain/DiscoGAN - This presentation is also available on: https://www.slideshare.net/SeongcheolBaek/introduction-of-discogan
  • 23. References 23 - Crux of Presentation T. Kim, et al., Learning to Discover Cross-Domain Relations with Generative Adversarial Networks (arXiv) ... https://arxiv.org/abs/1703.05192 - Recent generative technologies Apple announces Animoji (The Verge) … https://www.theverge.com/2017/9/12/16290210/new-iphone-emoji-animated-animoji-apple-ios-11-update AI Can Convert Black and White Clips into Color (Nvidia Developer) ... https://news.developer.nvidia.com/ai-can-convert-black-and-white-clips-into-color/ Nvidia’s latest AI software turns rough doodles into realistic landscapes (The Verge) ... https://www.theverge.com/2019/3/19/18272602/ai-art-generation-gan-nvidia-doodle-landscapes Deepfakes are getting easier than ever to make (The Verge) … https://www.theverge.com/2019/5/23/18637373/deepfakes-samsung-ai-research-results-single-photo-algorithm - Recent issues around deepfakes – security, art, etc. A viral video that appeared to show Obama calling Trump a 'dips---' shows a disturbing new trend called 'deepfakes’ (Business Insider) … https://www.businessinsider.com/obama- deepfake-video-insulting-trump-2018-4 New deepfake tech turns a single photo and audio file into a singing video portrait (The Verge) ... https://www.theverge.com/2019/6/20/18692671/deepfake-technology-singing-talking- video-portrait-from-a-single-image-imperial-college-samsung US lawmakers say AI deepfakes ‘have the potential to disrupt every facet of our society’ (The Verge) … https://www.theverge.com/2018/9/14/17859188/ai-deepfakes-national-security- threat-lawmakers-letter-intelligence-community Deepfakes: A Threat to Individuals and National Security (Lionbridge) … https://lionbridge.ai/articles/deepfakes-a-threat-to-individuals-and-national-security/ A deepfake clip of Mark Zuckerberg is being allowed to remain on Instagram (iNews) … https://inews.co.uk/news/technology/a-deepfake-clip-of-mark-zuckerberg-is-being-allowed-to- remain-on-instagram/ Portrait of Edmond Belamy created by GAN (Wikipedia) … https://en.wikipedia.org/wiki/Edmond_de_Belamy - Generative Adversarial Network I. J. Goodfellow, Generative Adversarial Nets (arXiv) … https://arxiv.org/abs/1406.2661 Tutorial on Generative Adversarial Networks … https://www.slideshare.net/ckmarkohchang/generative-adversarial-networks - Mode Collapse Problem A. Ghosh, et al., Multi-Agent Diverse Generative Adversarial Networks (Research Gate) … https://www.researchgate.net/publication/315882247_Multi- Agent_Diverse_Generative_Adversarial_Networks