SlideShare a Scribd company logo
1 of 15
RSGAN:
Regularization-on-Sigma GAN
Chung-Il Kim*, Seungwon Jung and Eenjun Hwang
School of Electrical Engineering, Korea University
Contents
▪ Introduction
▪ RSGAN
▪ Experiments
▪ Conclusion
2/15
Generative Adversarial Nets
▪ A generator(G) produces synthetic data from random variables.
▪ A discriminator(D) get two inputs; real one and fake one.
▪ The D determines if the input is authentic.
▪ Given by loss,
D & G are each trained by
the optimizer.
3/15[1] https://medium.com/coinmonks/celebrity-face-generation-using-gans-tensorflow-implementation-eaa2001eef86
Examples of Data
Generated by Recent GANs
BEGAN
(Boundary equilibrium GANs, 2017, Google)
Train data: CelebA
Human beings almost cannot tell the real
data from the synthetic data.
4/15
Fig 2. A total of 15 samples generated by
BEGAN[2]
[2] D. Berthelot, T. Schumm, and L. Metz, "BEGAN: boundary equilibrium generative adversarial networks," arXiv preprint
arXiv:1703.10717, 2017.
Failed Examples Generated
by GANs
In 737k steps, BEGAN keep on generating the same
data. At 2,000k steps, it fail to learn a data distribution.
This phenomenon is called ‘mode collapse’.
Several different input z vectors, but same output
(possibly due to low model capacity or inadequate
optimization).
Detecting mode collapse is very challenging.
5/15Fig 3. A total of 16 samples generated from a
few training steps are shown in BEGAN.
RSGAN
More stable than BEGAN in a long-term
learning
Each frame = 10k steps
Mode collapse has not been observed
during every global 2,400k steps.
6/15
Image Stability on
Sequential Steps
32x32 images generated
Each 16 samples monitored every
1000 steps until 2400k
At 737k, BEGAN start to mode
collapse, weird images
RSGAN generated diverse human-
like face steadily until 2,400k steps
7/15
BEGAN Training Procedure
8/15
Synthetic sample
𝐺(𝑧)
Real sample
𝑥
Discriminator
D
(Auto-Encoder)
output
𝐷(𝑥)
output
𝐷(𝐺(𝑧))
‘Real data error’
𝐿(𝑥) = |𝐷(𝑥) − 𝑥|
G
Real data
Noise
𝑧 Generator
‘Synthetic data error’
𝐿(𝐺(𝑧)) = |𝐷(𝐺(𝑧)) − 𝐺(𝑧)|
① ②
③
④
BEGAN Objectives
▪ Discriminator objectives
▪ Minimize ‘real data error’ and maximize ‘synthetic data error’
ℒ 𝐷𝑖𝑠𝑐𝑟𝑖𝑚𝑖𝑛𝑎𝑡𝑜𝑟 = argmin[𝐿 𝑥 − 𝑘 𝑡
∗
𝐿 𝐺 𝑧 ]
▪ Generator objectives
▪ Minimize ‘synthetic data error’
ℒ 𝐺𝑒𝑛𝑒𝑟𝑎𝑡𝑜𝑟 = argmin[𝐿 𝐺 𝑧 ]
▪ 𝑘 𝑡
∗
: proportional control variable for stable learning
9/15
RSGAN Training Procedure
10/15
Synthetic sample
𝐺(𝑧)
Real sample
𝑥
Discriminator
D
(Auto-Encoder)
output
𝐷(𝑥)
output
𝐷(𝐺(𝑧))
‘A metric on real data’
𝑚(𝑥, 𝐷(𝑥))
G
Real data
Noise
𝑧 Generator
‘A metric on synthetic data’
𝑚(𝐺(𝑧), 𝐷(𝐺(𝑧)))
① ②
③
④
RSGAN Objectives
▪ Discriminator objectives
▪ Minimize ‘a metric on real data’ and maximize ‘a metric on synthetic data’
ℒ 𝐷𝑖𝑠𝑐𝑟𝑖𝑚𝑖𝑛𝑎𝑡𝑜𝑟 = argmin[m(𝑥, 𝐷 𝑥 ) − 𝑘 𝑡
∗
m(𝐺 𝑧 , 𝐷(𝐺 𝑧 ))]
▪ Generator objectives
▪ Minimize ‘a metric on synthetic data’
ℒ 𝐺𝑒𝑛𝑒𝑟𝑎𝑡𝑜𝑟 = argmin[m(𝐺 𝑧 , 𝐷(𝐺 𝑧 ))]
▪ 𝑘 𝑡
∗
: proportional control variable for stable learning
11
Metric
▪ A function that defines a distance between each pair of elements of a set
▪ Ex) Euclidian distance, Manhattan distance, Jensen-Shannon Divergence, Wasserstein-1
distance, Wasserstein-2 distance
▪ RSGAN applied Wasserstein-2 distance
▪ Given data distribution 𝑃 𝑎𝑛𝑑 𝑄, Wasserstein-2 distance denoted:
𝑊2 𝑃, 𝑄 = 𝑚 𝑃 − 𝑚 𝑄
2
2
+ 𝑡𝑟𝑎𝑐𝑒(𝐶 𝑃 + 𝐶 𝑄 − 2 𝐶 𝑄
1
2
𝐶 𝑃
1
2
𝐶 𝑄
1
2
1
2
)
𝑚 𝑃, 𝑚 𝑄: 𝑚𝑒𝑎𝑛𝑠, 𝐶 𝑃, 𝐶 𝑄: 𝑐𝑜𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑚𝑎𝑡𝑟𝑖𝑐𝑒𝑠
12
Wasserstein-2 Distance
▪ Comparison BEGAN with RSGAN
▪ RSGAN consider not only means but also variances
13/15
BEGAN RSGAN
𝐿(𝑥) = |𝐷(𝑥) − 𝑥| m(𝑥, 𝐷 𝑥 )= 𝑚x − 𝑚D(x)
2
2
+
1
n
𝜎 𝑥 − 𝜎 𝐷(𝑥)
2
2
 RSGAN use simplified W-2 distance and it denoted:
𝑅𝑆𝐿 𝑃, 𝑄 = 𝑚 𝑃 − 𝑚 𝑄
2
2
+
1
n
𝜎 𝑃 − 𝜎 𝑄
2
2
𝜎 𝑃 𝑎𝑛𝑑 𝜎 𝑄: 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
 Derived procedure is in paper
Loss Graph
▪ Dataset: CelebA with 32 resolution
▪ Filter size: D with 128, G with 64
▪ Learning rate: D with 0.00008
▪ 𝛾(equilibrium): 0.5
▪ No technique used
▪ Batch normalization, dropout, transpose
convolutions, skip connections or
refinement in BEGAN
▪ BEGAN and RSGAN start to minimize
losses and converged to some value
14/15
Conclusion
▪ We proposed a new GAN model called RSGAN
▪ RSGAN uses Wasserstein-2 as its loss metric
▪ RSGAN can perform up to 2,400k training steps stably for CelebA dataset
▪ We plan to make RSGAN more stable using depthwise separable
convolution[3]
15/15[3] F. Chollet, "Xception: Deep learning with depthwise separable convolutions," arXiv preprint, p. 1610.02357, 2017.

More Related Content

Similar to Rsgan iconi

Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...
Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...
Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...
seijihagawa
 
BU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptx
BU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptxBU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptx
BU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptx
MaiGaafar
 

Similar to Rsgan iconi (20)

DTLC-GAN
DTLC-GANDTLC-GAN
DTLC-GAN
 
hands on machine learning Chapter 4 model training
hands on machine learning Chapter 4 model traininghands on machine learning Chapter 4 model training
hands on machine learning Chapter 4 model training
 
Romberg
RombergRomberg
Romberg
 
Intensity Constraint Gradient-Based Image Reconstruction
Intensity Constraint Gradient-Based Image ReconstructionIntensity Constraint Gradient-Based Image Reconstruction
Intensity Constraint Gradient-Based Image Reconstruction
 
PageRank on an evolving graph - Yanzhao Yang : NOTES
PageRank on an evolving graph - Yanzhao Yang : NOTESPageRank on an evolving graph - Yanzhao Yang : NOTES
PageRank on an evolving graph - Yanzhao Yang : NOTES
 
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
 
PR 103: t-SNE
PR 103: t-SNEPR 103: t-SNE
PR 103: t-SNE
 
Jsai final final final
Jsai final final finalJsai final final final
Jsai final final final
 
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs SWeG: Lossless and Lossy Summarization of Web-Scale Graphs
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs
 
그림 그리는 AI
그림 그리는 AI그림 그리는 AI
그림 그리는 AI
 
Paper Reading: Pessimistic Cardinality Estimation
Paper Reading: Pessimistic Cardinality EstimationPaper Reading: Pessimistic Cardinality Estimation
Paper Reading: Pessimistic Cardinality Estimation
 
05-Debug.pdf
05-Debug.pdf05-Debug.pdf
05-Debug.pdf
 
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
 
Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...
Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...
Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...
 
Recommendation Engine with In-Database Machine Learning
Recommendation Engine with In-Database Machine LearningRecommendation Engine with In-Database Machine Learning
Recommendation Engine with In-Database Machine Learning
 
Dda line algorithm presentatiion
Dda line algorithm presentatiionDda line algorithm presentatiion
Dda line algorithm presentatiion
 
DDPG algortihm for angry birds
DDPG algortihm for angry birdsDDPG algortihm for angry birds
DDPG algortihm for angry birds
 
Eye deep
Eye deepEye deep
Eye deep
 
BU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptx
BU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptxBU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptx
BU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptx
 
Seminar9
Seminar9Seminar9
Seminar9
 

Recently uploaded

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 

Recently uploaded (20)

Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 

Rsgan iconi

  • 1. RSGAN: Regularization-on-Sigma GAN Chung-Il Kim*, Seungwon Jung and Eenjun Hwang School of Electrical Engineering, Korea University
  • 2. Contents ▪ Introduction ▪ RSGAN ▪ Experiments ▪ Conclusion 2/15
  • 3. Generative Adversarial Nets ▪ A generator(G) produces synthetic data from random variables. ▪ A discriminator(D) get two inputs; real one and fake one. ▪ The D determines if the input is authentic. ▪ Given by loss, D & G are each trained by the optimizer. 3/15[1] https://medium.com/coinmonks/celebrity-face-generation-using-gans-tensorflow-implementation-eaa2001eef86
  • 4. Examples of Data Generated by Recent GANs BEGAN (Boundary equilibrium GANs, 2017, Google) Train data: CelebA Human beings almost cannot tell the real data from the synthetic data. 4/15 Fig 2. A total of 15 samples generated by BEGAN[2] [2] D. Berthelot, T. Schumm, and L. Metz, "BEGAN: boundary equilibrium generative adversarial networks," arXiv preprint arXiv:1703.10717, 2017.
  • 5. Failed Examples Generated by GANs In 737k steps, BEGAN keep on generating the same data. At 2,000k steps, it fail to learn a data distribution. This phenomenon is called ‘mode collapse’. Several different input z vectors, but same output (possibly due to low model capacity or inadequate optimization). Detecting mode collapse is very challenging. 5/15Fig 3. A total of 16 samples generated from a few training steps are shown in BEGAN.
  • 6. RSGAN More stable than BEGAN in a long-term learning Each frame = 10k steps Mode collapse has not been observed during every global 2,400k steps. 6/15
  • 7. Image Stability on Sequential Steps 32x32 images generated Each 16 samples monitored every 1000 steps until 2400k At 737k, BEGAN start to mode collapse, weird images RSGAN generated diverse human- like face steadily until 2,400k steps 7/15
  • 8. BEGAN Training Procedure 8/15 Synthetic sample 𝐺(𝑧) Real sample 𝑥 Discriminator D (Auto-Encoder) output 𝐷(𝑥) output 𝐷(𝐺(𝑧)) ‘Real data error’ 𝐿(𝑥) = |𝐷(𝑥) − 𝑥| G Real data Noise 𝑧 Generator ‘Synthetic data error’ 𝐿(𝐺(𝑧)) = |𝐷(𝐺(𝑧)) − 𝐺(𝑧)| ① ② ③ ④
  • 9. BEGAN Objectives ▪ Discriminator objectives ▪ Minimize ‘real data error’ and maximize ‘synthetic data error’ ℒ 𝐷𝑖𝑠𝑐𝑟𝑖𝑚𝑖𝑛𝑎𝑡𝑜𝑟 = argmin[𝐿 𝑥 − 𝑘 𝑡 ∗ 𝐿 𝐺 𝑧 ] ▪ Generator objectives ▪ Minimize ‘synthetic data error’ ℒ 𝐺𝑒𝑛𝑒𝑟𝑎𝑡𝑜𝑟 = argmin[𝐿 𝐺 𝑧 ] ▪ 𝑘 𝑡 ∗ : proportional control variable for stable learning 9/15
  • 10. RSGAN Training Procedure 10/15 Synthetic sample 𝐺(𝑧) Real sample 𝑥 Discriminator D (Auto-Encoder) output 𝐷(𝑥) output 𝐷(𝐺(𝑧)) ‘A metric on real data’ 𝑚(𝑥, 𝐷(𝑥)) G Real data Noise 𝑧 Generator ‘A metric on synthetic data’ 𝑚(𝐺(𝑧), 𝐷(𝐺(𝑧))) ① ② ③ ④
  • 11. RSGAN Objectives ▪ Discriminator objectives ▪ Minimize ‘a metric on real data’ and maximize ‘a metric on synthetic data’ ℒ 𝐷𝑖𝑠𝑐𝑟𝑖𝑚𝑖𝑛𝑎𝑡𝑜𝑟 = argmin[m(𝑥, 𝐷 𝑥 ) − 𝑘 𝑡 ∗ m(𝐺 𝑧 , 𝐷(𝐺 𝑧 ))] ▪ Generator objectives ▪ Minimize ‘a metric on synthetic data’ ℒ 𝐺𝑒𝑛𝑒𝑟𝑎𝑡𝑜𝑟 = argmin[m(𝐺 𝑧 , 𝐷(𝐺 𝑧 ))] ▪ 𝑘 𝑡 ∗ : proportional control variable for stable learning 11
  • 12. Metric ▪ A function that defines a distance between each pair of elements of a set ▪ Ex) Euclidian distance, Manhattan distance, Jensen-Shannon Divergence, Wasserstein-1 distance, Wasserstein-2 distance ▪ RSGAN applied Wasserstein-2 distance ▪ Given data distribution 𝑃 𝑎𝑛𝑑 𝑄, Wasserstein-2 distance denoted: 𝑊2 𝑃, 𝑄 = 𝑚 𝑃 − 𝑚 𝑄 2 2 + 𝑡𝑟𝑎𝑐𝑒(𝐶 𝑃 + 𝐶 𝑄 − 2 𝐶 𝑄 1 2 𝐶 𝑃 1 2 𝐶 𝑄 1 2 1 2 ) 𝑚 𝑃, 𝑚 𝑄: 𝑚𝑒𝑎𝑛𝑠, 𝐶 𝑃, 𝐶 𝑄: 𝑐𝑜𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑚𝑎𝑡𝑟𝑖𝑐𝑒𝑠 12
  • 13. Wasserstein-2 Distance ▪ Comparison BEGAN with RSGAN ▪ RSGAN consider not only means but also variances 13/15 BEGAN RSGAN 𝐿(𝑥) = |𝐷(𝑥) − 𝑥| m(𝑥, 𝐷 𝑥 )= 𝑚x − 𝑚D(x) 2 2 + 1 n 𝜎 𝑥 − 𝜎 𝐷(𝑥) 2 2  RSGAN use simplified W-2 distance and it denoted: 𝑅𝑆𝐿 𝑃, 𝑄 = 𝑚 𝑃 − 𝑚 𝑄 2 2 + 1 n 𝜎 𝑃 − 𝜎 𝑄 2 2 𝜎 𝑃 𝑎𝑛𝑑 𝜎 𝑄: 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛  Derived procedure is in paper
  • 14. Loss Graph ▪ Dataset: CelebA with 32 resolution ▪ Filter size: D with 128, G with 64 ▪ Learning rate: D with 0.00008 ▪ 𝛾(equilibrium): 0.5 ▪ No technique used ▪ Batch normalization, dropout, transpose convolutions, skip connections or refinement in BEGAN ▪ BEGAN and RSGAN start to minimize losses and converged to some value 14/15
  • 15. Conclusion ▪ We proposed a new GAN model called RSGAN ▪ RSGAN uses Wasserstein-2 as its loss metric ▪ RSGAN can perform up to 2,400k training steps stably for CelebA dataset ▪ We plan to make RSGAN more stable using depthwise separable convolution[3] 15/15[3] F. Chollet, "Xception: Deep learning with depthwise separable convolutions," arXiv preprint, p. 1610.02357, 2017.

Editor's Notes

  1. 안녕하십니까. 저는 RSGAN을 발표하는 김충일 이라고 합니다. 발표 시작하겠습니다. Good morning. I’m Chung-il Kim. I’ll introduce my paper RSGAN, Regularization-on-Sigma GAN in this presentation.
  2. 발표 순서는 다음과 같이 introduction, RSGAN기술, 실험 그리고 결론으로 진행하겠습니다. Contents are in below. Introduction of this field, RSGAN, its experiments and conclusion.
  3. Generative Adversarial Nets은 인공지능을 활용한 생성 모델입니다. 하단의 이미지를 보십시오. GANs는 생성기(G), 판결기(D) 2가지 생성 모델로 이루어져 있습니다. 생성기(G)는 Random variable을 통해 합성데이터를 생성합니다. 판별기(D)는 실제 데이터와 생성기를 통해 만들어진 합성데이터의 진위여부를 구분합니다. 판별기는 두 데이터 그룹을 판별한 결과를 각각 loss 값으로 표현합니다. Optimizer는 이 loss를 기반으로 Discriminator와, generato를 훈련시킵니다. 첨언) 이를 통해 최종적으로 Discriminator는 실제데이터와 합성데이터를 더욱 잘 구분하도록 훈련되며 Generator는 실제데이터와 매우 유사하게 합성데이터를 만들도록 훈련됩니다. Generative Adversairial Nets. This is a generative model which has 2 models. A generator and a discriminator. Generator’s input is a random variables. Generator receive it and make samples. And discriminator receive both real data and fake data. Discriminator determine which one is real or fake. Discriminator derive loss in each data group. Given by loss, D and G are trained. D train itself discriminate those two data. Minimize real loss and maximize fake loss. G train itself deceive D. Minimize fake loss.
  4. 왼쪽의 사진은 작년 구글이 발표한 GANs의 종류인 BEGAN입니다. 사진을 보면 실제 사람의 얼굴과 매우 비슷하게 생성되는 걸 볼 수 있습니다. 이러한 GANs기술은 점차 발전되어 매우 정교한 데이터를 생성하는 수준까지 올라왔습니다. Look at the left figure. This is a total of 15 samples generated by BEGAN. Boundary Equilibrium Generative Adversarial Nets is supposed by Google in 2017. Human beings almost cannot tell the real data from the synthetic data.
  5. 그러나 이러한 BEGAN도 한가지 문제점을 가지고 있습니다. 왼쪽의 그림은 각 트레이닝 스텝에 따라 만들어진 sample들 입니다. 보다시피 20k, 100k, 400k 에서는 매우 정교한 이미지를 생성하지만 737k 이후로는 이질적인 데이터를 생성합니다. 뿐만 아니라, 각기 다른 random value에서 생성된 이미지 임에도 불구하고, 각 데이터가 전부 같은 이미지만을 생성하기까지 합니다. 이러한 현상을 mode collapse라고 부르는데요. 현재 mode collapse가 일어나는 원인과 이를 검출하려는 연구가 활발히 진행되고 있습니다. But when we train BEGAN in a long term, some problem occurs. Look at the Fig3. A total of 16 samples generated from a few training steps are shown in BEGAN. Left-side of this Fig, this is training steps. And right-side images are corresponding to its left-side. In 737k steps, BEGAN keep on generating the same data. At 2,000k steps, it fail to learn a data distribution. This phenomenon is called ‘mode collapse’. Sereral different input z vectors, but same output. This phenomenon has not yet been clarified. Detecting mode collapse is very challenging in this field.
  6. RSGAN은 이러한 mode collapse를 완화시킨 모델입니다. BEGAN과 같은 조건 및 같은 architecture내에서 실험을 RSGAN의 영상을 보여드리겠습니다. 이 영상은 10k step마다 BEGAN과 RSGAN의 생성 데이터를 보여줍니다. 총 2,400k 학습을 진행한 과정입니다. (영상 진행) RSGAN is more stable than BEGAN in a long-term learning. I’ll show you a video. Upper parts is BEGAN Below parts is RSGAN. Both 2 models trained in same learning rate in Discriminator. (Play) Each frame is 10k steps. In this experiments, mode collapse has not been observed during every global 2,400k steps.
  7. 왼쪽의 사진은 보여드렸던 영상 중 일부를 캡쳐한 사진입니다. 같은 줄의 사진에서 왼쪽은 BEGAN이 해당 step에 생성한 이미지입니다. 오른쪽은 RSGAN을 training 하는 중 같은 step 간 생성된 이미지를 보여줍니다. 두 모델은 32x32 의 resolution과 같은 architecture와 같은 Discriminator의 learning rate에서 실행되었습니다. BEGAN의 경우는 737k step에서 부터 더 이상 mode collapse 현상이 일어나 이상한 이미지들을 생성하기 시작하는데요. RSGAN은 반면에 2400k step동안 사람과 같은 얼굴을 일관되게 생성하는 모습을 보입니다. This is some capture of video. In video, 32x32 resolution images generated. Each 16 samples monitored every 1000 steps until 2400k. At 737k, …..
  8. 먼저 BEGAN의 training procedure를 설명하겠습니다. 이후에 RSGAN의 training procedure 이어서 설명하겠습니다. BEGAN 학습 구조는 기본적으로 GANs와 같습니다. 먼저 noise를 Generator에 통과시킵니다. Generator는 noise 로부터 합성데이터를 생성합니다. Data에서 랜덤으로 데이터를 수집합니다. Auto-Encoder형태의 D에 real data와 fake data를 통과시킵니다. Auto-Encoder는 input data와 같은 차원의 output data를 생성합니다. At First, I’ll explain BEGAN training Procedure. Then next, RSGAN training procedure.
  9. 이때 D와 G의 목표는 다음과 같습니다. -D: real data의 input과 output 차이인 에러를 줄이고, 합성 데이터는 반대로 이 에러를 늘려고 합니다. -G: 합성데이터의 input과 output 차이를 줄이려고 합니다.
  10. 이제 RSGAN의 procedure입니다. 기본적인 틀은 BEGAN과 매우 유사합니다. 하지만 BEGAN이 error를 비교 했었다면, RSGAN은 metric 개념으로 접근합니다.
  11. 그러나 D와 G의 목표가 조금 다릅니다. -D: real data의 input과 output ‘데이터 간의 거리’를 줄이려고 합니다. 합성 데이터는 반대로 이 ‘데이터 간 거리’를 늘려고 합니다. -G: 합성데이터의 input과 output ‘데이터 간 거리’를 줄이려고 합니다. 즉, 같은 구조 설계를 가지고 있지만 BEGAN은 error를 줄이려고 하는 반면, RSGAN은 거리를 줄이려 합니다.
  12. 두 데이터 간 거리는 metric 에 따라 다양하게 적용될 수 있습니다. 예를 들어 Euclidian, Manhattan, JSD, w-1이 있다. RSGAN은 metric 중 w-2를 택한 모델이다. Wasserstein-2 는 다음과 같이 정의된다. 두 데이터 분포 의 평균과 공분산이 변수로 들어간 모습이다. 둘의 평균과 공분산이 각각 같을 수록 거리가 좁아지는 특징이 있다.
  13. 이 W-2 metric 은 다음과 같이 정의 됩니다. 이 식에는 두 데이터 분포 P와 Q의 평균을 비교하는 것 뿐만이 아니라, 공분산까지 고려하여 거리를 정의합니다. 분산 정도를 고려하는 특징 덕분에, RSGAN은 데이터의 다양성까지 내포하는 특징을 가지게 되었습니다.
  14. 다음은 해당 실험 중 loss의 변화를 나타냅니다. Loss가 낮을 수록 학습이 잘 되었음을 나타냅니다. 실험에 적용된 하이퍼 파라미터는 다음과 같습니다. 오른쪽의 그래프를 보시면 BEGAN과 RSGAN 모두 각 D와 G의 loss를 감소시키는 방향으로 훈련됩니다. 이후 두 모델 모두 어떠한 값에 수렴하게 됩니다. 하지만 생성된 데이터와 비교해보면 로스의 수렴과 상관없이 발생했던 것을 알 수 있습니다.
  15. 우리는 본 논문에서 새로운 GANs 모델인 RSGAN을 제안했습니다. RSGAN은 loss를 정의하는 metric 을 W-2를 활용해 이것이 CelebA 데이터셋에 대해서 Stable한 학습을 함을 보였습니다. 우리는 Deeplab3에서 제안된 depthwise separable convolution을 활용하여 RSGAN을 더욱 stable하게 만들 계획입니다.