Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Generative adversarial networks

6,217 views

Published on

Generative Adversarial Networks(GAN) slides for NAVER seminar talk.

Published in: Software
  • Visit this site: tinyurl.com/sexinarea and find sex in your area for one night)) You can find me on this site too)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Sex in your area for one night is there tinyurl.com/hotsexinarea Copy and paste link in your browser to visit a site)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Girls for sex are waiting for you https://bit.ly/2TQ8UAY
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Meetings for sex in your area are there: https://bit.ly/2TQ8UAY
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Nice !! Download 100 % Free Ebooks, PPts, Study Notes, Novels, etc @ https://www.ThesisScientist.com and Watch latest Blogs On Latest and New Technology @ https://www.ThesisScientist.com/blog
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Generative adversarial networks

  1. 1. 1 Introduction Generative Adversarial Networks Yunjey Choi Korea University DAVIAN LAB
  2. 2. 2 Speaker Introduction Introduction B.S. in Computer Science & Engineering at Korea University M.S. Student in Computer Science & Engineering at Korea University (Current) Interest: Deep Learning,TensorFlow, PyTorch GitHub Link: https://github.com/yunjey
  3. 3. 3 Referenced Slides Introduction β€’ Namju Kim. Generative Adversarial Networks (GAN) https://www.slideshare.net/ssuser77ee21/generative-adversarial-networks-70896091 β€’ Taehoon Kim. 지적 λŒ€ν™”λ₯Ό μœ„ν•œ 깊고 넓은 λ”₯λŸ¬λ‹ https://www.slideshare.net/carpedm20/ss-63116251
  4. 4. Introduction 01
  5. 5. 5 Branches of ML Introduction Semi-supervised Learning Unsupervised Learning Supervised Learning Reinforcement Learning Machine Learning No labeled data No feedback β€œfind hidden structure” Labeled data Direct feedback No labeled data Delayed feedback Reward signal
  6. 6. 6 Supervised Learning Introduction Discriminative Model woman The discriminative model learns how to classify input to its class. man (1) (1) (2) (2) Input image (64x64x3)
  7. 7. 7 Unsupervised Learning Introduction Generative Model Latent code The generative model learns the distribution of training data. (100) Image (64x64x3)
  8. 8. 8 Generative Model 𝑋 1 2 3 4 5 6 𝑃 𝑋 1 6 1 6 1 6 0 6 1 6 2 6 1 2 3 4 5 6 1 6 2 6 𝑝 π‘₯ π‘₯ Probability Distribution Introduction Random variable Probability Basics (Review) Probability mass function
  9. 9. 9 Generative Model What if π‘₯ is actual images in the training data? At this point, π‘₯ can be represented as a (for example) 64x64x3 dimensional vector. Probability Distribution Introduction
  10. 10. 10 Generative Model π‘₯ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) Probability Distribution Introduction There is a 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) that represents the distribution of actual images. Probability density function
  11. 11. 11 Generative Model π‘₯ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) Probability Distribution Introduction Let’s take an example with human face image dataset. Our dataset may contain few images of men with glasses. π‘₯1 π‘₯2 π‘₯3 π‘₯4 π‘₯1 is a 64x64x3 high dimensional vector representing a man with glasses. The probability density value is low
  12. 12. The probability density value is high 12 Generative Model π‘₯ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) Probability Distribution Introduction Our dataset may contain many images of women with black hair. π‘₯1 π‘₯2 π‘₯3 π‘₯4 π‘₯2 is a 64x64x3 high dimensional vector representing a woman with black hair.
  13. 13. 13 Generative Model π‘₯ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) Probability Distribution Introduction Our dataset may contain very many images of women with blonde hair. π‘₯1 π‘₯2 π‘₯3 π‘₯4 π‘₯3 is a 64x64x3 high dimensional vector representing a woman with blonde hair. The probability density value is very high
  14. 14. 14 Generative Model π‘₯ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) Probability Distribution Introduction Our dataset may not contain these strange images. π‘₯1 π‘₯2 π‘₯3 π‘₯4 π‘₯4 is an 64x64x3 high dimensional vector representing very strange images. The probability density value is almost 0
  15. 15. 15 Generative Model π‘₯ Probability Distribution Introduction The goal of the generative model is to find a 𝑝 π‘šπ‘œπ‘‘π‘’π‘™(π‘₯) that approximates 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) well. π‘₯1 π‘₯2 π‘₯3 π‘₯4 𝑝 π‘šπ‘œπ‘‘π‘’π‘™(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) π·π‘–π‘ π‘‘π‘Ÿπ‘–π‘π‘’π‘‘π‘–π‘œπ‘› π‘œπ‘“ π‘Žπ‘π‘‘π‘’π‘Žπ‘™ π‘–π‘šπ‘Žπ‘”π‘’π‘  π·π‘–π‘ π‘‘π‘Ÿπ‘–π‘π‘’π‘‘π‘–π‘œπ‘› π‘œπ‘“ π‘–π‘šπ‘Žπ‘”π‘’π‘  π‘”π‘’π‘›π‘’π‘Ÿπ‘Žπ‘‘π‘’π‘‘ 𝑏𝑦 π‘‘β„Žπ‘’ π‘šπ‘œπ‘‘π‘’π‘™
  16. 16. Generative Adversarial Networks 02
  17. 17. 17 Intuition in GAN GANs G(z) DGz D(G(z)) D D(x) x Fake image Real image The probability of that x came from the real data (0~1)Discriminator Generator Latent Code May be high May be low Training with real images Training with fake images
  18. 18. 18 Intuition in GAN GANs G(z) DGz D(G(z)) D D(x) x Real image (64x64x3) This value should be close to 1.Discriminator (Neural Network) The discriminator should classify a real image as real.
  19. 19. 19 Intuition in GAN GANs G(z) Dz D(G(z)) D D(x) x Fake image generated by the generator (64x64x3) Generator This value should be close to 0. The discriminator should classify a fake image as fake.
  20. 20. 20 Intuition in GAN GANs G(z) DGz D(G(z)) D D(x) x Generated image (64x64x3) Generator (Neural Network) Latent Code (100) This value should be close to 1. The generator should create an image that is indistinguishable from real to deceive the discriminator
  21. 21. π‘šπ‘–π‘› π‘šπ‘Žπ‘₯ 𝑉 𝐷, 𝐺 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺(𝑧))) 21 Objective Function of GAN GANs G(z) DGz D(G(z)) D D(x) x 𝐺 𝐷 Training with real images Training with fake images Maximum when 𝐷(π‘₯) = 1 Maximum when 𝐷(𝐺(𝑧)) = 0 Sample π‘₯ from real data distribution Sample latent code 𝑧 from Gaussian distribution Train D to classify fake images as fake Train D to classify real images as real 𝐷 should maximize 𝑉(𝐷, 𝐺) Objective function
  22. 22. π‘šπ‘–π‘› π‘šπ‘Žπ‘₯ 𝑉 𝐷, 𝐺 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺(𝑧))) 𝐺 is independent of this part 22 Objective Function of GAN GANs 𝐺 should minimize 𝑉(𝐷, 𝐺) Minimum when 𝐷(𝐺(𝑧)) = 1 G(z) DGz D(G(z)) D D(x) x Train G to deceive D 𝐺 𝐷 Training with real images Training with fake images Objective function
  23. 23. 23 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x Training with real images Training with fake images
  24. 24. Define the discriminator input size: 784 hidden size: 128 output size: 1 24 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x Discriminator Assume x is MNIST (784 dimension) Output probability(1 dimension)
  25. 25. 25 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x Generator Define the generator input size: 100 hidden size: 128 output size: 784 Latent code (100 dimension) Generated image (784 dimension)
  26. 26. 26 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x Binary Cross Entropy Loss (β„Ž(π‘₯), 𝑦) βˆ’ 𝑦 log β„Ž π‘₯ βˆ’ (1 βˆ’ 𝑦) log(1 βˆ’ β„Ž(π‘₯))
  27. 27. 27 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x Optimizer for D and G
  28. 28. 28 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x x is a tensor of shape (batch_size, 784). z is a tensor of shape (batch_size, 100).
  29. 29. 29 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x D(x) gets closer to 1. D(G(z)) gets closer to 0 Forward, Backward and Gradient Descent Train the discriminator with real images Train the discriminator with fake images
  30. 30. 30 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x Forward, Backward and Gradient Descent D(G(z)) gets closer to 1 Train the generator to deceive the discriminator
  31. 31. 31 PyTorch Implementation GANs G(z) DGz D(G(z)) D D(x)x The complete code can be found here https://github.com/yunjey/pytorch-tutorial
  32. 32. 32 Objective Function of Generator 𝐺 Objective function of G Non-Saturating Game GANs Images created by the generator at the beginning of training 𝑦 = log(1 βˆ’ π‘₯) The gradient is relatively small at D(G(z))=0 At the beginning of training, the discriminator can clearly classify the generated image as fake because the quality of the image is very low. This means that D(G(z)) is almost zero at early stages of training. π‘šπ‘–π‘› 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 )
  33. 33. 33 Solution for Poor Gradient 𝐺 π‘šπ‘Žπ‘₯ 𝐸𝑧~𝑝 𝑧(𝑧) log 𝐷(𝐺 𝑧 ) 𝐺 π‘šπ‘–π‘› 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) Non-Saturating Game GANs 𝑦 = log(π‘₯) The gradient is very large at D(G(z))=0 Use binary cross entropy loss function with fake label (1) π‘šπ‘–π‘› 𝐸𝑧~𝑝 𝑧(𝑧)[βˆ’π‘¦ log 𝐷(𝐺 𝑧 ) βˆ’ 1 βˆ’ 𝑦 log(1 βˆ’ 𝐷(𝐺 𝑧 )] π‘šπ‘–π‘› 𝐸𝑧~𝑝 𝑧(𝑧)[βˆ’ log 𝐷 𝐺 𝑧 ] 𝐺 𝐺 𝑦 = 1 Modification (heuristically motivated) β€’ Practical Usage
  34. 34. 34 Theory in GAN GANs 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) π‘šπ‘–π‘› 𝐽𝑆𝐷(𝑝 π‘‘π‘Žπ‘‘π‘Ž||𝑝 𝑔) 𝐺, 𝐷 π‘šπ‘–π‘› π‘šπ‘Žπ‘₯ 𝑉 𝐷, 𝐺 𝐺 𝐷 π‘₯ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝐽𝑆𝐷(𝑃||𝑄) = 1 2 𝐾𝐿(𝑃| 𝑀 + 1 2 𝐾𝐿(𝑄| 𝑀 π‘€β„Žπ‘’π‘Ÿπ‘’ 𝑀 = 1 2 (𝑃 + 𝑄) KL Divergence β€’ Why does GANs work? same Please see Appendix for details Because it actually minimizes the distance between the real data distribution and the model distribution. Objective function of GANs Jenson-Shannon divergence
  35. 35. Variants of GAN 03
  36. 36. 36 DCGAN Variants of GAN Radford et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, 2015 β€’ Deep Convolutional GAN(DCGAN), 2015 The authors present a model that is still highly preferred.
  37. 37. 37 DCGAN Variants of GAN DGz D(G(z)) D D(x) x Use convolution, Leaky ReLU Use deconvolution, ReLU β€’ No pooling layer (Instead strided convolution) β€’ Use batch normalization β€’ Adam optimizer(lr=0.0002, beta1=0.5, beta2=0.999)
  38. 38. 38 DCGAN Variants of GAN Radford et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, 2015 β€’ Latent vector arithmetic
  39. 39. 39 LSGAN Variants of GAN Xudong Mao et al. Least Squares Generative Adversarial Networks, 2016 β€’ Least Squares GAN (LSGAN) Proposed a GAN model that adopts the least squares loss function for the discriminator.
  40. 40. 40 LSGAN Variants of GAN Vanilla GAN LSGAN Remove sigmoid non-linearity in last layer
  41. 41. 41 LSGAN Variants of GAN Vanilla GAN LSGAN Generator is the same as original
  42. 42. 42 LSGAN Variants of GAN Vanilla GAN LSGAN Replace cross entropy loss to least squares loss (L2) D(x) gets closer to 1 D(G(z)) gets closer to 0 (same as original)
  43. 43. 43 LSGAN Variants of GAN Vanilla GAN LSGAN Replace cross entropy loss to least squares loss (L2) D(G(z)) gets closer to 1 (same as original)
  44. 44. 44 LSGAN Variants of GAN β€’ Results (LSUN dataset) Xudong Mao et al. Least Squares Generative Adversarial Networks, 2016
  45. 45. 45 LSGAN Variants of GAN β€’ Results (CelebA)
  46. 46. 46 SGAN Variants of GAN D G D one-hot vector representing 2 Real image latent vector z fake image (1) FC layer with softmax β€’ Semi-Supervised GAN Training with real images Training with fake images 11 dimension (10 classes + fake) (1) (1) one-hot vector representing a fake label one-hot vector representing 5 Augustus Odena et al. Semi-Supervised Learning with Generative Adversarial Netwoks, 2016
  47. 47. 47 SGAN Variants of GAN β€’ Results (Game Character) 1 2 3 4 5 The generator can create an character image that takes a certain pose. one-hot vectors representing class labels The code will be available soon: https://github.com/yunjey
  48. 48. 48 ACGAN Variants of GAN Augustus Odena et al. Conditional Image Synthesis with Auxilary Classifier, 2016 β€’ Auxiliary Classifier GAN(ACGAN), 2016 Proposed a new method for improved training of GANs using class labels.
  49. 49. 49 ACGAN Variants of GAN real or fake? D G real or fake? D one-hot vector representing 2 Real image one-hot vector representing 5 latent vector z fake image one-hot vector representing 2 (1) FC layer with sigmoid (2) FC layer with softmax (1) (2) Discriminator (multi-task learning) (1) (2) β€’ How does it work? Training with real images Training with fake images
  50. 50. Extensions of GAN 04
  51. 51. 51 CycleGAN Extensions β€’ CycleGAN: Unpaired Image-to-Image Translation presents a GAN model that transfer an image from a source domain A to a target domain B in the absence of paired examples. Jun-Yan Zhu et al. Unpaired Image-to-Image Translation using Cycle Consistent Adversarial Networks, 2017
  52. 52. 52 CycleGAN β€’ How does it work? real or fake ? DB GAB Real Image in domain A Fake Image in domain B Real Image in domain B Discriminator for domain B The generator GAB should generates a horse from the zebra to deceive the discriminator DB. Extensions
  53. 53. 53 CycleGAN β€’ How does it work? DB GAB Discriminator for domain B GBA Reconstructed Image L2 Loss GBA generates a reconstructed image of domain A. This makes the shape to be maintained when GAB generates a horse image from the zebra.real or fake ? Real Image in domain A Real Image in domain B Fake Image in domain B Extensions
  54. 54. 54 CycleGAN Extensions β€’ Results Jun-Yan Zhu et al. Unpaired Image-to-Image Translation using Cycle Consistent Adversarial Networks, 2017
  55. 55. 55 CycleGAN Extensions β€’ Results MNIST-to-SVHNSVHN-to-MNIST Odd columns contain real images and even columns contain generated images. https://github.com/yunjey/mnist-svhn-transfer
  56. 56. 56 Text2Image Scott Reed et al. Generative Adversarial Text to Image Synthesis, 2016 β€’ Generative Adversarial Text to Image Synthesis, 2016 presents a novel model architecture that generates an image from the text. Extensions
  57. 57. 57 Text2Image D Sentence Embedding (100) Is the image real and relevant to the sentence? Concatenate at last conv layer Discriminator should say β€˜yes’. Real image (128x128) β€’ Training with (real image, right text) Extensions β€œA small red bird with a black beak”
  58. 58. 58 Text2Image Fake image (128x128) D G Z (100) Sentence Embedding (100) Is the image real and relevant to the sentence? right text Discriminator should say β€˜no’. β€’ Training with (fake image, right text) Generator should creates an image relevant to the sentence to deceive the discriminator. Extensions β€œA small red bird with a black beak”
  59. 59. 59 Text2Image D Is the image real and relevant to the sentence? wrong text (sampled randomly from the training data) Discriminator should say β€˜no’. β€’ Training with (real image, wrong text) Extensions Real image (128x128) β€œA small yellow bird with a brown beak” Sentence Embedding (100)
  60. 60. 60 StackGAN Han Zhang et al. StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, 2016 β€’ StackGAN:Text to Photo-realistic Image Synthesis Extensions
  61. 61. 61 StackGAN Fake image (128x128) Real image (128x128) D G Z (100) β€˜real’ or β€˜fake’ Generates a128x128 image from scratch (not guarantee good result) Discriminator for 128x128 image β€’ Generating 128x128 from scratch Extensions
  62. 62. 62 StackGAN Extensions D1 D2 G1 G2 Fake image (64x64) Real image (64x64) Fake image (128x128) Real image (128x128) β€˜real’ or β€˜fake’ Generates a 64x64 image Upscales a 64x64 image to 128x128 (Easier than generating from scratch). Discriminator for 64x64 image Discriminator for 128x128 image β€’ Generating 128x128 from 64x64 β€˜real’ or β€˜fake’ Z (100)
  63. 63. Future of GAN 05
  64. 64. 64 Convergence Measure β€’ Boundary Equilibrium GAN (BEGAN) Extensions David Berthelot et al. BEGAN: Boundary Equilibrium Generative Adversarial Networks, 2017
  65. 65. 65 Convergence Measure β€’ Reconstruction Loss Extensions Sitao Xiang. On the effect of Batch Normalization and Weight Normalization in Generative Adversarial Network, 2017 Gz G(z) x Test images
  66. 66. 66 Better Upsampling β€’ Deconvolution Checkboard Artifacts Extensions http://distill.pub/2016/deconv-checkerboard/
  67. 67. 67 Better Upsampling β€’ Deconvolution vs Resize-Convolution Extensions http://distill.pub/2016/deconv-checkerboard/
  68. 68. 68 GAN in Supervised Learning β€’ Machine Translation (Seq2Seq) Extensions A B C D Y X Y Z<start> Z <end>X Should β€˜ABCD’ be translated to β€˜XYZ’? Tackling the supervised learning
  69. 69. 69 GAN in Supervised Learning β€’ Machine Translation (GANs) Extensions D G D Training with real sentences Training with fake sentences A (English) B (Korean) Does A and B have the same meaning? The discriminator should say β€˜yes’. Does A and B have the same meaning? The discriminator should say β€˜no’. A (English) B (Fake Korean) Lijun Wu et al. Adversarial Neural Machine Translation, 2017 The generator should generate B that has the same meaning as A to deceive the discriminator.
  70. 70. Thank you
  71. 71. Appendix 06
  72. 72. 72 π‘šπ‘–π‘› π‘šπ‘Žπ‘₯ 𝑉 𝐷, 𝐺 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) 𝐺 𝐷 Theory in GAN GANs 𝑂𝑏𝑗𝑒𝑐𝑑𝑖𝑣𝑒 π‘“π‘’π‘›π‘π‘‘π‘–π‘œπ‘› π‘œπ‘“ 𝐺𝐴𝑁𝑠 π‘Ÿπ‘’π‘Žπ‘™ π‘‘π‘Žπ‘‘π‘Ž π‘‘π‘–π‘ π‘‘π‘Ÿπ‘–π‘π‘’π‘‘π‘–π‘œπ‘› πΊπ‘Žπ‘’π‘ π‘ π‘–π‘Žπ‘› π‘‘π‘–π‘ π‘‘π‘Ÿπ‘–π‘π‘’π‘‘π‘–π‘œπ‘› β„Žπ‘–π‘”β„Ž π‘‘π‘–π‘šπ‘’π‘›π‘ π‘–π‘œπ‘›π‘Žπ‘™ π‘£π‘’π‘π‘‘π‘œπ‘Ÿ (𝑒. 𝑔. 64 Γ— 64) lπ‘œπ‘€ π‘‘π‘–π‘šπ‘’π‘›π‘ π‘–π‘œπ‘›π‘Žπ‘™ π‘£π‘’π‘π‘‘π‘œπ‘Ÿ (𝑒. 𝑔. 100)
  73. 73. 73 π‘šπ‘–π‘› π‘šπ‘Žπ‘₯ 𝑉 𝐷, 𝐺 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) 𝐺 𝐷 Theory in GAN GANs π·βˆ—(π‘₯) = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑉 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) 𝐹𝑖π‘₯ 𝐺 π‘‘π‘œ π‘šπ‘Žπ‘˜π‘’ 𝑖𝑑 π‘‘π‘œ π‘Ž π‘“π‘’π‘›π‘π‘‘π‘–π‘œπ‘› π‘œπ‘“ 𝐷 π‘‚π‘π‘‘π‘–π‘šπ‘Žπ‘™ 𝐷 𝐷 𝐺𝑒𝑑 𝐷 π‘€β„Žπ‘’π‘› 𝑉 𝐷 𝑖𝑠 π‘šπ‘Žπ‘₯π‘–π‘šπ‘’π‘š
  74. 74. 74 π‘šπ‘–π‘› π‘šπ‘Žπ‘₯ 𝑉 𝐷, 𝐺 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) 𝐺 𝐷 Theory in GAN GANs π·βˆ—(π‘₯) = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑉 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸 π‘₯~𝑝 𝑔(π‘₯) log(1 βˆ’ 𝐷(π‘₯)) 𝐹𝑖π‘₯ 𝐺 π‘‘π‘œ π‘šπ‘Žπ‘˜π‘’ 𝑖𝑑 π‘‘π‘œ π‘Ž π‘“π‘’π‘›π‘π‘‘π‘–π‘œπ‘› π‘œπ‘“ 𝐷 π‘ π‘Žπ‘šπ‘π‘™π‘–π‘›π‘” π‘₯ π‘“π‘Ÿπ‘œπ‘š 𝑝 𝑔 π‘–π‘›π‘ π‘‘π‘’π‘Žπ‘‘ π‘œπ‘“ π‘ π‘Žπ‘šπ‘π‘™π‘–π‘›π‘” 𝑧 π‘“π‘Ÿπ‘œπ‘š 𝑝𝑧 π‘£π‘’π‘π‘‘π‘œπ‘Ÿ π‘œπ‘“ 64 Γ— 64 π‘‘π‘–π‘šπ‘’π‘›π‘ π‘–π‘œπ‘› π‘£π‘’π‘π‘‘π‘œπ‘Ÿ π‘œπ‘“ 100 π‘‘π‘–π‘šπ‘’π‘›π‘ π‘–π‘œπ‘› π‘šπ‘œπ‘‘π‘’π‘™ 𝐺 π‘‘π‘–π‘ π‘‘π‘Ÿπ‘–π‘π‘’π‘‘π‘–π‘œπ‘› π‘“π‘œπ‘Ÿ β„Žπ‘–π‘”β„Ž π‘‘π‘–π‘šπ‘’π‘›π‘ π‘–π‘œπ‘›π‘Žπ‘™ π‘£π‘’π‘π‘‘π‘œπ‘Ÿ (𝑒. 𝑔. 64 Γ— 64) 𝐷
  75. 75. 75 π‘šπ‘–π‘› π‘šπ‘Žπ‘₯ 𝑉 𝐷, 𝐺 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) 𝐺 𝐷 Theory in GAN GANs π·βˆ—(π‘₯) = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑉 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸 π‘₯~𝑝 𝑔(π‘₯) log(1 βˆ’ 𝐷(π‘₯)) = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝐷(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 1 βˆ’ 𝐷 π‘₯ 𝑑π‘₯ π‘₯ π‘₯ 𝐹𝑖π‘₯ 𝐺 π‘‘π‘œ π‘šπ‘Žπ‘˜π‘’ 𝑖𝑑 π‘‘π‘œ π‘Ž π‘“π‘’π‘›π‘π‘‘π‘–π‘œπ‘› π‘œπ‘“ 𝐷 π‘ π‘Žπ‘šπ‘π‘™π‘–π‘›π‘” π‘₯ π‘“π‘Ÿπ‘œπ‘š 𝑝 𝑔 π‘–π‘›π‘ π‘‘π‘’π‘Žπ‘‘ π‘œπ‘“ π‘ π‘Žπ‘šπ‘π‘™π‘–π‘›π‘” 𝑧 π‘“π‘Ÿπ‘œπ‘š 𝑝𝑧 = ΰΆ± 𝑝 π‘₯ 𝑓 π‘₯ 𝑑π‘₯𝐸 π‘₯~𝑝(π‘₯) 𝑓(π‘₯) π·π‘’π‘“π‘–π‘›π‘–π‘‘π‘–π‘œπ‘› π‘œπ‘“ 𝐸π‘₯π‘π‘’π‘π‘‘π‘Žπ‘‘π‘–π‘œπ‘› π‘₯ πΌπ‘›π‘‘π‘’π‘”π‘Ÿπ‘Žπ‘‘π‘’ π‘“π‘œπ‘Ÿ π‘Žπ‘™π‘™ π‘π‘œπ‘ π‘ π‘–π‘π‘™π‘’ π‘₯ 𝐷
  76. 76. 76 π‘šπ‘–π‘› π‘šπ‘Žπ‘₯ 𝑉 𝐷, 𝐺 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) 𝐺 𝐷 Theory in GAN GANs π·βˆ—(π‘₯) = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑉 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸 π‘₯~𝑝 𝑔(π‘₯) log(1 βˆ’ 𝐷(π‘₯)) = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝐷(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 1 βˆ’ 𝐷 π‘₯ 𝑑π‘₯ = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝐷 π‘₯ + 𝑝 𝑔 π‘₯ log 1 βˆ’ 𝐷 π‘₯ 𝑑π‘₯ π‘₯ π‘₯ π‘₯ 𝐹𝑖π‘₯ 𝐺 π‘‘π‘œ π‘šπ‘Žπ‘˜π‘’ 𝑖𝑑 π‘‘π‘œ π‘Ž π‘“π‘’π‘›π‘π‘‘π‘–π‘œπ‘› π‘œπ‘“ 𝐷 π‘ π‘Žπ‘šπ‘π‘™π‘–π‘›π‘” π‘₯ π‘“π‘Ÿπ‘œπ‘š 𝑝 𝑔 π‘–π‘›π‘ π‘‘π‘’π‘Žπ‘‘ π‘œπ‘“ π‘ π‘Žπ‘šπ‘π‘™π‘–π‘›π‘” 𝑧 π‘“π‘Ÿπ‘œπ‘š 𝑝𝑧 = ΰΆ± 𝑝 π‘₯ 𝑓 π‘₯ 𝑑π‘₯𝐸 π‘₯~𝑝(π‘₯) 𝑓(π‘₯) π·π‘’π‘“π‘–π‘›π‘–π‘‘π‘–π‘œπ‘› π‘œπ‘“ 𝐸π‘₯π‘π‘’π‘π‘‘π‘Žπ‘‘π‘–π‘œπ‘› π‘₯ π΅π‘Žπ‘ π‘–π‘ π‘π‘Ÿπ‘œπ‘π‘’π‘Ÿπ‘‘π‘¦ π‘œπ‘“ πΌπ‘›π‘‘π‘’π‘”π‘Ÿπ‘Žπ‘™ 𝐷
  77. 77. 77 π‘šπ‘–π‘› π‘šπ‘Žπ‘₯ 𝑉 𝐷, 𝐺 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) 𝐺 𝐷 Theory in GAN GANs π·βˆ—(π‘₯) = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑉 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸𝑧~𝑝 𝑧(𝑧) log(1 βˆ’ 𝐷(𝐺 𝑧 ) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷 π‘₯ + 𝐸 π‘₯~𝑝 𝑔(π‘₯) log(1 βˆ’ 𝐷(π‘₯)) = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝐷(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 1 βˆ’ 𝐷 π‘₯ 𝑑π‘₯ = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝐷 π‘₯ + 𝑝 𝑔 π‘₯ log 1 βˆ’ 𝐷 π‘₯ 𝑑π‘₯ π‘₯ π‘₯ π‘₯ 𝐹𝑖π‘₯ 𝐺 π‘‘π‘œ π‘šπ‘Žπ‘˜π‘’ 𝑖𝑑 π‘‘π‘œ π‘Ž π‘“π‘’π‘›π‘π‘‘π‘–π‘œπ‘› π‘œπ‘“ 𝐷 π‘ π‘Žπ‘šπ‘π‘™π‘–π‘›π‘” π‘₯ π‘“π‘Ÿπ‘œπ‘š 𝑝 𝑔 π‘–π‘›π‘ π‘‘π‘’π‘Žπ‘‘ π‘œπ‘“ π‘ π‘Žπ‘šπ‘π‘™π‘–π‘›π‘” 𝑧 π‘“π‘Ÿπ‘œπ‘š 𝑝𝑧 = ΰΆ± 𝑝 π‘₯ 𝑓 π‘₯ 𝑑π‘₯𝐸 π‘₯~𝑝(π‘₯) 𝑓(π‘₯) π·π‘’π‘“π‘–π‘›π‘–π‘‘π‘–π‘œπ‘› π‘œπ‘“ 𝐸π‘₯π‘π‘’π‘π‘‘π‘Žπ‘‘π‘–π‘œπ‘› π‘₯ π΅π‘Žπ‘ π‘–π‘ π‘π‘Ÿπ‘œπ‘π‘’π‘Ÿπ‘‘π‘¦ π‘œπ‘“ πΌπ‘›π‘‘π‘’π‘”π‘Ÿπ‘Žπ‘™ π‘π‘œπ‘€ 𝑀𝑒 𝑛𝑒𝑒𝑑 π‘‘π‘œ 𝑓𝑖𝑛𝑑 𝐷 π‘₯ π‘€β„Žπ‘–π‘β„Ž π‘šπ‘Žπ‘˜π‘’π‘  π‘‘β„Žπ‘’ π‘“π‘’π‘›π‘‘π‘–π‘œπ‘› 𝑖𝑛𝑠𝑖𝑑𝑒 π‘–π‘›π‘‘π‘’π‘”π‘Ÿπ‘Žπ‘™ π‘šπ‘Žπ‘₯π‘–π‘šπ‘’π‘š. 𝐷
  78. 78. 78 Theoretical Results = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝐷 π‘₯ + 𝑝 𝑔 π‘₯ log 1 βˆ’ 𝐷 π‘₯ Theory in GAN GANs π·βˆ—(π‘₯) = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑉 𝐷 𝐷 𝐷 π‘‡β„Žπ‘’ π‘“π‘’π‘›π‘‘π‘–π‘œπ‘› 𝑖𝑛𝑠𝑖𝑑𝑒 π‘–π‘›π‘‘π‘’π‘”π‘Ÿπ‘Žπ‘™
  79. 79. 79 Theoretical Results = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝐷 π‘₯ + 𝑝 𝑔 π‘₯ log 1 βˆ’ 𝐷 π‘₯ Theory in GAN GANs π·βˆ—(π‘₯) = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑉 𝐷 𝐷 𝐷 π‘Ž log 𝑦 + 𝑏 log 1 βˆ’ 𝑦 𝑆𝑒𝑏𝑠𝑑𝑖𝑑𝑒𝑑𝑒 π‘Ž = 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ , 𝑦 = 𝐷 π‘₯ , 𝑏 = 𝑝 𝑔 π‘₯
  80. 80. 80 Theoretical Results = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝐷 π‘₯ + 𝑝 𝑔 π‘₯ log 1 βˆ’ 𝐷 π‘₯ Theory in GAN GANs π·βˆ—(π‘₯) = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑉 𝐷 𝐷 𝐷 π‘Ž log 𝑦 + 𝑏 log 1 βˆ’ 𝑦 π‘Ž 𝑦 + βˆ’π‘ 1 βˆ’ 𝑦 = π‘Ž βˆ’ (π‘Ž + 𝑏)𝑦 𝑦(1 βˆ’ 𝑦) 𝑆𝑒𝑏𝑠𝑑𝑖𝑑𝑒𝑑𝑒 π‘Ž = 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ , 𝑦 = 𝐷 π‘₯ , 𝑏 = 𝑝 𝑔 π‘₯ π·π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘‘π‘–π‘Žπ‘‘π‘’ π‘€π‘–π‘‘β„Ž π‘Ÿπ‘’π‘ π‘π‘’π‘π‘‘ π‘‘π‘œ 𝐷 π‘₯ 𝑒𝑠𝑖𝑛𝑔 π‘π‘œπ‘‘π‘’ π‘‘β„Žπ‘Žπ‘‘ 𝐷 π‘₯ π‘π‘Žπ‘› π‘›π‘œπ‘‘ π‘Žπ‘“π‘“π‘’π‘π‘‘ π‘‘π‘œ 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ π‘Žπ‘›π‘‘ 𝑝 𝑔 π‘₯ . 𝑑 𝑑π‘₯ log 𝑓(π‘₯) = 𝑓′ (π‘₯) 𝑓(π‘₯)
  81. 81. 81 Theoretical Results = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝐷 π‘₯ + 𝑝 𝑔 π‘₯ log 1 βˆ’ 𝐷 π‘₯ Theory in GAN GANs π·βˆ—(π‘₯) = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑉 𝐷 𝐷 𝐷 π‘Ž log 𝑦 + 𝑏 log 1 βˆ’ 𝑦 π‘Ž 𝑦 + βˆ’π‘ 1 βˆ’ 𝑦 π‘Ž βˆ’ (π‘Ž + 𝑏)𝑦 𝑦(1 βˆ’ 𝑦) = 0 𝑦 = π‘Ž π‘Ž + 𝑏 = π‘Ž βˆ’ (π‘Ž + 𝑏)𝑦 𝑦(1 βˆ’ 𝑦) 𝑆𝑒𝑏𝑠𝑑𝑖𝑑𝑒𝑑𝑒 π‘Ž = 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ , 𝑦 = 𝐷 π‘₯ , 𝑏 = 𝑝 𝑔 π‘₯ π‘π‘œπ‘‘π‘’ π‘‘β„Žπ‘Žπ‘‘ 𝐷 π‘₯ π‘π‘Žπ‘› π‘›π‘œπ‘‘ π‘Žπ‘“π‘“π‘’π‘π‘‘ π‘‘π‘œ 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ π‘Žπ‘›π‘‘ 𝑝 𝑔 π‘₯ . 𝐼𝑑 β„Žπ‘Žπ‘  π‘Ž π‘šπ‘Žπ‘₯π‘–π‘šπ‘’π‘š π‘£π‘Žπ‘™π‘’π‘’ π‘€β„Žπ‘’π‘› π‘π‘œπ‘‘π‘’ π‘‘β„Žπ‘Žπ‘‘ π‘‘β„Žπ‘’ π‘™π‘œπ‘π‘Žπ‘™ π‘šπ‘Žπ‘₯π‘–π‘šπ‘’π‘š 𝑖𝑠 π‘‘β„Žπ‘’ π‘”π‘™π‘œπ‘π‘Žπ‘™ π‘šπ‘Žπ‘₯π‘–π‘šπ‘’π‘š π‘€β„Žπ‘’π‘› π‘‘β„Žπ‘’π‘Ÿπ‘’ π‘Žπ‘Ÿπ‘’ π‘›π‘œ π‘œπ‘‘β„Žπ‘’π‘Ÿ π‘™π‘œπ‘π‘Žπ‘™ 𝑒π‘₯π‘‘π‘Ÿπ‘’π‘šπ‘’π‘ . 𝑑 𝑑π‘₯ log 𝑓(π‘₯) = 𝑓′ (π‘₯) 𝑓(π‘₯) π·π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘‘π‘–π‘Žπ‘‘π‘’ π‘€π‘–π‘‘β„Ž π‘Ÿπ‘’π‘ π‘π‘’π‘π‘‘ π‘‘π‘œ 𝐷 π‘₯ 𝑒𝑠𝑖𝑛𝑔 𝐹𝑖𝑛𝑑 π‘‘β„Žπ‘’ π‘π‘œπ‘–π‘›π‘‘ π‘€β„Žπ‘’π‘Ÿπ‘’ π‘‘β„Žπ‘’ π‘‘π‘’π‘Ÿπ‘–π‘£π‘Žπ‘‘π‘–π‘£π‘’ π‘£π‘Žπ‘™π‘’π‘’ 𝑖𝑠 0 π‘™π‘œπ‘π‘Žπ‘™ 𝑒π‘₯π‘‘π‘Ÿπ‘’π‘šπ‘’ .
  82. 82. 82 Theoretical Results = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝐷 π‘₯ + 𝑝 𝑔 π‘₯ log 1 βˆ’ 𝐷 π‘₯ Theory in GAN GANs π·βˆ—(π‘₯) = π‘Žπ‘Ÿπ‘” π‘šπ‘Žπ‘₯ 𝑉 𝐷 𝐷 𝐷 π‘Ž log 𝑦 + 𝑏 log 1 βˆ’ 𝑦 π‘Ž 𝑦 + βˆ’π‘ 1 βˆ’ 𝑦 π‘Ž βˆ’ (π‘Ž + 𝑏)𝑦 𝑦(1 βˆ’ 𝑦) = 0 𝑦 = π‘Ž π‘Ž + 𝑏 = π‘Ž βˆ’ (π‘Ž + 𝑏)𝑦 𝑦(1 βˆ’ 𝑦) 𝑆𝑒𝑏𝑠𝑑𝑖𝑑𝑒𝑑𝑒 π‘Ž = 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ , 𝑦 = 𝐷 π‘₯ , 𝑏 = 𝑝 𝑔 π‘₯ π‘π‘œπ‘‘π‘’ π‘‘β„Žπ‘Žπ‘‘ 𝐷 π‘₯ π‘π‘Žπ‘› π‘›π‘œπ‘‘ π‘Žπ‘“π‘“π‘’π‘π‘‘ π‘‘π‘œ 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ π‘Žπ‘›π‘‘ 𝑝 𝑔 π‘₯ . 𝐼𝑑 β„Žπ‘Žπ‘  π‘Ž π‘šπ‘Žπ‘₯π‘–π‘šπ‘’π‘š π‘£π‘Žπ‘™π‘’π‘’ π‘€β„Žπ‘’π‘› 𝑑 𝑑π‘₯ log 𝑓(π‘₯) = 𝑓′ (π‘₯) 𝑓(π‘₯) π·π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘‘π‘–π‘Žπ‘‘π‘’ π‘€π‘–π‘‘β„Ž π‘Ÿπ‘’π‘ π‘π‘’π‘π‘‘ π‘‘π‘œ 𝐷 π‘₯ 𝑒𝑠𝑖𝑛𝑔 𝐹𝑖𝑛𝑑 π‘‘β„Žπ‘’ π‘π‘œπ‘–π‘›π‘‘ π‘€β„Žπ‘’π‘Ÿπ‘’ π‘‘β„Žπ‘’ π‘‘π‘’π‘Ÿπ‘–π‘£π‘Žπ‘‘π‘–π‘£π‘’ π‘£π‘Žπ‘™π‘’π‘’ 𝑖𝑠 0 π‘™π‘œπ‘π‘Žπ‘™ 𝑒π‘₯π‘‘π‘Ÿπ‘’π‘šπ‘’ . π·βˆ— π‘₯ = 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑆𝑒𝑏𝑠𝑑𝑖𝑑𝑒𝑑𝑒 π‘Ž = 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ , 𝑦 = 𝐷 π‘₯ , 𝑏 = 𝑝 𝑔 π‘₯
  83. 83. 83 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 Objective Function of G (G should minimize C(G) Theory in GAN GANs
  84. 84. 84 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) Theory in GAN GANs Optimal discriminator
  85. 85. 85 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) + 𝐸 π‘₯~𝑝 𝑔 log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) Theory in GAN GANs
  86. 86. 86 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) + 𝐸 π‘₯~𝑝 𝑔 log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ Theory in GAN GANs π‘₯ π‘₯ = ΰΆ± 𝑝 π‘₯ 𝑓 π‘₯ 𝑑π‘₯𝐸 π‘₯~𝑝(π‘₯) 𝑓(π‘₯) Definition of Expectation π‘₯
  87. 87. 87 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) + 𝐸 π‘₯~𝑝 𝑔 log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ Theory in GAN GANs π‘₯ π‘₯
  88. 88. 88 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) + 𝐸 π‘₯~𝑝 𝑔 log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 2 βˆ™ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ Theory in GAN GANs π‘₯ π‘₯ π‘₯ π‘₯ How?
  89. 89. 89 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = βˆ’π‘™π‘œπ‘”4 + π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 2 βˆ™ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ 𝑑π‘₯ = 1 Theory in GAN GANs π‘₯ Property of probability density function
  90. 90. 90 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = βˆ’π‘™π‘œπ‘”4 + π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 2 βˆ™ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ 𝑑π‘₯ = 1 π‘™π‘œπ‘”2 = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ βˆ™ π‘™π‘œπ‘”2 𝑑π‘₯ Theory in GAN GANs π‘₯ π‘₯trick
  91. 91. 91 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = βˆ’π‘™π‘œπ‘”4 + π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 2 βˆ™ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ 𝑑π‘₯ = 1 π‘™π‘œπ‘”2 = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ βˆ™ π‘™π‘œπ‘”2 𝑑π‘₯ π‘™π‘œπ‘”2 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ Theory in GAN GANs π‘₯ π‘₯ π‘₯ π‘₯
  92. 92. 92 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) + 𝐸 π‘₯~𝑝 𝑔 log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 2 βˆ™ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ Theory in GAN GANs π‘₯ π‘₯ π‘₯π‘₯
  93. 93. 93 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) + 𝐸 π‘₯~𝑝 𝑔 log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 2 βˆ™ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + 𝐾𝐿(𝑝 π‘‘π‘Žπ‘‘π‘Ž|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) Theory in GAN GANs π‘₯π‘₯
  94. 94. 94 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) + 𝐸 π‘₯~𝑝 𝑔 log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) = βˆ’π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 2 βˆ™ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + 𝐾𝐿(𝑝 π‘‘π‘Žπ‘‘π‘Ž|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) 𝐾𝐿(𝑃| 𝑄 = ΰΆ± 𝑃(π‘₯) log 𝑃(π‘₯) 𝑄(π‘₯) 𝑑π‘₯ Theory in GAN GANs π‘₯ π‘₯ π‘₯ Definition of KL-Divergence
  95. 95. 95 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) + 𝐸 π‘₯~𝑝 𝑔 log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) = βˆ’π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 2 βˆ™ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + 𝐾𝐿(𝑝 π‘‘π‘Žπ‘‘π‘Ž|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) 𝐾𝐿(𝑃| 𝑄 = ΰΆ± 𝑃(π‘₯) log 𝑃(π‘₯) 𝑄(π‘₯) 𝑑π‘₯ ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 2 𝑑π‘₯ = 𝐾𝐿(𝑝 π‘‘π‘Žπ‘‘π‘Ž|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) Theory in GAN GANs π‘₯ π‘₯ π‘₯ π‘₯
  96. 96. 96 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) + 𝐸 π‘₯~𝑝 𝑔 log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 2 βˆ™ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + 𝐾𝐿(𝑝 π‘‘π‘Žπ‘‘π‘Ž|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) = βˆ’π‘™π‘œπ‘”4 + 2 βˆ™ 𝐽𝑆𝐷(𝑝 π‘‘π‘Žπ‘‘π‘Ž||𝑝 𝑔) Theory in GAN GANs
  97. 97. 97 Theoretical Results 𝐢 𝐺 = max 𝑉(𝐷, 𝐺) 𝐷 = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ—(π‘₯)) = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) + 𝐸 π‘₯~𝑝 𝑔 log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 2 βˆ™ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + 𝐾𝐿(𝑝 π‘‘π‘Žπ‘‘π‘Ž|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) = βˆ’π‘™π‘œπ‘”4 + 2 βˆ™ 𝐽𝑆𝐷(𝑝 π‘‘π‘Žπ‘‘π‘Ž||𝑝 𝑔) Theory in GAN GANs
  98. 98. 98 Theory in GAN GANs = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ— (π‘₯)) = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 2 βˆ™ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + 𝐾𝐿(𝑝 π‘‘π‘Žπ‘‘π‘Ž|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) = βˆ’π‘™π‘œπ‘”4 + 2 βˆ™ 𝐽𝑆𝐷(𝑝 π‘‘π‘Žπ‘‘π‘Ž||𝑝 𝑔) π‘₯ π‘₯ π‘₯ π‘₯ π‘₯π‘₯ π‘šπ‘–π‘› π‘šπ‘Žπ‘₯ 𝑉 𝐷, 𝐺 = 𝐺 𝐷 π‘šπ‘–π‘› 𝑉 π·βˆ—, 𝐺 𝐺 𝑉 π·βˆ—, 𝐺
  99. 99. 99 Theory in GAN GANs = 𝐸 π‘₯~𝑝 π‘‘π‘Žπ‘‘π‘Ž log π·βˆ— π‘₯ + 𝐸 π‘₯~𝑝 𝑔 log(1 βˆ’ π·βˆ— (π‘₯)) = ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + ΰΆ± 𝑝 π‘‘π‘Žπ‘‘π‘Ž π‘₯ log 2 βˆ™ 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ + ΰΆ± 𝑝 𝑔 π‘₯ log 2 βˆ™ 𝑝 𝑔(π‘₯) 𝑝 π‘‘π‘Žπ‘‘π‘Ž(π‘₯) + 𝑝 𝑔(π‘₯) 𝑑π‘₯ = βˆ’π‘™π‘œπ‘”4 + 𝐾𝐿(𝑝 π‘‘π‘Žπ‘‘π‘Ž|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) + 𝐾𝐿(𝑝 𝑔|| 𝑝 π‘‘π‘Žπ‘‘π‘Ž + 𝑝 𝑔 2 ) = βˆ’π‘™π‘œπ‘”4 + 2 βˆ™ 𝐽𝑆𝐷(𝑝 π‘‘π‘Žπ‘‘π‘Ž||𝑝 𝑔) π‘₯ π‘₯ π‘₯ π‘₯ π‘₯π‘₯ π‘šπ‘–π‘› π‘šπ‘Žπ‘₯ 𝑉 𝐷, 𝐺 = 𝐺 𝐷 π‘šπ‘–π‘› 𝑉 π·βˆ—, 𝐺 𝐺 𝑉 π·βˆ—, 𝐺 𝐺 π‘ β„Žπ‘œπ‘’π‘™π‘‘ π‘šπ‘–π‘›π‘–π‘šπ‘–π‘§π‘’ 𝐺 π‘ β„Žπ‘œπ‘’π‘™π‘‘ π‘šπ‘–π‘›π‘–π‘šπ‘–π‘§π‘’ π‘‚π‘π‘‘π‘–π‘šπ‘–π‘§π‘–π‘›π‘” 𝑉 𝐷, 𝐺 𝑖𝑠 π‘ π‘Žπ‘šπ‘’ π‘Žπ‘  π‘šπ‘–π‘›π‘–π‘šπ‘–π‘§π‘–π‘›π‘” 𝐽𝑆𝐷(𝑝 π‘‘π‘Žπ‘‘π‘Ž||𝑝 𝑔) π‘‚π‘π‘‘π‘–π‘šπ‘Žπ‘™ 𝐷

Γ—