Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Generative Adversarial Networks

7,377 views

Published on

Tutorial on Generative Adversarial Networks
youtube: https://www.youtube.com/playlist?list=PLeeHDpwX2Kj5Ugx6c9EfDLDojuQxnmxmU

Published in: Technology

Generative Adversarial Networks

  1. 1. Generative Adversarial Networks Mark Chang
  2. 2. Original Paper • Title: – Generative Adversarial Nets • Authors: – Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio • Organization: – Universite ́ de Montre ́al • URL: – https://arxiv.org/abs/1406.2661
  3. 3. Outlines • Mini-max Two-player Game • Generative Model v.s. Discriminative Model • Generative Adversarial Networks • Convergence Proof • Experiment • Further Research
  4. 4. Mini-max Two-player Game
  5. 5. Mini-max Two-player Game • : the current player • : the opponent • : the action taken by current player • : the action taken by opponent • : the value function of current player -1 1 1 vs = max as min as vs(as, as ) vs s s as as as as s s
  6. 6. Mini-max Two-player Game • : the current player • : the opponent • : the action taken by current player • : the action taken by opponent • : the value function of current player 1 vs = max as min as vs(as, as ) vs s s as as as s s -1 1 vs = 1 -1 1 1 as 1
  7. 7. Generative Model v.s. Discriminative Model • Discriminative Model • Generative Model p(x, y)p(y|x) y = 0 y = 1 y = 0 y = 1 x x
  8. 8. Discriminative Model : p(y|x) x p(y = 0|x) > p(y = 1|x) p(y = 1|x) > p(y = 0|x) p(y = 0|x) + p(y = 1|x) = 1 y = 1 y = 0 y = 1 example of other class p(y = 0|x) = p(y = 1|x) = 0.5
  9. 9. Generative Model : p(x, y) x p(x, y = 0) p(x, y = 1) generate new example example of other class
  10. 10. Generative Adversarial Networks • Generate new data by Neural Network p(x, z) = p(z)p(x|z) Generator Network p(z) p(x|z)prior generated dataz ⇠ p(z) sampling x
  11. 11. Generative Adversarial Networks Generator Network G(z)prior min G max D V (D, G) generated data z ⇠ pz(z) real data x ⇠ pdata(x) 1 0 V (D, G) = Ex⇠pdata(x)[logD(x)] + Ez⇠pz(z)[log(1 D(G(z))] Discriminator Network D(x) sigmoid function
  12. 12. Training Discriminator Network max D (Ex⇠pdata(x)[logD(x)] + Ez⇠pz(z)[log(1 D(G(z))]) real data generated data D(x) ⇡ 0 ) logD(x) ⌧ 0 should be 1D(x) should be 0D(G(z)) Discriminator Network 1 0D(x) D(G(z)) ⇡ 1 ) log(1 D(G(z))) ⌧ 0
  13. 13. Training Discriminator Network Discriminator Network 1 0 real data generated data D(x) ⇡ 1 ) logD(x) ⇡ 0 max D (Ex⇠pdata(x)[logD(x)] + Ez⇠pz(z)[log(1 D(G(z))]) should be 1D(x) should be 0D(G(z)) D(x) D(G(z)) ⇡ 0 ) log(1 D(G(z))) ⇡ 0
  14. 14. Training Generator Network min G (Ex⇠pdata(x)[logD(x)] + Ez⇠pz(z)[log(1 D(G(z))]) ) min G (Ez⇠pz(z)[log(1 D(G(z))]) ) max G (Ez⇠pz(z)[log(D(G(z))]) There is no in this term.G should not be 0 => should be 1D(G(z)) D(G(z))
  15. 15. prior z ⇠ pz(z) Training Generator Network Generator Network G(z) max G (Ez⇠pz(z)[log(D(G(z))]) generated data real data 1 0 D(G(z)) ⇡ 0 ) logD(G(z)) ⌧ 0 should be 1D(G(z))
  16. 16. Training Generator Network real data Generator Network prior z ⇠ pz(z) max G (Ez⇠pz(z)[log(D(G(z))]) 1 0 generated data G(z) should be 1D(G(z)) D(G(z)) ⇡ 1 ) logD(G(z)) ⇡ 0
  17. 17. Training Generative Adversarial Networks min G V (D, G) max D V (D, G) max D V (D, G) min G max D V (D, G) min G V (D, G) max D V (D, G) G(z) G(z) G(z) G(z)G(z)G(z)
  18. 18. Global Optimum D(G(z)) = 0.5 real data generated data G(z)
  19. 19. Convergence Proof • Global Optimum Exists • Converge to Global Optimum
  20. 20. Global Optimum Exists after training, reaching global optimum x pdata(x) = pg(x) D(x) = 0.5 x discriminator D(x) real data distribution pdata(x) generated data distribution pg(x) z prior pz(z) global optimum: pdata = pg Generator G(z) before training x pdata(x) pg(x) D(x) G(z)
  21. 21. x pdata(x) pg(x) D⇤ G(x) Global Optimum Exists • For fixed, the optimal discriminator is: D⇤ G(x) = pdata(x) pdata(x) + pg(x) G D pdata(x) = 0, pg(x) = 0.5 D⇤ G(x) = 0 0.5 + 0 = 0 pdata(x) = 0.5, pg(x) = 0.5 D⇤ G(x) = 0.5 0.5 + 0.5 = 0.5 pdata(x) = 0.5, pg(x) = 0 D⇤ G(x) = 0.5 0.5 + 0 = 1
  22. 22. Global Optimum Exists V (D, G) = Ex⇠pdata(x)[logD(x)] + Ez⇠pz(z)[log(1 D(G(z))] x = G(z) ) z = G 1 (x) ) dz = (G 1 )0 (x)dx ) pg(x) = pz(G 1 (x))(G 1 )0 (x) = Z x pdata(x)log(D(x))dx + Z x pz(G 1 (x))log(1 D(x))(G 1 )0 (x)dx = Z x pdata(x)log(D(x))dx + Z x pg(x)log(1 D(x))dx = Z x pdata(x)log(D(x)) + pg(x)log(1 D(x))dx = Z x pdata(x)log(D(x))dx + Z z pz(z)log(1 D(G(z)))dz
  23. 23. Global Optimum Exists max D V (D, G) = max D Z x pdata(x)log(D(x)) + pg(x)log(1 D(x))dx ) pdata(x) D(x) pg(x) 1 D(x) = 0 @ @D(x) (pdata(x)log(D(x)) + pg(x)log(1 D(x))) = 0 ) D(x) = pdata(x) pdata(x) + pg(x)
  24. 24. Global Optimum Exists • Suppose the discriminator is optimal , the optimal generator makes: D⇤ G(x) ) D⇤ G(x) = pdata(x) pdata(x) + pg(x) = 1 2 pdata(x) = pg(x) pdata(x) = 0.3, pg(x) = 0.3 D⇤ G(x) = 0.3 0.3 + 0.3 = 0.5 pdata(x) = 0.8, pg(x) = 0.8 D⇤ G(x) = 0.8 0.8 + 0.8 = 0.5 x D⇤ G(x) pdata(x) = pg(x)
  25. 25. Global Optimum Exists = max D Z x pdata(x)log(D(x)) + pg(x)log(1 D(x))dx C(G) = max D V (G, D) = Z x pdata(x)log(D⇤ G(x)) + pg(x)log(1 D⇤ G(x))dx = Z x pdata(x)log( pdata(x) pdata(x) + pg(x) ) + pg(x)log( pg(x) pdata(x) + pg(x) )dx = Z x pdata(x)log( pdata(x) pdata(x)+pg(x) 2 ) + pg(x)log( pg(x) pdata(x)+pg(x) 2 )dx log(4) = KL[pdata(x)|| pdata(x) + pg(x) 2 ] + KL[pg(x)|| pdata(x) + pg(x) 2 ] log(4)
  26. 26. Global Optimum Exists C(G) = KL[pdata(x)|| pdata(x) + pg(x) 2 ] + KL[pg(x)|| pdata(x) + pg(x) 2 ] log(4) 0 0 KL[pdata(x)|| pdata(x) + pg(x) 2 ] = 0 ) pdata(x) = pg(x) when pdata(x) = pdata(x) + pg(x) 2 min G C(G) = 0 + 0 log(4) = log(4)
  27. 27. D G Converge to Global Optimum V (D, G) Global Optimum max D V (G, D) Current V (D, G)
  28. 28. Converge to Global Optimum V (D, G) min G V (G, D) D G Global Optimum Current V (D, G)
  29. 29. Experiment • Quantitative Analyses • Qualitative Analyses
  30. 30. Experiment • Quantitative Analyses – log-likelihood estimation – Parzen window: ˆp(x) = 1 N NX i=1 G (x xi)
  31. 31. Experiment generated data G(z) testing data xi x ˆp(x) log likelihood ˆp(x) = 1 N NX i=1 G (x xi)
  32. 32. Experiment • Qualitative Analyses : – Visualization of samples from the model real datagenerated data real data generated data
  33. 33. Experiment • Qualitative Analyses : – linearly interpolation in z space prior:z ⇠ pz(z)
  34. 34. Further Research Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks Alec Radford, Luke Metz, Soumith Chintala
  35. 35. Further Research Adversarial Autoencoders Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, Brendan Frey
  36. 36. Further Research InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel
  37. 37. Source Code • Original paper (theano): – https://github.com/goodfeli/adversarial • Tensorflow implementation: – https://github.com/ckmarkoh/GAN-tensorflow

×