Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)

11,469 views

Published on

A basic introduction to Generative Adversarial Networks, what they are, how they work, and why study them. This presentation shows what is their contribution to Machine Learning field and for which reason they have been considered one of the major breakthroughts in Machine Learning field.

Published in: Technology

A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)

  1. 1. A (very) gentle introduction to Generative Adversarial Networks (a.k.a. GANs) Thomas Paula #4 Porto Alegre ML Meetup
  2. 2. Who am I 2 Thomas Paula Software Engineer Machine Learning Engineer Twitter @tsp_thomas
  3. 3. Why study GANs? 3
  4. 4. “(…) (GANs) and the variations that are now being proposed is the most interesting idea in the last 10 years in ML (…)” Yann LeCun Director of Facebook AI Research Source: https://www.quora.com/What-are-some-recent-and-potentially-upcoming-breakthroughs-in-deep-learning/answer/Yann-LeCun
  5. 5. Background 5
  6. 6. Recalling supervised learning • Labelled data; • Algorithms try to predict an output value based on a given input; • Examples include • Classification algorithms such as SVM • Regression algorithms such as Linear Regression 6
  7. 7. Recalling unsupervised learning • Unlabelled data; • Algorithms try to discover hidden structures in the data; • Examples include • Clustering algorithms such as K-means • Generative models such as GANs 7
  8. 8. Discriminative models • Learns a function that maps the input 𝑥 to an output 𝑦; • Conditional probability 𝑃(𝑦|𝑥); • Classification algorithms as SVM. 8
  9. 9. Generative Models 9 • Tries to learn a joint probability of the input 𝑥 and the output 𝑦 at the same time; • Joint probability 𝑃 𝑥, 𝑦 ; • Generative statistical models as Latent Dirichelet Allocation.
  10. 10. “What I do not understand, I cannot create.” Richard Feynman Nobel in Physics in 1965 Source: https://openai.com/blog/generative-models/
  11. 11. GANs 11
  12. 12. What are GANs? 12 First, an intuition Goal: produce counterfeit money that is as similar as real money. Goal: distinguish between real and counterfeit money. x
  13. 13. What are GANs? 13 First, an intuition Goal: distinguish between real and counterfeit money. x generator discriminator Goal: produce counterfeit money that is as similar as real money.
  14. 14. What are GANs? 14 generator discriminator 𝑧 𝑥 𝑥
  15. 15. What are GANs? 15 generator discriminator 𝑧 𝑥 𝑥Generated from Gaussian or Normal distribution (usually) Generated instance
  16. 16. What are GANs? 16 generator discriminator 𝑧 𝑥 𝑥 Generated instance Training data
  17. 17. Generative Adversarial Networks • Created by Ian Goodfellow (OpenAI); • Two neural networks compete (minmax game) • Discriminative network tries to distinguish between real and fake data • Generative network tries to generate samples to fool the discriminative one • Use latent code (𝑧). 17 Source: Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
  18. 18. 18 Source: Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
  19. 19. Training process 19
  20. 20. “It should happen until the generator exactly reproduces the true data distribution and the discriminator is guessing at random, unable to find a difference.” OpenAI Source: https://openai.com/blog/generative-models/
  21. 21. However, some problems arise… • Gradient Descent does not always find Nash equilibrium; • Mode Collapse: Generator starts to produce several copies of the same image. • Expected: First Max Discriminator and Min Generator • Problem: First Min Generator and Max Discriminator • Causes low diversity output. 21 Sources: https://www.quora.com/What-are-the-pros-and-cons-of-using-generative-adversarial-networks-a-type-of-neural-network, and Generative Adversarial Networks (GANs) #AIWTB 2016 video.
  22. 22. Some Advances 22 Since GANs’ creation, in 2014 (not necessarily in chronological order)
  23. 23. LAPGAN • Produces high quality samples of natural images (at 32x32 and 64x64); • Cascade of convolutional neural networks with Laplacian pyramid framework. 23 Laplacian Pyramid GAN – Emily Denton et al. (2015) Source: https://en.wikipedia.org/wiki/Pyramid_(image_processing)
  24. 24. 24 Source: Denton, Emily L., Soumith Chintala, and Rob Fergus. "Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks." Advances in neural information processing systems. 2015.
  25. 25. DCGAN • CNNs combined with GANs; • Train GANs and later use learned feature extractors for supervised tasks; • Successful use of CNNs after LAPGAN; • Created guidelines for stable Deep Convolutional GANs. 25 Deep Convolutional GAN – Alec Radford et al. (2016) Source: Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).
  26. 26. DCGAN Used three recent changes in CNNs: • All convolutional network – strided convolutions; • Elimination of fully-connected layers; • Batch normalization. 26 Deep Convolutional GAN – Alec Radford et al. (2016) Source: Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).
  27. 27. DCGAN 27 Deep Convolutional GAN – Alec Radford et al. (2016) Source: Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015). DCGAN generator for LSUN dataset.
  28. 28. 28 Source: Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).
  29. 29. 29 Source: Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015). DCGAN – Vector Arithmetic Deep Convolutional GAN – Alec Radford et al. (2016)
  30. 30. InfoGAN • Modification of GANs that encourages it to learn interpretable and meaningful representations; • Learns disentangled representations; • For instance, facial expression, eye color, hairstyle, presence or absence of eyeglasses for faces • Decomposes input into 𝑧 (noise) and 𝑐 (latent code). 30 Deep Convolutional GAN – Xi Chen et al. (2016) Source: Chen, Xi, et al. "Infogan: Interpretable representation learning by information maximizing generative adversarial nets." Advances in Neural Information Processing Systems. 2016.
  31. 31. 31 Source: Chen, Xi, et al. "Infogan: Interpretable representation learning by information maximizing generative adversarial nets." Advances in Neural Information Processing Systems. 2016. InfoGAN Deep Convolutional GAN – Xi Chen et al. (2016)
  32. 32. 32 Source: Chen, Xi, et al. "Infogan: Interpretable representation learning by information maximizing generative adversarial nets." Advances in Neural Information Processing Systems. 2016.
  33. 33. Let’s pause for a while… 33 Paper regarding improvements on training GANs
  34. 34. Improved Techniques for Training GANs • Techniques that encourage convergence of GANs – do you recall Nash equilibrium? • Why is it difficult? • Cost functions are non-convex; • Parameters are continuous; • Parameters space is very high dimensional. 34 Salimans et al. (2016) Sources: Salimans, Tim, et al. "Improved techniques for training gans." Advances in Neural Information Processing Systems, 2016; and Goodfellow, Ian. "Generative Adversarial Networks," NIPS 2016 tutorial.
  35. 35. Improved Techniques for Training GANs Feature matching • Instead of maximizing the output of the discriminator, requires the generator to generate data that matches statistics of real data; • Generator trained to match expected value of an intermediate layer of the discriminator. 35 Salimans et al. (2016) Sources: Salimans, Tim, et al. "Improved techniques for training gans." Advances in Neural Information Processing Systems, 2016; and Goodfellow, Ian. "Generative Adversarial Networks," NIPS 2016 tutorial.
  36. 36. Improved Techniques for Training GANs Minibatch discrimination • Allows the discriminator to look at multiple examples in combination; • Avoids mode collapse. 36 Salimans et al. (2016) Sources: Salimans, Tim, et al. "Improved techniques for training gans." Advances in Neural Information Processing Systems, 2016; and Goodfellow, Ian. "Generative Adversarial Networks," NIPS 2016 tutorial.
  37. 37. Improved Techniques for Training GANs Historical averaging • Adds a term in cost functions to consider past parameters; • Inspired on Fictitious Play (based on frequency of play). 37 Salimans et al. (2016) Sources: Salimans, Tim, et al. "Improved techniques for training gans." Advances in Neural Information Processing Systems, 2016; and Goodfellow, Ian. "Generative Adversarial Networks," NIPS 2016 tutorial.
  38. 38. Improved Techniques for Training GANs One-sided label smoothing • Reduces confidence, considering targets of 0.1 and 0.9 instead of 0 and 1; • Prevents discriminator from giving very large gradient signal to generator. 38 Salimans et al. (2016) Sources: Salimans, Tim, et al. "Improved techniques for training gans." Advances in Neural Information Processing Systems, 2016; and Goodfellow, Ian. "Generative Adversarial Networks," NIPS 2016 tutorial.
  39. 39. Improved Techniques for Training GANs Virtual Batch Normalization (VBN) • Regular batch normalization can cause strong intra-batch correlation; • VBN avoids the problem by using a reference batch and its statistics to combine with the given instances; • It is computationally expensive. 39 Salimans et al. (2016) Sources: Salimans, Tim, et al. "Improved techniques for training gans." Advances in Neural Information Processing Systems, 2016; and Goodfellow, Ian. "Generative Adversarial Networks," NIPS 2016 tutorial.
  40. 40. Coming Back to Advances 40 Since GANs’ creation, in 2014 (not necessarily in chronological order)
  41. 41. Advances... and some Applications 41 iGAN Source: Zhu, Jun-Yan, et al. "Generative visual manipulation on the natural image manifold." European Conference on Computer Vision. Springer International Publishing, 2016. Source: Gifs generated from original video (https://www.youtube.com/watch?v=9c4z6YsBGQ0).
  42. 42. Advances... and some Applications 42 cGAN – Conditional GAN Source: Reed, Scott, et al. "Generative adversarial text to image synthesis." Proceedings of The 33rd International Conference on Machine Learning. Vol. 3. 2016.
  43. 43. 43 Source: Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." arXiv preprint arXiv:1609.04802 (2016). Advances... and some Applications Super resolution
  44. 44. 44 Source: Brock, Andrew, et al. "Neural photo editing with introspective adversarial networks." arXiv preprint arXiv:1609.07093 (2016). Advances... and some Applications Neural Photo Editing with Introspective Adversarial Networks
  45. 45. Is that all, then? 45
  46. 46. Nope! • Around 72 papers cite GAN’s original paper (all from this year!)*; • The latest is called WGAN (Wasserstein GAN) • It was submitted to arxiv in 26 Jan! • “We empirically show that WGANs cure the main training problems of GANs” 46 Sources: Wasserstein GAN - https://arxiv.org/abs/1701.07875. * Simple search in Google Scholar.
  47. 47. Conclusion 47
  48. 48. Bottom line (in a nutshell) • GANs are composed of two networks that compete • One network generates samples (generator) • Another network differenciate between real and generated data (discriminator) • It is an unsupervised learning technique that is trained in a supervised manner; • Interesting and challenging open research questions. 48
  49. 49. Bottom line (in a nutshell) • Finding Nash equilibrium in high-dimensional, continuos, non-convex games is an important open research problem; • There is no rule of the thumb for evaluating generative models. 49
  50. 50. “(…) in the process of training generative models, we will endow the computer with an understanding of the world and what it is made up of.” OpenAI Source: https://openai.com/blog/generative-models/
  51. 51. Thank you! Thomas Paula Software Engineer Machine Learning Engineer Twitter @tsp_thomas

×