Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Multi-modal embeddings: from discriminative to generative models and creative ai

1,883 views

Published on

Guest lecture #2 at DD2427 Image Based Recognition and Classification, KTH, Stockholm, 2 May 2016

Published in: Education
  • Be the first to comment

Multi-modal embeddings: from discriminative to generative models and creative ai

  1. 1. @graphific Roelof Pieters Mul--Modal Embeddings: from Discrimina-ve to Genera-ve Models and Crea-ve AI 2 May 2016 
 KTH www.csc.kth.se/~roelof/ roelof@kth.se
  2. 2. Modalities 2
  3. 3. • AE/VAE • DBN • Latent Vectors/ Manifold Walking • RNN / LSTM /GRU • Modded RNNs (ie Biaxial RNN) • CNN + LSTM/GRU • X + Mixture density network (MDN) Common Generative Architectures • DRAW • GRAN • DCGAN • DeepDream & other CNN visualisations • Splitting/Remixing NNs: • Image Analogy • Style Transfer • Semantic Style Transfer • Texture Synthesis • Compositional Pattern-Producing Networks (CPPN) • NEAT • CPPN w/ GAN+VAE.
  4. 4. • AE/VAE • DBN • Latent Vectors/ Manifold Walking • RNN / LSTM /GRU • Modded RNNs (ie Biaxial RNN) • CNN + LSTM/GRU • X + Mixture density network (MDN) Common Generative Architectures • DRAW • GRAN • DCGAN • DeepDream & other CNN visualisations • Splitting/Remixing NNs: • Image Analogy • Style Transfer • Semantic Style Transfer • Texture Synthesis • Compositional Pattern-Producing Networks (CPPN) • NEAT • CPPN w/ GAN+VAE. what we’ll cover
  5. 5. Text Generation 5
  6. 6. Alex Graves (2014) Generating Sequences With Recurrent Neural Networks Wanna Play ? Text Prediction (1)
  7. 7. Alex Graves (2014) Generating Sequences With Recurrent Neural Networks Wanna Play ? Text Prediction (1)
  8. 8. Alex Graves (2014) Generating Sequences With Recurrent Neural Networks Wanna Play ? Text Prediction (1)
  9. 9. Alex Graves (2014) Generating Sequences With Recurrent Neural Networks Wanna Play ? Text Prediction (2)
  10. 10. Alex Graves (2014) Generating Sequences With Recurrent Neural Networks Wanna Play ? Text Prediction (2)
  11. 11. Alex Graves (2014) Generating Sequences With Recurrent Neural Networks
  12. 12. Alex Graves (2014) Generating Sequences With Recurrent Neural Networks Wanna Play ? Handwriting Prediction
  13. 13. Alex Graves (2014) Generating Sequences With Recurrent Neural Networks Wanna Play ? Handwriting Prediction (we’re skipping the density mixture network details for now)
  14. 14. Wanna Play ? Text generation 15 Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural Networks (blog)
  15. 15. Wanna Play ? Text generation 16 Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural Networks (blog)
  16. 16. Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural Networks (blog)
  17. 17. Andrej Karpathy, Justin Johnson, Li Fei-Fei (2015) Visualizing and Understanding Recurrent Networks
  18. 18. Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural Networks (blog)
  19. 19. http://www.creativeai.net/posts/aeh3orR8g6k65Cy9M/ generating-magic-cards-using-deep-recurrent- convolutional
  20. 20. More… more at:
 http://gitxiv.com/category/natural-language- processing-nlp http://www.creativeai.net/?cat%5B0%5D=read- write
  21. 21. Image Generation 24
  22. 22. Turn Convnet Around: “Deep Dream” Image -> NN -> What do you (think) you see 
 -> Whats the (text) label Image -> NN -> What do you (think) you see -> 
 feed back activations -> 
 optimize image to “fit” to the ConvNets “hallucination” (iteratively)
  23. 23. Google, Inceptionism: Going Deeper into Neural Networks Turn Convnet Around: “Deep Dream” see also: www.csc.kth.se/~roelof/deepdream/
  24. 24. Turn Convnet Around: “Deep Dream” see also: www.csc.kth.se/~roelof/deepdream/ Google, Inceptionism: Going Deeper into Neural Networks
  25. 25. code youtube Roelof Pieters 2015
  26. 26. https://www.flickr.com/photos/graphific/albums/72157657250972188 Single Units Roelof Pieters 2015
  27. 27. Multifaceted Feature Visualization Anh Nguyen, Jason Yosinski, Jeff Clune (2016) Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks
  28. 28. Multifaceted Feature Visualization
  29. 29. Multifaceted Feature Visualization
  30. 30. Preferred stimuli generation Anh Nguyen, Alexey Dosovitskiy, Jason Yosinski, Thomas Brox, and Jeff Clune (2016) AI Neuroscience: Understanding Deep Neural Networks by Synthetically Generating the Preferred Stimuli for Each of Their Neurons
  31. 31. Inter-modal: Style Transfer (“Style Net” 2015) Leon A. Gatys, Alexander S. Ecker, Matthias Bethge , 2015. 
 A Neural Algorithm of Artistic Style (GitXiv)
  32. 32. Inter-modal: Image Analogies (2001) A. Hertzmann, C. Jacobs, N. Oliver, B. Curless, D. Salesin. (2001) Image Analogies, SIGGRAPH 2001 Conference Proceedings. A. Hertzmann (2001) Algorithms for Rendering in Artistic Styles Ph.D thesis. New York University. May, 2001.
  33. 33. Inter-modal: Image Analogies (2001)
  34. 34. Inter-modal: Style Transfer (“Style Net” 2015) Leon A. Gatys, Alexander S. Ecker, Matthias Bethge , 2015. 
 A Neural Algorithm of Artistic Style (GitXiv)
  35. 35. style layers: 3_1,4_1,5_1
  36. 36. style layers: 3_2
  37. 37. style layers: 5_1
  38. 38. style layers: 5_2
  39. 39. style layers: 5_3
  40. 40. style layers: 5_4
  41. 41. style layers: 5_1 + 5_2 + 5_3 + 5_4
  42. 42. Gene Kogan, 2015. Why is a Raven Like a Writing Desk? (vimeo)
  43. 43. Inter-modal: Style Transfer+MRF (2016) Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis, 2016, Chuan Li, Michael Wand
  44. 44. Inter-modal: Style Transfer+MRF (2016) Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis, 2016, Chuan Li, Michael Wand
  45. 45. Inter-modal: Pretrained Style Transfer (2016) Texture Networks: Feed-forward Synthesis of Textures and Stylized Images, 2016, Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, Victor Lempitsky 500x speedup! (avg min loss from 10s to 20ms)
  46. 46. Inter-modal: Pretrained Style Transfer #2 (2016) Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks, Chuan Li, Michael Wand, 2016 similarly 500x speedup
  47. 47. Inter-modal: Perceptual Loss ST (2016) Perceptual Losses for Real-Time Style Transfer and Super-Resolution, 2016, Justin Johnson, Alexandre Alahi, Li Fei-Fei
  48. 48. Perceptual Losses for Real-Time Style Transfer and Super-Resolution, 2016, Justin Johnson, Alexandre Alahi, Li Fei-Fei
  49. 49. 54 Inter-modal: Style Transfer (“Style Net” 2015) @DeepForger
  50. 50. 55 Inter-modal: Style Transfer (“Style Net” 2015) @DeepForger
  51. 51. + + = Inter-modal: Semantic Style Transfer
  52. 52. https://github.com/alexjc/neural-doodle Semantic Style Transfer (“Neural Doodle”)
  53. 53. Synthesize textures
  54. 54. Synthesise textures (random weights) - which activation function should i use? - pooling? - min nr of units? Experiment: https://nucl.ai/blog/extreme-style-machines/ Random Weights (kinda like “Extreme Learning Machines”)
  55. 55. Synthesise textures (random weights) totally random initialised weights: https://nucl.ai/blog/extreme-style-machines/
  56. 56. Synthesise textures (randim weights) https://nucl.ai/blog/extreme-style-machines/ Activation Functions
  57. 57. Synthesise textures (randim weights) https://nucl.ai/blog/extreme-style-machines/ Activation Functions
  58. 58. Synthesise textures (randim weights) https://nucl.ai/blog/extreme-style-machines/ Activation Functions
  59. 59. Synthesise textures (randim weights) https://nucl.ai/blog/extreme-style-machines/ Down-sampling
  60. 60. Synthesise textures (randim weights) https://nucl.ai/blog/extreme-style-machines/ Down-sampling
  61. 61. Synthesise textures (randim weights) https://nucl.ai/blog/extreme-style-machines/ Down-sampling
  62. 62. Synthesise textures (randim weights) https://nucl.ai/blog/extreme-style-machines/ Nr of units
  63. 63. Synthesise textures (randim weights) https://nucl.ai/blog/extreme-style-machines/ Nr of units
  64. 64. Synthesise textures (randim weights) https://nucl.ai/blog/extreme-style-machines/ Nr of units
  65. 65. • Image Analogies, 2001, A. Hertzmann, C. Jacobs, N. Oliver, B. Curless, D. Sales • A Neural Algorithm of Artistic Style, 2015. Leon A. Gatys, Alexander S. Ecker, Matthias Bethge • Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis, 2016, Chuan Li, Michael Wand • Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks, 2016, Alex J. Champandard • Texture Networks: Feed-forward Synthesis of Textures and Stylized Images, 2016, Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, Victor Lempitsky • Perceptual Losses for Real-Time Style Transfer and Super-Resolution, 2016, Justin Johnson, Alexandre Alahi, Li Fei-Fei • Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks, 2016, Chuan Li, Michael Wand • @DeepForger 70 “Style Transfer” papers
  66. 66. “A stop sign is flying in blue skies.” “A herd of elephants flying in the blue skies.” Elman Mansimov, Emilio Parisotto, Jimmy Lei Ba, Ruslan Salakhutdinov, 2015. Generating Images from Captions with Attention (arxiv) (examples) Caption -> Image generation
  67. 67. Image Colorisation
  68. 68. More… more at:
 http://gitxiv.com/category/computer-vision http://www.creativeai.net/ (no category for images just yet)
  69. 69. Audio Generation 74
  70. 70. Early LSTM music composition (2002) Douglas Eck and Jurgen Schmidhuber (2002) Learning The Long-Term Structure of the Blues?
  71. 71. Markov constraints http://www.flow-machines.com/
  72. 72. Markov constraints http://www.flow-machines.com/
  73. 73. 78 Audio Generation: Midi https://soundcloud.com/graphific/pyotr-lstm-tchaikovsky A Recurrent Latent Variable Model for Sequential Data, 2016, 
 J. Chung, K. Kastner, L. Dinh, K. Goel, A. Courville, Y. Bengio + “modded VRNN:
  74. 74. 79 Audio Generation: Midi https://soundcloud.com/graphific/neural-remix-net A Recurrent Latent Variable Model for Sequential Data, 2016, 
 J. Chung, K. Kastner, L. Dinh, K. Goel, A. Courville, Y. Bengio + “modded VRNN:
  75. 75. 80 Audio Generation: Raw Gated Recurrent Unit (GRU)stanford cs224d project Aran Nayebi, Matt Vitelli (2015) GRUV: Algorithmic Music Generation using Recurrent Neural Networks
  76. 76. • LSTM improvements • Recurrent Batch Normalization http:// gitxiv.com/posts/MwSDm6A4wPG7TcuPZ/ recurrent-batch-normalization • also: hidden-to-hidden transition (earlier only input-to-hidden transformation of RNNs) • faster convergence and improved generalization. LSTM improvements
  77. 77. • LSTM improvements • weight normalisation http://gitxiv.com/posts/ p9B6i9Kzbkc5rP3cp/weight-normalization-a- simple-reparameterization-to LSTM improvements
  78. 78. • LSTM improvements • Associative Long Short-Term Memory http:// gitxiv.com/posts/jpfdiFPsu5c6LLsF4/associative- long-short-term-memory LSTM improvements
  79. 79. • LSTM improvements • Bayesian RNN dropout http://gitxiv.com/posts/ CsCDjy7WpfcBvZ88R/bayesianrnn LSTM improvements
  80. 80. Chor-RNN Continuous Generation Luka Crnkovic-Friis & Louise Crnkovic-Friis (2016) Generative Choreography using Deep Learning
  81. 81. • Mixture Density LSTM Generative 1 2 3 4 5 6
  82. 82. • Mixture Density LSTM Generative 1 2 3 4 5 6
  83. 83. • Mixture Density LSTM Generative 1 2 3 4 5 6
  84. 84. • Mixture Density LSTM Generative 1 2 3 4 5 6
  85. 85. • Mixture Density LSTM Generative 1 2 3 4 5 6
  86. 86. Wanna be Doing Deep Learning?
  87. 87. python has a wide range of deep learning-related libraries available Deep Learning with Python Low level High level deeplearning.net/software/theano caffe.berkeleyvision.org tensorflow.org/ lasagne.readthedocs.org/en/latest and of course: keras.io
  88. 88. Code & Papers? http://gitxiv.com/ #GitXiv
  89. 89. Creative AI projects? http://www.creativeai.net/ #Crea-veAI
  90. 90. Questions? love letters? existential dilemma’s? academic questions? gifts? 
 find me at:
 www.csc.kth.se/~roelof/ roelof@kth.se @graphific Oh, and soon we’re looking for Creative AI enthusiasts ! - job - internship - thesis work in AI (Deep Learning)
 &
 Creativity
  91. 91. https://medium.com/@ArtificialExperience/ creativeai-9d4b2346faf3
  92. 92. Creative AI > a “brush” > rapid experimentation human-machine collaboration
  93. 93. Creative AI > a “brush” > rapid experimentation (YouTube, Paper)
  94. 94. Creative AI > a “brush” > rapid experimentation (YouTube, Paper)
  95. 95. Creative AI > a “brush” > rapid experimentation (Vimeo, Paper)
  96. 96. 101 Generative Adverserial Nets Emily Denton, Soumith Chintala, Arthur Szlam, Rob Fergus, 2015. 
 Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks (GitXiv)
  97. 97. 102 Generative Adverserial Nets Alec Radford, Luke Metz, Soumith Chintala , 2015. 
 Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (GitXiv)
  98. 98. 103 Generative Adverserial Nets Alec Radford, Luke Metz, Soumith Chintala , 2015. 
 Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (GitXiv)
  99. 99. 104 Generative Adverserial Nets Alec Radford, Luke Metz, Soumith Chintala , 2015. 
 Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (GitXiv) ”turn” vector created from four averaged samples of faces looking left vs looking right.
  100. 100. walking through the manifold Generative Adverserial Nets
  101. 101. top: unmodified samples bottom: same samples dropping out ”window” filters Generative Adverserial Nets

×