Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Neural Networks 
that talk (Back)… with style

Talk at Nuclai 2016 in Vienna

Can neural networks sing, dance, remix and rhyme? And most importantly, can they talk back? This talk will introduce Deep Neural Nets with textual and auditory understanding and some of the recent breakthroughs made in these fields. It will then show some of the exciting possibilities these technologies hold for "creative" use and explorations of human-machine interaction, where the main theorem is "augmentation, not automation".

http://events.nucl.ai/track/cognitive/#deep-neural-networks-that-talk-back-with-style

  • Be the first to comment

Deep Neural Networks 
that talk (Back)… with style

  1. 1. WITH STYLE DEEP NEURAL NETWORKS 
 THAT TALK (BACK)… @graphific Roelof Pieters Vienna 2016
  2. 2. DEEP NEURAL NETWORKS
  3. 3. COMMON GENERATIVE ARCHITECTURES ▸ AE/VAE ▸ DBN ▸ Latent Vectors/Manifold Walking ▸ RNN / LSTM /GRU ▸ Modded RNNs (ie Biaxial RNN) ▸ CNN + LSTM/GRU ▸ X + Mixture density network (MDN) ▸ Compositional Pattern-Producing Networks (CPPN) ▸ NEAT ▸ CPPN w/ GAN+VAE. ▸ DRAW ▸ GRAN ▸ DCGAN ▸ DeepDream & other CNN visualisations ▸ Splitting/Remixing NNs: ▸ Image Analogy ▸ Style Transfer ▸ Semantic Style Transfer ▸ Texture Synthesis
  4. 4. DEEP NEURAL NETWORKS ▸ Here we mean the recurrent variants… (karpathy.github.io/2015/05/21/rnn-effectiveness/)
  5. 5. ▸ One to one ▸ just like a normal neural network ▸ ie Image Categorization RECURRENT NEURAL NETWORKS EEVEE
  6. 6. ▸ One to many ▸ example: image captioning RECURRENT NEURAL NETWORKS (Vinyals et al., 2015)
  7. 7. ▸ Many to one ▸ example: sentiment analysis RECURRENT NEURAL NETWORKS +++
  8. 8. ▸ Many to many ▸ output at end (machine translation?)
 or at each timestep (char-rnn !!) RECURRENT NEURAL NETWORKS
  9. 9. NEURAL NETWORKS CAN…
  10. 10. SAMPLE / CONDUCT / PREDICT
  11. 11. NEURAL NETWORKS CAN SAMPLE (XT > XT+1) RECURRENT NEURAL NETWORK (RNN/LSTM) SAMPLING ▸ (Naive) Sampling ▸ Scheduled Sampling (ML) Bengio et al 2015 ▸ Sequence Level (RL) Ranzati et al 2016 ▸ Reward Augmented Maximum Likelihood (ML+RL) Nourouzi et al forthcoming
  12. 12. ▸ Approach used for most recent “creative” generations ▸ (Char-RNN, Torch-RNN, etc) LSTM SAMPLING (GRAVES 2013)
  13. 13. SCHEDULED SAMPLING (BENGIO ET AL 2015) ▸ At training start with ground truth and slowly move towards using model predictions as next steps
  14. 14. SEQUENCE LEVEL TRAINING (RANZATO ET AL 2016) ▸ Use model predictions as next steps, but continuous reward/ loss through Reinforcement Learning
  15. 15. REWARD AUGMENTED MAXIMUM LIKELIHOOD (NOUROUZI ET AL FORTHCOMING) ▸ Generate targets sampled 
 around the correct solution ▸ “giving it mostly wrong 
 examples to learn the right ones”
  16. 16. NEURAL NETWORKS CAN…
  17. 17. TRANSLATE TRANSLATE
  18. 18. TRANSLATE (X == Y) Are you going to Nuclai? will there! Damn right you are homey! Ofcourse beI
  19. 19. NEURAL NETWORKS CAN…
  20. 20. REMIX
 / EVOLVE
  21. 21. NEURAL NETWORKS CAN (RE)MIX / TRANSFER (X > Y) / (Z = X+Y)
  22. 22. NEURAL NETWORKS CAN…
  23. 23. HALLUCINATE
  24. 24. NEURAL NETWORKS CAN HALLUCINATE {*X*}
  25. 25. NEURAL NETWORKS CAN HALLUCINATE {*X*} https://www.youtube.com/watch?v=oyxSerkkP4o https://vimeo.com/134457823 https://www.youtube.com/watch?v=tbTJH8aPl60 https://www.youtube.com/watch?v=NYVg-V8-7q0
  26. 26. ▸ Possible to learn and generate: ▸ Audio ▸ Images ▸ Text ▸ … anything basically… ▸Revolution ? RECURRENT NEURAL NETWORKS
  27. 27. IF I CAN’T DANCE IT’S NOT MY REVOLUTION. Emma Goldman
  28. 28. http://www.creativeai.net/posts/qPpnatfMqwMKXPEw7/chor-rnn-generative-choreography-using-deep-learning
  29. 29. http://www.creativeai.net/posts/qPpnatfMqwMKXPEw7/chor-rnn-generative-choreography-using-deep-learning
  30. 30. RECURRENT NEURAL NETWORKS Robin Sloan, Writing with the machine
  31. 31. EARLY LSTM MUSIC COMPOSITION (2002) Douglas Eck and Jurgen Schmidhuber (2002) Learning The Long-Term Structure of the Blues?
  32. 32. AUDIO GENERATION: MIDI Douglas Eck and Jurgen Schmidhuber (2002) Learning The Long-Term Structure of the Blues? ▸ https://soundcloud.com/graphific/pyotr-lstm- tchaikovsky A Recurrent Latent Variable Model for Sequential Data, 2016, 
 J. Chung, K. Kastner, L. Dinh, K. Goel, A. Courville, Y. Bengio + “modded VRNN:
  33. 33. AUDIO GENERATION: MIDI Douglas Eck and Jurgen Schmidhuber (2002) Learning The Long-Term Structure of the Blues? ▸ https://soundcloud.com/graphific/neural-remix-net A Recurrent Latent Variable Model for Sequential Data, 2016, 
 J. Chung, K. Kastner, L. Dinh, K. Goel, A. Courville, Y. Bengio + “modded VRNN:
  34. 34. https://soundcloud.com/graphific/neural-remix-net Gated Recurrent Unit (GRU)stanford cs224d project Aran Nayebi, Matt Vitelli (2015) GRUV: Algorithmic Music Generation using Recurrent Neural Networks AUDIO GENERATION: RAW (MP3)
  35. 35. BOTS? PLAY SAFE…
  36. 36. PLAYING WITH NEURAL NETS #1PLAYING WITH NEURAL NETS #1
  37. 37. PLAYING WITH NEURAL NETS #2PLAYING WITH NEURAL NETS #2
  38. 38. PLAYING WITH NEURAL NETS #3PLAYING WITH NEURAL NETS #3
  39. 39. PLAYING WITH NEURAL NETS #4PLAYING WITH NEURAL NETS #4
  40. 40. Wanna be Doing Deep Learning?
  41. 41. python has a wide range of deep learning-related libraries available Deep Learning with Python Low level High level deeplearning.net/software/theano caffe.berkeleyvision.org tensorflow.org/ lasagne.readthedocs.org/en/latest and of course: keras.io
  42. 42. Code & Papers? http://gitxiv.com/ #GitXiv
  43. 43. Creative AI projects? http://www.creativeai.net/ #Crea+veAI
  44. 44. Did you say Podcasts? http://ethicalmachines.com/
  45. 45. https://medium.com/@ArtificialExperience/creativeai-9d4b2346faf3
  46. 46. Questions? love letters? existential dilemma’s? academic questions? gifts? 
 find me at:
 www.csc.kth.se/~roelof/ roelof@kth.se @graphific Consulting / Projects / Contracts / $$$ / more love letters? http://www.graph-technologies.com/ roelof@graph-technologies.com
  47. 47. WHAT ABOUT CONVNETS? ▸ Awesome for interpreting features ▸ Recurrence can be “kind” of achieved with ▸ long splicing filters ▸ pooling layers ▸ smart architectures
  48. 48. Yoon Kim (2014) Convolutional Neural Networks for Sentence Classification Xiang Zhang, Junbo Zhao, Yann LeCun (2015) Character-level Convolutional Networks for Text Classification NLP
  49. 49. AUDIO Keunwoo Choi, Jeonghee Kim, George Fazekas, and Mark Sandler (2016) Auralisation of Deep Convolutional Neural Networks: Listening to Learned Features
  50. 50. AUDIO Keunwoo Choi, George Fazekas, Mark Sandler (2016) Explaining Deep Convolutional Neural Networks on Music Classification audio at:
 https://keunwoochoi.wordpress.com/2016/03/23/what-cnns-see-when-cnns-see-spectrograms/
  51. 51. AUDIO

    Be the first to comment

    Login to see the comments

  • perone

    Jul. 22, 2016
  • choeungjin

    Sep. 29, 2016
  • ThomasFritsch

    Jun. 10, 2017

Talk at Nuclai 2016 in Vienna Can neural networks sing, dance, remix and rhyme? And most importantly, can they talk back? This talk will introduce Deep Neural Nets with textual and auditory understanding and some of the recent breakthroughs made in these fields. It will then show some of the exciting possibilities these technologies hold for "creative" use and explorations of human-machine interaction, where the main theorem is "augmentation, not automation". http://events.nucl.ai/track/cognitive/#deep-neural-networks-that-talk-back-with-style

Views

Total views

2,172

On Slideshare

0

From embeds

0

Number of embeds

144

Actions

Downloads

37

Shares

0

Comments

0

Likes

3

×