Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Demystifying Differentiable Neural Computers and Their Brain Inspired Origin with Luis Leal

Machine learning and deep learning have proved to excel at pattern recognition tasks and represent state of the art models for computer vision, NLP, predictive applications and many artificial intelligence tasks. Neural networks represent a mathematical model of how scientists think the brain performs certain computations, but in decades they lacked of memories and memory/context recall characteristics of the brain. Although there’s recurrent neural networks (like LSTM) which handle memory and context, they can’t scale their memory size which prevents them from storing rich meaningful information over time and over contexts.

To solve this, some months ago researchers at Google’s deep mind proposed a model that combines neural networks with an external memory store and designed it to simulate the way neuroscientists think the brain manages memories (storing and retrieval), the system is fully differentiable which means that using the magic of calculus optimization, it can learn to handle its own inner-workings and memory I/O from scratch using training data.

The authors of the model demonstrated use cases where they trained it with graph data like family trees, or the London underground transport system and the system learned to use its memory to answer questions like navigation routes, and family relations hidden in the data, the model proved to have lots of potential to create new AI based applications that learn for themselves how to work with complex data structures (like graphs) without explicit programming. However, the problem is the scarce information available its neuroscience oriented and focused, which makes it difficult to grasp for software engineers, developers and data scientists.

We’ll demystify this fascinating architecture using analogies and examples familiar to software engineers, developers and data scientists,and provide intuitions that make the model easier to understand, adopt, which will unlock a complete new type of AI applications.

  • Be the first to comment

Demystifying Differentiable Neural Computers and Their Brain Inspired Origin with Luis Leal

  1. 1. Luis Leal, Xoom a PayPal service Demystifying differentiable neural computers #DLSAIS14
  2. 2. self-introduction 2#DLSAIS14
  3. 3. DNC •Differentiable Neural Computers(DeepMind,2016) 3#DLSAIS14
  4. 4. DNC basic idea •Memory augmented neural network •Neural network with I/O access to external memory •I/O operations are learned instead of programmed 4#DLSAIS14
  5. 5. DNC basic idea •Von Neumann computer architecture: •CPU: in the DNC the CPU is a neural network •Memory: separate external memory bank accessed by CPU via read/write operations 5#DLSAIS14
  6. 6. Neuroscience meets AI and CS •Basic architecture and memory allocation(release and assign) based in computer science. •Memory access(read) and retrieval based on neuroscience(hippocampus) 6#DLSAIS14
  7. 7. High level architecture 7#DLSAIS14 •A neural network called controller (CPU) performs computation on input data •Read/Write heads perform I/O from and to memory •The controller interacts with the read/write heads to use “memories” for computation.
  8. 8. DNC vs Neural Network •Neural networks excel at pattern recognition ,perception tasks, sensory recognition and reactive decision making(map inputs X to outputs Y) but they can’t be used for: •Planning and reasoning tasks •Use “memories” and facts from previous events •Store useful information for future usage •Generalize knowledge to new tasks(AGI) •Work with complex data structures , like associative ones(graphs or trees) 8#DLSAIS14
  9. 9. DNC vs Neural Network •The DNC tries to solve this by mixing the best of both worlds(memory based architecture and machine learning): •Perception and pattern recognition capabilities from machine learning •Planning and reasoning based on previous memories and knowledge •Usage of complex associative data structures •Like a computer it can organize knowledge , data and facts as well as links between them but like a neural network it needs no explicit programming because it can learn to do so from examples(data). 9#DLSAIS14
  10. 10. Knowledge retrieval •The DNC decides which “memories” to retrieve based on “attention mechanisms” which can be described from both computational as well as neuro-science perspectives,specially hippocampal synapses. •Foundations of Human Memory , from Michael Kahana provides key human memory concepts which the DNC has analogies with. 10#DLSAIS14 Neuro-science Computational Neuro-science Which external memory locations to read and write How does the brain retrieves and relates stored “memories” ?
  11. 11. Memory(attribute vectors) •The external memory it’s a real number matrix(NxW). •Attribute theory: every human memory is represented by a list of attributes which describe the memory itself,and the context. 11#DLSAIS14 Computational Neuro-science RAM with N positions and word-size W Human memories are represented as a list of W attributes.
  12. 12. Memory(attribute vectors) 12#DLSAIS14
  13. 13. Content based(similarity) access •The controller(CPU) can emit a key vector and retrieve from the memory(or write to) locations that best matches the key. •Neuro-science proposes a model were :we can remember events when exposed to a similar experience 13#DLSAIS14 Computational Neuro-science Retrieve a weigthed sum of memory values,weighted by similarity to some specific value. Similarity can be cosine similarity We recall(or reinforce) past experiences when exposed to similar ones.
  14. 14. Content based(similarity) access 14#DLSAIS14
  15. 15. Time ordered access(temporal links) •The system records the order in which memory locations are written. •Temporal Context Model: its easier for us to remember and recall events in the order they occurred(try to say all alphabet characters in random order vs ordered) 15#DLSAIS14 Computational Neuro-science Linked list of memory position written ,ordered by time. Recall/retrieve memories in the order they occurred.
  16. 16. Time ordered access(temporal links) 16#DLSAIS14
  17. 17. Short term and Long Term Memory •Although not mandatory, the controller can be a LSTM(long short term memory) neural network which provides short-term memory. •Search of associative memory(SAM): SAM model proposes that our memory is a dual store, a shor-term store and a long-term store. 17#DLSAIS14 Computational Neuro-science Short-term memory provided by LSTM neural network controller. SAM model of dual memories storage.
  18. 18. Short term and Long Term Memory 18#DLSAIS14
  19. 19. Dynamic Memory Allocation •Additionally to writing by content, the DNC can assign and release memory as a computer does, based on memory usage percentage and read orderings. •The DNC can choose to write on new locations , update existing ones(reinforce memories) or not write at all. 19#DLSAIS14 Computational Neuro-science Dynamic memory administration. Add new memories or reinforce existing ones.
  20. 20. Dynamic Memory Allocation 20#DLSAIS14
  21. 21. Complete architecture 21#DLSAIS14
  22. 22. Complete architecture 22#DLSAIS14
  23. 23. Complete architecture 23#DLSAIS14 At each time-step(clock cycle) the DNC: • Gets an input(data) and calculates an output that it’s a weighted sum of its inputs and the “memories” retrieved from memory. •The DNC decides how to interact with the memory (where and what to read and write) via an “interface vector”. •The DNC sends the “memories” read to the next time-step.
  24. 24. Complete architecture 24#DLSAIS14 Thus, the output of the DNC its a function of it’s input history,and what it decided to read from memory. Y = f(X,memory)
  25. 25. How the DNC decides I/O 25#DLSAIS14 How the DNC learns and decides how to interact with memory? •The differentiable part of the DNC. •Every component of the system uses weights similar to those of a neural network •Thus it can be trained via gradient descent and multi-variate calculus optimization. •Using samples(data) the system learns how to behave optimally.
  26. 26. Potential applications 26#DLSAIS14 •Problems that require reasoning and knowledge usage. •Data structure based problems(graphs)
  27. 27. Potential applications 27#DLSAIS14 •Reasoning in Natural language processing instead of probabilistic models •Chatbots that analyze and do reasoning? •Successful test in bAbi dataset
  28. 28. Potential applications 28#DLSAIS14 •Graph reasoning problems. •DeepMind trained the DNC on many random graphs: •It learned to use it’s memory to navigate through the graph. •Then 2 specific graphs were fed: - The London underground graph - A family tree -Surprisingly it was able to generalize without re-training( AGI ?)
  29. 29. Potential applications 29#DLSAIS14 •Reinforcement learning •It was tested on a grid game where: •The player(agent) is given a set of goals and constraints per goal. •It is then requested to satisfy a single goal •It has to plan and reason how to achieve the goal. •It stored the goals and constraints in memory
  30. 30. Thanks for your attention 30#DLSAIS14 •My contact: -Linkedin: -Email: -Github: •References and illustrations thanks to: •“Hybrid computing using a neural network with dynamic external memory", Nature 538, 471–476 (October 2016) doi:10.1038/nature20101. •“Implementation and Optimization of Differentiable Neural Computers” ,Carol Hsin,Stanford University •“Differentiable memory and the brain”,Sam Greydanus,