Machine learning and deep learning have proved to excel at pattern recognition tasks and represent state of the art models for computer vision, NLP, predictive applications and many artificial intelligence tasks. Neural networks represent a mathematical model of how scientists think the brain performs certain computations, but in decades they lacked of memories and memory/context recall characteristics of the brain. Although there’s recurrent neural networks (like LSTM) which handle memory and context, they can’t scale their memory size which prevents them from storing rich meaningful information over time and over contexts.
To solve this, some months ago researchers at Google’s deep mind proposed a model that combines neural networks with an external memory store and designed it to simulate the way neuroscientists think the brain manages memories (storing and retrieval), the system is fully differentiable which means that using the magic of calculus optimization, it can learn to handle its own inner-workings and memory I/O from scratch using training data.
The authors of the model demonstrated use cases where they trained it with graph data like family trees, or the London underground transport system and the system learned to use its memory to answer questions like navigation routes, and family relations hidden in the data, the model proved to have lots of potential to create new AI based applications that learn for themselves how to work with complex data structures (like graphs) without explicit programming. However, the problem is the scarce information available its neuroscience oriented and focused, which makes it difficult to grasp for software engineers, developers and data scientists.
We’ll demystify this fascinating architecture using analogies and examples familiar to software engineers, developers and data scientists,and provide intuitions that make the model easier to understand, adopt, which will unlock a complete new type of AI applications.