2. Agenda Introduction to Numenta What can we learn from Neuroscience? How can we incorporate these ideas into Algorithms? How can we incorporate these ideas into Applications?
3. Numenta Snapshot Creating a new computing technology, Hierarchical Temporal Memory, based on the structure and function of the neocortex 16 employees Founded in 2005 by Jeff Hawkins, Donna Dubinsky and Dileep George For-profit company with very long term roadmap and “patient capital” Focus on core technology Currently developing our third generation of algorithms Very selective corporate partnerships and application development
4. Numenta Timeline 2002 Redwood Neuroscience Institute, Jeff Hawkins 2004 On Intelligence, Hawkins and Blakeslee Described theory of Hierarchical Temporal Memory (HTM) 2005 Mathematical formalism (Dileep George) 2005 Numenta founded to build new computing platform based on HTM 2007 Released NuPIC software platform 2008 First HTM Workshop (>200 attendees) 2009 Vision toolkit Beta release 2010 Prediction toolkit release
5. Demo: An Easy Visual Task Goal: output the name of the object in the image cow sailboat cell phone rubber duck
6. Why Isn’t This Easy For Computers? Huge variations in images, even within a single category It is impossible to write down a set of rules or transformations that cover all possibilities
8. Agenda Introduction to Numenta What can we learn from Neuroscience? How can we incorporate these ideas into Algorithms? How can we incorporate these ideas into Applications?
9. No Universal Learning Machine No Free Lunch Theorem “no learning algorithm has an inherent superiority over other learning algorithms for all problems.” (Wolpert, 1995) x Universal Learning Machine Specific Learning Machine Machine with assumptions that match the structure of the world
15. Each region exposed to constantly changing sensory patterns and is constantly predicting future patternsSensory data (retina) Sensory data (skin) From: Felleman and Van Essen
16. Agenda Introduction to Numenta What can we learn from Neuroscience? How can we incorporate these ideas into Algorithms? How can we incorporate these ideas into Applications?
17. Hierarchical Temporal Memory (HTM) Common sequences Network of learning nodes All nodes do same thing Learns common spatial patterns Learns common sequences(groups patterns with common cause) Create a hierarchical, spatio-temporal model of data Probability of sequences passed up Predicted spatial patterns passed down Bayesian methods resolve ambiguity High level causes Low level causes Common spatial patterns
19. First Order Markov Graph HTM Nodes Learn Temporal Sequences HTM Node Variable order Markov Chains, “groups” Models frequency of transitions between patterns Memorizes static patterns, “coincidences” [Input vectors]
20. First Order Markov Graph HTM Nodes Output Probability Over Sequences HTM Node [P(g1), P(g2), … ] […], […], […], …
23. Summary: Hierarchical Temporal Memory Common sequences Network of learning nodes All nodes do same thing Learns common spatial patterns Learns common sequences(groups patterns with common cause) Creates hierarchical model of data Sequence names passed up Predicted spatial patterns passed down Bayesian methods resolve ambiguity High level causes Low level causes Common spatial patterns
24. Agenda Introduction to Numenta What can we learn from Neuroscience? How can we incorporate these ideas into Algorithms? How can we incorporate these ideas into Applications?
25. Web Analytics Analyze temporal patterns in a very high traffic news website (Forbes.com) Question: Can HTM’s model temporal statistics and predict topics and pages of interest to users?
26. Which Topic Is The User Interested In Next? ? ? Time 177 total topics Random prediction gives 0.56% accuracy
27. Training Paradigm HTM trained using 100,000 user sequences Temporal pooler builds up a variable order sequence model
28. Prediction Based On Page View Statistics ? ? ? ? ? Time Could predict using no temporal context, based just on popularity of different topics (“0’th order” prediction) This is what most sites do today Leads to 23% accuracy
29. First Order Prediction ? ? ? ? Time Can do better if we use transition probabilities from each page Improves accuracy from 23% to 28%
30. Variable Order Prediction ? ? Time “Variable order prediction” – how much temporal context you need is determined based on individual sequences Accuracy jumps to 45%
32. Summary: Predicting News Topics HTMs potentially represent a powerful mechanism for predicting and analyzing web traffic patterns
33. Potential Applications In Web Analytics Increase length of site visits Predict pages that are directly relevant to each user Increase revenue Predict ad-clicks based on current user’s immediate history Display interesting traffic patterns through a website What are most common sequences? Display changes in traffic patterns How are sequence models changing from day to day?
39. Pattern Detection In Digital Pathology Task: detect patterns in biopsy slides indicative of cancer Glands Not glands Malformed glands -> could be prostate cancer
40. Early Results Were Promising We trained a network to discriminate glands from other structures Test set accuracy was around 95% Glands Not glands
41. HTM For Biomedical Imaging HTM performing quite well in gland detection as well as some other tasks There could be applications in other areas of Biomedical Imaging Radiology Electron microscopy …. Key differentiator: General purpose pattern recognition algorithm Most existing work involves coding very specific algorithms to specific patterns
42. Applications Areas Web analytics Biomedical Imaging Video Analysis Credit card fraud Automotive Gaming Drug discovery Business modeling Healthcare