Yuwei Cui from Numenta presented on real-time streaming data analysis using Hierarchical Temporal Memory (HTM). HTM is based on principles of the neocortex and allows for online learning of high-order sequences from streaming data. HTM can make multiple predictions simultaneously and is fault tolerant. It has been applied successfully to problems like anomaly detection in data center servers and geospatial tracking data. Numenta is working to further understand the neocortex and create more biologically accurate models to continue advancing machine intelligence.
SF AI Meetup - Real-time Streaming Data Analysis with HTM
1. San Francisco Artificial Intelligence Meetup
April, 2016
Yuwei Cui
ycui@numenta.com
Real-time streaming data analysis with HTM
2. History of Numenta
2005 – 2009
First generation algorithms
Hierarchy and vision problems
2002
2004
2009 – 2012
Cortical Learning Algorithms
SDRs, sequence memory,
continuous learning
2013 – 2015
NuPIC open source project
Grok for anomaly detection
2005
2014 – ??
Sensorimotor
Goal directed behavior
3. Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM
4. Numenta
Research
HTM theory
HTM algorithms
NuPIC
Open source community
Technology Validation
and Development
Streaming Analytics
Natural Language
Sensorimotor Inference
Numenta’s Approach
*HTM = Hierarchical Temporal Memory
Neuroscience
Experimental
Research
5. 1) Discover the computational principles of the neocortex
- information and biological theory
- making good progress
2) Create Technology for Machine Intelligence
based on neocortical principles
- not whole-brain simulation, not human-like
- new senses, new embodiments, faster , larger
Numenta’s Goals
Mission: Be the leader in the coming era of machine intelligence
6. What Does the Neocortex Do?
Sensory stream
retina
cochlea
somatic
The neocortex learns a model
of the world from fast changing
sensory data
Sensory arrays
Motor stream
The model is time-based and
predictive.
light
sound
touch
The neocortex learns a
sensory-motor model of the
world
8. Cortical Theory
Hierarchy
Cellular layers
Mini-columns
Neurons: 5-10K synapses
10% proximal
90% distal
Active dendrites
Learning = new synapses
Remarkably uniform
- anatomically
- functionally
2.5 mm
Sheet of ~20 billion cells
2/3
4
6
5
HTM
Hierarchical Temporal Memory
Hierarchy of identical regions
Each regions learns sequences
9. Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM
10. The Neuron
Σ
ANN neuron
Few synapses
Sum input x weights
Learn by modifying weights
of synapses
HTM neuron
Thousands of synapses
Active dendrites:
Cell recognizes 100’s of unique
patterns
Learn by modeling growth of
new synapses
Biological neuron
Thousands of synapses
Active dendrites:
Cell recognizes 100’s of unique
patterns
Learn by growing new
synapses
Feedback
Local
Feedforward
Linear
Generate spikes
Non-linear
8-20 coactive synapses
lead to dendritic NMDA
spikes
Weakly depolarize soma
Hawkins & Ahmad, Front. Neural Circuits, 2016
11. Feedforward Input
Sparse activation of columns
(intercolumn inhibition)
No prediction
All cells in column become active
With prediction
Only predicted cells in column become active
(due to intracolumn inhibition)
Arranging Neurons In Minicolumns Leads To Powerful Sequence
Memory & Prediction Algorithm
t-1
t
Two separate sparse representations
No prediction
A subset of cells are depolarized via predictive
contextual input
With prediction
Feedforward Input
Hawkins & Ahmad, Front. Neural Circuits, 2016
12. High Order Sequences
Two sequences: A-B-C-D
X-B-C-Y
X
A B
B
C
C
D
Y
Before learning
X B’’ C’’
D’
Y’’
After learning
A B’ C’
Same columns,
but only one cell active per column after learning.
Active cells
Depolarized (predictive) cells
Inactive cells
Time
X
A B
B
C
C
D
Y
Before learning
X B’’ C’’
D’
Y’’
After learning
A B’ C’
Same columns,
but only one cell active per column after learning.
Active cells
Depolarized (predictive) cells
Inactive cells
Time
Hawkins & Ahmad, Front. Neural Circuits, 2016
Columns with depolarized cells
represent predictions
13. X
A B
B
C
C
D
Y
Before learning
X B’’ C’’
D’
Y’’
After learning
A B’ C’
Same columns,
but only one cell active per column after learning.
Active cells
Depolarized (predictive) cells
Inactive cells
Time
B input C input D’ AND Y” predicted
Start in the middle of learned sequences without context
C’ AND C” predicted
Multiple simultaneous predictions
Two sequences: A-B-C-D
X-B-C-Y
Hawkins & Ahmad, Front. Neural Circuits, 2016
Multiple predictions are carried forward until sufficient evidence disambiguates them
14. 1) On-line learning
2) High-order representations
For example: sequences “ABCD” vs. “XBCY”
3) Multiple simultaneous predictions
For example: “BC” predicts both “D” and “Y”
4) Fully local and unsupervised learning rules
5) Extremely robust
Tolerant to >40% noise and faults
6) High capacity
HTM Sequence Memory : Computational Properties
Extensively tested, deployed in commercial applications
Full source code and documentation available: numenta.org & github.com/numenta
Papers available: (Hawkins & Ahmad, Front. Neural Circuits, 2016; Cui et al., 2015, 2016)
15. Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM
16. Learning high-order sequences online
Test prediction accuracy at the end of the sequence
Cui et al, arXiv 2015
Shared
subsequence
Start End
High-order sequences
Sequence Noise Sequence Noise
Continuous learning/testing from streaming data
Sequence Noise …Sequence Noise
Switch to a new set of sequences
17. Learning high-order sequences online
00
Online extreme learning machine
LSTM with short buffer
LSTM with long buffer
HTM
19. Ability to Make Multiple Predictions
Cui et al, arXiv 2015
0 2000 4000 6000 8000 10000 12000
Num ber of elem ents seen
0.0
0.2
0.4
0.6
0.8
1.0
PredictionAccuracy
HTM: 2 predictions
LSTM: 2 predictions
HTM: 4 predictions
LSTM: 4 predictions
Multiple predictions are made in the form of sparse distributed representations (SDRs),
which also have very large coding capacity
21. Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM
22. Application to real-time streaming data analytics
Cui et al, arXiv 2015
HTM High Order
Sequence Memory
Encoder
SDR
Data
Predictions
Classification
Classifier
SDR
2015-04-20
Monday
2015-04-21
Tuesday
2015-04-22
Wednesday
2015-04-23
Thursday
2015-04-24
Friday
2015-04-25
Saturday
2015-04-26
Sunday
0 k
5 k
10 k
15 k
20 k
25 k
30 k
PassengerCountin30minwindow
A
B C
0.8
1.0
0.30
0.35
2.0
2.5
od
D
NYC Taxi demand
Source: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
24. Fast adaptation to changes in the data streams
Cui et al, arXiv 2015
New pattern introduced
20% increase of night taxi demand
20% decrease of morning taxi demand
25. Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM
26. Benchmarking Real-time Streaming Anomaly Detection
Traditional benchmarks don’t apply:
– Don’t incorporate time, e.g. favor early
detection over later detections
– Usually batch format
– Very few with real world data
Numenta Anomaly Benchmark (NAB)
– Scoring methodology favors early
detection
– Incorporates continuous learning
(learning a new normal baseline)
– Labeled real world data streams
– Different “application profiles”
– Fully open source
Lavin & Ahmad, IEEE ICMLA 2015
The NAB competition (Part of the IEEE WCCI):
Win up to $5,000 if you can contribute more datasets and/or anomaly detection algorithms
http://numenta.org/nab/
28. Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM
30. Anomaly Detection in Geospatial Tracking Data
HTM
Encoder
SDRs
Prediction
Anomaly Detection
Classification
GPS+ Velocity
Trick: convert GPS coordinates into an SDR
After input is encoded as an SDR, learning algorithm is agnostic
31. HTM Studio: an easy way to run HTM with your data
Now looking for beta testers for HTM studio!
32. Summary
- Experimental findings from Neuroscience can lead to improved learning
algorithms
- Used properties of active dendrites, Hebbian-style plasticity and minicolumns
- Creating biologically inspired algorithms that really work leads to deeper
understanding of cortical principles and numerous testable predictions
Research Roadmap
- Understand functional properties of laminar microcircuit and
thalamocortical inputs
- Model multiple regions and hierarchy
- More biophysically accurate neuron models (e.g. spiking models)
I don't know how many of you have heard about Numenta. Founded by Jeff Hawkins in 2005, we are an unusual research focused organization - we focus on understanding the computational principles of the neocortex. My background is in computer science and machine learning.
I don't know how many of you have heard about Numenta. Founded by Jeff Hawkins in 2005, we are an unusual research focused organization - we focus on understanding the computational principles of the neocortex
We study experimental research in neuroscience. We use these to improve our theory and learning algorithms. Why bother? Why not stick with the existing ML paradigm? Well if you look at the history of ML, insights from neuroscience have led to numerous fundamental advances (including by the way, the very first learning algorithm). But lately the field has ignored neuroscience. At Numenta we think that's a big mistake.
We validate that our algorithms in real-world applications. We also release everything we do as open source and have cultivated a very fast growing open source community. NuPIC is one of the top machine learning projects on github today. Two points here: 1) we think this approach will lead to qualitative leaps in learning algorithms. 2) <animate back arrow> I am hopeful that our theories will help inform experimental work as well. There is a large set of detailed testable predictions that come out of our theory.
Believe RE the cortex is a fruitful path to MI, and we have made good progress on this.
Using the computational principles we learned from the cortex, we create tech for MI. Note that by RE the cortex, I don’t mean whole-brain simulation, nor are we trying to create human-like robots. Instead, we focus on creating new senses and explore new embodiments that can operate faster and on a larger scale
So what does the cortex do? On a very high level, the neocortex is a memory organ that learns model of the world.
Learns a model of the world from changing data
We have quite a few sensory organs, …. What’s interesting about these is once you get out of the sensory organs,
This model is a time-based predictive model.
We believe three principles underlying the cortical computation
1) Memory-prediction. Jeff has described this extensively in his book On Intelligence. The idea is
How does the cortex do this.
Uniform sheet of cell
Hierarchy
Laminar structure Cellular layers minicolumns
HTM is a theory of the neocortex that incorporate neuroscience
The circuitry on the left is documented, the bursting is observed
What we can show is that a population of such neurons arranged in minicolumns leads to an extremely powerful sequence memory algorithm.
We study experimental research in neuroscience. We use these to improve our theory and learning algorithms. Why bother? Why not stick with the existing ML paradigm? Well if you look at the history of ML, insights from neuroscience have led to numerous fundamental advances in machine learning (including by the way, the very first learning algorithm). But lately the field has ignored neuroscience. At Numenta we think that's a big mistake.
We validate that our algorithms actually work in real-world applications. We also release everything we do as open source and have cultivated a very fast growing open source community. NuPIC is one of the top machine learning projects on github today. Two points here: 1) we think this approach will lead to qualitative leaps in learning algorithms. 2) <animate back arrow> I am hopeful that our theories will help inform experimental work as well. There is a large set of detailed testable predictions that come out of our theory.
These are stable performances after the models are exposed to data for 6000 records.
7 MINUTES
We study experimental research in neuroscience. We use these to improve our theory and learning algorithms. Why bother? Why not stick with the existing ML paradigm? Well if you look at the history of ML, insights from neuroscience have led to numerous fundamental advances in machine learning (including by the way, the very first learning algorithm). But lately the field has ignored neuroscience. At Numenta we think that's a big mistake.
We validate that our algorithms actually work in real-world applications. We also release everything we do as open source and have cultivated a very fast growing open source community. NuPIC is one of the top machine learning projects on github today. Two points here: 1) we think this approach will lead to qualitative leaps in learning algorithms. 2) <animate back arrow> I am hopeful that our theories will help inform experimental work as well. There is a large set of detailed testable predictions that come out of our theory.