San Francisco Artificial Intelligence Meetup
April, 2016
Yuwei Cui
ycui@numenta.com
Real-time streaming data analysis with HTM
History of Numenta
2005 – 2009
 First generation algorithms
 Hierarchy and vision problems
2002
2004
2009 – 2012
 Cortical Learning Algorithms
 SDRs, sequence memory,
continuous learning
2013 – 2015
 NuPIC open source project
 Grok for anomaly detection
2005
2014 – ??
 Sensorimotor
 Goal directed behavior
Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM
Numenta
Research
HTM theory
HTM algorithms
NuPIC
Open source community
Technology Validation
and Development
Streaming Analytics
Natural Language
Sensorimotor Inference
Numenta’s Approach
*HTM = Hierarchical Temporal Memory
Neuroscience
Experimental
Research
1) Discover the computational principles of the neocortex
- information and biological theory
- making good progress
2) Create Technology for Machine Intelligence
based on neocortical principles
- not whole-brain simulation, not human-like
- new senses, new embodiments, faster , larger
Numenta’s Goals
Mission: Be the leader in the coming era of machine intelligence
What Does the Neocortex Do?
Sensory stream
retina
cochlea
somatic
The neocortex learns a model
of the world from fast changing
sensory data
Sensory arrays
Motor stream
The model is time-based and
predictive.
light
sound
touch
The neocortex learns a
sensory-motor model of the
world
Cortical Architecture
Hierarchy
Cellular layers
Mini-columns
Neurons: 5-10K synapses
10% proximal
90% distal
Active dendrites
Learning = new synapses
Remarkably uniform
- anatomically
- functionally
2.5 mm
2/3
4
6
5
Sheet of ~20 billion cells
Cortical Theory
Hierarchy
Cellular layers
Mini-columns
Neurons: 5-10K synapses
10% proximal
90% distal
Active dendrites
Learning = new synapses
Remarkably uniform
- anatomically
- functionally
2.5 mm
Sheet of ~20 billion cells
2/3
4
6
5
HTM
Hierarchical Temporal Memory
Hierarchy of identical regions
Each regions learns sequences
Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM
The Neuron
Σ
ANN neuron
Few synapses
Sum input x weights
Learn by modifying weights
of synapses
HTM neuron
Thousands of synapses
Active dendrites:
Cell recognizes 100’s of unique
patterns
Learn by modeling growth of
new synapses
Biological neuron
Thousands of synapses
Active dendrites:
Cell recognizes 100’s of unique
patterns
Learn by growing new
synapses
Feedback
Local
Feedforward
Linear
Generate spikes
Non-linear
8-20 coactive synapses
lead to dendritic NMDA
spikes
Weakly depolarize soma
Hawkins & Ahmad, Front. Neural Circuits, 2016
Feedforward Input
Sparse activation of columns
(intercolumn inhibition)
No prediction
All cells in column become active
With prediction
Only predicted cells in column become active
(due to intracolumn inhibition)
Arranging Neurons In Minicolumns Leads To Powerful Sequence
Memory & Prediction Algorithm
t-1
t
Two separate sparse representations
No prediction
A subset of cells are depolarized via predictive
contextual input
With prediction
Feedforward Input
Hawkins & Ahmad, Front. Neural Circuits, 2016
High Order Sequences
Two sequences: A-B-C-D
X-B-C-Y
X
A B
B
C
C
D
Y
Before learning
X B’’ C’’
D’
Y’’
After learning
A B’ C’
Same columns,
but only one cell active per column after learning.
Active cells
Depolarized (predictive) cells
Inactive cells
Time
X
A B
B
C
C
D
Y
Before learning
X B’’ C’’
D’
Y’’
After learning
A B’ C’
Same columns,
but only one cell active per column after learning.
Active cells
Depolarized (predictive) cells
Inactive cells
Time
Hawkins & Ahmad, Front. Neural Circuits, 2016
Columns with depolarized cells
represent predictions
X
A B
B
C
C
D
Y
Before learning
X B’’ C’’
D’
Y’’
After learning
A B’ C’
Same columns,
but only one cell active per column after learning.
Active cells
Depolarized (predictive) cells
Inactive cells
Time
B input C input D’ AND Y” predicted
Start in the middle of learned sequences without context
C’ AND C” predicted
Multiple simultaneous predictions
Two sequences: A-B-C-D
X-B-C-Y
Hawkins & Ahmad, Front. Neural Circuits, 2016
Multiple predictions are carried forward until sufficient evidence disambiguates them
1) On-line learning
2) High-order representations
For example: sequences “ABCD” vs. “XBCY”
3) Multiple simultaneous predictions
For example: “BC” predicts both “D” and “Y”
4) Fully local and unsupervised learning rules
5) Extremely robust
Tolerant to >40% noise and faults
6) High capacity
HTM Sequence Memory : Computational Properties
Extensively tested, deployed in commercial applications
Full source code and documentation available: numenta.org & github.com/numenta
Papers available: (Hawkins & Ahmad, Front. Neural Circuits, 2016; Cui et al., 2015, 2016)
Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM
Learning high-order sequences online
Test prediction accuracy at the end of the sequence
Cui et al, arXiv 2015
Shared
subsequence
Start End
High-order sequences
Sequence Noise Sequence Noise
Continuous learning/testing from streaming data
Sequence Noise …Sequence Noise
Switch to a new set of sequences
Learning high-order sequences online
00
Online extreme learning machine
LSTM with short buffer
LSTM with long buffer
HTM
Learning high-order sequences online
Switch to a new set of sequences
Ability to Make Multiple Predictions
Cui et al, arXiv 2015
0 2000 4000 6000 8000 10000 12000
Num ber of elem ents seen
0.0
0.2
0.4
0.6
0.8
1.0
PredictionAccuracy
HTM: 2 predictions
LSTM: 2 predictions
HTM: 4 predictions
LSTM: 4 predictions
Multiple predictions are made in the form of sparse distributed representations (SDRs),
which also have very large coding capacity
Fault Tolerance
Kill a fraction of cells
Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM
Application to real-time streaming data analytics
Cui et al, arXiv 2015
HTM High Order
Sequence Memory
Encoder
SDR
Data
Predictions
Classification
Classifier
SDR
2015-04-20
Monday
2015-04-21
Tuesday
2015-04-22
Wednesday
2015-04-23
Thursday
2015-04-24
Friday
2015-04-25
Saturday
2015-04-26
Sunday
0 k
5 k
10 k
15 k
20 k
25 k
30 k
PassengerCountin30minwindow
A
B C
0.8
1.0
0.30
0.35
2.0
2.5
od
D
NYC Taxi demand
Source: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
Performance On Real-World Streaming Data Sources
ARIMA
(statistical method)
Recurrent
Neural network
(ESN, LSTM)
HTM
Extreme Learning Machine
(feedforward NN)
Fast adaptation to changes in the data streams
Cui et al, arXiv 2015
New pattern introduced
20% increase of night taxi demand
20% decrease of morning taxi demand
Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM
Benchmarking Real-time Streaming Anomaly Detection
Traditional benchmarks don’t apply:
– Don’t incorporate time, e.g. favor early
detection over later detections
– Usually batch format
– Very few with real world data
Numenta Anomaly Benchmark (NAB)
– Scoring methodology favors early
detection
– Incorporates continuous learning
(learning a new normal baseline)
– Labeled real world data streams
– Different “application profiles”
– Fully open source
Lavin & Ahmad, IEEE ICMLA 2015
The NAB competition (Part of the IEEE WCCI):
Win up to $5,000 if you can contribute more datasets and/or anomaly detection algorithms
http://numenta.org/nab/
Real-time anomaly detection
Lavin & Ahmad, IEEE ICMLA 2015
HTM detects
anomaly earlier
Other algorithms
https://github.com/numenta/NAB
Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM
Datacenter
server anomalies
Rogue human
behavior
Geospatial
tracking
Stock
anomalies
Applications Using HTM High-Order Inference
Social media
streams (Twitter)
HTM High Order
Sequence Memory
Encoder
SDRData Predictions
Classification
Anomalies
All using the core HTM algorithm, with same parameters
Anomaly Detection in Geospatial Tracking Data
HTM
Encoder
SDRs
Prediction
Anomaly Detection
Classification
GPS+ Velocity
Trick: convert GPS coordinates into an SDR
After input is encoded as an SDR, learning algorithm is agnostic
HTM Studio: an easy way to run HTM with your data
Now looking for beta testers for HTM studio!
Summary
- Experimental findings from Neuroscience can lead to improved learning
algorithms
- Used properties of active dendrites, Hebbian-style plasticity and minicolumns
- Creating biologically inspired algorithms that really work leads to deeper
understanding of cortical principles and numerous testable predictions
Research Roadmap
- Understand functional properties of laminar microcircuit and
thalamocortical inputs
- Model multiple regions and hierarchy
- More biophysically accurate neuron models (e.g. spiking models)
Collaborators
- Jeff Hawkins (PI)
- Subutai Ahmad
- Scott Purdy
- Alex Lavin
Contact info:
ycui@numenta.com
Real-Time Streaming Data Analysis with HTM

Real-Time Streaming Data Analysis with HTM

  • 1.
    San Francisco ArtificialIntelligence Meetup April, 2016 Yuwei Cui ycui@numenta.com Real-time streaming data analysis with HTM
  • 2.
    History of Numenta 2005– 2009  First generation algorithms  Hierarchy and vision problems 2002 2004 2009 – 2012  Cortical Learning Algorithms  SDRs, sequence memory, continuous learning 2013 – 2015  NuPIC open source project  Grok for anomaly detection 2005 2014 – ??  Sensorimotor  Goal directed behavior
  • 3.
    Outline • Numenta’s approachto machine intelligence • A theory of sequence memory in the neocortex • Learning high-order complex sequences online • Application to real-world sequence learning with streaming data • Numenta anomaly benchmark (NAB) • A wide variety of applications with HTM
  • 4.
    Numenta Research HTM theory HTM algorithms NuPIC Opensource community Technology Validation and Development Streaming Analytics Natural Language Sensorimotor Inference Numenta’s Approach *HTM = Hierarchical Temporal Memory Neuroscience Experimental Research
  • 5.
    1) Discover thecomputational principles of the neocortex - information and biological theory - making good progress 2) Create Technology for Machine Intelligence based on neocortical principles - not whole-brain simulation, not human-like - new senses, new embodiments, faster , larger Numenta’s Goals Mission: Be the leader in the coming era of machine intelligence
  • 6.
    What Does theNeocortex Do? Sensory stream retina cochlea somatic The neocortex learns a model of the world from fast changing sensory data Sensory arrays Motor stream The model is time-based and predictive. light sound touch The neocortex learns a sensory-motor model of the world
  • 7.
    Cortical Architecture Hierarchy Cellular layers Mini-columns Neurons:5-10K synapses 10% proximal 90% distal Active dendrites Learning = new synapses Remarkably uniform - anatomically - functionally 2.5 mm 2/3 4 6 5 Sheet of ~20 billion cells
  • 8.
    Cortical Theory Hierarchy Cellular layers Mini-columns Neurons:5-10K synapses 10% proximal 90% distal Active dendrites Learning = new synapses Remarkably uniform - anatomically - functionally 2.5 mm Sheet of ~20 billion cells 2/3 4 6 5 HTM Hierarchical Temporal Memory Hierarchy of identical regions Each regions learns sequences
  • 9.
    Outline • Numenta’s approachto machine intelligence • A theory of sequence memory in the neocortex • Learning high-order complex sequences online • Application to real-world sequence learning with streaming data • Numenta anomaly benchmark (NAB) • A wide variety of applications with HTM
  • 10.
    The Neuron Σ ANN neuron Fewsynapses Sum input x weights Learn by modifying weights of synapses HTM neuron Thousands of synapses Active dendrites: Cell recognizes 100’s of unique patterns Learn by modeling growth of new synapses Biological neuron Thousands of synapses Active dendrites: Cell recognizes 100’s of unique patterns Learn by growing new synapses Feedback Local Feedforward Linear Generate spikes Non-linear 8-20 coactive synapses lead to dendritic NMDA spikes Weakly depolarize soma Hawkins & Ahmad, Front. Neural Circuits, 2016
  • 11.
    Feedforward Input Sparse activationof columns (intercolumn inhibition) No prediction All cells in column become active With prediction Only predicted cells in column become active (due to intracolumn inhibition) Arranging Neurons In Minicolumns Leads To Powerful Sequence Memory & Prediction Algorithm t-1 t Two separate sparse representations No prediction A subset of cells are depolarized via predictive contextual input With prediction Feedforward Input Hawkins & Ahmad, Front. Neural Circuits, 2016
  • 12.
    High Order Sequences Twosequences: A-B-C-D X-B-C-Y X A B B C C D Y Before learning X B’’ C’’ D’ Y’’ After learning A B’ C’ Same columns, but only one cell active per column after learning. Active cells Depolarized (predictive) cells Inactive cells Time X A B B C C D Y Before learning X B’’ C’’ D’ Y’’ After learning A B’ C’ Same columns, but only one cell active per column after learning. Active cells Depolarized (predictive) cells Inactive cells Time Hawkins & Ahmad, Front. Neural Circuits, 2016 Columns with depolarized cells represent predictions
  • 13.
    X A B B C C D Y Before learning XB’’ C’’ D’ Y’’ After learning A B’ C’ Same columns, but only one cell active per column after learning. Active cells Depolarized (predictive) cells Inactive cells Time B input C input D’ AND Y” predicted Start in the middle of learned sequences without context C’ AND C” predicted Multiple simultaneous predictions Two sequences: A-B-C-D X-B-C-Y Hawkins & Ahmad, Front. Neural Circuits, 2016 Multiple predictions are carried forward until sufficient evidence disambiguates them
  • 14.
    1) On-line learning 2)High-order representations For example: sequences “ABCD” vs. “XBCY” 3) Multiple simultaneous predictions For example: “BC” predicts both “D” and “Y” 4) Fully local and unsupervised learning rules 5) Extremely robust Tolerant to >40% noise and faults 6) High capacity HTM Sequence Memory : Computational Properties Extensively tested, deployed in commercial applications Full source code and documentation available: numenta.org & github.com/numenta Papers available: (Hawkins & Ahmad, Front. Neural Circuits, 2016; Cui et al., 2015, 2016)
  • 15.
    Outline • Numenta’s approachto machine intelligence • A theory of sequence memory in the neocortex • Learning high-order complex sequences online • Application to real-world sequence learning with streaming data • Numenta anomaly benchmark (NAB) • A wide variety of applications with HTM
  • 16.
    Learning high-order sequencesonline Test prediction accuracy at the end of the sequence Cui et al, arXiv 2015 Shared subsequence Start End High-order sequences Sequence Noise Sequence Noise Continuous learning/testing from streaming data Sequence Noise …Sequence Noise Switch to a new set of sequences
  • 17.
    Learning high-order sequencesonline 00 Online extreme learning machine LSTM with short buffer LSTM with long buffer HTM
  • 18.
    Learning high-order sequencesonline Switch to a new set of sequences
  • 19.
    Ability to MakeMultiple Predictions Cui et al, arXiv 2015 0 2000 4000 6000 8000 10000 12000 Num ber of elem ents seen 0.0 0.2 0.4 0.6 0.8 1.0 PredictionAccuracy HTM: 2 predictions LSTM: 2 predictions HTM: 4 predictions LSTM: 4 predictions Multiple predictions are made in the form of sparse distributed representations (SDRs), which also have very large coding capacity
  • 20.
    Fault Tolerance Kill afraction of cells
  • 21.
    Outline • Numenta’s approachto machine intelligence • A theory of sequence memory in the neocortex • Learning high-order complex sequences online • Application to real-world sequence learning with streaming data • Numenta anomaly benchmark (NAB) • A wide variety of applications with HTM
  • 22.
    Application to real-timestreaming data analytics Cui et al, arXiv 2015 HTM High Order Sequence Memory Encoder SDR Data Predictions Classification Classifier SDR 2015-04-20 Monday 2015-04-21 Tuesday 2015-04-22 Wednesday 2015-04-23 Thursday 2015-04-24 Friday 2015-04-25 Saturday 2015-04-26 Sunday 0 k 5 k 10 k 15 k 20 k 25 k 30 k PassengerCountin30minwindow A B C 0.8 1.0 0.30 0.35 2.0 2.5 od D NYC Taxi demand Source: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
  • 23.
    Performance On Real-WorldStreaming Data Sources ARIMA (statistical method) Recurrent Neural network (ESN, LSTM) HTM Extreme Learning Machine (feedforward NN)
  • 24.
    Fast adaptation tochanges in the data streams Cui et al, arXiv 2015 New pattern introduced 20% increase of night taxi demand 20% decrease of morning taxi demand
  • 25.
    Outline • Numenta’s approachto machine intelligence • A theory of sequence memory in the neocortex • Learning high-order complex sequences online • Application to real-world sequence learning with streaming data • Numenta anomaly benchmark (NAB) • A wide variety of applications with HTM
  • 26.
    Benchmarking Real-time StreamingAnomaly Detection Traditional benchmarks don’t apply: – Don’t incorporate time, e.g. favor early detection over later detections – Usually batch format – Very few with real world data Numenta Anomaly Benchmark (NAB) – Scoring methodology favors early detection – Incorporates continuous learning (learning a new normal baseline) – Labeled real world data streams – Different “application profiles” – Fully open source Lavin & Ahmad, IEEE ICMLA 2015 The NAB competition (Part of the IEEE WCCI): Win up to $5,000 if you can contribute more datasets and/or anomaly detection algorithms http://numenta.org/nab/
  • 27.
    Real-time anomaly detection Lavin& Ahmad, IEEE ICMLA 2015 HTM detects anomaly earlier Other algorithms https://github.com/numenta/NAB
  • 28.
    Outline • Numenta’s approachto machine intelligence • A theory of sequence memory in the neocortex • Learning high-order complex sequences online • Application to real-world sequence learning with streaming data • Numenta anomaly benchmark (NAB) • A wide variety of applications with HTM
  • 29.
    Datacenter server anomalies Rogue human behavior Geospatial tracking Stock anomalies ApplicationsUsing HTM High-Order Inference Social media streams (Twitter) HTM High Order Sequence Memory Encoder SDRData Predictions Classification Anomalies All using the core HTM algorithm, with same parameters
  • 30.
    Anomaly Detection inGeospatial Tracking Data HTM Encoder SDRs Prediction Anomaly Detection Classification GPS+ Velocity Trick: convert GPS coordinates into an SDR After input is encoded as an SDR, learning algorithm is agnostic
  • 31.
    HTM Studio: aneasy way to run HTM with your data Now looking for beta testers for HTM studio!
  • 32.
    Summary - Experimental findingsfrom Neuroscience can lead to improved learning algorithms - Used properties of active dendrites, Hebbian-style plasticity and minicolumns - Creating biologically inspired algorithms that really work leads to deeper understanding of cortical principles and numerous testable predictions Research Roadmap - Understand functional properties of laminar microcircuit and thalamocortical inputs - Model multiple regions and hierarchy - More biophysically accurate neuron models (e.g. spiking models)
  • 33.
    Collaborators - Jeff Hawkins(PI) - Subutai Ahmad - Scott Purdy - Alex Lavin Contact info: ycui@numenta.com

Editor's Notes

  • #2 I don't know how many of you have heard about Numenta. Founded by Jeff Hawkins in 2005, we are an unusual research focused organization - we focus on understanding the computational principles of the neocortex. My background is in computer science and machine learning.
  • #3 I don't know how many of you have heard about Numenta. Founded by Jeff Hawkins in 2005, we are an unusual research focused organization - we focus on understanding the computational principles of the neocortex
  • #5 We study experimental research in neuroscience. We use these to improve our theory and learning algorithms. Why bother? Why not stick with the existing ML paradigm? Well if you look at the history of ML, insights from neuroscience have led to numerous fundamental advances (including by the way, the very first learning algorithm). But lately the field has ignored neuroscience. At Numenta we think that's a big mistake.   We validate that our algorithms in real-world applications. We also release everything we do as open source and have cultivated a very fast growing open source community. NuPIC is one of the top machine learning projects on github today. Two points here: 1) we think this approach will lead to qualitative leaps in learning algorithms. 2) <animate back arrow> I am hopeful that our theories will help inform experimental work as well. There is a large set of detailed testable predictions that come out of our theory. 
  • #6 Believe RE the cortex is a fruitful path to MI, and we have made good progress on this. Using the computational principles we learned from the cortex, we create tech for MI. Note that by RE the cortex, I don’t mean whole-brain simulation, nor are we trying to create human-like robots. Instead, we focus on creating new senses and explore new embodiments that can operate faster and on a larger scale
  • #7 So what does the cortex do? On a very high level, the neocortex is a memory organ that learns model of the world. Learns a model of the world from changing data We have quite a few sensory organs, …. What’s interesting about these is once you get out of the sensory organs, This model is a time-based predictive model. We believe three principles underlying the cortical computation 1) Memory-prediction. Jeff has described this extensively in his book On Intelligence. The idea is
  • #8 How does the cortex do this. Uniform sheet of cell Hierarchy Laminar structure Cellular layers minicolumns
  • #9 HTM is a theory of the neocortex that incorporate neuroscience
  • #12 The circuitry on the left is documented, the bursting is observed
  • #13 What we can show is that a population of such neurons arranged in minicolumns leads to an extremely powerful sequence memory algorithm.   
  • #22 We study experimental research in neuroscience. We use these to improve our theory and learning algorithms. Why bother? Why not stick with the existing ML paradigm? Well if you look at the history of ML, insights from neuroscience have led to numerous fundamental advances in machine learning (including by the way, the very first learning algorithm). But lately the field has ignored neuroscience. At Numenta we think that's a big mistake.   We validate that our algorithms actually work in real-world applications. We also release everything we do as open source and have cultivated a very fast growing open source community. NuPIC is one of the top machine learning projects on github today. Two points here: 1) we think this approach will lead to qualitative leaps in learning algorithms. 2) <animate back arrow> I am hopeful that our theories will help inform experimental work as well. There is a large set of detailed testable predictions that come out of our theory. 
  • #24 These are stable performances after the models are exposed to data for 6000 records.
  • #27 7 MINUTES
  • #29 We study experimental research in neuroscience. We use these to improve our theory and learning algorithms. Why bother? Why not stick with the existing ML paradigm? Well if you look at the history of ML, insights from neuroscience have led to numerous fundamental advances in machine learning (including by the way, the very first learning algorithm). But lately the field has ignored neuroscience. At Numenta we think that's a big mistake.   We validate that our algorithms actually work in real-world applications. We also release everything we do as open source and have cultivated a very fast growing open source community. NuPIC is one of the top machine learning projects on github today. Two points here: 1) we think this approach will lead to qualitative leaps in learning algorithms. 2) <animate back arrow> I am hopeful that our theories will help inform experimental work as well. There is a large set of detailed testable predictions that come out of our theory. 
  • #35 Matt