SF AI Meetup - Real-time Streaming Data Analysis with HTM

San Francisco Artificial Intelligence Meetup
April, 2016
Yuwei Cui
ycui@numenta.com
Real-time streaming data analysis with HTM

History of Numenta
2005 – 2009
 First generation algorithms
 Hierarchy and vision problems
2002
2004
2009 – 2012
 Cortical Learning Algorithms
 SDRs, sequence memory,
continuous learning
2013 – 2015
 NuPIC open source project
 Grok for anomaly detection
2005
2014 – ??
 Sensorimotor
 Goal directed behavior

Outline
• Numenta’s approach to machine intelligence
• A theory of sequence memory in the neocortex
• Learning high-order complex sequences online
• Application to real-world sequence learning with streaming data
• Numenta anomaly benchmark (NAB)
• A wide variety of applications with HTM

Numenta
Research
HTM theory
HTM algorithms
NuPIC
Open source community
Technology Validation
and Development
Streaming Analytics
Natural Language
Sensorimotor Inference
Numenta’s Approach
*HTM = Hierarchical Temporal Memory
Neuroscience
Experimental
Research

1) Discover the computational principles of the neocortex
- information and biological theory
- making good progress
2) Create Technology for Machine Intelligence
based on neocortical principles
- not whole-brain simulation, not human-like
- new senses, new embodiments, faster , larger
Numenta’s Goals
Mission: Be the leader in the coming era of machine intelligence

What Does the Neocortex Do?
Sensory stream
retina
cochlea
somatic
The neocortex learns a model
of the world from fast changing
sensory data
Sensory arrays
Motor stream
The model is time-based and
predictive.
light
sound
touch
The neocortex learns a
sensory-motor model of the
world

Cortical Architecture
Hierarchy
Cellular layers
Mini-columns
Neurons: 5-10K synapses
10% proximal
90% distal
Active dendrites
Learning = new synapses
Remarkably uniform
- anatomically
- functionally
2.5 mm
2/3
4
6
5
Sheet of ~20 billion cells

Cortical Theory
Hierarchy
Cellular layers
Mini-columns
Neurons: 5-10K synapses
10% proximal
90% distal
Active dendrites
Learning = new synapses
Remarkably uniform
- anatomically
- functionally
2.5 mm
Sheet of ~20 billion cells
2/3
4
6
5
HTM
Hierarchical Temporal Memory
Hierarchy of identical regions
Each regions learns sequences

The Neuron
Σ
ANN neuron
Few synapses
Sum input x weights
Learn by modifying weights
of synapses
HTM neuron
Thousands of synapses
Active dendrites:
Cell recognizes 100’s of unique
patterns
Learn by modeling growth of
new synapses
Biological neuron
Thousands of synapses
Active dendrites:
Cell recognizes 100’s of unique
patterns
Learn by growing new
synapses
Feedback
Local
Feedforward
Linear
Generate spikes
Non-linear
8-20 coactive synapses
lead to dendritic NMDA
spikes
Weakly depolarize soma
Hawkins & Ahmad, Front. Neural Circuits, 2016

Feedforward Input
Sparse activation of columns
(intercolumn inhibition)
No prediction
All cells in column become active
With prediction
Only predicted cells in column become active
(due to intracolumn inhibition)
Arranging Neurons In Minicolumns Leads To Powerful Sequence
Memory & Prediction Algorithm
t-1
t
Two separate sparse representations
No prediction
A subset of cells are depolarized via predictive
contextual input
With prediction
Feedforward Input

High Order Sequences
Two sequences: A-B-C-D
X-B-C-Y
X
A B
B
C
C
D
Y
Before learning
X B’’ C’’
D’
Y’’
After learning
A B’ C’
Same columns,
but only one cell active per column after learning.
Active cells
Depolarized (predictive) cells
Inactive cells
Time
X
A B
B
C
C
D
Y
Before learning
X B’’ C’’
D’
Y’’
After learning
A B’ C’
Same columns,
Active cells
Inactive cells
Time
Columns with depolarized cells
represent predictions

X
A B
B
C
C
D
Y
Before learning
X B’’ C’’
D’
Y’’
After learning
A B’ C’
Same columns,
Active cells
Inactive cells
Time
B input C input D’ AND Y” predicted
Start in the middle of learned sequences without context
C’ AND C” predicted
Multiple simultaneous predictions
Two sequences: A-B-C-D
X-B-C-Y
Multiple predictions are carried forward until sufficient evidence disambiguates them

1) On-line learning
2) High-order representations
For example: sequences “ABCD” vs. “XBCY”
3) Multiple simultaneous predictions
For example: “BC” predicts both “D” and “Y”
4) Fully local and unsupervised learning rules
5) Extremely robust
Tolerant to >40% noise and faults
6) High capacity
HTM Sequence Memory : Computational Properties
Extensively tested, deployed in commercial applications
Full source code and documentation available: numenta.org & github.com/numenta
Papers available: (Hawkins & Ahmad, Front. Neural Circuits, 2016; Cui et al., 2015, 2016)

Learning high-order sequences online
Test prediction accuracy at the end of the sequence
Cui et al, arXiv 2015
Shared
subsequence
Start End
High-order sequences
Sequence Noise Sequence Noise
Continuous learning/testing from streaming data
Sequence Noise …Sequence Noise
Switch to a new set of sequences

00
Online extreme learning machine
LSTM with short buffer
LSTM with long buffer
HTM

Switch to a new set of sequences

Ability to Make Multiple Predictions
0 2000 4000 6000 8000 10000 12000
Num ber of elem ents seen
0.0
0.2
0.4
0.6
0.8
1.0
PredictionAccuracy
HTM: 2 predictions
LSTM: 2 predictions
HTM: 4 predictions
LSTM: 4 predictions
Multiple predictions are made in the form of sparse distributed representations (SDRs),
which also have very large coding capacity

Fault Tolerance
Kill a fraction of cells

Application to real-time streaming data analytics
HTM High Order
Sequence Memory
Encoder
SDR
Data
Predictions
Classification
Classifier
SDR
2015-04-20
Monday
2015-04-21
Tuesday
2015-04-22
Wednesday
2015-04-23
Thursday
2015-04-24
Friday
2015-04-25
Saturday
2015-04-26
Sunday
0 k
5 k
10 k
15 k
20 k
25 k
30 k
PassengerCountin30minwindow
A
B C
0.8
1.0
0.30
0.35
2.0
2.5
od
D
NYC Taxi demand
Source: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml

Performance On Real-World Streaming Data Sources
ARIMA
(statistical method)
Recurrent
Neural network
(ESN, LSTM)
HTM
Extreme Learning Machine
(feedforward NN)

Fast adaptation to changes in the data streams
New pattern introduced
20% increase of night taxi demand
20% decrease of morning taxi demand

Benchmarking Real-time Streaming Anomaly Detection
Traditional benchmarks don’t apply:
– Don’t incorporate time, e.g. favor early
detection over later detections
– Usually batch format
– Very few with real world data
Numenta Anomaly Benchmark (NAB)
– Scoring methodology favors early
detection
– Incorporates continuous learning
(learning a new normal baseline)
– Labeled real world data streams
– Different “application profiles”
– Fully open source
Lavin & Ahmad, IEEE ICMLA 2015
The NAB competition (Part of the IEEE WCCI):
Win up to $5,000 if you can contribute more datasets and/or anomaly detection algorithms
http://numenta.org/nab/

Real-time anomaly detection
Lavin & Ahmad, IEEE ICMLA 2015
HTM detects
anomaly earlier
Other algorithms
https://github.com/numenta/NAB

Datacenter
server anomalies
Rogue human
behavior
Geospatial
tracking
Stock
anomalies
Applications Using HTM High-Order Inference
Social media
streams (Twitter)
HTM High Order
Sequence Memory
Encoder
SDRData Predictions
Classification
Anomalies
All using the core HTM algorithm, with same parameters

Anomaly Detection in Geospatial Tracking Data
HTM
Encoder
SDRs
Prediction
Anomaly Detection
Classification
GPS+ Velocity
Trick: convert GPS coordinates into an SDR
After input is encoded as an SDR, learning algorithm is agnostic

HTM Studio: an easy way to run HTM with your data
Now looking for beta testers for HTM studio!

Summary
- Experimental findings from Neuroscience can lead to improved learning
algorithms
- Used properties of active dendrites, Hebbian-style plasticity and minicolumns
- Creating biologically inspired algorithms that really work leads to deeper
understanding of cortical principles and numerous testable predictions
Research Roadmap
- Understand functional properties of laminar microcircuit and
thalamocortical inputs
- Model multiple regions and hierarchy
- More biophysically accurate neuron models (e.g. spiking models)

Collaborators
- Jeff Hawkins (PI)
- Subutai Ahmad
- Scott Purdy
- Alex Lavin
Contact info:
ycui@numenta.com

SF AI Meetup - Real-time Streaming Data Analysis with HTM

SF AI Meetup - Real-time Streaming Data Analysis with HTM

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to SF AI Meetup - Real-time Streaming Data Analysis with HTM

Similar to SF AI Meetup - Real-time Streaming Data Analysis with HTM (20)

More from Numenta

More from Numenta (20)

Recently uploaded

Recently uploaded (20)

SF AI Meetup - Real-time Streaming Data Analysis with HTM

Editor's Notes