Predictive Analytics with Numenta Machine Intelligence

PREDICTIVE
ANALYTICS WITH
NUMENTAMACHINE
INTELLIGENCE
SF Data Science Meetup
August 2, 2016
Alex Lavin
alavin@numenta.com
@theAlexLavin

OUTLINE
1. Online, streaming analytics
2. Intro to Hierarchical Temporal Memory (HTM)
3. Applying HTM
1. Real-time anomaly detection
2. Real-time prediction
3. Open-source engine and streams
4. Wrap up
1. Summary
2. Q&A

THE STREAMING ANALYTICS PROBLEM
Given all past input and current
input, compute the state of the
system right now.
Must report decision, perform
any retraining, bookkeeping,
etc. before next input arrives.
• No look-ahead – online, not batch
• No training/test set split
• System must be automated, and customized to each stream
• Unsupervised, continuous learning

REAL-TIME ANALYTICS
• Enormous increase in the availability of streaming, time-series data
• Prediction is fundamental to real-time analytics, and valuable in all
domains!
Monitoring
IT infrastructure
Financials data Tracking vehicles
Real-time
health
monitoring
Energy
consumption

RESEARCH @ NUMENTA
Neuroscience
Theories
Computational
Frameworks
Machine
Intelligence
Neurobiology
Data

HIERARCHICAL TEMPORAL MEMORY (HTM)
HTM is a powerful sequence memory derived from recent
findings in experimental neuroscience.
• High capacity memory-based system
• Models complex, high-order temporal sequences
• Inherently streaming
• Continuously learning and predicting
• No need to tune hyper-parameters
• Robust and fault-tolerant
• Runs in real time on a laptop
• Open source: github.com/numenta

HIERARCHICAL TEMPORAL MEMORY (HTM)
Want to dive in to HTM?
• http://numenta.com/learn
• BaMI
• Research papers
• HTM School
• http://numenta.org for NuPIC
• https://discourse.numenta.org
• Social media:

HTM PREDICTS FUTURE INPUT
Active Inactive Depolarized
(predicted)
HTM 𝑎(𝑥$)
𝜋(𝑥$)
𝑥$
• Input to the system is a stream of data:
• Encoded into a sparse, high dimensional vector
• Learns temporal sequences in inputstream:
• Makes a prediction in the form of a sparse vector:
• 𝜋(𝑥$) represents a predictionfor upcoming input:
𝑥$
𝑎(𝑥$)
𝜋(𝑥$)
𝑎(𝑥$'()

HTM
Raw anomaly
score
Anomaly
likelihood
• 𝑠$ is an instantaneous measure of
prediction error
• 0 if input was perfectly prediction
• 1 if it was completely unpredicted
• Could threshold it directly to report
anomalies, but in very noisy
environments we can do better
𝑥$
𝑎(𝑥$)
𝜋(𝑥$)
𝐿$
𝑠$
ANOMALY DETECTION WITH HTM

ANOMALY LIKELIHOOD
Second order measure: did the predictability of the metric change?
1. Estimate historical distribution of raw anomaly scores
2. Check if recent scores are very different

ANOMALY BENCHMARK
Detector Standard
Profile
Reward Low
FP
Reward Low
FN
Perfect 100 100 100
Numenta HTM 65.3 58.6 69.4
Multinomial Relative Entropy 54.6 47.6 58.8
Twitter ADVec v1.0.0 47.1 33.6 53.5
Etsy Skyline 35.7 27.1 44.5
Sliding Threshold 30.7 12.1 38.3
Bayesian Online Changepoint 17.7 3.2 32.2
EXPoSE 16.4 3.2 26.9
Random 11 1.2 19.5
Null 0 0 0
https://github.com/numenta/NAB

MULTIPLE STREAMS
Ahmad & Purdy, "Real-Time Anomaly Detection for StreamingAnalytics": https://arxiv.org/abs/1607.02480

PREDICTION USES SOFTMAX CLASSIFIER
HTM
SDR
Classifier
• Classifier maps 𝑎(𝑥$) to a probability distribution over inputs using a linear classifier
plus softmax
• Classifier trained to optimize negative log likelihood
• System can predict multiple time steps into the future
• Weights are updated continuously
• Can predict categories and scalar values
𝑎(𝑥$)𝑥$
𝑃(𝑥$',|𝑥$)

2015-04-20
Monday
2015-04-21
Tuesday
2015-04-22
Wednesday
2015-04-23
Thursday
2015-04-24
Friday
2015-04-25
Saturday
2015-04-26
Sunday
0 k
5 k
10 k
15 k
20 k
25 k
30 k
PassengerCountin30minwindow
A
B C
0.6
0.8
1.0
SE
0.20
0.25
0.30
0.35
E
1.5
2.0
2.5
g-likelihood
D
NYC Taxi demand
Source: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
?
PERFORMANCE ON REAL-WORLD
STREAMING DATA SOURCES

Cui et al, "Continuous online sequence learning with an unsupervised neural network model":
https://arxiv.org/abs/1512.05463
PERFORMANCE ON REAL-WORLD
STREAMING DATA SOURCES

New pattern introduced:
20% increase of night taxi demand
20% decrease of morning taxi demand
Cui et al, "Continuous online sequence learning with an unsupervised neural network model":
https://arxiv.org/abs/1512.05463
FAST ADAPTATION TO CHANGES IN THE DATA
STREAMS

HTM ENGINE FOR STREAMING ANALYTICS
Datacenter
server
anomalies
Rogue human
behavior
Geospatial
tracking
Stock
anomalies
Social media
streams
(Twitter)
HTM High Order
Sequence Memory
Encoder
SDRData
Prediction
Anomaly detection
Classification

HTM ENGINE + RIVER VIEW
HTM Engine code: https://github.com/numenta/numenta-apps
River View service: http://data.numenta.org/

TAKE HOME POINTS
Streaming data is the future
HTM is powerful for predictive analytics
Open source!

THANK YOU!
• Collaborators:
• Jeff Hawkins
• Subutai Ahmad
• Yuwei Cui
• Scott Purdy
• Contact:
• alavin@numenta.com
• @theAlexLavin

RESOURCES
• Open Source Repositories:
• Algorithm code: github.com/numenta/nupic
• Applications: github.com/numenta/numenta-apps
• NAB code + paper: github.com/numenta/nab
• Apache Flink: github.com/htm-community/flink-htm
• Learning center: numenta.com/learn
• HTM Studio: http://numenta.com/htm-studio
• Partners:
• Grok (anomalies in IT infrastructure): grokstream.com
• Cortical.io (NLP): cortical.io
• Contact:
• Alex Lavin: alavin@numenta.com @theAlexLavin
• Subutai Ahmad: sahmad@numenta.com @SubutaiAhmad
• HTM Forum: discourse.numenta.org

§ HTM Studio
§ Easy to use desktop application
§ No data upload required, no coding required
§ Download application at http://numenta.com/htm-studio
TRY HTM ANOMALY DETECTION WITH HTM
STUDIO!

ANOMALIES IN IT INFRASTRUCTURE
Grok
• Commercial server based product detects anomalies in IT infrastructure
• Runs thousands of HTM anomaly detectors in real time
• 10 milliseconds per input per metric, including continuous learning
• No parameter tuning required
• grokstream.com

ANOMALIES IN FINANCIALAND SOCIAL
MEDIA DATA
HTM for Stocks
• Real-time free demo application
• Continuously monitors top 200 stocks
• Available on iOS App Store or Google Play Store
• Open source application: github.com/numenta/numenta-apps

NUMENTAANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
• Real-world benchmark data set
• 58 labeled data streams
(47 real-world, 11 artificial streams)
• Total of 365,551 data points
• Scoring mechanism
• Rewards early detection
• Different “application profiles”
• Open resource
• AGPL repository contains data, source code, and
documentation
• github.com/numenta/NAB
• Ongoing competition to expand NAB

Predictive Analytics with Numenta Machine Intelligence

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to Predictive Analytics with Numenta Machine Intelligence

Similar to Predictive Analytics with Numenta Machine Intelligence (20)

More from Numenta

More from Numenta (20)

Recently uploaded

Recently uploaded (20)

Predictive Analytics with Numenta Machine Intelligence