4. THE STREAMING ANALYTICS PROBLEM
Given all past input and current
input, compute the state of the
system right now.
Must report decision, perform
any retraining, bookkeeping,
etc. before next input arrives.
• No look-ahead – online, not batch
• No training/test set split
• System must be automated, and customized to each stream
• Unsupervised, continuous learning
5. REAL-TIME ANALYTICS
• Enormous increase in the availability of streaming, time-series data
• Prediction is fundamental to real-time analytics, and valuable in all
domains!
Monitoring
IT infrastructure
Financials data Tracking vehicles
Real-time
health
monitoring
Energy
consumption
8. HIERARCHICAL TEMPORAL MEMORY (HTM)
HTM is a powerful sequence memory derived from recent
findings in experimental neuroscience.
• High capacity memory-based system
• Models complex, high-order temporal sequences
• Inherently streaming
• Continuously learning and predicting
• No need to tune hyper-parameters
• Robust and fault-tolerant
• Runs in real time on a laptop
• Open source: github.com/numenta
9. HIERARCHICAL TEMPORAL MEMORY (HTM)
Want to dive in to HTM?
• http://numenta.com/learn
• BaMI
• Research papers
• HTM School
• http://numenta.org for NuPIC
• https://discourse.numenta.org
• Social media:
11. HTM PREDICTS FUTURE INPUT
Active Inactive Depolarized
(predicted)
HTM 𝑎(𝑥$)
𝜋(𝑥$)
𝑥$
• Input to the system is a stream of data:
• Encoded into a sparse, high dimensional vector
• Learns temporal sequences in inputstream:
• Makes a prediction in the form of a sparse vector:
• 𝜋(𝑥$) represents a predictionfor upcoming input:
𝑥$
𝑎(𝑥$)
𝜋(𝑥$)
𝑎(𝑥$'()
12. HTM
Raw anomaly
score
Anomaly
likelihood
• 𝑠$ is an instantaneous measure of
prediction error
• 0 if input was perfectly prediction
• 1 if it was completely unpredicted
• Could threshold it directly to report
anomalies, but in very noisy
environments we can do better
𝑥$
𝑎(𝑥$)
𝜋(𝑥$)
𝐿$
𝑠$
ANOMALY DETECTION WITH HTM
13. ANOMALY LIKELIHOOD
Second order measure: did the predictability of the metric change?
1. Estimate historical distribution of raw anomaly scores
2. Check if recent scores are very different
14. ANOMALY LIKELIHOOD
Second order measure: did the predictability of the metric change?
1. Estimate historical distribution of raw anomaly scores
2. Check if recent scores are very different
16. MULTIPLE STREAMS
Ahmad & Purdy, "Real-Time Anomaly Detection for StreamingAnalytics": https://arxiv.org/abs/1607.02480
17. PREDICTION USES SOFTMAX CLASSIFIER
HTM
SDR
Classifier
• Classifier maps 𝑎(𝑥$) to a probability distribution over inputs using a linear classifier
plus softmax
• Classifier trained to optimize negative log likelihood
• System can predict multiple time steps into the future
• Weights are updated continuously
• Can predict categories and scalar values
𝑎(𝑥$)𝑥$
𝑃(𝑥$',|𝑥$)
19. Cui et al, "Continuous online sequence learning with an unsupervised neural network model":
https://arxiv.org/abs/1512.05463
PERFORMANCE ON REAL-WORLD
STREAMING DATA SOURCES
20. New pattern introduced:
20% increase of night taxi demand
20% decrease of morning taxi demand
Cui et al, "Continuous online sequence learning with an unsupervised neural network model":
https://arxiv.org/abs/1512.05463
FAST ADAPTATION TO CHANGES IN THE DATA
STREAMS
22. HTM ENGINE FOR STREAMING ANALYTICS
Datacenter
server
anomalies
Rogue human
behavior
Geospatial
tracking
Stock
anomalies
Social media
streams
(Twitter)
HTM High Order
Sequence Memory
Encoder
SDRData
Prediction
Anomaly detection
Classification
23. HTM ENGINE + RIVER VIEW
HTM Engine code: https://github.com/numenta/numenta-apps
River View service: http://data.numenta.org/
29. § HTM Studio
§ Easy to use desktop application
§ No data upload required, no coding required
§ Download application at http://numenta.com/htm-studio
TRY HTM ANOMALY DETECTION WITH HTM
STUDIO!
30. ANOMALIES IN IT INFRASTRUCTURE
Grok
• Commercial server based product detects anomalies in IT infrastructure
• Runs thousands of HTM anomaly detectors in real time
• 10 milliseconds per input per metric, including continuous learning
• No parameter tuning required
• grokstream.com
31. ANOMALIES IN FINANCIALAND SOCIAL
MEDIA DATA
HTM for Stocks
• Real-time free demo application
• Continuously monitors top 200 stocks
• Available on iOS App Store or Google Play Store
• Open source application: github.com/numenta/numenta-apps
32. NUMENTAANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
• Real-world benchmark data set
• 58 labeled data streams
(47 real-world, 11 artificial streams)
• Total of 365,551 data points
• Scoring mechanism
• Rewards early detection
• Different “application profiles”
• Open resource
• AGPL repository contains data, source code, and
documentation
• github.com/numenta/NAB
• Ongoing competition to expand NAB