© 2014 MapR Technologies 1
© MapR Technologies, confidential
Anomaly Detection, Time-Series Data Bases
and More
How to Fin...
© 2014 MapR Technologies 2
Agenda
• What is anomaly detection?
• Some examples
• Some generalization
• Compression == Trut...
© 2014 MapR Technologies 3
Who I am
• Ted Dunning, Chief Application Architect, MapR
tdunning@mapr.com
tdunning@apache.org...
© 2014 MapR Technologies 4
Who we are
• MapR makes the technology leading distribution including
Hadoop
• MapR integrates ...
© 2014 MapR Technologies 5
A New Look at Anomaly Detection
• O’Reilly series Practical Machine Learning 2nd volume
• Print...
© 2014 MapR Technologies 6
What is Anomaly Detection?
• What just happened that shouldn’t?
– but I don’t know what failure...
© 2014 MapR Technologies 7
© 2014 MapR Technologies 8
Looks pretty
anomalous
to me
© 2014 MapR Technologies 9
Will the real anomaly
please stand up?
© 2014 MapR Technologies 10
What Are We Really Doing
• We want action when something breaks
(dies/falls over/otherwise get...
© 2014 MapR Technologies 11
A Second Look
© 2014 MapR Technologies 12
A Second Look
99.9%-ile
© 2014 MapR Technologies 13
Online
Summarizer
99.9%-ile
t
x > t ? Alarm !
x
How Hard Can it Be?
© 2014 MapR Technologies 14
On-line Percentile Estimates
• Apache Mahout has on-line percentile estimator
– very high accu...
© 2014 MapR Technologies 15
Already Done? Etsy Skyline?
© 2014 MapR Technologies 16
What About This?
0 5 10 15
−20246810
offset+noise+pulse1+pulse2
A
B
© 2014 MapR Technologies 17
Spot the Anomaly
Anomaly?
© 2014 MapR Technologies 18
Maybe not!
© 2014 MapR Technologies 19
Where’s Waldo?
This is the real
anomaly
© 2014 MapR Technologies 20
Normal Isn’t Just Normal
• What we want is a model of what is normal
• What doesn’t fit the mo...
© 2014 MapR Technologies 21
We Do Windows
© 2014 MapR Technologies 22
We Do Windows
© 2014 MapR Technologies 23
We Do Windows
© 2014 MapR Technologies 24
We Do Windows
© 2014 MapR Technologies 25
We Do Windows
© 2014 MapR Technologies 26
We Do Windows
© 2014 MapR Technologies 27
We Do Windows
© 2014 MapR Technologies 28
We Do Windows
© 2014 MapR Technologies 29
We Do Windows
© 2014 MapR Technologies 30
We Do Windows
© 2014 MapR Technologies 31
We Do Windows
© 2014 MapR Technologies 32
We Do Windows
© 2014 MapR Technologies 33
We Do Windows
© 2014 MapR Technologies 34
We Do Windows
© 2014 MapR Technologies 35
We Do Windows
© 2014 MapR Technologies 36
Windows on the World
• The set of windowed signals is a nice model of our original signal
• Cl...
© 2014 MapR Technologies 37
Most Common Shapes (for EKG)
© 2014 MapR Technologies 38
Reconstructed signal
Original
signal
Reconstructed
signal
Reconstruction
error
< 1 bit / sample
© 2014 MapR Technologies 39
An Anomaly
Original technique for finding
1-d anomaly works against
reconstruction error
© 2014 MapR Technologies 40
Close-up of anomaly
Not what you want your
heart to do.
And not what the model
expects it to d...
© 2014 MapR Technologies 41
A Different Kind of Anomaly
© 2014 MapR Technologies 42
Model Delta Anomaly Detection
Online
Summarizer
δ > t ?
99.9%-ile
t
Alarm !
Model
-
+ δ
© 2014 MapR Technologies 43
The Real Inside Scoop
• The model-delta anomaly detector is really just a sum of random
variab...
© 2014 MapR Technologies 44
Example: Event Stream (timing)
• Events of various types arrive at irregular intervals
– we ca...
© 2014 MapR Technologies 45
Poisson Distribution
• Time between events is exponentially distributed
• This means that long...
© 2014 MapR Technologies 46
Converting Event Times to Anomaly
99.9%-ile
99.99%-ile
© 2014 MapR Technologies 51
Recap (out of order)
• Anomaly detection is best done with a probability model
• -log p is a g...
© 2014 MapR Technologies 52
Recap
• Different systems require different models
• Continuous time-series
– sparse coding to...
© 2014 MapR Technologies 53
But Wait! Compression is Truth
• Maximizing log πk is minimizing compressed size
– (each symbo...
© 2014 MapR Technologies 54
But Auto-encoders Find Max Likelihood
• Minimal error => maximum likelihood
• Maximum likeliho...
© 2014 MapR Technologies 55
In Case You Want the Details
E pk logpk[ ] = pk logpk
k
å
log x £ x -1
pk log
pk
pkk
å £ pk 1-...
© 2014 MapR Technologies 56
Pause To Reflect on Clustering
• Use windowing to apportion signal
– Hamming windows add up to...
© 2014 MapR Technologies 57
Auto-encoding - Information Bottleneck
Original Signal
Encoder
Reconstructed Signal
© 2014 MapR Technologies 58
x1 x2
...
xn-1 xn
x'1 x'2
...
x'n-1 x'n
Input
Hidden layer
(clusters)
Reconstruction
Clusterin...
© 2014 MapR Technologies 59
Overlapping Networks
Time series input
Reconstructed time series
...
...
...
...
...
...
...
....
© 2014 MapR Technologies 60
Deep Learning
...
...
...
...
... ... ... ... ...
... ... ... ... ...
A
B
A'
© 2014 MapR Technologies 61
What About the Database?
• We don’t have to keep the reconstruction
• We can keep the first le...
© 2014 MapR Technologies 62
What Does it Matter?
• Even one level of auto-encoding compresses
– 30-50x in EKG example with...
© 2014 MapR Technologies 63
How Do I Build Such a System
• The key is to combine real-time and long-time
– real-time evalu...
© 2014 MapR Technologies 64
t
now
Hadoop is Not Very Real-time
UnprocessedData
Fully processed Latest full
period
Hadoop j...
© 2014 MapR Technologies 65
t
now
Hadoop works great
back here
Storm
works
here
Real-time and Long-time together
Blended v...
© 2014 MapR Technologies 66
Who I am
• Ted Dunning, Chief Application Architect, MapR
tdunning@mapr.com
tdunning@apache.or...
© 2014 MapR Technologies 67
© MapR Technologies, confidential
Upcoming SlideShare
Loading in...5
×

How to find what you didn't know to look for, oractical anomaly detection

676

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
676
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • USE THIS
  • USE THIS
  • How to find what you didn't know to look for, oractical anomaly detection

    1. 1. © 2014 MapR Technologies 1 © MapR Technologies, confidential Anomaly Detection, Time-Series Data Bases and More How to Find What You Didn’t Know to Look For
    2. 2. © 2014 MapR Technologies 2 Agenda • What is anomaly detection? • Some examples • Some generalization • Compression == Truth • Deep dive into deep learning • Why this matters for time series databases This goes beyond what was described in the abstract…
    3. 3. © 2014 MapR Technologies 3 Who I am • Ted Dunning, Chief Application Architect, MapR tdunning@mapr.com tdunning@apache.org Twitter @ted_dunning • Committer, mentor, champion, PMC member on several Apache projects • Mahout, Drill, Zookeeper, Spark and others • Hashtag for today’s talk #HadoopSummit
    4. 4. © 2014 MapR Technologies 4 Who we are • MapR makes the technology leading distribution including Hadoop • MapR integrates real-time data semantics directly into a system that also runs Hadoop programs seamlessly • The biggest and best choose MapR – Google, Amazon – Largest credit card, retailer, health insurance, telco – Ping me for info
    5. 5. © 2014 MapR Technologies 5 A New Look at Anomaly Detection • O’Reilly series Practical Machine Learning 2nd volume • Print copies available at Hadoop Summit at MapR booth • eBook available at http://bit.ly/1jQ9QuL • Book signing by both authors at 4pm Wed (today)
    6. 6. © 2014 MapR Technologies 6 What is Anomaly Detection? • What just happened that shouldn’t? – but I don’t know what failure looks like (yet) • Find the problem before other people see it – especially customers and CEO’s • But don’t wake me up if it isn’t really broken
    7. 7. © 2014 MapR Technologies 7
    8. 8. © 2014 MapR Technologies 8 Looks pretty anomalous to me
    9. 9. © 2014 MapR Technologies 9 Will the real anomaly please stand up?
    10. 10. © 2014 MapR Technologies 10 What Are We Really Doing • We want action when something breaks (dies/falls over/otherwise gets in trouble) • But action is expensive • So we don’t want false alarms • And we don’t want false negatives • We need to trade off costs
    11. 11. © 2014 MapR Technologies 11 A Second Look
    12. 12. © 2014 MapR Technologies 12 A Second Look 99.9%-ile
    13. 13. © 2014 MapR Technologies 13 Online Summarizer 99.9%-ile t x > t ? Alarm ! x How Hard Can it Be?
    14. 14. © 2014 MapR Technologies 14 On-line Percentile Estimates • Apache Mahout has on-line percentile estimator – very high accuracy for extreme tails – new in version 0.9 !! • What’s the big deal with anomaly detection? • This looks like a solved problem
    15. 15. © 2014 MapR Technologies 15 Already Done? Etsy Skyline?
    16. 16. © 2014 MapR Technologies 16 What About This? 0 5 10 15 −20246810 offset+noise+pulse1+pulse2 A B
    17. 17. © 2014 MapR Technologies 17 Spot the Anomaly Anomaly?
    18. 18. © 2014 MapR Technologies 18 Maybe not!
    19. 19. © 2014 MapR Technologies 19 Where’s Waldo? This is the real anomaly
    20. 20. © 2014 MapR Technologies 20 Normal Isn’t Just Normal • What we want is a model of what is normal • What doesn’t fit the model is the anomaly • For simple signals, the model can be simple … • The real world is rarely so accommodating x ~ N(0,e)
    21. 21. © 2014 MapR Technologies 21 We Do Windows
    22. 22. © 2014 MapR Technologies 22 We Do Windows
    23. 23. © 2014 MapR Technologies 23 We Do Windows
    24. 24. © 2014 MapR Technologies 24 We Do Windows
    25. 25. © 2014 MapR Technologies 25 We Do Windows
    26. 26. © 2014 MapR Technologies 26 We Do Windows
    27. 27. © 2014 MapR Technologies 27 We Do Windows
    28. 28. © 2014 MapR Technologies 28 We Do Windows
    29. 29. © 2014 MapR Technologies 29 We Do Windows
    30. 30. © 2014 MapR Technologies 30 We Do Windows
    31. 31. © 2014 MapR Technologies 31 We Do Windows
    32. 32. © 2014 MapR Technologies 32 We Do Windows
    33. 33. © 2014 MapR Technologies 33 We Do Windows
    34. 34. © 2014 MapR Technologies 34 We Do Windows
    35. 35. © 2014 MapR Technologies 35 We Do Windows
    36. 36. © 2014 MapR Technologies 36 Windows on the World • The set of windowed signals is a nice model of our original signal • Clustering can find the prototypes – Fancier techniques available using sparse coding • The result is a dictionary of shapes • New signals can be encoded by shifting, scaling and adding shapes from the dictionary
    37. 37. © 2014 MapR Technologies 37 Most Common Shapes (for EKG)
    38. 38. © 2014 MapR Technologies 38 Reconstructed signal Original signal Reconstructed signal Reconstruction error < 1 bit / sample
    39. 39. © 2014 MapR Technologies 39 An Anomaly Original technique for finding 1-d anomaly works against reconstruction error
    40. 40. © 2014 MapR Technologies 40 Close-up of anomaly Not what you want your heart to do. And not what the model expects it to do.
    41. 41. © 2014 MapR Technologies 41 A Different Kind of Anomaly
    42. 42. © 2014 MapR Technologies 42 Model Delta Anomaly Detection Online Summarizer δ > t ? 99.9%-ile t Alarm ! Model - + δ
    43. 43. © 2014 MapR Technologies 43 The Real Inside Scoop • The model-delta anomaly detector is really just a sum of random variables – the model we know about already – and a normally distributed error • The output (delta) is (roughly) the log probability of the sum distribution (really δ2) • Thinking about probability distributions is good
    44. 44. © 2014 MapR Technologies 44 Example: Event Stream (timing) • Events of various types arrive at irregular intervals – we can assume Poisson distribution • The key question is whether frequency has changed relative to expected values • Want alert as soon as possible
    45. 45. © 2014 MapR Technologies 45 Poisson Distribution • Time between events is exponentially distributed • This means that long delays are exponentially rare • If we know λ we can select a good threshold – or we can pick a threshold empirically Dt ~ le-lt P(Dt > T) = e-lT -logP(Dt > T) = lT
    46. 46. © 2014 MapR Technologies 46 Converting Event Times to Anomaly 99.9%-ile 99.99%-ile
    47. 47. © 2014 MapR Technologies 51 Recap (out of order) • Anomaly detection is best done with a probability model • -log p is a good way to convert to anomaly measure • Adaptive quantile estimation works for auto-setting thresholds
    48. 48. © 2014 MapR Technologies 52 Recap • Different systems require different models • Continuous time-series – sparse coding to build signal model • Events in time – rate model base on variable rate Poisson – segregated rate model • Events with labels – language modeling – hidden Markov models
    49. 49. © 2014 MapR Technologies 53 But Wait! Compression is Truth • Maximizing log πk is minimizing compressed size – (each symbol takes -log πk bits on average) • Maximizing log πk happens where πk = pk – (maximum likelihood principle)
    50. 50. © 2014 MapR Technologies 54 But Auto-encoders Find Max Likelihood • Minimal error => maximum likelihood • Maximum likelihood => maximum compression • So good anomaly detectors give good compression
    51. 51. © 2014 MapR Technologies 55 In Case You Want the Details E pk logpk[ ] = pk logpk k å log x £ x -1 pk log pk pkk å £ pk 1- pk pk æ è ç ö ø ÷ = pk k å - pk k å k å = 0 pk logpk - pk log pk k å £ 0 k å pk logpk £ pk log pk k å k å E pk logpk[ ] » 1 n logpxi i å
    52. 52. © 2014 MapR Technologies 56 Pause To Reflect on Clustering • Use windowing to apportion signal – Hamming windows add up to 1 • Find nearest cluster for each window – Can use dot product because all clusters normalized • Scale cluster to right size – Dot product again • Subtract from original signal
    53. 53. © 2014 MapR Technologies 57 Auto-encoding - Information Bottleneck Original Signal Encoder Reconstructed Signal
    54. 54. © 2014 MapR Technologies 58 x1 x2 ... xn-1 xn x'1 x'2 ... x'n-1 x'n Input Hidden layer (clusters) Reconstruction Clustering as Neural Network Hidden layer is 1 of k Could be m of kSparsity allows k >> 100
    55. 55. © 2014 MapR Technologies 59 Overlapping Networks Time series input Reconstructed time series ... ... ... ... ... ... ... ... ... ...
    56. 56. © 2014 MapR Technologies 60 Deep Learning ... ... ... ... ... ... ... ... ... ... ... ... ... ... A B A'
    57. 57. © 2014 MapR Technologies 61 What About the Database? • We don’t have to keep the reconstruction • We can keep the first level nodes – And the reconstruction error • To keep the first level nodes – We can keep the second level nodes – Plus the reconstruction error
    58. 58. © 2014 MapR Technologies 62 What Does it Matter? • Even one level of auto-encoding compresses – 30-50x in EKG example with k-means • Multiple levels compress more – Understanding => Truth => Compression • Higher levels give semantic search
    59. 59. © 2014 MapR Technologies 63 How Do I Build Such a System • The key is to combine real-time and long-time – real-time evaluates data stream against model – long-time is how we build the model • Extended Lambda architecture is my favorite • See my other talks on slideshare.net for info • Ping me directly
    60. 60. © 2014 MapR Technologies 64 t now Hadoop is Not Very Real-time UnprocessedData Fully processed Latest full period Hadoop job takes this long for this data
    61. 61. © 2014 MapR Technologies 65 t now Hadoop works great back here Storm works here Real-time and Long-time together Blended viewBlended viewBlended View
    62. 62. © 2014 MapR Technologies 66 Who I am • Ted Dunning, Chief Application Architect, MapR tdunning@mapr.com tdunning@apache.org @ted_dunning • Committer, mentor, champion, PMC member on several Apache projects • Mahout, Drill, Zookeeper others
    63. 63. © 2014 MapR Technologies 67 © MapR Technologies, confidential
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×