Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
BERLIN BUZZWORDS 2014	

SELECTEDTALKS OVERVIEW
tech talk @ ferret
Andrii Gakhov	

19/06/2014
DEEP LEARNING
FOR HIGH PERFORMANCETIME-SERIES DATABASES
byTed Dunning
ABOUT OF THE AUTHOR
• Chief Application Architect at MapR
Technologies	

• Ph.D. in computing science from the University
...
ANOMALY DETECTION
ANOMALY DETECTION
99.9%-ile
Online summarizer	

(t-digest)
99.9%-ile
t
x > t?
x
!
• The t-digest algorithm was developed b...
ISSUES WITH SIMPLETHRESHOLDS
LOOKS LIKE ANOMALY?
NOT SURE
WHAT IS NORMAL?
• We need to have a model of what is normal	

• Everything that doesn’t fit model is the anomaly	

• For si...
WINDOWS
WINDOWS
WINDOWS
• Set of windowed signals - model of the original signal	

• Clustering can find the prototypes	

• The result is a...
COMMON SHAPES (EKG)
RECONSTRUCTION
ANOMALY
ANOMALY
MODEL ANOMALY DETECTION
Online summarizer
99.9%-ile
t
∆ > t?
x
!
t
∆ reconstruction error
Model
x’
x-x’
input signal
COMPRESSION
Minimal Error
Maximum likelihood
Maximum compression
• Good anomaly detectors give good compression	

• So, we...
MODEL ANOMALY DETECTION
Online summarizer
99.9%-ile
t
∆ > t?
x
!
t
∆ reconstruction error
Encoder
x’
x-x’
input signal
sha...
CLUSTERING
• Use windowing	

• Find nearest cluster for each window	

• Scale cluster to the right size	

• Subtract from ...
CLUSTERING AS NEURAL NETWORK
OVERLAPPING NETWORKS
Time series input
Reconstructed time series
READ MORE?
• A New Look At Anomaly
Detection	

• This is the second book in the
series Practical Machine Learning by
Ted D...
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
Claims club - November 2016, Exeter
Next
Upcoming SlideShare
Claims club - November 2016, Exeter
Next
Download to read offline and view in fullscreen.

Share

Buzzwords 2014 / Overview / part2

Download to read offline

Short overview of selected talks from Berlin Buzzwords 2014

Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Buzzwords 2014 / Overview / part2

  1. 1. BERLIN BUZZWORDS 2014 SELECTEDTALKS OVERVIEW tech talk @ ferret Andrii Gakhov 19/06/2014
  2. 2. DEEP LEARNING FOR HIGH PERFORMANCETIME-SERIES DATABASES byTed Dunning
  3. 3. ABOUT OF THE AUTHOR • Chief Application Architect at MapR Technologies • Ph.D. in computing science from the University of Sheffield • Committer on Mahout, Drill, Zookeeper … • http://tdunning.blogspot.de/ • @ted_dunning
  4. 4. ANOMALY DETECTION
  5. 5. ANOMALY DETECTION 99.9%-ile Online summarizer (t-digest) 99.9%-ile t x > t? x ! • The t-digest algorithm was developed byTed Dunning and available in Apache Machout • With t-digest algorithm one can accurately estimate quantiles for very large data sets with limited memory use input signal
  6. 6. ISSUES WITH SIMPLETHRESHOLDS
  7. 7. LOOKS LIKE ANOMALY?
  8. 8. NOT SURE
  9. 9. WHAT IS NORMAL? • We need to have a model of what is normal • Everything that doesn’t fit model is the anomaly • For simple signals we can assume just normal distribution
  10. 10. WINDOWS
  11. 11. WINDOWS
  12. 12. WINDOWS • Set of windowed signals - model of the original signal • Clustering can find the prototypes • The result is a dictionary of shapes • New signals can be encoded by shifting, scaling and adding shapes from the dictionary
  13. 13. COMMON SHAPES (EKG)
  14. 14. RECONSTRUCTION
  15. 15. ANOMALY
  16. 16. ANOMALY
  17. 17. MODEL ANOMALY DETECTION Online summarizer 99.9%-ile t ∆ > t? x ! t ∆ reconstruction error Model x’ x-x’ input signal
  18. 18. COMPRESSION Minimal Error Maximum likelihood Maximum compression • Good anomaly detectors give good compression • So, we are constructing an auto-encoder!
  19. 19. MODEL ANOMALY DETECTION Online summarizer 99.9%-ile t ∆ > t? x ! t ∆ reconstruction error Encoder x’ x-x’ input signal shape dict
  20. 20. CLUSTERING • Use windowing • Find nearest cluster for each window • Scale cluster to the right size • Subtract from the original signal
  21. 21. CLUSTERING AS NEURAL NETWORK
  22. 22. OVERLAPPING NETWORKS Time series input Reconstructed time series
  23. 23. READ MORE? • A New Look At Anomaly Detection • This is the second book in the series Practical Machine Learning by Ted Dunning & Ellen Friedman • FREE download from www.mapr.com

Short overview of selected talks from Berlin Buzzwords 2014

Views

Total views

577

On Slideshare

0

From embeds

0

Number of embeds

7

Actions

Downloads

3

Shares

0

Comments

0

Likes

0

×