Anomaly Detection in Seasonal Time Series

Anomaly Detection in
Seasonal Time Series
Humberto Cardoso Marchezi
Manchester, UK
25 March 2019

Applicability
● Financial Fraud
● Manufacturing Inspection
● Network Intrusion Detection
● Web Service Disaster Discovery (DevOps)
● etc.

DevOps Use Case
Machines are monitored through graphic analysis of CPU levels, memory consumption, etc.

DevOps Use Case
However relying on human eyes to look at dashboards is not a scalable option. Automation is needed!

incidents
Automated System
Machine
Collectors
Incident
Detection
System
metric
data points Alarm
Generator
notifies
DevOps Engineer

Case 1
machines
reboots/year
Machines that reboot too much are anomalies of interest. How to isolate them ?

machines
reboots/year
mean
mean +
2 std
mean -
2 std
Case 1
Machines that reboot too much are anomalies of interest. How to isolate them ?

machines
reboots/year
threshold
Case 1
Linear calculated threshold isolates anomalous data points

Case 2
How to isolate global and local anomalies in seasonal signal ?

Case 2
Linear threshold can identify global anomalies since they are outside of signal variation

?????
Case 2
However there is no linear threshold to isolate the local anomalies. How to proceed ?

Seasonality and Frequency
1 data point every hour daily seasonality frequency = 24
Review signal characteristics: daily seasonality, one data point per hour, no visible trend

Additive vs Multiplicative Time Series
International Airline
Passengers per Month
(multiplicative)
Austria Industrial
Production per Quarter
(additive)
Seasonality
magnitude
increases
with trend
Seasonality
effect
remains
constant
despite trend

● Multiplicative Model
Seasonal Trend Decomposition
= * *
observed trend seasonal residual
● Additive Model
= + +
observed trend seasonal residual
trend - long term signal behavior
seasonal - identified repetitive behavior
residual - all the rest that doesn’t fit the trend or seasonal

?????
frequency = 24
model = additive
Case 2
Recap problem: how to identify such local anomalies ?

Seasonal-Trend Decomposition
original
trend
seasonal
residual

original
trend
seasonal
residual
Residual is the component of interest for anomaly detection

original
residual
Global and local anomalies are mapped in the residual component

residual
Residual is free from trend and seasonal behavior

median
median + 6 mad
median - 6 mad
residual
Therefore anomalies can now be found with linear-based thresholds

Residual Extraction
Mapping the anomalies found in residual back to the original signal identifies all data points of interest

Residual Extraction
Pros:
● Works well with seasonal time series - global and local anomalies
● Few parameters to optimize (compared to other models)
● Algorithm implementation is simple given statistics libraries as available
Cons:
● Need to know how to adjust period parameter for each time series
● Need to know how to adjust anomaly factor so to avoid noisy results
● Works only for seasonal time series where residual is a normal distribution

References / Q&A
Notebook Demo - https://github.com/hcmarchezi/jupyter_notebooks/blob/master/residual_extraction_demo_1.ipynb
Anomaly Detection: A Tutorial - http://icdm2011.cs.ualberta.ca/downloads/ICDM2011_anomaly_detection_tutorial.pdf
Twitter Anomaly Detection - https://github.com/twitter/AnomalyDetection
Automatic Anomaly Detection in the Cloud Via Statistical Learning - https://arxiv.org/pdf/1704.07706.pdf
Generalized ESD for Outliers - https://www.itl.nist.gov/div898/handbook/eda/section3/eda35h3.htm
Real Time Anomaly Detection System for Time Series at Scale -
http://proceedings.mlr.press/v71/toledano18a/toledano18a.pdf
Time Series Dataset - https://datamarket.com/data/

Anomaly Detection in Seasonal Time Series

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Anomaly Detection in Seasonal Time Series

Similar to Anomaly Detection in Seasonal Time Series (20)

More from Humberto Marchezi

More from Humberto Marchezi (7)

Recently uploaded

Recently uploaded (20)

Anomaly Detection in Seasonal Time Series