Anomaly Detection in Deep
Adam Gibson Skymind - Reactive Meetup 2016 @ Google Tokyo
What’s an “Anomaly?”
● Abnormal Patterns in Data
● Fraud Detection - “Bad credit card Transactions”
● ALSO Fraud detection - Detecting fake locations with call
● Network Intrusion - Abnormal Activity in a network
● Broken Computers in a data center
Brief Case Studies - eg: Why am I up here?
● Telco: http://blogs.wsj.com/cio/2016/03/14/orange-tests-deep-
● Network Infrastructure: https://insights.ubuntu.
Network Infra - Save time and Money avoiding
Broken workloads by auto migration before it happens
Why Deep Learning?
● Learns well from lots of data
● Own feature representation: Robust to noise and allows for
learning cross domain patterns
● Already applied in ads: Google itself invests lots in this same
kind of pattern recognition (targeting/relevance)
● Unsupervised - Use autoencoder reconstruction error and use moving averages
use dropout with a set time window
● Supervised - RNNs Learn from a set of yes/nos in a time series. RNNs can learn
from a series of time steps and predict when an anomaly is about to occur.
● Use streaming/minibatches (all neural nets can learn like this)
● Reconstruction Error: Autoencoders can learn from
unsupervised pretraining and learn how to reconstruct data.
Minimize KL Divergence (the delta between two probability
● RNN/Time Series: See http://deeplearning4j.org/usingrnns
● Kafka/Spark Streaming/Flink/Apex
● Neural net works as consumer of streaming updates
● Data? Mostly log ingestion, could be video