Anomaly Detection in Deep
Learning
Adam Gibson - Skymind
Deep Learning
book
Dl4j
Skymind
We take Deep Learning models to production on premise
Using Scala (think Python for production)
Java Virtual Machine stack connected to C++ (eg: first class
access to big data systems) with native compute
We make SKIL(Skymind Intelligence Layer): A production deep
learning system for building deep learning applications in
production
What’s an “Anomaly?”
Abnormal Patterns in Data
Fraud Detection - “Bad credit card Transactions”
ALSO Fraud detection - Detecting fake locations with call detail
records
Network Intrusion - Abnormal Activity in a network
Broken Computers in a data center
Brief Case Studies - eg: Why am I up here?
Telco: http://blogs.wsj.com/cio/2016/03/14/orange-tests-deep-
learning-software-to-identify-fraud/
Network Infrastructure:
https://insights.ubuntu.com/2016/04/25/making-deep-
learning-accessible-on-openstack/
Network Infra - Save time and Money avoiding
Broken workloads by auto migration before it happens
Why Deep Learning?
Learns well from lots of data
Own feature representation: Robust to noise and allows for
learning cross domain patterns
Already applied in ads: Google itself invests lots in this same
kind of pattern recognition (targeting/relevance)
Techniques
Unsupervised - Use autoencoder reconstruction error and moving averages with
dropout over a set time window
Supervised - RNNs learn from a set of yes/nos in a time series. RNNs can learn from
a series of time steps and predict when an anomaly is about to occur.
Use streaming/minibatches (all neural nets can learn like this)
AutoEncoder Anomaly Detection
Moving average anomaly with KL Divergence
Autoencoder learns to reconstruct data (eg: the input is the labels)
Recurrent Net Anomalies
Learn a softmax over time series:
Given a fixed window, the goal is to predict a probability of an anomaly
occurring given a sequence
Sequences Time Series/Windows with RNNs
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
See: http://karpathy.github.io/2015/05/21/rnn-effectiveness/
Some definitions
Reconstruction Error: Autoencoders can learn from
unsupervised pretraining and learn how to reconstruct data.
Minimize KL Divergence (the delta between two probability
distributions)
RNN/Time Series: See http://deeplearning4j.org/usingrnns
Production
Kafka/Spark Streaming/Flink/Apex
Neural networks as consumer of streaming updates
Data? Mostly log ingestion, could be video
Demo!
Kibana
Kafka
Elasticsearch
Logstash
NiFi
Cassandra
Lagom
Dl4j Ecosystem(DataVec,Nd4j,Dl4j,Arbiter)
Reference Architecture for Anomaly Detection
External
World
Ingest from
external with
nifi Send to
kafka
Make a
prediction
about the
data
Index the
prediction in
elasticsearch
with logstash
Render
the
data
with
kibana
Store raw
events in
cassandra
Summary
Real ML pipeline
Cassandra for storing raw data results
ELK (Elasticsearch, Logstash, Kibana) stack for alerting and
visualization
Kafka for model ingestion
Lagom for serving model predictions
NiFi for designing data pipelines
Questions?
Email: adam@skymind.io
Twitter: agibsonccc
Github: agibsonccc

Anomaly detection in deep learning (Updated) English

  • 1.
    Anomaly Detection inDeep Learning Adam Gibson - Skymind
  • 2.
  • 3.
  • 4.
    Skymind We take DeepLearning models to production on premise Using Scala (think Python for production) Java Virtual Machine stack connected to C++ (eg: first class access to big data systems) with native compute We make SKIL(Skymind Intelligence Layer): A production deep learning system for building deep learning applications in production
  • 5.
    What’s an “Anomaly?” AbnormalPatterns in Data Fraud Detection - “Bad credit card Transactions” ALSO Fraud detection - Detecting fake locations with call detail records Network Intrusion - Abnormal Activity in a network Broken Computers in a data center
  • 6.
    Brief Case Studies- eg: Why am I up here? Telco: http://blogs.wsj.com/cio/2016/03/14/orange-tests-deep- learning-software-to-identify-fraud/ Network Infrastructure: https://insights.ubuntu.com/2016/04/25/making-deep- learning-accessible-on-openstack/
  • 7.
    Network Infra -Save time and Money avoiding Broken workloads by auto migration before it happens
  • 8.
    Why Deep Learning? Learnswell from lots of data Own feature representation: Robust to noise and allows for learning cross domain patterns Already applied in ads: Google itself invests lots in this same kind of pattern recognition (targeting/relevance)
  • 9.
    Techniques Unsupervised - Useautoencoder reconstruction error and moving averages with dropout over a set time window Supervised - RNNs learn from a set of yes/nos in a time series. RNNs can learn from a series of time steps and predict when an anomaly is about to occur. Use streaming/minibatches (all neural nets can learn like this)
  • 10.
    AutoEncoder Anomaly Detection Movingaverage anomaly with KL Divergence Autoencoder learns to reconstruct data (eg: the input is the labels)
  • 11.
    Recurrent Net Anomalies Learna softmax over time series: Given a fixed window, the goal is to predict a probability of an anomaly occurring given a sequence
  • 12.
    Sequences Time Series/Windowswith RNNs http://karpathy.github.io/2015/05/21/rnn-effectiveness/ See: http://karpathy.github.io/2015/05/21/rnn-effectiveness/
  • 13.
    Some definitions Reconstruction Error:Autoencoders can learn from unsupervised pretraining and learn how to reconstruct data. Minimize KL Divergence (the delta between two probability distributions) RNN/Time Series: See http://deeplearning4j.org/usingrnns
  • 14.
    Production Kafka/Spark Streaming/Flink/Apex Neural networksas consumer of streaming updates Data? Mostly log ingestion, could be video
  • 15.
  • 16.
    Reference Architecture forAnomaly Detection External World Ingest from external with nifi Send to kafka Make a prediction about the data Index the prediction in elasticsearch with logstash Render the data with kibana Store raw events in cassandra
  • 17.
    Summary Real ML pipeline Cassandrafor storing raw data results ELK (Elasticsearch, Logstash, Kibana) stack for alerting and visualization Kafka for model ingestion Lagom for serving model predictions NiFi for designing data pipelines
  • 18.