Anomaly Detection in Deep
Learning
Adam Gibson Skymind - Reactive Meetup 2016 @ Google Tokyo
What’s an “Anomaly?”
● Abnormal Patterns in Data
● Fraud Detection - “Bad credit card Transactions”
● ALSO Fraud detection - Detecting fake locations with call
detail records
● Network Intrusion - Abnormal Activity in a network
● Broken Computers in a data center
Brief Case Studies - eg: Why am I up here?
● Telco: http://blogs.wsj.com/cio/2016/03/14/orange-tests-deep-
learning-software-to-identify-fraud/
● Network Infrastructure: https://insights.ubuntu.
com/2016/04/25/making-deep-learning-accessible-on-
openstack/
Network Infra - Save time and Money avoiding
Broken workloads by auto migration before it happens
Why Deep Learning?
● Learns well from lots of data
● Own feature representation: Robust to noise and allows for
learning cross domain patterns
● Already applied in ads: Google itself invests lots in this same
kind of pattern recognition (targeting/relevance)
Techniques
● Unsupervised - Use autoencoder reconstruction error and use moving averages
use dropout with a set time window
● Supervised - RNNs Learn from a set of yes/nos in a time series. RNNs can learn
from a series of time steps and predict when an anomaly is about to occur.
● Use streaming/minibatches (all neural nets can learn like this)
Some definitions
● Reconstruction Error: Autoencoders can learn from
unsupervised pretraining and learn how to reconstruct data.
Minimize KL Divergence (the delta between two probability
distributions
● RNN/Time Series: See http://deeplearning4j.org/usingrnns
Production
● Kafka/Spark Streaming/Flink/Apex
● Neural net works as consumer of streaming updates
● Data? Mostly log ingestion, could be video
Questions?
Email: adam@skymind.io
Twitter: agibsonccc
Github: agibsonccc
Upcoming talks
Hadoop Summit: San Jose http://hadoopsummit.org/san-jose/ourspeakers/

Anomaly detection in deep learning

  • 1.
    Anomaly Detection inDeep Learning Adam Gibson Skymind - Reactive Meetup 2016 @ Google Tokyo
  • 2.
    What’s an “Anomaly?” ●Abnormal Patterns in Data ● Fraud Detection - “Bad credit card Transactions” ● ALSO Fraud detection - Detecting fake locations with call detail records ● Network Intrusion - Abnormal Activity in a network ● Broken Computers in a data center
  • 3.
    Brief Case Studies- eg: Why am I up here? ● Telco: http://blogs.wsj.com/cio/2016/03/14/orange-tests-deep- learning-software-to-identify-fraud/ ● Network Infrastructure: https://insights.ubuntu. com/2016/04/25/making-deep-learning-accessible-on- openstack/
  • 4.
    Network Infra -Save time and Money avoiding Broken workloads by auto migration before it happens
  • 5.
    Why Deep Learning? ●Learns well from lots of data ● Own feature representation: Robust to noise and allows for learning cross domain patterns ● Already applied in ads: Google itself invests lots in this same kind of pattern recognition (targeting/relevance)
  • 6.
    Techniques ● Unsupervised -Use autoencoder reconstruction error and use moving averages use dropout with a set time window ● Supervised - RNNs Learn from a set of yes/nos in a time series. RNNs can learn from a series of time steps and predict when an anomaly is about to occur. ● Use streaming/minibatches (all neural nets can learn like this)
  • 7.
    Some definitions ● ReconstructionError: Autoencoders can learn from unsupervised pretraining and learn how to reconstruct data. Minimize KL Divergence (the delta between two probability distributions ● RNN/Time Series: See http://deeplearning4j.org/usingrnns
  • 8.
    Production ● Kafka/Spark Streaming/Flink/Apex ●Neural net works as consumer of streaming updates ● Data? Mostly log ingestion, could be video
  • 9.
  • 10.
    Upcoming talks Hadoop Summit:San Jose http://hadoopsummit.org/san-jose/ourspeakers/