Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Anomaly detection in deep learning (Updated) English

4,444 views

Published on

English version of my japanese deep learning slides for anomaly detection at wacul.

Published in: Data & Analytics

Anomaly detection in deep learning (Updated) English

  1. 1. Anomaly Detection in Deep Learning Adam Gibson - Skymind
  2. 2. Deep Learning book
  3. 3. Dl4j
  4. 4. Skymind We take Deep Learning models to production on premise Using Scala (think Python for production) Java Virtual Machine stack connected to C++ (eg: first class access to big data systems) with native compute We make SKIL(Skymind Intelligence Layer): A production deep learning system for building deep learning applications in production
  5. 5. What’s an “Anomaly?” Abnormal Patterns in Data Fraud Detection - “Bad credit card Transactions” ALSO Fraud detection - Detecting fake locations with call detail records Network Intrusion - Abnormal Activity in a network Broken Computers in a data center
  6. 6. Brief Case Studies - eg: Why am I up here? Telco: http://blogs.wsj.com/cio/2016/03/14/orange-tests-deep- learning-software-to-identify-fraud/ Network Infrastructure: https://insights.ubuntu.com/2016/04/25/making-deep- learning-accessible-on-openstack/
  7. 7. Network Infra - Save time and Money avoiding Broken workloads by auto migration before it happens
  8. 8. Why Deep Learning? Learns well from lots of data Own feature representation: Robust to noise and allows for learning cross domain patterns Already applied in ads: Google itself invests lots in this same kind of pattern recognition (targeting/relevance)
  9. 9. Techniques Unsupervised - Use autoencoder reconstruction error and moving averages with dropout over a set time window Supervised - RNNs learn from a set of yes/nos in a time series. RNNs can learn from a series of time steps and predict when an anomaly is about to occur. Use streaming/minibatches (all neural nets can learn like this)
  10. 10. AutoEncoder Anomaly Detection Moving average anomaly with KL Divergence Autoencoder learns to reconstruct data (eg: the input is the labels)
  11. 11. Recurrent Net Anomalies Learn a softmax over time series: Given a fixed window, the goal is to predict a probability of an anomaly occurring given a sequence
  12. 12. Sequences Time Series/Windows with RNNs http://karpathy.github.io/2015/05/21/rnn-effectiveness/ See: http://karpathy.github.io/2015/05/21/rnn-effectiveness/
  13. 13. Some definitions Reconstruction Error: Autoencoders can learn from unsupervised pretraining and learn how to reconstruct data. Minimize KL Divergence (the delta between two probability distributions) RNN/Time Series: See http://deeplearning4j.org/usingrnns
  14. 14. Production Kafka/Spark Streaming/Flink/Apex Neural networks as consumer of streaming updates Data? Mostly log ingestion, could be video
  15. 15. Demo! Kibana Kafka Elasticsearch Logstash NiFi Cassandra Lagom Dl4j Ecosystem(DataVec,Nd4j,Dl4j,Arbiter)
  16. 16. Reference Architecture for Anomaly Detection External World Ingest from external with nifi Send to kafka Make a prediction about the data Index the prediction in elasticsearch with logstash Render the data with kibana Store raw events in cassandra
  17. 17. Summary Real ML pipeline Cassandra for storing raw data results ELK (Elasticsearch, Logstash, Kibana) stack for alerting and visualization Kafka for model ingestion Lagom for serving model predictions NiFi for designing data pipelines
  18. 18. Questions? Email: adam@skymind.io Twitter: agibsonccc Github: agibsonccc

×