Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna

Anomaly Detection using
Deep Auto-Encoders
GIANMARIO SPACAGNA
DATA SCIENCE MILAN - 18/05/2017

What you will (briefly) learn
▶ What is an anomaly (and an outlier)
▶ Popular techniques used in shallow machine
learning
▶ Why deep learning can make the difference
▶ Anomaly detection using deep auto—
encoders
▶ H2O overview
▶ ECG pulse detection PoC example

1. Machine Learning – An
Introduction
2. Neural Networks
3. Deep Learning
Fundamentals
4. Unsupervised Feature
Learning
5. Image Recognition
6. Recurrent Neural Networks
and Languages Models
7. Deep Learning for Board
Games
8. Deep Learning for
Computer Games
9. Anomaly Detection
10.Building a Production-ready
Intrusion Detection System

Why this use case?
▶ Anomaly detection is crucial to many business
applications
▶ Smart feature representation => better anomaly detection
▶ Deep Learning works very well on learning relationships in
the underlying raw data
(will see how…)

Outlier vs Anomaly
“An outlier is a legitimate data point that’s far
away from the mean or median in a distribution. It
may be unusual, like a 9.6-second 100-meter dash,
but still within the realm of reality. An anomaly is an
illegitimate data point that’s generated by a
different process than whatever generated the
rest of the data.”
Ravi Parikh
http://data.heapanalytics.com/garbage-in-garbage-out-how-anomalies-
can-wreck-your-data

Data modeling
▶ Point anomaly
(e.g. black sheep)
■ Contextual
anomaly
(e.g. selling ice-
creams in
January)
■ Collective
anomaly
(e.g. sequence of
suspected credit
card activities)

Detection modeling (and its
limitations)
▶ Supervised (classification)
▶ Data skewness, lack of
counter examples
▶ Unsupervised (clustering)
▶ Curse of dimensionality
▶ Semi-supervised
(novelty detection)
▶ Require a “normal” training
dataset

Real world applications
▶ Manufacturing => hardware faults
▶ Law-enforcement => reveal criminal activities
▶ Network system => detect intrusions or anomalous
behaviors
▶ Internet Security => malware detection
▶ Financial services => frauds
▶ Marketing / business strategy => spotting profitable
customers
▶ Healthcare => Medical diagnosis

What’s the challenge?
“Coming up with features is difficult, time-
consuming, requires expert knowledge.
When working applications of learning, we
spend a lot of time tuning features.“
Andrew Ng, Machine Learning and AI via Brain simulations, Stanford
University

Hierarchical Feature Learning
NVIDIA Deep Learning Course: Class #1 – Introduction to Deep Learning
https://www.youtube.com/watch?v=6eBpjEdgSm0

Structural representation
Advanced Topics, http://slideplayer.com/slide/3471890/

Signal propagation
Schematic diagram of back-propagation neural networks with two hidden layers.
Factor selection for delay analysis using Knowledge Discovery in Databases

Auto-encoders
• Signal propagation output: approximate an identity function
• Error back propagation: Mean Squared Error MSE (*)
between the original datum and the reconstructed one
(*) in case of numerical data

Novelty detection using auto-encoders
1. Identify a training dataset of what is considered “normal”
2. Learn what “normal” means, aka. learn the structures of normal
behavior
3. Try to reconstruct never-seen points re-using the same structure, if the
error is high means the point deviates from the normal distribution
TRAIN
Auto-
Encoder
RECONSTRUCT Low
error
RECONSTRUCT High
error

Features compression
■ Use just the encoder to compress data
into a reduced dimensional space then
use traditional unsupervised learning
Tom Mitchell’s example of an auto-encoder:
You can represent any combination of the 8 binary inputs using only 3 decimal
values

PoC examples
▶ ECG Anomaly Pulse Detection
▶ MNIST Anomaly Digit Recognition
(Optional)
▶ Jupyter notebooks available on
https://github.com/packtmayur/Python-
Deep-Learning/tree/master/chapter_9

Summary
▶ We listed a few real-world applications of anomaly
detection
▶ We covered some of the most popular techniques in
the literature with their limitations
▶ We proposed an overview of how deep neural
networks work and why they are great for learning
smart feature representations
▶ We proposed 2 semi-supervised approaches using
deep auto-encoders:
▶ Novel detection
▶ Feature compression

Going deeper
▶ Advanced modeling:
▶ Denoising auto-encoders
▶ Contractive auto-encoders
▶ Sparse auto-encoders
▶ Variational auto-encoders (for better novelty detection)
▶ Stacked auto-encoders (for better feature compression)
▶ Building a production-ready intrusion detection system:
▶ Validating and testing with labels and in absence of ground truth
▶ Evaluation KPIs for anomaly detection
▶ A/B(C/D) testing

E-book discount
▶ Use the code KVGRSF30
and get 30% discount on e-
book
▶ Only valid for 500 uses
until 31st October, 2017
▶ https://www.packtpub.com/b
ig-data-and-business-
intelligence/python-deep-
learning

"Data scientists realize that their best days
coincide with discovery of truly odd features in
the data."
Haystacks and Needles: Anomaly Detection By:
Gerhard Pilcher & Kenny Darrell, Data Mining
Analyst, Elder Research, Inc.

Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (9)

Similar to Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna

Similar to Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna (20)

More from Data Science Milan

More from Data Science Milan (20)

Recently uploaded

Recently uploaded (20)

Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna