Successfully reported this slideshow.
Upcoming SlideShare
×

# Anomaly Detection by ADGM / LVAE

2,813 views

Published on

Naoto Mizuno, PFN Summer Internship 2016

Published in: Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Hello! I do no use writing service very often, only when I really have problems. But this one, I like best of all. The team of writers operates very quickly. It's called ⇒ www.HelpWriting.net ⇐ Hope this helps!

Are you sure you want to  Yes  No
• There is a useful site for you that will help you to write a perfect and valuable essay and so on. Check out, please ⇒ www.WritePaper.info ⇐

Are you sure you want to  Yes  No

### Anomaly Detection by ADGM / LVAE

1. 1. Anomaly Detection by ADGM / LVAE Naoto Mizuno Mentor : Tanaka-san, Okanohara-san
2. 2. Introduction • Anomaly detection • Data • NAB Dataset (Artificial) • (Other data are not open to this presentation) • Model • Auxiliary VAE (ADGM) • Ladder VAE • VAE (previous work)
3. 3. Variational Auto-Encoder (VAE) • We assume that the data 𝑥 are generated from the latent variables 𝑧. • We use neural network as encoder and decoder. x x zz 𝑞\$(𝑧|𝑥) 𝑝)(𝑥|𝑧) Encoder Decoder Data Latent Variable
4. 4. VAE • We use lower bound of log 𝑝) 𝑥 as loss function. log 𝑝) 𝑥 ≥ 𝐸9: 𝑧 𝑥 log 𝑝) 𝑥, 𝑧 𝑞\$ 𝑧 𝑥 𝑝) 𝑥, 𝑧 = 𝑝) 𝑥|𝑧 𝑝) 𝑧 • 𝑝) 𝑧 : Standard normal distribution • In training, 𝑧 is chosen from 𝑞\$ 𝑧 𝑥 .
5. 5. ADGM • Semi-supervised Learning • Detect label 𝑦 and reconstruct data 𝑥. • Auxiliary variable increase the flexibility of the model. x a z y Data Latent Variable Auxiliary Variable Label x a z y x a z y SDGM
6. 6. Objective function of ADGM • For labeled data • Lower bound + classification loss 𝐿 𝑥, 𝑦 = −𝐸9: 𝑎, 𝑧 𝑥, 𝑦 log 𝑝) 𝑥, 𝑦, 𝑎, 𝑧 𝑞\$ 𝑎, 𝑧 𝑥, 𝑦 − 𝛼𝐸9: 𝑎 𝑥 log 𝑞\$ (𝑦|𝑎, 𝑥) • For unlabeled data 𝑈 𝑥 = −𝐸9: 𝑎, 𝑦, 𝑧 𝑥 log 𝑝) 𝑥, 𝑦, 𝑎, 𝑧 𝑞\$ 𝑎, 𝑦, 𝑧 𝑥 • Total 𝐽 = J 𝐿(𝑥K, 𝑦K) MN,ON + J 𝑈(𝑥Q) MR
7. 7. ADGM for MNIST • Semi-supervised learning • 100 labeled, 60000 unlabeled • Test error ADGM : 0.96 % SDGM : 1.32% • Generate image • Choosing 𝑧 from Gaussian • Generate with each 𝑦 SDGM Without auxiliary variable
8. 8. Auxiliary VAE • Unsupervised Learning • Several sampling layers (1 or 2) x z a a z x z a a z
9. 9. Ladder VAE • Several sampling layers (~5) • VAE with several sampling layers is difficult to train. • Sharing the information between decoder and encoder. x x zd d z d z
10. 10. Ladder VAE • Encoder use decoder output as prior. 𝜎9 T = 1 𝜎V9 WT + 𝜎X WT 𝜇9 = 𝜇V9 𝜎V9 WT + 𝜇X 𝜎X WT 𝜎V9 WT + 𝜎X WT z d 𝜇X, 𝜎X Prior Likelihood Posterior Sampling 𝜇V9, 𝜎V9 𝜇9, 𝜎9
11. 11. Anomaly detection • Model is trained without anomaly data. • Model cannot reconstruct anomaly data . 𝐴𝑛𝑜𝑚𝑎𝑙𝑦 𝑆𝑐𝑜𝑟𝑒 = log 𝐸 log 𝑝)(𝑥|𝑧) • MNIST with noise
12. 12. NAB Dataset (Artificial) • We convert raw data to spectrogram. • Spectrogram : the amplitudes at a particular frequency and time. • Input : the amplitudes at a time. raw data anomalyspectrogram single input time frequency
13. 13. NAB • Scores increase at anomaly. train test ADGM LVAE
14. 14. NAB • In this case models cannot detect anomaly. • Small input value tends to result in small score. train test ADGM LVAE
15. 15. Conclusion • Anomaly detection using ADGM / LVAE. • Anomaly is detected as low probability data. • Performances are almost same as VAE. • Many sampling layers are better (?)