SlideShare a Scribd company logo
1 of 81
Download to read offline
Yuma Koizumi
異常音検知の現状と展望
A Brief Introduction of Anomalous Sound Detection:
Recent Studies and Future Prospects
人工知能セミナー@産業総合研究所
AI seminar @ AIRC, AIST
15:00-17:00, Feb. 26th, 2021
Proprietary + Confidential
Special thanks
❏ Former colleagues at NTT Laboratories
❏ Dr. Kunio Kashino, Dr. Noboru Harada, Dr. Hisashi Uematsu, Akira Nakagawa,
Shoichiro Saito, Dr. Yasunori Ohishi, Daisuke Niizumi, Yuta Kawachi, Masataka
Yamaguchi, Masahiro Yasuda, Daiki Takeuchi, Luc Forget, Luca Mazzon, and
more...
❏ DCASE Challenge task co-organizers
❏ Dr. Yohei Kawaguchi, Dr. Harsh Purohit, Toshiki Nakamura, Yuki Nikaido, Ryo
Tanabe Kaori Suefusa, Takashi Endo (Hitachi, Ltd.) and Dr. Keisuke Imoto
(Doshisha University)
Proprietary + Confidential
Self-introduction
❏ Name: Yuma Koizumi (小泉 悠馬)
❏ Nov. 2020 - Current Research Scientist at Google Research
❏ Apr. 2014 - Nov. 2020 Research Scientist at NTT Media Intelligence Laboratories
❏ Ph.D degree, the University of Electro-Communications, Sept. 2017
❏ M.S. degree, Hosei University, Mar. 2014
❏ Research Topics
❏ Speech enhancement
❏ Anomalous sound detection (ASD)
❏ Audio captioning (1st place DCASE 2020 Challenge!)
Proprietary + Confidential
ASD & me
DCASE 2020 Challenge
Proprietary + Confidential
ASD & me
DCASE 2021 Challenge
01
Agenda
02
03
04
05
Overview of ASD
Unsupervised ASD
+Domain shift
+Anomalous samples
Future Prospects
01
Overview of ASD
Proprietary + Confidential
Anomaly detection
...what is “anomaly”?
Proprietary + Confidential
What is anomaly??
❏ Anomaly
❏ Something that is noticeable because it is different from what is usual [1]
❏ Anomalies are patterns in data that do not conform to a well-defined
notion of normal behavior [2]
[1] Longman Dictionary of Contemporary English
[2] V. Chandola, et al., “Anomaly detection: A survey,” ACM compt. Surv., 2009
anomaly = not normal
Proprietary + Confidential
Anomalous sounds
Gun shot
Photo by Alejo Reinoso
on Unsplash
Proprietary + Confidential
Anomalous sounds
Baby crying
Photo by Marcos Paulo Prado
on Unsplash
Proprietary + Confidential
Anomalous sounds
Mechanical failure
Photo by Ant Rozetsky
on Unsplash
Normal
Anomaly
Proprietary + Confidential
Purpose of ASD
Anomalous sounds may have been caused
by dangerous events
Prompt detection of anomalous sound
for preventing the worst case
Proprietary + Confidential
❏ DCASE 2020 Challenge Task [Link]
❏ Upcoming task of DCASE Challenge 2021! [Link]
Research hot topic
Proprietary + Confidential
Implementation
Anomalous score
calculator
e.g. DNN
Thresholding
Anomaly
Normal
high
low
Anomaly score
e.g.
mel-spectrogram
Proprietary + Confidential
OK, I know deep learning!
I'll train deep classifier for A(x)!
Calm down!
Let's figure out the problem
Proprietary + Confidential
“Known” and “Unknown” anomalies
Number of training samples of target events
Environmental sound
detection & classification
Proprietary + Confidential
“Known” and “Unknown” anomalies
Number of training samples of target events
Massive
Baby crying Gunshot
Often called as anomalous sound detection
Mechanical failure
Sound event
detection
Car
Speech
Trumpet
...
Proprietary + Confidential
“Known” and “Unknown” anomalies
Number of training samples of target events
Massive
Often called as anomalous sound detection
Mechanical failure
Gear failure Engine failure Pomp failure
and more...
Difficult to collect
target anomalies
Impossible to collect
exhaustive patterns of anomalies
Sound event
detection
Car
Speech
Trumpet
...
Baby crying Gunshot
Proprietary + Confidential
“Known” and “Unknown” anomalies
Number of training samples of target events
Massive Few Zero-resource
Rare sound event detection Unsupervised
anomalous sound detection
Often called as anomalous sound detection
Mechanical failure
Gear failure Engine failure Pomp failure
and more...
Difficult to collect
target anomalies
Impossible to collect
exhaustive patterns of anomalies
Detecting unknown anomalies
without anomalous samples
Detecting known anomalies
using few anomalous samples
Sound event
detection
Car
Speech
Trumpet
...
Baby crying Gunshot
Proprietary + Confidential
“Known” and “Unknown” anomalies
Number of training samples of target events
Massive Few Zero-resource
Rare sound event detection Unsupervised
anomalous sound detection
Often called as anomalous sound detection
Mechanical failure
Gear failure Engine failure Pomp failure
and more...
Difficult to collect
target anomalies
Impossible to collect
exhaustive patterns of anomalies
Detecting unknown anomalies
without anomalous samples
Detecting known anomalies
using few anomalous samples
Sound event
detection
Car
Speech
Trumpet
...
Baby crying Gunshot
Today’s topic
02
Unsupervised ASD
Proprietary + Confidential
❏ Anomalous sound detection for machine condition monitoring
Application example
Impossible to deliberately make exhaustive patterns of mechanical failure
Proprietary + Confidential
DCASE 2020 Challenge Task 2
Proprietary + Confidential
Typical task setup
❏ Only normal samples are provided as training data!!
❏ DCASE 2020 Challenge Task 2: ToyADMOS [Koizumi+, 2019] & MIMII [Purohit+, 2019]
[Koizumi+, 2019]: Y. Koizumi, et al., “ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection,” Proc. of WASPAA, 2019.
[Purohit+, 2019]: H. Purohit, et al., “MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection,” Proc. of DCASE Workshop, 2019.
6 machine types
(4+3) machine ID
Training data:
around 1000 samples of
10 sec normal sounds
Proprietary + Confidential
No anomalous samples?!
Normal
Anomaly
Label
Estimate
DNN
Supervised ASD
Cross-entropy
Proprietary + Confidential
No anomalous samples?!
Normal
Anomaly
Label
(only normal)
Estimate
DNN
How can anomalies be detected
without anomalous training data?
Unsupervised ASD
Proprietary + Confidential
Before DCASE 2020
Outlier-detection
&
Autoencoder
Proprietary + Confidential
Outlier detection
Learn what “normal” is
&
Detect “not normal”
Proprietary + Confidential
Outlier detection
❏ Normal: a subset of various sounds (full set)
❏ Anomaly: complement of normal
: various sounds
Proprietary + Confidential
Outlier detection
: various sounds
: given normal sounds
❏ Normal: a subset of various sounds (full set)
❏ Anomaly: complement of normal
Proprietary + Confidential
Outlier detection
: given normal sounds
: unknown sounds
= anomalous sounds
❏ Normal: a subset of various sounds (full set)
❏ Anomaly: complement of normal
Proprietary + Confidential
❏ Auto-encoder [Marchi+, 2015]
❏ Anomaly score = reconstruction error
❏ Auto-encoder is trained to reconst normal samples
How to model “normal”?
Enc Dec
Anomaly score
[Marchi+, 2015]: E. Marchi, et al., “A Novel Approach for Automatic Acoustic Novelty Detection using a Denoising Autoencoder with
Bidirectional LSTM Neural Networks,” Proc. of ICASSP, 2015.
Time
Frequency
Spectrogram
Proprietary + Confidential
Problem on auto-encoder
Anomalies cannot be reconstructed?
Proprietary + Confidential
Problem on auto-encoder
❏ Cost function does not mean that anomalies are not reconstructed
Normal training samples
2 2
2
2
Train
2 2
Boltzmann distribution
False negative
= overlooking
Proprietary + Confidential
Solutions
❏ Simulating anomalous sound
❏ Rejection sampling [Koizumi+, TASLP 2019]
❏ Batch uniformalization + add small another sound [Koizumi+, WASPAA 2019]
❏ Outlier expose [Hendrycks+, 2019]-like approach
❏ Classification of target machine and other individuals [Many DCASE
challenge submissions]
[Koizumi+, TASLP 2019]: Y. Koizumi, et al., “Unsupervised Detection of Anomalous Sound based on Deep Learning and the Neyman-Pearson Lemma,” IEEE TASLP, 2019.
[Koizumi+, WASPAA 2019]: Y. Koizumi, et al., “Batch Uniformization for Minimizing Maximum Anomaly Score of DNN-based Anomaly Detection in Sounds,” Proc. of
WASPAA, 2019.
[Hendrycks+, 2019]: D. Hendrycks, et al., “Deep Anomaly Detection with Outlier Exposure,” Proc. of ICLR, 2019.
How to increase A(x) of anomalies?
Proprietary + Confidential
Simulating anomalous sound
Cost =
1. Decreasing anomaly score for normal sounds &
2. Increasing anomaly score for simulated anomalous sounds
Proprietary + Confidential
Simulating anomalous sound
Auto-encoder Anomaly simulation Outlier-ex
Proprietary + Confidential
Simulating anomalous sound
Cost =
1. Decreasing anomaly score for normal sounds &
2. Increasing anomaly score for simulated anomalous sounds
How to simulate anomalous sounds?
Proprietary + Confidential
Rejection sampling of anomalous sound
❏ Remember that “anomaly” is complement of normal
❏ Generate a sample from PDF of various sounds p(x)
❏ Accept it as “anomaly” when p(x | state=normal) is low
: various sounds
: given normal sounds
[Koizumi+, TASLP 2019]: Y. Koizumi, et al., “Unsupervised Detection of Anomalous Sound based on Deep Learning and the Neyman-Pearson Lemma,” IEEE TASLP, 2019.
Proprietary + Confidential
Add small another sound
❏ Remember that “anomaly” is “not normal”
Normal sound + a collision sound = Anomalous sound
Normal sound + some rubbing sounds = Anomalous sound
Normal sound + clicking noise = Anomalous sound
Normal sound + something-else sound = Anomalous sound
[Koizumi+, WASPAA 2019]: Y. Koizumi, et al., “Batch Uniformization for Minimizing Maximum Anomaly Score of DNN-based Anomaly Detection in Sounds,” Proc. of
WASPAA, 2019.
Proprietary + Confidential
❏ However…, often becomes "The Boy Who Cried Wolf"
❏ Rare normal sounds are identified as anomalies
❏ Weighting A(x) of normal sound by reciprocal of its probability
+ Batch-uniformalization
[Koizumi+, WASPAA 2019]: Y. Koizumi, et al., “Batch Uniformization for Minimizing Maximum Anomaly Score of DNN-based Anomaly Detection in Sounds,” Proc. of
WASPAA, 2019.
Decrease A(x) of normals
especially rare normals
Increase A(x) of
simulated anomalies
Proprietary + Confidential
Toy example (cf. Problem on auto-encoder)
Normal training samples
2 2
2
2
Train
2 2
Boltzmann distribution
❏ Able to distinguish rare normals and anomalies
Proprietary + Confidential
But… what dense & ad-hoc method...
How to select “something-else sound”?
Criteria for select them?
More computationally efficient way?
Proprietary + Confidential
After DCASE 2020
Outlier-detection
&
Autoencoder
Classifier
Proprietary + Confidential
Outlier expose-like approach
❏ Outlier detection → Classification
❏ Classification of target machine and other individuals
6 machine types
(4+3) machine ID
Around 1000
samples of 10 sec
normal sounds
Recap: DCASE 2020 Challenge dataset
Proprietary + Confidential
❏ DNN solves machine ID identification instead of outlier-detection
Basic idea
Time
Frequency
Type: Valve
ID: 01
Training sample
Time
Frequency
Pump, ID01
Pump, ID02
Pump, ID03
Slide rail, ID07
Valve, ID01
Valve, ID02
...
...
DNN
Training
e.g. cross-entropy
Proprietary + Confidential
❏ DNN solves machine ID identification instead of outlier-detection
Basic idea (cont’d)
Time
Frequency
Type: Valve
ID: 01
Test sample
Time
Frequency
Pump, ID01
Pump, ID02
Pump, ID03
Slide rail, ID07
Valve, ID01
Valve, ID02
...
...
DNN
Test
Thresholding
Anomaly
Normal Anomaly score
Proprietary + Confidential
❏ DNN solves machine ID identification instead of outlier-detection
Basic idea (cont’d)
Auto-encoder Anomaly simulation Outlier-expose-like
Proprietary + Confidential
Target labels for classification-ASD
Which labels should be classification target?
❏ No answers yet, but many attempts have been made:
❏ Machine ID identification: [Giri+], [Primus+], [Zhou], [Lopez+]
❏ Machine Type & other datasets identification: [Primus+]
❏ Data augmentation identification: [Giri+], [Inoue/Vinayavekhin+]
Top-performing teams developed their own
methods independently
[Giri+]: R. Giri, et al., “Self-Supervised Classification for Detecting Anomalous Sounds,” Proc of DCASE Workshop, 2020
[Primus+]: P. Primus, et al., “Anomalous Sound Detection as a Simple Binary Classification Problem with Careful Selection of Proxy Outlier Examples,” Proc of DCASE
Workshop, 2020
[Inoue/Vinayavekhin+]: T. Inoue, P. Vinayavekhin, et al., “Detection of Anomalous Sounds for Machine Condition Monitoring using Classification Confidence” Proc of DCASE
Workshop, 2020
[Zhou]: Q. Zhou, “ARCFACE BASED SOUND MOBILENETS FOR DCASE 2020 TASK 2 ,” Tech. Report, DCASE Challenge 2020.
[Lopez+]: J. A. Lopez, “A SPEAKER RECOGNITION APPROACH TO ANOMALY DETECTION ,” Tech. Report, DCASE Challenge 2020.
Proprietary + Confidential
❏ Training fails in extremely easy/difficult classification cases
❏ Normal sounds of two individuals are exactly same or completely different
❏ Impossible to determine boundary between target normal and other sounds
Problems on classification-ASD
Due to this problem, although some teams achieved high scores on several machine
types, they dropped in ranks owing to relatively low Toy-conveyor scores
This problem can be a good start point to answer the next research question:
"which labels should be classification target?"
Proprietary + Confidential
Wanna try unsupervised ASD?
Try DCASE 2020 Challenge Task 2!!
Baseline system and dataset are available
http://dcase.community/challenge2020/task-unsupervised-detection-of-anomalous-sounds
03
+Domain shift
Proprietary + Confidential
System is not perfect
❏ Two types of “mis-detection”
False-positive (Type I error) False-negative (Type II error)
❏ Normal → Anomaly
❏ Frequently occurs
❏ Often caused by changes in normal
condition
❏ Anomaly → Normal
❏ Rarely occurs, but critical problem
This section Next section
Proprietary + Confidential
❏ In practice, “the normal state” is not always constant
❏ Changes in engine speed due to changes in production products
❏ Seasonal variation (e.g. sound speed, noise, and more...)
❏ Accidentally changed microphone position
❏ and more…
❏ It results in making “false alert”
= Normal is mistakenly identified as anomaly
Domain shift problem
Proprietary + Confidential
❏ Need to update ASD system immediately
Few-shot model adaptation
Normal
DNN Normal
Normal
Normal
Old domain (source) New domain (target)
Massive training data + trained model Few training samples
Proprietary + Confidential
❏ AdaFlow [Yamaguchi+, 2019]
❏ Normalizing flow + adaptive batch normalization
❏ Assuming low computational resource (e.g. edge device)
❏ DNN update w/o backpropagation
Model adaptation for ASD
[Yamaguchi+, 2019]: M. Yamaguchi, et al., “AdaFlow: Domain-Adaptive Density Estimator with Application to Anomaly Detection and Unpaired Cross-Domain Translation,”
Proc. of ICASSP, 2019.
Normal
ID: 01
Normal
ID: 02
Normal
ID: 03
BN
BN
BN
BN
BN
BN
BN
BN
BN
Training
Proprietary + Confidential
❏ AdaFlow [Yamaguchi+, 2019]
❏ Normalizing flow + adaptive batch normalization
❏ Assuming low computational resource (e.g. edge device)
❏ DNN update w/o backpropagation
Model adaptation for ASD
[Yamaguchi+, 2019]: M. Yamaguchi, et al., “AdaFlow: Domain-Adaptive Density Estimator with Application to Anomaly Detection and Unpaired Cross-Domain Translation,”
Proc. of ICASSP, 2019.
BN
BN
BN
Adaptation
Normal
Freeze
Update mean & var. params
Proprietary + Confidential
Important topic! But not much investigated...
Proprietary + Confidential
DCASE 2021 Challenge Task 2
Stay Tune!!
04
+Anomalous samples
Proprietary + Confidential
❏ Overlooked anomalies
❏ Critical problem!!
❏ Need to update system immediately
❏ Correctly detected anomalies
❏ Well done!!
❏ Do we have room to improve system using obtained anomalies?
Sometimes we can get anomalies
Proprietary + Confidential
Still not two class classification
Normal training
samples
Alright! We got
anomalous samples
Proprietary + Confidential
Still not two class classification
Train two class
classifier!!
Proprietary + Confidential
Still not two class classification
Overlook other
types of anomalies
Remember, we cannot collect exhaustive patterns of anomalies
Proprietary + Confidential
❏ E.g. density ratio-based classification
Why discriminative training is bad?
Normal
Given anomaly
Anomaly > Normal
Remember, we cannot collect exhaustive patterns of anomalies
Proprietary + Confidential
❏ Few-shot anomalies
❏ +Memory-based few-shot detector [Koizumi+, 2019], [Koizumi+, 2020]
❏ Enough amount of anomalies
❏ Complementary set VAE: estimating PDF of “complement of normal”
[Kawachi+, 2018], [Kawachi+, 2019]
Training strategies
[Koizumi+, 2019]: Y. Koizumi, et al., “SNIPER: Few-shot Learning for Anomaly Detection to Minimize False-Negative Rate with Ensured True-Positive Rate,” Proc. of ICASSP,
2019.
[Koizumi+, 2020]: Y. Koizumi, et al., “SPIDERnet: Attention Network for One-shot Anomaly Detection in Sounds,” Proc. of ICASSP, 2020.
[Kawachi+, 2018]: Y. Kawachi, et al., “Complementary Set Variational AutoEncoder for Supervised Anomaly Detection,” Proc. of ICASSP, 2018.
[Kawachi+, 2019]: Y. Kawachi, et al., “A Two-Class Hyper-Spherical Autoencoder for Supervised Anomaly Detection,” Proc. of ICASSP, 2019.
Proprietary + Confidential
❏ Increase A(x) when input is similar to memorized anomalies
+Few-shot learning
[Koizumi+, 2019]: Y. Koizumi, et al., “SNIPER: Few-shot Learning for Anomaly Detection to Minimize False-Negative Rate with Ensured True-Positive Rate,” Proc. of ICASSP,
2019.
[Koizumi+, 2020]: Y. Koizumi, et al., “SPIDERnet: Attention Network for One-shot Anomaly Detection in Sounds,” Proc. of ICASSP, 2020.
Proprietary + Confidential
❏ Complementary set PDF [Kawachi+, 2018], [Kawachi+, 2019]
Normal vs. Complement
[Kawachi+, 2018]: Y. Kawachi, et al., “Complementary Set Variational AutoEncoder for Supervised Anomaly Detection,” Proc. of ICASSP, 2018.
[Kawachi+, 2019]: Y. Kawachi, et al., “A Two-Class Hyper-Spherical Autoencoder for Supervised Anomaly Detection,” Proc. of ICASSP, 2019.
Normal
Complement
Anomaly > Normal
Proprietary + Confidential
Complementary set VAE [Kawachi+, 2018]
[Kawachi+, 2018]: Y. Kawachi, et al., “Complementary Set Variational AutoEncoder for Supervised Anomaly Detection,” Proc. of ICASSP, 2018.
Normal
Complement
Anomaly > Normal
Cost = Reconstruction error +
Likelihood of normal
Likelihood of complement
❏ Switch hidden space prior in VAD according to label
Proprietary + Confidential
Toy example [Kawachi+, 2018]
[Kawachi+, 2018]: Y. Kawachi, et al., “Complementary Set Variational AutoEncoder for Supervised Anomaly Detection,” Proc. of ICASSP, 2018.
❏ MNIST example
❏ Normal: 0-8
❏ Anomaly: 9
❏ Visualizing hidden space
❏ Since normal prior is Gaussian,
0-8 have been placed around
center of hidden space
Anomaly
Normal
05
Future Prospects
Proprietary + Confidential
Anomaly detected!!
...but where…?
Where is anomaly??
Photo by Magda Ehlers from Pexels
Proprietary + Confidential
Anomaly detected!!
...but where…?
Where is anomaly??
Photo by Magda Ehlers from Pexels
+sound localization
Localization is also tackled
in DCASE Challenge [Link]
Proprietary + Confidential
How anomalous??
Anomaly detected!!
...but how…?
Photo by Andrea Piacquadio from Pexels
Proprietary + Confidential
How anomalous??
Anomaly detected!!
...but how…?
Photo by Andrea Piacquadio from Pexels
+audio captioning
Captioning is also tackled
in DCASE Challenge [Link]
High frequency rubbing noise.
It might be an anomaly in bearing.
Proprietary + Confidential
Conclusion
❏ Interesting and “tasty” problems
❏ Outlier-detection? Classification?
❏ Even the definition of the problem is uncertain
❏ Blue ocean
❏ Many unsolved problems and DCASE Challenge
❏ Domain adaptation, few-shot learning...
❏ Frontier
❏ Practically important but undeveloped research field
❏ Combining other DCASE tasks
Proprietary + Confidential
Why not enjoy ASD?
Proprietary + Confidential
Why not enjoy ASD?
Thank You
Yuma Koizumi
Research Scientist @ Google Research Tokyo
Proprietary + Confidential
Problem on auto-encoder (supplement)
❏ Reconstruction error and energy of Boltzmann distribution
❏ MMSE-based training ignores normalizing constant
KL-div. between
PDF of normal
MMSE Constraint for increasing total anomaly score
= Increasing anomaly score of unknown samples

More Related Content

What's hot

時系列分析による異常検知入門
時系列分析による異常検知入門時系列分析による異常検知入門
時系列分析による異常検知入門
Yohei Sato
 

What's hot (20)

距離学習を導入した二値分類モデルによる異常音検知
距離学習を導入した二値分類モデルによる異常音検知距離学習を導入した二値分類モデルによる異常音検知
距離学習を導入した二値分類モデルによる異常音検知
 
環境音の特徴を活用した音響イベント検出・シーン分類
環境音の特徴を活用した音響イベント検出・シーン分類環境音の特徴を活用した音響イベント検出・シーン分類
環境音の特徴を活用した音響イベント検出・シーン分類
 
Weakly-Supervised Sound Event Detection with Self-Attention
Weakly-Supervised Sound Event Detection with Self-AttentionWeakly-Supervised Sound Event Detection with Self-Attention
Weakly-Supervised Sound Event Detection with Self-Attention
 
変調スペクトルを考慮したHMM音声合成
変調スペクトルを考慮したHMM音声合成変調スペクトルを考慮したHMM音声合成
変調スペクトルを考慮したHMM音声合成
 
Theory and Methods for Unsupervised Anomaly Detection in Sounds Based on Deep...
Theory and Methods for Unsupervised Anomaly Detection in Sounds Based on Deep...Theory and Methods for Unsupervised Anomaly Detection in Sounds Based on Deep...
Theory and Methods for Unsupervised Anomaly Detection in Sounds Based on Deep...
 
WaveNetが音声合成研究に与える影響
WaveNetが音声合成研究に与える影響WaveNetが音声合成研究に与える影響
WaveNetが音声合成研究に与える影響
 
実環境音響信号処理における収音技術
実環境音響信号処理における収音技術実環境音響信号処理における収音技術
実環境音響信号処理における収音技術
 
Autoencoderの実装と愉快な仲間との比較
Autoencoderの実装と愉快な仲間との比較Autoencoderの実装と愉快な仲間との比較
Autoencoderの実装と愉快な仲間との比較
 
One Class SVMを用いた異常値検知
One Class SVMを用いた異常値検知One Class SVMを用いた異常値検知
One Class SVMを用いた異常値検知
 
スペクトログラム無矛盾性に基づく独立低ランク行列分析
スペクトログラム無矛盾性に基づく独立低ランク行列分析スペクトログラム無矛盾性に基づく独立低ランク行列分析
スペクトログラム無矛盾性に基づく独立低ランク行列分析
 
時系列分析による異常検知入門
時系列分析による異常検知入門時系列分析による異常検知入門
時系列分析による異常検知入門
 
音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)
 
ICASSP 2019での音響信号処理分野の世界動向
ICASSP 2019での音響信号処理分野の世界動向ICASSP 2019での音響信号処理分野の世界動向
ICASSP 2019での音響信号処理分野の世界動向
 
DNN音響モデルにおける特徴量抽出の諸相
DNN音響モデルにおける特徴量抽出の諸相DNN音響モデルにおける特徴量抽出の諸相
DNN音響モデルにおける特徴量抽出の諸相
 
20190619 オートエンコーダーと異常検知入門
20190619 オートエンコーダーと異常検知入門20190619 オートエンコーダーと異常検知入門
20190619 オートエンコーダーと異常検知入門
 
深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~
深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~
深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~
 
音楽波形データからコードを推定してみる
音楽波形データからコードを推定してみる音楽波形データからコードを推定してみる
音楽波形データからコードを推定してみる
 
AutoEncoderで特徴抽出
AutoEncoderで特徴抽出AutoEncoderで特徴抽出
AutoEncoderで特徴抽出
 
機械学習応用システムのための要求工学
機械学習応用システムのための要求工学機械学習応用システムのための要求工学
機械学習応用システムのための要求工学
 
猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

A Brief Introduction of Anomalous Sound Detection: Recent Studies and Future Prospects

  • 1. Yuma Koizumi 異常音検知の現状と展望 A Brief Introduction of Anomalous Sound Detection: Recent Studies and Future Prospects 人工知能セミナー@産業総合研究所 AI seminar @ AIRC, AIST 15:00-17:00, Feb. 26th, 2021
  • 2. Proprietary + Confidential Special thanks ❏ Former colleagues at NTT Laboratories ❏ Dr. Kunio Kashino, Dr. Noboru Harada, Dr. Hisashi Uematsu, Akira Nakagawa, Shoichiro Saito, Dr. Yasunori Ohishi, Daisuke Niizumi, Yuta Kawachi, Masataka Yamaguchi, Masahiro Yasuda, Daiki Takeuchi, Luc Forget, Luca Mazzon, and more... ❏ DCASE Challenge task co-organizers ❏ Dr. Yohei Kawaguchi, Dr. Harsh Purohit, Toshiki Nakamura, Yuki Nikaido, Ryo Tanabe Kaori Suefusa, Takashi Endo (Hitachi, Ltd.) and Dr. Keisuke Imoto (Doshisha University)
  • 3. Proprietary + Confidential Self-introduction ❏ Name: Yuma Koizumi (小泉 悠馬) ❏ Nov. 2020 - Current Research Scientist at Google Research ❏ Apr. 2014 - Nov. 2020 Research Scientist at NTT Media Intelligence Laboratories ❏ Ph.D degree, the University of Electro-Communications, Sept. 2017 ❏ M.S. degree, Hosei University, Mar. 2014 ❏ Research Topics ❏ Speech enhancement ❏ Anomalous sound detection (ASD) ❏ Audio captioning (1st place DCASE 2020 Challenge!)
  • 4. Proprietary + Confidential ASD & me DCASE 2020 Challenge
  • 5. Proprietary + Confidential ASD & me DCASE 2021 Challenge
  • 6. 01 Agenda 02 03 04 05 Overview of ASD Unsupervised ASD +Domain shift +Anomalous samples Future Prospects
  • 8. Proprietary + Confidential Anomaly detection ...what is “anomaly”?
  • 9. Proprietary + Confidential What is anomaly?? ❏ Anomaly ❏ Something that is noticeable because it is different from what is usual [1] ❏ Anomalies are patterns in data that do not conform to a well-defined notion of normal behavior [2] [1] Longman Dictionary of Contemporary English [2] V. Chandola, et al., “Anomaly detection: A survey,” ACM compt. Surv., 2009 anomaly = not normal
  • 10. Proprietary + Confidential Anomalous sounds Gun shot Photo by Alejo Reinoso on Unsplash
  • 11. Proprietary + Confidential Anomalous sounds Baby crying Photo by Marcos Paulo Prado on Unsplash
  • 12. Proprietary + Confidential Anomalous sounds Mechanical failure Photo by Ant Rozetsky on Unsplash Normal Anomaly
  • 13. Proprietary + Confidential Purpose of ASD Anomalous sounds may have been caused by dangerous events Prompt detection of anomalous sound for preventing the worst case
  • 14. Proprietary + Confidential ❏ DCASE 2020 Challenge Task [Link] ❏ Upcoming task of DCASE Challenge 2021! [Link] Research hot topic
  • 15. Proprietary + Confidential Implementation Anomalous score calculator e.g. DNN Thresholding Anomaly Normal high low Anomaly score e.g. mel-spectrogram
  • 16. Proprietary + Confidential OK, I know deep learning! I'll train deep classifier for A(x)! Calm down! Let's figure out the problem
  • 17. Proprietary + Confidential “Known” and “Unknown” anomalies Number of training samples of target events Environmental sound detection & classification
  • 18. Proprietary + Confidential “Known” and “Unknown” anomalies Number of training samples of target events Massive Baby crying Gunshot Often called as anomalous sound detection Mechanical failure Sound event detection Car Speech Trumpet ...
  • 19. Proprietary + Confidential “Known” and “Unknown” anomalies Number of training samples of target events Massive Often called as anomalous sound detection Mechanical failure Gear failure Engine failure Pomp failure and more... Difficult to collect target anomalies Impossible to collect exhaustive patterns of anomalies Sound event detection Car Speech Trumpet ... Baby crying Gunshot
  • 20. Proprietary + Confidential “Known” and “Unknown” anomalies Number of training samples of target events Massive Few Zero-resource Rare sound event detection Unsupervised anomalous sound detection Often called as anomalous sound detection Mechanical failure Gear failure Engine failure Pomp failure and more... Difficult to collect target anomalies Impossible to collect exhaustive patterns of anomalies Detecting unknown anomalies without anomalous samples Detecting known anomalies using few anomalous samples Sound event detection Car Speech Trumpet ... Baby crying Gunshot
  • 21. Proprietary + Confidential “Known” and “Unknown” anomalies Number of training samples of target events Massive Few Zero-resource Rare sound event detection Unsupervised anomalous sound detection Often called as anomalous sound detection Mechanical failure Gear failure Engine failure Pomp failure and more... Difficult to collect target anomalies Impossible to collect exhaustive patterns of anomalies Detecting unknown anomalies without anomalous samples Detecting known anomalies using few anomalous samples Sound event detection Car Speech Trumpet ... Baby crying Gunshot Today’s topic
  • 23. Proprietary + Confidential ❏ Anomalous sound detection for machine condition monitoring Application example Impossible to deliberately make exhaustive patterns of mechanical failure
  • 24. Proprietary + Confidential DCASE 2020 Challenge Task 2
  • 25. Proprietary + Confidential Typical task setup ❏ Only normal samples are provided as training data!! ❏ DCASE 2020 Challenge Task 2: ToyADMOS [Koizumi+, 2019] & MIMII [Purohit+, 2019] [Koizumi+, 2019]: Y. Koizumi, et al., “ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection,” Proc. of WASPAA, 2019. [Purohit+, 2019]: H. Purohit, et al., “MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection,” Proc. of DCASE Workshop, 2019. 6 machine types (4+3) machine ID Training data: around 1000 samples of 10 sec normal sounds
  • 26. Proprietary + Confidential No anomalous samples?! Normal Anomaly Label Estimate DNN Supervised ASD Cross-entropy
  • 27. Proprietary + Confidential No anomalous samples?! Normal Anomaly Label (only normal) Estimate DNN How can anomalies be detected without anomalous training data? Unsupervised ASD
  • 28. Proprietary + Confidential Before DCASE 2020 Outlier-detection & Autoencoder
  • 29. Proprietary + Confidential Outlier detection Learn what “normal” is & Detect “not normal”
  • 30. Proprietary + Confidential Outlier detection ❏ Normal: a subset of various sounds (full set) ❏ Anomaly: complement of normal : various sounds
  • 31. Proprietary + Confidential Outlier detection : various sounds : given normal sounds ❏ Normal: a subset of various sounds (full set) ❏ Anomaly: complement of normal
  • 32. Proprietary + Confidential Outlier detection : given normal sounds : unknown sounds = anomalous sounds ❏ Normal: a subset of various sounds (full set) ❏ Anomaly: complement of normal
  • 33. Proprietary + Confidential ❏ Auto-encoder [Marchi+, 2015] ❏ Anomaly score = reconstruction error ❏ Auto-encoder is trained to reconst normal samples How to model “normal”? Enc Dec Anomaly score [Marchi+, 2015]: E. Marchi, et al., “A Novel Approach for Automatic Acoustic Novelty Detection using a Denoising Autoencoder with Bidirectional LSTM Neural Networks,” Proc. of ICASSP, 2015. Time Frequency Spectrogram
  • 34. Proprietary + Confidential Problem on auto-encoder Anomalies cannot be reconstructed?
  • 35. Proprietary + Confidential Problem on auto-encoder ❏ Cost function does not mean that anomalies are not reconstructed Normal training samples 2 2 2 2 Train 2 2 Boltzmann distribution False negative = overlooking
  • 36. Proprietary + Confidential Solutions ❏ Simulating anomalous sound ❏ Rejection sampling [Koizumi+, TASLP 2019] ❏ Batch uniformalization + add small another sound [Koizumi+, WASPAA 2019] ❏ Outlier expose [Hendrycks+, 2019]-like approach ❏ Classification of target machine and other individuals [Many DCASE challenge submissions] [Koizumi+, TASLP 2019]: Y. Koizumi, et al., “Unsupervised Detection of Anomalous Sound based on Deep Learning and the Neyman-Pearson Lemma,” IEEE TASLP, 2019. [Koizumi+, WASPAA 2019]: Y. Koizumi, et al., “Batch Uniformization for Minimizing Maximum Anomaly Score of DNN-based Anomaly Detection in Sounds,” Proc. of WASPAA, 2019. [Hendrycks+, 2019]: D. Hendrycks, et al., “Deep Anomaly Detection with Outlier Exposure,” Proc. of ICLR, 2019. How to increase A(x) of anomalies?
  • 37. Proprietary + Confidential Simulating anomalous sound Cost = 1. Decreasing anomaly score for normal sounds & 2. Increasing anomaly score for simulated anomalous sounds
  • 38. Proprietary + Confidential Simulating anomalous sound Auto-encoder Anomaly simulation Outlier-ex
  • 39. Proprietary + Confidential Simulating anomalous sound Cost = 1. Decreasing anomaly score for normal sounds & 2. Increasing anomaly score for simulated anomalous sounds How to simulate anomalous sounds?
  • 40. Proprietary + Confidential Rejection sampling of anomalous sound ❏ Remember that “anomaly” is complement of normal ❏ Generate a sample from PDF of various sounds p(x) ❏ Accept it as “anomaly” when p(x | state=normal) is low : various sounds : given normal sounds [Koizumi+, TASLP 2019]: Y. Koizumi, et al., “Unsupervised Detection of Anomalous Sound based on Deep Learning and the Neyman-Pearson Lemma,” IEEE TASLP, 2019.
  • 41. Proprietary + Confidential Add small another sound ❏ Remember that “anomaly” is “not normal” Normal sound + a collision sound = Anomalous sound Normal sound + some rubbing sounds = Anomalous sound Normal sound + clicking noise = Anomalous sound Normal sound + something-else sound = Anomalous sound [Koizumi+, WASPAA 2019]: Y. Koizumi, et al., “Batch Uniformization for Minimizing Maximum Anomaly Score of DNN-based Anomaly Detection in Sounds,” Proc. of WASPAA, 2019.
  • 42. Proprietary + Confidential ❏ However…, often becomes "The Boy Who Cried Wolf" ❏ Rare normal sounds are identified as anomalies ❏ Weighting A(x) of normal sound by reciprocal of its probability + Batch-uniformalization [Koizumi+, WASPAA 2019]: Y. Koizumi, et al., “Batch Uniformization for Minimizing Maximum Anomaly Score of DNN-based Anomaly Detection in Sounds,” Proc. of WASPAA, 2019. Decrease A(x) of normals especially rare normals Increase A(x) of simulated anomalies
  • 43. Proprietary + Confidential Toy example (cf. Problem on auto-encoder) Normal training samples 2 2 2 2 Train 2 2 Boltzmann distribution ❏ Able to distinguish rare normals and anomalies
  • 44. Proprietary + Confidential But… what dense & ad-hoc method... How to select “something-else sound”? Criteria for select them? More computationally efficient way?
  • 45. Proprietary + Confidential After DCASE 2020 Outlier-detection & Autoencoder Classifier
  • 46. Proprietary + Confidential Outlier expose-like approach ❏ Outlier detection → Classification ❏ Classification of target machine and other individuals 6 machine types (4+3) machine ID Around 1000 samples of 10 sec normal sounds Recap: DCASE 2020 Challenge dataset
  • 47. Proprietary + Confidential ❏ DNN solves machine ID identification instead of outlier-detection Basic idea Time Frequency Type: Valve ID: 01 Training sample Time Frequency Pump, ID01 Pump, ID02 Pump, ID03 Slide rail, ID07 Valve, ID01 Valve, ID02 ... ... DNN Training e.g. cross-entropy
  • 48. Proprietary + Confidential ❏ DNN solves machine ID identification instead of outlier-detection Basic idea (cont’d) Time Frequency Type: Valve ID: 01 Test sample Time Frequency Pump, ID01 Pump, ID02 Pump, ID03 Slide rail, ID07 Valve, ID01 Valve, ID02 ... ... DNN Test Thresholding Anomaly Normal Anomaly score
  • 49. Proprietary + Confidential ❏ DNN solves machine ID identification instead of outlier-detection Basic idea (cont’d) Auto-encoder Anomaly simulation Outlier-expose-like
  • 50. Proprietary + Confidential Target labels for classification-ASD Which labels should be classification target? ❏ No answers yet, but many attempts have been made: ❏ Machine ID identification: [Giri+], [Primus+], [Zhou], [Lopez+] ❏ Machine Type & other datasets identification: [Primus+] ❏ Data augmentation identification: [Giri+], [Inoue/Vinayavekhin+] Top-performing teams developed their own methods independently [Giri+]: R. Giri, et al., “Self-Supervised Classification for Detecting Anomalous Sounds,” Proc of DCASE Workshop, 2020 [Primus+]: P. Primus, et al., “Anomalous Sound Detection as a Simple Binary Classification Problem with Careful Selection of Proxy Outlier Examples,” Proc of DCASE Workshop, 2020 [Inoue/Vinayavekhin+]: T. Inoue, P. Vinayavekhin, et al., “Detection of Anomalous Sounds for Machine Condition Monitoring using Classification Confidence” Proc of DCASE Workshop, 2020 [Zhou]: Q. Zhou, “ARCFACE BASED SOUND MOBILENETS FOR DCASE 2020 TASK 2 ,” Tech. Report, DCASE Challenge 2020. [Lopez+]: J. A. Lopez, “A SPEAKER RECOGNITION APPROACH TO ANOMALY DETECTION ,” Tech. Report, DCASE Challenge 2020.
  • 51. Proprietary + Confidential ❏ Training fails in extremely easy/difficult classification cases ❏ Normal sounds of two individuals are exactly same or completely different ❏ Impossible to determine boundary between target normal and other sounds Problems on classification-ASD Due to this problem, although some teams achieved high scores on several machine types, they dropped in ranks owing to relatively low Toy-conveyor scores This problem can be a good start point to answer the next research question: "which labels should be classification target?"
  • 52. Proprietary + Confidential Wanna try unsupervised ASD? Try DCASE 2020 Challenge Task 2!! Baseline system and dataset are available http://dcase.community/challenge2020/task-unsupervised-detection-of-anomalous-sounds
  • 54. Proprietary + Confidential System is not perfect ❏ Two types of “mis-detection” False-positive (Type I error) False-negative (Type II error) ❏ Normal → Anomaly ❏ Frequently occurs ❏ Often caused by changes in normal condition ❏ Anomaly → Normal ❏ Rarely occurs, but critical problem This section Next section
  • 55. Proprietary + Confidential ❏ In practice, “the normal state” is not always constant ❏ Changes in engine speed due to changes in production products ❏ Seasonal variation (e.g. sound speed, noise, and more...) ❏ Accidentally changed microphone position ❏ and more… ❏ It results in making “false alert” = Normal is mistakenly identified as anomaly Domain shift problem
  • 56. Proprietary + Confidential ❏ Need to update ASD system immediately Few-shot model adaptation Normal DNN Normal Normal Normal Old domain (source) New domain (target) Massive training data + trained model Few training samples
  • 57. Proprietary + Confidential ❏ AdaFlow [Yamaguchi+, 2019] ❏ Normalizing flow + adaptive batch normalization ❏ Assuming low computational resource (e.g. edge device) ❏ DNN update w/o backpropagation Model adaptation for ASD [Yamaguchi+, 2019]: M. Yamaguchi, et al., “AdaFlow: Domain-Adaptive Density Estimator with Application to Anomaly Detection and Unpaired Cross-Domain Translation,” Proc. of ICASSP, 2019. Normal ID: 01 Normal ID: 02 Normal ID: 03 BN BN BN BN BN BN BN BN BN Training
  • 58. Proprietary + Confidential ❏ AdaFlow [Yamaguchi+, 2019] ❏ Normalizing flow + adaptive batch normalization ❏ Assuming low computational resource (e.g. edge device) ❏ DNN update w/o backpropagation Model adaptation for ASD [Yamaguchi+, 2019]: M. Yamaguchi, et al., “AdaFlow: Domain-Adaptive Density Estimator with Application to Anomaly Detection and Unpaired Cross-Domain Translation,” Proc. of ICASSP, 2019. BN BN BN Adaptation Normal Freeze Update mean & var. params
  • 59. Proprietary + Confidential Important topic! But not much investigated...
  • 60. Proprietary + Confidential DCASE 2021 Challenge Task 2 Stay Tune!!
  • 62. Proprietary + Confidential ❏ Overlooked anomalies ❏ Critical problem!! ❏ Need to update system immediately ❏ Correctly detected anomalies ❏ Well done!! ❏ Do we have room to improve system using obtained anomalies? Sometimes we can get anomalies
  • 63. Proprietary + Confidential Still not two class classification Normal training samples Alright! We got anomalous samples
  • 64. Proprietary + Confidential Still not two class classification Train two class classifier!!
  • 65. Proprietary + Confidential Still not two class classification Overlook other types of anomalies Remember, we cannot collect exhaustive patterns of anomalies
  • 66. Proprietary + Confidential ❏ E.g. density ratio-based classification Why discriminative training is bad? Normal Given anomaly Anomaly > Normal Remember, we cannot collect exhaustive patterns of anomalies
  • 67. Proprietary + Confidential ❏ Few-shot anomalies ❏ +Memory-based few-shot detector [Koizumi+, 2019], [Koizumi+, 2020] ❏ Enough amount of anomalies ❏ Complementary set VAE: estimating PDF of “complement of normal” [Kawachi+, 2018], [Kawachi+, 2019] Training strategies [Koizumi+, 2019]: Y. Koizumi, et al., “SNIPER: Few-shot Learning for Anomaly Detection to Minimize False-Negative Rate with Ensured True-Positive Rate,” Proc. of ICASSP, 2019. [Koizumi+, 2020]: Y. Koizumi, et al., “SPIDERnet: Attention Network for One-shot Anomaly Detection in Sounds,” Proc. of ICASSP, 2020. [Kawachi+, 2018]: Y. Kawachi, et al., “Complementary Set Variational AutoEncoder for Supervised Anomaly Detection,” Proc. of ICASSP, 2018. [Kawachi+, 2019]: Y. Kawachi, et al., “A Two-Class Hyper-Spherical Autoencoder for Supervised Anomaly Detection,” Proc. of ICASSP, 2019.
  • 68. Proprietary + Confidential ❏ Increase A(x) when input is similar to memorized anomalies +Few-shot learning [Koizumi+, 2019]: Y. Koizumi, et al., “SNIPER: Few-shot Learning for Anomaly Detection to Minimize False-Negative Rate with Ensured True-Positive Rate,” Proc. of ICASSP, 2019. [Koizumi+, 2020]: Y. Koizumi, et al., “SPIDERnet: Attention Network for One-shot Anomaly Detection in Sounds,” Proc. of ICASSP, 2020.
  • 69. Proprietary + Confidential ❏ Complementary set PDF [Kawachi+, 2018], [Kawachi+, 2019] Normal vs. Complement [Kawachi+, 2018]: Y. Kawachi, et al., “Complementary Set Variational AutoEncoder for Supervised Anomaly Detection,” Proc. of ICASSP, 2018. [Kawachi+, 2019]: Y. Kawachi, et al., “A Two-Class Hyper-Spherical Autoencoder for Supervised Anomaly Detection,” Proc. of ICASSP, 2019. Normal Complement Anomaly > Normal
  • 70. Proprietary + Confidential Complementary set VAE [Kawachi+, 2018] [Kawachi+, 2018]: Y. Kawachi, et al., “Complementary Set Variational AutoEncoder for Supervised Anomaly Detection,” Proc. of ICASSP, 2018. Normal Complement Anomaly > Normal Cost = Reconstruction error + Likelihood of normal Likelihood of complement ❏ Switch hidden space prior in VAD according to label
  • 71. Proprietary + Confidential Toy example [Kawachi+, 2018] [Kawachi+, 2018]: Y. Kawachi, et al., “Complementary Set Variational AutoEncoder for Supervised Anomaly Detection,” Proc. of ICASSP, 2018. ❏ MNIST example ❏ Normal: 0-8 ❏ Anomaly: 9 ❏ Visualizing hidden space ❏ Since normal prior is Gaussian, 0-8 have been placed around center of hidden space Anomaly Normal
  • 73. Proprietary + Confidential Anomaly detected!! ...but where…? Where is anomaly?? Photo by Magda Ehlers from Pexels
  • 74. Proprietary + Confidential Anomaly detected!! ...but where…? Where is anomaly?? Photo by Magda Ehlers from Pexels +sound localization Localization is also tackled in DCASE Challenge [Link]
  • 75. Proprietary + Confidential How anomalous?? Anomaly detected!! ...but how…? Photo by Andrea Piacquadio from Pexels
  • 76. Proprietary + Confidential How anomalous?? Anomaly detected!! ...but how…? Photo by Andrea Piacquadio from Pexels +audio captioning Captioning is also tackled in DCASE Challenge [Link] High frequency rubbing noise. It might be an anomaly in bearing.
  • 77. Proprietary + Confidential Conclusion ❏ Interesting and “tasty” problems ❏ Outlier-detection? Classification? ❏ Even the definition of the problem is uncertain ❏ Blue ocean ❏ Many unsolved problems and DCASE Challenge ❏ Domain adaptation, few-shot learning... ❏ Frontier ❏ Practically important but undeveloped research field ❏ Combining other DCASE tasks
  • 80. Thank You Yuma Koizumi Research Scientist @ Google Research Tokyo
  • 81. Proprietary + Confidential Problem on auto-encoder (supplement) ❏ Reconstruction error and energy of Boltzmann distribution ❏ MMSE-based training ignores normalizing constant KL-div. between PDF of normal MMSE Constraint for increasing total anomaly score = Increasing anomaly score of unknown samples