SlideShare a Scribd company logo
Study Meeting Presentation:



Unsupervised Video Anomaly Detection: A brief overview

Author: Tiago Oliveira



Date: 2021/11/10 

Summary
1. Problem framing
2. Benchmark Datasets
3. How about constructing your own dataset?
4. Unsupervised Approaches
a. Convolutional LSTM Autoencoder
b. Memory-Augmented Autoencoder
c. Memory-augmented Conv2D Autoencoder (MemConv2DAE)
5. Experiment Results
6. Conclusions
2
1. Problem Framing
Identification of frames within a video containing anomalous events.
In surveillance videos:
Presence or absence of an object or movement of an object
In industrial process videos:
Irregularities in a process such as the shape of a flame
3
Zhu, S., Chen, C., & Sultani, W. (2020). Video Anomaly Detection for Smart Surveillance. http://arxiv.org/abs/2004.00222
1. Problem Framing
This is a challenging task due to two major difficulties:
1. The data unbalance between positive (anomalous) and negative (normal)
2. The high variance within positive samples (although negative samples can also show high variance)
Usually addressed by:
● Training a model to represent normal events and considering the outliers as the anomalous events
● Outliers are identified by high scores in some form of reconstruction loss or low scores in metrics that are
the inverse of the loss - such as the regularity score
4
Zhu, S., Chen, C., & Sultani, W. (2020). Video Anomaly Detection for Smart Surveillance. http://arxiv.org/abs/2004.00222
1. Problem Framing
Another aspect to consider is that a sample fed to an anomaly detection model usually has
four dimensions (excluding the batch size), namely:
T (temporal depth) x h (height) x w (width) x c (channels)
The unsupervised models follow an autoencoder configuration and the goal is to
reconstruct the input sequence.
5
Zhu, S., Chen, C., & Sultani, W. (2020). Video Anomaly Detection for Smart Surveillance. http://arxiv.org/abs/2004.00222
1. Problem Framing
Input sequence
6
1. Problem Framing
Input sequence with skipping (because consecutive frames may contain redundant info)
7
シーケンスサイズ
連続したフレームには冗長な情報が含まれている可能性があります
予測でチェックされるフレーム
スキップ
1
1. Problem Framing
Abnormality Score based on the losses of set of sequences e(t):
Regularity Score:
8
Y. S. Chong and Y. H. Tay, “Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder,” Advances in Neural Networks - ISNN 2017. pp. 189–196, 2017,
https://arxiv.org/abs/1701.01546
シーケンスの集合の損失に基づく異常スコア
規則性スコア
2. Benchmark Datasets
Dataset
Total number
of videos
Number of
training
videos
Number of
test videos
Average number
of frames per
video
Number of
anomalous
frames
Abnormal
events
Scenes Anomaly examples
UCSD Ped1 70 34 36 201 4,005 40
Groups of people walking
towards and away from the
camera, and some amount of
perspective distortion.
Bikers, small carts
UCSD Ped2 28 16 12 163 1,636 12
Scenes with pedestrian
movement parallel to the camera
plane.
Bikers, small carts
Subway
Entrance
1 -- -- 121,749 2,400 66 People entering the subway
Wrong direction, no
payment
Subway Exit 1 -- -- 64,901 720 19 People exiting the subway
Wrong direction, no
payment
CUHK Avenue 37 16 21 30, 652 3,820 47 CHUK campus avenue videos Run, throw, new object
Shanghai Tech 437 330 107 317,398 17,090 130
Scenes from the campus of
ShanghaiTech
Bikers, cars
UCF Crime 1,900 1,610 290 7,247 -- 13
Videos covering 13 real-world
anomaly events
Arson, accident,
burglary, fighting
9
2. Benchmark Datasets
10
Shanghai Tech
2. Benchmark Datasets
11
UCF Crime
3. How about constructing your own dataset?
Motivation
● Lack of datasets that have scenes about industrial processes (which we care about at Ridge-i, given our projects)
● The need for an “easy” dataset with well-defined anomalies on which we can test different models
Method
● As a domain, we selected the operation of a domestic oven
○ It is an everyday object, so it is easily accessible
○ Allows for the regulation of flame intensity
○ It is possible to place contents inside and record their respective interaction with the flames
12
3. How about constructing your own dataset?
13
Normal
Flame at maximum size
73 964 frames
Anomaly
Small flame
11 106 frames
Anomaly
Smoke
14 529 frames
Anomaly
Ash and flame deformation
5 780 frames
Oven3 Dataset
The clips in the Oven3 dataset were recorded at 60 fps with a resolution of 1080x1920.
最大サイズでの名声 小火 燻す 灰と炎の変形
Oven3データセットのクリップは、
1080x1920の解像度で60fpsで記録されています。
4. Unsupervised Approaches
14
Convolutional LSTM Autoencoder (ConvLSTMAE)
A spatiotemporal architecture with two main components: one
for spatial feature representation and one for learning the
temporal evolution of patterns.
Loss function
Y. S. Chong and Y. H. Tay, “Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder,” Advances in Neural Networks - ISNN 2017. pp. 189–196, 2017,
https://arxiv.org/abs/1701.01546
4. Unsupervised Approaches
15
Memory- augmented Autoencoder (MemAE)
Sometimes the ability of the autoencoder to generalize is
so powerful that it is capable of reconstructing
anomalous inputs very well.
The MemAE aims to address this issue.
D. Gong et al., “Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection,” 2019 IEEE/CVF International
Conference on Computer Vision (ICCV). 2019, https://arxiv.org/abs/1904.02639
4. Unsupervised Approaches
16
Memory Autoencoder (MemAE)
D. Gong et al., “Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection,” 2019 IEEE/CVF International
Conference on Computer Vision (ICCV). 2019, https://arxiv.org/abs/1904.02639
Latent representation
Entropy
Loss function
4. Unsupervised Approaches
17
Memory Autoencoder (MemAE)
Robustness of the memory size (M): in the UCSD-Ped2
dataset the AUC saturates at around M=1000.
D. Gong et al., “Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection,” 2019 IEEE/CVF International
Conference on Computer Vision (ICCV). 2019, https://arxiv.org/abs/1904.02639
4. Unsupervised Approaches
18
Memory-augmented Conv2D Autoencoder (MemConv2DAE)
Unlike the MemAE, the MemConv2DAE uses the output of 2D convolutional layers as queries and
features compactness and separateness losses, allowing for a much smaller number of memory
items (10 vs 2000 in the MemAE).
The model consists of three parts: an encoder, a memory module, and a decoder. The encoder
extracts a query qt of size H x W x C from an input video frame It at time t. The memory module
reads and updates memory items pM of size 1 x 1 x C using the queries qt of size 1 x 1 x C.
H. Park, J. Noh, and B. Ham, “Learning Memory-Guided Normality for Anomaly Detection,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR). 2020, https://arxiv.org/abs/2003.13228
4. Unsupervised Approaches
19
Memory-augmented Conv2D Autoencoder (MemConv2DAE)
Multi-loss function
Reconstruction loss
Feature compactness loss
Feature separateness loss
H. Park, J. Noh, and B. Ham, “Learning Memory-Guided Normality for Anomaly Detection,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR). 2020, https://arxiv.org/abs/2003.13228
4. Unsupervised Approaches
20
AUC scores of the selected approaches in the benchmark datasets
Model
AUC (%)
UCSD
Ped1
UCSD Ped2 CUHK Avenue Subway
Entrance
Subway
Exit
Shanghai Tech
ConvLSTMAE 89.9 87.4 80.3 84.7 94.0 --
MemAE --- 94.1 83.3 --- --- 71.2
MemConv2DAE --- 90.2 (Recon.)
97.0 (Pred.)
82.8 (Recon.)
88.5 (Pred.)
--- --- 69.8 (Recon.)
70.5 (Pred.)
5. Experiment Results
21
Baseline configuration for the Oven3 sequences
(established with the ConvLSTMAE)
● Temporal depth (T): 15 frames
● Skip: 15 frames
● Frame size: 64x64
○ Resizing frames to a smaller size improved the detection of
anomalies and the lowest value with improvement was 64x64
● Color space: grayscale
○ Grayscale usually produced better results than RGB, but RGB
was always considered
Test sequence
22
5. Experiment Results
● The lower the regularity score for anomalies the better
● The MemAE and the MemConv2DAE show lower regularity scores for the most subtle anomaly: small flame
● The MemConv2DAE shows overall lower scores for every anomaly and faster recoveries from anomaly to normal
異常値の規則性スコアが低いほど良い
MemAEとMemConv2DAEは、最も微妙な異常である小火炎の規則性スコアが低いことを示している
MemConv2DAEは、すべての異常に対して全体的に低いスコアを示し、異常から正常への回復が早いことを示しています
5. Experiment Results
23
No.
Model
Dataset configuration
AUC Inference speed
Size
Color
Space
Temporal
depth
Skip
Frames
1 ConvLSTMAE 64 gray 15 30 0.9350 13 fps
2 ConvLSTMAE
64
RGB 15 30 0.9456 13 fps
3 MemAE
64
gray 15 30 0.9442 165 fps
4 MemAE
64
RGB 15 30 0.9363 160 fps
5 MemConv2DAE 64
gray
15 30 0.9617 110 fps
6 MemConv2DAE 64
RGB
15 30 0.9639 104 fps
6. Conclusions
24
● The ConvLSTMAE is very robust to changes in the parameters of the training data and hyperparameters of the model - when faced
with a new task is is always worth to try this model!
● The MemAE and the MemConv2DAE (in RGB mode) are better than ConvLSTMAE and are more sensitive to anomalies - they are
good to detect subtle anomalies!
● The MemAE was the fastest model overall.
● In the MemAE it is necessary to pay attention to the learning rate (the lower the better) and the memory size (the larger the better
until a certain point) of the MemAE.
Acknowledgements
Thank you Abe-san and Motaz-san for the collaboration in the contents of this presentation.
25
Study Meeting Presentation:



Unsupervised Video Anomaly Detection: A brief overview

Author: Tiago Oliveira



Date: 2021/11/10 


More Related Content

What's hot

Multiple object detection
Multiple object detectionMultiple object detection
Multiple object detection
SAURABH KUMAR
 
Security and Privacy of Machine Learning
Security and Privacy of Machine LearningSecurity and Privacy of Machine Learning
Security and Privacy of Machine Learning
Priyanka Aash
 
Anomaly Detection
Anomaly DetectionAnomaly Detection
Anomaly Detection
Carol Hargreaves
 
FACE RECOGNITION USING NEURAL NETWORK
FACE RECOGNITION USING NEURAL NETWORKFACE RECOGNITION USING NEURAL NETWORK
FACE RECOGNITION USING NEURAL NETWORK
codebangla
 
PR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersPR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision Learners
Jinwon Lee
 
Object detection
Object detectionObject detection
Object detection
Jksuryawanshi
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Impetus Technologies
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scale
Impetus Technologies
 
Modified clahe an adaptive algorithm for contrast enhancement of aerial medi...
Modified clahe an adaptive algorithm for contrast enhancement of aerial  medi...Modified clahe an adaptive algorithm for contrast enhancement of aerial  medi...
Modified clahe an adaptive algorithm for contrast enhancement of aerial medi...
IAEME Publication
 
Introduction to Object recognition
Introduction to Object recognitionIntroduction to Object recognition
Introduction to Object recognition
Ashiq Ullah
 
An Introduction to Computer Vision
An Introduction to Computer VisionAn Introduction to Computer Vision
An Introduction to Computer Vision
guestd1b1b5
 
Object tracking
Object trackingObject tracking
Object tracking
Sri vidhya k
 
Online signature recognition
Online signature recognitionOnline signature recognition
Online signature recognitionPiyush Mittal
 
Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001
Md. Minhazul Haque
 
Multiple Object Tracking
Multiple Object TrackingMultiple Object Tracking
Multiple Object Tracking
RainakSharma
 
Liver segmentation using U-net: Practical issues @ SNU-TF
Liver segmentation using U-net: Practical issues @ SNU-TFLiver segmentation using U-net: Practical issues @ SNU-TF
Liver segmentation using U-net: Practical issues @ SNU-TF
WonjoongCheon
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
Brodmann17
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Real Time Eye Tracking and Application
Real Time Eye Tracking and ApplicationReal Time Eye Tracking and Application
Real Time Eye Tracking and Application
Akshay Kamble
 
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
ijgca
 

What's hot (20)

Multiple object detection
Multiple object detectionMultiple object detection
Multiple object detection
 
Security and Privacy of Machine Learning
Security and Privacy of Machine LearningSecurity and Privacy of Machine Learning
Security and Privacy of Machine Learning
 
Anomaly Detection
Anomaly DetectionAnomaly Detection
Anomaly Detection
 
FACE RECOGNITION USING NEURAL NETWORK
FACE RECOGNITION USING NEURAL NETWORKFACE RECOGNITION USING NEURAL NETWORK
FACE RECOGNITION USING NEURAL NETWORK
 
PR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersPR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision Learners
 
Object detection
Object detectionObject detection
Object detection
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scale
 
Modified clahe an adaptive algorithm for contrast enhancement of aerial medi...
Modified clahe an adaptive algorithm for contrast enhancement of aerial  medi...Modified clahe an adaptive algorithm for contrast enhancement of aerial  medi...
Modified clahe an adaptive algorithm for contrast enhancement of aerial medi...
 
Introduction to Object recognition
Introduction to Object recognitionIntroduction to Object recognition
Introduction to Object recognition
 
An Introduction to Computer Vision
An Introduction to Computer VisionAn Introduction to Computer Vision
An Introduction to Computer Vision
 
Object tracking
Object trackingObject tracking
Object tracking
 
Online signature recognition
Online signature recognitionOnline signature recognition
Online signature recognition
 
Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001
 
Multiple Object Tracking
Multiple Object TrackingMultiple Object Tracking
Multiple Object Tracking
 
Liver segmentation using U-net: Practical issues @ SNU-TF
Liver segmentation using U-net: Practical issues @ SNU-TFLiver segmentation using U-net: Practical issues @ SNU-TF
Liver segmentation using U-net: Practical issues @ SNU-TF
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Real Time Eye Tracking and Application
Real Time Eye Tracking and ApplicationReal Time Eye Tracking and Application
Real Time Eye Tracking and Application
 
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
 

Similar to Unsupervised Video Anomaly Detection: A brief overview

IRJET- Survey Paper on Anomaly Detection in Surveillance Videos
IRJET-  	  Survey Paper on Anomaly Detection in Surveillance VideosIRJET-  	  Survey Paper on Anomaly Detection in Surveillance Videos
IRJET- Survey Paper on Anomaly Detection in Surveillance Videos
IRJET Journal
 
Deep Networks with Neuromorphic VLSI devices
Deep Networks with Neuromorphic VLSI devicesDeep Networks with Neuromorphic VLSI devices
Deep Networks with Neuromorphic VLSI devices
Giacomo Indiveri
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Vignesh V Menon
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Alpen-Adria-Universität
 
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Alpen-Adria-Universität
 
Video Description using Deep Learning
Video Description using Deep LearningVideo Description using Deep Learning
Video Description using Deep Learning
PranjalMahajan9
 
Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Anomaly Detection using Deep Auto-Encoders | Gianmario SpacagnaAnomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Data Science Milan
 
IJSRED-V2I3P80
IJSRED-V2I3P80IJSRED-V2I3P80
IJSRED-V2I3P80
IJSRED
 
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionSecure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
IJAEMSJORNAL
 
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
IRJET Journal
 
Biometric presentation attack detection
Biometric presentation attack detectionBiometric presentation attack detection
Biometric presentation attack detection
Gautam Saxena
 
76201950
7620195076201950
76201950
IJRAT
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
Parking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationParking Surveillance Footage Summarization
Parking Surveillance Footage Summarization
IRJET Journal
 
Research overview
Research overviewResearch overview
Research overview
dagunisa
 
Machine Learning approaches at video compression
Machine Learning approaches at video compression Machine Learning approaches at video compression
Machine Learning approaches at video compression
Roberto Iacoviello
 
Pipeline anomaly detection
Pipeline anomaly detectionPipeline anomaly detection
Pipeline anomaly detection
GauravBiswas9
 
Design and Analysis of Quantization Based Low Bit Rate Encoding System
Design and Analysis of Quantization Based Low Bit Rate Encoding SystemDesign and Analysis of Quantization Based Low Bit Rate Encoding System
Design and Analysis of Quantization Based Low Bit Rate Encoding System
ijtsrd
 
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
theijes
 

Similar to Unsupervised Video Anomaly Detection: A brief overview (20)

IRJET- Survey Paper on Anomaly Detection in Surveillance Videos
IRJET-  	  Survey Paper on Anomaly Detection in Surveillance VideosIRJET-  	  Survey Paper on Anomaly Detection in Surveillance Videos
IRJET- Survey Paper on Anomaly Detection in Surveillance Videos
 
Deep Networks with Neuromorphic VLSI devices
Deep Networks with Neuromorphic VLSI devicesDeep Networks with Neuromorphic VLSI devices
Deep Networks with Neuromorphic VLSI devices
 
PPT
PPTPPT
PPT
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
 
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
 
Video Description using Deep Learning
Video Description using Deep LearningVideo Description using Deep Learning
Video Description using Deep Learning
 
Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Anomaly Detection using Deep Auto-Encoders | Gianmario SpacagnaAnomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
 
IJSRED-V2I3P80
IJSRED-V2I3P80IJSRED-V2I3P80
IJSRED-V2I3P80
 
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionSecure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
 
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
 
Biometric presentation attack detection
Biometric presentation attack detectionBiometric presentation attack detection
Biometric presentation attack detection
 
76201950
7620195076201950
76201950
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Parking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationParking Surveillance Footage Summarization
Parking Surveillance Footage Summarization
 
Research overview
Research overviewResearch overview
Research overview
 
Machine Learning approaches at video compression
Machine Learning approaches at video compression Machine Learning approaches at video compression
Machine Learning approaches at video compression
 
Pipeline anomaly detection
Pipeline anomaly detectionPipeline anomaly detection
Pipeline anomaly detection
 
Design and Analysis of Quantization Based Low Bit Rate Encoding System
Design and Analysis of Quantization Based Low Bit Rate Encoding SystemDesign and Analysis of Quantization Based Low Bit Rate Encoding System
Design and Analysis of Quantization Based Low Bit Rate Encoding System
 
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
 

More from Ridge-i, Inc.

Continual Learning Introduction
Continual Learning IntroductionContinual Learning Introduction
Continual Learning Introduction
Ridge-i, Inc.
 
Introduction to Few shot learning
Introduction to Few shot learningIntroduction to Few shot learning
Introduction to Few shot learning
Ridge-i, Inc.
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
Ridge-i, Inc.
 
May internship challenge: Font Generator
May internship challenge: Font GeneratorMay internship challenge: Font Generator
May internship challenge: Font Generator
Ridge-i, Inc.
 
How to learn with non-reliable labels?
How to learn with non-reliable labels?How to learn with non-reliable labels?
How to learn with non-reliable labels?
Ridge-i, Inc.
 
How to learn with non-reliable labels? (Japanese version)
How to learn with non-reliable labels? (Japanese version)How to learn with non-reliable labels? (Japanese version)
How to learn with non-reliable labels? (Japanese version)
Ridge-i, Inc.
 
May internship challenge: User Authentication System only using image data: C...
May internship challenge: User Authentication System only using image data: C...May internship challenge: User Authentication System only using image data: C...
May internship challenge: User Authentication System only using image data: C...
Ridge-i, Inc.
 
May internship challenge: Estimating Distance between Two Balls App
May internship challenge: Estimating Distance between Two Balls AppMay internship challenge: Estimating Distance between Two Balls App
May internship challenge: Estimating Distance between Two Balls App
Ridge-i, Inc.
 

More from Ridge-i, Inc. (8)

Continual Learning Introduction
Continual Learning IntroductionContinual Learning Introduction
Continual Learning Introduction
 
Introduction to Few shot learning
Introduction to Few shot learningIntroduction to Few shot learning
Introduction to Few shot learning
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
May internship challenge: Font Generator
May internship challenge: Font GeneratorMay internship challenge: Font Generator
May internship challenge: Font Generator
 
How to learn with non-reliable labels?
How to learn with non-reliable labels?How to learn with non-reliable labels?
How to learn with non-reliable labels?
 
How to learn with non-reliable labels? (Japanese version)
How to learn with non-reliable labels? (Japanese version)How to learn with non-reliable labels? (Japanese version)
How to learn with non-reliable labels? (Japanese version)
 
May internship challenge: User Authentication System only using image data: C...
May internship challenge: User Authentication System only using image data: C...May internship challenge: User Authentication System only using image data: C...
May internship challenge: User Authentication System only using image data: C...
 
May internship challenge: Estimating Distance between Two Balls App
May internship challenge: Estimating Distance between Two Balls AppMay internship challenge: Estimating Distance between Two Balls App
May internship challenge: Estimating Distance between Two Balls App
 

Recently uploaded

Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 

Recently uploaded (20)

Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 

Unsupervised Video Anomaly Detection: A brief overview

  • 1. Study Meeting Presentation:
 
 Unsupervised Video Anomaly Detection: A brief overview
 Author: Tiago Oliveira
 
 Date: 2021/11/10 

  • 2. Summary 1. Problem framing 2. Benchmark Datasets 3. How about constructing your own dataset? 4. Unsupervised Approaches a. Convolutional LSTM Autoencoder b. Memory-Augmented Autoencoder c. Memory-augmented Conv2D Autoencoder (MemConv2DAE) 5. Experiment Results 6. Conclusions 2
  • 3. 1. Problem Framing Identification of frames within a video containing anomalous events. In surveillance videos: Presence or absence of an object or movement of an object In industrial process videos: Irregularities in a process such as the shape of a flame 3 Zhu, S., Chen, C., & Sultani, W. (2020). Video Anomaly Detection for Smart Surveillance. http://arxiv.org/abs/2004.00222
  • 4. 1. Problem Framing This is a challenging task due to two major difficulties: 1. The data unbalance between positive (anomalous) and negative (normal) 2. The high variance within positive samples (although negative samples can also show high variance) Usually addressed by: ● Training a model to represent normal events and considering the outliers as the anomalous events ● Outliers are identified by high scores in some form of reconstruction loss or low scores in metrics that are the inverse of the loss - such as the regularity score 4 Zhu, S., Chen, C., & Sultani, W. (2020). Video Anomaly Detection for Smart Surveillance. http://arxiv.org/abs/2004.00222
  • 5. 1. Problem Framing Another aspect to consider is that a sample fed to an anomaly detection model usually has four dimensions (excluding the batch size), namely: T (temporal depth) x h (height) x w (width) x c (channels) The unsupervised models follow an autoencoder configuration and the goal is to reconstruct the input sequence. 5 Zhu, S., Chen, C., & Sultani, W. (2020). Video Anomaly Detection for Smart Surveillance. http://arxiv.org/abs/2004.00222
  • 7. 1. Problem Framing Input sequence with skipping (because consecutive frames may contain redundant info) 7 シーケンスサイズ 連続したフレームには冗長な情報が含まれている可能性があります 予測でチェックされるフレーム スキップ 1
  • 8. 1. Problem Framing Abnormality Score based on the losses of set of sequences e(t): Regularity Score: 8 Y. S. Chong and Y. H. Tay, “Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder,” Advances in Neural Networks - ISNN 2017. pp. 189–196, 2017, https://arxiv.org/abs/1701.01546 シーケンスの集合の損失に基づく異常スコア 規則性スコア
  • 9. 2. Benchmark Datasets Dataset Total number of videos Number of training videos Number of test videos Average number of frames per video Number of anomalous frames Abnormal events Scenes Anomaly examples UCSD Ped1 70 34 36 201 4,005 40 Groups of people walking towards and away from the camera, and some amount of perspective distortion. Bikers, small carts UCSD Ped2 28 16 12 163 1,636 12 Scenes with pedestrian movement parallel to the camera plane. Bikers, small carts Subway Entrance 1 -- -- 121,749 2,400 66 People entering the subway Wrong direction, no payment Subway Exit 1 -- -- 64,901 720 19 People exiting the subway Wrong direction, no payment CUHK Avenue 37 16 21 30, 652 3,820 47 CHUK campus avenue videos Run, throw, new object Shanghai Tech 437 330 107 317,398 17,090 130 Scenes from the campus of ShanghaiTech Bikers, cars UCF Crime 1,900 1,610 290 7,247 -- 13 Videos covering 13 real-world anomaly events Arson, accident, burglary, fighting 9
  • 12. 3. How about constructing your own dataset? Motivation ● Lack of datasets that have scenes about industrial processes (which we care about at Ridge-i, given our projects) ● The need for an “easy” dataset with well-defined anomalies on which we can test different models Method ● As a domain, we selected the operation of a domestic oven ○ It is an everyday object, so it is easily accessible ○ Allows for the regulation of flame intensity ○ It is possible to place contents inside and record their respective interaction with the flames 12
  • 13. 3. How about constructing your own dataset? 13 Normal Flame at maximum size 73 964 frames Anomaly Small flame 11 106 frames Anomaly Smoke 14 529 frames Anomaly Ash and flame deformation 5 780 frames Oven3 Dataset The clips in the Oven3 dataset were recorded at 60 fps with a resolution of 1080x1920. 最大サイズでの名声 小火 燻す 灰と炎の変形 Oven3データセットのクリップは、 1080x1920の解像度で60fpsで記録されています。
  • 14. 4. Unsupervised Approaches 14 Convolutional LSTM Autoencoder (ConvLSTMAE) A spatiotemporal architecture with two main components: one for spatial feature representation and one for learning the temporal evolution of patterns. Loss function Y. S. Chong and Y. H. Tay, “Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder,” Advances in Neural Networks - ISNN 2017. pp. 189–196, 2017, https://arxiv.org/abs/1701.01546
  • 15. 4. Unsupervised Approaches 15 Memory- augmented Autoencoder (MemAE) Sometimes the ability of the autoencoder to generalize is so powerful that it is capable of reconstructing anomalous inputs very well. The MemAE aims to address this issue. D. Gong et al., “Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection,” 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019, https://arxiv.org/abs/1904.02639
  • 16. 4. Unsupervised Approaches 16 Memory Autoencoder (MemAE) D. Gong et al., “Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection,” 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019, https://arxiv.org/abs/1904.02639 Latent representation Entropy Loss function
  • 17. 4. Unsupervised Approaches 17 Memory Autoencoder (MemAE) Robustness of the memory size (M): in the UCSD-Ped2 dataset the AUC saturates at around M=1000. D. Gong et al., “Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection,” 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019, https://arxiv.org/abs/1904.02639
  • 18. 4. Unsupervised Approaches 18 Memory-augmented Conv2D Autoencoder (MemConv2DAE) Unlike the MemAE, the MemConv2DAE uses the output of 2D convolutional layers as queries and features compactness and separateness losses, allowing for a much smaller number of memory items (10 vs 2000 in the MemAE). The model consists of three parts: an encoder, a memory module, and a decoder. The encoder extracts a query qt of size H x W x C from an input video frame It at time t. The memory module reads and updates memory items pM of size 1 x 1 x C using the queries qt of size 1 x 1 x C. H. Park, J. Noh, and B. Ham, “Learning Memory-Guided Normality for Anomaly Detection,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020, https://arxiv.org/abs/2003.13228
  • 19. 4. Unsupervised Approaches 19 Memory-augmented Conv2D Autoencoder (MemConv2DAE) Multi-loss function Reconstruction loss Feature compactness loss Feature separateness loss H. Park, J. Noh, and B. Ham, “Learning Memory-Guided Normality for Anomaly Detection,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020, https://arxiv.org/abs/2003.13228
  • 20. 4. Unsupervised Approaches 20 AUC scores of the selected approaches in the benchmark datasets Model AUC (%) UCSD Ped1 UCSD Ped2 CUHK Avenue Subway Entrance Subway Exit Shanghai Tech ConvLSTMAE 89.9 87.4 80.3 84.7 94.0 -- MemAE --- 94.1 83.3 --- --- 71.2 MemConv2DAE --- 90.2 (Recon.) 97.0 (Pred.) 82.8 (Recon.) 88.5 (Pred.) --- --- 69.8 (Recon.) 70.5 (Pred.)
  • 21. 5. Experiment Results 21 Baseline configuration for the Oven3 sequences (established with the ConvLSTMAE) ● Temporal depth (T): 15 frames ● Skip: 15 frames ● Frame size: 64x64 ○ Resizing frames to a smaller size improved the detection of anomalies and the lowest value with improvement was 64x64 ● Color space: grayscale ○ Grayscale usually produced better results than RGB, but RGB was always considered Test sequence
  • 22. 22 5. Experiment Results ● The lower the regularity score for anomalies the better ● The MemAE and the MemConv2DAE show lower regularity scores for the most subtle anomaly: small flame ● The MemConv2DAE shows overall lower scores for every anomaly and faster recoveries from anomaly to normal 異常値の規則性スコアが低いほど良い MemAEとMemConv2DAEは、最も微妙な異常である小火炎の規則性スコアが低いことを示している MemConv2DAEは、すべての異常に対して全体的に低いスコアを示し、異常から正常への回復が早いことを示しています
  • 23. 5. Experiment Results 23 No. Model Dataset configuration AUC Inference speed Size Color Space Temporal depth Skip Frames 1 ConvLSTMAE 64 gray 15 30 0.9350 13 fps 2 ConvLSTMAE 64 RGB 15 30 0.9456 13 fps 3 MemAE 64 gray 15 30 0.9442 165 fps 4 MemAE 64 RGB 15 30 0.9363 160 fps 5 MemConv2DAE 64 gray 15 30 0.9617 110 fps 6 MemConv2DAE 64 RGB 15 30 0.9639 104 fps
  • 24. 6. Conclusions 24 ● The ConvLSTMAE is very robust to changes in the parameters of the training data and hyperparameters of the model - when faced with a new task is is always worth to try this model! ● The MemAE and the MemConv2DAE (in RGB mode) are better than ConvLSTMAE and are more sensitive to anomalies - they are good to detect subtle anomalies! ● The MemAE was the fastest model overall. ● In the MemAE it is necessary to pay attention to the learning rate (the lower the better) and the memory size (the larger the better until a certain point) of the MemAE.
  • 25. Acknowledgements Thank you Abe-san and Motaz-san for the collaboration in the contents of this presentation. 25
  • 26. Study Meeting Presentation:
 
 Unsupervised Video Anomaly Detection: A brief overview
 Author: Tiago Oliveira
 
 Date: 2021/11/10