SlideShare a Scribd company logo
1 of 26
Download to read offline
Study Meeting Presentation:



Unsupervised Video Anomaly Detection: A brief overview

Author: Tiago Oliveira



Date: 2021/11/10 

Summary
1. Problem framing
2. Benchmark Datasets
3. How about constructing your own dataset?
4. Unsupervised Approaches
a. Convolutional LSTM Autoencoder
b. Memory-Augmented Autoencoder
c. Memory-augmented Conv2D Autoencoder (MemConv2DAE)
5. Experiment Results
6. Conclusions
2
1. Problem Framing
Identification of frames within a video containing anomalous events.
In surveillance videos:
Presence or absence of an object or movement of an object
In industrial process videos:
Irregularities in a process such as the shape of a flame
3
Zhu, S., Chen, C., & Sultani, W. (2020). Video Anomaly Detection for Smart Surveillance. http://arxiv.org/abs/2004.00222
1. Problem Framing
This is a challenging task due to two major difficulties:
1. The data unbalance between positive (anomalous) and negative (normal)
2. The high variance within positive samples (although negative samples can also show high variance)
Usually addressed by:
● Training a model to represent normal events and considering the outliers as the anomalous events
● Outliers are identified by high scores in some form of reconstruction loss or low scores in metrics that are
the inverse of the loss - such as the regularity score
4
Zhu, S., Chen, C., & Sultani, W. (2020). Video Anomaly Detection for Smart Surveillance. http://arxiv.org/abs/2004.00222
1. Problem Framing
Another aspect to consider is that a sample fed to an anomaly detection model usually has
four dimensions (excluding the batch size), namely:
T (temporal depth) x h (height) x w (width) x c (channels)
The unsupervised models follow an autoencoder configuration and the goal is to
reconstruct the input sequence.
5
Zhu, S., Chen, C., & Sultani, W. (2020). Video Anomaly Detection for Smart Surveillance. http://arxiv.org/abs/2004.00222
1. Problem Framing
Input sequence
6
1. Problem Framing
Input sequence with skipping (because consecutive frames may contain redundant info)
7
シーケンスサイズ
連続したフレームには冗長な情報が含まれている可能性があります
予測でチェックされるフレーム
スキップ
1
1. Problem Framing
Abnormality Score based on the losses of set of sequences e(t):
Regularity Score:
8
Y. S. Chong and Y. H. Tay, “Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder,” Advances in Neural Networks - ISNN 2017. pp. 189–196, 2017,
https://arxiv.org/abs/1701.01546
シーケンスの集合の損失に基づく異常スコア
規則性スコア
2. Benchmark Datasets
Dataset
Total number
of videos
Number of
training
videos
Number of
test videos
Average number
of frames per
video
Number of
anomalous
frames
Abnormal
events
Scenes Anomaly examples
UCSD Ped1 70 34 36 201 4,005 40
Groups of people walking
towards and away from the
camera, and some amount of
perspective distortion.
Bikers, small carts
UCSD Ped2 28 16 12 163 1,636 12
Scenes with pedestrian
movement parallel to the camera
plane.
Bikers, small carts
Subway
Entrance
1 -- -- 121,749 2,400 66 People entering the subway
Wrong direction, no
payment
Subway Exit 1 -- -- 64,901 720 19 People exiting the subway
Wrong direction, no
payment
CUHK Avenue 37 16 21 30, 652 3,820 47 CHUK campus avenue videos Run, throw, new object
Shanghai Tech 437 330 107 317,398 17,090 130
Scenes from the campus of
ShanghaiTech
Bikers, cars
UCF Crime 1,900 1,610 290 7,247 -- 13
Videos covering 13 real-world
anomaly events
Arson, accident,
burglary, fighting
9
2. Benchmark Datasets
10
Shanghai Tech
2. Benchmark Datasets
11
UCF Crime
3. How about constructing your own dataset?
Motivation
● Lack of datasets that have scenes about industrial processes (which we care about at Ridge-i, given our projects)
● The need for an “easy” dataset with well-defined anomalies on which we can test different models
Method
● As a domain, we selected the operation of a domestic oven
○ It is an everyday object, so it is easily accessible
○ Allows for the regulation of flame intensity
○ It is possible to place contents inside and record their respective interaction with the flames
12
3. How about constructing your own dataset?
13
Normal
Flame at maximum size
73 964 frames
Anomaly
Small flame
11 106 frames
Anomaly
Smoke
14 529 frames
Anomaly
Ash and flame deformation
5 780 frames
Oven3 Dataset
The clips in the Oven3 dataset were recorded at 60 fps with a resolution of 1080x1920.
最大サイズでの名声 小火 燻す 灰と炎の変形
Oven3データセットのクリップは、
1080x1920の解像度で60fpsで記録されています。
4. Unsupervised Approaches
14
Convolutional LSTM Autoencoder (ConvLSTMAE)
A spatiotemporal architecture with two main components: one
for spatial feature representation and one for learning the
temporal evolution of patterns.
Loss function
Y. S. Chong and Y. H. Tay, “Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder,” Advances in Neural Networks - ISNN 2017. pp. 189–196, 2017,
https://arxiv.org/abs/1701.01546
4. Unsupervised Approaches
15
Memory- augmented Autoencoder (MemAE)
Sometimes the ability of the autoencoder to generalize is
so powerful that it is capable of reconstructing
anomalous inputs very well.
The MemAE aims to address this issue.
D. Gong et al., “Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection,” 2019 IEEE/CVF International
Conference on Computer Vision (ICCV). 2019, https://arxiv.org/abs/1904.02639
4. Unsupervised Approaches
16
Memory Autoencoder (MemAE)
D. Gong et al., “Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection,” 2019 IEEE/CVF International
Conference on Computer Vision (ICCV). 2019, https://arxiv.org/abs/1904.02639
Latent representation
Entropy
Loss function
4. Unsupervised Approaches
17
Memory Autoencoder (MemAE)
Robustness of the memory size (M): in the UCSD-Ped2
dataset the AUC saturates at around M=1000.
D. Gong et al., “Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection,” 2019 IEEE/CVF International
Conference on Computer Vision (ICCV). 2019, https://arxiv.org/abs/1904.02639
4. Unsupervised Approaches
18
Memory-augmented Conv2D Autoencoder (MemConv2DAE)
Unlike the MemAE, the MemConv2DAE uses the output of 2D convolutional layers as queries and
features compactness and separateness losses, allowing for a much smaller number of memory
items (10 vs 2000 in the MemAE).
The model consists of three parts: an encoder, a memory module, and a decoder. The encoder
extracts a query qt of size H x W x C from an input video frame It at time t. The memory module
reads and updates memory items pM of size 1 x 1 x C using the queries qt of size 1 x 1 x C.
H. Park, J. Noh, and B. Ham, “Learning Memory-Guided Normality for Anomaly Detection,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR). 2020, https://arxiv.org/abs/2003.13228
4. Unsupervised Approaches
19
Memory-augmented Conv2D Autoencoder (MemConv2DAE)
Multi-loss function
Reconstruction loss
Feature compactness loss
Feature separateness loss
H. Park, J. Noh, and B. Ham, “Learning Memory-Guided Normality for Anomaly Detection,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR). 2020, https://arxiv.org/abs/2003.13228
4. Unsupervised Approaches
20
AUC scores of the selected approaches in the benchmark datasets
Model
AUC (%)
UCSD
Ped1
UCSD Ped2 CUHK Avenue Subway
Entrance
Subway
Exit
Shanghai Tech
ConvLSTMAE 89.9 87.4 80.3 84.7 94.0 --
MemAE --- 94.1 83.3 --- --- 71.2
MemConv2DAE --- 90.2 (Recon.)
97.0 (Pred.)
82.8 (Recon.)
88.5 (Pred.)
--- --- 69.8 (Recon.)
70.5 (Pred.)
5. Experiment Results
21
Baseline configuration for the Oven3 sequences
(established with the ConvLSTMAE)
● Temporal depth (T): 15 frames
● Skip: 15 frames
● Frame size: 64x64
○ Resizing frames to a smaller size improved the detection of
anomalies and the lowest value with improvement was 64x64
● Color space: grayscale
○ Grayscale usually produced better results than RGB, but RGB
was always considered
Test sequence
22
5. Experiment Results
● The lower the regularity score for anomalies the better
● The MemAE and the MemConv2DAE show lower regularity scores for the most subtle anomaly: small flame
● The MemConv2DAE shows overall lower scores for every anomaly and faster recoveries from anomaly to normal
異常値の規則性スコアが低いほど良い
MemAEとMemConv2DAEは、最も微妙な異常である小火炎の規則性スコアが低いことを示している
MemConv2DAEは、すべての異常に対して全体的に低いスコアを示し、異常から正常への回復が早いことを示しています
5. Experiment Results
23
No.
Model
Dataset configuration
AUC Inference speed
Size
Color
Space
Temporal
depth
Skip
Frames
1 ConvLSTMAE 64 gray 15 30 0.9350 13 fps
2 ConvLSTMAE
64
RGB 15 30 0.9456 13 fps
3 MemAE
64
gray 15 30 0.9442 165 fps
4 MemAE
64
RGB 15 30 0.9363 160 fps
5 MemConv2DAE 64
gray
15 30 0.9617 110 fps
6 MemConv2DAE 64
RGB
15 30 0.9639 104 fps
6. Conclusions
24
● The ConvLSTMAE is very robust to changes in the parameters of the training data and hyperparameters of the model - when faced
with a new task is is always worth to try this model!
● The MemAE and the MemConv2DAE (in RGB mode) are better than ConvLSTMAE and are more sensitive to anomalies - they are
good to detect subtle anomalies!
● The MemAE was the fastest model overall.
● In the MemAE it is necessary to pay attention to the learning rate (the lower the better) and the memory size (the larger the better
until a certain point) of the MemAE.
Acknowledgements
Thank you Abe-san and Motaz-san for the collaboration in the contents of this presentation.
25
Study Meeting Presentation:



Unsupervised Video Anomaly Detection: A brief overview

Author: Tiago Oliveira



Date: 2021/11/10 


More Related Content

What's hot

Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learningSushant Shrivastava
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Video Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetVideo Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetGiorgio Carbone
 
Intro to Deep Learning for Computer Vision
Intro to Deep Learning for Computer VisionIntro to Deep Learning for Computer Vision
Intro to Deep Learning for Computer VisionChristoph Körner
 
Focal loss for dense object detection
Focal loss for dense object detectionFocal loss for dense object detection
Focal loss for dense object detectionDaeHeeKim31
 
Single Shot Multibox Detector
Single Shot Multibox DetectorSingle Shot Multibox Detector
Single Shot Multibox DetectorNamHyuk Ahn
 
Disentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative ModelsDisentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative ModelsRyohei Suzuki
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorJinwon Lee
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
Chapter10 image segmentation
Chapter10 image segmentationChapter10 image segmentation
Chapter10 image segmentationasodariyabhavesh
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelYanbin Kong
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Wavelet transform in image compression
Wavelet transform in image compressionWavelet transform in image compression
Wavelet transform in image compressionjeevithaelangovan
 
Mtech First progress PRESENTATION ON VIDEO SUMMARIZATION
Mtech First progress PRESENTATION ON VIDEO SUMMARIZATIONMtech First progress PRESENTATION ON VIDEO SUMMARIZATION
Mtech First progress PRESENTATION ON VIDEO SUMMARIZATIONNEERAJ BAGHEL
 
Video Transformers.pptx
Video Transformers.pptxVideo Transformers.pptx
Video Transformers.pptxSangmin Woo
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overviewjins0618
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMustafa Yagmur
 

What's hot (20)

Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Canny Edge Detection
Canny Edge DetectionCanny Edge Detection
Canny Edge Detection
 
Video Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetVideo Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 dataset
 
Intro to Deep Learning for Computer Vision
Intro to Deep Learning for Computer VisionIntro to Deep Learning for Computer Vision
Intro to Deep Learning for Computer Vision
 
Focal loss for dense object detection
Focal loss for dense object detectionFocal loss for dense object detection
Focal loss for dense object detection
 
Single Shot Multibox Detector
Single Shot Multibox DetectorSingle Shot Multibox Detector
Single Shot Multibox Detector
 
Disentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative ModelsDisentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative Models
 
Edge detection
Edge detectionEdge detection
Edge detection
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
Chapter10 image segmentation
Chapter10 image segmentationChapter10 image segmentation
Chapter10 image segmentation
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative Model
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Wavelet transform in image compression
Wavelet transform in image compressionWavelet transform in image compression
Wavelet transform in image compression
 
Mtech First progress PRESENTATION ON VIDEO SUMMARIZATION
Mtech First progress PRESENTATION ON VIDEO SUMMARIZATIONMtech First progress PRESENTATION ON VIDEO SUMMARIZATION
Mtech First progress PRESENTATION ON VIDEO SUMMARIZATION
 
Video Transformers.pptx
Video Transformers.pptxVideo Transformers.pptx
Video Transformers.pptx
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
 
Learning from imbalanced data
Learning from imbalanced data Learning from imbalanced data
Learning from imbalanced data
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 

Similar to Video Anomaly Detection Overview

IRJET- Survey Paper on Anomaly Detection in Surveillance Videos
IRJET-  	  Survey Paper on Anomaly Detection in Surveillance VideosIRJET-  	  Survey Paper on Anomaly Detection in Surveillance Videos
IRJET- Survey Paper on Anomaly Detection in Surveillance VideosIRJET Journal
 
Deep Networks with Neuromorphic VLSI devices
Deep Networks with Neuromorphic VLSI devicesDeep Networks with Neuromorphic VLSI devices
Deep Networks with Neuromorphic VLSI devicesGiacomo Indiveri
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Vignesh V Menon
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Alpen-Adria-Universität
 
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Alpen-Adria-Universität
 
Video Description using Deep Learning
Video Description using Deep LearningVideo Description using Deep Learning
Video Description using Deep LearningPranjalMahajan9
 
Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Anomaly Detection using Deep Auto-Encoders | Gianmario SpacagnaAnomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Anomaly Detection using Deep Auto-Encoders | Gianmario SpacagnaData Science Milan
 
IJSRED-V2I3P80
IJSRED-V2I3P80IJSRED-V2I3P80
IJSRED-V2I3P80IJSRED
 
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionSecure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionIJAEMSJORNAL
 
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...IRJET Journal
 
Biometric presentation attack detection
Biometric presentation attack detectionBiometric presentation attack detection
Biometric presentation attack detectionGautam Saxena
 
76201950
7620195076201950
76201950IJRAT
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
Parking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationParking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationIRJET Journal
 
Research overview
Research overviewResearch overview
Research overviewdagunisa
 
Machine Learning approaches at video compression
Machine Learning approaches at video compression Machine Learning approaches at video compression
Machine Learning approaches at video compression Roberto Iacoviello
 
Pipeline anomaly detection
Pipeline anomaly detectionPipeline anomaly detection
Pipeline anomaly detectionGauravBiswas9
 
Design and Analysis of Quantization Based Low Bit Rate Encoding System
Design and Analysis of Quantization Based Low Bit Rate Encoding SystemDesign and Analysis of Quantization Based Low Bit Rate Encoding System
Design and Analysis of Quantization Based Low Bit Rate Encoding Systemijtsrd
 
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...theijes
 

Similar to Video Anomaly Detection Overview (20)

IRJET- Survey Paper on Anomaly Detection in Surveillance Videos
IRJET-  	  Survey Paper on Anomaly Detection in Surveillance VideosIRJET-  	  Survey Paper on Anomaly Detection in Surveillance Videos
IRJET- Survey Paper on Anomaly Detection in Surveillance Videos
 
Deep Networks with Neuromorphic VLSI devices
Deep Networks with Neuromorphic VLSI devicesDeep Networks with Neuromorphic VLSI devices
Deep Networks with Neuromorphic VLSI devices
 
PPT
PPTPPT
PPT
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
 
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
 
Video Description using Deep Learning
Video Description using Deep LearningVideo Description using Deep Learning
Video Description using Deep Learning
 
Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Anomaly Detection using Deep Auto-Encoders | Gianmario SpacagnaAnomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
 
IJSRED-V2I3P80
IJSRED-V2I3P80IJSRED-V2I3P80
IJSRED-V2I3P80
 
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionSecure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
 
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
 
Biometric presentation attack detection
Biometric presentation attack detectionBiometric presentation attack detection
Biometric presentation attack detection
 
76201950
7620195076201950
76201950
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Parking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationParking Surveillance Footage Summarization
Parking Surveillance Footage Summarization
 
Research overview
Research overviewResearch overview
Research overview
 
Machine Learning approaches at video compression
Machine Learning approaches at video compression Machine Learning approaches at video compression
Machine Learning approaches at video compression
 
Pipeline anomaly detection
Pipeline anomaly detectionPipeline anomaly detection
Pipeline anomaly detection
 
Design and Analysis of Quantization Based Low Bit Rate Encoding System
Design and Analysis of Quantization Based Low Bit Rate Encoding SystemDesign and Analysis of Quantization Based Low Bit Rate Encoding System
Design and Analysis of Quantization Based Low Bit Rate Encoding System
 
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...
 

More from Ridge-i, Inc.

Continual Learning Introduction
Continual Learning IntroductionContinual Learning Introduction
Continual Learning IntroductionRidge-i, Inc.
 
Introduction to Few shot learning
Introduction to Few shot learningIntroduction to Few shot learning
Introduction to Few shot learningRidge-i, Inc.
 
May internship challenge: Font Generator
May internship challenge: Font GeneratorMay internship challenge: Font Generator
May internship challenge: Font GeneratorRidge-i, Inc.
 
How to learn with non-reliable labels?
How to learn with non-reliable labels?How to learn with non-reliable labels?
How to learn with non-reliable labels?Ridge-i, Inc.
 
How to learn with non-reliable labels? (Japanese version)
How to learn with non-reliable labels? (Japanese version)How to learn with non-reliable labels? (Japanese version)
How to learn with non-reliable labels? (Japanese version)Ridge-i, Inc.
 
May internship challenge: User Authentication System only using image data: C...
May internship challenge: User Authentication System only using image data: C...May internship challenge: User Authentication System only using image data: C...
May internship challenge: User Authentication System only using image data: C...Ridge-i, Inc.
 
May internship challenge: Estimating Distance between Two Balls App
May internship challenge: Estimating Distance between Two Balls AppMay internship challenge: Estimating Distance between Two Balls App
May internship challenge: Estimating Distance between Two Balls AppRidge-i, Inc.
 

More from Ridge-i, Inc. (8)

Continual Learning Introduction
Continual Learning IntroductionContinual Learning Introduction
Continual Learning Introduction
 
Introduction to Few shot learning
Introduction to Few shot learningIntroduction to Few shot learning
Introduction to Few shot learning
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
May internship challenge: Font Generator
May internship challenge: Font GeneratorMay internship challenge: Font Generator
May internship challenge: Font Generator
 
How to learn with non-reliable labels?
How to learn with non-reliable labels?How to learn with non-reliable labels?
How to learn with non-reliable labels?
 
How to learn with non-reliable labels? (Japanese version)
How to learn with non-reliable labels? (Japanese version)How to learn with non-reliable labels? (Japanese version)
How to learn with non-reliable labels? (Japanese version)
 
May internship challenge: User Authentication System only using image data: C...
May internship challenge: User Authentication System only using image data: C...May internship challenge: User Authentication System only using image data: C...
May internship challenge: User Authentication System only using image data: C...
 
May internship challenge: Estimating Distance between Two Balls App
May internship challenge: Estimating Distance between Two Balls AppMay internship challenge: Estimating Distance between Two Balls App
May internship challenge: Estimating Distance between Two Balls App
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 

Video Anomaly Detection Overview

  • 1. Study Meeting Presentation:
 
 Unsupervised Video Anomaly Detection: A brief overview
 Author: Tiago Oliveira
 
 Date: 2021/11/10 

  • 2. Summary 1. Problem framing 2. Benchmark Datasets 3. How about constructing your own dataset? 4. Unsupervised Approaches a. Convolutional LSTM Autoencoder b. Memory-Augmented Autoencoder c. Memory-augmented Conv2D Autoencoder (MemConv2DAE) 5. Experiment Results 6. Conclusions 2
  • 3. 1. Problem Framing Identification of frames within a video containing anomalous events. In surveillance videos: Presence or absence of an object or movement of an object In industrial process videos: Irregularities in a process such as the shape of a flame 3 Zhu, S., Chen, C., & Sultani, W. (2020). Video Anomaly Detection for Smart Surveillance. http://arxiv.org/abs/2004.00222
  • 4. 1. Problem Framing This is a challenging task due to two major difficulties: 1. The data unbalance between positive (anomalous) and negative (normal) 2. The high variance within positive samples (although negative samples can also show high variance) Usually addressed by: ● Training a model to represent normal events and considering the outliers as the anomalous events ● Outliers are identified by high scores in some form of reconstruction loss or low scores in metrics that are the inverse of the loss - such as the regularity score 4 Zhu, S., Chen, C., & Sultani, W. (2020). Video Anomaly Detection for Smart Surveillance. http://arxiv.org/abs/2004.00222
  • 5. 1. Problem Framing Another aspect to consider is that a sample fed to an anomaly detection model usually has four dimensions (excluding the batch size), namely: T (temporal depth) x h (height) x w (width) x c (channels) The unsupervised models follow an autoencoder configuration and the goal is to reconstruct the input sequence. 5 Zhu, S., Chen, C., & Sultani, W. (2020). Video Anomaly Detection for Smart Surveillance. http://arxiv.org/abs/2004.00222
  • 7. 1. Problem Framing Input sequence with skipping (because consecutive frames may contain redundant info) 7 シーケンスサイズ 連続したフレームには冗長な情報が含まれている可能性があります 予測でチェックされるフレーム スキップ 1
  • 8. 1. Problem Framing Abnormality Score based on the losses of set of sequences e(t): Regularity Score: 8 Y. S. Chong and Y. H. Tay, “Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder,” Advances in Neural Networks - ISNN 2017. pp. 189–196, 2017, https://arxiv.org/abs/1701.01546 シーケンスの集合の損失に基づく異常スコア 規則性スコア
  • 9. 2. Benchmark Datasets Dataset Total number of videos Number of training videos Number of test videos Average number of frames per video Number of anomalous frames Abnormal events Scenes Anomaly examples UCSD Ped1 70 34 36 201 4,005 40 Groups of people walking towards and away from the camera, and some amount of perspective distortion. Bikers, small carts UCSD Ped2 28 16 12 163 1,636 12 Scenes with pedestrian movement parallel to the camera plane. Bikers, small carts Subway Entrance 1 -- -- 121,749 2,400 66 People entering the subway Wrong direction, no payment Subway Exit 1 -- -- 64,901 720 19 People exiting the subway Wrong direction, no payment CUHK Avenue 37 16 21 30, 652 3,820 47 CHUK campus avenue videos Run, throw, new object Shanghai Tech 437 330 107 317,398 17,090 130 Scenes from the campus of ShanghaiTech Bikers, cars UCF Crime 1,900 1,610 290 7,247 -- 13 Videos covering 13 real-world anomaly events Arson, accident, burglary, fighting 9
  • 12. 3. How about constructing your own dataset? Motivation ● Lack of datasets that have scenes about industrial processes (which we care about at Ridge-i, given our projects) ● The need for an “easy” dataset with well-defined anomalies on which we can test different models Method ● As a domain, we selected the operation of a domestic oven ○ It is an everyday object, so it is easily accessible ○ Allows for the regulation of flame intensity ○ It is possible to place contents inside and record their respective interaction with the flames 12
  • 13. 3. How about constructing your own dataset? 13 Normal Flame at maximum size 73 964 frames Anomaly Small flame 11 106 frames Anomaly Smoke 14 529 frames Anomaly Ash and flame deformation 5 780 frames Oven3 Dataset The clips in the Oven3 dataset were recorded at 60 fps with a resolution of 1080x1920. 最大サイズでの名声 小火 燻す 灰と炎の変形 Oven3データセットのクリップは、 1080x1920の解像度で60fpsで記録されています。
  • 14. 4. Unsupervised Approaches 14 Convolutional LSTM Autoencoder (ConvLSTMAE) A spatiotemporal architecture with two main components: one for spatial feature representation and one for learning the temporal evolution of patterns. Loss function Y. S. Chong and Y. H. Tay, “Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder,” Advances in Neural Networks - ISNN 2017. pp. 189–196, 2017, https://arxiv.org/abs/1701.01546
  • 15. 4. Unsupervised Approaches 15 Memory- augmented Autoencoder (MemAE) Sometimes the ability of the autoencoder to generalize is so powerful that it is capable of reconstructing anomalous inputs very well. The MemAE aims to address this issue. D. Gong et al., “Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection,” 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019, https://arxiv.org/abs/1904.02639
  • 16. 4. Unsupervised Approaches 16 Memory Autoencoder (MemAE) D. Gong et al., “Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection,” 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019, https://arxiv.org/abs/1904.02639 Latent representation Entropy Loss function
  • 17. 4. Unsupervised Approaches 17 Memory Autoencoder (MemAE) Robustness of the memory size (M): in the UCSD-Ped2 dataset the AUC saturates at around M=1000. D. Gong et al., “Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection,” 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019, https://arxiv.org/abs/1904.02639
  • 18. 4. Unsupervised Approaches 18 Memory-augmented Conv2D Autoencoder (MemConv2DAE) Unlike the MemAE, the MemConv2DAE uses the output of 2D convolutional layers as queries and features compactness and separateness losses, allowing for a much smaller number of memory items (10 vs 2000 in the MemAE). The model consists of three parts: an encoder, a memory module, and a decoder. The encoder extracts a query qt of size H x W x C from an input video frame It at time t. The memory module reads and updates memory items pM of size 1 x 1 x C using the queries qt of size 1 x 1 x C. H. Park, J. Noh, and B. Ham, “Learning Memory-Guided Normality for Anomaly Detection,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020, https://arxiv.org/abs/2003.13228
  • 19. 4. Unsupervised Approaches 19 Memory-augmented Conv2D Autoencoder (MemConv2DAE) Multi-loss function Reconstruction loss Feature compactness loss Feature separateness loss H. Park, J. Noh, and B. Ham, “Learning Memory-Guided Normality for Anomaly Detection,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020, https://arxiv.org/abs/2003.13228
  • 20. 4. Unsupervised Approaches 20 AUC scores of the selected approaches in the benchmark datasets Model AUC (%) UCSD Ped1 UCSD Ped2 CUHK Avenue Subway Entrance Subway Exit Shanghai Tech ConvLSTMAE 89.9 87.4 80.3 84.7 94.0 -- MemAE --- 94.1 83.3 --- --- 71.2 MemConv2DAE --- 90.2 (Recon.) 97.0 (Pred.) 82.8 (Recon.) 88.5 (Pred.) --- --- 69.8 (Recon.) 70.5 (Pred.)
  • 21. 5. Experiment Results 21 Baseline configuration for the Oven3 sequences (established with the ConvLSTMAE) ● Temporal depth (T): 15 frames ● Skip: 15 frames ● Frame size: 64x64 ○ Resizing frames to a smaller size improved the detection of anomalies and the lowest value with improvement was 64x64 ● Color space: grayscale ○ Grayscale usually produced better results than RGB, but RGB was always considered Test sequence
  • 22. 22 5. Experiment Results ● The lower the regularity score for anomalies the better ● The MemAE and the MemConv2DAE show lower regularity scores for the most subtle anomaly: small flame ● The MemConv2DAE shows overall lower scores for every anomaly and faster recoveries from anomaly to normal 異常値の規則性スコアが低いほど良い MemAEとMemConv2DAEは、最も微妙な異常である小火炎の規則性スコアが低いことを示している MemConv2DAEは、すべての異常に対して全体的に低いスコアを示し、異常から正常への回復が早いことを示しています
  • 23. 5. Experiment Results 23 No. Model Dataset configuration AUC Inference speed Size Color Space Temporal depth Skip Frames 1 ConvLSTMAE 64 gray 15 30 0.9350 13 fps 2 ConvLSTMAE 64 RGB 15 30 0.9456 13 fps 3 MemAE 64 gray 15 30 0.9442 165 fps 4 MemAE 64 RGB 15 30 0.9363 160 fps 5 MemConv2DAE 64 gray 15 30 0.9617 110 fps 6 MemConv2DAE 64 RGB 15 30 0.9639 104 fps
  • 24. 6. Conclusions 24 ● The ConvLSTMAE is very robust to changes in the parameters of the training data and hyperparameters of the model - when faced with a new task is is always worth to try this model! ● The MemAE and the MemConv2DAE (in RGB mode) are better than ConvLSTMAE and are more sensitive to anomalies - they are good to detect subtle anomalies! ● The MemAE was the fastest model overall. ● In the MemAE it is necessary to pay attention to the learning rate (the lower the better) and the memory size (the larger the better until a certain point) of the MemAE.
  • 25. Acknowledgements Thank you Abe-san and Motaz-san for the collaboration in the contents of this presentation. 25
  • 26. Study Meeting Presentation:
 
 Unsupervised Video Anomaly Detection: A brief overview
 Author: Tiago Oliveira
 
 Date: 2021/11/10