Automating Google Workspace (GWS) & more with Apps Script
Mmai 2014 final
1. Naive Soul Guardian
Bloody Scenes Detection
with Deep Convolutional Neural Network
B99902080 李冠穎
R03944007 張人尹
2. Outline
● Motivation
● System Overview
o Convolutional Neural Network
o Fully-Convolutional Net
o Pixelation
● Experiment
● Future Work
● Reference
● Demo 1
3. Motivation
● Lots of videos contain bloody scenes, we want to
protect kids from these inappropriate scenes
● Our system aims to detect and pixelate bloody
scenes automatically
2
4. Motivation
● Lots of videos contain bloody scenes, we want to
protect kids from these inappropriate scenes
● Our system aims to detect and pixelate bloody
scenes automatically
3
6. Convolutional Neural Network
● Fine-tune pre-trained CaffeNet(ImageNet)
o Human-labeled frames without bounding box
● Predict decoded frames
o Background(0) ignored frames
o Bloody frame(1) fully-convolutional net
5
7. Fully-Convolutional Net
● Classification for each 227 × 227 box with stride
32 on 451 x 451 image
● Generate a 8 x 8 classification map
o Interpolate probabilities to obtain heat map
6
Fully-Convolutional
Net
9. Experiment (I)
● Run on cml21
● Decoding/Encoding done by FFmpeg
● Decoded frames as training/validation data
o Pos = Segments from Saw 1, 2, 3, 7, Final Destination 4,
5…… + Crawled images from google images
o Neg = Segments from The Big Bang Theory S8E11…… +
Part of ILSVRC 2013 val/test
o Random sample Pos : Neg = 2500 : 2500
8
11. Experiment (III)
● Time(sec) of Processing a video clip
10
Decoding Classification Heat map Pixelation Encoding Average
time
Saw6
(139 frames,720x404)
0.34 41.18 22.99 72.43 0.02 0.99
sec/frame
CWL
(109 frames,1280x720)
0.79 36.95 0 0 1.24 0.36
sec/frame
FD5
(121 frames,1024x576)
0.44 36.23 3.87 28.72 0.81 0.58
sec/frame
12. Future Work
● Train our model with more diverse data to
increase accuracy and reduce false-positive
● Accelerate blurring and smooth boundaries
● Implement on surveillance camera for security
● Combine shot detection and motion vector to
reduce computation
11
13. Reference
● Caffe | Deep Learning Framework
○ http://caffe.berkeleyvision.org/
○ Classifying ImageNet: the instant Caffe way
○ Net Surgery for a Fully-Convolutional Model
● FFmpeg
○ https://www.ffmpeg.org/
● ImageNet
○ http://www.image-net.org/
● Tutorials by Hsinfu, Shiro, Jocelyn
12