Traffic Accident Detection
The Problem is to build a reliable and accurate traffic accident detection system using state-of-the-art object detection models. The system should be able to analyze live video feeds from CCTV cameras and determine whether an accident has occurred or not, based on the presence of specific visual cues such as collision, and vehicle damage. The system's performance will be evaluated based on accuracy, precision, recall, and computational efficiency.
Faster R-CNN (ResNet-50)
YOLO_V5_ Overview
SSD_mobilenet_v2_320*320
2. Traffic Accident: Detection
Under supervision of
Dr. Amani Hassan
Eng. Dina Amr
Presented by
Mohanad Talat 20190561
Eslam Mohamed 20190098
Maher Esmat 20190410
Sayed Shaaban 20190254
Ahmed Mohamed 20190071
Khalid Hassan 20190187
5. INTRODUCTION
● According to the World Health Organization, Egypt has one of the highest rates of road accidents
worldwide. In 2016, WHO estimated road fatalities in Egypt at 9,287. The latest WHO data published in
2020 shows that road traffic accidents deaths in Egypt reached 10,141.
● Fast response to an accident is so crucial that a 7 to 12 minutes delay, can result in the odds of death
increasing by 46 percent
● Traffic accidents pose a significant threat to public safety and cause numerous injuries and fatalities
worldwide. Prompt detection of accidents can enable swift emergency responses, minimizing the potential
impact and saving lives. This problem definition focuses on developing a model to detect traffic accidents
using popular object detection models such as YOLO, SSD, and Faster R-CNN. The goal is to compare
these models and select the most effective one for deployment in CCTV cameras.
6. Problem Definition
The Problem is to build a reliable and accurate traffic accident detection
system using state-of-the-art object detection models. The system should be
able to analyze live video feeds from CCTV cameras and determine whether
an accident has occurred or not, based on the presence of specific visual cues
such as collision, and vehicle damage. The system's performance will be
evaluated based on accuracy, precision, recall, and computational efficiency.
7. Objective
● The aim of this project is to help
reduce traffic accidents.
● By trying to predict and
categorize the accident.
● We don’t need to depend on
human.
● But how could this possibly be
useful ??
8. Applications
● Notify authorities as soon as an
accident occurs.
● By using CCTV cameras installed in
the accident location .
● So we avoid to depend on the human.
10. Object Detection
Object detection is the field
of computer vision that deals
with the localization and
classification of objects
contained in an image or
video.
Drawing bounding boxes
around detected objects which
allow us to locate them in a
given scene. Object Detection
11. Semantic Segmentation
The process of dividing a digital
image into multiple image
segments, also known as image
regions or image objects, is
known as image segmentation
(sets of pixels).
Image segmentation is commonly
used to find objects and boundaries
(lines, curves, and so on) in images.
Semantic Segmentation
12. Pedestrian Detection
predicts information about the
pedestrian's position, provides a
comprehensive overview and
arguments for replacing based
on the detection in the current
frame.
It has an obvious extension
to automotive applications
due to the potential for
improving safety systems.
Pedestrian Detection
19. DATASET
- The dataset consists
of 1,416 video
segments collected
from YouTube
- The dataset contains
frames of different videos
of CCTV traffic cameras.
- Average length of
videos in our dataset is
366 frames per video
with longest video
consisting of 554
frames.
- The dataset has
both low and
high resolution.
- Total duration of videos is 5.2 hours.
21. Normalize image size :
- We are doing this step because the CNN
models should take the same input size.
- The size of image is (500,500,3) -
(Width,height, RGB)
Preprocessing
Original
Normalized(500,500,3)
22. Segmentation and morphology :
- As we can see we are going to use this step
in our preprocessing because it remove the
most important feature (cars) from
images.
Preprocessing
23. Duplicate data has been deleted
Bad quality data has been deleted
All frames that do not guarantee a traffic
accident have been deleted
Data filtering
Not interested
27. ● The image is fed to CNN which generates a
convolutional feature map
● First, Feed feature map into an independent
fully convolutional network Region Proposal
Network(RPN)
● Second feed generated proposals to a Fast R-
CNN network.
● Faster R-CNN is better because ROI generation
is now integrated within network
● One drawback of Faster R-CNN is that the RPN
can be slow, it is also unable to detect small
objects
Faster R-CNN (ResNet-50)
28. YOLO_V5_ Overview
Evolution from previous versions
Key features and improvements
Comparison to other object detection
algorithms (Faster R-CNN, SSD, etc.) D.
Supported frameworks and languages
29. YOLO_v5 Architecture
A. High-level architecture
B. Backbone network (e.g.,
CSPDarknet53, EfficientNet)
C. Neck and head components D.
Loss functions and training process
30. YOLO_V5
• You Only Look Once
• Proposes using an end-to-end neural network
that makes predictions of bounding boxes and
class probabilities all at once. It differs from
the approach taken by previous object
detection algorithms, which repurposed
classifiers to perform detection
• YOLO is a single-stage model so it can process
images much faster than RCNN family and
require much less training data, However it is
not as accurate as RCNN family and cannot
identify overlapping objects
32. SSD_mobilenet (v2_320*320)
- Balanced Accuracy and Efficiency: The model offers
a good balance between detection accuracy and
computational efficiency. It is suitable for real-time
applications on resource-constrained devices,
where both accuracy and efficiency are important.
- Multi-Scale Feature Extraction: The MobileNetV2
backbone enables the extraction of multi-scale
features, allowing the model to detect objects at
different scales and adapt to varying object sizes
present in the input images.
33. MobileNetV2 Architecture
Introduction to MobileNetV2
Key features and design principles
Lightweight and efficient architecture
Depthwise separable convolutions
Inverted residual blocks
34. SSD_mobilenet (v2_fbnlite_640*640)
- Improved Efficiency: The Fused Batch
Normalization Lite (FBNLite) variant of MobileNetV2
reduces memory consumption and improves
inference speed. It is optimized for efficient object
detection on mobile and embedded devices, making
it ideal for applications where computational
resources are limited.
- Aspect Ratio Flexibility: The elongated aspect ratio
(640x640) allows the model to handle objects with
non-standard aspect ratios more effectively. This
can be beneficial when dealing with objects that are
elongated or have specific aspect ratio
characteristics.
35. SSD_mobilenet (v2_fbnlite_320x320)
- Enhanced Efficiency: The FBNLite variant of
MobileNetV2 further improves computational
efficiency compared to the standard MobileNetV2
backbone. It reduces memory usage and speeds up
inference, making it well-suited for real-time
applications.
- Accurate Object Detection: While sacrificing some
accuracy compared to higher-resolution models,
SSD MobileNetV2 FBNLite 320x320 still maintains
good object detection performance. It strikes a
balance between accuracy and efficiency, making it
suitable for applications where real-time
performance is critical
40. Conclusion
. Finding data
. Applying all the required preprocessing
. Find models to work on the data
. Apply different models on the data
. Compare between the models
. Use the best model on real time camera
. Deployment the model on a Website Interface