2. GROUP MEMBERS
1
Subho Mrong
163 019 042
Department of Computer Science and Engineering
Primeasia University
2
Atia Ayesha Prema
163 015 042
Department of Computer Science and Engineering
Primeasia University
4. SPOTLIGHT
1. Overview
2. Objective
3. Understanding the problems
4. A Brief History of Object Detection
5. An Overview of the YOLO Architecture
6. Methodology
7. Implementation
8. Results
9. Conclusion
5. 1. OVERVIEW
We propose a vehicle-based computer vision approach to identify and analysis potholes,
vehicles and speed-breaker using a car-mounted camera. The results will then be logged
together with the GPS coordinates of the pothole or speed-breaker for use by technical
experts and road maintenance agencies and to aid drivers. Drivers will immediately get a
warning notification to take immediate action if potholes or speed-breakers are detected in
real-time.
6. 2. OBJECTIVE
Detecting potholes accurately and quickly is one of the important tasks for determining proper
strategies in road maintenance agencies.
The goal of this project is to detect the potholes, speed-breakers, and vehicles using simple
image processing methods. And then alert the driver to take immediate action if specific
objects are detected in real-time.
7. 3. UNDERSTANDING THE PROBLEMS
Friction between the vehicles tires and the road surface heats up and causes the road to
expand. This expansion results in the formation of cracks in the roadway surface over time.
This allows water to seep through the cracks, and causes pothole.
Potholes make a ride bumpy that can be risky. Potholes can generate damage such as flat tire
and wheel damage, impact and damage of lower vehicle, vehicle collision, cause wheel
alignment issues, and causes major accidents.
8. 4. A Brief History of Object Detection (1/5)
Object recognition is a general term to describe a collection of related computer vision tasks
that involve identifying objects in digital photographs.
Figure: Object Recognition
9. 4. A Brief History of Object Detection (2/5)
A modern detector is usually composed of two parts:
Backbone
1
Head
2
10. 4. A Brief History of Object Detection (3/5)
A backbone which is pre-trained on ImageNet and a head which is used to predict classes
and bounding boxes of objects.
● For those detectors running on GPU platform, their backbone could be VGG, ResNet,
ResNeXt, or DenseNet.
● For those detectors running on CPU platform, their backbone could be SqueezeNet,
MobileNet, or ShuffleNet.
11. 4. A Brief History of Object Detection (4/5)
The head part is usually categorized into two kinds:
● One-stage object detector
● Two-stage object detector.
12. 4. A Brief History of Object Detection (5/5)
One-stage object detector is the most representative model such as: SSD, RetinaNet, and
YOLO Family. YOLO Family consist of five members: YOLO V1, V2, V3, V4, and YOLO V5.
The most representative two-stage object detector is the R-CNN series, including fast R-
CNN, faster R-CNN, R-FCN, and Libra R-CNN.
13. 5. An Overview of the YOLO Architecture (1/5)
The YOLO model was first described by Joseph Redmon. In the 2015 paper titled “You Only
Look Once: Unified, Real-Time Object Detection”. Ross Girshick, developer of RCNN, was
also an author and contributor to this work.
14. 5. An Overview of the YOLO Architecture (2/5)
The approach involves a single neural network trained end to end that takes a photograph as
input and predicts bounding boxes and class labels for each bounding box directly. The
technique offers lower predictive accuracy, although operates at 45 frames per second and
up to 155 frames per second for a speed-optimized version of the model. The model works
by first splitting the input image into a grid of cells, where each cell is responsible for
predicting a bounding box if the center of a bounding box falls within the cell. Each grid cell
predicts a bounding box involving the x, y coordinate and the width and height and the
confidence. A class prediction is also based on each cell.
15. 5. An Overview of the YOLO Architecture (3/5)
The YOLO network consists of three main pieces:
● Backbone - A convolutional neural network that aggregates and forms image features at
different granularities.
● Neck - A series of layers to mix and combine image features to pass them forward to
prediction.
● Head - Consumes features from the neck and takes box and class prediction steps.
16. 5. An Overview of the YOLO Architecture (4/5)
The YOLO network consists of three main pieces:
● Backbone - A convolutional neural network that aggregates and forms image features at
different granularities.
● Neck - A series of layers to mix and combine image features to pass them forward to
prediction.
● Head - Consumes features from the neck and takes box and class prediction steps.
17. 5. An Overview of the YOLO Architecture (5/5)
Figure: YOLO V5 Benchmark Comparison
20. 7. IMPLEMENTATION
1. Preparing Dataset
2. Environment Setup
3. Files/Directories Configuration
5. Inference
6. Result Visualization
4. Training
YOLO v5 training part consists of six different steps:
26. 8. CONCLUSION
We presented a good preliminary method for pothole, vehicle, and speed-breaker
detection system that is designed to collect road images through a newly developed
optical device mounted on a vehicle and detects a pothole from the collected data using
the proposed algorithm. This system includes an optical device and a pothole detection
algorithm. The optical device on a vehicle collects potholes, vehicles, and speed-breaker
data, and the collected data is sent to a pothole detection algorithm. Also, the pothole
information such as the location and severity of a pothole obtained from a pothole
detection algorithm is sent to a road management agency. The optical device was
designed to easily be mounted in a vehicle, and it will have several functions such as
collecting and storing g data of potholes, communicating by GSM or Wi-Fi, and gathering
location information by GPS.
27. REFERENCES
Detecting potholes using simple image processing
techniques and real-world footage, 2015
[http://scholar.sun.ac.za/handle/10019.1/97191]
R-CNN
[https://arxiv.org/pdf/1506.01497.pdf]
YOLO; Unified, Real-Time Object Detection, 2015
[https://arxiv.org/pdf/1506.02640.pdf]
YOLOv3: An Incremental Improvement, 2018
[https://arxiv.org/pdf/1804.02767]
YOLOv4: Optimal Speed and Accuracy of Object
Detection, 2020
[https://arxiv.org/abs/2004.10934]
YOLO v5 PyTorch Github repository
[https://github.com/ultralytics/yolov5]
A Gentle Introduction to Object Recognition
[https://machinelearningmastery.com/object-recognition-
with-deep-learning]
An image processing approach to detect lanes, pot
holes and recognize road signs in Indian roads
[https://pdfs.semanticscholar.org/2dd8/c6112d45bcc22477
eeede41f411bb005036b.pdf]
1 5
2 6
3 7
4 8