Detecting House in Flood.pdf

DETECTING HOUSES IN FLOODED
AREA WITH THE HELP OF DRONE
IMAGE USING DEEP LEARNING
Submitted by
Mohammad Izaz Ahamed
CSE 01706540
Under the Supervision of
Mr. Sowmitra Das
Assistant Professor
This thesis is submitted to the Department of Computer Science and Engineering of
Port City International University in the fulfillment of the requirements for the degree
of Bachelor of Science (Engineering)
Department of Computer Science and Engineering
Port City International University
7-14, Nikunja Housing Society, South khulshi, Chattogram, Bangladesh
January 2023

DECLARATION
It's hereby declared that I have independently completed this thesis under the supervision of
Mr. Sowmitra Das, Assistant Professor of the Department of CSE at Port City International
University. To the best of my knowledge, no portion of this work has been previously
submitted for any other degree or qualification at this or any other educational institution. I
also confirm that I have only used the resources that were specifically authorized.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
(Signature of the candidate)
Mohammad Izaz Ahamed
CSE 01706540
CSE 17 (day)
Department of CSE
DETECTING HOUSES IN FLOODED AREA WITH THE HELP OF DRONE IMAGE USING DEEP LEARNING
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING, PCIU

APPROVAL
This thesis titled “DETECTING HOUSES IN FLOODED AREA WITH THE HELP OF
DRONE IMAGE USING DEEP LEARNING”, by Mohammad Izaz Ahamed has been
approved for summation to the Department of Computer Science and Engineering, Port City
International University, in partial fulfillment of the requirement for the degree of Bachelor of
Science (Engineering).
_ _ _ _ _ _ _ _ _ _ _ _ _ _
(Signature of Supervisor)
Mr. Sowmitra Das
Assistant Professor,
Department of Computer Science and Engineering

DEDICATION
This thesis is respectfully dedicated to my esteemed teachers, my loving parents, and all those
who have supported and encouraged me throughout my academic journey and specially our
respected supervisor Mr. Sowmitra Das.

TABLE OF CONTENTS
LIST OF FIGURES Ⅰ
LIST OF TABLES Ⅱ
ACKNOWLEDGMENTS Ⅲ
ABSTRACT Ⅳ
CHAPTER 1 1
INTRODUCTION 1
1.1 Overview 1
1.2 Problem Statement 1
1.3 Motivation 2
1.4 Objective 2
1.5 Object Detection 2
1.5.1 Object Localization 3
1.5.2 Object Classification 3
1.5.3 Object Instance Segmentation 3
1.6 YOLOv7 3
1.7 Organization of the document 4
CHAPTER 2 6
LITERATURE REVIEW 6
2.1 Overview 6
2.2 Previous Work 6
2.4 Scope of this problem 13
2.5 Challenges 13
CHAPTER 3 14
METHODOLOGY 14
3.1 Working Procedure 14
3.2 Proposed System 14
3.3 Data Collection 15
3.4 Data Preprocessing 17
3.4.1 Removing Unnecessary data 17
3.4.2 Data Resizing 17
3.5 Image Annotation 17
3.7 Automated Image Annotation 20
CHAPTER 4 23
HARDWARE AND TOOLKIT 23

4.1 Tools 23
4.1.1 Python 23
4.1.2 NumPy 24
4.1.3 Pandas 24
4.1.4 OS Module 25
4.1.6 OpenCV 25
4.1.7 VS Code 26
4.2 Hardware 26
CHAPTER 5 27
RESULT & DISCUSSION 27
5.1 Performance Evaluation 27
5.1.1 IOU 27
5.1.3 Average Precision 28
5.1.4 Mean Average Precision 29
5.3 YOLOv7 On Manually Annotated Dataset 30
5.3.1 Object Detection Report 30
5.3.2 Confusion Matrix 31
5.3.3 Accuracy & Loss Curve 32
5.4.1 Object Detection Report 33
5.4.2 Confusion Matrix 34
5.4.3 Accuracy & Loss Curve 35
5.5 Comparison of Performance 36
CHAPTER 6 37
CONCLUSION & FUTURE WORK 37
6.1 Conclusion 37
6.2 Limitation 37
6.3 Future Work 37
CHAPTER 7 38
REFERENCES 38
7.1 References 38

LIST OF FIGURES
Figure 2.1: Volan2018 annotation samples 7
Figure 2.2: Roof type recognition RGB + VDVI + Sobel 8
Figure 2.3: Image Processing 9
Figure 2.4: Model architecture of Mask R-CNN 10
Figure 2.5: Visualization of color channel for RGB, HSI 11
Figure 3.1: Basic flowchart of the proposed system 14
Figure 3.2: Example dataset Images of Flooded houses 16
Figure 3.3: YOLOv7 Annotation Format 18
Figure 3.4: YOLOv7 Annotated .txt File 19
Figure 3.5: Converting the image to Grayscale 20
Figure 3.6: Apply Adaptive Canny Edge Detection 21
Figure 3.7: Apply Dilation & Erosion 21
Figure 3.8: Find Coordinates 22
Figure 5.1: IOU 27
Figure 5.2: Confusion Matrix For Manually Annotated Data 31
Figure 5.3: Accuracy & Loss Curve For Manually Annotated Data 32
Figure 5.4: Confusion Matrix For Automatic Annotated Data 34
Figure 5.5: Accuracy & Loss Curve For Automatic Annotated Data 35
Ⅰ

LIST OF TABLES
Table 5.1: Object Detection Report For Manually Annotated Data 30
Table 5.2: Object Detection Report For Automatic Annotated Data 33
Table 5.3: Model Comparison 36
Ⅱ

ACKNOWLEDGEMENT
I would like to begin by offering my sincerest gratitude to the Almighty Allah for providing
me with the strength, determination, and opportunity to complete this project on time. My
deepest appreciation goes to my supervisor, Mr. Sowmitra Das, for his invaluable guidance
and support throughout the duration of this project. I am also grateful to my other esteemed
teachers at my university for their advice and assistance, both directly and indirectly, in
helping me stay focused on my thesis. Lastly, I extend my heartfelt thanks to my friends for
their unwavering support.
Ⅲ

ABSTRACT
Floods are a major natural disaster that can cause extensive damage to property,
infrastructure, and result in significant economic losses annually. In order to effectively
respond to such disasters, there is a need to develop an approach that can quickly detect the
houses in flooded areas. Satellite remote sensing has been utilized in emergency responses,
but it has limitations such as long revisit periods and inability to operate during rainy or
cloudy weather conditions. To address these limitations, this study proposes the use of drones
to detect flooded buildings. Through the utilization of deep learning models, specifically
YOLOv7, this study aims to develop an automated detection system for flooded buildings
using drone images. The results of the study show that the inundation of buildings and
vegetation can be accurately detected from the images with 92% accuracy. The performance
of the developed system was evaluated using various metrics such as accuracy, precision,
recall, and confusion matrices. Additionally, this study also presents an automated annotation
process to speed up the process of image annotation.
Ⅳ

CHAPTER 1
INTRODUCTION
1.1 Overview
Floods are one of the major natural disasters that cause huge damage to property,
infrastructure and economic losses every year. There is a need to develop an approach that
could instantly detect the houses in the flooded area. House detection using drones can be
helpful in a variety of ways, especially in emergency response situations such as floods. By
quickly and accurately identifying flooded buildings, emergency responders can prioritize
and target their efforts to provide aid and assistance to the most affected areas. Additionally,
the information obtained from house detection can also be used for post-disaster damage
assessments and to inform rebuilding efforts. Furthermore, this information can also be used
for urban planning and land management purposes, for example, to better understand the
distribution of housing in a region, and to identify areas that may be vulnerable to flooding.
In summary, house detection can provide valuable information to aid in emergency response
and recovery efforts, as well as for urban planning and land management. The objective of
our research is to develop a method for detecting houses in flooded areas. To achieve this, we
have used a dataset of over 3000 images (after augmentation) of flooded houses. To train our
model, we labeled the images in our dataset both manually and automatically by identifying
the presence of buildings and vegetation. Using the YOLOv7 model, we aim to develop an
automated detection system that can accurately identify flooded houses in images.
1.2 Problem Statement
The goal of this research is to create a system for identifying houses in flooded areas using
deep learning techniques. The proposed method utilizes the YOLOv7 model for automated
detection of flooded houses in images, with the aim of providing accurate and efficient
detection results.
P a g e | 1

1.3 Motivation
The motivation for this research stems from the need to address the limitations of current
methods for detecting flooded houses. The traditional approaches such as satellite remote
sensing have long revisit periods and are unable to operate during adverse weather
conditions. This research aims to provide a more efficient and accurate method for
identifying flooded houses using deep learning techniques and drone images. The proposed
method is expected to aid emergency response efforts, inform rebuilding efforts and urban
planning, as well as make a meaningful impact on society by contributing to the development
of a tool that can assist in disaster response and recovery. Furthermore, this research presents
an opportunity to explore and contribute to the field of computer vision and deep learning.
1.4 Objective
The objective of this research is to develop a method for identifying and detecting impacted
houses in flooded areas using deep learning-based object detection techniques. Specifically,
our aim is to utilize the YOLOv7 model to detect and identify houses in images of flooded
areas with high accuracy, and to create an automatic annotation system to efficiently label and
process the dataset used to train our object detection model.
1.5 Object Detection
Object recognition is a computational technique related to computer vision and image
processing that deals with recognizing instances of semantic objects of a particular class
(people, buildings, cars, fruits, etc.) in digital images or videos. One popular approach to
object detection is using the YOLO (You Only Look Once) algorithm, which is a real-time
object detection system that is able to effectively detect objects in images and videos using a
single pass of the convolutional neural network (CNN). YOLO divides the input image into a
grid of cells and for each cell, it predicts a set of bounding boxes and corresponding class
probabilities. This approach allows for fast and accurate object detection in real-time
applications. Object recognition is a computer vision task that includes several subtasks such
as object localization, object classification, and object instance segmentation.
P a g e | 2

1.5.1 Object Localization
This task is to determine the location of an object within an image or video. The most
common representation of the location of an object is a bounding box, which is a rectangular
box that surrounds an object. A bounding box is defined by the coordinates of the upper left
corner of the box, its width, and its height. This task is critical for object detection, as it
provides the information needed to identify the presence of an object within an image or
video.
1.5.2 Object Classification
This task is to identify the class or category of an object within an image or video. It is a
supervised machine learning problem, where the model is trained on a dataset of labeled
images and learns to recognize the different object classes. Object classification is used to
distinguish between different object classes, and the output of this task is typically a
probability score for each class.
1.5.3 Object Instance Segmentation
This task is to identify and segment out specific instances of objects within an image or
video. It is an extension of object localization, where not only the location of an object is
identified, but also the object pixels are segmented out. This task is more complex than object
localization and classification, as it requires the model to not only identify the object class,
but also the specific instance of that class. Object instance segmentation is used to identify
multiple instances of the same object class within an image or video, and the output of this
task is typically a mask or a binary image that indicates the presence of an object.
1.6 YOLOv7
YOLOv7 (You Only Look Once, version 7) is a state-of-the-art real-time object detection
algorithm that was developed by Alexey Bochkovskiy. It is a single-stage detector, which
means that it performs both object localization and classification in a single forward pass of
the convolutional neural network (CNN). This makes YOLOv7 particularly well-suited for
real-time object detection tasks where fast and accurate results are required. YOLOv7 uses a
new architecture that is more efficient than its predecessors, allowing it to run faster and with
less computational resources.
P a g e | 3

It also incorporates several new features and improvements, including a new anchor scale
search algorithm, new data augmentation method, new depthwise separable convolution, and
a new architecture called SPP-Net (Spatial Pyramid Pooling Network) to improve the
detection of small objects. YOLOv7 has been shown to be highly accurate and efficient, with
a balance between speed and accuracy, and it is widely used in various real-world
applications, such as autonomous vehicles, surveillance, image retrieval, and object tracking.
1.7 Organization of the document
Chapter 1: Introduction
This chapter provides a general overview of the research work, including the purpose and
scope of the study. It will give a brief summary of the problem being addressed, the research
questions, and the main objectives of the study. Additionally, this chapter will give a general
idea of the content and structure of the upcoming chapters.
Chapter 2: Background and Literature Review
This chapter will provide an overview of the previous research in the field of object detection,
discussing the different methods that have been used and their applications. The chapter will
also provide a comprehensive review of the existing literature in the field, highlighting the
gaps in the current knowledge that the present study aims to address.
Chapter 3: Proposed System & Research Methodology
This chapter will provide a detailed description of the methods and techniques used to
conduct the research. It will explain the theoretical framework, research design, and data
collection and analysis procedures used in the study. The chapter will also present the results
of the research in a graphical and pictorial format, to provide a clear understanding of the
approach taken and the findings obtained.
Chapter 4: Overview of Software and Hardware Used
This chapter will describe the software and hardware used to implement the proposed system,
including the specific components and technologies used. It will provide an overview of the
tools and equipment used in the research and how they were employed to achieve the
research objectives.
P a g e | 4

Chapter 5: Results and Discussion
This chapter will present the results of the proposed model for houses detection using object
detection. It will provide an in-depth analysis of the performance of the model, including
visual representations of the results, and a discussion of the findings. The chapter will also
evaluate the performance of the model using various metrics and provide insights into the
limitations of the proposed approach.
Chapter 6: Conclusion and Future Work
This chapter will summarize the main findings and conclusions of the research, highlighting
the key contributions of the proposed system for houses detection using object detection. It
will also discuss the limitations and challenges encountered during the research and propose
potential avenues for future work to improve the system. This chapter will give a brief
overview of the work discussed in the previous chapters.
P a g e | 5

CHAPTER 2
LITERATURE REVIEW
2.1 Overview
Building detection in flooded areas is a challenging task in the field of object detection,
where the goal is to accurately identify buildings or houses using images. Floods are a
common natural disaster that can cause significant damage to property and infrastructure, and
as a result, obtaining a sufficient dataset for this task can be difficult. In this literature review
section, we will explore previous research in object detection that is closely related to this
topic and examine the current state of the art in building detection in flooded areas.
2.2 Previous Work
In recent years, various research studies have been conducted to utilize deep learning models
for object detection in aerial imagery for disaster response and recovery. One such technique
is YOLO, which has been employed in several studies [1] to identify objects within aerial
images.
They employed a technique known as YOLO for identifying objects within aerial images,
with a focus on applications related to disaster response and recovery. They trained and
evaluated their models using an in-house dataset of 8 annotated aerial videos from various US
hurricanes in 2017-2018 [8]. Furthermore, they also used Volan18: object detection in aerial
imagery for disaster response and recovery [9].
P a g e | 6

Figure 2.1: Volan2018 annotation samples
Figure 2.1 examples of annotation samples for each class. They used a process for annotating
video frames using a tool called DarkLabel [10]. They draw bounding boxes around objects
of interest (GOIs), with the goal of obtaining pixel coordination values for training.
Furthermore, they estimate that it takes about 2 seconds for one annotator to annotate each
frame, but this can vary depending on the complexity of the scene and the number of objects
that need to be annotated [1]. As an example, they estimate that it would take 10 hours of
work to annotate a 10-minute video with 18,000 frames. Which take a long time to complete,
and it will also make the process of updating the dataset lot harder and time-consuming.
Furthermore, they achieved 80.69% mAP for high altitude and 74.48% for low altitude
footage. Besides, they also found that models trained on similar altitude footage perform
better and that using a balanced dataset and pre-trained weights improves performance and
reduces training time. [1] YOLOv7 can improve the performance and accuracy [11].
Another paper [2] presents a method for identifying roof types of complex rural buildings
using high-resolution UAV images, which achieved F1-score, Kappa coefficient (KC) and
Overall Accuracy (OA) averaging 0.777, 0.821 and 0.905 respectively. They use deep
learning networks to analyze different feature combinations . They found that the model
incorporating Sobel edge detection [12] features had the highest accuracy.
P a g e | 7

Figure 2.2: Roof type recognition RGB + VDVI + Sobel
They used the improved Mask R-CNN model to compare the recognition results of different
feature combinations, such as RGB, VDVI , Sobel [12]. The results, shown in Figure 2.2,
indicate that the feature combination of RGBS has a stronger sensitivity to the boundaries of
single building roof categories and can accurately outline the outlines of single buildings [2].
The feature combination of RGBS and RGBV was found to be effective in identifying flat
roofs, while RGBV was found to be more advantageous than RGBS for distinguishing
vegetation and buildings. The authors conclude that spectral and spatial information are
significant features for remote sensing image classification and recognition.
In another paper, [3] presents a study on the use of deep learning models for automated
detection of flooded buildings using UAV aerial images. The method was studied in a case
study of Kangshan levee of Poyang Lake, and the results showed that flooding of central
buildings and vegetation could be recognized from images with 88% and 85% accuracy,
respectively.
P a g e | 8

Figure 2.3: Image Processing
In Figure 2.3 the YOLOv3 algorithm is a method for object detection in images that the paper
has used [3]. It works by dividing the input image into a grid of S*S cells [13]. For each
object and each grid cell, the algorithm calculates the probability that the center of the object
falls within that cell. If the probability exceeds a certain threshold, it is determined that there
is an object in the grid cell. The algorithm then creates bounding boxes around the grid cells
that contain objects, and simultaneously calculates a confidence level for each bounding box
[16]. Each bounding box is represented by five parameters: the x and y coordinates of the
center of the bounding box relative to the grid cell, the width, and height of the bounding box
in relation to the entire image, and the confidence level.
The research [3] also shows that it is possible to estimate the buildings' inundation area
according to the UAV images and flight parameters. The study highlights the potential value
of UAV systems in providing accurate and timely visualization of the spatial distribution of
inundation for flood emergency response.
Some paper [4] presents research on using aerial drone-based image recognition for fast and
accurate assessment of flood damage. In that work, they propose a water level detection
system using an R-CNN learning model and a new labeling method for reference objects such
as houses and cars.
P a g e | 9

Figure 2.4: Model architecture of Mask R-CNN
Figure 2.4 describe a module in their system that is responsible for detecting and recognizing
houses and cars in top-view images captured by a drone. They use the Mask Region-Based
Convolutional Neural Network (Mask R-CNN) architecture, which is a state-of-the-art
method for object detection and segmentation [4]. The network uses a combination of Feature
Pyramid Network (FPN) and ResNet to extract features from the input images. The ResNet
component extracts simple features such as lines and corners, and the FPN component
extracts more complex and concrete features called feature maps.[5] The Region Proposal
Network (RPN) is than applied to identify regions of interest (RoIs) that are likely to contain
objects. The feature map on the highlighted RoIs is then used at the output layers to locate
and recognize (classify) objects. The output of this process is a list of detected and classified
objects, along with their bounding boxes.
This system uses data augmentation and transfer-learning overlays of masked R-CNN for
object detection models to address the challenges of limited wild datasets of top-down flood
images. [3]
P a g e | 10

Additionally, the VGG16 network is employed for water level detection purposes. The
system was evaluated on realistic images captured at the time of a disaster and the results
showed that the system can achieve a detection accuracy of 73.42% with an error of 21.43 cm
in estimating the water level.[4]
Furthermore, another study [5] presents research on the use of UAV-based image analysis for
automated flood detection. The study aims to develop a system that can automatically detect
and analyze flood severity using images captured by a UAV [6]. The study utilizes RGB and
HSI color models to represent flood images and employs k-mean clustering and region
growing for image segmentation.
Figure 2.5: Visualization of color channel for RGB, HSI
The authors present visualizations of different color channels in their study. They illustrate
the individual color channels of RGB in Figure 3 (a)-(c) and HSI in Figure 2.5 (d)-(f).
Additionally, they also show a gray level image in the same figure.
P a g e | 11

To enhance the S and I channel of HSI, the authors propose using two combinations of color
channels, I+S and I-S, which can be seen in Figure 2.5 (g) and (h) respectively. These
visualizations provide insight into the different color representations used in the study and
how they are manipulated to improve the performance of the system.
The segmented images were validated with manually segmented images and the results show
that the region growing method using gray images has a better segmentation accuracy of 88%
compared to the k-mean clustering method [5]. In this study, we also developed an automatic
flood monitoring system called Flood Detection Structure (FDS) based on the domain
extension method.
The authors of the paper [2],[4] have developed a large-scale dataset called Aerial Image
Dataset (AID) for aerial scene classification. Aerial scene classification is a problem in
remote sensing that aims to automatically label aerial images with specific semantic
categories. In recent years, many algorithms have been proposed for this task, but the existing
datasets for aerial scene classification, such as UC-Merced dataset and WHU-RS19, are
relatively small, and the results obtained from them are already saturated. This limits the
development of scene classification algorithms. The goal of AID is to create a dataset that can
advance the state-of-the-art in scene classification of remote sensing images. To create AID,
the authors have collected and annotated more than ten thousand aerial scene images.[8] They
also provide a comprehensive review of existing aerial scene classification techniques and
recent widely-used deep learning methods. But the manual notation make it difficult. Finally,
they provide a performance analysis of typical aerial scene classification and deep learning
approaches on AID, which can be used as a benchmark for future research.
P a g e | 12

2.3 Research Summary
Various studies have been conducted using UAV and drone images for the detection of
flooded buildings, with the use of YOLO and Mask-RCNN being the most common
algorithms used. These studies have achieved high accuracy results, with some utilizing novel
techniques such as feature fusion and transfer learning. The result can be improved. However,
the process of manual annotation of the data remains a challenge in such research.
2.4 Scope of this problem
Using object detection includes developing an automated detection system that can accurately
identify flooded houses in images, using deep learning techniques and the YOLOv7 model.
The proposed method aims to provide efficient detection results by using YOLOv7.
Additionally, the scope of the problem also includes the need for automated annotation of
images. Which can make the system dynamic and save more time.
2.5 Challenges
The challenges include the limited availability of labeled datasets, difficulty in identifying
flooded houses in images due to similar appearance with other non-flooded structures, and
the need for accurate and efficient detection results. Additionally, the variability in lighting
and weather conditions, as well as the complexity of the flooded landscapes, can also pose
challenges in detecting flooded houses. The limited resources and time for collecting and
labeling the data is also one of the major challenges in this research.
P a g e | 13

CHAPTER 3
METHODOLOGY
3.1 Working Procedure
This chapter provides an overview of the technical approach used to develop the system for
detecting flooded houses using object detection techniques. It explains the system
architecture, including the specific algorithms and techniques used for automated image
annotation and detection of buildings/houses in flooded areas. Additionally, it highlights any
challenges that were encountered during the development process and how they were
addressed.
3.2 Proposed System
Figure 3.1: Basic flowchart of the proposed system
Figure 3.1 shows the basic principle of our proposed system. The first step is to collect image
data and label it appropriately. This labeled dataset then undergoes preprocessing before
being split into three sets: training, testing, and validation. The training dataset is then used to
train the YOLOv7 model.
P a g e | 14

Once the model is trained, the performance is evaluated using the test and validation sets. The
evaluation process is used to determine the accuracy and effectiveness of the model in
detecting objects in the images.
3.3 Data Collection
The collection and annotation of high-quality datasets is crucial for the development of any
real-world AI application, particularly in the field of object detection. However, obtaining
such datasets can be a challenging task due to the complexity and unstructured nature of
real-world data. This challenge is amplified in the case of detecting flooded houses, where the
availability of relevant and accurately labeled datasets is limited. In this research, we aimed
to address this challenge by collecting and annotating a dataset of 500 images from AIDER[6]
of flooded houses. We augmented (Mosaic augmentation) the dataset and split it into 70%
training, 20% validation and 10% test data. Here is the distribution:
● Train: (2048, 640, 640, 3)
● validation: (593, 640, 640, 3)
● Test: (283, 640, 640, 3)
P a g e | 15

Figure 3.2: Example dataset Images of Flooded houses
Here are the examples of our dataset image samples in Figure 3.2. The images were taken
using a drone, providing a top-down angle. The images contain various examples of flooded
houses and buildings, as well as vegetation surrounding the buildings. These images are
being used to train the model in the proposed system.
P a g e | 16

3.4 Data Preprocessing
This section discusses the process of data pre-processing and cleaning, which is an essential
step in preparing the dataset for object detection. We began by removing any irrelevant or
noisy data from the dataset. This included removing images that did not belong to any of the
classes we were interested in. We also performed image resizing and brightness adjustments
to ensure a stable and consistent dataset for training and testing the object detection model.
3.4.1 Removing Unnecessary data
We preprocessed the dataset to ensure that it was clean and ready for training. This involved
removing any images that were noisy or did not represent flooded buildings. We also made
sure to remove any duplicate images [14], ensuring that the final dataset was as diverse and
representative as possible. Additionally, we also performed any necessary image adjustments,
such as brightness and contrast enhancements, to further improve the quality of the dataset.
This step was crucial in ensuring that our model was able to learn from the best possible data
and achieve high levels of accuracy during the training process.
3.4.2 Data Resizing
We used OpenCV’s resize() method to resize our image. We resized all images to a standard
resolution of 640×640 pixels with 3 channels to ensure consistency across the dataset. This
allows for better processing and training of our models. Additionally, this ensures that all
images are of the same size and aspect ratio, making it easier to work with and analyze the
data.
3.5 Image Annotation
The dataset must be labeled in order for the model to understand the relationship between the
input data and the desired output. However, in many cases, obtaining a labeled dataset can be
a time-consuming and labor-intensive process. In this study, we used an online annotation
tool called Roboflow to manually label our dataset for object detection [15]. This involved
drawing bounding boxes around the objects of interest in the images and providing
appropriate labels for each object.
P a g e | 17

Additionally, we also experimented with using different methods of automatic annotation to
label our images. This allowed us to quickly and efficiently label our dataset, which is
essential for training an accurate deep learning model.
3.6 YOLOv7 Image Annotation
Image annotation is typically done by drawing bounding boxes or polygons around the
objects, and is used to train object detection models such as those based on deep learning
algorithms like YOLO. The process of annotation is usually done by human annotators, but
there are also software tools that can assist in this task. The annotated images provide the
object detection model with the necessary information to learn how to detect objects within
images. This process is crucial for the accuracy of the object detection models.
Figure 3.3: YOLOv7 Annotation Format
We used the yolov7 format for bounding box annotation, which is stored in .txt files. Each
row in the file represents one object and includes the class, x and y coordinates of the center,
and the width and height of the bounding box. These coordinates are normalized to the
dimensions of the image, with values between 0 and 1. It's important to note that class
numbers are zero-indexed, starting from 0.[16]
P a g e | 18

Figure 3.4: YOLOv7 Annotated .txt File
Here in the Figure 3.4 each object is represented in one row, with each row consisting of
several columns. The columns include the object class, the x and y coordinates of the center
of the bounding box, the width, and the height of the bounding box (class, x_center, y_center,
width, height) [16] . It is noted that the bounding box coordinates must be normalized to the
dimensions of the image, meaning that the values must be between 0 and 1. Additionally, it is
stated that the class numbers are zero-indexed, starting from 0. A function is also mentioned
to be written, which will convert the annotations in VOC format to a format where the
bounding box information is stored in a dictionary.
P a g e | 19

3.7 Automated Image Annotation
The process of image annotation is a crucial but time-consuming task in the field of computer
vision and object detection. It involves manually labeling images in a dataset with relevant
information, such as object class, location, and attributes. This is a labor-intensive task that
can take a significant amount of time and resources, especially when the dataset is large or
constantly changing. However, with the advancement of technology, it is now possible to
automate this process, making it more efficient and cost-effective. Automated image
annotation techniques, can be trained to recognize the object with high accuracy, reducing the
need for human intervention. This not only saves time and resources, but also ensures that the
dataset is up-to-date and accurate. By automating the image annotation process, it becomes
much more feasible to keep up with the rapid pace of data generation and improve the
performance of machine learning models. For our dataset, we used opencv library to achieve
our goal. The process is shown below:
Figure 3.5: Converting the image to Grayscale
To apply automated image annotation techniques, it is necessary to follow some essential
steps. One important step is to convert the images to grayscale and apply Gaussian blur to
remove small noise [17]. This is an essential step that ensures the next process will have
better results.
P a g e | 20

Figure 3.6: Apply Adaptive Canny Edge Detection
We used Adaptive Canny Edge Detection[7] is an improved version of the traditional Canny
Edge Detection algorithm that uses local image intensity to adjust the threshold values for the
hysteresis thresholding step, making it more robust to variations in the image intensity and
noise. By tuning the parameters, we can get the edges of buildings only.
For calculating the threshold values is to use the following equations:
Lower threshold = mean(image intensity) - k1 * std(image intensity)
Upper threshold = mean(image intensity) + k2 * std(image intensity)
Where k1 and k2 are constants that are used to control the threshold values, and mean and std
are the mean and standard deviation of the image intensity in the region around each pixel.
Figure 3.7: Apply Dilation & Erosion
We also applied Dilation to join the shape and thickening it, and Erosion to remove small
noises using OpenCV [18].
P a g e | 21

Figure 3.8: Find Coordinates
After that, we can use the Contour Approximation Method [19] to find the coordinate and the
bounding box from the binary image. We then converted the abounding box coordinate to
YOLO label format and got the annotated .txt file.
P a g e | 22

CHAPTER 4
HARDWARE AND TOOLKIT
This chapter details the software and hardware components utilized in the development of our
system for detecting houses in flooded areas. The specific tools and technologies used to
implement the system will be discussed in depth.
4.1 Tools
The tools that will be utilized in the implementation of the proposed system include:
● Python
● NumPy
● Pandas
● OS
● Matplotlib
● OpenCV
● VS Code
● Colab Notebook
4.1.1 Python
Python is a widely-used, high-level programming language that is widely used in web
development, scientific computing, data analysis, artificial intelligence, and other fields. It
has a simple and easy-to-learn syntax, making it a popular choice for beginners and
experienced programmers alike. Python is also known for its large and active community,
which has developed a wide range of libraries and frameworks that make it easy to build
complex and powerful applications. These libraries provide a powerful and flexible set of
tools for building and training neural networks, which is the core technology behind our
system for detecting flooded houses. In this work, Python is used as a primary programming
language for implementing the deep learning model.
P a g e | 23

4.1.2 NumPy
NumPy is a library in Python that is commonly used for image processing tasks. It provides a
powerful array object, as well as a number of functions for manipulating arrays, including
mathematical and statistical operations. One of the main advantages of using NumPy for
image processing is its ability to perform element-wise operations on arrays, which allows for
efficient implementation of many image processing algorithms. Additionally, NumPy can be
easily integrated with other libraries, such as OpenCV, making it a versatile tool for image
processing tasks.
It has a variety of attributes, such as the following:
● A powerful object for an N-dimensional array
● Advanced (broadcasting) features Tools for combining
● Fortran and C/C++ programs
● Useful Fourier transform, random number, and linear algebra abilities
● Can be used to store common data in a multidimensional format
● Can quickly and cleanly connect to a variety of databases
● Allows the generation of any data types
4.1.3 Pandas
Pandas is a Python library that is used for data manipulation and analysis. It provides
powerful data structures like the Data Frame and Series, which allows for easy manipulation
and analysis of large datasets. Pandas also has built-in functions for handling missing data,
merging and joining data, and filtering and grouping data. Additionally, it has strong support
for reading and writing data in various file formats such as CSV, Excel, and SQL. Pandas is a
crucial tool for data scientists and data analysts and is widely used in data wrangling, data
exploration, and data visualization tasks.
P a g e | 24

4.1.4 OS Module
The Python OS module provides tools for interacting with the operating system. It allows for
the use of operating system-dependent features and contains a variety of file system interface
functions through the os and os.path modules. These functions can be used for tasks such as
navigating file directories and manipulating files and directories.
4.1.5 Matplotlib
Matplotlib is a powerful library for creating visualizations in Python. It can be used to create
static, animated, and interactive plots, making it a versatile tool for data exploration and
analysis. In our project, we utilized Matplotlib for visualizing the detected flooded houses by
plotting the images and displaying the results of our model. This made it easy to understand
the performance of our model and gain insights from the data. Overall, Matplotlib played an
important role in our project by providing a clear and intuitive way to present our findings.
4.1.6 OpenCV
In this project, we used OpenCV, which is an open-source computer vision library, to perform
various image processing tasks. This library contains a wide range of tools and functions that
can be used to process and analyze images and videos. In our project, we used OpenCV to
read, display, and manipulate images. We also used it to perform image cropping, resizing,
and thresholding to enhance the image quality. Additionally, OpenCV's feature detection and
extraction capabilities were used to identify objects in the flooded images, which is an
important step in object detection. Overall, OpenCV proved to be a valuable tool in this
project, as it allowed us to perform various image processing tasks efficiently and effectively.
P a g e | 25

4.1.7 VS Code
Visual Studio Code (VS Code) is a popular source-code editor that is widely used by
developers for its powerful features and ease of use. Developed by Microsoft, it is available
for Windows, Linux, and macOS and is designed for building and debugging modern web
and cloud applications. Some of the key features of VS Code include debugging, syntax
highlighting, intelligent code completion, and code refactoring, as well as support for a wide
range of programming languages and a large number of customizable extensions. It is
considered as a lightweight, fast and flexible code editor.
4.1.8 Colab Notebook
Colab Notebook is a web-based platform for machine learning development. It is a free,
open-source Jupyter notebook environment that requires no setup and runs entirely in the
cloud. With Colab Notebook, you can write, execute, and share code with others, as well as
import data, train models, and collaborate with others in real-time. It also allows you to use
powerful hardware such as GPUs and TPUs for training models. It also provides a seamless
integration with Google Drive, which makes it easy to store and access your data and models.
Overall, Colab Notebook is a great tool for data scientists, machine learning engineers, and
researchers, who need a powerful and easy-to-use environment for their work.
4.2 Hardware
● Processor : Intel(R) Core(TM) i3-8130U CPU @ 2.20GHz 2.21 GHz
● Ram : 8 GB DDR4, 2400MHz
● OS : Windows 10
P a g e | 26

CHAPTER 5
RESULT & DISCUSSION
This chapter presents the results and performance analysis of our proposed system, YOLOv7,
on our dataset of flood affected building images. The metrics used include accuracy,
precision, recall, and mAP. Additionally, the chapter includes visual representations of the
comparison between actual and predicted results for both the automatic and manual
annotation methods.
5.1 Performance Evaluation
The mAP (mean Average Precision) metric is commonly used to evaluate the performance of
object detection models, such as YOLO. It is calculated by taking into account various factors
including the Intersection over Union (IOU), precision, recall, and the precision-recall curve.
The mAP score provides a comprehensive measure of the model's overall accuracy.
5.1.1 IOU
IOU is a metric used to evaluate the performance of object detection models such as YOLO,
by measuring the overlap between predicted and ground truth bounding boxes. mAP is the
mean Average Precision, which takes into account the IOU threshold for successful detection.
Figure 5.1: IOU
P a g e | 27

mAP50 is the accuracy when IOU is set at 50%, meaning that if there is more than 50%
overlap between the predicted and ground truth bounding boxes, it is considered a correct
detection. The higher the IOU threshold, the stricter the evaluation, and thus the lower the
mAP value [20].
5.1.2 Precision and Recall
Precision is a metric that measures the accuracy of positive predictions. It is the ratio of true
positive detections to the total number of detections, including false positives. A precision of
1.0 means all positive predictions were correct, while a lower value indicates false positives.
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
Precision measures the proportion of correctly identified positive instances, while recall
measures the proportion of actual positive instances that were correctly identified. A model
with high precision and high recall is considered to be accurate.
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
5.1.3 Average Precision
Average Precision (AP) is a commonly used metric in object detection to evaluate the
performance of a model. It is calculated by measuring the area under the Precision-Recall
Curve. AP is considered a more comprehensive metric as it takes into account both precision
and recall, providing a more accurate assessment of the model's performance. A higher AP
value indicates that the model is able to identify more relevant objects and minimize false
positives, resulting in a better-performing model.
P a g e | 28

5.1.4 Mean Average Precision
The metric of mAP (mean Average Precision) is a useful tool for evaluating the performance
of object detection models. It takes into account both precision and recall by averaging the
AP (Average Precision) values for all classes in the model. A higher mAP value indicates a
more accurate and efficient model. It is commonly used to compare the performance of
different models and to identify areas for improvement.
𝑚𝐴𝑃 =
1
𝑁
𝑖=1
𝑁
∑ 𝐴𝑃𝑖
5.2 Experimental Analysis
The experiments have been separated into two parts. In the first part, we evaluate the
performance of our proposed system, YOLOv7, applied to our manually annotated dataset.
The dataset consists of two classes, buildings and vegetation, which were labeled by human
experts. We trained the YOLOv7 model using this dataset and calculated the accuracy,
precision, recall, and mAP. In the second part of the experiments, we applied the YOLOv7
model on the automated annotated dataset and compare the results with the manually
annotated dataset. Overall, our experiments aim to provide a comprehensive evaluation of the
performance of YOLOv7 on our flood affected building detection dataset.
P a g e | 29

5.3 YOLOv7 On Manually Annotated Dataset
We trained the Manually Annotated Dataset on YOLOv7, and the results were quite
impressive. The model was able to achieve a high level of accuracy, with an overall mAP of
around 0.92. The precision and recall were both very high, with precision being around 0.90
and recall being around 0.86. This indicates that the model was able to accurately detect and
classify the buildings and vegetation in the images with a high degree of accuracy. Overall,
the results of this experiment were very promising and demonstrate the potential of YOLOv7
for use in flood detection systems.
5.3.1 Object Detection Report
Table 5.1: Object Detection Report For Manually Annotated Data
We carried out a training using our manually annotated images and report the results of this
training. We used 50 training epochs, which took a total of 5 hours. The annotated images
were divided into two classes: building and vegetation. The validation results were recorded
and are presented in a Table (Table 5.1). It is reported that for 593 validation images, the
precision for the building class was 93% and for vegetation was 88%. The overall precision
for all classes was 90%. For recall, the results were 88% for building and 84% for vegetation,
with an overall recall of 86%. The model demonstrated a high level of accuracy, with an
overall mAP (mean average precision) of around 0.92. Additionally, the building class had a
mAP of 94% and the vegetation class had a mAP of 91%.
P a g e | 30

5.3.2 Confusion Matrix
Figure 5.2: Confusion Matrix For Manually Annotated Data
Figure 5.2 present a confusion matrix as a representation of the model's performance. The
x-axis of the matrix corresponds to the true categories, and the y-axis corresponds to the
model's outputs. The matrix includes floating values that indicate the percentage of samples.
The color of the blocks encodes the percentage of a class (x) classified into a predicted class
(y). Additionally, the matrix takes into account false positives and false negatives for the
background class. False positive for background refers to objects that are not part of either of
the classes but are detected as such, and false negative for background refers to objects that
are not detected and considered as background. Overall, the model correctly predicted 92% of
building and 90% of vegetation from the total validation images.
P a g e | 31

5.3.3 Accuracy & Loss Curve
Figure 5.3: Accuracy & Loss Curve For Manually Annotated Data
Figure 5.3 that illustrates the performance of their model using YOLOv7. The figure shows
the loss curve for the box, objectness, and classification components of the model.
Additionally, the figure shows the precision, recall, and mAP accuracy. The results indicate
that all the components of the model have performed well. The box and objectness measure
the accuracy of the bounding box prediction, and classification loss measure the error in
object class prediction, all of these losses are used to adjust the network's parameters and
improve detection accuracy. Overall, the results are quite good.
P a g e | 32

5.4 YOLOv7 On Automatic Annotated Dataset
We trained the YOLOv7 model on the Automated Annotated Dataset, and it performed little
less than the previous model. The model achieved a high level of accuracy, with an overall
mAP of around 0.73. It had a good precision of around 0.79, and recall of around 0.69. The
use of automated annotation system not only saves time but also gives the promising results.
The results of this experiment indicate that YOLOv7 could be an effective tool for detecting
flooded buildings in images.
5.4.1 Object Detection Report
Table 5.2: Object Detection Report For Automatic Annotated Data
We carried out a training using our automated annotated images and report the results of this
training. We used 50 training epochs, which took a total of 5 hours. The annotated images
class is building . The validation results were recorded and are presented in a Table (Table
5.2). It is reported that for 593 validation images, the precision for the building class was
73% and for vegetation was 69%. The model demonstrated a decent level of accuracy, with
an overall mAP (mean average precision) of around 73%.
P a g e | 33

5.4.2 Confusion Matrix
Figure 5.4: Confusion Matrix For Automatic Annotated Data
Figure 5.4 shows a visualization of the model's performance in the form of a confusion
matrix. The rows of the matrix correspond to the true categories, and the columns correspond
to the model's predictions. The values within the matrix show the percentage of samples that
fall into a particular category. The color of the blocks represents the percentage of a class (x)
that was classified as a predicted class (y). The matrix also takes into account false positives
and false negatives for the background class, which refer to objects that were incorrectly
identified or missed by the detector and considered as background. According to the
results,the model correctly identified 68% of building instances from the total validation
images.
P a g e | 34

5.4.3 Accuracy & Loss Curve
Figure 5.5: Accuracy & Loss Curve For Automatic Annotated Data
Figure 5.5 illustrates the performance of their model using YOLOv7. The figure shows the
loss curve for the box and objectness components of the model. Additionally, the figure
shows the precision, recall, and mAP accuracy. The results indicate that the performance of
this model is lower than the previous model. The box and objectness measure the accuracy of
the bounding box prediction. All of these components are used to adjust the network's
parameters and improve detection accuracy. From here we can observe lower performance in
this model as compared to the previous one.
P a g e | 35

5.5 Comparison of Performance
YOLOv7
(manual
annotation)
YOLOv7
(Automatic
annotation)
Precision
0.90 0.79
Recall
0.86 0.69
mAP@0.5
Accuracy 0.92 0.73
Table 5.3: Model Comparison
The results from both the manually annotated dataset and the automated annotated dataset
were compared and analyzed. The model trained on the manually annotated dataset showed a
slightly higher mAP of around 0.78 compared to the model trained on the automated
annotated dataset, which had a mAP of around 0.73. Both models had similar precision and
recall scores, with the manually annotated model having slightly higher precision at 0.83 and
the automated annotated model having slightly higher recall at 0.71. However, the use of an
automated annotation system greatly reduced the time and resources needed for annotation,
highlighting the potential benefits of this approach. Overall, both models showed promising
results in detecting flooded buildings in images.
P a g e | 36

CHAPTER 6
CONCLUSION & FUTURE WORK
This chapter presents the conclusions and evaluations of our proposed system, based on the
results and observations from the experiments. It also highlights the limitations of the current
research and suggests potential future work to improve the system's performance.
6.1 Conclusion
We compared the results of training YOLOv7 on both manual and automated annotated
datasets and found that the model achieved a higher accuracy with the manual annotated
dataset, with an overall mAP of around 92%. However, using an automated annotation
system saved a significant amount of time. Overall, the results of this comparison
demonstrate the potential for using YOLOv7 in flood detection systems and the benefits of
using an automated annotation system to speed up the process.
6.2 Limitation
The study used a few images for training and testing the model, which may not be
representative of all possible flood scenarios. It only focused on detecting flooded buildings,
and did not consider other types of flooding, such as road flooding or flash flooding.
6.3 Future Work
We will continue analyzing and try to improve the model and try to build projects with it.
P a g e | 37

CHAPTER 7
REFERENCES
7.1 References
[1] Y. Pi, N. D. Nath, and A. H. Behzadan, “Convolutional neural networks for object
detection in aerial imagery for disaster response and recovery,” Advanced Engineering
Informatics, vol. 43, p. 101009, Jan. 2020, doi: 10.1016/j.aei.2019.101009. ‌
[2] Y. Wang, S. Li, F. Teng, Y. Lin, M. Wang, and H. Cai, “Improved Mask R-CNN for Rural
Building Roof Type Recognition from UAV High-Resolution Images: A Case Study in
Hunan Province, China,” Remote Sensing, vol. 14, no. 2, p. 265, Jan. 2022, doi:
10.3390/rs14020265.
‌
[3] K. Yang, S. Zhang, X. Yang, and N. Wu, “Flood Detection Based on Unmanned Aerial
Vehicle System and Deep Learning,” Complexity, vol. 2022, pp. 1–9, May 2022, doi:
10.1155/2022/6155300.
[4] H. Rizk, Y. Nishimur, H. Yamaguchi, and T. Higashino, “Drone-Based Water Level
Detection in Flood Disasters,” International Journal of Environmental Research and Public
Health, vol. 19, no. 1, p. 237, Dec. 2021, doi: 10.3390/ijerph19010237. ‌
[5] N. S. Ibrahim, S. M. Sharun, M. K. Osman, S. B. Mohamed, and S. H. Y. S. Abdullah,
“The application of UAV images in flood detection using image segmentation techniques,”
Indonesian Journal of Electrical Engineering and Computer Science, vol. 23, no. 2, p. 1219,
Aug. 2021, doi: 10.11591/ijeecs.v23.i2.pp1219-1226. ‌
[6] C. Kyrkou, “AIDER: Aerial Image Database for Emergency Response applications,”
GitHub, Feb. 18, 2022. https://github.com/ckyrkou/AIDER .
‌
[7] G. Jie and L. Ning, “An Improved Adaptive Threshold Canny Edge Detection
Algorithm,” IEEE Xplore, Mar. 01, 2012.
https://ieeexplore.ieee.org/abstract/document/6187852 .
P a g e | 38

[8] G.S. Xia, J. Hu, F. Hu, B. Shi, X. Bai, Y. Zhong, L. Zhang, X. Lu, AID: A benchmark
data set for performance evaluation of aerial scene classification, IEEE Trans. Geosci.
Remote Sens. 55 (7) (2017) 3965–3981.
[9] C. I. and B. E. R. (CIBER) Lab, “Volan v.2018: object detection in aerial imagery for
disaster response and recovery,” GitHub, Oct. 25, 2021.
https://github.com/ciber-lab/volan-yolo ‌
[10] D. Programmer, “DarkLabel,” GitHub, Oct. 28, 2022.
https://github.com/darkpgmr/DarkLabel ‌
[11] K.-Y. Wong, “Official YOLOv7,” GitHub, Sep. 12, 2022.
https://github.com/WongKinYiu/yolov7
[12] R. Fisher, S. Perkins, A. Walker, and E. Wolfart, “Feature Detectors - Sobel Edge
Detector,” homepages.inf.ed.ac.uk, 2003.
https://homepages.inf.ed.ac.uk/rbf/HIPR2/sobel.htm ‌
[13] M. Chablani, “YOLO — You only look once, real time object detection explained,”
Medium, Aug. 31, 2017.
https://towardsdatascience.com/yolo-you-only-look-once-real-time-object-detection-explaine
d-492dc9230006 ‌
[14] A. Igareta, “Removing Duplicate or Similar Images in Python,” Medium, Jul. 14, 2021.
https://towardsdatascience.com/removing-duplicate-or-similar-images-in-python-93d447c1c3
eb#:~:text=To%20apply%20it%20in%20a (accessed Jan. 24, 2023). ‌
[15] “Roboflow Annotate,” roboflow.com. https://roboflow.com/annotate
[16]“Step-by-step instructions for training YOLOv7 on a Custom Dataset,” Paperspace Blog,
Oct. 20, 2022. https://blog.paperspace.com/train-yolov7-custom-data/ ‌
[17] “OpenCV: Smoothing Images,” docs.opencv.org.
https://docs.opencv.org/4.x/d4/d13/tutorial_py_filtering.html ‌
[18] “OpenCV: Eroding and Dilating,” Opencv.org, 2019.
https://docs.opencv.org/3.4/db/df6/tutorial_erosion_dilatation.html ‌
P a g e | 39

[19] “OpenCV: Contours : Getting Started,” docs.opencv.org.
https://docs.opencv.org/3.4/d4/d73/tutorial_py_contours_begin.html#:~:text=Contours%20ca
n%20be%20explained%20simply (accessed Jan. 24, 2023). ‌
[20]“Intersection Over Union IoU in Object Detection Segmentation,” Jun. 28, 2022.
https://learnopencv.com/intersection-over-union-iou-in-object-detection-and-segmentation/ ‌
P a g e | 40

Detecting House in Flood.pdf

Recommended

Recommended

More Related Content

Similar to Detecting House in Flood.pdf

Similar to Detecting House in Flood.pdf (20)

Recently uploaded

Recently uploaded (20)

Detecting House in Flood.pdf