Handwritten Text Recognition for manuscripts and early printed texts
Preliminary Evaluation of TinyYOLO on a New Dataset for Search-And-Rescue with Drones
1. Preliminary Evaluation of TinyYOLO on a New
Dataset for Search-And-Rescue with Drones
Giovanna Castellano, Ciro Castiello, Corrado Mencar, Gennaro Vessio
CILAB, Department of Computer Science, University of Bari, Italy
gennaro.vessio@uniba.it
ISCMI 2020
2. Context
Drones can provide a cost-efficient aid to
emergency rescue operations:
● swarms of aerial vehicles can be rapidly
spread across a disaster area providing
mobile ad-hoc networks
● they can rapidly overfly and traverse
difficult to reach regions, such as
mountains, islands, etc.
● they can deliver rescue apparatus, such as
medications, much faster than rescue
teams
2
3. Motivations
However, in such a scenario, a manual search performed by a flight operator
(based on the aerial video captured by the drone) can prove extremely difficult:
● it requires a long concentration to perform the flight operation and the
searching task at the same time
● the operator could work in poor conditions, because of the small size of the
monitor he is equipped with, as well as the brightness of the screen outdoor
The use of autonomous drones can reduce manual human intervention, thereby
increasing detection rate, while reducing rescue time
3
4. Goal
This opportunity motivates research efforts
towards the development of real-time intelligent
tools to be mounted directly on-board drones
Nowadays, drones embed quite powerful GPUs,
so even a simple UAV can be transformed into
an advanced computer vision flying machine
4
5. Proposed method
During a real-world SAR operation,
both high performance on-board
computing systems and high-speed
network connections are unlikely to be
available
Therefore, a lightweight and fast neural
network model is required to efficiently
process each video frame
For this reason, we used the
well-known lightweight TinyYOLOv3
model to implement an object (people)
detection system
5
6. ● Largest dataset of aerial images ever published
● Bounding boxes of different object categories
● Various weather and lighting conditions
● Sparse and crowded scenes
6
VisDrone - Task 1 dataset (fine-tuning)
7. New SAR dataset
(testing only)
Unfortunately, most of the existing
works make use of images captured by
drones depicting everyday-life
scenarios that are unrealistic for SAR
At present, we propose a new dataset
including two different SAR scenarios:
● mountains (200 frames, 100
with annotated people)
● beaches (110 frames, 55 with
annotated people)
Video frames were retrieved querying
YouTube and manually annotated in
accordance with well-known standards
7
8. Setting
● Software:
○ Darknet library
● Hardware:
○ Google Colab NVIDIA Tesla K80 (training)
○ NVIDIA Jetson TX2 (testing) → 7 fps
● Hyper-parameter setting:
○ Mini-batch size: 64
○ Learning rate: 0.001
○ Early stopping on a validation set
○ Input: 416×416
○ IoU: 50%
8
10. Conclusion
This work represents a first step in our research whose long-term goal is to develop
a large, challenging dataset to promote advances in this research direction
To this end, the dataset has been made publicly available in its current version and
we solicit contributions to make it bigger and bigger:
https://doi.org/10.5281/zenodo.3924925
10