This presentation describes preliminary work in the recent promising approach of generating synthetic training data for facilitating the learning procedure of deep learning (DL) models, with a focus on aerial photos produced by unmanned aerial vehicles (UAV). The general concept and methodology are described, and preliminary results are presented, based on a classification problem of fire identification in forests as well as a counting problem of estimating number of houses in urban areas. The proposed technique constitutes a new possibility for the DL community, especially related to UAV-based imagery analysis, with much potential, promising results, and unexplored ground for further research.
Training deep learning models to count using synthetic images
1. TRAINING DEEP LEARNING MODELS TO COUNT
USING SYNTHETIC IMAGES
6 Sept. 2019DR. ANDREAS KAMILARIS
@DL-UAV19, PROC. OF CAIP 2019
2. Problem
• Difficult to create ground truth data
• UCSD Pedestrian Database
• Video of pedestrians on UCSD walkways, taken
from a stationary camera
3. Problem
• Difficult to create ground truth data
• UCF CC 50 dataset
• Counts of persons range between 94 and 4543, with an average
of 1280 individuals per image
4. Motivation
• Generate data by simulation, which are easy then to label.
• Create synthetic ground truth data
Rahnemoonfar, M. and Sheppard, C., 2017. Deep count: fruit counting based on deep
simulated learning. Sensors, 17(4), p.905.
5. First try
• Generated simulated data in Python
• Goal: Detection of fires in forests
• 160 images, 80 of forest, 80 of fire
• 80% training, 20% testing
7. First try
• False negatives where there was smoke AND fire!
Data Custom Inception-v3
Only
generative
0.65 0.62
Only real 0.8 0.69
Combined,
augmented
dataset
0.9 0.71
9. Second try
• Improvement of results!
• Very few false negatives
Data Custom Inception-v3
Only
generative
0.95 0.72
Only real 0.8 0.69
Combined,
augmented
dataset
1.0 0.79
10. Generative data case
Training based on 2,000
synthetic images (labelled as
fire or forest)
Testing based on 100 real-world aerial
photos (classified as 50 images of
forest and 50 images of fire)
11. Hypothesis 1
Generating synthetic data can help to train deep learning models,
without the need to create expensive (in terms of time and effort)
ground truth data!
12. Hypothesis 2
Generating synthetic data can help to train deep learning models
not only to classify, but also to count!
… not only simple problems, but also more advanced ones…
13. Application: Counting houses from aerial photos
60 photos taken from satellite images in urban areas of Tanzania
Manually counted the number of houses, to create out
testing/validation dataset.
Each photo has [0,38] houses.
14. Application: Counting houses from aerial photos
Created synthetic training data, with automated house counting
First naïve try: Involved only squares!
MSE = 41
16. Application: Counting houses from aerial photos
Third try: Added grass, fences and different orientation of
houses. Also added images without any houses
MSE = 20
17. Application: Counting houses from aerial photos
Many decisions to take along the way…
• Dropout rate (35-50% works well)
• Stride (2 is small, 10 is too big)
• Convolutions (7x7 initially seems a good option)
• Pre-training (ImageNet is not helping a lot)
• Max-pooling better than average pooling
• Dense layers at the end of the network, ReLU function
18. Application: Counting houses from aerial photos
Adapted,
custom
topology
Inception-ResNet
7x7 input filter with
large stride
Dense fully-
connected layer
with ReLU
19. Application: Counting houses from aerial photos
Training vs. Testing MSE
The model can predict the number of houses with an error of 4,47 houses.
For example, for a photo with 20 houses, the model would predict in the
range of [16, 24].
Training based on 10,000
synthetic images (labelled
with exact number of houses)
Testing based on 60 real-
world aerial photos (labelled
with exact number of houses)
21. Application: Counting houses from aerial photos
Next steps:
• Crop houses from training dataset and reuse based on
random combinations in semi-synthetic images
• More realistic generation of data
• (GAN for counting? )
• Accountability
• Other domains:
o Agriculture (counting animals in farms)
o Energy (renewable energy in roofs)
o Environment and Climate (counting trees, plants,
endangered species of animals etc.)
o Microbiology (blood test analysis etc.)
22. State of the art (published in 2019)
Kar, A., Prakash, A., Liu, M.Y., Cameracci, E., Yuan, J., Rusiniak, M., Acuna, D., Torralba, A. and Fidler, S.,
2019. Meta-Sim: Learning to Generate Synthetic Datasets. arXiv preprint arXiv:1904.11621.
Future Work and Research Direction
23. Future Work and Research Direction
State of the art (published in 2019)
Combining Counting CNN model with the ResNeXt architecture
Tian, M. et al., Automated pig counting using deep learning,
Computers and Electronics in Agriculture, vol. 163, pp. 1-10, 2019. MAE = 4.47
MAE = 2.77
24. Conclusion
• Synthetic data can be used for training DL models
• Can be applied in UAV-related applications (classification vs.
counting problems)
• More advanced techniques are required for improving
performance (e.g. probabilistic scene graph generation, density
maps)
25. THANKS FOR YOUR ATTENTION!
DR. ANDREAS KAMILARIS
EMAIL: A.KAMILARIS@UTWENTE.NL