Convolutional Neural Network for pixel-wise skyline detection

Convolutional Neural Network for pixel-wise
skyline detection
Darian Frajberg
Piero Fraternali
Rocio Nahime Torres
Department of Electronics, Information and Bioengineering, Politecnico di Milano
September 15, 2017
26th International Conference
on Artificial Neural Networks

Deep learning is a hot topic and it has achieved outstanding results outperforming previous techniques
in a very wide variety of applications (e.g., computer vision, speech recognition, NLP, etc)
Augmented Reality (AR) applications is an emerging class of software that is getting massive attention
(e.g., PokemonGo) and its market is projected to be huge
The integration of Artificial Intelligence and Augmented Reality applications can definitely lead to very
successful results, capable of attracting people to voluntarily execute diverse tasks
Goals to accomplish
• High accuracy
• Low power devices support
• High real-time performance
• Acceptable memory usage
• Acceptable battery consumption
2
Introduction and motivation

Use case
– Convolutional Neural Network (CNN) for mountain skyline
detection
– Integration of CNN for the development of an AR mobile app for
mountain peaks identification
Mountain skyline detection
– Simple scenarios
– Complex scenarios
3

Mountain skyline detection for simple scenarios
– Comprises clear sky and continuous skylines
4
(Input) ( Output)

Mountain skyline detection for complex scenarios
– May comprise fuzzy or interrupted skylines with obstacles
(e.g., clouds, trees, houses, cables, people, etc)
5
(Input) ( Output)

Heuristic methods for skyline detection
– Edge-based
– Dynamic programming
– Solves simple scenarios
– Does not solve complex scenarios
Image-level CNN methods for skyline detection
– Semantic segmentation (seen as foreground-background problem)
– Solves simple scenarios
– To solve complex scenarios it would require ground truth extremely
difficult to generate
6
Related work

Successful pixel-level CNN methods for other purposes
– Detection of cancer in biomedical images
– Edges extraction
Our approach
– Use pixel-wise CNN for mountain skyline detection
7
Related work

(Continuous skyline annotation) 8
Skyline extraction with CNN
(Interrupted skyline annotation)
Dataset annotation (8.940 images)

Pre-processing
– Dataset split at image level
• 65% training
• 25% validation
• 10% test
– Patches extraction per each image (29 x 29 px)
• 100 positive patches
• 200 negative patches
– 8.940 images x 300 patches = 2.682.000 patches
9

10
Patches extraction
(Patches) ( Annotation)

Positive patches
Negative patches
9

Model architecture
12
Layer Type Input Kernel Stride Pad Output
Layer1 Conv 29 x 29 x 3 6 1 0 24 x 24 x 20
Layer2 Pool (max) 24 x 24 x 20 2 2 0 12 x 12 x 20
Layer3 Conv 12 x 12 x 20 5 1 0 8 x 8 x 50
Layer4 Pool (max) 8 x 8 x 50 2 2 0 4 x 4 x 50
Layer5 Conv 4 x 4 x 50 4 1 0 1 x 1 x 500
Layer6 Relu 1 x 1 x 500 - 1 0 1 x 1 x 500
Layer7 Conv 1 x 1 x 500 1 1 0 1 x 1 x 2
Layer8 Softmaxloss 1 x 1 x 2 - 1 0 1 x 1 x 2

Training
– Caffe framework
– Workstation with NVIDIA GeForce GTX 1080
– 61 minutes
– 428.732 learned parameters
13

Deployment of Fully Convolutional Network
– Input: Image
– Output: Spatial map in which each pixel is assigned a probability
of being positive (0..255 range)
14
(Input) ( Output)

Post-processing
– Threshold
– Soft erosion
– Selection of N pixels per column (at most)
15

CNN Accuracy
– 95,05%
– Evaluation at patch level
– Not representative enough
16
Evaluation

Accuracy evaluated at image level with test dataset
– Average Skyline Accuracy (ASA)
– Average No Skyline Accuracy (ANSA)
– Average Accuracy (AA)
17
Evaluation

19
Evaluation
Evaluation example
– Average Skyline Accuracy: 98%
– Average No Skyline Accuracy: 73%
– Average Accuracy: 94%
Ground truth annotation pixel
Correctly predicted skyline pixel
Incorrectly predicted skyline pixel
(Annotation) ( Evaluation)

Accuracy on test dataset images
20
Evaluation
Images Pixels
Per
Column
Threshold Average
Skyline
Accuracy
Average
No
Skyline
Accuracy
Average
Accuracy
Continuous skyline images from
test dataset
1 0 94,45% - 94,45%
Complete test dataset images 1 100 92,45% 20,14% 86,87%

Runtime performance
The dimension of a frame image impacts over:
– Accuracy
– Memory consumption
– Execution time
Good balance
– 321 x 241 px
We built our own library in native code for the deployment of the CNN on
the mobile
21
Evaluation

Execution time
Memory consumption
– 9,36 MB
22
Evaluation
Device Time (ms)
MacBook Pro – 2,9 GHz Intel Core i5 (2 cores) – 16 GB 73
Nexus 6 – 2,65 GHz Qualcomm Snapdragon 805 (4 cores) – 3 GB 273
Moto 4G PLUS – 1,52 GHz Qualcomm Snapdragon 617 (8 cores) – 2 GB 472
Galaxy Nexus – 1,2 GHz TI OMAP 4460 (2 cores) – 1 GB 1775

PeakLens is an outdoor AR mobile application that identifies
mountain peaks and overlays them in real-time on the view.
It extracts the mountain skyline with CNN and aligns it with
respect to the terrain skyline of the user’s current location.
23
Usage experience
100k installs
in Android

24
Usage experience
PeakLens [video]

Concept
– CNN model for mountain skyline extraction trained with a large set of
annotated images taken in uncontrolled conditions
– Definition of metrics to evaluate the quality of the resulting skyline
– Support for its deployment over low-end mobile devices
– Integration of the module on an AR mobile app
Future work
– Optimization of the CNN model to achieve a faster execution time
– Improvement of obstacles management
– Improvement of pre-processing and post-processing steps
– Runtime performance comparison vs. Caffe2 and TensorFlow with
MobileNets (both released after ICANN’s submission deadline)
25
Conclusions

26
Thanks For Your Attention!
Convolutional Neural Network
for pixel-wise skyline detection
Darian Frajberg
Piero Fraternali
Rocio Nahime Torres
darian.frajberg | piero.fraternali | rocionahime.torres
@polimi.it

Convolutional Neural Network for pixel-wise skyline detection

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Convolutional Neural Network for pixel-wise skyline detection

Similar to Convolutional Neural Network for pixel-wise skyline detection (20)

Recently uploaded

Recently uploaded (20)

Convolutional Neural Network for pixel-wise skyline detection