SlideShare a Scribd company logo
1 of 26
Download to read offline
Convolutional Neural Network for pixel-wise
skyline detection
Darian Frajberg
Piero Fraternali
Rocio Nahime Torres
Department of Electronics, Information and Bioengineering, Politecnico di Milano
September 15, 2017
26th International Conference
on Artificial Neural Networks
Deep learning is a hot topic and it has achieved outstanding results outperforming previous techniques
in a very wide variety of applications (e.g., computer vision, speech recognition, NLP, etc)
Augmented Reality (AR) applications is an emerging class of software that is getting massive attention
(e.g., PokemonGo) and its market is projected to be huge
The integration of Artificial Intelligence and Augmented Reality applications can definitely lead to very
successful results, capable of attracting people to voluntarily execute diverse tasks
Goals to accomplish
• High accuracy
• Low power devices support
• High real-time performance
• Acceptable memory usage
• Acceptable battery consumption
2
Introduction and motivation
Use case
– Convolutional Neural Network (CNN) for mountain skyline
detection
– Integration of CNN for the development of an AR mobile app for
mountain peaks identification
Mountain skyline detection
– Simple scenarios
– Complex scenarios
3
Introduction and motivation
Mountain skyline detection for simple scenarios
– Comprises clear sky and continuous skylines
4
Introduction and motivation
(Input) ( Output)
Mountain skyline detection for complex scenarios
– May comprise fuzzy or interrupted skylines with obstacles
(e.g., clouds, trees, houses, cables, people, etc)
5
Introduction and motivation
(Input) ( Output)
Heuristic methods for skyline detection
– Edge-based
– Dynamic programming
– Solves simple scenarios
– Does not solve complex scenarios
Image-level CNN methods for skyline detection
– Semantic segmentation (seen as foreground-background problem)
– Solves simple scenarios
– To solve complex scenarios it would require ground truth extremely
difficult to generate
6
Related work
Successful pixel-level CNN methods for other purposes
– Detection of cancer in biomedical images
– Edges extraction
Our approach
– Use pixel-wise CNN for mountain skyline detection
7
Related work
(Continuous skyline annotation) 8
Skyline extraction with CNN
(Interrupted skyline annotation)
Dataset annotation (8.940 images)
Pre-processing
– Dataset split at image level
• 65% training
• 25% validation
• 10% test
– Patches extraction per each image (29 x 29 px)
• 100 positive patches
• 200 negative patches
– 8.940 images x 300 patches = 2.682.000 patches
9
Skyline extraction with CNN
10
Skyline extraction with CNN
Patches extraction
(Patches) ( Annotation)
Skyline extraction with CNN
Positive patches
Negative patches
9
Model architecture
12
Skyline extraction with CNN
Layer Type Input Kernel Stride Pad Output
Layer1 Conv 29 x 29 x 3 6 1 0 24 x 24 x 20
Layer2 Pool (max) 24 x 24 x 20 2 2 0 12 x 12 x 20
Layer3 Conv 12 x 12 x 20 5 1 0 8 x 8 x 50
Layer4 Pool (max) 8 x 8 x 50 2 2 0 4 x 4 x 50
Layer5 Conv 4 x 4 x 50 4 1 0 1 x 1 x 500
Layer6 Relu 1 x 1 x 500 - 1 0 1 x 1 x 500
Layer7 Conv 1 x 1 x 500 1 1 0 1 x 1 x 2
Layer8 Softmaxloss 1 x 1 x 2 - 1 0 1 x 1 x 2
Training
– Caffe framework
– Workstation with NVIDIA GeForce GTX 1080
– 61 minutes
– 428.732 learned parameters
13
Skyline extraction with CNN
Deployment of Fully Convolutional Network
– Input: Image
– Output: Spatial map in which each pixel is assigned a probability
of being positive (0..255 range)
14
Skyline extraction with CNN
(Input) ( Output)
Post-processing
– Threshold
– Soft erosion
– Selection of N pixels per column (at most)
15
Skyline extraction with CNN
CNN Accuracy
– 95,05%
– Evaluation at patch level
– Not representative enough
16
Evaluation
Accuracy evaluated at image level with test dataset
– Average Skyline Accuracy (ASA)
– Average No Skyline Accuracy (ANSA)
– Average Accuracy (AA)
17
Evaluation
18
Evaluation
19
Evaluation
Evaluation example
– Average Skyline Accuracy: 98%
– Average No Skyline Accuracy: 73%
– Average Accuracy: 94%
Ground truth annotation pixel
Correctly predicted skyline pixel
Incorrectly predicted skyline pixel
(Annotation) ( Evaluation)
Accuracy on test dataset images
20
Evaluation
Images Pixels
Per
Column
Threshold Average
Skyline
Accuracy
Average
No
Skyline
Accuracy
Average
Accuracy
Continuous skyline images from
test dataset
1 0 94,45% - 94,45%
Complete test dataset images 1 100 92,45% 20,14% 86,87%
Runtime performance
The dimension of a frame image impacts over:
– Accuracy
– Memory consumption
– Execution time
Good balance
– 321 x 241 px
We built our own library in native code for the deployment of the CNN on
the mobile
21
Evaluation
Execution time
Memory consumption
– 9,36 MB
22
Evaluation
Device Time (ms)
MacBook Pro – 2,9 GHz Intel Core i5 (2 cores) – 16 GB 73
Nexus 6 – 2,65 GHz Qualcomm Snapdragon 805 (4 cores) – 3 GB 273
Moto 4G PLUS – 1,52 GHz Qualcomm Snapdragon 617 (8 cores) – 2 GB 472
Galaxy Nexus – 1,2 GHz TI OMAP 4460 (2 cores) – 1 GB 1775
PeakLens is an outdoor AR mobile application that identifies
mountain peaks and overlays them in real-time on the view.
It extracts the mountain skyline with CNN and aligns it with
respect to the terrain skyline of the user’s current location.
23
Usage experience
100k installs
in Android
24
Usage experience
PeakLens [video]
Concept
– CNN model for mountain skyline extraction trained with a large set of
annotated images taken in uncontrolled conditions
– Definition of metrics to evaluate the quality of the resulting skyline
– Support for its deployment over low-end mobile devices
– Integration of the module on an AR mobile app
Future work
– Optimization of the CNN model to achieve a faster execution time
– Improvement of obstacles management
– Improvement of pre-processing and post-processing steps
– Runtime performance comparison vs. Caffe2 and TensorFlow with
MobileNets (both released after ICANN’s submission deadline)
25
Conclusions
26
Thanks For Your Attention!
Convolutional Neural Network
for pixel-wise skyline detection
Darian Frajberg
Piero Fraternali
Rocio Nahime Torres
darian.frajberg | piero.fraternali | rocionahime.torres
@polimi.it

More Related Content

What's hot

Deep Learning Tomography
Deep Learning TomographyDeep Learning Tomography
Deep Learning TomographyAmir Adler
 
OpenStreetMap in 3D - current developments
OpenStreetMap in 3D - current developmentsOpenStreetMap in 3D - current developments
OpenStreetMap in 3D - current developmentsvirtualcitySYSTEMS GmbH
 
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 2)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 2)SIGGRAPH 2014 Course on Computational Cameras and Displays (part 2)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 2)Matthew O'Toole
 
FV_IGARSS11.ppt
FV_IGARSS11.pptFV_IGARSS11.ppt
FV_IGARSS11.pptgrssieee
 
A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...
A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...
A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...Sergio Orts-Escolano
 
Introductory Level of SLAM Seminar
Introductory Level of SLAM SeminarIntroductory Level of SLAM Seminar
Introductory Level of SLAM SeminarDong-Won Shin
 
Towards Exascale Simulations for Regional-Scale Earthquake Hazard and Risk
Towards Exascale Simulations for Regional-Scale Earthquake Hazard and RiskTowards Exascale Simulations for Regional-Scale Earthquake Hazard and Risk
Towards Exascale Simulations for Regional-Scale Earthquake Hazard and Riskinside-BigData.com
 
Analysis of KinectFusion
Analysis of KinectFusionAnalysis of KinectFusion
Analysis of KinectFusionDong-Won Shin
 
FastCampus 2018 SLAM Workshop
FastCampus 2018 SLAM WorkshopFastCampus 2018 SLAM Workshop
FastCampus 2018 SLAM WorkshopDong-Won Shin
 
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...Tomohiro Fukuda
 
Introduction of Mobile CNN
Introduction of Mobile CNNIntroduction of Mobile CNN
Introduction of Mobile CNNRyosuke Tanno
 
Vision Based Traffic Surveillance System
Vision Based Traffic Surveillance SystemVision Based Traffic Surveillance System
Vision Based Traffic Surveillance Systemmaheshwaraneee
 
Web odm workflow pres
Web odm workflow presWeb odm workflow pres
Web odm workflow presJasper Mowatt
 
12 SuperAI on Supercomputers
12 SuperAI on Supercomputers12 SuperAI on Supercomputers
12 SuperAI on SupercomputersRCCSRENKEI
 
An Open Source solution for Three-Dimensional documentation: archaeological a...
An Open Source solution for Three-Dimensional documentation: archaeological a...An Open Source solution for Three-Dimensional documentation: archaeological a...
An Open Source solution for Three-Dimensional documentation: archaeological a...Giulio Bigliardi
 

What's hot (18)

Deep Learning Tomography
Deep Learning TomographyDeep Learning Tomography
Deep Learning Tomography
 
OpenStreetMap in 3D - current developments
OpenStreetMap in 3D - current developmentsOpenStreetMap in 3D - current developments
OpenStreetMap in 3D - current developments
 
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 2)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 2)SIGGRAPH 2014 Course on Computational Cameras and Displays (part 2)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 2)
 
FV_IGARSS11.ppt
FV_IGARSS11.pptFV_IGARSS11.ppt
FV_IGARSS11.ppt
 
A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...
A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...
A Three-Dimensional Representation method for Noisy Point Clouds based on Gro...
 
Introductory Level of SLAM Seminar
Introductory Level of SLAM SeminarIntroductory Level of SLAM Seminar
Introductory Level of SLAM Seminar
 
Towards Exascale Simulations for Regional-Scale Earthquake Hazard and Risk
Towards Exascale Simulations for Regional-Scale Earthquake Hazard and RiskTowards Exascale Simulations for Regional-Scale Earthquake Hazard and Risk
Towards Exascale Simulations for Regional-Scale Earthquake Hazard and Risk
 
Analysis of KinectFusion
Analysis of KinectFusionAnalysis of KinectFusion
Analysis of KinectFusion
 
Kintinuous review
Kintinuous reviewKintinuous review
Kintinuous review
 
FastCampus 2018 SLAM Workshop
FastCampus 2018 SLAM WorkshopFastCampus 2018 SLAM Workshop
FastCampus 2018 SLAM Workshop
 
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
 
Introduction of Mobile CNN
Introduction of Mobile CNNIntroduction of Mobile CNN
Introduction of Mobile CNN
 
Vision Based Traffic Surveillance System
Vision Based Traffic Surveillance SystemVision Based Traffic Surveillance System
Vision Based Traffic Surveillance System
 
Web odm workflow pres
Web odm workflow presWeb odm workflow pres
Web odm workflow pres
 
AR/SLAM for end-users
AR/SLAM for end-usersAR/SLAM for end-users
AR/SLAM for end-users
 
12 SuperAI on Supercomputers
12 SuperAI on Supercomputers12 SuperAI on Supercomputers
12 SuperAI on Supercomputers
 
An Open Source solution for Three-Dimensional documentation: archaeological a...
An Open Source solution for Three-Dimensional documentation: archaeological a...An Open Source solution for Three-Dimensional documentation: archaeological a...
An Open Source solution for Three-Dimensional documentation: archaeological a...
 
Matlab
MatlabMatlab
Matlab
 

Similar to Convolutional Neural Network for pixel-wise skyline detection

Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsPetteriTeikariPhD
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksMarcinJedyk
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer visionMarcin Jedyk
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNZihao(Gerald) Zhang
 
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...CodeOps Technologies LLP
 
SDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's StampedeSDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's StampedeIntel® Software
 
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro..."High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...Edge AI and Vision Alliance
 
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情NVIDIA Japan
 
Real time-image-processing-applied-to-traffic-queue-detection-algorithm
Real time-image-processing-applied-to-traffic-queue-detection-algorithmReal time-image-processing-applied-to-traffic-queue-detection-algorithm
Real time-image-processing-applied-to-traffic-queue-detection-algorithmajayrampelli
 
Pipeline anomaly detection
Pipeline anomaly detectionPipeline anomaly detection
Pipeline anomaly detectionGauravBiswas9
 
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘CHENHuiMei
 
高解析度面板瑕疵檢測
高解析度面板瑕疵檢測高解析度面板瑕疵檢測
高解析度面板瑕疵檢測CHENHuiMei
 
URBAN OBJECT DETECTION IN UAV RESNETpptx
URBAN OBJECT DETECTION IN UAV RESNETpptxURBAN OBJECT DETECTION IN UAV RESNETpptx
URBAN OBJECT DETECTION IN UAV RESNETpptxbalajimankena
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural NetworksPyData
 
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHMA ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHMcsandit
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in VisionSangmin Woo
 
Weapon Detection Using AI and DL.pptx
Weapon Detection Using AI and DL.pptxWeapon Detection Using AI and DL.pptx
Weapon Detection Using AI and DL.pptxVijayKumar6017
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 

Similar to Convolutional Neural Network for pixel-wise skyline detection (20)

Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problems
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural Networks
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer vision
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
 
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
 
SDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's StampedeSDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's Stampede
 
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro..."High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
 
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
 
Real time-image-processing-applied-to-traffic-queue-detection-algorithm
Real time-image-processing-applied-to-traffic-queue-detection-algorithmReal time-image-processing-applied-to-traffic-queue-detection-algorithm
Real time-image-processing-applied-to-traffic-queue-detection-algorithm
 
Pipeline anomaly detection
Pipeline anomaly detectionPipeline anomaly detection
Pipeline anomaly detection
 
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
 
All projects
All projectsAll projects
All projects
 
高解析度面板瑕疵檢測
高解析度面板瑕疵檢測高解析度面板瑕疵檢測
高解析度面板瑕疵檢測
 
URBAN OBJECT DETECTION IN UAV RESNETpptx
URBAN OBJECT DETECTION IN UAV RESNETpptxURBAN OBJECT DETECTION IN UAV RESNETpptx
URBAN OBJECT DETECTION IN UAV RESNETpptx
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
 
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHMA ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
 
Weapon Detection Using AI and DL.pptx
Weapon Detection Using AI and DL.pptxWeapon Detection Using AI and DL.pptx
Weapon Detection Using AI and DL.pptx
 
kanimozhi2019.pdf
kanimozhi2019.pdfkanimozhi2019.pdf
kanimozhi2019.pdf
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 

Recently uploaded

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 

Recently uploaded (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

Convolutional Neural Network for pixel-wise skyline detection

  • 1. Convolutional Neural Network for pixel-wise skyline detection Darian Frajberg Piero Fraternali Rocio Nahime Torres Department of Electronics, Information and Bioengineering, Politecnico di Milano September 15, 2017 26th International Conference on Artificial Neural Networks
  • 2. Deep learning is a hot topic and it has achieved outstanding results outperforming previous techniques in a very wide variety of applications (e.g., computer vision, speech recognition, NLP, etc) Augmented Reality (AR) applications is an emerging class of software that is getting massive attention (e.g., PokemonGo) and its market is projected to be huge The integration of Artificial Intelligence and Augmented Reality applications can definitely lead to very successful results, capable of attracting people to voluntarily execute diverse tasks Goals to accomplish • High accuracy • Low power devices support • High real-time performance • Acceptable memory usage • Acceptable battery consumption 2 Introduction and motivation
  • 3. Use case – Convolutional Neural Network (CNN) for mountain skyline detection – Integration of CNN for the development of an AR mobile app for mountain peaks identification Mountain skyline detection – Simple scenarios – Complex scenarios 3 Introduction and motivation
  • 4. Mountain skyline detection for simple scenarios – Comprises clear sky and continuous skylines 4 Introduction and motivation (Input) ( Output)
  • 5. Mountain skyline detection for complex scenarios – May comprise fuzzy or interrupted skylines with obstacles (e.g., clouds, trees, houses, cables, people, etc) 5 Introduction and motivation (Input) ( Output)
  • 6. Heuristic methods for skyline detection – Edge-based – Dynamic programming – Solves simple scenarios – Does not solve complex scenarios Image-level CNN methods for skyline detection – Semantic segmentation (seen as foreground-background problem) – Solves simple scenarios – To solve complex scenarios it would require ground truth extremely difficult to generate 6 Related work
  • 7. Successful pixel-level CNN methods for other purposes – Detection of cancer in biomedical images – Edges extraction Our approach – Use pixel-wise CNN for mountain skyline detection 7 Related work
  • 8. (Continuous skyline annotation) 8 Skyline extraction with CNN (Interrupted skyline annotation) Dataset annotation (8.940 images)
  • 9. Pre-processing – Dataset split at image level • 65% training • 25% validation • 10% test – Patches extraction per each image (29 x 29 px) • 100 positive patches • 200 negative patches – 8.940 images x 300 patches = 2.682.000 patches 9 Skyline extraction with CNN
  • 10. 10 Skyline extraction with CNN Patches extraction (Patches) ( Annotation)
  • 11. Skyline extraction with CNN Positive patches Negative patches 9
  • 12. Model architecture 12 Skyline extraction with CNN Layer Type Input Kernel Stride Pad Output Layer1 Conv 29 x 29 x 3 6 1 0 24 x 24 x 20 Layer2 Pool (max) 24 x 24 x 20 2 2 0 12 x 12 x 20 Layer3 Conv 12 x 12 x 20 5 1 0 8 x 8 x 50 Layer4 Pool (max) 8 x 8 x 50 2 2 0 4 x 4 x 50 Layer5 Conv 4 x 4 x 50 4 1 0 1 x 1 x 500 Layer6 Relu 1 x 1 x 500 - 1 0 1 x 1 x 500 Layer7 Conv 1 x 1 x 500 1 1 0 1 x 1 x 2 Layer8 Softmaxloss 1 x 1 x 2 - 1 0 1 x 1 x 2
  • 13. Training – Caffe framework – Workstation with NVIDIA GeForce GTX 1080 – 61 minutes – 428.732 learned parameters 13 Skyline extraction with CNN
  • 14. Deployment of Fully Convolutional Network – Input: Image – Output: Spatial map in which each pixel is assigned a probability of being positive (0..255 range) 14 Skyline extraction with CNN (Input) ( Output)
  • 15. Post-processing – Threshold – Soft erosion – Selection of N pixels per column (at most) 15 Skyline extraction with CNN
  • 16. CNN Accuracy – 95,05% – Evaluation at patch level – Not representative enough 16 Evaluation
  • 17. Accuracy evaluated at image level with test dataset – Average Skyline Accuracy (ASA) – Average No Skyline Accuracy (ANSA) – Average Accuracy (AA) 17 Evaluation
  • 19. 19 Evaluation Evaluation example – Average Skyline Accuracy: 98% – Average No Skyline Accuracy: 73% – Average Accuracy: 94% Ground truth annotation pixel Correctly predicted skyline pixel Incorrectly predicted skyline pixel (Annotation) ( Evaluation)
  • 20. Accuracy on test dataset images 20 Evaluation Images Pixels Per Column Threshold Average Skyline Accuracy Average No Skyline Accuracy Average Accuracy Continuous skyline images from test dataset 1 0 94,45% - 94,45% Complete test dataset images 1 100 92,45% 20,14% 86,87%
  • 21. Runtime performance The dimension of a frame image impacts over: – Accuracy – Memory consumption – Execution time Good balance – 321 x 241 px We built our own library in native code for the deployment of the CNN on the mobile 21 Evaluation
  • 22. Execution time Memory consumption – 9,36 MB 22 Evaluation Device Time (ms) MacBook Pro – 2,9 GHz Intel Core i5 (2 cores) – 16 GB 73 Nexus 6 – 2,65 GHz Qualcomm Snapdragon 805 (4 cores) – 3 GB 273 Moto 4G PLUS – 1,52 GHz Qualcomm Snapdragon 617 (8 cores) – 2 GB 472 Galaxy Nexus – 1,2 GHz TI OMAP 4460 (2 cores) – 1 GB 1775
  • 23. PeakLens is an outdoor AR mobile application that identifies mountain peaks and overlays them in real-time on the view. It extracts the mountain skyline with CNN and aligns it with respect to the terrain skyline of the user’s current location. 23 Usage experience 100k installs in Android
  • 25. Concept – CNN model for mountain skyline extraction trained with a large set of annotated images taken in uncontrolled conditions – Definition of metrics to evaluate the quality of the resulting skyline – Support for its deployment over low-end mobile devices – Integration of the module on an AR mobile app Future work – Optimization of the CNN model to achieve a faster execution time – Improvement of obstacles management – Improvement of pre-processing and post-processing steps – Runtime performance comparison vs. Caffe2 and TensorFlow with MobileNets (both released after ICANN’s submission deadline) 25 Conclusions
  • 26. 26 Thanks For Your Attention! Convolutional Neural Network for pixel-wise skyline detection Darian Frajberg Piero Fraternali Rocio Nahime Torres darian.frajberg | piero.fraternali | rocionahime.torres @polimi.it