SlideShare a Scribd company logo
On the Effect of Geometries Simplification
on Geo-spatial Link Discovery
Abdullah Fathi Ahmed1, Mohamed Ahmed Sherif1,2,
and Axel-Cyrille Ngonga Ngomo1,2
1Department of Computer Science, University Paderborn, 33098 Paderborn, Germany
afaahmed@mail.uni-paderborn.de
{sherif|ngonga}@upb.de
2 Department of Computer Science, University of Leipzig, 04109 Leipzig, Germany
{sherif|ngonga}@informatik.uni-leipzig.de
12 September, 2018
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 1 / 25
Outline
1 Motivation
2 Approach
3 Evaluation
4 Conclusion & Future Work
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 2 / 25
Motivation
Link Discovery among RDF geospatial data
Linked Data Cloud
http://stats.lod2.eu
150+ billion triples
46+ million links
Mostly owl:sameAs
Large geospatial datasets
LinkedGeoData contains 20+ billion triples
CLC consists of 2+ million resources
NUTS contains up to 1500 points/resources
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 3 / 25
Motivation
Why is linking geospatial resources difficult?
Link Discovery
Given two knowledge bases S and T,
find links of type R between S and T
Formally find M = {(s, t) ∈ S × T : R(s, t)}
Na¨ıve computation of M requires quadratic
time complexity
Geo-spatial resources available on the LOD
Described using polygons
Large in number
Demands the computation of
1 Topological relations
2 point-set distance
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 4 / 25
Motivation
Why is linking of simplified geospatial data important?
Real-time applications
Structured machine learning
Cross-ontology QA
Reasoning
Federated Queries
...
The trade-off between
Runtime and
Accuracy http://www.thepinsta.com
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 5 / 25
Approach
I. Line Simplification
Input: Polygonized curve with n vertices
Goal: Find an approximating polygonized curve with m vertices, where m < n
Idea: Approximate a line with a defined error tolerance > 0
Many algorithms exist
Douglas-Peucker
Visvalingam-Whyatt
...
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 6 / 25
Approach
I. Line Simplification: Douglas-Peucker Algorithm
Constract a line segment from the first
point to the last point
Find the point with farthest distance
dmax from the line segment
If tolerance < dmax , the approximation
is accepted
otherwise, keep recursion
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 7 / 25
Approach
II. Topological Relations: The Dimensionally Extended nine-Intersection Model (DE-9IM)
Standard to describe the topological relations in 2D space.
DE-9IM is based on the intersection matrix:
DE9IM(a, b)
dim(I(g1) ∩ I(g2)) dim(I(g1) ∩ B(g2)) dim(I(g1) ∩ E(g2))
dim(B(g1) ∩ I(g2)) dim(B(g1) ∩ B(g2)) dim(B(g1) ∩ E(g2))
dim(E(g1) ∩ I(g2)) dim(E(g1) ∩ B(g2)) dim(E(g1) ∩ E(g2))
At least one shared point for a relation to be hold
For the disjoint relation ⇒ inverse of the intersects relation
Accelerates the computation of any topological relation
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 8 / 25
Approach
III. Distance Measures for Point Sets
Input Two resources with input geometries gs and gt
Compute the orthodromic distance δ(si , tj ) between pairwise point of gs and gt
δ(si , tj ) = R cos−1
sin(ϕsi ) sin(ϕtj ) + cos(ϕsi ) cos(ϕtj ) cos(λsi − λtj ),
pi is a point on the surface (ϕi , λi ), latitude ϕi and longitude λi ,
Many methods exist to compute the (gs, gt) point-set distance
Hausdorff
DHausdorff (gs, gt) = max
si ∈gs
min
tj ∈gt
δ(si , tj )
Mean
Dmean(gs, gt) = δ

1
n si ∈gs
si ,
1
m tj ∈gt
tj


Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 9 / 25
Evaluation
Overview
Line Simplification is independent from Link Discovery framework
Stat of the art:
Radon for topological relation extraction
Orchid for point set distance
Topological relations:
within, touches, overlaps, intersects, equals, crosses and covers
Point-sets distance:
Hausdorff, Mean, Link, Min, and Sumofmin
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 10 / 25
Evaluation
Overview
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 11 / 25
Evaluation
Setup
Hardware
Oculus a cluster machine running OpenJDK 64-Bit 1.8.0161 on Ubuntu 16.04.3 LTS
Assigned 16 CPU (2.6 GHz Intel Xeon ”Sandy Bridge”) and 200 GB of RAM with timeout
limit of 4 hours for each job
Datasets
NUTS
CORINE Land Cover (CLC)
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 12 / 25
Evaluation
F-Measure Analysis
Q1
How much performance (i.e., F-measure) each of the geospatial LD approaches loses, when to
deal with the simplified geometries vs. when to deal with the original ones?
The first set of experiments setup
Radon for discovering topological relations
Douglas-Peucker with simplification factors of 0.05, 0.09, 0.10 and 0.2
NUTS dataset as a source and CLC dataset as a target
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 13 / 25
Evaluation
F-Measure Analysis
Relation/Factor 0.05 0.09 0.10 0.20 Average
Equals 1.00 1.00 1.00 1.00 1.00 ± 0.00
Intersects 0.99 0.97 0.97 0.94 0.97 ± 0.02
Contains 0.99 0.97 0.97 0.93 0.97 ± 0.03
Within 0.99 0.97 0.97 0.93 0.97 ± 0.03
Covers 0.99 0.97 0.97 0.93 0.97 ± 0.03
Coveredby 0.99 0.97 0.97 0.93 0.97 ± 0.03
Crosses 1.00 1.00 1.00 1.00 1.00 ± 0.00
Touches 1.00 1.00 1.00 1.00 1.00 ± 0.00
Overlaps 0.80 0.52 0.47 0.28 0.52 ± 0.21
Average 0.97 ± 0.07 0.94 ± 0.16 0.93 ± 0.17 0.90 ± 0.23 0.94 ± 0.03
F-measures results of applying Radon against geometries generated using the
Douglas-Peucker line simplification algorithm.
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 14 / 25
Evaluation
F-Measure Analysis
Q1
How much performance (i.e., F-measure) each of the geospatial LD approaches loses, when to
deal with the simplified geometries vs. when to deal with the original ones?
The second set of experiments setup:
Orchid for measuring the distance between point-sets
Douglas-Peucker with simplification factors of 0.05, 0.09, 0.10 and 0.2
NUTS dataset dedublicated to compute F-measure
NUTS dataset as a source and as a target (deduplication)
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 15 / 25
Evaluation
F-Measure Analysis
Measure/Factor 0.05 0.9 0.1 0.2 0.3 Average Foriginal
Hausdorff 0.90 0.91 0.91 0.91 0.91 0.91 ± 0.00 0.88
Mean 0.94 0.94 0.94 0.94 0.94 0.94 ± 0.00 0.94
Min 0.14 0.16 0.16 0.21 0.25 0.18 ± 0.04 0.13
Link 0.95 0.95 0.94 0.94 0.94 0.94 ± 0.00 0.94
SumOfMin 0.95 0.95 0.94 0.94 0.94 0.94 ± 0.00 0.94
avarege 0.77 ± 0.36 0.78 ± 0.35 0.78 ± 0.35 0.79 ± 0.32 0.80 ± 0.31 0.77 ± 0.36
F-measures results of the point-set distance measures implementations in Orchid using the
Douglas-Peucker line simplification algorithm.
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 16 / 25
Evaluation
Runtime Analysis
Q2
How well each of the geospatial LD approaches scale (i.e., runtime speedup), and when to deal
with the simplified geometries?
The third sets of experiments setup:
Same setting as in the first sets of experiments
Radon for discovering topological relations
Douglas-Peucker with simplification factors of 0.05, 0.09, 0.10 and 0.2
NUTS dataset as a source and CLC dataset as a target
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 17 / 25
Evaluation
Runtime Analysis
0
50
100
150
200
250
300
350
400
450
Equals Intersects Contains Within Covers CoveredBy Overlaps Touches Crosses
Runtime(sec.)
Topological relations
Original dataset 0.05 0.09 0.1 0.2
Runtimes of Radon’s implementation of topological relations LD using the Douglas-Peucker
algorithm
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 18 / 25
Evaluation
Runtime Analysis
Q2
How well each of the geospatial LD approaches scale (i.e., runtime speedup), and when to deal
with the simplified geometries?
The fourth set of experiments:
Same setting in the second sets of experiments
Orchid for measuring the distance between point-sets
Douglas-Peucker with simplification factors of 0.05, 0.09, 0.10 and 0.2
NUTS dataset dedublicated to compute F-measure
NUTS dataset as a source and as a target (deduplication)
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 19 / 25
Evaluation
Runtime Analysis
0
200
400
600
800
1000
1200
1400
1600
1800
Hausdorff Mean Min Link SumOfMin
Runtime(sec.)
point-set measures
Original data 0.05 0.09 0.1 0.2 0.3
Runtimes of Orchid’s implementation of set-points distance measure LD using the
Douglas-Peucker algorithm
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 20 / 25
Evaluation
LD Relations Analysis
Q3
Which relation is the most/least affected by the simplification process?
Same setting in the first and second sets of experiments
The relation overlaps has the most affected F-measure when using Douglas-Peucker
algorithm, (see Table 14)
The equals,crosses and touches are not affected at all by any simplification (see
Tables 14
In the case of point-set measures, the F-measure of min measure is the most affected,
(see Table 16 )
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 21 / 25
Evaluation
Simplification Runtime Analysis
Q4
What is the run time cost of simplification?
Same setting in the third and fourth sets of experiments
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 22 / 25
Evaluation
Simplification Runtime Analysis
0
500
1000
1500
2000
2500
3000
3500
Simplification RADONon Simpl. RADON+ Simpl. RADONon original
Totalruntime(sec.)
Total Runtime for all topological relations.
0
50
100
150
200
250
300
350
400
450
500
Simplification RADONon Simpl. RADON+ Simpl. RADONon original
Averageruntimeforsingletask(sec.)
Average Runtime of a single topological relation.
Runtime of Radon on the original data vs. simplified data using the Douglas-Peucker
algorithm.
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 23 / 25
Conclusion & Future Work
Conclusion
Studied the usage of line simplification as a preprocessing step of LD approaches over
geospatial RDF knowledge bases
Studied the behaviour of two categories of geospatial linking approaches (i.e., the topological
relations and point-set distances)
On average, F-measure of 0.94 using the Douglas-Peucker algorithm and 0.69 using the
Visvalingam-Whyatt algorithm has been achieved.
Gain up to 19.8× speedup using Douglas-Peucker algorithm and up to 67.3× using the
Visvalingam-Whyatt
Future Work
Guarantee the minimum F-measure loss
Determine fittest line simplification algorithm and its best parameter to achieve a better
trad-off between F-measure and Runtime
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 24 / 25
Thank you for your Attention!
Abdullah Fathi Ahmed
afaahmed@mail.uni-paderborn.de
https://github.com/dice-group/Orchid
Acknowledgment
This work has been supported by LEDS project, Eurostars Project SAGE (GA no. E!10882), the H2020 projects SLIPO
(GA no. 731581) and HOBBIT (GA no. 688227) as well as the DFG project LinkingLOD (project no. NG 105/3-2) and
the BMWI Project GEISER (project no. 01MD16014).
Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 25 / 25

More Related Content

What's hot

TRB 2014 - Automatic Spatial-temporal Identification of Points of Interest in...
TRB 2014 - Automatic Spatial-temporal Identification of Points of Interest in...TRB 2014 - Automatic Spatial-temporal Identification of Points of Interest in...
TRB 2014 - Automatic Spatial-temporal Identification of Points of Interest in...
Sean Barbeau
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
The Statistical and Applied Mathematical Sciences Institute
 
The Turbidity (TB) Varies with Time And Space in The Reservoir Using GWR And ...
The Turbidity (TB) Varies with Time And Space in The Reservoir Using GWR And ...The Turbidity (TB) Varies with Time And Space in The Reservoir Using GWR And ...
The Turbidity (TB) Varies with Time And Space in The Reservoir Using GWR And ...
National Cheng Kung University
 
Ikuro Sato's slide presented at ICONIP2017
Ikuro Sato's slide presented at ICONIP2017Ikuro Sato's slide presented at ICONIP2017
Ikuro Sato's slide presented at ICONIP2017
Ikuro Sato
 
22 mitchell lee_am_and_pwat_spectral_correction
22 mitchell lee_am_and_pwat_spectral_correction22 mitchell lee_am_and_pwat_spectral_correction
22 mitchell lee_am_and_pwat_spectral_correction
Sandia National Laboratories: Energy & Climate: Renewables
 
Kumar, Pramod: Local-scale atmospheric inversion for the estimation of the lo...
Kumar, Pramod: Local-scale atmospheric inversion for the estimation of the lo...Kumar, Pramod: Local-scale atmospheric inversion for the estimation of the lo...
Kumar, Pramod: Local-scale atmospheric inversion for the estimation of the lo...
Integrated Carbon Observation System (ICOS)
 
Model predictive-fuzzy-control-of-air-ratio-for-automotive-engines
Model predictive-fuzzy-control-of-air-ratio-for-automotive-enginesModel predictive-fuzzy-control-of-air-ratio-for-automotive-engines
Model predictive-fuzzy-control-of-air-ratio-for-automotive-engines
pace130557
 
Aerospace Engineering projects in aerodynamics, finite element analysis, math...
Aerospace Engineering projects in aerodynamics, finite element analysis, math...Aerospace Engineering projects in aerodynamics, finite element analysis, math...
Aerospace Engineering projects in aerodynamics, finite element analysis, math...
Quoc-Bao Nguyen
 
4 hydrology geostatistics-part_2
4 hydrology geostatistics-part_2 4 hydrology geostatistics-part_2
4 hydrology geostatistics-part_2
Riccardo Rigon
 
Robust and efficient nonlinear structural analysis using the central differen...
Robust and efficient nonlinear structural analysis using the central differen...Robust and efficient nonlinear structural analysis using the central differen...
Robust and efficient nonlinear structural analysis using the central differen...
openseesdays
 
"Faster Geometric Algorithms via Dynamic Determinant Computation."
"Faster Geometric Algorithms via Dynamic Determinant Computation." "Faster Geometric Algorithms via Dynamic Determinant Computation."
"Faster Geometric Algorithms via Dynamic Determinant Computation."
Vissarion Fisikopoulos
 
FR3.L09 - MULTIBASELINE GRADIENT AMBIGUITY RESOLUTION TO SUPPORT MINIMUM COST...
FR3.L09 - MULTIBASELINE GRADIENT AMBIGUITY RESOLUTION TO SUPPORT MINIMUM COST...FR3.L09 - MULTIBASELINE GRADIENT AMBIGUITY RESOLUTION TO SUPPORT MINIMUM COST...
FR3.L09 - MULTIBASELINE GRADIENT AMBIGUITY RESOLUTION TO SUPPORT MINIMUM COST...grssieee
 
An Integrated Computational Environment for Simulating Structures in Real Fires
An Integrated Computational Environment for Simulating Structures in Real FiresAn Integrated Computational Environment for Simulating Structures in Real Fires
An Integrated Computational Environment for Simulating Structures in Real Fires
openseesdays
 
Decline curve
Decline curveDecline curve
Decline curve
Dr. Elnori Elhaddad
 
How to Determine which Algorithms Really Matter
How to Determine which Algorithms Really MatterHow to Determine which Algorithms Really Matter
How to Determine which Algorithms Really MatterDataWorks Summit
 
Precise Attitude Determination Using a Hexagonal GPS Platform
Precise Attitude Determination Using a Hexagonal GPS PlatformPrecise Attitude Determination Using a Hexagonal GPS Platform
Precise Attitude Determination Using a Hexagonal GPS Platform
CSCJournals
 
IRJET- Parallelization of Definite Integration
IRJET- Parallelization of Definite IntegrationIRJET- Parallelization of Definite Integration
IRJET- Parallelization of Definite Integration
IRJET Journal
 
Estimation of global solar radiation by using machine learning methods
Estimation of global solar radiation by using machine learning methodsEstimation of global solar radiation by using machine learning methods
Estimation of global solar radiation by using machine learning methods
mehmet şahin
 

What's hot (20)

Aerosols and Nephelometers
Aerosols and NephelometersAerosols and Nephelometers
Aerosols and Nephelometers
 
TRB 2014 - Automatic Spatial-temporal Identification of Points of Interest in...
TRB 2014 - Automatic Spatial-temporal Identification of Points of Interest in...TRB 2014 - Automatic Spatial-temporal Identification of Points of Interest in...
TRB 2014 - Automatic Spatial-temporal Identification of Points of Interest in...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
The Turbidity (TB) Varies with Time And Space in The Reservoir Using GWR And ...
The Turbidity (TB) Varies with Time And Space in The Reservoir Using GWR And ...The Turbidity (TB) Varies with Time And Space in The Reservoir Using GWR And ...
The Turbidity (TB) Varies with Time And Space in The Reservoir Using GWR And ...
 
Ikuro Sato's slide presented at ICONIP2017
Ikuro Sato's slide presented at ICONIP2017Ikuro Sato's slide presented at ICONIP2017
Ikuro Sato's slide presented at ICONIP2017
 
22 mitchell lee_am_and_pwat_spectral_correction
22 mitchell lee_am_and_pwat_spectral_correction22 mitchell lee_am_and_pwat_spectral_correction
22 mitchell lee_am_and_pwat_spectral_correction
 
2012 hsc-exam-physics
2012 hsc-exam-physics2012 hsc-exam-physics
2012 hsc-exam-physics
 
Kumar, Pramod: Local-scale atmospheric inversion for the estimation of the lo...
Kumar, Pramod: Local-scale atmospheric inversion for the estimation of the lo...Kumar, Pramod: Local-scale atmospheric inversion for the estimation of the lo...
Kumar, Pramod: Local-scale atmospheric inversion for the estimation of the lo...
 
Model predictive-fuzzy-control-of-air-ratio-for-automotive-engines
Model predictive-fuzzy-control-of-air-ratio-for-automotive-enginesModel predictive-fuzzy-control-of-air-ratio-for-automotive-engines
Model predictive-fuzzy-control-of-air-ratio-for-automotive-engines
 
Aerospace Engineering projects in aerodynamics, finite element analysis, math...
Aerospace Engineering projects in aerodynamics, finite element analysis, math...Aerospace Engineering projects in aerodynamics, finite element analysis, math...
Aerospace Engineering projects in aerodynamics, finite element analysis, math...
 
4 hydrology geostatistics-part_2
4 hydrology geostatistics-part_2 4 hydrology geostatistics-part_2
4 hydrology geostatistics-part_2
 
Robust and efficient nonlinear structural analysis using the central differen...
Robust and efficient nonlinear structural analysis using the central differen...Robust and efficient nonlinear structural analysis using the central differen...
Robust and efficient nonlinear structural analysis using the central differen...
 
"Faster Geometric Algorithms via Dynamic Determinant Computation."
"Faster Geometric Algorithms via Dynamic Determinant Computation." "Faster Geometric Algorithms via Dynamic Determinant Computation."
"Faster Geometric Algorithms via Dynamic Determinant Computation."
 
FR3.L09 - MULTIBASELINE GRADIENT AMBIGUITY RESOLUTION TO SUPPORT MINIMUM COST...
FR3.L09 - MULTIBASELINE GRADIENT AMBIGUITY RESOLUTION TO SUPPORT MINIMUM COST...FR3.L09 - MULTIBASELINE GRADIENT AMBIGUITY RESOLUTION TO SUPPORT MINIMUM COST...
FR3.L09 - MULTIBASELINE GRADIENT AMBIGUITY RESOLUTION TO SUPPORT MINIMUM COST...
 
An Integrated Computational Environment for Simulating Structures in Real Fires
An Integrated Computational Environment for Simulating Structures in Real FiresAn Integrated Computational Environment for Simulating Structures in Real Fires
An Integrated Computational Environment for Simulating Structures in Real Fires
 
Decline curve
Decline curveDecline curve
Decline curve
 
How to Determine which Algorithms Really Matter
How to Determine which Algorithms Really MatterHow to Determine which Algorithms Really Matter
How to Determine which Algorithms Really Matter
 
Precise Attitude Determination Using a Hexagonal GPS Platform
Precise Attitude Determination Using a Hexagonal GPS PlatformPrecise Attitude Determination Using a Hexagonal GPS Platform
Precise Attitude Determination Using a Hexagonal GPS Platform
 
IRJET- Parallelization of Definite Integration
IRJET- Parallelization of Definite IntegrationIRJET- Parallelization of Definite Integration
IRJET- Parallelization of Definite Integration
 
Estimation of global solar radiation by using machine learning methods
Estimation of global solar radiation by using machine learning methodsEstimation of global solar radiation by using machine learning methods
Estimation of global solar radiation by using machine learning methods
 

Similar to On the Effect of Geometries Simplification on Geo-spatial Link Discovery

EUCAP 2021_presentation (7)
EUCAP 2021_presentation (7)EUCAP 2021_presentation (7)
EUCAP 2021_presentation (7)
Hamdi Bilel
 
Design and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding AlgorithmDesign and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding Algorithm
csandit
 
Lidar Point Cloud Classification Using Expectation Maximization Algorithm
Lidar Point Cloud Classification Using Expectation Maximization AlgorithmLidar Point Cloud Classification Using Expectation Maximization Algorithm
Lidar Point Cloud Classification Using Expectation Maximization Algorithm
AIRCC Publishing Corporation
 
LIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHM
LIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHMLIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHM
LIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHM
ijnlc
 
GRADIENT OMISSIVE DESCENT IS A MINIMIZATION ALGORITHM
GRADIENT OMISSIVE DESCENT IS A MINIMIZATION ALGORITHMGRADIENT OMISSIVE DESCENT IS A MINIMIZATION ALGORITHM
GRADIENT OMISSIVE DESCENT IS A MINIMIZATION ALGORITHM
ijscai
 
LinkedGeoData and GeoKnow
LinkedGeoData and GeoKnowLinkedGeoData and GeoKnow
LinkedGeoData and GeoKnow
geoknow
 
Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...
Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...
Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...
EuroIoTa
 
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...
IJCSEA Journal
 
Sigmod11 outsource shortest path
Sigmod11 outsource shortest pathSigmod11 outsource shortest path
Sigmod11 outsource shortest pathredhatdb
 
Constraints on the gluon PDF from top quark differential distributions at NNLO
Constraints on the gluon PDF from top quark differential distributions at NNLOConstraints on the gluon PDF from top quark differential distributions at NNLO
Constraints on the gluon PDF from top quark differential distributions at NNLOjuanrojochacon
 
Kim_WE3_T05_2.pptx
Kim_WE3_T05_2.pptxKim_WE3_T05_2.pptx
Kim_WE3_T05_2.pptxgrssieee
 
Development of Methodology for Determining Earth Work Volume Using Combined S...
Development of Methodology for Determining Earth Work Volume Using Combined S...Development of Methodology for Determining Earth Work Volume Using Combined S...
Development of Methodology for Determining Earth Work Volume Using Combined S...
IJMER
 
Machine learning for high-speed corner detection
Machine learning for high-speed corner detectionMachine learning for high-speed corner detection
Machine learning for high-speed corner detectionbutest
 
Energy and latency aware application
Energy and latency aware applicationEnergy and latency aware application
Energy and latency aware application
csandit
 
ENERGY AND LATENCY AWARE APPLICATION MAPPING ALGORITHM & OPTIMIZATION FOR HOM...
ENERGY AND LATENCY AWARE APPLICATION MAPPING ALGORITHM & OPTIMIZATION FOR HOM...ENERGY AND LATENCY AWARE APPLICATION MAPPING ALGORITHM & OPTIMIZATION FOR HOM...
ENERGY AND LATENCY AWARE APPLICATION MAPPING ALGORITHM & OPTIMIZATION FOR HOM...
cscpconf
 
1308.3898 1
1308.3898 11308.3898 1
1308.3898 1
vin_bha1984
 
1308.3898
1308.38981308.3898
1308.3898
vin_bha1984
 
Stereo matching based on absolute differences for multiple objects detection
Stereo matching based on absolute differences for multiple objects detectionStereo matching based on absolute differences for multiple objects detection
Stereo matching based on absolute differences for multiple objects detection
TELKOMNIKA JOURNAL
 
sedimaging for soil classification
sedimaging for soil classificationsedimaging for soil classification
sedimaging for soil classification
Sudarshan Barole
 

Similar to On the Effect of Geometries Simplification on Geo-spatial Link Discovery (20)

EUCAP 2021_presentation (7)
EUCAP 2021_presentation (7)EUCAP 2021_presentation (7)
EUCAP 2021_presentation (7)
 
Design and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding AlgorithmDesign and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding Algorithm
 
Lidar Point Cloud Classification Using Expectation Maximization Algorithm
Lidar Point Cloud Classification Using Expectation Maximization AlgorithmLidar Point Cloud Classification Using Expectation Maximization Algorithm
Lidar Point Cloud Classification Using Expectation Maximization Algorithm
 
LIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHM
LIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHMLIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHM
LIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHM
 
GRADIENT OMISSIVE DESCENT IS A MINIMIZATION ALGORITHM
GRADIENT OMISSIVE DESCENT IS A MINIMIZATION ALGORITHMGRADIENT OMISSIVE DESCENT IS A MINIMIZATION ALGORITHM
GRADIENT OMISSIVE DESCENT IS A MINIMIZATION ALGORITHM
 
LinkedGeoData and GeoKnow
LinkedGeoData and GeoKnowLinkedGeoData and GeoKnow
LinkedGeoData and GeoKnow
 
Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...
Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...
Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...
 
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...
 
Sigmod11 outsource shortest path
Sigmod11 outsource shortest pathSigmod11 outsource shortest path
Sigmod11 outsource shortest path
 
HalifaxNGGs
HalifaxNGGsHalifaxNGGs
HalifaxNGGs
 
Constraints on the gluon PDF from top quark differential distributions at NNLO
Constraints on the gluon PDF from top quark differential distributions at NNLOConstraints on the gluon PDF from top quark differential distributions at NNLO
Constraints on the gluon PDF from top quark differential distributions at NNLO
 
Kim_WE3_T05_2.pptx
Kim_WE3_T05_2.pptxKim_WE3_T05_2.pptx
Kim_WE3_T05_2.pptx
 
Development of Methodology for Determining Earth Work Volume Using Combined S...
Development of Methodology for Determining Earth Work Volume Using Combined S...Development of Methodology for Determining Earth Work Volume Using Combined S...
Development of Methodology for Determining Earth Work Volume Using Combined S...
 
Machine learning for high-speed corner detection
Machine learning for high-speed corner detectionMachine learning for high-speed corner detection
Machine learning for high-speed corner detection
 
Energy and latency aware application
Energy and latency aware applicationEnergy and latency aware application
Energy and latency aware application
 
ENERGY AND LATENCY AWARE APPLICATION MAPPING ALGORITHM & OPTIMIZATION FOR HOM...
ENERGY AND LATENCY AWARE APPLICATION MAPPING ALGORITHM & OPTIMIZATION FOR HOM...ENERGY AND LATENCY AWARE APPLICATION MAPPING ALGORITHM & OPTIMIZATION FOR HOM...
ENERGY AND LATENCY AWARE APPLICATION MAPPING ALGORITHM & OPTIMIZATION FOR HOM...
 
1308.3898 1
1308.3898 11308.3898 1
1308.3898 1
 
1308.3898
1308.38981308.3898
1308.3898
 
Stereo matching based on absolute differences for multiple objects detection
Stereo matching based on absolute differences for multiple objects detectionStereo matching based on absolute differences for multiple objects detection
Stereo matching based on absolute differences for multiple objects detection
 
sedimaging for soil classification
sedimaging for soil classificationsedimaging for soil classification
sedimaging for soil classification
 

Recently uploaded

role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 

Recently uploaded (20)

role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 

On the Effect of Geometries Simplification on Geo-spatial Link Discovery

  • 1. On the Effect of Geometries Simplification on Geo-spatial Link Discovery Abdullah Fathi Ahmed1, Mohamed Ahmed Sherif1,2, and Axel-Cyrille Ngonga Ngomo1,2 1Department of Computer Science, University Paderborn, 33098 Paderborn, Germany afaahmed@mail.uni-paderborn.de {sherif|ngonga}@upb.de 2 Department of Computer Science, University of Leipzig, 04109 Leipzig, Germany {sherif|ngonga}@informatik.uni-leipzig.de 12 September, 2018 Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 1 / 25
  • 2. Outline 1 Motivation 2 Approach 3 Evaluation 4 Conclusion & Future Work Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 2 / 25
  • 3. Motivation Link Discovery among RDF geospatial data Linked Data Cloud http://stats.lod2.eu 150+ billion triples 46+ million links Mostly owl:sameAs Large geospatial datasets LinkedGeoData contains 20+ billion triples CLC consists of 2+ million resources NUTS contains up to 1500 points/resources Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 3 / 25
  • 4. Motivation Why is linking geospatial resources difficult? Link Discovery Given two knowledge bases S and T, find links of type R between S and T Formally find M = {(s, t) ∈ S × T : R(s, t)} Na¨ıve computation of M requires quadratic time complexity Geo-spatial resources available on the LOD Described using polygons Large in number Demands the computation of 1 Topological relations 2 point-set distance Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 4 / 25
  • 5. Motivation Why is linking of simplified geospatial data important? Real-time applications Structured machine learning Cross-ontology QA Reasoning Federated Queries ... The trade-off between Runtime and Accuracy http://www.thepinsta.com Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 5 / 25
  • 6. Approach I. Line Simplification Input: Polygonized curve with n vertices Goal: Find an approximating polygonized curve with m vertices, where m < n Idea: Approximate a line with a defined error tolerance > 0 Many algorithms exist Douglas-Peucker Visvalingam-Whyatt ... Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 6 / 25
  • 7. Approach I. Line Simplification: Douglas-Peucker Algorithm Constract a line segment from the first point to the last point Find the point with farthest distance dmax from the line segment If tolerance < dmax , the approximation is accepted otherwise, keep recursion Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 7 / 25
  • 8. Approach II. Topological Relations: The Dimensionally Extended nine-Intersection Model (DE-9IM) Standard to describe the topological relations in 2D space. DE-9IM is based on the intersection matrix: DE9IM(a, b) dim(I(g1) ∩ I(g2)) dim(I(g1) ∩ B(g2)) dim(I(g1) ∩ E(g2)) dim(B(g1) ∩ I(g2)) dim(B(g1) ∩ B(g2)) dim(B(g1) ∩ E(g2)) dim(E(g1) ∩ I(g2)) dim(E(g1) ∩ B(g2)) dim(E(g1) ∩ E(g2)) At least one shared point for a relation to be hold For the disjoint relation ⇒ inverse of the intersects relation Accelerates the computation of any topological relation Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 8 / 25
  • 9. Approach III. Distance Measures for Point Sets Input Two resources with input geometries gs and gt Compute the orthodromic distance δ(si , tj ) between pairwise point of gs and gt δ(si , tj ) = R cos−1 sin(ϕsi ) sin(ϕtj ) + cos(ϕsi ) cos(ϕtj ) cos(λsi − λtj ), pi is a point on the surface (ϕi , λi ), latitude ϕi and longitude λi , Many methods exist to compute the (gs, gt) point-set distance Hausdorff DHausdorff (gs, gt) = max si ∈gs min tj ∈gt δ(si , tj ) Mean Dmean(gs, gt) = δ  1 n si ∈gs si , 1 m tj ∈gt tj   Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 9 / 25
  • 10. Evaluation Overview Line Simplification is independent from Link Discovery framework Stat of the art: Radon for topological relation extraction Orchid for point set distance Topological relations: within, touches, overlaps, intersects, equals, crosses and covers Point-sets distance: Hausdorff, Mean, Link, Min, and Sumofmin Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 10 / 25
  • 11. Evaluation Overview Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 11 / 25
  • 12. Evaluation Setup Hardware Oculus a cluster machine running OpenJDK 64-Bit 1.8.0161 on Ubuntu 16.04.3 LTS Assigned 16 CPU (2.6 GHz Intel Xeon ”Sandy Bridge”) and 200 GB of RAM with timeout limit of 4 hours for each job Datasets NUTS CORINE Land Cover (CLC) Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 12 / 25
  • 13. Evaluation F-Measure Analysis Q1 How much performance (i.e., F-measure) each of the geospatial LD approaches loses, when to deal with the simplified geometries vs. when to deal with the original ones? The first set of experiments setup Radon for discovering topological relations Douglas-Peucker with simplification factors of 0.05, 0.09, 0.10 and 0.2 NUTS dataset as a source and CLC dataset as a target Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 13 / 25
  • 14. Evaluation F-Measure Analysis Relation/Factor 0.05 0.09 0.10 0.20 Average Equals 1.00 1.00 1.00 1.00 1.00 ± 0.00 Intersects 0.99 0.97 0.97 0.94 0.97 ± 0.02 Contains 0.99 0.97 0.97 0.93 0.97 ± 0.03 Within 0.99 0.97 0.97 0.93 0.97 ± 0.03 Covers 0.99 0.97 0.97 0.93 0.97 ± 0.03 Coveredby 0.99 0.97 0.97 0.93 0.97 ± 0.03 Crosses 1.00 1.00 1.00 1.00 1.00 ± 0.00 Touches 1.00 1.00 1.00 1.00 1.00 ± 0.00 Overlaps 0.80 0.52 0.47 0.28 0.52 ± 0.21 Average 0.97 ± 0.07 0.94 ± 0.16 0.93 ± 0.17 0.90 ± 0.23 0.94 ± 0.03 F-measures results of applying Radon against geometries generated using the Douglas-Peucker line simplification algorithm. Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 14 / 25
  • 15. Evaluation F-Measure Analysis Q1 How much performance (i.e., F-measure) each of the geospatial LD approaches loses, when to deal with the simplified geometries vs. when to deal with the original ones? The second set of experiments setup: Orchid for measuring the distance between point-sets Douglas-Peucker with simplification factors of 0.05, 0.09, 0.10 and 0.2 NUTS dataset dedublicated to compute F-measure NUTS dataset as a source and as a target (deduplication) Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 15 / 25
  • 16. Evaluation F-Measure Analysis Measure/Factor 0.05 0.9 0.1 0.2 0.3 Average Foriginal Hausdorff 0.90 0.91 0.91 0.91 0.91 0.91 ± 0.00 0.88 Mean 0.94 0.94 0.94 0.94 0.94 0.94 ± 0.00 0.94 Min 0.14 0.16 0.16 0.21 0.25 0.18 ± 0.04 0.13 Link 0.95 0.95 0.94 0.94 0.94 0.94 ± 0.00 0.94 SumOfMin 0.95 0.95 0.94 0.94 0.94 0.94 ± 0.00 0.94 avarege 0.77 ± 0.36 0.78 ± 0.35 0.78 ± 0.35 0.79 ± 0.32 0.80 ± 0.31 0.77 ± 0.36 F-measures results of the point-set distance measures implementations in Orchid using the Douglas-Peucker line simplification algorithm. Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 16 / 25
  • 17. Evaluation Runtime Analysis Q2 How well each of the geospatial LD approaches scale (i.e., runtime speedup), and when to deal with the simplified geometries? The third sets of experiments setup: Same setting as in the first sets of experiments Radon for discovering topological relations Douglas-Peucker with simplification factors of 0.05, 0.09, 0.10 and 0.2 NUTS dataset as a source and CLC dataset as a target Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 17 / 25
  • 18. Evaluation Runtime Analysis 0 50 100 150 200 250 300 350 400 450 Equals Intersects Contains Within Covers CoveredBy Overlaps Touches Crosses Runtime(sec.) Topological relations Original dataset 0.05 0.09 0.1 0.2 Runtimes of Radon’s implementation of topological relations LD using the Douglas-Peucker algorithm Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 18 / 25
  • 19. Evaluation Runtime Analysis Q2 How well each of the geospatial LD approaches scale (i.e., runtime speedup), and when to deal with the simplified geometries? The fourth set of experiments: Same setting in the second sets of experiments Orchid for measuring the distance between point-sets Douglas-Peucker with simplification factors of 0.05, 0.09, 0.10 and 0.2 NUTS dataset dedublicated to compute F-measure NUTS dataset as a source and as a target (deduplication) Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 19 / 25
  • 20. Evaluation Runtime Analysis 0 200 400 600 800 1000 1200 1400 1600 1800 Hausdorff Mean Min Link SumOfMin Runtime(sec.) point-set measures Original data 0.05 0.09 0.1 0.2 0.3 Runtimes of Orchid’s implementation of set-points distance measure LD using the Douglas-Peucker algorithm Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 20 / 25
  • 21. Evaluation LD Relations Analysis Q3 Which relation is the most/least affected by the simplification process? Same setting in the first and second sets of experiments The relation overlaps has the most affected F-measure when using Douglas-Peucker algorithm, (see Table 14) The equals,crosses and touches are not affected at all by any simplification (see Tables 14 In the case of point-set measures, the F-measure of min measure is the most affected, (see Table 16 ) Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 21 / 25
  • 22. Evaluation Simplification Runtime Analysis Q4 What is the run time cost of simplification? Same setting in the third and fourth sets of experiments Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 22 / 25
  • 23. Evaluation Simplification Runtime Analysis 0 500 1000 1500 2000 2500 3000 3500 Simplification RADONon Simpl. RADON+ Simpl. RADONon original Totalruntime(sec.) Total Runtime for all topological relations. 0 50 100 150 200 250 300 350 400 450 500 Simplification RADONon Simpl. RADON+ Simpl. RADONon original Averageruntimeforsingletask(sec.) Average Runtime of a single topological relation. Runtime of Radon on the original data vs. simplified data using the Douglas-Peucker algorithm. Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 23 / 25
  • 24. Conclusion & Future Work Conclusion Studied the usage of line simplification as a preprocessing step of LD approaches over geospatial RDF knowledge bases Studied the behaviour of two categories of geospatial linking approaches (i.e., the topological relations and point-set distances) On average, F-measure of 0.94 using the Douglas-Peucker algorithm and 0.69 using the Visvalingam-Whyatt algorithm has been achieved. Gain up to 19.8× speedup using Douglas-Peucker algorithm and up to 67.3× using the Visvalingam-Whyatt Future Work Guarantee the minimum F-measure loss Determine fittest line simplification algorithm and its best parameter to achieve a better trad-off between F-measure and Runtime Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 24 / 25
  • 25. Thank you for your Attention! Abdullah Fathi Ahmed afaahmed@mail.uni-paderborn.de https://github.com/dice-group/Orchid Acknowledgment This work has been supported by LEDS project, Eurostars Project SAGE (GA no. E!10882), the H2020 projects SLIPO (GA no. 731581) and HOBBIT (GA no. 688227) as well as the DFG project LinkingLOD (project no. NG 105/3-2) and the BMWI Project GEISER (project no. 01MD16014). Ahmed, Sherif and Ngonga GeoSimp. 12 September, 2018 25 / 25