SlideShare a Scribd company logo
1 of 30
Download to read offline
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Revisiting the Representation of and Need for Raw
Geometries on the Linked Data Web
Blake Regalia1, Krzysztof Janowicz1, and Grant McKenzie2
2017/04/03
1STKO Lab, University of California, Santa Barbara, USA
2Department of Geographical Sciences, University of Maryland, USA
blake.regalia@gmail.com
Revisiting the Representation of and Need for Raw
Geometries on the Linked Data Web
Blake Regalia1, Krzysztof Janowicz1, and Grant McKenzie2
2017/04/03
1STKO Lab, University of California, Santa Barbara, USA
2Department of Geographical Sciences, University of Maryland, USA
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
1. This column of the slideshow contains presenter notes that explains the slides for
viewers who may be viewing a copy of the PDF post-presentation.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Position
1. Raw geometries for complex features do not need to be, and perhaps should
not be, encoded in RDF. Geometry should be represented as a
dereferencable resource and offer content-negotiation for its different
serializations.
2. Linked Data users are most often interested in topological relations rather
than raw geometries, so what are the trade-offs of precomputing topological
relations?
blake.regalia@gmail.com
Position
1. Raw geometries for complex features do not need to be, and perhaps should
not be, encoded in RDF. Geometry should be represented as a
dereferencable resource and offer content-negotiation for its different
serializations.
2. Linked Data users are most often interested in topological relations rather
than raw geometries, so what are the trade-offs of precomputing topological
relations?
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Position
A Brief History of Geometry on the Semantic Web
A Brief History of Geometry on the Semantic Web
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
A Brief History of Geometry on the Semantic Web
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
The W3C Basic Geo Vocabulary, circa 2003, set out to:
“[explore] the possibilities of representing mapping/location data in RDF, and does not
attempt to address many of the issues covered in the professional GIS world”
Established a simple, minimalistic vocabulary for describing points with a latitude
and longitude value using the WGS84 reference datum.
Considered good enough for the basic needs of many!
blake.regalia@gmail.com
The W3C Basic Geo Vocabulary, circa 2003, set out to:
“[explore] the possibilities of representing mapping/location data in RDF, and does not
attempt to address many of the issues covered in the professional GIS world”
Established a simple, minimalistic vocabulary for describing points with a latitude
and longitude value using the WGS84 reference datum.
Considered good enough for the basic needs of many!
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
A Brief History of Geometry on the Semantic Web
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Resource about London Heathrow Airport using W3C Basic Geo in 2003-01-10.
blake.regalia@gmail.com
Resource about London Heathrow Airport using W3C Basic Geo in 2003-01-10.
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
A Brief History of Geometry on the Semantic Web
1. W3C Basic Geo Vocabulary worked pretty well, for example among web developers,
who just wanted to annotate web documents and XML resources manually.
2. However, one of its major shortcomings was that this vocabulary treated places as
though they were locations. Meaning that conceptually, there was no distinction
between the place on Earth and the geometric thing that can be used to represent its
location.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Needs of geospatial data not met by W3C Basic Geo:
• Support for Coordinate Reference Systems (CRS)
• Geometries beyond Point such as polylines, polygons, etc.
• Separation of entity and geometry, which have cardinality of 1-to-many
blake.regalia@gmail.com
Needs of geospatial data not met by W3C Basic Geo:
• Support for Coordinate Reference Systems (CRS)
• Geometries beyond Point such as polylines, polygons, etc.
• Separation of entity and geometry, which have cardinality of 1-to-many
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
A Brief History of Geometry on the Semantic Web
1. It became clear to the community that a more comprehensive vocabulary was needed.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
‘NeoGeo’ drafted a vocabulary that extended W3C Basic Geo to support more
geometry types. It encoded a geometry’s entire structure in RDF:
blake.regalia@gmail.com
‘NeoGeo’ drafted a vocabulary that extended W3C Basic Geo to support more
geometry types. It encoded a geometry’s entire structure in RDF:
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
A Brief History of Geometry on the Semantic Web
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
@prefix ngeo: <http://geovocab.org/geometry#> .
_:polygon rdf:type ngeo:Polygon ;
ngeo:exterior [
rdf:type ngeo:LinearRing ;
ngeo:posList (
[ geo:lat -29; geo:long 16 ]
[ geo:lat -28; geo:long 33 ]
[ geo:lat -34; geo:long 27 ]
[ geo:lat -29; geo:long 16 ]
)
] ;
ngeo:interior [
rdf:type ngeo:LinearRing ;
ngeo:posList (
[ geo:lat -29.5; geo:long 27 ]
[ geo:lat -28.5; geo:long 28.5 ]
[ geo:lat -31; geo:long 28 ]
[ geo:lat -29.5; geo:long 27 ]
)
] .
blake.regalia@gmail.com
@prefix ngeo: <http://geovocab.org/geometry#> .
_:polygon rdf:type ngeo:Polygon ;
ngeo:exterior [
rdf:type ngeo:LinearRing ;
ngeo:posList (
[ geo:lat -29; geo:long 16 ]
[ geo:lat -28; geo:long 33 ]
[ geo:lat -34; geo:long 27 ]
[ geo:lat -29; geo:long 16 ]
)
] ;
ngeo:interior [
rdf:type ngeo:LinearRing ;
ngeo:posList (
[ geo:lat -29.5; geo:long 27 ]
[ geo:lat -28.5; geo:long 28.5 ]
[ geo:lat -31; geo:long 28 ]
[ geo:lat -29.5; geo:long 27 ]
)
] .
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
A Brief History of Geometry on the Semantic Web
1. Their approach was to encode a geometry’s entire structure as RDF
2. However, NeoGeo still lacked support for coordinate reference systems and it was
criticized for its excessive creation of blank nodes (high overhead with little use case).
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
dbr:Perth rdfs:label "Perth"@en ;
geosparql:hasGeometry [
geosparql:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326>
POINT(115.85888671875 31.952222824097)"^^geosparql:wktLiteral→
] ;
GeoSPARQL addressed all the issues because it:
• separated places from their geometric representations via a blank node
• joined CRS, latitude and longitude values (in fact the entire geometry) into a
single RDF literal.
blake.regalia@gmail.com
dbr:Perth rdfs:label "Perth"@en ;
geosparql:hasGeometry [
geosparql:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326>
POINT(115.85888671875 31.952222824097)"^^geosparql:wktLiteral→
] ;
GeoSPARQL addressed all the issues because it:
• separated places from their geometric representations via a blank node
• joined CRS, latitude and longitude values (in fact the entire geometry) into a
single RDF literal.
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
A Brief History of Geometry on the Semantic Web
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
“What is the most populated city in Western Australia?”
select ?place where {
?place a dbo:City ;
dbo:populationTotal ?population ;
geosparql:hasGeometry [ geosparql:asWKT ?placeWKT ] .
?westernAustralia rdfs:label "Western Australia"@en ;
geosparql:hasGeometry [ geosparql:asWKT ?westernAustraliaWKT ] .
filter(geof:sfWithin(?placeWKT, ?westernAustraliaWKT))
} order by desc(?population) limit 1
Geospatial processing within SPARQL queries!
Perhaps most useful are the topological functions: sfTouches, sfWithin, etc.
blake.regalia@gmail.com
“What is the most populated city in Western Australia?”
select ?place where {
?place a dbo:City ;
dbo:populationTotal ?population ;
geosparql:hasGeometry [ geosparql:asWKT ?placeWKT ] .
?westernAustralia rdfs:label "Western Australia"@en ;
geosparql:hasGeometry [ geosparql:asWKT ?westernAustraliaWKT ] .
filter(geof:sfWithin(?placeWKT, ?westernAustraliaWKT))
} order by desc(?population) limit 1
Geospatial processing within SPARQL queries!
Perhaps most useful are the topological functions: sfTouches, sfWithin, etc.
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
A Brief History of Geometry on the Semantic Web
The Current Shortcomings of GeoSPARQL
The Current Shortcomings of GeoSPARQL
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
The Current Shortcomings of GeoSPARQL
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
WKT Literal for Western Australia is 61 KB of ‘human-readable’ markup text:
blake.regalia@gmail.com
WKT Literal for Western Australia is 61 KB of ‘human-readable’ markup text:
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
The Current Shortcomings of GeoSPARQL
1. This is an RDF literal, it is a string – so it should be stored in its entirety and indexed as
a string in the triplestore’s underlying relational database. Surely any GeoSPARQL
implementation will choose to create a spatially indexed copy of the geometry in a
binary format, but it will be a copy.
2. So it brings up the question, why do we need RDF literals for geometry? Are people
really doing fulltext searches on these strings? If anything, they might be applying a
filter to the Well-Known Text to select only certain types of geometries (such as point,
polyline or polygon).
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
filter(geof:sfTouches(?polygonA, ?polygonB))
No results.
blake.regalia@gmail.com
filter(geof:sfTouches(?polygonA, ?polygonB))
No results.
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
The Current Shortcomings of GeoSPARQL
1. These two geometries taken from Open Street Map are two adjacent counties in the
United States. But notice how they don’t line up perfectly.
2. Well this sort issue is so common with Geographic Information Systems that we have
names for it, like ‘sliver polygons’. Whether the coordinate data comes from GPS or
point-and-click, the edges are never exactly perfect. More generally, we call this a
digitization error.
3. Now normally, in a GIS, these problems are fixed so that topologically, they reflect their
real-world interpretation.
4. If you were to ask GeoSPARQL about this relation, it would return ‘overlapping’ – which
is not what you should expect since in fact, those two counties are actually ‘touching’
5. But the ‘touches’ function fails us here because GeoSPARQL does not handle dirty
polygons.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
blake.regalia@gmail.com
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
The Current Shortcomings of GeoSPARQL
1. Here we have two different features extracted from Open Street Map. One of them in
blue is the geometry for a city, and the other one in red is that of a county.
2. In real-life, the city of Lynchburg and Moore County actually have an identical spatial
extent. However, they were digitized independently on Open Street Map and one of
them was digitized to a finer resolution.
3. Again, if you asked GeoSPARQL about the topology here, it returns ‘overlapping’, when
in reality, it should be ‘equal’.
4. The problem is that GeoSPARQL offers no solutions when dealing with geometries like
this – perhaps because its assumed that the data has already been cleaned by the
data providers.
5. But on the Linked Data Web, geometries can come from a variety of different sources,
and their geometries will certainly almost never be in perfect alignmeent (such as with
Linked GeoData, who sources their geometries from Open Street Map)
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
blake.regalia@gmail.com
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
The Current Shortcomings of GeoSPARQL
1. Finally, computing topological relations on-demand can be very expensive.
2. This polygon is made up of 4.5k rings and 487.k points – and is a nightmare for finding
features ‘within’ and ‘touching’
3. Geometries like this are common in geographic datasets containing natural features.
On-demand topology does not scale in these cases – performance suffers to the point
that it becomes unusable.
Geometry is a Resource; Represent them w/ URIs
Geometry is a Resource; Represent them w/ URIs
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Geometry is a Resource; Represent them w/ URIs
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Beyond simple points and bounding boxes, storing raw geometry in RDF literals
should be reconsidered.
osm:relation_2316598
rdfs:label "Western Australia"@en ;
geosparql:hasGeometry [
geosparql:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326>
POLYGON((128.9999986 -14.4290140, 128.9999714 -14.8798443,
...))"^^geosparql:wktLiteral
→
→
] ;
# instead...
Instead, represent geometry as a resource via URIs that can be dereferenced to
fetch the geometry data.
blake.regalia@gmail.com
Beyond simple points and bounding boxes, storing raw geometry in RDF literals
should be reconsidered.
osm:relation_2316598
rdfs:label "Western Australia"@en ;
geosparql:hasGeometry [
geosparql:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326>
POLYGON((128.9999986 -14.4290140, 128.9999714 -14.8798443,
...))"^^geosparql:wktLiteral
→
→
] ;
# instead...
Instead, represent geometry as a resource via URIs that can be dereferenced to
fetch the geometry data.
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Geometry is a Resource; Represent them w/ URIs
1. Storing the raw geometry in RDF literals should be reconsidered for complex
geometries.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Beyond simple points and bounding boxes, storing raw geometry in RDF literals
should be reconsidered.
osm:relation_2316598
rdfs:label "Western Australia"@en ;
geosparql:hasGeometry [
geosparql:asWKT ‘<http://www.opengis.net/def/crs/EPSG/0/4326>
POLYGON((128.9999986 -14.4290140, 128.9999714 -14.8798443, ...))
geosparql:wktLiteral
→
→
] ;
# instead... get rid of blank node and use URI
ago:hasGeometry ex:WesternAustraliaPolygon ;
Instead, represent geometry as a resource via URIs that can be dereferenced to
fetch the geometry data.
blake.regalia@gmail.com
Beyond simple points and bounding boxes, storing raw geometry in RDF literals
should be reconsidered.
osm:relation_2316598
rdfs:label "Western Australia"@en ;
geosparql:hasGeometry [
geosparql:asWKT ‘<http://www.opengis.net/def/crs/EPSG/0/4326>
POLYGON((128.9999986 -14.4290140, 128.9999714 -14.8798443, ...))
geosparql:wktLiteral
→
→
] ;
# instead... get rid of blank node and use URI
ago:hasGeometry ex:WesternAustraliaPolygon ;
Instead, represent geometry as a resource via URIs that can be dereferenced to
fetch the geometry data.
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Geometry is a Resource; Represent them w/ URIs
1. Instead, geometry should be represented as a resource that can be dereferenced to
fetch the geometry data.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Dereferencable URIs: clients are free to fetch geometry in a variety of formats.
curl "http://ex.co/geometry/polygon?id=42" -H "Accept: $MIME_TYPE"
MIME Type Description Returns
text/html Web interface <!DOCTYPE html><html lang="en">...
text/plain Well-Known Text POLYGON((113.1016 -38.062 ...))
application/gml+xml GML <gml:Polygon><gml:Exterior>...
application/vnd.geo+json GeoJSON {"type":"Polygon","coordinates":...}
application/octet-stream Well-Known Binary 01 06 00 00 20 E6 10 00 00 01...
blake.regalia@gmail.com
Dereferencable URIs: clients are free to fetch geometry in a variety of formats.
curl "http://ex.co/geometry/polygon?id=42" -H "Accept: $MIME_TYPE"
MIME Type Description Returns
text/html Web interface <!DOCTYPE html><html lang="en">...
text/plain Well-Known Text POLYGON((113.1016 -38.062 ...))
application/gml+xml GML <gml:Polygon><gml:Exterior>...
application/vnd.geo+json GeoJSON {"type":"Polygon","coordinates":...}
application/octet-stream Well-Known Binary 01 06 00 00 20 E6 10 00 00 01...
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Geometry is a Resource; Represent them w/ URIs
1. Having the ability to choose a geometry serialization format is not only convenient for
clients, but it can also have a tremendous impact on the performance of web
applications who right now have to parse Well-Known Text – usually to convert it into
GeoJSON before doing anything interesting with the data.
2. Same goes for the transmission of large amounts of geodata such as Well-Known
Binary which takes up about 30% less storage than Well-Known Text.
3. With content-negotation, clients can dereference a URI with certain media types in the
‘Accept’ header to settle on a format that is suitable for their needs.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
base URI geometry id
http://ex.co/geometry/point?id=1337#128.988_-14.429
geometry type coordinate value
http://ex.co/geometry/polygon?id=7331#128.113_-14.221,126.188_-19.212
bounding box
blake.regalia@gmail.com
base URI geometry id
http://ex.co/geometry/point?id=1337#128.988_-14.429
geometry type coordinate value
http://ex.co/geometry/polygon?id=7331#128.113_-14.221,126.188_-19.212
bounding box
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Geometry is a Resource; Represent them w/ URIs
1. For our dataset, we mint the URIs such that they contain metadata about the geometry
they represent.
2. Just by looking at an identifier, someone can determine that it represents a geometry,
the type of geometry it represents, and the WGS84 bounding box of the feature.
3. To give a use case example, a web application could build a spatial index from just the
URIs alone by using their bounding box metadata, and then selectively download
geometries within areas of interest – such as the current map viewport.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Comparison of strategies towards storing and using geometry:
Trait GeoSPARQL NeoGeo AGO
Efficient geometry storage 
Geometry can persist externally 1

Content-negotiation for geometry format  
Uniform RDF structure  
Composite geometries  
Determine geometry type 2   
Access bounding box 2 
Access raw geometry 2  
1 = Geometry can persist in a local geodatabase or even on a remote system and without copies.
2 = From the triples’ RDF data alone (e.g., without using SPARQL).
blake.regalia@gmail.com
Comparison of strategies towards storing and using geometry:
Trait GeoSPARQL NeoGeo AGO
Efficient geometry storage 
Geometry can persist externally 1

Content-negotiation for geometry format  
Uniform RDF structure  
Composite geometries  
Determine geometry type 2   
Access bounding box 2 
Access raw geometry 2  
1 = Geometry can persist in a local geodatabase or even on a remote system and without copies.
2 = From the triples’ RDF data alone (e.g., without using SPARQL).
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Geometry is a Resource; Represent them w/ URIs
1. Through a combination of techniques, we’re able to save almost all the benefits of both
GeoSPARQL and NeoGeo – with the one drawback of accessing the geometry data
inline with the RDF, which has its own set of problems as discussed earlier.
2. An added benefit to using URIs is that the geometry data can persist external to the
triplestore – it can reside in its original form on a local geodatabase, or it may even exist
somewhere else on a remote system
Precompute Your Topological Relations!
Precompute Your Topological Relations!
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Precompute Your Topological Relations!
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
blake.regalia@gmail.com
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Precompute Your Topological Relations!
1. The other problem to GeoSPARQL that I brought up earlier was with topology.
Computing relations on-demand can take way too long AND GeoSPARQL does not
handle digitization errors.
2. But if we precompute topological relations, correcting digitization errors is back in the
hands of domain-experts who can use the right methods to come up with those
relations.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Strict Topological Relations
avg. area/length of...
# instances relation code left geometry right geometry
19,134 polygon touches polygon EC 960km2 2, 049km2
1,272 polygon overlaps polygon PO 321km2 2, 974km2
1,287 polygon tpp polygon TPP 57km2 2, 653km2
2,577 polygon ntpp polygon NTPP 16km2 3, 052km2
3 polygon equals polygon EQ 830m2 836m2
19,543 polyline touches polygon TCH 652km 701km2
7,871 polyline crosses polygon PTH 290km 2, 733km2
5,733 polyline within polygon INC 6.5km 6, 863km2
11,227 polyline crosses polyline 395km 688km
blake.regalia@gmail.com
Strict Topological Relations
avg. area/length of...
# instances relation code left geometry right geometry
19,134 polygon touches polygon EC 960km2 2, 049km2
1,272 polygon overlaps polygon PO 321km2 2, 974km2
1,287 polygon tpp polygon TPP 57km2 2, 653km2
2,577 polygon ntpp polygon NTPP 16km2 3, 052km2
3 polygon equals polygon EQ 830m2 836m2
19,543 polyline touches polygon TCH 652km 701km2
7,871 polyline crosses polygon PTH 290km 2, 733km2
5,733 polyline within polygon INC 6.5km 6, 863km2
11,227 polyline crosses polyline 395km 688km
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Precompute Your Topological Relations!
1. We tested this on a dataset we created by aligning Open Street Map polygons with their
corresponding DBpedia resources for about 18.5k polygons and 8k polylines in the
United States – which gave us a total of about 68k distinct topological relations
2. The reason we can get away with materializing so many relations is because they are
all based on intersection which is limited by locality – so in other words, we don’t
materialize relations for geometries that are disjoint, otherwise we would be adding (n
choose 2) triples .
3. In fact, we can start by only materializing the lowest-level relations in one direction and
then use reasoning to expand the graph – so for example, tangential proper part and
non-tangential proper part are both subclasses of the ‘within’ relation.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
# a) GeoSPARQL
select ?place where {
dbr:Perth geosparql:hasGeometry [geosparql:asWKT ?wktA].
?anyPlace geosparql:hasGeometry [geosparql:asWKT ?wktB].
filter(geof:sfWithin(?wktA, ?wktB))
}
# b) Our ‘AGO’ approach
prefix agt: http://awesemantic-geo.link/topology/ .
select ?place where {
dbr:Perth agt:within ?place.
}
blake.regalia@gmail.com
# a) GeoSPARQL
select ?place where {
dbr:Perth geosparql:hasGeometry [geosparql:asWKT ?wktA].
?anyPlace geosparql:hasGeometry [geosparql:asWKT ?wktB].
filter(geof:sfWithin(?wktA, ?wktB))
}
# b) Our ‘AGO’ approach
prefix agt: http://awesemantic-geo.link/topology/ .
select ?place where {
dbr:Perth agt:within ?place.
}
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Precompute Your Topological Relations!
1. So to see what that looks like, instead of applying a filter at query time, we simply
match the basic graph pattern for the specific precomputed topological relation we’re
interested in.
2. Not only is this faster, but we’ve also effectively cleaned the data so the results we get
will be more accurate.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
blake.regalia@gmail.com
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Precompute Your Topological Relations!
1. And with that consolidated city-county example from earlier, our preprocessing step
caught these dirty polygons and materialized the city of Lynchburg and Moore County
as having an EQUAL topological relation.
2. So now we’ve dealt with digitization errors and have sort of enriched the dataset with
strict topological relations.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
blake.regalia@gmail.com
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Precompute Your Topological Relations!
1. But there’s a geographic problem – strict topological relations alone do not take into
account uncertainty and vagueness principles.
2. So for example, the spatial extent of a town can be conceptually vauge – it depends on
who you ask, when you ask them, and so forth.
3. With rivers and water bodies, the spatial extent is always changing depending on how
much rainfall there has been and so on.
4. How can you say anything about the topology of these things? Well, you have to use
broad boundaries – that is to say that the edges of those polygons are actually regions
that can grow or shrink depending on what you are comparing them to.
5. So for a town, we can allow the boundary of the polygon to grow a little and the river’s
boundary might grow or shrink as well – until they meet; then we can say that they are
‘broadly touching’; using a different predicate than the strict topological relations.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
blake.regalia@gmail.com
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
Precompute Your Topological Relations!
1. And for cases that may exhibit cognitive relations, we can also create a set of
topological predicates for those as well – such as to say that Brazil is ‘mostly within’ the
southern hemisphere.
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
• Stop using blank nodes for geometry; adopt URIs!
• RDF literals are cumbersone. Geometry is a resource, let it persist as such
• What are those raw geometries needed for? On-demand computations do
not scale for large geodatasets. Also, digitization errors, vagueness and
uncertainty challenge topological queries – precomputing topology takes care
of these issues
blake.regalia@gmail.com
• Stop using blank nodes for geometry; adopt URIs!
• RDF literals are cumbersone. Geometry is a resource, let it persist as such
• What are those raw geometries needed for? On-demand computations do
not scale for large geodatasets. Also, digitization errors, vagueness and
uncertainty challenge topological queries – precomputing topology takes care
of these issues
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
1. Stop using blank nodes for geometry! At the very least, you should always be using a
dereferenceable URI instead.
2. While you’re at it, (if you can), remove RDF literals and let your geometries persist in a
production geodatabase.
3. And finally, what are these raw geometries really needed for in RDF? We have this
problem that on-demand computations are not scaling for large geodatasets
4. On top of that, we have problems with digitization errors, vagueness and uncertainty
making topological queries difficult to answer accurately - so that they reflect their
real-world equivalent.
5. Instead, knowledge graphs will see more benefit from precomputing those topological
relations and taking into account the full range of interpretations that topology can have
for geographic places.

More Related Content

Similar to Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web

Geographica: A Benchmark for Geospatial RDF Stores - ISWC 2013
Geographica: A Benchmark for Geospatial RDF Stores - ISWC 2013Geographica: A Benchmark for Geospatial RDF Stores - ISWC 2013
Geographica: A Benchmark for Geospatial RDF Stores - ISWC 2013
Kostis Kyzirakos
 
Database Conditioning Presentation ESRI PUG 2015
Database Conditioning Presentation ESRI PUG 2015Database Conditioning Presentation ESRI PUG 2015
Database Conditioning Presentation ESRI PUG 2015
Bernie South
 
Representing and Querying Geospatial Information in the Semantic Web
Representing and Querying Geospatial Information in the Semantic WebRepresenting and Querying Geospatial Information in the Semantic Web
Representing and Querying Geospatial Information in the Semantic Web
Kostis Kyzirakos
 
The Matrix: connecting and re-using digital records of archaeological investi...
The Matrix: connecting and re-using digital records of archaeological investi...The Matrix: connecting and re-using digital records of archaeological investi...
The Matrix: connecting and re-using digital records of archaeological investi...
Keith.May
 
Constructing Semantic Gazetteers: Managing GeoSpatial Vocabularies Using Open...
Constructing Semantic Gazetteers: Managing GeoSpatial Vocabularies Using Open...Constructing Semantic Gazetteers: Managing GeoSpatial Vocabularies Using Open...
Constructing Semantic Gazetteers: Managing GeoSpatial Vocabularies Using Open...
Stephane Fellah
 
Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...
Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...
Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...
Accumulo Summit
 

Similar to Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web (20)

GNIS-LD: Serving and Visualizing the Geographic Names Information System Gaze...
GNIS-LD: Serving and Visualizing the Geographic Names Information System Gaze...GNIS-LD: Serving and Visualizing the Geographic Names Information System Gaze...
GNIS-LD: Serving and Visualizing the Geographic Names Information System Gaze...
 
Geographica: A Benchmark for Geospatial RDF Stores
Geographica: A Benchmark for Geospatial RDF StoresGeographica: A Benchmark for Geospatial RDF Stores
Geographica: A Benchmark for Geospatial RDF Stores
 
Geographica: A Benchmark for Geospatial RDF Stores - ISWC 2013
Geographica: A Benchmark for Geospatial RDF Stores - ISWC 2013Geographica: A Benchmark for Geospatial RDF Stores - ISWC 2013
Geographica: A Benchmark for Geospatial RDF Stores - ISWC 2013
 
Mapping Lo Dto Proton Revised [Compatibility Mode]
Mapping Lo Dto Proton Revised [Compatibility Mode]Mapping Lo Dto Proton Revised [Compatibility Mode]
Mapping Lo Dto Proton Revised [Compatibility Mode]
 
Strabon: A Semantic Geospatial Database System
Strabon: A Semantic Geospatial Database SystemStrabon: A Semantic Geospatial Database System
Strabon: A Semantic Geospatial Database System
 
Brief State of the Art - Semantic Web technologies for geospatial data - Mode...
Brief State of the Art - Semantic Web technologies for geospatial data - Mode...Brief State of the Art - Semantic Web technologies for geospatial data - Mode...
Brief State of the Art - Semantic Web technologies for geospatial data - Mode...
 
Querying Incomplete Geospatial Information in RDF
Querying Incomplete Geospatial Information in RDFQuerying Incomplete Geospatial Information in RDF
Querying Incomplete Geospatial Information in RDF
 
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
 
Giving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityGiving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS Community
 
Database Conditioning Presentation ESRI PUG 2015
Database Conditioning Presentation ESRI PUG 2015Database Conditioning Presentation ESRI PUG 2015
Database Conditioning Presentation ESRI PUG 2015
 
Scalable Keyword Cover Search using Keyword NNE and Inverted Indexing
Scalable Keyword Cover Search using Keyword NNE and Inverted IndexingScalable Keyword Cover Search using Keyword NNE and Inverted Indexing
Scalable Keyword Cover Search using Keyword NNE and Inverted Indexing
 
Spatial Data, KML, and the University Web
Spatial Data, KML, and the University WebSpatial Data, KML, and the University Web
Spatial Data, KML, and the University Web
 
Representing and Querying Geospatial Information in the Semantic Web
Representing and Querying Geospatial Information in the Semantic WebRepresenting and Querying Geospatial Information in the Semantic Web
Representing and Querying Geospatial Information in the Semantic Web
 
Going for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial MetadataGoing for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial Metadata
 
The Matrix: connecting and re-using digital records of archaeological investi...
The Matrix: connecting and re-using digital records of archaeological investi...The Matrix: connecting and re-using digital records of archaeological investi...
The Matrix: connecting and re-using digital records of archaeological investi...
 
Beyond the Record : OCLC & the Future of MARC
Beyond the Record : OCLC & the Future of MARCBeyond the Record : OCLC & the Future of MARC
Beyond the Record : OCLC & the Future of MARC
 
GeoCO GeoJSON & GeoLocate
GeoCO GeoJSON & GeoLocateGeoCO GeoJSON & GeoLocate
GeoCO GeoJSON & GeoLocate
 
Gis and Ruby 101 at Ruby Conf Kenya 2017 by Kamal Ogudah
Gis and Ruby 101 at Ruby Conf Kenya 2017 by Kamal OgudahGis and Ruby 101 at Ruby Conf Kenya 2017 by Kamal Ogudah
Gis and Ruby 101 at Ruby Conf Kenya 2017 by Kamal Ogudah
 
Constructing Semantic Gazetteers: Managing GeoSpatial Vocabularies Using Open...
Constructing Semantic Gazetteers: Managing GeoSpatial Vocabularies Using Open...Constructing Semantic Gazetteers: Managing GeoSpatial Vocabularies Using Open...
Constructing Semantic Gazetteers: Managing GeoSpatial Vocabularies Using Open...
 
Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...
Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...
Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...
 

Recently uploaded

Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
Sérgio Sacani
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
PirithiRaju
 
The solar dynamo begins near the surface
The solar dynamo begins near the surfaceThe solar dynamo begins near the surface
The solar dynamo begins near the surface
Sérgio Sacani
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
PirithiRaju
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
University of Hertfordshire
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Sérgio Sacani
 

Recently uploaded (20)

National Biodiversity protection initiatives and Convention on Biological Di...
National Biodiversity protection initiatives and  Convention on Biological Di...National Biodiversity protection initiatives and  Convention on Biological Di...
National Biodiversity protection initiatives and Convention on Biological Di...
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
 
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana LahariERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
 
A Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthA Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on Earth
 
family therapy psychotherapy types .pdf
family therapy psychotherapy types  .pdffamily therapy psychotherapy types  .pdf
family therapy psychotherapy types .pdf
 
GBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday LifeGBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday Life
 
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdfMODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
 
The solar dynamo begins near the surface
The solar dynamo begins near the surfaceThe solar dynamo begins near the surface
The solar dynamo begins near the surface
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
 
Hemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. MuralinathHemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. Muralinath
 
B lymphocytes, Receptors, Maturation and Activation
B lymphocytes, Receptors, Maturation and ActivationB lymphocytes, Receptors, Maturation and Activation
B lymphocytes, Receptors, Maturation and Activation
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
 
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
 
The Scientific names of some important families of Industrial plants .pdf
The Scientific names of some important families of Industrial plants .pdfThe Scientific names of some important families of Industrial plants .pdf
The Scientific names of some important families of Industrial plants .pdf
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
 
GBSN - Microbiology Lab 1 (Microbiology Lab Safety Procedures)
GBSN -  Microbiology Lab  1 (Microbiology Lab Safety Procedures)GBSN -  Microbiology Lab  1 (Microbiology Lab Safety Procedures)
GBSN - Microbiology Lab 1 (Microbiology Lab Safety Procedures)
 

Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web

  • 1. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Blake Regalia1, Krzysztof Janowicz1, and Grant McKenzie2 2017/04/03 1STKO Lab, University of California, Santa Barbara, USA 2Department of Geographical Sciences, University of Maryland, USA blake.regalia@gmail.com Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Blake Regalia1, Krzysztof Janowicz1, and Grant McKenzie2 2017/04/03 1STKO Lab, University of California, Santa Barbara, USA 2Department of Geographical Sciences, University of Maryland, USA 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web 1. This column of the slideshow contains presenter notes that explains the slides for viewers who may be viewing a copy of the PDF post-presentation.
  • 2. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Position 1. Raw geometries for complex features do not need to be, and perhaps should not be, encoded in RDF. Geometry should be represented as a dereferencable resource and offer content-negotiation for its different serializations. 2. Linked Data users are most often interested in topological relations rather than raw geometries, so what are the trade-offs of precomputing topological relations? blake.regalia@gmail.com Position 1. Raw geometries for complex features do not need to be, and perhaps should not be, encoded in RDF. Geometry should be represented as a dereferencable resource and offer content-negotiation for its different serializations. 2. Linked Data users are most often interested in topological relations rather than raw geometries, so what are the trade-offs of precomputing topological relations? 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Position
  • 3. A Brief History of Geometry on the Semantic Web A Brief History of Geometry on the Semantic Web 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web
  • 4. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments The W3C Basic Geo Vocabulary, circa 2003, set out to: “[explore] the possibilities of representing mapping/location data in RDF, and does not attempt to address many of the issues covered in the professional GIS world” Established a simple, minimalistic vocabulary for describing points with a latitude and longitude value using the WGS84 reference datum. Considered good enough for the basic needs of many! blake.regalia@gmail.com The W3C Basic Geo Vocabulary, circa 2003, set out to: “[explore] the possibilities of representing mapping/location data in RDF, and does not attempt to address many of the issues covered in the professional GIS world” Established a simple, minimalistic vocabulary for describing points with a latitude and longitude value using the WGS84 reference datum. Considered good enough for the basic needs of many! 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web
  • 5. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Resource about London Heathrow Airport using W3C Basic Geo in 2003-01-10. blake.regalia@gmail.com Resource about London Heathrow Airport using W3C Basic Geo in 2003-01-10. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web 1. W3C Basic Geo Vocabulary worked pretty well, for example among web developers, who just wanted to annotate web documents and XML resources manually. 2. However, one of its major shortcomings was that this vocabulary treated places as though they were locations. Meaning that conceptually, there was no distinction between the place on Earth and the geometric thing that can be used to represent its location.
  • 6. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Needs of geospatial data not met by W3C Basic Geo: • Support for Coordinate Reference Systems (CRS) • Geometries beyond Point such as polylines, polygons, etc. • Separation of entity and geometry, which have cardinality of 1-to-many blake.regalia@gmail.com Needs of geospatial data not met by W3C Basic Geo: • Support for Coordinate Reference Systems (CRS) • Geometries beyond Point such as polylines, polygons, etc. • Separation of entity and geometry, which have cardinality of 1-to-many 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web 1. It became clear to the community that a more comprehensive vocabulary was needed.
  • 7. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments ‘NeoGeo’ drafted a vocabulary that extended W3C Basic Geo to support more geometry types. It encoded a geometry’s entire structure in RDF: blake.regalia@gmail.com ‘NeoGeo’ drafted a vocabulary that extended W3C Basic Geo to support more geometry types. It encoded a geometry’s entire structure in RDF: 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web
  • 8. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments @prefix ngeo: <http://geovocab.org/geometry#> . _:polygon rdf:type ngeo:Polygon ; ngeo:exterior [ rdf:type ngeo:LinearRing ; ngeo:posList ( [ geo:lat -29; geo:long 16 ] [ geo:lat -28; geo:long 33 ] [ geo:lat -34; geo:long 27 ] [ geo:lat -29; geo:long 16 ] ) ] ; ngeo:interior [ rdf:type ngeo:LinearRing ; ngeo:posList ( [ geo:lat -29.5; geo:long 27 ] [ geo:lat -28.5; geo:long 28.5 ] [ geo:lat -31; geo:long 28 ] [ geo:lat -29.5; geo:long 27 ] ) ] . blake.regalia@gmail.com @prefix ngeo: <http://geovocab.org/geometry#> . _:polygon rdf:type ngeo:Polygon ; ngeo:exterior [ rdf:type ngeo:LinearRing ; ngeo:posList ( [ geo:lat -29; geo:long 16 ] [ geo:lat -28; geo:long 33 ] [ geo:lat -34; geo:long 27 ] [ geo:lat -29; geo:long 16 ] ) ] ; ngeo:interior [ rdf:type ngeo:LinearRing ; ngeo:posList ( [ geo:lat -29.5; geo:long 27 ] [ geo:lat -28.5; geo:long 28.5 ] [ geo:lat -31; geo:long 28 ] [ geo:lat -29.5; geo:long 27 ] ) ] . 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web 1. Their approach was to encode a geometry’s entire structure as RDF 2. However, NeoGeo still lacked support for coordinate reference systems and it was criticized for its excessive creation of blank nodes (high overhead with little use case).
  • 9. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments dbr:Perth rdfs:label "Perth"@en ; geosparql:hasGeometry [ geosparql:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POINT(115.85888671875 31.952222824097)"^^geosparql:wktLiteral→ ] ; GeoSPARQL addressed all the issues because it: • separated places from their geometric representations via a blank node • joined CRS, latitude and longitude values (in fact the entire geometry) into a single RDF literal. blake.regalia@gmail.com dbr:Perth rdfs:label "Perth"@en ; geosparql:hasGeometry [ geosparql:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POINT(115.85888671875 31.952222824097)"^^geosparql:wktLiteral→ ] ; GeoSPARQL addressed all the issues because it: • separated places from their geometric representations via a blank node • joined CRS, latitude and longitude values (in fact the entire geometry) into a single RDF literal. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web
  • 10. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments “What is the most populated city in Western Australia?” select ?place where { ?place a dbo:City ; dbo:populationTotal ?population ; geosparql:hasGeometry [ geosparql:asWKT ?placeWKT ] . ?westernAustralia rdfs:label "Western Australia"@en ; geosparql:hasGeometry [ geosparql:asWKT ?westernAustraliaWKT ] . filter(geof:sfWithin(?placeWKT, ?westernAustraliaWKT)) } order by desc(?population) limit 1 Geospatial processing within SPARQL queries! Perhaps most useful are the topological functions: sfTouches, sfWithin, etc. blake.regalia@gmail.com “What is the most populated city in Western Australia?” select ?place where { ?place a dbo:City ; dbo:populationTotal ?population ; geosparql:hasGeometry [ geosparql:asWKT ?placeWKT ] . ?westernAustralia rdfs:label "Western Australia"@en ; geosparql:hasGeometry [ geosparql:asWKT ?westernAustraliaWKT ] . filter(geof:sfWithin(?placeWKT, ?westernAustraliaWKT)) } order by desc(?population) limit 1 Geospatial processing within SPARQL queries! Perhaps most useful are the topological functions: sfTouches, sfWithin, etc. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web
  • 11. The Current Shortcomings of GeoSPARQL The Current Shortcomings of GeoSPARQL 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web The Current Shortcomings of GeoSPARQL
  • 12. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments WKT Literal for Western Australia is 61 KB of ‘human-readable’ markup text: blake.regalia@gmail.com WKT Literal for Western Australia is 61 KB of ‘human-readable’ markup text: 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web The Current Shortcomings of GeoSPARQL 1. This is an RDF literal, it is a string – so it should be stored in its entirety and indexed as a string in the triplestore’s underlying relational database. Surely any GeoSPARQL implementation will choose to create a spatially indexed copy of the geometry in a binary format, but it will be a copy. 2. So it brings up the question, why do we need RDF literals for geometry? Are people really doing fulltext searches on these strings? If anything, they might be applying a filter to the Well-Known Text to select only certain types of geometries (such as point, polyline or polygon).
  • 13. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments filter(geof:sfTouches(?polygonA, ?polygonB)) No results. blake.regalia@gmail.com filter(geof:sfTouches(?polygonA, ?polygonB)) No results. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web The Current Shortcomings of GeoSPARQL 1. These two geometries taken from Open Street Map are two adjacent counties in the United States. But notice how they don’t line up perfectly. 2. Well this sort issue is so common with Geographic Information Systems that we have names for it, like ‘sliver polygons’. Whether the coordinate data comes from GPS or point-and-click, the edges are never exactly perfect. More generally, we call this a digitization error. 3. Now normally, in a GIS, these problems are fixed so that topologically, they reflect their real-world interpretation. 4. If you were to ask GeoSPARQL about this relation, it would return ‘overlapping’ – which is not what you should expect since in fact, those two counties are actually ‘touching’ 5. But the ‘touches’ function fails us here because GeoSPARQL does not handle dirty polygons.
  • 14. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments blake.regalia@gmail.com 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web The Current Shortcomings of GeoSPARQL 1. Here we have two different features extracted from Open Street Map. One of them in blue is the geometry for a city, and the other one in red is that of a county. 2. In real-life, the city of Lynchburg and Moore County actually have an identical spatial extent. However, they were digitized independently on Open Street Map and one of them was digitized to a finer resolution. 3. Again, if you asked GeoSPARQL about the topology here, it returns ‘overlapping’, when in reality, it should be ‘equal’. 4. The problem is that GeoSPARQL offers no solutions when dealing with geometries like this – perhaps because its assumed that the data has already been cleaned by the data providers. 5. But on the Linked Data Web, geometries can come from a variety of different sources, and their geometries will certainly almost never be in perfect alignmeent (such as with Linked GeoData, who sources their geometries from Open Street Map)
  • 15. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments blake.regalia@gmail.com 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web The Current Shortcomings of GeoSPARQL 1. Finally, computing topological relations on-demand can be very expensive. 2. This polygon is made up of 4.5k rings and 487.k points – and is a nightmare for finding features ‘within’ and ‘touching’ 3. Geometries like this are common in geographic datasets containing natural features. On-demand topology does not scale in these cases – performance suffers to the point that it becomes unusable.
  • 16. Geometry is a Resource; Represent them w/ URIs Geometry is a Resource; Represent them w/ URIs 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Geometry is a Resource; Represent them w/ URIs
  • 17. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Beyond simple points and bounding boxes, storing raw geometry in RDF literals should be reconsidered. osm:relation_2316598 rdfs:label "Western Australia"@en ; geosparql:hasGeometry [ geosparql:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POLYGON((128.9999986 -14.4290140, 128.9999714 -14.8798443, ...))"^^geosparql:wktLiteral → → ] ; # instead... Instead, represent geometry as a resource via URIs that can be dereferenced to fetch the geometry data. blake.regalia@gmail.com Beyond simple points and bounding boxes, storing raw geometry in RDF literals should be reconsidered. osm:relation_2316598 rdfs:label "Western Australia"@en ; geosparql:hasGeometry [ geosparql:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POLYGON((128.9999986 -14.4290140, 128.9999714 -14.8798443, ...))"^^geosparql:wktLiteral → → ] ; # instead... Instead, represent geometry as a resource via URIs that can be dereferenced to fetch the geometry data. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Geometry is a Resource; Represent them w/ URIs 1. Storing the raw geometry in RDF literals should be reconsidered for complex geometries.
  • 18. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Beyond simple points and bounding boxes, storing raw geometry in RDF literals should be reconsidered. osm:relation_2316598 rdfs:label "Western Australia"@en ; geosparql:hasGeometry [ geosparql:asWKT ‘<http://www.opengis.net/def/crs/EPSG/0/4326> POLYGON((128.9999986 -14.4290140, 128.9999714 -14.8798443, ...)) geosparql:wktLiteral → → ] ; # instead... get rid of blank node and use URI ago:hasGeometry ex:WesternAustraliaPolygon ; Instead, represent geometry as a resource via URIs that can be dereferenced to fetch the geometry data. blake.regalia@gmail.com Beyond simple points and bounding boxes, storing raw geometry in RDF literals should be reconsidered. osm:relation_2316598 rdfs:label "Western Australia"@en ; geosparql:hasGeometry [ geosparql:asWKT ‘<http://www.opengis.net/def/crs/EPSG/0/4326> POLYGON((128.9999986 -14.4290140, 128.9999714 -14.8798443, ...)) geosparql:wktLiteral → → ] ; # instead... get rid of blank node and use URI ago:hasGeometry ex:WesternAustraliaPolygon ; Instead, represent geometry as a resource via URIs that can be dereferenced to fetch the geometry data. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Geometry is a Resource; Represent them w/ URIs 1. Instead, geometry should be represented as a resource that can be dereferenced to fetch the geometry data.
  • 19. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Dereferencable URIs: clients are free to fetch geometry in a variety of formats. curl "http://ex.co/geometry/polygon?id=42" -H "Accept: $MIME_TYPE" MIME Type Description Returns text/html Web interface <!DOCTYPE html><html lang="en">... text/plain Well-Known Text POLYGON((113.1016 -38.062 ...)) application/gml+xml GML <gml:Polygon><gml:Exterior>... application/vnd.geo+json GeoJSON {"type":"Polygon","coordinates":...} application/octet-stream Well-Known Binary 01 06 00 00 20 E6 10 00 00 01... blake.regalia@gmail.com Dereferencable URIs: clients are free to fetch geometry in a variety of formats. curl "http://ex.co/geometry/polygon?id=42" -H "Accept: $MIME_TYPE" MIME Type Description Returns text/html Web interface <!DOCTYPE html><html lang="en">... text/plain Well-Known Text POLYGON((113.1016 -38.062 ...)) application/gml+xml GML <gml:Polygon><gml:Exterior>... application/vnd.geo+json GeoJSON {"type":"Polygon","coordinates":...} application/octet-stream Well-Known Binary 01 06 00 00 20 E6 10 00 00 01... 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Geometry is a Resource; Represent them w/ URIs 1. Having the ability to choose a geometry serialization format is not only convenient for clients, but it can also have a tremendous impact on the performance of web applications who right now have to parse Well-Known Text – usually to convert it into GeoJSON before doing anything interesting with the data. 2. Same goes for the transmission of large amounts of geodata such as Well-Known Binary which takes up about 30% less storage than Well-Known Text. 3. With content-negotation, clients can dereference a URI with certain media types in the ‘Accept’ header to settle on a format that is suitable for their needs.
  • 20. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments base URI geometry id http://ex.co/geometry/point?id=1337#128.988_-14.429 geometry type coordinate value http://ex.co/geometry/polygon?id=7331#128.113_-14.221,126.188_-19.212 bounding box blake.regalia@gmail.com base URI geometry id http://ex.co/geometry/point?id=1337#128.988_-14.429 geometry type coordinate value http://ex.co/geometry/polygon?id=7331#128.113_-14.221,126.188_-19.212 bounding box 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Geometry is a Resource; Represent them w/ URIs 1. For our dataset, we mint the URIs such that they contain metadata about the geometry they represent. 2. Just by looking at an identifier, someone can determine that it represents a geometry, the type of geometry it represents, and the WGS84 bounding box of the feature. 3. To give a use case example, a web application could build a spatial index from just the URIs alone by using their bounding box metadata, and then selectively download geometries within areas of interest – such as the current map viewport.
  • 21. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Comparison of strategies towards storing and using geometry: Trait GeoSPARQL NeoGeo AGO Efficient geometry storage Geometry can persist externally 1 Content-negotiation for geometry format Uniform RDF structure Composite geometries Determine geometry type 2 Access bounding box 2 Access raw geometry 2 1 = Geometry can persist in a local geodatabase or even on a remote system and without copies. 2 = From the triples’ RDF data alone (e.g., without using SPARQL). blake.regalia@gmail.com Comparison of strategies towards storing and using geometry: Trait GeoSPARQL NeoGeo AGO Efficient geometry storage Geometry can persist externally 1 Content-negotiation for geometry format Uniform RDF structure Composite geometries Determine geometry type 2 Access bounding box 2 Access raw geometry 2 1 = Geometry can persist in a local geodatabase or even on a remote system and without copies. 2 = From the triples’ RDF data alone (e.g., without using SPARQL). 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Geometry is a Resource; Represent them w/ URIs 1. Through a combination of techniques, we’re able to save almost all the benefits of both GeoSPARQL and NeoGeo – with the one drawback of accessing the geometry data inline with the RDF, which has its own set of problems as discussed earlier. 2. An added benefit to using URIs is that the geometry data can persist external to the triplestore – it can reside in its original form on a local geodatabase, or it may even exist somewhere else on a remote system
  • 22. Precompute Your Topological Relations! Precompute Your Topological Relations! 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations!
  • 23. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments blake.regalia@gmail.com 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations! 1. The other problem to GeoSPARQL that I brought up earlier was with topology. Computing relations on-demand can take way too long AND GeoSPARQL does not handle digitization errors. 2. But if we precompute topological relations, correcting digitization errors is back in the hands of domain-experts who can use the right methods to come up with those relations.
  • 24. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Strict Topological Relations avg. area/length of... # instances relation code left geometry right geometry 19,134 polygon touches polygon EC 960km2 2, 049km2 1,272 polygon overlaps polygon PO 321km2 2, 974km2 1,287 polygon tpp polygon TPP 57km2 2, 653km2 2,577 polygon ntpp polygon NTPP 16km2 3, 052km2 3 polygon equals polygon EQ 830m2 836m2 19,543 polyline touches polygon TCH 652km 701km2 7,871 polyline crosses polygon PTH 290km 2, 733km2 5,733 polyline within polygon INC 6.5km 6, 863km2 11,227 polyline crosses polyline 395km 688km blake.regalia@gmail.com Strict Topological Relations avg. area/length of... # instances relation code left geometry right geometry 19,134 polygon touches polygon EC 960km2 2, 049km2 1,272 polygon overlaps polygon PO 321km2 2, 974km2 1,287 polygon tpp polygon TPP 57km2 2, 653km2 2,577 polygon ntpp polygon NTPP 16km2 3, 052km2 3 polygon equals polygon EQ 830m2 836m2 19,543 polyline touches polygon TCH 652km 701km2 7,871 polyline crosses polygon PTH 290km 2, 733km2 5,733 polyline within polygon INC 6.5km 6, 863km2 11,227 polyline crosses polyline 395km 688km 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations! 1. We tested this on a dataset we created by aligning Open Street Map polygons with their corresponding DBpedia resources for about 18.5k polygons and 8k polylines in the United States – which gave us a total of about 68k distinct topological relations 2. The reason we can get away with materializing so many relations is because they are all based on intersection which is limited by locality – so in other words, we don’t materialize relations for geometries that are disjoint, otherwise we would be adding (n choose 2) triples . 3. In fact, we can start by only materializing the lowest-level relations in one direction and then use reasoning to expand the graph – so for example, tangential proper part and non-tangential proper part are both subclasses of the ‘within’ relation.
  • 25. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments # a) GeoSPARQL select ?place where { dbr:Perth geosparql:hasGeometry [geosparql:asWKT ?wktA]. ?anyPlace geosparql:hasGeometry [geosparql:asWKT ?wktB]. filter(geof:sfWithin(?wktA, ?wktB)) } # b) Our ‘AGO’ approach prefix agt: http://awesemantic-geo.link/topology/ . select ?place where { dbr:Perth agt:within ?place. } blake.regalia@gmail.com # a) GeoSPARQL select ?place where { dbr:Perth geosparql:hasGeometry [geosparql:asWKT ?wktA]. ?anyPlace geosparql:hasGeometry [geosparql:asWKT ?wktB]. filter(geof:sfWithin(?wktA, ?wktB)) } # b) Our ‘AGO’ approach prefix agt: http://awesemantic-geo.link/topology/ . select ?place where { dbr:Perth agt:within ?place. } 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations! 1. So to see what that looks like, instead of applying a filter at query time, we simply match the basic graph pattern for the specific precomputed topological relation we’re interested in. 2. Not only is this faster, but we’ve also effectively cleaned the data so the results we get will be more accurate.
  • 26. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments blake.regalia@gmail.com 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations! 1. And with that consolidated city-county example from earlier, our preprocessing step caught these dirty polygons and materialized the city of Lynchburg and Moore County as having an EQUAL topological relation. 2. So now we’ve dealt with digitization errors and have sort of enriched the dataset with strict topological relations.
  • 27. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments blake.regalia@gmail.com 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations! 1. But there’s a geographic problem – strict topological relations alone do not take into account uncertainty and vagueness principles. 2. So for example, the spatial extent of a town can be conceptually vauge – it depends on who you ask, when you ask them, and so forth. 3. With rivers and water bodies, the spatial extent is always changing depending on how much rainfall there has been and so on. 4. How can you say anything about the topology of these things? Well, you have to use broad boundaries – that is to say that the edges of those polygons are actually regions that can grow or shrink depending on what you are comparing them to. 5. So for a town, we can allow the boundary of the polygon to grow a little and the river’s boundary might grow or shrink as well – until they meet; then we can say that they are ‘broadly touching’; using a different predicate than the strict topological relations.
  • 28. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments blake.regalia@gmail.com 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations! 1. And for cases that may exhibit cognitive relations, we can also create a set of topological predicates for those as well – such as to say that Brazil is ‘mostly within’ the southern hemisphere.
  • 29. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web
  • 30. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments • Stop using blank nodes for geometry; adopt URIs! • RDF literals are cumbersone. Geometry is a resource, let it persist as such • What are those raw geometries needed for? On-demand computations do not scale for large geodatasets. Also, digitization errors, vagueness and uncertainty challenge topological queries – precomputing topology takes care of these issues blake.regalia@gmail.com • Stop using blank nodes for geometry; adopt URIs! • RDF literals are cumbersone. Geometry is a resource, let it persist as such • What are those raw geometries needed for? On-demand computations do not scale for large geodatasets. Also, digitization errors, vagueness and uncertainty challenge topological queries – precomputing topology takes care of these issues 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web 1. Stop using blank nodes for geometry! At the very least, you should always be using a dereferenceable URI instead. 2. While you’re at it, (if you can), remove RDF literals and let your geometries persist in a production geodatabase. 3. And finally, what are these raw geometries really needed for in RDF? We have this problem that on-demand computations are not scaling for large geodatasets 4. On top of that, we have problems with digitization errors, vagueness and uncertainty making topological queries difficult to answer accurately - so that they reflect their real-world equivalent. 5. Instead, knowledge graphs will see more benefit from precomputing those topological relations and taking into account the full range of interpretations that topology can have for geographic places.