Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Revisiting the Representation...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Position
1. Raw geometries fo...
A Brief History of Geometry on the Semantic Web
A Brief History of Geometry on the Semantic Web
2017-04-03
Revisiting the ...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
The W3C Basic Geo Vocabulary,...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Resource about London Heathro...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Needs of geospatial data not ...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
‘NeoGeo’ drafted a vocabulary...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
@prefix ngeo: <http://geovoca...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
dbr:Perth rdfs:label "Perth"@...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
“What is the most populated c...
The Current Shortcomings of GeoSPARQL
The Current Shortcomings of GeoSPARQL
2017-04-03
Revisiting the Representation of an...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
WKT Literal for Western Austr...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
filter(geof:sfTouches(?polygo...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
blake.regalia@gmail.com
2017-...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
blake.regalia@gmail.com
2017-...
Geometry is a Resource; Represent them w/ URIs
Geometry is a Resource; Represent them w/ URIs
2017-04-03
Revisiting the Re...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Beyond simple points and boun...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Beyond simple points and boun...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Dereferencable URIs: clients ...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
base URI geometry id
http://e...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Comparison of strategies towa...
Precompute Your Topological Relations!
Precompute Your Topological Relations!
2017-04-03
Revisiting the Representation of ...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
blake.regalia@gmail.com
2017-...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
Strict Topological Relations
...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
# a) GeoSPARQL
select ?place ...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
blake.regalia@gmail.com
2017-...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
blake.regalia@gmail.com
2017-...
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
blake.regalia@gmail.com
2017-...
2017-04-03
Revisiting the Representation of and Need for Raw Geometries on
the Linked Data Web
History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments
• Stop using blank nodes for ...
Upcoming SlideShare
Loading in …5
×

Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web

207 views

Published on

Geospatial data on the Semantic Web historically stems from using point geometries to represent the geographic locations of places. As the practice evolved in the Semantic Web community, a demand for more complex geometries and geospatial query capabilities came about as a consequence of integrating traditional GIS and geo-data into the Linked Data cloud. However, recent projects have revealed that, in practice, these established techniques have major shortcomings that limit their storage, transmission, and query potential. In this position paper, we examine these shortcomings, propose to treat geometries similar to how other binary data are stored and referenced on the Semantic Web, namely by representing them as resources via URIs instead of RDF literals, and demonstrate the utility of precomputing topological relations rather than computing them on-demand by arguing that end users are most often interested in topology and not raw geometries.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web

  1. 1. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Blake Regalia1, Krzysztof Janowicz1, and Grant McKenzie2 2017/04/03 1STKO Lab, University of California, Santa Barbara, USA 2Department of Geographical Sciences, University of Maryland, USA blake.regalia@gmail.com Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Blake Regalia1, Krzysztof Janowicz1, and Grant McKenzie2 2017/04/03 1STKO Lab, University of California, Santa Barbara, USA 2Department of Geographical Sciences, University of Maryland, USA 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web 1. This column of the slideshow contains presenter notes that explains the slides for viewers who may be viewing a copy of the PDF post-presentation.
  2. 2. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Position 1. Raw geometries for complex features do not need to be, and perhaps should not be, encoded in RDF. Geometry should be represented as a dereferencable resource and offer content-negotiation for its different serializations. 2. Linked Data users are most often interested in topological relations rather than raw geometries, so what are the trade-offs of precomputing topological relations? blake.regalia@gmail.com Position 1. Raw geometries for complex features do not need to be, and perhaps should not be, encoded in RDF. Geometry should be represented as a dereferencable resource and offer content-negotiation for its different serializations. 2. Linked Data users are most often interested in topological relations rather than raw geometries, so what are the trade-offs of precomputing topological relations? 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Position
  3. 3. A Brief History of Geometry on the Semantic Web A Brief History of Geometry on the Semantic Web 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web
  4. 4. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments The W3C Basic Geo Vocabulary, circa 2003, set out to: “[explore] the possibilities of representing mapping/location data in RDF, and does not attempt to address many of the issues covered in the professional GIS world” Established a simple, minimalistic vocabulary for describing points with a latitude and longitude value using the WGS84 reference datum. Considered good enough for the basic needs of many! blake.regalia@gmail.com The W3C Basic Geo Vocabulary, circa 2003, set out to: “[explore] the possibilities of representing mapping/location data in RDF, and does not attempt to address many of the issues covered in the professional GIS world” Established a simple, minimalistic vocabulary for describing points with a latitude and longitude value using the WGS84 reference datum. Considered good enough for the basic needs of many! 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web
  5. 5. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Resource about London Heathrow Airport using W3C Basic Geo in 2003-01-10. blake.regalia@gmail.com Resource about London Heathrow Airport using W3C Basic Geo in 2003-01-10. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web 1. W3C Basic Geo Vocabulary worked pretty well, for example among web developers, who just wanted to annotate web documents and XML resources manually. 2. However, one of its major shortcomings was that this vocabulary treated places as though they were locations. Meaning that conceptually, there was no distinction between the place on Earth and the geometric thing that can be used to represent its location.
  6. 6. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Needs of geospatial data not met by W3C Basic Geo: • Support for Coordinate Reference Systems (CRS) • Geometries beyond Point such as polylines, polygons, etc. • Separation of entity and geometry, which have cardinality of 1-to-many blake.regalia@gmail.com Needs of geospatial data not met by W3C Basic Geo: • Support for Coordinate Reference Systems (CRS) • Geometries beyond Point such as polylines, polygons, etc. • Separation of entity and geometry, which have cardinality of 1-to-many 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web 1. It became clear to the community that a more comprehensive vocabulary was needed.
  7. 7. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments ‘NeoGeo’ drafted a vocabulary that extended W3C Basic Geo to support more geometry types. It encoded a geometry’s entire structure in RDF: blake.regalia@gmail.com ‘NeoGeo’ drafted a vocabulary that extended W3C Basic Geo to support more geometry types. It encoded a geometry’s entire structure in RDF: 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web
  8. 8. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments @prefix ngeo: <http://geovocab.org/geometry#> . _:polygon rdf:type ngeo:Polygon ; ngeo:exterior [ rdf:type ngeo:LinearRing ; ngeo:posList ( [ geo:lat -29; geo:long 16 ] [ geo:lat -28; geo:long 33 ] [ geo:lat -34; geo:long 27 ] [ geo:lat -29; geo:long 16 ] ) ] ; ngeo:interior [ rdf:type ngeo:LinearRing ; ngeo:posList ( [ geo:lat -29.5; geo:long 27 ] [ geo:lat -28.5; geo:long 28.5 ] [ geo:lat -31; geo:long 28 ] [ geo:lat -29.5; geo:long 27 ] ) ] . blake.regalia@gmail.com @prefix ngeo: <http://geovocab.org/geometry#> . _:polygon rdf:type ngeo:Polygon ; ngeo:exterior [ rdf:type ngeo:LinearRing ; ngeo:posList ( [ geo:lat -29; geo:long 16 ] [ geo:lat -28; geo:long 33 ] [ geo:lat -34; geo:long 27 ] [ geo:lat -29; geo:long 16 ] ) ] ; ngeo:interior [ rdf:type ngeo:LinearRing ; ngeo:posList ( [ geo:lat -29.5; geo:long 27 ] [ geo:lat -28.5; geo:long 28.5 ] [ geo:lat -31; geo:long 28 ] [ geo:lat -29.5; geo:long 27 ] ) ] . 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web 1. Their approach was to encode a geometry’s entire structure as RDF 2. However, NeoGeo still lacked support for coordinate reference systems and it was criticized for its excessive creation of blank nodes (high overhead with little use case).
  9. 9. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments dbr:Perth rdfs:label "Perth"@en ; geosparql:hasGeometry [ geosparql:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POINT(115.85888671875 31.952222824097)"^^geosparql:wktLiteral→ ] ; GeoSPARQL addressed all the issues because it: • separated places from their geometric representations via a blank node • joined CRS, latitude and longitude values (in fact the entire geometry) into a single RDF literal. blake.regalia@gmail.com dbr:Perth rdfs:label "Perth"@en ; geosparql:hasGeometry [ geosparql:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POINT(115.85888671875 31.952222824097)"^^geosparql:wktLiteral→ ] ; GeoSPARQL addressed all the issues because it: • separated places from their geometric representations via a blank node • joined CRS, latitude and longitude values (in fact the entire geometry) into a single RDF literal. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web
  10. 10. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments “What is the most populated city in Western Australia?” select ?place where { ?place a dbo:City ; dbo:populationTotal ?population ; geosparql:hasGeometry [ geosparql:asWKT ?placeWKT ] . ?westernAustralia rdfs:label "Western Australia"@en ; geosparql:hasGeometry [ geosparql:asWKT ?westernAustraliaWKT ] . filter(geof:sfWithin(?placeWKT, ?westernAustraliaWKT)) } order by desc(?population) limit 1 Geospatial processing within SPARQL queries! Perhaps most useful are the topological functions: sfTouches, sfWithin, etc. blake.regalia@gmail.com “What is the most populated city in Western Australia?” select ?place where { ?place a dbo:City ; dbo:populationTotal ?population ; geosparql:hasGeometry [ geosparql:asWKT ?placeWKT ] . ?westernAustralia rdfs:label "Western Australia"@en ; geosparql:hasGeometry [ geosparql:asWKT ?westernAustraliaWKT ] . filter(geof:sfWithin(?placeWKT, ?westernAustraliaWKT)) } order by desc(?population) limit 1 Geospatial processing within SPARQL queries! Perhaps most useful are the topological functions: sfTouches, sfWithin, etc. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web A Brief History of Geometry on the Semantic Web
  11. 11. The Current Shortcomings of GeoSPARQL The Current Shortcomings of GeoSPARQL 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web The Current Shortcomings of GeoSPARQL
  12. 12. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments WKT Literal for Western Australia is 61 KB of ‘human-readable’ markup text: blake.regalia@gmail.com WKT Literal for Western Australia is 61 KB of ‘human-readable’ markup text: 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web The Current Shortcomings of GeoSPARQL 1. This is an RDF literal, it is a string – so it should be stored in its entirety and indexed as a string in the triplestore’s underlying relational database. Surely any GeoSPARQL implementation will choose to create a spatially indexed copy of the geometry in a binary format, but it will be a copy. 2. So it brings up the question, why do we need RDF literals for geometry? Are people really doing fulltext searches on these strings? If anything, they might be applying a filter to the Well-Known Text to select only certain types of geometries (such as point, polyline or polygon).
  13. 13. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments filter(geof:sfTouches(?polygonA, ?polygonB)) No results. blake.regalia@gmail.com filter(geof:sfTouches(?polygonA, ?polygonB)) No results. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web The Current Shortcomings of GeoSPARQL 1. These two geometries taken from Open Street Map are two adjacent counties in the United States. But notice how they don’t line up perfectly. 2. Well this sort issue is so common with Geographic Information Systems that we have names for it, like ‘sliver polygons’. Whether the coordinate data comes from GPS or point-and-click, the edges are never exactly perfect. More generally, we call this a digitization error. 3. Now normally, in a GIS, these problems are fixed so that topologically, they reflect their real-world interpretation. 4. If you were to ask GeoSPARQL about this relation, it would return ‘overlapping’ – which is not what you should expect since in fact, those two counties are actually ‘touching’ 5. But the ‘touches’ function fails us here because GeoSPARQL does not handle dirty polygons.
  14. 14. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments blake.regalia@gmail.com 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web The Current Shortcomings of GeoSPARQL 1. Here we have two different features extracted from Open Street Map. One of them in blue is the geometry for a city, and the other one in red is that of a county. 2. In real-life, the city of Lynchburg and Moore County actually have an identical spatial extent. However, they were digitized independently on Open Street Map and one of them was digitized to a finer resolution. 3. Again, if you asked GeoSPARQL about the topology here, it returns ‘overlapping’, when in reality, it should be ‘equal’. 4. The problem is that GeoSPARQL offers no solutions when dealing with geometries like this – perhaps because its assumed that the data has already been cleaned by the data providers. 5. But on the Linked Data Web, geometries can come from a variety of different sources, and their geometries will certainly almost never be in perfect alignmeent (such as with Linked GeoData, who sources their geometries from Open Street Map)
  15. 15. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments blake.regalia@gmail.com 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web The Current Shortcomings of GeoSPARQL 1. Finally, computing topological relations on-demand can be very expensive. 2. This polygon is made up of 4.5k rings and 487.k points – and is a nightmare for finding features ‘within’ and ‘touching’ 3. Geometries like this are common in geographic datasets containing natural features. On-demand topology does not scale in these cases – performance suffers to the point that it becomes unusable.
  16. 16. Geometry is a Resource; Represent them w/ URIs Geometry is a Resource; Represent them w/ URIs 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Geometry is a Resource; Represent them w/ URIs
  17. 17. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Beyond simple points and bounding boxes, storing raw geometry in RDF literals should be reconsidered. osm:relation_2316598 rdfs:label "Western Australia"@en ; geosparql:hasGeometry [ geosparql:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POLYGON((128.9999986 -14.4290140, 128.9999714 -14.8798443, ...))"^^geosparql:wktLiteral → → ] ; # instead... Instead, represent geometry as a resource via URIs that can be dereferenced to fetch the geometry data. blake.regalia@gmail.com Beyond simple points and bounding boxes, storing raw geometry in RDF literals should be reconsidered. osm:relation_2316598 rdfs:label "Western Australia"@en ; geosparql:hasGeometry [ geosparql:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POLYGON((128.9999986 -14.4290140, 128.9999714 -14.8798443, ...))"^^geosparql:wktLiteral → → ] ; # instead... Instead, represent geometry as a resource via URIs that can be dereferenced to fetch the geometry data. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Geometry is a Resource; Represent them w/ URIs 1. Storing the raw geometry in RDF literals should be reconsidered for complex geometries.
  18. 18. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Beyond simple points and bounding boxes, storing raw geometry in RDF literals should be reconsidered. osm:relation_2316598 rdfs:label "Western Australia"@en ; geosparql:hasGeometry [ geosparql:asWKT ‘<http://www.opengis.net/def/crs/EPSG/0/4326> POLYGON((128.9999986 -14.4290140, 128.9999714 -14.8798443, ...)) geosparql:wktLiteral → → ] ; # instead... get rid of blank node and use URI ago:hasGeometry ex:WesternAustraliaPolygon ; Instead, represent geometry as a resource via URIs that can be dereferenced to fetch the geometry data. blake.regalia@gmail.com Beyond simple points and bounding boxes, storing raw geometry in RDF literals should be reconsidered. osm:relation_2316598 rdfs:label "Western Australia"@en ; geosparql:hasGeometry [ geosparql:asWKT ‘<http://www.opengis.net/def/crs/EPSG/0/4326> POLYGON((128.9999986 -14.4290140, 128.9999714 -14.8798443, ...)) geosparql:wktLiteral → → ] ; # instead... get rid of blank node and use URI ago:hasGeometry ex:WesternAustraliaPolygon ; Instead, represent geometry as a resource via URIs that can be dereferenced to fetch the geometry data. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Geometry is a Resource; Represent them w/ URIs 1. Instead, geometry should be represented as a resource that can be dereferenced to fetch the geometry data.
  19. 19. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Dereferencable URIs: clients are free to fetch geometry in a variety of formats. curl "http://ex.co/geometry/polygon?id=42" -H "Accept: $MIME_TYPE" MIME Type Description Returns text/html Web interface <!DOCTYPE html><html lang="en">... text/plain Well-Known Text POLYGON((113.1016 -38.062 ...)) application/gml+xml GML <gml:Polygon><gml:Exterior>... application/vnd.geo+json GeoJSON {"type":"Polygon","coordinates":...} application/octet-stream Well-Known Binary 01 06 00 00 20 E6 10 00 00 01... blake.regalia@gmail.com Dereferencable URIs: clients are free to fetch geometry in a variety of formats. curl "http://ex.co/geometry/polygon?id=42" -H "Accept: $MIME_TYPE" MIME Type Description Returns text/html Web interface <!DOCTYPE html><html lang="en">... text/plain Well-Known Text POLYGON((113.1016 -38.062 ...)) application/gml+xml GML <gml:Polygon><gml:Exterior>... application/vnd.geo+json GeoJSON {"type":"Polygon","coordinates":...} application/octet-stream Well-Known Binary 01 06 00 00 20 E6 10 00 00 01... 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Geometry is a Resource; Represent them w/ URIs 1. Having the ability to choose a geometry serialization format is not only convenient for clients, but it can also have a tremendous impact on the performance of web applications who right now have to parse Well-Known Text – usually to convert it into GeoJSON before doing anything interesting with the data. 2. Same goes for the transmission of large amounts of geodata such as Well-Known Binary which takes up about 30% less storage than Well-Known Text. 3. With content-negotation, clients can dereference a URI with certain media types in the ‘Accept’ header to settle on a format that is suitable for their needs.
  20. 20. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments base URI geometry id http://ex.co/geometry/point?id=1337#128.988_-14.429 geometry type coordinate value http://ex.co/geometry/polygon?id=7331#128.113_-14.221,126.188_-19.212 bounding box blake.regalia@gmail.com base URI geometry id http://ex.co/geometry/point?id=1337#128.988_-14.429 geometry type coordinate value http://ex.co/geometry/polygon?id=7331#128.113_-14.221,126.188_-19.212 bounding box 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Geometry is a Resource; Represent them w/ URIs 1. For our dataset, we mint the URIs such that they contain metadata about the geometry they represent. 2. Just by looking at an identifier, someone can determine that it represents a geometry, the type of geometry it represents, and the WGS84 bounding box of the feature. 3. To give a use case example, a web application could build a spatial index from just the URIs alone by using their bounding box metadata, and then selectively download geometries within areas of interest – such as the current map viewport.
  21. 21. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Comparison of strategies towards storing and using geometry: Trait GeoSPARQL NeoGeo AGO Efficient geometry storage Geometry can persist externally 1 Content-negotiation for geometry format Uniform RDF structure Composite geometries Determine geometry type 2 Access bounding box 2 Access raw geometry 2 1 = Geometry can persist in a local geodatabase or even on a remote system and without copies. 2 = From the triples’ RDF data alone (e.g., without using SPARQL). blake.regalia@gmail.com Comparison of strategies towards storing and using geometry: Trait GeoSPARQL NeoGeo AGO Efficient geometry storage Geometry can persist externally 1 Content-negotiation for geometry format Uniform RDF structure Composite geometries Determine geometry type 2 Access bounding box 2 Access raw geometry 2 1 = Geometry can persist in a local geodatabase or even on a remote system and without copies. 2 = From the triples’ RDF data alone (e.g., without using SPARQL). 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Geometry is a Resource; Represent them w/ URIs 1. Through a combination of techniques, we’re able to save almost all the benefits of both GeoSPARQL and NeoGeo – with the one drawback of accessing the geometry data inline with the RDF, which has its own set of problems as discussed earlier. 2. An added benefit to using URIs is that the geometry data can persist external to the triplestore – it can reside in its original form on a local geodatabase, or it may even exist somewhere else on a remote system
  22. 22. Precompute Your Topological Relations! Precompute Your Topological Relations! 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations!
  23. 23. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments blake.regalia@gmail.com 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations! 1. The other problem to GeoSPARQL that I brought up earlier was with topology. Computing relations on-demand can take way too long AND GeoSPARQL does not handle digitization errors. 2. But if we precompute topological relations, correcting digitization errors is back in the hands of domain-experts who can use the right methods to come up with those relations.
  24. 24. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments Strict Topological Relations avg. area/length of... # instances relation code left geometry right geometry 19,134 polygon touches polygon EC 960km2 2, 049km2 1,272 polygon overlaps polygon PO 321km2 2, 974km2 1,287 polygon tpp polygon TPP 57km2 2, 653km2 2,577 polygon ntpp polygon NTPP 16km2 3, 052km2 3 polygon equals polygon EQ 830m2 836m2 19,543 polyline touches polygon TCH 652km 701km2 7,871 polyline crosses polygon PTH 290km 2, 733km2 5,733 polyline within polygon INC 6.5km 6, 863km2 11,227 polyline crosses polyline 395km 688km blake.regalia@gmail.com Strict Topological Relations avg. area/length of... # instances relation code left geometry right geometry 19,134 polygon touches polygon EC 960km2 2, 049km2 1,272 polygon overlaps polygon PO 321km2 2, 974km2 1,287 polygon tpp polygon TPP 57km2 2, 653km2 2,577 polygon ntpp polygon NTPP 16km2 3, 052km2 3 polygon equals polygon EQ 830m2 836m2 19,543 polyline touches polygon TCH 652km 701km2 7,871 polyline crosses polygon PTH 290km 2, 733km2 5,733 polyline within polygon INC 6.5km 6, 863km2 11,227 polyline crosses polyline 395km 688km 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations! 1. We tested this on a dataset we created by aligning Open Street Map polygons with their corresponding DBpedia resources for about 18.5k polygons and 8k polylines in the United States – which gave us a total of about 68k distinct topological relations 2. The reason we can get away with materializing so many relations is because they are all based on intersection which is limited by locality – so in other words, we don’t materialize relations for geometries that are disjoint, otherwise we would be adding (n choose 2) triples . 3. In fact, we can start by only materializing the lowest-level relations in one direction and then use reasoning to expand the graph – so for example, tangential proper part and non-tangential proper part are both subclasses of the ‘within’ relation.
  25. 25. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments # a) GeoSPARQL select ?place where { dbr:Perth geosparql:hasGeometry [geosparql:asWKT ?wktA]. ?anyPlace geosparql:hasGeometry [geosparql:asWKT ?wktB]. filter(geof:sfWithin(?wktA, ?wktB)) } # b) Our ‘AGO’ approach prefix agt: http://awesemantic-geo.link/topology/ . select ?place where { dbr:Perth agt:within ?place. } blake.regalia@gmail.com # a) GeoSPARQL select ?place where { dbr:Perth geosparql:hasGeometry [geosparql:asWKT ?wktA]. ?anyPlace geosparql:hasGeometry [geosparql:asWKT ?wktB]. filter(geof:sfWithin(?wktA, ?wktB)) } # b) Our ‘AGO’ approach prefix agt: http://awesemantic-geo.link/topology/ . select ?place where { dbr:Perth agt:within ?place. } 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations! 1. So to see what that looks like, instead of applying a filter at query time, we simply match the basic graph pattern for the specific precomputed topological relation we’re interested in. 2. Not only is this faster, but we’ve also effectively cleaned the data so the results we get will be more accurate.
  26. 26. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments blake.regalia@gmail.com 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations! 1. And with that consolidated city-county example from earlier, our preprocessing step caught these dirty polygons and materialized the city of Lynchburg and Moore County as having an EQUAL topological relation. 2. So now we’ve dealt with digitization errors and have sort of enriched the dataset with strict topological relations.
  27. 27. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments blake.regalia@gmail.com 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations! 1. But there’s a geographic problem – strict topological relations alone do not take into account uncertainty and vagueness principles. 2. So for example, the spatial extent of a town can be conceptually vauge – it depends on who you ask, when you ask them, and so forth. 3. With rivers and water bodies, the spatial extent is always changing depending on how much rainfall there has been and so on. 4. How can you say anything about the topology of these things? Well, you have to use broad boundaries – that is to say that the edges of those polygons are actually regions that can grow or shrink depending on what you are comparing them to. 5. So for a town, we can allow the boundary of the polygon to grow a little and the river’s boundary might grow or shrink as well – until they meet; then we can say that they are ‘broadly touching’; using a different predicate than the strict topological relations.
  28. 28. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments blake.regalia@gmail.com 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web Precompute Your Topological Relations! 1. And for cases that may exhibit cognitive relations, we can also create a set of topological predicates for those as well – such as to say that Brazil is ‘mostly within’ the southern hemisphere.
  29. 29. 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web
  30. 30. History of Geometry Shortcomings of GeoSPARQL Use URIs Precomputed Topology Backup Arguments • Stop using blank nodes for geometry; adopt URIs! • RDF literals are cumbersone. Geometry is a resource, let it persist as such • What are those raw geometries needed for? On-demand computations do not scale for large geodatasets. Also, digitization errors, vagueness and uncertainty challenge topological queries – precomputing topology takes care of these issues blake.regalia@gmail.com • Stop using blank nodes for geometry; adopt URIs! • RDF literals are cumbersone. Geometry is a resource, let it persist as such • What are those raw geometries needed for? On-demand computations do not scale for large geodatasets. Also, digitization errors, vagueness and uncertainty challenge topological queries – precomputing topology takes care of these issues 2017-04-03 Revisiting the Representation of and Need for Raw Geometries on the Linked Data Web 1. Stop using blank nodes for geometry! At the very least, you should always be using a dereferenceable URI instead. 2. While you’re at it, (if you can), remove RDF literals and let your geometries persist in a production geodatabase. 3. And finally, what are these raw geometries really needed for in RDF? We have this problem that on-demand computations are not scaling for large geodatasets 4. On top of that, we have problems with digitization errors, vagueness and uncertainty making topological queries difficult to answer accurately - so that they reflect their real-world equivalent. 5. Instead, knowledge graphs will see more benefit from precomputing those topological relations and taking into account the full range of interpretations that topology can have for geographic places.

×