Making spatial data discoverable and accessible on the Web
1. Spatial data on the Web
Linda van den Brink, Geonovum
November 22, 2018
@brinkwoman
#DataToBuildOn
2. The geospatial
niche
Our “Spatial Data Infrastructure” is
working well within our own
domain, but is not succesful in
distributing our data more widely
3. I can’t find the data
using my fav search
engine
I can’t use the data
because of lengthy,
complex standards
unknown to me and
not supported by
my tooling
I have
great
data! And
it’s open!
4. The Web is the World’s most successful vendor neutral
distributed information system […]
The ‘Web of data’ ranges from small amounts of data to
vast datasets, and either which are open to all or restricted
to a few. Data can be consumed by Web pages,
downloaded for local processing, or accessed via network
APIs that support remote processing [e.g. Web-services].”
“
W3C study of practices and tooling for Web data standardisation
https://www.w3.org/2017/12/odi-study/
4
5. Use of the Web platform’s
standard tools:
• search engines
• browsers
• HTTP (and HTTPS)
• hypermedia / Web links
• delegation to applications via media types
• openAPI metadata (Swagger)
6. 6
• Based on general Data on the
Web Best Practices
• Introducing a couple of
essential concepts for spatial
data on the Web →
7. Linked data is an approach to publishing data that puts linking at the
core of data representation and uses Web linking to “weave data into a
global graph”
By identifying spatial things and other resources with URLs we can link
data describing those spatial things just the same as Web-pages are
linked using hyperlinks
We (both humans and software) can follow those links to find out more
information and build an increasingly complete picture of the world
around us
8. This Linked Data approach is well described by the WEB-DATA 5-star
scheme:
Linkable: use stable and discoverable global identifiers
Parseable: use standardized data metamodels such as CSV,
XML, RDF, or JSON.
Understandable: use well-known or at least well-
documented vocabularies/schemas
Linked: link to other resources whenever possible
Usable: label your document with a license
★
★★
★★★
★★★★
★★★★★
9. Spatial Thing: “Anything with spatial extent, i.e. size,
shape, or position, e.g. people, places, bowling balls,
as well as abstract areas like cubes” [W3C_BASIC_GEO]
… or even a 5 metre tall orange statue of a man talking
on the telephone
(Orange Man at Cité Centre de Congrès de Lyon)
Feature: similar – but is the digital representation
instead of the actual entity
For example
9
12. https://www.wikidata.org/wiki/Q57783921
Best Practice 3: Link resources together to
create the Web of data
12
https://www.wikidata.o
rg/wiki/Q456
Resource about City of
Lyon
Link to
related
resource
HTML page with data
embedded
e.g. schema.org
Link: next page
Link: prev page
15. Data.Next!
Make your data part of the
ecosystem of the Web
• Discoverable through search
engines
• Friendly for developers
16. Thank you
Linda van den Brink
@brinkwoman
#DataToBuildOn
l.vandenbrink@geonovum.nl
Editor's Notes
I’m from the geospatial domain, where we have lots of – nowadays open – data, always related to a location on earth. About 20 years ago we transitioned from using floppy disks to exchange data, to using what we call the Spatial Data Infrastructure. It uses the WWW as a transport mechanism and is supported by international geospatial standards (OGC), which are largely based on SOAP and XML. The paradigm is to publish dataset metadata on a portal, and the actual data can be viewed online or downloaded using web services. This works fine within our own community, but…
Meanwhile, web standards, architecture and practice have evolved separately. In current web architecture, the web is used as an interface, not a transport mechanism.
The “rest of the world” – meaning: the web community, developers – expect different things on the Web nowadays. It seems they can’t find our data very well, and if they do, they experience serious difficulties understanding and using the data.
However, using the Web to distribute data seems like a good idea…
We can apply the methods and patterns familiar across the Web.
Last year, the joint OGC / W3C Spatial Data on the Web Working Group published a set of best practices to help data publishers do just that.
Editors included both myself and Jeremy Tandy from MetOffice UK.
So - to help you understand the Webby approach better, I wanted to highlight a few of the essential concepts…
At the core of the best practices is the concept of Linked data
Example – a spatial thing
So here we see a URL for the actual Orange Man in Lyon. I can talk about the Orange Man in my documents using the URL as a reference.
Obviously I can’t actually download the Orange Man through my browser (that would be magic!), so instead, the Web server will send me some useful information that describes the Orange Man.
To make the orange man findable through search engines, we publish some information at the URL. An HTML document that a web browser will render in readable form for humans; with embedded structured data for crawlers and other user agents.
Then we link the information to other interesting resources
Those were just illustrations of the first 3 best practices described in our document. There are 14 in total.
The _Spatial_ data on the web best practices are based on the general Data on the Web best practices, a W3C recommendation. This document gives the principles for any good web data publication.