• Save
Interconnecting Belgian national and regional address data using EC ISA "Location" Core Vocabulary
Upcoming SlideShare
Loading in...5

Interconnecting Belgian national and regional address data using EC ISA "Location" Core Vocabulary



My presentation to the AGI Addressing SIG meeting at Clore Management Centre, Birkbeck College, 18 April 2013 - "The Future of Address Management in the United Kingdom"

My presentation to the AGI Addressing SIG meeting at Clore Management Centre, Birkbeck College, 18 April 2013 - "The Future of Address Management in the United Kingdom"



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Whether machine or human, there’s a stepwise process of signal characterisation and analysis prior to reasoning.
  • The ‘shared representation’ is a form ofcontract. Sharing this representation is the basis of semantic interoperability
  • [EIF]European Interoperability Framework http://ec.europa.eu/isa/documents/isa_annex_ii_eif_en.pdf
  • [EIF]European Interoperability Framework http://ec.europa.eu/isa/documents/isa_annex_ii_eif_en.pdf
  • https://joinup.ec.europa.eu/node/43160
  • The Core Location Vocabulary is a simplified, reusable and extensible data model that captures the fundamental characteristics of a location, represented as an address, a geographic name, or a geometry [Location]. It is specified as a UML Static Model, an RDF Schema, and an XML Schema.The vocabulary has been developed in the period December 2011 – May 2012 by a multi disciplinary Working Group, with a total of 69 people from 22 countries, 18 EU and 4 non-EU countries (USA, South-Africa, Norway and Croatia), and several EU Institutions. The Working Group was co-chaired by Paul Smits, AdreaPerego, and Michael Lutz of the European Commission INSPIRE team. On 23 May 2012, the Coordination Group of the ISA Programme has endorsed version 1.00 of the combined specification of the Core Business, Core Location and Core Person Vocabulary [Business, Location, Person]. Although endorsement does not make the specifications legally binding, it is an important milestone as the EU Member States acknowledge the work and commit to further exploit and disseminate it at national level. The W3C Location and Addresses Community Group [locadd] is to review the existing efforts such as the Core Location Vocabulary in and assess whether any use cases would be served by harmonization and/or new standardization work. It may produce specifications or use cases and requirements documents, which may be proposed for adoption by the W3C Government Linked Data (GLD) Working Group.2012-05-30, ISA Member State representatives endorse key specifications for e-Government interoperabilityhttp://joinup.ec.europa.eu/news/isa-member-state-representatives-endorse-key-specifications-e-government-interoperability
  • There are lightweight relationships between the classes
  • The INSPIRE model has a much tighter set of contractual relationships between the classes
  • To date, the public sector has not yet tapped into the full potential of its base address registers. In Belgium, for example, the use of address data is impeded by the following obstacles: Address data fragmentation. The address data at Belgian federal level and at the three regions is housed in isolated registries maintained by the National civil register, AGIV, CIRB, and SPW.Heterogeneous address data formats. Address data is provided using different specifications. Lack of common identifiers. Addresses, administrative units, roads, buildings, and cadastral parcels are not identified by well-formed identifiers thus making it hard to reconcile data about the same entity coming from different sources. This situation is depicted in Figure 1. Due to these obstacles, consumers of address data such as national, regional, and local public administrations, businesses, and citizens make limited use of the aforementioned registers. Address data is duplicated and therefore fragmented in many different registers.
  • In the period November 2012 – February 2013, we have carried out a pilot to demonstrate that the Core Location Vocabulary and related INSPIRE data specifications on addresses can be applied to aggregate address data from various sources and contribute to overcoming the aforementioned obstacles. In particular, the pilot entails the following steps: Develop (provisional) URI sets enabling Belgian addresses to be uniquely identified and looked up on the Web by well-formed HTTP URIs;Represent existing address data from the federal and regional road and address registers using the Core Location vocabulary and experimental INSPIRE RDF vocabularies;Put in place a linked data infrastructure that allows querying harmonised Belgian addresses from a SPARQL endpoint (see Figure 3).Demonstrate the value of the linked data infrastructure to disambiguate, lookup, and link address data using simple Web-based standards such as HTTP, XML, and RDF.
  • The infrastructure is based on OpenLink Virtuoso [Virtuoso]. This is an open-source middleware and database management system that provides access to relational, RDF, XML, and text-based data. A particularly salient feature of OpenLink Virtuoso are its “Linked Data Views”; it allows defining relational-to-RDF mappings that allow Virtuoso’s SPARQL processor to access the relational database tables at-run-time without physical regeneration of RDF Data Sets from SQL Data. Virtuoso’s Linked Data Views make it possible to run a Linked Data infrastructure on top of an existing relational database infrastructure. In addition to this, it is possible to store and manipulate RDF data in Virtuoso’s native RDF Quad Store.
  • The design patterns for the URI sets come from the UK Government “Designing URI Sets for Location” http://data.gov.uk/sites/default/files/Designing_URI_Sets_for_Location-V1.0.pdf
  • The design of URIs uses codes from the INSPIRE world
  • The design of URIs uses codes from the INSPIRE world
  • The design of URIs uses codes from the INSPIRE world
  • In preparing the thin semantic overlay (Using Virtuoso) key items in the relational databases are either for creating URIs or string literals and are related to each other in appropriate ways using the Location CV terms or terms from other RDF vocabularies

Interconnecting Belgian national and regional address data using EC ISA "Location" Core Vocabulary Interconnecting Belgian national and regional address data using EC ISA "Location" Core Vocabulary Presentation Transcript

  • Interconnecting BelgianNational and RegionalAddress DataInteroperability Solutions forEuropean Public Administrations(ISA) ProgrammeStijn Goedertier : PwC BelgiumPeter Winstanley : Scottish Government
  • ...whether its machines or people
  • Triangle of Meaning
  • • Schemas determine what data is stored• Schemas determine how data is stored• A database or XML Schema represents acontract for information interchange• In a traditional DB/XML world, without aSchema/contract there is no scope forcommunication• The Schema tells you what is possibleSchema as Contract
  • • Tend to be Relational Models• Serialisations – XML• INSPIRE – founded in UML and XML• Exchanging addresses between machinesis problematic in great part because ofSchemas and the ETL process.Address Stores ..
  • Political contextEuropean InteroperabilityFramework
  • • Recommendation 12.Public administrations, whenworking to establishEuropean public services,should develop interfaces toauthentic sources and alignthem at semantic andtechnical level.European InteroperabilityFramework
  • Core vocabulariesSimplified, reusable, andextensible data models thatcapture the fundamentalcharacteristics of a data entity in acontext-neutral fashion.COREVOCABULARYPUBLICSERVICEhttps://joinup.ec.europa.eu/node/43160
  • Core Location Vocabulary• A simplified, reusable and extensible datamodel that captures the fundamentalcharacteristics of a location, representedas an address, a geographic name, or ageometry.• Developed in the period December 2011 –May 2012 by a multi disciplinary WorkingGroup
  • Core Location Vocabulary• co-chairs: Michael Lutz, Paul Smits, Andrea Perego(DG JRC)• editor: Phil Archer (W3C)• task force: Segun Alayande, Adam Arndt, JosephAzzopardi, Chirsina Bapst, Serena Coetzee, AndreasGehlert, Giorgios Georgiannakis, Anja Hopfstock,Andreas Illert, Michaela Elisa Jackson, Morten Lind,Matthias Lüttgert, Andras Micsik, Piotr Piotrowski, GregPotterton, Peter Schmitz, Raj Singh, Athina Trakas, RobWalker, Stuart Williams, Peter Winstanley, ...
  • Core Location Vocabularyavailable in various formatsRDF SchemaRe-uses existing LinkedData vocabularies, notablyDublin Core and FOAFBuilds on Universal BusinessLanguage, Re-usesinformation elementsprovided by CoreComponents TechnicalSpecification (CCTS) ofUN/CEFACTAll specifications are released under the “ISA OpenMetadata Licence v1.1”https://joinup.ec.europa.eu/category/licence/isa-open-metadata-licence-v11XML Schema
  • INSPIRE data specifications• Core Location can be seen as a subset of theINSPIRE address specification as it based onthe INSPIRE AddressRepresentation class.• INSPIRE XML vs Location RDF representation.• Location CV and INSPIRE are complementary• A linked data service can be implemented on topof an INSPIRE representation.
  • Core Location Vocabulary datamodel13
  • INSPIRE Address Specification14
  • Today addressdata isfragmentedacross variousregistersData fragmentationHeterogeneous data formatsLack of common identifiersUnlinkedLow qualityNon-interoperableUrBIS - BrusselsCapital RegionCRAB - Flanders PICC - Wallonia Civil registerNGI – NationalGeographic InstituteDATA CONSUMER?
  • LOGD INFRASTRUCTUREUrBIS - BrusselsCapital RegionCRAB - Flanders PICC - Wallonia Civil registerNGI – NationalGeographic InstituteDATA CONSUMERsample address data in native formatLinked address dataCommon Data modelsRDFRepositorySPARQL endpointAddressIdentifierAddressNotationLinkLook upDisambiguateDATA CONSUMER ORIENTEDUSE CASESINSPIRElookup, disambiguate, linkThe pilotdemonstratesfeasibility ofLinked Data
  • AddressIdentifierAddressNotationLook up(de-reference) anaddress identifierDisambiguate(reconcile) anaddress notationLink datasets bymeans of addressidentifiersExample:Maria-Theresiastraat 11000 BrusselExample:http://data.gov.be/so/ad/Address/00BR/9346-237(fictitious)Three use cases for dataconsumers
  • Prevent fragmentation ofaddress data
  • Technical architecturerelationaldatabaseSQL ProcessorSPARQL ProcessorWeb ApplicationServerWeb BrowserRDF ClientexternaldatabaseHTTPRDF QuadStoreOpenLink Virtuoso
  • Cookbook• Develop (provisional) URI sets – HTTP URIs thatdereference and uniquely identify Belgianaddresses/locations;• Represent existing address data using the core locationRDF vocabulary and experimental INSPIRE RDFvocabularies;• Put in place a linked data infrastructure that will allowquerying from a SPARQL endpoint.• Demonstrate the value of the linked data infrastructure todisambiguate, lookup, and link address data.
  • Cool URI PatternsSpatial things and corresponding information resourcesSpatial thing: http://{domain.name}/id/{type}/{namespace}/{localId}Information resource: http://{domain.name}/doc/{type}/{namespace}/{localId}Spatial objects and corresponding information resourcesSpatial object: http://{domain.name}/so/{theme}/{class}/{namespace}/{localId}Information resource: http://{domain.name}/doc/{theme}/{class}/{namespace}/{localId}Geometries and corresponding information resourcesGeometry: http://{domain.name}/id/geometry/{namespace}/{localId}Information resource: http://{domain.name}/doc/geometry/{namespace}/{localId}
  • Spatial Object: ThemesCode Descriptionad INSPIRE Addresses Vocabularyau INSPIRE Administrative Units Vocabularycp INSPIRE Cadastral Parcelsic INSPIRE Common Model Vocabularynet INSPIRE Networks Vocabularyrs INSPIRE Coordinate Reference Systemtn INSPIRE Transport Networks Vocabularytnro INSPIRE Road Transport Networks VocabularyTable 5 – Code list for {theme} [INSPIRE]
  • Spatial Object: ClassesINSPIRE Class DescriptionAddressAn identification of the fixed location of property by means of a structured composition of geographicnames and identifiers.AddressRepresentationRepresentation of an address spatial object for use in external application schemas that need toinclude the basic, address information in a readable way.AdministrativeUnitUnit of administration where a Member State has and/or exercises jurisdictional rights, for local,regional and national governance.RoadA collection of road link sequences and/or individual road links that are characterized by one or morethematic identifiers and/or properties.-- Description --EXAMPLE Examples are roads characterized bya specific identification code, used by road management authorities or tourist routes, identified by aspecific name.
  • Organisation -> “namespace”Namespace code DescriptionAGIV experimental INSPIRE namespace for AGIV.BPOST experimental INSPIRE namespace for BPOST.CIRB experimental INSPIRE namespace for CIRB.NGI experimental INSPIRE namespace for NGI.RN experimental INSPIRE namespace for the Civil Register.SPW experimental INSPIRE namespace for SPW.STATBEL experimental INSPIRE namespace for DGSEI.
  • Example URIsConcept Sample URIAddress(spatial thing)http://location.testproject.eu/id/address/AGIV/2000017467http://location.testproject.eu/id/address/CIRB/1232998http://location.testproject.eu/id/address/SPW/451463Address(spatial object)http://location.testproject.eu/so/ad/Address/AGIV/2000017467http://location.testproject.eu/so/ad/Address/CIRB/1232998http://location.testproject.eu/so/ad/Address/SPW/451463PostalDescriptor(spatial object)http://location.testproject.eu/so/ad/PostalDescriptor/BPOST/1560AddressLocator(spatial object)http://location.testproject.eu/so/AddressLocator/AGIV/2000017467Address-Representation(spatial object)http://location.testproject.eu/so/ad/AddressRepresentation/AGIV/2000017467http://location.testproject.eu/so/ad/AddressRepresentation/CIRB/1232998http://location.testproject.eu/so/ad/AddressRepresentation/SPW/451463Administrative-Unit(spatial thing)http://location.testproject.eu/id/administrative-unit/STATBEL/24000Administrative-Unit(spatial object)http://location.testproject.eu/so/au/AdministrativeUnit/STATBEL/24000Road(spatial thing)http://location.testproject.eu/id/road/RN/15601625Road(spatial object)http://location.testproject.eu/so/tn/Road/RN/10005081
  • Relating URIs using Location TermsSubject Predicate ObjectNGI_Road.NATIONALREGISTRATION-NUMBER - URIrdf:type tnro:Roadlocn:geographicName STREETNAMEGERMAN@delocn:geometry TGID
  • Mapping relational stores to RDF• Belgian Pilot uses OpenlinkSW“Virtuoso” RDF Views• Alternatives include:– D2RQ / R2D2– SquirrelRDF
  • UC 1: Disambiguate address notations with SPARQLall streets containing Watermael-Boitsfort„PREFIX locn: <http://www.w3.org/ns/locn#>PREFIX ex: <http://example.com/>SELECT DISTINCT ?type ?subject ?labelFROM <http://location.testproject.eu/BEL>WHERE {FILTER(?type=locn:Address || ?type=ex:Road ||?type=ex:AdministrativeUnit).FILTER(?predicate=locn:fullAddress || ?predicate=ex:roadName|| ?predicate=locn:geographicName).?subject a ?type.?subject ?predicate ?label.FILTER(regex(?label,Watermael-Boitsfort,i)).}LIMIT 100
  • UC2: Lookup address IDs through URL rewrite rules• Using 303 errors and “Content Negotiation” todeliver HTML, RDF, etc
  • UC3: Link datasets via common address identifiers• Using Google “Refine” to link addresslabels with locn:fullAddress• A semi-manual process• The end result of reconciliation(disambiguation) is a URI for each address• This URI can then be used to retrieveadditional linked data from the triplestore
  • The pilot demonstrates that• The Core Location RDF Vocabulary can be used as a foundational RDF Vocabulary tohomogenise address data that originates from disparate organisations and systems;• The Core Location RDF vocabulary can be flexibly extended with experimental INSPIRERDF vocabularies (i.e. transport networks and administrative units);• HTTP URI sets can be derived from INSPIRE Unique Object Identifiers for address data,allowing to create harmonised Web identifiers for spatial things and spatial objects suchas addresses ;• A linked data infrastructure can provide access to homogenised, linked, and enrichedlocation data using standard Web-based interfaces (such as HTTP and SPARQL) andWeb-based languages (such as XHTML, RDF+XML), thereby simplifying the use oflocation data for humans and machines.• Extract, Transform and Load (ETL) can be replaced with Extract, Enrich and Repair(ERR)http://location.testproject.eu/BEL/
  • Get involvedJoin the SEMIC community on JoinupVisit ourinitiativesSemanticinteroperabilityVassilios.PERISTERAS@ec.europa.euJoao.Frade@pwc.beStijn.Goedertier@pwc.beNikos.Loutas@pwc.beProgrammeManagerTeamhttp://joinup.ec.europa.eu/