Practical experience of INSPIRE Annex I Testing

1,170 views
1,138 views

Published on

Snowflake Software's experience in testing Annex I data: Transforming data into INSPIRE data specifications and making data accessible via Download Services

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,170
On SlideShare
0
From Embeds
0
Number of Embeds
187
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Today I’m going to give you a quick summary of our experiences from testing the INSPIRE Annex I data specifications – including a demonstration of the work that we did with HMLR
  • Objectives of INSPIRE testing were to:Understand the feasibility of transforming and publishing data into proposed INSPIRE Annex I data specificationDemonstrate provision of data access via INSPIRE Download ServicesEvaluate costs and benefits of publishing data into the INSPIRE data specification via Download Services89 testing reports were received from 16 Member States from over 70 participating organisations (LMOs, SDICs, software vendors, research institutes and geographical institutes and associations).As you can see the UK was one of the most active Member States involved in the testing.
  • There was a pretty even split of test reports received by Commission across all Annex themes.However, it should be noted that for most themes most of the test reports were received from organisations involved in Commission projects related to INSPIRE: EURADIN – Addresses, ESDIN – Geographic Names, Admistrative Areas, Cadastral Parcels, Transport Network, HydrographyNATURE-GIS – Protected Areas, GIS4EU and Humboldt. <20% of all reports received were undertaken/commissioned directly by organisations (LMOs)Due to the short period given for testing, not all organisations were able to perform a full test actually transforming and publishing data and metadata to the INSPIRE data specifications and download services. Only 60 out of 89 reports described the results of an actual transformation and publication test, while the rest were paper based exercises mapping source data to output schema using MS Excel.
  • Snowflake were directly (through the ESDIN project working with HMLR, EDINA, OS and Registers of Scotland)and indirectly through organisations downloading evaluations or using academic licences of our software.Consequently we were able to demonstrate that our software is capable of integrating, modelling, transforming and publishing data into all INSPIRE Annex themes.
  • There are two key approaches for transforming and publishing data into the INSPIRE Annex themes:Offline transformation: Data is transformed and stored into the INSPIRE data specification (i.e. flat files or separate database). This may be the most suitable option for datasets that are not updated regularly or are only going to be made accessible via simple download services.On-the-fly transformation: Data is stored once and transformed into the INSPIRE data specification on request by the download service. This is the most suitable for datasets that are updated regularly and where organisations need to support access to data by a wide range end users in different data specifications defined for different use cases.On-the-fly transformation will probably be the main transformation approach for many organisations as many organisations have to support communities outside of INSPIRE (i.e. Environment domain) which may have requirements for data to be published in other data specifications (e.g. Aviation domain have developed the WXXM which may differ to the INSPIRE Annex III meteorological geographic features data specification).It was also noted in the INSPIRE Conference that the INSPIRE Data Specifications would only define core feature types and propertiesthat support a broad range of use cases across the range of environmental acquis.It is therefore anticipated that the INSPIRE data specifications will be used to provide the base specifications which are extended to develop more specialised data specifications under the remit of SEIS (Shared Environmental Information Systems) to meet more specific use cases for information and information systems required to deliver the obligations within individual environmental Directives. i.e. Information need to satisfy reporting requirements between MS and Commission Information needed to be shared between Public Authorities to successfully deliver policy objectives Developing public information systems (e.g. Near-real time air quality monitoring applications and alert services or systems to enable public engagement in environmental policy making – as required by the Arhaus Convention).GO Publisher was used to test both types of transformation by organisations involved in testing.
  • Working with HMLR, we aimed to demonstrate the feasibility of developing transformational download services and direct access services (for use within client applications).The INSPIRE Implementing Rules define two types of Download Service:Basic Download Service: Files can be downloaded for local use via HTTP/FTP Advanced Download Service: User can define the extent (geographic, temporal, attribute) of the data they need to be downloaded through either: Data Ordering Services or Web Feature Services (WFS)It should be noted that Direct Access services (WFS for use within applications) were deemed beyond the scope of the Implementing Rules for Download Services. It is expected that the requirements for these would be defined by MS or thematic data working groups established for individual environmental Directives (e.g. CAFE Directive, Marine Strategy Directive, Water Framework Directive)
  • The aim of our involvement within the INSPIRE testing with HMLR was to test the feasibility of using COTS software for delivering the requirements of INSPIRE.Our objectives were to: Develop the translation without need for software customisation or development of bespoke scriptsDemonstrate that transforming data into INSPIRE data specifications and via a range of different download and direct access services could be achieved quicklyDemonstrate that data can be transformed on-the-fly enabling organisations:to manage once, publish many timesMaintain existing data maintenance infrastructures - minimising any future costs – although some changes to data capture (business) processes may be requiredDemonstrate that we have a industrial strength, scalable solution to enable organisations to start small (i.e simple download services) but extend their services to full enterprise level or SOA/SDI level when demand increases
  • Configuration or authoring of the transformations required to integrate, model and transform source data into a pre-defined output schema such as the INSPIRE Cadastral Parcels specification is performed within GO Publisher Desktop.GO Publisher Desktop is a powerful, intuitive graphic user interface that enables users to map database tables and columns to respective elements within the GML application schema that has been parsed and validated prior to use.GO Publisher provides users with the ability to perform a wide range of functions for manipulating and transforming the source data: Insert new values into the output where data doesn’t currently exist in the source data - useful for inserting codespace/namespace values Geometric Operations Logical, Comparison and Arithmetic Operations Coordinate Reference System TransformationsIt also provides users with a preview panel to evaluate/validate the success of the output during the mapping process and validates this against the schema to ensure that all mandatory elements are mapped.Once the mapping is complete, GO Publisher Desktop validates the output against the output schema to test for logical consistency which is the only data quality requirement/conformance test specified in all data specifications. If the output passes this validation test, then user can state that the data passes the data specification conformance quality measure within the metadata.
  • Visual analysis of the GML can also be performed by viewing a sample GML file using the GML Viewer
  • Our approach was to demonstrate how to transform HMLR data into INSPIRE Cadastral Parcel GML via simple download service or advanced download services and direct access services (i.e. WFS).HMLR provided us with a subset of their land parcel data which was loaded into Oracle database. Then used GO Publisher Desktop to author the transformation and mapping project, in conjunction with domain experts from HMLR. After several iterations, improving the transformation and mapping project this was then used to generate individual files which can be integrated into a zip file and which is made accessible via HTTP/FTP server. Alternatively, once the transformation and mapping project is configured GO Publisher Desktop can be used to configure the WFS (i.e. create service metadata). Once configured GO Publisher Desktop then deploys the WFS as a WAR file into an application service which is then accessible for use by WFS Clients (HTTP Get or Post).
  • The INSPIRE model for cadastral parcels is more extensive than HMLR model – however, HMLR model did contain all the mandatory feature types and properties (i.e cadastral boundaries or index sets).In UK we don’t have a cadastral mapping agency. Instead the Ordnance Survey provide a topographic mapping database which include objects that form properties (or cadastral parcels), while the Land Registry manages and maintains the land titles for properties in England and Wales.Consequently, the data model of the Land Registry differs to the INSPIRE Cadastral Parcel data specification. The primary feature type within the Land Registry dataset is the Land Title – which is not modelled in the INSPIRE model – and one or more objects (buildings, gardens, car parking etc) are associated to a title.Whereas, in the INSPIRE Cadastral Parcel Models the primary feature type is the cadastral parcel.Despite these different viewpoints, it is possible to map each component object within a land title to a cadastral parcel. However, this reveals a data management problem for operation transformation of HMLR data to INSPIRE specification.How does HMLR manage the identity and lifecycle of each of these component objects:This is further complicated as INSPIRE GML application schema requires 3 different identifiers:INSPIRE Identifier – unique, persistent identifier for international useNational Identifier– unique, persistent identifier for national usegmlID – non persistent identifier needed to uniquely identify features and objects within the GML file
  • All organisations that used GO Publisher were able to successfully transform their data into the mandatory requirements of the INSPIRE Annex data specifications for their respective themesInsufficient time to adequately undertake testing : resulted in few application tests. Some organisations are still performing and submitting test reports to Commission.Complexities of the INSPIRE data specification: many organisations reported that the data specifications were too complex which was causing problems with the transformation and publication (e.g. FME struggles with nesting below 2 levels)Lack of experience:many organisations reported that although they could perform the transformations required, they would need to provide further training of their staff to gain a better understanding of how they should transform their data into INSPIRE specifications operationally and they types of download services they need to provide. Lack of harmonisation between specifications: this comment has been taken on board and additional time is being provided once the next versions of data specification have been finalised to perform cross-harmonisation of the data specifications.
  • Many organisations identified areas where further work would be needed to better understand how to operationally publish data into the INSPIRE specifications, particularly where this is achieved on-the-fly:Identifier ManagementManaging feature lifecycles particularly those features generated on-the-flyTranslating between different codelist values – not all codelist values can be mapped 1:1 and therefore may not be possible to be transformed on-the-fly/automatically as this may require domain expertise to assign individual features to the appropriate INSPIRE codelist value. Therefore, encoding of alternate code values may have to be incorporated into the data maintenance workflowsMeasuring and quantifying quality of transformed output (geometric and attribute) to ensure that it is consistent with source data quality levels and other data quality levels – this shouldbe performed as part of an extensive pilot to demonstrate and understand what impacts the transformation to the INSPIRE data specifications which can then be expressed in the metadataMetadata – more work needs to be done to understand how metadata creation and publication can be integrated into the transformation and publication workflow to semi-automate the creation of metadata that meets the requirements for publication within discovery services, be accessible by download services and be disseminated within files downloaded for local use.
  • Practical experience of INSPIRE Annex I Testing

    1. 1. Practical Experience of INSPIRE Annex I Testing: Transforming Data into INSPIRE Data Specifications and Making Data Accessible via Download Services<br />Debbie Wilson<br />debbie.wilson@snowflakesoftware.com<br />
    2. 2. Overview of INSPIRE Testing Call<br />Objectives of INSPIRE testing were to:<br />Understand the feasibility of transforming and publishing data into proposed INSPIRE Annex I data specification<br />Demonstrate ability to access data via INSPIRE Download Services<br />Evaluate costs and benefits of publishing data into the INSPIRE data specification via Download Services<br />89 testing reports were received from 16 Member States from >70 organisations:<br />
    3. 3. Overview of INSPIRE Testing Call<br />Only 60 of 89 tests involved full transformation test rest were paper exercise<br />Of those organisations using COTS software ~40% used Snowflake’s GO Publisher Desktop & WFS<br />
    4. 4. Snowflake Software’s Experiences <br />Snowflake Software was directly and indirectly involved with 10 organisations across Europe:<br />
    5. 5. Transforming Data into INSPIRE Themes<br />Two key approaches are advocated for transforming and publishing data into INSPIRE themes:<br />Offline Transformation: Data is transformed and stored into the INSPIRE data specification (i.e. flat files or separate database)<br />On-the-fly Transformation: Data is stored once and transformed into the INSPIRE data specification on request by the download service<br />Offline transformation may be the most suitable option for datasets that are not updated regularly <br />On-the-fly transformation is most suitable for datasets that are updated regularly and where organisations need to support data access to a wide range end users in different data specifications defined for different use cases<br />Both approaches were evaluated during the testing phase<br />
    6. 6. Making INSPIRE Data Accessible via INSPIRE Download Services<br />GO Publisher Desktop, Agent and WFS were also used to demonstrate how organisations can develop download and direct access services <br />INSPIRE Implementing Rules define two types of Download Service:<br />Basic Download Service: Files can be downloaded for local use via HTTP/FTP <br />Advanced Download Service: User can define the extent (geographic, temporal, attribute) of the data they need to be downloaded through either: Data Ordering Services or Web Feature Services (WFS)<br />
    7. 7. Demonstration: Transforming HMLR data into INSPIRE Cadastral Parcels <br />Test the feasibility and benefits of using Commercial Off-The-Shelf (COTS) software for INSPIRE<br />Develop the translation without software customisation or development of bespoke scripts<br />Work quickly and productively to reduce costs<br />Refine the translation over several iterations within a limited time period<br />Implement Simple and Advanced Download Services to explore the practical issues of implementing real business requirements<br />On-the-fly translation to avoid replicating database infrastructure <br />Source data from the existing HMLR data model to avoid disruption to existing business processes<br /> “manage once, publish many times”<br />Implement an “industrial strength” solution<br />
    8. 8. Defining the Translation: GO Publisher Desktop<br />Database tables and columns<br />XML schema elements<br />Pull down lists populated from the XML schema<br />Preview panel<br />
    9. 9. Visualising output: GML Viewer<br />
    10. 10. The Solution Architecture<br />
    11. 11. Differences between HMLR and INSPIRE data specifications<br />HMLR<br />INSPIRE Cadastral Parcel<br />Land registry dataset<br />No boundaries or index sets<br />Title is the unit of management<br />Only titles have unique identifiers<br />No reference point values<br />Cadastral index model<br />Boundaries & index sets included<br />Parcel is the unit of management<br />Polygons have unique identifiers<br />Reference point geometry allowed<br /><ul><li>Despite these different viewpoints, it is possible to map each component object within a land title to a cadastral parcel.
    12. 12. Issues: Managing identity and feature lifecycles as INSPIRE GML application schema requires 3 different identifiers:</li></ul>INSPIRE Identifier <br />National Identifier<br />gmlID<br />
    13. 13. Benefits of GO Publisher<br />High productivity was achieved<br />Configuration alone was sufficient<br />No programming or scripting skills were needed<br />Several iterations of the translation were achieved in a limited time-frame<br />Supported progression from simple to advanced services<br />Initially deployed as simple file creation<br />Translation re-deployed as a WFS to support user querying<br />Mature “Industrial Strength” solution<br />Although INSPIRE Quality of Service Requirements were not evaluated in this test, scalability and performance has already been proven in previous deployments<br />The number of independent evaluations and our existing operational deployment base means that the technology is well tested and reliable<br />
    14. 14. Benefits of On-the-Fly Translation using GO Publisher<br />Re-use existing database infrastructure<br />Minor disruption to existing business processes<br />Extra translations added at low cost<br />Low initial investment - costs scale with increasing levels of data traffic<br />Example Architecture of an SDI<br />
    15. 15. Summary of Key Outcomes and Experiences<br />All organisations using GO Publisher were able to successfully demonstrate that they could transform their data into INSPIRE Annex data specifications<br />However, several issues relating to data transformation and publication were identified by many organisations:<br />Insufficient time to adequately undertake testing (few application tests)<br />Complexities involved in undertaking conceptual mapping of their data model to INSPIRE GML application schema (xsd) or UML model<br />In-experience of staff to transform data from their source format into GML<br />Lack of harmonisation in the way common concepts were modelled and to be implemented in v2.0 (e.g. Identifiers, lifecycle information, naming conventions)<br />
    16. 16. Summary of Key Outcomes and Experiences<br />Many organisations identified areas where further work would be needed to better understand how to operationally publish data into the INSPIRE specifications, particularly where this is achieved on-the-fly:<br />Identifier Management<br />Managing feature lifecycles particularly those features generated on-the-fly<br />Translating between different codelist values<br />Measuring and quantifying quality of transformed output (geometric and attribute) to ensure that it is consistent with source data quality levels and other data quality levels<br />Metadata: how can organisations integrate the creation and publication of metadata into operational data management and publication workflows<br />
    17. 17. Conclusions<br />A number of technical and business issues exist that need to be addressed at various levels (organisational, inter-agency and Member State)<br />Organisations need to perform more extensive testing to better understand how to incorporate requirements of INSPIRE data specifications into their business processes, datasets, products and infrastructure<br />Responsibility for creation and maintenance of some themes falls across multiple agencies or has been devolved to multiple organisations responsible for a specific geographic region (e.g. transport: road, rail, aviation, water, devolved administrations, local/regional authorities):<br />Need to better understand impact of cross-border/edge-matching issues<br />How can data be seamlessly integrated when combining data from multiple organisations for a single INSPIRE data specification<br />Organisations need facilities to support each other so they can share experiences and ensure everyone can better understand what they need to do to publish their data<br />
    18. 18. Questions?<br />debbie.wilson@snowflakesoftware.com<br />For demo of HMLR testing go to: <br />http://www.youtube.com/user/snowflakesoftware#p/u/14/V4Ut8kKL5YI<br />

    ×