Up to €67.4 million is foreseen from the 2020 CEF Telecom Work Programme for grants managed by INEA in the area of Generic Services. The grants under CEF Telecom helped European public administrations and businesses to hook up to the core platforms of the digital services that are the object of the calls.
In particular, €5 million was made available in 2019 and €3 million in 2020 for projects oriented towards 'Open Data' management.
GreenMov, ODALA and INTERSTAT have developed services and products that can be easily adopted by public administrations and beyond thank to the funding of CEF programme target on Open Data
The purpose of this event is not only to present results, demos or provide technical guidelines for developers, it is a moment of reflection on lesson learned and best practices that came from years of project’s activity to analyse what will be the impact for Public Administrations, and finally test the value of GreenMov, INTERSTAT and ODALA in solving future problems.
1. Vienna, Austria
12-13 June, 2023
#FIWARESummit
From Data
to Value
OPEN SOURCE
OPEN STANDARDS
OPEN COMMUNITY
Enabling interoperability for
Linked Open Statistical Data
Martino Maggio (Engineering), Giuseppina Ruocco (ISTAT), Franck Cotton (INSEE)
2. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
Statistical data and Interoperability
● Statistical data play an increasingly relevant role in the definition and
implementation of policies in the social, economic and administrative domains.
● For its relevance, statistical data is part of the thematic categories of High-Value
Datasets (HVDs) as defined in the Open Data Directive
● The interoperability of statistical data between different European countries is
essential in order to make open, reusable data available to scientific
communities, citizens and administrations for the creation of new cross-border
services.
3. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
INTERSTAT Project
«The overall objective of INTERSTAT is to develop a framework that will allow the
interoperability, by using technical assets and common ontologies, between national
statistical portals and the European Data portal and the deployment of cross-border
services that reuse European statistical open datasets from those portals.»
Project consortium
CEF Telecom call 2019 - Public Open Data
Start: September 2020
Duration: 36 months
4. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
INTERSTAT Objectives
● Enable interoperability among different national statistical portals and the European Data Portal
through the adoption of standards and tools to automate metadata harmonisation and data
publication
● Provide standards, methodologies and tools to achieve data harmonisation in the field of (linked)
open data statistics among different national statistical institutions
● Provide uniform technical interfaces for a standard and simple re-use of statistical information
through the adoption of CEF Context Broker Building Block and the implementation of ETSI NGSI-LD
API specification
● Provide tools to simplify statistical data visualisation and analysis for non-technical end-users
● Validate the technical solutions, provided in the Action, through the deployment and piloting of
cross-border end-users services in the domain of Population and Households Census, reusing
harmonised statistical data in combination with open dataset (i.e. city-related data) from European
Data Portal and other national open data portals.
5. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
INTERSTAT technical framework
INTERSTAT framework provides:
• The harmonisation of the Linked Open Statistical Data (LOSD), that in
project pilots have been provided by Istat and Insee, through the
adoption of common data models and the provisioning of specific tools
for data mapping, querying and visualisation, in compliance with SDMX
standard.
• Idra Open Data federation platform, for the federation and
harmonisation of open datasets coming from heterogeneous sources
and its provisioning through standard interfaces and metadata models
(i.e DCAT-AP)
• The CEF Context Broker Building Block that allows the access to the
LOSD through the NGSI-LD models and API
• A set of open APIs based on different standards allowing the access
and the sharing of the LODS through different Open Data/Statistical
Institutional portal in Europe (e.g. European Data Portal, Eurostat
systems), but also third party systems.
The INTERSTAT framework finally enables the possibility to easily create,
on top of its API, cross- border applications based on LOSD, as the three
ones (related to environment policies, school and geolocalised facilities)
experimented inside the project.
6. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
INTERSTAT Framework: Tools
Framework available at: https://framework.cef-interstat.eu/
7. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
Idra - Open Data Federation Platform
• Idra is an open source platform that provides a
single access point to open data of public
administrations or private entities from sources and
portals based on heterogeneous technologies.
• The open data remains in the source portals, the
platform imports and harmonises the metadata and
updates them periodically
• The platform is able to access open data portals
based on different technologies even those that do
not expose specific APIs
• Idra is one of the key tools of INTERSTAT project
together with the CEF Context Broker to enable
interoperability between heterogeneous open data
technologies and third party applications
GitHub: https://github.com/OPSILab/Idra
8. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
Integration between Idra and the CEF Context Broker
Purpose:
• Use the Smart data model to provide compatibility between the NGSI-LD and
DCAT-AP standards.
• To allow catalogues federated in Idra to be automatically loaded into an Orion
Context Broker.
• To allow an Orion catalogue containing DCAT-AP entities to be federated in Idra.
• To take advantage of CEF Context Broker's mechanism of subscription with
notifications to automatically make changes to the Idra catalogues, carried out
in the relative entities in the Context Broker.
DCAT-AP Metadata federated in Idra, are published in the
Context Broker
DCAT-AP Metadata stored in the Context Broker can be federated in
Idra.
Activation of a subscription allows Idra to receive notifications in case
of changes in the metadata
9. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
Key Standards
● SDMX: Statistical Data and Metadata eXchange is
a standard for the exchange of statistical data and
metadata among international organisations.
● DCAT-AP: DCAT Application profile for data
portals in Europe is a specification based on
W3C's Data Catalogue vocabulary (DCAT) for
describing public sector datasets in Europe.
● ETSI NGSI-LD: The Context Information
Management API NGSI-LD allows users to
provide, consume and subscribe to context
information in multiple scenarios and involving
multiple stakeholders.
Key
Standards
SDMX
NGSI-LD
DCAT-AP
10. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
Standardisation activities: NGSI-LD and Statistical
models
● A specific work on the interoperability between NGSI-LD and statistical standard models
has been conducted in collaboration with experts from the DDI Alliance and CODATA.
● The idea is to specify procedures for the transformation into NGSI-LD of statistical data
structured according to standard models, including the SDMX Information Model for data
and metadata, the VTL data model and the DDI Cross-Domain Integration (DDI-CDI).
● Work has progressed well regarding the SDMX data model, which covers so-called
dimensional data or cubes, and it will now focus on DDI-CDI, which can represent more
diverse data structures like sensor data, key-value ("big") data, trees, graphs, etc.
11. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
SDMX/NGSI-LD parser
LALR(1) RDF
Grammar
Turtle RDF
file
DCAT-AP
statDCAT-AP
models
Orion-LD
Context Broker
RDF Parser
JSON-LD
files
ETSI NGSI-LD
Third-Party
software
• Support automatic translation from Turtle Terse RDF to JSON-
LD format compatible with ETSI NGSI-LD common information
model
• Integrated with FIWARE Context Broker (Orion-LD).
• Compatible with ETSI NGSI-LD v1.4.1
• Support (almost) strict DCAT-AP v2.0.1 and statDCAT-AP v1.0.1
for metadata representation
• Secure management of HTTP Headers
• FastAPI and Uvicorn implementation for fast (high-
performance) web framework application
12. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
Cross-border pilot services
S4Y – The school for you
The “S4Y – The School for You” service allows families, who have to choose the most suitable school for
their children, to access integrated information related to data coming from different schools’ web portal
of a chosen city’s area and data coming from the Census of Population and Dwellings (e.g. as resident
population, age, sex, level of education, marital status, the construction period and the state of
preservation of the dwelling) related to the same geographical area.
Geolocalized Facilities
This application provides information about geolocalised events and infrastructures to provide decision
support to the user: it could be used by a user visiting a place he/she does not know, and wondering
where the nearest facilities of a different kind are, or what events are planned in nearby stadiums,
theatres or cultural venues. But it can also be used by a person who has to make a possible investment
at local level.
Support for Environment Policies
This use case has the objective to support local policy makers who have to take decisions about
environmental policies to be applied in a city. In particular, local policy makers can benefit from
integrated datasets deriving from: (i) sensor data concerning air pollution and (ii) statistical data
regarding demographic characterization of the city’s areas. On the basis of such data, they are able to
make several analyses to help planning and governing their interventions.
13. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
Data Pipelines (1)
The data pipelines implemented for the pilot applications are based on two different
approaches, converged at the end of the data workflow in the dissemination step:
A generalized ETL approach, generating RDF triples from a
CSV Dataset through Python/Prefect or VTL pipelines.
Main advantages:
• Openness: the code, developed using open tools, is
available in the INTERSTAT GitHub repository.
• Maximal automation, to avoid manual treatments, save
time and improve traceability.
• Reproducibility, resulting from automation and code
documentation.
• Efficiency, increased by the execution of the pipeline in a
distributed environment.
14. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
Data Pipelines (2)
Domain Knowledge approach, based on the description of the domain of interest through an
ontology and the definition of a logical Common Data Model to link heterogeneous data
sources with ontology concepts.
Main advantages of describing the domain of interest
through ontologies are:
• Formal and clear definition of target concepts and related
metadata
• Automatic reasoning
• Cross domain interoperability by design
• Decoupling between data structure and data semantics
• Incremental approach and cheaper data and metadata
management
• Easier linkage of new external data sources
15. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
Evaluation of the outcome (1)
Measuring the impact of the action in terms of efficiency, efficacy,
relevance, reproducibility and scalability
What to measure
● Assessment dimensions
● KPIs: Qualitative/Quantitative
Stakeholders to involve
● Interstat partners
● External stakeholders: IT staff,
domain experts, representative of
statistical data publishers
How to proceed
● Performance measurement: self-assessment and external feedbacks
through an Assessment Survey
16. The contents of this publication are the sole responsibility of INTERSTAT consortium
and do not necessarily reflect the opinion of the European Union
Evaluation of the outcome (2)
Provide a feedback concerning the following dimensions:
● Interstat framework: to assess the relevance of the tools provided by the framework
● Data pipelines: to evaluate the efficiency of the implemented data workflows
● Client applications/Front-end: to assess the pilots developed through the Interstat tools
Give your contribution and fill in the survey questionnaire
https://ec.europa.eu/eusurvey/runner/InterstatAssessmentSurvey
17. Vienna, 12-13 June, 2023 | #FIWARESummit www.fiware.org
Hosting Partner Keystone Sponsors
Media Partners
Find Us On Stay up to date Be certified and featured
JOIN OUR NEWSLETTER