Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

(Weather) data standards assessed against the FAIR principles: some findings


Published on

The presentation was delivered at the pre-meeting of the RDA Interest Group on Agricultural Data (IGAD) in Berlin on 19/3/2018.

The presentation illustrates the findings of a gap analysis study on weather data standards under the lens of the FAIR principles.
Data standards and vocabularies can be assessed against the FAIR principles: after all, the FAIR principles recommend that “(meta)data use vocabularies that follow FAIR principles”.
The GODAN Action project created a map of data standards relevant for food and agriculture and did a first gap analysis focusing on weather data. GODAN Action is a three year project to enable data users, producers and intermediaries to engage effectively with open data and maximize its potential for impact in the agriculture and nutrition sectors.
The criteria used for the gap analysis were organized in four categories: (a) fitness for purpose, (b) adoption, c) usability and (d) openness. The criteria in the latter two categories can be used for an assessment of the standards against the FAIR principles, as they all try to measure to what extent the standards are findable (is it available on the web? Is it annotated?), accessible (is it maintained? Is it referenceable?), interoperable (is it available in more formats? Is it machine-readable? Is it semantic, referenceable, linked?) and reusable (does it have a clear license?).
However, even more important is the extent to which a data standard or vocabulary contributes to making data that adopts it more FAIR. In some contexts and for some users (data providers, intermediaries) the FAIRness of the standards themselves is very relevant, while for other users (service providers, end users) the FAIRness of the produced data is what matters.
While it’s implied that the use of FAIR vocabularies and data standards contributes to the “I” in FAIR, certain vocabularies help data also with the “F” (e.g. vocabularies that have properties for identifiers, vocabularies that have dataset metadata), “A” (e.g. vocabularies that have properties for procotols) and “R” (e.g. vocabularies that describe licensing and provenance).
The presentation gives some examples of these assessments applied to different types of weather data standards.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

(Weather) data standards assessed against the FAIR principles: some findings

  1. 1. (Weather) data standards assessed against the FAIR principles: some findings Valeria Pesce
  2. 2. Data standards and FAIR principles Data standards and vocabularies can be assessed against the FAIR principles. Indeed, the FAIR principles recommend that • I2 “(meta)data use vocabularies that follow FAIR principles” [I]
  3. 3. FAIR – quick recap TO BE FINDABLE: • F1. (meta)data are assigned a globally unique and eternally persistent identifier. F2. data are described with rich metadata. F3. (meta)data are registered or indexed in a searchable resource. F4. metadata specify the data identifier. TO BE ACCESSIBLE: • A1 (meta)data are retrievable by their identifier using a standardized communications protocol. A1.1 the protocol is open, free, and universally implementable. A1.2 the protocol allows for an authentication and authorization procedure, where necessary. A2 metadata are accessible, even when the data are no longer available. TO BE INTEROPERABLE: • I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. I2. (meta)data use vocabularies that follow FAIR principles. I3. (meta)data include qualified references to other (meta)data. TO BE RE-USABLE: • R1. meta(data) have a plurality of accurate and relevant attributes. R1.1. (meta)data are released with a clear and accessible data usage license. R1.2. (meta)data are associated with their provenance. R1.3. (meta)data meet domain-relevant community standards.
  4. 4. GODAN Action gap analyses on data standards The GODAN Action project created a map of agri-food data standards relevant for food and agriculture and did a first gap analysis on the use and usability of these standards, focusing initially on one type of data: weather data. Map of agri-food data standards: Weather data standards:
  5. 5. Criteria for gap analysis Criteria developed based on: • The assessment process used by the UK Government’s Open Standards Board • The ODI Open Data Certificates criteria Categories of assessment: Fitness to purpose Adoption Usability Openness Close to TBL 5 stars criteria and FAIR criteria
  6. 6. Fitness and adoption criteria: NOT in 5star and FAIR UK gov ODI 5stars FAIR FIT Complete X Authoritative Largely compatible X ADOPTED Known Used in software X Used in datasets X Endorsed X Regulatory X Long-term, sustainable X Participatory, collaborative X Maintained But FAIR R1.3. (meta)data meet domain- relevant community standards
  7. 7. Usability and openness: 5star and FAIR USABLE & OPEN UK gov ODI 5star FAIR Discoverable X F Available on the web X 1 F, A Versatile X Served by APIs Manageable Documented X Supported X Testable X Machine-readable X 2-3 I Meaningful X I Referenceable X 4 F, I Linked X 5 I Annotated X Clearly licensed X X R Openly licensed 1
  8. 8. FAIR standards • Are identified by a unique ID, e.g. URI [F] • Are registered in a catalog (or self-discoverable?) [F] • Have metadata, have the ID in the metadata [F] • Are accessible at a URL resolved based on the ID [A] • Are accessible through an open protocol [A] • Are represented in a formal language (schema and format) [I] • Use properties to link to / extend other properties [I] • Use properties to describe their license [R] • Provide an ID metadata element / property, better if universally unique [F] • Provide dataset metadata elements [F] • Prescribe a formal language (schema, format) [I] • Provide elements for linking to other entities [I] • Provide elements to describe and refer to usage licenses [R] • Provide elements to represent provenance [R] FAIR standards don’t help much with the [A] of data except: • (e.g. RDF) there is a framework behind (Linked Data) that prescribes the resolution of URIs • (e.g. OGC) there is a related standards for the data service that prescribes to resolve the identifier to the dataset Standards FAIRifying data
  9. 9. Major weather data standards examined
  10. 10. Assessment of openness and FAIRness (1)
  11. 11. Assessment of openness and FAIRness (2)
  12. 12. FAIRness of standards and actual use! • Sometimes the most used standards are the least FAIR! (NetCDF, BUFR, GRIB, METAR compared to CSML, OGC and W3C ones) FITNESS AND ADOPTION ARE MORE IMPORTANT THAN FAIRNESS especially for long-standing non-FAIR standards Covered by R1.3. (meta)data meet domain-relevant community standards? • Sometimes data encoded with less FAIR standards are preferred by developers! (OpenWeatherMap response on one side, NetCDF on the other side; compared to OGC XML and OGC/W3C ontologies) SIMPLER IS SOMETIMES PREFERRED TO FAIR • The level of FAIRness of data standards doesn’t necessarily correspond to the level of FAIRness of data using them
  13. 13. Thank you Valeria Pesce