The FAIR principles have been introduced as a guideline for good scientific data stewardship. They have gained momentum at a management level and are now for example part of the project template for EU Horizon 2020 projects. This raises the question what research groups and projects can do to implement them. Hugo Besemer will introduce the ideas behind the FAIR principles.
3. A FAIRLY short timeline
• January 2014 Workshop in Leiden (the Netherlands)
• 2014 Results on Force11 site
• 15 March 2016 Article in ‘Scientific data’
• 26 July 2016 H2020 Programme Guidelines
• December 2016 Webinar FAIR / repositories
Guiding Principles for Findable,
Accessible, Interoperable and Re-usable
Data Publishing version b1.0
Discussion about indicators of ‘FAIRness’
5. What ‘FAIR’ does NOT want to be and
what it wants to achieve
• It is NOT a specification
• It is NOT a syntax (it aims to be syntax agnostic)
• It is meant to precede technology and other implementation choices
• In my own words : these guidelines aim to create a research data
environment that is FAIR to machines and humans
6. FF
to be findableto be findable
•F1. (meta)data are assigned a globally unique and
persistent identifier
•F2. data are described with rich metadata (defined by
R1 below)
•F3. metadata clearly and explicitly include the
identifier of the data it describes
•F4. (meta)data are registered or indexed in a
searchable resource
7. Proposed indicators F(indable)
• 1.No PID and no metadata/documentation
• 2.PID without or with insufficient* metadata
• 3.Sufficient* metadata without PID
• 4.PID with sufficient* metadata–Information on data provenance
• 5.PID, rich metadata and additional documentation–Additional
explanation of how data can be used
* Sufficient = enough metadata to understand what the data is about
8. F(indable) @ Wageningen
• Presently departments decide what data is published
• At best data that is underlying publications (pressure from journals
helps at lot….)
• There are ongoing (series of) datasets that are only known to insiders
9. AA
to be accessibleto be accessible
•A1. (meta)data are retrievable by their identifier using
a standardized communications protocol
•A1.1 the protocol is open, free, and universally
implementable
•A1.2 the protocol allows for an authentication and
authorization procedure, where necessary
•A2. metadata are accessible, even when the data are
no longer available
10. Proposed indicators A(ccessible)
1.No user license / unclear conditions of reuse / metadata nor data are
accessible
2.Metadata are accessible (even when the data are not or no longer
available)
3.User restrictions apply (of any kind, including privacy, commercial
interests, embargo period, etc.)
4.Public Access (after registration)
5.Open Access (unrestricted, CC0 –perhaps also CCby?)
11. Accessible @ Wageningen
• Probably the most important problem: who decides who can get
access (and who will grant the permission technically)
• We have been awaiting guidelines on ownership / usage rights for
three years.
12. II
to be interoperableto be interoperable
•I1. (meta)data use a formal, accessible, shared, and broadly applicable
language for knowledge representation.
•I2. (meta)data use vocabularies that follow FAIR principles
•I3. (meta)data include qualified references to other (meta)data
13. Proposed indicators I(nteroperable)
1. Proprietary, non-open format data
2.Proprietary format, accepted by DSA Certified Trusted Data
Repository
3.Non-proprietary, open format (= “preferred” or “archival” format)
4.Data is additionally harmonized/ standardized, using standard
vocabularies
5.Data is additionally linked to other data to provide context
14. I(nteroperable) @ Wageningen
• In response to a blog about this the people working with ontologies
met for the first time
• Their main concerns
• How to find the relevant ontologies
• Can we rely on them to justify investments (consistency, process of
maintenance
• H2020 coordinators have no clue what all this is about
15. RR
to be Reusable:to be Reusable:
•R1. meta(data) are richly described with a plurality of
accurate and relevant attributes
• R1.1. (meta)data are released with a clear and
accessible data usage license
•R1.2. (meta)data are associated with detailed
provenance
•R1.3. (meta)data meet domain-relevant community
standards
Also in F4
Also in F2, I1
Also in I1
16. Proposed indicators R(e-usable)
“First we attempted to operationalise R – Re-usable as well ... but we
changed our mind
Reusable – is it a separate dimension? Partly subjective: it
depends on what you want to use the data for!”
17. References
Guiding principles for findable, accessible, interoperable and re-usable data publishing version B1.0
https://www.force11.org/fairprinciples
The FAIR Guiding Principles for scientific data management and stewardship
https://www.nature.com/articles/sdata201618
Guidelines on FAIR Data Management in Horizon 2020
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
FAIR Data in Trustworthy Data Repositories Webinar
https://eudat.eu/events/webinar/fair-data-in-trustworthy-data-repositories-webinar
Two blogs about FAIR @ Wageningen
•https://weblog.wur.eu/openscience/can-wageningen-fair/
•https://weblog.wur.eu/openscience/vocabularies-and-the-i-in-fair-data-principles/