Application of recently developed FAIR metrics to the ELIXIR Core Data Resources

www.elixir-europe.org
Application of recently developed FAIR metrics
to the ELIXIR Core Data Resources
Ricardo de Miranda Azevedo & Michel Dumontier
Institute of Data Science (IDS)
Maastricht University, the Netherlands

An international, bottom-up paradigm for
the discovery and reuse of digital content
for the machines that people use

http://www.nature.com/articles/sdata201618

• DATA FAIRPORT workshop aimed
to define a minimal (yet
comprehensive) framework for
data discoverability, access,
annotation and authoring
• FAIR acronym was created and
guiding principles drafted
• for comment on FORCE11 website
• Principles were revised during the
2015 BioHackathon in Japan
FAIR: History
http://www.nature.com/articles/sdata201618

FAIR in a nutshell
FAIR aims to create social and economic impact by facilitating the discovery and reuse
of digital resources through a set of requirements:
• unique identifiers to retrieve all forms of digital content and knowledge
• high quality meta(data) to enhance discovery of digital resources
• use of common vocabularies to share terms and facilitate query
• use of community standards for more facile knowledge utilisation
• detailed provenance to provide context and reproducibility
• simpler terms of use to clarify expectations and intensify innovation
• deposited in appropriate repositories with high quality metadata for future content seekers
• social and technological commitments to realize reliable access

• 14 universal metrics covering each of the FAIR sub-principles. The metrics don’t dictate
any particular standards. They simply demand evidence (using protocols of the Web)
that you have met community expectations.
• Digital resource providers must provide at least one web-accessible document with
machine-readable metadata (FM-F2, FM-F3), resource management plan (FM-A2),
and any additional authorization procedures (FM-A1.2).
• They must use publically registered: identifier schemes (FM-F1A), (secure) access
protocols (FM-A1.1), knowledge representation languages (FM-I1), licenses (FM-R1.1),
provenance specifications (FM-R1.2), and community standards (FM-R1.3)
• They must evidence that their resource can be located in search results (FM-F4), that
it provides links to other (FAIR) resources (FM-I3; FM-I2), and it validates against
community standards (FM-R1.3)
http://fairmetrics.org

ELIXIR Core Data Resources
• ELIXIR Core Data Resources (CDRs) are a set of European data
resources of fundamental importance to the wider life-science
community and the long-term preservation of biological data.
• CDRs are assessed across several categories:
• Scientific focus and quality of science
• Community served by the resource
• Quality of service
• Legal and funding infrastructure, and governance
• Impact and translational stories
• Details in F1000R ELIXIR track article 'Identifying ELIXIR Core Data
Resources'.
• ELIXIR webinar https://www.elixir-europe.org/events/elixir-
webinar-elixir-core-data-resources-selection-process-and-
outcomes

Elixir Implementation Study:
FAIRness of the current ELIXIR Core resources
Objectives
1. Develop a shared understanding of the FAIR principles
2. Apply newly available FAIR metrics/FAIR evaluation software
3. Get feedback on the evaluation procedure
4. Identify actions that would increase the FAIRness of CDRs

Key Deliverables
1.Workshops and materials including FAIR implementation guide
2.Report on the analysis of the FAIRness of each participating CDR
3.Update records in FAIRsharing.org and TeSS with results of the study
4.Develop a vocabulary to represent and publish FAIR assessments
FAIRness of the current ELIXIR Core resources:

1st Workshop: European Bioinformatics Institute (Hinxton-UK) – 01/10/2018
• Introduction of FAIR maturity indicators (aka FAIRmetrics)
• Instructions on conducting manual FAIRness assessments using FAIRshake
• Representatives 8 ELIXIR CDRs submitted an assessment
• The assessments were reviewed by experts from the FAIRmetrics group
• Feedback was provided for each of the 8 participating ELIXIR CDRs
Materials: https://github.com/micheldumontier/fairness-assessment-workshop

Digite para inserir uma legenda.
Item
Protocol to access restricted content 0.5
Persistence of resource and metadata 0.5
Provenance scheme 0.5
Persistent identifier 0.38
Metadata format 0.38
Certificate of compliance to community standard 0.25
Linked 0
Distribution sum score of the participating CDRs
N = 8
Median = 12

Workshop Outcomes
• Substantive discussions about FAIR in the context of repositories!
– What is being evaluated: repositories or the records within?
• Domain entity descriptions are of high quality owing to depth of curation
• Repository metadata could to be improved
– structured repository metadata altogether missing (bioschemas)
– Unable to locate documentation regarding the persistence of identifiers, and the
maintenance of resources in the long term
– Licenses for repository metadata, as well as for their records
• Concern on how FAIRness assessments will be interpreted by outside parties
– FAIRShake did not have the capability to keep assessments private, until completed
– Anybody could perform manual assessments, that could be incomplete or wrong, and
show a lower compliance than was actually there
– Summary scores are not particularly informative – producer and consumer

2nd Workshop: European Bioinformatics Institute (Hinxton-UK) – 13/05/2019
• Preliminary results for the first round of assessments
• Presentation on the role of FAIRsharing.org
• Representatives of 5 CDRs (that did not take part on the first workshop)
• Breakout groups to promote discussion on FAIR data stewardship topics
• Minimal and ideal metadata for repositories and data records (bioschemas)
• Licensing and data stewardship plans
• Data standards, vocabularies, and participating in their evolution
• Substantial input generated from the breakout groups!

Automated FAIRness Assessments
• Powered using smartAPI and semantic web
technologies
• Harvests a diverse set of metadata through
HTTP operations and links in documents
• Open source and extensible!
http://w3id.org/AmIFAIR

{
"@context": "https://w3id.org/FAIR_Evaluator/schema#",
"@id": "https://w3id.org/FAIR_Evaluator/evaluations/801",
"@type": ["http://purl.org/dc/dcmitype/Dataset","https://purl.org/fair-
ontology/FAIR-Evaluation-Output"],
"collection": "https://w3id.org/FAIR_Evaluator/collections/5",
"primaryTopic": "https://www.ebi.ac.uk/chembl/",
"title": "FAIRness evaluation of CHEMBL resource",
"creator": "https://orcid.org/0000-0003-4727-9435",
"http://purl.org/pav/version": "2019-10-17T08:22:05.000Z",
"http://rdfs.org/ns/void#description": "FAIR Metrics Evaluation: FAIRness
evaluation of CHEMBL resource; Tested identifier: https://www.ebi.ac.uk/chembl/;
generated by https://orcid.org/0000-0003-4727-9435",
"http://www.w3.org/ns/dcat#contactPoint": "https://orcid.org/0000-0003-4727-
9435",
"http://www.w3.org/ns/dcat#identifier": "https://w3id.org/FAIR_Evaluator/evaluati
ons/801",
"http://www.w3.org/ns/dcat#publisher": "http://fairmetrics.org",
"evaluationInput": “…",
"evaluationResult": “…"
}

Evaluator Schema
https://w3id.org/FAIR_Evaluator/schema

Evaluation Input
"evaluationInput": "{"resource": "https://www.ebi.ac.uk/chembl/", "executor":
"0000-0003-4727-9435", "title": "FAIRness evaluation of CHEMBL resource"}",
{
"resource": "10.25504/FAIRsharing.m3jtpg",
"executor": "0000-0003-4727-9435",
"title": "Evaluation of CHEMBL using FAIRsharing DOI (no http)“
}

Evaluation Output
{
"https://w3id.org/FAIR_Evaluator/metrics/1": [{
"@id": "http://linkeddata.systems//cgi-bin/FAIR_Tests/gen2_unique_identifier#10.25504/FAIRsharing.m3jtpg/result-2019-10-17T10:20:59+00:00",
"http://purl.obolibrary.org/obo/date": [{
"@value": "2019-10-17T10:20:59+00:00",
"@type": "http://www.w3.org/2001/XMLSchema#date"
}],
"http://schema.org/comment": [{
"@value": "SUCCESS: Found an identifier of type 'doi'",
"@language": "en"
}],
"http://semanticscience.org/resource/SIO_000300": [{
"@value": "1",
"@type": "http://www.w3.org/2001/XMLSchema#int"
}],
"http://semanticscience.org/resource/SIO_000332": [{
"@value": "10.25504/FAIRsharing.m3jtpg",
"@type": "http://www.w3.org/2001/XMLSchema#float"
}],
"@type": ["http://fairmetrics.org/resources/metric_evaluation_result"]
}],

Lessons learned:
• The implementation study facilitated valuable interaction between ELIXIR curators
and FAIR data experts; FAIRness assessments offer an opportunity to improve
• The ELIXIR CDRs exhibited substantial FAIRness in their records they maintain,
but metadata about the CDRs need more attention
• The ELIXIR CDRs identified areas for improvement in the FAIRness assessment
• Questionnaires are time consuming, prone to error, and need proper management
• Coupling guidance with the results of the assessment could fuel improvements
• The FAIRCDR IS has directly contributed to “FAIR Evaluator Service”, an automated
state-of-art tool for FAIRness assessment

Next steps:
• A manuscript on the implementation study is under preparation. All representatives
involved in the implementation study will be invited to be co-authors (target
journal: F1000)
• Manuscript will provide a user-friendly guide for the implementation of the FAIR
principles for the ELIXIR CDR community

www.elixir-europe.org
@ELIXIREurope /company/elixir-europe
Thank you!
Acknowledgements: Rob Hooft, Rachel Drysdale, Mark Wilkinson,
Susanna Sansone, Peter McQuilton, Avi Ma’ayan, Daniel Clarke, and
all representatives of the ELIXIR CDRs that promptly collaborated
Michel.dumontier@maastrichtuniversity.nl
r.demirandaazevedo@maastrichtuniversity.nl

Application of recently developed FAIR metrics to the ELIXIR Core Data Resources

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Application of recently developed FAIR metrics to the ELIXIR Core Data Resources

Similar to Application of recently developed FAIR metrics to the ELIXIR Core Data Resources (20)

More from Pistoia Alliance

More from Pistoia Alliance (20)

Recently uploaded

Recently uploaded (20)

Application of recently developed FAIR metrics to the ELIXIR Core Data Resources