NFDI Physical Sciences Colloquium - FAIR

FAIR data: no longer optional, but it takes a village!
Susanna-Assunta Sansone, PhD
Academic Lead for Research Practice,
Professor of Data Readiness, Engineering Science
Associate Director, Oxford e-Research Centre
ELIXIR
Interoperability Platform Co-Lead
elixir-europe.org/platforms/interoperability
Founding
Academic Editor
nature.com/sdata
NFDI Physical Sciences Joint Colloquium, January 9, 2022
Slides: https://www.slideshare.net/SusannaSansone
datareadiness.eng.ox.ac.uk
0000-0001-5306-5690
@SusannaASansone
susanna-assunta.sansone@oerc.ox.ac.uk

Outline
Brief history of the FAIR Principles and FAIR awareness
Challenges and next steps
Highlights from the life sciences and ELIXIR

Acknowledgements
In particular slides from:
Carole Goble, Philippe Rocca-Serra, and Allyson Lister
My team* and our collaborators in many projects, working groups, advisory boards, incl.:
* https://datareadiness.eng.ox.ac.uk/#people

A set of principles to enhance the
value of all digital resources and its
reuse by humans and machines
Data that is discoverable and usable at scale

Discoveries are made using shared data and this requires data that are:
• Cited and stored to be discoverable
• Retrievable and structured in standard format(s)
• Richly described to be understandable
Rationale behind the FAIR Principles
https://www.forbes.com/sites/gilpress/2016/03/23/data-preparat
ion-most-time-consuming-least-enjoyable-data-science-task-surve
y-says/#276a35e6f637
Data preparation accounts for about 80% of the work of data scientists

The FAIR Principles in a nutshell

A set of principles … not a standard
To enhance the value of all digital resources and its
reuse by humans and machines
A continuum of increasing reusability, via many different
implementations
Relaunch a dialogue with researchers and policy makers.
The FAIR Principles: just guiding principles

doi.org/10.2777/1524
www.gov.uk/government/publications/open-researc
h-data-task-force-final-report
www.turing.ac.uk/research/impact-stories/changing-culture-data-
science
FAIR has de facto become a global norm
www.fair-access.net.au
doi.org/10.1787/25186167

The scholarly publishing
ecosystem is changing
Data-relates mandates by funders and
institutions are growing
Researchers need recognition
and credit for data, software
and all research outputs
Human-machine and AI
collaboration is the future
Reproducibility of published studies
should be business as usual
The data driven revolution

• Publishers a “leverage point”
• Data is an integral part of the scholarly communications
• FAIR as a business opportunity, e.g. data support services, data publication tools
Data journals and data articles
• Incentive, credit for sharing
- Big and small data
- Negative results
- Long tail of data
- Curated aggregation
• Peer review of data
• Discoverability and reusability
- Complementing community databases
FAIR-enabling data journals and publishers’ services

https://doi.org/10.1371/journal.pcbi.1007854
https://doi.org/10.1371/journal.pcbi.1007854
Software
Training material
Extending the reach of FAIR to other digital objects

FAIR data stakeholders: it takes a village
Personal, project, organizational, and public responsibilities
Researchers and
company scientists who
generate and use data
Service providers who
manage data
and infrastructure
from local to global
from public to
commercial
Authorities who set
community policy,
practice, resources,
compliance and global
sustainability
Funders, policy makers,
publishers, professional
societies, standards
organisations, institutions
Data Stewards and
Research Software
Engineers who support
data and data analytics
Programme and
institute directors who
set local policy,
methodology, practice,
resources and local
sustainability, drive
change management

Personal
Organisation -
School
Project - Lab,
Consortia
Public
The FAIR village: players and responsibilities
Example
Example

Research Culture Programme
Research
Practice
Enabling researchers to
do reliable, reproducible,
and transparent
research
Valuing
Contributions
Recognising a diversity
of talents skills, &
outputs, and evaluating
them fairly
A partnership of academics and professional services,
supported by the Pro-Vice Chancellor of Research
Careers
Supporting researcher
careers by focusing
on career destinations
Priorities for advancing R&I culture at Oxford
https://staff.admin.ox.ac.uk/article/research-culture-at-oxford-improving-research-practices-and-supporting-research-careers

What’s good
for research
What’s good for
research careers
• Collaboration
• Diverse skills
• Openness & transparency
• Rigour
• Speed
• Novelty
• Ground-breaking results
• Ownership
• Self-interest
Research Culture Programme:
support what we reward and value

Royal Society
(Oct 2018)
Nuffield
(Dec 2014)
Wellcome Trust
(Jan 2020)
BEIS
(July 2021)
• Research Integrity
• Open Research Data
• Career Development of Researchers
• Openness in Animal Research
• Engaging the Public with Research
• Advancement of Knowledge Exchange in
Higher Education
• Technician Commitment
• SF Declaration on Research Assessment (DORA)
• Leiden Manifesto on Research Metrics
• Guidance for Safeguarding in International Development
Research
• Race Equality Charter
• Athena Swan Charter
Sector concordats Agreements Community principles
UK Gov
(July 2022)
Research Culture Programme:
integrate, simplifying sector requirements

Research Practice pillar
Instruments: core training, pilot projects and policies
First focus: plan, execute, report research:

Research Practice pillar:
awareness and incentives
FAIR as a love note to yourself!

23
Nodes
220+
Orgs
Towards a federated digital infrastructure for
Life Science data, coordinating national
capabilities
Data & software FAIR and open as possible
transnational access and analysis
Gateway Communities of Practice,
European and Global initiatives,
Standards Bodies
Hub
elixir-europe.org
European research infrastructure for Life Science

The ELIXIR Interoperability Platform (EIP)
Food & Nutrition
+Toxicology
elixir-europe.org/platforms/interoperability
Deals with the challenges of
delivering FAIR data,
working with FAIR data, and
enable its actual reuse

Resources
Node-provided resources and
nascent one, annotation tools,
registries, catalogs, and
services
Standards
Generic and community-specific,
technical protocols, PIDs
schemas, reporting guidelines,
terminologies, models, formats
Methods
Good research data management,
and FAIRification design and
execution - retrospectively and
prospectively
WHY
Have practical stories to showcase, demonstrating
impact, and benefits
WHY
Systematic approach to collate knowledge, and
disseminate it to ELIXIR users and external researchers
EIP Knowledge Hub
HOW
Via a dissemination portal where users find
interoperability know-how, and use case examples
Interoperability stories and data journeys
HOW
Putting services, standards and methods in action,
showing how they can applied to cases and data types
The EIP: the FAIR service framework

Some examples:
Projects and Communities, incl.: Global
initiatives,e.g
NEW: RDA Life Science Infrastructure IG with Australia BioCommons,
the US NIH Oﬃce of Data Science Strategy, and H3ABioNet in Africa.
IMI2 project guidelines for
open access to publications
and research data
Funders’
guidelines
The EIP: the FAIR service framework

Interoperability stories: e.g. metadata authoring
ISA-implementing systems, internal and external to ELIXIR
• EMBL-EBI Metabolights (Claire o’Donovan)
• FAIRDom SEEK (Stuart Aitken, Rafael Buono, Flore d’Anna)
• Jackson Lab (Jake Emerson / Abigail Miller)
• NASA GeneLab (Dan Berrios)
• xOMics project (Anna Neuheus)
• EMBL-EBI Biosample (https://doi.org/10.1093/nar/gkab1046)
• Earlham Institute COPO (Rob Davey)
• Intermine (Gos Micklem)
github.com/ISA-tools/isa-api/discussions
github.com/ISA-tools/isa-api/issues
mailto: isatools@googlegroups.com
'Investigation' (the project context), 'Study' (a
unit of research) and 'Assay' (analytical
measurement) data model and serializations
(tabular, JSON and RDF)
● Experimental metadata authoring
● Compliance to metadata standards
● Formatting for submission to EBI
repositories

rdmkit.elixir-europe.org/nels_assembly
Omics data management
Data collected from sequencer facility (Norseq) and
deposited into a shared datastore (NeLS)
Selected samples and secondary data organised into ISA
structured catalogue with metadata (FAIRDOM-SEEK)
Data processing pipelines (Galaxy) registered in
WorkflowHub
Selected data enter deposition pipelines into public archives
(ELIXIR Deposition Databases)
Secure access (Feide)
Data management planning (DSW)
Ethical, Social, and Legal Implications checklist (Trygge)
FAIR data journeys: e.g. from ELIXIR-Norway

EIP Knowledge Hub
elixir-europe.org/what-we-offer

Share
Reuse
Preserve
Analyse
Process
Plan
Collect
Detailed recipes for
making FAIR data
FAIR Data Stewardship
Guidance, writing Data
Management Plans
Guidance and context for
RDM services
Registry of standards and
registries/repositories
EIP Knowledge Hub: the FAIR RDM know-how
Training elixiruknode.org/activities/elixir-dash-fellowship

faircookbook.elixir-europe.org
faircookbook-ed@elixir-europe.org
Connect Discover
Describe
fairsharing.org
contact@fairsharing.org

Authored by almost 100 data
professionals from industry and
academia, led by ELIXIR Nodes,
with participation of USA NIH
Internationally
sustained and
adopted!
Pre-print: doi.org/10.5281/zenodo.7156792
A collection of recipes that cover the operation steps of FAIR data management

● Over 70 recipes released and
more content available
● Covering over 20 data types,
incl:
○ omics
○ pre-clinical
○ clinical areas
But not limited to it!
A live resource, open to contributions
Learn how to improve the FAIRness with exemplar datasets
Understand the levels and indicators of FAIRness
Discover open source technologies, tools and services
Find out the required skills
Acknowledge the challenges
Coordinated by an Editorial Board

Navigate recipes: deﬁne your FAIR data journey
Search wizard: faircookbook.elixir-europe.org/content/search-wizard.html

fairplus.github.io/Data-Maturity
Maturity level: how much is FAIR enough?
Provide insights into FAIR
Maturity reached by
applying a specific recipe
to improve a dataset

The FAIRiﬁcation framework in a recipe
w3id.org/faircookbook/FCB079

Credit and citability of the recipes:
because all contributions matters!
CreDiT
attribution ontology
w3id.org/faircookbook/FCB006

Anatomy of a recipe: components
Ingredients
An idea of tools/skills needed
Step by step process
Guidelines, process, description
Practical
elements, code
snippets
#Python3
#zooma-annotator-script.py
ﬁle
def
get_annotations(propertyType
, propertyValues, ﬁlters = ""): "
Examples
Conclusions
What should I read next?

Links complementary resources
Current links with and references to:
ds-wizard.org

FAIRsharing: standards, databases and policies
Guides consumers to discover, select and use these resources with confidence
Helps producers to make their resources more visible, more widely adopted and cited

COMMUNITY STANDARDS
POLICIES
by funders, journals
and other organizations
DATABASES
including repositories
and knowledgebases
Identifiers
Terminologies Guidelines
Formats
Informative and educational resource, and a service
FAIRsharing provides curated descriptions and relationship graphs of
standards, databases and policies in all disciplines

Users, adopters and collaborators include:
https://fairsharing.org/communities
An endorsed output of the
FAIRsharing WG (since 2015):
A WG (since 2015) in:
A recommended resource in EOSC reports
Users from all stakeholder groups
Researchers Developers and curators Journal publishers
Societies and Alliances
Librarians and Trainers Funders
FAIRsharing: working with and for all stakeholders

License
Maintainer(s)
Standard(s)
Database(s)
Policy(s)
API
Life cycle
status
10.25504/FAIRsharing.m3jtpg
Detailed descriptions
of the resource

Relations between
databases, standards,
and policies

Visualization of the
relationships

Translational Medicine
Clinical Developments
fairsharing.org/3519
(work in progress!)
FAIR organizations profiles: building, comparing
The standards, repositories and policies each
organisation uses or endorses
fairsharing.org/organisations

Collection URL: fairsharing.org/graph/3515;
each record has a DOI
Collection URL: fairsharing.org/graph/3513;
each record has a DOI
FAIR organizations profiles: across disciplines
The standards,
repositories and
policies each EOSC
Cluster uses or endorses

NEW: FAIRsharing Community Curator Programme
Curate – Influence – Gain Attribution – Engage – Learn
Funded by the:
Ambassadorship Programme
Domain experts, from EOSC clusters and worldwide, who
● Help curate content, standards, repositories and policies
relevant to their EOSC cluster, RDA group, research
domain, or area of focus
● Contribute to educational material for the users
Enquires and apply: fairsharing.org/community_curation

First cohort of 16 curators!
They gain attribution of their
work in their profile
Curate – Influence – Gain Attribution – Engage – Learn
NEW: FAIRsharing Community Curator Programme

references gets data from new, in progress
EIP Knowledge Hub: building links across resources

Example: identifiers are key to FAIR, which one
should I use and how?

European Research Landscape Study 2022
• Objectives:
• To collect data on data production and use by scientific disciplines and relevant sub-disciplines
• To collect and analyse information on data deposition practices, data typology and volume
• To collect data on the level of maturity with respect to FAIR data implementation
• To assess responsiveness and readiness of research data repositories in terms of implementation of
FAIR principles
• Scope:
• All fields of science
• Survey of researchers: 15066 responses
• Survey of research data repositories: 316 responses
• Desk research; case studies; FAIRness assessment
Publications Office of the European Union, 2022, https://data.europa.eu/doi/10.2777/3648 Also
https://indico.lip.pt/event/1249/contributions/4555/

History of the problem
From the 2016 FAIR Principles paper:
These high-level FAIR Guiding Principles precede implementation choices, and do not
suggest any speciﬁc technology, standard, or implementation-solution; moreover, the
Principles are not, themselves, a standard or a speciﬁcation. They act as a guide to data
publishers and stewards to assist them in evaluating whether their particular
implementation choices are rendering their digital research artefacts Findable, Accessible,
Interoperable, and Reusable.

FAIR is not a standard
It is a set of guiding principles that provide for a continuum of
increasing reusability, via many different implementations

Turning FAIR into reality requires we:
• deliver a number of research infrastructures and tools
• harmonize the standards for data and metadata
• address policies, education and training
• overcome technical, social and cultural challenges
• identify motivators, credit and rewards mechanisms
The road to FAIR data

The “cottage industry” of FAIR evaluation
https://fairassist.org
● Suffers from abundance and diversity!
○ 19 independent FAIR evaluation platforms (Oct 2022)**
○ Most are questionnaire-based, a small few are automated
○ Some are guidance, others are more judgmental
○ Some have invented their own FAIR tests and indicators
○ Even when using the same method, the results are
differents!
● Six NEW evaluators appeared since Feb 2022!
** Demonstrates that certain stakeholder communities are clamoring for a solution!

From assess to assist: not to judge but to help
And not everything that can be measured matters!
Strive for the FAIR enough!
Follow your data journey
and your needs!
More importantly in the current tools the tests
used and the result given, are not comparable!!

Developing guidance at European level
Collective views to shape guidance and influence policies:
outputs of the FAIR Metrics and Data Quality Task Force
doi.org/10.5281/zenodo.7390482
doi.org/10.5281/zenodo.7463421

Modified form the Strategy for Culture Change:
https://www.cos.io/blog/continuing-acceleration-new-strategic-plan
and https://zenodo.org/record/6881009#.Y2BIeuTP2F5
Communities
Communities
Communities
Communities
Communities
Communities
Incentives
Incentives
Incentives
Infrastructure and Skills
Usability
Usability
Usability
Usability
Usability
Usability
Policy
D4.4 Report and recommendations on FAIR incentives and
expected impacts in the Nordics, Baltics and EOSC
https://zenodo.org/record/6881009#.Y2BIeuTP2F5

NFDI Physical Sciences Colloquium - FAIR

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to NFDI Physical Sciences Colloquium - FAIR

Similar to NFDI Physical Sciences Colloquium - FAIR (20)

More from Susanna-Assunta Sansone

More from Susanna-Assunta Sansone (20)

Recently uploaded

Recently uploaded (20)

NFDI Physical Sciences Colloquium - FAIR