1. pharmas and academia join forces
to make data FAIR
FDA/NCTR-MAQC 2022 Conference, 26-27 Sept, 2022
Slides: https://www.slideshare.net/SusannaSansone
Professor of Data Readiness
Associate Director, Oxford e-Research Centre
Interoperability Platform
ExCo
elixir-europe.org
Founding
Academic Editor
nature.com/sdata
datareadiness.eng.ox.uk
Susanna-Assunta Sansone
ORCiD: 0000-0001-5306-5690
Twitter: @SusannaASansone
3. The FAIR Principles
Globally unique and
persistent identifiers
Community defined
descriptive metadata
Community defined
terminologies
Detailed
provenance
Terms of access
Terms of
use
5. FAIR as driven of the digital transformation
in pharmas
● To improve biopharma R&D productivity
● To enables powerful new AI analytics to
access data for ML and prediction
● Requirements
o financial, technical, training
● Challenges
o change the culture, show business
value, achieve the ‘FAIR enough’ on
an enterprise scale
7. The FAIR Cookbook:
motivations and ambitions
beyond the hype
Large body of generic FAIR
guidance
Motivations
Non-specific guidance for
the life sciences
Ambitions
Target specific situations to deliver a guide with
applied examples
Join academia and industry forces to make the
case for FAIR data management
Build capacity for high quality data
management in the private and public sectors
Lack of practical examples
of ‘how-to’ with different
data types and scenarios
10. What it is?
A collection of recipes that cover the
operation steps of FAIR data management
11. Who is it for?
Data Managers,
Data Stewards,
Data Curators
Software
Developers,
Terminology
Managers
Policymakers,
Funders,
Trainers
Researchers,
Data Scientists,
Principal
Investigators
12. Who is it for?
Data Managers,
Data Stewards,
Data Curators
Software
Developers,
Terminology
Managers
• A venue to document and share existing and new approaches or
services to support FAIRification
• A way to promote a participatory culture that enables sharing of
expertise by getting exposure and credit
Policymakers,
Funders,
Trainers
• Practical examples to
recommend in policies
• To use in educational
material to incentivize and
guide FAIR in practice.
• Introductory material
• Hands-on, technical
step-by-step examples
Researchers,
Data Scientists,
Principal
Investigators
13. Who developed it?
Almost 100 life sciences professionals, researchers and data managers
FARIplus
partners
Industry
+
Academia
ELIXIR
Nodes
represented
14. Current operations and Editorial Board
Content prioritisation
Identification of topics
Review of drafts
Call for contributions
Monthly book-dash
events
Pre-defined focus areas
Breakout on topics
Housekeeping
Technical platform
Website Martin Cook
Dominique Batista
Office of Data
Science Strategy
15. ● Over 70 recipes released and more
content available
● Covering technical processes with
FAIRification examples in the life
sciences:
○ Omics
○ pre-clinical
○ clinical areas
But not limited to it!
Coverage and learning objectives
Learn how to improve the FAIRness with exemplar datasets
Understand the levels and indicators of FAIRness
Discover open source technologies, tools and services
Find out the required skills
Acknowledge the challenges
16. Anatomy of a recipe
components
Ingredients
An idea of tools/skills needed
Step by step process
Guidelines, process, description
Practical
elements, code
snippets
#Python3
#zooma-annotator-script.py
file
def
get_annotations(propertyType
, propertyValues, filters = ""): "
Examples
Conclusions
What should I read next?
19. Tagging recipes with
‘Dataset Maturity Indicators’
Maturity level and indicators
new feature!
https://fairplus.github.io/Data-Maturity
Provide insights into FAIR Maturity reached by
applying a specific recipe to improve a
dataset
21. The Cookbook platform
open source community practices
Built using jupyter-book, following the practice used by the
Alan Turing Institute’ the Turing Way book, an open source
community-driven guide to reproducible, ethical, inclusive
and collaborative data science
Technology stack:
● github for version control and hosting
● Markdown the write-up
● HackMD markdown editor, integrated with github
● jupyter notebook for executable code
● binder for the web execution of jupyter notebook
distributed with a recipe
● mermaid javascript library for flowcharts, Gantt charts
and pie charts
https://the-turing-way.netlify.app/welcome
22. The FAIRness of the Cookbook
Accessibility
HTTPS protocol
Interoperability
JSON-LD markup
Cross-links to objects in other
registries
From the ELIXIR ecosystem and beyond!
CreDiT attribution ontology
Reusability
CC BY 4.0 license for all
content
Findability
Sitemap.xml, JSON-LD
Markup with Schema.org,
Bioschemas
w3id.org unique persistent
identifiers for each recipe
ORCID for authors
25. Content type overview
Recipes focused on
each technical aspects,
and applicable to any
data type
Recipes specific to a
topic or data type
26. Define what your needs are
Goal: improving visibility of content, e.g.:
Goal: semantic integration of datasets from multiple sources, e.g.:
Goal: security compliance and with regulators, e.g.:
https://w3id.org/faircookbook/FCB010
https://w3id.org/faircookbook/FCB007
https://w3id.org/faircookbook/FCB006
https://w3id.org/faircookbook/FCB020 https://w3id.org/faircookbook/FCB004
https://w3id.org/faircookbook/FCB014 https://w3id.org/faircookbook/FCB035
30. What has motivated so many people
to contribute?
● To stay engaged in a growing community and updated with the latest
development
● To proof their FAIR competence (as individual and as an organisation)
● To expand their network of potential collaborators, clients and users
● To address common challenges and find common solutions at pre-
competitive level
31. Inclusion criteria
Proof of competence, e.g.:
● Involved in data centric projects as data managers
● Authored publication, datasets, reports, technical documentation...
● (For industry) provided service as data curators, data vendors...
32. Become part of a
community of FAIR experts!
1Identify a chapter and a topic
Findability Accessibility Interoperability Reusability
Infrastructure Applied examples Assessment
2 Choose a way of contributing and see our guidelines
Google Docs
HackMD
Git
Markdown cheat sheet
Get recipe template
Tips and tricks
Submit an
outline
3
You can
discuss it
with the
Editorial
Board
33. What to contribute?
• Contribute test datasets
• Contribute examples
• Contribute code
Found a coverage gap?
Want to share your expertise ?
Topics of interest:
• ETL to RDF solutions
• KG generation from unstructured data
• GDPR compliant document
• Input validation & RDF shape constraint
validation
• Specific FAIRification process
34. What a good recipe should
and should not be
Should Should Not
Specific •Should target a specific task/action
towards the improvement of FAIR
indicators. e.g. (mapping data to a
certain ontology, converting files from
one format to another, anonymising a
dataset for better reusability)
Too broad •Should not be a repeat of full user
manual
Complete •Should be an end-to-end recipe that
user can follow and finish a task
Too high level •Should not be a features list of a tool
•Should not be an advertisement
FAIR
•The tools used should be open, or, if
proprietary, a “free” or “community”
version should be available (maybe
with limited functions/support) and be
FAIR.
Incomplete •Should not be just a teaser that only
shows a few steps at the beginning
Closed •User can only test it after purchasing a
paid software
35. Should Should Not
Specific •Should target a specific task/action
towards the improvement of FAIR
indicators. e.g. (mapping data to a
certain ontology, converting files from
one format to another, anonymising a
dataset for better reusability)
Too broad •Should not be a repeat of full user
manual
Complete •Should be a end-to-end recipe that
user can follow and finish a task
Too high level •Should not be a features list of a tool
•Should not be an advertisement
FAIR
•The tools used can be proprietary, but
a “free” or “community” version should
be available (maybe with limited
functions just to finish the task, or
without support) and be FAIR.
Incomplete •Should not be just a teaser that only
shows a few steps at the beginning
Closed •User can only test it after purchasing a
paid software
What a good recipe should
and should not be
37. ● As an educational material on FAIR in a training context (as part of the
FAIRplus Fellowship Programme):
o showed the validity of the recipes’ content towards the intended
(learning) objectives
o confirmed that some recipes require a greater amount of technical
background knowledge, and a steeper learning curve
Utility and value based on three uses:
what we have learned
38. ● As a practical guidance to improve day-to-day tasks for FAIRer data
● As contributor towards changing the culture in research data
management (behind the pharmas’ firewall)
o outcomes were expressed in terms of satisfaction of the value of the
recipes, against specific tasks, or challenges addressed
o they reported a positive contribution towards their discussion on return
on investment to operationalize FAIR
Utility and value based on three uses:
what we have learned
40. Sustainability: working on four fronts
Content Infrastructure Embedding Endorsements
40
Cultivating the
collective knowledge
Light weight
41. Thanks to
Editorial Board
Section Editors
FAIRplus partners
All bookdashes’ participants
All authors
fairplus-cookbook@elixir-europe.org
faircookbook.elixir-europe.org
fairplus-project.eu This project has received funding from the Innovative Medicines Initiative Joint Undertaking under grant agreement No 802750. This Joint Undertaking
receives support from the European Union’s Horizon 2020 research and innovation programme and EFPIA Companies. This communication reflects the
views of the authors and neither IMI nor the European Union, EFPIA or any Associated Partners are liable for any use that may be made of the
information contained herein.