FAIR Sample and Data Access
David van Enckevort
david.van.enckevort@umcg.nl
NIH Scientific Data SIG
Introduction
• Project Manager
• Genomics Coordination
Centre at Dept. Genetics
• Member of the group lead by
Prof. M.A. Swertz
• My focus: sharing of biobank
data and material
Outline
• FAIR Principles
• Making your data and samples FAIR
• Examples of FAIR samples and data
FAIR Principles
• Findability
• Accessibility
• Interoperability
• Reusability
Make resources sustainable for reuse
doi: 10.1038/sdata.2016.18
Findable
• Resources are assigned a globally unique and
persistent identifier
• Resources are described with rich metadata
• Metadata clearly and explicitly include the identifier
of the resource it describes
• Resource can by found through other systems
Accessible
• Retrievable by their identifier using a standardized
communications protocol
• The protocol is open, free, and universally
implementable
• The protocol allows for an authentication and
authorization procedure, where necessary
• Metadata are accessible, even when the data are no
longer available
Interoperable
• Use a formal, accessible, shared, and broadly
applicable language for knowledge representation.
• Use vocabularies that follow FAIR principles
• Include qualified references to other (meta)data
Reusable
• Richly described with a plurality of accurate and
relevant attributes
• Released with a clear and accessible usage license
• Associated with detailed provenance
• Meet domain-relevant community standards
Make your samples & data FAIR
• Make knowledge explicit (do not assume people will
know things that are obvious to you)
• Include sufficient metadata (units, SOPs, access
conditions, consent)
• Provide the raw data (e.g. when you need BMI also
provide length and weight)
Step 1
Make your samples & data FAIR
• Use standards information models
• Encode data using ontologies
Step 2
Standard information models
• Define the minimal information you should capture
to make your data (re)usable
• Provide structure to the information
Common information models: MIABIS, MIAPE, MIAME
https://biosharing.org/
Ontologies
• Provide a well defined and unambiguous meaning to
a term
• Provide relations between terms, e.g. ’Breast Cancer’
is a ‘Cancer’ is a ‘Disease’
Common ontologies include: OBIB, HPO, OMIM
http://bioportal.bioontology.org/
• Information model defines what information to
include
• Ontologies define acceptable values
Improve interoperability and reusability
Make your samples & data FAIR
• Publish metadata about your samples and data
collections
• Make it available to others
Step 3
Sample
Catalogue
ID-Cards
RD-Connect Sample Catalogue
• Biobanks from EuroBiobank and RD-Connect
• Integrated with ID-Card and the RD-Connect
platform
• Will list over 50 000 rare disease samples
https://samples.rd-connect.eu/
How FAIR helps
• Common structure and protocols allows the
catalogue to aggregate data from the individual
biobanks
• Common terms allows researchers to find the right
data and samples
• Linked data allows integration with the Genomics
platform and ID-Cards system
www.umcg.nl
Thank you for your attention!

NIH Data Science Special Interest Group

  • 1.
    FAIR Sample andData Access David van Enckevort david.van.enckevort@umcg.nl NIH Scientific Data SIG
  • 2.
    Introduction • Project Manager •Genomics Coordination Centre at Dept. Genetics • Member of the group lead by Prof. M.A. Swertz • My focus: sharing of biobank data and material
  • 3.
    Outline • FAIR Principles •Making your data and samples FAIR • Examples of FAIR samples and data
  • 4.
    FAIR Principles • Findability •Accessibility • Interoperability • Reusability Make resources sustainable for reuse doi: 10.1038/sdata.2016.18
  • 5.
    Findable • Resources areassigned a globally unique and persistent identifier • Resources are described with rich metadata • Metadata clearly and explicitly include the identifier of the resource it describes • Resource can by found through other systems
  • 6.
    Accessible • Retrievable bytheir identifier using a standardized communications protocol • The protocol is open, free, and universally implementable • The protocol allows for an authentication and authorization procedure, where necessary • Metadata are accessible, even when the data are no longer available
  • 7.
    Interoperable • Use aformal, accessible, shared, and broadly applicable language for knowledge representation. • Use vocabularies that follow FAIR principles • Include qualified references to other (meta)data
  • 8.
    Reusable • Richly describedwith a plurality of accurate and relevant attributes • Released with a clear and accessible usage license • Associated with detailed provenance • Meet domain-relevant community standards
  • 9.
    Make your samples& data FAIR • Make knowledge explicit (do not assume people will know things that are obvious to you) • Include sufficient metadata (units, SOPs, access conditions, consent) • Provide the raw data (e.g. when you need BMI also provide length and weight) Step 1
  • 10.
    Make your samples& data FAIR • Use standards information models • Encode data using ontologies Step 2
  • 11.
    Standard information models •Define the minimal information you should capture to make your data (re)usable • Provide structure to the information Common information models: MIABIS, MIAPE, MIAME https://biosharing.org/
  • 12.
    Ontologies • Provide awell defined and unambiguous meaning to a term • Provide relations between terms, e.g. ’Breast Cancer’ is a ‘Cancer’ is a ‘Disease’ Common ontologies include: OBIB, HPO, OMIM http://bioportal.bioontology.org/
  • 13.
    • Information modeldefines what information to include • Ontologies define acceptable values Improve interoperability and reusability
  • 14.
    Make your samples& data FAIR • Publish metadata about your samples and data collections • Make it available to others Step 3
  • 15.
  • 16.
    RD-Connect Sample Catalogue •Biobanks from EuroBiobank and RD-Connect • Integrated with ID-Card and the RD-Connect platform • Will list over 50 000 rare disease samples https://samples.rd-connect.eu/
  • 18.
    How FAIR helps •Common structure and protocols allows the catalogue to aggregate data from the individual biobanks • Common terms allows researchers to find the right data and samples • Linked data allows integration with the Genomics platform and ID-Cards system
  • 19.