SlideShare a Scribd company logo
Making Data FAIR*
Tom Plasterer, PhD
Director, Bioinformatics, Research Bioinformatics 20 Mar 2019
* Findable, Accessible, Interoperable and Reusable
3
What FAIR: Principles at-a-Glance
Findable:
• F1 (meta)data are assigned a globally
unique and persistent identifier
• F2 data are described with rich metadata
• F3 metadata clearly and explicitly include
the identifier of the data it describes
• F4 (meta)data are registered or indexed in a
searchable resource
The FAIR Guiding Principles for scientific data management and stewardship
Sci. Data 3:160018 doi: 10.1038/sdata.2016.18 (2016)
Accessible:
• A1 (meta)data are retrievable by their identifier
using a standardized communications protocol
• A1.1 the protocol is open, free, and universally
implementable
• A1.2 the protocol allows for an authentication and
authorization procedure, where necessary;
• A2 metadata are accessible, even when the data
are no longer available;
Interoperable:
• I1 (meta)data use a formal, accessible,
shared, and broadly applicable language for
knowledge representation
• I2 (meta)data use vocabularies that follow
FAIR principles
• I3 (meta)data include qualified references to
other (meta)data
Reusable:
• R1 meta(data) are richly described with a plurality
of accurate and relevant attributes
• R1.1 (meta)data are released with a clear and
accessible data usage license
• R1.2 (meta)data are associated with detailed
provenance
• R1.3 (meta)data meet domain-relevant
community standards
4
Collaborative & Competitive Intelligence:
• Who do we want to partner with? Are there complementary assets to our portfolio?
• What space is too crowded and not our area of expertise?
• Greenfield situations?
Mergers, Acquisitions, Partnerships:
• How do we efficiently and deeply absorb data generated elsewhere into our systems? How
do we efficiently share?
• Does this make a smaller biotech/start-up a more viable partner?
Improved Patient Care:
• Can we share data and outcomes more efficiently in complicated trial settings (basket trials,
adaptive trials) to better engage opinion leaders and foster dialog?
• Along with Differential Privacy approaches, can we have the broader research community
help mine our data?
• How do we best reuse Real World Evidence (RWE) data in the clinic and in trial design?
Data (Ir)-reproducibility:
• Can we make preclinical data (more)-reproducible?
• Can we utilize data credentialization? (thanks to Dan Crowther @ Exscientia)
Why FAIR: Biopharma Value Proposition
5
Why FAIR: €26bn Reasons…
6
When FAIR: A Brief History
Moving away from Narrative
• Nanopublications
Incubating Standards in Open PHACTS
• VoID, PROV-O
Lorentz Center Workshop
• FORCE 11 FAIR Guiding Principles
• Participants: IMI members, US researchers,
Content providers, ELIXIR; European Open
Science Cloud, Big Data to Knowledge (BD2K)
Current Status:
• FAIR Data Workshops (EU-ELIXIR nodes)
• Inclusion in Horizon 2020, NIH Advocacy
• IMI2 Data FAIR-ification Call
• Vendors getting up to speed
7
Linked Data Community of Practice
How familiar are you with the
FAIR principles and metrics?
When FAIR: Community Awareness
8
Linked Data Community of Practice
What is the maturity
level of your
organization with
respect to
implementation of
FAIR?
When FAIR: Getting Started
9
How FAIR: Pistoia FAIR Implementation Group
• Business challenge:
- Effective application and analysis of data
assets in life science industry demands that
it is made Findable, Accessible,
Interoperable and Reusable
• Update and plans:
- Workshop at The Hyve, Utrecht NL in June
2018 resulted in a published feature
article:-
- Workshop at EPAM, Boston US in Dec
2018 contributed to the business case
thinking
- Phase 1 for 2019 plans:-
• Develop the business case to define
distinctive role for the project
• Develop the FAIR Toolkit concept
• Select a use case: e.g. clinical science
to engage with CROs at a workshop
- Seeking more funding – join us!
PM: Ian Harrow Collaborators
1.Metric Tools & Best Practice
2.Training resources
3.Culture change process
4.Use case examples
5.Cost benefit examples
• Adapt for Life Science industry
• Leverage existing FAIR resources
FAIR Toolkit
Implementation
for LS Industry
FAIR
10
How FAIR: Pistoia Ontologies Mapping Project
• Business challenge:
– Use of different ontologies within
same data domain hampers
interoperability and application.
Solve by mapping between them.
• Update and plans:
– Phase 3 completed by end of 2018
• Predicted mappings delivered as a
prototype Ontology Mapping Service
for phenotype and disease domain
• Mappings will be available through
public wiki and OxO mapping repository
at EMBL-EBI
• Mapping algorithm, Paxo is available
openly on GitHub
– Phase 4 for 2019 plans:-
• To extend mapping of biological and
chemical ontologies for support of
laboratory analytics
• FAIR implementation is planned
– Seeking more funding – join us!
PartnersPM: Ian Harrow
11
How FAIR:
12
How FAIR: Implementation Networks
13
How FAIR:
Overview:
• ELIXIR - Project Coordinator & Janssen - Project Leader
• 22 participants with 12 academic, 7 EFPIA, 3 SME
• €8.23M budget with €4M H2020 EC funding + €4.23M EFPIA in-kind
• 42 months
Goals:
• Establish a value-based process for prioritization and selection of IMI project databases
• Develop FAIRification toolkit e.g. develop guidelines, tools and metrics - FAIR Cookbook
• Apply this toolkit to FAIRify datasets from selected IMI projects and EFPIA companies
• Deliver training for data handlers (academia, SMEs and pharmaceuticals) to change and
sustain the data management culture
• Foster and innovation ecosystem on FAIR open data to power future reuse, knowledge
generation and societal benefit e.g. FAIR innovation and SME events
Members:
PM: Serena Scollen
14
How FAIR: Concept
15
How FAIR: FAIR Metrics &
17
Start FAIR: Find me Datasets about:
Projects
Study
Indication/
Disease
Technology
Targets
Cohort DatesAgent
Therapeutic
Area
Drugs
18
Dataset Catalog is a collection of Dataset Records
• Catalogs are needed to supporting FAIR (Findable) data
• Catalogs can and should support Enterprise MDM strategies
• Consumers can be internal or external
Dataset Catalogs are needed so data consumers can find Datasets
• Dataset records need sufficient metadata to support discoverability
• Dataset terms are NOT the data instance
Dataset Catalogs surface dataset provenance and enable data access
Dataset Catalogs can provide datasets for multiple consumption patters
• Analytics readiness and fit
• ‘Walking’ across information models
Start FAIR: Findability Starts with Catalogs
19
Start FAIR: A DCAT conformant Data Catalog
https://www.w3.org/TR/hcls-dataset/
https://www.w3.org/TR/vocab-dcat/#vocabulary-overview
Semantic tagging of datasets with
concepts from taxonomies:
• provides context
• multi-dimensional & flexible
• effective for discoverability
• light-weight semantics
skos:Concept
dcat:Catalog skos:ConceptScheme
dctypes:Dataset (summary)
dct:title
dct:publisher <foaf:Agent>
foaf:page
void:sparqlEndpoint
dct:accrualPeriodicity
dcat:keyword
dcat:dataset
dcat:theme
dctypes:Dataset (version)
dcat:Distribution
(dctypes:Dataset)
void:vocabulary
dct:conformsTo
void:exampleResource
…other void properties
dcat:distribution
dcat:themeTaxonomy
dct:isVersionOf
pav:previousVersion
dct:hasPart
pav:hasCurrentVersion
dct:hasPart
dct:title
dct:publisher <foaf:Agent>
pav:version
dct:creator <foaf:Agent>
dct:created
dct:source
dct:creator <foaf:Agent>
dct:license
dct:format
pav:retrievedFrom
dct:created
pav:createdWith
dcat:accessURL
dcat:downloadURL
void:Dataset
dct:title
dctDescription
dct:publisher <foaf:Agent>
Start FAIR: Dataset to Knowlege Graph to Analytics
Data Catalog Filter
Phase 1
Experiment Metadata Filter
Phase 2
Ad hoc Analyses Filtering
Phase 3
Outbound
to Data Analytics
Data Science
Tools
Statistical
Filtering
e.g., clinical trial with > 50
participants
Dataset
Catalog
Descriptions
R&D | RDI
Why FAIR?
• Cost avoidance, Business Advantage, Data Stewardship
When FAIR?
• Now! Peers, especially in Europe, are doing it
How FAIR?
• FAIRplus, GO-FAIR, Pistoia FAIR Implementation Group
Start FAIR
• Findability first, adopt a FAIR-compliant Data Catalog
FAIR-for-Biopharma: Take-aways
R&D | RDI
Thanks
Key Influencers
David Wood
Tim Berners-Lee
Lee Harland
Jane Lomax
James Malone
Dean Allemang
Barend Mons
Carole Goble
Bernadette Hyland
Bob Stanley
Eric Little
Michel Dumontier
John Wilbanks
Hans Constandt
Filip Pattyn
Tim Hoctor
Kees Van Boche
Serena Scollen
AstraZeneca/Pistoia FAIR
Data Community
Mathew Woodwark
Rajan Desai
Nic Sinibaldi
Chia-Chien Chiang
Kerstin Forsberg
Ola Engkvist
Ian Dix
Colin Wood
Ted Slater
Martin Romacker
Eric Neumann
John Wise
Carmen Nitsche
Ian Harrow
Jeff Saltzman
Kathy Reinold

More Related Content

What's hot

Melissa Virus
Melissa VirusMelissa Virus
Melissa Virus
CpavtsJoshA
 
Zika virus
Zika virusZika virus
Zika virus
Sonika Shrivastav
 
Threat Modeling 101
Threat Modeling 101Threat Modeling 101
Threat Modeling 101
Atlassian
 
iOS jailbreaking
iOS jailbreakingiOS jailbreaking
iOS jailbreaking
Varun Luthra
 
Advanced Client Side Exploitation Using BeEF
Advanced Client Side Exploitation Using BeEFAdvanced Client Side Exploitation Using BeEF
Advanced Client Side Exploitation Using BeEF
1N3
 
EBOLA VIRAL DISEASE
EBOLA VIRAL DISEASEEBOLA VIRAL DISEASE
EBOLA VIRAL DISEASE
Anas Indabawa
 
Computer virus !!!!!
Computer virus !!!!!Computer virus !!!!!
Computer virus !!!!!
pratikpandya18
 
Novel coronavirus disease (covid 19)
Novel coronavirus disease (covid 19)Novel coronavirus disease (covid 19)
Novel coronavirus disease (covid 19)
Somdattsen
 
World Health Day 2014: Vector-borne diseases
World Health Day 2014: Vector-borne diseasesWorld Health Day 2014: Vector-borne diseases
World Health Day 2014: Vector-borne diseases
WHO Regional Office for Europe
 
MS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database ConceptsMS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database Concepts
DataminingTools Inc
 
Computer virus
Computer virusComputer virus
Computer virus
Sarhad Baez
 
DISE - Database Concepts
DISE - Database ConceptsDISE - Database Concepts
DISE - Database Concepts
Rasan Samarasinghe
 
Antivirus - Virus detection and removal methods
Antivirus - Virus detection and removal methodsAntivirus - Virus detection and removal methods
Antivirus - Virus detection and removal methods
Somanath Kavalase
 
Ebola virus disease
Ebola virus diseaseEbola virus disease
Ebola virus disease
MADHUR VERMA
 
Ransomware and tips to prevent ransomware attacks
Ransomware and tips to prevent ransomware attacksRansomware and tips to prevent ransomware attacks
Ransomware and tips to prevent ransomware attacks
dinCloud Inc.
 
Data Mining And Data Warehousing Laboratory File Manual
Data Mining And Data Warehousing Laboratory File ManualData Mining And Data Warehousing Laboratory File Manual
Data Mining And Data Warehousing Laboratory File Manual
Nitin Bhasin
 
Cyber security research proposal
Cyber security research proposalCyber security research proposal
Cyber security research proposal
BarryAllen147
 
Pegasus, A spyware
Pegasus, A spywarePegasus, A spyware
Pegasus, A spyware
Manash Kumar Mondal
 

What's hot (20)

Melissa Virus
Melissa VirusMelissa Virus
Melissa Virus
 
Zika virus
Zika virusZika virus
Zika virus
 
Threat Modeling 101
Threat Modeling 101Threat Modeling 101
Threat Modeling 101
 
iOS jailbreaking
iOS jailbreakingiOS jailbreaking
iOS jailbreaking
 
Advanced Client Side Exploitation Using BeEF
Advanced Client Side Exploitation Using BeEFAdvanced Client Side Exploitation Using BeEF
Advanced Client Side Exploitation Using BeEF
 
EBOLA VIRAL DISEASE
EBOLA VIRAL DISEASEEBOLA VIRAL DISEASE
EBOLA VIRAL DISEASE
 
Computer virus !!!!!
Computer virus !!!!!Computer virus !!!!!
Computer virus !!!!!
 
Novel coronavirus disease (covid 19)
Novel coronavirus disease (covid 19)Novel coronavirus disease (covid 19)
Novel coronavirus disease (covid 19)
 
World Health Day 2014: Vector-borne diseases
World Health Day 2014: Vector-borne diseasesWorld Health Day 2014: Vector-borne diseases
World Health Day 2014: Vector-borne diseases
 
MS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database ConceptsMS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database Concepts
 
Computer virus
Computer virusComputer virus
Computer virus
 
DISE - Database Concepts
DISE - Database ConceptsDISE - Database Concepts
DISE - Database Concepts
 
Antivirus - Virus detection and removal methods
Antivirus - Virus detection and removal methodsAntivirus - Virus detection and removal methods
Antivirus - Virus detection and removal methods
 
Ebola virus disease
Ebola virus diseaseEbola virus disease
Ebola virus disease
 
Relational databases
Relational databasesRelational databases
Relational databases
 
Ransomware and tips to prevent ransomware attacks
Ransomware and tips to prevent ransomware attacksRansomware and tips to prevent ransomware attacks
Ransomware and tips to prevent ransomware attacks
 
Data integrity
Data integrityData integrity
Data integrity
 
Data Mining And Data Warehousing Laboratory File Manual
Data Mining And Data Warehousing Laboratory File ManualData Mining And Data Warehousing Laboratory File Manual
Data Mining And Data Warehousing Laboratory File Manual
 
Cyber security research proposal
Cyber security research proposalCyber security research proposal
Cyber security research proposal
 
Pegasus, A spyware
Pegasus, A spywarePegasus, A spyware
Pegasus, A spyware
 

Similar to Making Data FAIR (Findable, Accessible, Interoperable, Reusable)

A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
African Open Science Platform
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
African Open Science Platform
 
Framework and Roadmap towards an Open Science Infrastructure/Simon Hodson
Framework and Roadmap towards an Open Science Infrastructure/Simon HodsonFramework and Roadmap towards an Open Science Infrastructure/Simon Hodson
Framework and Roadmap towards an Open Science Infrastructure/Simon Hodson
African Open Science Platform
 
I o dav data workshop prof wafula final 19.9.17
I o dav data workshop prof wafula final 19.9.17I o dav data workshop prof wafula final 19.9.17
I o dav data workshop prof wafula final 19.9.17
Tom Nyongesa
 
FAIR play?
FAIR play? FAIR play?
FAIR play?
Sarah Jones
 
Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...
Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...
Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...
Sarah Jones
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
Research Data Alliance
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
Research Data Alliance
 
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE
 
FAIR data: what it means, how we achieve it, and the role of RDA
FAIR data: what it means, how we achieve it, and the role of RDAFAIR data: what it means, how we achieve it, and the role of RDA
FAIR data: what it means, how we achieve it, and the role of RDA
Sarah Jones
 
Essentials 4 Data Support: a fine course in FAIR Data Support
Essentials 4 Data Support: a fine course in FAIR Data SupportEssentials 4 Data Support: a fine course in FAIR Data Support
Essentials 4 Data Support: a fine course in FAIR Data Support
Ellen Verbakel
 
FAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basicsFAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basics
OpenAIRE
 
The future of FAIR
The future of FAIRThe future of FAIR
The future of FAIR
Sarah Jones
 
The FAIR Principles and the IMI FAIRplus project
The FAIR Principles and the IMI FAIRplus projectThe FAIR Principles and the IMI FAIRplus project
The FAIR Principles and the IMI FAIRplus project
Susanna-Assunta Sansone
 
Open Data: Strategies for Research Data Management (and Planning)
Open Data: Strategies for Research Data  Management (and Planning)Open Data: Strategies for Research Data  Management (and Planning)
Open Data: Strategies for Research Data Management (and Planning)
Martin Donnelly
 
FAIR data
FAIR dataFAIR data
FAIR data
Sarah Jones
 
Horizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandatesHorizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandates
Martin Donnelly
 
H2020 data pilot openaire
H2020 data pilot openaireH2020 data pilot openaire
H2020 data pilot openaire
Sarah Jones
 
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
OpenAIRE
 
PARTHENOS Common Policies and Implementation Strategies
PARTHENOS Common Policies and Implementation StrategiesPARTHENOS Common Policies and Implementation Strategies
PARTHENOS Common Policies and Implementation Strategies
Parthenos
 

Similar to Making Data FAIR (Findable, Accessible, Interoperable, Reusable) (20)

A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
 
Framework and Roadmap towards an Open Science Infrastructure/Simon Hodson
Framework and Roadmap towards an Open Science Infrastructure/Simon HodsonFramework and Roadmap towards an Open Science Infrastructure/Simon Hodson
Framework and Roadmap towards an Open Science Infrastructure/Simon Hodson
 
I o dav data workshop prof wafula final 19.9.17
I o dav data workshop prof wafula final 19.9.17I o dav data workshop prof wafula final 19.9.17
I o dav data workshop prof wafula final 19.9.17
 
FAIR play?
FAIR play? FAIR play?
FAIR play?
 
Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...
Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...
Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
 
FAIR data: what it means, how we achieve it, and the role of RDA
FAIR data: what it means, how we achieve it, and the role of RDAFAIR data: what it means, how we achieve it, and the role of RDA
FAIR data: what it means, how we achieve it, and the role of RDA
 
Essentials 4 Data Support: a fine course in FAIR Data Support
Essentials 4 Data Support: a fine course in FAIR Data SupportEssentials 4 Data Support: a fine course in FAIR Data Support
Essentials 4 Data Support: a fine course in FAIR Data Support
 
FAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basicsFAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basics
 
The future of FAIR
The future of FAIRThe future of FAIR
The future of FAIR
 
The FAIR Principles and the IMI FAIRplus project
The FAIR Principles and the IMI FAIRplus projectThe FAIR Principles and the IMI FAIRplus project
The FAIR Principles and the IMI FAIRplus project
 
Open Data: Strategies for Research Data Management (and Planning)
Open Data: Strategies for Research Data  Management (and Planning)Open Data: Strategies for Research Data  Management (and Planning)
Open Data: Strategies for Research Data Management (and Planning)
 
FAIR data
FAIR dataFAIR data
FAIR data
 
Horizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandatesHorizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandates
 
H2020 data pilot openaire
H2020 data pilot openaireH2020 data pilot openaire
H2020 data pilot openaire
 
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
 
PARTHENOS Common Policies and Implementation Strategies
PARTHENOS Common Policies and Implementation StrategiesPARTHENOS Common Policies and Implementation Strategies
PARTHENOS Common Policies and Implementation Strategies
 

More from Tom Plasterer

FAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to PracticeFAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to Practice
Tom Plasterer
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge Graphs
Tom Plasterer
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* Data
Tom Plasterer
 
BioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageBioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative Advantage
Tom Plasterer
 
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Tom Plasterer
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Tom Plasterer
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for Biopharma
Tom Plasterer
 
Enabling Discovery in High-Risk Plaque using Semantic Web Approaches
Enabling Discovery in High-Risk Plaque using Semantic Web ApproachesEnabling Discovery in High-Risk Plaque using Semantic Web Approaches
Enabling Discovery in High-Risk Plaque using Semantic Web Approaches
Tom Plasterer
 
Mechanisms of Plaque Rupture in Advanced Atherosclerosis
Mechanisms of Plaque Rupture in Advanced AtherosclerosisMechanisms of Plaque Rupture in Advanced Atherosclerosis
Mechanisms of Plaque Rupture in Advanced Atherosclerosis
Tom Plasterer
 
Biomarker Strategies
Biomarker StrategiesBiomarker Strategies
Biomarker Strategies
Tom Plasterer
 

More from Tom Plasterer (10)

FAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to PracticeFAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to Practice
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge Graphs
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* Data
 
BioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageBioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative Advantage
 
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for Biopharma
 
Enabling Discovery in High-Risk Plaque using Semantic Web Approaches
Enabling Discovery in High-Risk Plaque using Semantic Web ApproachesEnabling Discovery in High-Risk Plaque using Semantic Web Approaches
Enabling Discovery in High-Risk Plaque using Semantic Web Approaches
 
Mechanisms of Plaque Rupture in Advanced Atherosclerosis
Mechanisms of Plaque Rupture in Advanced AtherosclerosisMechanisms of Plaque Rupture in Advanced Atherosclerosis
Mechanisms of Plaque Rupture in Advanced Atherosclerosis
 
Biomarker Strategies
Biomarker StrategiesBiomarker Strategies
Biomarker Strategies
 

Recently uploaded

Physiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of TastePhysiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of Taste
MedicoseAcademics
 
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTSARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
Dr. Vinay Pareek
 
Adv. biopharm. APPLICATION OF PHARMACOKINETICS : TARGETED DRUG DELIVERY SYSTEMS
Adv. biopharm. APPLICATION OF PHARMACOKINETICS : TARGETED DRUG DELIVERY SYSTEMSAdv. biopharm. APPLICATION OF PHARMACOKINETICS : TARGETED DRUG DELIVERY SYSTEMS
Adv. biopharm. APPLICATION OF PHARMACOKINETICS : TARGETED DRUG DELIVERY SYSTEMS
AkankshaAshtankar
 
ABDOMINAL TRAUMA in pediatrics part one.
ABDOMINAL TRAUMA in pediatrics part one.ABDOMINAL TRAUMA in pediatrics part one.
ABDOMINAL TRAUMA in pediatrics part one.
drhasanrajab
 
Sex determination from mandible pelvis and skull
Sex determination from mandible pelvis and skullSex determination from mandible pelvis and skull
Sex determination from mandible pelvis and skull
ShashankRoodkee
 
Cardiac Assessment for B.sc Nursing Student.pdf
Cardiac Assessment for B.sc Nursing Student.pdfCardiac Assessment for B.sc Nursing Student.pdf
Cardiac Assessment for B.sc Nursing Student.pdf
shivalingatalekar1
 
The Electrocardiogram - Physiologic Principles
The Electrocardiogram - Physiologic PrinciplesThe Electrocardiogram - Physiologic Principles
The Electrocardiogram - Physiologic Principles
MedicoseAcademics
 
Colonic and anorectal physiology with surgical implications
Colonic and anorectal physiology with surgical implicationsColonic and anorectal physiology with surgical implications
Colonic and anorectal physiology with surgical implications
Dr Maria Tamanna
 
Best Ayurvedic medicine for Gas and Indigestion
Best Ayurvedic medicine for Gas and IndigestionBest Ayurvedic medicine for Gas and Indigestion
Best Ayurvedic medicine for Gas and Indigestion
Swastik Ayurveda
 
Netter's Atlas of Human Anatomy 7.ed.pdf
Netter's Atlas of Human Anatomy 7.ed.pdfNetter's Atlas of Human Anatomy 7.ed.pdf
Netter's Atlas of Human Anatomy 7.ed.pdf
BrissaOrtiz3
 
NVBDCP.pptx Nation vector borne disease control program
NVBDCP.pptx Nation vector borne disease control programNVBDCP.pptx Nation vector borne disease control program
NVBDCP.pptx Nation vector borne disease control program
Sapna Thakur
 
Ophthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE examOphthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE exam
KafrELShiekh University
 
Pictures of Superficial & Deep Fascia.ppt.pdf
Pictures of Superficial & Deep Fascia.ppt.pdfPictures of Superficial & Deep Fascia.ppt.pdf
Pictures of Superficial & Deep Fascia.ppt.pdf
Dr. Rabia Inam Gandapore
 
Temporomandibular Joint By RABIA INAM GANDAPORE.pptx
Temporomandibular Joint By RABIA INAM GANDAPORE.pptxTemporomandibular Joint By RABIA INAM GANDAPORE.pptx
Temporomandibular Joint By RABIA INAM GANDAPORE.pptx
Dr. Rabia Inam Gandapore
 
Superficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptxSuperficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptx
Dr. Rabia Inam Gandapore
 
micro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdfmicro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdf
Anurag Sharma
 
How STIs Influence the Development of Pelvic Inflammatory Disease.pptx
How STIs Influence the Development of Pelvic Inflammatory Disease.pptxHow STIs Influence the Development of Pelvic Inflammatory Disease.pptx
How STIs Influence the Development of Pelvic Inflammatory Disease.pptx
FFragrant
 
Journal Article Review on Rasamanikya
Journal Article Review on RasamanikyaJournal Article Review on Rasamanikya
Journal Article Review on Rasamanikya
Dr. Jyothirmai Paindla
 
Top-Vitamin-Supplement-Brands-in-India List
Top-Vitamin-Supplement-Brands-in-India ListTop-Vitamin-Supplement-Brands-in-India List
Top-Vitamin-Supplement-Brands-in-India List
SwisschemDerma
 
KDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologistsKDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologists
د.محمود نجيب
 

Recently uploaded (20)

Physiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of TastePhysiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of Taste
 
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTSARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
 
Adv. biopharm. APPLICATION OF PHARMACOKINETICS : TARGETED DRUG DELIVERY SYSTEMS
Adv. biopharm. APPLICATION OF PHARMACOKINETICS : TARGETED DRUG DELIVERY SYSTEMSAdv. biopharm. APPLICATION OF PHARMACOKINETICS : TARGETED DRUG DELIVERY SYSTEMS
Adv. biopharm. APPLICATION OF PHARMACOKINETICS : TARGETED DRUG DELIVERY SYSTEMS
 
ABDOMINAL TRAUMA in pediatrics part one.
ABDOMINAL TRAUMA in pediatrics part one.ABDOMINAL TRAUMA in pediatrics part one.
ABDOMINAL TRAUMA in pediatrics part one.
 
Sex determination from mandible pelvis and skull
Sex determination from mandible pelvis and skullSex determination from mandible pelvis and skull
Sex determination from mandible pelvis and skull
 
Cardiac Assessment for B.sc Nursing Student.pdf
Cardiac Assessment for B.sc Nursing Student.pdfCardiac Assessment for B.sc Nursing Student.pdf
Cardiac Assessment for B.sc Nursing Student.pdf
 
The Electrocardiogram - Physiologic Principles
The Electrocardiogram - Physiologic PrinciplesThe Electrocardiogram - Physiologic Principles
The Electrocardiogram - Physiologic Principles
 
Colonic and anorectal physiology with surgical implications
Colonic and anorectal physiology with surgical implicationsColonic and anorectal physiology with surgical implications
Colonic and anorectal physiology with surgical implications
 
Best Ayurvedic medicine for Gas and Indigestion
Best Ayurvedic medicine for Gas and IndigestionBest Ayurvedic medicine for Gas and Indigestion
Best Ayurvedic medicine for Gas and Indigestion
 
Netter's Atlas of Human Anatomy 7.ed.pdf
Netter's Atlas of Human Anatomy 7.ed.pdfNetter's Atlas of Human Anatomy 7.ed.pdf
Netter's Atlas of Human Anatomy 7.ed.pdf
 
NVBDCP.pptx Nation vector borne disease control program
NVBDCP.pptx Nation vector borne disease control programNVBDCP.pptx Nation vector borne disease control program
NVBDCP.pptx Nation vector borne disease control program
 
Ophthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE examOphthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE exam
 
Pictures of Superficial & Deep Fascia.ppt.pdf
Pictures of Superficial & Deep Fascia.ppt.pdfPictures of Superficial & Deep Fascia.ppt.pdf
Pictures of Superficial & Deep Fascia.ppt.pdf
 
Temporomandibular Joint By RABIA INAM GANDAPORE.pptx
Temporomandibular Joint By RABIA INAM GANDAPORE.pptxTemporomandibular Joint By RABIA INAM GANDAPORE.pptx
Temporomandibular Joint By RABIA INAM GANDAPORE.pptx
 
Superficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptxSuperficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptx
 
micro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdfmicro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdf
 
How STIs Influence the Development of Pelvic Inflammatory Disease.pptx
How STIs Influence the Development of Pelvic Inflammatory Disease.pptxHow STIs Influence the Development of Pelvic Inflammatory Disease.pptx
How STIs Influence the Development of Pelvic Inflammatory Disease.pptx
 
Journal Article Review on Rasamanikya
Journal Article Review on RasamanikyaJournal Article Review on Rasamanikya
Journal Article Review on Rasamanikya
 
Top-Vitamin-Supplement-Brands-in-India List
Top-Vitamin-Supplement-Brands-in-India ListTop-Vitamin-Supplement-Brands-in-India List
Top-Vitamin-Supplement-Brands-in-India List
 
KDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologistsKDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologists
 

Making Data FAIR (Findable, Accessible, Interoperable, Reusable)

  • 1. Making Data FAIR* Tom Plasterer, PhD Director, Bioinformatics, Research Bioinformatics 20 Mar 2019 * Findable, Accessible, Interoperable and Reusable
  • 2. 3 What FAIR: Principles at-a-Glance Findable: • F1 (meta)data are assigned a globally unique and persistent identifier • F2 data are described with rich metadata • F3 metadata clearly and explicitly include the identifier of the data it describes • F4 (meta)data are registered or indexed in a searchable resource The FAIR Guiding Principles for scientific data management and stewardship Sci. Data 3:160018 doi: 10.1038/sdata.2016.18 (2016) Accessible: • A1 (meta)data are retrievable by their identifier using a standardized communications protocol • A1.1 the protocol is open, free, and universally implementable • A1.2 the protocol allows for an authentication and authorization procedure, where necessary; • A2 metadata are accessible, even when the data are no longer available; Interoperable: • I1 (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation • I2 (meta)data use vocabularies that follow FAIR principles • I3 (meta)data include qualified references to other (meta)data Reusable: • R1 meta(data) are richly described with a plurality of accurate and relevant attributes • R1.1 (meta)data are released with a clear and accessible data usage license • R1.2 (meta)data are associated with detailed provenance • R1.3 (meta)data meet domain-relevant community standards
  • 3. 4 Collaborative & Competitive Intelligence: • Who do we want to partner with? Are there complementary assets to our portfolio? • What space is too crowded and not our area of expertise? • Greenfield situations? Mergers, Acquisitions, Partnerships: • How do we efficiently and deeply absorb data generated elsewhere into our systems? How do we efficiently share? • Does this make a smaller biotech/start-up a more viable partner? Improved Patient Care: • Can we share data and outcomes more efficiently in complicated trial settings (basket trials, adaptive trials) to better engage opinion leaders and foster dialog? • Along with Differential Privacy approaches, can we have the broader research community help mine our data? • How do we best reuse Real World Evidence (RWE) data in the clinic and in trial design? Data (Ir)-reproducibility: • Can we make preclinical data (more)-reproducible? • Can we utilize data credentialization? (thanks to Dan Crowther @ Exscientia) Why FAIR: Biopharma Value Proposition
  • 4. 5 Why FAIR: €26bn Reasons…
  • 5. 6 When FAIR: A Brief History Moving away from Narrative • Nanopublications Incubating Standards in Open PHACTS • VoID, PROV-O Lorentz Center Workshop • FORCE 11 FAIR Guiding Principles • Participants: IMI members, US researchers, Content providers, ELIXIR; European Open Science Cloud, Big Data to Knowledge (BD2K) Current Status: • FAIR Data Workshops (EU-ELIXIR nodes) • Inclusion in Horizon 2020, NIH Advocacy • IMI2 Data FAIR-ification Call • Vendors getting up to speed
  • 6. 7 Linked Data Community of Practice How familiar are you with the FAIR principles and metrics? When FAIR: Community Awareness
  • 7. 8 Linked Data Community of Practice What is the maturity level of your organization with respect to implementation of FAIR? When FAIR: Getting Started
  • 8. 9 How FAIR: Pistoia FAIR Implementation Group • Business challenge: - Effective application and analysis of data assets in life science industry demands that it is made Findable, Accessible, Interoperable and Reusable • Update and plans: - Workshop at The Hyve, Utrecht NL in June 2018 resulted in a published feature article:- - Workshop at EPAM, Boston US in Dec 2018 contributed to the business case thinking - Phase 1 for 2019 plans:- • Develop the business case to define distinctive role for the project • Develop the FAIR Toolkit concept • Select a use case: e.g. clinical science to engage with CROs at a workshop - Seeking more funding – join us! PM: Ian Harrow Collaborators 1.Metric Tools & Best Practice 2.Training resources 3.Culture change process 4.Use case examples 5.Cost benefit examples • Adapt for Life Science industry • Leverage existing FAIR resources FAIR Toolkit Implementation for LS Industry FAIR
  • 9. 10 How FAIR: Pistoia Ontologies Mapping Project • Business challenge: – Use of different ontologies within same data domain hampers interoperability and application. Solve by mapping between them. • Update and plans: – Phase 3 completed by end of 2018 • Predicted mappings delivered as a prototype Ontology Mapping Service for phenotype and disease domain • Mappings will be available through public wiki and OxO mapping repository at EMBL-EBI • Mapping algorithm, Paxo is available openly on GitHub – Phase 4 for 2019 plans:- • To extend mapping of biological and chemical ontologies for support of laboratory analytics • FAIR implementation is planned – Seeking more funding – join us! PartnersPM: Ian Harrow
  • 12. 13 How FAIR: Overview: • ELIXIR - Project Coordinator & Janssen - Project Leader • 22 participants with 12 academic, 7 EFPIA, 3 SME • €8.23M budget with €4M H2020 EC funding + €4.23M EFPIA in-kind • 42 months Goals: • Establish a value-based process for prioritization and selection of IMI project databases • Develop FAIRification toolkit e.g. develop guidelines, tools and metrics - FAIR Cookbook • Apply this toolkit to FAIRify datasets from selected IMI projects and EFPIA companies • Deliver training for data handlers (academia, SMEs and pharmaceuticals) to change and sustain the data management culture • Foster and innovation ecosystem on FAIR open data to power future reuse, knowledge generation and societal benefit e.g. FAIR innovation and SME events Members: PM: Serena Scollen
  • 14. 15 How FAIR: FAIR Metrics &
  • 15.
  • 16. 17 Start FAIR: Find me Datasets about: Projects Study Indication/ Disease Technology Targets Cohort DatesAgent Therapeutic Area Drugs
  • 17. 18 Dataset Catalog is a collection of Dataset Records • Catalogs are needed to supporting FAIR (Findable) data • Catalogs can and should support Enterprise MDM strategies • Consumers can be internal or external Dataset Catalogs are needed so data consumers can find Datasets • Dataset records need sufficient metadata to support discoverability • Dataset terms are NOT the data instance Dataset Catalogs surface dataset provenance and enable data access Dataset Catalogs can provide datasets for multiple consumption patters • Analytics readiness and fit • ‘Walking’ across information models Start FAIR: Findability Starts with Catalogs
  • 18. 19 Start FAIR: A DCAT conformant Data Catalog https://www.w3.org/TR/hcls-dataset/ https://www.w3.org/TR/vocab-dcat/#vocabulary-overview Semantic tagging of datasets with concepts from taxonomies: • provides context • multi-dimensional & flexible • effective for discoverability • light-weight semantics skos:Concept dcat:Catalog skos:ConceptScheme dctypes:Dataset (summary) dct:title dct:publisher <foaf:Agent> foaf:page void:sparqlEndpoint dct:accrualPeriodicity dcat:keyword dcat:dataset dcat:theme dctypes:Dataset (version) dcat:Distribution (dctypes:Dataset) void:vocabulary dct:conformsTo void:exampleResource …other void properties dcat:distribution dcat:themeTaxonomy dct:isVersionOf pav:previousVersion dct:hasPart pav:hasCurrentVersion dct:hasPart dct:title dct:publisher <foaf:Agent> pav:version dct:creator <foaf:Agent> dct:created dct:source dct:creator <foaf:Agent> dct:license dct:format pav:retrievedFrom dct:created pav:createdWith dcat:accessURL dcat:downloadURL void:Dataset dct:title dctDescription dct:publisher <foaf:Agent>
  • 19. Start FAIR: Dataset to Knowlege Graph to Analytics Data Catalog Filter Phase 1 Experiment Metadata Filter Phase 2 Ad hoc Analyses Filtering Phase 3 Outbound to Data Analytics Data Science Tools Statistical Filtering e.g., clinical trial with > 50 participants Dataset Catalog Descriptions
  • 20. R&D | RDI Why FAIR? • Cost avoidance, Business Advantage, Data Stewardship When FAIR? • Now! Peers, especially in Europe, are doing it How FAIR? • FAIRplus, GO-FAIR, Pistoia FAIR Implementation Group Start FAIR • Findability first, adopt a FAIR-compliant Data Catalog FAIR-for-Biopharma: Take-aways
  • 21. R&D | RDI Thanks Key Influencers David Wood Tim Berners-Lee Lee Harland Jane Lomax James Malone Dean Allemang Barend Mons Carole Goble Bernadette Hyland Bob Stanley Eric Little Michel Dumontier John Wilbanks Hans Constandt Filip Pattyn Tim Hoctor Kees Van Boche Serena Scollen AstraZeneca/Pistoia FAIR Data Community Mathew Woodwark Rajan Desai Nic Sinibaldi Chia-Chien Chiang Kerstin Forsberg Ola Engkvist Ian Dix Colin Wood Ted Slater Martin Romacker Eric Neumann John Wise Carmen Nitsche Ian Harrow Jeff Saltzman Kathy Reinold

Editor's Notes

  1. Eric Schulte’s talk: Ready, Set, GO-FAIR: https://vimeo.com/282650465
  2. 50% (or higher) preclinical research could not be reproduced with a cost of $28B/year http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002165 Pistoia paper: Implementation and relevance of FAIR data principles in biopharmaceutical R&D; https://www.ncbi.nlm.nih.gov/pubmed/30690198
  3. https://dx.doi.org/10.2777/02999 https://publications.europa.eu/en/publication-detail/-/publication/d375368c-1a0a-11e9-8d04-01aa75ed71a1/language-en
  4. EU Research and Innovation programme ever with nearly €80 billion of funding available over 7 years (2014 to 2020)
  5. http://fairmetrics.org/ https://fairshake.cloud/?q=TCGA
  6. Images: http://senior-project-led-cube.wikispaces.com/ (https://creativecommons.org/licenses/by-sa/3.0/) http://opensource.org/node/688 (https://creativecommons.org/licenses/by/4.0/)