SlideShare a Scribd company logo
1 of 53
FAIR DATA OVERVIEW
Luiz Olavo Bonino - luiz.bonino@dtls.nl
SUMMARY
 What is FAIR data?
 The FAIR ecosystem
 Plans and how to realise
(FAIR) DATA STEWARDSHIP
DATA STEWARDSHIP
 Combination of all expertise to treat data well in a project:
■ Experiment design and data-design;
■ Re-use of existing data where possible;
■ Planning of the storage, networking and computing
infrastructure;
■ Data acquisition and processing;
■ Data publishing in a format that allows functional
interlinking of data(sets) as well as in a format suitable for
long-term preservation.
FAIR DATA STEWARDSHIP
 Combination of all expertise to treat data well in a project:
■ Experiment design and data-design;
■ Re-use of existing data where possible;
■ Planning of the storage, networking and computing infrastructure;
■ Data acquisition and processing;
■ Data publishing in a format that allows functional interlinking of
data(sets) as well as in a format suitable for long-term preservation.
DATA STEWARDSHIP – PROCESS VIEW
DATA STEWARDSHIP – SUSTAINABILITY VIEW
Produces Consumes
Produces Consumes
storage
sustainability
maintenance
license
privacy security
stewardship
access
?
Produces Consumes
RDF
MIAPE
DBMS Excel
API
SQL
SPARQL
Metadata
DICOM
MIRIAM
Semantics
Produces Consumes
access
find
query
format
license
integrate
WHAT IS FAIR DATA?
FAIR Data aims to support existing communities in their
attempts to enable valuable scientific data and knowledge to
be published and utilised in a ‘FAIR’ manner.
Findable - (meta)data is uniquely and persistently identifiable.
Should have basic machine readable descriptive metadata.
Accessible - data is reachable and accessible by humans and
machines using standard formats and protocols.
Interoperable - (meta)data is machine readable and annotated
with resolvable vocabularies/ontologies.
Reusable - (meta)data is sufficiently well-described to allow
(semi)automated integration with other compatible data sources.
THE FAIR ECOSYSTEM
FAIR Data Principles
FAIR Data Protocol
FAIR Data Resources
FAIR Data Core Technologies
FAIR Data Systems/Tools
Normative
Artefact
Software
FAIR ECOSYSTEM - NORMATIVE LEVEL
 FAIR Data Principles - general principles guiding FAIR data
solutions;
 FAIR Data Protocol - complying with the FAIR Data
Principles, provide guidelines for implementing FAIR data
solutions, e.g., standards, APIs, technologies, …;
FAIR DATA PROTOCOL
Findable - standards for describing the dataset with the relevant
metadata;
Accessible - standards for represent and access the data
according to the defined usage license;
Interoperable - standards for machine readable descriptions of
the (meta)data and (semantic)annotation;
Reusable - standards for semantic annotation of the (meta)data
supporting machine reasoning, and standards for defining data
provenance and support citation;
The standards include technologies (e.g., RDF, nano pub, JSON,
OWL, etc.) as well as protocols and APIs.
FAIR ECOSYSTEM - ARTEFACT LEVEL
 FAIR Data Resource - datasets expressed using one of
the prescribed standards of the FAIR Data Protocol and
with metadata complying with the protocol.
 Annotation Ontology - reference conceptual model used
to provide semantics to elements of FAIR Data Resources
through annotation.
 Controlled vocabularies, dictionaries, etc.
FAIR DATA RESOURCE
Datasets expressed using one of the prescribed standards of the
FAIR Data Protocol, with metadata complying with the protocol
and license. The original dataset is transformed into a FAIR format
and proper metadata and license are added to produce a FAIR
Data Resource. The original and the FAIR version can co-exist,
each one fulfilling its own purpose.
FAIR Conversion
FAIR Data Resource
FAIR DATA RESOURCE
Data Creation
FAIR Data Resource
FAIR Data Creation
FAIR Data Resource
SHARING DATA
I would like to exploit common genotype-
phenotype relations between Alzheimer’s
Disease and Huntington’s Disease…
I need to combine AD and HD data…
I can help
with that!
I can help
with that!
Source: Marcos Roos
SHARING DATA
Source: Marcos Roos
???
Here’s my
data, have
fun!
Here’s my
data, have
fun!
SHARING LINKABLE DATA
Source: Marcos Roos
I can go straight to answering my questions
with data from multiple data owners!
Patients will be so pleased with this speed-up!
Here’s my
Linked Data,
have fun!
Here’s my
Linked Data,
have fun!
Raw data
(many formats)
Raw data
(many formats)
Processed data
(primary storage format)
Initial transformation
Raw data
(many formats)
Processed data
(primary storage format)
ProvenanceInitial transformation
Raw data
(many formats)
Processed data
(primary storage format)
FAIR transformation
FAIR (meta)data
(RDF,XML etc.)
ProvenanceInitial transformation
Raw data
(many formats)
Processed data
(primary storage format)
FAIR transformation
FAIR (meta)data
(RDF,XML etc.)
ProvenanceInitial transformation
Raw data
(many formats)
FAIR download
(in local format)
Processed data
(primary storage format)
FAIR transformation
FAIR (meta)data
(RDF,XML etc.)
ProvenanceInitial transformation
Raw data
(many formats)
FAIR download
(in local format)
Processed data
(primary storage format)
FAIR transformation
FAIR (meta)data
(RDF,XML etc.)
High-Performance
Analysis
ProvenanceInitial transformation
Analysis transformation
FAIR DATA APPLICATION
ECOSYSTEM (NL
APPROACH)
FAIR DATA RESOURCE
FAIR transformation
FAIR Data Resource
BRING YOUR OWN DATA - BYOD
 Goals:
■ Learn how to make data linkable “hands-on” with experts
■ Create a “telling story” to demonstrate its use
 Composition:
■ Data owners – specialists on given datasets
■ Data interoperability experts
■ Domain experts
Source: Marcos Roos
BYOD
FAIRIFIER
FAIRIFIER
FAIR DATA MODEL REGISTRY
FAIRIFIER AND FAIR DATA MODEL REGISTRY
A particular class of FAIR Data System that provides access to
published datasets. The datasets can be external or internal to the
FAIR Data Point. Also, the source data can be a regular (non-FAIR)
dataset or a FAIR Data Resource. If the source data is non-FAIR,
the FAIR Data Point needs to made the necessary FAIR
transformations on the fly.
FAIR DATA POINT
 A particular class of FAIR Data System to provide
support for data interoperability;
 Supports publication and access to FAIR data.
 Fosters an ecosystems of applications and services;
 Federated architecture: different FAIRports (and other
FAIR Data Systems) are interconnectable;
 Supports citations of datasets and data items;
 Provides metrics for data usage and citation;
FAIR DATA PUBLICATION
FAIR DATA ACCESS
DISTRIBUTED ARCHITECTURE
F A I
R
FAIRPORT ECOSYSTEM
FAIRPORT
WORK ORGANISATION (NL
APPROACH)
HOW TO REALISE
DTL
National FAIR
data engineering team
Elixir
Other
Project
FAIR Data IG
(P.I.s with
data)
Data SAC
G.M., B.M., M.G.
Data Core Team
FAIR Data Executive Team
L.B. (CTO), R.H., P.B., M.S., J.B., J.W.
engineers in the
FAIR Data virtual team
Core
FAIR
Technol
ogyFAIR Data V.T.
FAIR Data V.T.
FAIR Data V.T.
FAIR Data V.T.
FAIR Data V.T.
FAIR Data V.T.
Local
“DTLs”/
projects
project
s
QUESTIONS?

More Related Content

What's hot

Inside open metadata—the deep dive
Inside open metadata—the deep diveInside open metadata—the deep dive
Inside open metadata—the deep diveDataWorks Summit
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceDenodo
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data managementCunera Buys
 
Data Governance for the Executive
Data Governance for the ExecutiveData Governance for the Executive
Data Governance for the ExecutiveDATAVERSITY
 
Data Catalogues - Architecting for Collaboration & Self-Service
Data Catalogues - Architecting for Collaboration & Self-ServiceData Catalogues - Architecting for Collaboration & Self-Service
Data Catalogues - Architecting for Collaboration & Self-ServiceDATAVERSITY
 
Developing a Data Strategy
Developing a Data StrategyDeveloping a Data Strategy
Developing a Data StrategyMartha Horler
 
The Business Value of Metadata for Data Governance
The Business Value of Metadata for Data GovernanceThe Business Value of Metadata for Data Governance
The Business Value of Metadata for Data GovernanceRoland Bullivant
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata managementOpen Data Support
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsKhalid Salama
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
 
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?Precisely
 
Modern Metadata Strategies
Modern Metadata StrategiesModern Metadata Strategies
Modern Metadata StrategiesDATAVERSITY
 
Azure DataBricks for Data Engineering by Eugene Polonichko
Azure DataBricks for Data Engineering by Eugene PolonichkoAzure DataBricks for Data Engineering by Eugene Polonichko
Azure DataBricks for Data Engineering by Eugene PolonichkoDimko Zhluktenko
 
Neo4j : Graphes de Connaissance, IA et LLMs
Neo4j : Graphes de Connaissance, IA et LLMsNeo4j : Graphes de Connaissance, IA et LLMs
Neo4j : Graphes de Connaissance, IA et LLMsNeo4j
 
Introduction to Knowledge Graphs: Data Summit 2020
Introduction to Knowledge Graphs: Data Summit 2020Introduction to Knowledge Graphs: Data Summit 2020
Introduction to Knowledge Graphs: Data Summit 2020Enterprise Knowledge
 

What's hot (20)

Inside open metadata—the deep dive
Inside open metadata—the deep diveInside open metadata—the deep dive
Inside open metadata—the deep dive
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data management
 
Data Governance for the Executive
Data Governance for the ExecutiveData Governance for the Executive
Data Governance for the Executive
 
Data Catalogues - Architecting for Collaboration & Self-Service
Data Catalogues - Architecting for Collaboration & Self-ServiceData Catalogues - Architecting for Collaboration & Self-Service
Data Catalogues - Architecting for Collaboration & Self-Service
 
Developing a Data Strategy
Developing a Data StrategyDeveloping a Data Strategy
Developing a Data Strategy
 
The Business Value of Metadata for Data Governance
The Business Value of Metadata for Data GovernanceThe Business Value of Metadata for Data Governance
The Business Value of Metadata for Data Governance
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata management
 
Data Quality Presentation.ppt
Data Quality Presentation.pptData Quality Presentation.ppt
Data Quality Presentation.ppt
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
Data Governance
Data GovernanceData Governance
Data Governance
 
[XConf Brasil 2020] Data mesh
[XConf Brasil 2020] Data mesh[XConf Brasil 2020] Data mesh
[XConf Brasil 2020] Data mesh
 
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
 
Introducation to metadata
Introducation to metadataIntroducation to metadata
Introducation to metadata
 
Modern Metadata Strategies
Modern Metadata StrategiesModern Metadata Strategies
Modern Metadata Strategies
 
Azure DataBricks for Data Engineering by Eugene Polonichko
Azure DataBricks for Data Engineering by Eugene PolonichkoAzure DataBricks for Data Engineering by Eugene Polonichko
Azure DataBricks for Data Engineering by Eugene Polonichko
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
 
Neo4j : Graphes de Connaissance, IA et LLMs
Neo4j : Graphes de Connaissance, IA et LLMsNeo4j : Graphes de Connaissance, IA et LLMs
Neo4j : Graphes de Connaissance, IA et LLMs
 
Introduction to Knowledge Graphs: Data Summit 2020
Introduction to Knowledge Graphs: Data Summit 2020Introduction to Knowledge Graphs: Data Summit 2020
Introduction to Knowledge Graphs: Data Summit 2020
 

Viewers also liked

Rda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updatedRda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updatedResearch Data Alliance
 
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Europe
 
Importance and Challenges of Reproducible Research
Importance and Challenges of Reproducible ResearchImportance and Challenges of Reproducible Research
Importance and Challenges of Reproducible ResearchVladimir Kanchev
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theoryC. Tobin Magle
 
D4Science Data infrastructure: a facilitator for a FAIR data management
D4Science Data infrastructure: a facilitator for a FAIR data managementD4Science Data infrastructure: a facilitator for a FAIR data management
D4Science Data infrastructure: a facilitator for a FAIR data managementResearch Data Alliance
 
GARNet workshop on Integrating Large Data into Plant Science
GARNet workshop on Integrating Large Data into Plant ScienceGARNet workshop on Integrating Large Data into Plant Science
GARNet workshop on Integrating Large Data into Plant ScienceDavid Johnson
 
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...EUDAT
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingMerce Crosas
 
Governing Big Data : Principles and practices
Governing Big Data : Principles and practicesGoverning Big Data : Principles and practices
Governing Big Data : Principles and practicesPiyush Malik
 
Intro to network Science
Intro to network ScienceIntro to network Science
Intro to network SciencePyData
 
A Pragmatic Approach to Identity and Access Management
A Pragmatic Approach to Identity and Access ManagementA Pragmatic Approach to Identity and Access Management
A Pragmatic Approach to Identity and Access Managementhankgruenberg
 
Key Principles Of Data Mining
Key Principles Of Data MiningKey Principles Of Data Mining
Key Principles Of Data Miningtobiemuir
 
Analytics et Big Data, une histoire de cubes...
Analytics et Big Data, une histoire de cubes...Analytics et Big Data, une histoire de cubes...
Analytics et Big Data, une histoire de cubes...Mathias Kluba
 
50 data principles for loosely coupled identity management v1 0
50 data principles for loosely coupled identity management v1 050 data principles for loosely coupled identity management v1 0
50 data principles for loosely coupled identity management v1 0Ganesh Prasad
 

Viewers also liked (20)

"Cool" metadata for FAIR data
"Cool" metadata for FAIR data"Cool" metadata for FAIR data
"Cool" metadata for FAIR data
 
Mendeley Data FAIR hackathon
Mendeley Data FAIR hackathonMendeley Data FAIR hackathon
Mendeley Data FAIR hackathon
 
DTL Partners Event - FAIR Data Tech overview - Day 1
DTL Partners Event - FAIR Data Tech overview - Day 1DTL Partners Event - FAIR Data Tech overview - Day 1
DTL Partners Event - FAIR Data Tech overview - Day 1
 
Rda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updatedRda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updated
 
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?
 
Importance and Challenges of Reproducible Research
Importance and Challenges of Reproducible ResearchImportance and Challenges of Reproducible Research
Importance and Challenges of Reproducible Research
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theory
 
D4Science Data infrastructure: a facilitator for a FAIR data management
D4Science Data infrastructure: a facilitator for a FAIR data managementD4Science Data infrastructure: a facilitator for a FAIR data management
D4Science Data infrastructure: a facilitator for a FAIR data management
 
GARNet workshop on Integrating Large Data into Plant Science
GARNet workshop on Integrating Large Data into Plant ScienceGARNet workshop on Integrating Large Data into Plant Science
GARNet workshop on Integrating Large Data into Plant Science
 
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
 
Governing Big Data : Principles and practices
Governing Big Data : Principles and practicesGoverning Big Data : Principles and practices
Governing Big Data : Principles and practices
 
Intro to network Science
Intro to network ScienceIntro to network Science
Intro to network Science
 
A Pragmatic Approach to Identity and Access Management
A Pragmatic Approach to Identity and Access ManagementA Pragmatic Approach to Identity and Access Management
A Pragmatic Approach to Identity and Access Management
 
Key Principles Of Data Mining
Key Principles Of Data MiningKey Principles Of Data Mining
Key Principles Of Data Mining
 
Analytics et Big Data, une histoire de cubes...
Analytics et Big Data, une histoire de cubes...Analytics et Big Data, une histoire de cubes...
Analytics et Big Data, une histoire de cubes...
 
Network Science: Theory, Modeling and Applications
Network Science: Theory, Modeling and ApplicationsNetwork Science: Theory, Modeling and Applications
Network Science: Theory, Modeling and Applications
 
Big Data
Big DataBig Data
Big Data
 
50 data principles for loosely coupled identity management v1 0
50 data principles for loosely coupled identity management v1 050 data principles for loosely coupled identity management v1 0
50 data principles for loosely coupled identity management v1 0
 
Big data-analytics-ebook
Big data-analytics-ebookBig data-analytics-ebook
Big data-analytics-ebook
 

Similar to Overview of FAIR Data Principles and Stewardship

ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarFAIRDOM
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM
 
Towards cross-domain interoperation in the internet of FAIR data and services
Towards cross-domain interoperation in the internet of FAIR data and servicesTowards cross-domain interoperation in the internet of FAIR data and services
Towards cross-domain interoperation in the internet of FAIR data and servicesLuiz Olavo Bonino da Silva Santos
 
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...EUDAT
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupScott Mitchell
 
FAIRness through a novel combination of Web technologies
FAIRness through a novel combination of Web technologiesFAIRness through a novel combination of Web technologies
FAIRness through a novel combination of Web technologiesResearch Data Alliance
 
FAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basicsFAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basicsOpenAIRE
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptxGetu Tadele
 
Dataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabulariesDataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabulariesValeria Pesce
 
Managing your data paget
Managing your data pagetManaging your data paget
Managing your data pagetTERN Australia
 
CARARE: Can I use this data? FAIR into practice
CARARE: Can I use this data? FAIR into practiceCARARE: Can I use this data? FAIR into practice
CARARE: Can I use this data? FAIR into practiceCARARE
 
An ecosystem to support FAIR data
An ecosystem to support FAIR dataAn ecosystem to support FAIR data
An ecosystem to support FAIR dataBlue BRIDGE
 

Similar to Overview of FAIR Data Principles and Stewardship (20)

ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management Webinar
 
Weaving a Web of Linked Data - September 26th, 2019
Weaving a Web of Linked Data - September 26th, 2019Weaving a Web of Linked Data - September 26th, 2019
Weaving a Web of Linked Data - September 26th, 2019
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
 
Towards cross-domain interoperation in the internet of FAIR data and services
Towards cross-domain interoperation in the internet of FAIR data and servicesTowards cross-domain interoperation in the internet of FAIR data and services
Towards cross-domain interoperation in the internet of FAIR data and services
 
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
 
FAIR-Principles-and-ELN.pdf
FAIR-Principles-and-ELN.pdfFAIR-Principles-and-ELN.pdf
FAIR-Principles-and-ELN.pdf
 
Big Data
Big DataBig Data
Big Data
 
FAIR Data ecosystem
FAIR Data ecosystemFAIR Data ecosystem
FAIR Data ecosystem
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
 
FAIRness through a novel combination of Web technologies
FAIRness through a novel combination of Web technologiesFAIRness through a novel combination of Web technologies
FAIRness through a novel combination of Web technologies
 
FAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basicsFAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basics
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptx
 
Dataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabulariesDataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabularies
 
Bigdata overview
Bigdata overviewBigdata overview
Bigdata overview
 
A Finnish perspective on FAIRsFAIR outputs
A Finnish perspective on FAIRsFAIR outputsA Finnish perspective on FAIRsFAIR outputs
A Finnish perspective on FAIRsFAIR outputs
 
Fair data vs 5 star open data final
Fair data vs 5 star open data finalFair data vs 5 star open data final
Fair data vs 5 star open data final
 
Managing your data paget
Managing your data pagetManaging your data paget
Managing your data paget
 
CARARE: Can I use this data? FAIR into practice
CARARE: Can I use this data? FAIR into practiceCARARE: Can I use this data? FAIR into practice
CARARE: Can I use this data? FAIR into practice
 
ODSC and iRODS
ODSC and iRODSODSC and iRODS
ODSC and iRODS
 
An ecosystem to support FAIR data
An ecosystem to support FAIR dataAn ecosystem to support FAIR data
An ecosystem to support FAIR data
 

More from Luiz Olavo Bonino da Silva Santos (11)

FDOF and DDI-CDI
FDOF and DDI-CDIFDOF and DDI-CDI
FDOF and DDI-CDI
 
My repository is FAIR!!! What does it mean?
My repository is FAIR!!! What does it mean?My repository is FAIR!!! What does it mean?
My repository is FAIR!!! What does it mean?
 
Making FAIR easy
Making FAIR easyMaking FAIR easy
Making FAIR easy
 
FAIR Explained
FAIR ExplainedFAIR Explained
FAIR Explained
 
Achieving FAIR from a repository perspective
Achieving FAIR from a repository perspectiveAchieving FAIR from a repository perspective
Achieving FAIR from a repository perspective
 
Estruturas de apoio ao acesso aberto
Estruturas de apoio ao acesso abertoEstruturas de apoio ao acesso aberto
Estruturas de apoio ao acesso aberto
 
Ciência aberto, diretrizes FAIR, etapas de viabilização e horizontes
Ciência aberto, diretrizes FAIR, etapas de viabilização e horizontesCiência aberto, diretrizes FAIR, etapas de viabilização e horizontes
Ciência aberto, diretrizes FAIR, etapas de viabilização e horizontes
 
Panorama global de gestão de dados de pesquisa e a iniciativa GO FAIR
Panorama global de gestão de dados de pesquisa e a iniciativa GO FAIRPanorama global de gestão de dados de pesquisa e a iniciativa GO FAIR
Panorama global de gestão de dados de pesquisa e a iniciativa GO FAIR
 
Ciência aberta e dados FAIR
Ciência aberta e dados FAIRCiência aberta e dados FAIR
Ciência aberta e dados FAIR
 
FAIR Ecosystem - Health RI 2017
FAIR Ecosystem - Health RI 2017FAIR Ecosystem - Health RI 2017
FAIR Ecosystem - Health RI 2017
 
DTL Integrator's meeting
DTL Integrator's meetingDTL Integrator's meeting
DTL Integrator's meeting
 

Recently uploaded

办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 

Recently uploaded (20)

办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 

Overview of FAIR Data Principles and Stewardship

  • 1. FAIR DATA OVERVIEW Luiz Olavo Bonino - luiz.bonino@dtls.nl
  • 2. SUMMARY  What is FAIR data?  The FAIR ecosystem  Plans and how to realise
  • 4. DATA STEWARDSHIP  Combination of all expertise to treat data well in a project: ■ Experiment design and data-design; ■ Re-use of existing data where possible; ■ Planning of the storage, networking and computing infrastructure; ■ Data acquisition and processing; ■ Data publishing in a format that allows functional interlinking of data(sets) as well as in a format suitable for long-term preservation.
  • 5. FAIR DATA STEWARDSHIP  Combination of all expertise to treat data well in a project: ■ Experiment design and data-design; ■ Re-use of existing data where possible; ■ Planning of the storage, networking and computing infrastructure; ■ Data acquisition and processing; ■ Data publishing in a format that allows functional interlinking of data(sets) as well as in a format suitable for long-term preservation.
  • 6.
  • 7. DATA STEWARDSHIP – PROCESS VIEW
  • 8. DATA STEWARDSHIP – SUSTAINABILITY VIEW
  • 9.
  • 14. WHAT IS FAIR DATA? FAIR Data aims to support existing communities in their attempts to enable valuable scientific data and knowledge to be published and utilised in a ‘FAIR’ manner. Findable - (meta)data is uniquely and persistently identifiable. Should have basic machine readable descriptive metadata. Accessible - data is reachable and accessible by humans and machines using standard formats and protocols. Interoperable - (meta)data is machine readable and annotated with resolvable vocabularies/ontologies. Reusable - (meta)data is sufficiently well-described to allow (semi)automated integration with other compatible data sources.
  • 15. THE FAIR ECOSYSTEM FAIR Data Principles FAIR Data Protocol FAIR Data Resources FAIR Data Core Technologies FAIR Data Systems/Tools Normative Artefact Software
  • 16. FAIR ECOSYSTEM - NORMATIVE LEVEL  FAIR Data Principles - general principles guiding FAIR data solutions;  FAIR Data Protocol - complying with the FAIR Data Principles, provide guidelines for implementing FAIR data solutions, e.g., standards, APIs, technologies, …;
  • 17. FAIR DATA PROTOCOL Findable - standards for describing the dataset with the relevant metadata; Accessible - standards for represent and access the data according to the defined usage license; Interoperable - standards for machine readable descriptions of the (meta)data and (semantic)annotation; Reusable - standards for semantic annotation of the (meta)data supporting machine reasoning, and standards for defining data provenance and support citation; The standards include technologies (e.g., RDF, nano pub, JSON, OWL, etc.) as well as protocols and APIs.
  • 18. FAIR ECOSYSTEM - ARTEFACT LEVEL  FAIR Data Resource - datasets expressed using one of the prescribed standards of the FAIR Data Protocol and with metadata complying with the protocol.  Annotation Ontology - reference conceptual model used to provide semantics to elements of FAIR Data Resources through annotation.  Controlled vocabularies, dictionaries, etc.
  • 19. FAIR DATA RESOURCE Datasets expressed using one of the prescribed standards of the FAIR Data Protocol, with metadata complying with the protocol and license. The original dataset is transformed into a FAIR format and proper metadata and license are added to produce a FAIR Data Resource. The original and the FAIR version can co-exist, each one fulfilling its own purpose. FAIR Conversion FAIR Data Resource
  • 20. FAIR DATA RESOURCE Data Creation FAIR Data Resource FAIR Data Creation FAIR Data Resource
  • 21.
  • 22.
  • 23. SHARING DATA I would like to exploit common genotype- phenotype relations between Alzheimer’s Disease and Huntington’s Disease… I need to combine AD and HD data… I can help with that! I can help with that! Source: Marcos Roos
  • 24. SHARING DATA Source: Marcos Roos ??? Here’s my data, have fun! Here’s my data, have fun!
  • 25. SHARING LINKABLE DATA Source: Marcos Roos I can go straight to answering my questions with data from multiple data owners! Patients will be so pleased with this speed-up! Here’s my Linked Data, have fun! Here’s my Linked Data, have fun!
  • 27. Raw data (many formats) Processed data (primary storage format) Initial transformation
  • 28. Raw data (many formats) Processed data (primary storage format) ProvenanceInitial transformation
  • 29. Raw data (many formats) Processed data (primary storage format) FAIR transformation FAIR (meta)data (RDF,XML etc.) ProvenanceInitial transformation
  • 30. Raw data (many formats) Processed data (primary storage format) FAIR transformation FAIR (meta)data (RDF,XML etc.) ProvenanceInitial transformation
  • 31. Raw data (many formats) FAIR download (in local format) Processed data (primary storage format) FAIR transformation FAIR (meta)data (RDF,XML etc.) ProvenanceInitial transformation
  • 32. Raw data (many formats) FAIR download (in local format) Processed data (primary storage format) FAIR transformation FAIR (meta)data (RDF,XML etc.) High-Performance Analysis ProvenanceInitial transformation Analysis transformation
  • 34. FAIR DATA RESOURCE FAIR transformation FAIR Data Resource
  • 35. BRING YOUR OWN DATA - BYOD  Goals: ■ Learn how to make data linkable “hands-on” with experts ■ Create a “telling story” to demonstrate its use  Composition: ■ Data owners – specialists on given datasets ■ Data interoperability experts ■ Domain experts Source: Marcos Roos
  • 36. BYOD
  • 39. FAIR DATA MODEL REGISTRY
  • 40. FAIRIFIER AND FAIR DATA MODEL REGISTRY
  • 41. A particular class of FAIR Data System that provides access to published datasets. The datasets can be external or internal to the FAIR Data Point. Also, the source data can be a regular (non-FAIR) dataset or a FAIR Data Resource. If the source data is non-FAIR, the FAIR Data Point needs to made the necessary FAIR transformations on the fly.
  • 43.
  • 44.  A particular class of FAIR Data System to provide support for data interoperability;  Supports publication and access to FAIR data.  Fosters an ecosystems of applications and services;  Federated architecture: different FAIRports (and other FAIR Data Systems) are interconnectable;  Supports citations of datasets and data items;  Provides metrics for data usage and citation;
  • 52. HOW TO REALISE DTL National FAIR data engineering team Elixir Other Project FAIR Data IG (P.I.s with data) Data SAC G.M., B.M., M.G. Data Core Team FAIR Data Executive Team L.B. (CTO), R.H., P.B., M.S., J.B., J.W. engineers in the FAIR Data virtual team Core FAIR Technol ogyFAIR Data V.T. FAIR Data V.T. FAIR Data V.T. FAIR Data V.T. FAIR Data V.T. FAIR Data V.T. Local “DTLs”/ projects project s

Editor's Notes

  1. Data stewardship of FAIR data
  2. From a process point-of-view, in our vision, data stewardship encompasses a planning phase, where you prepare the data environment followed by a management phase where the actual data creation and processing takes place and then, after your project finishes, you have to take care of the long-term preservation of the data.
  3. From a sustainability point-of-view, the planning phase, depicted here as the data management plans, will be increasingly required by funding programs such as Horizon 2020. However, there is still a challenge to fund data preservation after the project since most of the time, projects funds cannot be used. And underlying these phases we should have the interoperability backbones, standards and procedures, such as the ones being developed/deployed in Excelerate,
  4. The central point in the ecosystem is what we call a FAIR Data Resource, which is composed of a dataset in a FAIR format, its related metadata and license.
  5. As I mentioned, BYODs are currently used to create FAIR Data (Resource). However, a significant part of a BYOD is spent on the FAIR Data Transformation.
  6. Because of this we are working on a tool to automate this process as much as possible, the FAIRifier.
  7. One of the tasks for both BYODs and the FAIRifier is to create a data model for what the original dataset should be transformed to, including proper semantic annotations of the types, etc. For common data structure types we are working on a FAIR Data Model Registry which stores the non-FAIR dataset model and the associated transformation model.
  8. In this way, the FAIRifier can be integrated with the FAIR Data Model Registry to try to match an input dataset model with the ones stored in the registry. Once a match is found, the information about how to transform the dataset into a FAIR format is retrieved by the FAIRifier improving the process automation.