4. WHAT IS FAIR DATA?
Findable:
F1. (meta)data are assigned a globally
unique and persistent identifier;
F2. data are described with rich metadata;
F3. metadata clearly and explicitly include
the identifier of the data it describes;
F4. (meta)data are registered or indexed in
a searchable resource;
http://www.nature.com/articles/sdata201618
6. WHAT IS FAIR DATA?
Accessible:
A1. (meta)data are retrievable by their
identifier using a standardized
communications protocol;
A1.1 the protocol is open, free, and
universally implementable;
A1.2. the protocol allows for an
authentication and authorization
procedure, where necessary;
A2. metadata are accessible, even when
the data are no longer available;
■ http://www.nature.com/articles/sdata201618
8. WHAT IS FAIR DATA?
Interoperable:
I1. (meta)data use a formal, accessible,
shared, and broadly applicable
language for knowledge
representation.
I2. (meta)data use vocabularies that
follow FAIR principles;
I3. (meta)data include qualified
references to other (meta)data;
■ http://www.nature.com/articles/sdata201618
10. WHAT IS FAIR DATA?
Reusable:
R1. meta(data) are richly described
with a plurality of accurate and
relevant attributes;
R1.1. (meta)data are released with a
clear and accessible data usage
license;
R1.2. (meta)data are associated with
detailed provenance;
R1.3. (meta)data meet domain-relevant
community standards;
■ http://www.nature.com/articles/sdata201618
15. FAIR DATA POINT
A particular class of FAIR Data System that provides access to
datasets in a FAIR manner. The datasets can be external or
internal to the FAIR Data Point. Also, the source data can be a
non-FAIR dataset or a FAIR Data Resource. If the source data is
non-FAIR, the FAIR Data Point needs to made the necessary FAIR
transformations on the fly.
FAIR Data Resource
non-FAIR Data Resource
28. FAIR DATA POINT - ARCHITECTURE
FAIR API / GUI
Metadata
Provider
FAIR Accessor
Metrics Gatherer Security Enforcer
FAIR Metadata FAIR Data
29.
30. FAIR Data Point metadata
Title
Responsible institution(s)
Contact
FAIR API version
License
…
31. FDP METADATA
<http://dev-vm.fair-dtls.surf-hosted.nl:8082/fdp> dct:alternative "DTL FDP"@en ;
dct:description "The DTL FAIR Data Point hosts the FAIR Data versions of datasets that
have been made FAIR during BYODs as well as other relevant life sciences datasets"@en ;
dct:subject "FAIR Data" , "Life Sciences" ;
dct:title "DTL FAIR Data Point"@en ;
<http://www.re3data.org/schema/3-0#api> <http://dtls.nl/fdp#api=1> ;
<http://www.re3data.org/schema/3-0#catalog> <http://dev-vm.fair-dtls.surf-hosted.nl:
8082/fdp/biobank> , <http://dev-vm.fair-dtls.surf-hosted.nl:8082/fdp/comparativeGenomics> ,
<http://dev-vm.fair-dtls.surf-hosted.nl:8082/fdp/patient-registry> , <http://dev-vm.fair-
dtls.surf-hosted.nl:8082/fdp/textmining> , <http://dev-vm.fair-dtls.surf-hosted.nl:8082/fdp/
transcriptomics> ;
<http://www.re3data.org/schema/3-0#institution> <http://dtls.nl> ;
<http://www.re3data.org/schema/3-0#institutionCountry> <http://lexvo.org/id/iso3166/NL>
;
<http://www.re3data.org/schema/3-0#lastUpdate> "2016-10-27"^^xsd:date ;
<http://www.re3data.org/schema/3-0#software> "FAIR Data Point" ;
<http://www.re3data.org/schema/3-0#startDate> "2016-10-27"^^xsd:date ;
a <http://www.re3data.org/schema/3-0#Repository> ;
rdfs:label "DTL FAIR Data Point"@en ;
<http://xmlns.com/foaf/0.1/landingpage> <http://dev-vm.fair-dtls.surf-hosted.nl:8082/
fdp/swagger-ui.html> .
32. FAIR Data Point metadata
Catalog metadata
Title
Theme taxonomy
Issued date
…
40. FAIR Data Point metadata
Catalog 2
metadata
Catalog 1 metadata
Dataset 1
metadata
Distribution
1.a metadata
Data record
metadata
Distribution
1.b metadata
Dataset 2
metadata
Distribution
2.a metadata
Data record
metadata
Distribution
2.b metadata
Dataset 3
metadata
Distribution
3.a metadata
Data record
metadata
42. METADATA LAYERS
Layer Description Example Standard
FDP (Data
repository)
Information about the FDP as
a data repository
PID, title, description,
license, owner, API
version, etc.
RE3Data
Catalog Information about the
catalog of datasets offered
PID, title, description,
publisher, etc.
W3C DCAT
#Catalog
Dataset Information about each of
the offered datasets
Publisher, issue date,
theme, etc.
W3C DCAT
#Dataset,
Distribution Information about how the
dataset is distributed
AccessURL,
downloadURL, format,
mediaType, etc.
W3C DCAT
#Distribution
Data record Information about the actual
data, types, identifiers, etc.
Data items types,
identifiers, domain,
range, etc.
RML
OAI-PMH
43. DEMO FAIR DATA POINT
http://dev-vm.fair-dtls.surf-hosted.nl:8082/fdp/swagger-ui.html
http://dev-vm.fair-dtls.surf-hosted.nl:8082/fdp/
API
GUI
52. FAIRIFICATION PROCESS
■ Retrieve original data
■ Dataset identification and analysis
■ Definition of the semantic model
■ Data transformation
■ License assignment
■ Metadata definition
■ FAIR Data resource (data, metadata, license)
deployment
55. FAIRIFIER
■ Transform non-FAIR datasets into FAIR Data Resources
(dataset in FAIR format, license and metadata)
■ Data munging
■ Semantic modeling
■ License definition
■ Metadata definition and extraction
■ Data publication
59. FAIRIFICATION - NEW DATASET TYPE
Original dataset
FAIR Data Resource
FAIR Format
Metadata Licensesubmit generate
FAIR Data
Model Registry
store
Non-FAIR
- FAIR
mapping
60. FAIRIFICATION - RECURRING DATASET TYPE
Original dataset
FAIR Data Resource
FAIR Format
Metadata Licensesubmit generate
FAIR Data
Model Registry
query
Non-FAIR
- FAIR
mapping
retrieve
61. ■ A particular class of FAIR Data System to provide
support for data interoperability;
■ Supports publication and access to FAIR data.
■ Fosters an ecosystems of applications and services;
■ Federated architecture: different FAIRports (and other
FAIR Data Systems) are interconnectable;
■ Supports citations of datasets and data items;
■ Provides metrics for data usage and citation;
DataFAIRport
62. FAIR Data Search
Engine
FAIRifier +
(Meta)Data
Publication
Metadata storage
Data storage
(optional)
Transformation
Services Registry
(optional)
FAIR Data Point
DataFAIRport
DTL
FAIR Data PointFAIR Data Point
F A I
R
63. FAIRPORT
DataFAIRportFind,&Access,&Interoperate&&&Re3use&Data
Stewardship API FAIR Data API
(Meta)Data Storage component
Metadata storage
Data storage
DataVerse EUDAT Data Repository
Semantic resolver Ontology storage
Data storage API / FAIR Data API
Data usage policy
Management component
GUI (Data publishing, search, mgmt)
Data Mgmt App
FAIR Data System
Metrics storage
Data Consumer
Data Producer
Data Consumer Apps
Ex. *APInatomy, BRAIN,
etc)
Data Consumer Apps
Ex. *APInatomy, BRAIN,
etc)
Data Consumer Apps
Ex. *APInatomy, BRAIN,
etc)
Data Consumer Apps
Ex. *APInatomy, BRAIN,
etc)Data Mgmt AppData Mgmt App
Data
Stewardship
Apps
64. ■ Allow third-party annotation on existing knowledge
bases
■ Capture the provenance of the annotator and the
original statement
Open RDF
Knowledge AnnotatorORKA
68. TOOLS ROADMAP
Dec 16 Jan 17 Feb 17 Mar 17
FAIR Data
Point
Version 1
Metadata editor,
release metadata,
POST, FAIR
accessor
Version 1.1
Reintroduce OAI-
PMH compliance
Version 1.2
Update
notification
FAIR Data
Search
Engine
Beta 1
Crawler,
metadata index
and search GUI
Beta 2
Improved search
GUI, search API
FAIRifier
Beta 1
OpenRefine + RDF
plugin, publication
to FAIR Data Point
Beta 2
Metadata
definition and
extraction (RML),
license picker
69. TOOLS ROADMAP
Dec 16 Jan 17 Feb 17 Mar 17
FAIR Data
Model
Registry
Alpha 1
Start of the
integration work
ORKA
Beta 1
Definition of 2-3
use cases
Beta 2
Extended with
features required
by the use cases
Data
FAIRport
Alpha 1
Start of the
integration work
71. EXTENDING EXISTING DATA REPOSITORIES
Metadata
Provider
FAIR Accessor
Metrics Gatherer Security Enforcer
+
Metadata
Provider
FAIR Data
Accessor
Metrics Gatherer Access Controller
EUDAT Current
Components
EUDAT Current
Components
EUDAT Current
Components
Current
Solution
Components
72. FAIR HACKATHON - GOALS
■ Align solutions with FAIR Data Point specifications.
■ Metadata content
■ API
■ Data
73. FAIR HACKATHON OUTCOME
■ FAIR data model for solutions content;
■ Architecture of the required adjustments/extensions;
■ Technical specification of the adjustments/extensions;
■ Proof-of-concept of the adjusted solution;
78. DTL’S FAIR HACKATHONS ROADMAP
■ EUDAT (pilot project ongoing)
■ EGA (July 6-8 2016)
■ Molgenis (Oct 19-20 2016)
■ Patient registry solution providers (Oct 25-27 2016)
■ Mendeley (Nov 18 2016)
■ Quaero Systems (Nov 24 2016)
■ tranSMART (TBD)
■ phenotypeDB (TBD)
■ Euretos Knowledge Platform (TBD)
■ NIH, Australian National Data Services, Brazilian open government
data, …
79. BRING YOUR OWN DATA - BYOD
■ Goals:
■ Learn how to make data linkable “hands-on” with experts
■ Create a “telling story” to demonstrate its use
■ Make FAIR Data at the source
■ Composition:
■ Data owners – specialists on given datasets
■ Data interoperability experts
■ Domain experts
Source: Marcos Roos
84. NETHERLANDS
BYOD Planning
Execution
Day One
Introduction
SW, LD, Ontology intro
Use case intro
Workgroups division
Working sessions
WWW/TTTALA
Day Two
Progress report
Working sessions
Groups reports
WWW/TTTALA
Day Three
Data integration
Answer driving question
Explore data
Demo improvement
Final report
WWW/TTTALA