SlideShare a Scribd company logo
Supported by the NIH grant 1U24 AI117966-01 to UCSD
PI , Co-Investigators at:
The model
annotated with schema.org
Susanna-Assunta Sansone, Alejandra Gonzalez-Beltran, Philippe Rocca-Serra
Oxford e-Research Centre, University of Oxford, UK
The model
What is ?
Like the JATS (Journal Article Tag Suite) is used by PubMed to index literature,
a DATS (DatA Tag Suite) is needed for a scalable way to index data sources in
the DataMed prototype
Where do I find the documentation?
Like the JATS (Journal Article Tag Suite) is used by PubMed to index literature,
a DATS (DatA Tag Suite) is needed for a scalable way to index data sources in
the DataMed prototype
A community-driven effort
A community-driven effort
v  Support intended capability of the DataMed prototype to harvest
key metadata (experimental and data) descriptors, such as
²  information and relations between authors, datasets, publication and
funding sources, nature of biological signal, nature of perturbation
etc.
What is support to cover and do?
v  Support intended capability of the DataMed prototype to harvest
key metadata (experimental and data) descriptors, such as
²  information and relations between authors, datasets, publication and
funding sources, nature of biological signal, nature of perturbation
etc.
v  Use cases and the competency questions used throughout the
development process
²  to define the appropriate boundaries and level of granularity: which
queries will be answered in full, which only partially, and which are
out of scope
What is support to cover and do?
Metadata elements identified by combining the two complementary approaches
top-down approach bottom-up approach
The development process in a nutshell
Model serialized as JSON schemas and mapping to schema.org
(v1.0, v1.1, v2.0, v2.1)
Extracting requirements from use cases
v  Selected competency questions
²  representative set collected from: use cases workshop, white paper, submitted by
the community and from NIH and Phil Bourne’s ADDS office
²  key metadata elements processed: abstracted, color-coded and terms binned
binned as Material, Process, Information, Properties; relation identified
top-down approach
bottom-up approach
Mapping existing metadata schemas
v  schema.org
v  DataCite
v  RIF-CS
v  W3C HCLS dataset descriptions (mapping of many models including DCAT, PROV, VOID, Dublin
Core)
v  Project Open Metadata (used by HealthData.gov is being added in this new iteration)
v  ISA
v  BioProject
v  BioSample
v  MiNIML
v  PRIDE-ml
v  MAGE-tab
v  GA4GH metadata schema
v  SRA xml
v  CDISC SDM / element of BRIDGE model
v  Metadata is either too much or too little
²  many databases won’t have all these metadata elements
²  conversely, domain-specific databases (e.g. focusing on a type of
study, organism or technology) have more detailed metadata
We know that one size does not fit all
v  Metadata is either too much or too little
²  many databases won’t have all these metadata elements
²  conversely, domain-specific databases (e.g. focusing on a type of
study, organism or technology) have more detailed metadata
v  Our goal is NOT to develop the perfect model
²  we have had several iterations testing the model
²  we have aimed to have maximum coverage of use cases with
minimal number of data elements
²  we do foresee that not all questions can be answered in full
We know that one size does not fit all
v  The descriptors for each metadata element (Entity), include
²  Property (describing the Entity), Definition (of each Entity and Property),
Value(s) (allowed for each Property)
Key features of
v  The descriptors for each metadata element (Entity), include
²  Property (describing the Entity), Definition (of each Entity and Property),
Value(s) (allowed for each Property)
v  We have defined a set of core and extended entities
²  Core elements are generic and applicable to any type of datasets, like the
JATS can describe any type of publication.
²  Extended elements includes an additional elements, some of which are
specific for life, environmental and biomedical science domains
²  this set can be further extended as needed
Key features of
v  The descriptors for each metadata element (Entity), include
²  Property (describing the Entity), Definition (of each Entity and Property),
Value(s) (allowed for each Property)
v  We have defined a set of core and extended entities
²  Core elements are generic and applicable to any type of datasets, like the
JATS can describe any type of publication.
²  Extended elements includes an additional elements, some of which are
specific for life, environmental and biomedical science domains
²  this set can be further extended as needed
v  Entities are not mandatory, in both core and extended set
²  An entity is used only when applicable to the dataset to be described
²  in that case only few of its properties are defined as mandatory
Key features of
v  Model is designed around the Dataset, an entity that intends to cater for
any unit of information stored by repositories:
²  archived experimental datasets, which do not change after deposition to the
repository => examples available for dbGAP, GEO, ClinicalTrials.org
²  datasets in reference knowledge bases, describing dynamic concepts, such
as “genes”, whose definition morphs over time => examples available for
UniProt
v  The Dataset entity is also linked to other digital research objects part of the NIH
Commons, such as Software and Data Standard, which are the focus on other
discovery indexes and therefore are not described in detail in this model
General design of the
v  Model is designed around the Dataset, an entity that intends to cater for
any unit of information stored by repositories:
²  archived experimental datasets, which do not change after deposition to the
repository => examples available for dbGAP, GEO, ClinicalTrials.org
²  datasets in reference knowledge bases, describing dynamic concepts, such
as “genes”, whose definition morphs over time => examples available for
UniProt
General design of the
core and extended elements
18 core elements
18 core elements and few
mandatory properties
v  What is the dataset about?
²  Material
v  How was the dataset produced ? Which information does it hold?
²  Dataset / Data Type with its Information, Method, Platform,
Instrument
v  Where can a dataset be found?
²  Dataset, Distribution, Access objects (links to License)
v  When was the datasets produced, released etc.?
²  Dates to specify the nature of an event {create, modify, start, end...}
and its timestamp
v  Who did the work, funded the research, hosts the resources etc.?
²  Person, Organization and their roles, Grant
Core elements provide the basic info
Standards
Standards
Standards
Software
Software
Dataset distribution and access
also follows the W3C Data on
the Web Best Practices
DATS follows these, which
also recommend
DatasetDistribution
https://www.w3.org/TR/dwbp
Serializations
v  DATS model in JSON schema, serialized as:
²  JSON* format, and
²  JSON-LD** with vocabulary from schema.org
v  …serializations in other formats can also be done, as / if needed
* JavaScript Object Notation
** JavaScript Object Notation for Linked Data
Context (mapping) file also
available, meaning that other
vocabularies can be used
Discussion also ongoing with:
Schema.org value-added to
v  Why using schema.org to annotate the DATS?
²  Developed/used by search engine consortium (google, yandex, yahoo etc.)
²  The NIH Commons is discussing its use
v  What benefits do DataMed get by implementing a schema.org-based
DATS? Especially:
²  increased visibility (by both popular search engines and DataMed), accessibility (via
common query interfaces) and possibly improve ranking
Schema.org value-added to
v  Why using schema.org to annotate the DATS?
²  Developed/used by search engine consortium (google, yandex, yahoo etc.)
²  The NIH Commons is discussing its use
v  What benefits do DataMed get by implementing a schema.org-based
DATS? Especially:
²  increased visibility (by both popular search engines and DataMed), accessibility (via
common query interfaces) and possibly improve ranking
v  Discussion and collaboration with schema.org is ongoing
²  missing elements (needed by DATS) submitted to the tracker; Roughly 80 % of DATS
entities and properties can be mapped but alignment is not perfect/less precise), the
remaining 20% constitute major gaps
²  schema.org and its related Health and Life Science extension evolve (the latter
focuses on clinical studies)
²  coordination also via the ELIXIR-supported bioschemas.org initiative
²  discussion also ongoing under the NIH Commons WGs
Mapping to schema.org
elements in CEDAR template
v  Datasets not yet in a formal
repositories
²  CEDAR metadata authoring
tool can be used to provide
DATS-compliant metadata to
be later indexed by DataMed
and omicsDI models - Mapping
v  Overlap with DATS core
elements
²  In red some DATS
extended elements
and DCIP supplement - Mapping
Citation metadata for repositories’ landing page
https://github.com/datacite/spinone/issues/3
exported by DataCite
v  An API endpoint that returns DataCite
metadata in DATS format is work in
progress: http://api.datacite.org/dats
v  DataCite Metadata Schema allows for
a RelatedIdentifier with the
HasMetadata relation type
²  this allows linking to the DATS
metadata from a DataCite
metadata record

More Related Content

What's hot

eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogueseROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
e-ROSA
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data Citation
DataONE
 
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
dkNET
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for Biopharma
Tom Plasterer
 
dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016
dkNET
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
Carole Goble
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Tom Plasterer
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
Amanda Whitmire
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data Management
OpenAIRE
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* Data
Tom Plasterer
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
National Information Standards Organization (NISO)
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
DataONE
 
BioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageBioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative Advantage
Tom Plasterer
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
National Information Standards Organization (NISO)
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
DataONE
 
Citations in ISO Metadata
Citations in ISO MetadataCitations in ISO Metadata
Citations in ISO Metadata
Ted Habermann
 
Preparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR PrinciplesPreparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR Principles
London School of Hygiene and Tropical Medicine
 
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Tom Plasterer
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
DataONE
 
Mendeley Data FAIR hackathon
Mendeley Data FAIR hackathonMendeley Data FAIR hackathon
Mendeley Data FAIR hackathon
Luiz Olavo Bonino da Silva Santos
 

What's hot (20)

eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogueseROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data Citation
 
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for Biopharma
 
dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data Management
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* Data
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 
BioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageBioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative Advantage
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
 
Citations in ISO Metadata
Citations in ISO MetadataCitations in ISO Metadata
Citations in ISO Metadata
 
Preparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR PrinciplesPreparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR Principles
 
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
 
Mendeley Data FAIR hackathon
Mendeley Data FAIR hackathonMendeley Data FAIR hackathon
Mendeley Data FAIR hackathon
 

Viewers also liked

OpenDataForge - SledgeHammer EDDI 2013 presentation
OpenDataForge - SledgeHammer EDDI 2013 presentationOpenDataForge - SledgeHammer EDDI 2013 presentation
OpenDataForge - SledgeHammer EDDI 2013 presentation
Pascal Heus
 
What's New in RDF 1.1?
What's New in RDF 1.1?What's New in RDF 1.1?
What's New in RDF 1.1?
Richard Cyganiak
 
VoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsVoID: Metadata for RDF Datasets
VoID: Metadata for RDF Datasets
Richard Cyganiak
 
A brief overview of metadata for datasets
A brief overview of metadata for datasetsA brief overview of metadata for datasets
A brief overview of metadata for datasets
sesrdm
 
Beyond regulatory submission - standards metadata management
Beyond regulatory submission  - standards metadata managementBeyond regulatory submission  - standards metadata management
Beyond regulatory submission - standards metadata management
Kevin Lee
 
BioCADDIE: Descriptive Metadata for Datasets WG3 - ELIXIR All Hands
BioCADDIE: Descriptive Metadata for Datasets WG3 - ELIXIR All HandsBioCADDIE: Descriptive Metadata for Datasets WG3 - ELIXIR All Hands
BioCADDIE: Descriptive Metadata for Datasets WG3 - ELIXIR All Hands
Susanna-Assunta Sansone
 
NIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery IndexNIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery Index
Susanna-Assunta Sansone
 

Viewers also liked (7)

OpenDataForge - SledgeHammer EDDI 2013 presentation
OpenDataForge - SledgeHammer EDDI 2013 presentationOpenDataForge - SledgeHammer EDDI 2013 presentation
OpenDataForge - SledgeHammer EDDI 2013 presentation
 
What's New in RDF 1.1?
What's New in RDF 1.1?What's New in RDF 1.1?
What's New in RDF 1.1?
 
VoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsVoID: Metadata for RDF Datasets
VoID: Metadata for RDF Datasets
 
A brief overview of metadata for datasets
A brief overview of metadata for datasetsA brief overview of metadata for datasets
A brief overview of metadata for datasets
 
Beyond regulatory submission - standards metadata management
Beyond regulatory submission  - standards metadata managementBeyond regulatory submission  - standards metadata management
Beyond regulatory submission - standards metadata management
 
BioCADDIE: Descriptive Metadata for Datasets WG3 - ELIXIR All Hands
BioCADDIE: Descriptive Metadata for Datasets WG3 - ELIXIR All HandsBioCADDIE: Descriptive Metadata for Datasets WG3 - ELIXIR All Hands
BioCADDIE: Descriptive Metadata for Datasets WG3 - ELIXIR All Hands
 
NIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery IndexNIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery Index
 

Similar to NIH BD2K DataMed data index - DATS model

NIH BD2K DataMed model, DATS
NIH BD2K DataMed model, DATSNIH BD2K DataMed model, DATS
NIH BD2K DataMed model, DATS
Susanna-Assunta Sansone
 
Introduction to DATS v2.2 - NIH May 2017
Introduction to DATS v2.2 - NIH May 2017Introduction to DATS v2.2 - NIH May 2017
Introduction to DATS v2.2 - NIH May 2017
Susanna-Assunta Sansone
 
Dataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabulariesDataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabularies
Valeria Pesce
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogs
Valeria Pesce
 
Datasets with bioschemas
Datasets with bioschemasDatasets with bioschemas
Datasets with bioschemas
Alejandra Gonzalez-Beltran
 
Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data Commons
Simon Twigger
 
Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...
AIMS (Agricultural Information Management Standards)
 
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at ScaleFull Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
National Information Standards Organization (NISO)
 
New Directions in Metadata
New Directions in MetadataNew Directions in Metadata
New Directions in Metadata
suyu22
 
The DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMedThe DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMed
Alejandra Gonzalez-Beltran
 
The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...
Hilmar Lapp
 
Data Citation Implementation at Dataverse
Data Citation Implementation at DataverseData Citation Implementation at Dataverse
Data Citation Implementation at Dataverse
Merce Crosas
 
NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016
Susanna-Assunta Sansone
 
Next-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalNext-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information Retrieval
Waqas Tariq
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
Paul Groth
 
Wheeler & Benedict -- Enabling the Preservation Relay
Wheeler & Benedict -- Enabling the Preservation RelayWheeler & Benedict -- Enabling the Preservation Relay
Wheeler & Benedict -- Enabling the Preservation Relay
National Information Standards Organization (NISO)
 
Fsci 2018 friday3_august_am6
Fsci 2018 friday3_august_am6Fsci 2018 friday3_august_am6
Fsci 2018 friday3_august_am6
ARDC
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge Graphs
Tom Plasterer
 
data mining
data miningdata mining
data mining
manasa polu
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
Carole Goble
 

Similar to NIH BD2K DataMed data index - DATS model (20)

NIH BD2K DataMed model, DATS
NIH BD2K DataMed model, DATSNIH BD2K DataMed model, DATS
NIH BD2K DataMed model, DATS
 
Introduction to DATS v2.2 - NIH May 2017
Introduction to DATS v2.2 - NIH May 2017Introduction to DATS v2.2 - NIH May 2017
Introduction to DATS v2.2 - NIH May 2017
 
Dataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabulariesDataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabularies
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogs
 
Datasets with bioschemas
Datasets with bioschemasDatasets with bioschemas
Datasets with bioschemas
 
Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data Commons
 
Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...
 
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at ScaleFull Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
 
New Directions in Metadata
New Directions in MetadataNew Directions in Metadata
New Directions in Metadata
 
The DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMedThe DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMed
 
The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...
 
Data Citation Implementation at Dataverse
Data Citation Implementation at DataverseData Citation Implementation at Dataverse
Data Citation Implementation at Dataverse
 
NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016
 
Next-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalNext-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information Retrieval
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
 
Wheeler & Benedict -- Enabling the Preservation Relay
Wheeler & Benedict -- Enabling the Preservation RelayWheeler & Benedict -- Enabling the Preservation Relay
Wheeler & Benedict -- Enabling the Preservation Relay
 
Fsci 2018 friday3_august_am6
Fsci 2018 friday3_august_am6Fsci 2018 friday3_august_am6
Fsci 2018 friday3_august_am6
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge Graphs
 
data mining
data miningdata mining
data mining
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 

More from Susanna-Assunta Sansone

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
Susanna-Assunta Sansone
 
FAIRsharing-Standards-4-GSC-Aug23.pdf
FAIRsharing-Standards-4-GSC-Aug23.pdfFAIRsharing-Standards-4-GSC-Aug23.pdf
FAIRsharing-Standards-4-GSC-Aug23.pdf
Susanna-Assunta Sansone
 
FAIR-4-GSC-Sansone-Aug23.pdf
FAIR-4-GSC-Sansone-Aug23.pdfFAIR-4-GSC-Sansone-Aug23.pdf
FAIR-4-GSC-Sansone-Aug23.pdf
Susanna-Assunta Sansone
 
FAIRsharing & FAIRcookbook at RDA 2023
FAIRsharing & FAIRcookbook at RDA 2023FAIRsharing & FAIRcookbook at RDA 2023
FAIRsharing & FAIRcookbook at RDA 2023
Susanna-Assunta Sansone
 
NFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRNFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIR
Susanna-Assunta Sansone
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
Susanna-Assunta Sansone
 
FAIRcookbook: GSRS22-Singapore
FAIRcookbook: GSRS22-SingaporeFAIRcookbook: GSRS22-Singapore
FAIRcookbook: GSRS22-Singapore
Susanna-Assunta Sansone
 
FAIR Cookbook
FAIR Cookbook FAIR Cookbook
FAIR Cookbook
Susanna-Assunta Sansone
 
FAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipesFAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipes
Susanna-Assunta Sansone
 
FAIRsharing and the FAIR Cookbook
FAIRsharing and the FAIR Cookbook FAIRsharing and the FAIR Cookbook
FAIRsharing and the FAIR Cookbook
Susanna-Assunta Sansone
 
FAIRsharing for EOSC
FAIRsharing for EOSC FAIRsharing for EOSC
FAIRsharing for EOSC
Susanna-Assunta Sansone
 
FAIR: standards and services
FAIR: standards and servicesFAIR: standards and services
FAIR: standards and services
Susanna-Assunta Sansone
 
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR CookbookFAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
Susanna-Assunta Sansone
 
FAIRsharing: what we do for policies
FAIRsharing: what we do for policiesFAIRsharing: what we do for policies
FAIRsharing: what we do for policies
Susanna-Assunta Sansone
 
FAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRnessFAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRness
Susanna-Assunta Sansone
 
ELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - ExamplarsELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - Examplars
Susanna-Assunta Sansone
 
FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features
Susanna-Assunta Sansone
 
FAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 responseFAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 response
Susanna-Assunta Sansone
 
FAIRsharing poster
FAIRsharing posterFAIRsharing poster
FAIRsharing poster
Susanna-Assunta Sansone
 
The FAIR Cookbook poster
The FAIR Cookbook posterThe FAIR Cookbook poster
The FAIR Cookbook poster
Susanna-Assunta Sansone
 

More from Susanna-Assunta Sansone (20)

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
FAIRsharing-Standards-4-GSC-Aug23.pdf
FAIRsharing-Standards-4-GSC-Aug23.pdfFAIRsharing-Standards-4-GSC-Aug23.pdf
FAIRsharing-Standards-4-GSC-Aug23.pdf
 
FAIR-4-GSC-Sansone-Aug23.pdf
FAIR-4-GSC-Sansone-Aug23.pdfFAIR-4-GSC-Sansone-Aug23.pdf
FAIR-4-GSC-Sansone-Aug23.pdf
 
FAIRsharing & FAIRcookbook at RDA 2023
FAIRsharing & FAIRcookbook at RDA 2023FAIRsharing & FAIRcookbook at RDA 2023
FAIRsharing & FAIRcookbook at RDA 2023
 
NFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRNFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIR
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
FAIRcookbook: GSRS22-Singapore
FAIRcookbook: GSRS22-SingaporeFAIRcookbook: GSRS22-Singapore
FAIRcookbook: GSRS22-Singapore
 
FAIR Cookbook
FAIR Cookbook FAIR Cookbook
FAIR Cookbook
 
FAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipesFAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipes
 
FAIRsharing and the FAIR Cookbook
FAIRsharing and the FAIR Cookbook FAIRsharing and the FAIR Cookbook
FAIRsharing and the FAIR Cookbook
 
FAIRsharing for EOSC
FAIRsharing for EOSC FAIRsharing for EOSC
FAIRsharing for EOSC
 
FAIR: standards and services
FAIR: standards and servicesFAIR: standards and services
FAIR: standards and services
 
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR CookbookFAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
 
FAIRsharing: what we do for policies
FAIRsharing: what we do for policiesFAIRsharing: what we do for policies
FAIRsharing: what we do for policies
 
FAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRnessFAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRness
 
ELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - ExamplarsELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - Examplars
 
FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features
 
FAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 responseFAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 response
 
FAIRsharing poster
FAIRsharing posterFAIRsharing poster
FAIRsharing poster
 
The FAIR Cookbook poster
The FAIR Cookbook posterThe FAIR Cookbook poster
The FAIR Cookbook poster
 

Recently uploaded

Digital Marketing Performance Marketing Sample .pdf
Digital Marketing Performance Marketing  Sample .pdfDigital Marketing Performance Marketing  Sample .pdf
Digital Marketing Performance Marketing Sample .pdf
Vineet
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
keesa2
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
osoyvvf
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
eudsoh
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Marlon Dumas
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
Bisnar Chase Personal Injury Attorneys
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
ArshadAyub49
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
nhutnguyen355078
 
SAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content DocumentSAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content Document
newdirectionconsulta
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
uevausa
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
NABLAS株式会社
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
perranet1
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
Alireza Kamrani
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 

Recently uploaded (20)

Digital Marketing Performance Marketing Sample .pdf
Digital Marketing Performance Marketing  Sample .pdfDigital Marketing Performance Marketing  Sample .pdf
Digital Marketing Performance Marketing Sample .pdf
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
 
SAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content DocumentSAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content Document
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 

NIH BD2K DataMed data index - DATS model

  • 1. Supported by the NIH grant 1U24 AI117966-01 to UCSD PI , Co-Investigators at: The model annotated with schema.org Susanna-Assunta Sansone, Alejandra Gonzalez-Beltran, Philippe Rocca-Serra Oxford e-Research Centre, University of Oxford, UK
  • 3. What is ? Like the JATS (Journal Article Tag Suite) is used by PubMed to index literature, a DATS (DatA Tag Suite) is needed for a scalable way to index data sources in the DataMed prototype
  • 4. Where do I find the documentation? Like the JATS (Journal Article Tag Suite) is used by PubMed to index literature, a DATS (DatA Tag Suite) is needed for a scalable way to index data sources in the DataMed prototype
  • 7. v  Support intended capability of the DataMed prototype to harvest key metadata (experimental and data) descriptors, such as ²  information and relations between authors, datasets, publication and funding sources, nature of biological signal, nature of perturbation etc. What is support to cover and do?
  • 8. v  Support intended capability of the DataMed prototype to harvest key metadata (experimental and data) descriptors, such as ²  information and relations between authors, datasets, publication and funding sources, nature of biological signal, nature of perturbation etc. v  Use cases and the competency questions used throughout the development process ²  to define the appropriate boundaries and level of granularity: which queries will be answered in full, which only partially, and which are out of scope What is support to cover and do?
  • 9. Metadata elements identified by combining the two complementary approaches top-down approach bottom-up approach The development process in a nutshell Model serialized as JSON schemas and mapping to schema.org (v1.0, v1.1, v2.0, v2.1)
  • 10. Extracting requirements from use cases v  Selected competency questions ²  representative set collected from: use cases workshop, white paper, submitted by the community and from NIH and Phil Bourne’s ADDS office ²  key metadata elements processed: abstracted, color-coded and terms binned binned as Material, Process, Information, Properties; relation identified top-down approach
  • 11. bottom-up approach Mapping existing metadata schemas v  schema.org v  DataCite v  RIF-CS v  W3C HCLS dataset descriptions (mapping of many models including DCAT, PROV, VOID, Dublin Core) v  Project Open Metadata (used by HealthData.gov is being added in this new iteration) v  ISA v  BioProject v  BioSample v  MiNIML v  PRIDE-ml v  MAGE-tab v  GA4GH metadata schema v  SRA xml v  CDISC SDM / element of BRIDGE model
  • 12. v  Metadata is either too much or too little ²  many databases won’t have all these metadata elements ²  conversely, domain-specific databases (e.g. focusing on a type of study, organism or technology) have more detailed metadata We know that one size does not fit all
  • 13. v  Metadata is either too much or too little ²  many databases won’t have all these metadata elements ²  conversely, domain-specific databases (e.g. focusing on a type of study, organism or technology) have more detailed metadata v  Our goal is NOT to develop the perfect model ²  we have had several iterations testing the model ²  we have aimed to have maximum coverage of use cases with minimal number of data elements ²  we do foresee that not all questions can be answered in full We know that one size does not fit all
  • 14. v  The descriptors for each metadata element (Entity), include ²  Property (describing the Entity), Definition (of each Entity and Property), Value(s) (allowed for each Property) Key features of
  • 15. v  The descriptors for each metadata element (Entity), include ²  Property (describing the Entity), Definition (of each Entity and Property), Value(s) (allowed for each Property) v  We have defined a set of core and extended entities ²  Core elements are generic and applicable to any type of datasets, like the JATS can describe any type of publication. ²  Extended elements includes an additional elements, some of which are specific for life, environmental and biomedical science domains ²  this set can be further extended as needed Key features of
  • 16. v  The descriptors for each metadata element (Entity), include ²  Property (describing the Entity), Definition (of each Entity and Property), Value(s) (allowed for each Property) v  We have defined a set of core and extended entities ²  Core elements are generic and applicable to any type of datasets, like the JATS can describe any type of publication. ²  Extended elements includes an additional elements, some of which are specific for life, environmental and biomedical science domains ²  this set can be further extended as needed v  Entities are not mandatory, in both core and extended set ²  An entity is used only when applicable to the dataset to be described ²  in that case only few of its properties are defined as mandatory Key features of
  • 17. v  Model is designed around the Dataset, an entity that intends to cater for any unit of information stored by repositories: ²  archived experimental datasets, which do not change after deposition to the repository => examples available for dbGAP, GEO, ClinicalTrials.org ²  datasets in reference knowledge bases, describing dynamic concepts, such as “genes”, whose definition morphs over time => examples available for UniProt v  The Dataset entity is also linked to other digital research objects part of the NIH Commons, such as Software and Data Standard, which are the focus on other discovery indexes and therefore are not described in detail in this model General design of the
  • 18. v  Model is designed around the Dataset, an entity that intends to cater for any unit of information stored by repositories: ²  archived experimental datasets, which do not change after deposition to the repository => examples available for dbGAP, GEO, ClinicalTrials.org ²  datasets in reference knowledge bases, describing dynamic concepts, such as “genes”, whose definition morphs over time => examples available for UniProt General design of the
  • 19. core and extended elements
  • 21. 18 core elements and few mandatory properties
  • 22. v  What is the dataset about? ²  Material v  How was the dataset produced ? Which information does it hold? ²  Dataset / Data Type with its Information, Method, Platform, Instrument v  Where can a dataset be found? ²  Dataset, Distribution, Access objects (links to License) v  When was the datasets produced, released etc.? ²  Dates to specify the nature of an event {create, modify, start, end...} and its timestamp v  Who did the work, funded the research, hosts the resources etc.? ²  Person, Organization and their roles, Grant Core elements provide the basic info
  • 29. also follows the W3C Data on the Web Best Practices DATS follows these, which also recommend DatasetDistribution https://www.w3.org/TR/dwbp
  • 30. Serializations v  DATS model in JSON schema, serialized as: ²  JSON* format, and ²  JSON-LD** with vocabulary from schema.org v  …serializations in other formats can also be done, as / if needed * JavaScript Object Notation ** JavaScript Object Notation for Linked Data Context (mapping) file also available, meaning that other vocabularies can be used Discussion also ongoing with:
  • 31. Schema.org value-added to v  Why using schema.org to annotate the DATS? ²  Developed/used by search engine consortium (google, yandex, yahoo etc.) ²  The NIH Commons is discussing its use v  What benefits do DataMed get by implementing a schema.org-based DATS? Especially: ²  increased visibility (by both popular search engines and DataMed), accessibility (via common query interfaces) and possibly improve ranking
  • 32. Schema.org value-added to v  Why using schema.org to annotate the DATS? ²  Developed/used by search engine consortium (google, yandex, yahoo etc.) ²  The NIH Commons is discussing its use v  What benefits do DataMed get by implementing a schema.org-based DATS? Especially: ²  increased visibility (by both popular search engines and DataMed), accessibility (via common query interfaces) and possibly improve ranking v  Discussion and collaboration with schema.org is ongoing ²  missing elements (needed by DATS) submitted to the tracker; Roughly 80 % of DATS entities and properties can be mapped but alignment is not perfect/less precise), the remaining 20% constitute major gaps ²  schema.org and its related Health and Life Science extension evolve (the latter focuses on clinical studies) ²  coordination also via the ELIXIR-supported bioschemas.org initiative ²  discussion also ongoing under the NIH Commons WGs
  • 34. elements in CEDAR template v  Datasets not yet in a formal repositories ²  CEDAR metadata authoring tool can be used to provide DATS-compliant metadata to be later indexed by DataMed
  • 35. and omicsDI models - Mapping v  Overlap with DATS core elements ²  In red some DATS extended elements
  • 36. and DCIP supplement - Mapping Citation metadata for repositories’ landing page
  • 37. https://github.com/datacite/spinone/issues/3 exported by DataCite v  An API endpoint that returns DataCite metadata in DATS format is work in progress: http://api.datacite.org/dats v  DataCite Metadata Schema allows for a RelatedIdentifier with the HasMetadata relation type ²  this allows linking to the DATS metadata from a DataCite metadata record