This document discusses content standards for datasets including formats, terminologies, and guidelines. It notes that content standards provide essential descriptors for interpreting, verifying, reproducing, and reusing datasets. Content standards fall into three main categories: minimum reporting requirements and checklists, controlled vocabularies and ontologies, and conceptual data models and exchange formats. There are many community-driven and organization-driven initiatives that have established various content standards, with over 1000 working groups, 220 terminologies, and 115 guidelines. Tracking how these standards develop and relate to each other and to databases, tools, and data policies is important.
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
BioSharing - EUDAT semantic workshop
1. n
connecting standards, databases and data policies
Susanna-Assunta Sansone
Associate Director
Oxford e-Research Centre, University of Oxford
2. • Domain-level descriptors that are essential for interpretation, verification,
reproducibility and reusability of datasets
• The depth and breadth of descriptors vary according to the domain broadly
covering the what, who, when, how and why
Content standards
4. Minimum information reporting
requirements, checklists
o Report the same core, essential
information
o e.g. MIAME guidelines
Controlled vocabularies, taxonomies, thesauri, ontologies etc.
o Unambiguous identification and definition of concepts
o e.g. Gene Ontology
Conceptual model, schema,
exchange formats etc
o Define the structure and
interrelation of information,
and the transmission format
o e.g. FASTA Formats Terminologies Guidelines
Content standards: three categories
6. 883 -> ~1000
220+
115+
548
source source
source
Content standards in numbers
Formats Terminologies Guidelines
MIAME
MIRIAM
MIQASMIX
MIGEN
ARRIVE
MIAPE
MIASE
MIQE
MISFISHIE….
REMARK
CONSORT
SRAxml
SOFT FASTA
DICOM
MzML
SBRML
SEDML…
GELML
ISA
CML
MITAB
AAO
CHEBIOBI
PATO ENVO
MOD
BTO
IDO…
TEDDY
PRO
XAO
DO
VO
MIAPPE
Sample-Tab
7. Content standards
Data policies by
funders, journals and
other organizations
Databases, tools
and services
Formats Terminologies Guidelines
Mapping this evolving landscape
8. Content standards
Data policies by
funders, journals and
other organizations
Databases, tools
and services
Formats Terminologies Guidelines
a resource of the ELIXIR Interoperability Platform
• A web-based, curated and searchable portal that monitors their
development and evolution to inform and educate
9.
10.
11.
12. Not just quantity but quality:
rich, curated and community
vetted descriptions
13. Indicators to describe the status of standards and databases
Ready for use, implementation, or recommendation
In development
Status uncertain
Deprecated as subsumed or superseded
Manually curated and verified
by the community behind each resource
16. …to inform and educate on
existing and new resources
Data Policy
17. Working with/for the community and our ‘adopters’, e.g.:
Standard developing groups:Journal, publishers:
Cross-links, data exchange:
Societies and organisations: Institutional RDM services:
Projects, programmes:
533
responders
18. Progressively cross-linking with other ELIXIR resources
Cross-links, data exchange:
Societies and organisations:
Standard developing groups:Journal, publishers:
Institutional RDM services:
Projects, programmes:
19. • Increase discoverability (e.g. by search engines), aggregation (e.g. by indices)
and analysis of content in different websites and services
• use of schema.org structured semantic markup (for web pages’ content) by Google, Bing,
Yahoo, Yandex
• coordinate its extension, where needed, in the life science area
Gaining traction and
support by: