1. The document discusses different approaches to modeling biodiversity data as linked data, including representing taxa as classes or class instances. Representing them the same way across datasets maximizes interlinking but may limit reasoning abilities.
2. It also outlines the differences between modeling from a thesaurus perspective versus a biological perspective. A thesaurus focuses on describing individuals while an ontology can define classes through necessary and sufficient conditions.
3. While pragmatism in modeling linked data can increase interoperability now, it may reduce opportunities for reasoning and inference in the future as perspectives are not fully aligned. Choosing a clear modeling approach is important.
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Modelling Biodiversity Linked Data: Pragmatism May Narrow Future Opportunities
1. 1Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
F. Michel1, C. Faron-Zucker1, S. Tercerie2, O. Gargominy2
1Université Côte d’Azur, CNRS, Inria, I3S, France. 2Service du Patrimoine Naturel, MNHN, CNRS, France.
Modelling Biodiversity Linked Data:
Pragmatism May Narrow Future Opportunities
SPNHC+TDWG 2018
Dunedin, New Zealand
2. 2Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Source: Sangya Pundir. https://fr.wikipedia.org/wiki/Fichier:FAIR_data_principles.jpg
3. 3Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
LOD Cloud: 1184 datasets, 150B Statements
Linking Open Data cloud diagram, 2018. J.P. McCrae, A. Abele,
P. Buitelaar, A. Jentzsch, V. Andryushechkin and R. Cyganiak.
http://lod-cloud.net/
On the Web, under open licenses
Machine-readable (RDF)
URIs to name things
Common vocabularies
Linked with each other
Queryable
4. 4Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
TAXREF-LD
NCBI Taxon
TaxonConcept
GeoSpecies
Plant Ontology
ENVO
5. 5Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
The Semantic Web is more than the Web of Data
“The Semantic Web provides an environment where
applications can publish and link data, define vocabularies,
query data at web scale, and draw inferences.” (adapted from W3C website)
Linked
Data
Querying
Vocabularies
Inference
6. 6Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
One class for all taxa
AGROVOC thesaurus (skos:Concept)
OpenBiodiv-O, EOL trait bank (dwc:Taxon)
Wikidata (wd:Q16521)
parent/broader
Delphinus
Delphinus
delphis
Taxon
Model a “thing” as a class or a class instance?
One class per taxonomic rank
GeoSpecies, TaxonConcept,
DBpedia, BBC WO, BioFid.de
hasGenus/broader
Delphinus
Delphinus
delphis
Species
Genus
Delphinus delphis
Delphinus
subClassOf
One class per taxon
NCBI Org. Classification,
VTO, TAXREF-LD
Flipper
Thesaurus perspective Biological perspective Taxonomic Rank perspective
Flipper
7. 7Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Biological
Formal conceptualization of domain knowledge in
a machine-processable format
Define and organize terms
• Hierarchy of concepts using “subclass of” (subsumption),
“is part of” (composition), or other relations
• Concepts are classes = sets of individuals
• Classes can be described and/or defined
Thesaurus
Representation of a domain knowledge in a
machine-processable format
Define and organize terms
• Hierarchy of concepts using relations between concepts
(broader, narrower, match…)
• Concepts are individuals = class instances
• Individuals are described
Thesaurus vs. Biological perspectives
8. 8Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
An individual is described
by stating its properties
A class can be defined
by a set of necessary and/or sufficient
membership conditions
Description vs. Definition
Flipper
is a
Mammals
restriction
(habitat, marine)…
restriction
(parental care, none)
habitat
marine
none
parental care
Delphinus
delphis
habitat
marine
Avg. body
length
2,44m
species
none
parental care
rank
Delphinus
delphis
Avg. body
length
2,44m
species
rank
subClassOf
9. 9Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
The Semantic Web is more than the Web of Data
Linked
Data
Querying
Vocabularies
Inference
Infer subsumption relationships between classes
Classify individuals:
compute instance relationships between individuals and classes
Improve query answering:
query expansion, infer new triples to improve performance
Align similar classifications
…
10. 10Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
The Semantic Web is more than the Web of Data
Linked
Data
Querying
Vocabularies
Inference
Infer subsumption relationships between classes
Classify individuals:
compute instance relationships between individuals and classes
Improve query answering:
query expansion, infer new triples to improve performance
Align similar classifications
…
11. 11Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Modelling LD requires tackling several questions
What is my modelling perspective? For what use?
Thesaurus? Ontology? Other?
Will I need some type of automatic reasoning eventually?
How to maximize interlinking with related datasets?
• Theoretical issue: DL best practices discourage aligning classes and class instances
⟹ Linking thesauruses and ontologies not always possible, e.g.:
A taxon in NCBI (class) ≠ A taxon in Agrovoc (instance of the SKOS concept class)
• Pragmatism ⟹ adopt the majority trend to maximize interlinking
Taxa = class instances in Agrovoc, EoL, Wikidata, OpenBiodiv-O,
DBpedia, GeoSpecies, TaxonConcept
Taxa = classes in VTO, NCBI, TAXREF-LD
Trade-off between interlinking and reasoning?
12. 12Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Take-home messages
The Semantic Web is not just Linked Data.
Think of what inference may solve in my context.
Choose a modelling perspective for my LD:
controlled vocabulary, thesaurus, ontology, …
Pragmatism can be beneficial in the short term,
but may come with a price.
13. 13Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Citation:
Michel F., Faron-Zucker C., Tercerie S. and Gargomony O. (2018).
Modelling Biodiversity Linked Data: Pragmatism May Narrow
Future Opportunities. Biodiversity Information Science and
Standards 2: e26235. https://doi.org/10.3897/biss.2.26235
Thank you