Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Presenting and Preserving the Change in Taxonomic Knowledge for Linked Data


Published on

Presented at Journal Paper Track, The Web Conference, Lyon, France, April 15, 2018
Abstract: Linked Open Data (LOD) technology enables web of data and exchangeable knowledge graphs through the Internet. However, the change in knowledge is happened everywhere and every time, and it becomes a challenging issue of linking data precisely because the misinterpretation and misunderstanding of some terms and concepts may be dissimilar under different context of time and different community knowledge. To solve this issue, we introduce an approach to the preservation of knowledge graph, and we select the biodiversity domain to be our case studies because knowledge of this domain is commonly changed and all changes are clearly documented. Our work produces an ontology, transformation rules, and an application to demonstrate that it is feasible to present and preserve knowledge graphs and provides open and accurate access to linked data. It covers changes in names and their relationships from different time and communities as can be seen in the cases of taxonomic knowledge.

Published in: Technology
  • Be the first to comment

Presenting and Preserving the Change in Taxonomic Knowledge for Linked Data

  1. 1. Presenting and Preserving the Change in Taxonomic Knowledge for Linked Data Rathachai Chawuthai Hideaki Takeda Professor Vilas Wuwongse Professor Utsugi Jinbo Entomologist Taxonomist Journal Paper Track, The Web Conference, Lyon, France, April 15, 2018 SemanticWeb, vol. 7, no. 6, pp. 589-616, 2016, DOI: 10.3233/SW-150192
  2. 2. Agenda  Change in Taxonomy  LTK: A Logical Model for Linking Taxonomic Knowledge  Result
  3. 3. Change in Taxonomy 3
  4. 4.  Knowledge on Biodiversity Domain  Taxonomy:  Description, identification, nomenclature, and classification of organisms.  Taxa (taxon)  Scientific Names  Information on taxa  Taxonomic Concept description, Interspecies Interaction, Ecological Information, Food Web, etc.  Databases: GBIF, uBio, TDWG, ZooBank, MycoBank, etc.  Most of them are based on scientific names.  Problem: Taxonomic Knowledge is dynamic  Biologists continue discovering more knowledge.  The Change in Taxonomic Knowledge is common due to the new discovery and new viewpoint by biologists 4 Biodiversity Knowledge
  5. 5. Example 5 Icterus bullockii (Swainson, 1827) Icterus galbula (Linnaeus, 1758) “Baltimore Oriole” “Bullock’s Oriole”
  6. 6. 1758 1827 6 I. bullockii I. galbula
  7. 7. 1758 1827 1964 7 I. galbula I. bullockii Merged Into I. galbula
  8. 8. 1758 1827 1964 1995 8 I. galbula I. bullockii Merged Into I. galbula I. bullockii I. galbula Split Into
  9. 9.  How to represent and preserve changes in taxonomy?  Not current knowledge alone is valuable. Past knowledge should be preserved correctly.  How to publish these changes as Linked Data with  Machine/Human-Readable Entities (taxon concept with name & context)  Light-weight expressions (compatible with the current use of taxon in other DBs) ? 9 Challenge C E
  10. 10. LTK 11 A Logical Model for Linking Taxonomic Knowledge
  11. 11. 12 LTK : Linked Taxonomic Knowledge Linked Taxonomic Knowledge (LTK) for preserving and presenting the change in taxonomic knowledge for linked data.  The model can manage the changes in taxonomic knowledge.  The model preserves the changes as an event along with aspects of time and provenance.  The model supports the changes in either taxa or association between taxa.  The model allows tracing the background knowledge of the changes by linking the cause and effect between them.  The model can be used to publish a suitable format for a dataset for linked open data.  The linked data model deals with simple identifiers of Semantic Web resources in order to make the linked data be easily recognized by both humans and machines.  The model provides a sequence of changes in taxa.  The model presents temporal data on the basis of a given time point.
  12. 12. 13 Definition  Entities for LTK  Nominal Entity, Simple Nominal Entity, and Contextual Nominal Entity  Operations of Change  Change in Conceptions:  Merge, Split, and Replace  Change in Relations:  Change higher taxon, subdivide, combine, synonym link, etc.  Data Models  Event-Centric Model, Transition Model, and Snapshot Model  Symbols in the following Diagrams  (nom) is an instance of a nominal entity,  (sim) is an instance of a simple nominal entity,  (con) is an instance of a contextual nominal entity,  (OPR) is a class of a change entity (operation),  (opr) is an instance of an operation, and  (event) is an instance of an event entity.
  13. 13. A taxon can be species, genera, families, etc. But, a taxon may change to a synonym by time and vice versa. 14 Entity Issue: Taxon and Name EC Merging of 2 Genera: Bubo and Nyctea into Bubo causes Nytea scandica is a synonym of Bubo scandiacus Nytea scandica 1999 Now Name Taxon Taxon
  14. 14. Introduce terms that satisfy the use case of biologists 15 Taxon ID for Linked Data Taxon Concept Name • uri • uri • uri Nominal Entity (nom) A concept and an Internet resource used for taxonomic knowledge that can be a taxon concept and a name (ex. synonym) Simple Nominal Entity (sim) A subset of the Nominal Entity corresponds to a single scientific name. - genus:Bubo (accepted)  a taxon concept - genus:Nyctea (obsoleted)  a name. Contextual Nominal Entity (con) It is a version of the nominal entity specified by an accepted period. genus:Bubo_1999 dct:isVersionOf genus:Bubo. EC
  15. 15. Ontology for Knowledge Change • Change in taxonomic knowledge is modeled as operations. • The operations are organized as the ontology. 16
  16. 16. It is an RDF format for presenting the operations of change with time, and references. It also provides links between operations for showing some reasons behind the change. This is an n- ary relation, so it is complicated by design, but is flexible for the uses of other applications. 17 Event-Centric Model ltk:Taxon Merger ltk:Change HigherTaxon ex:merge1 ex:reclass1 ex:event1 rdf:type rdf:type cka:interval “t1” “t2” tl:beginsAt DateTime tl:endsAt DateTime cka:effect ex:A_1 ex:B_1 ex:A_2 ex:X_1 (OPR) (OPR) (opr)(opr) (con) (con) (con) (con) (event) C C
  17. 17. It is transformed from the event-centric model by Semantic Web rules in order to generate flat, straightforward, and easily linkable triples representing the chronological changes of taxon concepts or their names. 18 Transition Model ltk:Taxon Merger ex:merge1 ex:A_1 ex:A_2 ex:B_1 rdf:type cka:Concept Evolution rdfs:subClassOf ltk:mergedInto ltk:mergedInto (OPR) (opr) (con) (con) (con) ex:event1 cka:interval “t1” “t2” tl:beginsAt DateTime tl:endsAt DateTime cka:assures (event) rules ex:A_1 ex:B_1 ex:A_2 ltk:major MergedInto ltk:major MergedInto ex:inv1 ltk:major Link “t1” “t1” “t2” “t1” ltk:expired ltk:expired ltk:entered ltk:expired Event-Centric Model Transition Model (con) (con) (con) E C E
  18. 18. It is a set of simply regular triples that are transformed from the event-centric model with a given time point using Semantic Web rules, so the triples can present snapshot knowledge at a particular time point. 19 Snapshot Model ltk:Change HigherTaxon ex:reclass1 rdf:type cka:Relationship Evolution rdfs:subClassOf ltk:higherTaxon cka:relation ex:event1 cka:interval “t1” “t2” tl:beginsAt DateTime tl:endsAt DateTime ex:A_2 ex:X_1 ex:B_1 cka:assures (OPR) (opr) (event) (con) (con) (con) ex:inv1 Event-Centric Model ex:inv1 “t1” “t2” tl:endsAt DateTime tl:beginsAt DateTime (the name of the graph) (named graph) ltk:higher Taxon ex:X_1 ex:A_2 (con) (con) rules Snapshot Model E C E
  19. 19. Role of LTK (right) in LOD Cloud (left) containing example datasets. Ovals with single alphabet or ID number are general concepts, ovals with version are versions of general concepts, dashed lines show same URIs, :sameAs is owl:sameAs, :isVer is dct:isVersionOf, :re is ltk:replacedInto, and :mg is ltk:mergedInto. 20 LTK with LOD Cloud Linked Taxonomic Knowledge Transition Model /Snapshot Model (For linked data) Event-Centric Model (for presenting change) :re :mg :mg DL O Example Dataset 2 (LODAC) C LOD Cloud Example Dataset 1 (GBIF) A c_3 a_1 b_1 a_2 a_2 b_1 c_3 a_1 a_2 02 01 0304 b a c External Links (for managing linked data with external datasets) (con) (con) (con) (con) (con) (con) (con) (con) (con) (sim) (sim) (sim) (event) (opr) (nom) (nom) C E
  20. 20. Result 21
  21. 21.  Evaluation against Use Cases  Change of moths species of the family Saturniidae among 3 checklists: Inoue (1982), Jinbo (2008), and Kishida (2011)  LTK model covers all cases including: creating a concept, obsoleting a concept, replacing a taxon, merging taxa, splitting a taxon, linking synonym, changing a higher taxon, subdividing a taxon, and combining taxa. 22 Outcome  Implementation C E
  22. 22. 23 Comparison & Discussion Criteria TaxMeOn (& its enhancement) LTK Change in Knowledge Capturing changes in taxonomy Yes Yes (Even-Centric Model) Presenting context in a graph No Yes (Even-Centric Model) Linking background between changes No (it is limited by design due to the use of a single binary relation presenting changes) Yes (Even-Centric Model) Human-Readable Identifiers Including a human-readable name in a URI Rare (Only in schema but not taxon concepts) Yes (SIM & CON) Light-Weight Triples Accessing a name of a taxon use 1 triple (taxon and name are split) get directly from the URI (SIM & CON) Accessing taxa before and after merging or splitting use 2 triples use 1 triple (Transition Model) Presenting a relation between two names use 3 triples use 1 triple (CON & Transition/Snapshot Model) Accessing temporal information by full-text linking to a taxon Yes (Snapshot Model) C C C E E E E C EC
  23. 23.  LTK framework allows increasing the capability of a system to other domain with other vocabularies.  Developer can create other operations under either the classes of the change in conception (cka:ConceptEvolution) or the change in triple (cka:RelationshipEvolution) and reusing or adapting the Semantic Web rules. 24 Extensibility Geographic Area Representations in Statistical Linked Open Data of Japan, D. Yamamoto, et al. Joint Proceedings of the International Workshops on Hybrid Statistical Semantic Understanding and Emerging Semantics, and Semantic Statistics, co-located with 16th Extended Semantic Web Conference (ISWC 2017)
  24. 24. Thank you very much 25