Christophe Debruyne, Jonathan Riggio, Emma Gustafson, Declan O'Sullivan, Mathieu Vinken, Tamara Vanhaecke, Olga De Troyer.
Presented at the 2020 IEEE 14th International Conference on Semantic Computing, San Diego, California, 3-5 February 2020
Toxicology aims to understand the adverse effects of
chemical compounds or physical agents on living organisms. For
chemicals, much information regarding safety testing of cosmetic
ingredients is now scattered in a plethora of safety evaluation
reports. Toxicologists in our university intend to collect this
information into a single repository. Their current approach uses
spreadsheets, does not scale well, and makes data curation and
querying cumbersome. Semantic technologies (e.g., RDF, OWL,
and Linked Data principles) would be more appropriate for
this purpose. However, this technology is not very accessible to
toxicologists without extensive training. In this paper, we report
on a tool that supports subject matter experts in the construction
of an RDF–based knowledge base for the toxicology domain. The
tool is using the jigsaw metaphor for guiding the subject matter
experts. We demonstrate that the jigsaw metaphor is a viable
option for generating RDF. Future work includes investigating
appropriate methods and tools for knowledge evolution and data
analysis.
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
1. Facilitating Data Curation:
a Solution Developed in the Toxicology Domain
Christophe Debruyne1,3, Jonathan Riggio1, Emma Gustafson2, Declan
O'Sullivan3, Mathieu Vinken2, Tamara Vanhaecke2, and Olga De Troyer1
1 WISE – Vrije Universiteit Brussel, Belgium
2 IVTD – Vrije Universiteit Brussel, Belgium
3 ADAPT – Trinity College Dublin, Ireland
2020-02-05 @ ICSC
2. Context
• The goal of toxicology is to understand the adverse effects of
chemical compounds or physical agents on living organisms.
• The IVTD Group in the VUB aims to collect safety testing data of
cosmetic ingredients available in publicly available safety evaluation
reports with the goal of creating a knowledge base.
2/5/20 ICSC 2020 2
3. Problem
• Subject matter experts currently rely on spreadsheets, which the
experts consider a “computational database”
– a term that experts in this domain use for any technology and representation
allowing for some data manipulation and analysis, ranging from spreadsheets
to more complex (domain-specific) database technologies.
• Spreadsheets, however, causes problems for: data curation, data
analysis, and data exploration.
2/5/20 ICSC 2020 3
4. Problem
• Semantic Technologies can overcome the challenges that come with
spreadsheets, and furthermore allows data to be linked with
external datasets.
• But these technologies are not easily accessible to these subject
matter experts [1]
• How can we facilitate subject matter experts in the domain of
toxicology in the creation of a knowledge base for the available
safety evaluation reports?
2/5/20 ICSC 2020 4
7. Proposed Solution: Using the Jigsaw Metaphor to Guide
Experts in Creating Resources
• An interface metaphor [2] is drawing upon the knowledge of
familiar concepts to facilitate learning and using a system.
• The Jigsaw Metaphor is drawing upon one’s familiarity with jigsaw
puzzle pieces, and has been successful for other tasks:
programming, querying, data integration,…
• In our proposed solution, the Jigsaw puzzle pieces guide subject
matter experts in creating valid data entries.
2/5/20 ICSC 2020 7
8. The Prototype
2/5/20 ICSC 2020 8
The prototype is
built on top of
Google Blockly for
the metaphor and
Apache Jena for the
knowledge base.
10. Knowledge Organization
2/5/20 ICSC 2020 10
ns:Test
a owl:Class .
ns:SkinAbsorptionInVitroNonOECD
a owl:Class .
ns:year
a owl:DatatypeProperty ;
owl:range xsd:string .
...
Ontology
Stored as an OWL file on the Web with
the namespace as the location (URL)
11. Knowledge Organization
2/5/20 ICSC 2020 11
...
ns:attribute [
ns:predicate ns:year ;
ns:order "C"
] ;
...
Graph
ns:SkinAbsorptionInVitroNonOECD...
ns:attribute [
ns:predicate ns:year ;
ns:order "C"
] ;
...
Graph
ns:SkinAbsorptionInVitroNonOECDns:Test
a owl:Class .
ns:SkinAbsorptionInVitroNonOECD
a owl:Class .
ns:year
a owl:DatatypeProperty ;
owl:range xsd:string .
...
Ontology
Representation: attributes and
attribute groups
...
ns:attribute [
ns:predicate ns:year ;
ns:order "C"
] ;
...
Graph
ns:SkinAbsorptionInVitroNonOECD
Stored as an OWL file on the Web with
the namespace as the location (URL)
Each graph stored as a named graph in
triplestore 1
12. Knowledge Organization
2/5/20 ICSC 2020 12
<http://.../dossier/7>
a ns:Report ;
ns:contains [
a ns:SkinAbsorptionInVitroNonOECD ;
ns:year "ND"^^xsd:string ;
...
] ;
...
.
Graph <http://.../dossier/7>
<http://.../dossier/7>
a ns:Report ;
ns:contains [
a ns:SkinAbsorptionInVitroNonOECD ;
ns:year "ND"^^xsd:string ;
...
] ;
...
.
Graph <http://.../dossier/7>
<http://.../dossier/7>
a ns:Report ;
ns:contains [
a ns:SkinAbsorptionInVitroNonOECD ;
ns:year "ND"^^xsd:string ;
...
] ;
...
.
Graph <http://.../dossier/7>
...
ns:attribute [
ns:predicate ns:year ;
ns:order "C"
] ;
...
Graph
ns:SkinAbsorptionInVitroNonOECD...
ns:attribute [
ns:predicate ns:year ;
ns:order "C"
] ;
...
Graph
ns:SkinAbsorptionInVitroNonOECDns:Test
a owl:Class .
ns:SkinAbsorptionInVitroNonOECD
a owl:Class .
ns:year
a owl:DatatypeProperty ;
owl:range xsd:string .
...
Ontology
Representation: attributes and
attribute groups
Instances of reports and studies
...
ns:attribute [
ns:predicate ns:year ;
ns:order "C"
] ;
...
Graph
ns:SkinAbsorptionInVitroNonOECD
Stored as an OWL file on the Web with
the namespace as the location (URL)
Each graph stored as a named graph in
triplestore 1
Each graph stored as a named graph in triplestore 2
13. Knowledge Organization
2/5/20 ICSC 2020 13
<http://.../dossier/7>
a ns:Report ;
ns:contains [
a ns:SkinAbsorptionInVitroNonOECD ;
ns:year "ND"^^xsd:string ;
...
] ;
...
.
Graph <http://.../dossier/7>
<http://.../dossier/7>
a ns:Report ;
ns:contains [
a ns:SkinAbsorptionInVitroNonOECD ;
ns:year "ND"^^xsd:string ;
...
] ;
...
.
Graph <http://.../dossier/7>
<http://.../dossier/7>
a ns:Report ;
ns:contains [
a ns:SkinAbsorptionInVitroNonOECD ;
ns:year "ND"^^xsd:string ;
...
] ;
...
.
Graph <http://.../dossier/7>
...
ns:attribute [
ns:predicate ns:year ;
ns:order "C"
] ;
...
Graph
ns:SkinAbsorptionInVitroNonOECD...
ns:attribute [
ns:predicate ns:year ;
ns:order "C"
] ;
...
Graph
ns:SkinAbsorptionInVitroNonOECDns:Test
a owl:Class .
ns:SkinAbsorptionInVitroNonOECD
a owl:Class .
ns:year
a owl:DatatypeProperty ;
owl:range xsd:string .
...
Ontology
Representation: attributes and
attribute groups
Instances of reports and studies
...
ns:attribute [
ns:predicate ns:year ;
ns:order "C"
] ;
...
Graph
ns:SkinAbsorptionInVitroNonOECD
Stored as an OWL file on the Web with
the namespace as the location (URL)
Each graph stored as a named graph in
triplestore 1
Each graph stored as a named graph in triplestore 2
14. Knowledge Organization
2/5/20 ICSC 2020 14
<http://.../dossier/7>
a ns:Report ;
ns:contains [
a ns:SkinAbsorptionInVitroNonOECD ;
ns:year "ND"^^xsd:string ;
...
] ;
...
.
Graph <http://.../dossier/7>
<http://.../dossier/7>
a ns:Report ;
ns:contains [
a ns:SkinAbsorptionInVitroNonOECD ;
ns:year "ND"^^xsd:string ;
...
] ;
...
.
Graph <http://.../dossier/7>
<http://.../dossier/7>
a ns:Report ;
ns:contains [
a ns:SkinAbsorptionInVitroNonOECD ;
ns:year "ND"^^xsd:string ;
...
] ;
...
.
Graph <http://.../dossier/7>
...
ns:attribute [
ns:predicate ns:year ;
ns:order "C"
] ;
...
Graph
ns:SkinAbsorptionInVitroNonOECD...
ns:attribute [
ns:predicate ns:year ;
ns:order "C"
] ;
...
Graph
ns:SkinAbsorptionInVitroNonOECDns:Test
a owl:Class .
ns:SkinAbsorptionInVitroNonOECD
a owl:Class .
ns:year
a owl:DatatypeProperty ;
owl:range xsd:string .
...
Ontology
Representation: attributes and
attribute groups
Instances of reports and studies
...
ns:attribute [
ns:predicate ns:year ;
ns:order "C"
] ;
...
Graph
ns:SkinAbsorptionInVitroNonOECD
Stored as an OWL file on the Web with
the namespace as the location (URL)
Each graph stored as a named graph in
triplestore 1
Used for
generating jigsaw
pieces
Composed
reports used for
generating RDF
Experts
”assemble”
reports
Each graph stored as a named graph in triplestore 2
15. Representing the Structure of Puzzle Pieces
• ns:attributeGroup relating instances
of ns:Study or ns:AttributeGroup to
instances of ns:AttributeGroup;
• ns:attribute relating instances of
ns:Study and ns:AttributeGroup to
instances of ns:Attribute;
• ns:predicate relates an ns:Attribute to
a predicate from the ontology;
• ns:order and ns:color.
2/5/20 ICSC 2020 15
16. Generating RDF
Blockly.Blocks['...#SkinAbsorptionInVitroNonOECD-Endpoints'] = {
init: function() {
this.jsonInit({
"type": "...#SkinAbsorptionInVitroNonOECD-Endpoints",
"message0": 'Endpoints',
"message1": 'methodOfAnalysis %1',
"args1": [{
"type": "field_input",
"name": "http://ontologies.vub.be/oecd#methodOfAnalysis" }],
// OMMITTED FOR BREVITY
"output": "...#SkinAbsorptionInVitroNonOECD-Endpoints",
"colour": 202 });
}
};
2/5/20 ICSC 2020 16
IRIs of the ontology are encoded
in the block definitions. Blockly
XML is then transformed into RDF
17. Conclusions and Future Work
• We proposed the adoption of a jigsaw metaphor to facilitate data
curation by subject matter experts.
• Future work is twofold:
– Use of the tool by subject matter experts for foreseen in 2020
– Adoption of the jigsaw metaphor defining and maintaining the various
blocks
2/5/20 ICSC 2020 17
18. References
1. L. McKenna, C. Debruyne, and D. O'Sullivan, "Understanding the position of
information professionals with regards to linked data: A survey of libraries,
archives and museums,'' in Proceedings of the 18th ACM/IEEE on Joint
Conference on Digital Libraries, JCDL 2018, Fort Worth, TX, USA, June 03-07,
2018, J. Chen et al., Eds. ACM, 2018, pp. 7—16.
2. T. D. Erickson, "Working with interface metaphors,'' in Readings in Human-
Computer Interaction. Elsevier, 1995, pp. 147—151.
2/5/20 ICSC 2020 18