This paper describe a proposed open semantic representation of chemical structure using JSON-LD (JSON for linked data) and an example of the semantic inferencing of a chemical structure concept (chirality).
1. Open Structures:
Current Issues & Future Plans
Open semantic chemical structures:
Ideas on the use of JSON-LD for
representation of chemical entities
Stuart J. Chalk, Department of Chemistry
University of North Florida
schalk@unf.edu
CINF Paper 6 – August 20, 2017
2. Outline
Inspiration
Chemical File Formats
YACFF
JavaScript for Linked Data (JSON-LD)
Building on JSON-LD
Reasoning Chemical Concepts
Conclusion
3. International Chemical Identifier (InChI)
http://www.inchi-trust.org/
Chemical JSON from Open Chemistry
https://github.com/OpenChemistry/
Common Standard for eXchange (CSX)
https://doi.org/10.1021/acs.jpca.6b10489
Hastings J, Magka D, Batchelor C, et al. Structure-
based classification and ontology in chemistry.
Journal of Cheminformatics. 2012;4:8.
https://doi.org/10.1186/1758-2946-4-8.
Inspiration
4. Over 100 different file formats
http://openbabel.org/docs/2.3.0/FileFormats/Overview.html
Represent many different types of information
Atoms, bonds (special bone types), positions, valence,
charge, isotope, radical, lone pairs, aromaticity,
stereochemistry, groups, mixtures, reactions, markush,
line notations, computational properties
Binary, plain text, XML – some open, most not
Chemical File Formats
6. No!
A chemical file framework
Built using JavaScript for Linked Data (JSON-LD)
Open
Extensible
Designed for semantic chemical applications
JSON schema can be applied
Yet Another Chemical File Format
7. Technical Recommendation Jan 2014
https://www.w3.org/TR/json-ld/
JSON representation of Resource Description
Framework (RDF) https://www.w3.org/RDF/
Representation of ‘things’ using ‘triples’
<subject> <predicate> <object>
JavaScript Object Notation
for Linked Data (JSON-LD)
8. JavaScript Object
Notation for Linked Data
(JSON-LD)
@context – aliases
@id – unique id’s
@type – data type
@base – define the type
of ‘thing’ that this file is
about
9. JavaScript Object Notation
for Linked Data (JSON-LD)
<http://schema.org/Person/stuchalk>
<http://purl.org/dc/terms/identifier>
"0000-0002-0703-7776"^^<http://www.w3.org/2001/XMLSchema#string> .
<http://schema.org/Person/stuchalk>
<http://schema.org/name>
"Stuart Chalk"^^<http://www.w3.org/2001/XMLSchema#string> .
<http://schema.org/Person/stuchalk>
<http://schema.org/worksfor>
"University of North Florida"^^<http://www.w3.org/2001/XMLSchema#string> .
<http://schema.org/Person/stuchalk>
<http://xmlns.com/foaf/0.1/mbox>
"schalk@unf.edu"^^<http://www.w3.org/2001/XMLSchema#string> .
10. Use layers for data
Create framework using @context
Identify the type of ‘chemical thing’ described
Include ‘table of contents’ for discovery
Define ‘required’ layers using JSON schema
JSON-LD for Chemical Structure
22. The JSON-LD file is RDF
Convert of the files to RDF triples…
…store in a graph database
Use Semantic Web languages to do more with
data, e.g. use the Web Ontology Language
https://www.w3.org/OWL/
Building on the JSON-LD
28. A semantic framework for storing chemical entity data
has great potential to improve interoperability
Can be extended as far as chemists can imagine
Can be used to infer the presence of chemical concepts
May require the use of more sophisticated semantic
languages – e.g. Shapes Constraint Language (SHACL)
https://www.w3.org/TR/shacl/
Conclusions