The past few years has seen a tremendous leap forward in public compound databases. Both PubChem and ChemSpider have made a clear message: chemical sciences can only move forward if we can search existing chemistry. However, the exact Open nature of “public” database is not always crystal clear. PubChem is mostly public domain but contains proprietary content too, while ChemSpider is mostly proprietary but has Open Data content. Neither are clear in how the Open Data parts of these databases can be used, modified, and redistributed, the three corner stones of Open Science.
We will demo, based on previous work on http://rdf.openmolecules.net/, an architecture where semantic web technologies, the InChI, and Open Source cheminformatics tools are used to create a Panton Principles-compliant compound database to aid the next-generation public databases. Standards proposed in the Open PHACTS community will be use to specify links between this new resource and other databases, and to provide compound properties. All this input will be available with provenance on the origin of that data, as separate downloadable files, and using ontologies to provide explicit meaning. Using ontologies like ChEBI and CHEMINF, applications in the areas of metabolomics and toxicology will be presented.