How can the International ChemicalIdentifier (InChI) be extended to non- trivial chemicals? of the pillars of a V. Tkachenko, A.J. Williams, Y. Borodina, F. Switzer, T. Peryea, L. Callahan ACS Philly August 2012
InChIKey The condensed, 27 character standard InChIKey is a hashed version of the full standard InChI (using the SHA-256 algorithm) Designed to allow for easy web searches of chemical compounds InChIKeys consist of 14 characters resulting from a hash of the connectivity information of the InChI followed by 9 characters resulting from a hash of the remaining layers of the InChI followed by a single character indication the version of InChI used followed by single checksum character InChI=1S/C17H19NO3/c1-18-7-6-17-10-3-5-13(20)16(17)21-15-12(19)4-2-9(14(15)17)8-11(10)18/h2-5,10- 11,13,16,19-20H,6-8H2,1H3/t10-,11+,13-,16-,17-/m0/s1 BQJCRHHNABKAKU-KBQPJGBKSA-N Unlike InChI, InChIKey CT only by lookup
Work in progress InChI Extensions: Under the guidance of IUPAC, several sub-teams are now working on expanding InChI to new areas of chemical representation: Reaction InChI (RInChI): the reaction working group has completed its recommendations, and work is ready to begin. Polymers/Mixtures: The polymers/mixtures working group also has submitted its recommendations, and work to incorporate the new representations should begin once version 1.04 is released. Markush: This project is the most complex undertaken to date. The initial recommendations have been submitted, but financing of the work still needs to be sorted out. But what do we do NOW???
Data Validation Standardization FilteringComponentization Deposition Process Deduplication Mapping data Non- redundant
OpenPHACTS Open PHACTS is an Innovative Medicines Initiative (IMI) – 3 years project To reduce the barriers to drug discovery in industry, academia and for small businesses To build an open platform, integrating chemistry and biology data from public domain resources Semantic web platform Open Standards, Open Data and Open Source