Your SlideShare is downloading. ×
0
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
How can the international chemical identifier (InChI) be extended to non …
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

How can the international chemical identifier (InChI) be extended to non …

814

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
814
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. How can the International ChemicalIdentifier (InChI) be extended to non- trivial chemicals? of the pillars of a V. Tkachenko, A.J. Williams, Y. Borodina, F. Switzer, T. Peryea, L. Callahan ACS Philly August 2012
  • 2. What is InChI
  • 3. InChI Examples CH3CH2OH InChI=1S/C2H6O/c1-2-3/h3H,2H2,1H3 ethanol InChI=1S/C6H8O6/c7-1-2(8)5- L-ascorbic acid 3(9)4(10)6(11)12-5/h2,5,7-8,10- 11H,1H2/t2-,5+/m0/s1
  • 4. InChI Structure
  • 5. InChIKey The condensed, 27 character standard InChIKey is a hashed version of the full standard InChI (using the SHA-256 algorithm) Designed to allow for easy web searches of chemical compounds InChIKeys consist of  14 characters resulting from a hash of the connectivity information of the InChI  followed by 9 characters resulting from a hash of the remaining layers of the InChI  followed by a single character indication the version of InChI used  followed by single checksum character InChI=1S/C17H19NO3/c1-18-7-6-17-10-3-5-13(20)16(17)21-15-12(19)4-2-9(14(15)17)8-11(10)18/h2-5,10- 11,13,16,19-20H,6-8H2,1H3/t10-,11+,13-,16-,17-/m0/s1 BQJCRHHNABKAKU-KBQPJGBKSA-N Unlike InChI, InChIKey  CT only by lookup
  • 6. Proliferation of InChI
  • 7. Search by InChI
  • 8. ChemSpider Google Searchhttp://www.chemspider.com/google/
  • 9. What’s the catch? InChI has limitations InChI is ideal for  Simple  Static  Well-defined graphs Real chemical substances can only be approximated by such graphs
  • 10. Limitations Non-trivial stereo (e.g. axial, planar) Non-trivial tautomers (e.g. ring-chain) Mixtures – full stereo is rarely known Polymers Markush structures Organometalics Inorganics Materials Reactions Etc
  • 11. Chemical data complexity
  • 12. Work in progress InChI Extensions: Under the guidance of IUPAC, several sub-teams are now working on expanding InChI to new areas of chemical representation:  Reaction InChI (RInChI): the reaction working group has completed its recommendations, and work is ready to begin.  Polymers/Mixtures: The polymers/mixtures working group also has submitted its recommendations, and work to incorporate the new representations should begin once version 1.04 is released.  Markush: This project is the most complex undertaken to date. The initial recommendations have been submitted, but financing of the work still needs to be sorted out. But what do we do NOW???
  • 13. Data Validation Standardization FilteringComponentization Deposition Process Deduplication Mapping data Non- redundant
  • 14. ChemSpider Data Model
  • 15. Organometallics
  • 16. Mixtures or unknown stereo
  • 17. Accelrys Enhanced Stereo
  • 18. MOL V3000
  • 19. Enhanced stereo and InChI… Unfortunately not supported Is it important? Now real-world examples…
  • 20. FDA Substance Registration System
  • 21. Stoichiometric and non-stoichiometric mixtures Moiety 1:Substance: Moiety 2:
  • 22. Substance: Moiety 1: Moiety 2: Moiety 3: Moiety 4:
  • 23. Substance: Moiety 1: Moiety 2: (undefined)
  • 24. Moiety 1:Substance: (A) Moiety 2: (B)
  • 25. D-glucose
  • 26. SRS standardization approach Substance description Standardization module Moieties generator Normalization InChI[Key] generator Hash function f(InChIKeys, moieties) Unique ID Standard description
  • 27. SRS TBD Markush Polymers Proteins Inorganics Materials
  • 28. OpenPHACTS Open PHACTS is an Innovative Medicines Initiative (IMI) – 3 years project To reduce the barriers to drug discovery in industry, academia and for small businesses To build an open platform, integrating chemistry and biology data from public domain resources Semantic web platform Open Standards, Open Data and Open Source
  • 29. OpenPHACTS specifics Active/inactive ingredient Parent/child Sample/substance Misreferences (!!!)
  • 30. ChemSpider Reactions
  • 31. ChemSpider Reaction Challenges Deduplication Identification Deposition
  • 32. Conclusions InChI is The Identifier InChI has its limitations InChI is work in progress InChI deficiencies can be hot-fixed
  • 33. Acknowledgements RSC Cheminformatics group FDA SRS group OpenPHACTS consortium Software: InChI, GGA Software
  • 34. Thank youEmail: tkachenkov@rsc.orgBlog: www.chemspider.com/blogSLIDES:http://www.slideshare.net/valerytkachenko16

×