Your SlideShare is downloading. ×
0
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation

2,045

Published on

This is a short presentation given to chemistry students at Drexel University as a remote presentation. This was for the class of Jean-Claude Bradley.

This is a short presentation given to chemistry students at Drexel University as a remote presentation. This was for the class of Jean-Claude Bradley.

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,045
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. ChemSpider and Traveling the Internet via Chemical Structures Antony Williams Drexel University, November 2012
  • 2. Compounds and Identifiers
  • 3. Chemistry on the Internet Where do you source chemistry information? What can you trust online? How can you recognize potential issues? Cross-referencing and curating data
  • 4. Molfiles (http://en.wikipedia.org/wiki/Chemical_table_file)
  • 5. Molfiles 10 9 0 0 1 0 0 0 0 0 1 V2000 31.2937 -9.0366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 26.6526 -9.0366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 31.2937 -7.7066 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 30.1161 -9.6877 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 25.5096 -9.6877 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 28.9731 -9.0366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 27.8163 -9.7016 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 26.6664 -7.7066 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 32.4367 -9.6877 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 30.1161 -11.0177 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3 1 2 0 0 0 0 4 1 1 0 0 0 0 9 1 1 0 0 0 0 7 2 1 0 0 0 0 5 2 2 0 0 0 0 8 2 1 0 0 0 0 6 4 1 0 0 0 0 4 10 1 6 0 0 0 7 6 1 0 0 0 0 M END
  • 6. Molfiles Molfiles are the primary exchange format between structure drawing packages Can be different between different drawing packages Most commonly carry X,Y coordinates for layout Can support polymers, organometallics, etc. Can carry 3D coordinates
  • 7. SMILES (http://en.wikipedia.org/wiki/SMILES) SMILES is a common format Can support polymers, organometallics, etc. Does NOT carry X,Y or Z coordinates for layout so requires layout algorithms – can be problematic! Generally different between drawing packages
  • 8. Stereo
  • 9. Tautomers
  • 10. SMILES ACD/Labs CC(C)CCC[C@@H](C)CCC[C@@H] (C)CCCC(C)=CCC2=C(C)C(=O)c1ccccc1C2=O OpenEye CC1=C(C(=O)c2ccccc2C1=O)C/C=C(C)/CCC[C @H](C)CCC[C@H](C)CCCC(C)C ChEMBL CC(C)CCC[C@@H](C)CCC[C@@H] (C)CCCC(=CCC1=C(C)C(=O)c2ccccc2C1=O)C
  • 11. The InChI Identifier
  • 12. InChI SINGLE code base managed by IUPAC – integrated into drawing packages. No variability as with SMILES InChI Strings can be reversed to structures – same problem as with SMILES – no layout Well adopted by the community (databases, publishers, blogs, Wikipedia) – good for searching the internet
  • 13. The InChI Standard
  • 14. Tautomers – “Mobile H Perception”
  • 15. Double Bond Orientation
  • 16. Stereo
  • 17. Checking for Stereochemistry
  • 18. Checking for StereochemistryUse your drawing package!
  • 19. Checking for Stereochemistry
  • 20. Checking for Stereochemistry
  • 21. Checking for Stereochemistry
  • 22. InChIKeysSearch the Web by Structure
  • 23. InChIs
  • 24. Databases and Standardization
  • 25. Databases and Standardization
  • 26. InChI No support for polymers, organometallics Many option settings can lead to variability and make integration across databases difficult – FixedH option especially problematic “Slight” chance of collisions of InChIKeys VERY USEFUL FOR INTEGRATING THE WEB
  • 27. Vancomycin
  • 28. VancomycinSearch Molecular Search Full Molecule SKELETON
  • 29. Full Skeleton Search: 104 Hits
  • 30. Full Molecule Search: 4 Hits
  • 31. Where is chemistry online? Encyclopedic articles (Wikipedia) Chemical vendor databases Metabolic pathway databases Property databases Patents with chemical structures Drug Discovery data Scientific publications Compound aggregators Blogs/Wikis and Open Notebook Science
  • 32. www.chemspider.com
  • 33. How do we build it? We deal in Molfiles or SDF files – with coordinates Valence checking, charge imbalance We have our own “business logic” to standardize InChI to “aggregate tautomers” to one record We link out to external sites using their IDs
  • 34. Searches: The INTERNETAll ChemSpider and Internet searches are “simply algorithms”but synonym searching is based on an assertion
  • 35. Validated Names for Searching…
  • 36. Validating structures Check for “full stereo” and use stereo descriptors especially for checking! Check for quality of associated data sources Check against reference literature when available – but it can be wrong Question EVERYTHING!
  • 37. Contributing to The Quality of DataWhat is the Structure of Vitamin K?
  • 38. Contributing to The Quality of Data What is the Structure of Vitamin K?A lipid cofactor that is required for normal bloodclotting. Several forms of vitamin K have beenidentified: VITAMIN K1 (phytomenadione)derived from plants, VITAMIN K2(menaquinone) from bacteria & syntheticnaphthoquinone provitamins, VITAMIN K3(menadione).
  • 39. What is the Structure of Vitamin K1?
  • 40. CAS’s Common Chemistry
  • 41. Wikipedia
  • 42. Wolfram Alpha
  • 43. DailyMed
  • 44. ALL Different, ALL “Domoic Acids”
  • 45. Thank youEmail: williamsa@rsc.orgTwitter: ChemConnectorBlog: www.chemspider.com/blogPersonal Blog: www.chemconnector.comSLIDES: www.slideshare.net/AntonyWilliams

×