Representation of chemical data in QSAR and Crystallography

1,819 views
1,712 views

Published on

My presentation at the Annual meeting NWO-CW section Analytical Chemistry, Lunteren, The Netherlands, 2005

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,819
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
47
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Representation of chemical data in QSAR and Crystallography

  1. 1. –Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography
  2. 2. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Computer representation of molecular structures •Spectra (NMR, IR, ...) •Connection Table •Schrödinger •Dietz Representation Equation •Molecular Invariants •Physical Properties
  3. 3. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Computer representations of molecular structures •Spectra (NMR, IR, ...) •Connection Table •Schrödinger •Dietz Representation Equation •Molecular Invariants •Physical Properties
  4. 4. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Representing relations between descriptors • • Descriptor ontology: explicit definition of descriptor types and descriptor properties • • • • • •C.Steinbeck, C.Hoppe, S.Kuhn, M.Floris, R.Guha, E.L.Willighagen, Recent Developments of the Chemistry Development Kit (CDK) - An Open-Source Java Library for Chemo- and Bioinformatics, Current Pharmaceutical Design, accepted
  5. 5. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Representation does make a difference • Use of NMR spectra in Quantitative Structure Activity Relationship (QSAR) modeling • Three representations: simulated 1H NMR, 13C NMR spectra and theoretical descriptors • Three data sets: – water solubility of 431 compounds (WS) – boiling points of 277 compounds (BP) – LogP values of 154 compounds (LogP) – •E.L.Willighagen, H.M.G.W.Denissen, R.Wehrens, L.M.C.Buydens, On the use of 1H and 13C NMR spectra as QSAR descriptors, submitted
  6. 6. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography How the experiment is performed... • Partial Least Squares – 220 NMR bins – 220 randomly selected theoretical descriptors (Dragon) – – five random divisions in training and test sets – – number of latent variables chosen with leave-one-out cross validation
  7. 7. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Number of latent variables
  8. 8. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography 1 H and 13C NMR versus Dragon Descriptors
  9. 9. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Prediction Errors
  10. 10. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Model interpretation?
  11. 11. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography What can we conclude? • Representation has a large effect • H NMR models have no predictive power 1 • 13 C NMR models have some predictive power ... but no advantages
  12. 12. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Finding a representation for crystal structures
  13. 13. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Electronic Radial Distribution Function (ReDF) • • Describes patterns in atom interactions in and around the unit cell • Å •E.L.Willighagen, R.Wehrens, P.Verwer, R.de Gelder, L.M.C.Buydens, A Method for the Computational Comparison of Crystal Structures, Acta.Cryst., 2005, B61, 29-36
  14. 14. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Quantifying Similarities
  15. 15. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Quantifying Similarities ReDF 1  Weighted Cross  Similarity [0,1] Correlation ReDF 2
  16. 16. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Cephalosporin crystal structures
  17. 17. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Polymorph Prediction • Polymorphs: different crystal structures for the same molecular compound • Polymorph Prediction: computational method to predict the polymorphs given a molecular structure
  18. 18. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Polymorphic estrone crystal structures • Better trend in similarity going from identical to different structures : ReDF + WCC Cerius2
  19. 19. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Conclusions • It is important to pick a proper representation • H and 13C NMR spectra are not good representations for 1 QSAR models • A new crystal structure descriptor for gives chemically better interpretable similarities
  20. 20. Egon Willighagen, Lunteren 2005 Representation of chemical data in QSAR and crystallography Acknowledgments ● Ron Wehrens, Lutgarde Buydens (supervisors) ● René de Gelder, Paul Verwer (crystal structures) ● Harm Denissen (QSAR) ● Peter Murray-Rust (Cambridge University, UK) Christoph Steinbeck (Cologne University, DE) ● NWO (for financial support)

×