Docking Pose Assessment: The importance of keeping your GARD up

1,388 views

Published on

Presentation made at a regional Schroedinger UGM in 2009. Describing some of our work on docking pose assessment

Published in: Education, Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,388
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
17
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Docking Pose Assessment: The importance of keeping your GARD up

  1. 1. Abcd Docking Pose Assessment: The importance of keeping your GARD up David C. Thompson J. Christian Baber[a] Jason B. Cross[b, c][a] Wyeth Research, Chemical Sciences, Cambridge, MA[b] Wyeth Research, Chemical Sciences, Collegeville, PA[c] Cubist Pharmaceuticals, Inc. Lexington, MA
  2. 2. The Why Abcd • Large-scale docking evaluation study[1] — Glide, DOCK6, PhDOCK, SurFlex, FlexX, and ICM — Cognate ligand docking — Virtual Screening • Project aims: — Assess our computational needs: Right tools for the job? — Assess and revise best practices[1] J. B. Cross et al., J. Chem. Inf. Model. (In press) The Why
  3. 3. How do we assess a docking program’s ability to regenerate a known binding mode? Abcd Measures of Accuracy: RMSDPose # Score RMSD Top scoring 1 -72.0 1.9 pose 2 -56.0 2.3 3 -24.0 1.8 Best RMSD 4 -9.00 2.7 … … … • We dock the native ligand back into the protein • We look at the RMSD of the top pose • We look at the best RMSD of all the poses The Why
  4. 4. Comparing docking programs is difficult …[2] Abcd • RMSD, and statistics derived from RMSD, are used heavily in comparing docking programs • This is fine as RMSD works a lot of the time, however there are some issues — Not bounded (how big is too big?) — Large RMSDs can dominate aggregate statistics — RMSD is chemically ambivalent • We may be losing useful information[2] J. C. Cole et al., Proteins, 60, 325 (2005) The Why
  5. 5. What has come before Abcd • These observations on RMSD are not new • Relative Displacement Error (RDE)[3] — Statistics compiled using the RDE measure are less dominated by very bad docking poses — Would still miss poses that contain correct binding modes • Interaction-Based Accuracy Classification (IBAC)[4] — Would not miss poses that have a correct binding mode — Highly subjective, not easily automated • Real-space R-factor (RSR)[5] — Inclusion of experimental information — Un-bounded (how big is too big?) • All of these methods address some of the issues associated with RMSD, but not in one single measure • RMSTanimoto[6][3] R. A. Abagyan et al., J. Mol. Bio., 268, 678 (1997)[4] R. T. Kroemer et al., J. Chem. Inf. Comput. Sci., 44, 871 (2004)[5] D. Yusuf et al., J. Chem. Inf. Model., 48, 1411 (2008)[6] OpenEye Scientific Software, Santa Fe, NM The Why
  6. 6. The Why: A Recap AbcdRMSD works a lot of the time, so we need a function that preservesthis feature, but that also accounts for those difficult cases whereuseful information maybe lostWe would also like: • To avoid the skewing problem associated with large RMSDs • To have an objective measure • An element of chemical awareness The Why
  7. 7. The How Abcd • A Generally Applicable Replacement for RMSD: GARD[7] • GARD is a metric for analyzing docking poses • It is bounded on [0,1] to remove arbitrary cutoffs which distort average measures • It is based on an analysis performed by P. R. Andrews et al. [8]* — Regression analysis of the binding constants and structural components of 200 drugs and enzyme inhibitors • Automated, and no more expensive than RMSD[7] Submitted, J. Chem. Inf. Model.[8] P. R. Andrews et al., J. Med. Chem., 27, 1648 1984* Yes, we know that this is an old study . . . The How
  8. 8. GARD: The Algorithm Abcd Atomic RMSD = 3.68Å • For each atom compute an RMSD (di) • Use Andrews weight corresponding to the atom type (wi) • Define a ‘good’ and ‘bad’ RMSD: dmin and dmax — dmin = 1Å — dmax = 2.5Å ∑δ w i i GARD = i ∑w i i ⎧ 1 di ≤ dmin ⎪ d −d ⎪ δi = ⎨( i min ) dmin ≤ di ≤ dmax ⎪ dmax − dmin ⎪ ⎩ 0 di ≥ dmax RMSD = 1.38Å GARD = 0.90Reference structure (cyan); Docking pose (tan) The How
  9. 9. GARD: Worked Example Abcd di ATOM TYPE wi δiwi 0.28 C (sp3) 0.8 0.8 0.48 C (sp3) 0.8 0.8 0.69 N 1.2 1.2 0.60 C (sp3) 0.8 0.8 0.36 C (sp3) 0.8 0.8 0.96 C (sp2) 0.7 0.7 0.96 N 1.2 1.2 3.68 C (sp3) 0.8 0 0.60 C (sp3) 0.8 0.8 SUM 7.9 7.1 GARD = 7.1/7.9 = 0.90 RMSD = 1.38Å GARD = 0.90Reference structure (cyan); Docking pose (tan) The How
  10. 10. Comparing docking programs is difficult … but we do it anyway Abcd “Cognate ligand docking to 68 diverse, high-resolution x-ray complexes revealed that ICM, GLIDE, and Surflex generated ligand poses close to the X-ray conformation more often than the other docking programs. GLIDE and Surflex also outperformed the other docking programs when used for virtual screening, based on mean ROC AUC and ROC enrichment . . .[1]” Protocol: 1. Initial ligand coordinates used as input for the docking were generated using CORINA[9] 2. The 10 top scoring poses (or fewer, depending on the specific output for a particular X-ray complex/docking program combination) were retained for analysis 3. These poses were then evaluated using both the GARD and RMSD measures[1] J. B. Cross et al., J. Chem. Inf. Model. (In press)[9] CORINA v1.82, Molecular Networks GmbH: Erlangen, Germany, 1997 The What
  11. 11. The What Abcd 30 25 20 RMSD 15 y = -7.3x + 7.2 R2 = 0.59 10 5 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 GARDCorrelation between GARD scores and RMSD across the top 10 poses of compounds from 68 different targets and 6 docking methods The What(4725 points)
  12. 12. The What: Some Specific Examples Abcd 5 1GLQ 4.5RMSD = 4.44Å 4 GARD = 0.77 3.5 3 2 RMSD R = 0.53 1A4Q 2.5 2RMSD = 4.90Å 1.5 GARD = 0.78 1 0.5 0 0.75 0.8 0.85 0.9 0.95 1 GARD Correlation between GARD scores and RMSD for those poses with a GARD score of at least 0.75 across the top 10 poses of compounds The What from 68 different targets and 6 docking methods (1469 points)
  13. 13. 1A4Q: Neuraminidase with dihydropyran-phenethyl-propy-carboxamide inhibitor (1.90Å) Abcd 1A4Q SurFlex Ringflex docking pose (green wire) RMSD = 4.90Å GARD = 0.78 X-tal (grey tube) The What
  14. 14. 1GLQ: Glutathione-S-transferase with p-nitrobenzyl Abcdglutathione (1.80Å) 1GLQ ICM docking pose (green wire) RMSD = 4.44Å X-tal (grey tube) GARD = 0.77 The What
  15. 15. 1HPX: HIV Protease with KNI-272 inhibitor (2.00 Å)* Abcd 1 1 2 2 3 4 3 4 Best RMSD Crystal Structure Top Scoring GARD=0.63 / RMSD=1.89 GARD=0.75 / RMSD=2.35 GLIDE SP 4.5 (10/30) GLIDE SP 4.5 (1/30)*Additional example, not in the original docking evaluation data set The What
  16. 16. GPCR Model Validation: GLIDE SP 5.0 Abcd 7 Evaluate GPCR model’s ability to reproduce known crystallographic binding mode 6.5 6 5.5 RMSD 5 4.5 4 3.5 3 0 0.2 0.4 0.6 25 poses, post-minimization GARD β2 adrenergic receptor (2RH1) Pose # 24 X-tal (green) RMSD = 3.69Å GLIDE pose (yellow) GARD = 0.48 The What
  17. 17. GPCR Model Validation: IFD[9] Abcd β2 adrenergic receptor (2RH1) IFD, default parameters, Pose #1 X-tal ligand (cyan); model protein (cyan) RMSD = 1.85Å IFD pose (tan); IFD protein (tan) GARD = 0.65[9] Schrödinger Suite 2008, Induced Fit Docking protocol; Glideversion 5.0, Schrödinger, LLC, New York, NY, 2008; Prime version2.0, Schrödinger, LLC, New York, NY, 2008 The What
  18. 18. Concluding remarks Abcd• RMSD is a good measure most of the time, although it has known drawbacks which can result in the discarding of useful information• A Generally Applicable Replacement to RMSD (GARD) has been proposed which overcomes most of the drawbacks of RMSD, whilst preserving it’s strengths. This measure is: — Normalized — ‘Chemically aware’ — Automated / objective• Illustrated GARD utility showing specific examples from a large scale docking evaluation exercise, and examples from the Protein Data Bank• Future application: Use with RMSD to triage docking results for protein model evaluation — Of particular utility when considering multiple models, and tens/hundreds of docking poses
  19. 19. Cultural highlight Abcd • Ethnographic examination of ‘simulators’ — Crystallographers — Architects — Oceanographers • “All models are wrong, but some models are useful” – G. E. P. Box • “If exactitude is elusive, it is better to be approximately right than certifiably wrong” – B. B. MandelbrotSimulation and its discontents, Sherry Turkle, Cambridge, MA: MIT Press (2009)
  20. 20. Acknowledgments Abcd • Boehringer Ingelheim — Dr. Ingo Mügge — Dr. Sandy Farmer • Wyeth Research — The Docking Evaluation Team (Dr. YongBo Hu, Dr. Kristi Yi Fan and Dr. Brajesh K. Rai*) — Dr. Jack A. Bikker — Dr. Christine Humblet* Pfizer Global Research and Development, Groton, CT

×