CAESAR II:
The Combination of Direct Geometry
Method and CAESAR Algorithm for Super
Fast Conformational Search


Jiabo Li,...
Product Roadmap Disclaimer


• This presentation and/or any related documents contains statements
  regarding our plans or...
Outline


     • Original CAESAR: very efficient torsion search
           –    Recursive partition of a molecule
        ...
Conformation Sampling is important

    •The Conformationimportant in many
     3D conformation search is Search Problem
 ...
CAESAR: Conformation Algorithm base on Energy
Screening And Recursive Buildup




            1. Recursively partition    ...
3D Search results

 Database: build with both Catalyst/FAST and CAESAR for a test set of 50,000 molecules




    Table 1....
Reference of CAESAR I




   J. Li, T. Ehlers, J. Sutter, S. Varma-O’Brien, and J. Kirchmair,
   CAESAR: A New Conformer G...
New Developments of CAESAR


• CAESAR II
   – Direct Geometry Method for conformation generation of
     constraint struct...
Traditional method for ring structure
(Distance Geometry)

• – Bound Smoothing (can be very time consuming)
• – Embedding ...
Direct Geometry Method


• Direct 3D coordinate modification according to geometric
  constraints
• Type of geometric cons...
Bond Length

• Bond Length correction




      Bond length between C16 and N15 are too long.
      Correction: Move the t...
Bond Angle


• Bond angle correction




                Bond anlge C12-C13-C14 is too small.
                Correction: ...
Linear bonds


• Linear bond correction




                Carbon C2 is off line.
                Correction: Move C2 to ...
VDW Clash


• Remove VDW clash




                Two atoms H3 and H6 are too close..
                Correction: Move H3...
Other types of geometric constraints
can also converted into distance
constraints
• For instance, simple distance constrai...
Put the simple ideas into practice


• Not a single correction can satisfy all the
  constraints.
• Correction needs to be...
Test 1: Diamond structures from random
starting coordinates




© 2009 Accelrys, Inc.                    17
Iterative correction towards correct
     structure




© 2009 Accelrys, Inc.                       18
Test 2: C-60




© 2009 Accelrys, Inc.   19
Timing:Direct Geometry Method VS.
     Distance Geometry Method




          Table 2: CPU time (second) for generating 3D...
Test 3: All-trans cyclic peptide bonds




© 2009 Accelrys, Inc.                    21
Test 4: Conformation sampling of macrocycles (Pascal
Bonnet data set). Diversity by fingerprints
               Table 3: N...
Test 4: Conformation sampling of macrocycles
 (Pascal Bonnet data set). Radius of gyration




            Figure 1. Distr...
Distribution of sum of atom-atom distances




            Figure 2. Distribution of atom-atom distance summation of
     ...
Test 5. Find bioactive conformation with
CAESAR I test dataset


            Table 4. RMSD of the best fitting conformatio...
Push efficiency to the new limit: reuse ring
 conformations by creating a library

• Scan 6M compounds=> 100,000 ring/rigi...
Retrieve ring conformations from library
efficiently

• Load the index file
• Read in ring conformation from file if it is...
Speed Test of on-the-fly conformation
generation using CAESAR II

• Test condition
      – Data set: CAP2008 database, ~6m...
Summary



      • There are two new technologies in CAESAR II which make the new
        algorithm much more robust and e...
Acknowledgment




      • Jon Sutter
      • Honglin Li
      • Fang Bai
      • David Zhang
      • Paul Flook
      • F...
Upcoming SlideShare
Loading in...5
×

CAESAR II:The Combination of Direct Geometry Method and CAESAR Algorithm for Super Fast Conformational Search

1,629

Published on

A new method called Direct Geometry, is proposed for 3D structure generation of molecules with various types of geometric constraints such as ring closures, chirality and cis-trans isomerism. This method is combined with the original CAESAR algorithm for super-fast conformation searches. The new method is based on a very simple iterative procedure which directly modifies atom coordinates according to the geometric constraints, such as bond lengths, bond angles, torsions, and various stereochemical constraints. As compared to the traditional Distance Geometry method, the new method is much simpler and more efficient for highly constrained molecules. The techniques for stabilizing and accelerating convergence are presented. The efficiency and the robustness of the Direct Geometry method are demonstrated by the successful 3D structure generation of C60 and other highly constraint structures from completely random coordinates. To further improve the overall performance, a new ring library technology is designed for better re-use and fast retrieval of ring conformations. Our test with nearly 6 million compounds shows that the new integrated method, called CAESAR II, the 2nd generation of the CAESAR algorithm, is significantly faster than the original one. The high performance suggests that the new algorithm can be used for on-the-fly conformation generation for many applications which involve conformational models. Validation studies, such as conformation diversity measurements, pharmacophore space coverage and the ability to reproduce of the bioactive conformation of ligands extracted from the Protein Data Bank (PDB) will be reported.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,629
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

CAESAR II:The Combination of Direct Geometry Method and CAESAR Algorithm for Super Fast Conformational Search

  1. 1. CAESAR II: The Combination of Direct Geometry Method and CAESAR Algorithm for Super Fast Conformational Search Jiabo Li, Ph.D. ACS Meeting, March 21-25, 2010 , San Francisco
  2. 2. Product Roadmap Disclaimer • This presentation and/or any related documents contains statements regarding our plans or expectations for future features, enhancements or functionalities of current or future products (collectively "Enhancements"). Our plans or expectations are subject to change at any time at our discretion. Accordingly, Accelrys is making no representation, undertaking no commitment or legal obligation to create, develop or license any product or Enhancements. The presentation, documents or any related statements are not intended to, nor shall, create any legal obligation upon Accelrys, and shall not be relied upon in purchasing any product. Any such obligation shall only result from a written agreement executed by both parties. In addition, information disclosed in this presentation and related documents, whether oral or written, is confidential or proprietary information of Accelrys. It shall be used only for the purpose of furthering our business relationship, and shall not be disclosed to third parties. © 2009 Accelrys, Inc. 2
  3. 3. Outline • Original CAESAR: very efficient torsion search – Recursive partition of a molecule – Recursive build-up of conformations – Fast energy screening for bad clashes – New strategy for eliminating TopSym duplicates • CAESAR II: Deal with various geometric constraints – New algorithm for 3D structure generation for ring systems – Enforce stereo-chemistry (chirality, cis-trans, stereogenic) – Ring conformation library and access of ring conformations • Validation of CAESAR II: Speed, Quality, Diversity – Quality (binding conformations) – Diversity: 3D pharmacophore space, distribution of radius of gyration – Test on macrocyles © 2009 Accelrys, Inc. 3
  4. 4. Conformation Sampling is important •The Conformationimportant in many 3D conformation search is Search Problem applications – How a new compound binds to a protein is usually unknown – 3D pharmacophore modeling – Docking • Efficient search is challenging ! – Most drug molecule is flexible in their conformations – Low energy conformation space is high dimensional and highly irregular in shape. – Stereo-chemistry and ring closure constraints • CAESAR is an efficient search algorithm © 2009 Accelrys, Inc. 4
  5. 5. CAESAR: Conformation Algorithm base on Energy Screening And Recursive Buildup 1. Recursively partition 2. Recursively assemble E A = E A + E B + E A− B B 3. Quickly filter out bad clashes 4. New method for eliminating duplicates © 2009 Accelrys, Inc. 5
  6. 6. 3D Search results Database: build with both Catalyst/FAST and CAESAR for a test set of 50,000 molecules Table 1. Number of catSearch hits with three 3D pharmacophore queries Catalyst/FAST CAESAR Common ang-IIHypo 90 100 73 Hypo2 247 236 215 ang-IIHypoShape 22 21 13 © 2009 Accelrys, Inc. 6
  7. 7. Reference of CAESAR I J. Li, T. Ehlers, J. Sutter, S. Varma-O’Brien, and J. Kirchmair, CAESAR: A New Conformer Generation Algorithm Based on Recursive Buildup and Local Rotational Symmetry Consideration, J. Chem. Inf. Model. 2007, 47, 1923-1932 © 2009 Accelrys, Inc. 7
  8. 8. New Developments of CAESAR • CAESAR II – Direct Geometry Method for conformation generation of constraint structures – New method for stereo chemistry control (chirality, cis- trans etc). – New strategy of ring conformation library © 2009 Accelrys, Inc. 8
  9. 9. Traditional method for ring structure (Distance Geometry) • – Bound Smoothing (can be very time consuming) • – Embedding from a distance matrix • – Optimization of the generated structures © 2009 Accelrys, Inc. 9
  10. 10. Direct Geometry Method • Direct 3D coordinate modification according to geometric constraints • Type of geometric constraints – Bond length – Bond angle (co-linear, 180 degree) – Torsion (co-planar, 180 degree) – Stereo chemistry (chirality, cis-trans, stereogenic) – VDW clash • All types of constraints can be converted into distance constraints © 2009 Accelrys, Inc. 10
  11. 11. Bond Length • Bond Length correction Bond length between C16 and N15 are too long. Correction: Move the two atoms to each other. © 2009 Accelrys, Inc. 11
  12. 12. Bond Angle • Bond angle correction Bond anlge C12-C13-C14 is too small. Correction: Increase the distance between C12 and C14. © 2009 Accelrys, Inc. 12
  13. 13. Linear bonds • Linear bond correction Carbon C2 is off line. Correction: Move C2 to its correct position. © 2009 Accelrys, Inc. 13
  14. 14. VDW Clash • Remove VDW clash Two atoms H3 and H6 are too close.. Correction: Move H3 and H6 from each other. © 2009 Accelrys, Inc. 14
  15. 15. Other types of geometric constraints can also converted into distance constraints • For instance, simple distance constraints does help for chiral centers. We can use stereo templates with correct chirality to guide each atom’s move to achieve the correct geometries. SOS by Zhu and Agrafiotis also had similar idea. • If the chirality is unknown, no additional constraints are needed © 2009 Accelrys, Inc. 15
  16. 16. Put the simple ideas into practice • Not a single correction can satisfy all the constraints. • Correction needs to be done iteratively • Control of convergence is important © 2009 Accelrys, Inc. 16
  17. 17. Test 1: Diamond structures from random starting coordinates © 2009 Accelrys, Inc. 17
  18. 18. Iterative correction towards correct structure © 2009 Accelrys, Inc. 18
  19. 19. Test 2: C-60 © 2009 Accelrys, Inc. 19
  20. 20. Timing:Direct Geometry Method VS. Distance Geometry Method Table 2: CPU time (second) for generating 3D structures Molecule Direct Method DG Method Ratio C60 0.12 1 8 Diamond 0.04 43 1000 © 2009 Accelrys, Inc. 20
  21. 21. Test 3: All-trans cyclic peptide bonds © 2009 Accelrys, Inc. 21
  22. 22. Test 4: Conformation sampling of macrocycles (Pascal Bonnet data set). Diversity by fingerprints Table 3: Number of pharmacophore fingerprints of conformation models of macrocycles MOL CASERII OMEGA Number of 3 Number of 4 Number of 3 Number of 4 Num. of Num. of points points points points Conf. Conf. Fingerprints Fingerprints Fingerprints Fingerprints P1 45 10567 251585 0 0 0 P2 71 8375 236449 76 8948 448917 P3 8 714 4468 157 3550 37078 P4 206 6104 165997 200 6440 109051 P5 11 3735 60097 22 4142 83547 P6 19 10481 221494 49 7326 322580 CD6 25 1207 44184 6 1062 29948 G6 6 250 498 9 131 342 G8 6 648 2476 13 359 1828 G10 6 1374 7503 11 760 5922 G12 6 2354 20640 5 969 8202 G14 6 3735 41996 2 817 6458 G16 6 4784 101998 2 1578 14654 G18 6 7345 165467 3 1154 23631 G20 6 8626 245656 2 1305 26909 SUM-1 433 70299 1570508 557 38541 1119067 SUM-2 388 59732 1318923 557 38541 1119067 Notes: (1) Bin Size 1.5A. All other setting are default in DS 2.1. (2) D8-D14 molecules failed in fingerprint generations, thus were excluded. (3) SUM-1: summation of all 15 molecules. SUM-2: P1 excluded. © 2009 Accelrys, Inc. 22
  23. 23. Test 4: Conformation sampling of macrocycles (Pascal Bonnet data set). Radius of gyration Figure 1. Distribution of radius gyration of conformations generated by OMEGA and CAESAR II © 2009 Accelrys, Inc. 23
  24. 24. Distribution of sum of atom-atom distances Figure 2. Distribution of atom-atom distance summation of conformations generated by OMEGA and CAESAR II © 2009 Accelrys, Inc. 24
  25. 25. Test 5. Find bioactive conformation with CAESAR I test dataset Table 4. RMSD of the best fitting conformation to the bioactive conformations PDB Ligands CASERII OMEGA (CAESAR I test data) (maxconf=400) (maxconf=400) Average 0.96 0.93 RMSD(angstrom) CPU time (s) 226 4385 RMSD < 0.5 26.6% 22.9% RMSD < 1.0 61.0% 61.9% RMSD <1.5 82.1% 87.6% RMSD <2.0 92.4% 94.4% *Machine: Intel(R) Xeon(R) CPU E7420 @ 2.13GHz © 2009 Accelrys, Inc. 25
  26. 26. Push efficiency to the new limit: reuse ring conformations by creating a library • Scan 6M compounds=> 100,000 ring/rigid structures • Generation conformations for all rings in the library using the BEST method • Build index for the library • ~100MB file size © 2009 Accelrys, Inc. 26
  27. 27. Retrieve ring conformations from library efficiently • Load the index file • Read in ring conformation from file if it is not cached in the memory, else just use the ring conformation from memory • If the ring is not in the library, generate ring conformations on the fly using the direct method, and save them in the memory for reuse © 2009 Accelrys, Inc. 27
  28. 28. Speed Test of on-the-fly conformation generation using CAESAR II • Test condition – Data set: CAP2008 database, ~6million compounds – Max Conformations/compound = 100 – Ring conformation library is pre-generated using Catalyst/BEST method – Quad-core CPU, 2.2 GHz, parallel computing – Without saving conformations in SD file (I/O bottleneck) • Speed – It takes 1.5 hours for generating conformations for all 6 million compounds, or 250 compounds/second/processor © 2009 Accelrys, Inc. 28
  29. 29. Summary • There are two new technologies in CAESAR II which make the new algorithm much more robust and efficient than the original CAESAR – Direct method for 3D structure generation of ring and rigid structures – New ring/rigid structure library and retrieving method • The ring conformation generation using Direct Geometry Method is highly efficient and robust • The conformer model of ring molecules has good coverage of 3D phamacophore space. © 2009 Accelrys, Inc. 29
  30. 30. Acknowledgment • Jon Sutter • Honglin Li • Fang Bai • David Zhang • Paul Flook • Frank Brown © 2009 Accelrys, Inc. 30

×