Your SlideShare is downloading. ×
0
Tim Cheeseright, Mark Mackey, Rob Scoffin, Martin SlaterAssessing the similarity of compound collectionsusing molecular fi...
Conclusions> It works brilliantly> All synthetic steps gave yields of 100%> All enrichments were perfect> All new molecule...
Conclusions> Work in progress> 3D similarity can add value to compound  selection> Full matrix of similarities possibly un...
Agenda & Background> Fields & similarity> Generating screening compounds using Fields> Selecting a 10K “diverse” library f...
Field PointsCondensed representation of electrostatic, hydrophobicand shape properties (“protein‟s view”)   > Molecular Fi...
Improved MM Electrostatics> Field patterns from XED force field reproduce  experimental results        Experimental       ...
Non-Classical Comparisons                            7
Molecular Alignment             0.82                     0.66                             0.98                    Cheeseri...
Using Fields>   Bioisosteric groups>   Virtual Screening>   Pharmacophore hypothesis>   Qualitative SAR interpretation>   ...
Field based library design success                                     10
Libraries from Fields> Small, custom synthesised libraries (~100s -  1000s compds)> Low scaffold diversity> Highly targete...
An Opportunity & a Challenge> Provide a small diverse screening library 10K for  a small biotech company  > Diversity in p...
Initial thoughts> Customised design not an option - commercial  compounds only> Using Fields to successfully select compou...
Initial thoughts> Compare 3D and 2D similarities for compound  collections - are we wasting our time?> Take a small compou...
Conformations> 3D Method requires conformations - which  one(s) to use?> What is the similarity of 2 compounds in 3D ?  > ...
Compound Collection> BIONET Rule of Three (Ro3) Fragment  Library: “7,907 Ro3-compliant fragments”> Conformation hunt on e...
Problems> 400Mb of data> Tedious to use and examinePilot study just using the first 500 compounds   > Some chemical famil...
Comparing 2D and 3D metrics                              Agreement                                          19
Example - Similar Scores                2D sim = 0.9    101                              104               3D field sim = ...
Example - Higher 3D Sim                 2D sim = 0.1             (other methods=0.3)               3D field sim = 0.82    ...
Example - Higher 3D Sim               2D sim = 0.2        141                   454               3D sim = 0.7            ...
Example - Higher 3D Sim                  2D sim = 0.3              (other methods 0.55)        437                        ...
So…> Pilot study suggests some added value> Full matrix painful even if we could calculate it> What about a reduced matrix...
Compound selection by Field Diversity> Proposed workflow for generation of a field diverse library:     9M                ...
Field Diverse library: Outcome12K „Field Diverse‟ library mapped by 3D PCA on the100 x 20,000 „Field Similarity Fingerprin...
Field Diverse library: Outcome12K „Field Diverse‟ library mapped by 3D PCA                                   Distinct sepa...
Deeper - Moderate „Field Similarity‟                           Alignment to „template1‟                                   ...
Deeper - Moderate „Field Similarity‟Random selection of mols   Alignment to „template1‟                                   ...
Deeper - Moderate „Field Similarity‟                           Alignment to „template‟                                    ...
Is the chemical space sensible?                                  Small sulphonamides                                  Larg...
Conclusions> Work in progress> Full similarity matrix shows potential of 3D sim to  add value> Full matrix difficult to ha...
Acknowledgements> Cresset  > Martin Slater  > Rob Scoffin  > Mark Mackey  > James Melville> Mission Therapeutics  > Keith ...
Upcoming SlideShare
Loading in...5
×

Tim Cheeseright, Assessing the Similarities of Compound collections using molecular fields: Does it add value?

350

Published on

This presentation, originally given at the 2012 ACS National Meeting in San Diego, investigates alternative methods of defining chemical space using 3D Field based methodologies - the advantages and disadvantages of which are described.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
350
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Notes:The 2D drawing of a molecule gives limited information about its nature – in real life, molecules take on a 3D geometry whose nature can’t be truly represented by a flat cartoon.Consider the electrostatic potential surrounding a molecule and map that potential out to a surface as shown in the second figure. Field Points are points that are placed at the extrema of the MEP, with the point size governed by the size of the electrostatic contribution.Spatial points are also included at the van der Walls radii extrema.
  • 1) Commercial databases 9 million filtered for Heavy atom count:  >11 < 30 correspond to roughly  Mwt >140  < 500  (4,655,051 cpds)  (2) Further filtered for rotatable bond count < 5  reactive group filters applied (removes nasties like aldehydes, ketones, hydrazones, alkylhalides, isocyanates, nitrosyl etc… see below for full list), charge filters < 3 formal charges neg. or positive.    (1,282,042 mols passed these filters). (3) For this list of compounds we intend to calculate logP, HBA, HBD, PMI and shadow indices and select 20K on shape diversity. I believe this is going to be a reasonable approximation of field similarity since fields are also heavily dependent on 3D conformation. (4) From this data we also intend to pick 100 probe molecules and use these to calculate similarity v the 20K set. This gives a 20K set each with a 100 bit field fingerprint.  This is the equivalent of a completing a 2M virtual screen. (5) This fingerprint can be subjected to a PCA analysis to reduce the data effectively to a 3 dimensional ‘field space’ from which a diverse 12 K set can be chosen. From a practical point of view it will be difficult to expand this process to a bigger data set although if 3d shape sim correlates well with Field sim then the PMI selection may be enough – we simply don’t know until we do the experiment.  (6) We will provide the 12K SD file set for you to purchase with 2000 cpd redundancy for those which are  not available or too expensive etc. (a) filtered on properties and nasty functionality to obtain a 1.2 million compound data set.  (b) On this set we ran a PMI shape descriptor calculation on a single ‘lowest energy’ conformation for each molecule in the set. (c) From this we picked a 20K shape diverse set using the PMI defined shape space.  (d)From the 20K set I picked a diverse 200 cpd set in the same way.(e) We applied to this 200 an all by all 2D similarity matrix ‘200 by 200’ we could then ensure 2D dissimilarity in the choice of a set of a 100 probe molecules. (f) These 100 probe molecules were used as templates to measure Field similarity against each of the 20K cpds and thus produce a 100 bit number for each of the 20K cpds.                (g) From the Field similarity matrix we collapsed the ‘ ~20000 X 100’ matrix to ~20000 X 3 dimensions using PCA to define the 3D fieldspace.                (h) 12k Field diverse compounds were selected from this 3D Fieldspace.
  • Theoretically, field based metrics should be a good way to assess the similarity/diversity of fragment collections?? Diversity of fragment databases?? In Fieldstere
  • Should have probably done a 200 X 200 field similarity at this stage to ensure picking field diverse probes? But 2d disim also ensured we were avoiding picking too similar chemotypes for the probes – probably doesn’t matter. Theoretically, field based metrics should be a good way to assess the similarity/diversity of fragment collections?? Diversity of fragment databases?? In FieldstereNever tried using a smaller number of probes – could increase/decrease discrimination?
  • Picked a cluster set from the space 3D PCA – selected an arbitrary conformer then flexibly aligned (Falign) the rest – plot surface. Bottom 8 Fsim less than 6
  • Picked a cluster set from the space 3D PCA – selected an arbitary conformer then flexibly aligned (Falign) the rest – plot surface. Bottom 8 Fsim less than 6
  • Againselected an arbitary conformer (different one this time) then flexibly aligned (Falign) the rest – plot surface. Bottom 5 Fsim less than 6
  • Picked a second cluster and repeated with another Arbitary template – Fsims all > 6 discarded 4 which were below 6. – Cluster still OKConclude: Evenin this space - clusters of close field similarity are still fairly diverse!!
  • Separation of chemically intuitive groupings – DHP-like esters/lactones………….compact sulphonamides – clusters on periphery are truly Field dissimilar.
  • Transcript of "Tim Cheeseright, Assessing the Similarities of Compound collections using molecular fields: Does it add value?"

    1. 1. Tim Cheeseright, Mark Mackey, Rob Scoffin, Martin SlaterAssessing the similarity of compound collectionsusing molecular fields: Does it add value? 1
    2. 2. Conclusions> It works brilliantly> All synthetic steps gave yields of 100%> All enrichments were perfect> All new molecules were sub nM> All QSARs were totally predictive, q2 = 1.0> We expect the call from Sweden any day now 2
    3. 3. Conclusions> Work in progress> 3D similarity can add value to compound selection> Full matrix of similarities possibly unnecessary> Using probes looks like a possible solution> Not a panacea 3
    4. 4. Agenda & Background> Fields & similarity> Generating screening compounds using Fields> Selecting a 10K “diverse” library for screening from commercial compounds > Initial thoughts > Problems > More Initial thoughts > A solution but not a complete one> Conclusions 4
    5. 5. Field PointsCondensed representation of electrostatic, hydrophobicand shape properties (“protein‟s view”) > Molecular Field Extrema (“Field Points”) 2D 3D Molecular Field Points Electrostatic = Positive Potential (MEP) = Negative = Shape = Hydrophobic 5
    6. 6. Improved MM Electrostatics> Field patterns from XED force field reproduce experimental results Experimental Using XEDs Not using XEDs Interaction of Acetone and Any-OH from small molecule XED adds ‘p-orbitals’ to crystal structures get better representation of atoms 6
    7. 7. Non-Classical Comparisons 7
    8. 8. Molecular Alignment 0.82 0.66 0.98 Cheeseright et al, J. Chem Inf. Mod., 2006, 665 8
    9. 9. Using Fields> Bioisosteric groups> Virtual Screening> Pharmacophore hypothesis> Qualitative SAR interpretation> 3D QSAR> Library Design 9
    10. 10. Field based library design success 10
    11. 11. Libraries from Fields> Small, custom synthesised libraries (~100s - 1000s compds)> Low scaffold diversity> Highly targeted> Lots of manual design 11
    12. 12. An Opportunity & a Challenge> Provide a small diverse screening library 10K for a small biotech company > Diversity in potential biological targets to be hit > Minimum redundancy in the set > Maximum chance of success in finding a lead within available budget and screening resources 12
    13. 13. Initial thoughts> Customised design not an option - commercial compounds only> Using Fields to successfully select compounds for screening performed many times > Virtual screening > Always in a specific biological context> What about using Fields to choose a „diverse‟ set> Possible problem with numbers > 10,000 cmpd library small > 9,000,000 commercially available molecules v. large for 3D diversity 13
    14. 14. Initial thoughts> Compare 3D and 2D similarities for compound collections - are we wasting our time?> Take a small compound collection> Full NxN calculation> 3D method = Fields & Shape> 2D method = atom pairs> Compare and Contrast 14
    15. 15. Conformations> 3D Method requires conformations - which one(s) to use?> What is the similarity of 2 compounds in 3D ? > Context is important! > Highest across all conformations? > Average ? > Lowest ?> For 3D, similarity calculation is Nconfs x Nconfs 15
    16. 16. Compound Collection> BIONET Rule of Three (Ro3) Fragment Library: “7,907 Ro3-compliant fragments”> Conformation hunt on every fragment  Maximum of 5 conformations (!)> Full N x N similarity matrix, 3D & 2D (60 Million data points)> ~30 compounds failed conformation hunting 17
    17. 17. Problems> 400Mb of data> Tedious to use and examinePilot study just using the first 500 compounds > Some chemical families in this area > Still a large dataset to deal with (250,000 data points)> 2D similarities and fragments > Small changes cause disproportionately high changes > Atom pairs particularly bad > Switch to KNIME fingerprints  All 2D values lower than „normal‟ 18
    18. 18. Comparing 2D and 3D metrics Agreement 19
    19. 19. Example - Similar Scores 2D sim = 0.9 101 104 3D field sim = 0.87 22
    20. 20. Example - Higher 3D Sim 2D sim = 0.1 (other methods=0.3) 3D field sim = 0.82 23
    21. 21. Example - Higher 3D Sim 2D sim = 0.2 141 454 3D sim = 0.7 24
    22. 22. Example - Higher 3D Sim 2D sim = 0.3 (other methods 0.55) 437 440 3D field sim = 0.8 25
    23. 23. So…> Pilot study suggests some added value> Full matrix painful even if we could calculate it> What about a reduced matrix? > Use „Probe‟ compounds to tease out molecules that are different in Field space How many probes? Across how many molecules> We were running out of time… 26
    24. 24. Compound selection by Field Diversity> Proposed workflow for generation of a field diverse library: 9M Pick 200 commercial Calc. 200 X 200 sub-set compounds 2D similarity Pick 100 Calc. Shape matrix Diverse Diversity by Field Property PMI probes Filters 1.2M Pick 20K sub-set Calc. 20K X 100 Field similarity matrix Pick 12K 3D PCA on Field Field matrix Diverse set 27
    25. 25. Field Diverse library: Outcome12K „Field Diverse‟ library mapped by 3D PCA on the100 x 20,000 „Field Similarity Fingerprint‟ Ammoniums Piperidines Distinct separation of charged species within this space ….so what!! Benzoic and aliphatic acids 30
    26. 26. Field Diverse library: Outcome12K „Field Diverse‟ library mapped by 3D PCA Distinct separation of by molecules by size within this space ….so what!! Decreasing Size 31
    27. 27. Deeper - Moderate „Field Similarity‟ Alignment to „template1‟ 32
    28. 28. Deeper - Moderate „Field Similarity‟Random selection of mols Alignment to „template1‟ 33
    29. 29. Deeper - Moderate „Field Similarity‟ Alignment to „template‟ 35
    30. 30. Is the chemical space sensible? Small sulphonamides Large esters Two example clusters 36
    31. 31. Conclusions> Work in progress> Full similarity matrix shows potential of 3D sim to add value> Full matrix difficult to handle and possibly unnecessary> Using probes looks like a possible solution> Not a panacea - still need to play the numbers game 37
    32. 32. Acknowledgements> Cresset > Martin Slater > Rob Scoffin > Mark Mackey > James Melville> Mission Therapeutics > Keith Menear 38
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×