Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using Matched Molecular Pairs To Cluster Compounds


Published on

Short presentation showing how matched molecular pairs can be used to cluster SAR data

  • Be the first to comment

  • Be the first to like this

Using Matched Molecular Pairs To Cluster Compounds

  1. 1. Using Matched Molecular Pairs tocluster compoundsWillem van HoornSenior Solutions ConsultantProfessional ServicesAccelrys, UK
  2. 2. See previous post for intro re MMP etc:
  3. 3. MMP output IDs / Activities of smiles of R-groups smiles of core compounds in MMP
  4. 4. Chemical series identification• Series identification ≈ clustering compounds• There is no universal best clustering method – Personal taste – May want few loose clusters or many tight clusters – Etc• Aim: identify series with interpretable SAR
  5. 5. Test set: EGFR from ChEMBL - ChEMBL version 11 - 4609 IC50 values - 3581 compounds - 2869 unique compounds with IC50 Ed Griffen et al J Med Chem. 2011, 54, 7739-50
  6. 6. Cluster by common core• 2869 compounds yield 2595 unique cores – Too many clusters – However: many cores are substructure of others: is subset of
  7. 7. Identify unique common cores• Convert all cores to substructure queries• Perform all vs. all substructure search• 430 cores are not substructure of other core
  8. 8. Map compounds to unique cores 1000 Number of series this size 100 430 series, 51 with ≥10 compounds 10 1 Series size
  9. 9. All cluster members form MMPs Sorted by pIC50
  10. 10. Same cluster in SAR table
  11. 11. Compound can be member of >1 cluster Core1 R1 R2 Core2