Molecular Similarity Characterization of          ADME Landscapes                       ACS Annual Meeting                ...
Outline     Introduction     Methods     Results & discussions     Use cases     Conclusions2                        ...
What has been done so far?                                              A lot of excellent work                           ...
Do similar compounds have similar ADME properties?    Similar ADME                                                        ...
Do different ADME endpoints have different landscapes?                                                                # ne...
Hypothesis: Visualizing Chemical Landscape                                                                       Identical...
Datasets, Assays and Bins    Endpoint                       Description                                 Result unit       ...
Characterize Chemical Landscape: Proposed Workflow*      Full matrix                     FCFP6                            ...
What are we evaluating?     Compound ID   Similarity 0.9   Similarity 0.8          Similarity 0.7   …         PF_1        ...
Results: Compare Different Endpoints      (a) ECFC4, Tanimoto, low risk                        (b) ECFC4, Tanimoto, high r...
Results: Compare Different Fingerprints*       (a) RRCK high risk                                            (b) RRCK low ...
Results: Compare different similarity coefficients             (a) RRCK Low Risk                         (b) RRCK High Ris...
Use Case: Which one is better to optimize?                                                                                ...
Use Case: Data Driven Compound Prioritization?                    l                          h                            ...
Potential Combinations     • 4 descriptor types are used     • 2 similarity metrics are used     • 9 endpoints,     • 512 ...
Results: Ranking matrix                                                                                     CYP2C9        ...
Conclusion      Small structural changes result in change of class       (High/Low Risk) within a given endpoint      Di...
Reference        Martin YC et al. Do Structurally Similar Molecules Have Similar Biological         Activity?. J. Med. Ch...
Acknowledgement        David Wild (School of Informatics and Computing, Indiana University)        Veerabahu Shanmugasun...
ThanksQuestions and Comments
Results     RRCK, ECFC4, Tanimoto, High Risk                    RRCK, ECFC4, Tanimoto, Low Risk Heatmap for ratios of all ...
Discussion & further work    Normal distribution    Outliers analysis    Ranking function validation    Implementation...
Backup—Normal distribution                                                           700                                  ...
Upcoming SlideShare
Loading in …5
×

Molecular Similarity Characterization of ADME Landscapes

967 views

Published on

ACS Annual Meeting San Francisco 2010

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
967
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
20
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Molecular Similarity Characterization of ADME Landscapes

  1. 1. Molecular Similarity Characterization of ADME Landscapes ACS Annual Meeting San Francisco 2010 Bin Chen‡, Rishi Gupta* and Eric Gifford† ‡ School of Informatics and Computing, Indiana University, Bloomington, IN 47408 * Anti Bacterial Research Unit, Pfizer Global R&D, Groton, CT 06340 † Computational Sciences CoE, Pfizer Global R&D, Groton, CT 06340 Pfizer Confidential
  2. 2. Outline  Introduction  Methods  Results & discussions  Use cases  Conclusions2 Pfizer Confidential
  3. 3. What has been done so far? A lot of excellent work in the Activity space using a variety of similarity methods and descriptorsCurrent work focusesprimarily on ADME endpoints and Molecularproperties whileexamining variousdescriptor types andsimilarity methods3 Pfizer Confidential
  4. 4. Do similar compounds have similar ADME properties? Similar ADME Varies based on Similarity 0.92 Similarity 0.85 Properties? descriptors used O OH OH OH 0.9 0.8 0.74 Pfizer Confidential
  5. 5. Do different ADME endpoints have different landscapes? # neighbors with same class Probe Compound Ratiosimilarity # total neighbors High Risk Compound HLM Low Risk Compound 4 Ratio0.9 0.8 0.9 0.8 0.7 5 8 Ratio0.8 0.67 12 RRCK 4 Ratio0.9 0.8 0.9 0.8 0.7 5 9 Ratio0.8 0.75 125 Pfizer Confidential
  6. 6. Hypothesis: Visualizing Chemical Landscape Identical Compounds Ratio ~1.0 1.0 Ratio Endpoint1 0.5 Endpoint2 Endpoint3 Endpoint4 Ratio=f(endpoint, similarity) Ratio ~ High (low) 0.2 0.5 0.8 1 risk compounds/total Similarity cutoff compounds6 Pfizer Confidential
  7. 7. Datasets, Assays and Bins Endpoint Description Result unit Low Risk High Risk -6 RRCK passive permeability in RRCK cell line 10 cm/sec >10 <=10 HLM metabolic stability using human liver microsomes µL/min/mg <20 >=20 -6 MDR Pgp influenced permeability and 10 cm/sec >10 <=10 efflux in MDCK-MDR1 cells CYP1A2 CYP1A2 inhibition in a substrate cocktail assay % Inhibition <10 >=10 CYP3A4 CYP3A4 inhibition in a substrate cocktail assay % Inhibition <10 >=10 CYP2D6 CYP2D6inhibition in a substrate cocktail assay % Inhibition <10 >=10 CYP2C9 CYP2C9 inhibition in a substrate cocktail assay % Inhibition <10 >=10 *Solubility ADMET Aqueous Solubility properties Solubility level >2 <=2 *cLogP logarithm partition coefficient Octanol-Water <3 >=3 Partition Coefficient • Full matrix consisting of 17787 compounds and 9 endpoints • Solubility and cLogP are predicted endpoints using in-house computational models on datasets with more than 10K compounds ,the rest are experimental results7 Pfizer Confidential
  8. 8. Characterize Chemical Landscape: Proposed Workflow* Full matrix FCFP6 Similarity • Structure similarity (cmpd*endpoint) Tanimoto matrix • Fingerprint (4) Select all high/low • MDL public keys risk compounds in an • Atom pairs Endpoint • FCFP6 Select one similarity • ECFC4 cutoff • Coefficient (2) • Tanimoto Select one compound Iterate all high • Cosine Iterate all Cutoffs Calculate the ratio of risk compounds • Risk categorization (2) (total 14) each compound • High risk Ratiosimilarity # neighbors _ with _ same _ class total _# neighbors • Low risk • Endpoints (9) Average the ratio of • Complexity: 4*2*2*9=144 all the compounds Plot: Similarity cutoff & ratioWorkflow for Plotting landscape of an endpoint using FCFP6 and tanimoto as similarity measurement8 *Molecular Similarity Characterization of ADME Landscapes; Chen et al., JCIM, Submitted, 2010 Pfizer Confidential
  9. 9. What are we evaluating? Compound ID Similarity 0.9 Similarity 0.8 Similarity 0.7 … PF_1 0.9 0.9 0.7 … PF_2 1 0.5 0.7 … PF_3 0.95 0.8 0.7 … … … … … … PF_N 0.91 0.85 0.68 … average 0.95 0.85 0.7 …  Calculate the ratio of all compounds, individually.  Average the ratio of all the compounds at each similarity threshold, ignoring the ratio is 0 (either no same class neighbor or no neighbor)9 Pfizer Confidential
  10. 10. Results: Compare Different Endpoints (a) ECFC4, Tanimoto, low risk (b) ECFC4, Tanimoto, high risk • Rate of “fall” of a given curve defines how easy/difficult it would be to modify a compound and modify its property i.e. transform a compound from being high risk to low risk or vice versa • Compounds in MDR are relatively difficult to come out of a High Risk Class compared to HLM at any given similarity cutoff •10 Ratio stays constant after a given certain similarity threshold (i.e. 0.4 in the case of CYP2C9 ) Pfizer Confidential
  11. 11. Results: Compare Different Fingerprints* (a) RRCK high risk (b) RRCK low risk • Ratio is different among fingerprints, the order is always FCFP6> Atom- pairs >ECFC4>MDL *Molecular11 Similarity Characterization of ADME Landscapes; Chen et al., JCIM, Submitted, 2010 Pfizer Confidential
  12. 12. Results: Compare different similarity coefficients (a) RRCK Low Risk (b) RRCK High Risk • Ratio is different among similarity coefficients, the order is always12 tanimoto>Cosine Pfizer Confidential
  13. 13. Use Case: Which one is better to optimize? MDR:LOW MDR: HIGH N RRCK:HIGH RRCK: LOW O … … N N N N N S Probability of Success? MDR: LOW? MDR:LOW? RRCK: LOW? RRCK:LOW? O … N … N N N N N S13 Pfizer Confidential
  14. 14. Use Case: Data Driven Compound Prioritization? l h Ei (ratio) (1 E j (ratio)) i j ADMET score l h CYP2D6 CYP2C9 CYP1A2 CYP3A4 Aq. Sol. cLogP RRCK MDR HLM Compds # High SCORE Risk Compound1 - - + - - - - - - 1 0.688 Compound2 + - - - - - - - - 1 0.694 Compound3 - - - - - - - + + 2 0.623 Compound4 - - - + + - + - - 3 0.627 + and - represent high risk and low risk endpoint, respectively14 Pfizer Confidential
  15. 15. Potential Combinations • 4 descriptor types are used • 2 similarity metrics are used • 9 endpoints, • 512 combinations. • Overlap means some compounds with higher risk endpoints should go first than those with lower e.g.: MDL+Tanimoto Coeff.15 Pfizer Confidential
  16. 16. Results: Ranking matrix CYP2C9 CYP2D6 CYP1A2 CYP3A4 Aq. Sol. # high Score at Score at cLogP RRCK MDR HLM endpoints similarity similarity 0.5 0.6 + + + + + + + + + 9 0.326676 0.275558 - + + + + + + + + 8 0.372088 0.333456 + - + + + + + + + 8 0.372646 0.332717 - - + + + + + + + 7 0.418058 0.390616 + + - + + + + + + 8 0.374459 0.336353 ... ... ... ... ... ... ... ... ... … … … - - + + + + + + + 1 0.679591 0.714969 + + - - - - - - - 2 0.635992 0.660706 - + - - - - - - - 1 0.681403 0.718605 + - - - - - - - - 1 0.681962 0.717866 - - - - - - - - - 0 0.727373 0.775765 • + and - represent high risk and low risk endpoint, respectively16 • totally, 9 endpoints and 512 combinations Pfizer Confidential
  17. 17. Conclusion  Small structural changes result in change of class (High/Low Risk) within a given endpoint  Different endpoints behave differently from each other e.g. MDR may be difficult to modify than CYP2C9  Curves are relatively parallel to each other independent of descriptor and similarity metric  Derived scoring function out of the plots to prioritize compounds (for screening or series selection)  Ratios could be used for differentiating between “difficult” endpoints versus “easy” endpoints 1.0 Ratio Difficult 0.5 Easy 0.2 0.5 0.8 117 Pfizer Confidential Similarity cutoff
  18. 18. Reference  Martin YC et al. Do Structurally Similar Molecules Have Similar Biological Activity?. J. Med. Chem. 2002, 45, 4350-4358  Medina-Franco, JL; et al. Characterization of Activity Landscapes Using 2D and 3D Similarity Methods: Consensus activity Cliffs. J. Chem. Inf. Model. 2009, 49, 477-491  Segall MD, et al. Focus on Success: Using a Probabilistic Approach to Achieve an Optimal Balance of Compound Properties in Drug Discovery. Expert Opin. Drug Metab. Toxicol. 2006, 2, 325-3718 Pfizer Confidential
  19. 19. Acknowledgement  David Wild (School of Informatics and Computing, Indiana University)  Veerabahu Shanmugasundaram (AB RU)  Robyn Ayscue  Hua Gao19 Pfizer Confidential
  20. 20. ThanksQuestions and Comments
  21. 21. Results RRCK, ECFC4, Tanimoto, High Risk RRCK, ECFC4, Tanimoto, Low Risk Heatmap for ratios of all compounds at 14 similarity cutoffs21 Pfizer Confidential
  22. 22. Discussion & further work Normal distribution Outliers analysis Ranking function validation Implementation  On virtue of full matrix and ADME predictive model, any given compound can be assigned a score for prioritization22 Pfizer Confidential
  23. 23. Backup—Normal distribution 700 620 600 554 548 523 500 456 397 396 400 331 308 300 263 212 198 200 185 161 148 150 121 104 101 100 42 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Binned Ratio RRCK, ECFC4, high, similarity 0.85 RRCK, ECFC4, high, similarity 0.6523 Pfizer Confidential

×