• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Numerical Characterization of Structure-Activity Relationships from a Medicinal Chemists Point of View
 

Numerical Characterization of Structure-Activity Relationships from a Medicinal Chemists Point of View

on

  • 1,678 views

 

Statistics

Views

Total Views
1,678
Views on SlideShare
1,678
Embed Views
0

Actions

Likes
1
Downloads
47
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Numerical Characterization of Structure-Activity Relationships from a Medicinal Chemists Point of View Numerical Characterization of Structure-Activity Relationships from a Medicinal Chemists Point of View Presentation Transcript

    • Defining & Using Structure-Activity Landscapes Rajarshi Guha Numerical Characterization of Background Visualization Structure-Activity Relationships from a Utilization Medicinal Chemists Point of View Predictive Models 3D models Chemical spaces Summary Rajarshi Guha School of Informatics Indiana University National Medicinal Chemistry Symposium Pittsburgh, PA 15th June, 2008
    • Defining & Using Structure Activity Relationships Structure-Activity Landscapes Rajarshi Guha Background Visualization Utilization Assumptions Predictive Models 3D models Chemical spaces Similar molecules will have similar activities Summary Small changes in structure will lead to small changes in activity One implication is that SAR’s are additive This is the basis for QSAR modeling Martin, Y.C. et al., J. Med. Chem., 2002, 45, 4350–4358
    • Defining & Using Structure Activity Landscapes Structure-Activity Landscapes Melanocortin-4 receptor inhibitors Rajarshi Guha Background Visualization Utilization F3C Cl Cl F3C Cl Cl NH2 Predictive Models NH2 3D models Chemical spaces N N Summary N N NH NH2 O O O Ki = 39.0 nM Ki = 1.8 nM F3C Cl Cl F3C Cl Cl NH2 NH2 N N N N NH NH O NH2 O O O NH2 Ki = 10.0 nM Ki = 1.0 nM Tran, J.A. et al., Bioorg. Med. Chem. Lett., 2007, 15, 5166–5176
    • Defining & Using Structure Activity Landscapes Structure-Activity Landscapes Rajarshi Guha Background Visualization Utilization Predictive Models 3D models Chemical spaces Summary Rugged gorges or rolling hills? Small structural changes associated with large activity changes represent steep slopes in the landscape Activity Cliffs But traditionally, QSAR assumes gentle slopes Machine learning is not very good for special cases Maggiora, G.M., J. Chem. Inf. Model., 2006, 46, 1535–1535
    • Defining & Using Characterizing the Landscape Structure-Activity Landscapes Rajarshi Guha Background Visualization Converting activity cliffs to numbers Utilization Predictive Models A cliff can be numerically characterized 3D models Chemical spaces Structure-Activity Landscape Index (SALI) Summary |Ai − Aj | SALIi,j = 1 − sim(i, j) Cliffs are characterized by elements of the matrix with very large values Guha, R.; Van Drie, J.H., J. Chem. Inf. Model., 2008, 48, 646–658
    • Defining & Using Visualizing the SALI Matrix Structure-Activity Landscapes Rajarshi Guha Background Visualization Utilization Predictive Models 3D models Chemical spaces Summary
    • Defining & Using Visualizing SALI Values Structure-Activity Landscapes Rajarshi Guha Background Alternatives? Visualization A heatmap is an easy to understand visualization Utilization Predictive Models 3D models Coupled with brushing, can be a handy tool Chemical spaces Summary A more flexible approach is to consider a network view of the matrix The SALI graph Compounds are nodes Nodes i, j are connected if SALIi,j > X Only display connected nodes
    • Defining & Using Visualizing the SALI Graph Structure-Activity Landscapes Rajarshi Guha Background q 17 qqqqqqqqq 7 13 29 43 49 45 54 59 76 Visualization q 15 q 28 q qqqqqqq 6 52 44 50 46 55 60 75 Utilization Predictive Models 3D models q q 3 18 qq 2 35 qq q 20 22 9 q 64 q 69 q 21 q 34 q 38 Chemical spaces Summary q 8 q 65 q 24 q q 1 71 qq 12 58 qq 63 10 qq q qq 68 27 23 41 42 qqqq 72 73 31 51 q 39 q 5 q q 19 62 q 25 q 57 q 56 qqq 30 53 37 q 4 q 40 q 66 Nodes are ordered such that the tail node in an edge has lower activity than the head node
    • Defining & Using Better Visualization Structure-Activity Landscapes Rajarshi Guha Background Visualization SALIViewer Utilization Predictive Models Java application for generating and visualizing SALI 3D models Chemical spaces graphs Summary Create SALI graphs from SMILES and activity data, using the CDK fingerprints Easily examine SALI graphs at different cutoffs Provides 2D depictions for nodes and edges Generate SALI curves http://cheminfo.informatics.indiana.edu/~rguha/code/java/salivis
    • Defining & Using Better Visualization - SALIViewer Structure-Activity Landscapes Rajarshi Guha Background Visualization Utilization Predictive Models 3D models Chemical spaces Summary
    • Defining & Using Varying Fingerprint Methods Structure-Activity Landscapes Rajarshi Guha 8 BCI 1052 bit MACCS 166 bit CDK 1024 bit 8 8 Background 6 6 6 Visualization Density Density Density 4 4 4 Utilization Predictive Models 2 2 2 3D models 0 0 0 Chemical spaces 0.70 0.75 0.80 0.85 0.90 0.95 1.00 0.70 0.75 0.80 0.85 0.90 0.95 1.00 0.6 0.7 0.8 0.9 1.0 Tanimoto Similarity Tanimoto Similarity Tanimoto Similarity Summary Shorter fingerprints will lead to more “similar” pairs Requires a higher cutoff to focus on significant cliffs
    • Defining & Using Varying the Similarity Metric Structure-Activity Landscapes Rajarshi Guha Background Visualization Utilization Predictive Models 3D models Chemical spaces Summary The similarity metric does not affect the SALI values
    • Defining & Using SALI Graphs & Predictive Models Structure-Activity Landscapes Rajarshi Guha Background Visualization Utilization The graph view allows us to view SAR’s and identify Predictive Models trends easily 3D models Chemical spaces The aim of a QSAR model is to encode SAR’s Summary Traditionally, we consider the quality of a model in terms of RMSE or R 2 But in general, we’re not as interested in RMSE’s as we are in whether the model predicted something as more active than something else What we want to have is the correct ordering We assume the model is statistically significant
    • Defining & Using SALI Graphs & Predictive Models Structure-Activity Landscapes Rajarshi Guha Background Measuring model quality Visualization Utilization A QSAR model should easily encode the “rolling hills” Predictive Models 3D models A good model captures the most significant cliffs Chemical spaces Summary Can be formalized as How many of the edge orderings of a SALI graph does the model predict correctly? Define S(X ), representing the number of edges correctly predicted for a SALI network at a threshold X Repeat for varying X and obtain the SALI curve
    • Defining & Using SALI Graphs & Predictive Models Structure-Activity Landscapes Rajarshi Guha Background Measuring model quality Visualization Utilization A QSAR model should easily encode the “rolling hills” Predictive Models 3D models A good model captures the most significant cliffs Chemical spaces Summary Can be formalized as How many of the edge orderings of a SALI graph does the model predict correctly? Define S(X ), representing the number of edges correctly predicted for a SALI network at a threshold X Repeat for varying X and obtain the SALI curve
    • Defining & Using SALI Curves - An Example Structure-Activity Landscapes Rajarshi Guha 1.0 Background Visualization Utilization 0.5 Predictive Models 3D models Chemical spaces Summary S(X) 0.0 −0.5 3−descriptor 5−descriptor Scrambled 3−descriptor −1.0 0.0 0.2 0.4 0.6 0.8 1.0 X
    • Defining & Using SALI Curves & Model Comparison Structure-Activity Landscapes Rajarshi Guha Background Visualization Considered four datasets Utilization Predictive Models Developed linear regression models, using exhaustive 3D models Chemical spaces search for feature selection Summary Identify three models for each dataset Minimum RMSE (“best”) Median RMSE Maximum RMSE (“worst”) Generate SALI curves for each model and summarize by dataset Guha, R.; Van Drie, J.H., J. Chem. Inf. Model., submitted
    • Defining & Using SALI Curves & Model Comparison Structure-Activity Landscapes 1.0 1.0 Min−RMSE Min−RMSE Median−RMSE Median−RMSE Rajarshi Guha Max−RMSE Max−RMSE 0.5 0.5 Background Visualization S(X) S(X) 0.0 0.0 Utilization Predictive Models −0.5 −0.5 3D models Chemical spaces Summary −1.0 −1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 X X PDGFR (3) Artemisinin (3) 1.0 1.0 Min−RMSE Min−RMSE Median−RMSE Median−RMSE Max−RMSE Max−RMSE 0.5 0.5 S(X) S(X) 0.0 0.0 −0.5 −0.5 −1.0 −1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 X X Melanocortin (6) Benzodiazepine (5)
    • Defining & Using SALI Curves & Model Comparison Structure-Activity Landscapes Rajarshi Guha Background Visualization 1.0 Utilization Predictive Models The initial and final portions of 3D models 0.5 Chemical spaces the curve are of interest Summary S(X) 0.0 It’s also useful to summarize the whole curve −0.5 We evaluate the area between the curve and the X-axis (SCI) SCI = 0.12 −1.0 -1 ≤ SCI ≤ 1 0.0 0.2 0.4 0.6 0.8 1.0 X SALI Curve Integral
    • Defining & Using SALI Curves & Model Comparison Structure-Activity Landscapes Rajarshi Guha 0.6 Min−RMSE Background Median−RMSE Max−RMSE Visualization 0.5 Utilization Predictive Models 3D models 0.4 Chemical spaces Summary 0.3 SCI 0.2 0.1 0.0 −0.1 PDGFR Artemisinin Melanocortin Benzodiazepine
    • Defining & Using Examining Any Type of Model . . . Structure-Activity Landscapes Rajarshi Guha Background Visualization Previous examples make use of predicted values from Utilization QSAR models Predictive Models 3D models We can consider any “prediction” that is supposed to Chemical spaces track observed activity Summary Ranks Energies Allows us to apply this approach to any type of computational model that predicts something Docking CoMFA Pharmacophore
    • Defining & Using Docking & CoMFA Models Structure-Activity Landscapes Rajarshi Guha 1.0 1.0 Background Visualization 0.5 0.5 Utilization Predictive Models 3D models S(X) S(X) Chemical spaces 0.0 0.0 Summary −0.5 −0.5 SCI = 0.95 SCI = 0.99 −1.0 −1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 X X Docking CoMFA Not surprising that 3D models capture more cliffs The CoMFA model is nearly perfect! Holloway, K. et al, J. Med. Chem., 1995, 38, 305–317 Cavalli, A. et al, J. Med. Chem., 2002, 45, 3844–3853
    • Defining & Using Comparing Landscapes Structure-Activity Landscapes Rajarshi Guha Background Visualization The SALI curve is a function of Utilization dataset Predictive Models 3D models descriptor space Chemical spaces We can quantify a descriptor spaces ability to encode Summary the structure-activity landscape using SALI graphs What is the size of the graph as a function of SALI cutoff? The SALI approach allows us to investigate molecular representations that may not be directly accessible Work in progress
    • Defining & Using Comparing Landscapes Structure-Activity Landscapes Rajarshi Guha Background Visualization The SALI curve is a function of Utilization dataset Predictive Models 3D models descriptor space Chemical spaces We can quantify a descriptor spaces ability to encode Summary the structure-activity landscape using SALI graphs What is the size of the graph as a function of SALI cutoff? The SALI approach allows us to investigate molecular representations that may not be directly accessible Work in progress
    • Defining & Using Comparing Landscapes Structure-Activity Landscapes Rajarshi Guha Background Visualization The SALI curve is a function of Utilization dataset Predictive Models 3D models descriptor space Chemical spaces We can quantify a descriptor spaces ability to encode Summary the structure-activity landscape using SALI graphs What is the size of the graph as a function of SALI cutoff? The SALI approach allows us to investigate molecular representations that may not be directly accessible Work in progress
    • Defining & Using Comparing Landscapes Structure-Activity Landscapes Rajarshi Guha 1.0 Fingerprint Best 3−Descriptor Background Worst 3−Descriptor Visualization Normalized Number of Cliffs 0.8 Utilization Predictive Models 3D models 0.6 Chemical spaces Summary 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 SALI Cutoff A Type 2 SALI curve for the PDGFR dataset, comparing 3 different molecular representations
    • Defining & Using What’s Next? Structure-Activity Landscapes Rajarshi Guha Background SALI graphs and curves represent a framework for Visualization exploring structure-∗ landscapes Utilization Predictive Models 3D models Chemical spaces Open questions Summary Weighted SALI graphs (ADMET, synthetic feasibility) Is it correct to identify cliffs using fingerprints, and then predict cliffs using different descriptors? Can we use SALI curves to compare 3D and 2D descriptor spaces? Can we use SCI for feature selection?
    • Defining & Using Conclusions Structure-Activity Landscapes Rajarshi Guha Background Visualization Utilization The SALI is an effective way to numerically encode Predictive Models 3D models activity cliffs Chemical spaces Summary The network view of these values allows us to explore SAR’s in an intuitive way Using the SALI curve allows us to compare predictive models in a manner that is intuitive for a medicinal chemist
    • Defining & Using Acknowledgments Structure-Activity Landscapes Rajarshi Guha Background Visualization Utilization Predictive Models 3D models Chemical spaces Summary John Van Drie
    • Defining & Using Structure-Activity Landscapes Rajarshi Guha Background Visualization Utilization Predictive Models 3D models Chemical spaces Summary
    • Defining & Using Making Use of the SALI Graph Structure-Activity Landscapes Rajarshi Guha Background Visualization Utilization A little difficult with a non-interactive graph Predictive Models 3D models We can investigate a series of transformations that Chemical spaces Summary increase (or decrease) activity Identify two types of SAR’s Broad Detailed Depends on what cutoff we choose These correspond somewhat to the continuous and discontinuous SAR’s described by Peltason et al. Peltason, L. et al., J. Med. Chem., 2007, 50, 5571–5578
    • Defining & Using Glucocorticoid Inhibitors Structure-Activity Landscapes Rajarshi Guha Background 06−21 06−32 06−41 06−30 07−12 07−19 07−20 07−23 06−42 06−24 06−44 07−30 07−41 Visualization Utilization Predictive Models 3D models Chemical spaces Summary 06−20 06−16 06−43 07−13 07−14 07−17 07−35 07−36 07−37 07−39 07−42 62 dihydroquinoline derivatives IC50 ’s reported, some values were censored 07−38 50% SALI graph generated using 1052 bit BCI fingerprints Takahashi, H. et al, Bioorg. Med. Chem. Lett., 2007, 17, 5091–5095
    • Defining & Using Glucocorticoid Inhibitors Structure-Activity Landscapes Rajarshi Guha O Background Visualization O Utilization Moving from ally or phenylethyl N H Predictive Models 3D models to ethyl causes a 6-fold increase 07-20, 2000 nM Chemical spaces Summary in activity O Reducing bulk at this position seems to improve activity O Pretty broad conclusion N H 07-23, 2000 nM But ethyl is not much smaller than allyl O We need more detail O N H 07-17, 355 nM
    • Defining & Using Glucocorticoid Inhibitors Structure-Activity Landscapes Rajarshi Guha Background Visualization q q qqqqqqqq 06−21 06−23 06−41 06−40 06−30 06−31 06−39 06−32 07−41 07−15 q 07−19 q q 07−20 07−23 q q q 06−37 07−12 06−44 Utilization Predictive Models 3D models Chemical spaces Summary qq 06−20 06−24 q 06−42 qq 06−43 06−16 q 07−42 qqqq qq qqqqq 07−26 07−27 07−25 07−21 07−29 07−18 07−22 06−36 07−13 07−14 07−37 q q 07−36 07−35 q 07−38 q 07−17 q 07−30 Generated using a 30% cutoff q 07−39
    • Defining & Using Glucocorticoid Inhibitors Structure-Activity Landscapes Rajarshi Guha q q 07−19 q q 07−20 07−23 q q q 06−37 07−12 06−44 Background Visualization Utilization Predictive Models 3D models Chemical spaces Summary qqqq qq qqqqq 07−26 07−27 07−25 07−21 07−29 07−18 07−22 06−36 07−13 07−14 07−37 q 07−17 q 07−30
    • Defining & Using Glucocorticoid Inhibitors Structure-Activity Landscapes Rajarshi Guha O O Background Visualization O O N H N Utilization H Predictive Models 07-15, 2000 nM 07-20, 2000 nM 3D models Chemical spaces Summary O Suggests that electron density is also important O N Lower π density possibly correlates H to increased activity 07-18, 710 nM Confirmed by 07-23 → 07-18 O 07-15 → 07-17 is interesting since the change increases the bulk O N H 07-17, 355 nM
    • Defining & Using Glucocorticoid Inhibitors Structure-Activity Landscapes Rajarshi Guha Background Visualization These observations match those made by Takahashi Utilization et al. Predictive Models 3D models More detailed graphs exhibit longer paths that focus on Chemical spaces Summary the bulk of side chains at the C4-α position A number of paths consider changes to the epoxide substitution Usually of length 1 Highlights the fact that bulk at the C4 α has greater impact on activity than epoxide substitutions The SALI graph stresses the non-linearity of the SAR
    • Defining & Using SALI Curves - Control Experiments Structure-Activity Landscapes Rajarshi Guha 1.0 Scrambling Background 0.5 Visualization Scramble the Y-variable and S(X) 0.0 Utilization rebuild the model Predictive Models 3D models −0.5 Evaluate the SALI curve Chemical spaces Summary Repeat 50 times and take −1.0 0.0 0.2 0.4 X 0.6 0.8 1.0 the mean of the counts for a given cutoff 1.0 0.5 Noise S(X) Add uniform noise to each 0.0 descriptor, rebuild the model −0.5 No noise 10 50 100 We expect little variation in −1.0 200 0.0 0.2 0.4 0.6 0.8 1.0 the plateau X