1. SAR Contour Landscape (SARCL) Maps.
Useful for Compound Organization
and Activity Prediction?
James T. Metz, Ph.D.
Abbott Laboratories
R4DG - Dept. of Discovery Technologies
(Cheminformatics)
Accelrys 2012 US User Group Meeting
May 9, 2012
Boston Marriott Longwharf Hotel
Boston, MA
2. SAR Contour Landscape (SARCL) Maps.
Useful for Compound Organization and
Activity Prediction?
2
SAR Contour Landscape Maps – Motivation?
Cpd # 2191A
pKd (DYRK1A) = 4.50
Cpd # 762
pKd (DYRK1A) = 7.20
Cpd # 41
pKd (DYRK1A) = 8.10
# 41
# 2191 # 762
A) All structures and experimentally measured kinase activities are from Metz, J.T. et al. “Navigating the Kinome,” Nature Chemical Biology, 7 (2011)
200-202, supplementary results
+ 427 other cpds with
experimentally measured
DYRK1A kinase activities
If I can generate an SAR contour landscape map, and if
the compound organization and contours are meaningful
and predictive, I can begin asking some useful questions ...
3. SAR Contour Landscape (SARCL) Maps.
Useful for Compound Organization and
Activity Prediction?
3
1. What does the SAR generally look like as a
2D contour map? Where are the “hot” regions
(active cpds)? What are the chemotypes in
those regions? Where are the “cold” regions
(inactive cpds)? Is the landscape smoothly
varying or rough and jagged?
2. What regions are well-explored (densely
populated)? What regions are under-explored?
3. Can the map be used to predict the activity of virtual compounds?
If so, what compound should I synthesize next? What general direction
(set of compounds) should I consider to get to where I need to go?
4. What else can I do with SARCL maps?
SAR Contour Landscape Map – Possible Questions,
Possible Utility
4. SAR Contour Landscape (SARCL) Maps.
Useful for Compound Organization and
Activity Prediction?
4
How to generate a SARCL map? – Pipeline Pilot
1. Begin with a set of 2D structures and activities for a target
2. Compute the complete Property Distance (1 – Tanimoto Similarity) matrix for
all compound pairs using ECFP_6 fingerprints
0.00 0.89 0.90
0.89 0.00 0.94
0.90 0.94 0.00
Cpd # 2191
Cpd # 762
Cpd # 41
# 2191
# 2191
# 41 # 762
# 41
# 762
5. SAR Contour Landscape (SARCL) Maps.
Useful for Compound Organization and
Activity Prediction?
5
How to generate a SARCL map? – R statistics (igraph)
0.00 0.89 0.90
0.89 0.00 0.94
0.90 0.94 0.00
# 2191
# 2191
# 41 # 762
# 41
# 762
R Statistics
igraph
package
Network X
NetworkY
Distance Matrix
41 2191
762
347
Network Coordinates
6. SAR Contour Landscape (SARCL) Maps.
Useful for Compound Organization and
Activity Prediction?
6
How to generate a SARCL map? – R statistics (AKIMA)
Structure Cpd
ID
“X”
Network
X
“Y”
Network
Y
“Z”
pActivity
(DYRK1A)
2191 122.2 94.8 4.50
41 -365.8 54.0 8.10
762 138.3 278.2 7.20
# 41
# 2191
# 762
R Statistics
AKIMA
package
NOTE! – AKIMA can also be used to predict Z values (pActivity) given an (X,Y)
point, if contours exist around that point e.g., from the training set.
Black filled circle – Training Set (TS)
Grey filled diamond – Prediction Set (PS)
11. SAR Contour Landscape (SARCL) Maps.
Useful for Compound Organization and
Activity Prediction?
11
PS MUE = f (Tanimoto Distance, Count) ?
Tan. Dist. = 0.10
Tan. Dist. = 0.20
Tan. Dist. = 0.30
Hmmm... In Progress!
PS molecule
TS molecule
12. SAR Contour Landscape (SARCL) Maps.
Useful for Compound Organization and
Activity Prediction?
12
1. Selectivity between 2 targets
Selectivity = pKd(DYRK1A) – pKd(CDK5)
2. Activity and Selectivity between 2 targets
Activity and Selectivity = pKd (DYRK1A) +
pKd(DYRK1A) – pKd(CDK5)
3. Activity and Multiple Selectivity for > 2 targets
Activity and Multiple Selectivity = pKd (DYRK1A) +
pKd(DYRK1A) – pKd(CDK5) +
pKd(DYRK1A) – pKd(ROCK1)
What else can I do with SARCL maps? – Fun with
pKd math!
activity
selectivity
18. SAR Contour Landscape (SARCL) Maps.
Useful for Compound Organization and
Activity Prediction?
18
1. Why is an activity contouring method
potentially useful for predicting activity?
Patches on the surface represent a
set of spline curves that are fit locally to
the activity data.
Local fitting potentially avoids the “one equation” or “one model”
problem. Data sets may contain multiple and varied chemotypes!
Hmmm... A Parting Thought
19. SAR Contour Landscape (SARCL) Maps.
Useful for Compound Organization and
Activity Prediction?
19
Conclusions and Future Directions
1. SAR Contour Landscape (SARCL) maps of small molecule SAR data sets can be
constructed using Pipeline Pilot and R statistics igraph and AKIMA codes. Training
Set error for 3 kinases is ~ +/- 0.04 pKd units. Prediction Set error is ~ +/- 0.5 pKd
units. Related approaches have appeared in the literature (Peltason, L. et al. J. Chem. Inf. Model.
50 (2010) 1021, Reutlinger, M. et al. Angew. Chem. Int. Ed. 50 (2011) 1).
2. SARCL maps of activity and (multiple) selectivity are also possible via simple
mathematical manipulations of the activity data.
3. More work to do! Current challenges include handling larger data sets, in-depth
assessment of prediction error versus fingerprint types and possible usage of other
properties, compound placement and spline fitting, automatic scaling of points and
contours, deployment in Tibco Spotfire, etc. Ideas, collaborators, and help are
greatly welcomed!
Stay tuned!
20. SAR Contour Landscape (SARCL) Maps.
Useful for Compound Organization and
Activity Prediction?
20
Many thanks to ...
Abbott Discovery Technologies
(Cheminformatics)
Rishi Gupta
Isabella Haight
Phil Hajduk
Yvonne Martin
Steve Muchmore
Abbott Discovery IT
Robert Gregg
James Kofron
Ravi Mamidipaka
Abbott HTS
Scott Galasinki
Lemma Kifle
Phil Merta
Accelrys Support
Keith Burdick
Mike Cherry
Dana Honeycutt
Ian Kerman
Avinash Kewalramani
And many other helpful folks!