Structure-­‐Ac)vity	  Rela)onships	  and	   Networks:	  A	  Generalized	  Approach	        to	  Exploring	  Structure-­‐Ac...
NIH	  Chemical	  Genomics	  Center	  •    Founded	  2004	  as	  part	  of	  NIH	  Roadmap	  Molecular	  Libraries	  Ini9a9...
NCGC	  Project	  Diversity	  (A) Disease areas    (B) Target types                            (C) Detection methods
qHTS:	  	  High	  Throughput	  Dose	  Response	          Assay concentration ranges over 4 logs                       Info...
Background	  •  Cheminforma9cs	  methods	      –  QSAR,	  diversity	  analysis,	  virtual	  screening,	  	         fragmen...
Outline	  •  Structure-­‐ac9vity	  rela9onships	  •  Characterizing	  ac9vity	  cliffs	  •  Working	  with	  the	  structur...
Structure	  Ac)vity	  Rela)onships	       •  Similar	  molecules	  will	  have	  similar	  ac9vi9es	       •  Small	  chan...
Excep)ons	  Are	  Easy	  to	  Find	                    F3C                                             Cl                 ...
Structure	  Ac)vity	  Landscapes	              •  Rugged	  gorges	  or	  rolling	  hills?	                          –  Sma...
Structure	  Ac)vity	  Landscapes	  
Characterizing	  the	  Landscape	              •  A	  cliff	  can	  be	  numerically	  characterized	              •  Struc...
Visualizing	  the	  SALI	  Matrix	  
Fingerprints	                         1   0   1   1   0   0         0    1    0•  Lots	  of	  types	  of	  fingerprints	  	...
Varying	  Fingerprint	  Methods	                                            BCI 1052 bit                                  ...
Varying	  the	  Similarity	  Metric	  
Different	  Ac)vity	  Representa)ons	               •  Using	  the	  Hill	  parameters	  from	  a	  dose-­‐response	       ...
Visualizing	  SALI	  Values	  •  Alterna9ves?	      –  A	  heatmap	  is	  an	  easy	  to	  understand	  visualiza9on	     ...
Visualizing	  SALI	  Values	  •  The	  SALI	  graph	      –  Compounds	  are	  nodes	      –  Nodes	  i,j	  are	  connecte...
Varying	  the	  Cutoff	      •  The	  cutoff	  controls	  the	  complexity	  of	  the	  graph	  	      •  Higher	  cut	  offs...
BePer	  Visualiza)on	  -­‐	  SALIViewer	            hPp://sali.rguha.net	  
What	  Can	  We	  Do	  With	  SALI’s?	  •  SALI	  characterizes	  cliffs	  &	  non-­‐cliffs	  •  For	  a	  	  given	  molecu...
Descriptor	  Space	  Smoothness	                                                                                          ...
Other	  Examples	                                                                             400•  Instead	  of	  fingerpr...
Feature	  Selec)on	  Using	  SALI	  •  Surprisingly,	  exhaus9ve	  search	  of	  66,000	  4-­‐   descriptor	  combina9ons	...
SALI	  Graphs	  &	  Predic)ve	  Models	  •  The	  graph	  view	  allows	  us	  to	  view	  SAR’s	  and	  iden9fy	     tren...
Measuring	  Model	  Quality	  •  A	  QSAR	  model	  should	  easily	  encode	  the	  “rolling	     hills”	  •  A	  good	  ...
SALI	  Curves	                                                                         1.0       1.0                      ...
Model	  Search	  Using	  the	  SCI	  •  We’ve	  used	  the	  SALI	  to	  retrospec9vely	  analyze	     models	  •  Can	  w...
The	  Objec)ve	  Func)on	  •  S0	  is	  a	  measure	  of	  the	  models	             1.0   ability	  to	  summarize	  the	...
SALI	  Based	  Model	  Selec)on	                                                                                          ...
SALI	  Based	  Model	  Selec)on	                                                                                          ...
SALI	  Based	  Model	  Selec)on	         •  The	  size	  of	  the	  solu9on	  space	  explored	            depends	  on	  ...
Predic)ng	  the	  Landscape	         •  Rather	  than	  predic9ng	  ac9vity	  directly,	  we	  can	            try	  to	  ...
Predic)ng	  Cliffs	  •  Dependent	  variable	  are	  pairwise	  SALI	  values,	     calculated	  using	  fingerprints	  •  I...
A	  Test	  Case	        •  We	  first	  consider	  the	  Cavalli	  CoMFA	  dataset	  of	  30	           molecules	  with	  ...
Double	  Coun)ng	  Structures?	  •  The	  dependent	  and	  	                                                  GeoMean   i...
Model	  	  Summaries	                                          Original	  pIC50	                                          ...
Test	  Case	  2	               •  Considered	  the	  Holloway	  docking	  dataset,	  32	                  molecules	  with...
Structure-Activity Relationships and Networks: A Generalized Approachto Exploring Structure-Activity Landscapes
Structure-Activity Relationships and Networks: A Generalized Approachto Exploring Structure-Activity Landscapes
Structure-Activity Relationships and Networks: A Generalized Approachto Exploring Structure-Activity Landscapes
Structure-Activity Relationships and Networks: A Generalized Approachto Exploring Structure-Activity Landscapes
Structure-Activity Relationships and Networks: A Generalized Approachto Exploring Structure-Activity Landscapes
Structure-Activity Relationships and Networks: A Generalized Approachto Exploring Structure-Activity Landscapes
Structure-Activity Relationships and Networks: A Generalized Approachto Exploring Structure-Activity Landscapes
Structure-Activity Relationships and Networks: A Generalized Approachto Exploring Structure-Activity Landscapes
Structure-Activity Relationships and Networks: A Generalized Approachto Exploring Structure-Activity Landscapes
Structure-Activity Relationships and Networks: A Generalized Approachto Exploring Structure-Activity Landscapes
Structure-Activity Relationships and Networks: A Generalized Approachto Exploring Structure-Activity Landscapes
Structure-Activity Relationships and Networks: A Generalized Approachto Exploring Structure-Activity Landscapes
Upcoming SlideShare
Loading in...5
×

Structure-Activity Relationships and Networks: A Generalized Approach to Exploring Structure-Activity Landscapes

1,104

Published on

Published in: Technology, Art & Photos
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,104
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
53
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Structure-Activity Relationships and Networks: A Generalized Approach to Exploring Structure-Activity Landscapes

  1. 1. Structure-­‐Ac)vity  Rela)onships  and   Networks:  A  Generalized  Approach   to  Exploring  Structure-­‐Ac)vity   Landscapes   Rajarshi  Guha   NIH  Chemical  Genomics  Center  /   NIH  Center  for  Transla9onal  Therapeu9cs   March  29,  2011  
  2. 2. NIH  Chemical  Genomics  Center  •  Founded  2004  as  part  of  NIH  Roadmap  Molecular  Libraries  Ini9a9ve   –  NCGC  staffed  with  90+  scien9sts  –  biologists,  chemists,  informa9cians,  engineers   –  Post-­‐doc  program  •  Mission   –  MLPCN  (screening  &  chemical  synthesis;  compound  repository;  PubChem  database;   funding  for  assay,  library  and  technology  development  )   •  Complements  individual  inves9gator-­‐ini9ated  research  programs   •  Enables  “pharma-­‐level”  HTS  and  early  chemical  op9miza9on   –  Develop  new  chemical  probes  for  basic  research  and  leads  for  therapeu9c  development,   par9cularly  for  rare/neglected  diseases   –  New  paradigms  &  applica9ons  of  HTS  for  chemical  biology  /  chemical  genomics  •  All  NCGC  projects  are  collabora9ons  with  a  target  or  disease  expert;    currently  >200   collabora9ons  with  inves9gators  worldwide     –  75%  NIH  extramural,  10%  NIH  intramural,  15%  Founda9ons/Research  Consor9a/Pharma/ Biotech  
  3. 3. NCGC  Project  Diversity  (A) Disease areas (B) Target types (C) Detection methods
  4. 4. qHTS:    High  Throughput  Dose  Response   Assay concentration ranges over 4 logs Informatics pipeline. Automated curve fittingA   (high:~ 100 μM) 1536-well plates, inter-plate dilution series and classification. 300K samples C   Assay volumes 2 – 5 μLB   Automated concentration-response data collection ~1 CRC/sec
  5. 5. Background  •  Cheminforma9cs  methods   –  QSAR,  diversity  analysis,  virtual  screening,     fragments,  polypharmacology,  networks  •  More  recently   –  RNAi  screening,  high  content  imaging  •  Extensive  use  of  machine  learning  •  All  9ed  together  with  socware     development   –  User-­‐facing  GUI  tools   –  Low  level  programma9c  libraries  •  Believer  &  prac99oner  of  Open  Source  
  6. 6. Outline  •  Structure-­‐ac9vity  rela9onships  •  Characterizing  ac9vity  cliffs  •  Working  with  the  structure-­‐ac9vity  landscape  
  7. 7. Structure  Ac)vity  Rela)onships   •  Similar  molecules  will  have  similar  ac9vi9es   •  Small  changes  in  structure  will  lead  to  small   changes  in  ac9vity   •  One  implica9on  is  that  SAR’s  are  addi9ve   •  This  is  the  basis  for  QSAR  modeling  Mar9n,  Y.C.  et  al.,  J.  Med.  Chem.,  2002,  45,  4350–4358  
  8. 8. Excep)ons  Are  Easy  to  Find   F3C Cl Cl F3C Cl Cl NH2 NH2 N N N N NH2 NH O O O Ki  =  39.0  nM   Ki  =  1.8  nM   F3C Cl Cl F3C Cl Cl NH2 NH2 N N N N NH NH O NH2 O O O NH2 Ki  =  10.0  nM   Ki  =  1.0  nM  Tran,  J.A.  et  al.,  Bioorg.  Med.  Chem.  Le2.,  2007,  15,  5166–5176  
  9. 9. Structure  Ac)vity  Landscapes   •  Rugged  gorges  or  rolling  hills?   –  Small  structural  changes  associated  with  large   ac9vity  changes  represent  steep  slopes  in  the   landscape   –  But  tradi9onally,  QSAR  assumes  gentle  slopes     –  Machine  learning  is  not  very  good  for  special   cases  Maggiora,  G.M.,  J.  Chem.  Inf.  Model.,  2006,  46,  1535–1535  
  10. 10. Structure  Ac)vity  Landscapes  
  11. 11. Characterizing  the  Landscape   •  A  cliff  can  be  numerically  characterized   •  Structure  Ac9vity  Landscape  Index  (SALI)   Ai − A j SALIi, j = 1− sim(i, j) •  Cliffs  are  characterized  by  elements  of  the   matrix  with  very  large  values   €Guha,  R.;  Van  Drie,  J.H.,  J.  Chem.  Inf.  Model.,  2008,  48,  646–658  
  12. 12. Visualizing  the  SALI  Matrix  
  13. 13. Fingerprints   1 0 1 1 0 0 0 1 0•  Lots  of  types  of  fingerprints    •  Indicates  the  presence  or  absence  of  a  structural   feature    •  Length  can  vary  from  166  to  4096  bits  or  more    •  Fingerprints  usually  compared  using  the   Tanimoto  metric  
  14. 14. Varying  Fingerprint  Methods   BCI 1052 bit MACCS 166 bit CDK 1024 bit 8 8 8 6 6 6 Density Density Density 4 4 4 2 2 2 0 0 0 0.70 0.75 0.80 0.85 0.90 0.95 1.00 0.70 0.75 0.80 0.85 0.90 0.95 1.00 0.6 0.7 0.8 0.9 1.0 Tanimoto Similarity Tanimoto Similarity Tanimoto Similarity•  Shorter  fingerprints  will  lead  to  more  “similar”  pairs  •  Requires  a  higher  cutoff  to  focus  on  significant  cliffs  
  15. 15. Varying  the  Similarity  Metric  
  16. 16. Different  Ac)vity  Representa)ons   •  Using  the  Hill  parameters  from  a  dose-­‐response   curve  represents  richer  data  than  a  single  IC50   SInf ⎧ S0 ⎫ ⎪ ⎪ ⎪ Sinf ⎪ d(Pi ,P j ) SALIi, j = 50% ⎨ ⎬Activity ⎪ AC50 ⎪ 1− sim(i, j) ⎪ H ⎪ ⎩ ⎭ S0 AC50 Concentration €
  17. 17. Visualizing  SALI  Values  •  Alterna9ves?   –  A  heatmap  is  an  easy  to  understand  visualiza9on   –  Coupled  with  brushing,  can  be  a  handy  tool   –  A  more  flexible  approach  is  to  consider  a  network   view  of  the  matrix    •  The  SALI  graph   –  Compounds  are  nodes   –  Nodes  i,j  are  connected  if  SALI(i,j)  >  X   –  Only  display  connected  nodes  
  18. 18. Visualizing  SALI  Values  •  The  SALI  graph   –  Compounds  are  nodes   –  Nodes  i,j  are  connected  if  SALI(i,j)  >  X   –  Only  display  connected  nodes   ! 17 !!!!!!!!! 7 13 29 43 49 45 54 59 76 ! 15 ! 28 ! !!!!!!! 6 52 44 50 46 55 60 75 ! ! 3 18 !! 2 35 !! ! 20 22 9 ! 64 ! 69 ! 21 ! 34 ! 38 ! 8 ! 65 ! 24 ! ! 1 71 !! 12 58 !! 63 10 !! ! !! 68 27 23 41 42 !!!! 72 73 31 51 ! 39 ! 5 ! ! 19 62 ! 25 ! 57 ! 56 !!! 30 53 37 ! 4 ! 40 ! 66
  19. 19. Varying  the  Cutoff   •  The  cutoff  controls  the  complexity  of  the  graph     •  Higher  cut  offs  will  highlight  the  most  significant   ac9vity  cliffs   Cutoff = 90% Cutoff = 50% Cutoff = 20% ! !!!!!!!!!! ! ! ! ! !!!!! ! !!!!!! 17 7 13 29 43 49 45 54 59 769 17 15 13 12 22 23 29 38 41 64 43 45 49 54 59 63 ! ! 9 17 ! 15 ! ! ! !!! ! 13 12 21 22 29 35 38 !64 !!!!!! 43 45 49 54 59 63 ! 15 ! 28 ! !!!!!!! 6 52 44 50 46 55 60 75! !!1 28 3 !! !!!!!!!!!!!!! 6 19 24 25 52 39 57 42 56 44 46 50 55 60 62 ! !! 1 28 3 !! ! !!! !!!! !!!!!!!! 6 19 23 24 52 65 39 41 42 56 58 66 44 46 50 55 60 62 ! ! 3 18 !! 2 35 !! ! 20 22 9 ! 64 ! 69 ! 21 ! 34 ! 38 ! 2 ! 8 !40 ! 2 ! 8 ! ! 40 25 ! 37 !57 ! 8 ! 65 ! 24 ! ! 1 71 !! 12 58 !! 63 10 !! ! !! 68 27 23 41 42 !!!! 72 73 31 51 ! 39 ! 5 ! ! 19 62 ! 25 ! 57 ! 56 !!! 30 53 37 ! 5 ! 5 ! 4 ! 40 ! 4 ! 4 ! 66
  20. 20. BePer  Visualiza)on  -­‐  SALIViewer   hPp://sali.rguha.net  
  21. 21. What  Can  We  Do  With  SALI’s?  •  SALI  characterizes  cliffs  &  non-­‐cliffs  •  For  a    given  molecular  representa9on,  SALI’s   gives  us  an  idea  of    the   smoothness  of  the     SAR  landscape  •  Models  try  and  encode   this  landscape  •  Use  the  landscape  to  guide   descriptor  or  model     selec9on  
  22. 22. Descriptor  Space  Smoothness   gatifloxacin granisetron dolasetron perhexiline amitriptyline diltiazem sparfloxacin grepafloxacin sildenafil moxifloxacin gatifloxacin moxifloxacin grepafloxacin sildenafil sparfloxacin diltiazem amitriptyline dolasetron granisetron imipramine perhexiline 400 Number of Edges in SALI Graph mibefradil chlorpromazine azimilide bepridil cisapride E-4031 sertindole pimozide dofetilide droperidol thioridazine haloperidol domperidone loratadine mizolastine bepridil azimilide mibefradil chlorpromazine imipramine halofantrine mizolastine loratadine domperidone verapamil terfenadine sertindole dofetilide haloperidol thioridazine droperidol 300 E-4031 cisapride pimozide astemizole astemizole 200 grepafloxacin sildenafil moxifloxacin gatifloxacin 100 0 0.0 0.2 0.4 0.6 0.8 1.0 astemizole SALI Cutoff•  Edge  count  of  the  SALI  graph  for  varying  cutoffs  •  Measures  smoothness  of  the  descriptor  space  •  Can  reduce  this  to  a  single  number  (AUC)  
  23. 23. Other  Examples   400•  Instead  of  fingerprints,     Number of Edges in SALI Graph 300 we  use  molecular     200 2D   descriptors   100•  SALI  denominator  now     0 uses  Euclidean  distance   0.0 0.2 0.4 0.6 SALI Cutoff 0.8 1.0•  2D  &  3D  random     descriptor  sets   400 Number of Edges in SALI Graph –  None  are  really  good   300 3D   –  Too  rough,  or   200 –  Too  flat   100 0 0.0 0.2 0.4 0.6 0.8 1.0 SALI Cutoff
  24. 24. Feature  Selec)on  Using  SALI  •  Surprisingly,  exhaus9ve  search  of  66,000  4-­‐ descriptor  combina9ons  did  not  yield  semi-­‐ smoothly  decreasing  curves  •  Not  en9rely  clear  what  type  of  curve  is  desirable  
  25. 25. SALI  Graphs  &  Predic)ve  Models  •  The  graph  view  allows  us  to  view  SAR’s  and  iden9fy   trends  easily  •  The  aim  of  a  QSAR  model  is  to  encode  SAR’s  •  Tradi9onally,  we  consider  the  quality  of  a  model  in   terms  of  RMSE  or  R2  •  But  in  general,  we’re  not  as  interested  in  RMSE’s  as   we  are  in  whether  the  model  predicted  something   as  more  ac9ve  than  something  else     –  What  we  want  to  have  is  the  correct  ordering   –  We  assume  the  model  is  sta9s9cally  significant  
  26. 26. Measuring  Model  Quality  •  A  QSAR  model  should  easily  encode  the  “rolling   hills”  •  A  good  model  captures  the  most  significant  cliffs  •  Can  be  formalized  as        How  many  of  the  edge  orderings  of  a  SALI  graph                    does  the  model  predict  correctly?  •  Define  S  (X  ),  represen9ng  the  number  of  edges   correctly  predicted  for  a  SALI  network  at  a  threshold   X  •  Repeat  for  varying  X  and  obtain  the  SALI  curve  
  27. 27. SALI  Curves   1.0 1.0 0.5 0.5 S(X)S(X) 0.0 0.0 !0.5 !0.5 3!descriptor 5!descriptor Scrambled 3!descriptor !1.0 SCI = 0.12 !1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 X X
  28. 28. Model  Search  Using  the  SCI  •  We’ve  used  the  SALI  to  retrospec9vely  analyze   models  •  Can  we  use  SALI  to  develop  models?   –  Iden9fy  a  model  that  captures  the  cliffs  •  Tricky   –  Cliffs  are  fundamentally  outliers   –  Op9mizing  for  good  SALI  values  implies  overfivng   –  Need  to  trade-­‐off  between  SALI  &  generalizability  
  29. 29. The  Objec)ve  Func)on  •  S0  is  a  measure  of  the  models   1.0 ability  to  summarize  the  dataset   0.9 S100   S(X) 0.8 (analogous  to  RMSE)   S   0.7 0•  S100  measures  the  models   0.6 ability  to  capture  cliffs   0.0 0.2 0.4 0.6 0.8 1.0 SALI Cutoff•  Ideally,  the  curve  starts  high  and  stays  high   1 1 (S100 − S0 ) 1 F= F= + F= S100 S0 2 SCI
  30. 30. SALI  Based  Model  Selec)on   RMSE SCI S(100) •  Considered  the  BZR  dataset     0.5 from  Sutherland  et  al   S(X) 0.0 •  Iden9fied  “best”  models   -0.5 using  a  GA  to  select  from  a     0.0 0.2 0.4 0.6 SALI Cutoff 0.8 1.0 pool  of  2D  descriptors   RMSE SCI S(100) •  While  SALI  based  op9miza9on   0.5 can  lead  to  a  “bexer”  curve,     S(X) 0.0 it  doesn’t  give  the  best  model   -0.5 0.00 0.02 0.04 0.06 0.08 SALI CutoffSutherland,  J  et  al,  J.  Chem.  Inf.  Comput.  Sci.,  2003,  43,  1906-­‐1915  
  31. 31. SALI  Based  Model  Selec)on   RMSE SCI S(0) + D/2 •  107  aryl  azoles  as  ER-­‐β  agonists   0.5 S(X) 0.0 •  Used  a  GA  and  2D  descriptors   -0.5 to  iden9fy  models   0.0 0.2 0.4 0.6 0.8 1.0 •  In  this  case,  a  SALI  based     RMSE SALI Cutoff SCI S(0) + D/2 objec9ve  func9on  was  able  to   iden9fy  the  best  model   0.5 •  Interes9ngly,  SCI  does  not     S(X) 0.0 seem  to  perform  very  well   -0.5 0.00 0.02 0.04 0.06 0.08 SALI CutoffMalamas,  M.S.  et  al,  J  Med  Chem,  2004,  47,  5021-­‐5040  
  32. 32. SALI  Based  Model  Selec)on   •  The  size  of  the  solu9on  space  explored   depends  on  the  SALI  objec9ve  func9on   1.15 BZR   ER-­‐β   0.65 1.10 1.05 0.60 RMSERMSE 1.00 0.95 0.55 0.90 RMSE S(100) SCI 1/S(0) + D/2 RMSE SCI Objective Function Objective Function
  33. 33. Predic)ng  the  Landscape   •  Rather  than  predic9ng  ac9vity  directly,  we  can   try  to  predict  the  SAR  landscape   •  Implies  that  we  axempt  to  directly  predict  cliffs   –  Observa9ons  are  now  pairs  of  molecules   •  A  more  complex  problem   –  Choice  of  features  is  trickier   –  S9ll  face  the  problem  of  cliffs  as  outliers   –  Somewhat  similar  to  predic9ng  ac9vity  differences  Scheiber  et  al,  StaHsHcal  Analysis  and  Data  Mining,  2009,  2,  115-­‐122  
  34. 34. Predic)ng  Cliffs  •  Dependent  variable  are  pairwise  SALI  values,   calculated  using  fingerprints  •  Independent  variables  are  molecular  descriptors   –  but  considered  pairwise   –  Absolute  difference  of  descriptor  pairs,  or   –  Geometric  mean  of  descriptor  pairs   –  …  •  Develop  a  model  to  correlate  pairwise   descriptors  to  pairwise  SALI  values  
  35. 35. A  Test  Case   •  We  first  consider  the  Cavalli  CoMFA  dataset  of  30   molecules  with  pIC50’s   •  Evaluate  topological  and  physicochemical   descriptors   •  Developed  random  forest     models   –  On  the  original  observed     values  (30  obs)   –  On  the  SALI  values     (435  observa9ons)  Cavalli,  A.  et  al,  J  Med  Chem,  2002,  45,  3844-­‐3853  
  36. 36. Double  Coun)ng  Structures?  •  The  dependent  and     GeoMean independent  variables  both     60 50 encode  structure.     40 30•  But  prexy  low  correla9ons     20 between  individual  pairwise     10 Percent of Total 0 descriptors  and  the  SALI     AbsDiff 60 values   50 40 30 20 10 0 0.00 0.05 0.10 0.15 R2
  37. 37. Model    Summaries   Original  pIC50   SALI,  AbsDiff   SALI,  GeoMean   9 RMSE  =  0.97   RMSE  =  1.10   RMSE  =  1.04   6 6 ! 8Predicted pIC50 ! !! ! Predicted SALI Predicted SALI ! ! ! ! ! ! ! ! ! !!! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! !! ! 7 ! ! ! ! !!! ! ! ! ! ! ! ! ! ! ! ! ! ! ! 4 ! ! ! !! ! ! ! ! 4 !! !! ! ! ! ! ! ! ! !!! !! ! ! ! ! ! !! !! ! ! ! ! ! !! ! ! ! ! !! ! ! ! ! ! !! ! !! ! ! ! !! ! ! ! ! !! ! ! !! ! ! ! ! ! !! ! !! ! !! ! ! !!! ! ! ! ! ! !!!!!!! ! ! ! ! ! ! ! ! !! ! !! ! !! ! ! ! ! ! ! !! ! ! 6 ! ! ! !! ! ! ! ! ! ! !! ! ! ! !! ! ! !!!! ! !!!!!!!! ! ! ! ! ! ! ! ! !!! !! ! ! ! ! ! ! !!! ! ! ! ! ! !!!! ! ! ! !! ! !!! !! ! ! ! !!! ! !!!!! ! ! ! ! ! !! !! ! ! ! ! !! ! !!!!! ! !!!! ! ! ! ! ! !! !! ! !! ! ! ! ! ! ! ! ! !! ! ! !! !!!!!! !!!!! !! ! ! !! ! ! ! ! ! ! !! ! ! !!! !!!! !!!! !!! ! ! ! !! ! !!!!! !! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! !! !! ! !! !! !! !! ! !! ! ! !! ! !! ! ! !!! !!!!!!!!!! !! ! ! !! !! !!!! ! ! ! ! !! !! ! ! ! !!!!!!!!!!!! ! !! ! ! ! !! !!!!!!!!! !!!!! !! ! ! ! !! ! ! ! 2 !!!!!!! ! !! ! ! ! ! ! ! ! ! !! !! ! ! ! !!!! !!!! ! !! ! ! !!!! ! ! ! ! !! ! !!!!!!! !!! !! 2 ! ! ! !!!!!!! !!! ! ! !!!!!! ! ! ! ! ! ! ! ! ! !!!! ! !! ! ! ! ! ! !!!!!!!!!!!! !! ! ! ! !! ! !!! ! ! ! !! !!!!! ! !! ! ! ! ! ! ! !!!!!!! ! !!! ! !!! !! ! ! ! ! ! ! ! ! ! !!! ! ! ! ! 5 ! ! ! ! ! ! ! !!!!! ! ! ! !! ! ! ! !!! !!! !!!!! ! !!! !!! !!!! ! ! !! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! !!!! ! ! ! ! !! !! ! !! ! !! ! ! ! ! ! ! ! !!! ! !! ! !! !! 4 0 0 4 5 6 7 8 9 0 2 4 6 0 2 4 6 Observed pIC50 Observed SALI Observed SALI •  All  models  explain  similar  %  of  variance  of   their  respec9ve  datasets     •  Using  geometric  mean  as  the  descriptor   aggrega9on  func9on  seems  to  perform  best   •  SALI  models  are  more  robust  due  to  larger  size   of  the  dataset  
  38. 38. Test  Case  2   •  Considered  the  Holloway  docking  dataset,  32   molecules  with  pIC50’s  and  Einter   •  Similar  strategy  as  before   •  Need  to  transform  SALI  values     •  Descriptors  show  minimal     correla9on   50 30 40 Percent of Total Percent of Total 30 20 20 10 10 0 0 0 20 40 60 80 100 120 -1 0 1 2Holloway,  M.K.  et  al,  J  Med  Chem,  1995,  38,  305-­‐317   SALI log10 (SALI)
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×