Combining Cheminformatics Methods and Pathway     Analysis to Identify Molecules with Whole Cell Activity             Agai...
Applying CDD to Build a disease community for TB   Tuberculosis Kills 1.6-1.7m/yr (~1 every 8 seconds)   1/3rd of worlds...
Molecules with activity        against  ~ 20 public datasets  for TB  Including Novartis  data on TB hits  >300,000 cpds  ...
Simple descriptor analysis on > 300,000 compounds tested vs TB                                                            ...
Fitting into the drug discovery                  processEkins et al,Trends inMicrobiology19: 65-74, 2011
BMGF   3 Academia/ Govt lab – Industry screening partnerships   CDD used for data sharing / collaboration – along with c...
More Medicines for TuberculosisCDD is a partner on a 5 year project supporting >20 labs and providing cheminformaticssuppo...
Bayesian Classification TB Models     We can use the public data for machine learning     model building     Using Discove...
Additional test sets 1702 hits in >100K cpds        34 hits in 248 cpds             21 hits in 2108 cpds 100K library     ...
Searching for TB molecular mimics; collaboration                                               Azaserine exhibited a good ...
CDD and SRI STTR collaboration          CDD                                                SRI   Literature data on       ...
Mimic strategy1. The enzymes around these metabolites are "in   vivo essential".2. These enzymes have no human homolog.3. ...
Leverages work of       Identification of essential in vivo enzymes of Mtb                                                ...
The cellular overview diagram for M. tuberculosis H37Rv, fromthe TBCyc database (http://tbcyc.tbdb.org/index.shtml)       ...
Venn diagram shows the degree of association between the in vivo           mutants of Mtb in different animal models   Sa...
Anishetty et al                                       185 proteins from Mtb absent in human       S. Anishetty. et al., Co...
TB target database for in vivo essential genes.https://www.collaborativedrug.com/buzz/2011/05/02/new-tb-targets-and-molecu...
TB molecules with activity in vitro and target information (from CDD) - now  added external links to pathways, literature ...
TB molecules and target information database connects       molecule, gene, pathway and literature
Targets, metabolites and pathways pursued in this studyEssential Gene      Pathway                                        ...
Pharmacophore developed (using Accelrys                      Discovery Studio) from 3D conformations of                   ...
Example of mimic strategy for bioB Rv1589                   Biotin biosynthesis                                         de...
Substrate Pharmacophores Developed for Mtb Enzymesa.                      b.                g.                        h.c....
Two Proposed Mimics of D-fructose 1,6 bisphosphate     Computationally searched >80,000 molecules – narrowed to 842 hits -...
Proposed generalized workflow for molecule discovery1. Find candidate genes           2. Prioritize target candidate      ...
Summary   POC took < 6mths - - Submitted phase II STTR,   Still need to test vs target - verify it hits suggested target...
Bayesian Machine Learning Models – Improve Hit RatesExample 1. Kinase library                       Example 2. Asinex libr...
What next - Apps for collaboration  ODDT – Open drug discovery teams  Flipboard-like app for aggregating social media for ...
Acknowledgments   collaborators (Allen Casey, Robert Reynolds    etc..)   Alex Clark (Molecular Materials Informatics, I...
Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final
Upcoming SlideShare
Loading in …5
×

Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

740 views

Published on

ACS talk wed 28 2012

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
740
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

  1. 1. Combining Cheminformatics Methods and Pathway Analysis to Identify Molecules with Whole Cell Activity Against Mycobacterium TuberculosisMalabika Sarker1, Carolyn Talcott1, Peter Madrid1, Sidharth Chopra1, Barry A. Bunin2 Gyanu Lamichhane3, Joel S. Freundlich4 and Sean Ekins2, 5, 1SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA. 2Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA. 3Johns Hopkins School of Medicine, Department of Medicine, 1550 Orleans St, Room 103, Baltimore, MD 21287, USA.4Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, UMDNJ – New Jersey Medical School, 185 South Orange Avenue Newark, NJ 07103, USA. 5Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA. .
  2. 2. Applying CDD to Build a disease community for TB Tuberculosis Kills 1.6-1.7m/yr (~1 every 8 seconds) 1/3rd of worlds population infected!!!! Multi drug resistance in 4.3% of cases Extensively drug resistant increasing incidence No new drugs in over 40 yrs Drug-drug interactions and Co-morbidity with HIV Collaboration between groups is rare These groups may work on existing or new targets Use of computational methods with TB is rare Literature TB data is not well collated (SAR) Funded by Bill and Melinda Gates Foundation
  3. 3. Molecules with activity against ~ 20 public datasets for TB Including Novartis data on TB hits >300,000 cpds Patents, Papers Annotated by CDD Open to browse by anyonehttp://www.collaborativedrug.com/register
  4. 4. Simple descriptor analysis on > 300,000 compounds tested vs TB Atom Dataset MWT logP HBD HBA RO 5 count PSA RBNMLSMRActive ≥90%inhibition at10uM 357.10 3.58 1.16 4.89 0.20 42.99 83.46 4.85(N = 4096) (84.70) (1.39) (0.93) (1.94) (0.48) (12.70) (34.31) (2.43)Inactive< 90%inhibition at10uM(N = 350.15 2.82 1.14 4.86 0.09 43.38 85.06 4.91216367) (77.98)** (1.44)** (0.88) (1.77) (0.31)** (10.73) (32.08)* (2.35)TAACF-NIAID CB2Active≥ 90%inhibition at10uM 349.58 4.04 0.98 4.18 0.19 41.88 70.28 4.76(N =1702) (63.82) (1.02) (0.84) (1.66) (0.40) (9.44) (29.55) (1.99)Inactive< 90%inhibition at10uM 77.75(N 352.59 3.38 1.11 4.24 0.12 42.43 (30.17)* 4.72=100,931) (70.87) (1.36)** (0.82)** (1.58) (0.34)** (8.94)* * (1.99)
  5. 5. Fitting into the drug discovery processEkins et al,Trends inMicrobiology19: 65-74, 2011
  6. 6. BMGF 3 Academia/ Govt lab – Industry screening partnerships CDD used for data sharing / collaboration – along with cheminformatics expertise Previously supported larger groups of labs – many continued as customers
  7. 7. More Medicines for TuberculosisCDD is a partner on a 5 year project supporting >20 labs and providing cheminformaticssupportAlready found hits for a TB target using docking www.mm4tb.org
  8. 8. Bayesian Classification TB Models We can use the public data for machine learning model building Using Discovery Studio Bayesian model Leave out 50% x 100 Dateset Internal (number of External ROC molecules) ROC Score Score Concordance Specificity Sensitivity MLSMR All single point screen (N = 220463) 0.86 ± 0 0.86 ± 0 78.56 ± 1.86 78.59 ± 1.94 77.13 ± 2.26 MLSMRdose response set (N = 2273) 0.73 ± 0.01 0.75 ± 0.01 66.85 ± 4.06 67.21 ± 7.05 65.47 ± 7.96 Ekins et al., Mol BioSyst, 6: 840-851, 2010
  9. 9. Additional test sets 1702 hits in >100K cpds 34 hits in 248 cpds 21 hits in 2108 cpds 100K library Novartis Data FDA drugsSuggests models can predict data from the same and independent labsInitial enrichment – enables screening few compounds to find activesEkins et al., Mol BioSyst, 6: 840-851, 2010 Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011.
  10. 10. Searching for TB molecular mimics; collaboration Azaserine exhibited a good fit for this pharmacophore, as judged by its quantitative FitValue (= 2.1) and visual inspection. Modeling – CDD Biology – Johns Hopkins Chemistry – Texas A&MLamichhane G, et al Mbio, 2: e00301-10, 2011
  11. 11. CDD and SRI STTR collaboration CDD SRI Literature data on Aim 1 Pathway data (targets) molecules Develop API to and their targets link CDD and Species differences in pathways SRI databasesSimilarity search with a mimic enables target Where to intervene fishing Aim 2 Target and compound data added to pathway model Combine the knowledge Aim 3 Select new targets Identify new Take mimic strategy targets for drugs
  12. 12. Mimic strategy1. The enzymes around these metabolites are "in vivo essential".2. These enzymes have no human homolog.3. These enzyme targets are not yet explored though some enzymes from the same pathways are drug targets (experimental or predicted).
  13. 13. Leverages work of Identification of essential in vivo enzymes of Mtb SRI Lamichhane et al., Sassetti et al.,Analysis of metabolic pathway and reaction information for the essential enzymes SRI Approach takenComparison of non-human-homologues enzymes with Mtb in similar to that of vivo essential gene set SRI Lamichhane et al Mbio paper 2011 -Instead mimic the Selection of targets – in vivo essential, not homologous to human and not known as TB drug-targets SRI substrate Uses data from SRI In silico design of small molecule inhibitors or pharmacophores for selected enzyme targets CDD and CDD databases to select targets that have not been In vitro testing of selected pharmacophores SRI exploited with small molecules
  14. 14. The cellular overview diagram for M. tuberculosis H37Rv, fromthe TBCyc database (http://tbcyc.tbdb.org/index.shtml) TBCyc gave a total of 53 non-redundant pathways for the set of 314 essential in vivo genes.  Sarker et al., Pharm Res 2012, in press
  15. 15. Venn diagram shows the degree of association between the in vivo mutants of Mtb in different animal models Sarker et al., Pharm Res 2012, in press
  16. 16. Anishetty et al 185 proteins from Mtb absent in human S. Anishetty. et al., Comput Biol Chem. 29:368-378 (2005). Sassetti et al 49 proteins unique to Mtb C.M. Sassetti, et.al., Molecular microbiology. 48:77-84 (2003).Among 314 essential in vivo proteins of Mtb 66 proteins were non-human homolgs
  17. 17. TB target database for in vivo essential genes.https://www.collaborativedrug.com/buzz/2011/05/02/new-tb-targets-and-molecules-data-available-for-public-access-use/
  18. 18. TB molecules with activity in vitro and target information (from CDD) - now added external links to pathways, literature etc.14 known gene targets and 31 predicted gene targets for already known 35 approved TB drugs
  19. 19. TB molecules and target information database connects molecule, gene, pathway and literature
  20. 20. Targets, metabolites and pathways pursued in this studyEssential Gene Pathway Essential Substrate/sbioB (Rv1589) Biotin biosynthesis dethiobiotinthiE (Rv0414c) Thiamine biosynthesis 2-(4-methylthiazol-5-yl)ethyl phosphate and [(4-amino-2-methyl-pyrimidin-5-yl)methoxy- oxido-phosphoryl] phosphatecysE (Rv2335) Cysteine biosynthesis L-serine and acetyl-CoAcobC (Rv2231c) No pathway assigned L-threonine O-3-phosphateglpX (Rv1099c) glycolysis and gluconeogenesis D-fructose 1,6-bisphosphateppgK (Rv2702) Amino sugar and nucleotide sugar metabolism β-D-glucose GluconeogenesisarcA (Rv1001) arginine degradation V (arginine deiminase pathway) L-argininepanD (Rv3601c) β-alanine biosynthesis IV L-aspartateotsA (Rv3490) trehalose biosynthesis I UDP-D-glucose and α-D-glucose 6- phosphate  Sarker et al., Pharm Res 2012, in press
  21. 21. Pharmacophore developed (using Accelrys Discovery Studio) from 3D conformations of the substrate van der Waals surface for the metabolite mapped onto it pharmacophore plus shape searched in 3D compound databases from vendors In silico hits collated Filtered for TB whole cell activity and reactivityCompounds filtered based on Bayesian score using models derived from NIAID / Southern ResearchInst data to retrieve ideal molecular properties for in vitro TB activity
  22. 22. Example of mimic strategy for bioB Rv1589 Biotin biosynthesis dethiobiotinTake substrateand generate 3Dconformers andbuild apharmacophore PharmacophoreUse thepharmacophoreto search vendorlibraries in 3D Searching Maybridge (57K)Buy and test gives 72 molecules – many ofcompounds them hydrophobic so they stand a chance of in vitro activity
  23. 23. Substrate Pharmacophores Developed for Mtb Enzymesa. b. g. h.c. d. i. j. k. l.e. f. Green = Hydrogen bond acceptor, Purple = hydrogen bond donor, cyan = hydrophobe Grey – van der Waals surface  Sarker et al., Pharm Res 2012, in press
  24. 24. Two Proposed Mimics of D-fructose 1,6 bisphosphate Computationally searched >80,000 molecules – narrowed to 842 hits -tested 23 compounds in vitro (3 picked as inactives), lead to 2 proposed as mimics of D-fructose 1,6 bisphosphate DFP000133SC MIC 40μg/mla. DFP000134SC MIC 20μg/mlb.  Sarker et al., Pharm Res 2012, in press
  25. 25. Proposed generalized workflow for molecule discovery1. Find candidate genes 2. Prioritize target candidate 3. For each candidate 4. Submit top mimics forcoding potential targets. list. molecule develop preliminary experimental pharmacophore model that validation and lead suggests mimics. optimization1. choose pathogen 1. Annotate (choose properties: 1. Develop pharmacophore 1. select molecules from 3 pathways, reactions, EC#, models from metabolites GO characterization) 2. order from vendor 2. Search known drug 2. Filtering (choose thresholds) databases for compounds 3. test in vitro / ex vivo2. search for genes mapping to 3. Sort (choose criteria: number pharmacophore, 4. add results to CDD choose source-- of pathways, number of databaseexperimental in vitro/ex vivo reactions, ...) 3. Filter based on ADME/Toxdata, in silico (single/double) properties 5. prioritize compounds forknockout (choose nutrient set, 4. Annotate reaction substrates lead optimization / in vivosurvival conditions) with structure information. 4. Filter based on other studies models for target choose filter (no human bioactivity 6. partnering with 3rd party forortholog, ..., user edit) preclinical/ clinical studies Output: Prioritized target list 5. Sorting or Pareto annotated with prioritizing optimization of results properties and associated Output: Experimental resultsOutput: target candidate reactions with their substrates to be fed into the CDDlist--gene names associated annotated with structure (these databasewith reference identifier. are the candidate molecules to Output: Pharmacophores and mimic). candidate mimics for substrates of target enzymes Metabolites (and metadata, required as sdf file for software) Molecule id, source
  26. 26. Summary POC took < 6mths - - Submitted phase II STTR, Still need to test vs target - verify it hits suggested target – optimize cpds. Need to link SRI and CDD databases via API – new product• Computational models based on Whole cell TB data could improve efficiency of screening• Collaborations get us to interesting compounds quickly• Additional prospective validation ongoing with IDRI, Southern Research Institute and UMDNJ using machine learning models - testing small numbers of compounds• UMDNJ – mined GSK malaria public data, scored with bayesian models – ordered from vendors
  27. 27. Bayesian Machine Learning Models – Improve Hit RatesExample 1. Kinase library Example 2. Asinex library Ranked Asinex 25K library with dose response model - 99 screened.16 cpds were identified with IC50<100uM Compare with HTS screening below Library Number Hit rate Notes size of hits (%) Reference Diverse 100997 1782 1.76 library Ananthan Diverse 215110 3817 1.77 library Maddry Human kinase focussed 25671 1329 5.18 library Reynolds Example 3. IDRI: 3 models - 48 compounds tested, 11 activity < or equal to MIC 10uM (22.9% hit rate) Example 4. UMDNJ 1 model – 4 tested, 3 active (1 MIC < 0.125ug/ml)
  28. 28. What next - Apps for collaboration ODDT – Open drug discovery teams Flipboard-like app for aggregating social media for diseases etcAlex Clark, Molecular Materials Informatics, Inc Williams et al DDT 16:928-939, 2011 Clark et al submitted 2012 Ekins et al submitted 2012
  29. 29. Acknowledgments collaborators (Allen Casey, Robert Reynolds etc..) Alex Clark (Molecular Materials Informatics, Inc) Accelrys CDD Funding BMGF Award Number R41AI088893 from the National Institute Of Allergy And Infectious Diseases. Email: ekinssean@yahoo.com Slideshare: http://www.slideshare.net/ekinssean Twitter: collabchem Blog: http://www.collabchem.com/ Website: http://www.collaborations.com/CHEMISTRY.HTM

×