New target prediction and vizualization tools incorporating open source molecular fingerprints for TB Mobile version 2

  • 305 views
Uploaded on

A talk at the ACS SF CINF division on TB Mobile and use for target identification/ prediction

A talk at the ACS SF CINF division on TB Mobile and use for target identification/ prediction

More in: Science
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
305
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
1
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. New Target Prediction and Visualization ToolsNew Target Prediction and Visualization Tools Incorporating Open Source Molecular FingerprintsIncorporating Open Source Molecular Fingerprints For TB Mobile Version 2For TB Mobile Version 2 Sean EkinsSean Ekins1, 21, 2 , Alex M. Clark, Alex M. Clark33 and Malabika Sarkerand Malabika Sarker44 1 Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA. 2 Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA. 3 Molecular Materials Informatics, 1900 St. Jacques #302, Montreal Quebec, Canada H3J 2S1 4 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA. .
  • 2. Tuberculosis kills 1.6-1.7m/yr (~1 every 8 seconds) 1/3rd of worlds population infected!!!! streptomycin (1943)streptomycin (1943) para-para-aminosalicyclic acid (1949)aminosalicyclic acid (1949) isoniazid (1952)isoniazid (1952) pyrazinamide (1954)pyrazinamide (1954) cycloserine (1955)cycloserine (1955) ethambutol (1962)ethambutol (1962) rifampicin (1967)rifampicin (1967) Multi drug resistance in 4.3% of casesMulti drug resistance in 4.3% of cases Extensively drug resistant increasingExtensively drug resistant increasing incidenceincidence one new drug (bedaquiline) in 40 yrsone new drug (bedaquiline) in 40 yrs TB key pointsTB key points
  • 3. Tested >350,000 molecules Tested ~2M 2M >300,000 >1500 active and non toxic Published 177 100s 800 Big Data: Screening for New Tuberculosis TreatmentsBig Data: Screening for New Tuberculosis Treatments How many will become a new drug? How do we learn from this big data? What are the targets for these molecules? Others have likely screened another 500,000
  • 4. Pathway analysis Binding site similarity to Mtb proteins Docking Bayesian Models - ligand similarity Predicting the target/s for small moleculesPredicting the target/s for small molecules
  • 5. Multi-step processMulti-step process 1.1.Identification of essentialIdentification of essential in vivoin vivo enzymes ofenzymes of MtbMtb involved intensiveinvolved intensive literature mining and manual curation, to extract all the genes essential forliterature mining and manual curation, to extract all the genes essential for MtbMtb growthgrowth in vivoin vivo across speciesacross species.. 2.2.Homolog information was collated from other studies.Homolog information was collated from other studies. 3.3.Collection of metabolic pathway information involved using TBDB.Collection of metabolic pathway information involved using TBDB. 4.4.Identifying molecules and drugs with known or predicted targetsIdentifying molecules and drugs with known or predicted targets involved searching the CDD databases for manually curated data. Theinvolved searching the CDD databases for manually curated data. The structures and data were exported for combination with the other data.structures and data were exported for combination with the other data. 5.5.All data were combined with URL links to literature and TBDB andAll data were combined with URL links to literature and TBDB and deposited in the CDD database.deposited in the CDD database. Initially over 700 molecules in datasetInitially over 700 molecules in dataset Dataset Curation: TB molecules and target informationDataset Curation: TB molecules and target information database connects molecule, gene, pathway and literaturedatabase connects molecule, gene, pathway and literature Sarker et al., Pharm Res 2012, 29, 2115-2127.
  • 6. TB molecules and target information database connectsTB molecules and target information database connects molecule, gene, pathway and literaturemolecule, gene, pathway and literature
  • 7. iPhone Android TB Mobile 1. layout on iPhone and AndroidTB Mobile 1. layout on iPhone and Android
  • 8. 14 First line drugs active against14 First line drugs active against MtbMtb evaluated inevaluated in TB Mobile app and the top 3 molecules shownTB Mobile app and the top 3 molecules shown Confirms all in TB Mobile and retrieved
  • 9. Predicted targets of GSK TB hits monthsPredicted targets of GSK TB hits months earlier using TB Mobileearlier using TB Mobile GSK report hits Dec 2012GSK report hits Dec 2012 2424thth Jan 2013 http://goo.gl/9LKrPZJan 2013 http://goo.gl/9LKrPZ GSK predict targets Oct 2013GSK predict targets Oct 2013
  • 10. Ekins et al., Tuberculosis 94: 162-169 (2014) Predicted targetsPredicted targets using TB Mobileusing TB Mobile No verificationNo verification yetyet
  • 11. PCA of 745 compounds with Mtb targets (blue) and 1200PCA of 745 compounds with Mtb targets (blue) and 1200 Mtb active and non cytotoxic hits compounds (yellow)Mtb active and non cytotoxic hits compounds (yellow) Chemical property space of TB Mobile compoundsChemical property space of TB Mobile compounds Ekins et al., Tuberculosis 94: 162-169 (2014)
  • 12. PCA of 745 compounds with Mtb targets (blue) and 177PCA of 745 compounds with Mtb targets (blue) and 177 GSK Mtb leads (yellow)GSK Mtb leads (yellow) Chemical property space of TB Mobile and GSK leadChemical property space of TB Mobile and GSK lead compoundscompounds Ekins et al., J Chem Inf Model 53: 3054 (2013)
  • 13. TB Mobile 2. layout on iPhoneTB Mobile 2. layout on iPhone About CDDAbout CDD Molecule searchMolecule search FiltersFilters Action Menu Molecule prediction Clustering About TB MobileAction Menu Molecule prediction Clustering About TB Mobile Control blockControl block Compound listCompound list Text searchText search
  • 14. TB Mobile 2. iPhone vs TB Mobile 1. AndroidTB Mobile 2. iPhone vs TB Mobile 1. Android Molecule Detail and LinksMolecule Detail and Links iPhone Android BookmarkBookmark copycopy open-inopen-in clustercluster closeclose
  • 15. TB Mobile 2. iPhone vs TB Mobile 1.TB Mobile 2. iPhone vs TB Mobile 1. Android Similarity Searching in the appAndroid Similarity Searching in the app iPhone Android
  • 16. TB Mobile 2. iPhone vs TB Mobile 1. AndroidTB Mobile 2. iPhone vs TB Mobile 1. Android Filtering and Sharing FunctionsFiltering and Sharing Functions Each molecule can be copied to the clipboard then opened with other apps (e.g. MMDS, MolPrime, MolSync, ChemSpider, and from these exported via Twitter or email) or shared via Dropbox.
  • 17. TB Mobile 2. – Filtering and SharingTB Mobile 2. – Filtering and Sharing FunctionsFunctions Data can also be filtered by target name, pathway name, essentiality and human ortholog
  • 18. PCA of 745 compounds with MtbPCA of 745 compounds with Mtb targets (blue) and 60 newtargets (blue) and 60 new compounds (yellow)compounds (yellow) Chemical property space of screening hits andChemical property space of screening hits and molecules evaluated in TB Mobile 2.molecules evaluated in TB Mobile 2. PCA of 745 compounds with MtbPCA of 745 compounds with Mtb targets (blue) and 20 new testtargets (blue) and 20 new test compounds (yellow)compounds (yellow)
  • 19. GeneGene CountCount GeneGene CountCount Rv0283Rv0283 11 Rv0678Rv0678 11 Rv1211Rv1211 11 Rv1685cRv1685c 11 Rv1885cRv1885c 44 Rv3160cRv3160c 11 Rv3161cRv3161c 11 TB27.3 (Rv0577)TB27.3 (Rv0577) 22 ald (Rv2780)ald (Rv2780) 22 alr (Rv3423c)alr (Rv3423c) 88 aroD (Rv2537c)aroD (Rv2537c) 1414 aspS (Rv2572c)aspS (Rv2572c) 11 atpE (Rv1305)atpE (Rv1305) 22 blaC (Rv2068c)blaC (Rv2068c) 11 clpB (Rv0384c)clpB (Rv0384c) 11 clpC (Rv3596c)clpC (Rv3596c) 11 cyp121 (Rv2276)cyp121 (Rv2276) 22 cyp130 (Rv1256c)cyp130 (Rv1256c) 22 cyp51 (Rv0764c)cyp51 (Rv0764c) 22 cysH (Rv2392)cysH (Rv2392) 1010 cysS (Rv2130c)cysS (Rv2130c) 11 dacB2 (Rv2911)dacB2 (Rv2911) 11 dapA (Rv2753c)dapA (Rv2753c) 1212 deaD (Rv1253)deaD (Rv1253) 11 def (Rv0429c)def (Rv0429c) 1414 dfrA (Rv2763c)dfrA (Rv2763c) 33 dinG (Rv1329c)" (count=1)dinG (Rv1329c)" (count=1) 11 dlaT (Rv2215)dlaT (Rv2215) 22 dnaA (Rv0001)" (count=1)dnaA (Rv0001)" (count=1) 11 dnaB (Rv0058)dnaB (Rv0058) 11 dnaE2 (Rv3370c)" (count=1)dnaE2 (Rv3370c)" (count=1) 11 dprE1 (Rv3790)dprE1 (Rv3790) 88 dprE2" (count=1)dprE2" (count=1) 11 drpE2 (Rv3791)drpE2 (Rv3791) 22 dxr (Rv2870c)dxr (Rv2870c) 11 dxs1 (Rv2682C)dxs1 (Rv2682C) 2929 embA (Rv3794)embA (Rv3794) 22 embB (Rv3795)embB (Rv3795) 11 embC (Rv3793)embC (Rv3793) 11 engA (Rv1713)engA (Rv1713) 11 era (Rv2364c)era (Rv2364c) 11 ethA (Rv3854c)ethA (Rv3854c) 11 fabG (Rv0242c)fabG (Rv0242c) 22 fabH (Rv0533)fabH (Rv0533) 4848 fadD32 (Rv3801c)fadD32 (Rv3801c) 55 fbpC (Rv0129C)fbpC (Rv0129C) 2121 folP1 (Rv3608C)folP1 (Rv3608C) 11 folP2 (Rv1207)folP2 (Rv1207) 11 frdA (Rv1552)frdA (Rv1552) 11 ftsZ (Rv2150c)ftsZ (Rv2150c) 33 fusA1 (Rv0684)fusA1 (Rv0684) 33 fusA2 (Rv0120c)fusA2 (Rv0120c) 33 glcB (Rv837c)glcB (Rv837c) 11 glf (Rv3809c)glf (Rv3809c) 4040 glmU (Rv1018c)glmU (Rv1018c) 11 guab2 (Rv3411)guab2 (Rv3411) 11 gyrA (Rv0006)gyrA (Rv0006) 2424 gyrB (Rv0005)gyrB (Rv0005) 99 ilvG (Rv1820)ilvG (Rv1820) 11 infB (Rv2839c)infB (Rv2839c) 11 inhA (Rv1484)inhA (Rv1484) 157157 kasA (Rv2245)kasA (Rv2245) 99 kasB (Rv2246)kasB (Rv2246) 55 ldtMt1 (Rv0116c)ldtMt1 (Rv0116c) 44 ldtMt2 (Rv2518c)ldtMt2 (Rv2518c) 11 lpd (Rv0462)lpd (Rv0462) 55 lppS (Rv2515c)lppS (Rv2515c) 11 mbtA (Rv2384)mbtA (Rv2384) 9595 mca (Rv1082)mca (Rv1082) 2727 mfd (Rv1020)mfd (Rv1020) 11 mmpL3 (Rv0206c)mmpL3 (Rv0206c) 1515 moeW (Rv2338c)moeW (Rv2338c) 11 mshB (Rv1170)mshB (Rv1170) 44 murB (Rv0482)murB (Rv0482) 11 murD (Rv2155c)murD (Rv2155c) 22 nadB (Rv1595)nadB (Rv1595) 11 ndhA (Rv0392c)ndhA (Rv0392c) 11 nrdR (Rv2718c)nrdR (Rv2718c) 22 pH HomeostasispH Homeostasis 55 panC (Rv3602c)panC (Rv3602c) 2020 pks13 (Rv3800c)pks13 (Rv3800c) 33 proteasomeproteasome 22 ptpA (Rv2234)ptpA (Rv2234) 3838 ptpB (Rv0153c)ptpB (Rv0153c) 33 purU (Rv2964)purU (Rv2964) 22 qcrB (Rv2196)qcrB (Rv2196) 55 quinol oxidasequinol oxidase 11 recG (Rv2973c)recG (Rv2973c) 11 rplC (Rv0701)rplC (Rv0701) 22 rplJ (Rv0651)rplJ (Rv0651) 33 rpoB (Rv0667)rpoB (Rv0667) 44 sahH (Rv3248c)sahH (Rv3248c) 22 thiL (Rv2977c)thiL (Rv2977c) 106106 tlyA (Rv1694)tlyA (Rv1694) 22 tuf (Rv0685)tuf (Rv0685) 33 uvrA (Rv1638)uvrA (Rv1638) 11 Target distributionTarget distribution in TB Mobile 2.in TB Mobile 2.
  • 20. Open Extended Connectivity FingerprintsOpen Extended Connectivity Fingerprints ECFP_6 FCFP_6 • Collected,Collected, deduplicated,deduplicated, hashedhashed • Sparse integersSparse integers • Invented for Pipeline Pilot: public method, proprietary detailsInvented for Pipeline Pilot: public method, proprietary details • Often used with Bayesian models: many published papersOften used with Bayesian models: many published papers • Built a new implementation: open source, Java, CDKBuilt a new implementation: open source, Java, CDK – stable: fingerprints don't change with each new toolkit releasestable: fingerprints don't change with each new toolkit release – well defined: easy to document precise stepswell defined: easy to document precise steps – easy to port: already migrated to iOS (Objective-C) foreasy to port: already migrated to iOS (Objective-C) for TB MobileTB Mobile appapp • Provides core basis feature for CDD open source model serviceProvides core basis feature for CDD open source model service •Clark et al., J Cheminformatics 6: 38 (2014)Clark et al., J Cheminformatics 6: 38 (2014)
  • 21. Testing the fingerprints – comparing to published dataTesting the fingerprints – comparing to published data Dataset Leave one out ROC Published Reference Leave one out ROC in this study Combined model (5304 molecules) ECFP_6 fingerprints N/A N/A 0.77 Combined model (5304 molecules) FCFP_6 fingerprints 0.71 J Chem Inf Model 53:3054- 3063. 0.77 MLSMR dual event model (2273 molecules) and ECFP_6 fingerprints N/A N/A 0.84 MLSMR dual event model (2273 molecules) and FCFP_6 fingerprints 0.86 PLOSONE 8:e63240 0.83 Published models also include 8 additional descriptors as well as fingerprints •Clark et al., J Cheminformatics 6: 38 (2014)Clark et al., J Cheminformatics 6: 38 (2014)
  • 22. Predictions for the InhA target: (a) the ROC curve with ECFP_6 and FCFP_6Predictions for the InhA target: (a) the ROC curve with ECFP_6 and FCFP_6 fingerprints; (b) modified Bayesian estimators for active and inactive compounds;fingerprints; (b) modified Bayesian estimators for active and inactive compounds; (c) structures of selected binders.(c) structures of selected binders. For each listed target with at least two binders, it is first assumed that all of theFor each listed target with at least two binders, it is first assumed that all of the molecules in the collection that do not indicate this as one of their targets aremolecules in the collection that do not indicate this as one of their targets are inactive.inactive. In the app we used ECFP_6 fingerprintsIn the app we used ECFP_6 fingerprints Building Bayesian models for each targetBuilding Bayesian models for each target
  • 23. Predict targetsPredict targets Cluster moleculesCluster molecules Open in MMDSOpen in MMDS Bayesian predictions, data export and clusteringBayesian predictions, data export and clustering Clark et al., J Cheminformatics, 6: 38 (2014)
  • 24.  Draw structures either in app, paste or open from other apps e.g. MMDS  TB Mobile ranks content  TB Mobile can use built in target Bayesian models to predict target  Take a screenshot of results  Output bayesian model predictions to MMDS  Compare to published data  Annotate results, tabulate Process used to evaluate TB MobileProcess used to evaluate TB Mobile
  • 25. We have curated an additional set of 20 molecules that have activityWe have curated an additional set of 20 molecules that have activity againstagainst MtbMtb and were identified by HTS or other methodsand were identified by HTS or other methods Several targets were not in the databaseSeveral targets were not in the database Molecules active againstMolecules active against MtbMtb evaluated in TB Mobile appevaluated in TB Mobile app •Clark et al., J Cheminformatics 6: 38 (2014)Clark et al., J Cheminformatics 6: 38 (2014)
  • 26.  Continue to update with more dataContinue to update with more data  Outreach to increase awareness of app and dataOutreach to increase awareness of app and data  Add machine learning algorithms to predict activity (Add machine learning algorithms to predict activity (inin vitro and in vivovitro and in vivo whole cell activity)whole cell activity)  Could we appify similar target data for other neglectedCould we appify similar target data for other neglected diseases/ targets e.g. malariadiseases/ targets e.g. malaria What next ?What next ?
  • 27. In vitro data In vivo data Target data ADME/Tox data & Models Drug-like scaffold creation TB Prediction Tools TB Publications Data sources and tools we could integrate
  • 28.  Exposure of CDD content from collaboration with SRIExposure of CDD content from collaboration with SRI  More visibility for brand in new placesMore visibility for brand in new places  Experiment in small database with focus on contentExperiment in small database with focus on content deliverydelivery  A functional app to reach scientists that may not haveA functional app to reach scientists that may not have cheminformatics or bioinformatics trainingcheminformatics or bioinformatics training  Pushing the boundaries of what an app can doPushing the boundaries of what an app can do Benefits of creating TB MobileBenefits of creating TB Mobile
  • 29. http://goo.gl/vPOKShttp://goo.gl/vPOKS http://goo.gl/iDJFR TB Mobile 2– Is on iTunes and TB Mobile 1 is on GoogleTB Mobile 2– Is on iTunes and TB Mobile 1 is on Google play and are FREEplay and are FREE
  • 30. http://goo.gl/Goa4e TB mobile – find out more at www.scimobileapps.com
  • 31. Papers published on TB Mobile or usingPapers published on TB Mobile or using datasetdataset Ekins et al., Tuberculosis 94: 162-169 (2014)Ekins et al., Tuberculosis 94: 162-169 (2014) Ekins et al., J Chem Inf Model 53: 3054 (2013)Ekins et al., J Chem Inf Model 53: 3054 (2013) Clark et al., J Cheminformatics 6:38 (2014)Clark et al., J Cheminformatics 6:38 (2014) Ekins et al., J Cheminformatics 5:13 (2013)Ekins et al., J Cheminformatics 5:13 (2013)
  • 32. You can find me @... PAPER ID: 22104 “Collaborative sharing of molecules and data in the mobile age” (final paper number: 43) DIVISION: COMP; DAY & TIME OF PRESENTATION: August 10, 2014 from 4:45 pm to 5:15 pm LOCATION: Moscone Center, West Bldg., Room: 2005 PAPER ID: 22094 “Expanding the metabolite mimic approach to identify hits for Mycobacterium tuberculosis ” (final paper number: 78) DIVISION: COMP: DAY & TIME OF PRESENTATION: August 11, 2014 from 9:00 am to 9:30 am LOCATION: Moscone Center, West Bldg., Room: 2005 PAPER ID: 22120 “Why there needs to be open data for ultrarare and rare disease drug discovery” (final paper number: 48) DIVISION: CINF:SESSION DAY & TIME OF PRESENTATION: August 11, 2014 from 10:50 am to 11:20 am LOCATION: Palace Hotel, Room: Marina PAPER ID: 22183 “Progress in computational toxicology” (final paper number: 125) DIVISION: TOXI: DAY & TIME OF PRESENTATION: August 12, 2014 from 6:30 pm to 10:30 pm LOCATION: Moscone Center, North Bldg. , Room: 134 PAPER ID: 22091 “Examples of how to inspire the next generation to pursue computational chemistry/cheminformatics” (final paper number: 100) DIVISION: CINF: Division of Chemical Information DAY & TIME OF PRESENTATION: August 13, 2014 from 8:25 am to 8:50 am LOCATION: Palace Hotel, Room: Presidio PAPER ID: 22176 “Applying computational models for transporters to predict toxicity” (final paper number: 132) DIVISION: TOXI: DAY & TIME OF PRESENTATION: August 13, 2014 from 9:45 am to 10:05 am LOCATION: InterContinental San Francisco, Room: Grand Ballroom A PAPER ID: 22186 “New target prediction and visualization tools incorporating open source molecular fingerprints for TB mobile version 2” (final paper number: 123) DIVISION: CINF: DAY & TIME OF PRESENTATION: August 13, 2014 from 1:35 pm to 2:05 pm LOCATION: Palace Hotel, Room: California Parlor
  • 33. All at CDD, SRI, IDRI and many others …Funding:All at CDD, SRI, IDRI and many others …Funding: 2R42AI088893-02 NIAID, NIH; 9R44TR000942-02 NCATS, NIH; CDD TB has been developed thanks to funding from the Bill and Melinda Gates Foundation (Grant#49852)
  • 34.  Email: ekinssean@yahoo.comEmail: ekinssean@yahoo.com • Slideshare: http://www.slideshare.net/ekinsseanSlideshare: http://www.slideshare.net/ekinssean • Twitter: collabchemTwitter: collabchem • Blog: http://www.collabchem.com/Blog: http://www.collabchem.com/ • Website: http://www.collaborations.com/CHEMISTRY.HTMWebsite: http://www.collaborations.com/CHEMISTRY.HTM