Collaboration in Pharmaceutical Research: From Neglected Diseases to ADME/Tox Sean Ekins Collaborations in Chemistry, Fuqu...
In the long history of human kind (and animal kind, too) those who have learned to collaborate and improvise most effectiv...
Outline <ul><li>Introduction </li></ul><ul><li>Collaborative Drug Discovery </li></ul><ul><li>TB Collaborations and Drug D...
Open Innovation   Open innovation is a paradigm that assumes that firms can and should use external ideas as well as inter...
How to do it better?  What can we do with software to facilitate it ? The future is more collaborative We have tools but n...
Major collaborative grants in EU: Framework, IMI …NIH moving in same direction? Cross continent collaboration CROs in Chin...
Hardware is getting smaller 1930’s 1980s 1990s Room size Desktop size Not to scale and not equivalent computing power – il...
Models and software becoming more accessible- free, precompetitive efforts - collaboration Free tools are proliferating
Typical Lab:  The Data Explosion Problem & Collaborations DDT  Feb 2009
Collaborative Drug Discovery Platform <ul><ul><li>CDD Vault –  Secure web-based place for private data – private by defaul...
 
CDD:  Single Click to Key Functionality
CDD:  Mining across projects and datasets
<ul><li>Tuberculosis Kills 1.6-1.7m/yr (~1 every 8 seconds) </li></ul><ul><li>1/3 rd  of worlds population infected!!!! </...
~ 20 public datasets  for TB Including Novartis data on TB hits >300,000 cpds Patents, Papers Annotated by CDD Open to bro...
CDD is a partner on a 5 year project supporting >20 labs and providing cheminformatics support  www.mm4tb.org More Medicin...
Ekins et al, Trends in Microbiology  19: 65-74, 2011   Fitting into the drug discovery process
Searching for TB molecular mimics; collaboration Lamichhane G, et al Mbio, 2: e00301-10, 2011  Modeling – CDD Biology – Jo...
Simple descriptor analysis on > 300,000 compounds tested vs TB  Dataset  MWT logP HBD HBA RO 5 Atom count PSA RBN MLSMR  A...
Novartis aerobic and anaerobic TB hits Anaerobic compounds showed statistically different and higher mean descriptor prope...
Bayesian machine learning Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010 Bayesian classification is a simpl...
Bayesian Classification Models for TB Good Bad active compounds with MIC < 5uM Laplacian-corrected Bayesian classifier mod...
Bayesian Classification Dose response Good Bad Ekins et al., Mol BioSyst, 6: 840-851, 2010
Bayesian Classification TB Models Leave out 50% x 100 Ekins et al., Mol BioSyst, 6: 840-851, 2010  Dateset  (number of mol...
100K library    Novartis Data   FDA drugs  External Test sets Suggests models can predict data from the same and independe...
<ul><li>Bayesian Models  </li></ul><ul><li>  </li></ul><ul><li>Generated with kinase data [1] - - (blind testing of previo...
  Models with SRI kinase library data Model 1 ROC XV AUC  (N 23797)  = 0.89 Model 2   (N 1248)   = 0.72 Model 3  (N 1248) ...
Bayesian Classification TB Models Ekins et al., Mol BioSyst, 6: 840-851, 2010  Single pt  ROC XV AUC  = 0.88 Dose resp   =...
<ul><li>Combining cheminformatics methods and pathway analysis </li></ul><ul><li>Identified essential TB targets that had ...
Malaria data in CDD > 22,000 compounds Including datasets from Dr.  Guy’s group Ekins, Hohman and Bunin in: Collaborative ...
http://www.slideshare.net/ekinssean Ekins S and Williams AJ, MedChemComm,  1: 325-330, 2010. Analysis of malaria and TB da...
Multiple antimalarial datasets Ekins and Williams Drug Disc Today 15; 812-815, 2010  Ekins and Williams, MedChemComm, 1: 3...
Antimalarial Compound libraries and filter failures Ekins and Williams Drug Disc Today 15; 812-815, 2010  Filtering using ...
TB Compound libraries and filter failures Filtering using SMARTs filters to remove thiol reactives, false positives etc  a...
Correlation between the number of SMARTS filter failures and the number of Lipinski violations for different types of rule...
Summary Computational models based on Whole cell TB data could improve efficiency of screening Collaborations get us to in...
Could all pharmas share their data as models with each other? Increasing Data & Model Access Ekins and Williams, Lab On A ...
The big idea <ul><li>Challenge..There is limited access to ADME/Tox data and models needed for R&D </li></ul><ul><li>How c...
<ul><li>What can be developed with very large training and test sets? </li></ul><ul><li>HLM training 50,000 testing 25,000...
Massive Human liver microsomal stability model PCA of training (red) and test (blue) compounds Overlap in Chemistry space ...
RRCK Permeability and MDR Open descriptors results almost identical to commercial descriptors Across many datasets and qua...
Merck KGaA  Combining models may give greater coverage of ADME/ Tox chemistry space and improve predictions? Model coverag...
Next steps <ul><li>ADME/Tox Data crosses diseases </li></ul><ul><li>Potential to share models selectively with collaborato...
<ul><li>Open source software for molecular descriptors and algorithms </li></ul><ul><li>Spend only a fraction of the money...
Bunin & Ekins DDT 16: 643-645, 2011  A complex ecosystem of collaborations: A new business model Inside Company Collaborat...
Finding Promiscuous Old Drugs for New Uses <ul><li>Research published in the last six years - 34 studies - Screened librar...
Finding Promiscuous Old Drugs for New Uses <ul><li>109 molecules were identified by screening in vitro  </li></ul><ul><li>...
2D Similarity search with “hit” from screening  Export database and use for 3D searching with a pharmacophore or other mod...
Crowdsourcing Project “Off the Shelf R&D” All pharmas have assets on shelf that reached clinic “ Off the Shelf R&D”  Get t...
Tools for Open Science <ul><li>Blogs </li></ul><ul><li>Wikis </li></ul><ul><li>Databases </li></ul><ul><li>Journals </li><...
2020: A Drug Discovery Odyssey Could our Pharma R&D look like this Massive collaboration networks – software enabled. We a...
Example of Social Collaboration in Science: Tweets, Blog Lead to The Green Solvents App I attend seminar on solvent select...
<ul><li>Make science more accessible = >communication </li></ul><ul><li>Mobile – take a phone into field /lab and do scien...
www.scimobileapps.com How do you find scientific mobile Apps ? Development of Wiki’s to track developments in tools..
Acknowledgments <ul><li>Rishi Gupta, Eric Gifford, Ted Liston, Chris Waller (Pfizer) </li></ul><ul><li>Antony J. Williams ...
Upcoming SlideShare
Loading in...5
×

Slides for st judes

676

Published on

Talk given at St Jude Children's Research Hospital 18 November

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
676
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. &amp; Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth &amp; Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD &amp; Overall Sales Strategy) Symyx (VP Bus Dev &amp; President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, &amp; Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
  • CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. &amp; Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth &amp; Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD &amp; Overall Sales Strategy) Symyx (VP Bus Dev &amp; President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, &amp; Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
  • CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. &amp; Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth &amp; Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD &amp; Overall Sales Strategy) Symyx (VP Bus Dev &amp; President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, &amp; Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
  • CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. &amp; Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth &amp; Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD &amp; Overall Sales Strategy) Symyx (VP Bus Dev &amp; President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, &amp; Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
  • CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. &amp; Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth &amp; Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD &amp; Overall Sales Strategy) Symyx (VP Bus Dev &amp; President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, &amp; Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
  • CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. &amp; Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth &amp; Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD &amp; Overall Sales Strategy) Symyx (VP Bus Dev &amp; President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, &amp; Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
  • CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. &amp; Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth &amp; Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD &amp; Overall Sales Strategy) Symyx (VP Bus Dev &amp; President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, &amp; Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
  • CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. &amp; Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth &amp; Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD &amp; Overall Sales Strategy) Symyx (VP Bus Dev &amp; President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, &amp; Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
  • CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. &amp; Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth &amp; Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD &amp; Overall Sales Strategy) Symyx (VP Bus Dev &amp; President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, &amp; Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
  • CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. &amp; Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth &amp; Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD &amp; Overall Sales Strategy) Symyx (VP Bus Dev &amp; President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, &amp; Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
  • CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. &amp; Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth &amp; Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD &amp; Overall Sales Strategy) Symyx (VP Bus Dev &amp; President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, &amp; Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
  • Added Massive collaboration networks – software enabled. We are in “Generation App”. Crowdsourcing will have a role in R&amp;D. Drug discovery possible by anyone with “app access”
  • Slides for st judes

    1. 1. Collaboration in Pharmaceutical Research: From Neglected Diseases to ADME/Tox Sean Ekins Collaborations in Chemistry, Fuquay Varina, NC. Collaborative Drug Discovery, Burlingame, CA. Department of Pharmacology, University of Medicine & Dentistry of New Jersey-Robert Wood Johnson Medical School, Piscataway, NJ. School of Pharmacy, Department of Pharmaceutical Sciences, University of Maryland, Baltimore, MD.
    2. 2. In the long history of human kind (and animal kind, too) those who have learned to collaborate and improvise most effectively have prevailed. Charles Darwin
    3. 3. Outline <ul><li>Introduction </li></ul><ul><li>Collaborative Drug Discovery </li></ul><ul><li>TB Collaborations and Drug Discovery Research </li></ul><ul><li>Open ADME Models </li></ul><ul><li>Repurposing FDA approved drugs </li></ul><ul><li>The Future – Mobile Apps for Drug Discovery </li></ul>
    4. 4. Open Innovation Open innovation is a paradigm that assumes that firms can and should use external ideas as well as internal ideas, and internal and external paths to market, as the firms look to advance their technology Chesbrough, H.W. (2003). Open Innovation: The new imperative for creating and profiting from technology. Boston: Harvard Business School Press, p. xxiv Collaborative Innovation A strategy in which groups partner to create a product - drive the efficient allocation of R&D resources. Collaborating with outsiders-including customers, vendors and even competitors-a company is able to import lower-cost, higher-quality ideas from the best sources in the world. e.g. Innocentive, crowdsourcing Open Source Companies can donate their patents to an independent organization, put them in a common pool or grant unlimited license use to anybody. e.g. GSK malaria data, Novartis TB data Some Definitions
    5. 5. How to do it better? What can we do with software to facilitate it ? The future is more collaborative We have tools but need integration <ul><li>Groups involved traverse the spectrum from pharma, academia, not for profit and government </li></ul><ul><li>More free, open technologies to enable biomedical research </li></ul><ul><li>Precompetitive organizations, consortia.. </li></ul>A starting point for collaboration A core root of the current inefficiencies in drug discovery are due to organizations’ and individual’s barriers to collaborate effectively Bunin & Ekins DDT 16: 643-645, 2011
    6. 6. Major collaborative grants in EU: Framework, IMI …NIH moving in same direction? Cross continent collaboration CROs in China, India etc – Pharma’s in US / Europe More industry – academia collaboration ‘not invented here’ a thing of the past More effort to go after rare and neglected diseases -Globalization and connectivity of scientists will be key – Current pace of change in pharma may not be enough. Need to rethink how we use all technologies & resources… Collaboration is everywhere
    7. 7. Hardware is getting smaller 1930’s 1980s 1990s Room size Desktop size Not to scale and not equivalent computing power – illustrates mobility Laptop Netbook Phone Watch 2000s
    8. 8. Models and software becoming more accessible- free, precompetitive efforts - collaboration Free tools are proliferating
    9. 9. Typical Lab: The Data Explosion Problem & Collaborations DDT Feb 2009
    10. 10. Collaborative Drug Discovery Platform <ul><ul><li>CDD Vault – Secure web-based place for private data – private by default </li></ul></ul><ul><ul><li>CDD Collaborate – Selectively share subsets of data </li></ul></ul><ul><ul><li>CDD Public – public data sets - Over 3 Million compounds, with molecular properties, similarity and substructure searching, data plotting etc </li></ul></ul><ul><ul><ul><li>will host datasets from companies, foundations etc </li></ul></ul></ul><ul><ul><ul><li>vendor libraries (Asinex, TimTec, ChemBridge) </li></ul></ul></ul><ul><ul><li>Unique to CDD – simultaneously query your private data, collaborators’ data, & public data, Easy GUI </li></ul></ul>www.collaborativedrug.com
    11. 12. CDD: Single Click to Key Functionality
    12. 13. CDD: Mining across projects and datasets
    13. 14. <ul><li>Tuberculosis Kills 1.6-1.7m/yr (~1 every 8 seconds) </li></ul><ul><li>1/3 rd of worlds population infected!!!! </li></ul><ul><li>Multi drug resistance in 4.3% of cases </li></ul><ul><li>Extensively drug resistant increasing incidence </li></ul><ul><li>No new drugs in over 40 yrs </li></ul><ul><li>Drug-drug interactions and Co-morbidity with HIV </li></ul><ul><li>Collaboration between groups is rare </li></ul><ul><li>These groups may work on existing or new targets </li></ul><ul><li>Use of computational methods with TB is rare </li></ul><ul><li>Literature TB data is not well collated (SAR) </li></ul><ul><li>Funded by Bill and Melinda Gates Foundation </li></ul>Applying CDD to Build a disease community for TB
    14. 15. ~ 20 public datasets for TB Including Novartis data on TB hits >300,000 cpds Patents, Papers Annotated by CDD Open to browse by anyone http://www.collaborativedrug.com/register Molecules with activity against
    15. 16. CDD is a partner on a 5 year project supporting >20 labs and providing cheminformatics support www.mm4tb.org More Medicines for Tuberculosis
    16. 17. Ekins et al, Trends in Microbiology 19: 65-74, 2011 Fitting into the drug discovery process
    17. 18. Searching for TB molecular mimics; collaboration Lamichhane G, et al Mbio, 2: e00301-10, 2011 Modeling – CDD Biology – Johns Hopkins Chemistry – Texas A&M
    18. 19. Simple descriptor analysis on > 300,000 compounds tested vs TB Dataset MWT logP HBD HBA RO 5 Atom count PSA RBN MLSMR Active ≥ 90% inhibition at 10uM (N = 4096) 357.10 (84.70) 3.58 (1.39) 1.16 (0.93) 4.89 (1.94) 0.20 (0.48) 42.99 (12.70) 83.46 (34.31) 4.85 (2.43) Inactive < 90% inhibition at 10uM (N = 216367) 350.15 (77.98)** 2.82 (1.44)** 1.14 (0.88) 4.86 (1.77) 0.09 (0.31)** 43.38 (10.73) 85.06 (32.08)* 4.91 (2.35) TAACF-NIAID CB2 Active ≥ 90% inhibition at 10uM (N =1702) 349.58 (63.82) 4.04 (1.02) 0.98 (0.84) 4.18 (1.66) 0.19 (0.40) 41.88 (9.44) 70.28 (29.55) 4.76 (1.99) Inactive < 90% inhibition at 10uM (N =100,931) 352.59 (70.87) 3.38 (1.36)** 1.11 (0.82)** 4.24 (1.58) 0.12 (0.34)** 42.43 (8.94)* 77.75 (30.17)** 4.72 (1.99)
    19. 20. Novartis aerobic and anaerobic TB hits Anaerobic compounds showed statistically different and higher mean descriptor property values compared with the aerobic hits (e.g. molecular weight, logP, hydrogen bond donor, hydrogen bond acceptor, polar surface area and rotatable bond number) The mean molecular properties for the Novartis compounds are in a similar range to the MLSMR and TAACF-NIAID CB2 hits Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011.
    20. 21. Bayesian machine learning Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010 Bayesian classification is a simple probabilistic classification model. It is based on Bayes’ theorem h is the hypothesis or model d is the observed data p ( h ) is the prior belief (probability of hypothesis h before observing any data) p ( d ) is the data evidence (marginal probability of the data) p ( d|h ) is the likelihood (probability of data d if hypothesis h is true) p ( h|d ) is the posterior probability (probability of hypothesis h being true given the observed data d ) A weight is calculated for each feature using a Laplacian-adjusted probability estimate to account for the different sampling frequencies of different features. The weights are summed to provide a probability estimate
    21. 22. Bayesian Classification Models for TB Good Bad active compounds with MIC < 5uM Laplacian-corrected Bayesian classifier models were generated using FCFP-6 and simple descriptors. 2 models 220,000 and >2000 compounds Ekins et al., Mol BioSyst, 6: 840-851, 2010
    22. 23. Bayesian Classification Dose response Good Bad Ekins et al., Mol BioSyst, 6: 840-851, 2010
    23. 24. Bayesian Classification TB Models Leave out 50% x 100 Ekins et al., Mol BioSyst, 6: 840-851, 2010 Dateset (number of molecules) External ROC Score Internal ROC Score Concordance Specificity Sensitivity MLSMR All single point screen (N = 220463) 0.86 ± 0 0.86 ± 0 78.56 ± 1.86 78.59 ± 1.94 77.13 ± 2.26 MLSMR dose response set (N = 2273) 0.73 ± 0.01 0.75 ± 0.01 66.85 ± 4.06 67.21 ± 7.05 65.47 ± 7.96
    24. 25. 100K library Novartis Data FDA drugs External Test sets Suggests models can predict data from the same and independent labs Initial enrichment – enables screening few compounds to find actives 21 hits in 2108 cpds 34 hits in 248 cpds 1702 hits in >100K cpds Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011. Ekins et al., Mol BioSyst, 6: 840-851, 2010
    25. 26. <ul><li>Bayesian Models </li></ul><ul><li>  </li></ul><ul><li>Generated with kinase data [1] - - (blind testing of previous models showed 3-4 fold </li></ul><ul><li>enrichment ) </li></ul><ul><li>  </li></ul><ul><li>Models were built as described previously [2] </li></ul><ul><li>  </li></ul><ul><li>Data for single point screening (cut off for activity % inhibition at 10uM >or equal to </li></ul><ul><li>90%) </li></ul><ul><li>  </li></ul><ul><li>2.IC 50 data Cut off for active = or equal to 5uM </li></ul><ul><li>  </li></ul><ul><li>3.IC 90 data Cut off for active = or equal to 10uM and vero cell selectivity index greater or </li></ul><ul><li>equal to 10. </li></ul><ul><li>  </li></ul><ul><li>  </li></ul><ul><li>[1] Reynolds RC, et al. Tuberculosis (Edinburgh, Scotland) 2011 In Press. </li></ul><ul><li>[2] Ekins S, et al.,Mol BioSystems 2010;6:840-51. </li></ul><ul><li>  </li></ul>Models with SRI kinase library data
    26. 27.   Models with SRI kinase library data Model 1 ROC XV AUC (N 23797) = 0.89 Model 2 (N 1248) = 0.72 Model 3 (N 1248) = 0.77 Leave out 50% x 100 Adding cytotoxicity data improves models Dateset (number of molecules) External ROC Score Internal ROC Score Concordance Specificity Sensitivity Model 1 (N = 23797) 0.87 ± 0 0.88 ± 0 76.77 ± 2.14 76.49 ± 2.41 81.7 ± 2.96 Model 2 (N = 1248) 0.65 ± 0.01 0.70 ± 0.01 61.58 ± 1.56 61.85 ± 8.45 61.30 ± 8.24 Model 3 (N=1248) 0.74 ± 0.02 0.75 ± 0.02 68.67 ± 6.88 69.28 ± 9.84 64.84 ± 12.11
    27. 28. Bayesian Classification TB Models Ekins et al., Mol BioSyst, 6: 840-851, 2010 Single pt ROC XV AUC = 0.88 Dose resp = 0.78 Dose resp + cyto = 0.86 Leave out 50% x 100 Dateset (number of molecules) External ROC Score Internal ROC Score Concordance Specificity Sensitivity MLSMR All single point screen (N = 220463) 0.86 ± 0 0.86 ± 0 78.56 ± 1.86 78.59 ± 1.94 77.13 ± 2.26 MLSMR dose response set (N = 2273) 0.73 ± 0.01 0.75 ± 0.01 66.85 ± 4.06 67.21 ± 7.05 65.47 ± 7.96 NEW Dose resp and cytotoxicity (N = 2273) 0.82 ± 0.02 0.84 ± 0.02 82.61 ± 4.68 83.91 ± 5.48 65.99 ± 7.47
    28. 29. <ul><li>Combining cheminformatics methods and pathway analysis </li></ul><ul><li>Identified essential TB targets that had not been exploited </li></ul><ul><li>Used resources available to both to identify targets and molecules that mimic substrates </li></ul><ul><li>Computationally searched >80,000 molecules - tested 23 compounds in vitro (3 picked as inactives), lead to 2 proposed as mimics of D-fructose 1,6 bisphosphate, (MIC of 20 and 40 ug/ml) </li></ul><ul><li>POC took < 6mths - - Submitted phase II STTR, Submitted manuscript </li></ul><ul><li>Still need to test vs target - verify it hits suggested target </li></ul>Ekins et al, Trends in Microbiology Feb 2011 Phase I STTR - NIAID funded collaboration with Stanford Research International Sarker et al, submitted 2011
    29. 30. Malaria data in CDD > 22,000 compounds Including datasets from Dr. Guy’s group Ekins, Hohman and Bunin in: Collaborative Computational Technologies for Biomedical Research , Edited by Sean Ekins, Maggie A. Z. Hupcey, Antony J. Williams.Published 2011 by John Wiley & Sons, Inc Other datasets
    30. 31. http://www.slideshare.net/ekinssean Ekins S and Williams AJ, MedChemComm, 1: 325-330, 2010. Analysis of malaria and TB datasets
    31. 32. Multiple antimalarial datasets Ekins and Williams Drug Disc Today 15; 812-815, 2010 Ekins and Williams, MedChemComm, 1: 325-330, 2010. Screening hits in total are not ‘lead-like’ (MW < 350, LogP< 3) closest to ‘natural product lead-like’. Although GSK suggests that the compounds are “drug-like” the evidence for this is weak Dataset MW logP HBD HBA Lipinski rule of 5 alerts PSA (Å 2 ) RBN GSK data (N = 13,471) 478.2 ± 114.3 4.5 ± 1.6 1.8 ± 1.0 5.6 ± 2.0 0.8 ± 0.8 76.8 ± 30.0 7.2 ± 3.4 St Jude (N = 1524) 385.3 ± 71.2 3.8 ± 1.6 1.1 ± 0.8 4.9 ± 1.8 0.2 ± 0.4 72.2 ±29.3 5.2 ±2.3 Novartis (N = 5695) 398.2 ± 105.3 3.7 ± 2.0 1.2 ± 1.1 4.7 ± 2.1 0.4 ± 0.7 74.7 ± 37.9 5.6 ± 3.0 Johns Hopkins All FDA drugs (N = 2615) 349.1 ± 355.8 1.2 ± 3.4 2.4 ± 4.6 5.1 ± 5.5 0.3 ± 0.8 96.0 ±139.8 5.4 ± 9.6 Johns Hopkins Subset > 50% malaria inhibition at 96h (N = 165) 458.0 ± 298.6 2.2 ± 2.7 2.1 ± 3.4 5.4 ± 4.7 0.6 ± 0.9 90.6 ± 104.4 7.1 ± 7.7 Antimalarial drugs (N = 14) 341.6 ± 67.0 3.8 ± 1.6 1.8 ± 1.0 5.3 ± 1.5 0.2 ± 0.6 53.4 ± 21.2 5.8 ± 3.0
    32. 33. Antimalarial Compound libraries and filter failures Ekins and Williams Drug Disc Today 15; 812-815, 2010 Filtering using SMARTs filters to remove thiol reactives, false positives etc at University of New Mexico (http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter) % Failure
    33. 34. TB Compound libraries and filter failures Filtering using SMARTs filters to remove thiol reactives, false positives etc at University of New Mexico (http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter) Ekins et al., Mol Biosyst, 6: 2316-2324, 2010
    34. 35. Correlation between the number of SMARTS filter failures and the number of Lipinski violations for different types of rules sets with FDA drug set from CDD (N = 2804) Suggests # of Lipinski violations may also be an indicator of undesirable chemical features that result in reactivity Correlations Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011.
    35. 36. Summary Computational models based on Whole cell TB data could improve efficiency of screening Collaborations get us to interesting compounds quickly Availability of datasets enable analysis that could suggest simple rules Active compounds vs Mtb and P. Falciparum have higher mean molecular weights and logP values A high proportion of compounds fail the Abbott filters for reactivity when compared to drugs and antimalarials Understanding the chemical properties and characteristics of compounds = better compounds for lead optimization. St Jude and Novartis datasets should be screened vs Mtb as their property space is close to TB actives Rare and Neglected disease researchers lack ADME/Tox insights
    36. 37. Could all pharmas share their data as models with each other? Increasing Data & Model Access Ekins and Williams, Lab On A Chip, 10: 13-22, 2010.
    37. 38. The big idea <ul><li>Challenge..There is limited access to ADME/Tox data and models needed for R&D </li></ul><ul><li>How could a company share data but keep the structures proprietary? </li></ul><ul><li>Sharing models means both parties use costly software </li></ul><ul><li>What about open source tools? </li></ul><ul><li>Pfizer had never considered this - So we proposed a study and Rishi Gupta generated models </li></ul>
    38. 39. <ul><li>What can be developed with very large training and test sets? </li></ul><ul><li>HLM training 50,000 testing 25,000 molecules </li></ul><ul><li>training 194,000 and testing 39,000 </li></ul><ul><li>MDCK training 25,000 testing 25,000 </li></ul><ul><li>MDR training 25,000 testing 18,400 </li></ul><ul><li>Open molecular descriptors / models vs commercial descriptors </li></ul>Gupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010 Open source tools for modeling
    39. 40. Massive Human liver microsomal stability model PCA of training (red) and test (blue) compounds Overlap in Chemistry space Gupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010 HLM Model with CDK and SMARTS Keys: HLM Model with MOE2D and SMARTS Keys <ul><li># Descriptors: 578 Descriptors </li></ul><ul><li># Training Set compounds: 193,650 </li></ul><ul><li>Cross Validation Results: 38,730 compounds </li></ul><ul><li>Training R 2 : 0.79 </li></ul><ul><li>20% Test Set R 2 : 0.69 </li></ul><ul><li>Blind Data Set (2310 compounds): </li></ul><ul><li>R 2 = 0.53 </li></ul><ul><li>RMSE = 0.367 </li></ul><ul><li>Continuous  Categorical: </li></ul><ul><li>κ = 0.40 </li></ul><ul><li>Sensitivity = 0.16 </li></ul><ul><li>Specificity = 0.99 </li></ul><ul><li>PPV = 0.80 </li></ul><ul><li>Time (sec/compound): 0.252 </li></ul><ul><li># Descriptors: 818 Descriptors </li></ul><ul><li># Training Set compounds: 193,930 </li></ul><ul><li>Cross Validation Results: 38,786 compounds </li></ul><ul><li>Training R 2 : 0.77 </li></ul><ul><li>20% Test Set R 2 : 0.69 </li></ul><ul><li>Blind Data Set (2310 compounds): </li></ul><ul><li>R 2 = 0.53 </li></ul><ul><li>RMSE = 0.367 </li></ul><ul><li>Continuous  Categorical: </li></ul><ul><li>κ = 0.42 </li></ul><ul><li>Sensitivity = 0.24 </li></ul><ul><li>Specificity = 0.987 </li></ul><ul><li>PPV = 0.823 </li></ul><ul><li>Time (sec/compound): 0.303 </li></ul>
    40. 41. RRCK Permeability and MDR Open descriptors results almost identical to commercial descriptors Across many datasets and quantitative and qualitative data Smaller solubility datasets give similar results Provides confidence that open models could be viable MDCK training 25,000 testing 25,000 MDR training 25,000 testing 18,400 Gupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010 C5.0 RRCK Permeability C5.0 MDR CDK descriptors Kappa = 0.47 Sensitivity = 0.59 Specificity = 0.93 PPV = 0.67 Kappa = 0.62 Sensitivity = 0.85 Specificity = 0.77 PPV = 0.83 MOE2D and SMARTS Keys Kappa = 0.53 Sensitivity = 0.64 Specificity = 0.94 PPV = 0.72 (Baseline) Kappa = 0.67 Sensitivity = 0.86 Specificity = 0.80 PPV = 0.85 (Baseline) CDK and SMARTS Keys Kappa = 0.50 Sensitivity = 0.62 Specificity = 0.94 PPV = 0.68 Kappa = 0.65 Sensitivity = 0.86 Specificity = 0.78 PPV = 0.84
    41. 42. Merck KGaA Combining models may give greater coverage of ADME/ Tox chemistry space and improve predictions? Model coverage of chemistry space Lundbeck Pfizer Merck GSK Novartis Lilly BMS Allergan Bayer AZ Roche BI Merk KGaA
    42. 43. Next steps <ul><li>ADME/Tox Data crosses diseases </li></ul><ul><li>Potential to share models selectively with collaborators e.g. academics, neglected disease researchers </li></ul><ul><li>We used the proof of concept to submit an SBIR “ Biocomputation across distributed private datasets to enhance drug discovery” </li></ul><ul><li>Develop prototype for sharing models securely- collaborate to show how combining data for TB etc could improve models </li></ul><ul><li>Phase II- develop a commercial product that leverages CDD </li></ul><ul><li>Engage Pistoia Alliance to expand concept to many companies – in progress </li></ul>
    43. 44. <ul><li>Open source software for molecular descriptors and algorithms </li></ul><ul><li>Spend only a fraction of the money on QSAR </li></ul><ul><li>Selectively share your models with collaborators and control access </li></ul><ul><li>Have someone else host the models / predictions </li></ul>The next opportunities for crowdsourcing… Models Inside company Collaborators Commercial Descriptors Algorithms ADME/Tox data Current investments >$1M/yr >$10-100’s M/yr
    44. 45. Bunin & Ekins DDT 16: 643-645, 2011 A complex ecosystem of collaborations: A new business model Inside Company Collaborators Inside Academia Collaborators Molecules, Models, Data Molecules, Models, Data Inside Foundation Collaborators Molecules, Models, Data Inside Government Collaborators Molecules, Models, Data IP IP IP IP Shared IP Collaborative platform/s
    45. 46. Finding Promiscuous Old Drugs for New Uses <ul><li>Research published in the last six years - 34 studies - Screened libraries of FDA approved drugs against various whole cell or target assays. </li></ul><ul><li>1 or more compounds with a suggested new bioactivity </li></ul><ul><li>13 drugs were active against more than one additional disease in vitro </li></ul>
    46. 47. Finding Promiscuous Old Drugs for New Uses <ul><li>109 molecules were identified by screening in vitro </li></ul><ul><li>Statistically more hydrophobic (log P) and higher MWT than orphan-designated products with at least one marketing approval for a common disease indication or one marketing approval for a rare disease from the FDA’s rare disease research database. </li></ul><ul><li>Created structure searchable databases in CDD </li></ul><ul><li>Data in publications is increasing but who is tracking it? </li></ul>Ekins and Williams, Pharm Res, 28, 1785-1791, 2011.
    47. 48. 2D Similarity search with “hit” from screening Export database and use for 3D searching with a pharmacophore or other model Suggest approved drugs for testing - may also indicate other uses if it is present in more than one database Suggest in silico hits for in vitro screening Key databases of structures and bioactivity data FDA drugs database Repurpose FDA drugs in silico Ekins S, Williams AJ, Krasowski MD and Freundlich JS, Drug Disc Today, 16: 298-310, 2011
    48. 49. Crowdsourcing Project “Off the Shelf R&D” All pharmas have assets on shelf that reached clinic “ Off the Shelf R&D” Get the crowd to help in repurposing / repositioning these assets How can software help? - Create communities to test - Provide informatics tools that are accessible to the crowd - enlarge user base - Data storage on cloud – integration with public data - Crowd becomes virtual pharma-CROs and the “customer” for enabling services
    49. 50. Tools for Open Science <ul><li>Blogs </li></ul><ul><li>Wikis </li></ul><ul><li>Databases </li></ul><ul><li>Journals </li></ul><ul><li>What about Twitter, Facebook, could these be used for social collaboration, science? </li></ul>
    50. 51. 2020: A Drug Discovery Odyssey Could our Pharma R&D look like this Massive collaboration networks – software enabled. We are in “Generation App” Crowdsourcing will have a role in R&D. Drug discovery possible by anyone with “app access” Ekins & Williams, Pharm Res, 27: 393-395, 2010.
    51. 52. Example of Social Collaboration in Science: Tweets, Blog Lead to The Green Solvents App I attend seminar on solvent selection guide I tweet during talk Mobile App developer Alex Clark responds to twitter along with Antony Williams starts an email discussion about Green Chemistry apps I blog that evening 3 days later an App is created By Alex
    52. 53. <ul><li>Make science more accessible = >communication </li></ul><ul><li>Mobile – take a phone into field /lab and do science more readily than on a laptop </li></ul><ul><li>GREEN – energy efficient computing </li></ul><ul><li>MolSync (+ DropBox) + MMDS = Share molecules as SDF files on the cloud = collaborate </li></ul>Mobile Apps for Drug Discovery Williams et al DDT 16:928-939, 2011
    53. 54. www.scimobileapps.com How do you find scientific mobile Apps ? Development of Wiki’s to track developments in tools..
    54. 55. Acknowledgments <ul><li>Rishi Gupta, Eric Gifford, Ted Liston, Chris Waller (Pfizer) </li></ul><ul><li>Antony J. Williams (RSC) </li></ul><ul><li>Joel Freundlich (Texas A&M), Gyanu Lamichhane (Johns Hopkins) </li></ul><ul><li>Carolyn Talcott, Malabika Sarker , Peter Madrid, Sidharth Chopra (SRI International) </li></ul><ul><li>MM4TB colleagues </li></ul><ul><li>Chris Lipinski </li></ul><ul><li>Takushi Kaneko (TB Alliance) </li></ul><ul><li>Nicko Goncharoff (SureChem) </li></ul><ul><li>Matthew D. Krasowski (University of Iowa) </li></ul><ul><li>Alex Clark (Molecular Materials Informatics, Inc) </li></ul><ul><li>Accelrys </li></ul><ul><li>CDD – Barry Bunin </li></ul><ul><li>Funding BMGF, NIAID. </li></ul><ul><li>Everyone that has shared data in CDD.. </li></ul><ul><li>Email: ekinssean@yahoo.com </li></ul><ul><li>Slideshare: http://www.slideshare.net/ekinssean </li></ul><ul><li>Twitter: collabchem </li></ul><ul><li>Blog: http://www.collabchem.com/ </li></ul><ul><li>Website: http://www.collaborations.com/CHEMISTRY.HTM </li></ul>
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×