Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Cheminformatics &  The evolving relationship between data in the public domain & pharma Dana Vanderwall Cheminformatics, B...
How do we start to find a new chemical that might be the next drug? <ul><li>Typically - Need a specific protein to target ...
This is where drug discovery gets  really  modern <ul><li>Highly automated robots and infromatics can do work that used to...
Compound optimization <ul><li>Compounds are optimized for many parameters including </li></ul><ul><ul><li>Potency, selecti...
Getting ready for the clinic <ul><li>All compounds are tested for safety in animals </li></ul><ul><li>Need to prove we can...
Profiling Assays and Lead Op Progression 2M 2K 200 1-10 1 2-5 10-50 300-600 HTS Hit Triage Early Lead Op Late Lead Op Targ...
Chemical Structures are the Intellectual Property <ul><li>The targets exist in nature- chemical structures are the unique ...
GlaxoSmithKline moves to stimulate public-private partnerships for R&D in neglected tropical diseases <ul><li>http://www.g...
Malaria <ul><li>Mosquito-borne infectious disease, caused by the  plasmodium  parasite </li></ul><ul><li>250 M cases/annum...
The Complexity of Cell Biology The assumption is: One target    One consequence Target <ul><li>In reality: </li></ul><ul><...
Emerging paradigm- look for the cellular activity first <ul><li>Advances in cell biology & the HTS platforms are enabling ...
Supporting black-box HTS for anti-malarials <ul><li>2M compound GSK HTS collection screened @ 2  M vs.  P. falciparum  (3...
Characterizing the hits <ul><li>Clustering was used to help characterize chemical space </li></ul><ul><ul><li>416 “molecul...
Three-dimensional plot of some of the  novel chemical diversity present in TCAMS F-J Gamo  et al.   Nature   465 , 305-310...
Characterizing the hits <ul><li>Compounds with an abnormally high frequency of activity across HTS campaigns were filtered...
Can we leverage the historical target data on compounds? <ul><li>Target assays </li></ul><ul><ul><li>Clear relationship be...
Can we leverage the historical target data on compounds? <ul><li>Find all target assay data for compounds tested in anti-m...
Finding targets ‘enriched’ among the anti-malarials <ul><li>An ‘enrichment’ was calculated for each possible target-result...
Narrowing down the possible candidates <ul><li>~140 targets @  ≥2 fold enrichment </li></ul><ul><li>~50 with homologues in...
Targets with homologues in  P. falciparum  genome Aspartic protease Methionyl-tRNA synthetase b-Ketoacid reductase Phenyla...
Targets with NO homologues in  P. falciparum  genome GPCR: Adrenergic antag Nuclear Receptor ag/antag GPCR: Cannabanoid an...
Data publicly available <ul><li>All chemical structures and exp. data for compounds  available@http://www.ebi.ac.uk/chembl...
And the raw target data used to develop hypotheses? <ul><li>That was trickier </li></ul><ul><li>Release the list of 400 ta...
Surrogates for internal data <ul><li>Chemical structures associated with a particular target hypothesis were used as ‘bait...
Acknowledgements Anti-malarial HTS <ul><li>Tres Cantos Medicines Development Campus, Tres Cantos Spain </li></ul><ul><li>M...
Upcoming SlideShare
Loading in …5
×

Vanderwall cheminformatics Drexel Part 1

1,505 views

Published on

Dana Vanderwall, Associate Director of Cheminformatics at Bristol-Myers Squibb, presented at Drexel University for Jean-Claude Bradley's Chemical Information Retrieval class on December 2, 2010. This first part covers "Cheminformatics & The evolving relationship between data in the public domain & pharma" and includes a general discussion of modern drug discovery and the details of a malaria dataset recently released from the pharmaceutical industry to the public.

  • Be the first to comment

  • Be the first to like this

Vanderwall cheminformatics Drexel Part 1

  1. 1. Cheminformatics & The evolving relationship between data in the public domain & pharma Dana Vanderwall Cheminformatics, Bristol-Myers Squibb
  2. 2. How do we start to find a new chemical that might be the next drug? <ul><li>Typically - Need a specific protein to target that we think we can use to fix the problem that causes the disease </li></ul><ul><ul><li>Caveat: emerging trends (&what’s old is new again) </li></ul></ul><ul><li>Need to design experiments that test for chemicals that can fit that protein (lock & key) </li></ul><ul><li>Thousands to >2 million chemicals are tested with that protein to look for a starting point </li></ul>
  3. 3. This is where drug discovery gets really modern <ul><li>Highly automated robots and infromatics can do work that used to take years in 1 week </li></ul>
  4. 4. Compound optimization <ul><li>Compounds are optimized for many parameters including </li></ul><ul><ul><li>Potency, selectivity, oral bioavailability, safety </li></ul></ul>1-3 years >2 million compounds tested in primary assay Make another 2-10,000
  5. 5. Getting ready for the clinic <ul><li>All compounds are tested for safety in animals </li></ul><ul><li>Need to prove we can give enough to get the positive benefit without side effects </li></ul><ul><li>We have to be able to make it on a scale and form suitable for dosing in the clinic </li></ul><ul><ul><ul><li>The early stages need milligrams or grams (tablespoons) </li></ul></ul></ul><ul><ul><ul><li>To start testing in humans requires many kilograms of very, very pure material </li></ul></ul></ul>1-3 years 2-3 years
  6. 6. Profiling Assays and Lead Op Progression 2M 2K 200 1-10 1 2-5 10-50 300-600 HTS Hit Triage Early Lead Op Late Lead Op Target to Hit Hit to Lead Lead to Candidate # Compounds # Assays
  7. 7. Chemical Structures are the Intellectual Property <ul><li>The targets exist in nature- chemical structures are the unique component that pharma & biotech can bring to the table </li></ul><ul><ul><li>(Biologicals are increasing in importance) </li></ul></ul><ul><li>As such, the structures, and their biological activity, are extremely sensitive </li></ul><ul><ul><li>Captured in the patents filed </li></ul></ul><ul><ul><li>Never disclosed until protected </li></ul></ul><ul><ul><li>Even similarity/sub-structure searches on public sites are treated cautiously </li></ul></ul>
  8. 8. GlaxoSmithKline moves to stimulate public-private partnerships for R&D in neglected tropical diseases <ul><li>http://www.gsk.com/responsibility/access/rnd-neglected-tropical-diseases.htm </li></ul><ul><li>GSK launched the open lab at Tres Cantos as one way in which to share our expertise and seek to stimulate open innovation in drug discovery into diseases of the developing world </li></ul><ul><ul><li>60 slots for scientists </li></ul></ul><ul><ul><li>Access to screening facility & TC staff scientists to support collaborations </li></ul></ul><ul><ul><li>$5M GBP facilities expansion </li></ul></ul><ul><li>Committed to sharing data & IP on GSK research in DDW </li></ul><ul><ul><li>Starting with recently generated novel anti-malarial hits </li></ul></ul>
  9. 9. Malaria <ul><li>Mosquito-borne infectious disease, caused by the plasmodium parasite </li></ul><ul><li>250 M cases/annum, 1-3 M deaths </li></ul><ul><li>Variety of drugs available, but resistance is a constant problem </li></ul>http://www.mcwhealthcare.com/malaria_drugs_medicines/life_cycle_of_plasmodium.htm
  10. 10. The Complexity of Cell Biology The assumption is: One target One consequence Target <ul><li>In reality: </li></ul><ul><li>This target is one component of a complicated biochemical network. </li></ul><ul><ul><li>A selective probe may influence many pathways. </li></ul></ul><ul><ul><li>Probes can interact with multiple targets. </li></ul></ul><ul><ul><li>Network interactions can be redundant. </li></ul></ul><ul><ul><li>Biological effects are often a consequence of interaction with multiple targets. </li></ul></ul>Target
  11. 11. Emerging paradigm- look for the cellular activity first <ul><li>Advances in cell biology & the HTS platforms are enabling HTS screening for a cellular phenotype </li></ul><ul><li>Start with something that works in a cellular model for disease phenotype (a.k.a. black box), then figure out how it works </li></ul><ul><ul><li>Target deconvolution </li></ul></ul>
  12. 12. Supporting black-box HTS for anti-malarials <ul><li>2M compound GSK HTS collection screened @ 2  M vs. P. falciparum (3D7) infected human erythrocytes </li></ul><ul><ul><li>12 mos. Screening in biohazard lab </li></ul></ul><ul><ul><li>Avg. z’ = 0.7 </li></ul></ul><ul><li>19,451 primary hits; inh. parasite growth >80%; 13,533 confirmed in via retests </li></ul><ul><ul><li>1,982 showed cytotox in HepG2s @10  M </li></ul></ul><ul><ul><li>None active in cell background control </li></ul></ul><ul><li>8,000 also active against DD2 (multi-drug resistant strain) >50% </li></ul>F-J Gamo et al. Nature 465 , 305-310 (2010) doi:10.1038/nature09107
  13. 13. Characterizing the hits <ul><li>Clustering was used to help characterize chemical space </li></ul><ul><ul><li>416 “molecular frameworks” Bemis & Murcko J. Med. Chem. 39 2887 (1996) </li></ul></ul><ul><ul><li>857 clusters/1978 singletons by Daylight FP/Tanimoto (.85) </li></ul></ul>
  14. 14. Three-dimensional plot of some of the novel chemical diversity present in TCAMS F-J Gamo et al. Nature 465 , 305-310 (2010) doi:10.1038/nature09107
  15. 15. Characterizing the hits <ul><li>Compounds with an abnormally high frequency of activity across HTS campaigns were filtered out </li></ul><ul><ul><li>Excluded where IFI=5% where tested in >100 HTS to 20% where tested in >25 HTS (~1800 cmpds.) </li></ul></ul><ul><li>~70 compounds that clustered with know anti-malarials </li></ul><ul><li>How are these rest of these compounds working??? </li></ul>
  16. 16. Can we leverage the historical target data on compounds? <ul><li>Target assays </li></ul><ul><ul><li>Clear relationship between interactions and measurements, but what does it mean biologically? </li></ul></ul>Can we use the data to figure out which targets lead to which biological response? ? stimulant readout <ul><li>Phenotypic assays </li></ul><ul><ul><li>Clear biological result associated with readout, but from which interaction(s)? </li></ul></ul>kinase_1 kinase_2 kinase_3 kinase_4 7TM_1 7TM_2 NR_3 NR_2 NR_1
  17. 17. Can we leverage the historical target data on compounds? <ul><li>Find all target assay data for compounds tested in anti-malarial screen </li></ul><ul><ul><li>Aggregate at the target-result type level (max pIC50/pEC50) </li></ul></ul><ul><li>Of the 2M tested, 130K had some associated target assay data </li></ul><ul><ul><li>Incl. 3,435 of the 13,500 ‘actives’ </li></ul></ul><ul><ul><li>“ Hits”* at 413 targets </li></ul></ul><ul><ul><ul><li>*pIC50 >7.0 for antag/inh/blocker </li></ul></ul></ul><ul><ul><ul><li>*pEC50 >6.5 for ag/activation/opener </li></ul></ul></ul><ul><ul><li>Given that some targets are screened in 2-3 modes, >650 target-result type combinations </li></ul></ul><ul><li>Surely not all 400 targets are significant </li></ul><ul><li>Data very sparse, avg. ~2 pXC50s per compound that had data </li></ul>
  18. 18. Finding targets ‘enriched’ among the anti-malarials <ul><li>An ‘enrichment’ was calculated for each possible target-result type combination </li></ul><ul><ul><li>Are compounds active at target X more prevalent amongst the compounds that inhibited P. falciparum, o r equally distributed across all screened compounds? </li></ul></ul><ul><li>For each target –result type, calculate: </li></ul>
  19. 19. Narrowing down the possible candidates <ul><li>~140 targets @ ≥2 fold enrichment </li></ul><ul><li>~50 with homologues in P. falciparum </li></ul><ul><li>400 targets </li></ul>>2 fold enrichment F-J Gamo et al. Nature 465 , 305-310 (2010) doi:10.1038/nature09107
  20. 20. Targets with homologues in P. falciparum genome Aspartic protease Methionyl-tRNA synthetase b-Ketoacid reductase Phenylalanyl-tRNA synthetase Calcium/calmodulin-dependent kinase Phosphatidylinositol 3-kinase Cysteine protease Plasmodium electron transport chain Dihydrofolate reductase Ribosome Dihydroorotate dehydrogenase Ser/Thr protein kinase DNA gyrase Tyrosyl-tRNA synthetase Isoleucyl-tRNA synthetase
  21. 21. Targets with NO homologues in P. falciparum genome GPCR: Adrenergic antag Nuclear Receptor ag/antag GPCR: Cannabanoid antag Ion Channel inh GPCR: Chemokine antag Phospholipse inh GPCR: Cholinergic ag Lipid amide hydrolase inh GPCR: Free Fatty Acid ag Serine protease inh GPCR: Serotonin ag/antag Toll-like receptor ag GPCR: Opiod ag/antag GPCR: Peptide hormone receptor ag/antag
  22. 22. Data publicly available <ul><li>All chemical structures and exp. data for compounds available@http://www.ebi.ac.uk/chemblntd </li></ul><ul><ul><ul><ul><li>EXT_CMPD_NUMBER </li></ul></ul></ul></ul><ul><ul><ul><ul><li>SMILES </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Percentage_inhibition_3D7 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Percentage_inhibition_DD2 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Percentage_inhibition_3D7_PFLDH </li></ul></ul></ul></ul><ul><ul><ul><ul><li>XC50_MOD_3D7 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>XC50_3D7 (µM) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Percentage_inhibition_HEPG2 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Chemical cluster Nr </li></ul></ul></ul></ul><ul><ul><ul><ul><li>IFI </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Graph_Frame_Cluster </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Target_Hypothesis </li></ul></ul></ul></ul><ul><ul><ul><ul><li>P. falciparum locus </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Commercial Supplier_Reference </li></ul></ul></ul></ul><ul><ul><li>Additional information & interest in additional collaborations contact: </li></ul></ul><ul><ul><ul><li>[email_address] </li></ul></ul></ul>
  23. 23. And the raw target data used to develop hypotheses? <ul><li>That was trickier </li></ul><ul><li>Release the list of 400 targets & all the inactive compounds would </li></ul><ul><ul><li>Reveal our whole compound collection </li></ul></ul><ul><ul><li>All the targets in the current (and past) portfolio </li></ul></ul><ul><li>Needed some level of validation for analysis to publish </li></ul>
  24. 24. Surrogates for internal data <ul><li>Chemical structures associated with a particular target hypothesis were used as ‘bait’ to find published structures & data that validate proposed MOA for each chemotype </li></ul><ul><ul><li>Similarity & SSS in Aureus DBs & SciFinder </li></ul></ul><ul><ul><li>Exemplars and their similarity to original hits published in Suppl. Material with reference </li></ul></ul><ul><ul><li>We often found our own compounds and data in J Med Chem and Patent literature. </li></ul></ul>
  25. 25. Acknowledgements Anti-malarial HTS <ul><li>Tres Cantos Medicines Development Campus, Tres Cantos Spain </li></ul><ul><li>Medicines Research Centre, Stevenage, UK </li></ul><ul><li>Darren VS Green </li></ul><ul><li>Collegeville & King or Prussia, PA, USA </li></ul><ul><li>Vinod Kumar </li></ul><ul><li>Samiul Hasan </li></ul><ul><li>James Brown </li></ul><ul><li>Catherine Peishoff </li></ul><ul><li>Lon Cardon </li></ul><ul><li>Francisco-Javier Gamo </li></ul><ul><li>Laura Sanz </li></ul><ul><li>Jaume Vidal </li></ul><ul><li>Cristina de Cozar </li></ul><ul><li>Emilio Alvarez </li></ul><ul><li>Jose-Luis Lavandera </li></ul><ul><li>Jose Garcia-Bustos </li></ul>

×