SlideShare a Scribd company logo
1 of 37
Download to read offline
Using Machine Learning Models Based on
Phenotypic Data to Discover New
Molecules for Neglected Diseases
Sean Ekins
Collaborative Drug Discovery, Inc., Burlingame, CA.
Collaborations Pharmaceuticals, Inc. Fuquay Varina, NC.
Collaborations in Chemistry, Inc. Fuquay Varina, NC.
Wikipedia
Machine Learning Examples
• Data is BIG for neglected diseases
• To discover new leads
• Tuberculosis – from public data to open models to create IP
• Chagas Disease - from public data to create new IP
• Ebola virus – from little data to create open data and IP
• Other diseases, emerging diseases?
Neglected Disease Drug Discovery
An urgent need for new therapeutics
http://www.mm4tb.org/
Tuberculosis kills 1.6-1.7m/yr (~1 every 8 seconds)
1/3rd of worlds population infected!!!!
streptomycin (1943)
para-aminosalicyclic acid (1949)
isoniazid (1952)
pyrazinamide (1954)
cycloserine (1955)
ethambutol (1962)
rifampicin (1967)
Multi drug resistance in 4.3% of cases
Extensively drug resistant increasing
incidence
one new drug (bedaquiline) in 40 yrs
Tuberculosis
Tested >350,000 molecules Tested ~2M 2M >300,000
>1500 active and non toxic Published 177 100s 800
Bigger Open Data: Screening for New Tuberculosis Treatments
~350,000 accessible
TBDA screened over 1 million, 1 million
more to go
TB Alliance + Japanese pharma screens
R43 LM011152-01
Over 8 years analyzed in vitro data and built models
Top scoring molecules
assayed for
Mtb growth inhibition
Mtb screening
molecule
database/s
High-throughput
phenotypic
Mtb screening
Descriptors + Bioactivity (+Cytotoxicity)
Bayesian Machine Learning classification Mtb Model
Molecule Database
(e.g. GSK malaria
actives)
virtually scored
using Bayesian Models
New bioactivity data
may enhance models
Identify in vitro hits and test models3 x published prospective tests ~750
molecules were tested in vitro
198 actives were identified
>20 % hit rate
Multiple retrospective tests 3-10 fold
enrichment
N
H
S
N
Ekins et al., Pharm Res 31: 414-435, 2014
Ekins, et al., Tuberculosis 94; 162-169, 2014
Ekins, et al., PLOSONE 8; e63240, 2013
Ekins, et al., Chem Biol 20: 370-378, 2013
Ekins, et al., JCIM, 53: 3054−3063, 2013
Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011
Ekins et al., Mol BioSyst, 6: 840-851, 2010
Ekins, et al., Mol. Biosyst. 6, 2316-2324, 2010,
R43 LM011152-01
5 active compounds vs Mtb in a few months
7 tested, 5 active (70% hit rate)
Ekins et al.,Chem
Biol 20, 370–378,
2013
1. Virtually screen
13,533-member GSK
antimalarial hit library
2. Bayesian Model = SRI
TAACF-CB2 dose
response + cytotoxicity
model
3. Top 46 commercially
available compounds
visually inspected
4. 7 compounds chosen
for Mtb testing based
on
- drug-likeness
- chemotype diversity
GSK #
Bayesian
Score Chemical Structure
Mtb H37Rv
MIC
(mg/mL)
GSK
Reported
% Inhibition
HepG2 @ 10
mM cmpd
TCMDC-
123868 5.73 >32 40
TCMDC-
125802 5.63 0.0625 5
TCMDC-
124192 5.27 2.0 4
TCMDC-
124334 5.20 2.0 4
TCMDC-
123856 5.09 1.0 83
TCMDC-
123640 4.66 >32 10
TCMDC-
124922 4.55 1.0 9
R43 LM011152-01
• BAS00521003/ TCMDC-125802 reported to be a P.
falciparum lactate dehydrogenase inhibitor
• Only one report of antitubercular activity from 1969
- solid agar MIC = 1 mg/mL (“wild strain”)
- “no activity” in mouse model up to 400 mg/kg
- however, activity was solely judged by
extension of survival!
Bruhin, H. et al., J. Pharm. Pharmac. 1969, 21, 423-433.
.
MIC of 0.0625 ug/mL
• 64X MIC affords 6 logs of
kill
• Resistance and/or drug
instability beyond 14 d
Vero cells : CC50 = 4.0
mg/mL
Selectivity Index SI =
CC50/MICMtb = 16 – 64
In mouse no toxicity but
also no efficacy in GKO
model – probably
metabolized.
Ekins et al.,Chem Biol 20, 370–378, 2013
Taking a compound in vivo identifies issues
R43 LM011152-01
Modeling and mapping Mouse in vivo data
Mouse TB model data over 70 yrs
784 training and 60 test set
Extended earlier study
J Chem Inf Model. 2014 Apr 28;54(4):1070-82
Optimizing the triazine series as part of this project, improve solubility and show in
vivo efficacy
1U19AI109713-01
Chagas Disease
• About 7 million to 8 million people
estimated to be infected worldwide
• Vector-borne transmission occurs in the
Americas.
• A triatomine bug carries the
parasite Trypanosoma cruzi which causes
the disease.
• The disease is curable if treatment is
initiated soon after infection.
• No FDA approved drug, pipe line sparse
Hotez et al., PLoS Negl Trop Dis. 2013 Oct
31;7(10):e2300
R41-AI108003-01
• Modeled data with over 300,000 cpds but focused on smaller set
• Dataset from PubChem AID 2044 – Broad Institute data
• Dose response data (1853 actives and 2203 inactives)
• Dose response and cytotoxicity (1698 actives and 2363 inactives)
• EC50 values less than 1 mM were selected as actives.
• For cytotoxicity greater than 10 fold difference compared with EC50
• Models generated using : molecular function class fingerprints of maximum
diameter 6 (FCFP_6), AlogP, molecular weight, number of rotatable bonds,
number of rings, number of aromatic rings, number of hydrogen bond
acceptors, number of hydrogen bond donors, and molecular fractional polar
surface area.
• 5-fold cross validation or leave out 50% x 100 fold cross validation was used
to calculate the ROC for the models generated
T. cruzi Machine Learning models
R41-AI108003-01
Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878
Model
Best
cutoff
Leave-one out
ROC
5-fold cross
validation ROC
5-fold cross
validation sensitivity
(%)
5-fold cross
validation
specificity (%)
5-fold cross
validation
concordance (%)
Dose response
(1853 actives,
2203 inactives)
-0.676 0.81 0.78 77 89 84
Dose response
and cytotoxicity
(1698 actives,
2363 inactives)
-0.337 0.82 0.80 80 88 84
External ROC Internal ROC
Concordance
(%)
Specificity
(%) Sensitivity (%)
0.79 ± 0.01 0.80 ± 0.01 73.48 ± 1.05 79.08 ± 3.73 65.68 ± 3.89
5 fold cross validation
Dual event 50% x 100 fold cross validation
R41-AI108003-01
Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878
Good Bad
Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878
T. cruzi Dose Response and cytotoxicity Machine Learning model features
Tertiary amines, piperidines and aromatic
fragments with basic Nitrogen
Cyclic hydrazines and electron poor
chlorinated aromatics
R41-AI108003-01
Bayesian Machine Learning Models
- Selleck Chemicals natural product lib. (139 molecules);
- GSK kinase library (367 molecules);
- Malaria box (400 molecules);
- Microsource Spectrum (2320 molecules);
- CDD FDA drugs (2690 molecules);
- Prestwick Chemical library (1280 molecules);
- Traditional Chinese Medicine components (373 molecules)
7569 molecules
99 molecules
R41-AI108003-01 Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878
Synonyms Infection Ratio EC50 (µM) EC90 (µM) Hill slope
Cytotoxicity CC50
(µM)
Chagas mouse model (4
days treatment,
luciferase): In vivo
efficacy at 50 mg/kg bid
(IP) (%)
(±)-Verapamil
hydrochloride, 715730,
SC-0011762
0.02, 0.02 0.0383 0.143 1.67 >10.0 55.1
29781612,
Pyronaridine 0.00, 0.00 0.225 0.665 2.03 3.0 85.2
511176, Furazolidone 0.00, 0.00 0.257 0.563 2.81 >10.0 100.5
501337,
SC-0011777,
Tetrandrine
0.00, 0.00 0.508 1.57 1.95 1.3 43.6
SC-0011754,
Nitrofural 0.01, 0.01 0.775 6.98 1.00 >10.0 78.5*
* Used hydroxymethylnitrofurazone for in vivo study (nitrofural pro-drug)
Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878
H3C
O
N
CH3
N
CH3
H3C
O
CH3
O
H3C
O
H3C
N
N
HN
N
N
OH
Cl
O
CH 3
O
N
N
+
N
O
O
–
O
O
O
N
+
O
O
–
N
H
N
NH2
O
In vitro and in vivo data for compounds selected
R41-AI108003-01
7,569 cpds => 99 cpds => 17 hits (5 in nM range)
Infection Treatment Reading
0 1 2 3 4 5 6 7
Pyronaridine Furazolidone Verapamil
Nitrofural Tetrandrine Benznidazole
In vivo efficacy of the 5 tested compounds
Vehicle
Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878R41-AI108003-01
Pyronaridine: New anti-Chagas and known anti-Malarial
EMA approved in combination with
artesunate
The IC50 value 2 nM against the growth
of KT1 and KT3 P. falciparum
Known P-gp inhibitor
Active against Babesia and Theileria
Parasites tick-transmitted
R41-AI108003-01
Work provided starting point for a phase II and phase I grant (submitted)
N
N
HN
N
N
OH
Cl
O
CH 3
Broad group
missed this cpd
2014-2015 Ebola outbreak
March 2014, the
World Health
Organization (WHO)
reported a major
Ebola outbreak in
Guinea, a western
African nation
8 August 2014, the
WHO declared the
epidemic to be an
international public
health emergency
I urge everyone involved in all aspects of this epidemic to openly and rapidly report their experiences and
findings. Information will be one of our key weapons in defeating the Ebola epidemic. Peter Piot
Wikipedia
Wikipedia
Madrid PB, et al. (2013) A Systematic Screen of FDA-Approved Drugs for Inhibitors of Biological
Threat Agents. PLoS ONE 8(4): e60579. doi:10.1371/journal.pone.0060579
Chloroquine in mouse
Machine Learning for EBOV
• 868 molecules from the viral pseudotype entry assay and the EBOV replication assay
• Salts were stripped and duplicates removed using Discovery Studio 4.1 (Biovia, San
Diego, CA)
• IC50 values less than 50 mM were selected as actives.
• Models generated using : molecular function class fingerprints of maximum diameter 6
(FCFP_6), AlogP, molecular weight, number of rotatable bonds, number of rings,
number of aromatic rings, number of hydrogen bond acceptors, number of hydrogen
bond donors, and molecular fractional polar surface area.
• Models were validated using five-fold cross validation (leave out 20% of the database).
• Bayesian, Support Vector Machine and Recursive Partitioning Forest and single tree
models built.
• RP Forest and RP Single Tree models used the standard protocol in Discovery Studio.
• 5-fold cross validation or leave out 50% x 100 fold cross validation was used to
calculate the ROC for the models generated
Models
(training set 868 compounds)
RP Forest
(Out of bag
ROC)
RP Single Tree
(With 5 fold
cross validation
ROC)
SVM
(with 5 fold
cross validation
ROC)
Bayesian
(with 5 fold
cross validation
ROC)
Bayesian
(leave out
50% x 100
ROC)
Open
Bayesian
(with 5 fold
cross
validation
ROC)
Ebola replication (actives = 20)
0.70 0.78 0.73 0.86 0.86 0.82
Ebola Pseudotype (actives = 41)
0.85 0.81 0.76 0.85 0.82 0.82
Ebola HTS Machine learning model cross validation
Receiver Operator Curve Statistics.
Ekins et al., F1000Res 4:1091, (2015)
Discovery Studio pseudotype Bayesian model
B
Discovery Studio EBOV replication model
Good Bad
Good Bad
Ekins et al., F1000Res 4:1091, (2015)
Effect of drug treatment on infection with Ebola-GFP
3 Molecules selected from MicroSource Spectrum virtual screen and tested in vitro
All of them nM activity
-8 -7 -6 -5 -4
-10
0
10
20
30
40
50
60
70
80
90
100
110
Chloroquine
Pyronaridine
Quinacrine
Tilorone
Untreated control
Log Conc. (M)%EbolaInfection
Compound EC50 (mM) [95% CI] Cytotoxicity CC50 (µM)
Chloroquine 4.0 [1.0 – 15] 250
Pyronaridine 0.42 [0.31 – 0.56] 3.1
Quinacrine 0.35 [0.28 – 0.44] 6.2
Tilorone 0.23 [0.09 – 0.62] 6.2
Duplicate experiments
control
Ekins et al., F1000Res 4:1091, (2015)
Making Ebola models available
• From data published by others …to proposing target
• Collaborated with lab to open up their screening data, build models,
identified more active inhibitors
• To date the most potent drugs and drug-like molecules
• Still a need for a drug that could be used ASAP
• Models in MMDS http://molsync.com/ebola/
More data continues to be published
• We collated 55 molecules from the literature
• A second review lists 60 hits
– Picazo, E. and F. Giordanetto, Drug Discovery Today. 2015 Feb;20(2):277-86
• Additional screens have identified 53 hits and 80 hits respectively
– Kouznetsova, J., et al., Emerg Microbes Infect, 2014. 3(12): p. e84.
– Johansen, L.M., et al., Sci Transl Med, 2015. 7(290): p. 290ra89.
 Litterman N, Lipinski C and Ekins S 2015 F1000Research 2015, 4:38
1000’s of models from
• Skipped targets with > 100,000 assays and sets
with < 100 measurements
• Converted data to –log
• Dealt with duplicates
• 2152 datasets
• Cutoff determination
• Balance active/ inactive ratio
• Favor structural diversity and activity distribution
Clark and Ekins, J Chem Inf Model. 2015 Jun 22;55(6):1246-60
http://molsync.com/bayesian2
What do 2000 ChEMBL models
look like
Folding bit size
Average
ROC
http://molsync.com/bayesian2 Clark and Ekins, J Chem Inf Model. 2015 Jun 22;55(6):1246-60
PolyPharma a new free app for drug discovery
#ZikaOpen
Image by John Liebler
Proposed workflow for rapid drug discovery against Zika virus
Ekins S, Mietchen D, Coffee M et al. F1000Research 2016, 5:150
(doi: 10.12688/f1000research.8013.1)
HOMOLOGY MODELS FOR ZIKA
Models developed with SWISS-MODEL
Will dock millions of compounds
vs these models
Ekins et al., F1000Research 5:275 (2016)
Ekins S, Mietchen D, Coffee M et al. 2016 F1000Research 2016,
5:150 (doi: 10.12688/f1000research.8013.1)
Compounds and chemical libraries suggested for testing against Zika virus
• Data is out there to produce models for neglected diseases
• Also modeled Marburg, Lassa, Dengue..
• Computational and experimental collaborations with open data have lead to :
– New hits and leads
– New IP
– New grants for collaborators
• Even Ebola had enough data to build models and suggest compounds to test
in 2014
• Zika = starting from scratch – no data – need to use other approaches
• Make findings open and published immediately
• Need for easier facilities to test compounds
• Challenges still – sharing and accessing information / knowledge
• How do we prepare for the next BIG ONE
Conclusions
Alex Clark
Jair Lage de Siqueira-Neto
Joel Freundlich
Peter Madrid
Robert Davey
Megan Coffee
Robert Reynolds
Nadia Litterman
Christopher Lipinski
Christopher Southan
Antony Williams
Carolyn Talcott
Malabika Sarker
Steven Wright
Mike Pollastri
Ni Ai
Barry Bunin and all colleagues at CDD
Acknowledgments and contact info
ekinssean@yahoo.com
collabchem
ZIKAOPEN ACKNOWLEDGMENTS
Tom Stratton
Priscilla L. Yang
Software on github
Models can be accessed at
• http://molsync.com/bayesian1
• http://molsync.com/bayesian2
• http://molsync.com/transporters
• http://molsync.com/ebola/

More Related Content

What's hot

dkNET Webinar: Population-Based Approaches to Investigate Endocrine Communica...
dkNET Webinar: Population-Based Approaches to Investigate Endocrine Communica...dkNET Webinar: Population-Based Approaches to Investigate Endocrine Communica...
dkNET Webinar: Population-Based Approaches to Investigate Endocrine Communica...
dkNET
 
a-rat-pharmacokinetic-pharmacodynamic-model-for-assessment-of-lipopolysacchar...
a-rat-pharmacokinetic-pharmacodynamic-model-for-assessment-of-lipopolysacchar...a-rat-pharmacokinetic-pharmacodynamic-model-for-assessment-of-lipopolysacchar...
a-rat-pharmacokinetic-pharmacodynamic-model-for-assessment-of-lipopolysacchar...
Shannon Chesley
 
Study of Glutathione Peroxidase GPX Activity Among Betel Quid Chewers of Indi...
Study of Glutathione Peroxidase GPX Activity Among Betel Quid Chewers of Indi...Study of Glutathione Peroxidase GPX Activity Among Betel Quid Chewers of Indi...
Study of Glutathione Peroxidase GPX Activity Among Betel Quid Chewers of Indi...
ijtsrd
 
Aerosolezed abx case report
Aerosolezed abx case reportAerosolezed abx case report
Aerosolezed abx case report
Choying Chen
 

What's hot (20)

acs talk open source drug discovery
acs talk open source drug discoveryacs talk open source drug discovery
acs talk open source drug discovery
 
CDD: Vault, CDD: Vision and CDD: Models for Drug Discovery Collaborations
CDD: Vault, CDD: Vision and CDD: Models for Drug Discovery CollaborationsCDD: Vault, CDD: Vision and CDD: Models for Drug Discovery Collaborations
CDD: Vault, CDD: Vision and CDD: Models for Drug Discovery Collaborations
 
Repositioning Old Drugs For New Indications Using Computational Approaches
Repositioning Old Drugs For New Indications Using Computational ApproachesRepositioning Old Drugs For New Indications Using Computational Approaches
Repositioning Old Drugs For New Indications Using Computational Approaches
 
Dept. of Food Safety and Zoonoses (FOS)
Dept. of Food Safety and Zoonoses (FOS)Dept. of Food Safety and Zoonoses (FOS)
Dept. of Food Safety and Zoonoses (FOS)
 
Carcinogenicity of Glyphosate A Systematic Review of the Available Evidence
Carcinogenicity of Glyphosate  A Systematic Review of the Available  EvidenceCarcinogenicity of Glyphosate  A Systematic Review of the Available  Evidence
Carcinogenicity of Glyphosate A Systematic Review of the Available Evidence
 
Michael Festing - MedicReS World Congress 2011
Michael Festing - MedicReS World Congress 2011Michael Festing - MedicReS World Congress 2011
Michael Festing - MedicReS World Congress 2011
 
Hazard Assessment of Glyphosate Carcinogenicity and Reproductive Toxicity
Hazard Assessment of Glyphosate  Carcinogenicity and Reproductive ToxicityHazard Assessment of Glyphosate  Carcinogenicity and Reproductive Toxicity
Hazard Assessment of Glyphosate Carcinogenicity and Reproductive Toxicity
 
Exploiting bigger data and collaborative tools for predictive drug discovery
Exploiting bigger data and collaborative tools for predictive drug discovery Exploiting bigger data and collaborative tools for predictive drug discovery
Exploiting bigger data and collaborative tools for predictive drug discovery
 
07.17.20 | Precision HIV PrEP – Tailoring the Prescription for the User
07.17.20 | Precision HIV PrEP – Tailoring the Prescription for the User07.17.20 | Precision HIV PrEP – Tailoring the Prescription for the User
07.17.20 | Precision HIV PrEP – Tailoring the Prescription for the User
 
Joint FAO/WHO Meeting on Pesticide Residues (JMPR) - Toxicological re-evalua...
Joint FAO/WHO Meeting on Pesticide  Residues (JMPR) - Toxicological re-evalua...Joint FAO/WHO Meeting on Pesticide  Residues (JMPR) - Toxicological re-evalua...
Joint FAO/WHO Meeting on Pesticide Residues (JMPR) - Toxicological re-evalua...
 
dkNET Webinar: Population-Based Approaches to Investigate Endocrine Communica...
dkNET Webinar: Population-Based Approaches to Investigate Endocrine Communica...dkNET Webinar: Population-Based Approaches to Investigate Endocrine Communica...
dkNET Webinar: Population-Based Approaches to Investigate Endocrine Communica...
 
Slides for st judes
Slides for st judesSlides for st judes
Slides for st judes
 
a-rat-pharmacokinetic-pharmacodynamic-model-for-assessment-of-lipopolysacchar...
a-rat-pharmacokinetic-pharmacodynamic-model-for-assessment-of-lipopolysacchar...a-rat-pharmacokinetic-pharmacodynamic-model-for-assessment-of-lipopolysacchar...
a-rat-pharmacokinetic-pharmacodynamic-model-for-assessment-of-lipopolysacchar...
 
Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...
Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...
Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...
 
Study of Glutathione Peroxidase GPX Activity Among Betel Quid Chewers of Indi...
Study of Glutathione Peroxidase GPX Activity Among Betel Quid Chewers of Indi...Study of Glutathione Peroxidase GPX Activity Among Betel Quid Chewers of Indi...
Study of Glutathione Peroxidase GPX Activity Among Betel Quid Chewers of Indi...
 
Aerosolezed abx case report
Aerosolezed abx case reportAerosolezed abx case report
Aerosolezed abx case report
 
Searching for predictors of male fecundity
Searching for predictors of male fecunditySearching for predictors of male fecundity
Searching for predictors of male fecundity
 
Bigger Data to Increase Drug Discovery
Bigger Data to Increase Drug DiscoveryBigger Data to Increase Drug Discovery
Bigger Data to Increase Drug Discovery
 
Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...
Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...
Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...
 
A comparative in silico study finds a functional co-relation between human hs...
A comparative in silico study finds a functional co-relation between human hs...A comparative in silico study finds a functional co-relation between human hs...
A comparative in silico study finds a functional co-relation between human hs...
 

Similar to Using Machine Learning Models Based on Phenotypic Data to Discover New Molecules For neglected diseases

EXTRAPOLATION OF IN VITRO DATA TO PRECLINICAL
EXTRAPOLATION OF IN VITRO DATA TO PRECLINICALEXTRAPOLATION OF IN VITRO DATA TO PRECLINICAL
EXTRAPOLATION OF IN VITRO DATA TO PRECLINICAL
TMU
 
Final MTBNZ.pptx
Final MTBNZ.pptxFinal MTBNZ.pptx
Final MTBNZ.pptx
Ari Mandler
 
2013-11-26 DTL FIH symposium, Leiden
2013-11-26 DTL FIH symposium, Leiden2013-11-26 DTL FIH symposium, Leiden
2013-11-26 DTL FIH symposium, Leiden
Alain van Gool
 

Similar to Using Machine Learning Models Based on Phenotypic Data to Discover New Molecules For neglected diseases (20)

C&E news talk sept 16
C&E news talk sept 16C&E news talk sept 16
C&E news talk sept 16
 
Pollastri ACS-2015 CDD Workshop
Pollastri ACS-2015 CDD WorkshopPollastri ACS-2015 CDD Workshop
Pollastri ACS-2015 CDD Workshop
 
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
 
Mel Reichman on Pool Shark’s Cues for More Efficient Drug Discovery
Mel Reichman on Pool Shark’s Cues for More Efficient Drug DiscoveryMel Reichman on Pool Shark’s Cues for More Efficient Drug Discovery
Mel Reichman on Pool Shark’s Cues for More Efficient Drug Discovery
 
Assignment on Alternatives to Animal Screening Method
Assignment on Alternatives to Animal Screening MethodAssignment on Alternatives to Animal Screening Method
Assignment on Alternatives to Animal Screening Method
 
EXTRAPOLATION OF IN VITRO DATA TO PRECLINICAL
EXTRAPOLATION OF IN VITRO DATA TO PRECLINICALEXTRAPOLATION OF IN VITRO DATA TO PRECLINICAL
EXTRAPOLATION OF IN VITRO DATA TO PRECLINICAL
 
Final MTBNZ.pptx
Final MTBNZ.pptxFinal MTBNZ.pptx
Final MTBNZ.pptx
 
2013-11-26 DTL FIH symposium, Leiden
2013-11-26 DTL FIH symposium, Leiden2013-11-26 DTL FIH symposium, Leiden
2013-11-26 DTL FIH symposium, Leiden
 
ChEMBL US tour December 2014
ChEMBL US tour December 2014ChEMBL US tour December 2014
ChEMBL US tour December 2014
 
2014 07 ismb personalized medicine
2014 07 ismb personalized medicine2014 07 ismb personalized medicine
2014 07 ismb personalized medicine
 
Phase 0 Clinical Trials (microdosing)
Phase 0 Clinical Trials (microdosing)Phase 0 Clinical Trials (microdosing)
Phase 0 Clinical Trials (microdosing)
 
Pre clinical studies
Pre clinical studiesPre clinical studies
Pre clinical studies
 
Unveiling the complexity, errors in animal experiments
Unveiling the complexity, errors in animal experimentsUnveiling the complexity, errors in animal experiments
Unveiling the complexity, errors in animal experiments
 
Nc state lecture v2 Computational Toxicology
Nc state lecture v2 Computational ToxicologyNc state lecture v2 Computational Toxicology
Nc state lecture v2 Computational Toxicology
 
Unc slides on computational toxicology
Unc slides on computational toxicologyUnc slides on computational toxicology
Unc slides on computational toxicology
 
TDRtargets.org: an open-access resource for prioritizing possible drug target...
TDRtargets.org: an open-access resource for prioritizing possible drug target...TDRtargets.org: an open-access resource for prioritizing possible drug target...
TDRtargets.org: an open-access resource for prioritizing possible drug target...
 
Animal Experiments and Alternatives
Animal Experiments and AlternativesAnimal Experiments and Alternatives
Animal Experiments and Alternatives
 
"Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ...
"Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ..."Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ...
"Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ...
 
Grafström - Lush Prize Conference 2014
Grafström - Lush Prize Conference 2014Grafström - Lush Prize Conference 2014
Grafström - Lush Prize Conference 2014
 
10.1128@jcm.00298 17
10.1128@jcm.00298 1710.1128@jcm.00298 17
10.1128@jcm.00298 17
 

More from Sean Ekins

More from Sean Ekins (20)

How to Win a small business grant.pptx
How to Win a small business grant.pptxHow to Win a small business grant.pptx
How to Win a small business grant.pptx
 
Evaluating Multiple Machine Learning Models for Biodegradation and Aquatic To...
Evaluating Multiple Machine Learning Models for Biodegradation and Aquatic To...Evaluating Multiple Machine Learning Models for Biodegradation and Aquatic To...
Evaluating Multiple Machine Learning Models for Biodegradation and Aquatic To...
 
A presentation at the Global Genes rare drug development symposium on governm...
A presentation at the Global Genes rare drug development symposium on governm...A presentation at the Global Genes rare drug development symposium on governm...
A presentation at the Global Genes rare drug development symposium on governm...
 
Leveraging Science Communication and Social Media to Build Your Brand and Ele...
Leveraging Science Communication and Social Media to Build Your Brand and Ele...Leveraging Science Communication and Social Media to Build Your Brand and Ele...
Leveraging Science Communication and Social Media to Build Your Brand and Ele...
 
Bayesian Models for Chagas Disease
Bayesian Models for Chagas DiseaseBayesian Models for Chagas Disease
Bayesian Models for Chagas Disease
 
Assay Central: A New Approach to Compiling Big Data and Preparing Machine Lea...
Assay Central: A New Approach to Compiling Big Data and Preparing Machine Lea...Assay Central: A New Approach to Compiling Big Data and Preparing Machine Lea...
Assay Central: A New Approach to Compiling Big Data and Preparing Machine Lea...
 
Drug Discovery Today March 2017 special issue
Drug Discovery Today March 2017 special issueDrug Discovery Today March 2017 special issue
Drug Discovery Today March 2017 special issue
 
Five Ways to Use Social Media to Raise Awareness for Your Paper or Research
Five Ways to Use Social Media to Raise Awareness for Your Paper or ResearchFive Ways to Use Social Media to Raise Awareness for Your Paper or Research
Five Ways to Use Social Media to Raise Awareness for Your Paper or Research
 
CDD models case study #3
CDD models case study #3 CDD models case study #3
CDD models case study #3
 
CDD models case study #2
CDD models case study #2 CDD models case study #2
CDD models case study #2
 
CDD Models case study #1
CDD Models case study #1 CDD Models case study #1
CDD Models case study #1
 
CDD: Vault, CDD: Vision and CDD: Models software for biologists and chemists ...
CDD: Vault, CDD: Vision and CDD: Models software for biologists and chemists ...CDD: Vault, CDD: Vision and CDD: Models software for biologists and chemists ...
CDD: Vault, CDD: Vision and CDD: Models software for biologists and chemists ...
 
The future of computational chemistry b ig
The future of computational chemistry b igThe future of computational chemistry b ig
The future of computational chemistry b ig
 
#ZikaOpen: Homology Models -
#ZikaOpen: Homology Models - #ZikaOpen: Homology Models -
#ZikaOpen: Homology Models -
 
Slas talk 2016
Slas talk 2016Slas talk 2016
Slas talk 2016
 
Pros and cons of social networking for scientists
Pros and cons of social networking for scientistsPros and cons of social networking for scientists
Pros and cons of social networking for scientists
 
Rare pediatric and neglected tropical diseases priority review voucher and tr...
Rare pediatric and neglected tropical diseases priority review voucher and tr...Rare pediatric and neglected tropical diseases priority review voucher and tr...
Rare pediatric and neglected tropical diseases priority review voucher and tr...
 
Combining Metabolite-Based Pharmacophores with Bayesian Machine Learning Mode...
Combining Metabolite-Based Pharmacophores with Bayesian Machine Learning Mode...Combining Metabolite-Based Pharmacophores with Bayesian Machine Learning Mode...
Combining Metabolite-Based Pharmacophores with Bayesian Machine Learning Mode...
 
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning ModelsMining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
 
Infographic for Sanfilippo Syndrome IIIC and IIID
Infographic for Sanfilippo Syndrome IIIC and IIIDInfographic for Sanfilippo Syndrome IIIC and IIID
Infographic for Sanfilippo Syndrome IIIC and IIID
 

Recently uploaded

(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 

Recently uploaded (20)

(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 

Using Machine Learning Models Based on Phenotypic Data to Discover New Molecules For neglected diseases

  • 1.
  • 2. Using Machine Learning Models Based on Phenotypic Data to Discover New Molecules for Neglected Diseases Sean Ekins Collaborative Drug Discovery, Inc., Burlingame, CA. Collaborations Pharmaceuticals, Inc. Fuquay Varina, NC. Collaborations in Chemistry, Inc. Fuquay Varina, NC. Wikipedia
  • 3. Machine Learning Examples • Data is BIG for neglected diseases • To discover new leads • Tuberculosis – from public data to open models to create IP • Chagas Disease - from public data to create new IP • Ebola virus – from little data to create open data and IP • Other diseases, emerging diseases?
  • 4. Neglected Disease Drug Discovery An urgent need for new therapeutics http://www.mm4tb.org/
  • 5. Tuberculosis kills 1.6-1.7m/yr (~1 every 8 seconds) 1/3rd of worlds population infected!!!! streptomycin (1943) para-aminosalicyclic acid (1949) isoniazid (1952) pyrazinamide (1954) cycloserine (1955) ethambutol (1962) rifampicin (1967) Multi drug resistance in 4.3% of cases Extensively drug resistant increasing incidence one new drug (bedaquiline) in 40 yrs Tuberculosis
  • 6. Tested >350,000 molecules Tested ~2M 2M >300,000 >1500 active and non toxic Published 177 100s 800 Bigger Open Data: Screening for New Tuberculosis Treatments ~350,000 accessible TBDA screened over 1 million, 1 million more to go TB Alliance + Japanese pharma screens R43 LM011152-01
  • 7. Over 8 years analyzed in vitro data and built models Top scoring molecules assayed for Mtb growth inhibition Mtb screening molecule database/s High-throughput phenotypic Mtb screening Descriptors + Bioactivity (+Cytotoxicity) Bayesian Machine Learning classification Mtb Model Molecule Database (e.g. GSK malaria actives) virtually scored using Bayesian Models New bioactivity data may enhance models Identify in vitro hits and test models3 x published prospective tests ~750 molecules were tested in vitro 198 actives were identified >20 % hit rate Multiple retrospective tests 3-10 fold enrichment N H S N Ekins et al., Pharm Res 31: 414-435, 2014 Ekins, et al., Tuberculosis 94; 162-169, 2014 Ekins, et al., PLOSONE 8; e63240, 2013 Ekins, et al., Chem Biol 20: 370-378, 2013 Ekins, et al., JCIM, 53: 3054−3063, 2013 Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011 Ekins et al., Mol BioSyst, 6: 840-851, 2010 Ekins, et al., Mol. Biosyst. 6, 2316-2324, 2010, R43 LM011152-01
  • 8. 5 active compounds vs Mtb in a few months 7 tested, 5 active (70% hit rate) Ekins et al.,Chem Biol 20, 370–378, 2013 1. Virtually screen 13,533-member GSK antimalarial hit library 2. Bayesian Model = SRI TAACF-CB2 dose response + cytotoxicity model 3. Top 46 commercially available compounds visually inspected 4. 7 compounds chosen for Mtb testing based on - drug-likeness - chemotype diversity GSK # Bayesian Score Chemical Structure Mtb H37Rv MIC (mg/mL) GSK Reported % Inhibition HepG2 @ 10 mM cmpd TCMDC- 123868 5.73 >32 40 TCMDC- 125802 5.63 0.0625 5 TCMDC- 124192 5.27 2.0 4 TCMDC- 124334 5.20 2.0 4 TCMDC- 123856 5.09 1.0 83 TCMDC- 123640 4.66 >32 10 TCMDC- 124922 4.55 1.0 9 R43 LM011152-01
  • 9. • BAS00521003/ TCMDC-125802 reported to be a P. falciparum lactate dehydrogenase inhibitor • Only one report of antitubercular activity from 1969 - solid agar MIC = 1 mg/mL (“wild strain”) - “no activity” in mouse model up to 400 mg/kg - however, activity was solely judged by extension of survival! Bruhin, H. et al., J. Pharm. Pharmac. 1969, 21, 423-433. . MIC of 0.0625 ug/mL • 64X MIC affords 6 logs of kill • Resistance and/or drug instability beyond 14 d Vero cells : CC50 = 4.0 mg/mL Selectivity Index SI = CC50/MICMtb = 16 – 64 In mouse no toxicity but also no efficacy in GKO model – probably metabolized. Ekins et al.,Chem Biol 20, 370–378, 2013 Taking a compound in vivo identifies issues R43 LM011152-01
  • 10. Modeling and mapping Mouse in vivo data Mouse TB model data over 70 yrs 784 training and 60 test set Extended earlier study J Chem Inf Model. 2014 Apr 28;54(4):1070-82
  • 11. Optimizing the triazine series as part of this project, improve solubility and show in vivo efficacy 1U19AI109713-01
  • 12. Chagas Disease • About 7 million to 8 million people estimated to be infected worldwide • Vector-borne transmission occurs in the Americas. • A triatomine bug carries the parasite Trypanosoma cruzi which causes the disease. • The disease is curable if treatment is initiated soon after infection. • No FDA approved drug, pipe line sparse Hotez et al., PLoS Negl Trop Dis. 2013 Oct 31;7(10):e2300 R41-AI108003-01
  • 13. • Modeled data with over 300,000 cpds but focused on smaller set • Dataset from PubChem AID 2044 – Broad Institute data • Dose response data (1853 actives and 2203 inactives) • Dose response and cytotoxicity (1698 actives and 2363 inactives) • EC50 values less than 1 mM were selected as actives. • For cytotoxicity greater than 10 fold difference compared with EC50 • Models generated using : molecular function class fingerprints of maximum diameter 6 (FCFP_6), AlogP, molecular weight, number of rotatable bonds, number of rings, number of aromatic rings, number of hydrogen bond acceptors, number of hydrogen bond donors, and molecular fractional polar surface area. • 5-fold cross validation or leave out 50% x 100 fold cross validation was used to calculate the ROC for the models generated T. cruzi Machine Learning models R41-AI108003-01 Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878
  • 14. Model Best cutoff Leave-one out ROC 5-fold cross validation ROC 5-fold cross validation sensitivity (%) 5-fold cross validation specificity (%) 5-fold cross validation concordance (%) Dose response (1853 actives, 2203 inactives) -0.676 0.81 0.78 77 89 84 Dose response and cytotoxicity (1698 actives, 2363 inactives) -0.337 0.82 0.80 80 88 84 External ROC Internal ROC Concordance (%) Specificity (%) Sensitivity (%) 0.79 ± 0.01 0.80 ± 0.01 73.48 ± 1.05 79.08 ± 3.73 65.68 ± 3.89 5 fold cross validation Dual event 50% x 100 fold cross validation R41-AI108003-01 Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878
  • 15. Good Bad Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878 T. cruzi Dose Response and cytotoxicity Machine Learning model features Tertiary amines, piperidines and aromatic fragments with basic Nitrogen Cyclic hydrazines and electron poor chlorinated aromatics R41-AI108003-01
  • 16. Bayesian Machine Learning Models - Selleck Chemicals natural product lib. (139 molecules); - GSK kinase library (367 molecules); - Malaria box (400 molecules); - Microsource Spectrum (2320 molecules); - CDD FDA drugs (2690 molecules); - Prestwick Chemical library (1280 molecules); - Traditional Chinese Medicine components (373 molecules) 7569 molecules 99 molecules R41-AI108003-01 Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878
  • 17. Synonyms Infection Ratio EC50 (µM) EC90 (µM) Hill slope Cytotoxicity CC50 (µM) Chagas mouse model (4 days treatment, luciferase): In vivo efficacy at 50 mg/kg bid (IP) (%) (±)-Verapamil hydrochloride, 715730, SC-0011762 0.02, 0.02 0.0383 0.143 1.67 >10.0 55.1 29781612, Pyronaridine 0.00, 0.00 0.225 0.665 2.03 3.0 85.2 511176, Furazolidone 0.00, 0.00 0.257 0.563 2.81 >10.0 100.5 501337, SC-0011777, Tetrandrine 0.00, 0.00 0.508 1.57 1.95 1.3 43.6 SC-0011754, Nitrofural 0.01, 0.01 0.775 6.98 1.00 >10.0 78.5* * Used hydroxymethylnitrofurazone for in vivo study (nitrofural pro-drug) Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878 H3C O N CH3 N CH3 H3C O CH3 O H3C O H3C N N HN N N OH Cl O CH 3 O N N + N O O – O O O N + O O – N H N NH2 O In vitro and in vivo data for compounds selected R41-AI108003-01
  • 18. 7,569 cpds => 99 cpds => 17 hits (5 in nM range) Infection Treatment Reading 0 1 2 3 4 5 6 7 Pyronaridine Furazolidone Verapamil Nitrofural Tetrandrine Benznidazole In vivo efficacy of the 5 tested compounds Vehicle Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878R41-AI108003-01
  • 19. Pyronaridine: New anti-Chagas and known anti-Malarial EMA approved in combination with artesunate The IC50 value 2 nM against the growth of KT1 and KT3 P. falciparum Known P-gp inhibitor Active against Babesia and Theileria Parasites tick-transmitted R41-AI108003-01 Work provided starting point for a phase II and phase I grant (submitted) N N HN N N OH Cl O CH 3 Broad group missed this cpd
  • 20. 2014-2015 Ebola outbreak March 2014, the World Health Organization (WHO) reported a major Ebola outbreak in Guinea, a western African nation 8 August 2014, the WHO declared the epidemic to be an international public health emergency I urge everyone involved in all aspects of this epidemic to openly and rapidly report their experiences and findings. Information will be one of our key weapons in defeating the Ebola epidemic. Peter Piot Wikipedia Wikipedia
  • 21. Madrid PB, et al. (2013) A Systematic Screen of FDA-Approved Drugs for Inhibitors of Biological Threat Agents. PLoS ONE 8(4): e60579. doi:10.1371/journal.pone.0060579 Chloroquine in mouse
  • 22. Machine Learning for EBOV • 868 molecules from the viral pseudotype entry assay and the EBOV replication assay • Salts were stripped and duplicates removed using Discovery Studio 4.1 (Biovia, San Diego, CA) • IC50 values less than 50 mM were selected as actives. • Models generated using : molecular function class fingerprints of maximum diameter 6 (FCFP_6), AlogP, molecular weight, number of rotatable bonds, number of rings, number of aromatic rings, number of hydrogen bond acceptors, number of hydrogen bond donors, and molecular fractional polar surface area. • Models were validated using five-fold cross validation (leave out 20% of the database). • Bayesian, Support Vector Machine and Recursive Partitioning Forest and single tree models built. • RP Forest and RP Single Tree models used the standard protocol in Discovery Studio. • 5-fold cross validation or leave out 50% x 100 fold cross validation was used to calculate the ROC for the models generated
  • 23. Models (training set 868 compounds) RP Forest (Out of bag ROC) RP Single Tree (With 5 fold cross validation ROC) SVM (with 5 fold cross validation ROC) Bayesian (with 5 fold cross validation ROC) Bayesian (leave out 50% x 100 ROC) Open Bayesian (with 5 fold cross validation ROC) Ebola replication (actives = 20) 0.70 0.78 0.73 0.86 0.86 0.82 Ebola Pseudotype (actives = 41) 0.85 0.81 0.76 0.85 0.82 0.82 Ebola HTS Machine learning model cross validation Receiver Operator Curve Statistics. Ekins et al., F1000Res 4:1091, (2015)
  • 24. Discovery Studio pseudotype Bayesian model B Discovery Studio EBOV replication model Good Bad Good Bad Ekins et al., F1000Res 4:1091, (2015)
  • 25. Effect of drug treatment on infection with Ebola-GFP 3 Molecules selected from MicroSource Spectrum virtual screen and tested in vitro All of them nM activity -8 -7 -6 -5 -4 -10 0 10 20 30 40 50 60 70 80 90 100 110 Chloroquine Pyronaridine Quinacrine Tilorone Untreated control Log Conc. (M)%EbolaInfection Compound EC50 (mM) [95% CI] Cytotoxicity CC50 (µM) Chloroquine 4.0 [1.0 – 15] 250 Pyronaridine 0.42 [0.31 – 0.56] 3.1 Quinacrine 0.35 [0.28 – 0.44] 6.2 Tilorone 0.23 [0.09 – 0.62] 6.2 Duplicate experiments control Ekins et al., F1000Res 4:1091, (2015)
  • 26. Making Ebola models available • From data published by others …to proposing target • Collaborated with lab to open up their screening data, build models, identified more active inhibitors • To date the most potent drugs and drug-like molecules • Still a need for a drug that could be used ASAP • Models in MMDS http://molsync.com/ebola/ More data continues to be published • We collated 55 molecules from the literature • A second review lists 60 hits – Picazo, E. and F. Giordanetto, Drug Discovery Today. 2015 Feb;20(2):277-86 • Additional screens have identified 53 hits and 80 hits respectively – Kouznetsova, J., et al., Emerg Microbes Infect, 2014. 3(12): p. e84. – Johansen, L.M., et al., Sci Transl Med, 2015. 7(290): p. 290ra89.  Litterman N, Lipinski C and Ekins S 2015 F1000Research 2015, 4:38
  • 27. 1000’s of models from • Skipped targets with > 100,000 assays and sets with < 100 measurements • Converted data to –log • Dealt with duplicates • 2152 datasets • Cutoff determination • Balance active/ inactive ratio • Favor structural diversity and activity distribution Clark and Ekins, J Chem Inf Model. 2015 Jun 22;55(6):1246-60 http://molsync.com/bayesian2
  • 28. What do 2000 ChEMBL models look like Folding bit size Average ROC http://molsync.com/bayesian2 Clark and Ekins, J Chem Inf Model. 2015 Jun 22;55(6):1246-60
  • 29. PolyPharma a new free app for drug discovery
  • 31. Proposed workflow for rapid drug discovery against Zika virus Ekins S, Mietchen D, Coffee M et al. F1000Research 2016, 5:150 (doi: 10.12688/f1000research.8013.1)
  • 32. HOMOLOGY MODELS FOR ZIKA Models developed with SWISS-MODEL Will dock millions of compounds vs these models Ekins et al., F1000Research 5:275 (2016)
  • 33. Ekins S, Mietchen D, Coffee M et al. 2016 F1000Research 2016, 5:150 (doi: 10.12688/f1000research.8013.1) Compounds and chemical libraries suggested for testing against Zika virus
  • 34. • Data is out there to produce models for neglected diseases • Also modeled Marburg, Lassa, Dengue.. • Computational and experimental collaborations with open data have lead to : – New hits and leads – New IP – New grants for collaborators • Even Ebola had enough data to build models and suggest compounds to test in 2014 • Zika = starting from scratch – no data – need to use other approaches • Make findings open and published immediately • Need for easier facilities to test compounds • Challenges still – sharing and accessing information / knowledge • How do we prepare for the next BIG ONE Conclusions
  • 35. Alex Clark Jair Lage de Siqueira-Neto Joel Freundlich Peter Madrid Robert Davey Megan Coffee Robert Reynolds Nadia Litterman Christopher Lipinski Christopher Southan Antony Williams Carolyn Talcott Malabika Sarker Steven Wright Mike Pollastri Ni Ai Barry Bunin and all colleagues at CDD Acknowledgments and contact info ekinssean@yahoo.com collabchem
  • 37. Software on github Models can be accessed at • http://molsync.com/bayesian1 • http://molsync.com/bayesian2 • http://molsync.com/transporters • http://molsync.com/ebola/