A high level overview of using artificial intelligence and chemical structure information to predict toxicity in various species. Discusses molecular docking, deep learning, quantitative structure activity relationships, Bayesian networks and cats (lots of cat pictures). Part of my artificial intelligence for national security, artificial intelligence for warfighter readiness, and alternative methods for toxicity prediction research portfolios.
In Silico Approaches for Predicting Hazards from Chemical Structure and Existing Data
1. In Silico Approaches for Predicting Hazards
from Chemical Structure and Existing Data
Lyle D. Burgoon, Ph.D.
Leader, Bioinformatics and Computational Toxicology
US Army Engineer Research and Development Center
Opinions expressed are those of the author and do not necessarily reflect
US Army policy.
5. Why????!!!!
• About 80,000 data poor chemicals in the environment
• Threatened and Endangered Species
• Ethics of testing
• Permits, practicality
• Human health
• Ethics of testing
• Species extrapolation issues
• Ecological species and population impacts
• Species extrapolation issues
• Ethics of testing
• Cost
• Which species get tested
9. What Should I Choose?
Match your time constraints with what information you have
10. What Should I Choose?
Match your time constraints with what information you have
Emergency Response?
Military Intelligence?
Site cleanup?
Prioritization?
13. Capturing:
- Affinity
- Model protein crystal
- Any modifications to the
crystal
- Chemical structure
- Version of DAMSL model
DAMSL:
Digital Automated Molecular Screening Library
14. Capturing:
- Affinity
- Model protein crystal
- Any modifications to the
crystal
- Chemical structure
- Version of DAMSL model
DAMSL:
Digital Automated Molecular Screening Library
Downside: Accuracy tends to not be as great a structure-based model
25. Briefly, what is deep learning?
• Artificial intelligence approach
• Misconceptions
• Always requires a lot of data
• Not necessarily – relative to a lot of things, and what you’re trying to do
• Always overfits when you don’t give it a lot of data
• Not necessarily – depends on a lot of things; simpler methods can also overfit
• The architecture of your neural networks are important
• What is true…
• There’s a lot of art to designing the optimal network
• Like any technique or approach, it’s best to get training before you
jump in
• Lots of free training on the web, lots of tutorials
28. If you want to start learning deep
learning…
• Kaggle is a great place to learn – several tutorials
• Lots of blogs with tutorials
• Online and traditional courses are popping up a lot
29. Deep Learning Approach to
Predict PPAR-gamma Ligands
• Ground Truth Dataset: 796 chemicals
• Ligands: 33 chemicals
• Not Ligands: 763
• This is pretty typical – very few chemicals will be ligands
• Accuracy (10-fold cross-validation): 94.5%
35. APECS
Autoencoder Predicting Estrogenic Chemical
Substances
Capture:
• APECS version
• Estrogenicity
prediction
• Chemical information
• ToxCast data version
and assays used for
training APECS
• Sensitivity and
Specificity data
Burgoon, L.D. Computational Toxicology 2: 45-49. https://doi.org/10.1016/j.comtox.2017.03.002
36. APECS
Autoencoder Predicting Estrogenic Chemical
Substances
In Vivo Model
Sensitivity: 97%
Specificity: 80%
Accuracy: 91%
In Vitro Model
Sensitivity: 100%
Specificity: 75%
Accuracy: 93%
Burgoon, L.D. Computational Toxicology 2: 45-49. https://doi.org/10.1016/j.comtox.2017.03.002
43. We fed this data into our AOPBN
Angrish, M.M., et al (2017). Mechanistic Toxicity Tests Based on an Adverse Outcome Pathway Network for Hepatic
Steatosis. Toxicol. Sci. 159, 159–169.
45. Why I like AOPBNs
• Causal networks
• Use maths to identify the Minimally Sufficient Set of
Key Events (MinSSKEs)
• Minimal set of key events sufficient to infer an adverse
outcome
• Devise scenarios to measure the value of information
associated with each key event and sets of key events
• Devise test batteries that maximize value of information while
minimizing resource costs
48. What Should I Choose?
Match your time constraints with what information you have
Emergency Response?
Military Intelligence?
Site cleanup?
Prioritization?
49. Tools
• I’m developing freely available, open source,
‘government off the shelf’ software for everything
presented here
• If you are interested in learning how to do this stuff
on your own, chat me up
50. Acknowledgements
• Shannon Bell (ILS)
• Ed Perkins (Army ERDC)
• Natalia Vinas (Army ERDC)
• Agnes Karmaus (ILS)
• Michelle Angrish (EPA)
• Ingrid Druwe (formerly ORISE, currently EPA)
• Erin Yost (formerly ORISE, currently EPA)
• Kyle Painter (formerly ORISE)
• Supported by the US Army Environmental Quality and
Installations Program
51. Contact me for more!
• Email: lyle.d.burgoon@usace.army.mil
• Twitter: @DataSciBurgoon
• ORCID: https://orcid.org/0000-0003-4977-5352