SlideShare a Scribd company logo
In Silico Approaches for Predicting Hazards
from Chemical Structure and Existing Data
Lyle D. Burgoon, Ph.D.
Leader, Bioinformatics and Computational Toxicology
US Army Engineer Research and Development Center
Opinions expressed are those of the author and do not necessarily reflect
US Army policy.
DECIDE FASTER!
DECIDE FASTER!
WITH SAME OR FEWER RESOURCES
DECIDE FASTER!
WITH SAME OR FEWER RESOURCES
WITH LESS DATA
Why????!!!!
• About 80,000 data poor chemicals in the environment
• Threatened and Endangered Species
• Ethics of testing
• Permits, practicality
• Human health
• Ethics of testing
• Species extrapolation issues
• Ecological species and population impacts
• Species extrapolation issues
• Ethics of testing
• Cost
• Which species get tested
There has got to be a better way…
What Should I Choose?
What Should I Choose?
Match your time constraints with what information you have
What Should I Choose?
Match your time constraints with what information you have
Emergency Response?
Military Intelligence?
Site cleanup?
Prioritization?
docking.
Capturing:
- Affinity
- Model protein crystal
- Any modifications to the
crystal
- Chemical structure
- Version of DAMSL model
DAMSL:
Digital Automated Molecular Screening Library
Capturing:
- Affinity
- Model protein crystal
- Any modifications to the
crystal
- Chemical structure
- Version of DAMSL model
DAMSL:
Digital Automated Molecular Screening Library
Downside: Accuracy tends to not be as great a structure-based model
qsar.
qsar.
in a nutshell.
qsar.
in a nutshell (spoiler alert: there’s a little math).
f(x) = hazard (yes/no)
f(x) = LD50
some mathematical function applied on x
some mathematical function applied on x
f( ) = hazard
f( ) = LD50
qsar.
deep learning.
deep learning.
its not just for cat
pictures any
more
Briefly, what is deep learning?
• Artificial intelligence approach
• Misconceptions
• Always requires a lot of data
• Not necessarily – relative to a lot of things, and what you’re trying to do
• Always overfits when you don’t give it a lot of data
• Not necessarily – depends on a lot of things; simpler methods can also overfit
• The architecture of your neural networks are important
• What is true…
• There’s a lot of art to designing the optimal network
• Like any technique or approach, it’s best to get training before you
jump in
• Lots of free training on the web, lots of tutorials
1 or more
hidden layers
of neurons
Probability of Hazard
1 or more
hidden layers
of neurons
Probability of Hazard
If you want to start learning deep
learning…
• Kaggle is a great place to learn – several tutorials
• Lots of blogs with tutorials
• Online and traditional courses are popping up a lot
Deep Learning Approach to
Predict PPAR-gamma Ligands
• Ground Truth Dataset: 796 chemicals
• Ligands: 33 chemicals
• Not Ligands: 763
• This is pretty typical – very few chemicals will be ligands
• Accuracy (10-fold cross-validation): 94.5%
assay data integration.
assay data integration.
bayesian network approach.
assay data integration.
deep learning approach.
APECS
Autoencoder Predicting Estrogenic Chemical
Substances
Capture:
• APECS version
• Estrogenicity
prediction
• Chemical information
• ToxCast data version
and assays used for
training APECS
• Sensitivity and
Specificity data
Burgoon, L.D. Computational Toxicology 2: 45-49. https://doi.org/10.1016/j.comtox.2017.03.002
APECS
Autoencoder Predicting Estrogenic Chemical
Substances
In Vivo Model
Sensitivity: 97%
Specificity: 80%
Accuracy: 91%
In Vitro Model
Sensitivity: 100%
Specificity: 75%
Accuracy: 93%
Burgoon, L.D. Computational Toxicology 2: 45-49. https://doi.org/10.1016/j.comtox.2017.03.002
adverse outcome pathway bayesian networks.
predict probability of adverse outcomes.
Example Workflow (steroidogenesis)
AOPXplorer
Visualize results using AOPXplorer – a
Cytoscape App that facilitates AOP-based
data visualization
https://github.com/DataSciBurgoon/bisct/releases/tag/1.1.2
Screenshot of BISCT following analysis of the ToxCast H295R prochloraz screening dataset (Karmaus, et al. (2016) ToxSci
150(2): 323-332).
Steatosis
We fed this data into our AOPBN
Angrish, M.M., et al (2017). Mechanistic Toxicity Tests Based on an Adverse Outcome Pathway Network for Hepatic
Steatosis. Toxicol. Sci. 159, 159–169.
We got these results
Why I like AOPBNs
• Causal networks
• Use maths to identify the Minimally Sufficient Set of
Key Events (MinSSKEs)
• Minimal set of key events sufficient to infer an adverse
outcome
• Devise scenarios to measure the value of information
associated with each key event and sets of key events
• Devise test batteries that maximize value of information while
minimizing resource costs
Value of key event analysis
practical advice.
What Should I Choose?
Match your time constraints with what information you have
Emergency Response?
Military Intelligence?
Site cleanup?
Prioritization?
Tools
• I’m developing freely available, open source,
‘government off the shelf’ software for everything
presented here
• If you are interested in learning how to do this stuff
on your own, chat me up
Acknowledgements
• Shannon Bell (ILS)
• Ed Perkins (Army ERDC)
• Natalia Vinas (Army ERDC)
• Agnes Karmaus (ILS)
• Michelle Angrish (EPA)
• Ingrid Druwe (formerly ORISE, currently EPA)
• Erin Yost (formerly ORISE, currently EPA)
• Kyle Painter (formerly ORISE)
• Supported by the US Army Environmental Quality and
Installations Program
Contact me for more!
• Email: lyle.d.burgoon@usace.army.mil
• Twitter: @DataSciBurgoon
• ORCID: https://orcid.org/0000-0003-4977-5352

More Related Content

What's hot

Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Functional Genomics Data Society
 
Ngs workshop passarelli-mapping-1
Ngs workshop passarelli-mapping-1Ngs workshop passarelli-mapping-1
Ngs workshop passarelli-mapping-1
Shaojun Xie
 
Automating the process of continuously prioritising data, updating and deploy...
Automating the process of continuously prioritising data, updating and deploy...Automating the process of continuously prioritising data, updating and deploy...
Automating the process of continuously prioritising data, updating and deploy...
Ola Spjuth
 
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning ModelsMining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Sean Ekins
 
Practical Guide to the $1000 Genome (2014)
Practical Guide to the $1000 Genome (2014)Practical Guide to the $1000 Genome (2014)
Practical Guide to the $1000 Genome (2014)
AllSeq
 
Why are we still doing industrial age drug
Why are we still doing industrial age drugWhy are we still doing industrial age drug
Why are we still doing industrial age drugSean Ekins
 
Using Bioinformatics Data to inform Therapeutics discovery and development
Using Bioinformatics Data to inform Therapeutics discovery and developmentUsing Bioinformatics Data to inform Therapeutics discovery and development
Using Bioinformatics Data to inform Therapeutics discovery and development
Eleanor Howe
 
Gene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -TutorialGene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -Tutorial
Dmitry Grapov
 
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Michael Reich, GenomeSpace Workshop, fged_seattle_2013Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Functional Genomics Data Society
 
Giab jan2016 analysis team breakout summary
Giab jan2016 analysis team breakout summaryGiab jan2016 analysis team breakout summary
Giab jan2016 analysis team breakout summary
GenomeInABottle
 
Jan2016 bina giab
Jan2016 bina giabJan2016 bina giab
Jan2016 bina giab
GenomeInABottle
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Databricks
 
Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...
Greg Landrum
 
Understanding Biological Function in Times of High Throughput and Low Output
Understanding Biological Function in Times of High Throughput and Low OutputUnderstanding Biological Function in Times of High Throughput and Low Output
Understanding Biological Function in Times of High Throughput and Low Output
Iddo
 
Phylogenetics: Making publication-quality tree figures
Phylogenetics: Making publication-quality tree figuresPhylogenetics: Making publication-quality tree figures
Phylogenetics: Making publication-quality tree figures
Bioinformatics and Computational Biosciences Branch
 
Considerations and challenges in building an end to-end microbiome workflow
Considerations and challenges in building an end to-end microbiome workflowConsiderations and challenges in building an end to-end microbiome workflow
Considerations and challenges in building an end to-end microbiome workflow
Eagle Genomics
 
Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma Ankur Khanna
 
BIOLINK 2008: Linking database submissions to primary citations with PubMe...
BIOLINK 2008:    Linking database submissions to primary citations with PubMe...BIOLINK 2008:    Linking database submissions to primary citations with PubMe...
BIOLINK 2008: Linking database submissions to primary citations with PubMe...
Heather Piwowar
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Ilya Kupershmidt speaks at the Molecular Medicine Tri-Conference
Ilya Kupershmidt speaks at the Molecular Medicine Tri-ConferenceIlya Kupershmidt speaks at the Molecular Medicine Tri-Conference
Ilya Kupershmidt speaks at the Molecular Medicine Tri-Conference
NextBio
 

What's hot (20)

Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
 
Ngs workshop passarelli-mapping-1
Ngs workshop passarelli-mapping-1Ngs workshop passarelli-mapping-1
Ngs workshop passarelli-mapping-1
 
Automating the process of continuously prioritising data, updating and deploy...
Automating the process of continuously prioritising data, updating and deploy...Automating the process of continuously prioritising data, updating and deploy...
Automating the process of continuously prioritising data, updating and deploy...
 
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning ModelsMining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
 
Practical Guide to the $1000 Genome (2014)
Practical Guide to the $1000 Genome (2014)Practical Guide to the $1000 Genome (2014)
Practical Guide to the $1000 Genome (2014)
 
Why are we still doing industrial age drug
Why are we still doing industrial age drugWhy are we still doing industrial age drug
Why are we still doing industrial age drug
 
Using Bioinformatics Data to inform Therapeutics discovery and development
Using Bioinformatics Data to inform Therapeutics discovery and developmentUsing Bioinformatics Data to inform Therapeutics discovery and development
Using Bioinformatics Data to inform Therapeutics discovery and development
 
Gene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -TutorialGene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -Tutorial
 
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Michael Reich, GenomeSpace Workshop, fged_seattle_2013Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
 
Giab jan2016 analysis team breakout summary
Giab jan2016 analysis team breakout summaryGiab jan2016 analysis team breakout summary
Giab jan2016 analysis team breakout summary
 
Jan2016 bina giab
Jan2016 bina giabJan2016 bina giab
Jan2016 bina giab
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
 
Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...
 
Understanding Biological Function in Times of High Throughput and Low Output
Understanding Biological Function in Times of High Throughput and Low OutputUnderstanding Biological Function in Times of High Throughput and Low Output
Understanding Biological Function in Times of High Throughput and Low Output
 
Phylogenetics: Making publication-quality tree figures
Phylogenetics: Making publication-quality tree figuresPhylogenetics: Making publication-quality tree figures
Phylogenetics: Making publication-quality tree figures
 
Considerations and challenges in building an end to-end microbiome workflow
Considerations and challenges in building an end to-end microbiome workflowConsiderations and challenges in building an end to-end microbiome workflow
Considerations and challenges in building an end to-end microbiome workflow
 
Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma
 
BIOLINK 2008: Linking database submissions to primary citations with PubMe...
BIOLINK 2008:    Linking database submissions to primary citations with PubMe...BIOLINK 2008:    Linking database submissions to primary citations with PubMe...
BIOLINK 2008: Linking database submissions to primary citations with PubMe...
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
 
Ilya Kupershmidt speaks at the Molecular Medicine Tri-Conference
Ilya Kupershmidt speaks at the Molecular Medicine Tri-ConferenceIlya Kupershmidt speaks at the Molecular Medicine Tri-Conference
Ilya Kupershmidt speaks at the Molecular Medicine Tri-Conference
 

Similar to In Silico Approaches for Predicting Hazards from Chemical Structure and Existing Data

BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
Philip Cheung
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
ChemSpider as an integration hub for interlinked chemistry data
ChemSpider as an integration hub for interlinked chemistry dataChemSpider as an integration hub for interlinked chemistry data
ChemSpider as an integration hub for interlinked chemistry data
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
geraintduck
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
University of California, Davis
 
Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
ExternalEvents
 
Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3Jillian Aurisano
 
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08
Russ Altman
 
New Approach Methods - What is That?
New Approach Methods - What is That?New Approach Methods - What is That?
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian Aurisano
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian Aurisano
 
Too good to be true? How validate your data
Too good to be true? How validate your dataToo good to be true? How validate your data
Too good to be true? How validate your data
Alex Henderson
 
sience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real studysience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real study
wolf vanpaemel
 
Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Mining Big datasets to create and validate machine learning models
Mining Big datasets to create and validate machine learning modelsMining Big datasets to create and validate machine learning models
Mining Big datasets to create and validate machine learning models
Sean Ekins
 
Life sciences big data use cases
Life sciences big data use casesLife sciences big data use cases
Life sciences big data use cases
Guy Coates
 

Similar to In Silico Approaches for Predicting Hazards from Chemical Structure and Existing Data (20)

BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
ChemSpider as an integration hub for interlinked chemistry data
ChemSpider as an integration hub for interlinked chemistry dataChemSpider as an integration hub for interlinked chemistry data
ChemSpider as an integration hub for interlinked chemistry data
 
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
 
Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
 
Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3
 
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
 
Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08
 
New Approach Methods - What is That?
New Approach Methods - What is That?New Approach Methods - What is That?
New Approach Methods - What is That?
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2
 
Too good to be true? How validate your data
Too good to be true? How validate your dataToo good to be true? How validate your data
Too good to be true? How validate your data
 
sience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real studysience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real study
 
Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
 
Mining Big datasets to create and validate machine learning models
Mining Big datasets to create and validate machine learning modelsMining Big datasets to create and validate machine learning models
Mining Big datasets to create and validate machine learning models
 
Life sciences big data use cases
Life sciences big data use casesLife sciences big data use cases
Life sciences big data use cases
 

More from U.S. Army Engineer Research and Development Center

What's the Harm in Toxicological Fallacies and Misrepresentations
What's the Harm in Toxicological Fallacies and MisrepresentationsWhat's the Harm in Toxicological Fallacies and Misrepresentations
What's the Harm in Toxicological Fallacies and Misrepresentations
U.S. Army Engineer Research and Development Center
 
Use of Artificial Intelligence for Literature Screening
Use of Artificial Intelligence for Literature ScreeningUse of Artificial Intelligence for Literature Screening
Use of Artificial Intelligence for Literature Screening
U.S. Army Engineer Research and Development Center
 
Perspective on the US Army's Uses for Predictive Models of Acute Oral Toxicity
Perspective on the US Army's Uses for Predictive Models of Acute Oral ToxicityPerspective on the US Army's Uses for Predictive Models of Acute Oral Toxicity
Perspective on the US Army's Uses for Predictive Models of Acute Oral Toxicity
U.S. Army Engineer Research and Development Center
 
Safe Environmental Exposure Levels for RDX and TNT Using Toxicogenomic Data
Safe Environmental Exposure Levels for RDX and TNT Using Toxicogenomic DataSafe Environmental Exposure Levels for RDX and TNT Using Toxicogenomic Data
Safe Environmental Exposure Levels for RDX and TNT Using Toxicogenomic Data
U.S. Army Engineer Research and Development Center
 
Integrating Toxicological Data with BISCT and GRAVEE for Better Chemical Asse...
Integrating Toxicological Data with BISCT and GRAVEE for Better Chemical Asse...Integrating Toxicological Data with BISCT and GRAVEE for Better Chemical Asse...
Integrating Toxicological Data with BISCT and GRAVEE for Better Chemical Asse...
U.S. Army Engineer Research and Development Center
 
Using Machine Intelligence to Perform Predictive Toxicology
Using Machine Intelligence to Perform Predictive ToxicologyUsing Machine Intelligence to Perform Predictive Toxicology
Using Machine Intelligence to Perform Predictive Toxicology
U.S. Army Engineer Research and Development Center
 

More from U.S. Army Engineer Research and Development Center (6)

What's the Harm in Toxicological Fallacies and Misrepresentations
What's the Harm in Toxicological Fallacies and MisrepresentationsWhat's the Harm in Toxicological Fallacies and Misrepresentations
What's the Harm in Toxicological Fallacies and Misrepresentations
 
Use of Artificial Intelligence for Literature Screening
Use of Artificial Intelligence for Literature ScreeningUse of Artificial Intelligence for Literature Screening
Use of Artificial Intelligence for Literature Screening
 
Perspective on the US Army's Uses for Predictive Models of Acute Oral Toxicity
Perspective on the US Army's Uses for Predictive Models of Acute Oral ToxicityPerspective on the US Army's Uses for Predictive Models of Acute Oral Toxicity
Perspective on the US Army's Uses for Predictive Models of Acute Oral Toxicity
 
Safe Environmental Exposure Levels for RDX and TNT Using Toxicogenomic Data
Safe Environmental Exposure Levels for RDX and TNT Using Toxicogenomic DataSafe Environmental Exposure Levels for RDX and TNT Using Toxicogenomic Data
Safe Environmental Exposure Levels for RDX and TNT Using Toxicogenomic Data
 
Integrating Toxicological Data with BISCT and GRAVEE for Better Chemical Asse...
Integrating Toxicological Data with BISCT and GRAVEE for Better Chemical Asse...Integrating Toxicological Data with BISCT and GRAVEE for Better Chemical Asse...
Integrating Toxicological Data with BISCT and GRAVEE for Better Chemical Asse...
 
Using Machine Intelligence to Perform Predictive Toxicology
Using Machine Intelligence to Perform Predictive ToxicologyUsing Machine Intelligence to Perform Predictive Toxicology
Using Machine Intelligence to Perform Predictive Toxicology
 

Recently uploaded

Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Studia Poinsotiana
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
frank0071
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
zeex60
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 

Recently uploaded (20)

Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 

In Silico Approaches for Predicting Hazards from Chemical Structure and Existing Data

  • 1. In Silico Approaches for Predicting Hazards from Chemical Structure and Existing Data Lyle D. Burgoon, Ph.D. Leader, Bioinformatics and Computational Toxicology US Army Engineer Research and Development Center Opinions expressed are those of the author and do not necessarily reflect US Army policy.
  • 3. DECIDE FASTER! WITH SAME OR FEWER RESOURCES
  • 4. DECIDE FASTER! WITH SAME OR FEWER RESOURCES WITH LESS DATA
  • 5. Why????!!!! • About 80,000 data poor chemicals in the environment • Threatened and Endangered Species • Ethics of testing • Permits, practicality • Human health • Ethics of testing • Species extrapolation issues • Ecological species and population impacts • Species extrapolation issues • Ethics of testing • Cost • Which species get tested
  • 6. There has got to be a better way…
  • 7.
  • 8. What Should I Choose?
  • 9. What Should I Choose? Match your time constraints with what information you have
  • 10. What Should I Choose? Match your time constraints with what information you have Emergency Response? Military Intelligence? Site cleanup? Prioritization?
  • 12.
  • 13. Capturing: - Affinity - Model protein crystal - Any modifications to the crystal - Chemical structure - Version of DAMSL model DAMSL: Digital Automated Molecular Screening Library
  • 14. Capturing: - Affinity - Model protein crystal - Any modifications to the crystal - Chemical structure - Version of DAMSL model DAMSL: Digital Automated Molecular Screening Library Downside: Accuracy tends to not be as great a structure-based model
  • 15. qsar.
  • 17. qsar. in a nutshell (spoiler alert: there’s a little math).
  • 18. f(x) = hazard (yes/no) f(x) = LD50 some mathematical function applied on x some mathematical function applied on x
  • 19.
  • 20. f( ) = hazard
  • 21. f( ) = LD50
  • 22.
  • 24. deep learning. its not just for cat pictures any more
  • 25. Briefly, what is deep learning? • Artificial intelligence approach • Misconceptions • Always requires a lot of data • Not necessarily – relative to a lot of things, and what you’re trying to do • Always overfits when you don’t give it a lot of data • Not necessarily – depends on a lot of things; simpler methods can also overfit • The architecture of your neural networks are important • What is true… • There’s a lot of art to designing the optimal network • Like any technique or approach, it’s best to get training before you jump in • Lots of free training on the web, lots of tutorials
  • 26. 1 or more hidden layers of neurons Probability of Hazard
  • 27. 1 or more hidden layers of neurons Probability of Hazard
  • 28. If you want to start learning deep learning… • Kaggle is a great place to learn – several tutorials • Lots of blogs with tutorials • Online and traditional courses are popping up a lot
  • 29. Deep Learning Approach to Predict PPAR-gamma Ligands • Ground Truth Dataset: 796 chemicals • Ligands: 33 chemicals • Not Ligands: 763 • This is pretty typical – very few chemicals will be ligands • Accuracy (10-fold cross-validation): 94.5%
  • 31. assay data integration. bayesian network approach.
  • 32.
  • 33.
  • 34. assay data integration. deep learning approach.
  • 35. APECS Autoencoder Predicting Estrogenic Chemical Substances Capture: • APECS version • Estrogenicity prediction • Chemical information • ToxCast data version and assays used for training APECS • Sensitivity and Specificity data Burgoon, L.D. Computational Toxicology 2: 45-49. https://doi.org/10.1016/j.comtox.2017.03.002
  • 36. APECS Autoencoder Predicting Estrogenic Chemical Substances In Vivo Model Sensitivity: 97% Specificity: 80% Accuracy: 91% In Vitro Model Sensitivity: 100% Specificity: 75% Accuracy: 93% Burgoon, L.D. Computational Toxicology 2: 45-49. https://doi.org/10.1016/j.comtox.2017.03.002
  • 37. adverse outcome pathway bayesian networks. predict probability of adverse outcomes.
  • 38. Example Workflow (steroidogenesis) AOPXplorer Visualize results using AOPXplorer – a Cytoscape App that facilitates AOP-based data visualization
  • 40. Screenshot of BISCT following analysis of the ToxCast H295R prochloraz screening dataset (Karmaus, et al. (2016) ToxSci 150(2): 323-332).
  • 41.
  • 43. We fed this data into our AOPBN Angrish, M.M., et al (2017). Mechanistic Toxicity Tests Based on an Adverse Outcome Pathway Network for Hepatic Steatosis. Toxicol. Sci. 159, 159–169.
  • 44. We got these results
  • 45. Why I like AOPBNs • Causal networks • Use maths to identify the Minimally Sufficient Set of Key Events (MinSSKEs) • Minimal set of key events sufficient to infer an adverse outcome • Devise scenarios to measure the value of information associated with each key event and sets of key events • Devise test batteries that maximize value of information while minimizing resource costs
  • 46. Value of key event analysis
  • 48. What Should I Choose? Match your time constraints with what information you have Emergency Response? Military Intelligence? Site cleanup? Prioritization?
  • 49. Tools • I’m developing freely available, open source, ‘government off the shelf’ software for everything presented here • If you are interested in learning how to do this stuff on your own, chat me up
  • 50. Acknowledgements • Shannon Bell (ILS) • Ed Perkins (Army ERDC) • Natalia Vinas (Army ERDC) • Agnes Karmaus (ILS) • Michelle Angrish (EPA) • Ingrid Druwe (formerly ORISE, currently EPA) • Erin Yost (formerly ORISE, currently EPA) • Kyle Painter (formerly ORISE) • Supported by the US Army Environmental Quality and Installations Program
  • 51. Contact me for more! • Email: lyle.d.burgoon@usace.army.mil • Twitter: @DataSciBurgoon • ORCID: https://orcid.org/0000-0003-4977-5352