SlideShare a Scribd company logo
Web-based access to data for >600
disinfection by-products via the EPA
CompTox Chemicals Dashboard
Antony Williams1, Chris Grulke1 and Susan Richardson2
1Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, RTP, NC
2University of South Carolina, Department of Chemistry and Biochemistry, Columbia, SC 29208
August 2021
ACS Fall Meeting, Atlanta
http://www.orcid.org/0000-0002-2668-4821
The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
1
EPA’s CompTox Chemicals Dashboard
A publicly accessible website delivering:
- ~883,000 chemicals with related property data
- Experimental and predicted physicochemical property data
- Integration to “biological assay data” for 1000’s of chemicals
- Information regarding consumer products containing
chemicals
- Links to other agency websites and public data resources
- “Literature” searches for chemicals using public resources
- “Batch searching” for thousands of chemicals
- Downloadable Open Data for reuse and repurposing
A single app integrating…
2
2
SEARCH
TOX DATA
BIOACTIVITY
SIMILARITY
READ-ACROSS
PUBMED
BATCH SEARCH
CompTox Chemicals Dashboard
https://comptox.epa.gov/dashboard
3
883k Chemical Substances
>906k in the next release
EPA Drinking Water Requirements
4
Chemical Lists on the Dashboard
>300 lists and growing
5
EPAHFR: Hydraulic Fracturing
6
The advantage of lists
• Pulls a relevant dataset of chemicals into a
single list
• Download of the file provides relevant data
to user: structure, CASRN, Names, InChI
• “Send to batch” provides access to all other
data of interest – hazard, in vitro bioactivity,
exposure, properties, relationship mappings
to salts and so much more
7
Disinfectants
• Where do you find the best list? You ask an
expert who reviews the science!
• Extract, register and map the data between
parents and by-products
8
Download File – SDF
9
Download File – Excel
10
Masses and Formulae
support Mass Spectrometry
Mass & Formula
Searching
11
Advanced Searches
Mass Search
12
Advanced Searches
Mass Search
13
MS-Ready Structures for
Formula Search
14
“MS-Ready Structures”
https://doi.org/10.1186/s13321-018-0299-2
15
16
MS-Ready Mappings
17
MS-Ready Mappings Set
18
MS-Ready Mappings
• EXACT Formula: C10H16N2O8: 3 Hits
19
MS-Ready Mappings
• Same Input Formula: C10H16N2O8
• MS Ready Formula Search: 125 Chemicals
20
MS-Ready Mappings
• 125 chemicals returned in total
– 8 of the 125 are single component chemicals
– 3 of the 8 are isotope-labeled
– 3 are neutral compounds and 2 are charged
21
Candidate
ranking
22
Data Source Ranking of
“known unknowns”
23
• Mass and/or formula is for an
unknown chemical but contained
within a reference database
• Most likely candidate chemicals
have the most associated data
sources, most associated lit.
articles or both
C14H22N2O3
266.16304
Chemical
Reference
Database
Sorted candidate
structures
Is a bigger database better?
24
• ChemSpider was 26 million chemicals then
• Much BIGGER today
• Is bigger better??
Using Metadata for Ranking
• Use available metadata to rank candidates
– Associated data sources
• Associated lists in the underlying database
• Associated data sources in PubChem
• Specific types (e.g. water, surfactants, pesticides etc.)
– Number of associated literature articles (Pubmed)
– Chemicals in the environment – the number of
products/categories containing the chemical is a very
important source of data
25
Comparing Search Performance
26
• Dashboard content was 720k chemicals
• Only 3% of ChemSpider size
• What was the comparison in performance?
SAME dataset for comparison
27
How did performance compare?
28
For the same 162 chemicals,
Dashboard outperforms
ChemSpider
How did performance compare?
29
Batch Searching
• Singleton searches are useful but we work
with thousands of masses and formulae!
• Typical questions
– What is the list of chemicals for the formula CxHyOz
– What is the list of chemicals for a mass +/- error
– Can I get chemical lists in Excel files? In SDF files?
– Can I include properties in the download file?
30
Batch Searching Formula/Mass
31
Searching batches using MS-Ready
Formula (or mass) searching
32
In Progress
33
Work in Progress
• Predicted Spectra for candidate ranking
– Viewing and Downloading pre-predicted spectra
– Search spectra against the database
34
Predicted Mass Spectra
http://cfmid.wishartlab.com/
• MS/MS spectra prediction for ESI+, ESI-, and EI
• Predictions generated and stored for >800,000
structures, to be accessible via Dashboard
35
Search Expt. vs. Predicted Spectra
• Predictions generated and stored for >700,000 structures
• Python code to score experimental vs predicted spectra
• Cosine dot product match score calculation
August 26, 2019
Nontargeted screening of wastewater for water reuse using mass spectrometry Current Advances in Water
Analysis
37
CFM-ID Predicted Library Available
Prototype Development
Structure/substructure search
38
Conclusion
• Dashboard access to data for ~875,000 chemicals
• MS-Ready data facilitates structure identification
• Related metadata facilitates candidate ranking
39
• Relationship mappings and
chemical lists of great utility
• Dashboard and contents
are one part of the solution
• New API and Web Services
are in development
Acknowledgements
• CCTE IT development team
• All scientists within CCTE that provide
data and feedback on the Dashboard
• Ann Richard and the CCTE curation team
Contact
Antony Williams
US EPA Office of Research and Development
National Center for Computational Toxicology
EMAIL: Williams.Antony@epa.gov
ORCID: https://orcid.org/0000-0002-2668-4821
41

More Related Content

What's hot

Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted AnalysisThe US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Environmental Chemistry Compound Identification Using High Resolution Mass Sp...
Environmental Chemistry Compound Identification Using High Resolution Mass Sp...Environmental Chemistry Compound Identification Using High Resolution Mass Sp...
Environmental Chemistry Compound Identification Using High Resolution Mass Sp...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Open PHACTS Chemistry Platform Update and Learnings
Open PHACTS Chemistry Platform Update and Learnings Open PHACTS Chemistry Platform Update and Learnings
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data t...
US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data t...US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data t...
US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data t...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Adding complex expert knowledge into chemical database and transforming surfa...
Adding complex expert knowledge into chemical database and transforming surfa...Adding complex expert knowledge into chemical database and transforming surfa...
Adding complex expert knowledge into chemical database and transforming surfa...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 

What's hot (20)

Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...
 
Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...
 
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
 
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
 
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
 
New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...
 
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted AnalysisThe US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
 
Environmental Chemistry Compound Identification Using High Resolution Mass Sp...
Environmental Chemistry Compound Identification Using High Resolution Mass Sp...Environmental Chemistry Compound Identification Using High Resolution Mass Sp...
Environmental Chemistry Compound Identification Using High Resolution Mass Sp...
 
Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...
 
Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
 
Open PHACTS Chemistry Platform Update and Learnings
Open PHACTS Chemistry Platform Update and Learnings Open PHACTS Chemistry Platform Update and Learnings
Open PHACTS Chemistry Platform Update and Learnings
 
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
 
US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data t...
US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data t...US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data t...
US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data t...
 
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
 
Adding complex expert knowledge into chemical database and transforming surfa...
Adding complex expert knowledge into chemical database and transforming surfa...Adding complex expert knowledge into chemical database and transforming surfa...
Adding complex expert knowledge into chemical database and transforming surfa...
 

Similar to Web-based access to data for >600 disinfection by-products via the EPA CompTox Chemicals Dashboard

Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
US-EPA Chemicals Dashboard and Applications to Digital Design of Molecules
US-EPA Chemicals Dashboard and Applications to Digital Design  of MoleculesUS-EPA Chemicals Dashboard and Applications to Digital Design  of Molecules
US-EPA Chemicals Dashboard and Applications to Digital Design of Molecules
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental C...
EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental C...EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental C...
EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental C...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Cheminformatics approaches to support chemical identification delivered via t...
Cheminformatics approaches to support chemical identification delivered via t...Cheminformatics approaches to support chemical identification delivered via t...
Cheminformatics approaches to support chemical identification delivered via t...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Toxico...
Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Toxico...Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Toxico...
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Toxico...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data DashboardsAccessing Environmental Chemistry Data via Data Dashboards
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
EPA’s CompTox Chemicals Dashboard, a tool with information on ~900,000 chemicals
EPA’s CompTox Chemicals Dashboard, a tool with information on ~900,000 chemicalsEPA’s CompTox Chemicals Dashboard, a tool with information on ~900,000 chemicals
EPA’s CompTox Chemicals Dashboard, a tool with information on ~900,000 chemicals
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Delivering chemical-associated data via EPA web applications
Delivering chemical-associated data via EPA web applicationsDelivering chemical-associated data via EPA web applications
Delivering chemical-associated data via EPA web applications
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Cheminformatics Support for MS Supporting Exposomics
Cheminformatics Support for MS Supporting ExposomicsCheminformatics Support for MS Supporting Exposomics
EPA CompTox chemicals dashboard: An online resource for environmental chemists
EPA CompTox chemicals dashboard: An online resource for environmental chemists EPA CompTox chemicals dashboard: An online resource for environmental chemists
EPA CompTox chemicals dashboard: An online resource for environmental chemists
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
Andrew McEachran
 

Similar to Web-based access to data for >600 disinfection by-products via the EPA CompTox Chemicals Dashboard (20)

Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
 
US-EPA Chemicals Dashboard and Applications to Digital Design of Molecules
US-EPA Chemicals Dashboard and Applications to Digital Design  of MoleculesUS-EPA Chemicals Dashboard and Applications to Digital Design  of Molecules
US-EPA Chemicals Dashboard and Applications to Digital Design of Molecules
 
Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...
 
EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental C...
EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental C...EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental C...
EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental C...
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
Cheminformatics approaches to support chemical identification delivered via t...
Cheminformatics approaches to support chemical identification delivered via t...Cheminformatics approaches to support chemical identification delivered via t...
Cheminformatics approaches to support chemical identification delivered via t...
 
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Toxico...
Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Toxico...Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Toxico...
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Toxico...
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data DashboardsAccessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
 
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
 
EPA’s CompTox Chemicals Dashboard, a tool with information on ~900,000 chemicals
EPA’s CompTox Chemicals Dashboard, a tool with information on ~900,000 chemicalsEPA’s CompTox Chemicals Dashboard, a tool with information on ~900,000 chemicals
EPA’s CompTox Chemicals Dashboard, a tool with information on ~900,000 chemicals
 
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
 
Delivering chemical-associated data via EPA web applications
Delivering chemical-associated data via EPA web applicationsDelivering chemical-associated data via EPA web applications
Delivering chemical-associated data via EPA web applications
 
Cheminformatics Support for MS Supporting Exposomics
Cheminformatics Support for MS Supporting ExposomicsCheminformatics Support for MS Supporting Exposomics
Cheminformatics Support for MS Supporting Exposomics
 
EPA CompTox chemicals dashboard: An online resource for environmental chemists
EPA CompTox chemicals dashboard: An online resource for environmental chemists EPA CompTox chemicals dashboard: An online resource for environmental chemists
EPA CompTox chemicals dashboard: An online resource for environmental chemists
 
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
 

Recently uploaded

GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 

Recently uploaded (20)

GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 

Web-based access to data for >600 disinfection by-products via the EPA CompTox Chemicals Dashboard

  • 1. Web-based access to data for >600 disinfection by-products via the EPA CompTox Chemicals Dashboard Antony Williams1, Chris Grulke1 and Susan Richardson2 1Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, RTP, NC 2University of South Carolina, Department of Chemistry and Biochemistry, Columbia, SC 29208 August 2021 ACS Fall Meeting, Atlanta http://www.orcid.org/0000-0002-2668-4821 The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
  • 2. 1 EPA’s CompTox Chemicals Dashboard A publicly accessible website delivering: - ~883,000 chemicals with related property data - Experimental and predicted physicochemical property data - Integration to “biological assay data” for 1000’s of chemicals - Information regarding consumer products containing chemicals - Links to other agency websites and public data resources - “Literature” searches for chemicals using public resources - “Batch searching” for thousands of chemicals - Downloadable Open Data for reuse and repurposing
  • 3. A single app integrating… 2 2 SEARCH TOX DATA BIOACTIVITY SIMILARITY READ-ACROSS PUBMED BATCH SEARCH
  • 4. CompTox Chemicals Dashboard https://comptox.epa.gov/dashboard 3 883k Chemical Substances >906k in the next release
  • 5. EPA Drinking Water Requirements 4
  • 6. Chemical Lists on the Dashboard >300 lists and growing 5
  • 8. The advantage of lists • Pulls a relevant dataset of chemicals into a single list • Download of the file provides relevant data to user: structure, CASRN, Names, InChI • “Send to batch” provides access to all other data of interest – hazard, in vitro bioactivity, exposure, properties, relationship mappings to salts and so much more 7
  • 9. Disinfectants • Where do you find the best list? You ask an expert who reviews the science! • Extract, register and map the data between parents and by-products 8
  • 11. Download File – Excel 10 Masses and Formulae support Mass Spectrometry
  • 17. 16
  • 20. MS-Ready Mappings • EXACT Formula: C10H16N2O8: 3 Hits 19
  • 21. MS-Ready Mappings • Same Input Formula: C10H16N2O8 • MS Ready Formula Search: 125 Chemicals 20
  • 22. MS-Ready Mappings • 125 chemicals returned in total – 8 of the 125 are single component chemicals – 3 of the 8 are isotope-labeled – 3 are neutral compounds and 2 are charged 21
  • 24. Data Source Ranking of “known unknowns” 23 • Mass and/or formula is for an unknown chemical but contained within a reference database • Most likely candidate chemicals have the most associated data sources, most associated lit. articles or both C14H22N2O3 266.16304 Chemical Reference Database Sorted candidate structures
  • 25. Is a bigger database better? 24 • ChemSpider was 26 million chemicals then • Much BIGGER today • Is bigger better??
  • 26. Using Metadata for Ranking • Use available metadata to rank candidates – Associated data sources • Associated lists in the underlying database • Associated data sources in PubChem • Specific types (e.g. water, surfactants, pesticides etc.) – Number of associated literature articles (Pubmed) – Chemicals in the environment – the number of products/categories containing the chemical is a very important source of data 25
  • 27. Comparing Search Performance 26 • Dashboard content was 720k chemicals • Only 3% of ChemSpider size • What was the comparison in performance?
  • 28. SAME dataset for comparison 27
  • 29. How did performance compare? 28 For the same 162 chemicals, Dashboard outperforms ChemSpider
  • 30. How did performance compare? 29
  • 31. Batch Searching • Singleton searches are useful but we work with thousands of masses and formulae! • Typical questions – What is the list of chemicals for the formula CxHyOz – What is the list of chemicals for a mass +/- error – Can I get chemical lists in Excel files? In SDF files? – Can I include properties in the download file? 30
  • 33. Searching batches using MS-Ready Formula (or mass) searching 32
  • 35. Work in Progress • Predicted Spectra for candidate ranking – Viewing and Downloading pre-predicted spectra – Search spectra against the database 34
  • 36. Predicted Mass Spectra http://cfmid.wishartlab.com/ • MS/MS spectra prediction for ESI+, ESI-, and EI • Predictions generated and stored for >800,000 structures, to be accessible via Dashboard 35
  • 37. Search Expt. vs. Predicted Spectra
  • 38. • Predictions generated and stored for >700,000 structures • Python code to score experimental vs predicted spectra • Cosine dot product match score calculation August 26, 2019 Nontargeted screening of wastewater for water reuse using mass spectrometry Current Advances in Water Analysis 37 CFM-ID Predicted Library Available
  • 40. Conclusion • Dashboard access to data for ~875,000 chemicals • MS-Ready data facilitates structure identification • Related metadata facilitates candidate ranking 39 • Relationship mappings and chemical lists of great utility • Dashboard and contents are one part of the solution • New API and Web Services are in development
  • 41. Acknowledgements • CCTE IT development team • All scientists within CCTE that provide data and feedback on the Dashboard • Ann Richard and the CCTE curation team
  • 42. Contact Antony Williams US EPA Office of Research and Development National Center for Computational Toxicology EMAIL: Williams.Antony@epa.gov ORCID: https://orcid.org/0000-0002-2668-4821 41

Editor's Notes

  1. Clarify it’s a mockup in progress