SlideShare a Scribd company logo
1 of 44
US-EPA Cheminformatics Support for
Delivering Data Related to Chemicals
of Emerging Concern
Antony Williams
Center for Computational Toxicology and Exposure, US-EPA, RTP, NC
The views expressed in this presentation are those of the author
and do not necessarily reflect the views or policies of the U.S. EPA
The role of cheminformatics at EPA
• I am from the EPA Center for Computational Toxicology and Exposure
• We develop lots of prediction models and web-based applications
• Today’s presentation: how do our efforts support data dissemination
regarding chemicals of emerging concern and MS-NTA
2
2
Chemical Monitoring Needs
Exposure
Assessment
Dose-
Response
Assessment
Risk
Characterization
Hazard
Identification
Free-Access Cheminformatics Tools
• The Center for Computational Toxicology and Exposure many tools
• CompTox Chemicals Dashboard
• Proof-of-Concept cheminformatics modules
• Chemicals Hazard Profiling
• Chemical Transformations database (ChET)
• Analytical Methods and Open Spectra database (AMOS)
• All chemicals are stored/curated in DSSTox
3
DSSTox Database
4
Accessing DSSTox chemistry:
CompTox Chemicals Dashboard
•A publicly accessible website delivering:
• 1.2M chemicals with related property data
• Related substances: transformation products, mono/polymer
• Experimental/predicted physicochemical property data
• Experimental Human and Ecological hazard data
• Integration to “biological assay data”
• Information regarding chemicals in consumer products
• Links to other agency websites and public data resources
• “Batch searching” for tens to thousands of chemicals
5
CompTox Chemicals Dashboard
https://comptox.epa.gov/dashboard
6
1 of ~1.2M Chemical Pages
7
Physicochemical Properties
8
Experimental Data
9
• Experimental data harvested
from public domain databases
and journal articles
• Data link back to provenance
• Data are used to build QSAR
models for real time predictions
• Data are available for download
and reuse
What is PFOS Called?
Synonyms, CASRNs and more
10
Substance Relationship Mappings
• Similar compounds -
based on structure
“fingerprints”
11
Relationships in the data
12
• Structure mappings -
between parent and
salts, multicomponent
chemicals, isotopomers
• Related substances –
monomer to polymer,
parent to transformation
products
Batch Searching is a big enabler
https://pubs.acs.org/doi/10.1021/acs.jcim.0c01273
13
Batch Searching
• Singleton searches are useful but people work with groups of chemicals
• Typical questions
• Find me all data based on the input of 1000 CASRNs, or 1000 names
• What are the physicochemical properties for a set of identifiers?
• What is the list of chemicals for the formula CxHyOz?
• What is the list of chemicals for a mass +/- error?
• Can I get chemical lists in Excel files? In SDF files?
• Can I include properties in the download file?
14
Batch Search
Batch Search
Batch Search
• All data can be
downloaded into Excel
files, CSV files or SDF
files and reused
• All data are Open
Chemical Lists
https://comptox.epa.gov/dashboard/chemical_lists
• Chemical lists are focused on regulations, research efforts and categories
• 425 lists and growing
• TSCA Inventory
• Clean Water Act Hazardous Substances
• Consumer Products database
• Chemicals of Emerging Concern
• PFAS lists
• Extractables and Leachables
• Lists are versioned and updated and new lists added regularly
18
Extractables
19
Tire Crumb Rubber
20
Hydraulic Fracturing
21
Disinfection By-Products
22
PFAS Lists of Chemicals (51/426)
23
Consumer Products Database
24
Applications at the EPA
•We have ongoing efforts applying NTA to multiple
challenges including
• PFAS identification
• Pesticides in various matrices
• CECs in water
• Biosolids
•Examples include…
25
Example 1: Consumer Product Analysis
26
27
Many chemicals observed in
consumer product extracts
More observed chemicals not
known to be in consumer
products
Why might the ‘other’
chemicals be in the products?
Many observed chemicals
known to be in consumer
products
Example 1: Consumer Product Analysis
28
Example 2: Recycled Product Analysis
29
Significant differences between
chemicals in recycled vs. virgin products
for certain product & use categories
Most differences observed in paper
products and construction materials
Some uses (e.g., fragrances) highly
represented across all product/use
categories
Example 2: Recycled Product Analysis
Example 3: Placental Tissue Analysis
30
Lots of “proof-of-concept” tools in development
• PoCs are research software builds to prove approaches before moving
into production software environments
• Assemble data, develop data model(s), test user interface approaches,
work with test user base to garner feedback
• Since PoCs are internal access data refreshes and application updates
can be more
31
32
Cheminformatics PoC Modules
https://www.epa.gov/chemical-research/cheminformatics
Easy Export of all data to Excel
33
AMOS: Analytical Methods and Spectra Database
• Three types of data in the database:
• Methods (regulatory, lab manuals and SOPs, publications, tech notes)
• Spectra (from public domain and our own laboratories)
• Fact Sheets (harvested from SWGDRUG and other sites)
• Currently contains >210,000 spectra, >700,000 external links, 4000
“Fact Sheets” and ~4000 methods
• ALL data are growing in number weekly at present
34
Embedded Method PDFs
35
Literature articles, SOPs, Protocols
36
Linking to actual spectra
37
Linking to actual spectra
38
• We are doing a lot of chemical curation as we
build the database
Why not just Regulatory Methods?
39
Why not just Regulatory Methods?
Because we need methods faster
40
Full presentation
https://t.ly/4MxFe
41
Our Data via services
https://api-ccte.epa.gov/docs/
42
Conclusions
• Our data resources underpin our research efforts – data quality is key
• Our web-based applications deliver our data to the community
• Our support for identifying chemicals of emerging concern is multi-fold
• Curated chemistry data streams
• Non-targeted analysis tool development and cheminformatics support
• NTA WebApp in development uses all data streams to support analysis
43
Acknowledgements and Contact Information
• The work presented here represents an enormous team of contributors
• Chemical curators
• Software developers and contractors
• Postdocs, SMEs and PIs
• Contact info: williams.antony@epa.gov
• Slides will be available at: https://www.slideshare.net/AntonyWilliams/
44

More Related Content

Similar to US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of Emerging Concern

The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...Andrew McEachran
 

Similar to US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of Emerging Concern (20)

Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
 
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
Delivering access to chemistry and bioassay data from the National Center for...
Delivering access to chemistry and bioassay data from the National Center for...Delivering access to chemistry and bioassay data from the National Center for...
Delivering access to chemistry and bioassay data from the National Center for...
 
New Approach Methods - What is That?
New Approach Methods - What is That?New Approach Methods - What is That?
New Approach Methods - What is That?
 
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted AnalysisThe US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data DashboardsAccessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
Using Cheminformatics Approaches to Develop a Structure Searchable Database o...
Using Cheminformatics Approaches to Develop a Structure Searchable Database o...Using Cheminformatics Approaches to Develop a Structure Searchable Database o...
Using Cheminformatics Approaches to Develop a Structure Searchable Database o...
 
US-EPA Chemicals Dashboard and Applications to Digital Design of Molecules
US-EPA Chemicals Dashboard and Applications to Digital Design  of MoleculesUS-EPA Chemicals Dashboard and Applications to Digital Design  of Molecules
US-EPA Chemicals Dashboard and Applications to Digital Design of Molecules
 
Progress in delivering transparency in research data
Progress in delivering transparency in research dataProgress in delivering transparency in research data
Progress in delivering transparency in research data
 
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
 
Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...
 
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
 
New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
Success in decision making data relevance curation
Success in decision making data relevance curationSuccess in decision making data relevance curation
Success in decision making data relevance curation
 

Recently uploaded

Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Cherry
 
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsKanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsDeepika Singh
 
Genome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxGenome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxCherry
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Cherry
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLkantirani197
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Cherry
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cherry
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsbassianu17
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxCherry
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptxArvind Kumar
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.Cherry
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsSérgio Sacani
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Cherry
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Cherry
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 

Recently uploaded (20)

Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsKanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
Genome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxGenome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptx
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 

US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of Emerging Concern

  • 1. US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of Emerging Concern Antony Williams Center for Computational Toxicology and Exposure, US-EPA, RTP, NC The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
  • 2. The role of cheminformatics at EPA • I am from the EPA Center for Computational Toxicology and Exposure • We develop lots of prediction models and web-based applications • Today’s presentation: how do our efforts support data dissemination regarding chemicals of emerging concern and MS-NTA 2 2 Chemical Monitoring Needs Exposure Assessment Dose- Response Assessment Risk Characterization Hazard Identification
  • 3. Free-Access Cheminformatics Tools • The Center for Computational Toxicology and Exposure many tools • CompTox Chemicals Dashboard • Proof-of-Concept cheminformatics modules • Chemicals Hazard Profiling • Chemical Transformations database (ChET) • Analytical Methods and Open Spectra database (AMOS) • All chemicals are stored/curated in DSSTox 3
  • 5. Accessing DSSTox chemistry: CompTox Chemicals Dashboard •A publicly accessible website delivering: • 1.2M chemicals with related property data • Related substances: transformation products, mono/polymer • Experimental/predicted physicochemical property data • Experimental Human and Ecological hazard data • Integration to “biological assay data” • Information regarding chemicals in consumer products • Links to other agency websites and public data resources • “Batch searching” for tens to thousands of chemicals 5
  • 7. 1 of ~1.2M Chemical Pages 7
  • 9. Experimental Data 9 • Experimental data harvested from public domain databases and journal articles • Data link back to provenance • Data are used to build QSAR models for real time predictions • Data are available for download and reuse
  • 10. What is PFOS Called? Synonyms, CASRNs and more 10
  • 11. Substance Relationship Mappings • Similar compounds - based on structure “fingerprints” 11
  • 12. Relationships in the data 12 • Structure mappings - between parent and salts, multicomponent chemicals, isotopomers • Related substances – monomer to polymer, parent to transformation products
  • 13. Batch Searching is a big enabler https://pubs.acs.org/doi/10.1021/acs.jcim.0c01273 13
  • 14. Batch Searching • Singleton searches are useful but people work with groups of chemicals • Typical questions • Find me all data based on the input of 1000 CASRNs, or 1000 names • What are the physicochemical properties for a set of identifiers? • What is the list of chemicals for the formula CxHyOz? • What is the list of chemicals for a mass +/- error? • Can I get chemical lists in Excel files? In SDF files? • Can I include properties in the download file? 14
  • 17. Batch Search • All data can be downloaded into Excel files, CSV files or SDF files and reused • All data are Open
  • 18. Chemical Lists https://comptox.epa.gov/dashboard/chemical_lists • Chemical lists are focused on regulations, research efforts and categories • 425 lists and growing • TSCA Inventory • Clean Water Act Hazardous Substances • Consumer Products database • Chemicals of Emerging Concern • PFAS lists • Extractables and Leachables • Lists are versioned and updated and new lists added regularly 18
  • 23. PFAS Lists of Chemicals (51/426) 23
  • 25. Applications at the EPA •We have ongoing efforts applying NTA to multiple challenges including • PFAS identification • Pesticides in various matrices • CECs in water • Biosolids •Examples include… 25
  • 26. Example 1: Consumer Product Analysis 26
  • 27. 27 Many chemicals observed in consumer product extracts More observed chemicals not known to be in consumer products Why might the ‘other’ chemicals be in the products? Many observed chemicals known to be in consumer products Example 1: Consumer Product Analysis
  • 28. 28 Example 2: Recycled Product Analysis
  • 29. 29 Significant differences between chemicals in recycled vs. virgin products for certain product & use categories Most differences observed in paper products and construction materials Some uses (e.g., fragrances) highly represented across all product/use categories Example 2: Recycled Product Analysis
  • 30. Example 3: Placental Tissue Analysis 30
  • 31. Lots of “proof-of-concept” tools in development • PoCs are research software builds to prove approaches before moving into production software environments • Assemble data, develop data model(s), test user interface approaches, work with test user base to garner feedback • Since PoCs are internal access data refreshes and application updates can be more 31
  • 33. Easy Export of all data to Excel 33
  • 34. AMOS: Analytical Methods and Spectra Database • Three types of data in the database: • Methods (regulatory, lab manuals and SOPs, publications, tech notes) • Spectra (from public domain and our own laboratories) • Fact Sheets (harvested from SWGDRUG and other sites) • Currently contains >210,000 spectra, >700,000 external links, 4000 “Fact Sheets” and ~4000 methods • ALL data are growing in number weekly at present 34
  • 37. Linking to actual spectra 37
  • 38. Linking to actual spectra 38 • We are doing a lot of chemical curation as we build the database
  • 39. Why not just Regulatory Methods? 39
  • 40. Why not just Regulatory Methods? Because we need methods faster 40
  • 42. Our Data via services https://api-ccte.epa.gov/docs/ 42
  • 43. Conclusions • Our data resources underpin our research efforts – data quality is key • Our web-based applications deliver our data to the community • Our support for identifying chemicals of emerging concern is multi-fold • Curated chemistry data streams • Non-targeted analysis tool development and cheminformatics support • NTA WebApp in development uses all data streams to support analysis 43
  • 44. Acknowledgements and Contact Information • The work presented here represents an enormous team of contributors • Chemical curators • Software developers and contractors • Postdocs, SMEs and PIs • Contact info: williams.antony@epa.gov • Slides will be available at: https://www.slideshare.net/AntonyWilliams/ 44