Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

US-EPA CompTox Chemicals Dashboard providing access to experimental and predicted environmental fate and transport data

92 views

Published on

Access to both experimental and predicted environmental fate and transport data is facilitated by the US-EPA CompTox Chemicals Dashboard. Providing access to various types of data associated with ~900,000 chemical substances, the dashboard is a web-based application supporting computational toxicology research in environmental chemistry. When experimental physicochemical and fate and transport data are not available, QSAR models developed using curated datasets are used for the prediction of properties. These include: bioaccumulation factors, bioconcentration factors, and biodegradation and fish biotransformation half-lives. For chemicals of interest that are not already registered in the dashboard real-time predictions based on structural inputs are available. This presentation will provide an overview of the dashboard with a focus on the availability of environmental fate and transport data, access to real time predictions, and our ongoing efforts to harvest and curate available experimental data from the literature and online databases. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

Published in: Science
  • Be the first to comment

  • Be the first to like this

US-EPA CompTox Chemicals Dashboard providing access to experimental and predicted environmental fate and transport data

  1. 1. CompTox Chemicals Dashboard providing access to experimental and predicted environmental fate and transport data Antony Williams1, Chris Grulke1, Kamel Mansouri2 and Todd Martin3 1) NCCT, U.S. Environmental Protection Agency, RTP, NC 2) Integrated Laboratory Systems, Research Triangle Park, NC. 3) NRMRL, U.S. Environmental Protection Agency, Cincinnati, OH Fall 2019 ACS Fall Meeting, San Diego http://www.orcid.org/0000-0002-2668-4821 The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
  2. 2. Overview • The CompTox Chemicals Dashboard - web- based database of 875k substances • Associated data including: – In vivo hazard data – In vitro bioactivity screening data – Link farm to tens of public resources • Includes experimental and predicted physchem and experimental fate and transport data • Access to real-time predictions • A quick overview of capabilities… 1
  3. 3. CompTox Chemicals Dashboard https://comptox.epa.gov/dashboard 2
  4. 4. BASIC Search 3
  5. 5. Detailed Chemical Pages 4
  6. 6. Experimental and Predicted Data 5
  7. 7. Prediction models and transparency 6
  8. 8. Curated data 7
  9. 9. Public PHYSPROP Dataset 8
  10. 10. LogP dataset: 15,809 structures • CAS Checksum: 12163 valid, 3646 invalid (>23%) • Invalid names: 555 • Invalid SMILES 133 • Valence errors: 322 Molfile, 3782 SMILES (>24%) • Duplicates check: –31 DUPLICATE MOLFILES –626 DUPLICATE SMILES –531 DUPLICATE NAMES • SMILES vs. Molfiles (structure check) –1279 differ in stereochemistry (~8%) –362 “Covalent Halogens” –191 differ as tautomers –436 are different compounds (~3%) 9
  11. 11. OPERA Predicted Properties 10 OPERA Models: https://github.com/kmansouri/OPERA
  12. 12. OPERA Standalone Application 11
  13. 13. Open Source https://github.com/kmansouri/OPERA 12
  14. 14. Environmental Fate and Transport 13
  15. 15. Environmental Fate and Transport 14
  16. 16. ECOTOX data 15
  17. 17. Other prediction modules TEST Desktop Software 16
  18. 18. Real-Time Predictions Based on TEST 17
  19. 19. Real-Time Predictions 18
  20. 20. TEST Predictions Detailed calculation reports 19
  21. 21. TEST Predictions Detailed calculation reports 20
  22. 22. Built on TEST Web Services https://www.epa.gov/sites/production/files/2018-08/documents/ webtest_users_guide.pdf 21
  23. 23. Batch Searching • Given a set of chemicals how can data be harvested? • OPERA and TEST predictions have been generated for structures in the database • How can the dashboard be used to harvest data for hundreds to thousands of chemicals… 22
  24. 24. Example list of chemicals - Opioids 23
  25. 25. Batch Search Names 24 Excel Download
  26. 26. Include Other Data of Interest 25
  27. 27. Batch search results - SDF 26
  28. 28. Work in Progress 27
  29. 29. Prototype Development Structure/substructure search 28
  30. 30. OPERA pKa Prediction Model • pKa prediction models based on Open Data Set of 8000 chemicals – acidic, basic and amphoteric chemicals • Accepted for publication to Journal of Cheminformatics29
  31. 31. “Chemical Transformation Simulator” • Chemical Transformation Simulator has public web services already available – (1) Abiotic Hydrolysis – (2) Abiotic Reduction – (3) Phase 1 Metabolism 30 Mock-up
  32. 32. Ongoing Extraction of Data • Data are extracted from literature based on agency priorities • Specific data sets of interest at present include those available for PFAS chemicals • Data releases are every 6 months at present 31
  33. 33. Conclusion • Dashboard access to data for ~875,000 chemicals • Ongoing aggregation of physicochemical property and environmental fate and transport data 32 • Retraining and rebuilding of models will occur as new data are assembled • Web services already available for TEST predictions with services for OPERA next • Future developments include integration of chemical transformation simulator
  34. 34. Acknowledgements EPA-RTP • An enormous team of contributors from NCCT, especially the IT software development team and data curation team • US-EPA ECOTOX for sharing data from their database • Valery Tkachenko for development of TEST web services
  35. 35. Contact Antony Williams NCCT, US EPA Office of Research and Development, Williams.Antony@epa.gov ORCID: https://orcid.org/0000-0002-2668-4821 34 https://doi.org/10.1186/s13321-017-0247-6

×