Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental Chemistry Data

23 views

Published on

The US EPA CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard) is a web-based application providing access to ~875,000 chemical substances and diverse data types including physicochemical property, toxicity and EPA’s ToxCast data. This presentation provided a live demo of the application and its benefits to members of ACS AGRO.

Published in: Science
  • Be the first to comment

  • Be the first to like this

EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental Chemistry Data

  1. 1. EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental Chemistry Data Antony Williams Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, RTP, NC November 13th 2019 ACS AGRO Webinar http://www.orcid.org/0000-0002-2668-4821 The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
  2. 2. CompTox Chemicals Dashboard https://comptox.epa.gov/dashboard • Freely accessible website and integration hub – Chemical substances – the majority with structures – Searchable by chemical, product use and gene – Experimental and predicted physicochemical property data – Experimental and predicted fate and transport data – Bioactivity data for the ToxCast/Tox21 project – Literature searches for chemicals using public resources – Links to other agency websites and public data resources – Batch searching for thousands of chemicals – Chemical lists of interest – pesticides, leachables, PFAS – DOWNLOADABLE Open Data for reuse and repurposing 1
  3. 3. A single application integrating… 2 SEARCH TOX DATA BIOACTIVITY SIMILARITY READ-ACROSS PUBMED BATCH SEARCH
  4. 4. A data integration hub LOTS of data! • >875,000 chemicals curated over 20 years • >700,000 toxicity data points from >30 sources • Millions of synonyms and identifiers • Tens of thousands of experimental data points • Millions of QSAR prediction reports • Millions of bioactivity data points for >4000 chemicals and hundreds of assay end points • Searching of Pubmed’s 29 million abstracts 3
  5. 5. CompTox Chemicals Dashboard https://comptox.epa.gov/dashboard 4 875k Chemical Substances
  6. 6. Type-ahead Search 5
  7. 7. Substring Search: Enniatin (10/29) 6
  8. 8. Chemical Details Page 7
  9. 9. Experimental & Predicted Properties 8
  10. 10. Open Source Prediction Models 9
  11. 11. OPERA Predicted Properties 10 OPERA Models: https://github.com/kmansouri/OPERA
  12. 12. Plus Real-Time Predictions 11
  13. 13. Toxicity Estimation Software Tool 12
  14. 14. Access to Chemical Hazard Data 13
  15. 15. Hazard Data from “ToxVal_DB” • ToxVal Database contains following data: –~800,000 toxicity values –~30 sources of data –~22,000 sub-sources –~5000 journals cited –~70,000 literature citations 14
  16. 16. In Vitro Bioassay Screening ToxCast and Tox21 15
  17. 17. In Vitro Bioassay Screening ToxCast and Tox21 16
  18. 18. Identifiers to Support Searches 17
  19. 19. Built in “Modules” 18
  20. 20. Literature Searching 19
  21. 21. Literature Searching 20
  22. 22. Literature Searching 21
  23. 23. Sifting retrieved articles 22
  24. 24. Direct Link to PubMed 23
  25. 25. Abstract Sifter for Excel 24
  26. 26. Mapped Relationships 25
  27. 27. Relationships in the Data All chemicals: Same Formula 26
  28. 28. Relationships in the Data All chemicals: Same Formula 27
  29. 29. Relationships in the Data Structure search the web 28
  30. 30. Structure search the web 29
  31. 31. Similar Compounds 30
  32. 32. Related Substances – Metabolites and Transformation Products 31
  33. 33. “External Links” to >70 sites 32
  34. 34. External Links: CTD 33
  35. 35. External Links: MassBank of North America 34
  36. 36. Chemical Lists 35
  37. 37. >200 Lists of Chemicals 36
  38. 38. Filtered Search on Toxins 37
  39. 39. Mycotoxins with MS Data 38
  40. 40. EPA Algal Toxins https://www.epa.gov/cyanohabs 39
  41. 41. Algal Toxins 40
  42. 42. Hazard Data for 25/54 Algal Toxins 41
  43. 43. And who wants to draw these? 42
  44. 44. When you can download them… 43
  45. 45. Batch Searching 44
  46. 46. Batch Searching • Singleton searches are useful but people generally want data on LOTS of chemicals! • Typical questions – What is the list of chemicals for the formula CxHyOz – What is the list of chemicals for a mass +/- error – Can I get chemical lists in Excel files? In SDF files? – Can I include properties in the download file? 45
  47. 47. Batch searching 46
  48. 48. Add Other Data of Interest 47
  49. 49. Related Substance Relationships 48
  50. 50. MASS AND FORMULA SEARCHING 49
  51. 51. Advanced Searches Mass and Formula Based Search 50
  52. 52. Advanced Searches Mass and Formula Based Search 51
  53. 53. Batch Searching Formula/Mass 52
  54. 54. Crowdsourced Curation 53
  55. 55. WORK IN PROGRESS 54
  56. 56. Predicted Mass Spectra http://cfmid.wishartlab.com/ • MS/MS spectra prediction for ESI+, ESI-, and EI • Predictions generated and stored for >800,000 structures, to be accessible via Dashboard 55
  57. 57. Search Expt. vs. Predicted Spectra
  58. 58. Search Expt. vs. Predicted Spectra
  59. 59. Spectral Viewer Comparison 58
  60. 60. Prototype Development 59
  61. 61. Prototype Development 60
  62. 62. Conclusion • Building an integrated hub for environmental chemistry • Transparent access to data and models • Data QUALITY is a key focus - ongoing curation 61
  63. 63. Acknowledgements EPA-RTP • An enormous team of contributors from CCTE, especially the IT software development team • Our curation team for their care and focus on data quality • Multiple centers and laboratories across the EPA • Many public domain databases and open data contributors
  64. 64. Contact Antony Williams CCTE, US EPA Office of Research and Development, Williams.Antony@epa.gov ORCID: https://orcid.org/0000-0002-2668-4821 63 https://doi.org/10.1186/s13321-017-0247-6

×