The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrate advances in biology, chemistry, exposure and computer science to help prioritize chemicals for further research based on potential human health risks. This work involves computational and data driven approaches that integrate chemistry, exposure and biological data. As an outcome of these efforts the National Center for Computational Toxicology (NCCT) has measured, assembled and delivered an enormous quantity and diversity of data for the environmental sciences including high-throughput in vitro screening data, legacy in vivo animal data, consumer use and production information, exposure models and chemical structure databases with associated properties. A series of software applications and databases have been produced over the past decade to deliver these data, but recent developments have focused on the development of a new software architecture that assembles the resources into a single platform. Our web application, the CompTox Chemistry Dashboard provides access to data associated with ~750,000 chemical substances. These data include experimental and predicted physicochemical property data, bioassay screening data associated with the ToxCast program, product and functional use information and a myriad of related data of value to environmental scientists.
The dashboard provides chemical-based searching based on chemical names, synonyms and CAS Registry Numbers. Flexible search capabilities allow for chemical identification based on non-targeted analysis studies using mass spectrometry. Chemical identification using both mass and formula-based searching utilizes rank-ordering of results via functional use statistics, thereby providing a solution to help prioritize chemicals for further review when detected in environmental media.
This presentation will provide an overview of the dashboard, its capabilities for delivering data to the environmental chemistry community and how the architecture provides a foundation for the development of additional applications to support chemical risk assessment. This abstract does not reflect U.S. EPA policy.
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Environmental Chemistry and Toxicology Data
1. Comptox Chemistry Dashboard: Web-based
data integration hub for environmental
chemistry and toxicology data
Antony Williams1, Chris Grulke1, Andrew McEachran2, Ann Richard1,
Rebecca Jolley1, Jeremy Dunne1, Elizabeth Edmiston1 & Jeff Edwards1
1. National Center for Computational Toxicology, U.S. Environmental Protection Agency, RTP, NC
2. Oak Ridge Institute of Science and Education (ORISE) Research Participant, Research Triangle Park, NC
August 2017
ACS Fall Meeting, Washington, DC
http://www.orcid.org/0000-0002-2668-4821
The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
2. Who is NCCT?
• National Center for Computational Toxicology – part of EPA’s
Office of Research and Development
• Research driven by EPA’s Chemical Safety for Sustainability
Research Program
– Develop new approaches to evaluate the potential chemical toxicity
– Integrate advances in biology, biotechnology, chemistry, exposure
science and computer science
1
3. The CompTox Chemistry Dashboard
PRIMARY GOALS
• Deliver a web-based application serving up the chemistry related
data used by our team
• Provide public access to the results of over a decade of curation
work reviewing environmental chemistry data
• Provide access to the results of our QSAR modeling work
• Deliver a central hub to link together websites of interest
• All data to be available as Open Data for download/reuse
SECONDARY GOAL
• To develop a cheminformatics architecture to serve as a high
quality chemical foundation for all NCCT tools and data
5. The CompTox Chemistry Dashboard:
An Overview
• A publicly accessible website delivering access:
– ~760,000 chemicals and related property data
– Links to other agency websites and public data resources
– “Literature” searches for chemicals using public resources
– Integration to “biological assay data” for 1000s of chemicals
– Information regarding consumer products containing chemicals
– “Batch searching” for thousands of chemicals
• Day-to-day curation efforts for data quality
4
20. In vitro Bioassay Data
• In vitro bioassays are used to determine the biological
activity of a substance – ToxCast and Tox21 projects
• A decade of measurements, and millions of dollars of data
integrated into the dashboard
19
37. Data Available for Download
https://comptox.epa.gov/dashboard/downloads
36
38. Connecting into the Dashboard
• Linkages into the Dashboard are simple: using the
associated identifiers
• For integration we can supply files of structures
and identifiers mapped to DTXSIDs. Contact us…
• PubChem, EBI’s UNICHEM, ChemSpider, etc.
40. Future Work
• Continue expansion and curation of data and types.
• Provide “programmatic access” to all data –
connect to other Agency resources and allow other
scientists to integrate their scientific applications.
• Continue to assemble and enhance chemical lists
and data for specific projects. Make available to
Agency researchers and for public use.
• Make new modules public – “Generalized Read
Across”, “EcoTox data”
39
41. Confidential Business Information
40
CBI is broadly defined as proprietary information, considered
confidential to the submitter, the release of which would cause
substantial business injury to the owner.
42. The Dashboard for CBI
• The dashboard code and data will be
deployed in the Office of Pollution Prevention
and Toxics (OPPT) supporting CBI data
– Integrate OPPT CBI data in the database
– Isolate all internet-based modules for the CBI
environment – no external links, no literature searching,
no PubChem data etc.
– Rebuild OPERA models using CBI data (if the models
improve can we release without training data?)
41
43. Services to Support Real-Time
Property Prediction
42
Search
Load Select Properties to Predict
LogP: Octanol-Water
Water Solubility
Density
Flash Point
Melting Point
Boiling Point
Surface Tension
Thermal Conductivity
Vapor Pressure
LogKoa: Octanol-Air
Henry’s Law
Index of Refraction
Molar Refractivity
Molar Volume
Polarizability
T.E.S.T OPERA EPI Suite
44. Conclusion
• The CompTox Chemistry Dashboard provides
access to data for ~760,000 chemicals
• An Integration Hub to data – toxicity,
environmental, property, bioassay, and expanding
• Data downloads allows for reuse in other systems
and integration of resources to support research
43
45. Contact
Antony Williams
US EPA Office of Research and Development
National Center for Computational Toxicology (NCCT)
Williams.Antony@epa.gov
ORCID: https://orcid.org/0000-0002-2668-4821
44