SlideShare a Scribd company logo
1 of 58
Structure Identification Using High Resolution
Mass Spectrometry Data and the EPA
CompTox Dashboard
Antony J. Williams, Andrew McEachran, Chris Grulke,
Elin Ulrich, Jennifer Smith, Jeff Edwards and Jon Sobus,
November 2-3, 2016
SWEMSA 2016
http://www.orcid.org/0000-0002-2668-4821
The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
Who is NCCT?
• National Center for Computational Toxicology – part of EPA’s
Office of Research and Development
• Research driven by EPA’s Chemical Safety for Sustainability
Research Program
– Develop new approaches to evaluate the safety of chemicals
– Integrate advances in biology, biotechnology, chemistry, exposure
science and computer science
• Goal - To identify chemical exposures that may disrupt
biological processes and cause adverse outcomes.
1
Our Dashboard Applications
• Some of our Web-based Applications
2
Introducing Our Latest Dashboard
https://comptox.epa.gov
3
• >720,000 chemicals
• >14 years assembling data
Bisphenol A
4
Physicochemical Properties
5
ToxCast Bioassay Screening Data
Useful Meta Data
6
Functional Use and Composition
VERY Useful Meta Data
7
Dashboard: External Links to
Analytical Methods and Data
8
National Environmental Methods Index
9
RSC Analytical Abstracts
10
For_IDENT and MONA
11
Previous Work with Suspect-Screening
ONE ASPECT of the dashboard is to
support Non-targeted Analysis
Rank-Ordering of “Known-Unknowns”
using ChemSpider
13
Some history…
• 2007 A Hobby Project
14
Some history…
• 2007 A Hobby Project
• 2009 ChemSpider Acquired
15
Some history…
• 2007 A Hobby Project
• 2009 ChemSpider Acquired
• May 2015 Joined EPA – what
we are showing is very new
16
Advanced MS Searches
17
Monoisotopic Mass Search
18
Found 344 results for '215.096 ± 0.005 amu'
Download to Excel
19
Download as SDF file
20
Formula Search
21
Found 8 results for 'C8H14ClN5'
Does the Dashboard Add Value?
22
721k structures
Does the Dashboard Add Value?
• Remember:
– Focus on high quality data and curation
– Data sources include EPA data sources and a focus on
environmental chemistry
• No “dilution” by chemical vendors
23
Dilution Example…
Morphine Skeleton
24
Bisphenol A as an example
ChemSpider: 1564 Structures
25
Bisphenol A as an example
Dashboard: 215 Structures
26
ChemSpider 6926 Results!!!
27
Tacedinaline
Methyl Red
C.I Disperse
Yellow 3
Using Meta-Data to Sort Candidates
28
Anti-cancer Drug
Microbiological
Indicator Dye
Textile/Product Dye
Same top hits – different ranking
90 hits only versus 6926 hits
29
18
17
4Tacedinaline
Methyl Red
C.I Disperse
Yellow 3
Chemical Identification
Dashboard vs ChemSpider
Sorted by number of references (ChemSpider) or data sources (Dashboard)
Monoisotopic Mass (+/- 0.005 amu) Search
Position of compound sorted
Source of List # of
Compounds
Search Tool Mean
Position
Median
Position #1 #2 #3 #4 #5+
McEachran et al
Wastewater
34 ChemSpider 1.8 1 28 5 0 0 1
Dashboard 1.3 1 31 2 0 0 1
Misc. NTA Compounds 13 ChemSpider 2 1 7 5 0 0 1
Dashboard 1.7 1 10 2 0 0 1
Bade et al (2016) 19 ChemSpider 2.1 1 11 2 5 0 1
Dashboard 1.6 1 12 3 3 1 0
Rager et al (2016) 24 ChemSpider 2.25 1 15 2 1 2 4
Dashboard 1.08 1 22 2 0 0 0
Dashboard vs ChemSpider
Ranking Summary
Mass-based Searching Formula Based Searching
Dashboard ChemSpider Dashboard ChemSpider
Cumulative Average
Position 1.3 2.2 1.2 1.4
% in #1 Position 85% 70% 88% 80%
• Selected peer-reviewed publications
• 162 total individual chemicals in search
The Confusion of Chemicals…
Valid CAS-substance?
Monoisotopic Mass
Formula
Parent structure
(no stereo, desalted)
 Resolve CAS-structure mappings for
accurate data mapping
 Collapse sphere to collect all data at
parent structure-formula level
DSSTox_v2 Database
& Cheminformatics Layer
many:1
• Deleted CAS
• Invalid CAS
• Salt forms
• Complex forms
• Hydrate forms
• Approx mappings to mixtures
• Approx mappings to ill-
defined substances
• Stereoisomers
• Unresolved tautomers
CAS2 ?
CAS5 ?
CAS3 ?
CAS1 ?
CAS4 ?NOCAS?
Data1
Data2
Data3
Data4
Data5
Data6
Data7
Data8
Data9
CAS-Structure “Sphere of Confusion”
DSSTox List Curation Tool
Conflicts binned to facilitate curation
Helping to Curate Data
• We are helping to Curate Data (prior to
linking from our dashboard)
• Our discussions with Thomas and team –
“We all agree it is hard!”
34
Helping to Curate Data
• We are helping to Curate Data (prior to
linking from our dashboard)
• Our discussions with Thomas and team –
“We all agree it is hard!”
• Approx. 80% of STOFF-IDENT is done…
35
Helping to Curate Data
• We are helping to Curate Data (prior to
linking from our dashboard)
• Our discussions with Thomas and team –
“We all agree it is hard!”
• Approx. 80% of STOFF-IDENT is done…
36
Curating on CASRNs is DIFFICULT
• 36861-47-9 : SI00004220
• 38102-62-4 : SI00008957
37
Collisions in CAS Numbers
38
Active and Deleted CASRN
39
But there are MANY CASRNs!
• http://web.stanford.edu/group/swain/cinf/c
asreg/snumber.html
40
How Bad Can It Get??
41
How Bad Can It Get?
This one is 316 Deleted CASRN
42
Helping to Curate Data
• We are missing 159
CAS Numbers listed in
STOFF-IDENT
• Work on curating and
mapping MASSBANK is
underway. Much
bigger!! Way more work
Our OPEN Data is available…
• Various types of data at FTP download site:
ftp://newftp.epa.gov/COMPTOX/Sustainable_Chemistry_
Data/Chemistry_Dashboard
44
Data Availability
• The data are now available in METLIN
• Available in MetFrag (alpha) and in testing.
45
Coming December 2016
Batch Searching Names/CASRNs
• What are these chemicals?
46
Coming December 2016
Batch Searching…
47
Coming December 2016
Download to Excel
48
In-testing
49
Metadata included for Ranking
50
Need for “MS-Ready Structures”
51
“QSAR-Ready Structures”
• For the purpose of building QSAR Models
we already “standardize” structures
– Desalt/Neutralize
– Desolvate
– Remove stereochemistry
• Some minor tweaks gets us “MS-ready
Structures”. ALREADY in our database.
52
“QSAR-Ready Structures”
• Mass and Formula-based searches will be
based on MS-ready structures but
connected to the original chemical (with
name, CAS, rank ordering)
• MS-ready structures and substance
mappings will be available as Open Data
53
Rank-Ordering – incl. PubChem
54
Future Work
• Continue to research rank-ordering approaches
• Working on “retention time prediction”
• Search for adducts (+Na, +K, +NH4) and handle
decarboxylation, loss of water etc
• Additional links to methods – CDC NIOSH
• Expand link outs to Mass Spec databases –
Thermo’s mzCloud, Massbank, etc.
• Predicting metabolites and degradants
• Optimize web services for the community
55
Conclusions
• Only 1 aspect of the dashboard is focused on
MS – to support the EPA NTA Trial underway
• We should work on data curation TOGETHER!
• We are “part” of the solution. Our Open Data
and Open Services should be of value.
56
Acknowledgements
EPA NCCT
Chris Grulke
Jeff Edwards
Ann Richard
Jennifer Smith
Andrew McEachran*
EPA NERL
Jon Sobus
Seth Newton
Elin Ulrich
* = ORISE Participant

More Related Content

What's hot

OPERA, AN OPEN SOURCE AND OPEN DATA SUITE OF QSAR MODELS
OPERA, AN OPEN SOURCE AND OPEN DATA SUITE OF QSAR MODELSOPERA, AN OPEN SOURCE AND OPEN DATA SUITE OF QSAR MODELS
OPERA, AN OPEN SOURCE AND OPEN DATA SUITE OF QSAR MODELSKamel Mansouri
 

What's hot (20)

The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
 
The needs for chemistry standards, database tools and data curation at the ch...
The needs for chemistry standards, database tools and data curation at the ch...The needs for chemistry standards, database tools and data curation at the ch...
The needs for chemistry standards, database tools and data curation at the ch...
 
An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...
 
Web-based access to data for >600 disinfection by-products via the EPA CompTo...
Web-based access to data for >600 disinfection by-products via the EPA CompTo...Web-based access to data for >600 disinfection by-products via the EPA CompTo...
Web-based access to data for >600 disinfection by-products via the EPA CompTo...
 
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
 
Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...
 
New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...
 
New Approach Methods - What is That?
New Approach Methods - What is That?New Approach Methods - What is That?
New Approach Methods - What is That?
 
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
 
Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...
 
How to place your research questions or results into the context of the "Lega...
How to place your research questions or results into the context of the "Lega...How to place your research questions or results into the context of the "Lega...
How to place your research questions or results into the context of the "Lega...
 
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
 
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
 
What chemicals constitute the Exposome? Accessing data via the US EPA’s Comp...
What chemicals constitute the Exposome? Accessing data via the US EPA’s  Comp...What chemicals constitute the Exposome? Accessing data via the US EPA’s  Comp...
What chemicals constitute the Exposome? Accessing data via the US EPA’s Comp...
 
Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...
 
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted AnalysisThe US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
 
Adding complex expert knowledge into chemical database and transforming surfa...
Adding complex expert knowledge into chemical database and transforming surfa...Adding complex expert knowledge into chemical database and transforming surfa...
Adding complex expert knowledge into chemical database and transforming surfa...
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
OPERA, AN OPEN SOURCE AND OPEN DATA SUITE OF QSAR MODELS
OPERA, AN OPEN SOURCE AND OPEN DATA SUITE OF QSAR MODELSOPERA, AN OPEN SOURCE AND OPEN DATA SUITE OF QSAR MODELS
OPERA, AN OPEN SOURCE AND OPEN DATA SUITE OF QSAR MODELS
 

Viewers also liked

Using Ecological Momentary Assessment to Examine Post-food Consumption Affect...
Using Ecological Momentary Assessment to Examine Post-food Consumption Affect...Using Ecological Momentary Assessment to Examine Post-food Consumption Affect...
Using Ecological Momentary Assessment to Examine Post-food Consumption Affect...Yue Liao
 
From Data Availability to Information Accessibility: The WellWiki Project
From Data Availability to Information Accessibility: The WellWiki ProjectFrom Data Availability to Information Accessibility: The WellWiki Project
From Data Availability to Information Accessibility: The WellWiki ProjectJoel Gehman
 
SMS Berlin 2016 Cultural Perspectives on Strategic Management
SMS Berlin 2016 Cultural Perspectives on Strategic ManagementSMS Berlin 2016 Cultural Perspectives on Strategic Management
SMS Berlin 2016 Cultural Perspectives on Strategic ManagementJoel Gehman
 
Shaping Expectations: Defining and Refining the Role of Technical Services in...
Shaping Expectations: Defining and Refining the Role of Technical Services in...Shaping Expectations: Defining and Refining the Role of Technical Services in...
Shaping Expectations: Defining and Refining the Role of Technical Services in...NASIG
 
Web Preservation, or Managing your Organisation’s Online Presence After the O...
Web Preservation, or Managing your Organisation’s Online Presence After the O...Web Preservation, or Managing your Organisation’s Online Presence After the O...
Web Preservation, or Managing your Organisation’s Online Presence After the O...lisbk
 
Going Concerns: A Perspective from the Nexus of Business, Culture and Instit...
Going Concerns:  A Perspective from the Nexus of Business, Culture and Instit...Going Concerns:  A Perspective from the Nexus of Business, Culture and Instit...
Going Concerns: A Perspective from the Nexus of Business, Culture and Instit...Joel Gehman
 

Viewers also liked (19)

Building an Online Profile Using Social Networking and Amplification Tools fo...
Building an Online Profile Using Social Networking and Amplification Tools fo...Building an Online Profile Using Social Networking and Amplification Tools fo...
Building an Online Profile Using Social Networking and Amplification Tools fo...
 
Social Networking Tools for Scientists and Building an Online Profile
Social Networking Tools for Scientists and Building an Online ProfileSocial Networking Tools for Scientists and Building an Online Profile
Social Networking Tools for Scientists and Building an Online Profile
 
Building an Online Profile: Social Networking and Amplification Tools for Sc...
Building an Online Profile:  Social Networking and Amplification Tools for Sc...Building an Online Profile:  Social Networking and Amplification Tools for Sc...
Building an Online Profile: Social Networking and Amplification Tools for Sc...
 
Simple Springshare Mashups: Cross-Platform Strategies for Repurposing Digital...
Simple Springshare Mashups: Cross-Platform Strategies for Repurposing Digital...Simple Springshare Mashups: Cross-Platform Strategies for Repurposing Digital...
Simple Springshare Mashups: Cross-Platform Strategies for Repurposing Digital...
 
How One Monkey on a Typewriter Made a Difference to Online Chemistry
How One Monkey on a Typewriter Made a Difference to Online ChemistryHow One Monkey on a Typewriter Made a Difference to Online Chemistry
How One Monkey on a Typewriter Made a Difference to Online Chemistry
 
Using Ecological Momentary Assessment to Examine Post-food Consumption Affect...
Using Ecological Momentary Assessment to Examine Post-food Consumption Affect...Using Ecological Momentary Assessment to Examine Post-food Consumption Affect...
Using Ecological Momentary Assessment to Examine Post-food Consumption Affect...
 
NSF Data Management Requirements 101
NSF Data Management Requirements 101NSF Data Management Requirements 101
NSF Data Management Requirements 101
 
From Data Availability to Information Accessibility: The WellWiki Project
From Data Availability to Information Accessibility: The WellWiki ProjectFrom Data Availability to Information Accessibility: The WellWiki Project
From Data Availability to Information Accessibility: The WellWiki Project
 
SMS Berlin 2016 Cultural Perspectives on Strategic Management
SMS Berlin 2016 Cultural Perspectives on Strategic ManagementSMS Berlin 2016 Cultural Perspectives on Strategic Management
SMS Berlin 2016 Cultural Perspectives on Strategic Management
 
Investigating Impact Metrics for Performance for the US-EPA National Center f...
Investigating Impact Metrics for Performance for the US-EPA National Center f...Investigating Impact Metrics for Performance for the US-EPA National Center f...
Investigating Impact Metrics for Performance for the US-EPA National Center f...
 
A Bird in the Hand: Leveraging ILL Requests to Improve Electronic Resource A...
A Bird in the Hand: Leveraging ILL Requests to Improve Electronic Resource A...A Bird in the Hand: Leveraging ILL Requests to Improve Electronic Resource A...
A Bird in the Hand: Leveraging ILL Requests to Improve Electronic Resource A...
 
Social Media Tools for Scientists and Building an Online Profile
Social Media Tools for Scientists and Building an Online ProfileSocial Media Tools for Scientists and Building an Online Profile
Social Media Tools for Scientists and Building an Online Profile
 
Shaping Expectations: Defining and Refining the Role of Technical Services in...
Shaping Expectations: Defining and Refining the Role of Technical Services in...Shaping Expectations: Defining and Refining the Role of Technical Services in...
Shaping Expectations: Defining and Refining the Role of Technical Services in...
 
Web Preservation, or Managing your Organisation’s Online Presence After the O...
Web Preservation, or Managing your Organisation’s Online Presence After the O...Web Preservation, or Managing your Organisation’s Online Presence After the O...
Web Preservation, or Managing your Organisation’s Online Presence After the O...
 
Going Concerns: A Perspective from the Nexus of Business, Culture and Instit...
Going Concerns:  A Perspective from the Nexus of Business, Culture and Instit...Going Concerns:  A Perspective from the Nexus of Business, Culture and Instit...
Going Concerns: A Perspective from the Nexus of Business, Culture and Instit...
 
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Tox...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Tox...The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Tox...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Tox...
 
2016 davis-plantbio
2016 davis-plantbio2016 davis-plantbio
2016 davis-plantbio
 
2016 bergen-sars
2016 bergen-sars2016 bergen-sars
2016 bergen-sars
 
2016 davis-biotech
2016 davis-biotech2016 davis-biotech
2016 davis-biotech
 

Similar to Structure Identification Using High Resolution Mass Spectrometry Data and the EPA CompTox Dashboard

The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...Andrew McEachran
 

Similar to Structure Identification Using High Resolution Mass Spectrometry Data and the EPA CompTox Dashboard (20)

TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
 
Cheminformatics approaches to support chemical identification delivered via t...
Cheminformatics approaches to support chemical identification delivered via t...Cheminformatics approaches to support chemical identification delivered via t...
Cheminformatics approaches to support chemical identification delivered via t...
 
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
 
Consensus ranking and fragmentation prediction for identification of unknowns...
Consensus ranking and fragmentation prediction for identification of unknowns...Consensus ranking and fragmentation prediction for identification of unknowns...
Consensus ranking and fragmentation prediction for identification of unknowns...
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
Structure standardization approaches for mass spectrometry data integration
Structure standardization approaches for  mass spectrometry data integrationStructure standardization approaches for  mass spectrometry data integration
Structure standardization approaches for mass spectrometry data integration
 
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data DashboardsAccessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
 
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Toxico...
Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Toxico...Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Toxico...
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Toxico...
 
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
 
Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...
 
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
 

Recently uploaded

Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Cytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxCytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxVarshiniMK
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10ROLANARIBATO3
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2John Carlo Rollon
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayZachary Labe
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsHajira Mahmood
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxEran Akiva Sinbar
 
insect anatomy and insect body wall and their physiology
insect anatomy and insect body wall and their  physiologyinsect anatomy and insect body wall and their  physiology
insect anatomy and insect body wall and their physiologyDrAnita Sharma
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 

Recently uploaded (20)

Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Cytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxCytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptx
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work Day
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutions
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
 
insect anatomy and insect body wall and their physiology
insect anatomy and insect body wall and their  physiologyinsect anatomy and insect body wall and their  physiology
insect anatomy and insect body wall and their physiology
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 

Structure Identification Using High Resolution Mass Spectrometry Data and the EPA CompTox Dashboard

  • 1. Structure Identification Using High Resolution Mass Spectrometry Data and the EPA CompTox Dashboard Antony J. Williams, Andrew McEachran, Chris Grulke, Elin Ulrich, Jennifer Smith, Jeff Edwards and Jon Sobus, November 2-3, 2016 SWEMSA 2016 http://www.orcid.org/0000-0002-2668-4821 The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
  • 2. Who is NCCT? • National Center for Computational Toxicology – part of EPA’s Office of Research and Development • Research driven by EPA’s Chemical Safety for Sustainability Research Program – Develop new approaches to evaluate the safety of chemicals – Integrate advances in biology, biotechnology, chemistry, exposure science and computer science • Goal - To identify chemical exposures that may disrupt biological processes and cause adverse outcomes. 1
  • 3. Our Dashboard Applications • Some of our Web-based Applications 2
  • 4. Introducing Our Latest Dashboard https://comptox.epa.gov 3 • >720,000 chemicals • >14 years assembling data
  • 7. ToxCast Bioassay Screening Data Useful Meta Data 6
  • 8. Functional Use and Composition VERY Useful Meta Data 7
  • 9. Dashboard: External Links to Analytical Methods and Data 8
  • 13. Previous Work with Suspect-Screening ONE ASPECT of the dashboard is to support Non-targeted Analysis
  • 15. Some history… • 2007 A Hobby Project 14
  • 16. Some history… • 2007 A Hobby Project • 2009 ChemSpider Acquired 15
  • 17. Some history… • 2007 A Hobby Project • 2009 ChemSpider Acquired • May 2015 Joined EPA – what we are showing is very new 16
  • 19. Monoisotopic Mass Search 18 Found 344 results for '215.096 ± 0.005 amu'
  • 21. Download as SDF file 20
  • 22. Formula Search 21 Found 8 results for 'C8H14ClN5'
  • 23. Does the Dashboard Add Value? 22 721k structures
  • 24. Does the Dashboard Add Value? • Remember: – Focus on high quality data and curation – Data sources include EPA data sources and a focus on environmental chemistry • No “dilution” by chemical vendors 23
  • 26. Bisphenol A as an example ChemSpider: 1564 Structures 25
  • 27. Bisphenol A as an example Dashboard: 215 Structures 26
  • 29. Using Meta-Data to Sort Candidates 28 Anti-cancer Drug Microbiological Indicator Dye Textile/Product Dye
  • 30. Same top hits – different ranking 90 hits only versus 6926 hits 29 18 17 4Tacedinaline Methyl Red C.I Disperse Yellow 3
  • 31. Chemical Identification Dashboard vs ChemSpider Sorted by number of references (ChemSpider) or data sources (Dashboard) Monoisotopic Mass (+/- 0.005 amu) Search Position of compound sorted Source of List # of Compounds Search Tool Mean Position Median Position #1 #2 #3 #4 #5+ McEachran et al Wastewater 34 ChemSpider 1.8 1 28 5 0 0 1 Dashboard 1.3 1 31 2 0 0 1 Misc. NTA Compounds 13 ChemSpider 2 1 7 5 0 0 1 Dashboard 1.7 1 10 2 0 0 1 Bade et al (2016) 19 ChemSpider 2.1 1 11 2 5 0 1 Dashboard 1.6 1 12 3 3 1 0 Rager et al (2016) 24 ChemSpider 2.25 1 15 2 1 2 4 Dashboard 1.08 1 22 2 0 0 0
  • 32. Dashboard vs ChemSpider Ranking Summary Mass-based Searching Formula Based Searching Dashboard ChemSpider Dashboard ChemSpider Cumulative Average Position 1.3 2.2 1.2 1.4 % in #1 Position 85% 70% 88% 80% • Selected peer-reviewed publications • 162 total individual chemicals in search
  • 33. The Confusion of Chemicals… Valid CAS-substance? Monoisotopic Mass Formula Parent structure (no stereo, desalted)  Resolve CAS-structure mappings for accurate data mapping  Collapse sphere to collect all data at parent structure-formula level DSSTox_v2 Database & Cheminformatics Layer many:1 • Deleted CAS • Invalid CAS • Salt forms • Complex forms • Hydrate forms • Approx mappings to mixtures • Approx mappings to ill- defined substances • Stereoisomers • Unresolved tautomers CAS2 ? CAS5 ? CAS3 ? CAS1 ? CAS4 ?NOCAS? Data1 Data2 Data3 Data4 Data5 Data6 Data7 Data8 Data9 CAS-Structure “Sphere of Confusion”
  • 34. DSSTox List Curation Tool Conflicts binned to facilitate curation
  • 35. Helping to Curate Data • We are helping to Curate Data (prior to linking from our dashboard) • Our discussions with Thomas and team – “We all agree it is hard!” 34
  • 36. Helping to Curate Data • We are helping to Curate Data (prior to linking from our dashboard) • Our discussions with Thomas and team – “We all agree it is hard!” • Approx. 80% of STOFF-IDENT is done… 35
  • 37. Helping to Curate Data • We are helping to Curate Data (prior to linking from our dashboard) • Our discussions with Thomas and team – “We all agree it is hard!” • Approx. 80% of STOFF-IDENT is done… 36
  • 38. Curating on CASRNs is DIFFICULT • 36861-47-9 : SI00004220 • 38102-62-4 : SI00008957 37
  • 39. Collisions in CAS Numbers 38
  • 40. Active and Deleted CASRN 39
  • 41. But there are MANY CASRNs! • http://web.stanford.edu/group/swain/cinf/c asreg/snumber.html 40
  • 42. How Bad Can It Get?? 41
  • 43. How Bad Can It Get? This one is 316 Deleted CASRN 42
  • 44. Helping to Curate Data • We are missing 159 CAS Numbers listed in STOFF-IDENT • Work on curating and mapping MASSBANK is underway. Much bigger!! Way more work
  • 45. Our OPEN Data is available… • Various types of data at FTP download site: ftp://newftp.epa.gov/COMPTOX/Sustainable_Chemistry_ Data/Chemistry_Dashboard 44
  • 46. Data Availability • The data are now available in METLIN • Available in MetFrag (alpha) and in testing. 45
  • 47. Coming December 2016 Batch Searching Names/CASRNs • What are these chemicals? 46
  • 48. Coming December 2016 Batch Searching… 47
  • 51. Metadata included for Ranking 50
  • 52. Need for “MS-Ready Structures” 51
  • 53. “QSAR-Ready Structures” • For the purpose of building QSAR Models we already “standardize” structures – Desalt/Neutralize – Desolvate – Remove stereochemistry • Some minor tweaks gets us “MS-ready Structures”. ALREADY in our database. 52
  • 54. “QSAR-Ready Structures” • Mass and Formula-based searches will be based on MS-ready structures but connected to the original chemical (with name, CAS, rank ordering) • MS-ready structures and substance mappings will be available as Open Data 53
  • 56. Future Work • Continue to research rank-ordering approaches • Working on “retention time prediction” • Search for adducts (+Na, +K, +NH4) and handle decarboxylation, loss of water etc • Additional links to methods – CDC NIOSH • Expand link outs to Mass Spec databases – Thermo’s mzCloud, Massbank, etc. • Predicting metabolites and degradants • Optimize web services for the community 55
  • 57. Conclusions • Only 1 aspect of the dashboard is focused on MS – to support the EPA NTA Trial underway • We should work on data curation TOGETHER! • We are “part” of the solution. Our Open Data and Open Services should be of value. 56
  • 58. Acknowledgements EPA NCCT Chris Grulke Jeff Edwards Ann Richard Jennifer Smith Andrew McEachran* EPA NERL Jon Sobus Seth Newton Elin Ulrich * = ORISE Participant

Editor's Notes

  1. For example- Rager was actually 33 confirmed; Bade was 25