SlideShare a Scribd company logo
1
Environmental Cheminformatics to
Identify Unknown Chemicals and their Effects
Assoc. Prof. Dr. Emma L. Schymanski
FNR ATTRACT Fellow and PI in Environmental Cheminformatics
Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg
Email: emma.schymanski@uni.lu
…and many colleagues who contributed to my science over the years!
Image©www.seanoakley.com/
https://tinyurl.com/ucdavis-echidna
Metabolomics Group Seminar, UC Davis, California, November 28, 2018. Host: Oliver Fiehn
2
Outline for Today
o Background about LCSB
• LCSB & University of Luxembourg
• Biomedicine and Parkinson’s Disease
• Environmental Cheminformatics @LCSB
o European(+) Community Efforts for Unknown ID
• Mass Spectral Libraries (www.massbank.eu)
• NORMAN Suspect Exchange and CompTox Chemicals Dashboard
• Metadata, MS-ready and MetFrag
• Bigger Picture Examples (Rhine, NormanNEWS, DSFP)
o Work in Progress and Future Challenges
• Complex Mixtures – Cheminformatics to Screen Undefined Structures
• Preview: Disease-specific & MetFrag-compatible Metadata
• Bonus slides on HDX (an entire presentation) if anyone wants
3
Luxembourg
Source: https://en.wikipedia.org/wiki/File:Luxembourg-CIA_WFB_Map.png and https://en.wikipedia.org/wiki/File:EU-Luxembourg.svg
4
University of Luxembourg & LCSB
o Uni Lu was founded in 2003
• We just turned 15 (teenage years!)
o LSCB was founded in 2009
• …and is still pre-teenager
• Young and very dynamic working environment!
5
Environmental Cheminformatics … the Group
S. Gene; https://en.wikipedia.org/wiki/File:Zwei_zigaretten.jpg; R. Singh; DOI:10.1186/s13321-017-0223-1; DOI: 10.1016/j.aca.2017.12.034
Sources:
6
Our challenge? We still have many unknowns …
o …in both environmental and metabolomics analysis
(l) Data from Schymanski et al 2014, ES&T DOI: 10.1021/es4044374. (r) E. coli data provided by N. Zamboni, IMSB, ETH Zürich.
Wastewater
Cells
7
(European) Environmental Community (subset!)
Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7
Croatian
Water
RWS
Specialist Knowledge
Highly Disjointed
8
1 10 100 1000 10000 100000 1 million 1 billion chemicals …. …. ….
Our (Community) Challenge: Identifying Chemicals
Data: Schymanski et al 2014, Environ. Sci. Technol. DOI: 10.1021/es4044374; Hollender et al 2017 DOI: 10.1021/acs.est.7b02184
Sample
High resolution
mass spectrometry
9
1 10 100 1000 10000 100000 1 million 1 billion chemicals …. …. ….
Our (Community) Challenge: Identifying Chemicals
Data: Schymanski et al 2014, DOI: 10.1021/es4044374; https://www.slideshare.net/EmmaSchymanski/small-molecules-in-big-data-analytica-munich
Sample
High resolution
mass spectrometry
Chemicals
AND connecting
chemical knowledge
10
Mass Spectral Libraries
http://massbank.eu/MassBank
https://github.com/MassBank/MassBank-data/
>46,000 spectra
32 contributors
11
MassBank EU
https://github.com/MassBank/MassBank-data; https://github.com/MassBank/MassBank-web/; Rösch et al DOI 10.1021/acs.est.5b05186
http://massbank.eu/MassBank
o MassBank.EU was founded late 2012, hosted at UFZ, Leipzig, Germany
o >16,000 MS/MS spectra; 1,200 substances from NORMAN members
o MassBank now has >46,000 spectra from 32 contributing institutes!
o Thorough Github-based modernization in progress for traceability:
o Tentative/unknown/literature spectra (Level Scheme) as SI for publications
Schymanski et al DOI: 10.1021/es5002105
12
Mass Spectral Libraries
https://github.com/cdk/depict/
13
Confidence Levels for Tentative Structures
Schymanski, Jeon, Gulde, Fenner, Ruff, Singer & Hollender (2014) ES&T, 48 (4), 2097-2098. DOI: 10.1021/es5002105
o Annotation is the key to communicating information
MS, MS2, RT, Reference Std.
Level 1: Confirmed structure
by reference standard
Level 2: Probable structure
a) by library spectrum match
b) by diagnostic evidence
Identification confidence
N
N
N
NHNH
CH3
CH3
S
CH3
OH
MS, MS2, Library MS2
MS, MS2, Exp. data
Example Minimum data requirements
Level 4: Unequivocal molecular formula
Level 5: Exact mass of interest
C6H5N3O4
192.0757
MS isotope/adduct
MS
Level 3: Tentative candidate(s)
structure, substituent, class MS, MS2, Exp. data
14
Creating High-Quality Mass Spectra
Stravs, Schymanski, Singer and Hollender, 2013, Journal of Mass Spectrometry, 48, 89–99. DOI: 10.1002/jms.3131
Automatic MS and MS/MS
Recalibration and Clean-up
Remove interfering peaks
Spectral Annotation with
- Experimental Details
- Compound Information
https://github.com/MassBank/RMassBank/
http://bioconductor.org/packages/RMassBank/
15
Communicating Mass Spectra for Mixtures
Stravs et al. (2013), J. Mass Spectrom, 48(1):89-99. DOI: 10.1002/jms.3131
OHSO
O
CH3
O
OH
m n
SPA-9C
m+n=6
Formulas: http://sourceforge.net/projects/genform/
Meringer et al, 2011, MATCH 65, 259-290
Data: Schymanski et al. 2014, ES&T, 48:
1811-1818. DOI: 10.1021/es4044374
Chromatography and MS/MS Annotation
Literature: LIT00034,35
Sample: ETS00002
Standard: ETS00016,17,19,20
https://github.com/MassBank/RMassBank/
16
1 10 100 1000 10000 100000 1 million 1 billion chemicals …. …. ….
Our (Community) Challenge: Identifying Chemicals
Data: Schymanski et al 2014, DOI: 10.1021/es4044374; https://www.slideshare.net/EmmaSchymanski/small-molecules-in-big-data-analytica-munich
Sample
High resolution
mass spectrometry
Chemicals
AND connecting
chemical knowledge
17
European (World-)Wide Exchange of Suspects
Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7
NORMAN Suspect List Exchange:
http://www.norman-network.com/?q=node/236
18
NORMAN Suspect List Exchange
o http://www.norman-network.com/?q=node/236
Schymanski, Aalizadeh et al. in prep; https://www.researchgate.net/project/Supporting-Mass-Spectrometry-Through-Cheminformatics
ReferencesFull Lists
19
o Now 21 lists available online … from small to large!
• Specialist collections (e.g. NormaNEWS) to large market lists
• Integrated into the CompTox Chemistry Dashboard
NORMAN Suspect Exchange Lists
20
NORMAN Lists => CompTox Dashboard
https://comptox.epa.gov/dashboard/chemical_lists/normanews
http://www.norman-network.com/?q=node/236
https://comptox.epa.gov/dashboard/chemical_lists/normanews
21
Lists on CompTox Chemicals Dashboard
https://comptox.epa.gov/dashboard/chemical_lists/
More lists become available with every release
22
CompTox Chemicals Dashboard
https://comptox.epa.gov/dashboard/
23
CompTox Chemistry Dashboard – Presence in Lists
https://comptox.epa.gov/dashboard/chemical_lists
24
CompTox Chemistry Dashboard – Additional Data
https://comptox.epa.gov/dashboard/
25
Metadata & Different Chemical Forms
Schymanski & Williams, 2017, ES&T, 51 (10), pp 5357–5359. DOI: 10.1021/acs.est.7b01908
26
Metadata & Different Chemical Forms
MS-ready: McEachran et al. 2018, J Cheminform. DOI: 10.1186/s13321-018-0299-2
27
Metadata & Different Chemical Forms
https://comptox.epa.gov/dashboard/dsstoxdb/mixture_search?cid=930
28
Connecting Resources in MetFrag
https://msbi.ipb-halle.de/MetFragBeta/ AND https://comptox.epa.gov/dashboard/dsstoxdb/batch_search (MetFrag Export)
https://msbi.ipb-halle.de/MetFragBeta/
29
MetFrag2.3: Non-target Identification
Ruttkies, Schymanski, Wolf, Hollender, Neumann (2016) J. Cheminf., 2016, DOI: 10.1186/s13321-016-0115-9
Status: 2010 => 2016
5 ppm
0.001 Da
mz [M-H]-
213.9637
ChemSpider
or
PubChem± 5 ppm
2.3
RT: 4.54 min
355 InChI/RTs
References
External Refs
Data Sources
RSC Count
PubMed Count
Suspect Lists
MS/MS
134.0054 339689
150.0001 77271
213.9607 632466
Elements: C,N,S
S OO
OH
30
MetFrag2.3: Non-target Identification
Ruttkies, Schymanski, Wolf, Hollender, Neumann (2016) J. Cheminf., 2016, DOI: 10.1186/s13321-016-0115-9
Try with the Web Interface: http://msbi.ipb-halle.de/MetFragBeta/
31
MetFrag2.3: Non-target Identification
Ruttkies, Schymanski, Wolf, Hollender, Neumann (2016) J. Cheminf., 2016, DOI: 10.1186/s13321-016-0115-9
Try with the Web Interface: http://msbi.ipb-halle.de/MetFragBeta/
32
https://msbi.ipb-halle.de/MetFragBeta/ ; https://comptox.epa.gov/dashboard/ ; https://massbank.eu ; http://normandata.eu/
Combined evidence clearly highlights potential
neurotoxicant among chemical candidates
Connecting Resources in MetFrag
33
MS-ready: McEachran et al. 2018, J Cheminform. DOI: 10.1186/s13321-018-0299-2
Connecting Resources in MetFrag
34
Connecting and Enhancing Open Resources
https://www.slideshare.net/EmmaSchymanski/small-molecules-in-big-data-analytica-munich
o Sharing knowledge is a win-win situation
2014 2015: found in waters across Europe
2016: 1 datapoint cross-annotates 3072 in GNPS
Hits in GNPS MassIVE datasets:
Surfactants: http://goo.gl/7sY9Pf
2017: Early-Warning
System is born
2018: Highlighted in
Science
35
NORMAN Digital Sample Freezing Platform
“Live” retrospective screening of known and unknown
chemicals in European samples (various matrices)
http://norman-data.eu/ AND Alygizakis et al, in prep.
36
Interactive heatmap available at http://norman-data.eu/NORMAN-REACH
NORMAN Digital Sample Freezing Platform
Retrospective screening of REACH chemicals in
Black Sea samples (various matrices)
37
NORMAN Digital Sample Freezing Platform
“Live” retrospective screening of known and unknown
chemicals in European samples (various matrices)
Future work: use results of unknowns to drive prioritization efforts
http://norman-data.eu/ AND Alygizakis et al, in prep.
38
Real-time Monitoring of the Rhine River
Hollender, Schymanski, Singer & Ferguson, 2018, ES&T Feature, 51:20, 11505-11512. DOI: 10.1021/acs.est.7b02184
Previously unknown chemicals detected due to “stand-out” patterns
39
Real-time Monitoring of the Rhine River
Hollender, Schymanski, Singer & Ferguson, 2018, ES&T Feature, 51:20, 11505-11512. DOI: 10.1021/acs.est.7b02184
Previously unknown chemicals detected due to “stand-out” patterns
40
We still have many unknowns …
(l) Data from Schymanski et al 2014, ES&T DOI: 10.1021/es4044374. (r) E. coli data provided by N. Zamboni, IMSB, ETH Zürich.
Environment
Cells
41
NORMAN Suspects don’t all have structures!
42
Accessing Metadata Behind Complex Mixtures
Highest Priority PFAS are also highly complex UVCBs!
43
Homologous Series Detection
M. Loos & H Singer, 2017. J. Cheminf. DOI: 10.1186/s13321-017-0197-z & Schymanski et al. 2014, ES&T DOI: 10.1021/es4044374
http://www.envihomolog.eawag.ch/
Search for
discrete
mass
differences S OO
OH
CH3
CH3
m
n
C9H19
O
O
S
O
O
OHm
44
Towards high throughput MS screening of UVCBs
o https://github.com/schymane/RChemMass/
45
Cross-Linking Homologues in the Dashboard
Schymanski, Grulke, Williams et al, in prep. & Williams et al. 2017 J. Cheminformatics 9:61 DOI: 10.1186/s13321-017-0247-6
https://comptox.epa.gov/dashboard/chemical_lists/eawagsurf
46
Homologous Series in Biological Matrices
Lipid extract of Mycobacterium smegmatis
C23F48O7
+CF2
Schymanski & Zamboni … random data exploration …
47
Exchanging Knowledge … Open Science Helps!
We need to be able to find and annotate the unexpected!
C23F48O7
+CF2
Schymanski & Zamboni … random data exploration …
48
Exchanging Knowledge … Open Science Helps!
We need to be able to find and annotate the unexpected!
49
Exchanging data reveals things we never expected!
Schymanski & Zamboni … random data exploration …
o Lipid extract of Mycobacterium smegmatis
C23F48O7
+CF2
DTXSID70880513DTXSID70880513
50
Community Challenges … and Solutions
Data: Schymanski et al 2014, DOI: 10.1021/es4044374; https://www.slideshare.net/EmmaSchymanski/small-molecules-in-big-data-analytica-munich
High resolution
mass spectrometry
AND connecting
chemical knowledge
51
Target List Suspect List
(e.g. NORMAN,
LMC, Eawag-PPS,
ReSOLUTION)
Componentization
(nontarget)
TARGET
ANALYSIS
SUSPECT
SCREENING
NON-TARGET
SCREENING
(enviMass,
vendor software)
Gather evidence
(nontarget,
ReSOLUTION,
RMassBank)
Masses of interest
Molecular formula
determination
(enviPat, GenForm)
Non-target identification
(MetFrag2.3, ReSOLUTION)
Sampling extraction (SPE) HPLC separation HR-MS/MS
Detection of blank/blind/noise/internal standards; time trend analysis (enviMass)
Conversion (Proteowizard) and Peak Picking (enviPick, xcms, MZmine, …)
Prioritization
(enviMass)
MS/MS Extraction
(RMassBank)
Interpretation, confirmation, peak inventory, confidence and reporting
52
Coming Soon … (WiP and Already Online!)
Schymanski, Baker, Williams, Singh et al. in preparation. Excel macro: https://figshare.com/s/824f6606644f474c7288
https://comptox.epa.gov/dashboard/chemical_lists/litminedneuro
53
Conclusions / Outlook / Perspectives
Monzel et al 2017 Stem Cell Reports, DOI: 10.1016/j.stemcr.2017.03.010 (Organoids)
o Over 60 % of HR-MS peaks are potentially relevant but unknown
o Non-target screening requires data and evidence from many different sources
o Many excellent workflows now available to collate this information
o Incorporation of all available metadata (expert knowledge) is critical to
success!
o Complex mixtures (UVCBs) are a huge and very challenging part of the puzzle
o New cheminformatics approaches needed - great progress so far
o Information in the public domain helps everyone!
o Additional experimental methods can provide more information
o H-D exchange-based labelling [EXTRA SLIDES]
o Integration of computational toxicity knowledge essential
o LCSB has some amazing facilities and expertise
(I am just beginning to appreciate how much …)
54
Acknowledgements
emma.schymanski@uni.lu
Further Information:
https://massbank.eu/MassBank/
http://c-ruttkies.github.io/MetFrag/
https://comptox.epa.gov/dashboard/
http://www.norman-network.com/?q=node/236
https://wwwen.uni.lu/lcsb/research/
environmental_cheminformatics
.eu
EU Grant
603437
55

More Related Content

What's hot

SETAC Rome Non-Target Screening For Chemical Discovery
SETAC Rome Non-Target Screening For Chemical DiscoverySETAC Rome Non-Target Screening For Chemical Discovery
SETAC Rome Non-Target Screening For Chemical Discovery
Emma Schymanski
 
Adding complex expert knowledge into chemical database and transforming surfa...
Adding complex expert knowledge into chemical database and transforming surfa...Adding complex expert knowledge into chemical database and transforming surfa...
Adding complex expert knowledge into chemical database and transforming surfa...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Curating and Sharing Structures and Spectra for the Environmental Community
Curating and Sharing  Structures and Spectra for the Environmental CommunityCurating and Sharing  Structures and Spectra for the Environmental Community
Curating and Sharing Structures and Spectra for the Environmental Community
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...
Valery Tkachenko
 
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Cheminformatics and the Structure Elucidation of Natural Products
Cheminformatics and the Structure Elucidation of Natural ProductsCheminformatics and the Structure Elucidation of Natural Products
Cheminformatics and the Structure Elucidation of Natural Products
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 

What's hot (20)

SETAC Rome Non-Target Screening For Chemical Discovery
SETAC Rome Non-Target Screening For Chemical DiscoverySETAC Rome Non-Target Screening For Chemical Discovery
SETAC Rome Non-Target Screening For Chemical Discovery
 
Adding complex expert knowledge into chemical database and transforming surfa...
Adding complex expert knowledge into chemical database and transforming surfa...Adding complex expert knowledge into chemical database and transforming surfa...
Adding complex expert knowledge into chemical database and transforming surfa...
 
Curating and Sharing Structures and Spectra for the Environmental Community
Curating and Sharing  Structures and Spectra for the Environmental CommunityCurating and Sharing  Structures and Spectra for the Environmental Community
Curating and Sharing Structures and Spectra for the Environmental Community
 
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
 
Overview of open resources to support automated structure verification and e...
Overview of open resources to support automated structure verification  and e...Overview of open resources to support automated structure verification  and e...
Overview of open resources to support automated structure verification and e...
 
Implementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSImplementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTS
 
Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...
 
An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...
 
Printout webinar r ax costanza 05 05-2020
Printout webinar r ax costanza 05 05-2020Printout webinar r ax costanza 05 05-2020
Printout webinar r ax costanza 05 05-2020
 
Asking the scientific literature to tell us about metabolism
Asking the scientific literature to tell us about metabolismAsking the scientific literature to tell us about metabolism
Asking the scientific literature to tell us about metabolism
 
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
 
Can machines understand the scientific literature
Can machines understand the scientific literatureCan machines understand the scientific literature
Can machines understand the scientific literature
 
The needs for chemistry standards, database tools and data curation at the ch...
The needs for chemistry standards, database tools and data curation at the ch...The needs for chemistry standards, database tools and data curation at the ch...
The needs for chemistry standards, database tools and data curation at the ch...
 
The royal society of chemistry and its adoption of semantic web technologies ...
The royal society of chemistry and its adoption of semantic web technologies ...The royal society of chemistry and its adoption of semantic web technologies ...
The royal society of chemistry and its adoption of semantic web technologies ...
 
Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...
 
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
 
Cheminformatics and the Structure Elucidation of Natural Products
Cheminformatics and the Structure Elucidation of Natural ProductsCheminformatics and the Structure Elucidation of Natural Products
Cheminformatics and the Structure Elucidation of Natural Products
 
PubChem and Its Applications for Drug Discovery
PubChem and Its Applications for Drug DiscoveryPubChem and Its Applications for Drug Discovery
PubChem and Its Applications for Drug Discovery
 
PubChem as a resource for chemical information training
PubChem as a resource for chemical information trainingPubChem as a resource for chemical information training
PubChem as a resource for chemical information training
 
How to place your research questions or results into the context of the "Lega...
How to place your research questions or results into the context of the "Lega...How to place your research questions or results into the context of the "Lega...
How to place your research questions or results into the context of the "Lega...
 

Similar to Environmental Cheminformatics for Unknown ID UC Davis Nov 2018

Automated Structure Annotation and Curation for MassBank: Potential and Pitfalls
Automated Structure Annotation and Curation for MassBank: Potential and PitfallsAutomated Structure Annotation and Curation for MassBank: Potential and Pitfalls
Automated Structure Annotation and Curation for MassBank: Potential and Pitfalls
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Integrated Analysis of Toxicology Data supported by ToxBank
Integrated Analysis of Toxicology Data supported by ToxBankIntegrated Analysis of Toxicology Data supported by ToxBank
Integrated Analysis of Toxicology Data supported by ToxBank
Barry Hardy
 
Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...
Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...
Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...
Kamel Mansouri
 
OpenTox - an open community and framework supporting predictive toxicology an...
OpenTox - an open community and framework supporting predictive toxicology an...OpenTox - an open community and framework supporting predictive toxicology an...
OpenTox - an open community and framework supporting predictive toxicology an...
Barry Hardy
 
Metabolomics Society meeting 2011 - presentatie Kees
Metabolomics Society meeting 2011 - presentatie KeesMetabolomics Society meeting 2011 - presentatie Kees
Metabolomics Society meeting 2011 - presentatie Kees
thehyve
 
SpectrometryandProteomicsAgenda
SpectrometryandProteomicsAgendaSpectrometryandProteomicsAgenda
SpectrometryandProteomicsAgenda
Nikola Nastić
 

Similar to Environmental Cheminformatics for Unknown ID UC Davis Nov 2018 (20)

Automated Structure Annotation and Curation for MassBank: Potential and Pitfalls
Automated Structure Annotation and Curation for MassBank: Potential and PitfallsAutomated Structure Annotation and Curation for MassBank: Potential and Pitfalls
Automated Structure Annotation and Curation for MassBank: Potential and Pitfalls
 
High throughput mining of the scholarly literature; talk at NIH
High throughput mining of the scholarly literature; talk at NIHHigh throughput mining of the scholarly literature; talk at NIH
High throughput mining of the scholarly literature; talk at NIH
 
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
 
Integrated Analysis of Toxicology Data supported by ToxBank
Integrated Analysis of Toxicology Data supported by ToxBankIntegrated Analysis of Toxicology Data supported by ToxBank
Integrated Analysis of Toxicology Data supported by ToxBank
 
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
 
Non-targeted screening to improve substance identity for Chemical Substances ...
Non-targeted screening to improve substance identity for Chemical Substances ...Non-targeted screening to improve substance identity for Chemical Substances ...
Non-targeted screening to improve substance identity for Chemical Substances ...
 
2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG
 
Semantics and linked data at astra zeneca
Semantics and linked data at astra zenecaSemantics and linked data at astra zeneca
Semantics and linked data at astra zeneca
 
Text-Mining PubMed Search Results to Identify Emerging Technologies Relevant ...
Text-Mining PubMed Search Results to Identify Emerging Technologies Relevant ...Text-Mining PubMed Search Results to Identify Emerging Technologies Relevant ...
Text-Mining PubMed Search Results to Identify Emerging Technologies Relevant ...
 
Profiling Linked Open Data
Profiling Linked Open DataProfiling Linked Open Data
Profiling Linked Open Data
 
Examples of how to inspire the next generation to pursue computational chemis...
Examples of how to inspire the next generation to pursue computational chemis...Examples of how to inspire the next generation to pursue computational chemis...
Examples of how to inspire the next generation to pursue computational chemis...
 
Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...
Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...
Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...
 
2 francisco del aguila
2 francisco del aguila2 francisco del aguila
2 francisco del aguila
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notation
 
OpenTox - an open community and framework supporting predictive toxicology an...
OpenTox - an open community and framework supporting predictive toxicology an...OpenTox - an open community and framework supporting predictive toxicology an...
OpenTox - an open community and framework supporting predictive toxicology an...
 
Metabolomics Society meeting 2011 - presentatie Kees
Metabolomics Society meeting 2011 - presentatie KeesMetabolomics Society meeting 2011 - presentatie Kees
Metabolomics Society meeting 2011 - presentatie Kees
 
Reproducibility, dissemination, and management of modeling results
Reproducibility, dissemination,  and management of modeling resultsReproducibility, dissemination,  and management of modeling results
Reproducibility, dissemination, and management of modeling results
 
Transparency in the Data Supply Chain
Transparency in the Data Supply ChainTransparency in the Data Supply Chain
Transparency in the Data Supply Chain
 
FAIR data management in biomedicine
FAIR data management  in biomedicineFAIR data management  in biomedicine
FAIR data management in biomedicine
 
SpectrometryandProteomicsAgenda
SpectrometryandProteomicsAgendaSpectrometryandProteomicsAgenda
SpectrometryandProteomicsAgenda
 

Recently uploaded

Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
PirithiRaju
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
muralinath2
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
Jocelyn Atis
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Sérgio Sacani
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
sreddyrahul
 

Recently uploaded (20)

biotech-regenration of plants, pharmaceutical applications.pptx
biotech-regenration of plants, pharmaceutical applications.pptxbiotech-regenration of plants, pharmaceutical applications.pptx
biotech-regenration of plants, pharmaceutical applications.pptx
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
 
National Biodiversity protection initiatives and Convention on Biological Di...
National Biodiversity protection initiatives and  Convention on Biological Di...National Biodiversity protection initiatives and  Convention on Biological Di...
National Biodiversity protection initiatives and Convention on Biological Di...
 
FAIRSpectra - Towards a common data file format for SIMS images
FAIRSpectra - Towards a common data file format for SIMS imagesFAIRSpectra - Towards a common data file format for SIMS images
FAIRSpectra - Towards a common data file format for SIMS images
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxGLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
 
INSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere UniversityINSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere University
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
SAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniquesSAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniques
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
 

Environmental Cheminformatics for Unknown ID UC Davis Nov 2018

  • 1. 1 Environmental Cheminformatics to Identify Unknown Chemicals and their Effects Assoc. Prof. Dr. Emma L. Schymanski FNR ATTRACT Fellow and PI in Environmental Cheminformatics Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg Email: emma.schymanski@uni.lu …and many colleagues who contributed to my science over the years! Image©www.seanoakley.com/ https://tinyurl.com/ucdavis-echidna Metabolomics Group Seminar, UC Davis, California, November 28, 2018. Host: Oliver Fiehn
  • 2. 2 Outline for Today o Background about LCSB • LCSB & University of Luxembourg • Biomedicine and Parkinson’s Disease • Environmental Cheminformatics @LCSB o European(+) Community Efforts for Unknown ID • Mass Spectral Libraries (www.massbank.eu) • NORMAN Suspect Exchange and CompTox Chemicals Dashboard • Metadata, MS-ready and MetFrag • Bigger Picture Examples (Rhine, NormanNEWS, DSFP) o Work in Progress and Future Challenges • Complex Mixtures – Cheminformatics to Screen Undefined Structures • Preview: Disease-specific & MetFrag-compatible Metadata • Bonus slides on HDX (an entire presentation) if anyone wants
  • 4. 4 University of Luxembourg & LCSB o Uni Lu was founded in 2003 • We just turned 15 (teenage years!) o LSCB was founded in 2009 • …and is still pre-teenager • Young and very dynamic working environment!
  • 5. 5 Environmental Cheminformatics … the Group S. Gene; https://en.wikipedia.org/wiki/File:Zwei_zigaretten.jpg; R. Singh; DOI:10.1186/s13321-017-0223-1; DOI: 10.1016/j.aca.2017.12.034 Sources:
  • 6. 6 Our challenge? We still have many unknowns … o …in both environmental and metabolomics analysis (l) Data from Schymanski et al 2014, ES&T DOI: 10.1021/es4044374. (r) E. coli data provided by N. Zamboni, IMSB, ETH Zürich. Wastewater Cells
  • 7. 7 (European) Environmental Community (subset!) Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 Croatian Water RWS Specialist Knowledge Highly Disjointed
  • 8. 8 1 10 100 1000 10000 100000 1 million 1 billion chemicals …. …. …. Our (Community) Challenge: Identifying Chemicals Data: Schymanski et al 2014, Environ. Sci. Technol. DOI: 10.1021/es4044374; Hollender et al 2017 DOI: 10.1021/acs.est.7b02184 Sample High resolution mass spectrometry
  • 9. 9 1 10 100 1000 10000 100000 1 million 1 billion chemicals …. …. …. Our (Community) Challenge: Identifying Chemicals Data: Schymanski et al 2014, DOI: 10.1021/es4044374; https://www.slideshare.net/EmmaSchymanski/small-molecules-in-big-data-analytica-munich Sample High resolution mass spectrometry Chemicals AND connecting chemical knowledge
  • 11. 11 MassBank EU https://github.com/MassBank/MassBank-data; https://github.com/MassBank/MassBank-web/; Rösch et al DOI 10.1021/acs.est.5b05186 http://massbank.eu/MassBank o MassBank.EU was founded late 2012, hosted at UFZ, Leipzig, Germany o >16,000 MS/MS spectra; 1,200 substances from NORMAN members o MassBank now has >46,000 spectra from 32 contributing institutes! o Thorough Github-based modernization in progress for traceability: o Tentative/unknown/literature spectra (Level Scheme) as SI for publications Schymanski et al DOI: 10.1021/es5002105
  • 13. 13 Confidence Levels for Tentative Structures Schymanski, Jeon, Gulde, Fenner, Ruff, Singer & Hollender (2014) ES&T, 48 (4), 2097-2098. DOI: 10.1021/es5002105 o Annotation is the key to communicating information MS, MS2, RT, Reference Std. Level 1: Confirmed structure by reference standard Level 2: Probable structure a) by library spectrum match b) by diagnostic evidence Identification confidence N N N NHNH CH3 CH3 S CH3 OH MS, MS2, Library MS2 MS, MS2, Exp. data Example Minimum data requirements Level 4: Unequivocal molecular formula Level 5: Exact mass of interest C6H5N3O4 192.0757 MS isotope/adduct MS Level 3: Tentative candidate(s) structure, substituent, class MS, MS2, Exp. data
  • 14. 14 Creating High-Quality Mass Spectra Stravs, Schymanski, Singer and Hollender, 2013, Journal of Mass Spectrometry, 48, 89–99. DOI: 10.1002/jms.3131 Automatic MS and MS/MS Recalibration and Clean-up Remove interfering peaks Spectral Annotation with - Experimental Details - Compound Information https://github.com/MassBank/RMassBank/ http://bioconductor.org/packages/RMassBank/
  • 15. 15 Communicating Mass Spectra for Mixtures Stravs et al. (2013), J. Mass Spectrom, 48(1):89-99. DOI: 10.1002/jms.3131 OHSO O CH3 O OH m n SPA-9C m+n=6 Formulas: http://sourceforge.net/projects/genform/ Meringer et al, 2011, MATCH 65, 259-290 Data: Schymanski et al. 2014, ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 Chromatography and MS/MS Annotation Literature: LIT00034,35 Sample: ETS00002 Standard: ETS00016,17,19,20 https://github.com/MassBank/RMassBank/
  • 16. 16 1 10 100 1000 10000 100000 1 million 1 billion chemicals …. …. …. Our (Community) Challenge: Identifying Chemicals Data: Schymanski et al 2014, DOI: 10.1021/es4044374; https://www.slideshare.net/EmmaSchymanski/small-molecules-in-big-data-analytica-munich Sample High resolution mass spectrometry Chemicals AND connecting chemical knowledge
  • 17. 17 European (World-)Wide Exchange of Suspects Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 NORMAN Suspect List Exchange: http://www.norman-network.com/?q=node/236
  • 18. 18 NORMAN Suspect List Exchange o http://www.norman-network.com/?q=node/236 Schymanski, Aalizadeh et al. in prep; https://www.researchgate.net/project/Supporting-Mass-Spectrometry-Through-Cheminformatics ReferencesFull Lists
  • 19. 19 o Now 21 lists available online … from small to large! • Specialist collections (e.g. NormaNEWS) to large market lists • Integrated into the CompTox Chemistry Dashboard NORMAN Suspect Exchange Lists
  • 20. 20 NORMAN Lists => CompTox Dashboard https://comptox.epa.gov/dashboard/chemical_lists/normanews http://www.norman-network.com/?q=node/236 https://comptox.epa.gov/dashboard/chemical_lists/normanews
  • 21. 21 Lists on CompTox Chemicals Dashboard https://comptox.epa.gov/dashboard/chemical_lists/ More lists become available with every release
  • 23. 23 CompTox Chemistry Dashboard – Presence in Lists https://comptox.epa.gov/dashboard/chemical_lists
  • 24. 24 CompTox Chemistry Dashboard – Additional Data https://comptox.epa.gov/dashboard/
  • 25. 25 Metadata & Different Chemical Forms Schymanski & Williams, 2017, ES&T, 51 (10), pp 5357–5359. DOI: 10.1021/acs.est.7b01908
  • 26. 26 Metadata & Different Chemical Forms MS-ready: McEachran et al. 2018, J Cheminform. DOI: 10.1186/s13321-018-0299-2
  • 27. 27 Metadata & Different Chemical Forms https://comptox.epa.gov/dashboard/dsstoxdb/mixture_search?cid=930
  • 28. 28 Connecting Resources in MetFrag https://msbi.ipb-halle.de/MetFragBeta/ AND https://comptox.epa.gov/dashboard/dsstoxdb/batch_search (MetFrag Export) https://msbi.ipb-halle.de/MetFragBeta/
  • 29. 29 MetFrag2.3: Non-target Identification Ruttkies, Schymanski, Wolf, Hollender, Neumann (2016) J. Cheminf., 2016, DOI: 10.1186/s13321-016-0115-9 Status: 2010 => 2016 5 ppm 0.001 Da mz [M-H]- 213.9637 ChemSpider or PubChem± 5 ppm 2.3 RT: 4.54 min 355 InChI/RTs References External Refs Data Sources RSC Count PubMed Count Suspect Lists MS/MS 134.0054 339689 150.0001 77271 213.9607 632466 Elements: C,N,S S OO OH
  • 30. 30 MetFrag2.3: Non-target Identification Ruttkies, Schymanski, Wolf, Hollender, Neumann (2016) J. Cheminf., 2016, DOI: 10.1186/s13321-016-0115-9 Try with the Web Interface: http://msbi.ipb-halle.de/MetFragBeta/
  • 31. 31 MetFrag2.3: Non-target Identification Ruttkies, Schymanski, Wolf, Hollender, Neumann (2016) J. Cheminf., 2016, DOI: 10.1186/s13321-016-0115-9 Try with the Web Interface: http://msbi.ipb-halle.de/MetFragBeta/
  • 32. 32 https://msbi.ipb-halle.de/MetFragBeta/ ; https://comptox.epa.gov/dashboard/ ; https://massbank.eu ; http://normandata.eu/ Combined evidence clearly highlights potential neurotoxicant among chemical candidates Connecting Resources in MetFrag
  • 33. 33 MS-ready: McEachran et al. 2018, J Cheminform. DOI: 10.1186/s13321-018-0299-2 Connecting Resources in MetFrag
  • 34. 34 Connecting and Enhancing Open Resources https://www.slideshare.net/EmmaSchymanski/small-molecules-in-big-data-analytica-munich o Sharing knowledge is a win-win situation 2014 2015: found in waters across Europe 2016: 1 datapoint cross-annotates 3072 in GNPS Hits in GNPS MassIVE datasets: Surfactants: http://goo.gl/7sY9Pf 2017: Early-Warning System is born 2018: Highlighted in Science
  • 35. 35 NORMAN Digital Sample Freezing Platform “Live” retrospective screening of known and unknown chemicals in European samples (various matrices) http://norman-data.eu/ AND Alygizakis et al, in prep.
  • 36. 36 Interactive heatmap available at http://norman-data.eu/NORMAN-REACH NORMAN Digital Sample Freezing Platform Retrospective screening of REACH chemicals in Black Sea samples (various matrices)
  • 37. 37 NORMAN Digital Sample Freezing Platform “Live” retrospective screening of known and unknown chemicals in European samples (various matrices) Future work: use results of unknowns to drive prioritization efforts http://norman-data.eu/ AND Alygizakis et al, in prep.
  • 38. 38 Real-time Monitoring of the Rhine River Hollender, Schymanski, Singer & Ferguson, 2018, ES&T Feature, 51:20, 11505-11512. DOI: 10.1021/acs.est.7b02184 Previously unknown chemicals detected due to “stand-out” patterns
  • 39. 39 Real-time Monitoring of the Rhine River Hollender, Schymanski, Singer & Ferguson, 2018, ES&T Feature, 51:20, 11505-11512. DOI: 10.1021/acs.est.7b02184 Previously unknown chemicals detected due to “stand-out” patterns
  • 40. 40 We still have many unknowns … (l) Data from Schymanski et al 2014, ES&T DOI: 10.1021/es4044374. (r) E. coli data provided by N. Zamboni, IMSB, ETH Zürich. Environment Cells
  • 41. 41 NORMAN Suspects don’t all have structures!
  • 42. 42 Accessing Metadata Behind Complex Mixtures Highest Priority PFAS are also highly complex UVCBs!
  • 43. 43 Homologous Series Detection M. Loos & H Singer, 2017. J. Cheminf. DOI: 10.1186/s13321-017-0197-z & Schymanski et al. 2014, ES&T DOI: 10.1021/es4044374 http://www.envihomolog.eawag.ch/ Search for discrete mass differences S OO OH CH3 CH3 m n C9H19 O O S O O OHm
  • 44. 44 Towards high throughput MS screening of UVCBs o https://github.com/schymane/RChemMass/
  • 45. 45 Cross-Linking Homologues in the Dashboard Schymanski, Grulke, Williams et al, in prep. & Williams et al. 2017 J. Cheminformatics 9:61 DOI: 10.1186/s13321-017-0247-6 https://comptox.epa.gov/dashboard/chemical_lists/eawagsurf
  • 46. 46 Homologous Series in Biological Matrices Lipid extract of Mycobacterium smegmatis C23F48O7 +CF2 Schymanski & Zamboni … random data exploration …
  • 47. 47 Exchanging Knowledge … Open Science Helps! We need to be able to find and annotate the unexpected! C23F48O7 +CF2 Schymanski & Zamboni … random data exploration …
  • 48. 48 Exchanging Knowledge … Open Science Helps! We need to be able to find and annotate the unexpected!
  • 49. 49 Exchanging data reveals things we never expected! Schymanski & Zamboni … random data exploration … o Lipid extract of Mycobacterium smegmatis C23F48O7 +CF2 DTXSID70880513DTXSID70880513
  • 50. 50 Community Challenges … and Solutions Data: Schymanski et al 2014, DOI: 10.1021/es4044374; https://www.slideshare.net/EmmaSchymanski/small-molecules-in-big-data-analytica-munich High resolution mass spectrometry AND connecting chemical knowledge
  • 51. 51 Target List Suspect List (e.g. NORMAN, LMC, Eawag-PPS, ReSOLUTION) Componentization (nontarget) TARGET ANALYSIS SUSPECT SCREENING NON-TARGET SCREENING (enviMass, vendor software) Gather evidence (nontarget, ReSOLUTION, RMassBank) Masses of interest Molecular formula determination (enviPat, GenForm) Non-target identification (MetFrag2.3, ReSOLUTION) Sampling extraction (SPE) HPLC separation HR-MS/MS Detection of blank/blind/noise/internal standards; time trend analysis (enviMass) Conversion (Proteowizard) and Peak Picking (enviPick, xcms, MZmine, …) Prioritization (enviMass) MS/MS Extraction (RMassBank) Interpretation, confirmation, peak inventory, confidence and reporting
  • 52. 52 Coming Soon … (WiP and Already Online!) Schymanski, Baker, Williams, Singh et al. in preparation. Excel macro: https://figshare.com/s/824f6606644f474c7288 https://comptox.epa.gov/dashboard/chemical_lists/litminedneuro
  • 53. 53 Conclusions / Outlook / Perspectives Monzel et al 2017 Stem Cell Reports, DOI: 10.1016/j.stemcr.2017.03.010 (Organoids) o Over 60 % of HR-MS peaks are potentially relevant but unknown o Non-target screening requires data and evidence from many different sources o Many excellent workflows now available to collate this information o Incorporation of all available metadata (expert knowledge) is critical to success! o Complex mixtures (UVCBs) are a huge and very challenging part of the puzzle o New cheminformatics approaches needed - great progress so far o Information in the public domain helps everyone! o Additional experimental methods can provide more information o H-D exchange-based labelling [EXTRA SLIDES] o Integration of computational toxicity knowledge essential o LCSB has some amazing facilities and expertise (I am just beginning to appreciate how much …)
  • 55. 55

Editor's Notes

  1. Cross-annotation & database linking!
  2. Cross-annotation & database linking!
  3. Cross-annotation & database linking!
  4. Cross-annotation & database linking!