Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Curating suspect lists for international non target screening efforts

143 views

Published on

The NORMAN Network (www.norman-network.com) is a unique network of reference laboratories, research centres and related organisations for monitoring of emerging environmental substances, through European and across the world. Key activities of the network include prioritization of emerging substances and non-target screening. A recent collaborative trial revealed that suspect screening (using specific lists of chemicals to find “known unknowns”) was a very common and efficient way to expedite non-target screening (Schymanski et al. 2015, DOI: 10.1007/s00216-015-8681-7). As a result, the NORMAN Suspect Exchange was founded (http://www.norman-network.com/?q=node/236) and members were encouraged to submit their suspect lists. To date 20 lists of highly varying substance numbers (between 52 and 30,418), quality and information content have been uploaded, including valuable information previously unavailable to the public. All preparation and curation was done within the network using open access cheminformatics toolkits. Additionally, members expressed a desire for one merged list (“SusDat”). However, as a small network with very limited resources (member contributions only), the burden of curating and merging these lists into a high quality, curated dataset went beyond the capacity and expertise of the network. In 2017 the NORMAN Suspect Exchange and US EPA CompTox Chemistry Dashboard (https://comptox.epa.gov/) pooled resources in curating and uploading these lists to the Dashboard (https://comptox.epa.gov/dashboard/chemical_lists). This talk will cover the curation and annotation of the lists with unique identifiers (known as DTXSIDs), plus the advantages and drawbacks of these for NORMAN (e.g. creating a registration/resource inter-dependence). It will cover the use of “MS-ready structure forms” with chemical substances provided in the form observed by the mass spectrometer (e.g. desalted, as separate components of mixtures) and how these efforts will support other NORMAN activities. Finally, limitations of existing cheminformatics approaches and future ideas for extending this work will be covered. Note: This abstract does not reflect US EPA policy.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Curating suspect lists for international non target screening efforts

  1. 1. 1 Curating “Suspect Lists” forCurating “Suspect Lists” for International Non-target Screening EffortsInternational Non-target Screening Efforts Emma Schymanski Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg. Email: emma.schymanski@uni.lu Reza Aalizadeh, Nikolaos S. Thomaidis (University of Athens, Greece) Juliane Hollender (Eawag, Dübendorf, Switzerland), Nikiforos Aligizakis, Jaroslav Slobodnik (Environmental Institute, Kos, Slovak Republic) Antony J. Williams (NCCT, US EPA, Research Triangle Park, NC, USA) Image © www.seanoakley.com/ The views expressed in this presentation are those of the authors and do not necessarily reflect the views or policies of the U.S. Environmental Protection Agency.
  2. 2. 2 2015: European Non-target Screening Trial Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 Croatian Water RWS
  3. 3. 3 2015: European Non-target Screening Trial Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 o Background: o Need to compare and harmonize non-target screening methods in Europe o Sampling in conjunction with Joint Danube Survey 3 (Aug/Sept 2013)
  4. 4. 4 2015: European Non-target Screening Trial Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 o Background: o Need to compare and harmonize non-target screening methods in Europe o Sampling in conjunction with Joint Danube Survey 3 (Aug/Sept 2013) o Objectives: o Analyse sample using established MS techniques and reporting how many: o Substances are present in the sample and; o Were provisionally identified by suspect and non-target screening Peak picking Non-target HR-MS(/MS) Acquisition Target Screening Suspect Screening Non-target Screening Target list Suspect list Peak picking or XICs
  5. 5. 5 2015: European Non-target Screening Trial Schymanski et al, 2014, ES&T. DOI: 10.1021/es5002105 & Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 Peak picking Non-target HR-MS(/MS) Acquisition Target Screening Suspect Screening Non-target Screening Start Level 1 Confirmed Structure by reference standard Level 2 Probable Structure by library/diagnostic evidence Start Level 3 Tentative Candidate(s) suspect, substructure, class Level 4 Unequivocal Molecular Formula insufficient structural evidence Start Level 5 Mass of Interest multiple detection, trends, … “downgrading” with contradictory evidence Increasing identification confidence Target list Suspect list Peak picking or XICs
  6. 6. 6 Overall Results: Non-target Screening Trial Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 TARGET ANALYSIS SUSPECT SCREENING NON-TARGET SCREENING Quantified (n=11) Semi/Non- Quantified (n=12) Suspects (n=14) Isomer mixes (n=5) Identified Non-targets (n=6) Unidentified Non-targets (n=13) Formula (n=7) Level 1 Levels 2-3 Level 4 Level 5 #Substances
  7. 7. 7 Data Sources (for all 19 participants) Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 Database/Library Name Total Compounds Compounds with Spectra ChemSpider [35] 32 million DAIOS [49,50] 1,404 >1,000 a PubChem [48] 63,105,228 STOFF-IDENT [38] 8,000 b MassBank [51,52] 5,000 mzCloud [53] 1,956 NIST MS 2011 [11,54] 212,961 c NIST MS/MS 2011 [11,54] 4,628 Wiley Registry of Mass Spectral Data 7 th Edition [12] 289,000 ABSciex Meta Library 2,381 Agilent Broecker, Herre & Pragst toxic/forensics 7,509 c ~2,500 Agilent Pesticide Library 1,664 ~700 c Agilent Synthetic Substance Library 23,053 n/a Agilent METLIN database 64,092 8,040 Bruker Pesticide Screener 700 d Thermo Environmental Food Safety (EFS) with RT 454 dp ; 447 p ; 90 dn ; 278 n Thermo toxicology 618 p ; 36 n Waters database with RT 730 de In-house Libraries without spectra (two participants) 2,000; 1,600 In-house Libraries with spectra (two participants) 526 d ; 63 d In-house Libraries with spectra for some substances 2,200 d 835 ad 7,815 1500 ap ; 500 an 3,000 350 d Surfactant List [3] 394 Feedback Too many different data resources Cannot use them all; cannot compare easily => cross-institute reproducibility? At the time of the trial
  8. 8. 8 Reported Identification Depended on Source List Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 Terbutylazine Detects: 12; # Refs: 220 Sebutylazine Detects: 3; # Refs: 51 Propazine Detects: 3; # Refs: 201 C9H16ClN5 m/z229.1094Da N NN Cl NH CH3 NH CH3 CH3 CH3 (no related compound at this mass)N NN Cl NH CH3NH CH3 CH3 N NN Cl NH CH3NH CH3 CH3 CH3 Simazine Detects: 4; # Refs: 518 Terbutylazine-desethyl Detects: 9; # Refs: 92 Sebutylazine-desethyl Detects: 1; # Refs: 14 C7H12ClN5 m/z201.0781Da N NN Cl NH2NH CH3 CH3 CH3 N NN Cl NH2NH CH3 CH3 N NN Cl NHNH CH3CH3 (no related compound at this mass) Terbutylazine-desethyl- 2-hydroxy Detects: 2; # Refs: 57 Sebutylazine-desethyl- 2-hydroxy Detects: 0; # Refs: 3 Simazine-2-hydroxy Detects: 2; # Refs: 66 C7H13N5O m/z183.1120Da N NN NH2NH OH CH3 CH3 CH3 N NN OH NH2NH CH3 CH3 N NN OH NHNH CH3CH3 (no related compound at this mass)
  9. 9. 9 Suspect Screening Allows Efficient Data Exploration
  10. 10. 10 Enter: NORMAN Suspect Exchange Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 o …part of the NORMAN Databases Collection
  11. 11. 11 Enter: NORMAN Suspect Exchange o http://www.norman-network.com/?q=node/236 Schymanski, Aalizadeh et al. in prep. 2.3 2.3 ReferencesFull Lists InChIKeys
  12. 12. 12 o Now 21 lists available online … from small to large! Specialised Lists through to Market Lists
  13. 13. 13 …but not all are what they seem…
  14. 14. 14 Chemicals “in disguise” Schymanski & Williams, 2017, ES&T, 51 (10), pp 5357–5359. DOI: 10.1021/acs.est.7b01908
  15. 15. 15 Curating and Merging Chemical Lists … Schymanski, Aalizadeh et al. in prep., Fourches et al 2010, 2016 Twenty lists => one big “NORMAN SusDat” 1. Fill missing information for all entries 2. Standardize, generate 2D structure 3. Remove salts, solvent etc. 4. Fix valences, add Hs 5. Optimize 3D structure 6. Store original data 7. Create StdInChIs 8. Remove duplicates 9. Create MS- ready SMILES 10. Validate MS- ready list 70,829 -28,873 40,053 Manually fix problem entries Nostructure Add to sheet “Missing info” 1,903 entries -1,903 Add to sheets “Removed” And “Duplicate_pairs” 28,873 entries each
  16. 16. 16 Validation Levels for SusDat Schymanski, Aalizadeh et al. in prep., o Using open public sources available to us
  17. 17. 17 NORMAN-SusDat – the “merged” data table Schymanski, Aalizadeh et al. in prep., Name, Identifiers, Validation level, Source, MSMS, RTI, Toxicity, logKow
  18. 18. 18 NORMAN-SusDat – the “merged” data table Schymanski, Aalizadeh et al. in prep., SCREEN SMART – OR BIG – OR BOTH? All suspect lists available in one table: o http://www.norman-network.com/datatable/ o Quick search options on every field, e.g. name, mass, … MERGING SEVERAL LISTS IS NOT TRIVIAL! WORK IN PROGRESS!!!
  19. 19. 19 NORMAN Lists => CompTox Dashboard https://comptox.epa.gov/dashboard/chemical_lists/sfishfluoro ~2,600 PFAS
  20. 20. 20 https://comptox.epa.gov/dashboard/chemical_lists/sfishfluoro ~2,600 PFAS NORMAN Lists => CompTox Dashboard
  21. 21. 21 NORMAN Lists => CompTox Dashboard https://comptox.epa.gov/dashboard/chemical_lists/ More lists become available with every release
  22. 22. 22 Curation never stops … (many) more registrations… Cleaning up lists to remove errors Undefined mixtures (UVCBs)
  23. 23. 23 How to Store and Access UVCBs? Schymanski, Grulke, Williams et al, in prep. & Williams et al. 2017 J. Cheminformatics 9:61 DOI: 10.1186/s13321-017-0247-6 https://comptox.epa.gov/dashboard/dsstoxdb/results?search=DTXSID3020041 https://comptox.epa.gov/dashboard/chemical_lists/eawagsurf
  24. 24. 24 European (World-)Wide Exchange of Suspects Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 NORMAN Suspect List Exchange: http://www.norman-network.com/?q=node/236
  25. 25. 25 NORMAN Digital Sample Freezing Platform “Live” retrospective screening of known and unknown chemicals in European samples (various matrices) Aligizakis et al, in prep.
  26. 26. 26 NORMAN Resources: http://www.norman-network.com/datatable/ http://www.norman-network.com/?q=node/236 CompTox Chemistry Dashboard: https://comptox.epa.gov/ Contact: emma.schymanski@uni.lu Acknowledgements .eu Jaroslav Slobodnik, Natalia Glowacka, Lubos Cirka, Ildiko Ipolyi, Nikiforos Alygizakis & more at EI Questions? EU Grant 603437 “Suspect Exchange” task partners: Stellan Fischer, Reza Aalizadeh Nikos Thomaidis
  27. 27. 27
  28. 28. 28 Target, Suspect and Non-Target Screening KNOWNS SUSPECTS No Prior Knowledge HPLC separation and HR-MS/MS TARGET ANALYSIS SUSPECT SCREENING NON-TARGET SCREENING Targets found Suspects found Masses of interest (Molecular formula) DATABASE SEARCH STRUCTURE GENERATION Confirmation and quantification of compounds present Candidate selection (retention time, MS/MS, calculated properties) Sampling extraction (SPE) HPLC separation HR-MS/MS Time, Effort & Number of Compounds…. SUSPECTS SPECTRUM SEARCH Spectral match

×