***********************************************************************************Note: The final approved version of thi...
CAS Registry [1] by elemental compositions or           m/z value directly for a charged species andmolecular weights then...
MassLynx V4.1 software which included i-FIT             web-based version of SciFinder. The CASsoftware for numerically ra...
composition is the preferred approach, but             reasonable approach when a unique elementalsearching by a monoisoto...
Example from Our Laboratory for UV Stabilizer           found which were sorted in descending orderIdentification in a Com...
8,000,000       No. of Compounds   7,000,000                          6,000,000                          5,000,000        ...
Table 3: Comparison of Results Searching Compounds MW>600 Da by Elemental Composition andMonoisotopic Mass                ...
Scheme 2. In-source CID fragmentation for Goodrite 3114At a later date, the LC retention time, UV-Vis          Some change...
fragments expected by the user from the                    elemental composition and monoisotopic masscandidates’ structur...
manufacturers’ isotope abundance programs to             2. Little, J.L.; Williams, A. J.; Tkachenko, V.;that of the unkno...
with UHR-TOF. Bruker Application Note # ET-23           systematic bond disconnection approach. Rapid(2011).              ...
Electronic Supplementary Material for                              pp        y                    DOI: 10.1007/s13361-011-...
Initial Window Selected at www.chemspider.com on Internet                  Either Select “More Searches”                  ...
Second Window Selected in ChemSpider Interface on Internet                          Select “Advanced”                     ...
UV Stabilizer Example: Three Steps for Searching by Elemental Composition                                          1) Sele...
UV Stabilizer Example: Sorting “Initial” Elemental Composition Results by                    Selecting Desired Column to S...
UV Stabilizer Example: Five Steps for Searching by Monoisotopic Mass                                         1) Enter obse...
UV Stabilizer Example: Sorting “Initial” Monoisotopic Mass Results                            by Desired ColumnSort descen...
Antioxidant Example: Identification by Monoisotopic MW                        1) Enter observed m/z value                 ...
Antioxidant Example: Sorting “Initial” Monoisotopic Mass Results                    by “# of References Column”Sort descen...
CAS and ChemSpider Numbers for Compounds Shown in Table 3 in Text                                Elemental        Monoisot...
Increase in Absolute Intensity of Sodium Adduct at Higher                  In Source                  In-Source CID Voltag...
Upcoming SlideShare
Loading in …5
×

Identification of “Known Unknowns” Utilizing Accurate Mass Data and ChemSpider

1,045 views

Published on

In many cases, an unknown to an investigator is actually known in the chemical literature, a reference database, or an internet resource. We refer to these types of compounds as “known unknowns.” ChemSpider is a very valuable internet database of known compounds useful in the identification of these types of compounds in commercial, environmental, forensic, and natural product samples. The database contains over 26 million entries from hundreds of data sources and is provided as a free resource to the community. Accurate mass mass spectrometry data is used to query the database by either elemental composition or a monoisotopic mass. Searching by elemental composition is the preferred approach. However, it is often difficult to determine a unique elemental composition for compounds with molecular weights greater than 600 Da. In these cases, searching by the monoisotopic mass is advantageous. In either case, the search results are refined by sorting the number of associated references associated with each compound in descending order. This raises the most useful candidates to the top of the list for further evaluation. These approaches were shown to be successful in identifying “known unknowns” noted in our laboratory and for compounds of interest to others.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,045
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Identification of “Known Unknowns” Utilizing Accurate Mass Data and ChemSpider

  1. 1. ***********************************************************************************Note: The final approved version of this paper can be found at www.springerlink.com,http://dx.doi.org/10.1007/s13361-011-0265-y. This is a prepress version of the article.***********************************************************************************Identification of “Known Unknowns” Utilizing Accurate Mass Data and ChemSpiderJames L. Little,1 Antony J. Williams,2 Alexey Pshenichnov,2 Valery Tkachenko21Eastman Chemical Company, Kingsport, TN 37662, USA2ChemSpider, Royal Society of Chemistry, Cambridge, UKCorrespondence to: J. L. Little; e-mail: jameslittle@eastman.com, Antony J. Williams; e-mail:williamsa@rsc.org Abstract In many cases, an unknown to an investigator is actually known in the chemical literature, a reference database, or an internet resource. We refer to these types of compounds as “known unknowns.” ChemSpider is a very valuable internet database of known compounds useful in the identification of these types of compounds in commercial, environmental, forensic, and natural product samples. The database contains over 26 million entries from hundreds of data sources and is provided as a free resource to the community. Accurate mass mass spectrometry data is used to query the database by either elemental composition or a monoisotopic mass. Searching by elemental composition is the preferred approach. However, it is often difficult to determine a unique elemental composition for compounds with molecular weights greater than 600 Da. In these cases, searching by the monoisotopic mass is advantageous. In either case, the search results are refined by sorting the number of references associated with each compound in descending order. This raises the most useful candidates to the top of the list for further evaluation. These approaches were shown to be successful in identifying “known unknowns” noted in our laboratory and for compounds of interest to others.W e have previously demonstrated [1] that searching the Chemical Abstracts Service (CAS) Registryemploying accurate mass mass spectrometrydata is a very useful approach for the There is a particular need for additional approaches for the identification of “known unknowns” found in liquid chromatography/mass spectrometry (LC/MS) analyses because the availability of computeridentification of “known unknowns.” We define searchable collision-induced dissociation (CID)a “known unknown” as a compound which is mass spectral databases is limited for LC/MS [1]unknown to the investigator but is known in the as compared to that of electron ionization (EI)chemical literature, a reference database, or an mass spectral databases. We have found thatinternet resource. searching “spectraless” databases such as the 1
  2. 2. CAS Registry [1] by elemental compositions or m/z value directly for a charged species andmolecular weights then sorting the hit list in then select the type of ionic species from a pull-descending order by the number of associated down (drop-down) menu. Examples of typicalreferences to be very effective in the types of ionic species in the menu include [M +identification of “known unknowns.” The most Na]+, [M + NH4]+, [M - H]-, etc. The m/z valuelikely candidates are normally brought to the entered is then automatically adjusted by thetop of the list where they can be further program before searching the monoisotopicscrutinized by employing additional data. mass of the neutral species as it would appear in the ChemSpider database.ChemSpider is another very large “spectraless”database that can be searched by elemental The second change was the ability for thecomposition, molecular weight, or entered m/z value of the charged species to bemonoisotopic mass and it is provided as a free corrected for the mass of an electron. Someresource to the community. ChemSpider manufacturers’ data systems do not properlycontains >26 million entries as compared to the calculate the m/z values of ions for the mass ofCAS Registry that contains >62 million an electron [3, 4]. The errors normally cancelsubstances. In our current work, the within the manufacturers’ elementalChemSpider interface was modified [2] such composition programs because the referencethat the initial search results can be sorted by calibration tables are also not corrected.the number of references associated with an However, all data exported into otherentry. The approach was then evaluated with a applications for further data processing shouldwide variety of compounds and compared to be corrected.previous results obtained with the CAS Registry[1]. Acquisition of Accurate Mass DataExperimental The experimental detail including calibrations, typical chromatographic separations, andChemSpider Software Modifications sample preparations were previously described in detail [1]. Briefly, the accurate massSeveral changes were made in the ChemSpider electrospray LC/MS/UV-Vis (ultraviolet-visible)software interface to facilitate our studies [2]. data were obtained on a LCT time-of-flight massThe most important change was the ability to spectrometer (Waters Corporation, Milford,sort the initial search results in descending MA) equipped with a LockSpray secondary ESIorder by the number of data sources or probe. An 1100 Series liquid chromatographassociated references. The references in with autosampler, degasser, and UV-Vis diodeChemSpider originate from the SureChem array spectrophotometer (Agilent Technologies,patent database (>20 million), PubMed (>20 Santa Clara, CA) was employed for themillion articles), and the content of the Royal separations in the reversed phase mode. TheSociety of Chemistry publishing database. In UV-Vis spectral data is very useful in locatingour work, sorting the “# of References” column particular classes of compounds (UV absorbers,was found to be the most useful. Many screen dyes, etc.) and in confirming the identity ofdisplays of the software annotated with compounds by comparison to reference spectraexamples from our laboratory are shown in the from standards, literature references, or evenElectronic Supplementary Material. internet sources.Two other changes allowed more convenient Elemental compositions were generated fromdata entry for searching by the user. The first monoisotopic masses utilizing the Waterschange was the ability for the user to enter the Elemental Composition Program within 2
  3. 3. MassLynx V4.1 software which included i-FIT web-based version of SciFinder. The CASsoftware for numerically ranking the observed Registry can only be searched by elementalisotopic pattern to the theoretical ones. Our composition, molecular weight, or nominalLCT accurate mass measurements typically molecular weight, but not monoisotopic mass.yielded a standard deviation of 5 ppm. Thus, The ability to search by the monoisotopic masswindows of +/-18 ppm (slightly greater than 3 is much more useful because the standardstandard deviations) were employed in deviation for its determination is much lowerexamples from our laboratory to insure than that for the molecular weight. In addition,inclusions of all reasonable candidates for the monoisotopic mass is calculated by alldetermining elemental compositions or for manufacturers’ data systems whereas thesearching ChemSpider by monoisotopic masses. molecular weight must be manually calculatedIn literature studies, windows of approximately by the user.+/-5 ppm were employed because manycurrently available accurate mass time-of-flight One significant advantage of searching the CASinstruments yield standard deviations Registry versus the ChemSpider database is theapproaching 1 ppm for mass measurements. ability of the former to search its > 34 million document records associated with an elementalResults and Discussion composition by key words [1]. Only very minimal sample history is required to quicklyDescriptions of Two Approaches obtain useful candidates for tentative identifications by this approach. This capabilityThe ability to search by either elemental can be especially useful in identifying morecomposition or monoisotopic mass is very obscure “known unknowns” with fewervaluable for the identification of “known associated references. This capability is notunknowns” using accurate mass mass currently available with ChemSpider.spectrometry data. In the work describedwithin this article, the ChemSpider user Evaluation of the Two Approaches withinterface was modified to permit the sorting of Literature Exampleseither the elemental composition ormonoisotopic mass search results by the A group of 90 compounds was assembled fromnumber of associated references. The results, literature sources [5-8], internet sites, andsorted in descending order, normally bring the American Society of Mass Spectrometrymost useful entries to the top of the list. These Conference presentations to evaluate the twocandidate structures are then scrutinized [1] by approaches in ChemSpider. The results areother available data such as in-source CID summarized in Tables 1 and 2. Searching thespectra, UV-Vis spectra, GC/MS data, number of ChemSpider database by elementalexchangeable protons, NMR data, etc. to compositions then sorting in descending orderultimately obtain the identification of the by the number of overall references yieldedcompound of interest. For critical more target compounds highly ranked asidentifications, a standard of the material is compared to the same approach usingnormally obtained and its LC retention time, monoisotopic masses. This is to be expectedUV-Vis spectrum, and in-source CID spectra are because the number of overall candidates wascompared to those of the unknown. less when searching by elemental composition (mean = 513, median = 310) compared toA very similar approach [1] was demonstrated monoisotopic mass (mean = 800, median =to be very useful utilizing the CAS Registry of 740). However, the overall number with>62 million substances, a fee-based service, rankings less than or equal to 5 was acceptablewhich is searched by either STN Express or the in both cases. Thus, searching by elemental 3
  4. 4. composition is the preferred approach, but reasonable approach when a unique elementalsearching by a monoisotopic mass is also a composition cannot be readily determined.Table 1. Searching ChemSpider by elemental composition then sorting by number of associatedreferences. Number Position of Compound Sorted in Descending Class of Compounds Compounds Order by Number of References in Class #1 #2 #3 #4 #5 >#5Drugs 45 43 1 1Pesticides 8 7 1Toxins 2 2Polymer antioxidants 15 15Polymer UV Stabilizers 10 8 1 1Polymer Clarifying agent (Irgaclear DM) 1 1(14)Polyurethane additives 4 2 1 1Natural products 3 2 1Herbicide (clofibric acid) 1 1Artificial sweetener (Sucralose) 1 1 Total Compounds ChemSpider 90 81 4 3 1 1 Total Compounds CAS Registry [1] 90 84 4 1 1Table 2. Searching ChemSpider by monoisotopic mass with +/- 5 ppm window. Number Position of Compound Sorted in Descending Class of Compounds Compounds Order by Number of References in Class #1 #2 #3 #4 #5 >#5Drugs 45 43 1 1Pesticides 8 7 1Toxins 2 2Polymer antioxidants 15 13 1 1Polymer UV Stabilizers 10 6 1 1 1 1(8)Polymer Clarifying agent (Irgaclear DM) 1 1Polyurethane additives 4 2 1 1Natural products 3 2 1Herbicide (clofibric acid) 1 1Artificial sweetener (Sucralose) 1 1 Total Compounds ChemSpider 90 77 4 4 2 3The same 90 compounds were previously comparison can be made in Table 2 for theevaluated with the CAS Registry searching by results obtained with ChemSpider.elemental compositions [1] using the web-based version of SciFinder. Similar results (seebottom of Table 1) were obtained using eitherChemSpider or the CAS registry as databases forthese limited number of test compounds. TheCAS Registry currently cannot be searched bymonoisotopic mass [1], therefore, no direct 4
  5. 5. Example from Our Laboratory for UV Stabilizer found which were sorted in descending orderIdentification in a Commercial Polymer by the number of overall associated references. The top candidate was Tinuvin 328. TheAn unknown additive, noted in a commercial proposed structure was consistent with thepolymer sample, was characterized by accurate accurate mass in-source CID spectrum (Schememass LC/MS. The accurate mass data was used 1) which showed two very significant losses ofin conjunction with isotopic abundance C5H10 from the protonated molecule, [M + H]+,information to obtain an elemental composition and the presence of a C5H11+ ion at a m/z valueof C22H29N3O. ChemSpider was searched by the of 71.085.elemental composition and 1135 hits wereScheme 1. In-source CID fragmentation for Tinuvin 328The confidence afforded by the initial dataallowed the identity to be reported to the Advantage of Searching Higher MW Compoundscustomer. At a later date, the identification of by Monoisotopic Mass Datathe additive was confirmed by comparison of itsLC retention time, UV-Vis spectrum, and in- It is often difficult to determine a uniquesource CID mass spectra to those of a elemental composition for “known unknowns”purchased reference sample. with molecular weights greater than 600 Da [9, 10]. Either there are two or more possibleChemSpider was also searched for protonated elemental compositions for the unknown ormolecules with m/z 352.239 using a m/z one is inadvertently excluded if the somewhatwindow of +/-18 ppm. There were 1459 hits subjective user settings are set too narrow forand Tinuvin 328 was still the top candidate. The elements present, range of elements, doublemass precision of our older instrumentation is bond equivalents, etc. In theory, the number ofrelatively large, but newer time-of-flight elemental compositions increases dramaticallyinstrumentation afford much better mass as the molecular weight increases. However, inprecision, which would return a smaller number practice, the number of elemental compositionsof candidates from the search. Screen displays at molecular weights greater than 600 Dafrom the ChemSpider interface for the decreases dramatically in both ChemSpider (seeidentification of Tinuvin 328 by both Figure 1) and CAS Registry databases [1].approaches are shown in the ElectronicSupplementary Material. 5
  6. 6. 8,000,000 No. of Compounds 7,000,000 6,000,000 5,000,000 4,000,000 3,000,000 2,000,000 1,000,000 0 Molecular Weight RangesFigure 1. Number of ChemSpider entries versus molecular weight rangesTherefore, it is much more reasonable to search of examples. Of course, ex post facto, it is stillfor “known unknowns” in these databases using extremely important to compare the isotopicmonoisotopic mass instead of struggling to abundances of the candidate structures fromdetermine a unique elemental composition for ChemSpider to those of the unknown. Thesearching. The rankings noted by number of ability to calculate, rank, and comparereferences for several higher molecular weight theoretical isotopic abundances for elementalcompounds noted in the literature [9,10] are compositions to those of observed ones iscompared to elemental composition searches normally a standard option in most massversus monoisotopic mass searches in Table 3. spectrometry manufacturers’ software. TheThere is essentially no significant penalty noted CAS and ChemSpider identification numbers forfor searching by monoisotopic mass instead of compounds in Table 3 are included in theelemental composition for this limited number Electronic Supplementary Material. 6
  7. 7. Table 3: Comparison of Results Searching Compounds MW>600 Da by Elemental Composition andMonoisotopic Mass Rank Rank Monoisotopic Elemental Monoisotopic Compound Elemental Mass using +/-5 ppm Composition Mass Composition window Moxidectin C37H53NO8 639.3771 1 of 5 1 of 39 Erythromycin C37H67NO13 733.4612 1 of 42 1 of 53 Digoxin C41H64O14 780.4296 1 of 47 1 of 65 Rifampicin C43H58N4O12 822.4051 1 of 29 1 of 96 Rapamycin C51H79NO13 913.5551 1 of 43 1 of 51 Amphotericin B C47H73NO17 923.4878 1 of 33 1 of 42 Gramicidin S C60H92N12O10 1140.7059 1 of 5 1 of 13 Cereulide C57H96N6O18 1152.6781 1 of 3 2 of 8 Cyclosporin A C62H111N11O12 1201.8414 1 of 36 1 of 38 Vancomycin C66H75Cl2N9O24 1447.4302 1 of 24 1 of 26 Perfluorotriazine C30H18N3O6P3F48 1520.9642 1 of 1 1 of 1 Thiostrepton C72H85N19O18S5 1663.4924 1 of 5 1 of 5Example from Our Laboratory for the laboratory for in-source CID [1] and a typicalIdentification of a Higher Molecular Weight example is shown in the ElectronicAntioxidant in a Commercial Polymer Supplementary Material.An unknown additive, noted in a commercial ChemSpider was searched by the monoisotopicpolymer sample, was characterized by accurate mass for the ammonium adduct with a massmass LC/MS. The ammonium adduct, [M + window of 18 ppm (see ElectronicNH4]+, of the component was observed at a m/z Supplementary Material) and 23 candidatesvalue of 801.558. The observed ion was were obtained. The top candidate wasconfirmed to be an ammonium adduct because Goodrite 3114. Only two of the 23 candidatesat higher in-source CID energies the ammonium had isotopic abundances consistent with that ofadduct intensity was reduced and the intensity the unknown. Of these two, only the in-sourceof the sodium adduct was noted to increase. CID mass spectrum (Scheme 2) of GoodriteThis increase in absolute intensity of the sodium 3114 was consistent with that for the unknown.adduct and corresponding decrease in that ofthe ammonium adduct is routinely noted in our 7
  8. 8. Scheme 2. In-source CID fragmentation for Goodrite 3114At a later date, the LC retention time, UV-Vis Some changes in both the current API and thespectrum, and in-source CID mass spectra of a vendors’ programs would be required to utilizecommercial sample were shown to be identical associated references and comparison ofto those of the unknown. isotopic abundances to facilitate increases in data processing speeds. When searching byFuture Enhancements to Improve Productivity monoisotopic mass, the isotopic abundances of the candidates should then be calculated in theChemSpider currently supports a web vendor’s data system and compared to theapplication program interface (API) via web observed isotopic abundance of the unknownservices and sorted in descending order by either the(http://www.chemspider.com/MassSpecAPI.as number of references or the fit of the isotopicmx) for batch queries by external vendors’ abundances to those of the unknown. Thisprograms. Several companies, including Bruker would be particularly useful for “knownDaltonics, Thermo Scientific, Waters, and unknowns” with molecular weights greater thanAgilent Technologies, integrate their data 600 Da, but would be also beneficial for lowerprocessing programs with ChemSpider. Users molecular weight compounds. Alternatively,who integrate using the web services, including the ChemSpider software could be enhanced tothese vendors, use a token supplied by perform the calculations of the isotopicChemSpider to authenticate their identity to the abundances of the candidates for comparisonweb service. Currently there are no similar to the unknowns.capabilities for such queries of the CAS Registryusing either SciFinder or STN Express. Even with the above changes, excessive time would still be needed to manually compare the observed CID spectrum of an unknown to 8
  9. 9. fragments expected by the user from the elemental composition and monoisotopic masscandidates’ structures. If in silico CID spectra, are listed as C12H19NO2 and 209.1216,i.e. theoretical fragmentation with predicted respectively, in ChemSpider. It would be moreabundances, could be calculated [11,12] and beneficial to parse this type of organic cation asranked for the candidate structures from either C10H16N.C2H3O2 then list the elementalChemSpider or SciFinder SDF (Structure Data composition and monoisotopic mass forFormat) files [13], further data processing searching, respectively, as C10H16N andspeeds would be realized. The results of the 150.1277.numeric structure ranking, isotopic abundance,and number of references could then be more Amphoteric (inner salt, zwitterionic) species arequickly examined by the user to yield tentative listed in the database with no modifications.identifications. For example, (CH3)3N+CH2CO2-, trimethylglycine, is listed as C5H11NO2 with monoisotopic mass ofBoth ChemSpider and the web-based SciFinder 117.079. This type of compound would yield [Mare able to export structures for the candidate + H]+ and [M + acetate]- ions, respectively, incompounds in SDF. The web-based version positive and negative ion electrospray analysesSciFinder permits the export of up to 500 using acetate in the LC eluent. Thus thestructures in an SDF file. There is no limit to the elemental composition for the species wouldnumber of structures that can be exported in need to be corrected before searching.the ChemSpider API and a limit of 10,000 in theweb interface. ConclusionsCharged Species in ChemSpider Modifications in the ChemSpider interface to sort elemental composition and monoisotopicMore development work needs to be mass search results by the number ofperformed to facilitate the searching of charged associated references in descending orderspecies by either elemental composition or offered significantly improved capabilities formonoisotopic mass. Sodium benzoate the identification of “known unknowns” usingillustrates the current state of affairs for organic accurate mass mass spectrometry data. Otheranions. Its elemental composition and changes were made to allow easier input ofmonoisotopic mass are listed in ChemSpider as data by users to specify the type of ion adductC7H5NaO2 and 144.018724, respectively. It and to correct the monoisotopic mass for thewould be more useful to parse the elemental mass of an electron. Further enhancements arecomposition as C7H6O2.Na (neutral benzoic acid still needed to improve the overall productivityto left of period, associated cation to the right) of the process and to enable the searching ofthen list the elemental composition and charged species.monoisotopic mass for searching as C7H6O2 and122.037, respectively. This is the format The elemental composition search is theemployed in the CAS Registry [1]. In reverse preferred one in the lower molecular weightphase chromatography electrospray mass range (200-600 Da), but even the monoisotopicspectrometry, the organic anions elute as their mass search with reasonable error windows isfree acid form and are normally detected as [M viable in this range. Monoisotopic mass- H]- ions in the negative ion mode. Thus, their searching for compounds with a molecularidentity is independent of their associated weight >600 Da is preferred when it is difficultcation. to determine a unique elemental composition. The resulting candidates from the monoisotopicA simple example of an organic cation is N,N,N- mass search are then ranked by comparing theirtrimethyl-N-benzylammonium acetate whose calculated isotopic abundances by the 9
  10. 10. manufacturers’ isotope abundance programs to 2. Little, J.L.; Williams, A. J.; Tkachenko, V.;that of the unknown. Pshenichnov, A: Accurate mass measurements: identifying “known unknowns” usingThe ability to search monoisotopic mass in ChemSpider. Proceedings of the ASMS, Denver,ChemSpider is an important function which is CO, 2010.absent in search engines for the CAS Registry. 3. Mamer, O. A.; Lesimple, A.: CommonHowever, CAS Registry searches can be further shortcomings in computer programs calculatingrefined by key words which can be particularly molecular weights. J.Am.Soc.Mass Spectrom.useful for more obscure “known unknowns” 14, 626 (2004).with few associated references. This option is 4. Ferrer, I; Thurman, E. M.: Importance of thenot currently available in ChemSpider. The electron mass in the calculations of exact massability to search by both databases is very by time-of-flight mass spectrometry. Rapiddesirable depending on the problem at hand Commun. Mass Spectrom. 21, 2538-2539when costs are not a limitation. ChemSpider is (2007).provided at no cost to the community, but 5. Hogenboom, A. C.; van Leerdam, J. A.; dethere is a fee associated with the utilization of Voogt, P.: Accurate mass screening andthe CAS Registry. identification of emerging contaminants in environmental samples by liquidIn all cases, other data such as sample history, chromatography-hybrid linear ion trap orbitrapUV-Vis data, types of ions observed, mass spectrometry. J. Chromatogr. A, 1216,exchangeable protons, CID spectra, etc. are 510-519 (2009).needed to scrutinize the candidate structures 6. Liao, W.; Draper, W. M.; Perera, S. K.:from the elemental composition and Identification of unknowns in atmosphericmonoisotopic mass search results. For ultimate pressure ionization mass spectrometry using aconfirmation of structure, analysis of a standard mass to structure search engine. Anal. Chem.material under identical conditions is always 80, 7765-7777 (2008).desirable. 7. Garcia-Reyes, J. F.; Ferrer, I.; Thurman, E. M.; Molina-Diaz, A.; Fernandez-Alba, A. R.:Acknowledgements Searching for non-target chlorinated pesticides in food by liquid chromatography/time-of-flightThe authors gratefully recognize Curt Cleven mass spectrometry. Rapid Commun. Massfrom Eastman Chemical Company, Mike Scott Spectrom. 19, 2780-2788 (2005).from Agilent Technologies, Inc., and Jim 8. Ojanperä, S.; Pelander, A.; Peizing, M.; Krebs,Lekander from Waters Corporation for their I.; Vuori, E.; Ojanperä, I.: Isotopic pattern andhelp and advice. We also thank Bill Tindall and accurate mass determination in urine drugKent Morrill (retirees from Eastman Chemical screening by liquid chromatography/time-of-Company) for their initial work on “spectraless” flight Mass Spectrometry. Rapid Commun. Massdatabases created from the TSCA (Toxic Spectrom. 20, 1161-1167 (2006).Substances Control Act) and the Eastman 9. Erve, J.C.; Gu, M.; Wang, Y.; DeMaio, W.;Corporate Plant Material databases. Talaat, R.E.: Spectral accuracy of molecular ions in an LTQ/Orbitrap mass spectrometer andReferences implications for elemental composition determination. J. Am. Soc. Mass Spectrom. 20,1. Little, J. L.; Cleven, C. D.; Brown, S. D.: 2058-2069 (2009).Identification of “known unknowns” utilizing 10. Lohmann, W.; Decker, P; Barsch, A:accurate mass data and Chemical Abstracts Accurate elemental composition determinationService databases. J. Am. Soc. Mass Spectrom. and identification of molecules with > 1100 m/z22, 348-359 (2011). 10
  11. 11. with UHR-TOF. Bruker Application Note # ET-23 systematic bond disconnection approach. Rapid(2011). Comm. in Mass Spectrom., 19, 3111-311811. Sweeney, D. L.: Small molecules as (2005).mathematical partitions. Anal. Chem. 75, 5362- 13. Dalby, A.; Nourse, J. G.; Hounshel, W. D.;5373 (2003). Gushurst, A.K.I.; Grier, D. L.; Leland, B. A.;12. Hill, A. W.; Mortishire-Smith, J.: Automated Laufer, J.: Description of several chemicalassignment of high-resolution collisionally structure file formats used by computeractivated dissociation mass spectra using a programs developed at Molecular Design Limited. J. Chem. Inf. Comput. Sci., 32, 244-255 (1992). 11
  12. 12. Electronic Supplementary Material for pp y DOI: 10.1007/s13361-011-0265-y Identification of “Known Unknowns” Utilizing Accurate Mass Data and ChemSpider Journal of the American Society for Mass Spectrometry, online version James L. Little,a Antony J. Williams,b Alexey Pshenichnov,b Valery TkachenkobaEastman Chemical Company, Kingsport Tennessee, USA p y gpbChemSpider, Royal Society of Chemistry, Cambridge, UK Correspondence to jameslittle@eastman.com or williamsa@rsc.org 1
  13. 13. Initial Window Selected at www.chemspider.com on Internet Either Select “More Searches” or Select “Advanced Search” 2
  14. 14. Second Window Selected in ChemSpider Interface on Internet Select “Advanced” 3
  15. 15. UV Stabilizer Example: Three Steps for Searching by Elemental Composition 1) Select “Search by Properties” 2) Enter elemental composition 3) Select “Search” 4
  16. 16. UV Stabilizer Example: Sorting “Initial” Elemental Composition Results by Selecting Desired Column to Sort Sort descending by selecting “# of References” column, down arrow will appear after sorted 5
  17. 17. UV Stabilizer Example: Five Steps for Searching by Monoisotopic Mass 1) Enter observed m/z value 2) Enter appropriate precision window for your instrument 3) S l t t Select type of i f ion 4) Correct for mass of electron if necessary for your instrument 5) Select “Search” 6
  18. 18. UV Stabilizer Example: Sorting “Initial” Monoisotopic Mass Results by Desired ColumnSort descending by selecting“# of References” column, down arrow will appear after sorted 7
  19. 19. Antioxidant Example: Identification by Monoisotopic MW 1) Enter observed m/z value 2) Enter appropriate precision window for your instrument 3) Select type of ion 4) Correct for mass of electron if necessary for your instrument 5) Select “Search” 8
  20. 20. Antioxidant Example: Sorting “Initial” Monoisotopic Mass Results by “# of References Column”Sort descending by selecting“# of References” column, down arrow will appear after sorted 9
  21. 21. CAS and ChemSpider Numbers for Compounds Shown in Table 3 in Text Elemental Monoisotopic Compound CAS No. ChemSpider No. Composition Mass M id ti c Moxidectin C37H53NO8 639.3771 639 3771 113507-06-5 113507 06 5 16736424, 7845502 16736424 7845502, 7878580 Erythromycinc C37H67NO13 733.4612 114-07-8 12041 Digoxinc C41H64O14 780.4296 20830-75-5 2006532, 23089581 Rifampicinc C43H58N4O12 822.4051 13292-46-1 10468813, 21112299, Rapamycinc R i C51H79NO13 913.5551 913 5551 53123-88-9 53123 88 9 10482078 Amphotericin Bc C47H73NO17 923.4878 1397-89-3 10237579, 21111691, 23123815 Gramicidin Sc C60H92N12O10 1140.7059 113-73-5 66085 Cereulided C57H96N6O18 1152.6781 157232-64-9 8030708, 8117283, 8232646 Cyclosporin Ac C62H111N11O12 1201.8414 59865-13-3 4444325, 4447449 Vancomycinc C66H75Cl2N9O24 1447.4302 1404-90-6 14253 Perfluorotriazinec C30H18N3O6P3F48 1520.9642 16059-16-8 2055443 Thiostreptonc C72H85N19O18S5 1663.4924 1393 48 2 1393-48-2 10469505, 24603973cErve,J.C.; Gu, M.; Wang, Y.; DeMaio, W.; Talaat, R.E.: Spectral accuracy of molecular ions in an LTQ/Orbitrap massspectrometer and implications for elemental composition determination. J. Am. Soc. Mass Spectrom. 20, 2058-2069 (2009).dLohmann, W.; Decker, P; Barsch, A: Accurate molecular formula determination and identification of molecules with > 1100m/z with UHR-TOF. Bruker Application Note # ET-23 (2011). / pp ( ) 10
  22. 22. Increase in Absolute Intensity of Sodium Adduct at Higher In Source In-Source CID Voltages [M+H]+ [ ] [M+NH4]+ [M+Na]+ (volts) m/z 329 Chemical Formula: C30H58O4S MW 514 4 514.4 m/z 143 m/z 89 11

×