In very many cases, an unknown to an investigator is actually known in the chemical literature. We refer to these types of compounds as “known unknowns.” ChemSpider is a particular good collection of …
In very many cases, an unknown to an investigator is actually known in the chemical literature. We refer to these types of compounds as “known unknowns.” ChemSpider is a particular good collection of “known unknowns” for the identification of compounds in commercial products, environmental matrices, etc. However, several modifications were necessary to refine the initial search results sorting with orthogonal filters such as the number of associated patents and references. Previously we described a similar approach using the CAS registry with either SciFinder or STN Express, but ChemSpider is a viable alternative and it is freely accessible to the public.
Accurate mass GC-MS and LC-MS measurements were performed on mixtures using, respectively, either Waters GCT or LCT (LockSpray) instrumentation. MassLynx (Waters) elemental software was used to determine molecular formulae which were further refined by i-FIT for ranking to theoretical isotope distributions. Candidate structures were obtained by searching either molecular formulae or monoisotopic molecular weights with ChemSpider. Further data such as EI or MS/MS fragmentation, number of exchangeable protons, or sample history were used to identify the “known unknown.”
The ChemSpider database of >25 million chemicals was searched by either molecular formulae or monoisotopic molecular weights to identify “known unknowns.” The latter is an attractive approach since no subjective restrictions on the elements, the range of elements, and the double bond equivalents are required prior the ChemSpider search to limit candidate compounds. Changes were made in the ChemSpider to refine the initial candidate list by number of associated references or patents. This tended to bring more promising candidates to the top of the list. The success of these approaches was evaluated with a group of 90 compounds from literature sources, internet sites, and American Society for Mass Spectrometry Conference presentations. Furthermore, the results were compared to similar methods employed searching the Chemical Abstracts Services databases.