Successfully reported this slideshow.

Chem Spider Building An Online Database Of Open Spectra

1,077 views

Published on

ChemSpider is an online database of over 20 million chemical compounds sourced from over 300 different sources including government laboratories, chemical vendors, public resources and publications. Developed with the intention of building community for chemists ChemSpider allows its users to deposit data including structures, properties, links to external resources and various forms of spectral data. Over the past three years ChemSpider has aggregated almost 3000 high quality NMR spectra and continues to expand as the community deposits additional data. The majority of spectral data is licensed as Open Data allowing it to be downloaded and reused in presentations, lesson plans and for teaching purposes.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Chem Spider Building An Online Database Of Open Spectra

  1. 1. ChemSpider – Building an Online Database of Open Spectra Antony J. Williams and Valery Tkachenko ChemSpider, Royal Society of Chemistry, info@chemspider.com Introduction ChemSpider is an online database of almost 25 million unique chemical compounds sourced from almost 400 different sources including government laboratories, chemical vendors, public resources and publications. Developed with the intention of building community for chemists ChemSpider allows its users to deposit data including structures, properties, links to external resources and various forms of spectral data. Over the past three years ChemSpider has aggregated almost 3000 high quality NMR spectra and continues to expand as the community deposits additional data. Free Resources of NMR Data There have been a number of efforts in recent years to aggregate and make available NMR data to the community. A number of commercial databases exist of structures with associated NMR assignments. These include databases from ACD/Labs,Wiley/Chemical Concepts, Bio-Rad and others. The NMRShift database [1] from EBI contains over 35,000 structures with multinuclear assignments extracted from over 40,000 spectra. However, databases of NMR spectral curves are less common and generally limited to those made available by academics and especially those from groups generating metabonomics data (for example, the BMRB [2] and DrugBank [3]). One component of the ChemSpider project is to gather, host and make available a structure searchable database of spectral data: 1D/2D NMR, IR, Raman and Mass Spectrometry. Sources of Spectral Data While efforts have been made to harvest NMR data from various websites, with permission, the majority of the data are deposited by users of ChemSpider. Submission of spectra in the form of JCAMP-DX (for 1D spectra) or images/PDF (for 1D or 2D spectra) are supported. In order to deposit a spectrum a user simply searches ChemSpider for the associated structure and uploads the JCAMP-DX or image form of the spectrum and it is available to the community immediately. Community-based curators will validate and annotate the data as appropriate to ensure that only the highest quality data are available on the database. Figure 1: The spectrum upload page Spectral Visualization 1D NMR data are viewed inside the JSpecView spectral display applet. Zooming, scrolling and integration of the data are possible. 2D NMR spectra are viewed as images only at present. Figure 2: The JSpecView Applet Data Downloads The majority of spectral data are licensed as Open Data allowing it to be downloaded and reused in presentations, lesson plans and for teaching purposes. A web service is available to allow the data to be integrated into other systems. The Spectral Game Using a ChemSpider web service as a basis Bradley et al. [3] have set up an online game to learn how to interpret NMR spectra. The game is at www.spectralgame.com and has been played by thousands of students around the world. Figure 3: The Spectral Game A Call to Action The ChemSpider database grows only as a result of participation by the community. We encourage you to deposit and share your non-proprietary spectral data with the community to benefit chemists and spectroscopists everywhere. Feel free to contact us at info@chemspider.com to assist you in depositing data. Whether it is an individual spectrum or many hundreds we are interested in hosting your data. Future Directions We intend to continue to grow the spectral database by encouraging further depositions from the community. A 2D NMR Spectral Game is available as a proof of concept at http://spectralgame.com/2d/. We have recently introduced a beta version of NMR prediction integrated to the NMRShiftDB prediction algorithms. Conclusions ChemSpider is an online structure database allowing the community to participate in the deposition of additional data. A growing NMR spectral curve data collection is available to download and use but users are encouraged to contribute back to the database and share their own data with the community. In this way a major reference source of Open NMR data can be provided. References 1)NMRShiftDB: http://www.ebi.ac.uk/nmrshiftdb/ 2)Biological Magnetic Resonance Bank: http://www.bmrb.wisc.edu/ 3)DrugBank: http://www.drugbank.ca/ 4)The JSpecView Project: an Open Source Java viewer and converter for JCAMP-DX, and XML spectral data files, R.J. Lancashire, http://www.journal.chemistrycentral.co m/content/1/1/31 5) The Spectral Game: leveraging Open Data and crowdsourcing for education, J.C. Bradley et al., http://www.jcheminf.com/content/1/1/9 www.chemspider.com

×