Building a Community Resource of Open Spectral Data


Published on

A presentation given at #FACCS2010 in Raleigh, North Carolina

ChemSpider is an online database of almost 25 million chemical compounds sourced from over 300 different sources including government laboratories, chemical vendors, public resources and publications. Developed with the intention of building community for chemists ChemSpider allows its users to deposit data including structures, properties, links to external resources and various forms of spectral data. Over the past three years ChemSpider has aggregated almost 3000 spectra including Infrared and Raman Data and continues to expand as the community deposits additional data. The majority of spectral data is licensed as Open Data allowing it to be downloaded and reused in presentations, lesson plans and for teaching purposes. This presentation will provide an overview of our efforts to build a structure-indexed online database of spectral data, initiate a call to action to the community to participate in improving this resource for the community at large and discuss how such a resource could be used as the basis of a spectral game to teach students spectral interpretation.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Building a Community Resource of Open Spectral Data

  1. 1. Building a Community Resource of Open Spectral Data
  2. 2. If there were an online DB of spectra.. <ul><li>Reference data </li></ul><ul><li>Could dramatically reduce rework </li></ul><ul><li>Excellent teaching resource </li></ul><ul><li>Opportunity for crowdsourced review and annotation of data (“Wikipedia for spectra”) </li></ul><ul><li>Other… </li></ul>
  3. 3. Where is chemistry online? <ul><li>Encyclopedic articles (Wikipedia) </li></ul><ul><li>Chemical vendor databases </li></ul><ul><li>Metabolic pathway databases </li></ul><ul><li>Property databases </li></ul><ul><li>Patents with chemical structures </li></ul><ul><li>Drug Discovery data </li></ul><ul><li>Scientific publications </li></ul><ul><li>Compound aggregators </li></ul><ul><li>Blogs/Wikis and Open Notebook Science </li></ul>
  4. 4. And where are spectra online?
  5. 5. And where are spectra online?
  6. 6. And where are spectra online?
  7. 7. And where are spectra online?
  8. 8. And where are spectra online?
  9. 9. What’s in the way of “Open Spectra” <ul><li>Spectral databases are revenue generators </li></ul><ul><li>Intellectual property </li></ul><ul><li>Scientist versus organizational ownership </li></ul><ul><li>Legal risks </li></ul><ul><li>It’s generally “work” to release data </li></ul><ul><li>Confusion about “Free Data” versus “Open Data” – some people will provide “Free Data”, some will provide “Open Data”. They are different </li></ul>
  10. 10. A Pragmatic Vision <ul><ul><li>“ Build a Structure Centric Community to </li></ul></ul><ul><ul><li>Serve Chemists” </li></ul></ul><ul><ul><li>Integrate chemical structure data on the web </li></ul></ul><ul><ul><li>Create a “structure-based hub” to information and data </li></ul></ul><ul><ul><li>Provide access to structure-based “algorithms” </li></ul></ul><ul><ul><li>Let chemists contribute their own data </li></ul></ul><ul><ul><li>Allow the community to curate/correct data </li></ul></ul>
  11. 11. We Answer Questions for Chemists <ul><li>Questions a chemist might ask… </li></ul><ul><ul><li>What is the melting point of n-heptanol? </li></ul></ul><ul><ul><li>What is the chemical structure of Xanax? </li></ul></ul><ul><ul><li>Chemically, what is phenolphthalein? </li></ul></ul><ul><ul><li>What are the stereocenters of cholesterol? </li></ul></ul><ul><ul><li>Where can I find publications about xylene? </li></ul></ul><ul><ul><li>What are the different trade names for Aspirin? </li></ul></ul><ul><ul><li>What is the IR spectrum of Benzoic Acid? </li></ul></ul>
  12. 12.
  13. 13. Search for a Chemical…by name
  14. 14. Available Information… <ul><li>Linked to vendors, safety data, toxicity, metabolism </li></ul>
  15. 15. Available Information….
  16. 16. ChemSpider Today <ul><li>24.8 million structures </li></ul><ul><li>400 data sources </li></ul><ul><li>Grows daily </li></ul><ul><li>Community annotation and curation </li></ul><ul><li>We curate, edit, change, enhance data daily </li></ul>
  17. 17. ChemSpider : Spectra Linked
  18. 18. ChemSpider: Spectra Linked
  19. 19. Spectra Linked
  20. 20. Spectra Linked
  21. 21. Spectra on ChemSpider
  22. 22. Sources of Spectra <ul><li>Sourced from online sources with permission </li></ul><ul><li>Private collections </li></ul><ul><li>The MAJORITY deposited by ChemSpider users </li></ul>
  23. 23. Spectral Uploading <ul><li>Locate the structure of interest and deposit spectrum </li></ul>
  24. 24. Spectral Uploading <ul><li>Various types of NMR spectra supported </li></ul>
  25. 25. Spectral formats supported <ul><li>Spectra may be uploaded as JCAMP-DX format </li></ul><ul><li>Graphical formats such as JPEG, PNG, GIF </li></ul><ul><li>PDF files </li></ul>
  26. 26. Multiple Spectra for One Structure
  27. 27. ChemSpider ID 24528095 H1 NMR
  28. 28. ChemSpider ID 24528095 C13 NMR
  29. 29. ChemSpider ID 24528095 HHCOSY
  30. 30. ChemSpider ID 24528095 HSQC
  31. 31. ChemSpider ID 24528095 HMBC
  32. 32. Deposit spectra against new structure <ul><li>If a NEW compound has spectral data then deposit the structure onto ChemSpider first </li></ul>
  33. 33. Available Spectra
  34. 34. Embedding Data
  35. 35. Web Services
  36. 36.
  37. 37. Spectral Game
  38. 38. Increasing Complexity
  39. 39. Spectral Game
  40. 40. Data Curation
  41. 41. Reversed Spectrum
  42. 42. Download, reprocess, redeposit
  43. 43. True Curation of Data
  44. 44. Building Other Spectral Games! <ul><li>We would like to build other forms of the spectral game. The database is presently very rich in NMR data </li></ul><ul><li>There are presently </li></ul><ul><ul><li>101 Infrared Spectra </li></ul></ul><ul><ul><li>46 Raman Spectra </li></ul></ul><ul><ul><li>We would like more!!! </li></ul></ul>
  45. 45. Invitations <ul><li>Spectral data are welcomed from associated syntheses, lab experiments etc </li></ul><ul><li>Upload structures, spectra, analyses etc to ChemSpider to share with the community </li></ul><ul><li>Use and encourage your students </li></ul>
  46. 46. Thank you Email: Twitter: ChemConnector Blog: Personal Blog: SLIDES:
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.