Building a Community Resource of Open Spectral Data

Uploaded on

A presentation given at #FACCS2010 in Raleigh, North Carolina …

A presentation given at #FACCS2010 in Raleigh, North Carolina

ChemSpider is an online database of almost 25 million chemical compounds sourced from over 300 different sources including government laboratories, chemical vendors, public resources and publications. Developed with the intention of building community for chemists ChemSpider allows its users to deposit data including structures, properties, links to external resources and various forms of spectral data. Over the past three years ChemSpider has aggregated almost 3000 spectra including Infrared and Raman Data and continues to expand as the community deposits additional data. The majority of spectral data is licensed as Open Data allowing it to be downloaded and reused in presentations, lesson plans and for teaching purposes. This presentation will provide an overview of our efforts to build a structure-indexed online database of spectral data, initiate a call to action to the community to participate in improving this resource for the community at large and discuss how such a resource could be used as the basis of a spectral game to teach students spectral interpretation.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Building a Community Resource of Open Spectral Data
  • 2. If there were an online DB of spectra..
    • Reference data
    • Could dramatically reduce rework
    • Excellent teaching resource
    • Opportunity for crowdsourced review and annotation of data (“Wikipedia for spectra”)
    • Other…
  • 3. Where is chemistry online?
    • Encyclopedic articles (Wikipedia)
    • Chemical vendor databases
    • Metabolic pathway databases
    • Property databases
    • Patents with chemical structures
    • Drug Discovery data
    • Scientific publications
    • Compound aggregators
    • Blogs/Wikis and Open Notebook Science
  • 4. And where are spectra online?
  • 5. And where are spectra online?
  • 6. And where are spectra online?
  • 7. And where are spectra online?
  • 8. And where are spectra online?
  • 9. What’s in the way of “Open Spectra”
    • Spectral databases are revenue generators
    • Intellectual property
    • Scientist versus organizational ownership
    • Legal risks
    • It’s generally “work” to release data
    • Confusion about “Free Data” versus “Open Data” – some people will provide “Free Data”, some will provide “Open Data”. They are different
  • 10. A Pragmatic Vision
      • “ Build a Structure Centric Community to
      • Serve Chemists”
      • Integrate chemical structure data on the web
      • Create a “structure-based hub” to information and data
      • Provide access to structure-based “algorithms”
      • Let chemists contribute their own data
      • Allow the community to curate/correct data
  • 11. We Answer Questions for Chemists
    • Questions a chemist might ask…
      • What is the melting point of n-heptanol?
      • What is the chemical structure of Xanax?
      • Chemically, what is phenolphthalein?
      • What are the stereocenters of cholesterol?
      • Where can I find publications about xylene?
      • What are the different trade names for Aspirin?
      • What is the IR spectrum of Benzoic Acid?
  • 12.
  • 13. Search for a Chemical…by name
  • 14. Available Information…
    • Linked to vendors, safety data, toxicity, metabolism
  • 15. Available Information….
  • 16. ChemSpider Today
    • 24.8 million structures
    • 400 data sources
    • Grows daily
    • Community annotation and curation
    • We curate, edit, change, enhance data daily
  • 17. ChemSpider : Spectra Linked
  • 18. ChemSpider: Spectra Linked
  • 19. Spectra Linked
  • 20. Spectra Linked
  • 21. Spectra on ChemSpider
  • 22. Sources of Spectra
    • Sourced from online sources with permission
    • Private collections
    • The MAJORITY deposited by ChemSpider users
  • 23. Spectral Uploading
    • Locate the structure of interest and deposit spectrum
  • 24. Spectral Uploading
    • Various types of NMR spectra supported
  • 25. Spectral formats supported
    • Spectra may be uploaded as JCAMP-DX format
    • Graphical formats such as JPEG, PNG, GIF
    • PDF files
  • 26. Multiple Spectra for One Structure
  • 27. ChemSpider ID 24528095 H1 NMR
  • 28. ChemSpider ID 24528095 C13 NMR
  • 29. ChemSpider ID 24528095 HHCOSY
  • 30. ChemSpider ID 24528095 HSQC
  • 31. ChemSpider ID 24528095 HMBC
  • 32. Deposit spectra against new structure
    • If a NEW compound has spectral data then deposit the structure onto ChemSpider first
  • 33. Available Spectra
  • 34. Embedding Data
  • 35. Web Services
  • 36.
  • 37. Spectral Game
  • 38. Increasing Complexity
  • 39. Spectral Game
  • 40. Data Curation
  • 41. Reversed Spectrum
  • 42. Download, reprocess, redeposit
  • 43. True Curation of Data
  • 44. Building Other Spectral Games!
    • We would like to build other forms of the spectral game. The database is presently very rich in NMR data
    • There are presently
      • 101 Infrared Spectra
      • 46 Raman Spectra
      • We would like more!!!
  • 45. Invitations
    • Spectral data are welcomed from associated syntheses, lab experiments etc
    • Upload structures, spectra, analyses etc to ChemSpider to share with the community
    • Use and encourage your students
  • 46. Thank you Email: Twitter: ChemConnector Blog: Personal Blog: SLIDES: