ChemSpider Presentation At University Of Toronto


Published on

The presentation of ChemSpider was to a groub of science librarians, specifically chemistry librarians, and was meant to provide an overview of the platform and answer the question posed: What is the difference between ChemSpider, CAS Scifinder and Reaxys.

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

ChemSpider Presentation At University Of Toronto

  1. 1. ChemSpider: Building the Premier Online Resource for Chemists University of Toronto June 8th 2010
  2. 2. Overview <ul><li>The status of chemistry online today </li></ul><ul><li>The pragmatic vision of ChemSpider </li></ul><ul><li>The Quality of online chemistry </li></ul><ul><li>Linking together the internet using InChIs </li></ul><ul><li>Citizen scientists for deposition and curation </li></ul><ul><li>ChemSpider as a multimedia container </li></ul><ul><li>Comparing ChemSpider, Reaxys and SciFinder </li></ul>
  3. 3. Where is chemistry online? <ul><li>Encyclopedic articles (Wikipedia) </li></ul><ul><li>Chemical vendor databases </li></ul><ul><li>Metabolic pathway databases </li></ul><ul><li>Property databases </li></ul><ul><li>Patents with chemical structures </li></ul><ul><li>Drug Discovery data </li></ul><ul><li>Scientific publications </li></ul><ul><li>Compound aggregators </li></ul><ul><li>Blogs/Wikis and Open Notebook Science </li></ul>
  4. 4. Chemistry on the Internet TODAY <ul><li>Chemistry searches are generally limited to text-based searches across the internet </li></ul><ul><li>Data are dirty: sorting the wheat from the chaff. Who can you trust? </li></ul><ul><li>Too many searches required to resource data </li></ul>
  5. 5. <ul><li>As few interfaces as possible </li></ul>What do humans want?
  6. 6. A Pragmatic Vision <ul><ul><li>“ Build a Structure Centric Community” </li></ul></ul><ul><li>December 2006 – A hobby project initiated to connect chemistry on the web </li></ul><ul><ul><li>Integrate chemical structure data on the web </li></ul></ul><ul><ul><li>Create a “structure-based hub” to information and data </li></ul></ul><ul><ul><li>Provide access to structure-based “algorithms” </li></ul></ul><ul><ul><li>Let chemists contribute their own data </li></ul></ul><ul><ul><li>Allow the community to curate/correct data </li></ul></ul>
  7. 7. ChemSpider Searches
  8. 8. Search Cholesterol
  9. 9. Search Cholesterol
  10. 10. Search Cholesterol
  11. 11. Search Cholesterol
  12. 12. Search Cholesterol
  13. 13. Linked across the internet
  14. 14. Kyoto Encyclopedia of Genes and Genomes
  15. 15. Links to Patents based on structure
  16. 16. Articles Linked
  17. 17. ChemSpider Complex Searches
  18. 18. Link off a structure in ChemSpider <ul><ul><li>Chemical suppliers </li></ul></ul><ul><ul><li>Other publications </li></ul></ul><ul><ul><li>Analytical Data </li></ul></ul><ul><ul><li>Related Reactions </li></ul></ul><ul><ul><li>Wikipedia </li></ul></ul><ul><ul><li>Patents </li></ul></ul><ul><ul><li>“ Everything” </li></ul></ul>
  19. 19. Answering Questions for Chemists <ul><li>Questions a chemist might ask… </li></ul><ul><ul><li>What is the melting point of n-butanol? </li></ul></ul><ul><ul><li>What is the chemical structure of Xanax? </li></ul></ul><ul><ul><li>Chemically, what is phenolphthalein? </li></ul></ul><ul><ul><li>What are the stereocenters of cholesterol? </li></ul></ul><ul><ul><li>Where can I find publications about xylene? </li></ul></ul><ul><ul><li>What are the different trade names for Ketoconazole? </li></ul></ul><ul><ul><li>What is the NMR spectrum of Aspirin? </li></ul></ul><ul><ul><li>What are the safety handling issues for Thymol Blue? </li></ul></ul>
  20. 20. What is a compound?
  21. 21. ChemSpider is a structure-centric hub <ul><li>ChemSpider aggregates and links out across the internet </li></ul><ul><li>Data aggregate based on “structures and links” </li></ul><ul><li>What defines a chemical compound? </li></ul>
  22. 22. Linked Data on the Web Taken from: Rafael Sidis’ Blog
  23. 23. Where Would You look? What Do You Trust?
  24. 24. Chemistry on The Internet Is Messy
  25. 25. It’s Methane…
  26. 26. What’s Methane?
  27. 27. What’s Methane?
  28. 28. What ELSE is Methane???
  29. 29. PubChem
  30. 30. Chemistry is REALLY Messy
  31. 31. Vancomycin <ul><li>Who will curate? </li></ul><ul><li>How would you clean such a large dataset? </li></ul><ul><li>Assertions!!! </li></ul>
  32. 32. Vancomycin on ChemSpider 1 compound – 3 days
  33. 33. The EXPERTS must get it right?!
  34. 34. Wikipedia, C&E News, PubChem <ul><li>C&E News (from ACS) </li></ul>
  35. 35. The InChI Identifier
  36. 36. Multiple Layers
  37. 37. InChIStrings Hash to InChIKeys
  38. 38. Vancomycin – Search the Internet
  39. 39. Full Molecule Search: 4 Hits
  40. 40. Full Skeleton Search: 104 Hits
  41. 41. Citizen Scientists
  42. 42. Crowd-sourcing Chemistry Curation <ul><li>Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate </li></ul>
  43. 43. Citizens as Data Sources
  44. 44. Semantic Markup: Project Prospect
  45. 45. Entity-Extraction, Mark-up, Annotate
  46. 46. Success Depends on Dictionaries
  47. 47. Semantic Linking of Structures <ul><li>What would you want to link off a structure? </li></ul><ul><ul><li>Chemical suppliers </li></ul></ul><ul><ul><li>Other publications </li></ul></ul><ul><ul><li>Analytical Data </li></ul></ul><ul><ul><li>Related Reactions </li></ul></ul><ul><ul><li>Wikipedia </li></ul></ul><ul><ul><li>Patents </li></ul></ul><ul><ul><li>“ Everything” </li></ul></ul>
  48. 48. Unpublished Chemistry <ul><li>Only a fraction of chemistry is published </li></ul><ul><li>Only a tiny fraction of chemistry is patented </li></ul><ul><li>What of the “Lost Chemistry”- never published and cannot be abstracted </li></ul><ul><ul><li>Reactions performed </li></ul></ul><ul><ul><li>Structures made and studied </li></ul></ul><ul><ul><li>Spectra acquired and then disposed of </li></ul></ul><ul><ul><li>Available chemicals never found </li></ul></ul>
  49. 49. Org Prep Daily (Blog)
  50. 50. ChemSpider SyntheticPages
  51. 51. Submission Process <ul><li>Submissions reviewed by editorial board </li></ul><ul><li>Published as is or comments sent to author </li></ul><ul><li>Online Peer Review process </li></ul><ul><li>Data supported include web movies, images, live spectra etc. </li></ul>
  52. 52. Micro- and Nano-publications <ul><li>Blogs, wiki entries and even Amazon book reviews are micro/nano-publications </li></ul><ul><li>ChemSpider SyntheticPages will be DOI’ed – students can add these “micro-publications” to their resume </li></ul><ul><li>Structures and spectra are nano-publications – these can be tracked and referenced also. (depositions, curations etc). Students participate in building one of the premier sources of chemistry data. </li></ul>
  53. 53. ChemSpider Everywhere: What do computers want? <ul><li>Web services </li></ul>
  54. 54. ChemSpider Everywhere: ChemMobi
  55. 55. Mobile ChemSpider
  56. 56. Multimedia Content Holder
  57. 57. Periodic Table Images
  58. 58. CAS SciFinder
  59. 59. reaxys
  60. 60. Differences between ChemSpider, Reaxys and SciFinder <ul><li>Everything on Reaxys and Scifinder is curated </li></ul><ul><li>The data resources can be over a 100 years old </li></ul><ul><li>The platforms are commercial and “read-only” </li></ul><ul><li>ChemSpider is free, to everyone </li></ul><ul><li>Data are in a state of ongoing curation & annotation </li></ul><ul><li>Data resources are from the “electronic era” </li></ul><ul><li>Data are expanded daily and enhanced on an ongoing basis </li></ul><ul><li>The platform delivers integrated algorithm access </li></ul>
  61. 61. Community Contribution <ul><li>We make a bigger contribution to the community if the community shares via ChemSpider </li></ul><ul><li>ChemSpider wins “Community </li></ul><ul><li>contribution” best practice award” </li></ul>
  62. 62. How Can You Help ChemSpider? <ul><li>Encourage students to deposit their data and share with the community </li></ul><ul><ul><li>Structures – one or many </li></ul></ul><ul><ul><li>Spectra </li></ul></ul><ul><ul><li>Links </li></ul></ul><ul><ul><li>Syntheses into ChemSpider SyntheticPages </li></ul></ul><ul><li>Spread the word – ChemSpider is an untapped resource </li></ul>
  63. 63. Chemistry on the Internet FUTURE <ul><li>The semantic web for chemistry is in place </li></ul><ul><li>Crowdsourced contributions are commonplace </li></ul><ul><li>Chemists will search by structure/substructure </li></ul><ul><li>Chemistry articles indexed and searchable </li></ul><ul><li>Reduced number of searches to find data </li></ul><ul><li>Data are integrated – compounds, vendors, syntheses, data, publications and patents </li></ul><ul><li>A world of Open Access and Open Data </li></ul>
  64. 64. Thank you [email_address] Twitter: ChemSpiderman SLIDES:
  1. ¿Le ha llamado la atención una diapositiva en particular?

    Recortar diapositivas es una manera útil de recopilar información importante para consultarla más tarde.