RSC ChemSpider is the online chemistry database where community contributions coun


Published on

The ChemSpider database is a resource hosted by the Royal Society of Chemistry. With over 28 million unique chemicals on the database linked out to over 400 data sources the platform provides access to experimental and predicted data (properties, spectra etc.), links to publications, patents and a myriad of other resources. The ChemSpider database has been used as the foundation of a number of other resources for chemists including ChemSpider SyntheticPages, the Learn Chemistry Wiki and the Spectral Game. This presentation will provide an overview of ChemSpider and discuss how chemists can both derive value from and contribute to the content available from the database and its related resources. We will also discuss our view of future platform for managing personal, institutional and public chemistry in a shared environment.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

RSC ChemSpider is the online chemistry database where community contributions coun

  1. 1. RSC|ChemSpider – The Online Chemistry Database WhereCommunity Contributions Count
  2. 2. ChemSpider The RSC’s Online Chemical Database A central hub for chemists to source information  >28 million unique chemical records  Aggregated from >400 data sources  Chemicals, spectra, CIF files, movies, images, podcasts, links to patents, publications, predictions A central hub for chemists to deposit & curate data
  3. 3. Answer Questions with ChemSpider Questions a chemist might ask…  What is the melting point of n-heptanol?  What is the chemical structure of Xanax?  Chemically, what is phenolphthalein?  What are the stereocenters of cholesterol?  Where can I find publications about xylene?  What are the different trade names for Ketoconazole?  What is the NMR spectrum of Aspirin?  What are the safety handling issues for Thymol Blue?
  4. 4. I want to know about “Vincristine”
  5. 5. I want to know about “Vincristine” If all algorithms work then everything on the page is correct by default except the name!
  6. 6. Vincristine: Identifiers and Properties
  7. 7. Vincristine: Identifiers and Properties
  8. 8. Vincristine: Vendors and Sources
  9. 9. Vincristine: Patents
  10. 10. Vincristine: Articles
  11. 11. ChemSpider : Spectra Linked
  12. 12. Spectra Linked
  13. 13. Multiple Spectra for One Structure
  14. 14. ChemSpider ID 24528095 H1 NMR
  15. 15. ChemSpider ID 24528095 C13 NMR
  16. 16. ChemSpider ID 24528095 HHCOSY
  17. 17. ChemSpider ID 24528095 HSQC
  18. 18. ChemSpider ID 24528095 HMBC
  19. 19. About Structures
  20. 20. The InChI Standard
  21. 21. InChIKeysSearch the Web by Structure
  22. 22. InChIs
  23. 23. Searches: The INTERNETAll ChemSpider and Internet searches are “simply algorithms”but synonym searching is based on an assertion
  24. 24. Validated Names for Searching…
  25. 25. Scientists are measured by… Impact Citations Papers Patents Funding and increasingly by “Alt-Metrics” – what you say, what you contribute, your data depositions, your code in repositories, your voice in the network, your activities on Facebook (be careful!)
  26. 26. If it was not just about me…
  27. 27. If it was not just about me… We might have a community built encyclopedia I might know where the best restaurants are I might get good advice on books to read I might know which movies to watch I might know which plumber to call Data might just be Open
  28. 28. If it was not just about me… We might have a community built encyclopedia I might know where the best restaurants are I might get good advice on books to read I might know which movies to watch I might know which plumber to call Data might just be Open
  29. 29. The Social Network Career-wise NOT having a personal presence online will be a detriment  Self-marketing  Establishing a profile  Getting on the record  Collaborative Science  Demonstrating a skill set  Measured using alternative metrics  Contributing to the public peer review process
  30. 30. Social Networking Tools A growing number of social networking tools:  Facebook  Twitter  Linked-In  Flickr  YouTube  Blogs  Communities  Collaborative environments
  31. 31. Chemistry Social Networking Methods of sharing MY chemistry online include:  Wikis or blogs  Slideshare for presentations  YouTube for videos  Flickr, Wikimedia etc. for images  PubChem for assay data  NMRShiftDB for NMR assignments  GoogleDocs for data
  32. 32. Your profile online…
  33. 33. Establish a Mendeley Account
  34. 34. ResearchGate
  35. 35. Microsoft Academic Search
  36. 36. The Alt-Metrics Manifesto
  37. 37. What is my ImpactStory?
  38. 38. ImpactStory
  39. 39. Enabled by ORCID…
  40. 40. The Linked Network
  41. 41. There is much to be linked
  42. 42. The World of Contribution Times have changed  Immediacy of social networks  Commenting on articles/data is here  The “participating scientist” has high profile  And who can be a scientist now???
  43. 43. A Ten Year Old Scientist
  44. 44. Share Science!!! Not Just Yourself If you have time, and the inclination, become a community contributor Share your expertise in the new world of openness  Share your Open Source code  Share your data and your model  Share your Figures  Contribute to Wikis – Wikipedia and others  Become an Open Notebook Scientist
  45. 45. Expose Data and Figures on FigShare
  46. 46. Expose Data and Figures on FigShare
  47. 47. ChemSpider SyntheticPages Many syntheses are not published but are of value A database of synthesis procedures built for the community, by the community. Peer-reviewed by the community Each contribution DOI’ed. Develop online scientific reputation at a time of “micro-publications” Integrates semantic mark-up and visualization tools
  48. 48. ChemSpider SyntheticPages
  49. 49. ChemSpider SyntheticPages
  50. 50. Submission process Register as a user Use the Submit button and fill in the fields…
  51. 51. Submission Process Submissions reviewed by editorial board Published as is or comments sent to author Online Peer Review process – engage chemists in ongoing discussions and feedback loop Data supported include web movies, images, live spectra etc.
  52. 52. Recent Submissions
  53. 53. Interactive Data
  54. 54. Most Accessed
  55. 55. Is it working? Show of hands…  How many of you know ChemSpider?  How many of you know CSSP?  Have any of you submitted to CSSP? Low submissions but some dedicated authors
  56. 56. Popular Authors
  57. 57. Is it working? Show of hands…  How many of you know CSSP?  Have any of you submitted to CSSP? Low submissions but some dedicated authors What reasons are there you would not publish?  Time  Approval from supervisor  Need to keep the science quiet  Publishing on CSSP prevents future publishing?
  58. 58. Contributing to The Quality of DataWhat is the Structure of Vitamin K?
  59. 59. Contributing to The Quality of Data What is the Structure of Vitamin K?A lipid cofactor that is required for normal bloodclotting. Several forms of vitamin K have beenidentified: VITAMIN K1 (phytomenadione)derived from plants, VITAMIN K2(menaquinone) from bacteria & syntheticnaphthoquinone provitamins, VITAMIN K3(menadione).
  60. 60. What is the Structure of Vitamin K1?
  61. 61. CAS’s Common Chemistry
  62. 62. Wikipedia
  63. 63. Wolfram Alpha
  64. 64. DailyMed
  65. 65. People Use Trusted Resources…
  66. 66. Quality police…
  67. 67. How will it improve? Participation and contribution
  68. 68. ALL Different, ALL “Domoic Acids”
  69. 69. The EXPERTS must get it right?!
  70. 70. Question Everything Online
  71. 71. Deposition, Annotation andValidation ANYBODY can annotate a record on ChemSpider Registered users can deposit new data Registered users can validate existing data
  72. 72. CURATION Search “Vitamin H”
  73. 73. “Curate” Identifiers
  74. 74. “Curate” Identifiers
  75. 75. Spectra Linked
  76. 76. Spectral Uploading Locate the structure of interest and deposit spectrum
  77. 77. Spectral Uploading Various types of NMR spectra supported
  78. 78. Regular Updates
  79. 79. Web Services
  80. 80. www.SpectralGame.com
  81. 81. Spectral Game
  82. 82. Increasing Complexity
  83. 83. SpectralGame in the hand
  84. 84. Work in Progress – 300k Reactions
  85. 85. Data Enabling the RSC Archive An archive going back to 1841. Project underway to “data enable” the archive:  Extract chemistry – chemicals, reactions, experimental data points, complex data  Semantic enriching of the articles for interactive viewing and crowdsourced annotation/curation  Dramatically enables the type of queries possible across the archive
  86. 86. A model for data segregation Integrate to Institutional repositories Access to Theses and Dissertations
  87. 87. Model Building with Community Data Community data can be the basis of model building  Consume data from available databases, RSC archive, new publications and build predictive algorithms for the community  Accept research data from the community and include into predictions
  88. 88. An Open Data-Centric Chemistry Hub Internet Data Small organic molecules Commercial Software Undefined materials Pre-competitive Data Organometallics Open Science Nanomaterials Open Data Polymers Publishers Minerals Educators Particle bound Open Databases Links to Biologicals Chemical Vendors
  89. 89. Wikipedia
  90. 90. An Interesting Read
  91. 91. ScientistsDB
  92. 92. ScientistsDB Write your OWN article about yourself on ScientistsDB It is a community-policed site so any comments you write might be challenged/edited. It is “your” page but edited by all An article, once approved by the community, can, in theory, be moved to Wikipedia All content is licensed under standard CC-BY-SA 3.0 licensing provided by Wikipedia
  93. 93. Acknowledgments RSC|ChemSpider team CSSP Editorial Team All data source providers Curators and annotators Service providers:  ACD/Labs  OpenEye  GGA Software Services  Many others….
  94. 94. Communicating Science As scientists one of our primary roles is contribution The internet enables contribution in different ways, benefitting the scientist and the community Share your data and experience – it can enhance your public profile as a scientist, make you more discoverable and contribute data to the community AltMetrics will be a measure of scientists…
  95. 95. Thank youEmail: williamsa@rsc.orgTwitter: ChemConnectorPersonal Blog: www.chemconnector.comSLIDES: