Contributions to the World of eScience from the Royal Society of Chemistry


Published on

Our access to scientific information has changed in ways that were hardly imagined even by the early pioneers of the internet. The immense quantities of data and the array of tools available to search and analyze online content continues to expand while the pace of change does not appear to be slowing. ChemSpider is one of the chemistry community’s primary online public compound databases. Containing tens of millions of chemical compounds and its associated data ChemSpider serves data tens of thousands of chemists every day and it serves as the foundation for many important international projects to integrate chemistry and biology data, facilitate drug discovery efforts and help to identify new chemicals from under the ocean. This presentation will provide an overview of the expanding reach of the ChemSpider platform and the nature of the solutions that it helps to enable. We will also discuss the possibilities it offers in the domain of crowdsourcing and open data sharing. The future of scientific information and communication will be underpinned by these efforts, influenced by increasing participation from the scientific community and facilitated collaboration and ultimately accelerate scientific progress.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Contributions to the World of eScience from the Royal Society of Chemistry

  1. 1. Contributions to the World of eScience from the Royal Society of Chemistry Antony Williams University of North Florida November 14th 2013
  2. 2. We Have …Too Much Data!!!
  3. 3. The World of Online Chemistry • • • • • • • • Property databases Compound aggregators Screening assay results Scientific publications Encyclopedic articles (Wikipedia) Metabolic pathway databases ADME/Tox data – eTOX for example Blogs/Wikis and Open Notebook Science
  4. 4. e-Science and Primary Data • • How much data generated in a lab, that COULD go public, is lost forever? Public Domain reference databases of value? • • • • • • Syntheses Properties Spectra CIFs Images Much of chemistry is chemical structure-based – where and how could we host these data?
  5. 5. RSC’s ChemSpider
  6. 6. ChemSpider • >29 million unique chemicals from >500 data sources • Focus on improving data quality, enhancing functionality, integrating and enabling
  7. 7. Crowdsourced “Annotations” • Users can add • • • • • • • • Descriptions/Syntheses/Commentaries Links to PubMed articles Links to articles via DOIs Add spectral data Add Crystallographic Information Files Add photos Add MP3 files Add Videos
  8. 8. Spectra
  9. 9. Chemistry Data online are messy • • • • • • • • We have inherited errors All public compound databases have errors “Incorrect” structures – assertions, timelines etc “Incorrect” names associated with structures Properties Links Publications ENORMOUS CHALLENGE
  10. 10. Crowdsourced Curation • Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate
  11. 11. Search “Vitamin H”
  12. 12. “Curate” Identifiers
  13. 13. “Curate” Identifiers
  14. 14. “Curate” Identifiers
  15. 15. Validated Name-Structure Dictionaries • Chemical name dictionaries are used for: • Text-mining (publications, patents) • • Linking to other databases – think Biology! • • Used to index PubMed and link to Google Patents When structures are not available drug names link Searching the web • Names link to structures link to InChIs
  16. 16. I want to know about “Vincristine”
  17. 17. Vincristine: Identifiers and Properties
  18. 18. Vincristine: Vendors and Sources Linked by Structure
  19. 19. Vincristine: Patents Linked by Name
  20. 20. Vincristine: Articles Linked by Name
  21. 21. Semantic Mark-up of Articles
  22. 22. Linking Names to Structures
  23. 23. The InChI Identifier
  24. 24. InChIStrings Hash to InChIKeys
  25. 25. Vancomycin – Search the Internet
  26. 26. Vancomycin Search Molecular SKELETON Search Full Molecule
  27. 27. Full Skeleton Search: 104 Hits
  28. 28. Full Molecule Search: 4 Hits
  29. 29. Publications - a summary of work • Scientific publications are a summary of work • • • Is all work reported? How much science is lost to pruning? What of value sits in notebooks and is lost? • How much data is lost? • • • How many compounds never reported? How many syntheses fail or succeed? How many characterization measurements?
  30. 30. What if we could capture it all? Digitally Enhancing the RSC Archive
  31. 31. Start with data in publications
  32. 32. Turn “Figures” Into Data
  33. 33. ChemSpider Reactions • • Starting with data from CSSP, MOS and CCR Will cover reactions extracted from: • Patents • RSC journal articles and ESI
  34. 34. About Me…as a Chemist • I’ve performed a few dozen chemical syntheses • I’ve run thousands of analytical spectra • I’ve generated thousands of NMR assignments • I’ve probably published <5% of all work • Most of it has been lost • But things can be different today…. • But it still needs to be associated with me…
  35. 35. Micropublishing Syntheses
  36. 36. ChemSpider SyntheticPages
  37. 37. Visibility Means Discoverability • Does a Social Profile matter? • You are visible, when you share your skills, experience and research activities by: • • • • • • Establishing a public profile Getting on the record Collaborative Science Demonstrating a skill set Measured using “alternative metrics” Contributing to the public peer review process
  38. 38. Scientists are “Quantified” • Scientists are quantified • Stats are gathered and analyzed • Employers can find them, tenure will depend on them, and these already happen without your participation • Scientists Impact Factors, H-index and many other variants.
  39. 39. How you can be Quantified…
  40. 40. ResearchGate
  41. 41. The Alt-Metrics Manifesto •
  42. 42. AltMetrics via Plum Analytics
  43. 43. Usage, Citations, Social Media, Etc
  44. 44. Detailed Usage Statistics
  45. 45. Your Profile as a Scientist • If you are an active scientist – i.e. already published, active researcher, generator of data, early, mid- or late career there is lots to do! • If you are a junior scientist the benefits of investing time now will provide a strong foundation for your future! • So what do I do??
  46. 46. Branding: I am ChemConnector
  47. 47. Enabled by • Persistent unique digital identifier • Integrates to workflows such as manuscript and grant submission • Supports automated linkages with your professional activities
  48. 48. An Online Profile • Methods of sharing science online include: • • • • • • • • Wikis or blogs Slideshare for presentations YouTube for videos Flickr, Wikimedia etc. for images ChemSpider for chemistry GoogleDocs for data Google Scholar Citations for citations Microsoft Academic Scholar for papers
  49. 49. LinkedIn
  50. 50. My Career Captured…
  51. 51. And “Endorsements”
  52. 52. Are you sharing your slides online? • Slideshare to host, expose and share your presentations, publications, posters and videos (subject to copyright you might have transferred!) • Register for an account and retain your branding! Keep your online brand consistent
  53. 53. Upload and Add Details • Edit title, add tags, add “abstract”, choose category • Select checkbox for allow/disallow file download
  54. 54. SlideShare
  55. 55. Social Media Tools Feed Each Other • Plugins and connectors integrate your activities across the social media platforms • Expose your Tweeting and your Slideshare presentations directly on LinkedIn. • Plug-ins allow your tweets and presentations to be automagically displayed on LinkedIn
  56. 56. From Slideshare Into the Network
  57. 57. Add Applications to LinkedIn
  58. 58. Places to Share Videos • There are other sites for you to share your videos online as a scientist • • • • YouTube SciVee Vimeo Slideshare
  59. 59. Share/Manage Your Publications • Where do you “manage your publications”? • Share your “activities” with the community • My publications/slides/videos are my CV on • • • • • My Blog On LinkedIn On SlideShare On Researchgate On
  60. 60.
  61. 61.
  62. 62. And Mendeley
  63. 63. My Google Scholar Profile
  64. 64. My Co-author Graph on MAS..
  65. 65. Share Science!!! Not Just Yourself • Become a community contributor to science • Share your expertise in the new world of openness • • • • • Share your code Share your data and your model Share your Figures Contribute to Wikis – Wikipedia and others Become an Open Notebook Scientist
  66. 66. The Power of Blogs & Social Media
  67. 67. The Power of Blogs & Social Media
  68. 68. The Power of Blogs & Social Media
  69. 69. And into the AltMetrics World
  70. 70. And into the AltMetrics World
  71. 71. Social Networking for Scientists • The representation of YOU on the web is going to become increasingly important… • Engagement and participation is a choice… • Consider the value to both you and to your community regarding contribution • Open Data, Curations, Annotations etc.
  72. 72. Conclusions • Online chemistry has exploded… • Each of you has the opportunity to contribute • Contributions will ultimately be credited to you and your scientific career • Imagine starting to build your online presence early and how it can benefit you • There is no time that is too early to start actively building profile/reputation
  73. 73. Thank you Email: Twitter: @ChemConnector Personal Blog: SLIDES: