• Like
Feeding and consuming data to support open notebook science via the chem spider platform
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Feeding and consuming data to support open notebook science via the chem spider platform

  • 1,306 views
Published

We are all benefiting from a shift towards openness fed by Open Source, Open Standards, Open Data and Open Access. Open Notebook Science is likely the scientific revolution of the near term. As more …

We are all benefiting from a shift towards openness fed by Open Source, Open Standards, Open Data and Open Access. Open Notebook Science is likely the scientific revolution of the near term. As more scientists become comfortable with the concepts of openly sharing their experiments and data, often in near real time, we are seeing a shift to significant increases in the availability of new data that does not have to be extracted from publications but is available as data feeds that can be delivered to the community. This presentation will provide an overview of how the ChemSpider database from the RSC supports Open Notebook Science using programmatic access to both data and services and how ChemSpider ingests data feeds to mesh together with our existing database of over 27 million chemical compounds.

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,306
On SlideShare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
3
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Feeding and consuming data tosupport Open Notebook Science via the ChemSpider PlatformAntony Williams, Jean-Claude Bradley, Andrew Lang and Valery Tkachenko ACS Philadelphia August 2012
  • 2. Setting the Stage Chemists want access to tools and data  The more capabilities the better  The more data the better  And give us an API with that…  And it should be free…  And constantly updated…  And all data should be Open…  And make it fully Open Source…  And it needs to be on my mobile…
  • 3. Setting the Stage Chemists have access to tools and data  The more capabilities the better – we’ll see  The more data the better – changing daily  And give us an API with that… - not just one  And it should be free… - sure  And constantly updated… - indeed..please help!  And all data should be Open…- licensing  And make it fully Open Source… - kinda, sorta  And it needs to be on my mobile… - sure
  • 4. Welcome to ChemSpider 5 years, 28 million chemicals, linking 400 data sources and growing daily Hosted by the Royal Society of Chemistry An important part of our long term strategic vision Free to access With lots/most/all (?) of the functionality necessary to support chemists and Open Notebook Science…
  • 5. Why Use ChemSpider?
  • 6. Why Use ChemSpider?
  • 7. Why Use ChemSpider?
  • 8. Why Use ChemSpider?
  • 9. Why Use ChemSpider? LINKING OUT
  • 10. Why Use ChemSpider?
  • 11. Why Use ChemSpider
  • 12. Why Use ChemSpider
  • 13. Why Use ChemSpider
  • 14. Why Use ChemSpider
  • 15. What about Syntheses?
  • 16. ChemSpider SyntheticPages
  • 17. Work in Progress – 300k Reactions
  • 18. Storing ONS Reactions Working with JC Bradley to host ONS reactions Linking directly back to ONS reactions What if the links decay? Host all related ONS data – benefits of Openness! Future applications for RInChIs
  • 19. What we have been asked for “Allow us to grab data” “Let us link” “Give us web services to integrate” “Can we store our data with you?” “Can you give us predictions to validate data?”
  • 20. What we have been asked for “Allow us to grab data” “Let us link” “Give us web services to integrate” “Can we store our data with you?” “Can you give us predictions to validate data?” “Can you build us an ELN?”
  • 21. Simple Linking to ChemSpider Link using ChemSpiderID http://www.chemspider.com/1234567
  • 22. ChemSpider IDs Proliferating Now
  • 23. Simple Querying Example http:// www.chemspider.com/Search.aspx?q=InChIKey=XXO
  • 24. Or InChI, or SMILES http://www.chemspider.com/Search.aspx?q=InChI=1S m1/s1 http://www.chemspider.com/Search.aspx? q=Clc1ccc(cc1)C(O)=C3C(=O)C(=O)N([C@@H]3 c2cccc(F)c2)CCc5c4ccccc4nc5
  • 25. Better to provide APIs….
  • 26. Various Flavors of API
  • 27. Various Flavors of API
  • 28. MANY Web Services for integration
  • 29. Feeding ONS Data into ChemSpider ONS data can be deposited into ChemSpider and linked out to the ONS pages Simply deposit structure(s) and links
  • 30. Feeding ONS Data into ChemSpider ONS Solubility Challenge
  • 31. Feeding ONS Data into ChemSpider
  • 32. So isn’t ONS all about ELNs? Open Notebook Science is about  Making records of research publicly available online as it is recorded ONS is enabled by software tools and platforms  Keep the notebook of the researcher online with all raw and processed data as it is generated (close to or near real time)  Notebooks as Wikis, Commercial or Free ELNs published to the web (choose public/private – what data to expose)
  • 33. Feeding ELN Data into ChemSpider Integrate e-Notebooks into ChemSpider  IDBS e-Workbook plug-in allows direct deposition of chemical structures  Can be extended to more ELN content  Spectra  Reactions  Properties etc.  Integration Video http://tinyurl.com/9xnprqr
  • 34. Feeding ELN Data into ChemSpider
  • 35. How much data is lost? How many reactions in a thesis never get published? How many spectra of common materials could be shared? How many properties are measured and lost? What stands in the way of sharing?  Is it technology?  Permissions? “The Boss”, Licensing? And yes – there are data quality issues but there is algorithmic checking and data curation to help
  • 36. What could the future look like? “Publicly funded” research data flows onto the web Licensing is clear and NOT a challenge Machines are picking up data and depositing EXAMPLE project – Any interest?  Put your spectra/structure in folders (Dropbox)  ChemSpider robot scoops, processes and deposits – opportunity with JC Bradley  While processing also predicts spectra and compares for validation
  • 37. Leaving the Stage Chemists have access to tools and data  The more capabilities the better – what’s missing?  The more data the better – anyone want to share?  And give us an API with that… - ask us for help  And it should be free… - it is  And constantly updated… - help annotate/curate  And all data should be Open…- licensing  And make it fully Open Source… - book chapter  And it needs to be on my mobile… - it is
  • 38. ChemSpider Mobile
  • 39. New URLs to try out ChemSpider Reactions: www.chemspider.com/reactions ChemSpider Validation and Standardization Platform: www.chemspider.com/cvsp ChemSpider Google: www.chemspider.com/google
  • 40. ChemSpider Google
  • 41. ChemSpider Google
  • 42. Acknowledgments RSC Cheminformatics team JC Bradley’s lab Daniel Lowe – reactions Commercial Software – GGA Software, ACD/Labs, OpenEye Open Source Components
  • 43. Thank youEmail: williamsa@rsc.orgBlog: www.chemconnector.comSLIDES: www.slideshare.net/AntonyWilliams