Successfully reported this slideshow.
The Great Promise of Online Data for    Chemistry and the Life Sciences                           Antony J Williams       ...
READ FAST – IT’S HAPPENING NOW  20 minutes, >40 slidesDisruption Can be Cheap, Fast and Unexpectedly        Successful
Online Chemistry Databases in 2007
A search gave LOTS of “info”..What is Yohimbine?
For chemists…try filtering!
Why not Index the web of chemistry? Build a search engine for chemistry Index all public domain chemicals and link Buil...
Create a structure-centric hub
Answering Real Questions Questions a chemist might ask…   What is the melting point of n-heptanol?   What is the chemic...
The World of Online Chemistry   Safety data   Toxicity data   Blogs and Wikis   Property databases   Experimental res...
Linked Data for Life Sciences growing…
Solve Real World Problems Provide programmable interface against content Provide a chemistry database tuned to integrators
RSC and ChemSpider – May 2009
Why RSC acquired ChemSpider Commitment to serve the community Bring cheminformatics expertise in-house Add additional d...
Making sense of data is overwhelming
Publications are Hosts to Data
Data has value, is Free, is Open Data cannot be copyrighted. A particular  expression of data, such as a chart or table i...
Tell me about Yohimbine…
Of course it is out there…
SOME Chemistry Databases in 2012
Tell me more…but…   Where can I find the electronic structure?   Papers/Patents about Yohimbine?   What are the side ef...
Yohimbine on ChemSpider
RSC Databases are Integrated
RSC Journals are Integrated
Patents are Linked
Google Books are Integrated
And so are…   Chemical vendors   Safety and Toxicity information   Experimental and Predicted properties   Analytical ...
And all “mobile”
Not only compounds but syntheses
And analytical data…
The world can take and contribute Scientists can deposit their data They can annotate and curate They can download data...
Integrate to electronic lab notebooks
Integrate to electronic lab notebooks
Integrate to instruments and software Primary analytical instrumentation vendors integrate   Agilent, Bruker, Thermo, Wa...
Publications are a summary of work Scientific publications are a summary of work   Is all work reported?   How much sci...
What if we could capture it all?
Start with data in publications
But in the time of Big Data…it’s linked!
ONE example – data for life sciences                                                    IP?                            Wha...
 Crowdsourcing across drug discovery Open PHACTS : partnership between European  Community and European Pharma Companies...
All that glisters is not gold…
Crowdsourced Assertions The future of publishing will include generation  and consumption of “nanopublications” http://w...
Nanopublications??
So what’s the business model? Decisions are based on data Publications encapsulate, reference and link data More data i...
Acknowledgments The RSC ChemSpider team Our users, our depositors, our curators GGA Software Services, OpenEye, ACD/Lab...
Thank youEmail: williamsa@rsc.orgTwitter: ChemConnectorPersonal Blog: www.chemconnector.comSLIDES: www.slideshare.net/Anto...
The Great Promise of Online Data for Chemistry and the Life Sciences
The Great Promise of Online Data for Chemistry and the Life Sciences
Upcoming SlideShare
Loading in …5
×

The Great Promise of Online Data for Chemistry and the Life Sciences

2,515 views

Published on

This is the presentation I gave at the Silverchair Colloquium at Keswick Hall in Charlottesville. This presentation

Published in: Technology, Education
  • Be the first to comment

The Great Promise of Online Data for Chemistry and the Life Sciences

  1. 1. The Great Promise of Online Data for Chemistry and the Life Sciences Antony J Williams Silverchair Colloquium 2012
  2. 2. READ FAST – IT’S HAPPENING NOW 20 minutes, >40 slidesDisruption Can be Cheap, Fast and Unexpectedly Successful
  3. 3. Online Chemistry Databases in 2007
  4. 4. A search gave LOTS of “info”..What is Yohimbine?
  5. 5. For chemists…try filtering!
  6. 6. Why not Index the web of chemistry? Build a search engine for chemistry Index all public domain chemicals and link Build a structure searchable web Crowdsource new chemistry from the community Crowdsource curation and annotation
  7. 7. Create a structure-centric hub
  8. 8. Answering Real Questions Questions a chemist might ask…  What is the melting point of n-heptanol?  What is the chemical structure of Xanax?  Chemically, what is phenolphthalein?  What are the stereocenters of cholesterol?  Where can I find publications about xylene?  What are the different trade names for Ketoconazole?  What is the NMR spectrum of Aspirin?  What are the safety handling issues for Thymol Blue?
  9. 9. The World of Online Chemistry Safety data Toxicity data Blogs and Wikis Property databases Experimental results Scientific publications Compound aggregators Open Notebook Science Metabolic pathway databases Encyclopedic articles (Wikipedia)
  10. 10. Linked Data for Life Sciences growing…
  11. 11. Solve Real World Problems Provide programmable interface against content Provide a chemistry database tuned to integrators
  12. 12. RSC and ChemSpider – May 2009
  13. 13. Why RSC acquired ChemSpider Commitment to serve the community Bring cheminformatics expertise in-house Add additional data to publications Potential freemium model – web services, data Because data is critical to science
  14. 14. Making sense of data is overwhelming
  15. 15. Publications are Hosts to Data
  16. 16. Data has value, is Free, is Open Data cannot be copyrighted. A particular expression of data, such as a chart or table in a publication, can be. Data licensing is being dealt with and openness encouraged Research data mandates are starting… Who will manage the integration and curation and keep the access FREE!
  17. 17. Tell me about Yohimbine…
  18. 18. Of course it is out there…
  19. 19. SOME Chemistry Databases in 2012
  20. 20. Tell me more…but… Where can I find the electronic structure? Papers/Patents about Yohimbine? What are the side effects of Yohimbine? Where can I order Yohimbine? What are the physicochemical properties? What are the associated metabolic pathways? Different synonyms of Yohimbine? Are there side effects with Yohimbine? ChemSpider links all of this information and more
  21. 21. Yohimbine on ChemSpider
  22. 22. RSC Databases are Integrated
  23. 23. RSC Journals are Integrated
  24. 24. Patents are Linked
  25. 25. Google Books are Integrated
  26. 26. And so are… Chemical vendors Safety and Toxicity information Experimental and Predicted properties Analytical data Images and Movies And all for free…
  27. 27. And all “mobile”
  28. 28. Not only compounds but syntheses
  29. 29. And analytical data…
  30. 30. The world can take and contribute Scientists can deposit their data They can annotate and curate They can download data They can embed data in the social network They can integrate and connect
  31. 31. Integrate to electronic lab notebooks
  32. 32. Integrate to electronic lab notebooks
  33. 33. Integrate to instruments and software Primary analytical instrumentation vendors integrate  Agilent, Bruker, Thermo, Waters Cheminformatics vendors link to ChemSpider  Accelrys, ACD/Labs, ChemAxon, iChemLabs
  34. 34. Publications are a summary of work Scientific publications are a summary of work  Is all work reported?  How much science is lost to pruning?  What of value sits in notebooks and is lost? How much data is lost?  How many compounds never reported?  How many syntheses fail or succeed?  How many characterization measurements?
  35. 35. What if we could capture it all?
  36. 36. Start with data in publications
  37. 37. But in the time of Big Data…it’s linked!
  38. 38. ONE example – data for life sciences IP? What’s the structure? Are they in our file? What’s similar? What’s the Pharmacology target? data? Known Pathways? Competitors? Working On Connections Now? to disease? Expressed in right cell type?
  39. 39.  Crowdsourcing across drug discovery Open PHACTS : partnership between European Community and European Pharma Companies 22 partners, 8 pharmaceutical companies, 3 biotechs working together for 3 years Freely accessible for knowledge discovery and verification.  Data on chemistry and biology  Pharmacological profiles  Proprietary and public data sources.
  40. 40. All that glisters is not gold…
  41. 41. Crowdsourced Assertions The future of publishing will include generation and consumption of “nanopublications” http://www.nanopub.org/
  42. 42. Nanopublications??
  43. 43. So what’s the business model? Decisions are based on data Publications encapsulate, reference and link data More data is free and open. More services and APIS allow access – free or for fee. Ask Google The large-scale licensed content business model is at risk without interfaces to integrate and mine
  44. 44. Acknowledgments The RSC ChemSpider team Our users, our depositors, our curators GGA Software Services, OpenEye, ACD/Labs and a lot of Open Source code! And Al Gore for supporting the internethttp:// en.wikipedia.org/wiki/Al_Gore_and_information_techn
  45. 45. Thank youEmail: williamsa@rsc.orgTwitter: ChemConnectorPersonal Blog: www.chemconnector.comSLIDES: www.slideshare.net/AntonyWilliams

×