Why not Index the web of chemistry? Build a search engine for chemistry Index all public domain chemicals and link Build a structure searchable web Crowdsource new chemistry from the community Crowdsource curation and annotation
Answering Real Questions Questions a chemist might ask… What is the melting point of n-heptanol? What is the chemical structure of Xanax? Chemically, what is phenolphthalein? What are the stereocenters of cholesterol? Where can I find publications about xylene? What are the different trade names for Ketoconazole? What is the NMR spectrum of Aspirin? What are the safety handling issues for Thymol Blue?
The World of Online Chemistry Safety data Toxicity data Blogs and Wikis Property databases Experimental results Scientific publications Compound aggregators Open Notebook Science Metabolic pathway databases Encyclopedic articles (Wikipedia)
Why RSC acquired ChemSpider Commitment to serve the community Bring cheminformatics expertise in-house Add additional data to publications Potential freemium model – web services, data Because data is critical to science
Data has value, is Free, is Open Data cannot be copyrighted. A particular expression of data, such as a chart or table in a publication, can be. Data licensing is being dealt with and openness encouraged Research data mandates are starting… Who will manage the integration and curation and keep the access FREE!
Tell me more…but… Where can I find the electronic structure? Papers/Patents about Yohimbine? What are the side effects of Yohimbine? Where can I order Yohimbine? What are the physicochemical properties? What are the associated metabolic pathways? Different synonyms of Yohimbine? Are there side effects with Yohimbine? ChemSpider links all of this information and more
The world can take and contribute Scientists can deposit their data They can annotate and curate They can download data They can embed data in the social network They can integrate and connect
Integrate to instruments and software Primary analytical instrumentation vendors integrate Agilent, Bruker, Thermo, Waters Cheminformatics vendors link to ChemSpider Accelrys, ACD/Labs, ChemAxon, iChemLabs
Publications are a summary of work Scientific publications are a summary of work Is all work reported? How much science is lost to pruning? What of value sits in notebooks and is lost? How much data is lost? How many compounds never reported? How many syntheses fail or succeed? How many characterization measurements?
ONE example – data for life sciences IP? What’s the structure? Are they in our file? What’s similar? What’s the Pharmacology target? data? Known Pathways? Competitors? Working On Connections Now? to disease? Expressed in right cell type?
Crowdsourcing across drug discovery Open PHACTS : partnership between European Community and European Pharma Companies 22 partners, 8 pharmaceutical companies, 3 biotechs working together for 3 years Freely accessible for knowledge discovery and verification. Data on chemistry and biology Pharmacological profiles Proprietary and public data sources.
So what’s the business model? Decisions are based on data Publications encapsulate, reference and link data More data is free and open. More services and APIS allow access – free or for fee. Ask Google The large-scale licensed content business model is at risk without interfaces to integrate and mine
Acknowledgments The RSC ChemSpider team Our users, our depositors, our curators GGA Software Services, OpenEye, ACD/Labs and a lot of Open Source code! And Al Gore for supporting the internethttp:// en.wikipedia.org/wiki/Al_Gore_and_information_techn