Building a Foundation for the Semantic Web by Hosting a Crowdsourced Databasing Platform for Chemistry
What’s the Status of Chemistry online? <ul><li>Encyclopedic articles (Wikipedia) </li></ul><ul><li>Chemical vendor databas...
For Synthesis…TotallySynthetic.com
Org Prep Daily  (Blog)
Molbank (Open Access Journal)
Synthetic Pages (Website)
Encyclopedic Articles (Wikipedia)
Lots of “Public Compound” Databases <ul><li>PubChem </li></ul><ul><li>Drugbank </li></ul><ul><li>ChEBI/ChEMBL </li></ul><u...
Where Would You look?  What Do You Trust?
Linked Data on the Web Taken from: Rafael Sidis’ Blog
Compounds and Identifiers
Connecting Chemistry on the Web <ul><li>Connections are made </li></ul><ul><ul><li>actively by purposeful linking </li></u...
Connections Can Lead Anywhere
The FUTURE of Chemistry on the Web <ul><li>The internet is searchable by chemical structure and substructure (e.g.Wikipedi...
What is ChemSpider? <ul><li>ChemSpider is: </li></ul><ul><ul><li>Building a Structure Centric Community for Chemists </li>...
Search Cholesterol
Search Cholesterol
Search Cholesterol
Search Cholesterol
Search Cholesterol
Search Cholesterol
Linked across the internet
Kyoto Encyclopedia of Genes and Genomes
Link off a structure in ChemSpider <ul><ul><li>Chemical suppliers </li></ul></ul><ul><ul><li>Other publications </li></ul>...
Links to Patents based on structure
Clickthrough to Patent (SureChem)
Pubmed Articles Linked
Answering Questions for Chemists <ul><li>Questions a chemist might ask… </li></ul><ul><ul><li>What is the melting point of...
Complex Data and Information
ChemSpider Searches
ChemSpider Searches
ChemSpider Complex Searches
Question Everything online: www.dhmo.org
Searching Dihydrogen Monoxide…
Politicians, Chemistry and DHMO   <ul><li>March 2004 ,  Aliso Viejo, California   considered banning foam containers at c...
Caution! Question Everything!
PubChem
Vancomycin <ul><li>Who will curate? </li></ul><ul><li>PubChem is not resourced to clean these errors </li></ul><ul><li>How...
Vancomycin on ChemSpider  1 compound – discussions over 3 days
The EXPERTS must get it right?!
Wikipedia, C&E News, PubChem <ul><li>C&E News (from ACS) </li></ul>
Feedback from Steve Ritter <ul><li>“ As for where we source our structures, our  primary source is the researcher and peer...
Feedback from Steve Ritter <ul><li>“ As a rule,  we at C&EN don’t use Wikipedia as a primary source for structures or chem...
What About Digitonin?
Comments on the Blog <ul><li>Kirill Degtyarenko  says:   </li></ul><ul><li>September 15th, 2009 at 1:57 pm   It looks like...
CAS as an authority
The Blogging Community Participate
Will it ever end? <ul><li>The community says the structure of digitonin has “up” 20-Methyl. </li></ul><ul><li>If so, then ...
The FDA’s DailyMed
  Structures on DailyMed
Lack of Stereochemisty
  Incorrect Structures
Wow!
Collaborative Knowledge Management
Drugbank
Taxol on PubChem
The InChI Identifier
Multiple Layers
InChIStrings Hash to InChIKeys
InChIs for Taxol
Back to Taxol <ul><li>DrugBank: RCINICONZNJXQF-CLDWUXIMDD </li></ul><ul><li>ChEBI:   RCINICONZNJXQF-GXKQXQCDDN  </li></ul>...
InChIKeys for Taxol <ul><li>DrugBank: RCINICONZNJXQF-CLDWUXIMDD </li></ul><ul><li>ChEBI:   RCINICONZNJXQF-GXKQXQCDDN  </li...
Does one stereocenter matter?
Does one stereocenter matter? <ul><li>Distaval, Talimol, Nibrol, Sedimide, Quietoplex, Contergan, Neurosedyn, and Softenon...
Does one stereocenter matter? <ul><li>Distaval, Talimol, Nibrol, Sedimide, Quietoplex, Contergan, Neurosedyn, and Softenon...
Building a Structure Centric Community for Chemists
Content is King and  Quality  Costs <ul><li>Chemistry “content” is big  money </li></ul><ul><ul><li>Patent searching </li>...
ChemSpider Searches
ChemSpider Searches
InChIKey Searches Work
The InChI “Resolver”
Assertion and  Chemical Entities <ul><li>Who says what Taxol is? </li></ul><ul><li>What is the “timeline” for a molecule? ...
Crowdsourcing
Depositions From Crowds  <ul><li>CAS indexes published literature, patents and chemical vendors ( indexes ChemSpider too )...
Crowd-sourcing Chemistry Curation <ul><li>Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify reco...
“ Lathosterol”
“ Lathosterol”
“ Lathosterol”
“ Lathosterol” Removed
 
“ Lathosterol” on PubChem
“ Lathosterol” on PubChem
“ Lathosterol” on PubChem
Validate Names for PubMed “Caldarchaeol”
Publishers are experimenting <ul><li>… invited researchers to prototype tools dealing with the ever-increasing amount of o...
Reflect
Entity-Extraction, Mark-up, Annotate
Success Depends on Dictionaries
Structure Searching Articles… <ul><li>Searching articles based on chemical structure and substructure is very expensive.. ...
Semantic Mark-up for Chemistry <ul><li>Semantic mark-up for chemistry is here </li></ul><ul><ul><li>RSC project prospect (...
Nature Chemistry Compound Pages
ChemSpider and Publishing <ul><li>Curation led to a set of validated dictionaries </li></ul><ul><li>Integrated best-in-cla...
ChemMantis and CJOC
Name-Structure Pairs
Converting Detected Names… <ul><li>Names are searched against a validated dictionary (this expands as ChemSpider is curate...
Deposit Structures
Custom Dictionaries <ul><li>Entity Extraction built around modified algorithms from SureChem </li></ul><ul><li>Optimized f...
Species – linked to Wikipedia
Semantic Linking of Structures <ul><li>What would you want to link off a structure? </li></ul><ul><ul><li>Chemical supplie...
RSC’s Project Prospect
In Development ChemSpider Synthesis <ul><li>ChemSpider Synthesis will be a home for all things “synthetic”  </li></ul><ul>...
RSC Supplementary Info
RSC Supplementary Info
Online Journals and Live Data
ChemSpider Everywhere : Embed
ChemSpider Everywhere: Spectral Game
ChemSpider Everywhere Crowdsourced Curation of Spectra Building a Structure Centric Community for Chemists
ChemSpider Everywhere ChemMobi Building a Structure Centric Community for Chemists
ChemSpider Everywhere <ul><li>Linked from Wikipedia </li></ul><ul><li>Linked from Open Notebook Science sites  </li></ul><...
 
It’s a long road ahead…
More representative…
Where is ChemSpider Lacking? <ul><li>ChemSpider is limited to “defined chemicals”. No support for: </li></ul><ul><ul><li>P...
What’s next? <ul><li>Keep cleaning and depositing data </li></ul><ul><li>Deprecate/redeposit at PubChem </li></ul><ul><li>...
You are invited.. <ul><li>Curate ChemSpider data and link to us </li></ul><ul><li>Deposit your data with us </li></ul><ul>...
How Was ChemSpider Built? <ul><li>ChemSpider was a “hobby project”  </li></ul><ul><li>Housed in a basement and running off...
Not in a basement now...
And the Downside
Thank you [email_address] Twitter: ChemSpiderman www.chemspider.com/blog SLIDES: www.slideshare.net/AntonyWilliams
Upcoming SlideShare
Loading in...5
×

ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd Sourced Databasing Platform for Chemistry

2,833

Published on

There is an increasing availability of free and open access resources for chemists to use on the internet. Coupled with the increasing availability of Open Source software tools we are in the middle of a revolution in data availability and tools to manipulate these data. ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. It was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge.
There are tens if not hundreds of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them. Despite the fact that there were a large number of databases containing chemical compounds and data available online their inherent quality, accuracy and completeness was lacking in many regards. The intention with ChemSpider was to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data, experimental properties and linking to other valuable resources. It has grown into a resource containing over 21 million unique chemical structures from over 200 data sources.
ChemSpider has enabled real time curation of the data, association of analytical data with chemical structures, real-time deposition of single or batch chemical structures (including with activity data) and transaction-based predictions of physicochemical data. The social community aspects of the system demonstrate the potential of this approach. Curation of the data continues daily and thousands of edits and depositions by members of the community have dramatically improved the quality of the data relative to other public resources for chemistry.
This presentation will provide an overview of the history of ChemSpider, the present capabilities of the platform and how it can become one of the primary foundations of the semantic web for chemistry. It will also discuss some of the present projects underway since the acquisition of ChemSpider by the Royal Society of Chemistry.

Published in: Technology, Education
1 Comment
3 Likes
Statistics
Notes
  • Find coupons for your hosting. Get a promo deal before you purchase hosting http://midphasehosting.blogspot.com/2010/04/midphase-reviews-by-users.html
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
2,833
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
47
Comments
1
Likes
3
Embeds 0
No embeds

No notes for slide

ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd Sourced Databasing Platform for Chemistry

  1. 1. Building a Foundation for the Semantic Web by Hosting a Crowdsourced Databasing Platform for Chemistry
  2. 2. What’s the Status of Chemistry online? <ul><li>Encyclopedic articles (Wikipedia) </li></ul><ul><li>Chemical vendor databases </li></ul><ul><li>Metabolic pathway databases </li></ul><ul><li>Virtual Screening databases </li></ul><ul><li>Property databases </li></ul><ul><li>Screening assay results </li></ul><ul><li>Patents with chemical structures (IBM & SureChem) </li></ul><ul><li>ADME/Tox data </li></ul><ul><li>Scientific publications </li></ul><ul><li>Compound aggregators </li></ul><ul><li>Blogs/Wikis and Open Notebook Science </li></ul>
  3. 3. For Synthesis…TotallySynthetic.com
  4. 4. Org Prep Daily (Blog)
  5. 5. Molbank (Open Access Journal)
  6. 6. Synthetic Pages (Website)
  7. 7. Encyclopedic Articles (Wikipedia)
  8. 8. Lots of “Public Compound” Databases <ul><li>PubChem </li></ul><ul><li>Drugbank </li></ul><ul><li>ChEBI/ChEMBL </li></ul><ul><li>KEGG </li></ul><ul><li>LipidMAPs </li></ul><ul><li>ChemIDPlus </li></ul><ul><li>eMolecules </li></ul><ul><li>ZINC </li></ul><ul><li>Lots of chemical vendors </li></ul><ul><li>ChemSpider </li></ul>
  9. 9. Where Would You look? What Do You Trust?
  10. 10. Linked Data on the Web Taken from: Rafael Sidis’ Blog
  11. 11. Compounds and Identifiers
  12. 12. Connecting Chemistry on the Web <ul><li>Connections are made </li></ul><ul><ul><li>actively by purposeful linking </li></ul></ul><ul><ul><li>via semantic web links (i.e. RDF and triple stores) </li></ul></ul><ul><ul><li>by integrating via identifiers (InChI, CAS #, names and web service look ups) </li></ul></ul>
  13. 13. Connections Can Lead Anywhere
  14. 14. The FUTURE of Chemistry on the Web <ul><li>The internet is searchable by chemical structure and substructure (e.g.Wikipedia, Google Scholar) </li></ul><ul><li>Chemistry articles are indexed and searchable by a free online service </li></ul><ul><li>Publicly funded research data can be shared and discussed in the Open, maybe as ONS? </li></ul><ul><li>There is structure based access to property data (ADME/Tox), spectra, clean name-structure dictionaries </li></ul>
  15. 15. What is ChemSpider? <ul><li>ChemSpider is: </li></ul><ul><ul><li>Building a Structure Centric Community for Chemists </li></ul></ul><ul><ul><li>>20 million compounds, >200 data sources </li></ul></ul><ul><ul><li>A deposition and curation platform </li></ul></ul><ul><ul><li>A publishing platform for the community </li></ul></ul><ul><ul><li>Grows daily – more depositions, more links, more data sources </li></ul></ul>
  16. 16. Search Cholesterol
  17. 17. Search Cholesterol
  18. 18. Search Cholesterol
  19. 19. Search Cholesterol
  20. 20. Search Cholesterol
  21. 21. Search Cholesterol
  22. 22. Linked across the internet
  23. 23. Kyoto Encyclopedia of Genes and Genomes
  24. 24. Link off a structure in ChemSpider <ul><ul><li>Chemical suppliers </li></ul></ul><ul><ul><li>Other publications </li></ul></ul><ul><ul><li>Analytical Data </li></ul></ul><ul><ul><li>Related Reactions </li></ul></ul><ul><ul><li>Wikipedia </li></ul></ul><ul><ul><li>Patents </li></ul></ul><ul><ul><li>“ Everything” </li></ul></ul>
  25. 25. Links to Patents based on structure
  26. 26. Clickthrough to Patent (SureChem)
  27. 27. Pubmed Articles Linked
  28. 28. Answering Questions for Chemists <ul><li>Questions a chemist might ask… </li></ul><ul><ul><li>What is the melting point of n-butanol? </li></ul></ul><ul><ul><li>What is the chemical structure of Xanax? </li></ul></ul><ul><ul><li>Chemically, what is phenolphthalein? </li></ul></ul><ul><ul><li>What are the stereocenters of cholesterol? </li></ul></ul><ul><ul><li>Where can I find publications about xylene? </li></ul></ul><ul><ul><li>What are the different trade names for Ketoconazole? </li></ul></ul><ul><ul><li>What is the NMR spectrum of Aspirin? </li></ul></ul><ul><ul><li>What are the safety handling issues for Thymol Blue? </li></ul></ul>
  29. 29. Complex Data and Information
  30. 30. ChemSpider Searches
  31. 31. ChemSpider Searches
  32. 32. ChemSpider Complex Searches
  33. 33. Question Everything online: www.dhmo.org
  34. 34. Searching Dihydrogen Monoxide…
  35. 35. Politicians, Chemistry and DHMO  <ul><li>March 2004 , Aliso Viejo, California considered banning foam containers at city-sponsored events because dihydrogen monoxide is part of their production. </li></ul><ul><li>2007 Jacqui Dean , New Zealand National Party MP write a letter to Associate Minister of Health Jim Anderton asking &quot;Does the Expert Advisory Committee on Drugs have a view on the banning of this drug?&quot; </li></ul>
  36. 36. Caution! Question Everything!
  37. 37. PubChem
  38. 38. Vancomycin <ul><li>Who will curate? </li></ul><ul><li>PubChem is not resourced to clean these errors </li></ul><ul><li>How would you clean such a large dataset? </li></ul>
  39. 39. Vancomycin on ChemSpider 1 compound – discussions over 3 days
  40. 40. The EXPERTS must get it right?!
  41. 41. Wikipedia, C&E News, PubChem <ul><li>C&E News (from ACS) </li></ul>
  42. 42. Feedback from Steve Ritter <ul><li>“ As for where we source our structures, our primary source is the researcher and peer-reviewed papers , because many compounds are novel. </li></ul><ul><li>..we always double check them against one or more primary sources, typically Merck Index and SciFinder. </li></ul><ul><li>Although CAS and C&EN are both part of the ACS Publications Division, we at C&EN still have to pay for our SciFinder access, strangely enough.” </li></ul>
  43. 43. Feedback from Steve Ritter <ul><li>“ As a rule, we at C&EN don’t use Wikipedia as a primary source for structures or chemical information, and I recommend that policy to anyone .” </li></ul><ul><li>“ It would be nice to have an authoritative web-based source of standard, well-drawn structures for chemists to go to so they can freely cut and paste structures into their papers, PowerPoint presentations, and anything else they might need. Maybe Wikipedia will be that source one day .” </li></ul>
  44. 44. What About Digitonin?
  45. 45. Comments on the Blog <ul><li>Kirill Degtyarenko says: </li></ul><ul><li>September 15th, 2009 at 1:57 pm It looks like both ChEBI and Wikipedia structures are wrong as far as aglycon is concerned. According to http://www3.interscience.wiley.com/journal/20330/abstract </li></ul><ul><li>“… for the first time to confirm beyond all doubt the structure suggested by Tschesche and Wulff for digitonin by means of modern NMR techniques, and to assign all proton and carbon resonances.” Structure 1 shows methyl group at C-20 going UP, i.e. 20β (while by default spirostan is 20α). </li></ul>
  46. 46. CAS as an authority
  47. 47. The Blogging Community Participate
  48. 48. Will it ever end? <ul><li>The community says the structure of digitonin has “up” 20-Methyl. </li></ul><ul><li>If so, then multiple substances related to digitonin have OPPOSITE stereo at 20-Methyl </li></ul><ul><li>The spirostane skeleton has a “ down ” Methyl group so all spirostane-related structures would be wrong </li></ul>
  49. 49. The FDA’s DailyMed
  50. 50. Structures on DailyMed
  51. 51. Lack of Stereochemisty
  52. 52. Incorrect Structures
  53. 53. Wow!
  54. 54. Collaborative Knowledge Management
  55. 55. Drugbank
  56. 56. Taxol on PubChem
  57. 57. The InChI Identifier
  58. 58. Multiple Layers
  59. 59. InChIStrings Hash to InChIKeys
  60. 60. InChIs for Taxol
  61. 61. Back to Taxol <ul><li>DrugBank: RCINICONZNJXQF-CLDWUXIMDD </li></ul><ul><li>ChEBI: RCINICONZNJXQF-GXKQXQCDDN </li></ul><ul><li>Wikipedia: RCINICONZNJXQF-MZXODVADBJ </li></ul><ul><li>Which one is correct??? </li></ul>
  62. 62. InChIKeys for Taxol <ul><li>DrugBank: RCINICONZNJXQF-CLDWUXIMDD </li></ul><ul><li>ChEBI: RCINICONZNJXQF-GXKQXQCDDN </li></ul><ul><li>Wikipedia: RCINICONZNJXQF-MZXODVADBJ </li></ul><ul><li>ChEBI and Wikipedia are the SAME structure </li></ul><ul><li>Drugbank is a DIFFERENT structure – ONE stereocenter </li></ul>
  63. 63. Does one stereocenter matter?
  64. 64. Does one stereocenter matter? <ul><li>Distaval, Talimol, Nibrol, Sedimide, Quietoplex, Contergan, Neurosedyn, and Softenon </li></ul>
  65. 65. Does one stereocenter matter? <ul><li>Distaval, Talimol, Nibrol, Sedimide, Quietoplex, Contergan, Neurosedyn, and Softenon </li></ul>
  66. 66. Building a Structure Centric Community for Chemists
  67. 67. Content is King and Quality Costs <ul><li>Chemistry “content” is big money </li></ul><ul><ul><li>Patent searching </li></ul></ul><ul><ul><li>Structures and properties </li></ul></ul><ul><ul><li>Drug databases </li></ul></ul><ul><ul><li>Literature databases </li></ul></ul><ul><li>Chemical Abstracts Service (CAS), the “Gold Standard” in Chemistry related information </li></ul><ul><ul><li>101 years of content </li></ul></ul><ul><ul><li>$260 million revenue (2006) </li></ul></ul><ul><ul><li>>50 million substances </li></ul></ul><ul><ul><li>>60 million sequences </li></ul></ul>
  68. 68. ChemSpider Searches
  69. 69. ChemSpider Searches
  70. 70. InChIKey Searches Work
  71. 71. The InChI “Resolver”
  72. 72. Assertion and Chemical Entities <ul><li>Who says what Taxol is? </li></ul><ul><li>What is the “timeline” for a molecule? </li></ul><ul><li>How do we clean up the Public data? </li></ul>
  73. 73. Crowdsourcing
  74. 74. Depositions From Crowds <ul><li>CAS indexes published literature, patents and chemical vendors ( indexes ChemSpider too ) </li></ul><ul><li>“ Lost Chemistry” </li></ul><ul><ul><li>syntheses in theses </li></ul></ul><ul><ul><li>lab notebooks </li></ul></ul><ul><ul><li>compounds in private collections </li></ul></ul><ul><li>ChemSpider accepts public depositions: structures, text, spectra, images. </li></ul>
  75. 75. Crowd-sourcing Chemistry Curation <ul><li>Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate </li></ul>
  76. 76. “ Lathosterol”
  77. 77. “ Lathosterol”
  78. 78. “ Lathosterol”
  79. 79. “ Lathosterol” Removed
  80. 81. “ Lathosterol” on PubChem
  81. 82. “ Lathosterol” on PubChem
  82. 83. “ Lathosterol” on PubChem
  83. 84. Validate Names for PubMed “Caldarchaeol”
  84. 85. Publishers are experimenting <ul><li>… invited researchers to prototype tools dealing with the ever-increasing amount of online life sciences information </li></ul><ul><li>The winners built: </li></ul><ul><li>Reflect: Automated Annotation of Scientific Terms : http://reflect.ws </li></ul>
  85. 86. Reflect
  86. 87. Entity-Extraction, Mark-up, Annotate
  87. 88. Success Depends on Dictionaries
  88. 89. Structure Searching Articles… <ul><li>Searching articles based on chemical structure and substructure is very expensive.. but is changing </li></ul><ul><li>The web IS ready - when will publishers deliver? </li></ul><ul><ul><li>Structures can be shown </li></ul></ul><ul><ul><li>Spectra can be interactive </li></ul></ul><ul><ul><li>Graphics don’t need to be static </li></ul></ul><ul><ul><li>Publishers can enhance their articles </li></ul></ul>
  89. 90. Semantic Mark-up for Chemistry <ul><li>Semantic mark-up for chemistry is here </li></ul><ul><ul><li>RSC project prospect (structure linking, IUPAC Gold Book ontology and other ontologies </li></ul></ul><ul><ul><li>Nature publishing group compound linking </li></ul></ul><ul><ul><li>ChemSpider Journal of Chemistry </li></ul></ul>
  90. 91. Nature Chemistry Compound Pages
  91. 92. ChemSpider and Publishing <ul><li>Curation led to a set of validated dictionaries </li></ul><ul><li>Integrated best-in-class entity extraction with validated name dictionaries </li></ul><ul><li>Additional dictionaries gave reactions, groups, families, hardware and software vendors etc </li></ul>
  92. 93. ChemMantis and CJOC
  93. 94. Name-Structure Pairs
  94. 95. Converting Detected Names… <ul><li>Names are searched against a validated dictionary (this expands as ChemSpider is curated) </li></ul><ul><li>If not found then they are passed through a Name to Structure algorithm </li></ul><ul><li>If they cannot convert then ChemSpider is searched for non-validated names </li></ul>
  95. 96. Deposit Structures
  96. 97. Custom Dictionaries <ul><li>Entity Extraction built around modified algorithms from SureChem </li></ul><ul><li>Optimized for “publications” </li></ul><ul><li>Dictionaries for chemical entities, groups, reactions, elements, families, species… </li></ul><ul><li>Dictionaries can be expanded </li></ul>
  97. 98. Species – linked to Wikipedia
  98. 99. Semantic Linking of Structures <ul><li>What would you want to link off a structure? </li></ul><ul><ul><li>Chemical suppliers </li></ul></ul><ul><ul><li>Other publications </li></ul></ul><ul><ul><li>Analytical Data </li></ul></ul><ul><ul><li>Related Reactions </li></ul></ul><ul><ul><li>Wikipedia </li></ul></ul><ul><ul><li>Patents </li></ul></ul><ul><ul><li>“ Everything” </li></ul></ul>
  99. 100. RSC’s Project Prospect
  100. 101. In Development ChemSpider Synthesis <ul><li>ChemSpider Synthesis will be a home for all things “synthetic” </li></ul><ul><li>An online resource for synthetic procedures from blogs, other online resources, RSC supplementary info, other publishers etc. </li></ul><ul><li>Public peer-review and feedback for synthetic procedures </li></ul>
  101. 102. RSC Supplementary Info
  102. 103. RSC Supplementary Info
  103. 104. Online Journals and Live Data
  104. 105. ChemSpider Everywhere : Embed
  105. 106. ChemSpider Everywhere: Spectral Game
  106. 107. ChemSpider Everywhere Crowdsourced Curation of Spectra Building a Structure Centric Community for Chemists
  107. 108. ChemSpider Everywhere ChemMobi Building a Structure Centric Community for Chemists
  108. 109. ChemSpider Everywhere <ul><li>Linked from Wikipedia </li></ul><ul><li>Linked from Open Notebook Science sites </li></ul><ul><li>Linked from Blogs using Structure/Spectra EMBED </li></ul><ul><li>Integrated into structure drawing packages such as ACD/ChemSketch, Symyx Draw, Open Source applets </li></ul><ul><li>Integrated to software offerings from Thermo, Waters, Agilent, Bruker </li></ul>
  109. 111. It’s a long road ahead…
  110. 112. More representative…
  111. 113. Where is ChemSpider Lacking? <ul><li>ChemSpider is limited to “defined chemicals”. No support for: </li></ul><ul><ul><li>Polymers </li></ul></ul><ul><ul><li>Minerals </li></ul></ul><ul><ul><li>Markush structures </li></ul></ul><ul><li>ChemSpider is very dependent on InChIs </li></ul><ul><ul><li>Stereochemistry around non-carbon centers </li></ul></ul><ul><ul><li>Organometallics are not correctly represented </li></ul></ul><ul><li>There are millions of errors on ChemSpider </li></ul>
  112. 114. What’s next? <ul><li>Keep cleaning and depositing data </li></ul><ul><li>Deprecate/redeposit at PubChem </li></ul><ul><li>Layer on RDF for the semantic web </li></ul><ul><li>Integrate software: Symyx Jdraw, NMRShiftDB </li></ul><ul><li>Integrate RSC content – a massive archive! </li></ul><ul><li>Integrate RSC publishing workflows and databases </li></ul>
  113. 115. You are invited.. <ul><li>Curate ChemSpider data and link to us </li></ul><ul><li>Deposit your data with us </li></ul><ul><ul><li>Structures </li></ul></ul><ul><ul><li>Spectra </li></ul></ul><ul><ul><li>Synthesis procedures </li></ul></ul><ul><li>ChemSpider Synthesis is under development </li></ul><ul><li>Help assert what is Digitonin? </li></ul>
  114. 116. How Was ChemSpider Built? <ul><li>ChemSpider was a “hobby project” </li></ul><ul><li>Housed in a basement and running off three servers – one bought, two built </li></ul><ul><li>Sensitive to weather and power stability </li></ul><ul><li>Went live at ACS Spring 2007 in Chicago </li></ul>
  115. 117. Not in a basement now...
  116. 118. And the Downside
  117. 119. Thank you [email_address] Twitter: ChemSpiderman www.chemspider.com/blog SLIDES: www.slideshare.net/AntonyWilliams
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×