Why Chemistry and the Web Will Benefit from a ChemSpider
The FUTURE of Chemistry on the Web <ul><li>The internet is searchable by chemical structure and substructure </li></ul><ul...
Citizen Scientists Enable the Web <ul><li>Who is writing about chemical compounds on Wikipedia? </li></ul><ul><li>Who is w...
For Synthesis…TotallySynthetic.com
Org Prep Daily  (Blog)
Molbank (Open Access Journal)
Synthetic Pages (Website)
Encyclopedic Articles (Wikipedia)
 
Chemistry online – An Overview <ul><li>Encyclopedic articles (Wikipedia) </li></ul><ul><li>Chemical vendor databases </li>...
What and who do you trust?
Compounds and Identifiers
Connecting Chemistry on the Web <ul><li>Connections are made </li></ul><ul><ul><li>actively by purposeful linking </li></u...
Linked Data on the Web Taken from: Rafael Sidis’ Blog
Connections Can Lead Anywhere
What is ChemSpider? <ul><li>ChemSpider is: </li></ul><ul><ul><li>Building a Structure Centric Community for Chemists </li>...
Search Cholesterol
Search Cholesterol
Search Cholesterol
Search Cholesterol
Search Cholesterol
Linked across the internet
Link off a structure in ChemSpider <ul><ul><li>Chemical suppliers </li></ul></ul><ul><ul><li>Other publications </li></ul>...
Linked to Millions of Articles
Answering Questions for Chemists <ul><li>Questions a chemist might ask… </li></ul><ul><ul><li>What is the melting point of...
Complex Data and Information
Online Analytical Data Building a Structure Centric Community for Chemists
Various Searches  <ul><li>Structure searching </li></ul><ul><li>Substructure searching </li></ul><ul><li>Subset searching ...
ChemSpider for MS Spectrometrists <ul><li>What would an MS spectrometrist want to do? </li></ul><ul><ul><li>Search the dat...
Search Database Based on Mass
Mass Based Searches? <ul><li>What compounds have a mass of 300+/-0.001? </li></ul>
59 hits/1.3 seconds from 23 million
Substructure and Property
 
Elemental Constraints
Search based on Data Sources
ChemSpider Searches
ChemSpider Searches
Caution! Question Everything!
PubChem
Vancomycin <ul><li>Who will curate? </li></ul><ul><li>PubChem is not resourced to clean these errors </li></ul><ul><li>How...
Vancomycin on ChemSpider  1 compound – discussions over 3 days
The EXPERTS must get it right?!
Wikipedia, C&E News, PubChem <ul><li>C&E News (from ACS) </li></ul>
Question Everything online: www.dhmo.org
Searching Dihydrogen Monoxide…
“ Lathosterol”
“ Lathosterol”
“ Lathosterol”
“ Lathosterol” Removed
 
“ Lathosterol” on PubChem
“ Lathosterol” on PubChem
“ Lathosterol” on PubChem
Crowd-sourcing Chemistry Curation <ul><li>Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify reco...
Validate Names for PubMed “Caldarchaeol”
Citizen Scientists
Become a Data Source
 
Synthesis Procedures
Links to Data or Deposit Data
Upload Spectral Data, OPEN Data?
Semantic Mark-up for Chemistry <ul><li>Semantic mark-up for chemistry is here </li></ul><ul><ul><li>RSC project prospect (...
ChemSpider and Publishing <ul><li>Curation led to a set of validated dictionaries </li></ul><ul><li>Integrated best-in-cla...
ChemMantis and CJOC
Name-Structure Pairs
Deposit Structures
Species – linked to Wikipedia
Semantic Linking of Structures <ul><li>What would you want to link off a structure? </li></ul><ul><ul><li>Chemical supplie...
RSC’s Project Prospect
In Development  ChemSpider Synthesis <ul><li>ChemSpider Synthesis will be a home for all things “synthetic”  </li></ul><ul...
RSC Supplementary Info
Online Journals and Live Data
ChemSpider Everywhere : Embed
ChemSpider Everywhere: Spectral Game
ChemSpider Everywhere Crowdsourced Curation of Spectra
ChemSpider Everywhere ChemMobi Building a Structure Centric Community for Chemists
ChemSpider Web Services
ChemSpider Everywhere <ul><li>Linked from Wikipedia </li></ul><ul><li>Linked from Open Notebook Science sites  </li></ul><...
 
Where is ChemSpider Lacking? <ul><li>ChemSpider is limited to “defined chemicals”. No support for: </li></ul><ul><ul><li>P...
What’s next? <ul><li>Keep cleaning and depositing data </li></ul><ul><li>Enable discovery via the semantic web (RDF) </li>...
<ul><li>Continue Building Community for Chemistry </li></ul><ul><li>Building a Public ADME/Tox database </li></ul><ul><li>...
People  Make Change Happen <ul><li>ChemSpider was a “hobby project”  </li></ul><ul><li>Housed in a basement and running of...
You are invited.. <ul><li>Curate ChemSpider data and link to us </li></ul><ul><li>Deposit your data with us </li></ul><ul>...
Organizations Scale Innovation
There is a Downside…
There is a Downside…
Thank you [email_address] Twitter: ChemSpiderman www.chemspider.com/blog SLIDES: www.slideshare.net/AntonyWilliams
Upcoming SlideShare
Loading in...5
×

Why Chemistry and the Web Will Benefit from a ChemSpider

1,156

Published on

ChemSpider is a free access website for chemists built with the vision of providing a structure centric community for chemists. Vision is great…execution is better. ChemSpider is now one of the internet’s primary portals for chemistry offering access to over 23 million unique chemical structures from over 200 data sources and expanding daily. Even though there are tens if not hundreds of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. there has been no single way to search across them. Despite the fact that there are a large number of databases containing chemical compounds and data available online their inherent quality, accuracy and completeness remains lacking in many regards. With ChemSpider we have provided a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data, experimental properties and linking to other valuable resources.
This presentation will provide an overview of ChemSpider and its value to chemists as a search tool, as a public repository of information and how it can become one of the primary foundations of internet-based chemistry. I will also discuss the vision for ChemSpider and some of the exciting goals we are setting for the system moving forward.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,156
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
17
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • This is a list of some of the things an MS scientist might want to do and some of the queries we have already experienced
  • Why Chemistry and the Web Will Benefit from a ChemSpider

    1. 1. Why Chemistry and the Web Will Benefit from a ChemSpider
    2. 2. The FUTURE of Chemistry on the Web <ul><li>The internet is searchable by chemical structure and substructure </li></ul><ul><li>Chemistry articles are indexed and searchable by a free online service </li></ul><ul><li>Research data is shared and discussed in the Open </li></ul><ul><li>“ Open Notebook Science” is mainstream </li></ul><ul><li>There are new “Authorities” in Chemistry – Wikipedia, Google Scholar, ChemSpider (?) </li></ul>
    3. 3. Citizen Scientists Enable the Web <ul><li>Who is writing about chemical compounds on Wikipedia? </li></ul><ul><li>Who is writing critical reviews of Chemistry online? </li></ul><ul><li>Who is blogging about chemistry on the web? </li></ul>
    4. 4. For Synthesis…TotallySynthetic.com
    5. 5. Org Prep Daily (Blog)
    6. 6. Molbank (Open Access Journal)
    7. 7. Synthetic Pages (Website)
    8. 8. Encyclopedic Articles (Wikipedia)
    9. 10. Chemistry online – An Overview <ul><li>Encyclopedic articles (Wikipedia) </li></ul><ul><li>Chemical vendor databases </li></ul><ul><li>Metabolic pathway databases </li></ul><ul><li>Property databases </li></ul><ul><li>Chemical Synthesis procedures </li></ul><ul><li>Scientific publications </li></ul><ul><li>Chemical vendors </li></ul><ul><li>Blogs </li></ul><ul><li>Wikis </li></ul><ul><li>Open Notebook Science </li></ul>
    10. 11. What and who do you trust?
    11. 12. Compounds and Identifiers
    12. 13. Connecting Chemistry on the Web <ul><li>Connections are made </li></ul><ul><ul><li>actively by purposeful linking </li></ul></ul><ul><ul><li>via semantic web links (i.e. RDF and triple stores) </li></ul></ul><ul><ul><li>by integrating via identifiers (InChI, CAS #, names and web service look ups) </li></ul></ul>
    13. 14. Linked Data on the Web Taken from: Rafael Sidis’ Blog
    14. 15. Connections Can Lead Anywhere
    15. 16. What is ChemSpider? <ul><li>ChemSpider is: </li></ul><ul><ul><li>Building a Structure Centric Community for Chemists </li></ul></ul><ul><ul><li>>23 million compounds, ca. 250 data sources </li></ul></ul><ul><ul><li>A deposition and curation platform </li></ul></ul><ul><ul><li>A publishing platform for the community </li></ul></ul><ul><ul><li>Grows daily – more depositions, more links, more data sources </li></ul></ul>
    16. 17. Search Cholesterol
    17. 18. Search Cholesterol
    18. 19. Search Cholesterol
    19. 20. Search Cholesterol
    20. 21. Search Cholesterol
    21. 22. Linked across the internet
    22. 23. Link off a structure in ChemSpider <ul><ul><li>Chemical suppliers </li></ul></ul><ul><ul><li>Other publications </li></ul></ul><ul><ul><li>Analytical Data </li></ul></ul><ul><ul><li>Related Reactions </li></ul></ul><ul><ul><li>Wikipedia </li></ul></ul><ul><ul><li>Patents </li></ul></ul><ul><ul><li>“ Everything” </li></ul></ul>
    23. 24. Linked to Millions of Articles
    24. 25. Answering Questions for Chemists <ul><li>Questions a chemist might ask… </li></ul><ul><ul><li>What is the melting point of n-butanol? </li></ul></ul><ul><ul><li>What is the chemical structure of Xanax? </li></ul></ul><ul><ul><li>Chemically, what is phenolphthalein? </li></ul></ul><ul><ul><li>What are the stereocenters of cholesterol? </li></ul></ul><ul><ul><li>Where can I find publications about xylene? </li></ul></ul><ul><ul><li>What are the different trade names for Ketoconazole? </li></ul></ul><ul><ul><li>What is the NMR spectrum of Aspirin? </li></ul></ul><ul><ul><li>What are the safety handling issues for Thymol Blue? </li></ul></ul>
    25. 26. Complex Data and Information
    26. 27. Online Analytical Data Building a Structure Centric Community for Chemists
    27. 28. Various Searches <ul><li>Structure searching </li></ul><ul><li>Substructure searching </li></ul><ul><li>Subset searching – choose from 200 data sources </li></ul><ul><li>Property searching </li></ul><ul><li>Searches are used in various ways by different types of chemists… </li></ul>
    28. 29. ChemSpider for MS Spectrometrists <ul><li>What would an MS spectrometrist want to do? </li></ul><ul><ul><li>Search the database based on mass (various forms) </li></ul></ul><ul><ul><li>Search selected subsets of the database based on mass </li></ul></ul><ul><ul><li>Search for structure based on name(s) or database IDs </li></ul></ul><ul><ul><li>Download the structure/structures in standard format </li></ul></ul><ul><ul><li>Identify related data sources – chemical vendors, pathway databases , etc </li></ul></ul>
    29. 30. Search Database Based on Mass
    30. 31. Mass Based Searches? <ul><li>What compounds have a mass of 300+/-0.001? </li></ul>
    31. 32. 59 hits/1.3 seconds from 23 million
    32. 33. Substructure and Property
    33. 35. Elemental Constraints
    34. 36. Search based on Data Sources
    35. 37. ChemSpider Searches
    36. 38. ChemSpider Searches
    37. 39. Caution! Question Everything!
    38. 40. PubChem
    39. 41. Vancomycin <ul><li>Who will curate? </li></ul><ul><li>PubChem is not resourced to clean these errors </li></ul><ul><li>How would you clean such a large dataset? </li></ul>
    40. 42. Vancomycin on ChemSpider 1 compound – discussions over 3 days
    41. 43. The EXPERTS must get it right?!
    42. 44. Wikipedia, C&E News, PubChem <ul><li>C&E News (from ACS) </li></ul>
    43. 45. Question Everything online: www.dhmo.org
    44. 46. Searching Dihydrogen Monoxide…
    45. 47. “ Lathosterol”
    46. 48. “ Lathosterol”
    47. 49. “ Lathosterol”
    48. 50. “ Lathosterol” Removed
    49. 52. “ Lathosterol” on PubChem
    50. 53. “ Lathosterol” on PubChem
    51. 54. “ Lathosterol” on PubChem
    52. 55. Crowd-sourcing Chemistry Curation <ul><li>Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate </li></ul>
    53. 56. Validate Names for PubMed “Caldarchaeol”
    54. 57. Citizen Scientists
    55. 58. Become a Data Source
    56. 60. Synthesis Procedures
    57. 61. Links to Data or Deposit Data
    58. 62. Upload Spectral Data, OPEN Data?
    59. 63. Semantic Mark-up for Chemistry <ul><li>Semantic mark-up for chemistry is here </li></ul><ul><ul><li>RSC project prospect (structure linking, IUPAC Gold Book ontology and other ontologies </li></ul></ul><ul><ul><li>ChemSpider Journal of Chemistry </li></ul></ul><ul><ul><li>Nature publishing group compound linking </li></ul></ul>
    60. 64. ChemSpider and Publishing <ul><li>Curation led to a set of validated dictionaries </li></ul><ul><li>Integrated best-in-class entity extraction with validated name dictionaries </li></ul><ul><li>Additional dictionaries gave reactions, groups, families, hardware and software vendors etc </li></ul>
    61. 65. ChemMantis and CJOC
    62. 66. Name-Structure Pairs
    63. 67. Deposit Structures
    64. 68. Species – linked to Wikipedia
    65. 69. Semantic Linking of Structures <ul><li>What would you want to link off a structure? </li></ul><ul><ul><li>Chemical suppliers </li></ul></ul><ul><ul><li>Other publications </li></ul></ul><ul><ul><li>Analytical Data </li></ul></ul><ul><ul><li>Related Reactions </li></ul></ul><ul><ul><li>Wikipedia </li></ul></ul><ul><ul><li>Patents </li></ul></ul><ul><ul><li>“ Everything” </li></ul></ul>
    66. 70. RSC’s Project Prospect
    67. 71. In Development ChemSpider Synthesis <ul><li>ChemSpider Synthesis will be a home for all things “synthetic” </li></ul><ul><li>An online resource for synthetic procedures from blogs, other online resources, RSC supplementary info, other publishers etc. </li></ul><ul><li>Public peer-review and feedback for synthetic procedures </li></ul>
    68. 72. RSC Supplementary Info
    69. 73. Online Journals and Live Data
    70. 74. ChemSpider Everywhere : Embed
    71. 75. ChemSpider Everywhere: Spectral Game
    72. 76. ChemSpider Everywhere Crowdsourced Curation of Spectra
    73. 77. ChemSpider Everywhere ChemMobi Building a Structure Centric Community for Chemists
    74. 78. ChemSpider Web Services
    75. 79. ChemSpider Everywhere <ul><li>Linked from Wikipedia </li></ul><ul><li>Linked from Open Notebook Science sites </li></ul><ul><li>Linked from Blogs using Structure/Spectra </li></ul><ul><li>Integrated into structure drawing packages such as ACD/ChemSketch, Symyx Draw, Open Source applets </li></ul>
    76. 81. Where is ChemSpider Lacking? <ul><li>ChemSpider is limited to “defined chemicals”. No support for: </li></ul><ul><ul><li>Polymers </li></ul></ul><ul><ul><li>Minerals </li></ul></ul><ul><ul><li>Markush structures </li></ul></ul><ul><li>ChemSpider is very dependent on InChIs </li></ul><ul><ul><li>Stereochemistry around non-carbon centers </li></ul></ul><ul><ul><li>Organometallics are not correctly represented </li></ul></ul><ul><li>There are millions of errors on ChemSpider </li></ul>
    77. 82. What’s next? <ul><li>Keep cleaning and depositing data </li></ul><ul><li>Enable discovery via the semantic web (RDF) </li></ul><ul><li>Integrate software: Symyx Jdraw, NMRShiftDB </li></ul><ul><li>Integrate RSC content – a massive archive! </li></ul><ul><li>Integrate RSC publishing workflows and databases </li></ul>
    78. 83. <ul><li>Continue Building Community for Chemistry </li></ul><ul><li>Building a Public ADME/Tox database </li></ul><ul><li>Delivering ChemSpider Synthetic Pages </li></ul><ul><li>Delivering ChemSpider Analytical Data </li></ul><ul><li>Delivering ChemSpider Education </li></ul>Project Focus
    79. 84. People Make Change Happen <ul><li>ChemSpider was a “hobby project” </li></ul><ul><li>Housed in a basement and running off three servers – one bought, two built </li></ul><ul><li>Sensitive to weather and power stability </li></ul><ul><li>Went live at ACS Spring 2007 in Chicago </li></ul><ul><li>ca. 6000 visitors a day, >50,000 transactions daily </li></ul>
    80. 85. You are invited.. <ul><li>Curate ChemSpider data and link to us </li></ul><ul><li>Deposit your data with us </li></ul><ul><ul><li>Structures </li></ul></ul><ul><ul><li>Spectra </li></ul></ul><ul><ul><li>Synthesis procedures </li></ul></ul><ul><li>ChemSpider Synthesis is under development </li></ul>
    81. 86. Organizations Scale Innovation
    82. 87. There is a Downside…
    83. 88. There is a Downside…
    84. 89. Thank you [email_address] Twitter: ChemSpiderman www.chemspider.com/blog SLIDES: www.slideshare.net/AntonyWilliams
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×