• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Crawling Across the Web of Chemistry Using ChemSpider
 

Crawling Across the Web of Chemistry Using ChemSpider

on

  • 2,109 views

ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. It was developed to index available sources of chemical structures ...

ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. It was developed to index available sources of chemical structures and their associated data into a single searchable repository and making it available to everybody, at no charge. While there are a large number of databases containing chemical compounds and data available online their inherent quality, accuracy and completeness is severely lacking. ChemSpider has provided a platform so that the chemistry community could contribute to improving the quality of data online and expanding the information to include data such as reaction syntheses, analytical data, experimental properties and linkages to other valuable resources. It has grown into a resource containing over 21 million unique chemical structures from over 200 data sources.
This presentation will provide an overview of ChemSpider and its value to chemists as a search tool, as a public repository of information and how it can become one of the primary foundations of internet-based chemistry. I will also discuss the vision for ChemSpider and some of the lofty goals we are setting for the system moving forward.

Statistics

Views

Total Views
2,109
Views on SlideShare
2,080
Embed Views
29

Actions

Likes
2
Downloads
25
Comments
0

2 Embeds 29

http://www.chemspider.com 28
http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Crawling Across the Web of Chemistry Using ChemSpider Crawling Across the Web of Chemistry Using ChemSpider Presentation Transcript

    • Crawling Across the Web of Chemistry Using ChemSpider
    • Citizen Scientists Enable the Web
      • Who is writing about chemical compounds on Wikipedia?
      • Who is writing critical reviews of Chemistry online?
      • Who is blogging about chemistry on the web?
    • For Synthesis…TotallySynthetic.com
    • Org Prep Daily (Blog)
    • Molbank (Open Access Journal)
    • Synthetic Pages (Website)
    • Encyclopedic Articles (Wikipedia)
    •  
    • Chemistry online – An Overview
      • Encyclopedic articles (Wikipedia)
      • Chemical vendor databases
      • Metabolic pathway databases
      • Property databases
      • Chemical Synthesis procedures
      • Scientific publications
      • Chemical vendors
      • Blogs
      • Wikis
      • Open Notebook Science
    • What and who do you trust?
    • Compounds and Identifiers
    • What is ChemSpider?
      • ChemSpider is:
        • Building a Structure Centric Community for Chemists
        • >23 million compounds, ca. 250 data sources
        • A deposition and curation platform
        • A publishing platform for the community
        • Grows daily – more depositions, more links, more data sources
    • Search Cholesterol
    • Search Cholesterol
    • Search Cholesterol
    • Search Cholesterol
    • Search Cholesterol
    • Linked across the internet
    • Link off a structure in ChemSpider
        • Chemical suppliers
        • Other publications
        • Analytical Data
        • Related Reactions
        • Wikipedia
        • Patents
        • “ Everything”
    • Linked to Millions of Articles
    • Answering Questions for Chemists
      • Questions a chemist might ask…
        • What is the melting point of n-butanol?
        • What is the chemical structure of Xanax?
        • Chemically, what is phenolphthalein?
        • What are the stereocenters of cholesterol?
        • Where can I find publications about xylene?
        • What are the different trade names for Ketoconazole?
        • What is the NMR spectrum of Aspirin?
        • What are the safety handling issues for Thymol Blue?
    • What is the structure of Flibanserin?
    • What is the structure of Flibanserin?
    • Complex Data and Information
    • Various Searches
      • Structure searching
      • Substructure searching
      • Subset searching – choose from 200 data sources
      • Property searching
      • Searches are used in various ways by different types of chemists…
    • ChemSpider Searches
    • ChemSpider Searches
    • Caution! Question Everything!
    • Vancomycin
      • Who will curate?
      • PubChem is not resourced to clean these errors
      • How would you clean such a large dataset?
    • Vancomycin on ChemSpider 1 compound – discussions over 3 days
    • The EXPERTS must get it right?!
    • Wikipedia, C&E News, PubChem
      • C&E News (from ACS)
    • “ Lathosterol”
    • “ Lathosterol”
    • “ Lathosterol”
    • “ Lathosterol” Removed
    •  
    • “ Lathosterol” on PubChem
    • Crowd-sourcing Chemistry Curation
      • Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate
    • Citizen Scientists
    • Become a Data Source
    •  
    • Synthesis Procedures
    • Links to Data or Deposit Data
    • Your Blog Posted Online?
    • Upload Spectral Data, OPEN Data?
    • Data as DOIs
      • Primary Data for Chemistry Available for the First Time
      • … Thieme is the first publisher to make primary chemistry data accessible worldwide
      • Analytical data, from various experiments, is the foundation of research work and scientific papers
      • From now on, primary data will be registered and made available online using digital object recognition in the form of Digital Object Identifiers (DOI)
    • Linking Data By DOI
    • Semantic Mark-up for Chemistry
      • Semantic mark-up for chemistry is here
        • RSC project prospect (structure linking, IUPAC Gold Book ontology and other ontologies). Based on the OSCAR system
        • ChemSpider Journal of Chemistry
        • Nature publishing group compound linking
    • ChemSpider and Publishing
      • Curation led to a set of validated dictionaries
      • Integrated entity extraction with validated name dictionaries
      • Additional dictionaries gave reactions, groups, families, hardware and software vendors etc
    • ChemMantis and CJOC
    • Name-Structure Pairs
    • Deposit Structures
    • Species – linked to Wikipedia
    • Semantic Linking of Structures
      • What would you want to link off a structure?
        • Chemical suppliers
        • Other publications
        • Analytical Data
        • Related Reactions
        • Wikipedia
        • Patents
        • “ Everything”
    • RSC’s Project Prospect
    • In Development ChemSpider Synthesis
      • ChemSpider Synthesis will be a home for all things “synthetic”
      • An online resource for synthetic procedures from blogs, other online resources, RSC supplementary info, other publishers etc.
      • Public peer-review and feedback for synthetic procedures
    • RSC Supplementary Info
    • Online Journals and Live Data
    • ChemSpider Everywhere : Embed
    • ChemSpider Everywhere: Spectral Game
    • ChemSpider Everywhere Crowdsourced Curation of Spectra
    • ChemSpider Everywhere ChemMobi Building a Structure Centric Community for Chemists
    • ChemSpider Web Services
    • ChemSpider Everywhere
      • Linked from Wikipedia
      • Linked from Open Notebook Science sites
      • Linked from Blogs using Structure/Spectra
      • Integrated into structure drawing packages such as ACD/ChemSketch, Symyx Draw, Open Source applets
    •  
    • Where is ChemSpider Lacking?
      • ChemSpider is limited to “defined chemicals”. No support for:
        • Polymers
        • Minerals
        • Markush structures
      • ChemSpider is very dependent on InChIs
        • Stereochemistry around non-carbon centers
        • Organometallics are not correctly represented
      • There are millions of errors on ChemSpider
    • What’s next?
      • Keep cleaning and depositing data
      • Enable discovery via the semantic web (RDF)
      • Integrate software: Symyx Jdraw, NMRShiftDB
      • Integrate RSC content – a massive archive!
      • Integrate RSC publishing workflows and databases
      • Continue Building Community for Chemistry
      • Building a Public ADME/Tox database
      • Delivering ChemSpider Synthetic Pages
      • Delivering ChemSpider Analytical Data
      • Delivering ChemSpider Education
      Project Focus
    • People Make Change Happen You are invited..
      • Curate ChemSpider data and link to us
      • Deposit your data with us
        • Structures
        • Spectra
        • Synthesis procedures
      • ChemSpider Synthesis is under development
    • People Make Change Happen
      • ChemSpider was a “hobby project”
      • Housed in a basement and running off three servers – one bought, two built
      • Sensitive to weather and power stability
      • Went live at ACS Spring 2007 in Chicago
      • ca. 6000 visitors a day, >50,000 transactions daily
    • Organizations Scale Innovation
    • There is a Downside…
    • There is a Downside…
    • Thank you [email_address] Twitter: ChemSpiderman www.chemspider.com/blog SLIDES: www.slideshare.net/AntonyWilliams