• Like
  • Save
Navigating the Complex Web of Chemistry Using ChemSpider
Upcoming SlideShare
Loading in...5

Navigating the Complex Web of Chemistry Using ChemSpider



There is an increasing availability of free and open access resources for scientists to use on the internet. Coupled with the increasing availability of Open Source software tools we are in the middle ...

There is an increasing availability of free and open access resources for scientists to use on the internet. Coupled with the increasing availability of Open Source software tools we are in the middle of a revolution in data availability and tools to manipulate these data. ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. As an aggregator of chemistry related information from many sources, at present over 21.5 million unique chemical entities from over 200 separate data sources, ChemSpider has taken on the task of both robotically and manually curating publicly available data sources. This presentation will provide an overview of the ChemSpider platform and how it is fast becoming the centralized hub for resourcing information about chemical entities.



Total Views
Views on SlideShare
Embed Views



2 Embeds 4

http://www.chemspider.com 3
http://www.slideshare.net 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Navigating the Complex Web of Chemistry Using ChemSpider Navigating the Complex Web of Chemistry Using ChemSpider Presentation Transcript

    • Navigating the Complex Web of Chemistry Using ChemSpider
    • Antony Williams vs Identifiers Old Passport ID Dad, Tony, others SSN Green Card License 5 email addresses ChemSpiderman (blog, Twitter account, Facebook, Friendfeed) OpenID … .
    • Aspirin vs Chemical Identifiers
    • Aspirin names and synonyms
      • Text searches depend on correct association
      • 335 suggested identifiers for Aspirin just on PubChem!
      • Disambiguation dictionaries are necessary
    • Linked Data Cloud
      • … the premium database producers are using some automatic tools to prepare a ‘first draft’ of a database record, to be refined by eye .
      • Coupled with the public internet as a distribution method of choice, it is becoming possible for the first time to create and distribute new structure based databases at much lower costs, or even free of charge.
    • The Final Search Strategy
    • All Those Names, One Structure
    • Content is King and Quality Costs
      • Chemistry “content” is big business. Not everyone can afford it.
        • Patent searching
        • Structures and properties
        • Drug databases
        • Literature databases
      • Chemical Abstracts Service (CAS), the “Gold Standard” in Chemistry related information
        • 101 years of content
        • $260 million revenue (2006)
        • >50 million substances
        • Proprietary platform
    • Searching Chemistry on the Internet
      • How complete a result set will we get if we search for “chemicals” by name?
      • Is there a better way to link chemistry databases? Linking by “names” is dangerous
      • Chemists want structure and SUBstructure searching
    • The InChI Identifier
    • Multiple Layers
    • InChIStrings Hash to InChIKeys
    • Oleoylethanolamine
    • InChIKey Searches Work
    • Search Engine Dependencies
    • Search Engine Dependencies
    • InChIs have traction…
    • RDF Linking of Structures
    • PubChem
    • The Simplest Organic Molecule
    • Question Everything online: www.dhmo.org
    • The Structure-Based Data Cloud
    • Vancomycin
    • Vancomycin
      • Who will curate?
      • How would you clean such a large dataset?
    • Vancomycin on ChemSpider
    • Vancomycin
    • Vancomycin Search Molecular SKELETON Search Full Molecule
    • Full Skeleton Search: 104 Hits
    • Full Molecule Search: 4 Hits
    • What is ChemSpider?
      • ChemSpider is:
        • Building a Structure Centric Community for Chemists
        • 22.2 million compounds, >200 data sources
        • A deposition and curation platform
        • A publishing platform for the community
        • Grows daily – more depositions, more links, more data sources
    • For Chemical Compounds
      • Vendor sites – Aldrich, Alfa Aesar, TCI and 100s of others
      • Government databases – PubChem, DSSTox, FDA databases, ChemIDPlus,…
      • Biological Databases – Protein Database, Stitch, KEGG, ChEBI,…
      • Analytical databases –NMRShiftDB,…
    • How Was ChemSpider Built?
      • ChemSpider was a “hobby project”
      • Housed in a basement and running off three servers – one bought, two built
      • May 2009
      • 3 servers – 2 homebuilt
      • .NET architecture
      • SQL server
      • Homebuilt structure/substructure
      • Commercial components
      • Open Source Components
        • OpenBabel, Jmol, JSpecView, NCBI Toolkit, InChI Libraries
    • Search Cholesterol
    • Search Cholesterol
    • Search Cholesterol
    • Search Cholesterol
    • Linked across the internet
    • Kyoto Encyclopedia of Genes and Genomes
    • Links to Patents based on structure
    • Answering Questions for Chemists
      • Questions a chemist might ask…
        • What is the melting point of n-butanol?
        • What is the chemical structure of Xanax?
        • Chemically, what is phenolphthalein?
        • What are the stereocenters of cholesterol?
        • Where can I find publications about xylene?
        • What are the different trade names for Ketoconazole?
        • What is the NMR spectrum of Aspirin?
        • What are the safety handling issues for Thymol Blue?
    • Complex Data and Information
    • Remember – QUALITY ISSUES
    • The FDA’s DailyMed
    • Incorrect Structures
    • Does one stereocenter matter?
      • Distaval, Talimol, Nibrol, Sedimide, Quietoplex, Contergan, Neurosedyn, and Softenon
    • Crowd-sourcing Chemistry Curation
    • We Need Recognition and Rewards
    • Master Curators, Curators, Depositors
    • Collaborating with Wikipedia
      • Long term project to curate chemical compounds
      • Robotically linking ChemSpider to Wikipedia at present
      • Will layer on InChI Strings and InChIKeys shortly and make Wikipedia structure searchable
    • Blogs need InChIs too!
    • Blogs need InChIs too!
    • Use Intelligent Structures : ChemSpider Embed Web Service
    • ChemSpider Web Services
    • Semantic Mark-up for Chemistry
      • Semantic mark-up for chemistry is here
        • RSC project prospect
        • Nature publishing group compound linking
        • ChemMantis
    • Nature Chemistry Compound Pages
    • Project Prospect
    • ChemMantis
    • Deposit Structures
    • Species – linked to Wikipedia
    • Semantic Linking of Structures
      • What would you want to link off a structure?
        • Chemical suppliers
        • Other publications
        • Analytical Data
        • Related Reactions
        • Wikipedia
        • Patents
        • “ Everything”
    • The InChI “Resolver”
    • InChI Resolver to DOIs Structure Search the Web
    • Conclusions
      • Internet resources provide a collaborative community for chemistry
      • Crowdsourcing to expand, curate and integrate to the benefit of chemists
      • Searching the web for chemistry is arriving
      • InChIs are enabling chemistry on the internet
      • Question Quality!
    • [email_address] Twitter: ChemSpiderman www.chemspider.com/blog