Your SlideShare is downloading. ×
0
How an Online Chemistry Resource
        Could Change Our World

                              Antony Williams
           ...
Imagine a time when ….
   The internet is searchable by chemical structure and substructure
    (e.g.Wikipedia, Google Sc...
The Language of Chemistry

   My language….




               Building a Structure Centric Community for Chemists
And its dialects….




Building a Structure Centric Community for Chemists
As a chemist…

   I look for information about chemicals/chemistry
       What is a particular structure ?
       What ...
Linked Data Cloud




Building a Structure Centric Community for Chemists
Chemistry on the Internet

   Much of the information online is User Beware!

   The Quality of information is “diverse”...
“Good Stuff” TotallySynthetic.com




    Building a Structure Centric Community for Chemists
PubChem




Building a Structure Centric Community for Chemists
   Questions a chemist might ask…
     What is the melting point of n-butanol?
     What is the chemical structure of X...
Search Cholesterol




Building a Structure Centric Community for Chemists
Search Cholesterol




Building a Structure Centric Community for Chemists
Search Cholesterol




Building a Structure Centric Community for Chemists
Search Cholesterol




Building a Structure Centric Community for Chemists
Search Cholesterol




Building a Structure Centric Community for Chemists
Link outs




Building a Structure Centric Community for Chemists
Complex Data and Information




 Building a Structure Centric Community for Chemists
Online Analytical Data




Building a Structure Centric Community for Chemists
Various Searches

   Structure searching
   Substructure searching
   Subset searching – choose from 200 data sources
...
ChemSpider for MS Spectrometrists

   What would an MS spectrometrist want to do?
       Search the database based on ma...
Search Database Based on Mass




  Building a Structure Centric Community for Chemists
Mass Based Searches?

   What compounds have a mass of 300+/-0.001?




                Building a Structure Centric Comm...
59 hits/1.3 seconds from 21.5 MILLION




       Building a Structure Centric Community for Chemists
Substructure and Property




Building a Structure Centric Community for Chemists
Building a Structure Centric Community for Chemists
Elemental Constraints




Building a Structure Centric Community for Chemists
Search based on Data Sources




Building a Structure Centric Community for Chemists
Outlinks – to vendors and other databases
   Example databases of interest to MS Spectrometrists:
       HMDB – Human Me...
Links out to KEGG
Kyoto Encyclopedia of Genes and Genomes




        Building a Structure Centric Community for Chemists
WikiPathways Link




Building a Structure Centric Community for Chemists
Download Structure(s)

   Download individual record – molfile




   Download SDF file (group of structures)


        ...
Web Service Integration

   ChemSpider integration presently integrated to Bruker,
    Waters and Thermo – more vendors c...
MassSpec API Web Services

   http://www.chemspider.com/MassSpecAPI.asmx




               Building a Structure Centric ...
Web Services




Building a Structure Centric Community for Chemists
Test Web Services for MassSpec

   http://
    www.chemspider.com/WebServices/WSMassSpecAPIDem




              Building...
Test results




Building a Structure Centric Community for Chemists
Waters Integration




Building a Structure Centric Community for Chemists
Waters Integration




Building a Structure Centric Community for Chemists
Outlinks from Table




Building a Structure Centric Community for Chemists
For Chromatographers?

   “Structure-based methods” being linked
   Structure-centric searching of methods
   We can ho...
From 21.5 MILLION molecules…

   Data are gathered/deposited from >200 data sources
       Government databases
       ...
What is “wrong”?




Building a Structure Centric Community for Chemists
Quality is a Major Issue- Search Butanol
             OLD EXAMPLE..now fixed




   Building a Structure Centric Community...
Vancomycin

                                           Who will curate?
                                           PubCh...
Wikipedia, C&E News, PubChem

                           C&E News (from ACS)




   Building a Structure Centric Community...
Building a Structure Centric Community for Chemists
Does one stereocenter matter?
                        Thalidomide




Building a Structure Centric Community for Chemists
Question Everything
                                   www.dhmo.org




Building a Structure Centric Community for Chemists
DailyMed


“DailyMed provides high quality information about
marketed drugs.

This information includes FDA approved label...
The FDA’s DailyMed




Building a Structure Centric Community for Chemists
Structures on DailyMed
                   Poor Representations




Building a Structure Centric Community for Chemists
Incorrect Structures
                                  Scanning (?) Issues




Building a Structure Centric Community for ...
Incorrect Structures




Building a Structure Centric Community for Chemists
Wikis for Science

   Who in the room hasn’t used Wikipedia?

   Is it trustworthy?

   What are the advantages and dis...
Collaborative Knowledge Management
                         for Chemists




   Building a Structure Centric Community for...
Wikipedia Curation

   Looking for self-consistency
    across a Wikipedia Page
   Primary key is the article TITLE
   ...
Taxol on PubChem




Building a Structure Centric Community for Chemists
When are things “wrong”?

   Structures have a timeline…..




                 Building a Structure Centric Community fo...
Building a Structure Centric Community for Chemists
Building a Structure Centric Community for Chemists
Building a Structure Centric Community for Chemists
Creating a trusted source…

   Small databases can be curated by the hosts – EPA’s
    DSSTox, Wikipedia, etc.
   Who wi...
Crowdsourcing




Building a Structure Centric Community for Chemists
Curating ChemSpider
   Anyone can “Post Comments” associated with a
    structure. To curate data we require login to tra...
Multi-level Curation and Approval




   Building a Structure Centric Community for Chemists
ChemMantis

   Chemical Markup And Nomenclature Transformation
    Integrated System




               Building a Struct...
On the fly conversion




Building a Structure Centric Community for Chemists
Nature Publications




Building a Structure Centric Community for Chemists
Integrations Out to Other Sources




   Building a Structure Centric Community for Chemists
Reactions




Building a Structure Centric Community for Chemists
ChemSpider Everywhere
                           RSC Compounds




Building a Structure Centric Community for Chemists
ChemSpider Everywhere
                           Nature Chemistry
            Nature Chemistry articles are
            an...
ChemSpider Everywhere
                                 ChemMobi




Building a Structure Centric Community for Chemists
Building a Structure Centric Community for Chemists
It Happened in a Basement!!

   Homebuilt servers
   Cable internet
   Software donations
   Lots of hard work
   >80...
And now…




   The Royal Society of Chemistry announced on May 11th that it
    has acquired ChemSpider, heralding a bre...
Upcoming SlideShare
Loading in...5
×

How an Online Resource for Chemistry Can Change Our World

682

Published on

This is a presentation given at the Triangle Chromatography Discussion Group with a focus on Mass Spectrometry and associated web services and what is possible for chromatographers

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
682
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
13
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • This is a list of some of the things an MS scientist might want to do and some of the queries we have already experienced
  • Transcript of "How an Online Resource for Chemistry Can Change Our World"

    1. 1. How an Online Chemistry Resource Could Change Our World Antony Williams Triangle Chromatography Discussion Group, Raleigh, NC, May 2009
    2. 2. Imagine a time when ….  The internet is searchable by chemical structure and substructure (e.g.Wikipedia, Google Scholar)  When there is an online database of NMR, IR, MS spectra and chromatography methods built by available to the community  Chemistry articles are indexed and searchable by “chemistry”  The web is linked together through the “language of chemistry”  Publicly funded research data can be shared and discussed in the Open, maybe as Open Notebook Science  Cheminformatics has as much of a public face and success as bioinformatics (Protein DataBank, Genbank, etc) Building a Structure Centric Community for Chemists
    3. 3. The Language of Chemistry  My language…. Building a Structure Centric Community for Chemists
    4. 4. And its dialects…. Building a Structure Centric Community for Chemists
    5. 5. As a chemist…  I look for information about chemicals/chemistry  What is a particular structure ?  What alternative names/identifiers?  Reaction synthesis?  Physical properties?  Analytical data?  Purchase?  Tell me more?  Similar stuff – what other compounds are “like” mine? Building a Structure Centric Community for Chemists
    6. 6. Linked Data Cloud Building a Structure Centric Community for Chemists
    7. 7. Chemistry on the Internet  Much of the information online is User Beware!  The Quality of information is “diverse”  Technologies can “link and connect” information but validation and curation is key to providing quality  The LinkedData web is of less value when the data linked are “wrong” Building a Structure Centric Community for Chemists
    8. 8. “Good Stuff” TotallySynthetic.com Building a Structure Centric Community for Chemists
    9. 9. PubChem Building a Structure Centric Community for Chemists
    10. 10.  Questions a chemist might ask…  What is the melting point of n-butanol?  What is the chemical structure of Xanax?  Chemically, what is phenolphthalein?  What are the stereocenters of cholesterol?  Where can I find publications about xylene?  What are the different trade names for Ketoconazole?  What is the NMR spectrum of Aspirin?  What are the safety handling issues for Thymol Blue? Building a Structure Centric Community for Chemists
    11. 11. Search Cholesterol Building a Structure Centric Community for Chemists
    12. 12. Search Cholesterol Building a Structure Centric Community for Chemists
    13. 13. Search Cholesterol Building a Structure Centric Community for Chemists
    14. 14. Search Cholesterol Building a Structure Centric Community for Chemists
    15. 15. Search Cholesterol Building a Structure Centric Community for Chemists
    16. 16. Link outs Building a Structure Centric Community for Chemists
    17. 17. Complex Data and Information Building a Structure Centric Community for Chemists
    18. 18. Online Analytical Data Building a Structure Centric Community for Chemists
    19. 19. Various Searches  Structure searching  Substructure searching  Subset searching – choose from 200 data sources  Property searching  Value for Mass Spectrometrists and Chromatographers? Building a Structure Centric Community for Chemists
    20. 20. ChemSpider for MS Spectrometrists  What would an MS spectrometrist want to do?  Search the database based on mass (various forms)  Search selected subsets of the database based on mass  Search based on mass and substructure(s)  Search for structure based on name(s) or database IDs  Search for structures based on elements/not elements  Download the structure/structures in standard format  Search literature for information  Identify related data sources – chemical vendors, pathway databases, etc Building a Structure Centric Community for Chemists
    21. 21. Search Database Based on Mass Building a Structure Centric Community for Chemists
    22. 22. Mass Based Searches?  What compounds have a mass of 300+/-0.001? Building a Structure Centric Community for Chemists
    23. 23. 59 hits/1.3 seconds from 21.5 MILLION Building a Structure Centric Community for Chemists
    24. 24. Substructure and Property Building a Structure Centric Community for Chemists
    25. 25. Building a Structure Centric Community for Chemists
    26. 26. Elemental Constraints Building a Structure Centric Community for Chemists
    27. 27. Search based on Data Sources Building a Structure Centric Community for Chemists
    28. 28. Outlinks – to vendors and other databases  Example databases of interest to MS Spectrometrists:  HMDB – Human Metabolome Database  KEGG – Kyoto Encyclopedia of Genes and Genomes  BioCyc - collection of Pathway/Genome Databases  Uni. Minnesota Biodegradation DB - information on microbial biocatalytic reactions and biodegradation pathways for primarily xenobiotic, chemical compounds  WikiPathways – new initiative to build crowdsourced pathway data management Building a Structure Centric Community for Chemists
    29. 29. Links out to KEGG Kyoto Encyclopedia of Genes and Genomes Building a Structure Centric Community for Chemists
    30. 30. WikiPathways Link Building a Structure Centric Community for Chemists
    31. 31. Download Structure(s)  Download individual record – molfile  Download SDF file (group of structures) Building a Structure Centric Community for Chemists
    32. 32. Web Service Integration  ChemSpider integration presently integrated to Bruker, Waters and Thermo – more vendors coming…  Direct integration to vendor data processing tools Building a Structure Centric Community for Chemists
    33. 33. MassSpec API Web Services  http://www.chemspider.com/MassSpecAPI.asmx Building a Structure Centric Community for Chemists
    34. 34. Web Services Building a Structure Centric Community for Chemists
    35. 35. Test Web Services for MassSpec  http:// www.chemspider.com/WebServices/WSMassSpecAPIDem Building a Structure Centric Community for Chemists
    36. 36. Test results Building a Structure Centric Community for Chemists
    37. 37. Waters Integration Building a Structure Centric Community for Chemists
    38. 38. Waters Integration Building a Structure Centric Community for Chemists
    39. 39. Outlinks from Table Building a Structure Centric Community for Chemists
    40. 40. For Chromatographers?  “Structure-based methods” being linked  Structure-centric searching of methods  We can host chromatograms for display  LogPs and LogDs (pH5.5 and 7.4) calculated for >21 million compounds using ACD/Labs software  We’d love to host collections from the column vendors! tony@chemspider.com Building a Structure Centric Community for Chemists
    41. 41. From 21.5 MILLION molecules…  Data are gathered/deposited from >200 data sources  Government databases  Chemical vendors  Wikipedia  There are “imperfections” in all online data sources  How bad can it get???? Building a Structure Centric Community for Chemists
    42. 42. What is “wrong”? Building a Structure Centric Community for Chemists
    43. 43. Quality is a Major Issue- Search Butanol OLD EXAMPLE..now fixed Building a Structure Centric Community for Chemists
    44. 44. Vancomycin  Who will curate?  PubChem is not resourced to clean these errors   How would you clean such a large dataset? Building a Structure Centric Community for Chemists
    45. 45. Wikipedia, C&E News, PubChem C&E News (from ACS) Building a Structure Centric Community for Chemists
    46. 46. Building a Structure Centric Community for Chemists
    47. 47. Does one stereocenter matter? Thalidomide Building a Structure Centric Community for Chemists
    48. 48. Question Everything www.dhmo.org Building a Structure Centric Community for Chemists
    49. 49. DailyMed “DailyMed provides high quality information about marketed drugs. This information includes FDA approved labels (package inserts).” Building a Structure Centric Community for Chemists
    50. 50. The FDA’s DailyMed Building a Structure Centric Community for Chemists
    51. 51. Structures on DailyMed Poor Representations Building a Structure Centric Community for Chemists
    52. 52. Incorrect Structures Scanning (?) Issues Building a Structure Centric Community for Chemists
    53. 53. Incorrect Structures Building a Structure Centric Community for Chemists
    54. 54. Wikis for Science  Who in the room hasn’t used Wikipedia?  Is it trustworthy?  What are the advantages and disadvantages of the Wiki environment?  How suitable is it for Chemistry? Building a Structure Centric Community for Chemists
    55. 55. Collaborative Knowledge Management for Chemists Building a Structure Centric Community for Chemists
    56. 56. Wikipedia Curation  Looking for self-consistency across a Wikipedia Page  Primary key is the article TITLE  The chemical shown needs to match the title  Cyclic self-consistency – and decisions must get made Building a Structure Centric Community for Chemists
    57. 57. Taxol on PubChem Building a Structure Centric Community for Chemists
    58. 58. When are things “wrong”?  Structures have a timeline….. Building a Structure Centric Community for Chemists
    59. 59. Building a Structure Centric Community for Chemists
    60. 60. Building a Structure Centric Community for Chemists
    61. 61. Building a Structure Centric Community for Chemists
    62. 62. Creating a trusted source…  Small databases can be curated by the hosts – EPA’s DSSTox, Wikipedia, etc.  Who will curate an enormous database? Building a Structure Centric Community for Chemists
    63. 63. Crowdsourcing Building a Structure Centric Community for Chemists
    64. 64. Curating ChemSpider  Anyone can “Post Comments” associated with a structure. To curate data we require login to track Building a Structure Centric Community for Chemists
    65. 65. Multi-level Curation and Approval Building a Structure Centric Community for Chemists
    66. 66. ChemMantis  Chemical Markup And Nomenclature Transformation Integrated System Building a Structure Centric Community for Chemists
    67. 67. On the fly conversion Building a Structure Centric Community for Chemists
    68. 68. Nature Publications Building a Structure Centric Community for Chemists
    69. 69. Integrations Out to Other Sources Building a Structure Centric Community for Chemists
    70. 70. Reactions Building a Structure Centric Community for Chemists
    71. 71. ChemSpider Everywhere RSC Compounds Building a Structure Centric Community for Chemists
    72. 72. ChemSpider Everywhere Nature Chemistry Nature Chemistry articles are annotated to identify all of the chemical compounds mentioned throughout the text. Those compounds are linked out to other information resources including PubChem and ChemSpider. Building a Structure Centric Community for Chemists
    73. 73. ChemSpider Everywhere ChemMobi Building a Structure Centric Community for Chemists
    74. 74. Building a Structure Centric Community for Chemists
    75. 75. It Happened in a Basement!!  Homebuilt servers  Cable internet  Software donations  Lots of hard work  >8000 users per day  >80,000 transactions per day Building a Structure Centric Community for Chemists
    76. 76. And now…  The Royal Society of Chemistry announced on May 11th that it has acquired ChemSpider, heralding a breakthrough investment for the organisation and for the Chemistry Community. This acquisition reflects RSC's commitment to providing access to rich resources of chemistry data and information. Building a Structure Centric Community for Chemists
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×