Your SlideShare is downloading. ×
0
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Building A Community Resource For The Life Sciences

677

Published on

This is a presentation given in Track 4, Open Access and Cheminformatics, at the Bio-IT Meeting in Boston on April 21st 2010. It is a general overview of ChemSpider activities to link together the …

This is a presentation given in Track 4, Open Access and Cheminformatics, at the Bio-IT Meeting in Boston on April 21st 2010. It is a general overview of ChemSpider activities to link together the internet for chemists and validate and curate data. We won the Bio-IT Best Practices Community Service Award that evening also.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
677
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  1. Building A Community Platform to Support Chemistry and the Life Sciences
  2. Where Would You look? What Do You Trust?
  3. Chemistry on the Internet TODAY <ul><li>Chemistry searches are generally limited to text-based searches across the internet </li></ul><ul><li>Data are dirty: sorting the wheat from the chaff. Who can you trust? </li></ul><ul><li>Too many searches required to resource data </li></ul>
  4. Chemistry on the Internet TODAY <ul><li>Chemistry searches are generally limited to text-based searches across the internet </li></ul><ul><li>Data are dirty: sorting the wheat from the chaff. Who can you trust? </li></ul><ul><li>Too many searches required to resource data </li></ul>
  5.  
  6.  
  7. The Final Search Strategy
  8. All Those Names, One Structure A problem to solve…
  9. Chemistry on the Internet TODAY <ul><li>Chemistry searches are generally limited to text-based searches across the internet </li></ul><ul><li>Data are dirty: sorting the wheat from the chaff. Who can you trust? </li></ul><ul><li>Too many searches required to resource data </li></ul>
  10. Trustworthy Chemistry? <ul><li>Encyclopedic articles (Wikipedia) </li></ul><ul><li>Chemical vendor databases </li></ul><ul><li>Metabolic pathway databases </li></ul><ul><li>Property databases </li></ul><ul><li>Patents with chemical structures </li></ul><ul><li>Drug Discovery data </li></ul><ul><li>Scientific publications </li></ul><ul><li>Compound aggregators </li></ul><ul><li>Blogs/Wikis and Open Notebook Science </li></ul>
  11. Where Would You look? What Do You Trust?
  12. Structural Data for LifeSciences DailyMed
  13. Lack of Stereochemisty
  14. Incorrect Structures
  15. Ugh…
  16. Drugs are REALLY Messy
  17. Vancomycin <ul><li>Who will curate? </li></ul><ul><li>How would you clean such a large dataset? </li></ul><ul><li>Assertions!!! </li></ul>
  18. The EXPERTS must get it right?!
  19. Wikipedia, C&E News, PubChem <ul><li>C&E News (from ACS) </li></ul>
  20. Chemistry on the Internet TODAY <ul><li>Chemistry searches are generally limited to text-based searches across the internet </li></ul><ul><li>Data are dirty: sorting the wheat from the chaff. Who can you trust? </li></ul><ul><li>Too many searches required to resource data </li></ul>
  21. Just “Public Compound” Databases <ul><li>PubChem </li></ul><ul><li>Drugbank </li></ul><ul><li>ChEBI/ChEMBL </li></ul><ul><li>KEGG </li></ul><ul><li>LipidMAPs </li></ul><ul><li>ChemIDPlus </li></ul><ul><li>eMolecules </li></ul><ul><li>ZINC </li></ul><ul><li>Lots of chemical vendors </li></ul><ul><li>ChemSpider </li></ul>
  22. media.obsessable.com <ul><li>As few interfaces as possible </li></ul>What do humans want?
  23. A Pragmatic Vision <ul><ul><li>“ Build a Structure Centric Community to </li></ul></ul><ul><ul><li>Serve Chemists” </li></ul></ul><ul><ul><li>Integrate chemical structure data on the web </li></ul></ul><ul><ul><li>Create a “structure-based hub” to information and data </li></ul></ul><ul><ul><li>Provide access to structure-based “algorithms” </li></ul></ul><ul><ul><li>Let chemists contribute their own data </li></ul></ul><ul><ul><li>Allow the community to curate/correct data </li></ul></ul>
  24. Answer Questions <ul><li>Questions a chemist might ask… </li></ul><ul><ul><li>What is the melting point of n-heptanol? </li></ul></ul><ul><ul><li>What is the chemical structure of Xanax? </li></ul></ul><ul><ul><li>Chemically, what is phenolphthalein? </li></ul></ul><ul><ul><li>What are the stereocenters of cholesterol? </li></ul></ul><ul><ul><li>Where can I find publications about xylene? </li></ul></ul><ul><ul><li>What are the different trade names for Ketoconazole? </li></ul></ul><ul><ul><li>What is the NMR spectrum of Aspirin? </li></ul></ul><ul><ul><li>What are the safety handling issues for Thymol Blue? </li></ul></ul>
  25. ChemSpider Searches
  26.  
  27. Search “OEA”
  28. Search OEA
  29. Link Farm Connections
  30. Link Farm Connections
  31. Search OEA
  32. Search OEA
  33. Google Books
  34. Google Scholar
  35. Linked Patents for OEA
  36.  
  37. Google Patents
  38. Microsoft Academic Search
  39. RSC Journals
  40. RSC Databases
  41. Statistics for Today <ul><ul><li>Almost 25 million compounds from >350 data sources </li></ul></ul><ul><ul><li>About 7000 unique users per day and up to ½ million transactions per day </li></ul></ul><ul><ul><li>A crowdsourced deposition and curation platform </li></ul></ul><ul><ul><li>Grows daily – more depositions, more links, more data </li></ul></ul>
  42. Searching Chemistry on the Internet <ul><li>How complete a result set will we get if we search for “chemicals” by name? </li></ul><ul><li>Is there a better way to link chemistry databases? Linking by “names” is dangerous </li></ul><ul><li>Chemists want structure and SUBstructure searching </li></ul>
  43. The InChI Identifier
  44. Multiple Layers
  45. InChIStrings Hash to InChIKeys
  46. Link the Internet with InChIKeys! Taken from: Rafael Sidis’ Blog
  47. Vancomycin – Search the Internet
  48. Vancomycin Search Molecular SKELETON Search Full Molecule
  49. Full Molecule Search: 4 Hits
  50. Full Skeleton Search: 104 Hits
  51.  
  52.  
  53.  
  54. Vancomycin
  55. Vancomycin on ChemSpider 1 compound – 3 days
  56. InChIKeys Make the internet searchable by adding InChIKeys Publishers add InChIKeys to papers now…
  57. InChIKeys Make the internet searchable by adding InChIKeys Publishers add InChIKeys to papers now… is what???
  58. The InChI “Resolver”
  59. InChI Resolver to DOIs Structure Search the Web
  60. Most Chemistry is NOT Published <ul><li>Only a fraction of chemistry is published </li></ul><ul><li>Only a tiny fraction of chemistry is patented </li></ul><ul><li>What of the “Lost Chemistry”- never published and cannot be abstracted </li></ul><ul><ul><li>Reactions performed </li></ul></ul><ul><ul><li>Structures made and studied </li></ul></ul><ul><ul><li>Spectra acquired and then disposed of </li></ul></ul><ul><ul><li>Available chemicals never found </li></ul></ul>
  61. Crowd-sourcing Curation and Deposition <ul><li>Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate </li></ul>
  62. Multi-level Curation and Approval Building a Structure Centric Community for Chemists
  63. Semantic Markup: Project Prospect
  64. Name-Structure Pairs
  65. Semantic Linking of Structures <ul><li>What would you want to link off a structure? </li></ul><ul><ul><li>Chemical suppliers </li></ul></ul><ul><ul><li>Other publications </li></ul></ul><ul><ul><li>Analytical Data </li></ul></ul><ul><ul><li>Related Reactions </li></ul></ul><ul><ul><li>Wikipedia </li></ul></ul><ul><ul><li>Patents </li></ul></ul><ul><ul><li>“ Everything” </li></ul></ul>
  66. Org Prep Daily (Blog)
  67. ChemSpider SyntheticPages
  68. Chemistry on the Internet FUTURE <ul><li>The semantic web for chemistry is in place </li></ul><ul><li>Crowdsourced contributions are commonplace </li></ul><ul><li>Chemists will search by structure/substructure </li></ul><ul><li>Chemistry articles indexed and searchable </li></ul><ul><li>Reduced number of searches to find data </li></ul><ul><li>Data are integrated – compounds, vendors, syntheses, data, publications and patents </li></ul><ul><li>A world of Open Access and Open Data </li></ul>
  69. ChemSpider Web Services
  70.  
  71. Thank you [email_address] Twitter: ChemSpiderman www.chemspider.com/blog SLIDES: www.slideshare.net/AntonyWilliams

×