How Internet Resources Are Providing a Collaborative Community for Chemistry

  • 1,948 views
Uploaded on

Online chemistry resources have expanded dramatically in the past few years with resources such as PubChem, ChEBI, Wikipedia, ChemSpider and many others offering rich resources to scientists seeking …

Online chemistry resources have expanded dramatically in the past few years with resources such as PubChem, ChEBI, Wikipedia, ChemSpider and many others offering rich resources to scientists seeking data and information. ChemSpider has become one of the primary chemistry portals delivering a heterogeneous mix of Open and Closed data. ChemSpider offers a structure-centric community for collaboration enabling the crowd-sourced deposition and validation of online chemistry data. ChemSpider has also been integrated into the ChemMantis system – CHEMistry Markup And Nomenclature Transformation Integrated System. This platform facilitates entity extraction of science related terms using both heuristics and highly curated dictionaries. The resulting documents are marked up to allow viewing of chemical structures linked out to over 200 different data sources via the ChemSpider database.

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,948
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
37
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. How Internet Resources Are Providing a Collaborative Community for Chemistry 60 slides in 20 minutes
  • 2. Imagine a time when ….
    • The internet is searchable by chemical structure and substructure (e.g.Wikipedia, Google Scholar)
    • Chemistry articles are indexed and searchable by a free online service
    • The web is linked together through the “language of chemistry”
  • 3. It’s Coming…Linked Data Cloud
  • 4. Thanks to the Organizers…
  • 5. Antony Williams vs Identifiers Passport ID Dad, Tony, others SSN Green Card License 5 email addresses ChemSpiderman (blog, Twitter account, Facebook, Friendfeed) OpenID … .
  • 6. Aspirin vs Chemical Identifiers
  • 7. Aspirin names and synonyms
    • Text searches depend on correct association
    • 335 suggested identifiers for Aspirin just on PubChem!
    • Disambiguation dictionaries are necessary
  • 8.  
  • 9.  
  • 10.  
  • 11. The Final Search Strategy
  • 12. All Those Names, One Structure
  • 13. Searching Chemistry on the Internet
    • How complete a result set will we get if we search for “chemicals” by name?
    • Is there a better way to link chemistry databases? Linking by “names” is dangerous
    • Chemists want structure and SUBstructure searching
  • 14. The InChI Identifier
  • 15. Multiple Layers
  • 16. InChIStrings Hash to InChIKeys
  • 17. Oleoylethanolamine
  • 18. Search Engine Dependencies
  • 19. Search Engine Dependencies
  • 20. InChIs have traction…
  • 21. RDF Linking of Structures
  • 22. PubChem
  • 23. The Simplest Organic Molecule
  • 24. Vancomycin
  • 25.  
  • 26. Vancomycin
    • Who will curate?
    • How would you clean such a large dataset?
  • 27. Vancomycin on ChemSpider
  • 28. Vancomycin
  • 29. Vancomycin Search Molecular SKELETON Search Full Molecule
  • 30. Full Skeleton Search: 104 Hits
  • 31. Full Molecule Search: 4 Hits
  • 32. The InChI “Resolver”
  • 33. Content is King and Quality Costs
    • Curated Chemistry “content” is expensive to create
      • Patent searching
      • Structures and properties
      • Drug databases
      • Literature databases
    • Chemical Abstracts Service (CAS), the “Gold Standard” in Chemistry related information
      • 102 years of content
      • >50 million substances
      • Proprietary platform
  • 34. The EXPERTS must get it right?!
  • 35. Wikipedia, C&E News, PubChem
    • C&E News (from ACS)
  • 36. Feedback from Steve Ritter
    • “ Although CAS and C&EN are both part of the ACS Publications Division, we at C&EN still have to pay for our SciFinder access, strangely enough.”
    • “ It would be nice to have an authoritative web-based source of standard, well-drawn structures for chemists to go to so they can freely cut and paste structures into their papers, PowerPoint presentations, and anything else they might need. Maybe Wikipedia will be that source one day .”
  • 37. Maybe it will be ChemSpider?
    • What is ChemSpider?
      • A database of almost 23 million compounds, >200 data sources
      • A deposition and curation platform
      • A publishing platform for the community
      • Grows daily – more depositions, more links, more data sources
  • 38. Search OEA
  • 39. Search OEA
  • 40. Search OEA
  • 41. Search OEA
  • 42. Linked Patents for OEA
  • 43.  
  • 44. Linked resources
    • Vendor sites – Aldrich, Alfa Aesar, TCI and 100s of others
    • Government databases – PubChem, DSSTox, FDA databases, ChemIDPlus,…
    • Biological Databases – Protein Database, Stitch, KEGG, ChEBI,…
    • Analytical databases –NMRShiftDB,…
  • 45. Linked across the internet
  • 46. Kyoto Encyclopedia of Genes and Genomes
  • 47. Complex Data and Information
  • 48. Remember – QUALITY ISSUES
  • 49. The FDA’s DailyMed
  • 50. Incorrect Structures
  • 51. Crowd-sourcing Chemistry Curation
  • 52. The Currency of Recognition
    • We need to build a platform for recognition ….
  • 53. Chemistry – A Deposition Platform
    • CAS indexes published literature, patents and chemical vendors
    • CAS indexes ChemSpider – >303,000 records
    • “ Lost Chemistry” – syntheses in theses, lab notebooks? Compounds in private collections?
    • ChemSpider accepts public depositions, linking to websites, hosting of details etc. Accepts structures, text, spectra, images.
  • 54. Blogs should be searchable too…
  • 55. Use Intelligent Structures : ChemSpider Embed Web Service
  • 56. ChemSpider Web Services
  • 57. Semantic Linking of Structures
    • What would you want to link off a structure?
      • Chemical suppliers
      • Other publications
      • Analytical Data
      • Related Reactions
      • Wikipedia
      • Patents
      • “ Everything”
      • See Richard Kidd’s Talk
  • 58. Conclusions
    • Internet resources provide a collaborative community for chemistry
    • Crowdsourcing to expand, curate and integrate to the benefit of chemists
    • Searching the web for chemistry is arriving
    • InChIs are enabling chemistry on the internet
    • Question Quality!
  • 59.  
  • 60. Acknowledgments
    • Valery Tkachenko and Sergey Golotvin
    • RSC infrastructure team
    • The ChemSpider advisory group
    • The Wikipedia Chemistry team
  • 61. [email_address] Twitter: ChemSpiderman www.chemspider.com/blog