Building a semantic chemistry platform with the royal society of chemistry
Upcoming SlideShare
Loading in...5
×
 

Building a semantic chemistry platform with the royal society of chemistry

on

  • 375 views

We live in an exponentially expanding world of “big data”. Social networks, global portals and other distributed systems have been attempting to deal with the problem for a few years now. ...

We live in an exponentially expanding world of “big data”. Social networks, global portals and other distributed systems have been attempting to deal with the problem for a few years now. Scientific applications are commonly lagging behind the mainstream trends due to the complexity of the scientific domain. The Royal Society of Chemistry is building the Global Chemistry Network connecting a variety of resources both in-house and external, bridging gaps and advancing the chemical sciences. One of the main issues connected to the world of big data is the ease of navigation and comprehensiveness of the search capabilities. This is where the approach of the semantic web meets the world of big data. We will present our approaches in building a global federated chemistry platform connecting multiple domains of chemistry using semantic web technologies.

Statistics

Views

Total Views
375
Views on SlideShare
374
Embed Views
1

Actions

Likes
0
Downloads
10
Comments
0

1 Embed 1

https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Building a semantic chemistry platform with the royal society of chemistry Presentation Transcript

  • 1. Building a semantic chemistry platform with the Royal Society of Chemistry Valery Tkachenko, Colin Batchelor, Peter Corbett, Ken Karapetyan, Alexey Pshenichnov, Antony William ACS 247th National Meeting Dallas, TX
  • 2. Big Data World and Chemistry ChemSpider RSC Archive RSC Chemistry Platform Data quality Global Chemistry Network
  • 3. Science map
  • 4. Chemical space - 1060
  • 5. Visualization and navigation
  • 6. Automated learning
  • 7. Managing Big Data
  • 8. Big Data World and Chemistry ChemSpider RSC Archive RSC Chemistry Platform Data quality Global Chemistry Network
  • 9. • ~30 million chemicals and growing • Data sourced from >500 different sources • Crowdsourced curation and annotation • Ongoing deposition of data from our journals and our collaborators • A structure centric hub for web-searching
  • 10. ChemSpider
  • 11. ChemSpider - properties
  • 12. ChemSpider - references
  • 13. ChemSpider - classification
  • 14. Share in a “proper way”
  • 15. Big Data World and Chemistry ChemSpider RSC Archive RSC Chemistry Platform Data quality Global Chemistry Network
  • 16. RSC Archive – since 1841
  • 17. It is so difficult to navigate… What’s the structure? What’s the structure? Are they in our file? Are they in our file? What’s similar? What’s similar? What’s the target? What’s the target?Pharmacology data? Pharmacology data? Known Pathways? Known Pathways? Working On Now? Working On Now?Connections to disease? Connections to disease? Expressed in right cell type? Expressed in right cell type? Competitors?Competitors? IP?IP?
  • 18. Digitally Enabling RSC Archive
  • 19. CSSP Article Example Compounds Reaction Analytical Data Text and References
  • 20. Big Data World and Chemistry ChemSpider RSC Archive RSC Chemistry Platform Data quality Global Chemistry Network
  • 21. RSC Chemistry Platform ChemSpider Compounds ChemSpider Reactions ChemSpider Spectra ChemSpider Crystals ChemSpider Materials ChemSpider Assays ChemSpider Algorithms
  • 22. Data Pipeline Deposition Gateway Staging databases Compounds Reactions Spectra Crystals Materials Compounds Module Spectra Module Reactions Module Materials Module Textmining Module ͙ Module Web UI for unified depositions DropBox, Google Drive, SkyDrive, etc ELNs, templated data input Documents API, FTP, etc Raw data Validated data Staging databases Alldatabases are sliced by data sources/data collections and havesimple security model where each data slice/sourceis private, public or embargoed Etc Experiments Research
  • 23. Compounds Database
  • 24. Reactions Database • ChemSpider Synthetic Pages • Methods in Organic Synthesis • Catalysts and Catalyzed Reactions • USPTO
  • 25. Reactions Database
  • 26. Analytical Data Database
  • 27. Data Pipeline Compounds Reactions Spectra Crystals Documents Compounds API Reactions API Spectra API Crystals API Documents API Compounds Widgets Reactions Widgets Spectra Widgets Crystals Widgets Documents Widgets Data tier Data access tier User interface components tier Analytical Laboratory application User interface tier (examples) Electronic Laboratory Notebook Paid 3rd party integrations (various platforms – SharePoint, Google, etc) Chemical Inventory application
  • 28. Big Data World and Chemistry ChemSpider RSC Archive RSC Chemistry Platform Data quality Global Chemistry Network
  • 29. Data quality – Robochemistry – Proliferation of errors in public and private databases – Automated quality control system – Crowdsourcing
  • 30. Typical public databases errors J. Brechner, IUPAC Graphical Representation of stereochem. configurations Section: ST-1.1.10 DB06287
  • 31. Chemistry Validation and Standardization Platform
  • 32. Crowdsourcing and AltMetrics
  • 33. RSC/Rewards and Recognition Congratulations! Your 1st CSSP article has been published. Philosopher Lao Tzu said “A journey of a thousand miles begins with a single step”. In the same way we hope that this will be the first of many submissions that you make to CSSP. The First Step badge is awarded when a user submits (& has published) their 1st CSSP article.
  • 34. Big Data World and Chemistry ChemSpider RSC Archive RSC Chemistry Platform Data quality Global Chemistry Network
  • 35. We are a part of a much larger world
  • 36. Research data network University 1 Data Hub Workstations University 2 Data Hub Workstations Company 3 Data Hub Workstations Data Repository indexed storage Data Repository provided data storage Chemically intelligent services Indexes Data External clients Publishers Scientists Funding bodies
  • 37. ChemSpider APIs
  • 38. National Chemistry Database
  • 39. http://www.openphacts.org Open PHACTS is an Innovative Medicines Initiative (IMI) project, aiming to reduce the barriers to drug discovery in industry, academia and for small businesses. Semantic web is one of the corner stones
  • 40. OSDD
  • 41. Thank you Email: tkachenkov@rsc.org Slides: http://www.slideshare.net/valerytkachenko16