Building a semantic chemistry platform
with the Royal Society of Chemistry
Valery Tkachenko, Colin Batchelor, Peter Corbet...
Big Data World and Chemistry
ChemSpider
RSC Archive
RSC Chemistry Platform
Data quality
Global Chemistry Network
Science map
Chemical space - 1060
Visualization and navigation
Automated learning
Managing Big Data
Big Data World and Chemistry
ChemSpider
RSC Archive
RSC Chemistry Platform
Data quality
Global Chemistry Network
• ~30 million chemicals and growing
• Data sourced from >500 different sources
• Crowdsourced curation and annotation
• On...
ChemSpider
ChemSpider - properties
ChemSpider - references
ChemSpider - classification
Share in a “proper way”
Big Data World and Chemistry
ChemSpider
RSC Archive
RSC Chemistry Platform
Data quality
Global Chemistry Network
RSC Archive – since 1841
It is so difficult to navigate…
What’s the
structure?
What’s the
structure?
Are they in
our file?
Are they in
our file?
Wh...
Digitally Enabling RSC Archive
CSSP Article Example
Compounds
Reaction
Analytical Data
Text and References
Big Data World and Chemistry
ChemSpider
RSC Archive
RSC Chemistry Platform
Data quality
Global Chemistry Network
RSC Chemistry Platform
ChemSpider Compounds
ChemSpider Reactions
ChemSpider Spectra
ChemSpider Crystals
ChemSpider Materia...
Data Pipeline
Deposition Gateway
Staging
databases
Compounds Reactions Spectra Crystals
Materials
Compounds
Module
Spectra...
Compounds Database
Reactions Database
• ChemSpider Synthetic Pages
• Methods in Organic Synthesis
• Catalysts and Catalyzed Reactions
• USPTO
Reactions Database
Analytical Data Database
Data Pipeline
Compounds Reactions Spectra Crystals Documents
Compounds
API
Reactions
API
Spectra
API
Crystals
API
Document...
Big Data World and Chemistry
ChemSpider
RSC Archive
RSC Chemistry Platform
Data quality
Global Chemistry Network
Data quality
– Robochemistry
– Proliferation of errors in public and
private databases
– Automated quality control system
...
Typical public databases
errors
J. Brechner, IUPAC
Graphical Representation
of stereochem.
configurations
Section: ST-1.1....
Chemistry Validation and Standardization
Platform
Crowdsourcing and AltMetrics
RSC/Rewards and Recognition
Congratulations! Your 1st CSSP
article has been published.
Philosopher Lao Tzu said “A
journey...
Big Data World and Chemistry
ChemSpider
RSC Archive
RSC Chemistry Platform
Data quality
Global Chemistry Network
We are a part of a much larger world
Research data network
University 1
Data Hub
Workstations
University 2
Data Hub
Workstations
Company 3
Data Hub
Workstation...
ChemSpider APIs
National Chemistry Database
http://www.openphacts.org
Open PHACTS is an Innovative
Medicines Initiative (IMI) project,
aiming to reduce the barriers t...
OSDD
Thank you
Email: tkachenkov@rsc.org
Slides: http://www.slideshare.net/valerytkachenko16
Building a semantic chemistry platform with the royal society of chemistry
Building a semantic chemistry platform with the royal society of chemistry
Building a semantic chemistry platform with the royal society of chemistry
Upcoming SlideShare
Loading in...5
×

Building a semantic chemistry platform with the royal society of chemistry

370

Published on

We live in an exponentially expanding world of “big data”. Social networks, global portals and other distributed systems have been attempting to deal with the problem for a few years now. Scientific applications are commonly lagging behind the mainstream trends due to the complexity of the scientific domain. The Royal Society of Chemistry is building the Global Chemistry Network connecting a variety of resources both in-house and external, bridging gaps and advancing the chemical sciences. One of the main issues connected to the world of big data is the ease of navigation and comprehensiveness of the search capabilities. This is where the approach of the semantic web meets the world of big data. We will present our approaches in building a global federated chemistry platform connecting multiple domains of chemistry using semantic web technologies.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
370
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Building a semantic chemistry platform with the royal society of chemistry

  1. 1. Building a semantic chemistry platform with the Royal Society of Chemistry Valery Tkachenko, Colin Batchelor, Peter Corbett, Ken Karapetyan, Alexey Pshenichnov, Antony William ACS 247th National Meeting Dallas, TX
  2. 2. Big Data World and Chemistry ChemSpider RSC Archive RSC Chemistry Platform Data quality Global Chemistry Network
  3. 3. Science map
  4. 4. Chemical space - 1060
  5. 5. Visualization and navigation
  6. 6. Automated learning
  7. 7. Managing Big Data
  8. 8. Big Data World and Chemistry ChemSpider RSC Archive RSC Chemistry Platform Data quality Global Chemistry Network
  9. 9. • ~30 million chemicals and growing • Data sourced from >500 different sources • Crowdsourced curation and annotation • Ongoing deposition of data from our journals and our collaborators • A structure centric hub for web-searching
  10. 10. ChemSpider
  11. 11. ChemSpider - properties
  12. 12. ChemSpider - references
  13. 13. ChemSpider - classification
  14. 14. Share in a “proper way”
  15. 15. Big Data World and Chemistry ChemSpider RSC Archive RSC Chemistry Platform Data quality Global Chemistry Network
  16. 16. RSC Archive – since 1841
  17. 17. It is so difficult to navigate… What’s the structure? What’s the structure? Are they in our file? Are they in our file? What’s similar? What’s similar? What’s the target? What’s the target?Pharmacology data? Pharmacology data? Known Pathways? Known Pathways? Working On Now? Working On Now?Connections to disease? Connections to disease? Expressed in right cell type? Expressed in right cell type? Competitors?Competitors? IP?IP?
  18. 18. Digitally Enabling RSC Archive
  19. 19. CSSP Article Example Compounds Reaction Analytical Data Text and References
  20. 20. Big Data World and Chemistry ChemSpider RSC Archive RSC Chemistry Platform Data quality Global Chemistry Network
  21. 21. RSC Chemistry Platform ChemSpider Compounds ChemSpider Reactions ChemSpider Spectra ChemSpider Crystals ChemSpider Materials ChemSpider Assays ChemSpider Algorithms
  22. 22. Data Pipeline Deposition Gateway Staging databases Compounds Reactions Spectra Crystals Materials Compounds Module Spectra Module Reactions Module Materials Module Textmining Module ͙ Module Web UI for unified depositions DropBox, Google Drive, SkyDrive, etc ELNs, templated data input Documents API, FTP, etc Raw data Validated data Staging databases Alldatabases are sliced by data sources/data collections and havesimple security model where each data slice/sourceis private, public or embargoed Etc Experiments Research
  23. 23. Compounds Database
  24. 24. Reactions Database • ChemSpider Synthetic Pages • Methods in Organic Synthesis • Catalysts and Catalyzed Reactions • USPTO
  25. 25. Reactions Database
  26. 26. Analytical Data Database
  27. 27. Data Pipeline Compounds Reactions Spectra Crystals Documents Compounds API Reactions API Spectra API Crystals API Documents API Compounds Widgets Reactions Widgets Spectra Widgets Crystals Widgets Documents Widgets Data tier Data access tier User interface components tier Analytical Laboratory application User interface tier (examples) Electronic Laboratory Notebook Paid 3rd party integrations (various platforms – SharePoint, Google, etc) Chemical Inventory application
  28. 28. Big Data World and Chemistry ChemSpider RSC Archive RSC Chemistry Platform Data quality Global Chemistry Network
  29. 29. Data quality – Robochemistry – Proliferation of errors in public and private databases – Automated quality control system – Crowdsourcing
  30. 30. Typical public databases errors J. Brechner, IUPAC Graphical Representation of stereochem. configurations Section: ST-1.1.10 DB06287
  31. 31. Chemistry Validation and Standardization Platform
  32. 32. Crowdsourcing and AltMetrics
  33. 33. RSC/Rewards and Recognition Congratulations! Your 1st CSSP article has been published. Philosopher Lao Tzu said “A journey of a thousand miles begins with a single step”. In the same way we hope that this will be the first of many submissions that you make to CSSP. The First Step badge is awarded when a user submits (& has published) their 1st CSSP article.
  34. 34. Big Data World and Chemistry ChemSpider RSC Archive RSC Chemistry Platform Data quality Global Chemistry Network
  35. 35. We are a part of a much larger world
  36. 36. Research data network University 1 Data Hub Workstations University 2 Data Hub Workstations Company 3 Data Hub Workstations Data Repository indexed storage Data Repository provided data storage Chemically intelligent services Indexes Data External clients Publishers Scientists Funding bodies
  37. 37. ChemSpider APIs
  38. 38. National Chemistry Database
  39. 39. http://www.openphacts.org Open PHACTS is an Innovative Medicines Initiative (IMI) project, aiming to reduce the barriers to drug discovery in industry, academia and for small businesses. Semantic web is one of the corner stones
  40. 40. OSDD
  41. 41. Thank you Email: tkachenkov@rsc.org Slides: http://www.slideshare.net/valerytkachenko16
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×