Building Global Chemistry
Network at the Royal Society of
Chemistry
Valery Tkachenko
ICSTI Workshop
Data and Non-Data Inte...
The World we live in
Internet World
20+ years into the Internet Revolution
Web 2.0 -> Web 3.0
Connected World
Social Netwo...
Big Data challenge
RSC/ChemSpider platforms
Crowdsourcing and AltMetrics
New interfaces
Building Global Chemistry Network
Science map
Chemistry on the Internet
Why disproportion?
Scientific complexity
Conservative nature
Big Data challenge
RSC/ChemSpider platforms
Crowdsourcing and AltMetrics
New interfaces
Building Global Chemistry Network
Royal Society of Chemistry (RSC)
Largest European organisation for advancing the chemical
sciences
Founded 1841
Not-for pr...
About the RSC
• Headquarters in London
• Offices in Cambridge, Beijing,
Shanghai, Philadelphia, Tokyo
Bangalore, Sao Paulo
STM publisher

Knowledge

Delivery Magic

Our User Interfaces
(Desktop, Web, Mobile, etc)

Customers

3rd party integratio...
ChemSpider Suite
UIs
ChemSpider
Reactions
mobile web app

ChemSpider
website

ChemSpider
desktop app

Depositions client

...
•
•
•
•

29 million chemicals and growing
Data sourced from >500 different sources
Crowdsourced curation and annotation
On...
ChemSpider
ChemSpider
ChemSpider
ChemSpider
ChemSpider
ChemSpider
ChemSpider
ChemSpider
ChemSpider
ChemSpider Reactions
ChemSpider Reactions
ChemSpider Reactions
ChemSpider Reactions
RSC Archive – since 1841
DERA Digitally Enabling RSC Archive
Semantic Mark-up of Articles
It is so difficult to navigate…
IP?
IP?
What’s the
What’s the
structure?
structure?
Are they in
Are they in
our file?
our ...
DERA Architecture
Structures

Reactions

DERA
(Text Mining)
Text, PDF, XML

Chemistry Validation and
Standardization Platf...
Data quality issue and CVSP
Robochemistry
Proliferation of errors in public and
private databases
Automated quality contro...
DrugBank dataset (6516
records)
~60 records that can’t be dearomatized unambiguously

DB04283

DB04462
~30 records with bonds that do not make
sense

DB04283
DDB04009
7 records with 2 stereo bonds at chiral
atoms

J. Brechner, IUPAC
Graphical Representation of
stereochem. configurations
S...
“Direction of bond makes no sense” –
63%
“Stereo types of non-opposite bonds match” – 2%
ChemSpider Suite
UIs
ChemSpider
Reactions
mobile web app

ChemSpider
website

ChemSpider
desktop app

Depositions client

...
Big Data challenge
RSC/ChemSpider platforms
Crowdsourcing and AltMetrics
New interfaces
Building Global Chemistry Network
AltMetrics
Plum Analytics
RSC/Rewards and Recognition
The First Step badge is
awarded when a user
submits (& has published)
their 1st CSSP article.
...
Big Data challenge
RSC/ChemSpider platforms
Crowdsourcing and AltMetrics
New interfaces
Building Global Chemistry Network
Visualization
Navigation
ChemSpider APIs
Big Data challenge
RSC/ChemSpider platforms
Crowdsourcing and AltMetrics
New interfaces
Building Global Chemistry Network
We are a part of a larger world
National Chemistry Database
National Data Repository
Scientists

Funding bodies

External clients

Publishers
Indexes
Data Repository
indexed storage
...
http://www.openphacts.org
Open PHACTS is an Innovative
Medicines Initiative (IMI) project,
aiming to reduce the barriers t...
We know about Natural Products
Marinlit
OSDD
The Future
Internet Data

Small organic molecules
Undefined materials
Organometallics
Nanomaterials
Polymers
Minerals
Part...
Thank you
Email: tkachenkov@rsc.org
Slides: http://www.slideshare.net/valerytkachenko16
Building global chemistry network at the royal society of chemistry
Building global chemistry network at the royal society of chemistry
Building global chemistry network at the royal society of chemistry
Building global chemistry network at the royal society of chemistry
Building global chemistry network at the royal society of chemistry
Upcoming SlideShare
Loading in...5
×

Building global chemistry network at the royal society of chemistry

1,286

Published on

The Royal Society of Chemistry is building a Global Chemistry Network which will connect chemical resources and chemists across the globe in a single scientific information network dynamically updated in real time. We have been working on a number of the foundation technologies for a number of years including a structure database containing almost 30 million chemicals, a micropublishing environment, a platform for the validation and standardization of chemical structure representations and a text-mining and semantic markup platform for data enabling our published articles. Our goal is to provide seamless tools for researchers, librarians, publishers, informational technology specialists and government agencies to facilitate scientific research by providing a free flow of information. This talk will review our work to date to provide a chemistry data platform for the community and will highlight some of the challenges we face as we expand the architecture for our Global Chemistry Network platform.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,286
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
20
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Building global chemistry network at the royal society of chemistry

  1. 1. Building Global Chemistry Network at the Royal Society of Chemistry Valery Tkachenko ICSTI Workshop Data and Non-Data Integration – A Journey Across Disciplines Ottawa, October 16th 2013
  2. 2. The World we live in Internet World 20+ years into the Internet Revolution Web 2.0 -> Web 3.0 Connected World Social Networks Real-time Communications Big Data World Semantic content New Interfaces
  3. 3. Big Data challenge RSC/ChemSpider platforms Crowdsourcing and AltMetrics New interfaces Building Global Chemistry Network
  4. 4. Science map
  5. 5. Chemistry on the Internet
  6. 6. Why disproportion? Scientific complexity Conservative nature
  7. 7. Big Data challenge RSC/ChemSpider platforms Crowdsourcing and AltMetrics New interfaces Building Global Chemistry Network
  8. 8. Royal Society of Chemistry (RSC) Largest European organisation for advancing the chemical sciences Founded 1841 Not-for profit “To be the leading voice and trusted partner for science and humanity” Professional body with a worldwide network of 48,000 members International publisher ~400 employees Education facilitator, Science leader, E-Science leaders
  9. 9. About the RSC • Headquarters in London • Offices in Cambridge, Beijing, Shanghai, Philadelphia, Tokyo Bangalore, Sao Paulo
  10. 10. STM publisher Knowledge Delivery Magic Our User Interfaces (Desktop, Web, Mobile, etc) Customers 3rd party integrations (our web services)
  11. 11. ChemSpider Suite UIs ChemSpider Reactions mobile web app ChemSpider website ChemSpider desktop app Depositions client Components Layer Java Beans JS Components Python widgets ASP.NET Components PHP snippets Google Apps Components SharePoint Components APIs Layer Search API CSC API Export API CSR API DS API Processing API CSS API CSM API CSA API CSAs API CSR BO CSS BO CSM BO CSA BO CSAs BO ChemSpider Reactions ChemSpider Spectra ChemSpider Materials ChemSpider Algorithms ChemSpider Assays Business Objects Layer CSC BO Data Layer ChemSpider Compounds
  12. 12. • • • • 29 million chemicals and growing Data sourced from >500 different sources Crowdsourced curation and annotation Ongoing deposition of data from our journals and our collaborators • A structure centric hub for web-searching
  13. 13. ChemSpider
  14. 14. ChemSpider
  15. 15. ChemSpider
  16. 16. ChemSpider
  17. 17. ChemSpider
  18. 18. ChemSpider
  19. 19. ChemSpider
  20. 20. ChemSpider
  21. 21. ChemSpider
  22. 22. ChemSpider Reactions
  23. 23. ChemSpider Reactions
  24. 24. ChemSpider Reactions
  25. 25. ChemSpider Reactions
  26. 26. RSC Archive – since 1841
  27. 27. DERA Digitally Enabling RSC Archive
  28. 28. Semantic Mark-up of Articles
  29. 29. It is so difficult to navigate… IP? IP? What’s the What’s the structure? structure? Are they in Are they in our file? our file? What’s What’s similar? similar? Pharmacology Pharmacology data? data? What’s the What’s the target? target? Known Known Pathways? Pathways? Competitors? Competitors? Connections Connections to disease? to disease? Working On Working On Now? Now? Expressed in Expressed in right cell type? right cell type?
  30. 30. DERA Architecture Structures Reactions DERA (Text Mining) Text, PDF, XML Chemistry Validation and Standardization Platform (CVSP) Spectra Materials Biological Activities
  31. 31. Data quality issue and CVSP Robochemistry Proliferation of errors in public and private databases Automated quality control system
  32. 32. DrugBank dataset (6516 records) ~60 records that can’t be dearomatized unambiguously DB04283 DB04462
  33. 33. ~30 records with bonds that do not make sense DB04283 DDB04009
  34. 34. 7 records with 2 stereo bonds at chiral atoms J. Brechner, IUPAC Graphical Representation of stereochem. configurations Section: ST-1.1.10 DB08128 DB06287
  35. 35. “Direction of bond makes no sense” – 63%
  36. 36. “Stereo types of non-opposite bonds match” – 2%
  37. 37. ChemSpider Suite UIs ChemSpider Reactions mobile web app ChemSpider website ChemSpider desktop app Depositions client Components Layer Java Beans JS Components Python widgets ASP.NET Components PHP snippets Google Apps Components SharePoint Components APIs Layer Search API CSC API Export API CSR API DS API Processing API CSS API CSM API CSA API CSAs API CSR BO CSS BO CSM BO CSA BO CSAs BO ChemSpider Reactions ChemSpider Spectra ChemSpider Materials ChemSpider Algorithms ChemSpider Assays Business Objects Layer CSC BO Data Layer ChemSpider Compounds
  38. 38. Big Data challenge RSC/ChemSpider platforms Crowdsourcing and AltMetrics New interfaces Building Global Chemistry Network
  39. 39. AltMetrics
  40. 40. Plum Analytics
  41. 41. RSC/Rewards and Recognition The First Step badge is awarded when a user submits (& has published) their 1st CSSP article. Congratulations! Your 1st CSSP article has been published. Philosopher Lao Tzu said “A journey of a thousand miles begins with a single step”. In the same way we hope that this will be the first of many submissions that you make to CSSP.
  42. 42. Big Data challenge RSC/ChemSpider platforms Crowdsourcing and AltMetrics New interfaces Building Global Chemistry Network
  43. 43. Visualization
  44. 44. Navigation
  45. 45. ChemSpider APIs
  46. 46. Big Data challenge RSC/ChemSpider platforms Crowdsourcing and AltMetrics New interfaces Building Global Chemistry Network
  47. 47. We are a part of a larger world
  48. 48. National Chemistry Database
  49. 49. National Data Repository Scientists Funding bodies External clients Publishers Indexes Data Repository indexed storage Chemically intelligent services Data Data Repository provided data storage University 1 University 2 Data Hub Workstations Company 3 Data Hub Workstations Data Hub Workstations
  50. 50. http://www.openphacts.org Open PHACTS is an Innovative Medicines Initiative (IMI) project, aiming to reduce the barriers to drug discovery in industry, academia and for small businesses. Semantic web is one of the corner stones
  51. 51. We know about Natural Products
  52. 52. Marinlit
  53. 53. OSDD
  54. 54. The Future Internet Data Small organic molecules Undefined materials Organometallics Nanomaterials Polymers Minerals Particle bound Links to Biologicals Commercial Software Pre-competitive Data Open Science Open Data Publishers Educators Open Databases Chemical Vendors
  55. 55. Thank you Email: tkachenkov@rsc.org Slides: http://www.slideshare.net/valerytkachenko16
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×