SlideShare a Scribd company logo
1 of 23
ChemSpider compound database as
 one of the pillars of a semantic web
                          for chemistry
                       of the pillars of a
                  Valery Tkachenko, Antony J Williams,
 Ken Karapetyan, Colin Batchelor, Jon Steele, Aileen Day
                                      and David Sharpe

                                     ACS Philly August 2012
Outline
   The world we live in
   Pillars of the world
   ChemSpider as a semantic web system
   Example of federated semantic web system
The World we live in
 Internet World
    ~20 years of WWW and inflationary expansion
    Web 2.0

 Connected World
   Social Networks
   Mobile Communications
   Internet TV

 Big Data World
   Semantic content
   New Interfaces
Pillars of the World
 Data is King
    New data model approach: SQL  NoSQL
    Inflow of data
    Structured data
 Search and Navigation
    Search by all domain specific information
    Navigate inside and link out
 Cloud
    Data and code are distributed and self-sustained
    Federated systems take precedence over standalone solutions
 Interfaces
    Sophisticated HCI (human computer interface)
    Pervasive M2M (machine to machine)
Chemistry on the Internet
What’s wrong?!?!
 Is science (and chemistry in particular) so miserable in the
  world we live in?
 Or too obscure and complex to be easily presented?
 Or scientists are rather conservative beasts?
Scientific data complexity
Chemical data complexity
ChemCloud
ChemSpider
   Database of small organic molecules
      Properties
      Names and synonyms
      Spectra
   Contribute in an easy way
      New data depositions
      Existing data curations
   Search engine for chemistry
      Search a chemical by a, b, c
      Cluster and navigate relationships
   Extensive infrastructure
      Computer farm
      Components
   Standard interfaces
      SOAP
      REST
      JSON
ChemSpider                       UI


       Filters


                                           APIs
Data




                            BPF
                 (distributed computing)
Deposition System
ChemSpider                       UI


       Filters


                                           APIs
Data




                            BPF
                 (distributed computing)
Validation and Standardization
ChemSpider                       UI


       Filters


                                           APIs
Data




                            BPF
                 (distributed computing)
User Interface
 Web



 Mobile




 GUI components
JS Components
Google Search
UI


       Filters


                                           APIs
Data




                            BPF
                 (distributed computing)
APIs
 SOAP
   Traditional web services

 REST/JSON
   Used in JS applications

 RDF
   Exchange format for semantic web

 SPARQL
   Query language
OpenPHACTS
 Open PHACTS is an Innovative Medicines
  Initiative (IMI) – 3 years project

 To reduce the barriers to drug discovery in
  industry, academia and for small businesses

 To build an open platform, integrating chemistry
  and biology data from public domain resources

 Open Standards, Open Data and Open Source
Acknowledgements
 RSC Cheminformatics group

 Open PHACTS consortium

 Software: GGA Software, ACD/Labs, Scilligence,
  OpenEye, Accelrys, ChemDoodle, ChemAxon,
  Dotmatics, OpenBabel, Jmol, JSpecView,
Thank you

Email: tkachenkov@rsc.org
Blog: www.chemspider.com/blog
SLIDES:
http://www.slideshare.net/valerytkachenko16

More Related Content

Similar to ChemSpider compound database as one of the pillars of a semantic web for …

BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
Chunlei Wu
 
Chapman_publishingweb_BOSC2009
Chapman_publishingweb_BOSC2009Chapman_publishingweb_BOSC2009
Chapman_publishingweb_BOSC2009
bosc
 
ChemReader chemical informatics tool
ChemReader chemical informatics toolChemReader chemical informatics tool
ChemReader chemical informatics tool
harrisbr23
 

Similar to ChemSpider compound database as one of the pillars of a semantic web for … (20)

BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
 
ChemSpider Overview Presentation at Special Libraries Association
ChemSpider Overview Presentation at Special Libraries AssociationChemSpider Overview Presentation at Special Libraries Association
ChemSpider Overview Presentation at Special Libraries Association
 
The application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platformsThe application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platforms
 
Connecting Chemistry Across the Internet Using ChemSpider
Connecting Chemistry Across the Internet Using ChemSpiderConnecting Chemistry Across the Internet Using ChemSpider
Connecting Chemistry Across the Internet Using ChemSpider
 
RSC ChemSpider as an environment for teaching and sharing chemistry
RSC ChemSpider as an environment for teaching and sharing chemistryRSC ChemSpider as an environment for teaching and sharing chemistry
RSC ChemSpider as an environment for teaching and sharing chemistry
 
Lowering barriers to publishing biological data on the web
Lowering barriers to publishing biological data on the webLowering barriers to publishing biological data on the web
Lowering barriers to publishing biological data on the web
 
Chapman_publishingweb_BOSC2009
Chapman_publishingweb_BOSC2009Chapman_publishingweb_BOSC2009
Chapman_publishingweb_BOSC2009
 
Building a semantic chemistry platform with the royal society of chemistry
Building a semantic chemistry platform with the royal society of chemistryBuilding a semantic chemistry platform with the royal society of chemistry
Building a semantic chemistry platform with the royal society of chemistry
 
ChemReader chemical informatics tool
ChemReader chemical informatics toolChemReader chemical informatics tool
ChemReader chemical informatics tool
 
BioNLPSADI
BioNLPSADIBioNLPSADI
BioNLPSADI
 
ChemSpider Overview SLides August 2007
ChemSpider Overview SLides August 2007ChemSpider Overview SLides August 2007
ChemSpider Overview SLides August 2007
 
Crawling Across the Web of Chemistry Using ChemSpider
Crawling Across the Web of Chemistry Using ChemSpider Crawling Across the Web of Chemistry Using ChemSpider
Crawling Across the Web of Chemistry Using ChemSpider
 
EnCOrE: Chemistry, Education, Knowledge From the Real to the Virtual Needs, P...
EnCOrE: Chemistry, Education, Knowledge From the Real to the Virtual Needs, P...EnCOrE: Chemistry, Education, Knowledge From the Real to the Virtual Needs, P...
EnCOrE: Chemistry, Education, Knowledge From the Real to the Virtual Needs, P...
 
Building an integrated system for chemistry markup and online publishing inte...
Building an integrated system for chemistry markup and online publishing inte...Building an integrated system for chemistry markup and online publishing inte...
Building an integrated system for chemistry markup and online publishing inte...
 
2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG
 
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of AgricultureAn Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
 
How the web has weaved a web of interlinked chemistry data final
How the web has weaved a web of interlinked chemistry data finalHow the web has weaved a web of interlinked chemistry data final
How the web has weaved a web of interlinked chemistry data final
 
How an Online Resource for Chemistry Can Change Our World
How an Online Resource for Chemistry Can Change Our WorldHow an Online Resource for Chemistry Can Change Our World
How an Online Resource for Chemistry Can Change Our World
 
ChemSpider and How The Wisdom Of The Crowds Can Improve The Quality Of ...
ChemSpider  and How The Wisdom Of The  Crowds  Can  Improve The  Quality Of  ...ChemSpider  and How The Wisdom Of The  Crowds  Can  Improve The  Quality Of  ...
ChemSpider and How The Wisdom Of The Crowds Can Improve The Quality Of ...
 
BioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogueBioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogue
 

More from Valery Tkachenko

In silico design of new functional materials
In silico design of new functional materialsIn silico design of new functional materials
In silico design of new functional materials
Valery Tkachenko
 
Metal-organic frameworks: from database to supramolecular effects in complexa...
Metal-organic frameworks: from database to supramolecular effects in complexa...Metal-organic frameworks: from database to supramolecular effects in complexa...
Metal-organic frameworks: from database to supramolecular effects in complexa...
Valery Tkachenko
 
Machine learning methods for chemical properties and toxicity based endpoints
Machine learning methods for chemical properties and toxicity based endpointsMachine learning methods for chemical properties and toxicity based endpoints
Machine learning methods for chemical properties and toxicity based endpoints
Valery Tkachenko
 
Chemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionChemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collection
Valery Tkachenko
 
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsDeep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Valery Tkachenko
 
Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...
Valery Tkachenko
 
Development and comparison of deep learning toolkit with other machine learni...
Development and comparison of deep learning toolkit with other machine learni...Development and comparison of deep learning toolkit with other machine learni...
Development and comparison of deep learning toolkit with other machine learni...
Valery Tkachenko
 
Living in a world of federated knowledge challenges, principles, tools and ...
Living in a world of federated knowledge   challenges, principles, tools and ...Living in a world of federated knowledge   challenges, principles, tools and ...
Living in a world of federated knowledge challenges, principles, tools and ...
Valery Tkachenko
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
Valery Tkachenko
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0
Valery Tkachenko
 
Not just another reaction database
Not just another reaction databaseNot just another reaction database
Not just another reaction database
Valery Tkachenko
 

More from Valery Tkachenko (20)

Evolution of public chemistry databases: past and the future
Evolution of public chemistry databases: past and the futureEvolution of public chemistry databases: past and the future
Evolution of public chemistry databases: past and the future
 
In silico design of new functional materials
In silico design of new functional materialsIn silico design of new functional materials
In silico design of new functional materials
 
Metal-organic frameworks: from database to supramolecular effects in complexa...
Metal-organic frameworks: from database to supramolecular effects in complexa...Metal-organic frameworks: from database to supramolecular effects in complexa...
Metal-organic frameworks: from database to supramolecular effects in complexa...
 
Abstract recommendation system: beyond word-level representations
Abstract recommendation system: beyond word-level representationsAbstract recommendation system: beyond word-level representations
Abstract recommendation system: beyond word-level representations
 
Machine learning methods for chemical properties and toxicity based endpoints
Machine learning methods for chemical properties and toxicity based endpointsMachine learning methods for chemical properties and toxicity based endpoints
Machine learning methods for chemical properties and toxicity based endpoints
 
Chemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionChemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collection
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpoints
 
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsDeep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
 
Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...
 
Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...
 
Development and comparison of deep learning toolkit with other machine learni...
Development and comparison of deep learning toolkit with other machine learni...Development and comparison of deep learning toolkit with other machine learni...
Development and comparison of deep learning toolkit with other machine learni...
 
Living in a world of federated knowledge challenges, principles, tools and ...
Living in a world of federated knowledge   challenges, principles, tools and ...Living in a world of federated knowledge   challenges, principles, tools and ...
Living in a world of federated knowledge challenges, principles, tools and ...
 
Using the structured product labeling format to index versatile chemical data
Using the structured product labeling format to index versatile chemical dataUsing the structured product labeling format to index versatile chemical data
Using the structured product labeling format to index versatile chemical data
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0
 
Open Science Data Repository - the platform for materials research
Open Science Data Repository - the platform for materials researchOpen Science Data Repository - the platform for materials research
Open Science Data Repository - the platform for materials research
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardization
 
Evolution of open chemical information
Evolution of open chemical informationEvolution of open chemical information
Evolution of open chemical information
 
OMPOL – visualisation of large chemical spaces
OMPOL – visualisation of large chemical spacesOMPOL – visualisation of large chemical spaces
OMPOL – visualisation of large chemical spaces
 
Not just another reaction database
Not just another reaction databaseNot just another reaction database
Not just another reaction database
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

ChemSpider compound database as one of the pillars of a semantic web for …

  • 1. ChemSpider compound database as one of the pillars of a semantic web for chemistry of the pillars of a Valery Tkachenko, Antony J Williams, Ken Karapetyan, Colin Batchelor, Jon Steele, Aileen Day and David Sharpe ACS Philly August 2012
  • 2. Outline  The world we live in  Pillars of the world  ChemSpider as a semantic web system  Example of federated semantic web system
  • 3. The World we live in  Internet World  ~20 years of WWW and inflationary expansion  Web 2.0  Connected World  Social Networks  Mobile Communications  Internet TV  Big Data World  Semantic content  New Interfaces
  • 4. Pillars of the World  Data is King  New data model approach: SQL  NoSQL  Inflow of data  Structured data  Search and Navigation  Search by all domain specific information  Navigate inside and link out  Cloud  Data and code are distributed and self-sustained  Federated systems take precedence over standalone solutions  Interfaces  Sophisticated HCI (human computer interface)  Pervasive M2M (machine to machine)
  • 5. Chemistry on the Internet
  • 6. What’s wrong?!?!  Is science (and chemistry in particular) so miserable in the world we live in?  Or too obscure and complex to be easily presented?  Or scientists are rather conservative beasts?
  • 10. ChemSpider  Database of small organic molecules  Properties  Names and synonyms  Spectra  Contribute in an easy way  New data depositions  Existing data curations  Search engine for chemistry  Search a chemical by a, b, c  Cluster and navigate relationships  Extensive infrastructure  Computer farm  Components  Standard interfaces  SOAP  REST  JSON
  • 11. ChemSpider UI Filters APIs Data BPF (distributed computing)
  • 13. ChemSpider UI Filters APIs Data BPF (distributed computing)
  • 15. ChemSpider UI Filters APIs Data BPF (distributed computing)
  • 16. User Interface  Web  Mobile  GUI components
  • 19. UI Filters APIs Data BPF (distributed computing)
  • 20. APIs  SOAP  Traditional web services  REST/JSON  Used in JS applications  RDF  Exchange format for semantic web  SPARQL  Query language
  • 21. OpenPHACTS  Open PHACTS is an Innovative Medicines Initiative (IMI) – 3 years project  To reduce the barriers to drug discovery in industry, academia and for small businesses  To build an open platform, integrating chemistry and biology data from public domain resources  Open Standards, Open Data and Open Source
  • 22. Acknowledgements  RSC Cheminformatics group  Open PHACTS consortium  Software: GGA Software, ACD/Labs, Scilligence, OpenEye, Accelrys, ChemDoodle, ChemAxon, Dotmatics, OpenBabel, Jmol, JSpecView,
  • 23. Thank you Email: tkachenkov@rsc.org Blog: www.chemspider.com/blog SLIDES: http://www.slideshare.net/valerytkachenko16