ChemSpider compound database as
 one of the pillars of a semantic web
                          for chemistry
                       of the pillars of a
                  Valery Tkachenko, Antony J Williams,
 Ken Karapetyan, Colin Batchelor, Jon Steele, Aileen Day
                                      and David Sharpe

                                     ACS Philly August 2012
Outline
   The world we live in
   Pillars of the world
   ChemSpider as a semantic web system
   Example of federated semantic web system
The World we live in
 Internet World
    ~20 years of WWW and inflationary expansion
    Web 2.0

 Connected World
   Social Networks
   Mobile Communications
   Internet TV

 Big Data World
   Semantic content
   New Interfaces
Pillars of the World
 Data is King
    New data model approach: SQL  NoSQL
    Inflow of data
    Structured data
 Search and Navigation
    Search by all domain specific information
    Navigate inside and link out
 Cloud
    Data and code are distributed and self-sustained
    Federated systems take precedence over standalone solutions
 Interfaces
    Sophisticated HCI (human computer interface)
    Pervasive M2M (machine to machine)
Chemistry on the Internet
What’s wrong?!?!
 Is science (and chemistry in particular) so miserable in the
  world we live in?
 Or too obscure and complex to be easily presented?
 Or scientists are rather conservative beasts?
Scientific data complexity
Chemical data complexity
ChemCloud
ChemSpider
   Database of small organic molecules
      Properties
      Names and synonyms
      Spectra
   Contribute in an easy way
      New data depositions
      Existing data curations
   Search engine for chemistry
      Search a chemical by a, b, c
      Cluster and navigate relationships
   Extensive infrastructure
      Computer farm
      Components
   Standard interfaces
      SOAP
      REST
      JSON
ChemSpider                       UI


       Filters


                                           APIs
Data




                            BPF
                 (distributed computing)
Deposition System
ChemSpider                       UI


       Filters


                                           APIs
Data




                            BPF
                 (distributed computing)
Validation and Standardization
ChemSpider                       UI


       Filters


                                           APIs
Data




                            BPF
                 (distributed computing)
User Interface
 Web



 Mobile




 GUI components
JS Components
Google Search
UI


       Filters


                                           APIs
Data




                            BPF
                 (distributed computing)
APIs
 SOAP
   Traditional web services

 REST/JSON
   Used in JS applications

 RDF
   Exchange format for semantic web

 SPARQL
   Query language
OpenPHACTS
 Open PHACTS is an Innovative Medicines
  Initiative (IMI) – 3 years project

 To reduce the barriers to drug discovery in
  industry, academia and for small businesses

 To build an open platform, integrating chemistry
  and biology data from public domain resources

 Open Standards, Open Data and Open Source
Acknowledgements
 RSC Cheminformatics group

 Open PHACTS consortium

 Software: GGA Software, ACD/Labs, Scilligence,
  OpenEye, Accelrys, ChemDoodle, ChemAxon,
  Dotmatics, OpenBabel, Jmol, JSpecView,
Thank you

Email: tkachenkov@rsc.org
Blog: www.chemspider.com/blog
SLIDES:
http://www.slideshare.net/valerytkachenko16

ChemSpider compound database as one of the pillars of a semantic web for …

  • 1.
    ChemSpider compound databaseas one of the pillars of a semantic web for chemistry of the pillars of a Valery Tkachenko, Antony J Williams, Ken Karapetyan, Colin Batchelor, Jon Steele, Aileen Day and David Sharpe ACS Philly August 2012
  • 2.
    Outline  The world we live in  Pillars of the world  ChemSpider as a semantic web system  Example of federated semantic web system
  • 3.
    The World welive in  Internet World  ~20 years of WWW and inflationary expansion  Web 2.0  Connected World  Social Networks  Mobile Communications  Internet TV  Big Data World  Semantic content  New Interfaces
  • 4.
    Pillars of theWorld  Data is King  New data model approach: SQL  NoSQL  Inflow of data  Structured data  Search and Navigation  Search by all domain specific information  Navigate inside and link out  Cloud  Data and code are distributed and self-sustained  Federated systems take precedence over standalone solutions  Interfaces  Sophisticated HCI (human computer interface)  Pervasive M2M (machine to machine)
  • 5.
  • 6.
    What’s wrong?!?!  Isscience (and chemistry in particular) so miserable in the world we live in?  Or too obscure and complex to be easily presented?  Or scientists are rather conservative beasts?
  • 7.
  • 8.
  • 9.
  • 10.
    ChemSpider  Database of small organic molecules  Properties  Names and synonyms  Spectra  Contribute in an easy way  New data depositions  Existing data curations  Search engine for chemistry  Search a chemical by a, b, c  Cluster and navigate relationships  Extensive infrastructure  Computer farm  Components  Standard interfaces  SOAP  REST  JSON
  • 11.
    ChemSpider UI Filters APIs Data BPF (distributed computing)
  • 12.
  • 13.
    ChemSpider UI Filters APIs Data BPF (distributed computing)
  • 14.
  • 15.
    ChemSpider UI Filters APIs Data BPF (distributed computing)
  • 16.
    User Interface  Web Mobile  GUI components
  • 17.
  • 18.
  • 19.
    UI Filters APIs Data BPF (distributed computing)
  • 20.
    APIs  SOAP  Traditional web services  REST/JSON  Used in JS applications  RDF  Exchange format for semantic web  SPARQL  Query language
  • 21.
    OpenPHACTS  Open PHACTSis an Innovative Medicines Initiative (IMI) – 3 years project  To reduce the barriers to drug discovery in industry, academia and for small businesses  To build an open platform, integrating chemistry and biology data from public domain resources  Open Standards, Open Data and Open Source
  • 22.
    Acknowledgements  RSC Cheminformaticsgroup  Open PHACTS consortium  Software: GGA Software, ACD/Labs, Scilligence, OpenEye, Accelrys, ChemDoodle, ChemAxon, Dotmatics, OpenBabel, Jmol, JSpecView,
  • 23.
    Thank you Email: tkachenkov@rsc.org Blog:www.chemspider.com/blog SLIDES: http://www.slideshare.net/valerytkachenko16