The CDK, Bioclipse, and RDF


Egon Willighagen <http://chem-bla-ics.blogspot.com/>


                Bioclipse & Proteoche...
Who am I?



Problem

Building
Blocks                 http://www.citeulike.org/user/
Solution               egonw/tag/pape...
The Problem...



Problem

Building
Blocks

Solution                                                  We model our world, ...
Names...



Problem
                      benzene
Building              3-[4-[3-(1-methyl-7-oxo-3-propyl-4H-
Blocks
      ...
... Molecular reality...



Problem

Building
Blocks

Solution
                 1 000 000 000 000 000 000 000 000
Applicat...
... and Numbers



Problem

Building
Blocks

Solution

Application

Conclusion




 2010-04-01   Bioclipse & Proteochemome...
Knowledge Representation: Information
               Loss


Problem

Building
Blocks

Solution

Application

Conclusion


...
Data Analysis



Problem

Building
Blocks

Solution

Application

Conclusion




 2010-04-01   Bioclipse & Proteochemometr...
Proteochemometrics



Problem

Building
Blocks

Solution

Application

Conclusion




 2010-04-01   Bioclipse & Proteochem...
Main Theme



Problem

Building
Blocks                  How do we navigate dimensionality space?
Solution                H...
The Setting...



Problem

Building
Blocks

Solution        1998: Organic
Application
                chemistry...
Conclus...
Knowledge Representation...



Problem

Building
Blocks

Solution

Application

Conclusion      What are the
             ...
The Problem: Reproducibility...



Problem
                Where reproducibility is
Building
Blocks          severely hamp...
Solutions...


                Openess
Problem
                       license that allows
Building
Blocks                 ...
Reproducibility needs ODOSOS

                Open Data
Problem                No Intellectual Monopoly
Building        Op...
Jmol



Problem

Building               Started in 1997 by
Blocks
                       Dan Gezelter
Solution
           ...
The Chemistry Development Kit

                 A Family of Projects
                        CDK-Taverna (chemoinformatics...
CDK: an Open Project



Problem
                Features
Building
Blocks                 open mailinglist and bug
Solution...
Bioclipse



Problem

Building
Blocks

Solution

Application

Conclusion




                 O. Spjuth et al., BMC Bioinf...
Integration

                Services
                       databases: PubChem
Problem

Building
                       w...
Resource Description Framework



Problem
                Facts as Triples
Building               subject
Blocks

Solution...
OpenMolecules RDF: dereferenceable URI



Problem

Building
Blocks

Solution

Application

Conclusion




                ...
OpenMolecules RDF: linked data



Problem

Building
Blocks

Solution

Application

Conclusion




                 http://...
Bioclipse-RDF



Problem

Building
Blocks

Solution
                        local RDF storage
Application             read...
Names 2 Graphs 2 Numbers...



Problem

Building
Blocks

Solution

Application

Conclusion




 2010-04-01   Bioclipse & P...
ChEMBL / QSAR



Problem

Building
Blocks

Solution

Application

Conclusion




 2010-04-01   Bioclipse & Proteochemometr...
RDF graph visualization



Problem

Building
Blocks

Solution

Application

Conclusion




 2010-04-01   Bioclipse & Prote...
OWL for Descriptors



Problem

Building
Blocks

Solution

Application

Conclusion




                 Used for model and...
MyExperiment: Bioclipse Scripting
               Language


Problem

Building
Blocks

Solution

Application

Conclusion


...
What does this bring us?



Problem

Building
Blocks

Solution                Platform to integrate the RDF with the compu...
Upcoming SlideShare
Loading in …5
×

Opentox Virtual Seminar: Bioclipse - Life Science Application and Ontology Development in Cheminformatics and Bioinformatics

1,481 views

Published on

OpenTox Virtual Seminar presentation of 2010-04-01.

Published in: Health & Medicine
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,481
On SlideShare
0
From Embeds
0
Number of Embeds
110
Actions
Shares
0
Downloads
32
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Opentox Virtual Seminar: Bioclipse - Life Science Application and Ontology Development in Cheminformatics and Bioinformatics

  1. 1. The CDK, Bioclipse, and RDF Egon Willighagen <http://chem-bla-ics.blogspot.com/> Bioclipse & Proteochemometric Group (Prof. Wikberg) Department of Pharmaceutical Biosciences Uppsala University 2010-04-01
  2. 2. Who am I? Problem Building Blocks http://www.citeulike.org/user/ Solution egonw/tag/papers Application http: Conclusion //chem-bla-ics.blogspot.com http://egonw.github.com waveto: egon.willighagen@googlewave.com 2010-04-01 Bioclipse & Proteochemometric Group -2- Egon Willighagen | chem-bla-ics.blogspot.com
  3. 3. The Problem... Problem Building Blocks Solution We model our world, but ... Application Life is not uni- or bivariate Conclusion Knowledge is not either But we think of it as such Information Loss! Solanum lycopersicum... 2010-04-01 Bioclipse & Proteochemometric Group -3- Egon Willighagen | chem-bla-ics.blogspot.com
  4. 4. Names... Problem benzene Building 3-[4-[3-(1-methyl-7-oxo-3-propyl-4H- Blocks pyrazolo[4,3-d]pyrimidin-5-yl)-4- Solution propoxyphenyl]sulfonylpiperazin-1- Application Conclusion yl]propanoic acid InChI=1S/C25H34N6O6S/c1-4-6-19-22- 23(29(3)28-19)25(34)27-24(26-22)18-16- 17(7-8-20(18)37-15-5-2)38(35,36)31-13-11- 30(12-14-31)10-9-21(32)33/h7-8,16H,4-6,9- 15H2,1-3H3,(H,32,33)(H,26,27,34) 2010-04-01 Bioclipse & Proteochemometric Group -4- Egon Willighagen | chem-bla-ics.blogspot.com
  5. 5. ... Molecular reality... Problem Building Blocks Solution 1 000 000 000 000 000 000 000 000 Application Conclusion 000 000 000 000 000 000 000 000 000 000 000 000 2010-04-01 Bioclipse & Proteochemometric Group -5- Egon Willighagen | chem-bla-ics.blogspot.com
  6. 6. ... and Numbers Problem Building Blocks Solution Application Conclusion 2010-04-01 Bioclipse & Proteochemometric Group -6- Egon Willighagen | chem-bla-ics.blogspot.com
  7. 7. Knowledge Representation: Information Loss Problem Building Blocks Solution Application Conclusion 2010-04-01 Bioclipse & Proteochemometric Group -7- Egon Willighagen | chem-bla-ics.blogspot.com
  8. 8. Data Analysis Problem Building Blocks Solution Application Conclusion 2010-04-01 Bioclipse & Proteochemometric Group -8- Egon Willighagen | chem-bla-ics.blogspot.com
  9. 9. Proteochemometrics Problem Building Blocks Solution Application Conclusion 2010-04-01 Bioclipse & Proteochemometric Group -9- Egon Willighagen | chem-bla-ics.blogspot.com
  10. 10. Main Theme Problem Building Blocks How do we navigate dimensionality space? Solution How include prior knowledge? Application While minimizing information loss? Conclusion With optimal knowledge extraction? And maximizing interpretability? Without ending up in random correlation? 2010-04-01 Bioclipse & Proteochemometric Group - 10 - Egon Willighagen | chem-bla-ics.blogspot.com
  11. 11. The Setting... Problem Building Blocks Solution 1998: Organic Application chemistry... Conclusion beatiful science! But ... why, how, what, ... PJJA Buijnsters et al., Eur.J.Org.Chem, 2002, 1397–1406 2010-04-01 Bioclipse & Proteochemometric Group - 11 - Egon Willighagen | chem-bla-ics.blogspot.com
  12. 12. Knowledge Representation... Problem Building Blocks Solution Application Conclusion What are the organic normal conditions? 2010-04-01 Bioclipse & Proteochemometric Group - 12 - Egon Willighagen | chem-bla-ics.blogspot.com
  13. 13. The Problem: Reproducibility... Problem Where reproducibility is Building Blocks severely hampered: Solution recalculate basic atom and Application bond properties Conclusion access to QSAR/QSPR data well-defined algorithms publications destroy information 2010-04-01 Bioclipse & Proteochemometric Group - 13 - Egon Willighagen | chem-bla-ics.blogspot.com
  14. 14. Solutions... Openess Problem license that allows Building Blocks modification and Solution redistribution Application hiding behind public Conclusion domain is not helpful Semantic Web be explicit in what you mean both in facts and in algorithms 2010-04-01 Bioclipse & Proteochemometric Group - 14 - Egon Willighagen | chem-bla-ics.blogspot.com
  15. 15. Reproducibility needs ODOSOS Open Data Problem No Intellectual Monopoly Building Open Source Blocks Solution algorithms are complex Application implementations even more Conclusion strong interaction with representation Open Standards Semantic Web formats unique identifiers http: // en. wikipedia. org/ wiki/ Glyn_ Moody 2010-04-01 Bioclipse & Proteochemometric Group - 15 - Egon Willighagen | chem-bla-ics.blogspot.com
  16. 16. Jmol Problem Building Started in 1997 by Blocks Dan Gezelter Solution (Notre Dame) Application Conclusion Leaders: Bradly Smith, me, Miguel Howard, Bob Hanson E.L. Willighagen, M. Howard, Nature Precedings, 2005 http: // www. jmol. org/ 2010-04-01 Bioclipse & Proteochemometric Group - 16 - Egon Willighagen | chem-bla-ics.blogspot.com
  17. 17. The Chemistry Development Kit A Family of Projects CDK-Taverna (chemoinformatics workflows) Problem Building JChemPaint (semantic 2D editor) Blocks ChemoJava (GPL-ed extension) Solution Application Goals Conclusion library of cheminformatics algorithms educational Usage CDK 2003: 75+ times cited in literature Bioclipse, KNIME, Jumbo (CML), AMBIT, ... C. Steinbeck et al., J.Chem.Inf.Comput.Sci, 2003 C. Steinbeck et al., Curr.Pharm.Design, 2006 2010-04-01 Bioclipse & Proteochemometric Group - 17 - Egon Willighagen | chem-bla-ics.blogspot.com
  18. 18. CDK: an Open Project Problem Features Building Blocks open mailinglist and bug Solution tracker Application open source repository Conclusion release soon, release often Offer Review senior developers review patches 2010-04-01 Bioclipse & Proteochemometric Group - 18 - Egon Willighagen | chem-bla-ics.blogspot.com
  19. 19. Bioclipse Problem Building Blocks Solution Application Conclusion O. Spjuth et al., BMC Bioinformatics 2007, 8:59 2010-04-01 Bioclipse & Proteochemometric Group - 19 - Egon Willighagen | chem-bla-ics.blogspot.com
  20. 20. Integration Services databases: PubChem Problem Building web services Blocks Google Spreadsheets Solution Application MyExperiment.org: Bioclipse Conclusion Scripting Language Twitter, ... journals, ... Techniques SOAP, REST, XMPP, . . . Resource Description Framework dedicated APIs 2010-04-01 Bioclipse & Proteochemometric Group - 20 - Egon Willighagen | chem-bla-ics.blogspot.com
  21. 21. Resource Description Framework Problem Facts as Triples Building subject Blocks Solution predictate (relation) Application object Conclusion Examples wp:Benzene chem:hasSMILES "c1ccccc1" wp:Benzene owl:sameAs chemspider:123 2010-04-01 Bioclipse & Proteochemometric Group - 21 - Egon Willighagen | chem-bla-ics.blogspot.com
  22. 22. OpenMolecules RDF: dereferenceable URI Problem Building Blocks Solution Application Conclusion http://rdf.openmolecules.net/ 2010-04-01 Bioclipse & Proteochemometric Group - 22 - Egon Willighagen | chem-bla-ics.blogspot.com
  23. 23. OpenMolecules RDF: linked data Problem Building Blocks Solution Application Conclusion http://rdf.openmolecules.net/ 2010-04-01 Bioclipse & Proteochemometric Group - 23 - Egon Willighagen | chem-bla-ics.blogspot.com
  24. 24. Bioclipse-RDF Problem Building Blocks Solution local RDF storage Application read/write RDF/XML, N3 Conclusion run SPARQL queries (local and remote) extract RDF from XHTML/RDFa Thanx to Jena and Pellet. 2010-04-01 Bioclipse & Proteochemometric Group - 24 - Egon Willighagen | chem-bla-ics.blogspot.com
  25. 25. Names 2 Graphs 2 Numbers... Problem Building Blocks Solution Application Conclusion 2010-04-01 Bioclipse & Proteochemometric Group - 25 - Egon Willighagen | chem-bla-ics.blogspot.com
  26. 26. ChEMBL / QSAR Problem Building Blocks Solution Application Conclusion 2010-04-01 Bioclipse & Proteochemometric Group - 26 - Egon Willighagen | chem-bla-ics.blogspot.com
  27. 27. RDF graph visualization Problem Building Blocks Solution Application Conclusion 2010-04-01 Bioclipse & Proteochemometric Group - 27 - Egon Willighagen | chem-bla-ics.blogspot.com
  28. 28. OWL for Descriptors Problem Building Blocks Solution Application Conclusion Used for model and data. 2010-04-01 Bioclipse & Proteochemometric Group - 28 - Egon Willighagen | chem-bla-ics.blogspot.com
  29. 29. MyExperiment: Bioclipse Scripting Language Problem Building Blocks Solution Application Conclusion 2010-04-01 Bioclipse & Proteochemometric Group - 29 - Egon Willighagen | chem-bla-ics.blogspot.com
  30. 30. What does this bring us? Problem Building Blocks Solution Platform to integrate the RDF with the computation world Application Bioclipse as single point of access Conclusion Scripting, sharing of scripts with MyExperiment.org Bridge the nominal with the numerical world 2010-04-01 Bioclipse & Proteochemometric Group - 30 - Egon Willighagen | chem-bla-ics.blogspot.com

×