The Chemistry Development Kit An OpenSource Java library for structural chemo- and bioinformatics Egon Willighagen, Radboud University NijmegenChristoph Steinbeck, Cologne University BioInformatics Center 18. CIC-Workshop 14.-16. November 2004, Boppard
The Chemistry Development Kit?Library of Standard Algorithms ● Reduce need to rewrite code ● ChemoInformatics educationToolkit for prototyping ● 2D/3D rendering ● file IOJava ● Object oriented ● Portability ● Applet (and Internet technologies in general) ● ... speed ?
Standard AlgorithmsMolinformatics ● IO (CML, MDL, PDB, INChI, ...) ● SMILES parsing and canonical generation ● Isomorphism checking ● Substructure search (and SMARTS) ● Maximal Common Subgraph Searches ● Gasteiger charges ● Ring searching (SSSR) ● Structure Diagram Generation ● 2D Rendering (and 3D via Jmol) ● Fingerprinting ● HOSE codes ● Atom typing
3D Rendering: JmolRendering Features ● wireframe/ball-sticks/etc ● protein ● cartoon ● backboneRasmol ScriptingApplet and Application
History of the ProjectSeptember 2000the CDK emerged from the CompChem librariesused by Jmol, JChemPaint and Seneca.February 2001the CDK project registered at SourceForge.netMarch 2003 Steinbeck, C. and Han, Y. and Kuhn, S, and Horlacher, O. and Luttmann, E. and Willighagen, E. J.Chem.Inf.Comput.Sci. 2003, 43:493-500July 2004first release of CDK News
CDK CommunityActive development ● 10 active and 30 part-time ● highly internalUsers ● 50+ users on user list ● many projects using the libraryCommunication ● Email: user list, developers list ● Internet Relay Chat ● Informal meetings ● CDK News
CDK NewsNewsletter (ISSN 1614/7553) ● With articles on the use of CDK ● ChangeLog / Literature / FAQ ● Free, print copies availableVol. 1 Issue 2 ● Customizing file IO ● First steps in the implementation of a force field ● Spok - The Spectrum Organisation Kit ● Predictor ● Konqueror web shortcuts to the CDK API
A few applications...Chemistry ● NMRShiftDB ● 2D diagram editor (JChemPaint) ● Seneca (structure elucidation) ● CML Rich Site Summary ● Nomen (IUPAC name parser)Bioinformatics ● Brenda (enzyme database) ● Pathway analysis ● Enzyme reaction mechanismsMany more... ● A few commercial software programs ● Some project in development
NMRShiftDBC. Steinbeck, S. Kunh et al. J.Chem.Inf.Comp.Sci., 2003,43:1733-1739
Chemistry enhancedRich Site Summary (CMLRSS)P. Murray-Rust, H.S. Rzepa, M.J. Williamson, and E.L.Willighagen, J.Chem.Inf.Comp.Sci., 2004, 44:462 - 469
Summary➔ Large library with key algorithms➔ Active developer and user community➔ Has been used in several projectsNew areas of interest ● Descriptor calculation (QSAR) ● Structure optimization (force field)
AcknowledgmentsCode contributions from: Ulrich Bauer, Fabian Dortu, Dan Gezelter, Rajarshi Guha, Yonquan Han, Kai Hartmann, Christian Hoppe, Oliver Horlacher, Miguel Howard, Geert Josten, Anatoli Krassavine, Stefan Kuhn, Daniel Leidert, Edgar Luttmann, Nathanaël Mazuir, Stephan Michels, Peter Murray-Rust, Chris Pudney, Jonathan Rienstra-Kiracofe, David Robinson, Bhupinder Sandhu, Jean-Sebastien Senecal, Sulev Sild, Bradley Smith, Christoph Steinbeck, Stephan Tomkinson, Joerg Wegner, Stephane Werner, Egon Willighagen, Yong Zhang.