Emphasize Chemopreventive specific Predictive Models.
Expert User with knowledge on biology, chemistry, data mining and knowledge extraction algorithms.
Search from LBS, get ZINC database.Molecular DescriptorsGet compounds that might be active to estrogen receptors.Prepare files required for docking.Docking to ER-a using AutoDock Vina.Gather results.
Towards a modular Web-based Workflow environment for enablinglarge scale Virtual Screening in Cancer Chemoprevention Research 19 June 2012 COST Conference Personalised Medicine: Better Healthcare for the Future Christos Kannas Computer Science Dept., University of Cyprus
June 19, 2012 2Outline• About the Project• Overview of the Project• Objectives• State of the Art Review• Implementation • Virtual Screening Process • Predictive Model Preparation • In-Silico Tools and Methods• Early in silico experiments• Concluding Remarks
June 19, 2012 3About the Project• The vision of the GRANATUM project is to: • bridge the information, knowledge and collaboration gap among biomedical researchers in Europe (at least), • ensure that the biomedical scientific community has homogenized, integrated access to the globally available information and data resources needed to perform complex cancer chemoprevention experiments, and conduct studies on large-scale datasets.• The GRANATUM project is partially funded by the European Commission under the Seventh Framework Programme in the area of Virtual Physiological Human (ICT-2009.5.3).• http://www.granatum.org/
June 19, 2012 5Objectives• Design a scientific algorithmic workflow for the development of in silico chemoprevention models.• Implement workflow(s) for the selection of promising chemopreventive agents.• Connect the custom in-silico models for compound selection to other datasets, and evidence included in the Linked Biomedical Data Space.• Test the performance of custom in-silico models.
June 19, 2012 6State Of the Art• Significant overlap of chemoprevention and traditional drug discovery process (DDP). • Special case with additional constraints, e.g. no toxicity• In Silico Models and Tools: heavily borrowing from DDP. SOA Review Online resources Databases (e.g. ChemBL), journals, reports, … Infrastructure tools Chemoinformatics toolkits (e.g. RDKit and CDK): compound representation, property and descriptor calculation, substructure mining, … Advanced comp. chem. Biological property predictive models, compound 3D conformations, docking tools, … Machine learning Classification and regression methods, available open source libraries Scientific workflow Knime, Taverna, Galaxy, … systems
June 19, 2012 8Predictive Model Preparation Template Chemical data Algorithm Biological • Algorithm data parameters Predictive Model
June 19, 2012 9Chemopreventive Property Models Anti – Anti – Anti – metastatic Estrogenic Anti – oxidant Apoptotic inflammatory proliferating / Anti – agiogenic Activity Anti-apoptotic Cyclin D1 members of COX-2 but not COX-2 down- ER-alpha Direct Effect down- Bcl-2 family COX-1 inhibitor regulation binding affinity regulation down- regulation IAP family Reduction of Her-2 down- VEGF down- ER-beta Indirect Effect down- TNF-a regulation regulation binding affinity regulation Caspase up- Direct/Indirect Reduction of Cyclin E down- PDGF down- ER-alpha/beta regulation/activ Effect LOX regulation regulation binding affinity ation Induction of EGFR down- No affinity AP-1 regulation Reduction of Estrogen Interleukins Antagonists Selective Estrogen Receptor Modulators (SERMs) Estrogen Receptor Modulators
June 19, 2012 10In Silico Tools and Methods• Generic Chemoinformatics Tool: • E-Health Lab and collaborators resources • RDKit• Docking Experiment Tools: • AutoDock Vina • Chil2 GlamDock• Data Mining & Statistics Tools: • In house tools • R• Scientific Workflow System: • Galaxy
June 19, 2012 11Early In Silico Experiments• In silico tool & models validation• Steps: • Prepare compound dataset • Mix of natural products and known inhibitors (4% actives) • Implementation/application of predictive models • Rule of Five • Toxicity model • Implementation/application of docking model • ER-alpha • Compound prioritization • Top selections visualization/evaluation
June 19, 2012 12Virtual Screening Process ExampleNatural products Calculate collection + physicochemical Rule of Five filterknown ER-alpha molecular inhibitors descriptors Compound prioritization; Docking to ER- Toxicity model Report on top alpha selections
June 19, 2012 13Cytotoxicity Predictive Model Cytotoxicity Cytotoxicity Clean Oral Drug- Morgan Predictive Dataset Molecules like Filtering Fingerprints Model • Source : The • Remove Salts • HBA <= 10 • Bit Vector 2048- • Cytotoxicity Scripps • HBD <= 5 bits Bio-Chemical Research • Molecular data Institute Weight <=500 • SVM: Molecular • Kernel: Linear • logP <= 5 Screening • Stratified K- Center Fold: • PubChem Bio- • 5-folds Assay: AID 464 • 10-folds • Tested: 706 • Active: 331 • Inactive: 375
June 19, 2012 14Virtual Screening Process Example Demo Dataset Result: 2451 OK, Remove 85 (valence errors, empty Known ER-Alpha Inhibitors (42) Indofine Dataset (2494) molecule block) Clean Molecules Remove Salts 2451 molecules (42 Known, 2409 Indofine) Oral Druglike Filtering HBA <= 10 HBD <= 5 Molecular Weight <=500 logP <= 5 Result: 2035 pass, 416 not pass Calculate Morgan Fingerprints row-20-top-known row-36-top-known row-42-top-known Bit Vector 2048-bits Cytotoxicity (Predictive Model) SVM Classifier Trained with Bio-Assay 464 dataset Predict: 2451 molecules ER-Alpha Docking (GlamDock) ER-Alpha Protein 2451 molecules for docking experiments Ranked order of Cytotoxicity Prediction, Docking and Oral Druglikness Filtering results row-1988-top-unknown row-729-top-unknown row-1652-top-unknown
June 19, 2012 15Docking results: known ER inhibitors row-20-top-known
June 19, 2012 16Docking results: known ER inhibitors row-36-top-known
June 19, 2012 17Docking results: known ER inhibitors row-42-top-known
June 19, 2012 18Docking results: Indofine compounds row-729-top-unknown
June 19, 2012 19Docking results: Indofine compounds row-1652-top-unknown
June 19, 2012 20Docking results: Indofine compounds row-1988-top-unknown
June 19, 2012 21Concluding Remarks• Support of chemopreventive specific predictive models. • Initial promising results on ERa (based on Indofine dataset).• Modular architecture and workflow management.• Integrated with additional tools within the Granatum Project. • Linked Biomedical Data Space. • Social Collaborative Workspace.• Product Release: • Advanced Prototype Version: October 2012 • Final Version: April 2013