Shape Signatures is a novel molecular shape-based method for virtual screening in drug discovery and computational toxicology. It employs a ray-tracing algorithm to explore the volume enclosed by a molecule's surface, constructing histograms that encode molecular shape and polarity as signatures. These signatures can be used to rapidly screen large libraries, classify compounds, and build predictive models such as for drug-target binding, toxicity, and blood-brain barrier permeation.
DK Group secured the first order from a European shipowner for 4xVLBC of USD 100m each
The order is a significant step towards proof of concept and confirms that the shipping market is ready and willing to order new ships with the ACS technology
Project pipeline established leading to new orders
DK Group received an important classification of the ACS technology (Germanischer Lloyd)
The ACS demonstrator vessel was acquired and reconstructed
The project is supported by key industry players
Shipowners have shown willingness to compensate for development cost, reducing CAPEX per project
DK Group received significant political and industrial interest
DK Group raised first round venture capital in a private placement process
DK Group appointed a professional non-executive board of directors
Creative Responses to Artificial IntelligenceLuba Elliott
This presentation was delivered by Murray Shanahan at the Creative AI meetup #3 in London on the 18th January 2017.
Science fiction has long offered a philosophical critique of the prospect of artificial intelligence. But now that AI technologies are increasingly real rather than fictional the wider world of culture and the arts is beginning to respond. I will offer my personal perpective on this based on my experience working with the film Ex Machina, and collaborating with artist collective Random International.
Murray Shanahan is Professor of Cognitive Robotics in the Dept. of Computing at Imperial College London, where he heads the Neurodynamics Group. Educated at Imperial College and Cambridge University (King’s College), he became a full professor in 2006. His publications span artificial intelligence, robotics, logic, dynamical systems, computational neuroscience, and philosophy of mind. He was scientific advisor to the film Ex Machina, and regularly appears in the media to comment on artificial intelligence and robotics. His books include “Embodiment and the Inner Life” (2010), and “The Technological Singularity” (2015).
Les Français et l'Hôpital du Futur - Sondage OpinionWay pour le Groupe Conflu...MTaveau
Sondage OpinionWay réalisé en novembre 2016 pour le Groupe Confluent, auprès d'un échantillon de 1023 personnes représentatif de la population française âgées de 18 ans et plus.
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Elia Brodsky
This workshop will address critical issues related to Transcriptomics data:
Processing raw Next Generation Sequencing (NGS) data:
1. Next Generation Sequencing data preprocessing:
Trimming technical sequences
Removing PCR duplicates
2. RNA-seq based quantification of expression levels:
Conventional pipelines (looking at known transcripts)
Identification of novel isoforms
Analysis of Expression Data Using Machine Learning:
3. Unsupervised analysis of expression data:
Principal Component Analysis
Clustering
4. Supervised analysis:
Differential expression analysis
Classification, gene signature construction
5. Gene set enrichment analysis
The workshop will include hands-on exercises utilizing public domain datasets:
breast cancer cell lines transcriptomic profiles (https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-10-r110),
patient-derived xenograft (PDX) mouse model of tumor and stroma transcriptomic profiles (http://www.oncotarget.com/index.php?journal=oncotarget&page=article&op=view&path[]=8014&path[]=23533), and
processed data from The Cancer Genome Atlas samples (https://cancergenome.nih.gov/).
Team: The workshops are designed by the researchers at the Tauber Bioinformatics Research Center at University of Haifa, Israel in collaboration with academic centers across the US. Technical support for the workshops is provided by the Pine Biotech team. https://edu.t-bio.info/a-critical-approach-to-transcriptomic-data-analysis/
IPG (Immobilized pH Gradient) based separations are frequently
used as the first step in shotgun proteomics methods; it yields an
increase in both the dynamic range and resolution of peptide
separation prior to the LC-MS analysis. Experimental isoelectric
point (pI) values can improve peptide identifications in conjunction
with MS/MS information. Our group has previously reported the
possibility of identifying theoretically peptides and proteins based
on different experimental properties. Thus, accurate estimation
of the pI value based on the amino acid sequence becomes critical
to perform these kinds of experiments. Nowadays, pI is commonly
predicted using the charge-state model [3], and/or the co-factor
algorithm. However, none of these methods is capable of
calculating the pI value for basic peptides accurately. In this
manuscript, we present an new approach that can significant
improve the pI estimation, by using Support Vector Machines
(SVM), an experimental amino acid descriptor taken from the
AAIndex database and the isoelectric point predicted by the
charge-state model.
Interpretable Spiculation Quantification for Lung Cancer ScreeningWookjin Choi
Spiculations are spikes on the surface of pulmonary nodule and are important predictors of malignancy in lung cancer. In this work, we introduced an interpretable, parameter-free technique for quantifying this critical feature using the area distortion metric from the spherical conformal (angle-preserving) parameterization. The conformal factor in the spherical mapping formulation provides a direct measure of spiculation which can be used to detect spikes and compute spike heights for geometrically-complex spiculations. The use of the area distortion metric from conformal mapping has never been exploited before in this context. Based on the area distortion metric and the spiculation height, we introduced a novel spiculation score. A combination of our spiculation measures was found to be highly correlated (Spearman's rank correlation coefficient ρ = 0.48) with the radiologist's spiculation score. These measures were also used in the radiomics framework to achieve state-of-the-art malignancy prediction accuracy of 88.9% on a publicly available dataset.
Next generation sequencing of the whole transcriptome enables high resolution measurement of gene expression activity in different tissue and cell types. This methodology provides an in depth study of known transcripts and depending on the data analysis, allows identification of additional transcript types such as transcript variants, fusion transcripts, and small and long ncRNAs.
In this study we performed RNA-Seq using the Ion Torrent™ sequencing platform to compare the expression profile of testicular germ cell cancers (seminoma type, n=3) and normal testis (n=3). Using Partek Flow® 3.0 and TopHat/BowTie or Star aligners, we aligned the reads to the human genome and mapped sequences to the RefSeq database. Differentially expressed genes were identified and screened with additional germ cell tumors.
PCA analysis showed clear separation of the two sample types indicating biological differences. List of differentially expressed genes generated from TopHat/Bowtie and Star were similar. We identified a large number of genes that were up and down regulated with high degree of significance (p<0.01,>2X FC (fold change)). These included genes related to testicular tissue type, stem cell pluripotency (NANOG; POU5F1) and proliferation (KRAS, CCND2).
In addition, a number of differentially expressed noncoding RNAs were identified (SNORD12B, XIST). The method was validated on a small set of genes (n=20) using qPCR (TaqMan® Assays) and were found to be correlated. We used the OpenArray® platform to quickly and quantitatively screen 102 differentially expressed genes and 10 endogenous control genes across a number of different testicular germ cell cancer types.
We used a complete work flow solution from sample prep to NGS to qPCR to compare the expression profile of normal testis and seminoma type germ cell tumors. From the NGS experiments we identified a large number of differentially expressed genes for qPCR screening with samples from different types of germ cell tumors. Results from these screening studies will be presented.
PICS: Pathway Informed Classification System for cancer analysis using gene e...David Craft
We introduce PICS (Pathway Informed Classification System) for classifying cancers based on tumor sample gene expression levels. The method clearly separates a pan-cancer dataset into their tissue of origin and is also able to sub-classify individual cancer datasets into distinct survival classes. Gene expression values are collapsed into pathway scores that reveal which biological activities are most useful for clustering cancer cohorts into sub-types. Variants of the method allow it to be used on datasets that do and do not contain non-cancerous samples. Activity levels of all types of pathways, broadly grouped into metabolic, cellular processes and signaling, and immune system, are useful for separating the pan-cancer cohort. In the clustering of specific cancer types, certain pathway types become more valuable depending on the site being studied. For lung cancer, signaling pathways dominate, for pancreatic cancer signaling and metabolic pathways, and for melanoma immune system pathways are the most useful. This work suggests the utility of pathway level genomic analysis and points in the direction of using pathway classification for predicting the efficacy and side effects of drugs and radiation.
QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...Kamel Mansouri
The goal of this study was to predict ready biodegradation of
chemicals by QSAR modeling. The dataset used for this purpose was
produced by the Japanese Ministry of International Trade and Industry
(MITI) with experimental results according to the OECD test guideline
301C. Molecular descriptors from Dragon 6 were calculated. Variable
selection coupled with classification methods were applied to find the
most predictive models with low cross-validation error rate. The best
models were after that validated using the preselected test set to check
its prediction reliability and for further analysis.
DK Group secured the first order from a European shipowner for 4xVLBC of USD 100m each
The order is a significant step towards proof of concept and confirms that the shipping market is ready and willing to order new ships with the ACS technology
Project pipeline established leading to new orders
DK Group received an important classification of the ACS technology (Germanischer Lloyd)
The ACS demonstrator vessel was acquired and reconstructed
The project is supported by key industry players
Shipowners have shown willingness to compensate for development cost, reducing CAPEX per project
DK Group received significant political and industrial interest
DK Group raised first round venture capital in a private placement process
DK Group appointed a professional non-executive board of directors
Creative Responses to Artificial IntelligenceLuba Elliott
This presentation was delivered by Murray Shanahan at the Creative AI meetup #3 in London on the 18th January 2017.
Science fiction has long offered a philosophical critique of the prospect of artificial intelligence. But now that AI technologies are increasingly real rather than fictional the wider world of culture and the arts is beginning to respond. I will offer my personal perpective on this based on my experience working with the film Ex Machina, and collaborating with artist collective Random International.
Murray Shanahan is Professor of Cognitive Robotics in the Dept. of Computing at Imperial College London, where he heads the Neurodynamics Group. Educated at Imperial College and Cambridge University (King’s College), he became a full professor in 2006. His publications span artificial intelligence, robotics, logic, dynamical systems, computational neuroscience, and philosophy of mind. He was scientific advisor to the film Ex Machina, and regularly appears in the media to comment on artificial intelligence and robotics. His books include “Embodiment and the Inner Life” (2010), and “The Technological Singularity” (2015).
Les Français et l'Hôpital du Futur - Sondage OpinionWay pour le Groupe Conflu...MTaveau
Sondage OpinionWay réalisé en novembre 2016 pour le Groupe Confluent, auprès d'un échantillon de 1023 personnes représentatif de la population française âgées de 18 ans et plus.
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Elia Brodsky
This workshop will address critical issues related to Transcriptomics data:
Processing raw Next Generation Sequencing (NGS) data:
1. Next Generation Sequencing data preprocessing:
Trimming technical sequences
Removing PCR duplicates
2. RNA-seq based quantification of expression levels:
Conventional pipelines (looking at known transcripts)
Identification of novel isoforms
Analysis of Expression Data Using Machine Learning:
3. Unsupervised analysis of expression data:
Principal Component Analysis
Clustering
4. Supervised analysis:
Differential expression analysis
Classification, gene signature construction
5. Gene set enrichment analysis
The workshop will include hands-on exercises utilizing public domain datasets:
breast cancer cell lines transcriptomic profiles (https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-10-r110),
patient-derived xenograft (PDX) mouse model of tumor and stroma transcriptomic profiles (http://www.oncotarget.com/index.php?journal=oncotarget&page=article&op=view&path[]=8014&path[]=23533), and
processed data from The Cancer Genome Atlas samples (https://cancergenome.nih.gov/).
Team: The workshops are designed by the researchers at the Tauber Bioinformatics Research Center at University of Haifa, Israel in collaboration with academic centers across the US. Technical support for the workshops is provided by the Pine Biotech team. https://edu.t-bio.info/a-critical-approach-to-transcriptomic-data-analysis/
IPG (Immobilized pH Gradient) based separations are frequently
used as the first step in shotgun proteomics methods; it yields an
increase in both the dynamic range and resolution of peptide
separation prior to the LC-MS analysis. Experimental isoelectric
point (pI) values can improve peptide identifications in conjunction
with MS/MS information. Our group has previously reported the
possibility of identifying theoretically peptides and proteins based
on different experimental properties. Thus, accurate estimation
of the pI value based on the amino acid sequence becomes critical
to perform these kinds of experiments. Nowadays, pI is commonly
predicted using the charge-state model [3], and/or the co-factor
algorithm. However, none of these methods is capable of
calculating the pI value for basic peptides accurately. In this
manuscript, we present an new approach that can significant
improve the pI estimation, by using Support Vector Machines
(SVM), an experimental amino acid descriptor taken from the
AAIndex database and the isoelectric point predicted by the
charge-state model.
Interpretable Spiculation Quantification for Lung Cancer ScreeningWookjin Choi
Spiculations are spikes on the surface of pulmonary nodule and are important predictors of malignancy in lung cancer. In this work, we introduced an interpretable, parameter-free technique for quantifying this critical feature using the area distortion metric from the spherical conformal (angle-preserving) parameterization. The conformal factor in the spherical mapping formulation provides a direct measure of spiculation which can be used to detect spikes and compute spike heights for geometrically-complex spiculations. The use of the area distortion metric from conformal mapping has never been exploited before in this context. Based on the area distortion metric and the spiculation height, we introduced a novel spiculation score. A combination of our spiculation measures was found to be highly correlated (Spearman's rank correlation coefficient ρ = 0.48) with the radiologist's spiculation score. These measures were also used in the radiomics framework to achieve state-of-the-art malignancy prediction accuracy of 88.9% on a publicly available dataset.
Next generation sequencing of the whole transcriptome enables high resolution measurement of gene expression activity in different tissue and cell types. This methodology provides an in depth study of known transcripts and depending on the data analysis, allows identification of additional transcript types such as transcript variants, fusion transcripts, and small and long ncRNAs.
In this study we performed RNA-Seq using the Ion Torrent™ sequencing platform to compare the expression profile of testicular germ cell cancers (seminoma type, n=3) and normal testis (n=3). Using Partek Flow® 3.0 and TopHat/BowTie or Star aligners, we aligned the reads to the human genome and mapped sequences to the RefSeq database. Differentially expressed genes were identified and screened with additional germ cell tumors.
PCA analysis showed clear separation of the two sample types indicating biological differences. List of differentially expressed genes generated from TopHat/Bowtie and Star were similar. We identified a large number of genes that were up and down regulated with high degree of significance (p<0.01,>2X FC (fold change)). These included genes related to testicular tissue type, stem cell pluripotency (NANOG; POU5F1) and proliferation (KRAS, CCND2).
In addition, a number of differentially expressed noncoding RNAs were identified (SNORD12B, XIST). The method was validated on a small set of genes (n=20) using qPCR (TaqMan® Assays) and were found to be correlated. We used the OpenArray® platform to quickly and quantitatively screen 102 differentially expressed genes and 10 endogenous control genes across a number of different testicular germ cell cancer types.
We used a complete work flow solution from sample prep to NGS to qPCR to compare the expression profile of normal testis and seminoma type germ cell tumors. From the NGS experiments we identified a large number of differentially expressed genes for qPCR screening with samples from different types of germ cell tumors. Results from these screening studies will be presented.
PICS: Pathway Informed Classification System for cancer analysis using gene e...David Craft
We introduce PICS (Pathway Informed Classification System) for classifying cancers based on tumor sample gene expression levels. The method clearly separates a pan-cancer dataset into their tissue of origin and is also able to sub-classify individual cancer datasets into distinct survival classes. Gene expression values are collapsed into pathway scores that reveal which biological activities are most useful for clustering cancer cohorts into sub-types. Variants of the method allow it to be used on datasets that do and do not contain non-cancerous samples. Activity levels of all types of pathways, broadly grouped into metabolic, cellular processes and signaling, and immune system, are useful for separating the pan-cancer cohort. In the clustering of specific cancer types, certain pathway types become more valuable depending on the site being studied. For lung cancer, signaling pathways dominate, for pancreatic cancer signaling and metabolic pathways, and for melanoma immune system pathways are the most useful. This work suggests the utility of pathway level genomic analysis and points in the direction of using pathway classification for predicting the efficacy and side effects of drugs and radiation.
QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...Kamel Mansouri
The goal of this study was to predict ready biodegradation of
chemicals by QSAR modeling. The dataset used for this purpose was
produced by the Japanese Ministry of International Trade and Industry
(MITI) with experimental results according to the OECD test guideline
301C. Molecular descriptors from Dragon 6 were calculated. Variable
selection coupled with classification methods were applied to find the
most predictive models with low cross-validation error rate. The best
models were after that validated using the preselected test set to check
its prediction reliability and for further analysis.
Two-Tailed PCR - New Ultrasensitive and Ultraspecific Technique for the Quant...Kate Barlow
Mikael Kubista, Department of Biotechnology, CAS and TATAA Biocenter
We present a highly specific, sensitive and cost-effective system to quantify miRNA expression based on novel chemistry called Two-tailed RT-qPCR. It takes advantage of target-specific primers for reverse transcription composed of two hemiprobes complementary to two different parts of the targeted miRNA, connected by a hairpin structure. The introduction of a second probe ensures high sensitivity and enables discrimination of highly homologous miRNAs irrespectively of the position of the mismatched nucleotide. Two-tailed RT-qPCR has a dynamic range of 7 logs and a sensitivity sufficient to detect less than ten target miRNA molecules. The reverse transcription step can be multiplexed and it allows for rapid testing with a total analysis time of less than 2.5 hours.
Quantitative Image Analysis for Cancer Diagnosis and Radiation TherapyWookjin Choi
1.Lung Cancer Screening
1.1.Deep learning (feasible but not interpretable)
1.2.Radiomics (concise model)
1.3.Spiculation quantification (interpretable feature)
2.PET/CT Tumor Response
2.1.Aggressive Lung ADC subtype prediction (helpful for surgeons)
2.2.Pathologic response prediction (accurate but not concise)
2.3.Local tumor morphological changes (accurate and interpretable)
Quantitative Image Analysis for Cancer Diagnosis and Radiation Therapy
Shape Signatures Light
1. Dmitriy Chekmarev Department of Pharmacology & Environmental Bioinformatics and Computational Toxicology Center, UMDNJ - RWJMS 675 Hoes Lane, Piscataway, NJ 08854 [email_address] Shape Signatures: Exploring novel molecular shape based methods for in silico drug discovery and computational toxicology
2.
3. Shape Signatures Method pics from Meek PJ, Liu Z, Tian L, Wang CY, Welsh WJ, Zauhar RJ Drug Discovery Today. 2006 Oct;11(19-20):895-904 Indinavir (IDV) - HIV protease inhibitor 1D/2D-to-3D conversion (e.g. with CORINA) Generation of Solvent Excluded Surface (SES) Triangulation of SES using SMART algorithm 100,000 reflections ray tracing Rays propagate by optical reflection from triangular surface elements 1D ShapeSigs (shape only) 2D ShapeSigs (shape + MEP) 1D Shape Signatures generate a histogram of ray segment lengths (prob. dist.) 2D Shape Signatures compute molecular electrostatic potential (Coulomb) at each reflection point of SES, then generate a 2D histogram of pairs of ray segments and associated MEP values (joint prob. dist.)
4. Shape Signatures employs a customized ray-tracing algorithm to explore the volume enclosed by the surface of a molecule, then uses the output to construct compact histograms (signatures) that encode for molecular shape and polarity. The method lends itself to rapid screening of large chemical libraries , and Shape Signatures databases can be created for an almost limitless number and variety of chemical structures. Zauhar RJ, Moyna G, Tian L, Li Z, Welsh WJ J Med Chem. 2003;46(26):5674-90 Shape Signatures: a new approach to computer-aided ligand- and receptor-based drug design
12. Shape Signatures: Ligand-based virtual screening for selected therapeutic targets Arithmetic weighted ROC enrichment (awROCE) at false positive rate of 5% Arithmetic weighted ROC AUC (awAUC) The performance of each method is assessed using a set of arithmetic weighted ROCE @X% false positive rates and arithmetic weighted area under ROC curve (awAUC), which account for differences in the chemotype among the retrieved actives Jahn A, Hinselmann G, Fechner N, Zell A, J.Cheminformatics 2009 , 1:14, 1-23
13. Shape Signatures: Predictive modeling Classification by Support Vector Machines (SVM) ACTIVE NON-ACTIVE INPUT SPACE ACTIVE NON-ACTIVE FEATURE SPACE MAPPING complex boundary separating hyperplane Chang CC, Lin CJ. LIBSVM: A library for support vector machines, 2001 Sensitivity: SE = TP/(TP+FN), expresses the prediction accuracy for actives Specificity: SP = TN/(TN+FP), reflects the prediction accuracy for non-actives Overall prediction accuracy: Q = (TP+TN)/(TP+FP+TN+FN) Matthews correlation coefficient ( ): C = [TP*TN-FP*FN]/[(TP+FN)(TP+FP)(TN+FP)(TN+FN)] 1/2 For a perfect classifier with FP=FN=0, one would have C = 1.0. For a random prediction, C = 0, and for a complete inversion (TP=TN=0) C = -1.0
14. Shape Signatures: cardiotoxicity via blocking hERG potassium channels The human ether a-go-go-related gene, hERG , is believed to encode the K+ channel which regulates the repolarizing IKr current in the cardiac action potential (CAP). Blockage of hERG channel by some chemicals can cause potentially fatal cardiac arrhythmias by prolonging the QT interval of CAP. Drugs taken off the market include terfenadine, sertindole, cisapride Chekmarev D, Kholodovych V, Balakin KV, Ivanenkov Y, Ekins S, Welsh WJ. Chem. Res. Toxicol. 2008 , 21, 1304-1314 39 strong blockers: IC 50 < 1 µM and 44 weak blockers: IC 50 > 10 µM 2D Shape Sigs (shape + polarity) 1D Shape Sigs (shape only) Descriptors 0.488 74 74 73 78 SVM Classification Method 10-fold cross validation (%) Leave-20%-out testing SE (%) SP (%) Q (%) C SVM 77 70 68 69 0.390
15. Shape Signatures: cardiotoxicity via binding 5-HT 2B serotonin receptors Serotonin plays a major regulatory function in cardiovascular morphogenesis. 5-HT 2B (GPCR) is expressed in cardiovascular tissues and is implicated in the valvular heart diseases (VHD) caused by now banned ‘Fen-Phen’ anti-obesity medication. Norfenfluramine , a primary metabolite of fenfluramine , is a potent agonist of 5-HT 2B receptors Chekmarev D, Kholodovych V, Balakin KV, Ivanenkov Y, Ekins S, Welsh WJ. Chem. Res. Toxicol. 2008 , 21, 1304-1314 116 strong binders: K i 100 nM and 66 weak binders: K i 1 µM PDSP (NIMH Psychoactive Drug Screening Program) K i DB http://pdsp.med.unc.edu/ MOE 2D Shape Sigs (shape + polarity) 1D Shape Sigs (shape only) Descriptors 0.638 83 69 91 87 SVM Classification Method 10-fold cross validation (%) Leave-20%-out testing SE (%) SP (%) Q (%) C SVM 80 81 59 73 0.424 SVM 87 91 70 84 0.640
16. Shape Signatures: classification models with Blood-Brain Barrier permeation data SVM models using 2D Shape Signatures and MOE molecular descriptors Combined: 186 BBB+ and 165 BBB- Li et al: 250 BBB+ and 126 BBB- Kortagere S, Chekmarev D, Welsh WJ, Ekins S. Pharm. Res. 2008 , 25, 1836 - 1845 MOE 2D Shape Sigs (shape + polarity) MOE 2D Shape Sigs (shape + polarity) Molecular descriptors 0.635 82 79 84 83 Combined 0.595 80 79 80 80 Combined Dataset 10-fold cross validation (%) Leave-20%-out testing SE (%) SP (%) Q (%) C Li et al 80 89 62 80 0.533 Li et al 80 89 51 76 0.435