This document summarizes and reviews methods for inferring gene co-expression networks from gene expression data, as presented in related articles including Chiquet et al. It describes various statistical approaches implemented in packages like GeneNet and glasso, including graphical Gaussian models using shrinkage and sparse linear regression. It compares the resulting network densities produced by different methods.
Graph Neural Network for Phenotype Predictiontuxette
This document describes a study on using graph neural networks (GNNs) for phenotype prediction from gene expression data. The objectives are to determine if including network information can improve predictions, which network types work best, and if GNNs can learn network inferences. It provides background on GNNs and how they generalize convolutional layers to graph data. The authors implemented a GNN model from previous work as a starting point and tested it on different network types to see which network information is most useful for predictions. Their methodology involves comparing GNN performance to other methods like random forests using 10-fold cross validation.
Convolutional networks and graph networks through kernelstuxette
This presentation discusses how convolutional kernel networks (CKNs) can be used to model sequential and graph-structured data through kernels defined over sequences and graphs. CKNs define feature maps from substructures like n-mers in sequences and paths in graphs into high-dimensional spaces, which are then approximated to obtain low-dimensional representations that can be used for prediction tasks like classification. This approach is analogous to convolutional neural networks and can be extended to multiple layers. The presentation provides examples showing CKNs achieve good performance on problems involving protein sequences and social networks.
Multimodal Residual Networks for Visual QAJin-Hwa Kim
Deep neural networks continue to advance the state-of-the-art of image recognition tasks with various methods. However, applications of these methods to multimodality remain limited. We present Multimodal Residual Networks (MRN) for the multimodal residual learning of visual question-answering, which extends the idea of the deep residual learning. Unlike the deep residual learning, MRN effectively learns the joint representation from vision and language information. The main idea is to use element-wise multiplication for the joint residual mappings exploiting the residual learning of the attentional models in recent studies. Various alternative models introduced by multimodality are explored based on our study. We achieve the state-of-the-art results on the Visual QA dataset for both Open-Ended and Multiple-Choice tasks. Moreover, we introduce a novel method to visualize the attention effect of the joint representations for each learning block using back-propagation algorithm, even though the visual features are collapsed without spatial information.
Medical pathology images are visually evaluated by experts for disease diagnosis, but the connectionbetween image features and the state of the cells in an image is typically unknown. To understand thisrelationship, we describe a multimodal modeling and inference framework that estimates shared latentstructure of joint gene expression levels and medical image features. The method is built aroundprobabilistic canonical correlation analysis (PCCA), which is jointly fit to image embeddings that are learnedusing convolutional neural networks and linear embeddings of paired gene expression data. We finallydiscuss a set of theoretical and empirical challenges in domain adaptation settings arising from genomics data.(based on work in collab with Gregory Gundersen and Barbara E. Engelhardt)
Since the advent of the horseshoe priors for regularization, global-local shrinkage methods have proved to be a fertile ground for the development of Bayesian theory and methodology in machine learning. They have achieved remarkable success in computation, and enjoy strong theoretical support. Much of the existing literature has focused on the linear Gaussian case. The purpose of the current talk is to demonstrate that the horseshoe priors are useful more broadly, by reviewing both methodological and computational developments in complex models that are more relevant to machine learning applications. Specifically, we focus on methodological challenges in horseshoe regularization in nonlinear and non-Gaussian models; multivariate models; and deep neural networks. We also outline the recent computational developments in horseshoe shrinkage for complex models along with a list of available software implementations that allows one to venture out beyond the comfort zone of the canonical linear regression problems.
A short and naive introduction to epistasis in association studiestuxette
This document provides a short introduction to detecting epistasis, or gene-gene interactions, in genome-wide association studies. It discusses how standard GWAS have limitations and epistasis may help explain missing heritability. Various approaches for detecting epistasis are summarized, including regression-based methods, correlation-based methods, information theory methods, and methods that combine or summarize results across multiple SNPs or genomic regions. Challenges in detecting epistasis like multiple testing and computational complexity are also noted. The goal is to give an overview of epistasis detection rather than precise directions.
The document describes how to calculate backpropagation for neural networks. It involves:
1) Calculating the gradients of the objective function with respect to the weights in order to update them.
2) The gradients are calculated layer by layer, starting from the output layer and moving backwards.
3) To calculate the gradient for a weight, the gradients of the layers above are used, along with the activation values of the layers below.
Graph Neural Network for Phenotype Predictiontuxette
This document describes a study on using graph neural networks (GNNs) for phenotype prediction from gene expression data. The objectives are to determine if including network information can improve predictions, which network types work best, and if GNNs can learn network inferences. It provides background on GNNs and how they generalize convolutional layers to graph data. The authors implemented a GNN model from previous work as a starting point and tested it on different network types to see which network information is most useful for predictions. Their methodology involves comparing GNN performance to other methods like random forests using 10-fold cross validation.
Convolutional networks and graph networks through kernelstuxette
This presentation discusses how convolutional kernel networks (CKNs) can be used to model sequential and graph-structured data through kernels defined over sequences and graphs. CKNs define feature maps from substructures like n-mers in sequences and paths in graphs into high-dimensional spaces, which are then approximated to obtain low-dimensional representations that can be used for prediction tasks like classification. This approach is analogous to convolutional neural networks and can be extended to multiple layers. The presentation provides examples showing CKNs achieve good performance on problems involving protein sequences and social networks.
Multimodal Residual Networks for Visual QAJin-Hwa Kim
Deep neural networks continue to advance the state-of-the-art of image recognition tasks with various methods. However, applications of these methods to multimodality remain limited. We present Multimodal Residual Networks (MRN) for the multimodal residual learning of visual question-answering, which extends the idea of the deep residual learning. Unlike the deep residual learning, MRN effectively learns the joint representation from vision and language information. The main idea is to use element-wise multiplication for the joint residual mappings exploiting the residual learning of the attentional models in recent studies. Various alternative models introduced by multimodality are explored based on our study. We achieve the state-of-the-art results on the Visual QA dataset for both Open-Ended and Multiple-Choice tasks. Moreover, we introduce a novel method to visualize the attention effect of the joint representations for each learning block using back-propagation algorithm, even though the visual features are collapsed without spatial information.
Medical pathology images are visually evaluated by experts for disease diagnosis, but the connectionbetween image features and the state of the cells in an image is typically unknown. To understand thisrelationship, we describe a multimodal modeling and inference framework that estimates shared latentstructure of joint gene expression levels and medical image features. The method is built aroundprobabilistic canonical correlation analysis (PCCA), which is jointly fit to image embeddings that are learnedusing convolutional neural networks and linear embeddings of paired gene expression data. We finallydiscuss a set of theoretical and empirical challenges in domain adaptation settings arising from genomics data.(based on work in collab with Gregory Gundersen and Barbara E. Engelhardt)
Since the advent of the horseshoe priors for regularization, global-local shrinkage methods have proved to be a fertile ground for the development of Bayesian theory and methodology in machine learning. They have achieved remarkable success in computation, and enjoy strong theoretical support. Much of the existing literature has focused on the linear Gaussian case. The purpose of the current talk is to demonstrate that the horseshoe priors are useful more broadly, by reviewing both methodological and computational developments in complex models that are more relevant to machine learning applications. Specifically, we focus on methodological challenges in horseshoe regularization in nonlinear and non-Gaussian models; multivariate models; and deep neural networks. We also outline the recent computational developments in horseshoe shrinkage for complex models along with a list of available software implementations that allows one to venture out beyond the comfort zone of the canonical linear regression problems.
A short and naive introduction to epistasis in association studiestuxette
This document provides a short introduction to detecting epistasis, or gene-gene interactions, in genome-wide association studies. It discusses how standard GWAS have limitations and epistasis may help explain missing heritability. Various approaches for detecting epistasis are summarized, including regression-based methods, correlation-based methods, information theory methods, and methods that combine or summarize results across multiple SNPs or genomic regions. Challenges in detecting epistasis like multiple testing and computational complexity are also noted. The goal is to give an overview of epistasis detection rather than precise directions.
The document describes how to calculate backpropagation for neural networks. It involves:
1) Calculating the gradients of the objective function with respect to the weights in order to update them.
2) The gradients are calculated layer by layer, starting from the output layer and moving backwards.
3) To calculate the gradient for a weight, the gradients of the layers above are used, along with the activation values of the layers below.
Estimating Functional Connectomes: Sparsity’s Strength and LimitationsGael Varoquaux
Talk given at the OHBM 2017 education course.
I present the challenges and techniques to estimating meaningful brain functional connectomes from fMRI: why sparsity in inverse covariance leads to models that can interpreted as interactions between regions.
Then I discuss the limitations of sparse estimators and introduce shrinkage as an alternative. Finally, I discuss how to compare multiple functional connectomes.
Enhancing Partition Crossover with Articulation Points Analysisjfrchicanog
This is the presentation of the paper entitled "Enhancing Partition Crossover with Articulation Points Analysis" at the ECOM track in gECCO 2018 (Kyoto). This paper was awarded with a "Best Paper Award"
Connectomics: Parcellations and Network Analysis MethodsGael Varoquaux
Simple tutorial on methods for functional connectome analysis: learning regions, extracting functional signal, inferring the network structure, and comparing it across subjects.
This document provides an overview of advanced network modeling and connectivity measures for functional magnetic resonance imaging (fMRI) data. It discusses extracting network structures from fMRI time series data, comparing connectivity across subjects or groups, and interpreting the resulting network structures. Specific topics covered include functional connectivity measures like correlation and partial correlation, estimating inverse covariance matrices, comparing connections between groups, and summarizing networks using graph theoretical measures.
Representation Learning & Generative Modeling with Variational Autoencoder(VA...changedaeoh
This document summarizes the key ideas of auto-encoding variational Bayes. It discusses representation learning using latent variables to model high-dimensional sparse data on low-dimensional manifolds. It then explains generative modeling and the challenge of directly estimating complex data generating distributions. Finally, it introduces variational autoencoders as a way to approximate intractable posterior distributions over latent variables using variational inference and maximize a tractable evidence lower bound objective using the reparameterization trick, allowing end-to-end training of the encoder and decoder networks.
Reading "Bayesian measures of model complexity and fit"Christian Robert
This document summarizes a paper on Bayesian measures of model complexity and fit. It discusses using the posterior expected residual information, denoted pD, as a Bayesian measure of model complexity that accounts for the number of effective parameters. pD can be used to compare complex hierarchical models by balancing measures of fit and complexity. It is defined as the deviation of the estimated residual information from the true residual information. The paper also notes some observations about pD, such as that it is not invariant to transformations and depends on choices like the prior and estimator. pD can be easily calculated using MCMC output.
The document discusses machine learning techniques for clustering and segmentation. It introduces Dirichlet process mixtures and the Chinese restaurant process as nonparametric Bayesian models that allow for an infinite number of clusters. It describes how these models can be used for problems like image segmentation, object recognition, population clustering from genetic data, and evolutionary document clustering over time. Approximate inference methods like Markov chain Monte Carlo sampling are used to analyze these models.
Inter-site autism biomarkers from resting state fMRIGael Varoquaux
This document summarizes research predicting autism from resting-state fMRI data using a connectome classification pipeline. The pipeline involves defining regions of interest, extracting time series, computing functional connectivity matrices, and using supervised learning. The authors explore different choices for each step and find that learning regions with MSDL, using tangent-space embedding for connectivity, and standard SVM learning work best across datasets with different heterogeneity levels. The findings suggest connectome structure is less important for prediction than choice of regions and preprocessing.
Reviews on Deep Generative Models in the early days / GANs & VAEs paper reviewchangedaeoh
The document summarizes recent developments in deep generative models including GAN, VAE, CGAN, CVAE, DCGAN, and InfoGAN. It explains the objectives and training procedures of these models. GANs use a generator and discriminator in an adversarial training procedure, while VAEs have an encoder-decoder structure to learn an explicit density function. Conditional variants like CGAN and CVAE generate outputs conditioned on input data. DCGAN proposed architectures that improve GAN stability. InfoGAN extends GANs to learn disentangled and interpretable representations by maximizing mutual information between latent variables and observations.
This document summarizes and compares two popular Python libraries for graph neural networks - Spektral and PyTorch Geometric. It begins by providing an overview of the basic functionality and architecture of each library. It then discusses how each library handles data loading and mini-batching of graph data. The document reviews several common message passing layer types implemented in both libraries. It provides an example comparison of using each library for a node classification task on the Cora dataset. Finally, it discusses a graph classification comparison in PyTorch Geometric using different message passing and pooling layers on the IMDB-binary dataset.
Kernel methods and variable selection for exploratory analysis and multi-omic...tuxette
Nathalie Vialaneix
4th course on Computational Systems Biology of Cancer: Multi-omics and Machine Learning Approaches
International course, Curie training
https://training.institut-curie.org/courses/sysbiocancer2021
(remote)
September 29th, 2021
The document summarizes a presentation on applying GANs in medical imaging. It discusses several papers on this topic:
1. A paper that used GANs to reduce noise in low-dose CT scans by training on paired routine-dose and low-dose CT images. This approach generated reconstructed low-dose CT images with improved quality.
2. A paper that used GANs for cross-modality synthesis, specifically generating skin lesion images from other modalities.
3. Additional papers discussed other medical imaging applications of GANs such as vessel-fundus image synthesis and organ segmentation.
Brain reading, compressive sensing, fMRI and statistical learning in PythonGael Varoquaux
This document discusses techniques for predictive modeling of brain imaging data using statistical learning methods. It presents an approach that combines sparse recovery, randomized clustering, and total variation regularization to predict stimuli from fMRI data with over 50,000 voxels and around 100 samples. The key steps are clustering spatially correlated voxels, running sparse models on the reduced feature set, and accumulating selected features over multiple runs. Simulations show this approach outperforms other methods at recovering brain patches. The document also discusses disseminating research through open source Python libraries like scikit-learn, which has helped popularize machine learning techniques.
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...Shao-Chuan Wang
The document summarizes a research paper on spatially coherent latent topic modeling for concurrent object segmentation and classification from images. The proposed model represents images as a collection of regions, each associated with a latent topic. It incorporates spatial relationships between regions by encouraging neighboring regions to take on similar topics. The model is trained using variational message passing to maximize the log likelihood of image data. Experimental results show the model can segment objects even under occlusion and achieve good performance on supervised classification tasks using natural scene images.
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge GraphFedorNikolaev
The document proposes a method called KEWER that learns distributed representations of words, entities, and categories from a knowledge graph in the same embedding space. KEWER first generates random walks from entities, replaces some elements with surface forms, and then learns embeddings by maximizing the likelihood of contexts. These embeddings improve entity retrieval over term-based and existing joint embedding models, especially when combined with entity linking.
Goodness–of–fit tests for regression models: the functional data caseNeuroMat
In this talk the topic of the goodness–of–fit for regression models with functional covariates is considered. Although several papers have been published in the last two decades for the checking of regression models, the case where the covariates are functional is quite recent and has became of interest in the last years. We will review the very recent advances in this area and we will propose a new goodness–of–fit test for the null hypothesis of a functional linear model with scalar response. Our test is based on a generalization to the functional framework of a previous one, designed for the goodness–of–fit of regression models with multivariate covariates using random projections. The test statistic is easy to compute using geometrical and matrix arguments, and simple to calibrate in its distribution by a wild bootstrap on the residuals. Some theoretical aspects are derived and the finite sample properties of the test are illustrated by a simulation study. Finally, the test is applied to real data for checking the assumption of the functional linear model and a graphical tool is introduced. Lecturer: Wenceslao González-Manteiga, Univ. de Santiago de Compostela, Spain.
Consensual gene co-expression network inference with multiple samplestuxette
This document discusses methods for inferring gene co-expression networks from multiple gene expression samples, such as from different breeds or conditions. It describes using graphical Gaussian models and sparse regression approaches like the graphical lasso to learn networks from individual samples. For multiple samples, independent or joint network estimation methods are discussed, including the GroupLasso and CoopLasso approaches implemented in the R package simone, which aim to find consensus networks that are consistent or sign-coherent across conditions. An example dataset with gene expression from two pig breeds is analyzed to compare the methods.
Asynchronous Stochastic Optimization, New Analysis and AlgorithmsFabian Pedregosa
This document provides an overview of asynchronous stochastic optimization methods and algorithms. It discusses asynchronous parallel stochastic gradient descent (SGD) and how it can minimize idle time. It also introduces asynchronous variance-reduced optimization methods like asynchronous SAGA that provide faster convergence than SGD. The document analyzes the convergence properties of asynchronous optimization methods and presents empirical results demonstrating the speedups achieved by asynchronous proximal SAGA (ProxASAGA) on large datasets.
Estimating Functional Connectomes: Sparsity’s Strength and LimitationsGael Varoquaux
Talk given at the OHBM 2017 education course.
I present the challenges and techniques to estimating meaningful brain functional connectomes from fMRI: why sparsity in inverse covariance leads to models that can interpreted as interactions between regions.
Then I discuss the limitations of sparse estimators and introduce shrinkage as an alternative. Finally, I discuss how to compare multiple functional connectomes.
Enhancing Partition Crossover with Articulation Points Analysisjfrchicanog
This is the presentation of the paper entitled "Enhancing Partition Crossover with Articulation Points Analysis" at the ECOM track in gECCO 2018 (Kyoto). This paper was awarded with a "Best Paper Award"
Connectomics: Parcellations and Network Analysis MethodsGael Varoquaux
Simple tutorial on methods for functional connectome analysis: learning regions, extracting functional signal, inferring the network structure, and comparing it across subjects.
This document provides an overview of advanced network modeling and connectivity measures for functional magnetic resonance imaging (fMRI) data. It discusses extracting network structures from fMRI time series data, comparing connectivity across subjects or groups, and interpreting the resulting network structures. Specific topics covered include functional connectivity measures like correlation and partial correlation, estimating inverse covariance matrices, comparing connections between groups, and summarizing networks using graph theoretical measures.
Representation Learning & Generative Modeling with Variational Autoencoder(VA...changedaeoh
This document summarizes the key ideas of auto-encoding variational Bayes. It discusses representation learning using latent variables to model high-dimensional sparse data on low-dimensional manifolds. It then explains generative modeling and the challenge of directly estimating complex data generating distributions. Finally, it introduces variational autoencoders as a way to approximate intractable posterior distributions over latent variables using variational inference and maximize a tractable evidence lower bound objective using the reparameterization trick, allowing end-to-end training of the encoder and decoder networks.
Reading "Bayesian measures of model complexity and fit"Christian Robert
This document summarizes a paper on Bayesian measures of model complexity and fit. It discusses using the posterior expected residual information, denoted pD, as a Bayesian measure of model complexity that accounts for the number of effective parameters. pD can be used to compare complex hierarchical models by balancing measures of fit and complexity. It is defined as the deviation of the estimated residual information from the true residual information. The paper also notes some observations about pD, such as that it is not invariant to transformations and depends on choices like the prior and estimator. pD can be easily calculated using MCMC output.
The document discusses machine learning techniques for clustering and segmentation. It introduces Dirichlet process mixtures and the Chinese restaurant process as nonparametric Bayesian models that allow for an infinite number of clusters. It describes how these models can be used for problems like image segmentation, object recognition, population clustering from genetic data, and evolutionary document clustering over time. Approximate inference methods like Markov chain Monte Carlo sampling are used to analyze these models.
Inter-site autism biomarkers from resting state fMRIGael Varoquaux
This document summarizes research predicting autism from resting-state fMRI data using a connectome classification pipeline. The pipeline involves defining regions of interest, extracting time series, computing functional connectivity matrices, and using supervised learning. The authors explore different choices for each step and find that learning regions with MSDL, using tangent-space embedding for connectivity, and standard SVM learning work best across datasets with different heterogeneity levels. The findings suggest connectome structure is less important for prediction than choice of regions and preprocessing.
Reviews on Deep Generative Models in the early days / GANs & VAEs paper reviewchangedaeoh
The document summarizes recent developments in deep generative models including GAN, VAE, CGAN, CVAE, DCGAN, and InfoGAN. It explains the objectives and training procedures of these models. GANs use a generator and discriminator in an adversarial training procedure, while VAEs have an encoder-decoder structure to learn an explicit density function. Conditional variants like CGAN and CVAE generate outputs conditioned on input data. DCGAN proposed architectures that improve GAN stability. InfoGAN extends GANs to learn disentangled and interpretable representations by maximizing mutual information between latent variables and observations.
This document summarizes and compares two popular Python libraries for graph neural networks - Spektral and PyTorch Geometric. It begins by providing an overview of the basic functionality and architecture of each library. It then discusses how each library handles data loading and mini-batching of graph data. The document reviews several common message passing layer types implemented in both libraries. It provides an example comparison of using each library for a node classification task on the Cora dataset. Finally, it discusses a graph classification comparison in PyTorch Geometric using different message passing and pooling layers on the IMDB-binary dataset.
Kernel methods and variable selection for exploratory analysis and multi-omic...tuxette
Nathalie Vialaneix
4th course on Computational Systems Biology of Cancer: Multi-omics and Machine Learning Approaches
International course, Curie training
https://training.institut-curie.org/courses/sysbiocancer2021
(remote)
September 29th, 2021
The document summarizes a presentation on applying GANs in medical imaging. It discusses several papers on this topic:
1. A paper that used GANs to reduce noise in low-dose CT scans by training on paired routine-dose and low-dose CT images. This approach generated reconstructed low-dose CT images with improved quality.
2. A paper that used GANs for cross-modality synthesis, specifically generating skin lesion images from other modalities.
3. Additional papers discussed other medical imaging applications of GANs such as vessel-fundus image synthesis and organ segmentation.
Brain reading, compressive sensing, fMRI and statistical learning in PythonGael Varoquaux
This document discusses techniques for predictive modeling of brain imaging data using statistical learning methods. It presents an approach that combines sparse recovery, randomized clustering, and total variation regularization to predict stimuli from fMRI data with over 50,000 voxels and around 100 samples. The key steps are clustering spatially correlated voxels, running sparse models on the reduced feature set, and accumulating selected features over multiple runs. Simulations show this approach outperforms other methods at recovering brain patches. The document also discusses disseminating research through open source Python libraries like scikit-learn, which has helped popularize machine learning techniques.
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...Shao-Chuan Wang
The document summarizes a research paper on spatially coherent latent topic modeling for concurrent object segmentation and classification from images. The proposed model represents images as a collection of regions, each associated with a latent topic. It incorporates spatial relationships between regions by encouraging neighboring regions to take on similar topics. The model is trained using variational message passing to maximize the log likelihood of image data. Experimental results show the model can segment objects even under occlusion and achieve good performance on supervised classification tasks using natural scene images.
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge GraphFedorNikolaev
The document proposes a method called KEWER that learns distributed representations of words, entities, and categories from a knowledge graph in the same embedding space. KEWER first generates random walks from entities, replaces some elements with surface forms, and then learns embeddings by maximizing the likelihood of contexts. These embeddings improve entity retrieval over term-based and existing joint embedding models, especially when combined with entity linking.
Goodness–of–fit tests for regression models: the functional data caseNeuroMat
In this talk the topic of the goodness–of–fit for regression models with functional covariates is considered. Although several papers have been published in the last two decades for the checking of regression models, the case where the covariates are functional is quite recent and has became of interest in the last years. We will review the very recent advances in this area and we will propose a new goodness–of–fit test for the null hypothesis of a functional linear model with scalar response. Our test is based on a generalization to the functional framework of a previous one, designed for the goodness–of–fit of regression models with multivariate covariates using random projections. The test statistic is easy to compute using geometrical and matrix arguments, and simple to calibrate in its distribution by a wild bootstrap on the residuals. Some theoretical aspects are derived and the finite sample properties of the test are illustrated by a simulation study. Finally, the test is applied to real data for checking the assumption of the functional linear model and a graphical tool is introduced. Lecturer: Wenceslao González-Manteiga, Univ. de Santiago de Compostela, Spain.
Consensual gene co-expression network inference with multiple samplestuxette
This document discusses methods for inferring gene co-expression networks from multiple gene expression samples, such as from different breeds or conditions. It describes using graphical Gaussian models and sparse regression approaches like the graphical lasso to learn networks from individual samples. For multiple samples, independent or joint network estimation methods are discussed, including the GroupLasso and CoopLasso approaches implemented in the R package simone, which aim to find consensus networks that are consistent or sign-coherent across conditions. An example dataset with gene expression from two pig breeds is analyzed to compare the methods.
Asynchronous Stochastic Optimization, New Analysis and AlgorithmsFabian Pedregosa
This document provides an overview of asynchronous stochastic optimization methods and algorithms. It discusses asynchronous parallel stochastic gradient descent (SGD) and how it can minimize idle time. It also introduces asynchronous variance-reduced optimization methods like asynchronous SAGA that provide faster convergence than SGD. The document analyzes the convergence properties of asynchronous optimization methods and presents empirical results demonstrating the speedups achieved by asynchronous proximal SAGA (ProxASAGA) on large datasets.
Joint gene network inference with multiple samples: a bootstrapped consensual...tuxette
The document describes a method for jointly inferring gene networks from multiple samples using a consensus LASSO approach. It begins with an overview of network inference with Gaussian graphical models and the use of partial correlations. It then discusses challenges with independently estimating networks from multiple samples. The consensus LASSO approach is proposed to infer multiple networks by forcing them towards a consensus network, using an L2 penalty to constrain differences between networks. Simulation results demonstrate that the consensus LASSO approach can accurately recover the shared structure between networks generated from a common original network.
A short and naive introduction to using network in prediction modelstuxette
The document provides an introduction to using network information in prediction models. It discusses representing a network as a graph with a Laplacian matrix. The Laplacian captures properties like random walks on the graph and heat diffusion. Eigenvectors of the Laplacian related to small eigenvalues are strongly tied to graph structure. The document discusses using the Laplacian in prediction models by working in the feature space defined by the Laplacian eigenvectors or directly regularizing a linear model with the Laplacian. This introduces network information and encourages similar contributions from connected nodes. The approaches are applied to problems like predicting phenotypes from gene expression using a known gene network.
Topic of presentation: Variational autoencoders for speech processing
The main points of the presentation: Variational autoencoders (or VAE) have become one of the most popular unsupervised learning techniques for modelling complex data distributions, such as images and audio. In this talk I'll begin with a general introduction to VAEs and then review a recent technique called VQ-VAE which is capable of learning rundimentary phoneme-level language model from raw audio without any supervision.
http://dataconf.com.ua/speaker-page/dmytro-bielievtsov.php
https://www.youtube.com/watch?v=euYSAL-aKMI&list=PL5_LBM8-5sLjbRFUtXaUpg84gtJtyc4Pu&t=0s&index=9
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksValentin De Bortoli
The document discusses quantitative analysis of stochastic gradient descent (SGD) for training wide neural networks. It presents two different regimes - a deterministic regime where the limiting dynamics is described by an ordinary differential equation, and a stochastic regime where the limiting dynamics is a stochastic differential equation. Experiments on MNIST classification show that the stochastic regime with larger step sizes exhibits better regularization properties. The analysis provides insights into the behavior of neural network training as the number of neurons becomes large.
Deep dive into the mathematics and algorithms of neural nets. Covers the sigmoid activation function, cross-entropy loss function, gradient descent and the derivatives used in back propagation.
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...tuxette
The document summarizes preliminary results from evaluating methods for inferring gene regulatory networks from expression data in Bacillus subtilis. It finds that recall of the known network is generally poor (<20% for random forest), but inferred clusters still retain biological information about common regulators. It plans to confirm results, test restricting edges to sigma factors, and explore other inference methods like Bayesian networks and ARACNE.
Special Plenary Lecture at the International Conference on VIBRATION ENGINEERING AND TECHNOLOGY OF MACHINERY (VETOMAC), Lisbon, Portugal, September 10 - 13, 2018
http://www.conf.pt/index.php/v-speakers
Propagation of uncertainties in complex engineering dynamical systems is receiving increasing attention. When uncertainties are taken into account, the equations of motion of discretised dynamical systems can be expressed by coupled ordinary differential equations with stochastic coefficients. The computational cost for the solution of such a system mainly depends on the number of degrees of freedom and number of random variables. Among various numerical methods developed for such systems, the polynomial chaos based Galerkin projection approach shows significant promise because it is more accurate compared to the classical perturbation based methods and computationally more efficient compared to the Monte Carlo simulation based methods. However, the computational cost increases significantly with the number of random variables and the results tend to become less accurate for a longer length of time. In this talk novel approaches will be discussed to address these issues. Reduced-order Galerkin projection schemes in the frequency domain will be discussed to address the problem of a large number of random variables. Practical examples will be given to illustrate the application of the proposed Galerkin projection techniques.
Hybrid Meta-Heuristic Algorithms For Solving Network Design ProblemAlana Cartwright
This document discusses hybrid meta-heuristic algorithms for solving network design problems. It proposes hybridizing the ant system meta-heuristic with genetic algorithms, simulated annealing and tabu search. Seven hybrid algorithms are developed and tested on the Sioux Falls network, finding the hybrids to be more effective than the base ant system alone. One hybrid combining all four concepts is also applied to a real network of a city with over 2 million people, proving more effective than the base network.
The document discusses building robust machine learning systems that can handle concept drift. It introduces the challenges of concept drift when the underlying data distribution changes over time. It proposes using Gaussian process classifiers with an adaptive training window approach. The approach monitors for concept drift and retrains the model if detected. It tests the approach on artificial data streams with different drift scenarios and finds the adaptive approach performs better than a static model at handling concept drift. Future work could explore other drift detection methods and ensembles of adaptive Gaussian process classifiers.
Inferring networks from multiple samples with consensus LASSOtuxette
This document provides a short overview of network inference using graphical Gaussian models (GGMs). It discusses inferring networks from multiple samples, with the motivation being to identify genes that are linked independently or depending on different conditions. A naive approach of performing independent estimations on each sample is described. Joint network inference using the consensus LASSO method is then introduced to better identify common and condition-specific network structures across multiple related samples.
ERGM (Exponential Random Graph Models) are statistical models for social networks that specify the probability of a graph as a function of network statistics. Three key points:
1. ERGMs express the probability of a graph as proportional to an exponential family form involving network statistics. This allows modeling dependencies between ties.
2. The conditional probability of a tie is derived from the ERGM and gives insight into how the model parameters influence individual tie formation.
3. Examples of classic network models like Bernoulli graphs and p1 models are shown to be special cases within the ERGM framework, connecting logistic regression approaches to the more general ERGM.
We approach the screening problem - i.e. detecting which inputs of a computer model significantly impact the output - from a formal Bayesian model selection point of view. That is, we place a Gaussian process prior on the computer model and consider the $2^p$ models that result from assuming that each of the subsets of the $p$ inputs affect the response. The goal is to obtain the posterior probabilities of each of these models. In this talk, we focus on the specification of objective priors on the model-specific parameters and on convenient ways to compute the associated marginal likelihoods. These two problems that normally are seen as unrelated, have challenging connections since the priors proposed in the literature are specifically designed to have posterior modes in the boundary of the parameter space, hence precluding the application of approximate integration techniques based on e.g. Laplace approximations. We explore several ways of circumventing this difficulty, comparing different methodologies with synthetic examples taken from the literature.
Authors: Gonzalo Garcia-Donato (Universidad de Castilla-La Mancha) and Rui Paulo (Universidade de Lisboa)
Statistical inference of network structureTiago Peixoto
The document discusses Bayesian statistical inference for characterizing the structure of large networks. It introduces stochastic blockmodels as a generative model for network structure and describes performing Bayesian inference on these models. This involves defining a likelihood function based on a stochastic blockmodel, placing prior distributions on the model parameters, and computing the posterior distribution over partitions using Bayes' rule. Statistical inference provides a principled means of inferring community structure without overfitting and enables model selection among different partitions. Examples analyzing real networks demonstrate its ability to uncover meaningful structure.
This document discusses network representation and analysis. It defines networks as consisting of nodes (vertices) and edges, and describes different ways to represent networks mathematically using adjacency matrices, incidence matrices, and Laplacian matrices. It also discusses visualizing networks using multidimensional scaling and plotting them in R. Special types of networks like complete graphs and random graphs are briefly introduced.
Learning to discover monte carlo algorithm on spin ice manifoldKai-Wen Zhao
The global update Monte Carlo sampler can be discovered naturally by trained machine using policy gradient method on topologically constrained environment.
There is now a huge literature on Bayesian methods for variable selection that use spike-and-slab priors. Such methods, in particular, have been quite successful for applications in a variety of different fields. High-throughput genomics and neuroimaging are two of such examples. There, novel methodological questions are being generated, requiring the integration of different concepts, methods, tools and data types. These have in particular motivated the development of variable selection priors that go beyond the independence assumptions of a simple Bernoulli prior on the variable inclusion indicators. In this talk I will describe various prior constructions that incorporate information about structural dependencies among the variables. I will also address extensions of the models to the analysis of count data. I will motivate the development of the models using specific applications from neuroimaging and from studies that use microbiome data.
Similar to Reading revue of "Inferring Multiple Graphical Structures" (20)
Racines en haut et feuilles en bas : les arbres en mathstuxette
1. The document discusses methods for clustering and differential analysis of Hi-C matrices, which represent the 3D organization of DNA.
2. It proposes extending Ward's hierarchical clustering to directly use Hi-C similarity matrices while enforcing adjacency constraints. A fast algorithm was also developed.
3. A new method called "treediff" was created to perform differential analysis of Hi-C matrices based on the Wasserstein distance between hierarchical clusterings. Software implementations of these methods were also developed.
Méthodes à noyaux pour l’intégration de données hétérogènestuxette
The document discusses a presentation about multi-omics data integration methods using kernel methods. The presentation introduces kernel methods, how they can be used to integrate heterogeneous omics data, and examples of applications. Specifically, it discusses using kernel methods to perform unsupervised transformation-based integration of multi-omics data. It also presents an application of constrained kernel hierarchical clustering to analyze Hi-C data by directly using Hi-C matrices as kernels.
Méthodologies d'intégration de données omiquestuxette
This document presents a presentation on multi-omics data integration methods given by Nathalie Vialaneix on December 13, 2023. The presentation discusses different types of omics data that can be integrated, both vertically across different levels of omics data on the same samples and horizontally across similar types of omics data on different samples. It also discusses different analysis approaches that can be taken, including supervised and unsupervised methods. The rest of the presentation focuses on unsupervised transformation-based integration methods using kernels.
The document discusses current and future work on analyzing Hi-C data and differential analysis of Hi-C matrices. It describes a clustering method developed to partition chromosomes based on Hi-C matrix similarity. It also introduces a new method called treediff for differential analysis of Hi-C data that calculates the distance between hierarchical clusterings. Current work includes reviewing differential analysis methods, investigating differential subtrees with multiple testing control, and inferring chromatin interaction networks.
Can deep learning learn chromatin structure from sequence?tuxette
This document discusses a deep learning model called ORCA that can predict chromatin structure from DNA sequence. The model uses a neural network with an encoder to extract features from sequence and a decoder to predict Hi-C matrices. It was trained on Hi-C data from multiple cell types and can predict interactions between regions at various resolutions. The model accurately captures features like CTCF-mediated loops and can predict effects of structural variants on chromatin structure. It allows for in silico mutagenesis to study how mutations may alter 3D genome organization.
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
The document discusses multi-omics data integration methods, particularly kernel methods. It describes how kernel methods transform data into similarity matrices between samples rather than relying on variable space. Multiple kernel integration approaches are presented that combine multiple similarity matrices into a consensus kernel in an unsupervised manner, such as through a STATIS-like framework that maximizes the similarity between kernels. Examples of applications to datasets from the TARA Oceans expedition are given.
This document provides an overview of the MetaboWean and Idefics projects. MetaboWean aims to study the co-evolution of gut microbiota and epithelium during suckling-to-weaning transition in rabbits, using metabolomics, metagenomics, and single-cell RNA sequencing data. Idefics integrates multiple omics datasets from human skin samples to understand relationships between microorganisms and molecules and how they are structured in patient groups. The datasets include metagenomics, metabolomics, and proteomics from host and microbiota.
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...tuxette
ASTERICS is an interactive and integrative data analysis tool for omics data. It uses Rserve and PyRserve with Flask and Vue.js in a Docker container to integrate omics data. The backend uses Rserve and PyRserve with Flask on the server side, while the frontend uses Vue.js. This architecture was chosen for its open source and light design. Data communication between Rserve and PyRserve is limited, requiring an object database. ASTERICS is deployed using three Docker containers for R, Python, and
Apprentissage pour la biologie moléculaire et l’analyse de données omiquestuxette
This document summarizes a scientific presentation about molecular biology and omics data analysis. The presentation covers topics related to analyzing large omics datasets using methods like kernel methods, graphical models, and neural networks to learn gene regulation networks and predict phenotypes. Key challenges addressed are handling big data, missing values, non-Gaussian data types like counts and compositional data. The goal is to better understand complex biological systems from multi-omics data.
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...tuxette
The document discusses methods for integrating multi-scale omics data using kernel and machine learning approaches. It describes how omics data is large, heterogeneous, and multi-scaled, creating bottlenecks for analysis. Methods discussed for data integration include multiple kernel learning to combine different relational datasets in an unsupervised way. The methods are applied to integrate different datasets from the TARA Oceans expedition to identify patterns in ocean microbial communities. Improving interpretability of the methods and making them more accessible to biological users is discussed.
Journal club: Validation of cluster analysis results on validation datatuxette
This document presents a framework for validating cluster analysis results on validation data. It describes situations where clustering is inferential versus descriptive and recommends using validation data separate from the data used for clustering. A typology of validation methods is provided, including validation based on the clustering method or results, and evaluation using internal validation, external validation, visual properties, or stability measures.
The document discusses the differences between overfitting and overparametrization in machine learning models. It explores how random forests may exhibit a phenomenon known as "double descent" where test error initially decreases then increases with more parameters before decreasing again. While double descent has been observed in other models, the document questions whether it is directly due to model complexity in random forests since very large trees may be unable to fully interpolate extremely large datasets.
Selective inference and single-cell differential analysistuxette
This document discusses selective inference and single-cell differential analysis. It introduces the problem of "double dipping" in the standard single-cell analysis pipeline where the same dataset is used for clustering and differential analysis. Two approaches for addressing this are presented: 1) A method that perturbs clusters before testing for differences, and 2) A test based on a truncated distribution that assumes clusters and genes are given separately. Experiments applying these methods to real single-cell datasets are described. The document outlines challenges in extending these approaches to more complex analyses.
SOMbrero : un package R pour les cartes auto-organisatricestuxette
SOMbrero is an R package that implements self-organizing map (SOM) algorithms. It can handle numeric, non-numeric, and relational data. The package contains functions for training SOMs, diagnosing results, and plotting maps. It also includes tools like a shiny app and vignettes to aid users without programming experience. SOMbrero supports missing data imputation and extends SOM to relational datasets through non-Euclidean distance measures.
This document summarizes different approaches for structure learning in graph neural networks. It discusses three main classes of methods: 1) metric-based learning which learns a similarity matrix between nodes, 2) probabilistic models which learn the parameters of a distribution over graphs, and 3) direct optimization which directly optimizes the graph adjacency matrix. The document provides examples of methods within each class and notes challenges such as the simplicity of probabilistic models and computational difficulties of direct optimization.
La statistique et le machine learning pour l'intégration de données de la bio...tuxette
This document summarizes a presentation on using statistics and machine learning for integrating high-throughput biological data. It discusses how biological data is large in volume, multi-scaled and heterogeneous in type, creating bottlenecks for analysis. It presents different methods for integrating multiple data tables, including multiple kernel learning to combine similarity matrices. An example application to TARA Oceans data is described, identifying Rhizaria abundance as structuring ocean differences. Interpretability of results is discussed along with prospects for deep learning and predicting phenotypes while understanding relationships.
Authoring a personal GPT for your research and practice: How we created the Q...Leonel Morgado
Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfSelcen Ozturkcan
Ozturkcan, S., Berndt, A., & Angelakis, A. (2024). Mending clothing to support sustainable fashion. Presented at the 31st Annual Conference by the Consortium for International Marketing Research (CIMaR), 10-13 Jun 2024, University of Gävle, Sweden.
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
Immersive Learning That Works: Research Grounding and Paths ForwardLeonel Morgado
We will metaverse into the essence of immersive learning, into its three dimensions and conceptual models. This approach encompasses elements from teaching methodologies to social involvement, through organizational concerns and technologies. Challenging the perception of learning as knowledge transfer, we introduce a 'Uses, Practices & Strategies' model operationalized by the 'Immersive Learning Brain' and ‘Immersion Cube’ frameworks. This approach offers a comprehensive guide through the intricacies of immersive educational experiences and spotlighting research frontiers, along the immersion dimensions of system, narrative, and agency. Our discourse extends to stakeholders beyond the academic sphere, addressing the interests of technologists, instructional designers, and policymakers. We span various contexts, from formal education to organizational transformation to the new horizon of an AI-pervasive society. This keynote aims to unite the iLRN community in a collaborative journey towards a future where immersive learning research and practice coalesce, paving the way for innovative educational research and practice landscapes.
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...Advanced-Concepts-Team
Presentation in the Science Coffee of the Advanced Concepts Team of the European Space Agency on the 07.06.2024.
Speaker: Diego Blas (IFAE/ICREA)
Title: Gravitational wave detection with orbital motion of Moon and artificial
Abstract:
In this talk I will describe some recent ideas to find gravitational waves from supermassive black holes or of primordial origin by studying their secular effect on the orbital motion of the Moon or satellites that are laser ranged.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...Scintica Instrumentation
Targeting Hsp90 and its pathogen Orthologs with Tethered Inhibitors as a Diagnostic and Therapeutic Strategy for cancer and infectious diseases with Dr. Timothy Haystead.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
Farming systems analysis: what have we learnt?.pptx
Reading revue of "Inferring Multiple Graphical Structures"
1. Reading revue of Inferring Multiple Graphical
Structures
from J. Chiquet et al. (and related articles)
Nathalie Villa-Vialaneix - nathalie.villa@univ-paris1.fr
http://www.nathalievilla.org
Groupe de travail samm-graph - 17/02/2012
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 1 / 18
4. Network inference
Framework
Data: large scale gene expression data
individuals
n 30/50
X =
. . . . . .
. . X
j
i . . .
. . . . . .
variables (genes expression), p 103/4
What we want to obtain: a network with
• nodes: genes;
• edges: signicant and direct co-expression between two genes (track
transcription regulations)
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 4 / 18
5. Network inference
Advantages of inferring a network from large scale
transcription data
1 over raw data: focuses on direct links
strong indirect correlation
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 5 / 18
6. Network inference
Advantages of inferring a network from large scale
transcription data
1 over raw data: focuses on direct links
2 over raw data (again): focuses on signicant links (more robust)
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 5 / 18
7. Network inference
Advantages of inferring a network from large scale
transcription data
1 over raw data: focuses on direct links
2 over raw data (again): focuses on signicant links (more robust)
3 over bibliographic network: can handle interactions with yet
unknown (not annotated) genes
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 5 / 18
8. Network inference
Various approaches (and packages) to infer gene
co-expression networks
• Graphical Gaussian Model (Xi)i=1,...,n are i.i.d. Gaussian random
variables N(0, Σ) (gene expression); then
j ←→ j (genes j and j are linked) ⇔ Cor X
j, X
j |(X
k)k=j,j 0
Cor X
j, X
j |(X
k)k=j,j Σ−1
j,j ⇒ nd the partial correlations
by means of (Σn)−1
.
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 6 / 18
9. Network inference
Various approaches (and packages) to infer gene
co-expression networks
• Graphical Gaussian Model (Xi)i=1,...,n are i.i.d. Gaussian random
variables N(0, Σ) (gene expression); then
j ←→ j (genes j and j are linked) ⇔ Cor X
j, X
j |(X
k)k=j,j 0
Cor X
j, X
j |(X
k)k=j,j Σ−1
j,j ⇒ nd the partial correlations
by means of (Σn)−1
.
Problem: Σ is a p-dimensional matrix (with p large) and n is small
compared to p ⇒ (Σn)−1
is a poor estimate of Σ−1
!
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 6 / 18
10. Network inference
Various approaches (and packages) to infer gene
co-expression networks
• Graphical Gaussian Model
• seminal work:
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b]
(with bootstrapping or shrinkage and a proposal for a Bayesian test for
signicance); package genenet;
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 6 / 18
11. Network inference
Various approaches (and packages) to infer gene
co-expression networks
• Graphical Gaussian Model
• seminal work:
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b]
(with bootstrapping or shrinkage and a proposal for a Bayesian test for
signicance); package genenet;
• sparse approaches [Friedman et al., 2008]: packages GGMselect
[Giraud et al., 2009] or SIMoNe [Chiquet et al., 2009,
Ambroise et al., 2009, Chiquet et al., 2011] (with unsupervised
clustering or able to handle multiple populations data)
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 6 / 18
12. Network inference
Various approaches (and packages) to infer gene
co-expression networks
• Graphical Gaussian Model
• Bayesian network learning [Pearl, 1998, Pearl and Russel, 2002]
DAG (Direct Acyclic Graph) and (conditional) probability tables
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 6 / 18
13. Network inference
Various approaches (and packages) to infer gene
co-expression networks
• Graphical Gaussian Model
• Bayesian network learning [Pearl, 1998, Pearl and Russel, 2002]
Learning: nd conditional probability tables and DAG.
Standard issues:
• search for unobserved (latent) variables dependency;
• estimate probabilities by ML optimization (EM algorithm);
• search for DAG (skeleton, directionality): several DAGs are often
plausible.
Package bnlearn, [Scutari, 2010].
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 6 / 18
14. Network inference
Various approaches (and packages) to infer gene
co-expression networks
• Graphical Gaussian Model
• Bayesian network learning [Pearl, 1998, Pearl and Russel, 2002]
• Networks based on mutual information (MI): MI, I (X
j, X
j )
measures the information gain (related to KL divergence):
I (X
j, X
j ) = H(X
j) + H(X
j ) − H(X
j, X
j ) = H(X
j) − H(X
j|X
j )
where H is the entropy H(X
j) = − x∈Xj p(x ) log p(x ) (I
uncertainty reduction in one variable after removing the uncertainty
in the other variable).
Standard issues:
• estimate I ;
• nd out which pairs of variables have signicant MI.
Package minet, [Meyer et al., 2008].
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 6 / 18
15. Network inference Package GeneNet
GGM: shrinkage approach
package GeneNet estimates partial correlations in the Gaussian
Graphical Model framework [Schäfer and Strimmer, 2005b]:
• X = (X
1
, . . . , X
p) (p genes expressions): random Gaussian vector
with variance Σ;
• j ↔ j ⇔ Cor(X
j, X
j |(X
k)k=j,j ) 0 ⇔ Σ−1
jj 0.
Shrinkage: use (1 − λ)Σ + λΩ instead of Σ (where Ω is, e.g., identity
matrix and λ is estimated from the data) to stabilize the estimation of Σ−1
(bagging is also useable [Schäfer and Strimmer, 2005a])
Signicant partial correlations are then selected using a Bayesian test
based on a distribution mixture: partial correlation ts a mixture model
η0f0(., κ) + ηAfA
η0 prior for null hypothesis, ηA = 1 − η0, η0 ηA (η0, κ estimated by EM).
FDR correction: at level α (5% here), keep edges for which p(i) ≤ iα
e/η0
where e is the number of edges and p(1), p(2), ..., p(e) are ordered p-values.
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 7 / 18
17. Network inference Package glasso
Sparse linear regression
Linear regression for each node:
∀ j = 1, . . . , p, X
j = SjX
−j + j
with X
−j, gene expressions without gene j .
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 9 / 18
18. Network inference Package glasso
Sparse linear regression
Linear regression for each node:
∀ j = 1, . . . , p, X
j = SjX
−j + j
with X
−j, gene expressions without gene j .
Relation with the network:
j ↔ j ⇔ Sjj = 0.
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 9 / 18
19. Network inference Package glasso
Sparse linear regression
Linear regression for each node:
∀ j = 1, . . . , p, X
j = SjX
−j + j
with X
−j, gene expressions without gene j .
Relation with the network:
j ↔ j ⇔ Sjj = 0.
Estimation: [Meinshausen and Bühlmann, 2006] LS estimate
∀ j = 1, . . . , p, arg min
Sj
n
i=1
X
j
i − SjX
−j
i
2
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 9 / 18
20. Network inference Package glasso
Sparse linear regression
Linear regression for each node:
∀ j = 1, . . . , p, X
j = SjX
−j + j
with X
−j, gene expressions without gene j .
Relation with the network:
j ↔ j ⇔ Sjj = 0.
Estimation: [Meinshausen and Bühlmann, 2006] LS estimate with
L
1
-penalization
∀ j = 1, . . . , p, arg min
Sj
n
i=1
X
j
i − SjX
−j
i
2
+λ
j =j
|Sjj |
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 9 / 18
21. Network inference Package glasso
Sparse linear regression
Linear regression for each node:
∀ j = 1, . . . , p, X
j = SjX
−j + j
with X
−j, gene expressions without gene j .
Relation with the network:
j ↔ j ⇔ Sjj = 0.
Estimation: [Meinshausen and Bühlmann, 2006] LS estimate with
L
1
-penalization
∀ j = 1, . . . , p, arg min
Sj
n
i=1
X
j
i − SjX
−j
i
2
+λ
j =j
|Sjj |
Sparse penalization ⇒ only a few j are such that Sjj = 0 (variable
selection).
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 9 / 18
22. Network inference Package glasso
Sparse linear regression by pseudo-Likelihood maximization
Estimation: [Friedman et al., 2008] Gaussien framework allows us to
use pseudo-ML optimization with a sparse penalization
L (S |X ) −λ S 1 =
n
i=1
p
j=1
log P(X
j
i |X
−j
i , Sj)
−λ S 1
Remark: For [Meinshausen and Bühlmann, 2006], the estimates are
not symmetric ⇒ symmetrization is done by OR or AND policies.
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 10 / 18
23. Network inference Package glasso
Summary
Density comparison
Schäfer and Strimmer (shrinkage) 2.24%
Schäfer and Strimmer (bootstrap) 6.36%
Friedman et al. 3.78%
Meinshausen and Bühlmann (OR policy) 3.24%
Meinshausen and Bühlmann (AND policy) 1.68%
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 11 / 18
24. Network inference Package glasso
Summary
Density comparison
Schäfer and Strimmer (shrinkage) 2.24%
Schäfer and Strimmer (bootstrap) 6.36%
Friedman et al. 3.78%
Meinshausen and Bühlmann (OR policy) 3.24%
Meinshausen and Bühlmann (AND policy) 1.68%
Edges comparison
Schäfer Strimmer Schäfer Strimmer Friedman et al.
(883) (2345) (1425)
Schäfer Strimmer 883
Friedman et al. 883 1425
Meinshausen Bühlmann (1195) 883 1195 1195
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 11 / 18
26. Multiple Graphical Structures
Framework
T samples measuring the expression of the same genes:
X
1,t, . . . , X
p,t
for t = 1, . . . , T and each X
j,t is a nt-dimensional vectors (nt observations
in sample t).
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 13 / 18
27. Multiple Graphical Structures
Framework
T samples measuring the expression of the same genes:
X
1,t, . . . , X
p,t
for t = 1, . . . , T and each X
j,t is a nt-dimensional vectors (nt observations
in sample t).
Naive approach: independant inferences
L S
t|X
t =
n
i=1
p
j=1
log P(X
j,t
i |X
−j,t
i , S
t
j )
and
arg max
S1,...,ST
t
L S
t|X
t − λ S
t
1
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 13 / 18
28. Multiple Graphical Structures
Framework
T samples measuring the expression of the same genes:
X
1,t, . . . , X
p,t
for t = 1, . . . , T and each X
j,t is a nt-dimensional vectors (nt observations
in sample t).
Naive approach: independant inferences
L S
t|X
t =
n
i=1
p
j=1
log P(X
j,t
i |X
−j,t
i , S
t
j )
and
arg max
S1,...,ST
t
L S
t|X
t − λ S
t
1
Problem: Doesn't use the fact that the samples are actually related... and
produces T networks!
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 13 / 18
29. Multiple Graphical Structures
3 solutions to address this issue
First note that, in the Gaussian framework:
L (S |X ) =
n
2
log det(D) −
n
2
Tr D
−1/2
S ΣSD
−1/2
−
np
2π
where D = Diag (S11, . . . , Spp) and Σ is the empirical covariance matrix ⇒
L (S |X ) ≡ L S |Σ ;
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 14 / 18
30. Multiple Graphical Structures
3 solutions to address this issue
First note that, in the Gaussian framework:
L (S |X ) =
n
2
log det(D) −
n
2
Tr D
−1/2
S ΣSD
−1/2
−
np
2π
where D = Diag (S11, . . . , Spp) and Σ is the empirical covariance matrix ⇒
L (S |X ) ≡ L S |Σ ;
• Intertwined estimation Use Σt = αΣt + (1 − α)¯Σt instead of Σt
where ¯Σt = 1
n t ntΣt
arg max
S1,...,ST
t
L S
t|Σt − λ S
t
1
Similar to the assumption that each sample is generated from a
mixture of Gaussian(?). In the experiments, α = 1/2.
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 14 / 18
31. Multiple Graphical Structures
3 solutions to address this issue
First note that, in the Gaussian framework:
L (S |X ) =
n
2
log det(D) −
n
2
Tr D
−1/2
S ΣSD
−1/2
−
np
2π
where D = Diag (S11, . . . , Spp) and Σ is the empirical covariance matrix ⇒
L (S |X ) ≡ L S |Σ ;
• Intertwined estimation
• Group-LASSO Mixed norm:
arg max
t
L S
t|Σt − λ
j=j t
(Sjj )2
1/2
Sjj ≡ t (St
jj
)2
1/2
(tends to encourage Sjj = 0). Hence should lead to very consensual
inferred networks.
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 14 / 18
32. Multiple Graphical Structures
3 solutions to address this issue
First note that, in the Gaussian framework:
L (S |X ) =
n
2
log det(D) −
n
2
Tr D
−1/2
S ΣSD
−1/2
−
np
2π
where D = Diag (S11, . . . , Spp) and Σ is the empirical covariance matrix ⇒
L (S |X ) ≡ L S |Σ ;
• Intertwined estimation
• Group-LASSO
• Cooperative-LASSO
arg max
t
L St
|Σt
− λ
j=j
t
(St
jj )2
+
1/2
(S+)jj
+
t
(−St
jj )2
+
1/2
(S−)jj
Takes into account that sign swaps are unlickely accross samples (down and
up-regulations).
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 14 / 18
35. Multiple Graphical Structures
Real life experiment
independent estimations true - sum of intertwined
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 17 / 18
36. Multiple Graphical Structures
Open questions
• is the group-lasso type penalty the correct approach to the biological
problem?
• how to be able to combine the network to analyze the dierences
between networks? (distances between graphs?) to build a unique
consensual network from all samples (mean network, AND network,
OR network... ?)
• could it be relevant to penalize the sparse regression problem by an
additional relagularization (e.g., distance between each network and a
consensual network)?
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 18 / 18
37. Multiple Graphical Structures
References
Ambroise, C., Chiquet, J., and Matias, C. (2009).
Inferring sparse Gaussian graphical models with latent structure.
Electronic Journal of Statistics, 3:205238.
Chiquet, J., Grandvalet, Y., and Ambroise, C. (2011).
Inferring multiple graphical structures.
Statistics and Computing, 21(4):537553.
Chiquet, J., Smith, A., Grasseau, G., Matias, C., and Ambroise, C. (2009).
SIMoNe: Statistical Inference for MOdular NEtworks.
Bioinformatics, 25(3):417418.
Friedman, J., Hastie, T., and Tibshirani, R. (2008).
Sparse inverse covariance estimation with the graphical lasso.
Biostatistics, 9(3):432441.
Giraud, C., Huet, S., and Verzelen, N. (2009).
Graph selection with ggmselect.
Technical report, preprint arXiv.
http://fr.arxiv.org/abs/0907.0619.
Meinshausen, N. and Bühlmann, P. (2006).
High dimensional graphs and variable selection with the lasso.
Annals of Statistic, 34(3):14361462.
Meyer, P., Latte, F., and Bontempi, G. (2008).
minet: A R/Bioconductor package for inferring large transcriptional networks using mutual information.
BMC Bioinformatics, 9(461).
Pearl, J. (1998).
Probabilistic reasoning in intelligent systems: networks of plausible inference.
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 18 / 18
38. Multiple Graphical Structures
In Kaufmann, M., editor, Representation and reasoning series (2nd printing ed.). San Fracisco, California,
USA.
Pearl, J. and Russel, S. (2002).
Bayesian networks.
In Michael, A., editor, Handbook of Brain Theory and Neural Networks. Bradford Books (MIT Press),
Cambridge, Massachussets, USA.
Schäfer, J. and Strimmer, K. (2005a).
An empirical bayes approach to inferring large-scale gene association networks.
Bioinformatics, 21(6):754764.
Schäfer, J. and Strimmer, K. (2005b).
A shrinkage approach to large-scale covariance matrix estimation and implication for functional genomics.
Statistical Applications in Genetics and Molecular Biology, 4:132.
Scutari, M. (2010).
Learning Bayesian networks with the bnlearn R package.
Journal of Statistical Software, 35(3):122.
Reading revue (Chiquet et al., 2011) samm-graph, 17/02/2012 Nathalie Villa-Vialaneix 18 / 18