SlideShare a Scribd company logo
1 of 69
Download to read offline
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Joint network inference with the consensual 
LASSO 
Nathalie Villa-Vialaneix 
Joint work with Matthieu Vignes, Nathalie Viguerie 
and Magali San Cristobal 
Séminaire de Statistique Parisien 
Paris, 15 septembre 2014 
http://www.nathalievilla.org 
nathalie.villa@toulouse.inra.fr 
Nathalie Villa-Vialaneix | Consensus Lasso 1/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Outline 
1 Short overview on biological background 
2 Network inference and GGM 
3 Inference with multiple samples 
4 Simulations 
Nathalie Villa-Vialaneix | Consensus Lasso 2/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
DNA 
DNA (DeoxyriboNucleic Acid (DNA) 
molecule that encodes the genetic instructions used in the development 
and functioning of all known living organisms and many viruses 
double helix made with only four nucleotides (Adenine, Cytosine, 
Thymine, Guanine): A binds with T and C with G. 
Nathalie Villa-Vialaneix | Consensus Lasso 3/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Transcription 
Transcription of DNA 
transcription is the process by which a particular segment of DNA (called 
gene) is copied into a single-strand RNA (which is called message RNA) 
A is transcripted into U, T into A, C 
into G and G into C 
Nathalie Villa-Vialaneix | Consensus Lasso 4/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
What is mRNA used for? 
mRNA then moves out of the cell nucleus and is translated into proteins 
which are made of 20 different amino-acids (using an alphabet: 3 letters 
of mRNA ! 1 amino-acid) 
Proteins are used by the cell to function. 
Nathalie Villa-Vialaneix | Consensus Lasso 5/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Gene expression 
In a given cell, at a given time, all the genes are not translated or not 
translated at the same level. 
Gene expression 
is a process by which a given gene is transcripted or/and traducted 
It depends on the type of cell, the environment, ... 
Nathalie Villa-Vialaneix | Consensus Lasso 6/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Gene expression 
In a given cell, at a given time, all the genes are not translated or not 
translated at the same level. 
Gene expression 
is a process by which a given gene is transcripted or/and traducted 
It depends on the type of cell, the environment, ... 
Gene expression can be measured either by: 
“counting” the number of copies (mRNA) of a given gene in the cell: 
transcriptomic data; 
“counting” the quantity of a given protein in the cell: proteomic data. 
Even though a given mRNA is translated into a unique protein, there is 
no simple relationship between transcriptomic and proteomic data. 
Nathalie Villa-Vialaneix | Consensus Lasso 6/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Transcriptomic data 
How are transcriptomic data obtained? 
DNA spots associated to target 
genes (probes) are attached to a 
solid surface (array) 
Nathalie Villa-Vialaneix | Consensus Lasso 7/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Transcriptomic data 
How are transcriptomic data obtained? 
RNA material extracting for cells of 
interest (blood, lipid tissue, muscle, 
urine...) are labeled fluorescently 
and applied on the array 
When the binding between mRNA 
and the probes is good, the spot 
becomes fluorescent. 
Expression of a given gene is quantified by the intensity of the 
fluorescent signal (read by a scanner). 
Nathalie Villa-Vialaneix | Consensus Lasso 7/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Typical microarray data 
Data: large scale gene expression data 
individuals 
n ' 30=50 
8>>><>>>: 
X = 
0BBBBBBBB@ 
: : : : : : 
: : Xj 
i : : : 
: : : : : : 
1CCCCCCCCA 
|                                {z                                } 
variables (genes expression); p'103=4 
Nathalie Villa-Vialaneix | Consensus Lasso 8/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Typical microarray data 
Data: large scale gene expression data 
individuals 
n ' 30=50 
8>>><>>>: 
X = 
0BBBBBBBB@ 
: : : : : : 
: : Xj 
i : : : 
: : : : : : 
1CCCCCCCCA 
|                                {z                                } 
variables (genes expression); p'103=4 
Typical design: two (or more) conditions (treated/control for instance) 
More and more complicated designs: crossed conditions (treated/control 
and different breeds), longitudinal data... 
Nathalie Villa-Vialaneix | Consensus Lasso 8/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Typical microarray data 
Data: large scale gene expression data 
individuals 
n ' 30=50 
8>>><>>>: 
X = 
0BBBBBBBB@ 
: : : : : : 
: : Xj 
i : : : 
: : : : : : 
1CCCCCCCCA 
|                                {z                                } 
variables (genes expression); p'103=4 
Typical design: two (or more) conditions (treated/control for instance) 
More and more complicated designs: crossed conditions (treated/control 
and different breeds), longitudinal data... 
Typical issues: find differentially expressed genes (i.e., genes whose 
expression is significantly different between the conditions) 
Nathalie Villa-Vialaneix | Consensus Lasso 8/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Outline 
1 Short overview on biological background 
2 Network inference and GGM 
3 Inference with multiple samples 
4 Simulations 
Nathalie Villa-Vialaneix | Consensus Lasso 9/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Systems biology 
Instead of being used to produce proteins, some genes’ expressions 
activate or repress other genes’ expressions ) understanding the whole 
cascade helps to comprehend the global functioning of living organisms1 
1Picture taken from: Abdollahi A et al., PNAS 2007, 104:12890-12895. c
 2007 by 
National Academy of Sciences 
Nathalie Villa-Vialaneix | Consensus Lasso 10/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Model framework 
Data: large scale gene expression data 
individuals 
n ' 30=50 
8>>><>>>:X = 
0BBBBBBBB@ 
: : : : : : 
: : Xj 
i : : : 
: : : : : : 
1CCCCCCCCA 
|                                {z                                } 
variables (genes expression); p'103=4 
What we want to obtain: a graph/network with 
nodes: (selected) genes; 
edges: strong links between gene expressions. 
Nathalie Villa-Vialaneix | Consensus Lasso 11/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Advantages of network inference 
1 over raw data: focuses on the strongest direct relationships: 
irrelevant or indirect relations are removed (more robust) and the 
data are easier to visualize and understand (track transcription 
relations). 
Nathalie Villa-Vialaneix | Consensus Lasso 12/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Advantages of network inference 
1 over raw data: focuses on the strongest direct relationships: 
irrelevant or indirect relations are removed (more robust) and the 
data are easier to visualize and understand (track transcription 
relations). 
Expression data are analyzed all together and not by pairs (systems 
model). 
Nathalie Villa-Vialaneix | Consensus Lasso 12/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Advantages of network inference 
1 over raw data: focuses on the strongest direct relationships: 
irrelevant or indirect relations are removed (more robust) and the 
data are easier to visualize and understand (track transcription 
relations). 
Expression data are analyzed all together and not by pairs (systems 
model). 
2 over bibliographic network: can handle interactions with yet 
unknown (not annotated) genes and deal with data collected in a 
particular condition. 
Nathalie Villa-Vialaneix | Consensus Lasso 12/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Using correlations 
Relevance network [Butte and Kohane, 1999] 
First (naive) approach: calculate correlations between expressions for all 
pairs of genes, threshold the smallest ones and build the network. 
Correlations Thresholding Graph 
Nathalie Villa-Vialaneix | Consensus Lasso 13/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Using partial correlations 
x 
y z 
strong indirect correlation 
Nathalie Villa-Vialaneix | Consensus Lasso 14/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Using partial correlations 
x 
y z 
strong indirect correlation 
set.seed(2807); x <- rnorm(100) 
y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826 
z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751 
cor(y,z) [1] 0.9971105 
Nathalie Villa-Vialaneix | Consensus Lasso 14/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Using partial correlations 
x 
y z 
strong indirect correlation 
set.seed(2807); x <- rnorm(100) 
y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826 
z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751 
cor(y,z) [1] 0.9971105 
] Partial correlation 
cor(lm(xz)$residuals,lm(yz)$residuals) [1] 0.7801174 
cor(lm(xy)$residuals,lm(zy)$residuals) [1] 0.7639094 
cor(lm(yx)$residuals,lm(zx)$residuals) [1] -0.1933699 
Nathalie Villa-Vialaneix | Consensus Lasso 14/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Partial correlation and GGM 
(Xi)i=1;:::;n are i.i.d. Gaussian random variables N(0; ) (gene 
expression); then 
j  ! j0(genes j and j0 are linked) , Cor 
 
Xj ; Xj0 
j(Xk )k,j;j0 
 
 0 
Nathalie Villa-Vialaneix | Consensus Lasso 15/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Partial correlation and GGM 
(Xi)i=1;:::;n are i.i.d. Gaussian random variables N(0; ) (gene 
expression); then 
j  ! j0(genes j and j0 are linked) , Cor 
 
Xj ; Xj0 
j(Xk )k,j;j0 
 
 0 
If (concentration matrix) S = 1, 
Cor 
 
Xj ; Xj0 
j(Xk )k,j;j0 
 
=  
Sjj0 p 
SjjSj0 j0 
) Estimate 1 to unravel the graph structure 
Nathalie Villa-Vialaneix | Consensus Lasso 15/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Partial correlation and GGM 
(Xi)i=1;:::;n are i.i.d. Gaussian random variables N(0; ) (gene 
expression); then 
j  ! j0(genes j and j0 are linked) , Cor 
 
Xj ; Xj0 
j(Xk )k,j;j0 
 
 0 
If (concentration matrix) S = 1, 
Cor 
 
Xj ; Xj0 
j(Xk )k,j;j0 
 
=  
Sjj0 p 
SjjSj0 j0 
) Estimate 1 to unravel the graph structure 
Problem: : p-dimensional matrix and n  p ) (b 
n)1 is a poor 
estimate of S)! 
Nathalie Villa-Vialaneix | Consensus Lasso 15/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Various approaches for inferring 
networks with GGM 
Graphical Gaussian Model 
seminal work: 
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] 
(with shrinkage and a proposal for a Bayesian test of significance) 
estimate 1 by (b 
n + I)1 
use a Bayesian test to test which coefficients are significantly non zero. 
Nathalie Villa-Vialaneix | Consensus Lasso 16/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Various approaches for inferring 
networks with GGM 
Graphical Gaussian Model 
seminal work: 
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] 
(with shrinkage and a proposal for a Bayesian test of significance) 
estimate 1 by (b 
n + I)1 
use a Bayesian test to test which coefficients are significantly non zero. 
relation with LM 
8 j, estimate the linear model: 
Xj =
Tj 
Xj +  ; arg max 
(
jj0 )j0 
(logMLj) 
because
jj0 =  
Sjj0 
Sjj 
. 
Nathalie Villa-Vialaneix | Consensus Lasso 16/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Various approaches for inferring 
networks with GGM 
Graphical Gaussian Model 
seminal work: 
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] 
(with shrinkage and a proposal for a Bayesian test of significance) 
estimate 1 by (b 
n + I)1 
use a Bayesian test to test which coefficients are significantly non zero. 
relation with LM 
8 j, estimate the linear model: 
Xj =
Tj 
Xj +  ; arg min 
(
jj0 )j0 
Xn 
i=1 
 
Xij
Tj 
Xj 
i 
2 
because
jj0 =  
Sjj0 
Sjj 
. 
Nathalie Villa-Vialaneix | Consensus Lasso 16/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Various approaches for inferring 
networks with GGM 
Graphical Gaussian Model 
seminal work: 
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] 
(with shrinkage and a proposal for a Bayesian test of significance) 
estimate 1 by (b 
n + I)1 
use a Bayesian test to test which coefficients are significantly non zero. 
sparse approaches: 
[Meinshausen and Bühlmann, 2006, Friedman et al., 2008]: 
8 j, estimate the linear model: 
Xj =
Tj 
Xj +  ; arg min 
(
jj0 )j0 
Xn 
i=1 
 
Xij
Tj 
Xj 
i 
2 
+k
jkL1 
with k
jkL1 = 
P 
j0 j
jj0 j 
L1 penalty yields to
jj0 = 0 for most j0 (variable selection) 
Nathalie Villa-Vialaneix | Consensus Lasso 16/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Outline 
1 Short overview on biological background 
2 Network inference and GGM 
3 Inference with multiple samples 
4 Simulations 
Nathalie Villa-Vialaneix | Consensus Lasso 17/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Motivation for multiple networks 
inference 
Pan-European project Diogenes2 (with Nathalie Viguerie, INSERM): 
gene expressions (lipid tissues) from 204 obese women before and after 
a low-calorie diet (LCD). 
Assumption: A common 
functioning exists 
regardless the 
condition; 
Which genes are linked 
independently 
from/depending on the 
condition? 
2http://www.diogenes-eu.org/; see also [Viguerie et al., 2012] 
Nathalie Villa-Vialaneix | Consensus Lasso 18/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Naive approach: independent 
estimations 
Notations: p genes measured in k samples, each corresponding to a 
specific condition: (Xc 
j )j=1;:::;p  N(0;c), for c = 1; : : : ; k. 
For c = 1; : : : ; k, nc independent observations (Xc 
P ij )i=1;:::;nc and 
c nc = n. 
Independent inference 
Estimation 8 c = 1; : : : ; k and 8 j = 1; : : : ; p, 
Xc 
j = Xc 
nj
cj 
+ c 
j 
are estimated (independently) by maximizing pseudo-likelihood: 
L(SjX) = 
Xk 
c=1 
Xp 
j=1 
Xnc 
i=1 
log P 
 
Xc 
ij jXci 
;nj ; Sc 
j 
 
Nathalie Villa-Vialaneix | Consensus Lasso 19/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Related papers 
Problem: previous estimation does not use the fact that the different 
networks should be somehow alike! 
Previous proposals 
[Chiquet et al., 2011] replace c bye 
c = 12 
c + 12 
 and add a 
sparse penalty; 
Nathalie Villa-Vialaneix | Consensus Lasso 20/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Related papers 
Problem: previous estimation does not use the fact that the different 
networks should be somehow alike! 
Previous proposals 
[Chiquet et al., 2011] replace c bye 
c = 12 
c + 12 
 and add a 
sparse penalty; 
[Chiquet et al., 2011] LASSO and Group-LASSO type penalties to 
force identical or sign-coherent edges between conditions: 
X 
jj0 
sX 
c 
(Sc 
jj0 )2 or 
X 
jj0 
266666664 
sX 
c 
(Sc 
jj0 )2 
+ + 
sX 
c 
(Sc 
jj0 )2 
 
377777775 
) Sc 
jj0 = 0 8 c for most entries OR Sc 
jj0 can only be of a given sign 
(positive or negative) whatever c 
Nathalie Villa-Vialaneix | Consensus Lasso 20/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Related papers 
Problem: previous estimation does not use the fact that the different 
networks should be somehow alike! 
Previous proposals 
[Chiquet et al., 2011] replace c bye 
c = 12 
c + 12 
 and add a 
sparse penalty; 
[Chiquet et al., 2011] LASSO and Group-LASSO type penalties to 
force identical or sign-coherent edges between conditions 
[Danaher et al., 2013] add the penalty 
P 
c,c0 kSc  Sc0 
kL1 ) (sparse 
penalty over the concentration matrix entries forces to strong 
consistency between conditions); 
Nathalie Villa-Vialaneix | Consensus Lasso 20/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Related papers 
Problem: previous estimation does not use the fact that the different 
networks should be somehow alike! 
Previous proposals 
[Chiquet et al., 2011] replace c bye 
c = 12 
c + 12 
 and add a 
sparse penalty; 
[Chiquet et al., 2011] LASSO and Group-LASSO type penalties to 
force identical or sign-coherent edges between conditions 
[Danaher et al., 2013] add the penalty 
P 
c,c0 kSc  Sc0 
kL1 ) (sparse 
penalty over the concentration matrix entries forces to strong 
consistency between conditions); 
[PMohan P 
et al., 2012] add a group-LASSO like penalty 
c,c0 
j kSc 
j  Sc0 
j kL2 that focuses on differences due to a few 
number of nodes only. 
Nathalie Villa-Vialaneix | Consensus Lasso 20/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Consensus LASSO 
Proposal 
Infer multiple networks by forcing them toward a consensual network: i.e., 
explicitly keeping the differences between conditions under control but 
with a L2 penalty (allow for more differences than Group-LASSO type 
penalties). 
Original optimization: 
max 
(
c 
jk )k,j;c=1;:::;C 
X 
c 
0BBBBBB@ 
logMLcj 
  
X 
k,j 
1CCCCCCA 
j
c 
jk j 
: 
Nathalie Villa-Vialaneix | Consensus Lasso 21/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Consensus LASSO 
Proposal 
Infer multiple networks by forcing them toward a consensual network: i.e., 
explicitly keeping the differences between conditions under control but 
with a L2 penalty (allow for more differences than Group-LASSO type 
penalties). 
Original optimization: 
max 
(
c 
jk )k,j;c=1;:::;C 
X 
c 
0BBBBBB@ 
logMLcj 
  
X 
k,j 
1CCCCCCA 
j
c 
jk j 
: 
[Ambroise et al., 2009, Chiquet et al., 2011]: is equivalent to minimize 
p problems having dimension k(p  1): 
1 
2
Tj 
b 
njnj
j +
Tj 
b 
jnj + k
jkL1 ;
j = (
1j 
; : : : ;
kj 
) 
withb 
njnj : block diagonal matrix Diag 
 
b 
1 
njnj ; : : : ;b 
k 
njnj 
 
and similarly forb 
jnj . 
Nathalie Villa-Vialaneix | Consensus Lasso 21/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Consensus LASSO 
Proposal 
Infer multiple networks by forcing them toward a consensual network: i.e., 
explicitly keeping the differences between conditions under control but 
with a L2 penalty (allow for more differences than Group-LASSO type 
penalties). 
Add a constraint to force inference toward a “consensus”
cons 
1 
2
Tj 
b 
njnj
j +
Tj b 
jnj + k

More Related Content

What's hot

Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Hakky St
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its applicationHưng Đặng
 
Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsMinakshi Atre
 
CEHS 2016 Poster
CEHS 2016 PosterCEHS 2016 Poster
CEHS 2016 PosterEric Ma
 
20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal ClubMed_KU
 
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]Luís Rita
 
JPN1422 Defending Against Collaborative Attacks by Malicious Nodes in MANETs...
JPN1422  Defending Against Collaborative Attacks by Malicious Nodes in MANETs...JPN1422  Defending Against Collaborative Attacks by Malicious Nodes in MANETs...
JPN1422 Defending Against Collaborative Attacks by Malicious Nodes in MANETs...chennaijp
 
Defending against collaborative attacks by
Defending against collaborative attacks byDefending against collaborative attacks by
Defending against collaborative attacks byranjith kumar
 

What's hot (8)

Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its application
 
Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian Models
 
CEHS 2016 Poster
CEHS 2016 PosterCEHS 2016 Poster
CEHS 2016 Poster
 
20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club
 
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
 
JPN1422 Defending Against Collaborative Attacks by Malicious Nodes in MANETs...
JPN1422  Defending Against Collaborative Attacks by Malicious Nodes in MANETs...JPN1422  Defending Against Collaborative Attacks by Malicious Nodes in MANETs...
JPN1422 Defending Against Collaborative Attacks by Malicious Nodes in MANETs...
 
Defending against collaborative attacks by
Defending against collaborative attacks byDefending against collaborative attacks by
Defending against collaborative attacks by
 

Viewers also liked

A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learningtuxette
 
Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014tuxette
 
Interpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional dataInterpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional datatuxette
 
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans RVisualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans Rtuxette
 
Stanford - Statistical Learning
Stanford - Statistical LearningStanford - Statistical Learning
Stanford - Statistical LearningRavi Sankar Varma
 
Stanford Statistical Learning
Stanford Statistical LearningStanford Statistical Learning
Stanford Statistical LearningKurt Holst
 
Graph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph miningGraph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph miningtuxette
 
Introduction To Applied Machine Learning
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learningananth
 
Random Forest for Big Data
Random Forest for Big DataRandom Forest for Big Data
Random Forest for Big Datatuxette
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learningmahutte
 
Text classification in scikit-learn
Text classification in scikit-learnText classification in scikit-learn
Text classification in scikit-learnJimmy Lai
 
Statistical learning
Statistical learningStatistical learning
Statistical learningSlideshare
 
Statistical learning
Statistical learningStatistical learning
Statistical learningSlideshare
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processingananth
 
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Jimmy Lai
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysisodsc
 
MaxEnt (Loglinear) Models - Overview
MaxEnt (Loglinear) Models - OverviewMaxEnt (Loglinear) Models - Overview
MaxEnt (Loglinear) Models - Overviewananth
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vecananth
 
Multiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetsMultiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetstuxette
 

Viewers also liked (20)

A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
 
Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014
 
Interpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional dataInterpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional data
 
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans RVisualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
 
Stanford - Statistical Learning
Stanford - Statistical LearningStanford - Statistical Learning
Stanford - Statistical Learning
 
Stanford Statistical Learning
Stanford Statistical LearningStanford Statistical Learning
Stanford Statistical Learning
 
Graph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph miningGraph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph mining
 
Introduction To Applied Machine Learning
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learning
 
Random Forest for Big Data
Random Forest for Big DataRandom Forest for Big Data
Random Forest for Big Data
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learning
 
Text classification in scikit-learn
Text classification in scikit-learnText classification in scikit-learn
Text classification in scikit-learn
 
Statistical learning
Statistical learningStatistical learning
Statistical learning
 
Statistical learning intro
Statistical learning introStatistical learning intro
Statistical learning intro
 
Statistical learning
Statistical learningStatistical learning
Statistical learning
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processing
 
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysis
 
MaxEnt (Loglinear) Models - Overview
MaxEnt (Loglinear) Models - OverviewMaxEnt (Loglinear) Models - Overview
MaxEnt (Loglinear) Models - Overview
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
 
Multiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetsMultiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasets
 

Similar to Inferring networks from multiple samples with consensus LASSO

Consensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samplesConsensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samplestuxette
 
Next generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasNext generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasMuhammadAbbaskhan9
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1Double Check ĆŐNSULTING
 
NetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioNetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioAlexander Pico
 
Integrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS dataIntegrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS dataJoão André Carriço
 
Deciphering cancer stem cells regulatory circuits through an interactome–regu...
Deciphering cancer stem cells regulatory circuits through an interactome–regu...Deciphering cancer stem cells regulatory circuits through an interactome–regu...
Deciphering cancer stem cells regulatory circuits through an interactome–regu...Claire Rioualen
 
Neuroinformatics conference 2012
Neuroinformatics conference 2012Neuroinformatics conference 2012
Neuroinformatics conference 2012Nicolas Le Novère
 
20080110 Genome exploration in A-T G-C space: an introduction to DNA walking
20080110 Genome exploration in A-T G-C space: an introduction to DNA walking20080110 Genome exploration in A-T G-C space: an introduction to DNA walking
20080110 Genome exploration in A-T G-C space: an introduction to DNA walkingJonathan Blakes
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxxRowlet
 
NetBioSIG2012 chrisevelo
NetBioSIG2012 chriseveloNetBioSIG2012 chrisevelo
NetBioSIG2012 chriseveloAlexander Pico
 
Complementing Computation with Visualization in Genomics
Complementing Computation with Visualization in GenomicsComplementing Computation with Visualization in Genomics
Complementing Computation with Visualization in GenomicsFrancis Rowland
 
STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...Lars Juhl Jensen
 
Bioinformatics2015.pdf
Bioinformatics2015.pdfBioinformatics2015.pdf
Bioinformatics2015.pdfAbdetaImi
 
Bioinformatics2015.pdf
Bioinformatics2015.pdfBioinformatics2015.pdf
Bioinformatics2015.pdfAbdetaImi
 
Perturbing The Interactome: Multi-Omics And Personalized Methods For Network ...
Perturbing The Interactome: Multi-Omics And Personalized Methods For Network ...Perturbing The Interactome: Multi-Omics And Personalized Methods For Network ...
Perturbing The Interactome: Multi-Omics And Personalized Methods For Network ...Marc Santolini
 

Similar to Inferring networks from multiple samples with consensus LASSO (20)

Consensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samplesConsensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samples
 
Next generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasNext generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad Abbas
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
 
NetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioNetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore Loguercio
 
Integrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS dataIntegrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS data
 
presentation
presentationpresentation
presentation
 
Deciphering cancer stem cells regulatory circuits through an interactome–regu...
Deciphering cancer stem cells regulatory circuits through an interactome–regu...Deciphering cancer stem cells regulatory circuits through an interactome–regu...
Deciphering cancer stem cells regulatory circuits through an interactome–regu...
 
Basen Network
Basen NetworkBasen Network
Basen Network
 
Neuroinformatics conference 2012
Neuroinformatics conference 2012Neuroinformatics conference 2012
Neuroinformatics conference 2012
 
New generation Sequencing
New generation Sequencing New generation Sequencing
New generation Sequencing
 
RapportHicham
RapportHichamRapportHicham
RapportHicham
 
20080110 Genome exploration in A-T G-C space: an introduction to DNA walking
20080110 Genome exploration in A-T G-C space: an introduction to DNA walking20080110 Genome exploration in A-T G-C space: an introduction to DNA walking
20080110 Genome exploration in A-T G-C space: an introduction to DNA walking
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptx
 
NetBioSIG2012 chrisevelo
NetBioSIG2012 chriseveloNetBioSIG2012 chrisevelo
NetBioSIG2012 chrisevelo
 
Complementing Computation with Visualization in Genomics
Complementing Computation with Visualization in GenomicsComplementing Computation with Visualization in Genomics
Complementing Computation with Visualization in Genomics
 
STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...
 
Bioinformatics2015.pdf
Bioinformatics2015.pdfBioinformatics2015.pdf
Bioinformatics2015.pdf
 
Bioinformatics2015.pdf
Bioinformatics2015.pdfBioinformatics2015.pdf
Bioinformatics2015.pdf
 
PhD midterm report
PhD midterm reportPhD midterm report
PhD midterm report
 
Perturbing The Interactome: Multi-Omics And Personalized Methods For Network ...
Perturbing The Interactome: Multi-Omics And Personalized Methods For Network ...Perturbing The Interactome: Multi-Omics And Personalized Methods For Network ...
Perturbing The Interactome: Multi-Omics And Personalized Methods For Network ...
 

More from tuxette

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathstuxette
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènestuxette
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquestuxette
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-Ctuxette
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?tuxette
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquestuxette
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeantuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...tuxette
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquestuxette
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...tuxette
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...tuxette
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation datatuxette
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?tuxette
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysistuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricestuxette
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Predictiontuxette
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random foresttuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 

More from tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 

Recently uploaded

Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxnoordubaliya2003
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 

Recently uploaded (20)

Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptx
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 

Inferring networks from multiple samples with consensus LASSO

  • 1. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Joint network inference with the consensual LASSO Nathalie Villa-Vialaneix Joint work with Matthieu Vignes, Nathalie Viguerie and Magali San Cristobal Séminaire de Statistique Parisien Paris, 15 septembre 2014 http://www.nathalievilla.org nathalie.villa@toulouse.inra.fr Nathalie Villa-Vialaneix | Consensus Lasso 1/36
  • 2. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Outline 1 Short overview on biological background 2 Network inference and GGM 3 Inference with multiple samples 4 Simulations Nathalie Villa-Vialaneix | Consensus Lasso 2/36
  • 3. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations DNA DNA (DeoxyriboNucleic Acid (DNA) molecule that encodes the genetic instructions used in the development and functioning of all known living organisms and many viruses double helix made with only four nucleotides (Adenine, Cytosine, Thymine, Guanine): A binds with T and C with G. Nathalie Villa-Vialaneix | Consensus Lasso 3/36
  • 4. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Transcription Transcription of DNA transcription is the process by which a particular segment of DNA (called gene) is copied into a single-strand RNA (which is called message RNA) A is transcripted into U, T into A, C into G and G into C Nathalie Villa-Vialaneix | Consensus Lasso 4/36
  • 5. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations What is mRNA used for? mRNA then moves out of the cell nucleus and is translated into proteins which are made of 20 different amino-acids (using an alphabet: 3 letters of mRNA ! 1 amino-acid) Proteins are used by the cell to function. Nathalie Villa-Vialaneix | Consensus Lasso 5/36
  • 6. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Gene expression In a given cell, at a given time, all the genes are not translated or not translated at the same level. Gene expression is a process by which a given gene is transcripted or/and traducted It depends on the type of cell, the environment, ... Nathalie Villa-Vialaneix | Consensus Lasso 6/36
  • 7. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Gene expression In a given cell, at a given time, all the genes are not translated or not translated at the same level. Gene expression is a process by which a given gene is transcripted or/and traducted It depends on the type of cell, the environment, ... Gene expression can be measured either by: “counting” the number of copies (mRNA) of a given gene in the cell: transcriptomic data; “counting” the quantity of a given protein in the cell: proteomic data. Even though a given mRNA is translated into a unique protein, there is no simple relationship between transcriptomic and proteomic data. Nathalie Villa-Vialaneix | Consensus Lasso 6/36
  • 8. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Transcriptomic data How are transcriptomic data obtained? DNA spots associated to target genes (probes) are attached to a solid surface (array) Nathalie Villa-Vialaneix | Consensus Lasso 7/36
  • 9. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Transcriptomic data How are transcriptomic data obtained? RNA material extracting for cells of interest (blood, lipid tissue, muscle, urine...) are labeled fluorescently and applied on the array When the binding between mRNA and the probes is good, the spot becomes fluorescent. Expression of a given gene is quantified by the intensity of the fluorescent signal (read by a scanner). Nathalie Villa-Vialaneix | Consensus Lasso 7/36
  • 10. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Typical microarray data Data: large scale gene expression data individuals n ' 30=50 8>>><>>>: X = 0BBBBBBBB@ : : : : : : : : Xj i : : : : : : : : : 1CCCCCCCCA | {z } variables (genes expression); p'103=4 Nathalie Villa-Vialaneix | Consensus Lasso 8/36
  • 11. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Typical microarray data Data: large scale gene expression data individuals n ' 30=50 8>>><>>>: X = 0BBBBBBBB@ : : : : : : : : Xj i : : : : : : : : : 1CCCCCCCCA | {z } variables (genes expression); p'103=4 Typical design: two (or more) conditions (treated/control for instance) More and more complicated designs: crossed conditions (treated/control and different breeds), longitudinal data... Nathalie Villa-Vialaneix | Consensus Lasso 8/36
  • 12. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Typical microarray data Data: large scale gene expression data individuals n ' 30=50 8>>><>>>: X = 0BBBBBBBB@ : : : : : : : : Xj i : : : : : : : : : 1CCCCCCCCA | {z } variables (genes expression); p'103=4 Typical design: two (or more) conditions (treated/control for instance) More and more complicated designs: crossed conditions (treated/control and different breeds), longitudinal data... Typical issues: find differentially expressed genes (i.e., genes whose expression is significantly different between the conditions) Nathalie Villa-Vialaneix | Consensus Lasso 8/36
  • 13. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Outline 1 Short overview on biological background 2 Network inference and GGM 3 Inference with multiple samples 4 Simulations Nathalie Villa-Vialaneix | Consensus Lasso 9/36
  • 14. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Systems biology Instead of being used to produce proteins, some genes’ expressions activate or repress other genes’ expressions ) understanding the whole cascade helps to comprehend the global functioning of living organisms1 1Picture taken from: Abdollahi A et al., PNAS 2007, 104:12890-12895. c 2007 by National Academy of Sciences Nathalie Villa-Vialaneix | Consensus Lasso 10/36
  • 15. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Model framework Data: large scale gene expression data individuals n ' 30=50 8>>><>>>:X = 0BBBBBBBB@ : : : : : : : : Xj i : : : : : : : : : 1CCCCCCCCA | {z } variables (genes expression); p'103=4 What we want to obtain: a graph/network with nodes: (selected) genes; edges: strong links between gene expressions. Nathalie Villa-Vialaneix | Consensus Lasso 11/36
  • 16. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Advantages of network inference 1 over raw data: focuses on the strongest direct relationships: irrelevant or indirect relations are removed (more robust) and the data are easier to visualize and understand (track transcription relations). Nathalie Villa-Vialaneix | Consensus Lasso 12/36
  • 17. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Advantages of network inference 1 over raw data: focuses on the strongest direct relationships: irrelevant or indirect relations are removed (more robust) and the data are easier to visualize and understand (track transcription relations). Expression data are analyzed all together and not by pairs (systems model). Nathalie Villa-Vialaneix | Consensus Lasso 12/36
  • 18. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Advantages of network inference 1 over raw data: focuses on the strongest direct relationships: irrelevant or indirect relations are removed (more robust) and the data are easier to visualize and understand (track transcription relations). Expression data are analyzed all together and not by pairs (systems model). 2 over bibliographic network: can handle interactions with yet unknown (not annotated) genes and deal with data collected in a particular condition. Nathalie Villa-Vialaneix | Consensus Lasso 12/36
  • 19. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Using correlations Relevance network [Butte and Kohane, 1999] First (naive) approach: calculate correlations between expressions for all pairs of genes, threshold the smallest ones and build the network. Correlations Thresholding Graph Nathalie Villa-Vialaneix | Consensus Lasso 13/36
  • 20. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Using partial correlations x y z strong indirect correlation Nathalie Villa-Vialaneix | Consensus Lasso 14/36
  • 21. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Using partial correlations x y z strong indirect correlation set.seed(2807); x <- rnorm(100) y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826 z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751 cor(y,z) [1] 0.9971105 Nathalie Villa-Vialaneix | Consensus Lasso 14/36
  • 22. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Using partial correlations x y z strong indirect correlation set.seed(2807); x <- rnorm(100) y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826 z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751 cor(y,z) [1] 0.9971105 ] Partial correlation cor(lm(xz)$residuals,lm(yz)$residuals) [1] 0.7801174 cor(lm(xy)$residuals,lm(zy)$residuals) [1] 0.7639094 cor(lm(yx)$residuals,lm(zx)$residuals) [1] -0.1933699 Nathalie Villa-Vialaneix | Consensus Lasso 14/36
  • 23. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Partial correlation and GGM (Xi)i=1;:::;n are i.i.d. Gaussian random variables N(0; ) (gene expression); then j ! j0(genes j and j0 are linked) , Cor Xj ; Xj0 j(Xk )k,j;j0 0 Nathalie Villa-Vialaneix | Consensus Lasso 15/36
  • 24. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Partial correlation and GGM (Xi)i=1;:::;n are i.i.d. Gaussian random variables N(0; ) (gene expression); then j ! j0(genes j and j0 are linked) , Cor Xj ; Xj0 j(Xk )k,j;j0 0 If (concentration matrix) S = 1, Cor Xj ; Xj0 j(Xk )k,j;j0 = Sjj0 p SjjSj0 j0 ) Estimate 1 to unravel the graph structure Nathalie Villa-Vialaneix | Consensus Lasso 15/36
  • 25. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Partial correlation and GGM (Xi)i=1;:::;n are i.i.d. Gaussian random variables N(0; ) (gene expression); then j ! j0(genes j and j0 are linked) , Cor Xj ; Xj0 j(Xk )k,j;j0 0 If (concentration matrix) S = 1, Cor Xj ; Xj0 j(Xk )k,j;j0 = Sjj0 p SjjSj0 j0 ) Estimate 1 to unravel the graph structure Problem: : p-dimensional matrix and n p ) (b n)1 is a poor estimate of S)! Nathalie Villa-Vialaneix | Consensus Lasso 15/36
  • 26. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Various approaches for inferring networks with GGM Graphical Gaussian Model seminal work: [Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] (with shrinkage and a proposal for a Bayesian test of significance) estimate 1 by (b n + I)1 use a Bayesian test to test which coefficients are significantly non zero. Nathalie Villa-Vialaneix | Consensus Lasso 16/36
  • 27. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Various approaches for inferring networks with GGM Graphical Gaussian Model seminal work: [Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] (with shrinkage and a proposal for a Bayesian test of significance) estimate 1 by (b n + I)1 use a Bayesian test to test which coefficients are significantly non zero. relation with LM 8 j, estimate the linear model: Xj =
  • 28. Tj Xj + ; arg max (
  • 29. jj0 )j0 (logMLj) because
  • 30. jj0 = Sjj0 Sjj . Nathalie Villa-Vialaneix | Consensus Lasso 16/36
  • 31. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Various approaches for inferring networks with GGM Graphical Gaussian Model seminal work: [Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] (with shrinkage and a proposal for a Bayesian test of significance) estimate 1 by (b n + I)1 use a Bayesian test to test which coefficients are significantly non zero. relation with LM 8 j, estimate the linear model: Xj =
  • 32. Tj Xj + ; arg min (
  • 33. jj0 )j0 Xn i=1 Xij
  • 34. Tj Xj i 2 because
  • 35. jj0 = Sjj0 Sjj . Nathalie Villa-Vialaneix | Consensus Lasso 16/36
  • 36. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Various approaches for inferring networks with GGM Graphical Gaussian Model seminal work: [Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] (with shrinkage and a proposal for a Bayesian test of significance) estimate 1 by (b n + I)1 use a Bayesian test to test which coefficients are significantly non zero. sparse approaches: [Meinshausen and Bühlmann, 2006, Friedman et al., 2008]: 8 j, estimate the linear model: Xj =
  • 37. Tj Xj + ; arg min (
  • 38. jj0 )j0 Xn i=1 Xij
  • 39. Tj Xj i 2 +k
  • 41. jkL1 = P j0 j
  • 42. jj0 j L1 penalty yields to
  • 43. jj0 = 0 for most j0 (variable selection) Nathalie Villa-Vialaneix | Consensus Lasso 16/36
  • 44. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Outline 1 Short overview on biological background 2 Network inference and GGM 3 Inference with multiple samples 4 Simulations Nathalie Villa-Vialaneix | Consensus Lasso 17/36
  • 45. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Motivation for multiple networks inference Pan-European project Diogenes2 (with Nathalie Viguerie, INSERM): gene expressions (lipid tissues) from 204 obese women before and after a low-calorie diet (LCD). Assumption: A common functioning exists regardless the condition; Which genes are linked independently from/depending on the condition? 2http://www.diogenes-eu.org/; see also [Viguerie et al., 2012] Nathalie Villa-Vialaneix | Consensus Lasso 18/36
  • 46. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Naive approach: independent estimations Notations: p genes measured in k samples, each corresponding to a specific condition: (Xc j )j=1;:::;p N(0;c), for c = 1; : : : ; k. For c = 1; : : : ; k, nc independent observations (Xc P ij )i=1;:::;nc and c nc = n. Independent inference Estimation 8 c = 1; : : : ; k and 8 j = 1; : : : ; p, Xc j = Xc nj
  • 47. cj + c j are estimated (independently) by maximizing pseudo-likelihood: L(SjX) = Xk c=1 Xp j=1 Xnc i=1 log P Xc ij jXci ;nj ; Sc j Nathalie Villa-Vialaneix | Consensus Lasso 19/36
  • 48. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Related papers Problem: previous estimation does not use the fact that the different networks should be somehow alike! Previous proposals [Chiquet et al., 2011] replace c bye c = 12 c + 12 and add a sparse penalty; Nathalie Villa-Vialaneix | Consensus Lasso 20/36
  • 49. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Related papers Problem: previous estimation does not use the fact that the different networks should be somehow alike! Previous proposals [Chiquet et al., 2011] replace c bye c = 12 c + 12 and add a sparse penalty; [Chiquet et al., 2011] LASSO and Group-LASSO type penalties to force identical or sign-coherent edges between conditions: X jj0 sX c (Sc jj0 )2 or X jj0 266666664 sX c (Sc jj0 )2 + + sX c (Sc jj0 )2 377777775 ) Sc jj0 = 0 8 c for most entries OR Sc jj0 can only be of a given sign (positive or negative) whatever c Nathalie Villa-Vialaneix | Consensus Lasso 20/36
  • 50. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Related papers Problem: previous estimation does not use the fact that the different networks should be somehow alike! Previous proposals [Chiquet et al., 2011] replace c bye c = 12 c + 12 and add a sparse penalty; [Chiquet et al., 2011] LASSO and Group-LASSO type penalties to force identical or sign-coherent edges between conditions [Danaher et al., 2013] add the penalty P c,c0 kSc Sc0 kL1 ) (sparse penalty over the concentration matrix entries forces to strong consistency between conditions); Nathalie Villa-Vialaneix | Consensus Lasso 20/36
  • 51. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Related papers Problem: previous estimation does not use the fact that the different networks should be somehow alike! Previous proposals [Chiquet et al., 2011] replace c bye c = 12 c + 12 and add a sparse penalty; [Chiquet et al., 2011] LASSO and Group-LASSO type penalties to force identical or sign-coherent edges between conditions [Danaher et al., 2013] add the penalty P c,c0 kSc Sc0 kL1 ) (sparse penalty over the concentration matrix entries forces to strong consistency between conditions); [PMohan P et al., 2012] add a group-LASSO like penalty c,c0 j kSc j Sc0 j kL2 that focuses on differences due to a few number of nodes only. Nathalie Villa-Vialaneix | Consensus Lasso 20/36
  • 52. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Consensus LASSO Proposal Infer multiple networks by forcing them toward a consensual network: i.e., explicitly keeping the differences between conditions under control but with a L2 penalty (allow for more differences than Group-LASSO type penalties). Original optimization: max (
  • 53. c jk )k,j;c=1;:::;C X c 0BBBBBB@ logMLcj X k,j 1CCCCCCA j
  • 54. c jk j : Nathalie Villa-Vialaneix | Consensus Lasso 21/36
  • 55. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Consensus LASSO Proposal Infer multiple networks by forcing them toward a consensual network: i.e., explicitly keeping the differences between conditions under control but with a L2 penalty (allow for more differences than Group-LASSO type penalties). Original optimization: max (
  • 56. c jk )k,j;c=1;:::;C X c 0BBBBBB@ logMLcj X k,j 1CCCCCCA j
  • 57. c jk j : [Ambroise et al., 2009, Chiquet et al., 2011]: is equivalent to minimize p problems having dimension k(p 1): 1 2
  • 59. j +
  • 60. Tj b jnj + k
  • 62. j = (
  • 63. 1j ; : : : ;
  • 64. kj ) withb njnj : block diagonal matrix Diag b 1 njnj ; : : : ;b k njnj and similarly forb jnj . Nathalie Villa-Vialaneix | Consensus Lasso 21/36
  • 65. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Consensus LASSO Proposal Infer multiple networks by forcing them toward a consensual network: i.e., explicitly keeping the differences between conditions under control but with a L2 penalty (allow for more differences than Group-LASSO type penalties). Add a constraint to force inference toward a “consensus”
  • 68. j +
  • 69. Tj b jnj + k
  • 70. jkL1 + X c wck
  • 71. cj
  • 72. cons j k2 L2 with: wc : real number used to weight the conditions (wc = 1 or wc = 1 p nc ); regularization parameter;
  • 73. cons j whatever you want...? Nathalie Villa-Vialaneix | Consensus Lasso 21/36
  • 74. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Choice of a consensus: set one... Typical case: a prior network is known (e.g., from bibliography); with no prior information, use a fixed prior corresponding to (e.g.) global inference ) given (and fixed)
  • 75. cons Nathalie Villa-Vialaneix | Consensus Lasso 22/36
  • 76. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Choice of a consensus: set one... Typical case: a prior network is known (e.g., from bibliography); with no prior information, use a fixed prior corresponding to (e.g.) global inference ) given (and fixed)
  • 78. cons j , the optimization problem is equivalent to minimizing the p following standard quadratic problem in Rk(p1) with L1-penalty: 1 2
  • 80. j +
  • 82. jkL1 ; where B1() =b njnj + 2Ik(p1), with Ik(p1) the k(p 1)-identity matrix B2() =b jnj 2Ik(p1)
  • 84. cons = (
  • 85. cons j )T ; : : : ; (
  • 86. cons j )T T Nathalie Villa-Vialaneix | Consensus Lasso 22/36
  • 87. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Choice of a consensus: adapt one during training... Derive the consensus from the condition-specific estimates:
  • 88. cons j = X c nc n
  • 89. cj Nathalie Villa-Vialaneix | Consensus Lasso 23/36
  • 90. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Choice of a consensus: adapt one during training... Derive the consensus from the condition-specific estimates:
  • 91. cons j = X c nc n
  • 93. cons j = Pkc =1 nc n
  • 94. cj , the optimization problem is equivalent to minimizing the following standard quadratic problem with L1-penalty: 1 2
  • 96. j +
  • 97. Tj b jnj + k
  • 98. jkL1 where Sj() =b njnj + 2AT ()A() where A() is a [k(p 1) k(p 1)]-matrix that does not depend on j. Nathalie Villa-Vialaneix | Consensus Lasso 23/36
  • 99. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Computational aspects: optimization 2j Tj 1j Tj 12 Common framework Objective function can be decomposed into: convex part C(
  • 100. j) =
  • 101. Q() +
  • 103. j) = k
  • 104. jkL1 Nathalie Villa-Vialaneix | Consensus Lasso 24/36
  • 105. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Computational aspects: optimization 2j Tj 1j Tj 12 Common framework Objective function can be decomposed into: convex part C(
  • 106. j) =
  • 107. Q() +
  • 109. j) = k
  • 110. jkL1 optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011] 1: repeat( given) 2: Given A and
  • 112. jj0 , 0, 8 j0 2 A, solve (over h) the smooth minimization problem restricted to A C(
  • 113. j + h) + P(
  • 114. j + h) )
  • 115. j
  • 116. j + h 4: until Nathalie Villa-Vialaneix | Consensus Lasso 24/36
  • 117. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Computational aspects: optimization 2j Tj 1j Tj 12 Common framework Objective function can be decomposed into: convex part C(
  • 118. j) =
  • 119. Q() +
  • 121. j) = k
  • 122. jkL1 optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011] 1: repeat( given) 2: Given A and
  • 124. jj0 , 0, 8 j0 2 A, solve (over h) the smooth minimization problem restricted to A C(
  • 125. j + h) + P(
  • 126. j + h) )
  • 127. j
  • 128. j + h 3: Update A by adding most violating variables, i.e., variables st: abs
  • 129.
  • 130.
  • 131.
  • 132.
  • 133.
  • 134. @C(
  • 136. j) 0 with [@P(
  • 137. j )]j0 2 [1; 1] if j0 A 4: until Nathalie Villa-Vialaneix | Consensus Lasso 24/36
  • 138. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Computational aspects: optimization 2j Tj 1j Tj 12 Common framework Objective function can be decomposed into: convex part C(
  • 139. j) =
  • 140. Q() +
  • 142. j) = k
  • 143. jkL1 optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011] 1: repeat( given) 2: Given A and
  • 145. jj0 , 0, 8 j0 2 A, solve (over h) the smooth minimization problem restricted to A C(
  • 146. j + h) + P(
  • 147. j + h) )
  • 148. j
  • 149. j + h 3: Update A by adding most violating variables, i.e., variables st: abs
  • 150.
  • 151.
  • 152.
  • 153.
  • 154.
  • 155. @C(
  • 157. j) 0 with [@P(
  • 158. j )]j0 2 [1; 1] if j0 A 4: until all conditions satisfied i.e. abs
  • 159.
  • 160.
  • 161.
  • 162.
  • 163.
  • 164. @C(
  • 166. j) 0 Nathalie Villa-Vialaneix | Consensus Lasso 24/36
  • 167. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Computational aspects: optimization 2j Tj 1j Tj 12 Common framework Objective function can be decomposed into: convex part C(
  • 168. j) =
  • 169. Q() +
  • 171. j) = k
  • 172. jkL1 optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011] 1: repeat( given) 2: Given A and
  • 174. jj0 , 0, 8 j0 2 A, solve (over h) the smooth minimization problem restricted to A C(
  • 175. j + h) + P(
  • 176. j + h) )
  • 177. j
  • 178. j + h 3: Update A by adding most violating variables, i.e., variables st: abs
  • 179.
  • 180.
  • 181.
  • 182.
  • 183.
  • 184. @C(
  • 186. j) 0 with [@P(
  • 187. j )]j0 2 [1; 1] if j0 A 4: until all conditions satisfied i.e. abs
  • 188.
  • 189.
  • 190.
  • 191.
  • 192.
  • 193. @C(
  • 195. j) 0 Repeat: large # small using previous
  • 196. as prior Nathalie Villa-Vialaneix | Consensus Lasso 24/36
  • 197. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Bootstrap estimation ' BOLASSO subsample n observations with replacement x x x x x x x x Nathalie Villa-Vialaneix | Consensus Lasso 25/36
  • 198. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Bootstrap estimation ' BOLASSO subsample n observations with replacement x x x x x x x x cLasso estimation ! for varying , (
  • 199. ;b jj0 )jj0 Nathalie Villa-Vialaneix | Consensus Lasso 25/36
  • 200. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Bootstrap estimation ' BOLASSO subsample n observations with replacement x x x x x x x x cLasso estimation ! for varying , (
  • 201. ;b jj0 )jj0 # threshold keep the first T1 largest coefficients Nathalie Villa-Vialaneix | Consensus Lasso 25/36
  • 202. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Bootstrap estimation ' BOLASSO subsample n observations with replacement x x x x x x x x cLasso estimation ! for varying , (
  • 203. ;b jj0 )jj0 # threshold keep the first T1 largest coefficients B Frequency table (1; 2) (1; 3) ... (j; j0) ... 130 25 ... 120 ... Nathalie Villa-Vialaneix | Consensus Lasso 25/36
  • 204. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Bootstrap estimation ' BOLASSO subsample n observations with replacement x x x x x x x x cLasso estimation ! for varying , (
  • 205. ;b jj0 )jj0 # threshold keep the first T1 largest coefficients B Frequency table (1; 2) (1; 3) ... (j; j0) ... 130 25 ... 120 ... ! Keep the T2 most frequent pairs Nathalie Villa-Vialaneix | Consensus Lasso 25/36
  • 206. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Outline 1 Short overview on biological background 2 Network inference and GGM 3 Inference with multiple samples 4 Simulations Nathalie Villa-Vialaneix | Consensus Lasso 26/36
  • 207. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Simulated data Expression data with known co-expression network original network (scale free) taken from http://www.comp-sys-bio.org/AGN/data.html (100 nodes, 200 edges, loops removed); rewire a ratio r of the edges to generate k “children” networks (sharing approximately 100(1 2r)% of their edges); Nathalie Villa-Vialaneix | Consensus Lasso 27/36
  • 208. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Simulated data Expression data with known co-expression network original network (scale free) taken from http://www.comp-sys-bio.org/AGN/data.html (100 nodes, 200 edges, loops removed); rewire a ratio r of the edges to generate k “children” networks (sharing approximately 100(1 2r)% of their edges); generate “expression data” with a random Gaussian process from each chid: use the Laplacian of the graph to generate a putative concentration matrix; use edge colors in the original network to set the edge sign; correct the obtained matrix to make it positive; invert to obtain a covariance matrix...; ... which is used in a random Gaussian process to generate expression data (with noise). Nathalie Villa-Vialaneix | Consensus Lasso 27/36
  • 209. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations An example with k = 2, r = 5% mother network3 first child second child 3actually the parent network. My co-author wisely noted that the mistake was unforgivable for a feminist... Nathalie Villa-Vialaneix | Consensus Lasso 28/36
  • 210. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Choice for T2 Data: r = 0:05, k = 2 and n1 = n2 = 20 100 bootstrap samples, = 1, T1 = 250 or 500 l l 30 selections at least 30 selections at least 1.00 0.75 0.50 0.25 0.00 0.00 0.25 0.50 0.75 precision recall 43 selections at least l l 40 selections at least 1.00 0.75 0.50 0.25 0.00 0.00 0.25 0.50 0.75 1.00 precision recall Dots correspond to best F = 2 precisionrecall precision+recall ) Best F corresponds to selecting a number of edges approximately equal to the number of edges in the original network. Nathalie Villa-Vialaneix | Consensus Lasso 29/36
  • 211. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Choice for T1 and T1 % of improvement 0.1/1 f250; 300; 500g of bootstrapping network sizes rewired edges: 5% 20-20 1 500 30.69 20-30 0.1 500 11.87 30-30 1 300 20.15 50-50 1 300 14.36 20-20-20-20-20 1 500 86.04 30-30-30-30 0.1 500 42.67 network sizes rewired edges: 20% 20-20 0.1 300 -17.86 20-30 0.1 300 -18.35 30-30 1 500 -7.97 50-50 0.1 300 -7.83 20-20-20-20-20 0.1 500 10.27 30-30-30-30 1 500 13.48 Nathalie Villa-Vialaneix | Consensus Lasso 30/36
  • 212. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Comparison with other approaches Method compared (direct and bootstrap approaches) independant Graphical LASSO estimation gLasso methods implementated in the R package simone and described in [Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperative LASSO coopLasso and group LASSO groupLasso fused graphical LASSO as described in [Danaher et al., 2013] as implemented in the R package fgLasso Nathalie Villa-Vialaneix | Consensus Lasso 31/36
  • 213. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Comparison with other approaches Method compared (direct and bootstrap approaches) independant Graphical LASSO estimation gLasso methods implementated in the R package simone and described in [Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperative LASSO coopLasso and group LASSO groupLasso fused graphical LASSO as described in [Danaher et al., 2013] as implemented in the R package fgLasso consensus Lasso with fixed prior (the mother network) cLasso(p) fixed prior (average over the conditions of independant estimations) cLasso(2) adaptative estimation of the prior cLasso(m) Nathalie Villa-Vialaneix | Consensus Lasso 31/36
  • 214. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Comparison with other approaches Method compared (direct and bootstrap approaches) independant Graphical LASSO estimation gLasso methods implementated in the R package simone and described in [Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperative LASSO coopLasso and group LASSO groupLasso fused graphical LASSO as described in [Danaher et al., 2013] as implemented in the R package fgLasso consensus Lasso with fixed prior (the mother network) cLasso(p) fixed prior (average over the conditions of independant estimations) cLasso(2) adaptative estimation of the prior cLasso(m) Parameters set to: T1 = 500, B = 100, = 1 Nathalie Villa-Vialaneix | Consensus Lasso 31/36
  • 215. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Selected results (best F) rewired edges: 5% - conditions: 2 - sample size: 2 30 direct version Method gLasso iLasso groupLasso coopLasso 0.28 0.35 0.32 0.35 Method fgLasso cLasso(m) cLasso(p) cLasso(2) 0.32 0.31 0.86 0.30 bootstrap version Method gLasso iLasso groupLasso coopLasso 0.31 0.34 0.36 0.34 Method fgLasso cLasso(m) cLasso(p) cLasso(2) 0.36 0.37 0.86 0.35 Conclusions bootstraping improves performance (except for iLasso and for large r) joint inference improves performance using a good prior is (as expected) very efficient adaptive approach for cLasso is better than naive approach Nathalie Villa-Vialaneix | Consensus Lasso 32/36
  • 216. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Real data 204 obese women ; expression of 221 genes before and after a LCD = 1 ; T1 = 1000 (target density: 4%) Distribution of the number of times an edge is selected over 100 bootstrap samples 1500 1000 500 0 0 25 50 75 100 Counts Frequency (70% of the pairs of nodes are never selected) ) T2 = 80 Nathalie Villa-Vialaneix | Consensus Lasso 33/36
  • 217. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Networks Before diet l l l l l l l FADS2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l AZGP1 CIDEA FADS1 GPD1L PCK2 After diet l l l l l l l FADS2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l AZGP1 CIDEA FADS1 GPD1L PCK2 densities about 1:3% - some interactions (both shared and specific) make sense to the biologist Nathalie Villa-Vialaneix | Consensus Lasso 34/36
  • 218. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Thank you for your attention... Programs available in the R package therese (on R-Forge)4. Joint work with Magali SanCristobal Matthieu Vignes (GenPhySe, INRA Toulouse) (Massey University, NZ) Nathalie Viguerie (I2MC, INSERM Toulouse) 4https://r-forge.r-project.org/projects/therese-pkg Nathalie Villa-Vialaneix | Consensus Lasso 35/36
  • 219. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Questions? Nathalie Villa-Vialaneix | Consensus Lasso 36/36
  • 220. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations References Ambroise, C., Chiquet, J., and Matias, C. (2009). Inferring sparse Gaussian graphical models with latent structure. Electronic Journal of Statistics, 3:205–238. Butte, A. and Kohane, I. (1999). Unsupervised knowledge discovery in medical databases using relevance networks. In Proceedings of the AMIA Symposium, pages 711–715. Chiquet, J., Grandvalet, Y., and Ambroise, C. (2011). Inferring multiple graphical structures. Statistics and Computing, 21(4):537–553. Danaher, P., Wang, P., and Witten, D. (2013). The joint graphical lasso for inverse covariance estimation accross multiple classes. Journal of the Royal Statistical Society Series B. Forthcoming. Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432–441. Meinshausen, N. and Bühlmann, P. (2006). High dimensional graphs and variable selection with the lasso. Annals of Statistic, 34(3):1436–1462. Mohan, K., Chung, J., Han, S., Witten, D., Lee, S., and Fazel, M. (2012). Structured learning of Gaussian graphical models. In Proceedings of NIPS (Neural Information Processing Systems) 2012, Lake Tahoe, Nevada, USA. Osborne, M., Presnell, B., and Turlach, B. (2000). Nathalie Villa-Vialaneix | Consensus Lasso 36/36
  • 221. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations On the LASSO and its dual. Journal of Computational and Graphical Statistics, 9(2):319–337. Schäfer, J. and Strimmer, K. (2005a). An empirical bayes approach to inferring large-scale gene association networks. Bioinformatics, 21(6):754–764. Schäfer, J. and Strimmer, K. (2005b). A shrinkage approach to large-scale covariance matrix estimation and implication for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4:1–32. Viguerie, N., Montastier, E., Maoret, J., Roussel, B., Combes, M., Valle, C., Villa-Vialaneix, N., Iacovoni, J., Martinez, J., Holst, C., Astrup, A., Vidal, H., Clément, K., Hager, J., Saris, W., and Langin, D. (2012). Determinants of human adipose tissue gene expression: impact of diet, sex, metabolic status and cis genetic regulation. PLoS Genetics, 8(9):e1002959. Nathalie Villa-Vialaneix | Consensus Lasso 36/36