SlideShare a Scribd company logo
1 of 163
Download to read offline
An introduction to Biological network inference via
                   Gaussian Graphical Models

                        Christophe Ambroise, Julien Chiquet

                                                           e ´
                    Statistique et G´nome, CNRS & Universit´ d’Evry Val d’Essonne
                                    e


                       S˜o Paulo – School on Advance Science – Octobre 2012
                        a




                            http://stat.genopole.cnrs.fr/~cambroise




Network inference                                                                   1
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                2
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                3
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                4
Real networks

     I   Many scientific fields :

             I      World Wide Web
             I      Biology, sociology, physics

     I   Nature of data under study:

             I      Interactions between N
                    objects
             I      O(N 2 ) possible interactions
     I   Network topology :
             I      Describes the way nodes
                    interact, structure/function    Sample of 250 blogs (nodes) with their links
                    relationship                    (edges) of the French political Blogosphere.



Network inference                                                                                  5
1
 What the reconstructed networks are expected to be       (1)
 Regulatory networks




  E. coli regulatory network
      I    relationships between
           gene and their products
      I    inhibition/activation
      I    impossible to recover at
           large scale
      I    always incomplete



  1
      1
          and are presumably wrongly assumed to be
Network inference                                           6
What the reconstructed networks are expected to be (2)
 Regulatory networks




     Figure: Regulatory network identified in mammalian cells: highly structured
Network inference                                                                 7
What the reconstructed networks are expected to be (3)
 Protein-Protein interaction networks




  Figure: Yeast PPI network : do not be mislead by the representation, trust stat !
Network inference                                                                     8
What the reconstructed networks are expected to be (3)
 Protein-Protein interaction networks




  Figure: Yeast PPI network : do not be mislead by the representation, trust stat !
Network inference                                                                     8
What the reconstructed networks are expected to be (3)
 Protein-Protein interaction networks




  Figure: Yeast PPI network : do not be mislead by the representation, trust stat !



Network inference                                                                     8
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                9
What are we looking at?


  Central dogma of molecular biology
                          transcription             translation
                    DNA                   mRNA                    Proteins
   replication



  Proteins
     I   are building blocks of any cellular functionality,
     I   are encoded by the genes,
     I   do interact (at the protein and gene level – regulations).




Network inference                                                            10
What questions in functional genomics? (1)

  Various levels/scales of study
     I   genome: sequence analysis,
     I   transcriptome: gene expression levels,
     I   proteome: protein functions and interactions.


  Questions
    1. Biological understanding
             I      Mechanisms of diseases,
             I      gene/protein functions and interactions.
    2. Medical/clinical care
             I      Diagnostic (type of disease),
             I      prognostic (survival analysis),
             I      treatment (prediction of response).


Network inference                                              11
What questions in functional genomics? (1)

  Various levels/scales of study
     I   genome: sequence analysis,
     I   transcriptome: gene expression levels,
     I   proteome: protein functions and interactions.


  Questions
    1. Biological understanding
             I      Mechanisms of diseases,
             I      gene/protein functions and interactions.
    2. Medical/clinical care
             I      Diagnostic (type of disease),
             I      prognostic (survival analysis),
             I      treatment (prediction of response).


Network inference                                              11
What questions in functional genomics? (2)


  Central dogma of molecular biology
                          transcription           translation
                    DNA                   mRNA                  Proteins
   replication



  Basic biostatistical issues
         Selecting some genes of interest (biomarkers),
         Looking for interactions between them (pathway analysis).




Network inference                                                          12
How is this measured? (1)
 Microarray technology: parallel measurement of many biological features




                                   signal processing




     Matrix of features n ⌧ p                        0 1  2  3              p   1
                                                      x1 x1 x1 . . .       x1
     Expression levels of p                          B.                         C
                                   pretreatment    X=@ .
                                                       .                        A
     probes are simultaneously                                              p
                                                          1  2  2
                                                         xn xn x1 . . .    xn
     monitored for n individuals




Network inference                                                                   13
How is this measured? (2)
 Next Generation Sequencing: parallel measurement of even many more biological features




                                         assembling




     Matrix of features n n p                            0 1 2  3               p   1
                                                         k1 k1 k1 . . .        k1
                                                        B.                          C
                                                      X=@ .
     Expression counts are extracted
                                       pretreatment       .                         A
     from small repeated sequences                                              p
                                                              1  2  2
                                                             kn kn k1 . . .    kn
     and monitored for n individuals




Network inference                                                                       14
What questions are we dealing with? (1)
 Supervised canonical example at the gene level: di↵erential analysis


  Leukemia (Golub data, thanks to P. Neuvial)
     I   AML – Acute Myeloblastic Leukemia, n1 = 11,
     I   ALL – Acute Lymphoblastic Leukemia n2 = 27,
       a n1 + n2 vector of outcome with each patient’s tumor type.



Supervised classification
Find genes with significant
di↵erent expression levels
between groups – biomarkers
     prediction purpose


Network inference                                                       15
What questions are we dealing with? (2)
 Unsupervised canonical example at the gene level: hierarchical clustering

  Same kind of data, no outcome is considered



(Unsupervised) clustering
Find groups of gene which show
statistical
dependencies/commonalities –
hoping for biological interactions

        exploratory purpose
        functional understanding



       Can we do better than that ? And how do genes interact anyway?

Network inference                                                            16
What questions are we dealing with? (2)
 Unsupervised canonical example at the gene level: hierarchical clustering

  Same kind of data, no outcome is considered



(Unsupervised) clustering
Find groups of gene which show
statistical
dependencies/commonalities –
hoping for biological interactions

        exploratory purpose
        functional understanding



       Can we do better than that ? And how do genes interact anyway?

Network inference                                                            16
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                17
The problem at hand




                                               Inference




   ⇡ 10s/100s microarray/sequencing experiments
                    ⇡ 1000s probes (“genes”)


  Modeling questions prior to inference
    1. What do the nodes represent? (the easiest one)
    2. What is/should be the meaning of an edge? (the toughest one)
             I      Biologically?
             I      Statistically?

Network inference                                                     18
The problem at hand




                                               Inference




   ⇡ 10s/100s microarray/sequencing experiments
                    ⇡ 1000s probes (“genes”)


  Modeling questions prior to inference
    1. What do the nodes represent? (the easiest one)
    2. What is/should be the meaning of an edge? (the toughest one)
             I      Biologically?
             I      Statistically?

Network inference                                                     18
The problem at hand




                                               Inference




   ⇡ 10s/100s microarray/sequencing experiments
                    ⇡ 1000s probes (“genes”)


  Modeling questions prior to inference
    1. What do the nodes represent? (the easiest one)
    2. What is/should be the meaning of an edge? (the toughest one)
             I      Biologically?
             I      Statistically?

Network inference                                                     18
The problem at hand




                                               Inference




   ⇡ 10s/100s microarray/sequencing experiments
                    ⇡ 1000s probes (“genes”)


  Modeling questions prior to inference
    1. What do the nodes represent? (the easiest one)
    2. What is/should be the meaning of an edge? (the toughest one)
             I      Biologically?
             I      Statistically?

Network inference                                                     18
More questions/issues

  Modelling
     I   Is the network dynamic of static?
     I   How has the data been generated? (time-course/steady state)
     I   Are the edges oriented or not? (causality)
     I   What do the edges represent for my particular problem?


  Statistical challenges
     I   (Ultra) high dimensionality,
     I   Noisy data, lack of reproducibility,
     I   Heterogeneity of the data (many techniques, various signals).


Network inference                                                        19
More questions/issues

  Modelling
     I   Is the network dynamic of static?
     I   How has the data been generated? (time-course/steady state)
     I   Are the edges oriented or not? (causality)
     I   What do the edges represent for my particular problem?


  Statistical challenges
     I   (Ultra) high dimensionality,
     I   Noisy data, lack of reproducibility,
     I   Heterogeneity of the data (many techniques, various signals).


Network inference                                                        19
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                20
Canonical model settings
 Biological microarrays in comparable conditions


  Notations
    1. a set P = {1, . . . , p} of p variables:
       these are typically the genes (could be proteins);
    2. a sample N = {1, . . . , n} of individuals associated to the variables:
       these are typically the microarray (could be sequence counts).

  Basic statistical model
  This can be view as
     I   a random vector X in Rp , whose j th entry is the j th variable,
     I   a n-size sample (X 1 , . . . , X n ), such as X i is the i th microarrays,
             I      could be independent identically distributed copies (steady-state)
             I      could be dependent in a certain way (time-course data)
     I   assume a parametric probability distribution for X (Gaussian).

Network inference                                                                        21
Canonical model settings
 Biological microarrays in comparable conditions


  Notations
    1. a set P = {1, . . . , p} of p variables:
       these are typically the genes (could be proteins);
    2. a sample N = {1, . . . , n} of individuals associated to the variables:
       these are typically the microarray (could be sequence counts).

  Basic statistical model
  This can be view as
     I   a random vector X in Rp , whose j th entry is the j th variable,
     I   a n-size sample (X 1 , . . . , X n ), such as X i is the i th microarrays,
             I      could be independent identically distributed copies (steady-state)
             I      could be dependent in a certain way (time-course data)
     I   assume a parametric probability distribution for X (Gaussian).

Network inference                                                                        21
Canonical model settings
 Biological microarrays in comparable conditions


  Notations
    1. a set P = {1, . . . , p} of p variables:
       these are typically the genes (could be proteins);
   2. a sample N = {1, . . . , n} of individuals associated to the variables:
  The data are typically the microarray (could be sequence counts).
      these
  Stacking (X 1 , . . . , X n ), we met the usual individual/variable table X
  Basic statistical model                             0 1       2   3         p1
  This can be view as                                   x1 x1 x1 . . . x1
                                                      B.                        C
     I                    Inference              j th @ .
         a random vector X in Rp , whose X =entry is the j th variable, A
                                                         .
                                                          1     2   2         p
     I   a n-size sample (X 1 , . . . , X n ), such as Xxin is xn ix1 microarrays,
                                                               the th . . . xn
             I      could be independent identically distributed copies (steady-state)
             I      could be dependent in a certain way (time-course data)
     I   assume a parametric probability distribution for X (Gaussian).

Network inference                                                                        21
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                22
Modeling relationship between variables (1)
 Independence

  Definition (Independence of events)
  Two events A and B are independent if and only if

                                   P(A, B ) = P(A)P(B ),

  which is usually denoted by A ? B . Equivalently,
                                ?
     I   A ? B , P(A|B ) = P(A),
           ?
     I   A ? B , P(A|B ) = P(A|B c )
           ?

  Example (class vs party)

                                party                                  party
                 class      Labour Tory                  class     Labour Tory
              working         0.42 0.28               working        0.60 0.40
           bourgeoisie        0.06 0.24            bourgeoisie       0.20 0.80
               Table: Joint probability (left) vs. conditional probability (right)
Network inference                                                                    23
Modeling relationship between variables (1)
 Independence

  Definition (Independence of events)
  Two events A and B are independent if and only if

                                   P(A, B ) = P(A)P(B ),

  which is usually denoted by A ? B . Equivalently,
                                ?
     I   A ? B , P(A|B ) = P(A),
           ?
     I   A ? B , P(A|B ) = P(A|B c )
           ?

  Example (class vs party)

                                party                                  party
                 class      Labour Tory                  class     Labour Tory
              working         0.42 0.28               working        0.60 0.40
           bourgeoisie        0.06 0.24            bourgeoisie       0.20 0.80
               Table: Joint probability (left) vs. conditional probability (right)
Network inference                                                                    23
Modeling relationships between variables (2)
 Conditional independence




  Generalizing to more than two events requires strong assumptions
  (mutual independence). Better handle with
  Definition (Conditional independence of events)
  Two events A and B are independent if and only if

                            P(A, B |C ) = P(A|C )P(B |C ),

  which is usually denoted by A ? B |C
                                ?

  Example (Does QI depends on weight?)
  Consider the events A = ”having low QI”, B = ”having low weight”.




Network inference                                                     24
Modeling relationships between variables (2)
 Conditional independence




  Generalizing to more than two events requires strong assumptions
  (mutual independence). Better handle with
  Definition (Conditional independence of events)
  Two events A and B are independent if and only if

                            P(A, B |C ) = P(A|C )P(B |C ),

  which is usually denoted by A ? B |C
                                ?

  Example (Does QI depends on weight?)
  Consider the events A = ”having low QI”, B = ”having low weight”.




Network inference                                                     24
Modeling relationships between variables (2)
 Conditional independence

  Generalizing to more than two events requires strong assumptions
  (mutual independence). Better handle with
  Definition (Conditional independence of events)
  Two events A and B are independent if and only if

                            P(A, B |C ) = P(A|C )P(B |C ),

  which is usually denoted by A ? B |C
                                ?
  Example (Does QI depends on weight?)
  Consider the events A = ”having low QI”, B = ”having low weight”.
  Estimating2 P(A, B ), P(A) and P(B ) in a sample would lead to

                                P(A, B ) 6= P(A)P(B )

      2
          stupidly
Network inference                                                     24
Modeling relationships between variables (2)
 Conditional independence


  Generalizing to more than two events requires strong assumptions
  (mutual independence). Better handle with
  Definition (Conditional independence of events)
  Two events A and B are independent if and only if

                            P(A, B |C ) = P(A|C )P(B |C ),

  which is usually denoted by A ? B |C
                                ?

  Example (Does QI depends on weight?)
  Consider the events A = ”having low QI”, B = ”having low weight”.
  But in fact, introducing C = ”having a given age”,

                            P(A, B |C ) = P(A|C )P(B |C )


Network inference                                                     24
Independence of random vectors (1)
 Independence and Conditional independence: natural generalization



  Definition
  Consider 3 random vector X , Y , Z with distribution fX , fY , fZ , jointly
  fXY , fXYZ . Then,
     I   X and Y are independent iif fXY (x , y) = fX (x )fY (y);
     I   X and Y are conditionally independent on Z , z : fZ (z ) > 0 iif
         fXY |Z (x , y; z ) = fX |Z (x ; z )fY |Z (y; z ).

  Proposition (Factorization criterion)
  X and Y are independent (resp. conditionally independent on Z ) iif
  there exists functions g and h such as, for all x and y
    1. fXY (x , y) = g(x )h(y),
    2. fXYZ (x , y, z ) = g(x , z )h(y, z ), for all z fZ (z ) > 0.


Network inference                                                               25
Independence of random vectors (1)
 Independence and Conditional independence: natural generalization



  Definition
  Consider 3 random vector X , Y , Z with distribution fX , fY , fZ , jointly
  fXY , fXYZ . Then,
     I   X and Y are independent iif fXY (x , y) = fX (x )fY (y);
     I   X and Y are conditionally independent on Z , z : fZ (z ) > 0 iif
         fXY |Z (x , y; z ) = fX |Z (x ; z )fY |Z (y; z ).

  Proposition (Factorization criterion)
  X and Y are independent (resp. conditionally independent on Z ) iif
  there exists functions g and h such as, for all x and y
    1. fXY (x , y) = g(x )h(y),
    2. fXYZ (x , y, z ) = g(x , z )h(y, z ), for all z fZ (z ) > 0.


Network inference                                                               25
Independence of random vectors (2)
 Independence vs Conditional independence




                                            f ; X ? Y |Z
                                                  ?
                    f ; fXYZ




                                        f ; fX fY fZ

                        f ; X ? Z |Y
                              ?                        f ; Y ? Z |X
                                                             ?




        Figure: Mutual independence, Conditional dependence, full dependence.
Network inference                                                               26
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                27
Definition

  Definition
  A graphical model gives a graphical (intuitive) representation of the
  dependence structure of a probability distribution.

               Graphical structure $ Random variables/Random vector


  It links
    1. a random vector (or a set of random variables.) X = {X1 , . . . , Xp }
       with distribution P,

    2. a graph G = (P, E) where
             I      P = {1, . . . , p} is the set of nodes associated to each variable,
             I      E is a set of edges describing the dependence relationship of X ⇠ P.


Network inference                                                                          28
Definition

  Definition
  A graphical model gives a graphical (intuitive) representation of the
  dependence structure of a probability distribution.

               Graphical structure $ Random variables/Random vector


  It links
    1. a random vector (or a set of random variables.) X = {X1 , . . . , Xp }
       with distribution P,

    2. a graph G = (P, E) where
             I      P = {1, . . . , p} is the set of nodes associated to each variable,
             I      E is a set of edges describing the dependence relationship of X ⇠ P.


Network inference                                                                          28
Conditional Independence Graphs
 Definition




  Definition
  The conditional independence graph of a random vector X is the
  undirected graph G = {P, E} with the set of node P = {1, . . . , p} and
  where
                      (i , j ) 2 E , Xi ? Xj |P{i , j }.
                               /        ?




  Property
  It owns the Markov property: any two subsets of variables separated by a
  third is independent conditionally on variables in the third set.



Network inference                                                            29
Conditional Independence Graphs
 Definition




  Definition
  The conditional independence graph of a random vector X is the
  undirected graph G = {P, E} with the set of node P = {1, . . . , p} and
  where
                      (i , j ) 2 E , Xi ? Xj |P{i , j }.
                               /        ?




  Property
  It owns the Markov property: any two subsets of variables separated by a
  third is independent conditionally on variables in the third set.



Network inference                                                            29
Conditional Independence Graphs
 An example

  Let X1 , X2 , X3 , X4 be four random variables with joint probability density
  function fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 ) with u a given constant.
  Apply the factorization property

                    fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 )
                           = exp(u) · exp(x1 + x1 x2 ) · exp(x2 x3 x4 )


  Graphical representation

                                                      1         2         4
G = (P, E) such as P = {1, 2, 3, 4}
and
               E=
                                                                      3


Network inference                                                                 30
Conditional Independence Graphs
 An example

  Let X1 , X2 , X3 , X4 be four random variables with joint probability density
  function fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 ) with u a given constant.
  Apply the factorization property

                    fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 )
                           = exp(u) · exp(x1 + x1 x2 ) · exp(x2 x3 x4 )


  Graphical representation

                                                      1         2         4
G = (P, E) such as P = {1, 2, 3, 4}
and
             E = {?}
                                                                      3


Network inference                                                                 30
Conditional Independence Graphs
 An example

  Let X1 , X2 , X3 , X4 be four random variables with joint probability density
  function fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 ) with u a given constant.
  Apply the factorization property

                    fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 )
                           = exp(u) · exp(x1 + x1 x2 ) · exp(x2 x3 x4 )


  Graphical representation

                                                      1         2         4
G = (P, E) such as P = {1, 2, 3, 4}
and
           E = {(1, 2)}
                                                                      3


Network inference                                                                 30
Conditional Independence Graphs
 An example

  Let X1 , X2 , X3 , X4 be four random variables with joint probability density
  function fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 ) with u a given constant.
  Apply the factorization property

                    fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 )
                           = exp(u) · exp(x1 + x1 x2 ) · exp(x2 x3 x4 )


  Graphical representation

                                                      1         2         4
G = (P, E) such as P = {1, 2, 3, 4}
and

         E = {(2, 3), (3, 4), (2, 4)}
                                                                      3


Network inference                                                                 30
Directed Acyclic conditional independence Graph (DAG)
 Motivation



  Limitation of undirected graphs
 Sometimes an ordering on the variables is known, which allows to break
 the symmetry in the graphical representation to introduce, in some sense,
 “causality” in the modeling.


  Consequences
     I   Each element of E has to be directed.
     I   There are no directed cycle in the graph.


       We thus deal with a directed acyclic graph (or DAG).


Network inference                                                        31
Directed Acyclic conditional independence Graph (DAG)
 Definition

  Definition (Ordering)
  An ordering between variables {1, . . . , p} is a relation such that: i) for
  all couple (i , j ), either i j or j i , ii) is transitive iii) is not
  reflexive.
     I   A natural ordering is obtained when variables are observed across
         time,
     I   A natural conditioning set for a pair of variables (i , j ) is the past,
         denoted P(j ) = 1, . . . , j for j .

  Definition (DAG)
  The directed conditional dependence graph of X is the directed graph
  G = (P, E ) where

                    (i , j ) such as i   j 2 E , Xj ? Xi |P(j ){i , j }.
                                           /        ?

Network inference                                                                   32
Directed Acyclic conditional independence Graph (DAG)
 Definition

  Definition (Ordering)
  An ordering between variables {1, . . . , p} is a relation such that: i) for
  all couple (i , j ), either i j or j i , ii) is transitive iii) is not
  reflexive.
     I   A natural ordering is obtained when variables are observed across
         time,
     I   A natural conditioning set for a pair of variables (i , j ) is the past,
         denoted P(j ) = 1, . . . , j for j .

  Definition (DAG)
  The directed conditional dependence graph of X is the directed graph
  G = (P, E ) where

                    (i , j ) such as i   j 2 E , Xj ? Xi |P(j ){i , j }.
                                           /        ?

Network inference                                                                   32
Directed Acyclic conditional independence Graph (DAG)
 Factorization and Markov property




  Another view is a parent/descendant relationships to deal with the
  ordering of the nodes:
  The factorization property
                                      p
                                      Y
                          fX (x ) =          fXk |pak (xk |pak ),
                                      k =1

  where pak are the parents of node k .




Network inference                                                      33
Directed Acyclic conditional independence Graph (DAG)
 An example



                                                       x1



                                          x2                       x3




                                                 x4          x5




                                          x6                       x7




                    fX (x ) = ?fX1 fX2 fX3 fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 fX6 |X4 fX7 |X4 ,X5 .


Network inference                                                                              34
Directed Acyclic conditional independence Graph (DAG)
 An example



                                                    x1



                                       x2                       x3




                                             x4           x5




                                       x6                       x7




              fX (x ) = ?fX1 · · · fX2 fX3 fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 fX6 |X4 fX7 |X4 ,X5 .


Network inference                                                                              34
Directed Acyclic conditional independence Graph (DAG)
 An example



                                                    x1



                                       x2                       x3




                                             x4           x5




                                       x6                       x7




              fX (x ) = ?fX1 fX2 · · · fX3 fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 fX6 |X4 fX7 |X4 ,X5 .


Network inference                                                                              34
Directed Acyclic conditional independence Graph (DAG)
 An example



                                                    x1



                                       x2                       x3




                                             x4           x5




                                       x6                       x7




              fX (x ) = ?fX1 fX2 fX3 · · · fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 fX6 |X4 fX7 |X4 ,X5 .


Network inference                                                                              34
Directed Acyclic conditional independence Graph (DAG)
 An example



                                                    x1



                                       x2                       x3




                                             x4           x5




                                       x6                       x7




              fX (x ) = ?fX1 fX2 fX3 fX4 |X1 ,X2 ,X3 · · · fX5 |X1 ,X3 fX6 |X4 fX7 |X4 ,X5 .


Network inference                                                                              34
Directed Acyclic conditional independence Graph (DAG)
 An example



                                                    x1



                                       x2                       x3




                                             x4           x5




                                       x6                       x7




              fX (x ) = ?fX1 fX2 fX3 fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 · · · fX6 |X4 fX7 |X4 ,X5 .


Network inference                                                                              34
Directed Acyclic conditional independence Graph (DAG)
 An example



                                                    x1



                                       x2                       x3




                                             x4           x5




                                       x6                       x7




              fX (x ) = ?fX1 fX2 fX3 fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 fX6 |X4 · · · fX7 |X4 ,X5 .


Network inference                                                                              34
Directed Acyclic conditional independence Graph (DAG)
 An example



                                                       x1



                                          x2                       x3




                                                 x4          x5




                                          x6                       x7




                    fX (x ) = ?fX1 fX2 fX3 fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 fX6 |X4 fX7 |X4 ,X5 .


Network inference                                                                              34
Directed Acyclic conditional independence Graph (DAG)
 Markov property




  Local Markov property
  For any Y 2 dek where dek are the descendants of k , then

                               Xk ? Y | pak ,
                                  ?

  that is, Xk is conditionally independent on its non-descendants given its
  parents.




Network inference                                                             35
Local Markov property: example

                    x1



   x2                             x3                   Check that x4 ? x5 | {x2 , x3 }, by
                                                                      ?
                                                       using the factorization property.

           x4               x5




                                              P(x2 , x3 , x4 , x5 )
                         P(x4 |x5 , x2 , x3 ) =
                                               P(x2 , x3 , x5 )
                                              P(x2 )P(x3 )P(x4 |x2 , x3 )P(x5 |x3 )
                                            =
                                                     P(x2 )P(x3 )P(x5 |x3 )
                                            = P(x4 |x2 , x3 ).



Network inference                                                                            36
Local Markov property: example

                    x1



   x2                             x3                   Check that x4 ? x5 | {x2 , x3 }, by
                                                                      ?
                                                       using the factorization property.

           x4               x5




                                              P(x2 , x3 , x4 , x5 )
                         P(x4 |x5 , x2 , x3 ) =
                                               P(x2 , x3 , x5 )
                                              P(x2 )P(x3 )P(x4 |x2 , x3 )P(x5 |x3 )
                                            =
                                                     P(x2 )P(x3 )P(x5 |x3 )
                                            = P(x4 |x2 , x3 ).



Network inference                                                                            36
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                37
Modeling the genomic data
 Gaussian assumption



  The data
                                             0 1  2  3                p   1
                                              x1 x1 x1 . . .         x1
                                             B.                           C
                           Inference       X=@ .
                                               .                          A
                                                   1  2  2            p
                                                  xn xn x1 . . .     xn

  Assuming fX (X) multivariate Gaussian
  Greatly simplifies the inference:
         naturally links independence and conditional independence to the
         covariance and partial covariance,
         gives a straightforward interpretation to the graphical modeling
         previously considered.


Network inference                                                             38
Modeling the genomic data
 Gaussian assumption



  The data
                                             0 1  2  3                p   1
                                              x1 x1 x1 . . .         x1
                                             B.                           C
                           Inference       X=@ .
                                               .                          A
                                                   1  2  2            p
                                                  xn xn x1 . . .     xn

  Assuming fX (X) multivariate Gaussian
  Greatly simplifies the inference:
         naturally links independence and conditional independence to the
         covariance and partial covariance,
         gives a straightforward interpretation to the graphical modeling
         previously considered.


Network inference                                                             38
Start gently with the univariate Gaussian distribution


The Gaussian distribution is the
natural model for the level of
expression of gene (noisy data).

  We note X ⇠ N (µ,      2 ),so as EX = µ, VarX = 2 and
                                       ⇢
                                 1          1
                    fX (x ) = p exp            (x µ)2 ,
                                  2⇡       2 2

  and
                                          p            1
                    log fX (x ) =   log       2⇡           2
                                                               (x   µ)2 .
                                                   2

     Useless for modeling the distribution of expression level for a whole
  bunch of genes.

Network inference                                                            39
Start gently with the univariate Gaussian distribution


The Gaussian distribution is the
natural model for the level of
expression of gene (noisy data).

  We note X ⇠ N (µ,      2 ),so as EX = µ, VarX = 2 and
                                       ⇢
                                 1          1
                    fX (x ) = p exp            (x µ)2 ,
                                  2⇡       2 2

  and
                                          p            1
                    log fX (x ) =   log       2⇡           2
                                                               (x   µ)2 .
                                                   2

     Useless for modeling the distribution of expression level for a whole
  bunch of genes.

Network inference                                                            39
One step forward: bivariate Gaussian distribution
 Need concepts of covariance and correlation

  Let X , Y be two real random variables.
  Definitions        h                                  i
     cov(X , Y ) = E X        E(X ) Y          E(Y )       = E(XY )   E(X )E(Y ).

                                           cov(X , Y )
                    ⇢XY = cor(X , Y ) = p                  .
                                         Var(X ) · Var(Y )

  Proposition
     I   cov(X , X ) = Var(X ) = E[(X          EX )(Y       EY )],
     I   cov(X + Y , Z ) = cov(X , Z ) + cov(X , Z ),
     I   Var(X + Y ) = Var(X ) + Var(Y ) + 2cov(X , Y ).
     I   X ? Y ) cov(X , Y ) = 0.
           ?
     I   X ? Y , cov(X , Y ) = 0 when X , Y are Gaussian.
           ?

Network inference                                                                   40
One step forward: bivariate Gaussian distribution
 Need concepts of covariance and correlation

  Let X , Y be two real random variables.
  Definitions        h                                  i
     cov(X , Y ) = E X        E(X ) Y          E(Y )       = E(XY )   E(X )E(Y ).

                                           cov(X , Y )
                    ⇢XY = cor(X , Y ) = p                  .
                                         Var(X ) · Var(Y )

  Proposition
     I   cov(X , X ) = Var(X ) = E[(X          EX )(Y       EY )],
     I   cov(X + Y , Z ) = cov(X , Z ) + cov(X , Z ),
     I   Var(X + Y ) = Var(X ) + Var(Y ) + 2cov(X , Y ).
     I   X ? Y ) cov(X , Y ) = 0.
           ?
     I   X ? Y , cov(X , Y ) = 0 when X , Y are Gaussian.
           ?

Network inference                                                                   40
The bivariate Gaussian distribution

                                                                        ✓          ◆
                            1          1                            1       x   µ1
        fXY (x , y) = p            exp{ x          µ1 y      µ2 ⌃                    }
                          2⇡ det ⌃     2                                    y   µ2

  where ⌃ is the variance/covariance matrix which is symmetric and
  positive definite.
                          ✓                         ◆
                             Var(X )    cov(Y , X )
                      ⌃=                              .
                            cov(Y , X )   Var(Y )

   and
                                1                      1
     fX ,Y (x , y) = p                     exp               (x 2 + y 2 + 2⇢XY xy),
                         2⇡(1       ⇢2 )
                                     XY
                                                 2(1    ⇢2 )
                                                         XY

  where ⇢XY is the correlation between X , Y and describe the interaction
  between them.
Network inference                                                                        41
The bivariate Gaussian distribution

                                                                        ✓          ◆
                            1          1                            1       x   µ1
        fXY (x , y) = p            exp{ x          µ1 y      µ2 ⌃                    }
                          2⇡ det ⌃     2                                    y   µ2

  where ⌃ is the variance/covariance matrix which is symmetric and
  positive definite. If standardized,
                                   ✓         ◆
                                      1  ⇢XY
                             ⌃=                .
                                     ⇢XY  1

   and
                                1                      1
     fX ,Y (x , y) = p                     exp               (x 2 + y 2 + 2⇢XY xy),
                         2⇡(1       ⇢2 )
                                     XY
                                                 2(1    ⇢2 )
                                                         XY

  where ⇢XY is the correlation between X , Y and describe the interaction
  between them.
Network inference                                                                        41
The bivariate Gaussian distribution


The Covariance Matrix
Let

            X ⇠ N (0, ⌃),

with unit variance and
⇢XY = 0
             ✓     ◆
               1 0
       ⌃=            .
               0 1

The shape of the 2-D
distribution evolves
accordingly.


Network inference                      42
The bivariate Gaussian distribution


The Covariance Matrix
Let

            X ⇠ N (0, ⌃),

with unit variance and
⇢XY = 0.9
           ✓         ◆
              1 0.9
     ⌃=                .
             0.9 1

The shape of the 2-D
distribution evolves
accordingly.


Network inference                      42
Full generalization: multivariate Gaussian vector
 Now need partial covariance and partial correlation


  Let X , Y , Z be real random variables.
  Definitions

            cov(X , Y |Z ) = cov(X , Y )      cov(X , Z )cov(Y , Z )/Var(Z ).
                                         ⇢XY ⇢XZ ⇢YZ
                              ⇢XY |Z = q       q        .
                                        1 ⇢2XZ  1 ⇢2 YZ


       Give the interaction between X and Y once removed the e↵ect of Z .

  Proposition
  When X , Y , Z are jointly Gaussian, then

                    cov(X , Y |Z ) = 0 , cor(X , Y |Z ) = 0 , X ? Y |Z .
                                                                ?


Network inference                                                               43
Full generalization: multivariate Gaussian vector
 Now need partial covariance and partial correlation


  Let X , Y , Z be real random variables.
  Definitions

            cov(X , Y |Z ) = cov(X , Y )      cov(X , Z )cov(Y , Z )/Var(Z ).
                                         ⇢XY ⇢XZ ⇢YZ
                              ⇢XY |Z = q       q        .
                                        1 ⇢2XZ  1 ⇢2 YZ


       Give the interaction between X and Y once removed the e↵ect of Z .

  Proposition
  When X , Y , Z are jointly Gaussian, then

                    cov(X , Y |Z ) = 0 , cor(X , Y |Z ) = 0 , X ? Y |Z .
                                                                ?


Network inference                                                               43
The multivariate Gaussian distribution


  Allow to give a modeling for the expression level of a whole set of genes
  P:
  Gaussian vector
  Let X ⇠ N (µ, ⌃), and assume any block decomposition with {a, b} a
  partition of P               ✓          ◆
                                 ⌃ab ⌃ba
                           ⌃=               .
                                 ⌃ab ⌃bb
  Then
    1. Xa is Gaussian with distribution N (µa , ⌃aa )
    2. Xa |Xb = x is Gaussian with distribution N (µa|b , ⌃a|b ) known.




Network inference                                                             44
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                45
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                46
Steady-state data: scheme




                                 Inference




   ⇡ 10s microarrays over time
                                             Which interactions?
     ⇡ 1000s probes (“genes”)




Network inference                                                  47
Modeling the underlying distribution (1)


  Model for data generation
     I   A microarray can be represented as a multivariate vector
         X = (X1 , . . . , Xp ) 2 Rp ,
     I   Consider n biological replicate in the same condition, which forms a
         usual n-size sample (X1 , . . . , Xn ).



  Consequence: a Gaussian Graphical Model
     I   X ⇠ N (µ, ⌃) with X1 , . . . , Xn i.i.d. copies of X ,
     I   ⇥ = (✓ij )i,j 2P , ⌃   1
                                    is called the concentration matrix.



Network inference                                                               48
Modeling the underlying distribution (1)


  Model for data generation
     I   A microarray can be represented as a multivariate vector
         X = (X1 , . . . , Xp ) 2 Rp ,
     I   Consider n biological replicate in the same condition, which forms a
         usual n-size sample (X1 , . . . , Xn ).



  Consequence: a Gaussian Graphical Model
     I   X ⇠ N (µ, ⌃) with X1 , . . . , Xn i.i.d. copies of X ,
     I   ⇥ = (✓ij )i,j 2P , ⌃   1
                                    is called the concentration matrix.



Network inference                                                               48
Modeling the underlying distribution (2)
 Interpretation as a GGM



  Multivariate Gaussian vector and covariance selection
                      ✓ij
                    p        = cor Xi , Xj |XPi,j = ⇢ij |P{i,j } ,
                     ✓ii ✓jj



  Graphical Interpretation
        The matrix ⇥ = (✓ij )i,j 2P encodes the network G we are looking for.
                                        conditional dependency between Xj and Xi
         ?    i                                                or
                    if and only if     non-null partial correlation between Xj and Xi
    j                                                          m
                                                            ✓ij 6= 0




Network inference                                                                       49
Modeling the underlying distribution (2)
 Interpretation as a GGM



  Multivariate Gaussian vector and covariance selection
                      ✓ij
                    p        = cor Xi , Xj |XPi,j = ⇢ij |P{i,j } ,
                     ✓ii ✓jj



  Graphical Interpretation
        The matrix ⇥ = (✓ij )i,j 2P encodes the network G we are looking for.
                                        conditional dependency between Xj and Xi
         ?    i                                                or
                    if and only if     non-null partial correlation between Xj and Xi
    j                                                          m
                                                            ✓ij 6= 0




Network inference                                                                       49
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                50
Time-course data: scheme




   t0                                      Inference
        t1
              tn
             ⇡ 10s microarrays over time
                                                       Which interactions?
              ⇡ 1000s probes (“genes”)




Network inference                                                            51
Modeling time-course data with DAG
  Collecting gene expression
    1. Follow-up of one single experiment/individual;
    2. Close enough time-points to ensure
             I      dependency between consecutive measurements;
             I      homogeneity of the Markov process.

                                                      Xt
                                                       1           Xt+1
                                                                    1

              X4
                                                      Xt
                                                       2           Xt+1
                                                                    2



              X1                      stands for                   Xt+1
                                                                    3



   X3                  X2        X5                        G       Xt+1
                                                                    4



                                                                   Xt+1
                                                                    5



Network inference                                                         52
Modeling time-course data with DAG
  Collecting gene expression
    1. Follow-up of one single experiment/individual;
    2. Close enough time-points to ensure
             I      dependency between consecutive measurements;
             I      homogeneity of the Markov process.

                                                      Xt
                                                       1           Xt+1
                                                                    1

              X4
                                                      Xt
                                                       2           Xt+1
                                                                    2



              X1                      stands for                   Xt+1
                                                                    3



   X3                  X2        X5                        G       Xt+1
                                                                    4



                                                                   Xt+1
                                                                    5



Network inference                                                         52
Modeling time-course data with DAG
  Collecting gene expression
    1. Follow-up of one single experiment/individual;
    2. Close enough time-points to ensure
             I      dependency between consecutive measurements;
             I      homogeneity of the Markov process.

   Xt
    1                         X2
                               1
                                                ...                Xn
                                                                    1



   X1
    2                         X2
                               2
                                                ...                Xn
                                                                    2



   X1
    3                         X2
                               3
                                                ...                Xn
                                                                    3



   X1
    4                         X2
                               4
                                                ...                Xn
                                                                    4

                    G                  G                  G
   X1
    5                         X2
                               5
                                                ...                Xn
                                                                    5



Network inference                                                       52
DAG: remark


                                                        X1
                                                         t         X1
                                                                    t+1

                         X4
                                                        X2
                                                         t         X2
                                                                    t+1



                         X1                versus                  X3
                                                                    t+1



                    X3        X2      X5                      G    X4
                                                                    t+1



                                                                   X5
                                                                    t+1



         Argh, there is a cycle :’(                     is indeed a DAG

       Overcomes the rather restrictive acyclic requirement

Network inference                                                         53
Modeling the underlying distribution (1)

  Model for data generation
  A microarray can be represented as a multivariate vector
  X = (X1 , . . . , Xp ) 2 Rp , generated through a first order vector
  autoregressive process VAR(1):

                         X t = ⇥X t       1
                                              + b + "t ,    t 2 [1, n]

  where "t is a white noise to ensure the Markov property and
  X 0 ⇠ N (0, ⌃0 ).

  Consequence: a Gaussian Graphical Model
     I   Each X t |X t   1   ⇠ N (✓X t    1 , ⌃),

     I   or, equivalently, Xjt |X t   1   ⇠ N (⇥j X t      1 , ⌃)

  where ⌃ is known and ⇥j is the j th row of ⇥.

Network inference                                                        54
Modeling the underlying distribution (1)

  Model for data generation
  A microarray can be represented as a multivariate vector
  X = (X1 , . . . , Xp ) 2 Rp , generated through a first order vector
  autoregressive process VAR(1):

                         X t = ⇥X t       1
                                              + b + "t ,    t 2 [1, n]

  where "t is a white noise to ensure the Markov property and
  X 0 ⇠ N (0, ⌃0 ).

  Consequence: a Gaussian Graphical Model
     I   Each X t |X t   1   ⇠ N (✓X t    1 , ⌃),

     I   or, equivalently, Xjt |X t   1   ⇠ N (⇥j X t      1 , ⌃)

  where ⌃ is known and ⇥j is the j th row of ⇥.

Network inference                                                        54
Modeling the underlying distribution (2)
     I
                  2                                 3
           2    3   ✓11   .   .   . ✓1j   .   . ✓1p 2 1 3 2 3 2 1 3
            Xt1                                         X         b     "
           6 . 7 6
                  6 .     .   .   . .     .   . . 7 6 t 17 6 1 7 6 t 7
                                                    7     .        .7 6.7
           6 7 6 .        .   .   . .     .   . . 76          7 6
           6 . 7 6                                  76 . 7 6 . 7 6 . 7
           6 i7 6 .       .   .   . .     .   . . 7676        7 6 7 6 7
           6 Xt 7 6                                       . 7 6 7 6 i7
           6 7 = 6 ✓i1    .   .   . ✓ij   .         7 6 j 7 + 6 bi 7 + 6 "t 7
                                              . ✓ip 7 6
           6 . 7 6                                      X 7 6.7 6.7
           6 7 6 .        .   .   . .     .   . . 7 6 t 17 6 7 6 7
           6 . 7 6                                  76 . 7 6 . 7 6 . 7
           6 7 6 .        .   .   . .     .   . . 76          7 6 7 6 7
           4 . 5 6                                  74 . 5 4 . 5 4 . 5
                  4 .     .   .   . .     .   . .   5
            Xtp                                         Xtp 1     bp    "pt
                    ✓p1   .   .   . ✓pj   .   . ✓pp
     I   Example:
                                                      0          1
                                                       ✓11 ✓12 0
                                                 ⇥ = @ ✓21 0 0 A
                                                        0 ✓32 0

Network inference                                                               55
Modeling the underlying distribution (3)
 Interpretation as a GGM



  The VAR(1) as a covariance selection model
                                      ⇣                   ⌘
                                   cov Xit , Xjt 1 |XPj1
                                                      t
                             ✓ij =      ⇣               ⌘ ,
                                     var Xjt 1 |XPj1
                                                    t




  Graphical Interpretation
        The matrix ⇥ = (✓ij )i,j 2P encodes the network G we are looking for.
                                        conditional dependency between Xjt 1 and Xit
         ?    i                                                  or
                    if and only if     non-null partial correlation between Xjt 1 and Xit
    j
                                                                 m
                                                              ✓ij 6= 0



Network inference                                                                           56
Modeling the underlying distribution (3)
 Interpretation as a GGM



  The VAR(1) as a covariance selection model
                                      ⇣                   ⌘
                                   cov Xit , Xjt 1 |XPj1
                                                      t
                             ✓ij =      ⇣               ⌘ ,
                                     var Xjt 1 |XPj1
                                                    t




  Graphical Interpretation
        The matrix ⇥ = (✓ij )i,j 2P encodes the network G we are looking for.
                                        conditional dependency between Xjt 1 and Xit
         ?    i                                                  or
                    if and only if     non-null partial correlation between Xjt 1 and Xit
    j
                                                                 m
                                                              ✓ij 6= 0



Network inference                                                                           56
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                57
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                58
The graphical models: remindera
 a
    for goldfish-like memories


  Assumption
  A microarray can be represented as a multivariate Gaussian vector X .

  Collecting gene expression
    1. Steady-state data leads to an i.i.d. sample.
    2. Time-course data gives a time series.


  Graphical interpretation
    i                                 conditional dependency between X (i) and X (j )
                    if and only if                              or
              j                      non null partial correlation between X (i) and X (j )


        Encoded in an unknown matrix of parameters ⇥.

Network inference                                                                            59
The graphical models: remindera
 a
    for goldfish-like memories


  Assumption
  A microarray can be represented as a multivariate Gaussian vector X .

  Collecting gene expression
    1. Steady-state data leads to an i.i.d. sample.
    2. Time-course data gives a time series.


  Graphical interpretation
    i
            ?                         conditional dependency between X (i) and X (j )
                    if and only if                              or
                j                    non null partial correlation between X (i) and X (j )


        Encoded in an unknown matrix of parameters ⇥.

Network inference                                                                            59
The graphical models: remindera
 a
    for goldfish-like memories


  Assumption
  A microarray can be represented as a multivariate Gaussian vector X .

  Collecting gene expression
    1. Steady-state data leads to an i.i.d. sample.
    2. Time-course data gives a time series.


  Graphical interpretation
    i
            ?                         conditional dependency between Xt (i) and Xt 1 (j )
                    if and only if                               or
                j                    non null partial correlation between Xt (i) and Xt 1 (j )




        Encoded in an unknown matrix of parameters ⇥.

Network inference                                                                                59
The Maximum likelihood estimator
 The natural approach for parametric statistics

  Let X be a random vector with distribution defined by fX (x ; ⇥), where
  ⇥ are the model parameters.
  Maximum likelihood estimator

                               ˆ
                               ⇥ = arg max L(⇥; X)
                                         ⇥

  where L is the log likelihood, a function of the parameters:
                                             n
                                             Y
                          L(⇥; X) = log             fX (xk ; ⇥),
                                             k =1

  where xk is the k row of X.
  Remarks
     I   This a convex optimization problem,
     I   We just need to detect non zero coe cients in ⇥
Network inference                                                          60
The penalized likelihood approach

  Let ⇥ be the parameters to infer (the edges).

  A penalized likelihood approach

                       ˆ
                       ⇥ = arg max L(⇥; X)              pen`1 (⇥),
                                    ⇥




     I   L is the model log-likelihood,
     I   pen`1 is a penalty function tuned by        > 0.

         It performs
            1. regularization (needed when n ⌧ p),
            2. selection (sparsity induced by the `1 -norm),



Network inference                                                    61
The penalized likelihood approach

  Let ⇥ be the parameters to infer (the edges).

  A penalized likelihood approach

                       ˆ
                       ⇥ = arg max L(⇥; X)              pen`1 (⇥),
                                    ⇥




     I   L is the model log-likelihood,
     I   pen`1 is a penalty function tuned by        > 0.

         It performs
            1. regularization (needed when n ⌧ p),
            2. selection (sparsity induced by the `1 -norm),



Network inference                                                    61
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                62
A Geometric View of Sparsity
 Constrained Optimization




                                We basically want to solve a problem of
                                the form

                                         maximize f (   1,   2 ; X)
                                             1, 2
2 ; X)




                                where f is typically a concave likelihood
                                function.
1,




                                This is strictly equivalent to solve
f(




                                         minimize g(    1,   2 ; X)
                                             1, 2


                                where g = f is convex ! For instance
                                the square lost in the OLS.
             2
                            1

Network inference                                                      63
A Geometric View of Sparsity
 Constrained Optimization




                                    (
2 ; X)




                                        maximize f (       1,   2 ; X)
                                               1, 2                       ,
                                        s.t.          ⌦(   1,   2)   c
1,




                                where ⌦ defines a domain that
f(




                                constrains .




             2
                            1

Network inference                                                             63
A Geometric View of Sparsity
 Constrained Optimization




                                (
                                    maximize f (            1,   2 ; X)
                                           1, 2                                ,
                                    s.t.               ⌦(   1,   2)   c

                            where ⌦ defines a domain that
                            constrains .
                                              m
 2




                             maximize f (         1,   2 ; X)      ⌦(     1,       2)
                                 1, 2




                    1


Network inference                                                                       63
A Geometric View of Sparsity
 Constrained Optimization




                                (
                                    maximize f (            1,   2 ; X)
                                           1, 2                                ,
                                    s.t.               ⌦(   1,   2)   c

                            where ⌦ defines a domain that
                            constrains .
                                              m
                             maximize f (         1,   2 ; X)      ⌦(     1,       2)
 2




                                 1, 2


                                How shall we define ⌦ to induce
                                           sparsity?
                    1


Network inference                                                                       63
A Geometric View of Sparsity
 Supporting Hyperplane

  An hyperplane supports a set i↵
     I   the set is contained in one half-space
     I   the set has at least one point on the hyperplane


                           2




                                         1




Network inference                                           64
A Geometric View of Sparsity
 Supporting Hyperplane

  An hyperplane supports a set i↵
     I   the set is contained in one half-space
     I   the set has at least one point on the hyperplane
    2




                    1




Network inference                                           64
A Geometric View of Sparsity
 Supporting Hyperplane

  An hyperplane supports a set i↵
     I   the set is contained in one half-space
     I   the set has at least one point on the hyperplane


                           2




                                         1




Network inference                                           64
A Geometric View of Sparsity
 Supporting Hyperplane

  An hyperplane supports a set i↵
     I   the set is contained in one half-space
     I   the set has at least one point on the hyperplane


                           2




                                         1




Network inference                                           64
A Geometric View of Sparsity
 Supporting Hyperplane

  An hyperplane supports a set i↵
     I   the set is contained in one half-space
     I   the set has at least one point on the hyperplane


                           2




                                         1



           There are Supporting Hyperplane at all points of convex sets:
                               Generalize tangents
Network inference                                                          64
A Geometric View of Sparsity
 Supporting Hyperplane

  An hyperplane supports a set i↵
     I   the set is contained in one half-space
     I   the set has at least one point on the hyperplane


                           2




                                                   2
                                         1                  1




Network inference                                               64
A Geometric View of Sparsity
 Dual Cone

                        Generalizes normals
    2




                        2




                                              2
                    1               1             1




Network inference                                     65
A Geometric View of Sparsity
 Dual Cone

                        Generalizes normals
    2




                        2




                                              2
                    1               1             1




Network inference                                     65
A Geometric View of Sparsity
 Dual Cone

                        Generalizes normals
    2




                        2




                                              2
                    1               1             1




Network inference                                     65
A Geometric View of Sparsity
 Dual Cone

                                 Generalizes normals
    2




                                2




                                                        2
                    1                        1                   1


                        Shape of dual cones ) sparsity pattern




Network inference                                                    65
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                66
The LASSO

         R. Tibshirani, 1996.
         The Lasso: Least Absolute Shrinkage and Selection Operator
         S. Chen , D. Donoho , M. Saunders, 1995.
3.2.     Basis Pursuit.
       Régularisations ` p                                                                            23
         Weisberg, 1980.
         Forward Stagewise regression.
             2                                            2

                                                      (
                                                           minimize ky                 X k2 ,
                                                                                          2
                                                                 2R2
                                                           s.t.`   2         k k1 = | 1 | + |   2|    c.
                           ls                                           ls

                                                                                   m
                      `1              1                                            1

                                                              minimize ky           X k2 + k k1 .
                                                                                       2
                                                                       2R



Fig. 3.2 – Comparaisons des solutions de problèmes régularisés par une norme `1 et `2 .
 Network inference                                                                                          67
Orthogonal case and link to the OLS
  OLS shrinkage
  The Lasso has no analytical solution but in the orthogonal case: when
  X| X = I (never for real data),
                    ˆlasso = sign( ˆols ) max(0, | ˆols |         ).
                     j              j               j

                                                        OLS
                                   4
                                                            Lasso
                                   2


                                   0                        ols
                      4       2        0    2      4

                                   2


                                   4


Network inference                                                         68
LARs: Least angle regression


         B. Efron, T. Hastie, I. Johnstone, R. Tibshirani, 2004.
         Least Angle Regression.


  E cient algorithm to compute the Lasso solutions
  The LARS solution consists of a curve denoting the solution        for each
  value of .
     I   construct a piecewise linear path of solution starting from the null
         vector towards the OLS estimate,
     I   (Almost) the same cost as OLS,
     I   well adapted to cross validation (help us to choose ).




Network inference                                                               69
Example: prostate cancer I




  Lasso solution path with Lars

  >   library(lars)
  >   load("prostate.rda")
  >   x   <- as.matrix(x)
  >   x   <- scale(as.matrix(x))
  >   out <- lars(x,y)
  >   plot(out)




Network inference                  70
Example: prostate cancer II
                                                    LASSO
                                  0       1             3    5     6 7           8




                                                                                       1
                                                                                  *
                              6




                                                                   *   *
                                                              *
                                                        **

                                              *
  Standardized Coefficients




                                          *
                              4




                                                                                  *




                                                                                       2
                                                                       *          *
                                                              *    *
                                                                   *
                                                              *
                              2




                                                        **
                                                        *
                                                                                  *




                                                                                       8
                                                                       *          *
                                                                   *
                                                                   *   *
                                            *                 *
                                                              *




                                                                                       7
                                                                                  *
                                                         *             *
                              0




                                   *      * *           **    *    *   *
                                                                   *
                                                                       *
                                                                                  *



                                                                                       3
                                                                                  *

                                  0.0   0.2       0.4        0.6           0.8   1.0
Network inference                                                                          71
Choice of the tuning parameter                   I

  Model selection criteria
                                                            log n
                      BIC( ) = ky     X ˆ k2
                                           2      df( ˆ )
                                                              2

                        AIC( ) = ky      X ˆ k2
                                              2       df( ˆ )
  where df( ˆ ) is the number of nonzero entries in           .

  Cross-validation
    1. split the data into K folds,
    2. use successively each K fold as the testing set,
    3. compute the test error on this K folds,
    4. average to obtain the CV estimation of the test error.
          is chosen to minimize the CV test error.

Network inference                                                   72
Choice of the tuning parameter   II




  CV choice for

  > cv.lars(x,y, K=10)




Network inference                      73
Choice of the tuning parameter                                                        III
                        1.6
                        1.4




                              ●
                               ●
  Cross−Validated MSE




                                ●
                                 ●
                                  ●
                        1.2




                                   ●
                                    ●
                                     ●
                                      ●
                                       ●
                                        ●
                                         ●
                                          ●
                        1.0




                                           ●
                                            ●
                                             ●
                                              ●
                                               ●
                                                ●
                                                 ●
                                                  ●●●●
                        0.8




                                                      ●●●●●●●●●●●●●●●●●
                                                                      ●●
                                                                        ●●
                                                                         ●●
                        0.6




                                                                           ●●●
                                                                            ●●●●
                                                                               ●●●●●●
                                                                                 ●●●●●●●●●●●●●●●●●●●●●●
                                                                                      ●●●●●●●●●●●●●●●●●●




                              0.0          0.2           0.4            0.6           0.8            1.0
Network inference                                                                                          74
Many variations

  Group-Lasso
  Activate the variables by group (given by the user).


  Adaptive/Weighted-Lasso
  Adjust the penalty level to each variables, according to prior knowledge or
  with data driven weights.


  BoLasso
  Bootstrapped version that removes false positives/stabilizes the estimate.

  etc.

  + many theoretical results.

Network inference                                                              75
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                76
Outline
  Introduction
      Motivations
      Background on omics
      Modeling issue
  Modeling tools
      Statistical dependence
      Graphical models
      Covariance selection and Gaussian vector
  Gaussian Graphical Models for genomic data
      Steady-state data
      Time-course data
  Statistical inference
      Penalized likelihood approach
      Inducing sparsity and regularization
      The Lasso
  Application in Post-genomics
      Modeling time-course data
      Illustrations
      Multitask learning
Network inference                                77
Problem




   t0                                      Inference
        t1
              tn
             ⇡ 10s microarrays over time
                                                               Which interactions?
              ⇡ 1000s probes (“genes”)




                   The main statistical issue is the high dimensional setting.



Network inference                                                                    78
Handling the scarcity of the data
 By introducing some prior

  Priors should be biologically grounded
    1. few genes e↵ectively interact (sparsity),
    2. networks are organized (latent clustering),

                                             G8



                                   G7                     G9

                                             G11




                         G1             G6          G10




                    G4   G5   G2
                                                               G12


                                                   G13
                         G3




Network inference                                                    79
Handling the scarcity of the data
 By introducing some prior

  Priors should be biologically grounded
    1. few genes e↵ectively interact (sparsity),
    2. networks are organized (latent clustering),

                                             G8



                                   G7                     G9

                                             G11




                         G1             G6          G10




                    G4   G5   G2
                                                               G12


                                                   G13
                         G3




Network inference                                                    79
Handling the scarcity of the data
 By introducing some prior

  Priors should be biologically grounded
    1. few genes e↵ectively interact (sparsity),
    2. networks are organized (latent clustering),

                                             B3



                                   B2                  B4

                                             B




                         A1             B1        B5




                    A4   A    A2
                                                            C1


                                                  C
                         A3




Network inference                                                79
Penalized log-likelihood

  Banerjee et al., JMLR 2008

                     ˆ
                     ⇥ = arg max Liid (⇥; S)      k⇥k`1 ,
                               ⇥

  e ciently solved by the graphical Lasso of Friedman et al, 2008.


  Ambroise, Chiquet, Matias, EJS 2009
  Use adaptive penalty parameters for di↵erent coe cients

                          Liid (⇥; S)   kPZ ? ⇥k`1 ,

  where PZ is a matrix of weights depending on the underlying clustering
  Z.

       Works with the pseudo log-likelihood (computationally e cient).

Network inference                                                          80
Penalized log-likelihood

  Banerjee et al., JMLR 2008

                     ˆ
                     ⇥ = arg max Liid (⇥; S)      k⇥k`1 ,
                               ⇥

  e ciently solved by the graphical Lasso of Friedman et al, 2008.


  Ambroise, Chiquet, Matias, EJS 2009
  Use adaptive penalty parameters for di↵erent coe cients
                          ˜
                          Liid (⇥; S)   kPZ ? ⇥k`1 ,

  where PZ is a matrix of weights depending on the underlying clustering
  Z.

       Works with the pseudo log-likelihood (computationally e cient).

Network inference                                                          80
Neighborhood selection (1)

  Let
     I   Xi be the i th column of X,
     I   Xi be X deprived of Xi .

                                                            ✓ij
                      Xi = Xi     + ",    where    j   =       .
                                                            ✓ii

  Meinshausen and B¨lhman, 2006
                   u
  Since sign(corij |P{i,j } ) = sign( j ), select the neighbors of i with

                               1                2
                     arg min     Xi       Xi   2
                                                    + k k`1 .
                               n


       The sign pattern of ⇥ is inferred after a symmetrization step.

Network inference                                                            81
Neighborhood selection (2)

  The pseudo log-likelihood of the i.i.d Gaussian sample is
                p      n
                                                      !
                X X
   ˜
  Liid (⇥; S) =          log P(Xk (i )|Xk (Pi ); ⇥i ) ,
                      i=1   k =1
                     n                 n      ⇣                       ⌘   n
                                                   1/2          1/2
                    = log det(D)         Trace D         ⇥S⇥D               log(2⇡),
                     2                 2                                  2
  where D = diag(⇥).

  Proposition

                        ˆ pseudo = arg max Liid (⇥; S)
                        ⇥                  ˜               k⇥k`1
                                   ⇥:✓ij 6=✓ii

  has the same null entries as inferred by neighborhood selection.


Network inference                                                                  82
Structured regularization
 Introduce prior knowledge




  Building the weights
    1. Build w from prior biological information
             I      transcription factors vs. regulatees,
             I      number of potential binding sites,
             I      KEGG pathways, Gene Ontology . . .
    2. Build the weights matrix from clustering algorithm
             I      Infer the network G 0 with w = 1 for each node,
             I      Apply a clustering algorithm on G 0 ,
             I      Re-Infer G with w built according to the clustering Z.




Network inference                                                            83
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models

More Related Content

What's hot

Integration of Bioinformatics Web Services through the Search Computing Techn...
Integration of Bioinformatics Web Services through the Search Computing Techn...Integration of Bioinformatics Web Services through the Search Computing Techn...
Integration of Bioinformatics Web Services through the Search Computing Techn...Davide Chicco
 
ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing code
ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing codeISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing code
ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing codeKengo Sato
 
Research presentation-wd
Research presentation-wdResearch presentation-wd
Research presentation-wdWagied Davids
 
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...Numenta
 
A short introduction to single-cell RNA-seq analyses
A short introduction to single-cell RNA-seq analysesA short introduction to single-cell RNA-seq analyses
A short introduction to single-cell RNA-seq analysestuxette
 
Confirming dna replication origins of saccharomyces cerevisiae a deep learnin...
Confirming dna replication origins of saccharomyces cerevisiae a deep learnin...Confirming dna replication origins of saccharomyces cerevisiae a deep learnin...
Confirming dna replication origins of saccharomyces cerevisiae a deep learnin...Abdelrahman Hosny
 
27 20 dec16 13794 28120-1-sm(edit)genap
27 20 dec16 13794 28120-1-sm(edit)genap27 20 dec16 13794 28120-1-sm(edit)genap
27 20 dec16 13794 28120-1-sm(edit)genapnooriasukmaningtyas
 
A Reliable Password-based User Authentication Scheme for Web-based Human Geno...
A Reliable Password-based User Authentication Scheme for Web-based Human Geno...A Reliable Password-based User Authentication Scheme for Web-based Human Geno...
A Reliable Password-based User Authentication Scheme for Web-based Human Geno...Thitichai Sripan
 
NatashaBME1450.doc
NatashaBME1450.docNatashaBME1450.doc
NatashaBME1450.docbutest
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionXavier Llorà
 
Have We Missed Half of What the Neocortex Does? by Jeff Hawkins (12/15/2017)
Have We Missed Half of What the Neocortex Does? by Jeff Hawkins (12/15/2017)Have We Missed Half of What the Neocortex Does? by Jeff Hawkins (12/15/2017)
Have We Missed Half of What the Neocortex Does? by Jeff Hawkins (12/15/2017)Numenta
 
syllabus-IS.doc
syllabus-IS.docsyllabus-IS.doc
syllabus-IS.docbutest
 
A clonal based algorithm for the reconstruction of
A clonal based algorithm for the reconstruction ofA clonal based algorithm for the reconstruction of
A clonal based algorithm for the reconstruction ofeSAT Publishing House
 
HPC-MAQ : A PARALLEL SHORT-READ REFERENCE ASSEMBLER
HPC-MAQ : A PARALLEL SHORT-READ REFERENCE ASSEMBLERHPC-MAQ : A PARALLEL SHORT-READ REFERENCE ASSEMBLER
HPC-MAQ : A PARALLEL SHORT-READ REFERENCE ASSEMBLERcscpconf
 
A clonal based algorithm for the reconstruction of genetic network using s sy...
A clonal based algorithm for the reconstruction of genetic network using s sy...A clonal based algorithm for the reconstruction of genetic network using s sy...
A clonal based algorithm for the reconstruction of genetic network using s sy...eSAT Journals
 
EEG Based Classification of Emotions with CNN and RNN
EEG Based Classification of Emotions with CNN and RNNEEG Based Classification of Emotions with CNN and RNN
EEG Based Classification of Emotions with CNN and RNNijtsrd
 
Technical Paper.doc.doc
Technical Paper.doc.docTechnical Paper.doc.doc
Technical Paper.doc.docbutest
 
Data Mining-Project Report Gene Classification using Neural Network- Apil Tamang
Data Mining-Project Report Gene Classification using Neural Network- Apil TamangData Mining-Project Report Gene Classification using Neural Network- Apil Tamang
Data Mining-Project Report Gene Classification using Neural Network- Apil TamangApil Tamang
 

What's hot (20)

Integration of Bioinformatics Web Services through the Search Computing Techn...
Integration of Bioinformatics Web Services through the Search Computing Techn...Integration of Bioinformatics Web Services through the Search Computing Techn...
Integration of Bioinformatics Web Services through the Search Computing Techn...
 
ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing code
ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing codeISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing code
ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing code
 
10.1.1.80.2149
10.1.1.80.214910.1.1.80.2149
10.1.1.80.2149
 
H43014046
H43014046H43014046
H43014046
 
Research presentation-wd
Research presentation-wdResearch presentation-wd
Research presentation-wd
 
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
 
A short introduction to single-cell RNA-seq analyses
A short introduction to single-cell RNA-seq analysesA short introduction to single-cell RNA-seq analyses
A short introduction to single-cell RNA-seq analyses
 
Confirming dna replication origins of saccharomyces cerevisiae a deep learnin...
Confirming dna replication origins of saccharomyces cerevisiae a deep learnin...Confirming dna replication origins of saccharomyces cerevisiae a deep learnin...
Confirming dna replication origins of saccharomyces cerevisiae a deep learnin...
 
27 20 dec16 13794 28120-1-sm(edit)genap
27 20 dec16 13794 28120-1-sm(edit)genap27 20 dec16 13794 28120-1-sm(edit)genap
27 20 dec16 13794 28120-1-sm(edit)genap
 
A Reliable Password-based User Authentication Scheme for Web-based Human Geno...
A Reliable Password-based User Authentication Scheme for Web-based Human Geno...A Reliable Password-based User Authentication Scheme for Web-based Human Geno...
A Reliable Password-based User Authentication Scheme for Web-based Human Geno...
 
NatashaBME1450.doc
NatashaBME1450.docNatashaBME1450.doc
NatashaBME1450.doc
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly Detection
 
Have We Missed Half of What the Neocortex Does? by Jeff Hawkins (12/15/2017)
Have We Missed Half of What the Neocortex Does? by Jeff Hawkins (12/15/2017)Have We Missed Half of What the Neocortex Does? by Jeff Hawkins (12/15/2017)
Have We Missed Half of What the Neocortex Does? by Jeff Hawkins (12/15/2017)
 
syllabus-IS.doc
syllabus-IS.docsyllabus-IS.doc
syllabus-IS.doc
 
A clonal based algorithm for the reconstruction of
A clonal based algorithm for the reconstruction ofA clonal based algorithm for the reconstruction of
A clonal based algorithm for the reconstruction of
 
HPC-MAQ : A PARALLEL SHORT-READ REFERENCE ASSEMBLER
HPC-MAQ : A PARALLEL SHORT-READ REFERENCE ASSEMBLERHPC-MAQ : A PARALLEL SHORT-READ REFERENCE ASSEMBLER
HPC-MAQ : A PARALLEL SHORT-READ REFERENCE ASSEMBLER
 
A clonal based algorithm for the reconstruction of genetic network using s sy...
A clonal based algorithm for the reconstruction of genetic network using s sy...A clonal based algorithm for the reconstruction of genetic network using s sy...
A clonal based algorithm for the reconstruction of genetic network using s sy...
 
EEG Based Classification of Emotions with CNN and RNN
EEG Based Classification of Emotions with CNN and RNNEEG Based Classification of Emotions with CNN and RNN
EEG Based Classification of Emotions with CNN and RNN
 
Technical Paper.doc.doc
Technical Paper.doc.docTechnical Paper.doc.doc
Technical Paper.doc.doc
 
Data Mining-Project Report Gene Classification using Neural Network- Apil Tamang
Data Mining-Project Report Gene Classification using Neural Network- Apil TamangData Mining-Project Report Gene Classification using Neural Network- Apil Tamang
Data Mining-Project Report Gene Classification using Neural Network- Apil Tamang
 

Viewers also liked

Joint gene network inference with multiple samples: a bootstrapped consensual...
Joint gene network inference with multiple samples: a bootstrapped consensual...Joint gene network inference with multiple samples: a bootstrapped consensual...
Joint gene network inference with multiple samples: a bootstrapped consensual...tuxette
 

Viewers also liked (11)

Apresentação do CTBE
Apresentação do CTBEApresentação do CTBE
Apresentação do CTBE
 
Initiatives Regarding Sustainability of Biofuels in Europe and their Potencia...
Initiatives Regarding Sustainability of Biofuels in Europe and their Potencia...Initiatives Regarding Sustainability of Biofuels in Europe and their Potencia...
Initiatives Regarding Sustainability of Biofuels in Europe and their Potencia...
 
An Overview of Brazilian Actions Regarding Sustainability of Biofuels
An Overview of Brazilian Actions Regarding Sustainability of BiofuelsAn Overview of Brazilian Actions Regarding Sustainability of Biofuels
An Overview of Brazilian Actions Regarding Sustainability of Biofuels
 
Social Externalities of Different Fuels in Brazil
Social Externalities of Different Fuels in BrazilSocial Externalities of Different Fuels in Brazil
Social Externalities of Different Fuels in Brazil
 
Procedimentos de Interação: CTBE-Indústria
Procedimentos de Interação: CTBE-IndústriaProcedimentos de Interação: CTBE-Indústria
Procedimentos de Interação: CTBE-Indústria
 
Research Program on Sustainability of Bioethanol: Socio economic impacts
Research Program on Sustainability of Bioethanol: Socio economic impactsResearch Program on Sustainability of Bioethanol: Socio economic impacts
Research Program on Sustainability of Bioethanol: Socio economic impacts
 
NGO Reporter Brasil´s Experience on Monitoring Social Impacts of Ethanol
NGO Reporter Brasil´s Experience on Monitoring Social Impacts of EthanolNGO Reporter Brasil´s Experience on Monitoring Social Impacts of Ethanol
NGO Reporter Brasil´s Experience on Monitoring Social Impacts of Ethanol
 
Joint gene network inference with multiple samples: a bootstrapped consensual...
Joint gene network inference with multiple samples: a bootstrapped consensual...Joint gene network inference with multiple samples: a bootstrapped consensual...
Joint gene network inference with multiple samples: a bootstrapped consensual...
 
The Biomass, Fibre and Sucrose Dilemma in Realising the Agronomic Potential o...
The Biomass, Fibre and Sucrose Dilemma in Realising the Agronomic Potential o...The Biomass, Fibre and Sucrose Dilemma in Realising the Agronomic Potential o...
The Biomass, Fibre and Sucrose Dilemma in Realising the Agronomic Potential o...
 
Research Program on Sustainability of Bioethanol
Research Program on Sustainability of BioethanolResearch Program on Sustainability of Bioethanol
Research Program on Sustainability of Bioethanol
 
Silicon Importance on Aliviating Biotic and Abiotic Stress on Sugarcane
Silicon Importance on Aliviating Biotic and Abiotic Stress on SugarcaneSilicon Importance on Aliviating Biotic and Abiotic Stress on Sugarcane
Silicon Importance on Aliviating Biotic and Abiotic Stress on Sugarcane
 

Similar to Biological Network Inference via Gaussian Graphical Models

Cytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networksCytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networksBITS
 
The role of machine learning in modelling the cell
The role of machine learning in modelling the cellThe role of machine learning in modelling the cell
The role of machine learning in modelling the cellbutest
 
SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesMike Hucka
 
CEHS 2016 Poster
CEHS 2016 PosterCEHS 2016 Poster
CEHS 2016 PosterEric Ma
 
Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...Anubis Hosein
 
bioinformatics simple
bioinformatics simple bioinformatics simple
bioinformatics simple nadeem akhter
 
Rocha comple net2012-melbourne
Rocha comple net2012-melbourneRocha comple net2012-melbourne
Rocha comple net2012-melbourneJuan C. Rocha
 
Introduction to graph databases: Neo4j and Cypher
Introduction to graph databases: Neo4j and CypherIntroduction to graph databases: Neo4j and Cypher
Introduction to graph databases: Neo4j and CypherAnjani Dhrangadhariya
 
Thesis def
Thesis defThesis def
Thesis defJay Vyas
 
Introduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfIntroduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfkigaruantony
 
scRNA-Seq Workshop Presentation - Stem Cell Network 2018
scRNA-Seq Workshop Presentation - Stem Cell Network 2018scRNA-Seq Workshop Presentation - Stem Cell Network 2018
scRNA-Seq Workshop Presentation - Stem Cell Network 2018David Cook
 
Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27Sage Base
 
Stephen Friend Cytoscape Retreat 2011-05-20
Stephen Friend Cytoscape Retreat 2011-05-20Stephen Friend Cytoscape Retreat 2011-05-20
Stephen Friend Cytoscape Retreat 2011-05-20Sage Base
 
Cornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 NetsCornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 NetsMark Gerstein
 
NetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-vizNetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-vizAlexander Pico
 
Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...Gerald Lushington
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1Double Check ĆŐNSULTING
 
Humanizing bioinformatics
Humanizing bioinformaticsHumanizing bioinformatics
Humanizing bioinformaticsJan Aerts
 
Pcmd bioinformatics-lecture i
Pcmd bioinformatics-lecture iPcmd bioinformatics-lecture i
Pcmd bioinformatics-lecture iMuhammad Younis
 

Similar to Biological Network Inference via Gaussian Graphical Models (20)

Cytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networksCytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networks
 
The role of machine learning in modelling the cell
The role of machine learning in modelling the cellThe role of machine learning in modelling the cell
The role of machine learning in modelling the cell
 
SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resources
 
CEHS 2016 Poster
CEHS 2016 PosterCEHS 2016 Poster
CEHS 2016 Poster
 
Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...
 
bioinformatics simple
bioinformatics simple bioinformatics simple
bioinformatics simple
 
Rocha comple net2012-melbourne
Rocha comple net2012-melbourneRocha comple net2012-melbourne
Rocha comple net2012-melbourne
 
Introduction to graph databases: Neo4j and Cypher
Introduction to graph databases: Neo4j and CypherIntroduction to graph databases: Neo4j and Cypher
Introduction to graph databases: Neo4j and Cypher
 
Thesis def
Thesis defThesis def
Thesis def
 
Introduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfIntroduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdf
 
scRNA-Seq Workshop Presentation - Stem Cell Network 2018
scRNA-Seq Workshop Presentation - Stem Cell Network 2018scRNA-Seq Workshop Presentation - Stem Cell Network 2018
scRNA-Seq Workshop Presentation - Stem Cell Network 2018
 
Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27
 
presentation
presentationpresentation
presentation
 
Stephen Friend Cytoscape Retreat 2011-05-20
Stephen Friend Cytoscape Retreat 2011-05-20Stephen Friend Cytoscape Retreat 2011-05-20
Stephen Friend Cytoscape Retreat 2011-05-20
 
Cornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 NetsCornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 Nets
 
NetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-vizNetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-viz
 
Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
 
Humanizing bioinformatics
Humanizing bioinformaticsHumanizing bioinformatics
Humanizing bioinformatics
 
Pcmd bioinformatics-lecture i
Pcmd bioinformatics-lecture iPcmd bioinformatics-lecture i
Pcmd bioinformatics-lecture i
 

More from CTBE - Brazilian Bioethanol Sci&Tech Laboratory

More from CTBE - Brazilian Bioethanol Sci&Tech Laboratory (20)

Environments of Production and Agronomic Management
Environments of Production and Agronomic ManagementEnvironments of Production and Agronomic Management
Environments of Production and Agronomic Management
 
From Genotype to Phenotype in Sugarcane: a systems biology approach to unders...
From Genotype to Phenotype in Sugarcane: a systems biology approach to unders...From Genotype to Phenotype in Sugarcane: a systems biology approach to unders...
From Genotype to Phenotype in Sugarcane: a systems biology approach to unders...
 
Reflexões
ReflexõesReflexões
Reflexões
 
Visão Sobre o Documento Base
Visão Sobre o Documento BaseVisão Sobre o Documento Base
Visão Sobre o Documento Base
 
Procedimentos que Regem o Relacionamento CTBE-Indústria
Procedimentos que Regem o Relacionamento CTBE-IndústriaProcedimentos que Regem o Relacionamento CTBE-Indústria
Procedimentos que Regem o Relacionamento CTBE-Indústria
 
Sugarcane Agroecological Zoning for Ethanol and Sugar production in Brazil
Sugarcane Agroecological Zoning for Ethanol and Sugar production in BrazilSugarcane Agroecological Zoning for Ethanol and Sugar production in Brazil
Sugarcane Agroecological Zoning for Ethanol and Sugar production in Brazil
 
Remote Sensing Satellite Images for Sugarcane Crop Monitoring
Remote Sensing Satellite Images for Sugarcane Crop Monitoring Remote Sensing Satellite Images for Sugarcane Crop Monitoring
Remote Sensing Satellite Images for Sugarcane Crop Monitoring
 
Greenhouse Gas Emissions in the Production Cycle of Bioethanol from Sugarcane...
Greenhouse Gas Emissions in the Production Cycle of Bioethanol from Sugarcane...Greenhouse Gas Emissions in the Production Cycle of Bioethanol from Sugarcane...
Greenhouse Gas Emissions in the Production Cycle of Bioethanol from Sugarcane...
 
Greenhouse Gas (GHG) Emissions Balances of Biofuels
Greenhouse Gas (GHG) Emissions Balances of Biofuels Greenhouse Gas (GHG) Emissions Balances of Biofuels
Greenhouse Gas (GHG) Emissions Balances of Biofuels
 
Energy and GHG emission balances: CTBE´s proposal
Energy and GHG emission balances: CTBE´s proposalEnergy and GHG emission balances: CTBE´s proposal
Energy and GHG emission balances: CTBE´s proposal
 
Land Use Change in Computable General Equilibrium Models
Land Use Change in Computable General Equilibrium ModelsLand Use Change in Computable General Equilibrium Models
Land Use Change in Computable General Equilibrium Models
 
Sugarcane Integration to Other Agricultural Activities
Sugarcane Integration to Other Agricultural ActivitiesSugarcane Integration to Other Agricultural Activities
Sugarcane Integration to Other Agricultural Activities
 
Greenhouse Gas Emissions and Soil Carbon Stocks in the Agricultural Phase of ...
Greenhouse Gas Emissions and Soil Carbon Stocks in the Agricultural Phase of ...Greenhouse Gas Emissions and Soil Carbon Stocks in the Agricultural Phase of ...
Greenhouse Gas Emissions and Soil Carbon Stocks in the Agricultural Phase of ...
 
The Sustainability of the Present Ethanol Production and Future Expansion in ...
The Sustainability of the Present Ethanol Production and Future Expansion in ...The Sustainability of the Present Ethanol Production and Future Expansion in ...
The Sustainability of the Present Ethanol Production and Future Expansion in ...
 
Research Agenda: Assessing Impacts of Sugarcane Ethanol Production and New Te...
Research Agenda: Assessing Impacts of Sugarcane Ethanol Production and New Te...Research Agenda: Assessing Impacts of Sugarcane Ethanol Production and New Te...
Research Agenda: Assessing Impacts of Sugarcane Ethanol Production and New Te...
 
Socio-economics Aspects Biofuels Production: What are the concerns in Europe?
Socio-economics Aspects Biofuels Production: What are the concerns in Europe?Socio-economics Aspects Biofuels Production: What are the concerns in Europe?
Socio-economics Aspects Biofuels Production: What are the concerns in Europe?
 
Experience on System Integration and Simulation
Experience on System Integration and SimulationExperience on System Integration and Simulation
Experience on System Integration and Simulation
 
Intellectual Property (IP) and Technology Transfer: models for interaction in...
Intellectual Property (IP) and Technology Transfer: models for interaction in...Intellectual Property (IP) and Technology Transfer: models for interaction in...
Intellectual Property (IP) and Technology Transfer: models for interaction in...
 
New Technologies Evaluation: uma visão empresarial
New Technologies Evaluation: uma visão empresarial New Technologies Evaluation: uma visão empresarial
New Technologies Evaluation: uma visão empresarial
 
Embrapa´s Success Measurement Methodology
Embrapa´s Success Measurement Methodology Embrapa´s Success Measurement Methodology
Embrapa´s Success Measurement Methodology
 

Recently uploaded

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Biological Network Inference via Gaussian Graphical Models

  • 1. An introduction to Biological network inference via Gaussian Graphical Models Christophe Ambroise, Julien Chiquet e ´ Statistique et G´nome, CNRS & Universit´ d’Evry Val d’Essonne e S˜o Paulo – School on Advance Science – Octobre 2012 a http://stat.genopole.cnrs.fr/~cambroise Network inference 1
  • 2. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 2
  • 3. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 3
  • 4. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 4
  • 5. Real networks I Many scientific fields : I World Wide Web I Biology, sociology, physics I Nature of data under study: I Interactions between N objects I O(N 2 ) possible interactions I Network topology : I Describes the way nodes interact, structure/function Sample of 250 blogs (nodes) with their links relationship (edges) of the French political Blogosphere. Network inference 5
  • 6. 1 What the reconstructed networks are expected to be (1) Regulatory networks E. coli regulatory network I relationships between gene and their products I inhibition/activation I impossible to recover at large scale I always incomplete 1 1 and are presumably wrongly assumed to be Network inference 6
  • 7. What the reconstructed networks are expected to be (2) Regulatory networks Figure: Regulatory network identified in mammalian cells: highly structured Network inference 7
  • 8. What the reconstructed networks are expected to be (3) Protein-Protein interaction networks Figure: Yeast PPI network : do not be mislead by the representation, trust stat ! Network inference 8
  • 9. What the reconstructed networks are expected to be (3) Protein-Protein interaction networks Figure: Yeast PPI network : do not be mislead by the representation, trust stat ! Network inference 8
  • 10. What the reconstructed networks are expected to be (3) Protein-Protein interaction networks Figure: Yeast PPI network : do not be mislead by the representation, trust stat ! Network inference 8
  • 11. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 9
  • 12. What are we looking at? Central dogma of molecular biology transcription translation DNA mRNA Proteins replication Proteins I are building blocks of any cellular functionality, I are encoded by the genes, I do interact (at the protein and gene level – regulations). Network inference 10
  • 13. What questions in functional genomics? (1) Various levels/scales of study I genome: sequence analysis, I transcriptome: gene expression levels, I proteome: protein functions and interactions. Questions 1. Biological understanding I Mechanisms of diseases, I gene/protein functions and interactions. 2. Medical/clinical care I Diagnostic (type of disease), I prognostic (survival analysis), I treatment (prediction of response). Network inference 11
  • 14. What questions in functional genomics? (1) Various levels/scales of study I genome: sequence analysis, I transcriptome: gene expression levels, I proteome: protein functions and interactions. Questions 1. Biological understanding I Mechanisms of diseases, I gene/protein functions and interactions. 2. Medical/clinical care I Diagnostic (type of disease), I prognostic (survival analysis), I treatment (prediction of response). Network inference 11
  • 15. What questions in functional genomics? (2) Central dogma of molecular biology transcription translation DNA mRNA Proteins replication Basic biostatistical issues Selecting some genes of interest (biomarkers), Looking for interactions between them (pathway analysis). Network inference 12
  • 16. How is this measured? (1) Microarray technology: parallel measurement of many biological features signal processing Matrix of features n ⌧ p 0 1 2 3 p 1 x1 x1 x1 . . . x1 Expression levels of p B. C pretreatment X=@ . . A probes are simultaneously p 1 2 2 xn xn x1 . . . xn monitored for n individuals Network inference 13
  • 17. How is this measured? (2) Next Generation Sequencing: parallel measurement of even many more biological features assembling Matrix of features n n p 0 1 2 3 p 1 k1 k1 k1 . . . k1 B. C X=@ . Expression counts are extracted pretreatment . A from small repeated sequences p 1 2 2 kn kn k1 . . . kn and monitored for n individuals Network inference 14
  • 18. What questions are we dealing with? (1) Supervised canonical example at the gene level: di↵erential analysis Leukemia (Golub data, thanks to P. Neuvial) I AML – Acute Myeloblastic Leukemia, n1 = 11, I ALL – Acute Lymphoblastic Leukemia n2 = 27, a n1 + n2 vector of outcome with each patient’s tumor type. Supervised classification Find genes with significant di↵erent expression levels between groups – biomarkers prediction purpose Network inference 15
  • 19. What questions are we dealing with? (2) Unsupervised canonical example at the gene level: hierarchical clustering Same kind of data, no outcome is considered (Unsupervised) clustering Find groups of gene which show statistical dependencies/commonalities – hoping for biological interactions exploratory purpose functional understanding Can we do better than that ? And how do genes interact anyway? Network inference 16
  • 20. What questions are we dealing with? (2) Unsupervised canonical example at the gene level: hierarchical clustering Same kind of data, no outcome is considered (Unsupervised) clustering Find groups of gene which show statistical dependencies/commonalities – hoping for biological interactions exploratory purpose functional understanding Can we do better than that ? And how do genes interact anyway? Network inference 16
  • 21. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 17
  • 22. The problem at hand Inference ⇡ 10s/100s microarray/sequencing experiments ⇡ 1000s probes (“genes”) Modeling questions prior to inference 1. What do the nodes represent? (the easiest one) 2. What is/should be the meaning of an edge? (the toughest one) I Biologically? I Statistically? Network inference 18
  • 23. The problem at hand Inference ⇡ 10s/100s microarray/sequencing experiments ⇡ 1000s probes (“genes”) Modeling questions prior to inference 1. What do the nodes represent? (the easiest one) 2. What is/should be the meaning of an edge? (the toughest one) I Biologically? I Statistically? Network inference 18
  • 24. The problem at hand Inference ⇡ 10s/100s microarray/sequencing experiments ⇡ 1000s probes (“genes”) Modeling questions prior to inference 1. What do the nodes represent? (the easiest one) 2. What is/should be the meaning of an edge? (the toughest one) I Biologically? I Statistically? Network inference 18
  • 25. The problem at hand Inference ⇡ 10s/100s microarray/sequencing experiments ⇡ 1000s probes (“genes”) Modeling questions prior to inference 1. What do the nodes represent? (the easiest one) 2. What is/should be the meaning of an edge? (the toughest one) I Biologically? I Statistically? Network inference 18
  • 26. More questions/issues Modelling I Is the network dynamic of static? I How has the data been generated? (time-course/steady state) I Are the edges oriented or not? (causality) I What do the edges represent for my particular problem? Statistical challenges I (Ultra) high dimensionality, I Noisy data, lack of reproducibility, I Heterogeneity of the data (many techniques, various signals). Network inference 19
  • 27. More questions/issues Modelling I Is the network dynamic of static? I How has the data been generated? (time-course/steady state) I Are the edges oriented or not? (causality) I What do the edges represent for my particular problem? Statistical challenges I (Ultra) high dimensionality, I Noisy data, lack of reproducibility, I Heterogeneity of the data (many techniques, various signals). Network inference 19
  • 28. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 20
  • 29. Canonical model settings Biological microarrays in comparable conditions Notations 1. a set P = {1, . . . , p} of p variables: these are typically the genes (could be proteins); 2. a sample N = {1, . . . , n} of individuals associated to the variables: these are typically the microarray (could be sequence counts). Basic statistical model This can be view as I a random vector X in Rp , whose j th entry is the j th variable, I a n-size sample (X 1 , . . . , X n ), such as X i is the i th microarrays, I could be independent identically distributed copies (steady-state) I could be dependent in a certain way (time-course data) I assume a parametric probability distribution for X (Gaussian). Network inference 21
  • 30. Canonical model settings Biological microarrays in comparable conditions Notations 1. a set P = {1, . . . , p} of p variables: these are typically the genes (could be proteins); 2. a sample N = {1, . . . , n} of individuals associated to the variables: these are typically the microarray (could be sequence counts). Basic statistical model This can be view as I a random vector X in Rp , whose j th entry is the j th variable, I a n-size sample (X 1 , . . . , X n ), such as X i is the i th microarrays, I could be independent identically distributed copies (steady-state) I could be dependent in a certain way (time-course data) I assume a parametric probability distribution for X (Gaussian). Network inference 21
  • 31. Canonical model settings Biological microarrays in comparable conditions Notations 1. a set P = {1, . . . , p} of p variables: these are typically the genes (could be proteins); 2. a sample N = {1, . . . , n} of individuals associated to the variables: The data are typically the microarray (could be sequence counts). these Stacking (X 1 , . . . , X n ), we met the usual individual/variable table X Basic statistical model 0 1 2 3 p1 This can be view as x1 x1 x1 . . . x1 B. C I Inference j th @ . a random vector X in Rp , whose X =entry is the j th variable, A . 1 2 2 p I a n-size sample (X 1 , . . . , X n ), such as Xxin is xn ix1 microarrays, the th . . . xn I could be independent identically distributed copies (steady-state) I could be dependent in a certain way (time-course data) I assume a parametric probability distribution for X (Gaussian). Network inference 21
  • 32. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 22
  • 33. Modeling relationship between variables (1) Independence Definition (Independence of events) Two events A and B are independent if and only if P(A, B ) = P(A)P(B ), which is usually denoted by A ? B . Equivalently, ? I A ? B , P(A|B ) = P(A), ? I A ? B , P(A|B ) = P(A|B c ) ? Example (class vs party) party party class Labour Tory class Labour Tory working 0.42 0.28 working 0.60 0.40 bourgeoisie 0.06 0.24 bourgeoisie 0.20 0.80 Table: Joint probability (left) vs. conditional probability (right) Network inference 23
  • 34. Modeling relationship between variables (1) Independence Definition (Independence of events) Two events A and B are independent if and only if P(A, B ) = P(A)P(B ), which is usually denoted by A ? B . Equivalently, ? I A ? B , P(A|B ) = P(A), ? I A ? B , P(A|B ) = P(A|B c ) ? Example (class vs party) party party class Labour Tory class Labour Tory working 0.42 0.28 working 0.60 0.40 bourgeoisie 0.06 0.24 bourgeoisie 0.20 0.80 Table: Joint probability (left) vs. conditional probability (right) Network inference 23
  • 35. Modeling relationships between variables (2) Conditional independence Generalizing to more than two events requires strong assumptions (mutual independence). Better handle with Definition (Conditional independence of events) Two events A and B are independent if and only if P(A, B |C ) = P(A|C )P(B |C ), which is usually denoted by A ? B |C ? Example (Does QI depends on weight?) Consider the events A = ”having low QI”, B = ”having low weight”. Network inference 24
  • 36. Modeling relationships between variables (2) Conditional independence Generalizing to more than two events requires strong assumptions (mutual independence). Better handle with Definition (Conditional independence of events) Two events A and B are independent if and only if P(A, B |C ) = P(A|C )P(B |C ), which is usually denoted by A ? B |C ? Example (Does QI depends on weight?) Consider the events A = ”having low QI”, B = ”having low weight”. Network inference 24
  • 37. Modeling relationships between variables (2) Conditional independence Generalizing to more than two events requires strong assumptions (mutual independence). Better handle with Definition (Conditional independence of events) Two events A and B are independent if and only if P(A, B |C ) = P(A|C )P(B |C ), which is usually denoted by A ? B |C ? Example (Does QI depends on weight?) Consider the events A = ”having low QI”, B = ”having low weight”. Estimating2 P(A, B ), P(A) and P(B ) in a sample would lead to P(A, B ) 6= P(A)P(B ) 2 stupidly Network inference 24
  • 38. Modeling relationships between variables (2) Conditional independence Generalizing to more than two events requires strong assumptions (mutual independence). Better handle with Definition (Conditional independence of events) Two events A and B are independent if and only if P(A, B |C ) = P(A|C )P(B |C ), which is usually denoted by A ? B |C ? Example (Does QI depends on weight?) Consider the events A = ”having low QI”, B = ”having low weight”. But in fact, introducing C = ”having a given age”, P(A, B |C ) = P(A|C )P(B |C ) Network inference 24
  • 39. Independence of random vectors (1) Independence and Conditional independence: natural generalization Definition Consider 3 random vector X , Y , Z with distribution fX , fY , fZ , jointly fXY , fXYZ . Then, I X and Y are independent iif fXY (x , y) = fX (x )fY (y); I X and Y are conditionally independent on Z , z : fZ (z ) > 0 iif fXY |Z (x , y; z ) = fX |Z (x ; z )fY |Z (y; z ). Proposition (Factorization criterion) X and Y are independent (resp. conditionally independent on Z ) iif there exists functions g and h such as, for all x and y 1. fXY (x , y) = g(x )h(y), 2. fXYZ (x , y, z ) = g(x , z )h(y, z ), for all z fZ (z ) > 0. Network inference 25
  • 40. Independence of random vectors (1) Independence and Conditional independence: natural generalization Definition Consider 3 random vector X , Y , Z with distribution fX , fY , fZ , jointly fXY , fXYZ . Then, I X and Y are independent iif fXY (x , y) = fX (x )fY (y); I X and Y are conditionally independent on Z , z : fZ (z ) > 0 iif fXY |Z (x , y; z ) = fX |Z (x ; z )fY |Z (y; z ). Proposition (Factorization criterion) X and Y are independent (resp. conditionally independent on Z ) iif there exists functions g and h such as, for all x and y 1. fXY (x , y) = g(x )h(y), 2. fXYZ (x , y, z ) = g(x , z )h(y, z ), for all z fZ (z ) > 0. Network inference 25
  • 41. Independence of random vectors (2) Independence vs Conditional independence f ; X ? Y |Z ? f ; fXYZ f ; fX fY fZ f ; X ? Z |Y ? f ; Y ? Z |X ? Figure: Mutual independence, Conditional dependence, full dependence. Network inference 26
  • 42. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 27
  • 43. Definition Definition A graphical model gives a graphical (intuitive) representation of the dependence structure of a probability distribution. Graphical structure $ Random variables/Random vector It links 1. a random vector (or a set of random variables.) X = {X1 , . . . , Xp } with distribution P, 2. a graph G = (P, E) where I P = {1, . . . , p} is the set of nodes associated to each variable, I E is a set of edges describing the dependence relationship of X ⇠ P. Network inference 28
  • 44. Definition Definition A graphical model gives a graphical (intuitive) representation of the dependence structure of a probability distribution. Graphical structure $ Random variables/Random vector It links 1. a random vector (or a set of random variables.) X = {X1 , . . . , Xp } with distribution P, 2. a graph G = (P, E) where I P = {1, . . . , p} is the set of nodes associated to each variable, I E is a set of edges describing the dependence relationship of X ⇠ P. Network inference 28
  • 45. Conditional Independence Graphs Definition Definition The conditional independence graph of a random vector X is the undirected graph G = {P, E} with the set of node P = {1, . . . , p} and where (i , j ) 2 E , Xi ? Xj |P{i , j }. / ? Property It owns the Markov property: any two subsets of variables separated by a third is independent conditionally on variables in the third set. Network inference 29
  • 46. Conditional Independence Graphs Definition Definition The conditional independence graph of a random vector X is the undirected graph G = {P, E} with the set of node P = {1, . . . , p} and where (i , j ) 2 E , Xi ? Xj |P{i , j }. / ? Property It owns the Markov property: any two subsets of variables separated by a third is independent conditionally on variables in the third set. Network inference 29
  • 47. Conditional Independence Graphs An example Let X1 , X2 , X3 , X4 be four random variables with joint probability density function fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 ) with u a given constant. Apply the factorization property fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 ) = exp(u) · exp(x1 + x1 x2 ) · exp(x2 x3 x4 ) Graphical representation 1 2 4 G = (P, E) such as P = {1, 2, 3, 4} and E= 3 Network inference 30
  • 48. Conditional Independence Graphs An example Let X1 , X2 , X3 , X4 be four random variables with joint probability density function fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 ) with u a given constant. Apply the factorization property fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 ) = exp(u) · exp(x1 + x1 x2 ) · exp(x2 x3 x4 ) Graphical representation 1 2 4 G = (P, E) such as P = {1, 2, 3, 4} and E = {?} 3 Network inference 30
  • 49. Conditional Independence Graphs An example Let X1 , X2 , X3 , X4 be four random variables with joint probability density function fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 ) with u a given constant. Apply the factorization property fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 ) = exp(u) · exp(x1 + x1 x2 ) · exp(x2 x3 x4 ) Graphical representation 1 2 4 G = (P, E) such as P = {1, 2, 3, 4} and E = {(1, 2)} 3 Network inference 30
  • 50. Conditional Independence Graphs An example Let X1 , X2 , X3 , X4 be four random variables with joint probability density function fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 ) with u a given constant. Apply the factorization property fX (x ) = exp(u + x1 + x1 x2 + x2 x3 x4 ) = exp(u) · exp(x1 + x1 x2 ) · exp(x2 x3 x4 ) Graphical representation 1 2 4 G = (P, E) such as P = {1, 2, 3, 4} and E = {(2, 3), (3, 4), (2, 4)} 3 Network inference 30
  • 51. Directed Acyclic conditional independence Graph (DAG) Motivation Limitation of undirected graphs Sometimes an ordering on the variables is known, which allows to break the symmetry in the graphical representation to introduce, in some sense, “causality” in the modeling. Consequences I Each element of E has to be directed. I There are no directed cycle in the graph. We thus deal with a directed acyclic graph (or DAG). Network inference 31
  • 52. Directed Acyclic conditional independence Graph (DAG) Definition Definition (Ordering) An ordering between variables {1, . . . , p} is a relation such that: i) for all couple (i , j ), either i j or j i , ii) is transitive iii) is not reflexive. I A natural ordering is obtained when variables are observed across time, I A natural conditioning set for a pair of variables (i , j ) is the past, denoted P(j ) = 1, . . . , j for j . Definition (DAG) The directed conditional dependence graph of X is the directed graph G = (P, E ) where (i , j ) such as i j 2 E , Xj ? Xi |P(j ){i , j }. / ? Network inference 32
  • 53. Directed Acyclic conditional independence Graph (DAG) Definition Definition (Ordering) An ordering between variables {1, . . . , p} is a relation such that: i) for all couple (i , j ), either i j or j i , ii) is transitive iii) is not reflexive. I A natural ordering is obtained when variables are observed across time, I A natural conditioning set for a pair of variables (i , j ) is the past, denoted P(j ) = 1, . . . , j for j . Definition (DAG) The directed conditional dependence graph of X is the directed graph G = (P, E ) where (i , j ) such as i j 2 E , Xj ? Xi |P(j ){i , j }. / ? Network inference 32
  • 54. Directed Acyclic conditional independence Graph (DAG) Factorization and Markov property Another view is a parent/descendant relationships to deal with the ordering of the nodes: The factorization property p Y fX (x ) = fXk |pak (xk |pak ), k =1 where pak are the parents of node k . Network inference 33
  • 55. Directed Acyclic conditional independence Graph (DAG) An example x1 x2 x3 x4 x5 x6 x7 fX (x ) = ?fX1 fX2 fX3 fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 fX6 |X4 fX7 |X4 ,X5 . Network inference 34
  • 56. Directed Acyclic conditional independence Graph (DAG) An example x1 x2 x3 x4 x5 x6 x7 fX (x ) = ?fX1 · · · fX2 fX3 fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 fX6 |X4 fX7 |X4 ,X5 . Network inference 34
  • 57. Directed Acyclic conditional independence Graph (DAG) An example x1 x2 x3 x4 x5 x6 x7 fX (x ) = ?fX1 fX2 · · · fX3 fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 fX6 |X4 fX7 |X4 ,X5 . Network inference 34
  • 58. Directed Acyclic conditional independence Graph (DAG) An example x1 x2 x3 x4 x5 x6 x7 fX (x ) = ?fX1 fX2 fX3 · · · fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 fX6 |X4 fX7 |X4 ,X5 . Network inference 34
  • 59. Directed Acyclic conditional independence Graph (DAG) An example x1 x2 x3 x4 x5 x6 x7 fX (x ) = ?fX1 fX2 fX3 fX4 |X1 ,X2 ,X3 · · · fX5 |X1 ,X3 fX6 |X4 fX7 |X4 ,X5 . Network inference 34
  • 60. Directed Acyclic conditional independence Graph (DAG) An example x1 x2 x3 x4 x5 x6 x7 fX (x ) = ?fX1 fX2 fX3 fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 · · · fX6 |X4 fX7 |X4 ,X5 . Network inference 34
  • 61. Directed Acyclic conditional independence Graph (DAG) An example x1 x2 x3 x4 x5 x6 x7 fX (x ) = ?fX1 fX2 fX3 fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 fX6 |X4 · · · fX7 |X4 ,X5 . Network inference 34
  • 62. Directed Acyclic conditional independence Graph (DAG) An example x1 x2 x3 x4 x5 x6 x7 fX (x ) = ?fX1 fX2 fX3 fX4 |X1 ,X2 ,X3 fX5 |X1 ,X3 fX6 |X4 fX7 |X4 ,X5 . Network inference 34
  • 63. Directed Acyclic conditional independence Graph (DAG) Markov property Local Markov property For any Y 2 dek where dek are the descendants of k , then Xk ? Y | pak , ? that is, Xk is conditionally independent on its non-descendants given its parents. Network inference 35
  • 64. Local Markov property: example x1 x2 x3 Check that x4 ? x5 | {x2 , x3 }, by ? using the factorization property. x4 x5 P(x2 , x3 , x4 , x5 ) P(x4 |x5 , x2 , x3 ) = P(x2 , x3 , x5 ) P(x2 )P(x3 )P(x4 |x2 , x3 )P(x5 |x3 ) = P(x2 )P(x3 )P(x5 |x3 ) = P(x4 |x2 , x3 ). Network inference 36
  • 65. Local Markov property: example x1 x2 x3 Check that x4 ? x5 | {x2 , x3 }, by ? using the factorization property. x4 x5 P(x2 , x3 , x4 , x5 ) P(x4 |x5 , x2 , x3 ) = P(x2 , x3 , x5 ) P(x2 )P(x3 )P(x4 |x2 , x3 )P(x5 |x3 ) = P(x2 )P(x3 )P(x5 |x3 ) = P(x4 |x2 , x3 ). Network inference 36
  • 66. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 37
  • 67. Modeling the genomic data Gaussian assumption The data 0 1 2 3 p 1 x1 x1 x1 . . . x1 B. C Inference X=@ . . A 1 2 2 p xn xn x1 . . . xn Assuming fX (X) multivariate Gaussian Greatly simplifies the inference: naturally links independence and conditional independence to the covariance and partial covariance, gives a straightforward interpretation to the graphical modeling previously considered. Network inference 38
  • 68. Modeling the genomic data Gaussian assumption The data 0 1 2 3 p 1 x1 x1 x1 . . . x1 B. C Inference X=@ . . A 1 2 2 p xn xn x1 . . . xn Assuming fX (X) multivariate Gaussian Greatly simplifies the inference: naturally links independence and conditional independence to the covariance and partial covariance, gives a straightforward interpretation to the graphical modeling previously considered. Network inference 38
  • 69. Start gently with the univariate Gaussian distribution The Gaussian distribution is the natural model for the level of expression of gene (noisy data). We note X ⇠ N (µ, 2 ),so as EX = µ, VarX = 2 and ⇢ 1 1 fX (x ) = p exp (x µ)2 , 2⇡ 2 2 and p 1 log fX (x ) = log 2⇡ 2 (x µ)2 . 2 Useless for modeling the distribution of expression level for a whole bunch of genes. Network inference 39
  • 70. Start gently with the univariate Gaussian distribution The Gaussian distribution is the natural model for the level of expression of gene (noisy data). We note X ⇠ N (µ, 2 ),so as EX = µ, VarX = 2 and ⇢ 1 1 fX (x ) = p exp (x µ)2 , 2⇡ 2 2 and p 1 log fX (x ) = log 2⇡ 2 (x µ)2 . 2 Useless for modeling the distribution of expression level for a whole bunch of genes. Network inference 39
  • 71. One step forward: bivariate Gaussian distribution Need concepts of covariance and correlation Let X , Y be two real random variables. Definitions h i cov(X , Y ) = E X E(X ) Y E(Y ) = E(XY ) E(X )E(Y ). cov(X , Y ) ⇢XY = cor(X , Y ) = p . Var(X ) · Var(Y ) Proposition I cov(X , X ) = Var(X ) = E[(X EX )(Y EY )], I cov(X + Y , Z ) = cov(X , Z ) + cov(X , Z ), I Var(X + Y ) = Var(X ) + Var(Y ) + 2cov(X , Y ). I X ? Y ) cov(X , Y ) = 0. ? I X ? Y , cov(X , Y ) = 0 when X , Y are Gaussian. ? Network inference 40
  • 72. One step forward: bivariate Gaussian distribution Need concepts of covariance and correlation Let X , Y be two real random variables. Definitions h i cov(X , Y ) = E X E(X ) Y E(Y ) = E(XY ) E(X )E(Y ). cov(X , Y ) ⇢XY = cor(X , Y ) = p . Var(X ) · Var(Y ) Proposition I cov(X , X ) = Var(X ) = E[(X EX )(Y EY )], I cov(X + Y , Z ) = cov(X , Z ) + cov(X , Z ), I Var(X + Y ) = Var(X ) + Var(Y ) + 2cov(X , Y ). I X ? Y ) cov(X , Y ) = 0. ? I X ? Y , cov(X , Y ) = 0 when X , Y are Gaussian. ? Network inference 40
  • 73. The bivariate Gaussian distribution ✓ ◆ 1 1 1 x µ1 fXY (x , y) = p exp{ x µ1 y µ2 ⌃ } 2⇡ det ⌃ 2 y µ2 where ⌃ is the variance/covariance matrix which is symmetric and positive definite. ✓ ◆ Var(X ) cov(Y , X ) ⌃= . cov(Y , X ) Var(Y ) and 1 1 fX ,Y (x , y) = p exp (x 2 + y 2 + 2⇢XY xy), 2⇡(1 ⇢2 ) XY 2(1 ⇢2 ) XY where ⇢XY is the correlation between X , Y and describe the interaction between them. Network inference 41
  • 74. The bivariate Gaussian distribution ✓ ◆ 1 1 1 x µ1 fXY (x , y) = p exp{ x µ1 y µ2 ⌃ } 2⇡ det ⌃ 2 y µ2 where ⌃ is the variance/covariance matrix which is symmetric and positive definite. If standardized, ✓ ◆ 1 ⇢XY ⌃= . ⇢XY 1 and 1 1 fX ,Y (x , y) = p exp (x 2 + y 2 + 2⇢XY xy), 2⇡(1 ⇢2 ) XY 2(1 ⇢2 ) XY where ⇢XY is the correlation between X , Y and describe the interaction between them. Network inference 41
  • 75. The bivariate Gaussian distribution The Covariance Matrix Let X ⇠ N (0, ⌃), with unit variance and ⇢XY = 0 ✓ ◆ 1 0 ⌃= . 0 1 The shape of the 2-D distribution evolves accordingly. Network inference 42
  • 76. The bivariate Gaussian distribution The Covariance Matrix Let X ⇠ N (0, ⌃), with unit variance and ⇢XY = 0.9 ✓ ◆ 1 0.9 ⌃= . 0.9 1 The shape of the 2-D distribution evolves accordingly. Network inference 42
  • 77. Full generalization: multivariate Gaussian vector Now need partial covariance and partial correlation Let X , Y , Z be real random variables. Definitions cov(X , Y |Z ) = cov(X , Y ) cov(X , Z )cov(Y , Z )/Var(Z ). ⇢XY ⇢XZ ⇢YZ ⇢XY |Z = q q . 1 ⇢2XZ 1 ⇢2 YZ Give the interaction between X and Y once removed the e↵ect of Z . Proposition When X , Y , Z are jointly Gaussian, then cov(X , Y |Z ) = 0 , cor(X , Y |Z ) = 0 , X ? Y |Z . ? Network inference 43
  • 78. Full generalization: multivariate Gaussian vector Now need partial covariance and partial correlation Let X , Y , Z be real random variables. Definitions cov(X , Y |Z ) = cov(X , Y ) cov(X , Z )cov(Y , Z )/Var(Z ). ⇢XY ⇢XZ ⇢YZ ⇢XY |Z = q q . 1 ⇢2XZ 1 ⇢2 YZ Give the interaction between X and Y once removed the e↵ect of Z . Proposition When X , Y , Z are jointly Gaussian, then cov(X , Y |Z ) = 0 , cor(X , Y |Z ) = 0 , X ? Y |Z . ? Network inference 43
  • 79. The multivariate Gaussian distribution Allow to give a modeling for the expression level of a whole set of genes P: Gaussian vector Let X ⇠ N (µ, ⌃), and assume any block decomposition with {a, b} a partition of P ✓ ◆ ⌃ab ⌃ba ⌃= . ⌃ab ⌃bb Then 1. Xa is Gaussian with distribution N (µa , ⌃aa ) 2. Xa |Xb = x is Gaussian with distribution N (µa|b , ⌃a|b ) known. Network inference 44
  • 80. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 45
  • 81. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 46
  • 82. Steady-state data: scheme Inference ⇡ 10s microarrays over time Which interactions? ⇡ 1000s probes (“genes”) Network inference 47
  • 83. Modeling the underlying distribution (1) Model for data generation I A microarray can be represented as a multivariate vector X = (X1 , . . . , Xp ) 2 Rp , I Consider n biological replicate in the same condition, which forms a usual n-size sample (X1 , . . . , Xn ). Consequence: a Gaussian Graphical Model I X ⇠ N (µ, ⌃) with X1 , . . . , Xn i.i.d. copies of X , I ⇥ = (✓ij )i,j 2P , ⌃ 1 is called the concentration matrix. Network inference 48
  • 84. Modeling the underlying distribution (1) Model for data generation I A microarray can be represented as a multivariate vector X = (X1 , . . . , Xp ) 2 Rp , I Consider n biological replicate in the same condition, which forms a usual n-size sample (X1 , . . . , Xn ). Consequence: a Gaussian Graphical Model I X ⇠ N (µ, ⌃) with X1 , . . . , Xn i.i.d. copies of X , I ⇥ = (✓ij )i,j 2P , ⌃ 1 is called the concentration matrix. Network inference 48
  • 85. Modeling the underlying distribution (2) Interpretation as a GGM Multivariate Gaussian vector and covariance selection ✓ij p = cor Xi , Xj |XPi,j = ⇢ij |P{i,j } , ✓ii ✓jj Graphical Interpretation The matrix ⇥ = (✓ij )i,j 2P encodes the network G we are looking for. conditional dependency between Xj and Xi ? i or if and only if non-null partial correlation between Xj and Xi j m ✓ij 6= 0 Network inference 49
  • 86. Modeling the underlying distribution (2) Interpretation as a GGM Multivariate Gaussian vector and covariance selection ✓ij p = cor Xi , Xj |XPi,j = ⇢ij |P{i,j } , ✓ii ✓jj Graphical Interpretation The matrix ⇥ = (✓ij )i,j 2P encodes the network G we are looking for. conditional dependency between Xj and Xi ? i or if and only if non-null partial correlation between Xj and Xi j m ✓ij 6= 0 Network inference 49
  • 87. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 50
  • 88. Time-course data: scheme t0 Inference t1 tn ⇡ 10s microarrays over time Which interactions? ⇡ 1000s probes (“genes”) Network inference 51
  • 89. Modeling time-course data with DAG Collecting gene expression 1. Follow-up of one single experiment/individual; 2. Close enough time-points to ensure I dependency between consecutive measurements; I homogeneity of the Markov process. Xt 1 Xt+1 1 X4 Xt 2 Xt+1 2 X1 stands for Xt+1 3 X3 X2 X5 G Xt+1 4 Xt+1 5 Network inference 52
  • 90. Modeling time-course data with DAG Collecting gene expression 1. Follow-up of one single experiment/individual; 2. Close enough time-points to ensure I dependency between consecutive measurements; I homogeneity of the Markov process. Xt 1 Xt+1 1 X4 Xt 2 Xt+1 2 X1 stands for Xt+1 3 X3 X2 X5 G Xt+1 4 Xt+1 5 Network inference 52
  • 91. Modeling time-course data with DAG Collecting gene expression 1. Follow-up of one single experiment/individual; 2. Close enough time-points to ensure I dependency between consecutive measurements; I homogeneity of the Markov process. Xt 1 X2 1 ... Xn 1 X1 2 X2 2 ... Xn 2 X1 3 X2 3 ... Xn 3 X1 4 X2 4 ... Xn 4 G G G X1 5 X2 5 ... Xn 5 Network inference 52
  • 92. DAG: remark X1 t X1 t+1 X4 X2 t X2 t+1 X1 versus X3 t+1 X3 X2 X5 G X4 t+1 X5 t+1 Argh, there is a cycle :’( is indeed a DAG Overcomes the rather restrictive acyclic requirement Network inference 53
  • 93. Modeling the underlying distribution (1) Model for data generation A microarray can be represented as a multivariate vector X = (X1 , . . . , Xp ) 2 Rp , generated through a first order vector autoregressive process VAR(1): X t = ⇥X t 1 + b + "t , t 2 [1, n] where "t is a white noise to ensure the Markov property and X 0 ⇠ N (0, ⌃0 ). Consequence: a Gaussian Graphical Model I Each X t |X t 1 ⇠ N (✓X t 1 , ⌃), I or, equivalently, Xjt |X t 1 ⇠ N (⇥j X t 1 , ⌃) where ⌃ is known and ⇥j is the j th row of ⇥. Network inference 54
  • 94. Modeling the underlying distribution (1) Model for data generation A microarray can be represented as a multivariate vector X = (X1 , . . . , Xp ) 2 Rp , generated through a first order vector autoregressive process VAR(1): X t = ⇥X t 1 + b + "t , t 2 [1, n] where "t is a white noise to ensure the Markov property and X 0 ⇠ N (0, ⌃0 ). Consequence: a Gaussian Graphical Model I Each X t |X t 1 ⇠ N (✓X t 1 , ⌃), I or, equivalently, Xjt |X t 1 ⇠ N (⇥j X t 1 , ⌃) where ⌃ is known and ⇥j is the j th row of ⇥. Network inference 54
  • 95. Modeling the underlying distribution (2) I 2 3 2 3 ✓11 . . . ✓1j . . ✓1p 2 1 3 2 3 2 1 3 Xt1 X b " 6 . 7 6 6 . . . . . . . . 7 6 t 17 6 1 7 6 t 7 7 . .7 6.7 6 7 6 . . . . . . . . 76 7 6 6 . 7 6 76 . 7 6 . 7 6 . 7 6 i7 6 . . . . . . . . 7676 7 6 7 6 7 6 Xt 7 6 . 7 6 7 6 i7 6 7 = 6 ✓i1 . . . ✓ij . 7 6 j 7 + 6 bi 7 + 6 "t 7 . ✓ip 7 6 6 . 7 6 X 7 6.7 6.7 6 7 6 . . . . . . . . 7 6 t 17 6 7 6 7 6 . 7 6 76 . 7 6 . 7 6 . 7 6 7 6 . . . . . . . . 76 7 6 7 6 7 4 . 5 6 74 . 5 4 . 5 4 . 5 4 . . . . . . . . 5 Xtp Xtp 1 bp "pt ✓p1 . . . ✓pj . . ✓pp I Example: 0 1 ✓11 ✓12 0 ⇥ = @ ✓21 0 0 A 0 ✓32 0 Network inference 55
  • 96. Modeling the underlying distribution (3) Interpretation as a GGM The VAR(1) as a covariance selection model ⇣ ⌘ cov Xit , Xjt 1 |XPj1 t ✓ij = ⇣ ⌘ , var Xjt 1 |XPj1 t Graphical Interpretation The matrix ⇥ = (✓ij )i,j 2P encodes the network G we are looking for. conditional dependency between Xjt 1 and Xit ? i or if and only if non-null partial correlation between Xjt 1 and Xit j m ✓ij 6= 0 Network inference 56
  • 97. Modeling the underlying distribution (3) Interpretation as a GGM The VAR(1) as a covariance selection model ⇣ ⌘ cov Xit , Xjt 1 |XPj1 t ✓ij = ⇣ ⌘ , var Xjt 1 |XPj1 t Graphical Interpretation The matrix ⇥ = (✓ij )i,j 2P encodes the network G we are looking for. conditional dependency between Xjt 1 and Xit ? i or if and only if non-null partial correlation between Xjt 1 and Xit j m ✓ij 6= 0 Network inference 56
  • 98. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 57
  • 99. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 58
  • 100. The graphical models: remindera a for goldfish-like memories Assumption A microarray can be represented as a multivariate Gaussian vector X . Collecting gene expression 1. Steady-state data leads to an i.i.d. sample. 2. Time-course data gives a time series. Graphical interpretation i conditional dependency between X (i) and X (j ) if and only if or j non null partial correlation between X (i) and X (j ) Encoded in an unknown matrix of parameters ⇥. Network inference 59
  • 101. The graphical models: remindera a for goldfish-like memories Assumption A microarray can be represented as a multivariate Gaussian vector X . Collecting gene expression 1. Steady-state data leads to an i.i.d. sample. 2. Time-course data gives a time series. Graphical interpretation i ? conditional dependency between X (i) and X (j ) if and only if or j non null partial correlation between X (i) and X (j ) Encoded in an unknown matrix of parameters ⇥. Network inference 59
  • 102. The graphical models: remindera a for goldfish-like memories Assumption A microarray can be represented as a multivariate Gaussian vector X . Collecting gene expression 1. Steady-state data leads to an i.i.d. sample. 2. Time-course data gives a time series. Graphical interpretation i ? conditional dependency between Xt (i) and Xt 1 (j ) if and only if or j non null partial correlation between Xt (i) and Xt 1 (j ) Encoded in an unknown matrix of parameters ⇥. Network inference 59
  • 103. The Maximum likelihood estimator The natural approach for parametric statistics Let X be a random vector with distribution defined by fX (x ; ⇥), where ⇥ are the model parameters. Maximum likelihood estimator ˆ ⇥ = arg max L(⇥; X) ⇥ where L is the log likelihood, a function of the parameters: n Y L(⇥; X) = log fX (xk ; ⇥), k =1 where xk is the k row of X. Remarks I This a convex optimization problem, I We just need to detect non zero coe cients in ⇥ Network inference 60
  • 104. The penalized likelihood approach Let ⇥ be the parameters to infer (the edges). A penalized likelihood approach ˆ ⇥ = arg max L(⇥; X) pen`1 (⇥), ⇥ I L is the model log-likelihood, I pen`1 is a penalty function tuned by > 0. It performs 1. regularization (needed when n ⌧ p), 2. selection (sparsity induced by the `1 -norm), Network inference 61
  • 105. The penalized likelihood approach Let ⇥ be the parameters to infer (the edges). A penalized likelihood approach ˆ ⇥ = arg max L(⇥; X) pen`1 (⇥), ⇥ I L is the model log-likelihood, I pen`1 is a penalty function tuned by > 0. It performs 1. regularization (needed when n ⌧ p), 2. selection (sparsity induced by the `1 -norm), Network inference 61
  • 106. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 62
  • 107. A Geometric View of Sparsity Constrained Optimization We basically want to solve a problem of the form maximize f ( 1, 2 ; X) 1, 2 2 ; X) where f is typically a concave likelihood function. 1, This is strictly equivalent to solve f( minimize g( 1, 2 ; X) 1, 2 where g = f is convex ! For instance the square lost in the OLS. 2 1 Network inference 63
  • 108. A Geometric View of Sparsity Constrained Optimization ( 2 ; X) maximize f ( 1, 2 ; X) 1, 2 , s.t. ⌦( 1, 2) c 1, where ⌦ defines a domain that f( constrains . 2 1 Network inference 63
  • 109. A Geometric View of Sparsity Constrained Optimization ( maximize f ( 1, 2 ; X) 1, 2 , s.t. ⌦( 1, 2) c where ⌦ defines a domain that constrains . m 2 maximize f ( 1, 2 ; X) ⌦( 1, 2) 1, 2 1 Network inference 63
  • 110. A Geometric View of Sparsity Constrained Optimization ( maximize f ( 1, 2 ; X) 1, 2 , s.t. ⌦( 1, 2) c where ⌦ defines a domain that constrains . m maximize f ( 1, 2 ; X) ⌦( 1, 2) 2 1, 2 How shall we define ⌦ to induce sparsity? 1 Network inference 63
  • 111. A Geometric View of Sparsity Supporting Hyperplane An hyperplane supports a set i↵ I the set is contained in one half-space I the set has at least one point on the hyperplane 2 1 Network inference 64
  • 112. A Geometric View of Sparsity Supporting Hyperplane An hyperplane supports a set i↵ I the set is contained in one half-space I the set has at least one point on the hyperplane 2 1 Network inference 64
  • 113. A Geometric View of Sparsity Supporting Hyperplane An hyperplane supports a set i↵ I the set is contained in one half-space I the set has at least one point on the hyperplane 2 1 Network inference 64
  • 114. A Geometric View of Sparsity Supporting Hyperplane An hyperplane supports a set i↵ I the set is contained in one half-space I the set has at least one point on the hyperplane 2 1 Network inference 64
  • 115. A Geometric View of Sparsity Supporting Hyperplane An hyperplane supports a set i↵ I the set is contained in one half-space I the set has at least one point on the hyperplane 2 1 There are Supporting Hyperplane at all points of convex sets: Generalize tangents Network inference 64
  • 116. A Geometric View of Sparsity Supporting Hyperplane An hyperplane supports a set i↵ I the set is contained in one half-space I the set has at least one point on the hyperplane 2 2 1 1 Network inference 64
  • 117. A Geometric View of Sparsity Dual Cone Generalizes normals 2 2 2 1 1 1 Network inference 65
  • 118. A Geometric View of Sparsity Dual Cone Generalizes normals 2 2 2 1 1 1 Network inference 65
  • 119. A Geometric View of Sparsity Dual Cone Generalizes normals 2 2 2 1 1 1 Network inference 65
  • 120. A Geometric View of Sparsity Dual Cone Generalizes normals 2 2 2 1 1 1 Shape of dual cones ) sparsity pattern Network inference 65
  • 121. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 66
  • 122. The LASSO R. Tibshirani, 1996. The Lasso: Least Absolute Shrinkage and Selection Operator S. Chen , D. Donoho , M. Saunders, 1995. 3.2. Basis Pursuit. Régularisations ` p 23 Weisberg, 1980. Forward Stagewise regression. 2 2 ( minimize ky X k2 , 2 2R2 s.t.` 2 k k1 = | 1 | + | 2|  c. ls ls m `1 1 1 minimize ky X k2 + k k1 . 2 2R Fig. 3.2 – Comparaisons des solutions de problèmes régularisés par une norme `1 et `2 . Network inference 67
  • 123. Orthogonal case and link to the OLS OLS shrinkage The Lasso has no analytical solution but in the orthogonal case: when X| X = I (never for real data), ˆlasso = sign( ˆols ) max(0, | ˆols | ). j j j OLS 4 Lasso 2 0 ols 4 2 0 2 4 2 4 Network inference 68
  • 124. LARs: Least angle regression B. Efron, T. Hastie, I. Johnstone, R. Tibshirani, 2004. Least Angle Regression. E cient algorithm to compute the Lasso solutions The LARS solution consists of a curve denoting the solution for each value of . I construct a piecewise linear path of solution starting from the null vector towards the OLS estimate, I (Almost) the same cost as OLS, I well adapted to cross validation (help us to choose ). Network inference 69
  • 125. Example: prostate cancer I Lasso solution path with Lars > library(lars) > load("prostate.rda") > x <- as.matrix(x) > x <- scale(as.matrix(x)) > out <- lars(x,y) > plot(out) Network inference 70
  • 126. Example: prostate cancer II LASSO 0 1 3 5 6 7 8 1 * 6 * * * ** * Standardized Coefficients * 4 * 2 * * * * * * 2 ** * * 8 * * * * * * * * 7 * * * 0 * * * ** * * * * * * 3 * 0.0 0.2 0.4 0.6 0.8 1.0 Network inference 71
  • 127. Choice of the tuning parameter I Model selection criteria log n BIC( ) = ky X ˆ k2 2 df( ˆ ) 2 AIC( ) = ky X ˆ k2 2 df( ˆ ) where df( ˆ ) is the number of nonzero entries in . Cross-validation 1. split the data into K folds, 2. use successively each K fold as the testing set, 3. compute the test error on this K folds, 4. average to obtain the CV estimation of the test error. is chosen to minimize the CV test error. Network inference 72
  • 128. Choice of the tuning parameter II CV choice for > cv.lars(x,y, K=10) Network inference 73
  • 129. Choice of the tuning parameter III 1.6 1.4 ● ● Cross−Validated MSE ● ● ● 1.2 ● ● ● ● ● ● ● ● 1.0 ● ● ● ● ● ● ● ●●●● 0.8 ●●●●●●●●●●●●●●●●● ●● ●● ●● 0.6 ●●● ●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● 0.0 0.2 0.4 0.6 0.8 1.0 Network inference 74
  • 130. Many variations Group-Lasso Activate the variables by group (given by the user). Adaptive/Weighted-Lasso Adjust the penalty level to each variables, according to prior knowledge or with data driven weights. BoLasso Bootstrapped version that removes false positives/stabilizes the estimate. etc. + many theoretical results. Network inference 75
  • 131. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 76
  • 132. Outline Introduction Motivations Background on omics Modeling issue Modeling tools Statistical dependence Graphical models Covariance selection and Gaussian vector Gaussian Graphical Models for genomic data Steady-state data Time-course data Statistical inference Penalized likelihood approach Inducing sparsity and regularization The Lasso Application in Post-genomics Modeling time-course data Illustrations Multitask learning Network inference 77
  • 133. Problem t0 Inference t1 tn ⇡ 10s microarrays over time Which interactions? ⇡ 1000s probes (“genes”) The main statistical issue is the high dimensional setting. Network inference 78
  • 134. Handling the scarcity of the data By introducing some prior Priors should be biologically grounded 1. few genes e↵ectively interact (sparsity), 2. networks are organized (latent clustering), G8 G7 G9 G11 G1 G6 G10 G4 G5 G2 G12 G13 G3 Network inference 79
  • 135. Handling the scarcity of the data By introducing some prior Priors should be biologically grounded 1. few genes e↵ectively interact (sparsity), 2. networks are organized (latent clustering), G8 G7 G9 G11 G1 G6 G10 G4 G5 G2 G12 G13 G3 Network inference 79
  • 136. Handling the scarcity of the data By introducing some prior Priors should be biologically grounded 1. few genes e↵ectively interact (sparsity), 2. networks are organized (latent clustering), B3 B2 B4 B A1 B1 B5 A4 A A2 C1 C A3 Network inference 79
  • 137. Penalized log-likelihood Banerjee et al., JMLR 2008 ˆ ⇥ = arg max Liid (⇥; S) k⇥k`1 , ⇥ e ciently solved by the graphical Lasso of Friedman et al, 2008. Ambroise, Chiquet, Matias, EJS 2009 Use adaptive penalty parameters for di↵erent coe cients Liid (⇥; S) kPZ ? ⇥k`1 , where PZ is a matrix of weights depending on the underlying clustering Z. Works with the pseudo log-likelihood (computationally e cient). Network inference 80
  • 138. Penalized log-likelihood Banerjee et al., JMLR 2008 ˆ ⇥ = arg max Liid (⇥; S) k⇥k`1 , ⇥ e ciently solved by the graphical Lasso of Friedman et al, 2008. Ambroise, Chiquet, Matias, EJS 2009 Use adaptive penalty parameters for di↵erent coe cients ˜ Liid (⇥; S) kPZ ? ⇥k`1 , where PZ is a matrix of weights depending on the underlying clustering Z. Works with the pseudo log-likelihood (computationally e cient). Network inference 80
  • 139. Neighborhood selection (1) Let I Xi be the i th column of X, I Xi be X deprived of Xi . ✓ij Xi = Xi + ", where j = . ✓ii Meinshausen and B¨lhman, 2006 u Since sign(corij |P{i,j } ) = sign( j ), select the neighbors of i with 1 2 arg min Xi Xi 2 + k k`1 . n The sign pattern of ⇥ is inferred after a symmetrization step. Network inference 81
  • 140. Neighborhood selection (2) The pseudo log-likelihood of the i.i.d Gaussian sample is p n ! X X ˜ Liid (⇥; S) = log P(Xk (i )|Xk (Pi ); ⇥i ) , i=1 k =1 n n ⇣ ⌘ n 1/2 1/2 = log det(D) Trace D ⇥S⇥D log(2⇡), 2 2 2 where D = diag(⇥). Proposition ˆ pseudo = arg max Liid (⇥; S) ⇥ ˜ k⇥k`1 ⇥:✓ij 6=✓ii has the same null entries as inferred by neighborhood selection. Network inference 82
  • 141. Structured regularization Introduce prior knowledge Building the weights 1. Build w from prior biological information I transcription factors vs. regulatees, I number of potential binding sites, I KEGG pathways, Gene Ontology . . . 2. Build the weights matrix from clustering algorithm I Infer the network G 0 with w = 1 for each node, I Apply a clustering algorithm on G 0 , I Re-Infer G with w built according to the clustering Z. Network inference 83