SlideShare a Scribd company logo
1 of 42
Download to read offline
Dependent Dirichlet processes
    and application to ecological data

                     Julyan Arbel
Joint work with Kerrie Mengersen & Judith Rousseau

                                ´
          CREST-INSEE, Universite Paris-Dauphine


                  2 December 2012
                    ERCIM 2012
          5th International Conference on
               Computing & Statistics
Biology question
                    Nonparametric model


Outline



  1   Biology question
         Introduction
         Data


  2   Nonparametric model
        Dirichlet process
        Dependent Dirichlet process




                            Julyan Arbel   DDP and ecological data
Biology question   Introduction
                    Nonparametric model    Data


Outline



  1   Biology question
         Introduction
         Data


  2   Nonparametric model
        Dirichlet process
        Dependent Dirichlet process




                            Julyan Arbel   DDP and ecological data
Biology question   Introduction
                    Nonparametric model    Data


Biology introduction



    Series of measurements at
    different places around
    Casey Station, permanent
    base in Antarctica
    At each site: pollution
    level, and abundance of
    microbes called OTUs.
    Assess the impact of a
    pollutant on the soil
    composition / biodiversity



                            Julyan Arbel   DDP and ecological data
Biology question   Introduction
                   Nonparametric model    Data


Data

       Data consist of measurements of microbes abundance:




                           Julyan Arbel   DDP and ecological data
Biology question   Introduction
                         Nonparametric model    Data


Data

         Data consist of measurements of microbes abundance:


  Site     TPH   06251     00576       00429    06360          08793   06259   05164   00772




         Sample of abundance of 8 microbes (columns) at 6 sites
         (rows)
         Main covariate is a pollution level called TPH, denoted x
Biology question   Introduction
                         Nonparametric model    Data


Data

         Data consist of measurements of microbes abundance:


  Site     TPH   06251     00576       00429    06360          08793   06259   05164   00772
   1        80     3        724          88       1              0       0       0      467
   2        80     9        2364        252       0              0       2       0      616
   3        80    12        443        1655      11              0       0       0      168
   .
   .         .
             .     .
                   .          .
                              .           .
                                          .       .
                                                  .              .
                                                                 .       .
                                                                         .       .
                                                                                 .       .
                                                                                         .
   .         .     .          .           .       .              .       .       .       .




         Sample of abundance of 8 microbes (columns) at 6 sites
         (rows)
         Main covariate is a pollution level called TPH, denoted x
Biology question   Introduction
                          Nonparametric model    Data


Data

         Data consist of measurements of microbes abundance:


  Site     TPH    06251     00576       00429    06360          08793   06259     05164   00772
   1        80      3        724          88       1              0       0         0      467
   2        80      9        2364        252       0              0       2         0      616
   3        80     12        443        1655      11              0       0         0      168
   .
   .         .
             .      .
                    .          .
                               .           .
                                           .       .
                                                   .              .
                                                                  .       .
                                                                          .         .
                                                                                    .       .
                                                                                            .
   .         .      .          .           .       .              .       .         .       .
   13      2600   2262        339        229     1100            537        352     0       0
   20     10000   1883         23         18      879            224        325     9       1
   24     22000   1446          2         27      920           1808       1456     0       0


         Sample of abundance of 8 microbes (columns) at 6 sites
         (rows)
         Main covariate is a pollution level called TPH, denoted x

                                  Julyan Arbel   DDP and ecological data
Biology question   Introduction
                  Nonparametric model    Data


Notations


     Microbe species are denoted by j = 1, . . . by decreasing
     total abundance




                          Julyan Arbel   DDP and ecological data
Biology question   Introduction
                  Nonparametric model    Data


Notations


     Microbe species are denoted by j = 1, . . . by decreasing
     total abundance
     At each site x, there are N(x) microbes, denoted Yi (x),
     i = 1, . . . , N(x).




                          Julyan Arbel   DDP and ecological data
Biology question    Introduction
                       Nonparametric model     Data


Notations


     Microbe species are denoted by j = 1, . . . by decreasing
     total abundance
     At each site x, there are N(x) microbes, denoted Yi (x),
     i = 1, . . . , N(x).
     Data are a frequency matrix:

         Site    TPH               06251                         00576        ...
                                    j =1                            j         ...
            1   x = 80     #(Yn (x = 80) = 1) = 3                  ...        ...
            .
            .      .
                   .                  .
                                      .                             .
                                                                    .          .
                                                                               .
            .      .                  .                             .          .
            k     x                      ...                  #(Yn (x) = j)   ...




                               Julyan Arbel    DDP and ecological data
Biology question   Introduction
                       Nonparametric model    Data


Notations
  A standard example of diversity is Shannon diversity, taken as
  the exponential of Shannon entropy
                                                                 #(Yn (x)=j)
  D(x) = exp    j   −pj (x) log pj (x) with pj (x) =               N(x)




                               Julyan Arbel   DDP and ecological data
Biology question                        Introduction
                                              Nonparametric model                         Data


Notations
  A standard example of diversity is Shannon diversity, taken as
  the exponential of Shannon entropy
                                                                                                              #(Yn (x)=j)
  D(x) = exp                          j   −pj (x) log pj (x) with pj (x) =                                      N(x)




                                                                                           40
                            3.5




                                                                      Shannon diversity
          Shannon entropy




                                                                                           30
                            3.0




                                                                                           20
                            2.5




                                  0       5000 10000       20000                           10   0    5000 10000     20000

                                                 tph                                                        tph




  Figure: Left: Shannon entropy in row data. Right: Shannon diversity
  in row data.

                                                       Julyan Arbel                       DDP and ecological data
Biology question   Dirichlet process
                    Nonparametric model    Dependent Dirichlet process


Outline



  1   Biology question
         Introduction
         Data


  2   Nonparametric model
        Dirichlet process
        Dependent Dirichlet process




                            Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                   Nonparametric model    Dependent Dirichlet process


First model
  Pavlovian conditioning associated with the word species leads
  to the Dirichlet process and/or related processes.




                           Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                   Nonparametric model    Dependent Dirichlet process


First model
  Pavlovian conditioning associated with the word species leads
  to the Dirichlet process and/or related processes.

                                                    Yi (x) | G ∼ G,
First, we run an                                           ∞
independent model at                         G(·) =             pj δj (·),
each site with TPH x                                      j=1
                                                (pj )j ∼ GEM(M).




                           Julyan Arbel   DDP and ecological data
Biology question     Dirichlet process
                      Nonparametric model      Dependent Dirichlet process


First model
  Pavlovian conditioning associated with the word species leads
  to the Dirichlet process and/or related processes.

                                                         Yi (x) | G ∼ G,
First, we run an                                                ∞
independent model at                              G(·) =             pj δj (·),
each site with TPH x                                           j=1
                                                     (pj )j ∼ GEM(M).
  The GEM(M) distribution is defined in [Pitman, 2002] (GEM
  stands for Griffiths, Engen and McCloskey) and represents the
  distribution of the weights in a Dirichlet process:


            pj = Vj          (1 − Vl ),       Vj ∼ Beta(1, M).
                       l<j

                               Julyan Arbel    DDP and ecological data
Biology question   Dirichlet process
                  Nonparametric model    Dependent Dirichlet process


Posterior sampling



     We use a blocked Gibbs sampler (truncated version of the
     infinite sum)




                          Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                   Nonparametric model    Dependent Dirichlet process


Posterior sampling



     We use a blocked Gibbs sampler (truncated version of the
     infinite sum)
     The prior on p is induced by the Beta prior on V ,
     π⊥ (Vj ) = Be(1, M).




                           Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                   Nonparametric model    Dependent Dirichlet process


Posterior sampling



     We use a blocked Gibbs sampler (truncated version of the
     infinite sum)
     The prior on p is induced by the Beta prior on V ,
     π⊥ (Vj ) = Be(1, M).
     This is conjugated, with a Beta posterior:

           π(Vj |Y ) = Be(Vj |1 + #(Yn = j), M + #(Yn > j)).




                           Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                   Nonparametric model    Dependent Dirichlet process


Second model

  But we want to run a single model across TPH x ; it means a
  predictor-dependent model




                           Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                   Nonparametric model    Dependent Dirichlet process


Second model

  But we want to run a single model across TPH x ; it means a
  predictor-dependent model
      Early references to predictor-dependent DP models include
      Cifarelli and Regazzini [1978] and Muliere and Petrone
      [1993]




                           Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                   Nonparametric model    Dependent Dirichlet process


Second model

  But we want to run a single model across TPH x ; it means a
  predictor-dependent model
      Early references to predictor-dependent DP models include
      Cifarelli and Regazzini [1978] and Muliere and Petrone
      [1993]
      Increasing interest since MacEachern [1999,2000,2001]




                           Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                   Nonparametric model    Dependent Dirichlet process


Second model

  But we want to run a single model across TPH x ; it means a
  predictor-dependent model
      Early references to predictor-dependent DP models include
      Cifarelli and Regazzini [1978] and Muliere and Petrone
      [1993]
      Increasing interest since MacEachern [1999,2000,2001]
      Extensions with varying weights include, among others,
      order-based DDP [Griffin and Steel, 2006], local DP [Chung
      and Dunson, 2009], weighted mixtures of DP [Dunson and
      Park, 2008], and kernel stick-breaking processes [Dunson
      et al., 2007].



                           Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                   Nonparametric model    Dependent Dirichlet process


Second model

  Only interested in a dependence in the weights. We worked out
  a dependent process prior with a simple structure of
  dependence on the weights.




                           Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                          Nonparametric model    Dependent Dirichlet process


Second model

  Only interested in a dependence in the weights. We worked out
  a dependent process prior with a simple structure of
  dependence on the weights.

     Yi (x) | G(x) ∼ G(x),
              ∞
  G(x)(·) =         pj (x)δj (·),
              j=1
     (pj (x))j ∼ DGEM(M),




                                  Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                          Nonparametric model    Dependent Dirichlet process


Second model

  Only interested in a dependence in the weights. We worked out
  a dependent process prior with a simple structure of
  dependence on the weights.

     Yi (x) | G(x) ∼ G(x),
              ∞
  G(x)(·) =         pj (x)δj (·),                pj (x) = Vj (x)                 (1 − Vl (x)),
              j=1                                                          l<j

     (pj (x))j ∼ DGEM(M),                                        Vj (x) ∼ Beta(1, M).


  where DGEM(M) stands for Dependent GEM distribution.




                                  Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                          Nonparametric model    Dependent Dirichlet process


Second model

  Only interested in a dependence in the weights. We worked out
  a dependent process prior with a simple structure of
  dependence on the weights.

     Yi (x) | G(x) ∼ G(x),
              ∞
  G(x)(·) =         pj (x)δj (·),                pj (x) = Vj (x)                 (1 − Vl (x)),
              j=1                                                          l<j

     (pj (x))j ∼ DGEM(M),                                        Vj (x) ∼ Beta(1, M).


  where DGEM(M) stands for Dependent GEM distribution.
  Want a process for each j, (Vj (x))x , which is marginally
  Beta(1, M).

                                  Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                   Nonparametric model    Dependent Dirichlet process


Process on the beta breaks,Vj (x)
  Construction from Trippa, Muller and Johnson [2011].
                             ¨




                           Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                          Nonparametric model    Dependent Dirichlet process


Process on the beta breaks,Vj (x)
  Construction from Trippa, Muller and Johnson [2011].
                             ¨



                    Γ(x1 )
   V (x1 ) =   Γ(x1 )+ΓM (x1 )




                                  Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                          Nonparametric model    Dependent Dirichlet process


Process on the beta breaks,Vj (x)
  Construction from Trippa, Muller and Johnson [2011].
                             ¨


                                                                      α2
                    Γ(x1 )                               α1                        α3
   V (x1 ) =   Γ(x1 )+ΓM (x1 )                                 α12
                                                                             α23
                                                                          α123
                                                              x1     x2            x3




                                  Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                          Nonparametric model    Dependent Dirichlet process


Process on the beta breaks,Vj (x)
  Construction from Trippa, Muller and Johnson [2011].
                             ¨


                                                                      α2
                    Γ(x1 )                               α1                        α3
   V (x1 ) =   Γ(x1 )+ΓM (x1 )                                 α12
                                                                             α23
                                                                          α123
                                                              x1     x2            x3



   Γ(x1 ) = Γ1 + Γ12 + Γ123 ,
  ΓM (x1 ) = ΓM + ΓM + ΓM .
              1    12   123




                                  Julyan Arbel   DDP and ecological data
Biology question     Dirichlet process
                          Nonparametric model      Dependent Dirichlet process


Process on the beta breaks,Vj (x)
  Construction from Trippa, Muller and Johnson [2011].
                             ¨


                                                                        α2
                    Γ(x1 )                                 α1                        α3
   V (x1 ) =   Γ(x1 )+ΓM (x1 )                                   α12
                                                                               α23
                                                                            α123
                                                                x1     x2            x3



   Γ(x1 ) = Γ1 + Γ12 + Γ123 ,
                                                 Γ1 ∼ Ga(α1 ), . . . , Γ123 ∼ Ga(α123 ),
  ΓM (x1 ) = ΓM + ΓM + ΓM .
              1    12   123
                                             ΓM
                                              1
                                                  ∼ Ga(α1 M), . . . , ΓM ∼ Ga(α123 M).
                                                                       123




                                  Julyan Arbel     DDP and ecological data
Biology question     Dirichlet process
                          Nonparametric model      Dependent Dirichlet process


Process on the beta breaks,Vj (x)
  Construction from Trippa, Muller and Johnson [2011].
                             ¨


                                                                        α2
                    Γ(x1 )                                 α1                        α3
   V (x1 ) =   Γ(x1 )+ΓM (x1 )                                   α12
                                                                               α23
                                                                            α123
                                                                x1     x2            x3



   Γ(x1 ) = Γ1 + Γ12 + Γ123 ,
                                                 Γ1 ∼ Ga(α1 ), . . . , Γ123 ∼ Ga(α123 ),
  ΓM (x1 ) = ΓM + ΓM + ΓM .
              1    12   123
                                             ΓM
                                              1
                                                  ∼ Ga(α1 M), . . . , ΓM ∼ Ga(α123 M).
                                                                       123

  In the end:
                pj (x) = Vj (x)         l<j (1   − Vl (x)) ∼ DGEM(M).

                                  Julyan Arbel     DDP and ecological data
Biology question         Dirichlet process
                   Nonparametric model          Dependent Dirichlet process


Interesting features

      This idea can be extended to large dimensional covariate
      spaces:


                                                               α3
                                                         x3.
                                            α123
                            α1    x1.             x2.
                                                         α23
                                          α12
                                                    α2



      Easy to simulate in: only needs to simulate Gamma
      random variables


                           Julyan Arbel         DDP and ecological data
Biology question   Dirichlet process
                     Nonparametric model    Dependent Dirichlet process


Posterior sampling


     There is independence across j, so it suffices to be able to
     simulate in each posterior:

    π(Vj | Y ) ∝ π(V j )L(Y | V j ),
               ∝ π(V j )          Vj (x)#(Yn (x)=j) (1 − Vj (x))#(Yn (x)>j) .
                              x


     Quite uncommon situation: we can sample in the prior
     π(V j ), but we cannot evaluate it. Reverse situation to
     Approximate Bayesian computation (ABC), where the
     likelihood is intractable, but can be sampled.



                             Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                  Nonparametric model    Dependent Dirichlet process




A first solution is to use a Metropolis-Hastings algorithm:
Metropolis-Hastings Algorithm
 1   Given a current value V j , sample a new one V ∗
                                                    j
     independently in the prior π(V j ).
 2   Acceptance probability is

                                   L(Y |V ∗ ) 
                                              
                                           j 
                          ρ = min 
                                   L(Y |V )  .
                                  1,
                                              
                                              
                                               
                                           j




                          Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                   Nonparametric model    Dependent Dirichlet process




A first solution is to use a Metropolis-Hastings algorithm:
Metropolis-Hastings Algorithm
 1   Given a current value V j , sample a new one V ∗
                                                    j
     independently in the prior π(V j ).
 2   Acceptance probability is

                                    L(Y |V ∗ ) 
                                               
                                            j 
                           ρ = min 
                                    L(Y |V )  .
                                   1,
                                               
                                               
                                                
                                            j


     But it is not a good idea to propose in the prior.
     Acceptance rate is low (around 1%).



                           Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                    Nonparametric model    Dependent Dirichlet process




A better solution is to use Importance Sampling:
Importance Sampling
 1   Sample iid values V j in the prior π(V j ).
 2   Use a weighted sample by the importance weights defined
     by the likelihood w(V j ) = L(Y |V j ).




                            Julyan Arbel   DDP and ecological data
Biology question   Dirichlet process
                    Nonparametric model    Dependent Dirichlet process




A better solution is to use Importance Sampling:
Importance Sampling
 1   Sample iid values V j in the prior π(V j ).
 2   Use a weighted sample by the importance weights defined
     by the likelihood w(V j ) = L(Y |V j ).

     iid sample instead of a Markov chain
     better precision by a Rao-Blackwellisation argument
     (weights instead of accept-reject)




                            Julyan Arbel   DDP and ecological data
Biology question                       Dirichlet process
                                          Nonparametric model                        Dependent Dirichlet process


Results
                        40




                                                                                        40
  Posterior diversity




                                                                 Diversity in data
                        30




                                                                                        30
                        20




                                                                                        20
                        10




                                                                                        10
                             0   5000 10000        20000                                     0     5000 10000        20000

                                        tph                                                                    tph



  Figure: Left: dependent DP prior: posterior mean of the Shannon
  diversity by TPH; 95% centred credible intervals. Right: Shannon
  diversity in row data.

                                                  Julyan Arbel                       DDP and ecological data
Biology question   Dirichlet process
                  Nonparametric model    Dependent Dirichlet process


Conclusion




     Such a model allows to give probabilistic answers to
     questions about diversity as we get a posterior sample.
     The use of Gaussian processes transformed to Beta
     processes by the inverse CDF might fastened the posterior
     computations.
     Extension to handle other covariates.




                          Julyan Arbel   DDP and ecological data

More Related Content

Similar to Arbel oviedo

Novel image fusion techniques using global and local kekre wavelet transforms
Novel image fusion techniques using global and local kekre wavelet transformsNovel image fusion techniques using global and local kekre wavelet transforms
Novel image fusion techniques using global and local kekre wavelet transformsIAEME Publication
 
OpenCL applications in genomics
OpenCL applications in genomicsOpenCL applications in genomics
OpenCL applications in genomicsUSC
 
image processing to detect worms
image processing to detect wormsimage processing to detect worms
image processing to detect wormsSynergy Vision
 
Analysis update for GENEVA meeting 2011
Analysis update for GENEVA meeting 2011Analysis update for GENEVA meeting 2011
Analysis update for GENEVA meeting 2011USC
 
Elizabeth Iorns - How Science Exchange promotes Open Science
Elizabeth Iorns - How Science Exchange promotes Open ScienceElizabeth Iorns - How Science Exchange promotes Open Science
Elizabeth Iorns - How Science Exchange promotes Open ScienceScience Exchange
 
Self Organinising neural networks
Self Organinising  neural networksSelf Organinising  neural networks
Self Organinising neural networksESCOM
 
Terminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryTerminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryGiuseppe Rizzo
 
Multimodal Image Processing in Cytology
Multimodal Image Processing in CytologyMultimodal Image Processing in Cytology
Multimodal Image Processing in CytologyUniversity of Zurich
 
Analysis Methods in Flow Cytometry
Analysis Methods in Flow CytometryAnalysis Methods in Flow Cytometry
Analysis Methods in Flow CytometryNikolas Pontikos
 
Classification of squamous cell cervical cytology
Classification of squamous cell cervical cytologyClassification of squamous cell cervical cytology
Classification of squamous cell cervical cytologykarthigailakshmi
 
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.pptArtificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.pptAnonymous9etQKwW
 
Faster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationFaster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationSilvio Cesare
 

Similar to Arbel oviedo (15)

Novel image fusion techniques using global and local kekre wavelet transforms
Novel image fusion techniques using global and local kekre wavelet transformsNovel image fusion techniques using global and local kekre wavelet transforms
Novel image fusion techniques using global and local kekre wavelet transforms
 
OpenCL applications in genomics
OpenCL applications in genomicsOpenCL applications in genomics
OpenCL applications in genomics
 
image processing to detect worms
image processing to detect wormsimage processing to detect worms
image processing to detect worms
 
Analysis update for GENEVA meeting 2011
Analysis update for GENEVA meeting 2011Analysis update for GENEVA meeting 2011
Analysis update for GENEVA meeting 2011
 
Elizabeth Iorns - How Science Exchange promotes Open Science
Elizabeth Iorns - How Science Exchange promotes Open ScienceElizabeth Iorns - How Science Exchange promotes Open Science
Elizabeth Iorns - How Science Exchange promotes Open Science
 
Self Organinising neural networks
Self Organinising  neural networksSelf Organinising  neural networks
Self Organinising neural networks
 
Terminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryTerminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom Discovery
 
T1 2018 bioinformatics
T1 2018 bioinformaticsT1 2018 bioinformatics
T1 2018 bioinformatics
 
Multimodal Image Processing in Cytology
Multimodal Image Processing in CytologyMultimodal Image Processing in Cytology
Multimodal Image Processing in Cytology
 
Analysis Methods in Flow Cytometry
Analysis Methods in Flow CytometryAnalysis Methods in Flow Cytometry
Analysis Methods in Flow Cytometry
 
Classification of squamous cell cervical cytology
Classification of squamous cell cervical cytologyClassification of squamous cell cervical cytology
Classification of squamous cell cervical cytology
 
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.pptArtificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
 
main
mainmain
main
 
Faster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationFaster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware Classification
 
MMseqs NGS 2014
MMseqs NGS 2014MMseqs NGS 2014
MMseqs NGS 2014
 

More from Julyan Arbel

Bayesian neural networks increasingly sparsify their units with depth
Bayesian neural networks increasingly sparsify their units with depthBayesian neural networks increasingly sparsify their units with depth
Bayesian neural networks increasingly sparsify their units with depthJulyan Arbel
 
Species sampling models in Bayesian Nonparametrics
Species sampling models in Bayesian NonparametricsSpecies sampling models in Bayesian Nonparametrics
Species sampling models in Bayesian NonparametricsJulyan Arbel
 
Dependent processes in Bayesian Nonparametrics
Dependent processes in Bayesian NonparametricsDependent processes in Bayesian Nonparametrics
Dependent processes in Bayesian NonparametricsJulyan Arbel
 
Asymptotics for discrete random measures
Asymptotics for discrete random measuresAsymptotics for discrete random measures
Asymptotics for discrete random measuresJulyan Arbel
 
Bayesian Nonparametrics, Applications to biology, ecology, and marketing
Bayesian Nonparametrics, Applications to biology, ecology, and marketingBayesian Nonparametrics, Applications to biology, ecology, and marketing
Bayesian Nonparametrics, Applications to biology, ecology, and marketingJulyan Arbel
 
A Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsA Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsJulyan Arbel
 
A Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsA Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsJulyan Arbel
 
Lindley smith 1972
Lindley smith 1972Lindley smith 1972
Lindley smith 1972Julyan Arbel
 
Diaconis Ylvisaker 1985
Diaconis Ylvisaker 1985Diaconis Ylvisaker 1985
Diaconis Ylvisaker 1985Julyan Arbel
 
Jefferys Berger 1992
Jefferys Berger 1992Jefferys Berger 1992
Jefferys Berger 1992Julyan Arbel
 
Poster DDP (BNP 2011 Veracruz)
Poster DDP (BNP 2011 Veracruz)Poster DDP (BNP 2011 Veracruz)
Poster DDP (BNP 2011 Veracruz)Julyan Arbel
 
Bayesian adaptive optimal estimation using a sieve prior
Bayesian adaptive optimal estimation using a sieve priorBayesian adaptive optimal estimation using a sieve prior
Bayesian adaptive optimal estimation using a sieve priorJulyan Arbel
 

More from Julyan Arbel (17)

UCD_talk_nov_2020
UCD_talk_nov_2020UCD_talk_nov_2020
UCD_talk_nov_2020
 
Bayesian neural networks increasingly sparsify their units with depth
Bayesian neural networks increasingly sparsify their units with depthBayesian neural networks increasingly sparsify their units with depth
Bayesian neural networks increasingly sparsify their units with depth
 
Species sampling models in Bayesian Nonparametrics
Species sampling models in Bayesian NonparametricsSpecies sampling models in Bayesian Nonparametrics
Species sampling models in Bayesian Nonparametrics
 
Dependent processes in Bayesian Nonparametrics
Dependent processes in Bayesian NonparametricsDependent processes in Bayesian Nonparametrics
Dependent processes in Bayesian Nonparametrics
 
Asymptotics for discrete random measures
Asymptotics for discrete random measuresAsymptotics for discrete random measures
Asymptotics for discrete random measures
 
Bayesian Nonparametrics, Applications to biology, ecology, and marketing
Bayesian Nonparametrics, Applications to biology, ecology, and marketingBayesian Nonparametrics, Applications to biology, ecology, and marketing
Bayesian Nonparametrics, Applications to biology, ecology, and marketing
 
A Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsA Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian Nonparametrics
 
A Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsA Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian Nonparametrics
 
Lindley smith 1972
Lindley smith 1972Lindley smith 1972
Lindley smith 1972
 
Berger 2000
Berger 2000Berger 2000
Berger 2000
 
Seneta 1993
Seneta 1993Seneta 1993
Seneta 1993
 
Diaconis Ylvisaker 1985
Diaconis Ylvisaker 1985Diaconis Ylvisaker 1985
Diaconis Ylvisaker 1985
 
Jefferys Berger 1992
Jefferys Berger 1992Jefferys Berger 1992
Jefferys Berger 1992
 
R in latex
R in latexR in latex
R in latex
 
Poster DDP (BNP 2011 Veracruz)
Poster DDP (BNP 2011 Veracruz)Poster DDP (BNP 2011 Veracruz)
Poster DDP (BNP 2011 Veracruz)
 
Bayesian adaptive optimal estimation using a sieve prior
Bayesian adaptive optimal estimation using a sieve priorBayesian adaptive optimal estimation using a sieve prior
Bayesian adaptive optimal estimation using a sieve prior
 
Seminaire ihp
Seminaire ihpSeminaire ihp
Seminaire ihp
 

Recently uploaded

Manisha Rani Net Worth 2024 Biography.pdf
Manisha Rani Net Worth 2024 Biography.pdfManisha Rani Net Worth 2024 Biography.pdf
Manisha Rani Net Worth 2024 Biography.pdfkigaya33
 
8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr
8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr
8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncrdollysharma2066
 
'the Spring 2024- popular Fashion trends
'the Spring 2024- popular Fashion trends'the Spring 2024- popular Fashion trends
'the Spring 2024- popular Fashion trendsTangledThoughtsCO
 
22K Indian Gold Jewelry Online - Buy 22 Karat Gold Jewelry in USA
22K Indian Gold Jewelry Online - Buy 22 Karat Gold Jewelry in USA22K Indian Gold Jewelry Online - Buy 22 Karat Gold Jewelry in USA
22K Indian Gold Jewelry Online - Buy 22 Karat Gold Jewelry in USAQueen of Hearts Jewelry
 
Model Call Girl in Adarsh Nagar Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Adarsh Nagar Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Adarsh Nagar Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Adarsh Nagar Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Independent Call Girls Delhi ~9711199012~ Call Me
Independent Call Girls Delhi ~9711199012~ Call MeIndependent Call Girls Delhi ~9711199012~ Call Me
Independent Call Girls Delhi ~9711199012~ Call MeMs Riya
 
Youthlab Indonesia Gen-Z Lifestyle Chart
Youthlab Indonesia Gen-Z Lifestyle ChartYouthlab Indonesia Gen-Z Lifestyle Chart
Youthlab Indonesia Gen-Z Lifestyle ChartYouthLab
 
Call Girls in Chittaranjan Park Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Chittaranjan Park Delhi 💯Call Us 🔝8264348440🔝Call Girls in Chittaranjan Park Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Chittaranjan Park Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
10 Tips To Be More Disciplined In Life To Be Successful | Amit Kakkar Healthyway
10 Tips To Be More Disciplined In Life To Be Successful | Amit Kakkar Healthyway10 Tips To Be More Disciplined In Life To Be Successful | Amit Kakkar Healthyway
10 Tips To Be More Disciplined In Life To Be Successful | Amit Kakkar HealthywayAmit Kakkar Healthyway
 
BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756
BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756
BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756dollysharma2066
 
KALENDAR KUDA 2024 Hi resolution cuti umum.pdf
KALENDAR KUDA 2024 Hi resolution cuti umum.pdfKALENDAR KUDA 2024 Hi resolution cuti umum.pdf
KALENDAR KUDA 2024 Hi resolution cuti umum.pdfSallamSulaiman
 
Virat Kohli Centuries In Career Age Awards and Facts.pdf
Virat Kohli Centuries In Career Age Awards and Facts.pdfVirat Kohli Centuries In Career Age Awards and Facts.pdf
Virat Kohli Centuries In Career Age Awards and Facts.pdfkigaya33
 
Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000
Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000
Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000Sapana Sha
 
Call Girls In Malviya Nagar 9654467111 Escorts Service
Call Girls In Malviya Nagar 9654467111 Escorts ServiceCall Girls In Malviya Nagar 9654467111 Escorts Service
Call Girls In Malviya Nagar 9654467111 Escorts ServiceSapana Sha
 
‘I think I might die if I made it’ 'There were no singles'
‘I think I might die if I made it’ 'There were no singles'‘I think I might die if I made it’ 'There were no singles'
‘I think I might die if I made it’ 'There were no singles'cakepearls Official
 
9990771857 Call Girls in Noida Sector 05 Noida (Call Girls) Delhi
9990771857 Call Girls in Noida Sector 05 Noida (Call Girls) Delhi9990771857 Call Girls in Noida Sector 05 Noida (Call Girls) Delhi
9990771857 Call Girls in Noida Sector 05 Noida (Call Girls) Delhidelhimodel235
 
Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls {Delhi Meet Payal Pitampura} 9711199012 Indepedemt Girls Delhi
Call Girls {Delhi Meet Payal Pitampura} 9711199012 Indepedemt Girls DelhiCall Girls {Delhi Meet Payal Pitampura} 9711199012 Indepedemt Girls Delhi
Call Girls {Delhi Meet Payal Pitampura} 9711199012 Indepedemt Girls DelhiMs Riya
 
83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...
83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...
83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...dollysharma2066
 

Recently uploaded (20)

Manisha Rani Net Worth 2024 Biography.pdf
Manisha Rani Net Worth 2024 Biography.pdfManisha Rani Net Worth 2024 Biography.pdf
Manisha Rani Net Worth 2024 Biography.pdf
 
8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr
8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr
8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr
 
'the Spring 2024- popular Fashion trends
'the Spring 2024- popular Fashion trends'the Spring 2024- popular Fashion trends
'the Spring 2024- popular Fashion trends
 
22K Indian Gold Jewelry Online - Buy 22 Karat Gold Jewelry in USA
22K Indian Gold Jewelry Online - Buy 22 Karat Gold Jewelry in USA22K Indian Gold Jewelry Online - Buy 22 Karat Gold Jewelry in USA
22K Indian Gold Jewelry Online - Buy 22 Karat Gold Jewelry in USA
 
Model Call Girl in Adarsh Nagar Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Adarsh Nagar Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Adarsh Nagar Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Adarsh Nagar Delhi reach out to us at 🔝8264348440🔝
 
Independent Call Girls Delhi ~9711199012~ Call Me
Independent Call Girls Delhi ~9711199012~ Call MeIndependent Call Girls Delhi ~9711199012~ Call Me
Independent Call Girls Delhi ~9711199012~ Call Me
 
Youthlab Indonesia Gen-Z Lifestyle Chart
Youthlab Indonesia Gen-Z Lifestyle ChartYouthlab Indonesia Gen-Z Lifestyle Chart
Youthlab Indonesia Gen-Z Lifestyle Chart
 
Call Girls in Chittaranjan Park Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Chittaranjan Park Delhi 💯Call Us 🔝8264348440🔝Call Girls in Chittaranjan Park Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Chittaranjan Park Delhi 💯Call Us 🔝8264348440🔝
 
10 Tips To Be More Disciplined In Life To Be Successful | Amit Kakkar Healthyway
10 Tips To Be More Disciplined In Life To Be Successful | Amit Kakkar Healthyway10 Tips To Be More Disciplined In Life To Be Successful | Amit Kakkar Healthyway
10 Tips To Be More Disciplined In Life To Be Successful | Amit Kakkar Healthyway
 
BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756
BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756
BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756
 
Call Girls 9953525677 Call Girls In Delhi Call Girls 9953525677 Call Girls In...
Call Girls 9953525677 Call Girls In Delhi Call Girls 9953525677 Call Girls In...Call Girls 9953525677 Call Girls In Delhi Call Girls 9953525677 Call Girls In...
Call Girls 9953525677 Call Girls In Delhi Call Girls 9953525677 Call Girls In...
 
KALENDAR KUDA 2024 Hi resolution cuti umum.pdf
KALENDAR KUDA 2024 Hi resolution cuti umum.pdfKALENDAR KUDA 2024 Hi resolution cuti umum.pdf
KALENDAR KUDA 2024 Hi resolution cuti umum.pdf
 
Virat Kohli Centuries In Career Age Awards and Facts.pdf
Virat Kohli Centuries In Career Age Awards and Facts.pdfVirat Kohli Centuries In Career Age Awards and Facts.pdf
Virat Kohli Centuries In Career Age Awards and Facts.pdf
 
Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000
Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000
Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000
 
Call Girls In Malviya Nagar 9654467111 Escorts Service
Call Girls In Malviya Nagar 9654467111 Escorts ServiceCall Girls In Malviya Nagar 9654467111 Escorts Service
Call Girls In Malviya Nagar 9654467111 Escorts Service
 
‘I think I might die if I made it’ 'There were no singles'
‘I think I might die if I made it’ 'There were no singles'‘I think I might die if I made it’ 'There were no singles'
‘I think I might die if I made it’ 'There were no singles'
 
9990771857 Call Girls in Noida Sector 05 Noida (Call Girls) Delhi
9990771857 Call Girls in Noida Sector 05 Noida (Call Girls) Delhi9990771857 Call Girls in Noida Sector 05 Noida (Call Girls) Delhi
9990771857 Call Girls in Noida Sector 05 Noida (Call Girls) Delhi
 
Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls {Delhi Meet Payal Pitampura} 9711199012 Indepedemt Girls Delhi
Call Girls {Delhi Meet Payal Pitampura} 9711199012 Indepedemt Girls DelhiCall Girls {Delhi Meet Payal Pitampura} 9711199012 Indepedemt Girls Delhi
Call Girls {Delhi Meet Payal Pitampura} 9711199012 Indepedemt Girls Delhi
 
83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...
83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...
83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...
 

Arbel oviedo

  • 1. Dependent Dirichlet processes and application to ecological data Julyan Arbel Joint work with Kerrie Mengersen & Judith Rousseau ´ CREST-INSEE, Universite Paris-Dauphine 2 December 2012 ERCIM 2012 5th International Conference on Computing & Statistics
  • 2. Biology question Nonparametric model Outline 1 Biology question Introduction Data 2 Nonparametric model Dirichlet process Dependent Dirichlet process Julyan Arbel DDP and ecological data
  • 3. Biology question Introduction Nonparametric model Data Outline 1 Biology question Introduction Data 2 Nonparametric model Dirichlet process Dependent Dirichlet process Julyan Arbel DDP and ecological data
  • 4. Biology question Introduction Nonparametric model Data Biology introduction Series of measurements at different places around Casey Station, permanent base in Antarctica At each site: pollution level, and abundance of microbes called OTUs. Assess the impact of a pollutant on the soil composition / biodiversity Julyan Arbel DDP and ecological data
  • 5. Biology question Introduction Nonparametric model Data Data Data consist of measurements of microbes abundance: Julyan Arbel DDP and ecological data
  • 6. Biology question Introduction Nonparametric model Data Data Data consist of measurements of microbes abundance: Site TPH 06251 00576 00429 06360 08793 06259 05164 00772 Sample of abundance of 8 microbes (columns) at 6 sites (rows) Main covariate is a pollution level called TPH, denoted x
  • 7. Biology question Introduction Nonparametric model Data Data Data consist of measurements of microbes abundance: Site TPH 06251 00576 00429 06360 08793 06259 05164 00772 1 80 3 724 88 1 0 0 0 467 2 80 9 2364 252 0 0 2 0 616 3 80 12 443 1655 11 0 0 0 168 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sample of abundance of 8 microbes (columns) at 6 sites (rows) Main covariate is a pollution level called TPH, denoted x
  • 8. Biology question Introduction Nonparametric model Data Data Data consist of measurements of microbes abundance: Site TPH 06251 00576 00429 06360 08793 06259 05164 00772 1 80 3 724 88 1 0 0 0 467 2 80 9 2364 252 0 0 2 0 616 3 80 12 443 1655 11 0 0 0 168 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2600 2262 339 229 1100 537 352 0 0 20 10000 1883 23 18 879 224 325 9 1 24 22000 1446 2 27 920 1808 1456 0 0 Sample of abundance of 8 microbes (columns) at 6 sites (rows) Main covariate is a pollution level called TPH, denoted x Julyan Arbel DDP and ecological data
  • 9. Biology question Introduction Nonparametric model Data Notations Microbe species are denoted by j = 1, . . . by decreasing total abundance Julyan Arbel DDP and ecological data
  • 10. Biology question Introduction Nonparametric model Data Notations Microbe species are denoted by j = 1, . . . by decreasing total abundance At each site x, there are N(x) microbes, denoted Yi (x), i = 1, . . . , N(x). Julyan Arbel DDP and ecological data
  • 11. Biology question Introduction Nonparametric model Data Notations Microbe species are denoted by j = 1, . . . by decreasing total abundance At each site x, there are N(x) microbes, denoted Yi (x), i = 1, . . . , N(x). Data are a frequency matrix: Site TPH 06251 00576 ... j =1 j ... 1 x = 80 #(Yn (x = 80) = 1) = 3 ... ... . . . . . . . . . . . . . . . k x ... #(Yn (x) = j) ... Julyan Arbel DDP and ecological data
  • 12. Biology question Introduction Nonparametric model Data Notations A standard example of diversity is Shannon diversity, taken as the exponential of Shannon entropy #(Yn (x)=j) D(x) = exp j −pj (x) log pj (x) with pj (x) = N(x) Julyan Arbel DDP and ecological data
  • 13. Biology question Introduction Nonparametric model Data Notations A standard example of diversity is Shannon diversity, taken as the exponential of Shannon entropy #(Yn (x)=j) D(x) = exp j −pj (x) log pj (x) with pj (x) = N(x) 40 3.5 Shannon diversity Shannon entropy 30 3.0 20 2.5 0 5000 10000 20000 10 0 5000 10000 20000 tph tph Figure: Left: Shannon entropy in row data. Right: Shannon diversity in row data. Julyan Arbel DDP and ecological data
  • 14. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Outline 1 Biology question Introduction Data 2 Nonparametric model Dirichlet process Dependent Dirichlet process Julyan Arbel DDP and ecological data
  • 15. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process First model Pavlovian conditioning associated with the word species leads to the Dirichlet process and/or related processes. Julyan Arbel DDP and ecological data
  • 16. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process First model Pavlovian conditioning associated with the word species leads to the Dirichlet process and/or related processes. Yi (x) | G ∼ G, First, we run an ∞ independent model at G(·) = pj δj (·), each site with TPH x j=1 (pj )j ∼ GEM(M). Julyan Arbel DDP and ecological data
  • 17. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process First model Pavlovian conditioning associated with the word species leads to the Dirichlet process and/or related processes. Yi (x) | G ∼ G, First, we run an ∞ independent model at G(·) = pj δj (·), each site with TPH x j=1 (pj )j ∼ GEM(M). The GEM(M) distribution is defined in [Pitman, 2002] (GEM stands for Griffiths, Engen and McCloskey) and represents the distribution of the weights in a Dirichlet process: pj = Vj (1 − Vl ), Vj ∼ Beta(1, M). l<j Julyan Arbel DDP and ecological data
  • 18. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Posterior sampling We use a blocked Gibbs sampler (truncated version of the infinite sum) Julyan Arbel DDP and ecological data
  • 19. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Posterior sampling We use a blocked Gibbs sampler (truncated version of the infinite sum) The prior on p is induced by the Beta prior on V , π⊥ (Vj ) = Be(1, M). Julyan Arbel DDP and ecological data
  • 20. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Posterior sampling We use a blocked Gibbs sampler (truncated version of the infinite sum) The prior on p is induced by the Beta prior on V , π⊥ (Vj ) = Be(1, M). This is conjugated, with a Beta posterior: π(Vj |Y ) = Be(Vj |1 + #(Yn = j), M + #(Yn > j)). Julyan Arbel DDP and ecological data
  • 21. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Second model But we want to run a single model across TPH x ; it means a predictor-dependent model Julyan Arbel DDP and ecological data
  • 22. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Second model But we want to run a single model across TPH x ; it means a predictor-dependent model Early references to predictor-dependent DP models include Cifarelli and Regazzini [1978] and Muliere and Petrone [1993] Julyan Arbel DDP and ecological data
  • 23. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Second model But we want to run a single model across TPH x ; it means a predictor-dependent model Early references to predictor-dependent DP models include Cifarelli and Regazzini [1978] and Muliere and Petrone [1993] Increasing interest since MacEachern [1999,2000,2001] Julyan Arbel DDP and ecological data
  • 24. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Second model But we want to run a single model across TPH x ; it means a predictor-dependent model Early references to predictor-dependent DP models include Cifarelli and Regazzini [1978] and Muliere and Petrone [1993] Increasing interest since MacEachern [1999,2000,2001] Extensions with varying weights include, among others, order-based DDP [Griffin and Steel, 2006], local DP [Chung and Dunson, 2009], weighted mixtures of DP [Dunson and Park, 2008], and kernel stick-breaking processes [Dunson et al., 2007]. Julyan Arbel DDP and ecological data
  • 25. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Second model Only interested in a dependence in the weights. We worked out a dependent process prior with a simple structure of dependence on the weights. Julyan Arbel DDP and ecological data
  • 26. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Second model Only interested in a dependence in the weights. We worked out a dependent process prior with a simple structure of dependence on the weights. Yi (x) | G(x) ∼ G(x), ∞ G(x)(·) = pj (x)δj (·), j=1 (pj (x))j ∼ DGEM(M), Julyan Arbel DDP and ecological data
  • 27. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Second model Only interested in a dependence in the weights. We worked out a dependent process prior with a simple structure of dependence on the weights. Yi (x) | G(x) ∼ G(x), ∞ G(x)(·) = pj (x)δj (·), pj (x) = Vj (x) (1 − Vl (x)), j=1 l<j (pj (x))j ∼ DGEM(M), Vj (x) ∼ Beta(1, M). where DGEM(M) stands for Dependent GEM distribution. Julyan Arbel DDP and ecological data
  • 28. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Second model Only interested in a dependence in the weights. We worked out a dependent process prior with a simple structure of dependence on the weights. Yi (x) | G(x) ∼ G(x), ∞ G(x)(·) = pj (x)δj (·), pj (x) = Vj (x) (1 − Vl (x)), j=1 l<j (pj (x))j ∼ DGEM(M), Vj (x) ∼ Beta(1, M). where DGEM(M) stands for Dependent GEM distribution. Want a process for each j, (Vj (x))x , which is marginally Beta(1, M). Julyan Arbel DDP and ecological data
  • 29. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Process on the beta breaks,Vj (x) Construction from Trippa, Muller and Johnson [2011]. ¨ Julyan Arbel DDP and ecological data
  • 30. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Process on the beta breaks,Vj (x) Construction from Trippa, Muller and Johnson [2011]. ¨ Γ(x1 ) V (x1 ) = Γ(x1 )+ΓM (x1 ) Julyan Arbel DDP and ecological data
  • 31. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Process on the beta breaks,Vj (x) Construction from Trippa, Muller and Johnson [2011]. ¨ α2 Γ(x1 ) α1 α3 V (x1 ) = Γ(x1 )+ΓM (x1 ) α12 α23 α123 x1 x2 x3 Julyan Arbel DDP and ecological data
  • 32. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Process on the beta breaks,Vj (x) Construction from Trippa, Muller and Johnson [2011]. ¨ α2 Γ(x1 ) α1 α3 V (x1 ) = Γ(x1 )+ΓM (x1 ) α12 α23 α123 x1 x2 x3 Γ(x1 ) = Γ1 + Γ12 + Γ123 , ΓM (x1 ) = ΓM + ΓM + ΓM . 1 12 123 Julyan Arbel DDP and ecological data
  • 33. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Process on the beta breaks,Vj (x) Construction from Trippa, Muller and Johnson [2011]. ¨ α2 Γ(x1 ) α1 α3 V (x1 ) = Γ(x1 )+ΓM (x1 ) α12 α23 α123 x1 x2 x3 Γ(x1 ) = Γ1 + Γ12 + Γ123 , Γ1 ∼ Ga(α1 ), . . . , Γ123 ∼ Ga(α123 ), ΓM (x1 ) = ΓM + ΓM + ΓM . 1 12 123 ΓM 1 ∼ Ga(α1 M), . . . , ΓM ∼ Ga(α123 M). 123 Julyan Arbel DDP and ecological data
  • 34. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Process on the beta breaks,Vj (x) Construction from Trippa, Muller and Johnson [2011]. ¨ α2 Γ(x1 ) α1 α3 V (x1 ) = Γ(x1 )+ΓM (x1 ) α12 α23 α123 x1 x2 x3 Γ(x1 ) = Γ1 + Γ12 + Γ123 , Γ1 ∼ Ga(α1 ), . . . , Γ123 ∼ Ga(α123 ), ΓM (x1 ) = ΓM + ΓM + ΓM . 1 12 123 ΓM 1 ∼ Ga(α1 M), . . . , ΓM ∼ Ga(α123 M). 123 In the end: pj (x) = Vj (x) l<j (1 − Vl (x)) ∼ DGEM(M). Julyan Arbel DDP and ecological data
  • 35. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Interesting features This idea can be extended to large dimensional covariate spaces: α3 x3. α123 α1 x1. x2. α23 α12 α2 Easy to simulate in: only needs to simulate Gamma random variables Julyan Arbel DDP and ecological data
  • 36. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Posterior sampling There is independence across j, so it suffices to be able to simulate in each posterior: π(Vj | Y ) ∝ π(V j )L(Y | V j ), ∝ π(V j ) Vj (x)#(Yn (x)=j) (1 − Vj (x))#(Yn (x)>j) . x Quite uncommon situation: we can sample in the prior π(V j ), but we cannot evaluate it. Reverse situation to Approximate Bayesian computation (ABC), where the likelihood is intractable, but can be sampled. Julyan Arbel DDP and ecological data
  • 37. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process A first solution is to use a Metropolis-Hastings algorithm: Metropolis-Hastings Algorithm 1 Given a current value V j , sample a new one V ∗ j independently in the prior π(V j ). 2 Acceptance probability is  L(Y |V ∗ )    j  ρ = min   L(Y |V )  . 1,      j Julyan Arbel DDP and ecological data
  • 38. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process A first solution is to use a Metropolis-Hastings algorithm: Metropolis-Hastings Algorithm 1 Given a current value V j , sample a new one V ∗ j independently in the prior π(V j ). 2 Acceptance probability is  L(Y |V ∗ )    j  ρ = min   L(Y |V )  . 1,      j But it is not a good idea to propose in the prior. Acceptance rate is low (around 1%). Julyan Arbel DDP and ecological data
  • 39. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process A better solution is to use Importance Sampling: Importance Sampling 1 Sample iid values V j in the prior π(V j ). 2 Use a weighted sample by the importance weights defined by the likelihood w(V j ) = L(Y |V j ). Julyan Arbel DDP and ecological data
  • 40. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process A better solution is to use Importance Sampling: Importance Sampling 1 Sample iid values V j in the prior π(V j ). 2 Use a weighted sample by the importance weights defined by the likelihood w(V j ) = L(Y |V j ). iid sample instead of a Markov chain better precision by a Rao-Blackwellisation argument (weights instead of accept-reject) Julyan Arbel DDP and ecological data
  • 41. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Results 40 40 Posterior diversity Diversity in data 30 30 20 20 10 10 0 5000 10000 20000 0 5000 10000 20000 tph tph Figure: Left: dependent DP prior: posterior mean of the Shannon diversity by TPH; 95% centred credible intervals. Right: Shannon diversity in row data. Julyan Arbel DDP and ecological data
  • 42. Biology question Dirichlet process Nonparametric model Dependent Dirichlet process Conclusion Such a model allows to give probabilistic answers to questions about diversity as we get a posterior sample. The use of Gaussian processes transformed to Beta processes by the inverse CDF might fastened the posterior computations. Extension to handle other covariates. Julyan Arbel DDP and ecological data