SlideShare a Scribd company logo
1 of 50
Download to read offline
Multivariate analyses & decoding

Kay Henning Brodersen
Translational Neuromodeling Unit (TNU)
Institute for Biomedical Engineering, University of Zurich & ETH Zurich
Machine Learning and Pattern Recognition Group
Department of Computer Science, ETH Zurich
http://people.inf.ethz.ch/bkay
Why multivariate?

Univariate approaches are excellent for localizing activations in individual voxels.



                                     *

                                          n.s.




                          v1 v2                   v1 v2




                          reward                 no reward

                                                                                       2
Why multivariate?

Multivariate approaches can be used to examine responses that are jointly encoded
in multiple voxels.


                n.s.


                       n.s.



  v1 v2                        v1 v2        v2




 orange juice                 apple juice                     v1
                                                                                    3
Why multivariate?

Multivariate approaches can utilize โ€˜hiddenโ€™ quantities such as coupling strengths.


                                        signal
                                         ๐‘ฅ2 (๐‘ก)
                      signal                         signal
                                                      ๐‘ฅ3 (๐‘ก)                         observed BOLD signal
                       ๐‘ฅ1 (๐‘ก)




                                       activity
                                        ๐‘ง2 (๐‘ก)                                       hidden underlying
                     activity                      activity
                      ๐‘ง1 (๐‘ก)                        ๐‘ง3 (๐‘ก)                           neural activity and
                                                                                     coupling strengths

              driving input ๐‘ข1(๐‘ก)       modulatory input ๐‘ข2(๐‘ก)

                                                         t
                                 t

Friston, Harrison & Penny (2003) NeuroImage; Stephan & Friston (2007) Handbook of Brain Connectivity; Stephan et al. (2008) NeuroImage

                                                                                                                                         4
Overview


1 Introduction

2 Classification

3 Multivariate Bayes

4 Model-based analyses




                         5
Overview


1 Introduction

2 Classification

3 Multivariate Bayes

4 Model-based analyses




                         6
Encoding vs. decoding




     condition                   encoding model
         stimulus                   ๐‘”: ๐‘‹ ๐‘ก โ†’ ๐‘Œ๐‘ก
    response
      prediction error
                                    decoding model
                                     โ„Ž: ๐‘Œ๐‘ก โ†’ ๐‘‹ ๐‘ก

context (cause or consequence)                       BOLD signal
           ๐‘‹๐‘ก โˆˆ โ„ ๐‘‘                                  ๐‘Œ๐‘ก โˆˆ โ„ ๐‘ฃ




                                                                   7
Regression vs. classification


 Regression model

 independent
                            continuous
 variables              ๐‘“
                            dependent variable
 (regressors)



 Classification model

 independent                categorical
 variables              ๐‘“   dependent variable   vs.
 (features)                 (label)




                                                       8
Univariate vs. multivariate models

A univariate model considers a        A multivariate model considers
single voxel at a time.               many voxels at once.




  context           BOLD signal           context            BOLD signal
  ๐‘‹๐‘ก โˆˆ โ„ ๐‘‘            ๐‘Œ๐‘ก โˆˆ โ„              ๐‘‹๐‘ก โˆˆ โ„ ๐‘‘         ๐‘Œ๐‘ก โˆˆ โ„ ๐‘ฃ , v โ‰ซ 1


Spatial dependencies between voxels   Multivariate models enable
are only introduced afterwards,       inferences on distributed responses
through random field theory.          without requiring focal activations.




                                                                              9
Prediction vs. inference

    The goal of prediction is to find                     The goal of inference is to decide
    a highly accurate encoding or                         between competing hypotheses.
    decoding function.




  predicting a cognitive      predicting a                comparing a model that         weighing the
      state using a         subject-specific             links distributed neuronal      evidence for
     brain-machine          diagnostic status               activity to a cognitive        sparse vs.
        interface                                          state with a model that    distributed coding
                                                                   does not

                predictive density                            marginal likelihood (model evidence)
๐‘ ๐‘‹ ๐‘›๐‘’๐‘ค ๐‘Œ ๐‘›๐‘’๐‘ค , ๐‘‹, ๐‘Œ = โˆซ ๐‘ ๐‘‹ ๐‘›๐‘’๐‘ค ๐‘Œ ๐‘›๐‘’๐‘ค , ๐œƒ ๐‘ ๐œƒ ๐‘‹, ๐‘Œ ๐‘‘๐œƒ              ๐‘ ๐‘‹ ๐‘Œ = โˆซ ๐‘ ๐‘‹ ๐‘Œ, ๐œƒ ๐‘ ๐œƒ ๐‘‘๐œƒ


                                                                                                      10
Goodness of fit vs. complexity

Goodness of fit is the degree to which a model explains observed data.
Complexity is the flexibility of a model (including, but not limited to, its number of
parameters).


  ๐‘Œ                  1 parameter            4 parameters                    9 parameters




 truth
 data
 model                             ๐‘‹
            underfitting               optimal                       overfitting



We wish to find the model that optimally trades off goodness of fit and complexity.

Bishop (2007) PRML


                                                                                           11
Summary of modelling terminology
                                                                        General Linear Model (GLM)
                                                                        โ€ข mass-univariate encoding model
                                                                        โ€ข to regress context onto brain activity
                                                                          and find clusters of similar effects

                                       Dynamic Causal Modelling (DCM)
                                       โ€ข multivariate encoding model
                                       โ€ข to evaluate connectivity
                                         hypotheses

                 Classification
                 โ€ข multivariate decoding model
                 โ€ข to predict a categorical context
                   label from brain activity

 Multivariate Bayes (MVB)
 โ€ข multivariate decoding model
 โ€ข to evaluate anatomical and
   coding hypotheses



                                                                                                                   12
Overview


1 Introduction

2 Classification

3 Multivariate Bayes

4 Model-based analyses




                         13
Constructing a classifier

A principled way of designing a classifier would be to adopt a probabilistic approach:


              ๐‘Œ๐‘ก          ๐‘“        that ๐‘˜ which maximizes ๐‘ ๐‘‹ ๐‘ก = ๐‘˜ ๐‘Œ๐‘ก , ๐‘‹, ๐‘Œ



In practice, classifiers differ in terms of how strictly they implement this principle.

 Generative classifiers           Discriminative classifiers       Discriminant classifiers

 use Bayesโ€™ rule to estimate      estimate ๐‘ ๐‘‹ ๐‘ก ๐‘Œ๐‘ก directly       estimate ๐‘“ ๐‘Œ๐‘ก directly
  ๐‘ ๐‘‹ ๐‘ก ๐‘Œ๐‘ก โˆ ๐‘ ๐‘Œ๐‘ก ๐‘‹ ๐‘ก ๐‘ ๐‘‹ ๐‘ก       without Bayesโ€™ theorem

 โ€ข Gaussian Naรฏve Bayes           โ€ข Logistic regression            โ€ข Fisherโ€™s Linear
 โ€ข Linear Discriminant            โ€ข Relevance Vector                 Discriminant
   Analysis                         Machine                        โ€ข Support Vector Machine



                                                                                              14
Support vector machine (SVM)

     Linear SVM                                                       Nonlinear SVM

v2




                                                                 v1
     Vapnik (1999) Springer; Schรถlkopf et al. (2002) MIT Press

                                                                                      15
Stages in a classification analysis




                           Classification
       Feature                              Performance
                            using cross-
      extraction                             evaluation
                             validation
                                                 Bayesian mixed-
                                                 effects inference

                                                  ๐‘ =1โˆ’
                                                  ๐‘ƒ ๐œ‹ > ๐œ‹0 ๐‘˜, ๐‘›

                                                     mixed effects




                                                                     16
Feature extraction for trial-by-trial classification

We can obtain trial-wise estimates of neural activity by filtering the data with a GLM.


  data ๐‘Œ                             design matrix ๐‘‹


                                                                    coefficients

                          Boxcar
                                                                        ๐›ฝ1
                                                                        ๐›ฝ2
                 =                                               ร—              + ๐‘’
                          regressor for
                          trial 2
                                                                        โ‹ฎ
                                                                        ๐›ฝ๐‘
                                                                        Estimate of this
                                                                        coefficient
                                                                        reflects activity
                                                                        on trial 2




                                                                                            17
Cross-validation

The generalization ability of a classifier can be estimated using a resampling procedure
known as cross-validation. One example is 2-fold cross-validation:


examples
       1               ?                                               training example
       2               ?                                           ? test examples
       3               ?
             ...
             ...

      99      ?
     100      ?
              1       2     folds




                                                                                           18
Cross-validation

A more commonly used variant is leave-one-out cross-validation.




examples
       1                                            ?                training example
       2                                    ?                     ? test example
       3                            ?

                           ...
            ...
            ...

                                   ...
                                   ...
                                   ...
      99              ?
     100      ?
              1      2             98      99     100    folds



                      performance evaluation


                                                                                        19
Performance evaluation

     Single-subject study with ๐’ trials
  The most common approach is to assess how likely the obtained number of correctly
  classified trials could have occurred by chance.


                subject
                                          Binomial test
                                          ๐‘ = ๐‘ƒ ๐‘‹ โ‰ฅ ๐‘˜ ๐ป0 = 1 โˆ’ ๐ต ๐‘˜|๐‘›, ๐œ‹0
                                          In MATLAB:
                                          p = 1 - binocdf(k,n,pi_0)


      trial 1                             ๐‘˜    number of correctly classified trials
                                 0        ๐‘›    total number of trials
trial ๐‘›                          1        ๐œ‹0   chance level (typically 0.5)
                                 1        ๐ต    binomial cumulative density function
                                 1
                          - +-   0
                   +-


                                                                                       20
Performance evaluation


                                    population



                subject 1             subject 2   subject 3   subject 4       subject ๐‘š


                                                                          โ€ฆ

      trial 1                   0        1           0           1                0
trial ๐‘›                         1        1           1           1                1
                                1        0           1           1        โ€ฆ       1
                                1        1           0           1                1
                         - +-   0        1           0           1                0
                    +-


                                                                                          21
Performance evaluation

       Group study with ๐’Ž subjects, ๐’ trials each
In a group setting, we must account for both within-subjects (fixed-effects) and between-
subjects (random-effects) variance components.
                                                                                                     available for
                                                                                                    download soon
 Binomial test on               Binomial test on              t-test on                  Bayesian mixed-
 concatenated data              averaged data                 summary statistics         effects inference
  ๐‘=1โˆ’                          ๐‘=1โˆ’                                       ๐œ‹โˆ’๐œ‹0          ๐‘=1โˆ’
                                                               ๐‘ก=      ๐‘š
                                  1          1                             ๐œŽ ๐‘šโˆ’1
  ๐ต โˆ‘๐‘˜|โˆ‘๐‘›, ๐œ‹0                   ๐ต ๐‘š โˆ‘๐‘˜|        โˆ‘๐‘›,    ๐œ‹0                                 ๐‘ƒ ๐œ‹ > ๐œ‹0 ๐‘˜, ๐‘›
                                             ๐‘š                 ๐‘=1โˆ’ ๐‘ก          ๐‘šโˆ’1   ๐‘ก

        fixed effects                 fixed effects                 random effects           mixed effects



  ๐œ‹     sample mean of sample accuracies                     ๐œ‹0      chance level (typically 0.5)
  ๐œŽ ๐‘šโˆ’1 sample standard deviation                           ๐‘ก ๐‘šโˆ’1    cumulative Studentโ€™s ๐‘ก-distribution


Brodersen, Mathys, Chumbley, Daunizeau, Ong, Buhmann, Stephan (under review)

                                                                                                              22
Spatial deployment of informative regions

Which brain regions are jointly informative of a cognitive state of interest?

 Searchlight approach                               Whole-brain approach




 A sphere is passed across the brain. At each       A constrained classifier is trained on whole-
 location, the classifier is evaluated using only   brain data. Its voxel weights are related to
 the voxels in the current sphere โ†’ map of t-       their empirical null distributions using a
 scores.                                            permutation test โ†’ map of t-scores.

Nandy & Cordes (2003) MRM                           Mourao-Miranda et al. (2005) NeuroImage
Kriegeskorte et al. (2006) PNAS                     Lomakina et al. (in preparation)

                                                                                                    23
Summary: research questions for classification

  Overall classification accuracy                                       Spatial deployment of discriminative regions
       accuracy                                                                                  80%
        100 %


         50 %
                   Left or right        Truth        Healthy or
                     button?              or            ill?
                                         lie?
                                                                                                 55%
                                                classification task



  Temporal evolution of discriminability                                Model-based classification
        accuracy                                Participant indicates
                                                      decision
        100 %

                                                                                                       { group 1,
         50 %
                                                                                                       group 2 }
                                   Accuracy rises above
                                         chance
                                                within-trial time


Pereira et al. (2009) NeuroImage, Brodersen et al. (2009) The New Collection

                                                                                                                       24
Overview


1 Introduction

2 Classification

3 Multivariate Bayes

4 Model-based analyses




                         25
Multivariate Bayes

SPM brings multivariate analyses into the conventional inference framework of
Bayesian hierarchical models and their inversion.




                                                              Mike West




                                                                                26
Multivariate Bayes

Multivariate analyses in SPM rest on the central tenet that inferences about how
the brain represents things reduce to model comparison.




    some cause or                                                  vs.
                          decoding model
     consequence



                                              sparse coding in           distributed coding in
                                            orbitofrontal cortex           prefrontal cortex




To make the ill-posed regression problem tractable, MVB uses a prior on voxel
weights. Different priors reflect different coding hypotheses.

                                                                                                 27
From encoding to decoding


 Encoding model: GLM           Decoding model: MVB

           ๐‘‹                     ๐›ผ           ๐‘‹   = ๐ด๐›ผ


  ๐›ฝ        ๐ด   = ๐‘‹๐›ฝ                          ๐ด

  ๐›พ                              ๐›พ
           ๐‘Œ   = ๐‘‡๐ด + ๐บ๐›พ + ๐œ€                 ๐‘Œ   = ๐‘‡๐ด + ๐บ๐›พ + ๐œ€
  ๐œ€                              ๐œ€

 In summary:                   In summary:
  ๐‘Œ = ๐‘‡๐‘‹๐›ฝ + ๐บ๐›พ + ๐œ€             ๐‘‡๐‘‹ = ๐‘Œ๐›ผ โˆ’ ๐บ๐›พ๐›ผ โˆ’ ๐œ€๐›ผ



                                                                 28
Specifying the prior for MVB

 1st level โ€“ spatial coding hypothesis ๐‘ˆ                                              ร—   ๐œ‚
                    ๐‘ข   patterns




Voxel 2 is
allowed to play a
role. ๐‘›
    voxels
                        ๐‘ˆ          Voxel 3 is allowed to
                                   play a role, but only if
                                                              ๐‘ˆ                   ๐‘ˆ
                                   its neighbours play
                                   similar roles.




 2nd level โ€“ pattern covariance structure ฮฃ
             ๐‘ ๐œ‚ = ๐’ฉ ๐œ‚ 0, ฮฃ
             ฮฃ = โˆ‘๐‘– ๐œ†๐‘– ๐‘            ๐‘–


Thus: ๐‘ ๐›ผ|๐œ† = ๐’ฉ ๐‘› ๐›ผ 0, ๐‘ˆฮฃ๐‘ˆ ๐‘‡                             and      ๐‘ ๐œ† = ๐’ฉ ๐œ† ๐œ‹, ฮ  โˆ’1
                                                                                              29
Inverting the model

                          1                                                 Model inversion involves
Partition #1   subset ๐‘ 
                                                                            finding the posterior
                                                                            distribution over voxel
 ฮฃ = ๐œ†1 ร—                                                                   weights ๐›ผ.
                                                                            In MVB, this includes a
                                                                            greedy search for the
                                                                            optimal covariance
                          1                      2
Partition #2   subset ๐‘                subset ๐‘                               structure that governs
                                                                            the prior over ๐›ผ.
 ฮฃ = ๐œ†1 ร—                     +๐œ†2 ร—



                          1                      2                      3
Partition #3   subset ๐‘                subset ๐‘                subset ๐‘ 
(optimal)

 ฮฃ = ๐œ†1 ร—                     +๐œ†2 ร—                  +๐œ†3 ร—



                                                                                                      30
Example: decoding motion from visual cortex
                                                               photic   motion   attention   const
MVB can be illustrated using SPMโ€™s attention-
to-motion example dataset.

This dataset is based on a simple block
design. There are three experimental factors:
     ๏‚จ   photic         โ€“ display shows random dots
     ๏‚จ   motion         โ€“ dots are moving




                                                       scans
     ๏‚จ   attention โ€“ subjects asked to pay attention




Buechel & Friston 1999 Cerebral Cortex
Friston et al. 2008 NeuroImage


                                                                                                 31
Multivariate Bayes in SPM




Step 1
After having specified and estimated
a model, use the Results button.

                                       Step 2
                                       Select the contrast to be decoded.

                                                                            32
Multivariate Bayes in SPM
                            Step 3
                            Pick a region of interest.




                                                         33
Multivariate Bayes in SPM

                            Step 5
                            Here, the region of
                            interest is
                            specified as a
                            sphere around the
                            cursor. The spatial
                            prior implements
                            a sparse coding
                            hypothesis.



Step 4
Multivariate Bayes can be
invoked from within the
Multivariate section.




                                          34
Multivariate Bayes in SPM




Step 6
Results can be displayed using the BMS
button.



                                         35
Observations vs. predictions


                                 ๐‘น๐‘ฟ๐‘




                      (motion)




                                       36
Model evidence and voxel weights




                     log evidence = 3




                                        37
Using MVB for point classification


                                     MVB may outperform
                                     conventional point
                                     classifiers when using a
                                     more appropriate coding
                                     hypothesis.




                  Support Vector
                  Machine




                                                                38
Summary: research questions for MVB

 Where does the brain represent things?       How does the brain represent things?
 Evaluating competing anatomical hypotheses   Evaluating competing coding hypotheses




                                                                                       39
Overview


1 Introduction

2 Classification

3 Multivariate Bayes

4 Model-based analyses




                         40
Classification approaches by data representation

                                           Model-based
                                           classification
                                           How do patterns of
                                           hidden quantities (e.g.,
                                           connectivity among brain
                                           regions) differ between groups?

                         Activation-based
Structure-based          classification
classification
                         Which functional
Which anatomical         differences allow us to
structures allow us to   separate groups?
separate patients and
healthy controls?

                                                                         41
Generative embedding for model-based classification

                               step 1 โ€”                                   A                step 2 โ€”              Aโ†’B
                               modelling                                                   embedding             Aโ†’C
                                                                              B                                  Bโ†’B
                                                                      C                                          Bโ†’C

measurements from                                           subject-specific                            subject representation in
an individual subject                                      generative model                            model-based feature space


                                                                  1
         A                                                 accuracy
                                step 5 โ€”                                               step 4 โ€”                      step 3 โ€”
                                interpretation                                         evaluation                    classification
                  B
     C
                                                                  0
 jointly discriminative                               discriminability of                           classification
connection strengths?                                      groups?                                     model

 Brodersen, Haiss, Ong, Jung, Tittgemeyer, Buhmann, Weber, Stephan (2011) NeuroImage
 Brodersen, Schofield, Leff, Ong, Lomakina, Buhmann, Stephan (2011) PLoS Comput Biol


                                                                                                                                      42
Example: diagnosing stroke patients




                                      anatomical
                                      regions of interest




                     y = โ€“26 mm




                                                            43
Example: diagnosing stroke patients




        PT                                       PT

                   HG                      HG
   L              (A1)                    (A1)        R


                    MGB              MGB

                         stimulus input



                                                          44
Multivariate analysis: connectional fingerprints




                                             patients
                                             controls

                                                        45
Dissecting diseases into physiologically distinct subgroups

                             Voxel-based contrast space                                                           Model-based parameter space

                      0.4           0.4                                              -0.15                     -0.15




                                                                              embedding
                                                                              generative
Voxel (64,-24,4) mm




                      0.3           0.3                                               -0.2                      -0.2




                                                                                                 L.HG โ†’ L.HG
                      0.2           0.2                                              -0.25                     -0.25

                      0.1           0.1                                                -0.3                     -0.3

                                                                                 patients
                        0             0                                            -0.35                       -0.35
                                                                                 controls
                                                                        -10                -10                                                 0.5          0.5
                      -0.1          -0.1                                               -0.4                     -0.4
                      -0.5          -0.5                      0                  0     -0.4                     -0.4                  0                0
                                     0              0                                                          -0.2       -0.2
                                                             Voxel (-56,-20,10) mm                                                                    R.HG โ†’ L.HG
Voxel (-42,-26,10) mm                            0.5    10        0.5   10                                                  0 -0.5         0   -0.5
                                                                                                                  L.MGB โ†’ L.MGB


                                   classification accuracy                                                                  classification accuracy
                        (using all voxels in the regions of interest)                                                  (using all 23 model parameters)
                                           75%                                                                                       98%
                                                                                                                                                             46
Discriminative features in model space




         PT                                      PT

                   HG                      HG
                  (A1)                    (A1)
   L                                                  R


                    MGB              MGB

                         stimulus input




                                                          47
Discriminative features in model space




         PT                                        PT

                   HG                      HG
                  (A1)                    (A1)
   L                                                         R


                    MGB              MGB

                         stimulus input          highly discriminative
                                                 somewhat discriminative
                                                 not discriminative

                                                                     48
Generative embedding and DCM

Question 1 โ€“ What do the data tell us about hidden processes in the brain?
๏ƒฐ compute the posterior                                                                 ?
                 ๐‘ ๐‘ฆ ๐œƒ,๐‘š ๐‘ ๐œƒ ๐‘š
    ๐‘ ๐œƒ ๐‘ฆ, ๐‘š =
                     ๐‘ ๐‘ฆ ๐‘š




Question 2 โ€“ Which model is best w.r.t. the observed fMRI data?
๏ƒฐ compute the model evidence
    ๐‘ ๐‘š ๐‘ฆ โˆ ๐‘ ๐‘ฆ ๐‘š ๐‘(๐‘š)
   = โˆซ ๐‘ ๐‘ฆ ๐œƒ, ๐‘š ๐‘ ๐œƒ ๐‘š ๐‘‘๐œƒ
                                                                                    ?

Question 3 โ€“ Which model is best w.r.t. an external criterion?
๏ƒฐ compute the classification accuracy
                                                                                        { patient,
    ๐‘ โ„Ž ๐‘ฆ = ๐‘ฅ ๐‘ฆ                                                                         control }
   =     ๐‘ โ„Ž ๐‘ฆ = ๐‘ฅ ๐‘ฆ, ๐‘ฆtrain , ๐‘ฅtrain ๐‘ ๐‘ฆ ๐‘ ๐‘ฆtrain   ๐‘ ๐‘ฅtrain   ๐‘‘๐‘ฆ ๐‘‘๐‘ฆtrain ๐‘ฅtrain




                                                                                                     49
Summary

          Classification
          โ€ข to assess whether a cognitive state is
            linked to patterns of activity
          โ€ข to assess the spatial deployment of
            discriminative activity

          Multivariate Bayes
          โ€ข to evaluate competing anatomical
            hypotheses
          โ€ข to evaluate competing coding hypotheses

          Model-based analyses
          โ€ข to assess whether groups differ in terms of
            patterns of connectivity
          โ€ข to generate new grouping hypotheses


                                                          50

More Related Content

What's hot

Lecture8 - From CBR to IBk
Lecture8 - From CBR to IBkLecture8 - From CBR to IBk
Lecture8 - From CBR to IBkAlbert Orriols-Puig
ย 
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSHIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSAlbert Orriols-Puig
ย 
Chapter6.doc
Chapter6.docChapter6.doc
Chapter6.docbutest
ย 
Making the Most of Predictive Models
Making the Most of Predictive ModelsMaking the Most of Predictive Models
Making the Most of Predictive ModelsRajarshi Guha
ย 
Lecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IILecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IIAlbert Orriols-Puig
ย 
Predicting performance in Recommender Systems - Poster
Predicting performance in Recommender Systems - PosterPredicting performance in Recommender Systems - Poster
Predicting performance in Recommender Systems - PosterAlejandro Bellogin
ย 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learningbutest
ย 
Lecture3 - Machine Learning
Lecture3 - Machine LearningLecture3 - Machine Learning
Lecture3 - Machine LearningAlbert Orriols-Puig
ย 
Genetic algorithms
Genetic algorithms Genetic algorithms
Genetic algorithms Pradeep Kumar
ย 
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...Albert Orriols-Puig
ย 
97cogsci.doc
97cogsci.doc97cogsci.doc
97cogsci.docbutest
ย 

What's hot (20)

Lecture8 - From CBR to IBk
Lecture8 - From CBR to IBkLecture8 - From CBR to IBk
Lecture8 - From CBR to IBk
ย 
Lecture24
Lecture24Lecture24
Lecture24
ย 
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSHIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
ย 
Lecture23
Lecture23Lecture23
Lecture23
ย 
Chapter6.doc
Chapter6.docChapter6.doc
Chapter6.doc
ย 
Lecture7 - IBk
Lecture7 - IBkLecture7 - IBk
Lecture7 - IBk
ย 
Making the Most of Predictive Models
Making the Most of Predictive ModelsMaking the Most of Predictive Models
Making the Most of Predictive Models
ย 
Lecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IILecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART II
ย 
24 poster
24 poster24 poster
24 poster
ย 
Lecture20
Lecture20Lecture20
Lecture20
ย 
Predicting performance in Recommender Systems - Poster
Predicting performance in Recommender Systems - PosterPredicting performance in Recommender Systems - Poster
Predicting performance in Recommender Systems - Poster
ย 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
ย 
Lecture3 - Machine Learning
Lecture3 - Machine LearningLecture3 - Machine Learning
Lecture3 - Machine Learning
ย 
Lecture18
Lecture18Lecture18
Lecture18
ย 
Jmora.di.oeg.3x1e
Jmora.di.oeg.3x1eJmora.di.oeg.3x1e
Jmora.di.oeg.3x1e
ย 
Lecture19
Lecture19Lecture19
Lecture19
ย 
Genetic algorithms
Genetic algorithms Genetic algorithms
Genetic algorithms
ย 
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
ย 
Lecture17
Lecture17Lecture17
Lecture17
ย 
97cogsci.doc
97cogsci.doc97cogsci.doc
97cogsci.doc
ย 

Viewers also liked

Reading the mindโ€™s eye: Decoding object information during mental imagery fr...
Reading the mindโ€™s eye:  Decoding object information during mental imagery fr...Reading the mindโ€™s eye:  Decoding object information during mental imagery fr...
Reading the mindโ€™s eye: Decoding object information during mental imagery fr...Thomas Serre
ย 
CCSP Response Letter 6.1.06
CCSP Response Letter 6.1.06CCSP Response Letter 6.1.06
CCSP Response Letter 6.1.06Obama White House
ย 
People: Known And Unknown
People: Known And UnknownPeople: Known And Unknown
People: Known And Unknownguimera
ย 
ๅฃใ‚ณใƒŸใƒžใƒผใ‚ฑใƒ†ใ‚ฃใƒณใ‚ฐใฎใŸใ‚ใฎๅŠฃใƒขใ‚ธใƒฅใƒฉ้–ขๆ•ฐใฎ่ฉฑ
ๅฃใ‚ณใƒŸใƒžใƒผใ‚ฑใƒ†ใ‚ฃใƒณใ‚ฐใฎใŸใ‚ใฎๅŠฃใƒขใ‚ธใƒฅใƒฉ้–ขๆ•ฐใฎ่ฉฑๅฃใ‚ณใƒŸใƒžใƒผใ‚ฑใƒ†ใ‚ฃใƒณใ‚ฐใฎใŸใ‚ใฎๅŠฃใƒขใ‚ธใƒฅใƒฉ้–ขๆ•ฐใฎ่ฉฑ
ๅฃใ‚ณใƒŸใƒžใƒผใ‚ฑใƒ†ใ‚ฃใƒณใ‚ฐใฎใŸใ‚ใฎๅŠฃใƒขใ‚ธใƒฅใƒฉ้–ขๆ•ฐใฎ่ฉฑHigashiyama Masahiko
ย 
"Playing and Teaching with Interactive Fiction" by Sherry Jones (April 12, 2015)
"Playing and Teaching with Interactive Fiction" by Sherry Jones (April 12, 2015)"Playing and Teaching with Interactive Fiction" by Sherry Jones (April 12, 2015)
"Playing and Teaching with Interactive Fiction" by Sherry Jones (April 12, 2015)Sherry Jones
ย 
INBOUND Bold Talks: David Meerman Scott
INBOUND Bold Talks: David Meerman ScottINBOUND Bold Talks: David Meerman Scott
INBOUND Bold Talks: David Meerman ScottHubSpot
ย 
เธชเธกเธฑเธ„เธฃเธ‡เธฒเธ™
เธชเธกเธฑเธ„เธฃเธ‡เธฒเธ™เธชเธกเธฑเธ„เธฃเธ‡เธฒเธ™
เธชเธกเธฑเธ„เธฃเธ‡เธฒเธ™findgooodjob
ย 
The State of the Union: Highlights
The State of the Union: HighlightsThe State of the Union: Highlights
The State of the Union: HighlightsLinkedIn Editors' Picks
ย 
PLAY to win the product development race. SERIOUSLY (Donna Denio and Dieter R...
PLAY to win the product development race. SERIOUSLY (Donna Denio and Dieter R...PLAY to win the product development race. SERIOUSLY (Donna Denio and Dieter R...
PLAY to win the product development race. SERIOUSLY (Donna Denio and Dieter R...ProductCamp Boston
ย 
Estatuas Pelo Mundo
 Estatuas Pelo Mundo Estatuas Pelo Mundo
Estatuas Pelo MundoMenotti Orlandi
ย 
Strategies for IND Filing Success -CMC
Strategies for IND Filing Success -CMCStrategies for IND Filing Success -CMC
Strategies for IND Filing Success -CMCSharon W. Ayd
ย 
Hope for Today Marketing
Hope for Today MarketingHope for Today Marketing
Hope for Today MarketingAllyson Watson
ย 
Bloom's Taxonomy Revised - Posters by Schrock
Bloom's Taxonomy Revised - Posters by SchrockBloom's Taxonomy Revised - Posters by Schrock
Bloom's Taxonomy Revised - Posters by SchrockAnubanchonburi School
ย 

Viewers also liked (18)

Reading the mindโ€™s eye: Decoding object information during mental imagery fr...
Reading the mindโ€™s eye:  Decoding object information during mental imagery fr...Reading the mindโ€™s eye:  Decoding object information during mental imagery fr...
Reading the mindโ€™s eye: Decoding object information during mental imagery fr...
ย 
CCSP Response Letter 6.1.06
CCSP Response Letter 6.1.06CCSP Response Letter 6.1.06
CCSP Response Letter 6.1.06
ย 
People: Known And Unknown
People: Known And UnknownPeople: Known And Unknown
People: Known And Unknown
ย 
ๅฃใ‚ณใƒŸใƒžใƒผใ‚ฑใƒ†ใ‚ฃใƒณใ‚ฐใฎใŸใ‚ใฎๅŠฃใƒขใ‚ธใƒฅใƒฉ้–ขๆ•ฐใฎ่ฉฑ
ๅฃใ‚ณใƒŸใƒžใƒผใ‚ฑใƒ†ใ‚ฃใƒณใ‚ฐใฎใŸใ‚ใฎๅŠฃใƒขใ‚ธใƒฅใƒฉ้–ขๆ•ฐใฎ่ฉฑๅฃใ‚ณใƒŸใƒžใƒผใ‚ฑใƒ†ใ‚ฃใƒณใ‚ฐใฎใŸใ‚ใฎๅŠฃใƒขใ‚ธใƒฅใƒฉ้–ขๆ•ฐใฎ่ฉฑ
ๅฃใ‚ณใƒŸใƒžใƒผใ‚ฑใƒ†ใ‚ฃใƒณใ‚ฐใฎใŸใ‚ใฎๅŠฃใƒขใ‚ธใƒฅใƒฉ้–ขๆ•ฐใฎ่ฉฑ
ย 
Erasmus ip june_2013
Erasmus ip june_2013Erasmus ip june_2013
Erasmus ip june_2013
ย 
"Playing and Teaching with Interactive Fiction" by Sherry Jones (April 12, 2015)
"Playing and Teaching with Interactive Fiction" by Sherry Jones (April 12, 2015)"Playing and Teaching with Interactive Fiction" by Sherry Jones (April 12, 2015)
"Playing and Teaching with Interactive Fiction" by Sherry Jones (April 12, 2015)
ย 
INBOUND Bold Talks: David Meerman Scott
INBOUND Bold Talks: David Meerman ScottINBOUND Bold Talks: David Meerman Scott
INBOUND Bold Talks: David Meerman Scott
ย 
Zaragoza turismo-48
Zaragoza turismo-48Zaragoza turismo-48
Zaragoza turismo-48
ย 
เธชเธกเธฑเธ„เธฃเธ‡เธฒเธ™
เธชเธกเธฑเธ„เธฃเธ‡เธฒเธ™เธชเธกเธฑเธ„เธฃเธ‡เธฒเธ™
เธชเธกเธฑเธ„เธฃเธ‡เธฒเธ™
ย 
The State of the Union: Highlights
The State of the Union: HighlightsThe State of the Union: Highlights
The State of the Union: Highlights
ย 
Zaragoza turismo 191
Zaragoza turismo 191Zaragoza turismo 191
Zaragoza turismo 191
ย 
PLAY to win the product development race. SERIOUSLY (Donna Denio and Dieter R...
PLAY to win the product development race. SERIOUSLY (Donna Denio and Dieter R...PLAY to win the product development race. SERIOUSLY (Donna Denio and Dieter R...
PLAY to win the product development race. SERIOUSLY (Donna Denio and Dieter R...
ย 
Science2.0 bcg10
Science2.0 bcg10Science2.0 bcg10
Science2.0 bcg10
ย 
Estatuas Pelo Mundo
 Estatuas Pelo Mundo Estatuas Pelo Mundo
Estatuas Pelo Mundo
ย 
Strategies for IND Filing Success -CMC
Strategies for IND Filing Success -CMCStrategies for IND Filing Success -CMC
Strategies for IND Filing Success -CMC
ย 
Hope for Today Marketing
Hope for Today MarketingHope for Today Marketing
Hope for Today Marketing
ย 
Bloom's Taxonomy Revised - Posters by Schrock
Bloom's Taxonomy Revised - Posters by SchrockBloom's Taxonomy Revised - Posters by Schrock
Bloom's Taxonomy Revised - Posters by Schrock
ย 
Tell Your Story Capabilities & Cases 2015
Tell Your Story Capabilities & Cases 2015Tell Your Story Capabilities & Cases 2015
Tell Your Story Capabilities & Cases 2015
ย 

Similar to Multivariate analyses & decoding

Presentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data MiningPresentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data Miningbutest
ย 
Sparse inverse covariance estimation using skggm
Sparse inverse covariance estimation using skggmSparse inverse covariance estimation using skggm
Sparse inverse covariance estimation using skggmManjari Narayan
ย 
Vensoft IEEE 2014 2015 Matlab Projects tiltle Image Processing Wireless Signa...
Vensoft IEEE 2014 2015 Matlab Projects tiltle Image Processing Wireless Signa...Vensoft IEEE 2014 2015 Matlab Projects tiltle Image Processing Wireless Signa...
Vensoft IEEE 2014 2015 Matlab Projects tiltle Image Processing Wireless Signa...Vensoft Technologies
ย 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models Chia-Wen Cheng
ย 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspectiveAnirban Santara
ย 
A Dual congress Psychiatry and the Neurosciences
A Dual congress Psychiatry and the NeurosciencesA Dual congress Psychiatry and the Neurosciences
A Dual congress Psychiatry and the NeurosciencesMedicineAndHealthNeurolog
ย 
Alz forum webinar_4-10-12_raj
Alz forum webinar_4-10-12_rajAlz forum webinar_4-10-12_raj
Alz forum webinar_4-10-12_rajAlzforum
ย 
[PR12] understanding deep learning requires rethinking generalization
[PR12] understanding deep learning requires rethinking generalization[PR12] understanding deep learning requires rethinking generalization
[PR12] understanding deep learning requires rethinking generalizationJaeJun Yoo
ย 
Building Neural Network Through Neuroevolution
Building Neural Network Through NeuroevolutionBuilding Neural Network Through Neuroevolution
Building Neural Network Through Neuroevolutionbergel
ย 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Wanjin Yu
ย 
Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing Sivagowry Shathesh
ย 
SBML and related resources โ€จand standardization efforts
SBML and related resources โ€จand standardization effortsSBML and related resources โ€จand standardization efforts
SBML and related resources โ€จand standardization effortsMike Hucka
ย 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Ian Morgan
ย 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Bayes Nets meetup London
ย 
Intelligent Tutoring Systems: The DynaLearn Approach
Intelligent Tutoring Systems: The DynaLearn ApproachIntelligent Tutoring Systems: The DynaLearn Approach
Intelligent Tutoring Systems: The DynaLearn ApproachWouter Beek
ย 
Topic model, LDA and all that
Topic model, LDA and all thatTopic model, LDA and all that
Topic model, LDA and all thatZhibo Xiao
ย 
SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesMike Hucka
ย 
Main single agent machine learning algorithms
Main single agent machine learning algorithmsMain single agent machine learning algorithms
Main single agent machine learning algorithmsbutest
ย 

Similar to Multivariate analyses & decoding (20)

Presentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data MiningPresentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data Mining
ย 
Sparse inverse covariance estimation using skggm
Sparse inverse covariance estimation using skggmSparse inverse covariance estimation using skggm
Sparse inverse covariance estimation using skggm
ย 
2224d_final
2224d_final2224d_final
2224d_final
ย 
Vensoft IEEE 2014 2015 Matlab Projects tiltle Image Processing Wireless Signa...
Vensoft IEEE 2014 2015 Matlab Projects tiltle Image Processing Wireless Signa...Vensoft IEEE 2014 2015 Matlab Projects tiltle Image Processing Wireless Signa...
Vensoft IEEE 2014 2015 Matlab Projects tiltle Image Processing Wireless Signa...
ย 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models
ย 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
ย 
A Dual congress Psychiatry and the Neurosciences
A Dual congress Psychiatry and the NeurosciencesA Dual congress Psychiatry and the Neurosciences
A Dual congress Psychiatry and the Neurosciences
ย 
Alz forum webinar_4-10-12_raj
Alz forum webinar_4-10-12_rajAlz forum webinar_4-10-12_raj
Alz forum webinar_4-10-12_raj
ย 
[PR12] understanding deep learning requires rethinking generalization
[PR12] understanding deep learning requires rethinking generalization[PR12] understanding deep learning requires rethinking generalization
[PR12] understanding deep learning requires rethinking generalization
ย 
Building Neural Network Through Neuroevolution
Building Neural Network Through NeuroevolutionBuilding Neural Network Through Neuroevolution
Building Neural Network Through Neuroevolution
ย 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
ย 
Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing
ย 
SBML and related resources โ€จand standardization efforts
SBML and related resources โ€จand standardization effortsSBML and related resources โ€จand standardization efforts
SBML and related resources โ€จand standardization efforts
ย 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
ย 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
ย 
Intelligent Tutoring Systems: The DynaLearn Approach
Intelligent Tutoring Systems: The DynaLearn ApproachIntelligent Tutoring Systems: The DynaLearn Approach
Intelligent Tutoring Systems: The DynaLearn Approach
ย 
Topic model, LDA and all that
Topic model, LDA and all thatTopic model, LDA and all that
Topic model, LDA and all that
ย 
SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resources
ย 
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical ModelsBiological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
ย 
Main single agent machine learning algorithms
Main single agent machine learning algorithmsMain single agent machine learning algorithms
Main single agent machine learning algorithms
ย 

Multivariate analyses & decoding

  • 1. Multivariate analyses & decoding Kay Henning Brodersen Translational Neuromodeling Unit (TNU) Institute for Biomedical Engineering, University of Zurich & ETH Zurich Machine Learning and Pattern Recognition Group Department of Computer Science, ETH Zurich http://people.inf.ethz.ch/bkay
  • 2. Why multivariate? Univariate approaches are excellent for localizing activations in individual voxels. * n.s. v1 v2 v1 v2 reward no reward 2
  • 3. Why multivariate? Multivariate approaches can be used to examine responses that are jointly encoded in multiple voxels. n.s. n.s. v1 v2 v1 v2 v2 orange juice apple juice v1 3
  • 4. Why multivariate? Multivariate approaches can utilize โ€˜hiddenโ€™ quantities such as coupling strengths. signal ๐‘ฅ2 (๐‘ก) signal signal ๐‘ฅ3 (๐‘ก) observed BOLD signal ๐‘ฅ1 (๐‘ก) activity ๐‘ง2 (๐‘ก) hidden underlying activity activity ๐‘ง1 (๐‘ก) ๐‘ง3 (๐‘ก) neural activity and coupling strengths driving input ๐‘ข1(๐‘ก) modulatory input ๐‘ข2(๐‘ก) t t Friston, Harrison & Penny (2003) NeuroImage; Stephan & Friston (2007) Handbook of Brain Connectivity; Stephan et al. (2008) NeuroImage 4
  • 5. Overview 1 Introduction 2 Classification 3 Multivariate Bayes 4 Model-based analyses 5
  • 6. Overview 1 Introduction 2 Classification 3 Multivariate Bayes 4 Model-based analyses 6
  • 7. Encoding vs. decoding condition encoding model stimulus ๐‘”: ๐‘‹ ๐‘ก โ†’ ๐‘Œ๐‘ก response prediction error decoding model โ„Ž: ๐‘Œ๐‘ก โ†’ ๐‘‹ ๐‘ก context (cause or consequence) BOLD signal ๐‘‹๐‘ก โˆˆ โ„ ๐‘‘ ๐‘Œ๐‘ก โˆˆ โ„ ๐‘ฃ 7
  • 8. Regression vs. classification Regression model independent continuous variables ๐‘“ dependent variable (regressors) Classification model independent categorical variables ๐‘“ dependent variable vs. (features) (label) 8
  • 9. Univariate vs. multivariate models A univariate model considers a A multivariate model considers single voxel at a time. many voxels at once. context BOLD signal context BOLD signal ๐‘‹๐‘ก โˆˆ โ„ ๐‘‘ ๐‘Œ๐‘ก โˆˆ โ„ ๐‘‹๐‘ก โˆˆ โ„ ๐‘‘ ๐‘Œ๐‘ก โˆˆ โ„ ๐‘ฃ , v โ‰ซ 1 Spatial dependencies between voxels Multivariate models enable are only introduced afterwards, inferences on distributed responses through random field theory. without requiring focal activations. 9
  • 10. Prediction vs. inference The goal of prediction is to find The goal of inference is to decide a highly accurate encoding or between competing hypotheses. decoding function. predicting a cognitive predicting a comparing a model that weighing the state using a subject-specific links distributed neuronal evidence for brain-machine diagnostic status activity to a cognitive sparse vs. interface state with a model that distributed coding does not predictive density marginal likelihood (model evidence) ๐‘ ๐‘‹ ๐‘›๐‘’๐‘ค ๐‘Œ ๐‘›๐‘’๐‘ค , ๐‘‹, ๐‘Œ = โˆซ ๐‘ ๐‘‹ ๐‘›๐‘’๐‘ค ๐‘Œ ๐‘›๐‘’๐‘ค , ๐œƒ ๐‘ ๐œƒ ๐‘‹, ๐‘Œ ๐‘‘๐œƒ ๐‘ ๐‘‹ ๐‘Œ = โˆซ ๐‘ ๐‘‹ ๐‘Œ, ๐œƒ ๐‘ ๐œƒ ๐‘‘๐œƒ 10
  • 11. Goodness of fit vs. complexity Goodness of fit is the degree to which a model explains observed data. Complexity is the flexibility of a model (including, but not limited to, its number of parameters). ๐‘Œ 1 parameter 4 parameters 9 parameters truth data model ๐‘‹ underfitting optimal overfitting We wish to find the model that optimally trades off goodness of fit and complexity. Bishop (2007) PRML 11
  • 12. Summary of modelling terminology General Linear Model (GLM) โ€ข mass-univariate encoding model โ€ข to regress context onto brain activity and find clusters of similar effects Dynamic Causal Modelling (DCM) โ€ข multivariate encoding model โ€ข to evaluate connectivity hypotheses Classification โ€ข multivariate decoding model โ€ข to predict a categorical context label from brain activity Multivariate Bayes (MVB) โ€ข multivariate decoding model โ€ข to evaluate anatomical and coding hypotheses 12
  • 13. Overview 1 Introduction 2 Classification 3 Multivariate Bayes 4 Model-based analyses 13
  • 14. Constructing a classifier A principled way of designing a classifier would be to adopt a probabilistic approach: ๐‘Œ๐‘ก ๐‘“ that ๐‘˜ which maximizes ๐‘ ๐‘‹ ๐‘ก = ๐‘˜ ๐‘Œ๐‘ก , ๐‘‹, ๐‘Œ In practice, classifiers differ in terms of how strictly they implement this principle. Generative classifiers Discriminative classifiers Discriminant classifiers use Bayesโ€™ rule to estimate estimate ๐‘ ๐‘‹ ๐‘ก ๐‘Œ๐‘ก directly estimate ๐‘“ ๐‘Œ๐‘ก directly ๐‘ ๐‘‹ ๐‘ก ๐‘Œ๐‘ก โˆ ๐‘ ๐‘Œ๐‘ก ๐‘‹ ๐‘ก ๐‘ ๐‘‹ ๐‘ก without Bayesโ€™ theorem โ€ข Gaussian Naรฏve Bayes โ€ข Logistic regression โ€ข Fisherโ€™s Linear โ€ข Linear Discriminant โ€ข Relevance Vector Discriminant Analysis Machine โ€ข Support Vector Machine 14
  • 15. Support vector machine (SVM) Linear SVM Nonlinear SVM v2 v1 Vapnik (1999) Springer; Schรถlkopf et al. (2002) MIT Press 15
  • 16. Stages in a classification analysis Classification Feature Performance using cross- extraction evaluation validation Bayesian mixed- effects inference ๐‘ =1โˆ’ ๐‘ƒ ๐œ‹ > ๐œ‹0 ๐‘˜, ๐‘› mixed effects 16
  • 17. Feature extraction for trial-by-trial classification We can obtain trial-wise estimates of neural activity by filtering the data with a GLM. data ๐‘Œ design matrix ๐‘‹ coefficients Boxcar ๐›ฝ1 ๐›ฝ2 = ร— + ๐‘’ regressor for trial 2 โ‹ฎ ๐›ฝ๐‘ Estimate of this coefficient reflects activity on trial 2 17
  • 18. Cross-validation The generalization ability of a classifier can be estimated using a resampling procedure known as cross-validation. One example is 2-fold cross-validation: examples 1 ? training example 2 ? ? test examples 3 ? ... ... 99 ? 100 ? 1 2 folds 18
  • 19. Cross-validation A more commonly used variant is leave-one-out cross-validation. examples 1 ? training example 2 ? ? test example 3 ? ... ... ... ... ... ... 99 ? 100 ? 1 2 98 99 100 folds performance evaluation 19
  • 20. Performance evaluation Single-subject study with ๐’ trials The most common approach is to assess how likely the obtained number of correctly classified trials could have occurred by chance. subject Binomial test ๐‘ = ๐‘ƒ ๐‘‹ โ‰ฅ ๐‘˜ ๐ป0 = 1 โˆ’ ๐ต ๐‘˜|๐‘›, ๐œ‹0 In MATLAB: p = 1 - binocdf(k,n,pi_0) trial 1 ๐‘˜ number of correctly classified trials 0 ๐‘› total number of trials trial ๐‘› 1 ๐œ‹0 chance level (typically 0.5) 1 ๐ต binomial cumulative density function 1 - +- 0 +- 20
  • 21. Performance evaluation population subject 1 subject 2 subject 3 subject 4 subject ๐‘š โ€ฆ trial 1 0 1 0 1 0 trial ๐‘› 1 1 1 1 1 1 0 1 1 โ€ฆ 1 1 1 0 1 1 - +- 0 1 0 1 0 +- 21
  • 22. Performance evaluation Group study with ๐’Ž subjects, ๐’ trials each In a group setting, we must account for both within-subjects (fixed-effects) and between- subjects (random-effects) variance components. available for download soon Binomial test on Binomial test on t-test on Bayesian mixed- concatenated data averaged data summary statistics effects inference ๐‘=1โˆ’ ๐‘=1โˆ’ ๐œ‹โˆ’๐œ‹0 ๐‘=1โˆ’ ๐‘ก= ๐‘š 1 1 ๐œŽ ๐‘šโˆ’1 ๐ต โˆ‘๐‘˜|โˆ‘๐‘›, ๐œ‹0 ๐ต ๐‘š โˆ‘๐‘˜| โˆ‘๐‘›, ๐œ‹0 ๐‘ƒ ๐œ‹ > ๐œ‹0 ๐‘˜, ๐‘› ๐‘š ๐‘=1โˆ’ ๐‘ก ๐‘šโˆ’1 ๐‘ก fixed effects fixed effects random effects mixed effects ๐œ‹ sample mean of sample accuracies ๐œ‹0 chance level (typically 0.5) ๐œŽ ๐‘šโˆ’1 sample standard deviation ๐‘ก ๐‘šโˆ’1 cumulative Studentโ€™s ๐‘ก-distribution Brodersen, Mathys, Chumbley, Daunizeau, Ong, Buhmann, Stephan (under review) 22
  • 23. Spatial deployment of informative regions Which brain regions are jointly informative of a cognitive state of interest? Searchlight approach Whole-brain approach A sphere is passed across the brain. At each A constrained classifier is trained on whole- location, the classifier is evaluated using only brain data. Its voxel weights are related to the voxels in the current sphere โ†’ map of t- their empirical null distributions using a scores. permutation test โ†’ map of t-scores. Nandy & Cordes (2003) MRM Mourao-Miranda et al. (2005) NeuroImage Kriegeskorte et al. (2006) PNAS Lomakina et al. (in preparation) 23
  • 24. Summary: research questions for classification Overall classification accuracy Spatial deployment of discriminative regions accuracy 80% 100 % 50 % Left or right Truth Healthy or button? or ill? lie? 55% classification task Temporal evolution of discriminability Model-based classification accuracy Participant indicates decision 100 % { group 1, 50 % group 2 } Accuracy rises above chance within-trial time Pereira et al. (2009) NeuroImage, Brodersen et al. (2009) The New Collection 24
  • 25. Overview 1 Introduction 2 Classification 3 Multivariate Bayes 4 Model-based analyses 25
  • 26. Multivariate Bayes SPM brings multivariate analyses into the conventional inference framework of Bayesian hierarchical models and their inversion. Mike West 26
  • 27. Multivariate Bayes Multivariate analyses in SPM rest on the central tenet that inferences about how the brain represents things reduce to model comparison. some cause or vs. decoding model consequence sparse coding in distributed coding in orbitofrontal cortex prefrontal cortex To make the ill-posed regression problem tractable, MVB uses a prior on voxel weights. Different priors reflect different coding hypotheses. 27
  • 28. From encoding to decoding Encoding model: GLM Decoding model: MVB ๐‘‹ ๐›ผ ๐‘‹ = ๐ด๐›ผ ๐›ฝ ๐ด = ๐‘‹๐›ฝ ๐ด ๐›พ ๐›พ ๐‘Œ = ๐‘‡๐ด + ๐บ๐›พ + ๐œ€ ๐‘Œ = ๐‘‡๐ด + ๐บ๐›พ + ๐œ€ ๐œ€ ๐œ€ In summary: In summary: ๐‘Œ = ๐‘‡๐‘‹๐›ฝ + ๐บ๐›พ + ๐œ€ ๐‘‡๐‘‹ = ๐‘Œ๐›ผ โˆ’ ๐บ๐›พ๐›ผ โˆ’ ๐œ€๐›ผ 28
  • 29. Specifying the prior for MVB 1st level โ€“ spatial coding hypothesis ๐‘ˆ ร— ๐œ‚ ๐‘ข patterns Voxel 2 is allowed to play a role. ๐‘› voxels ๐‘ˆ Voxel 3 is allowed to play a role, but only if ๐‘ˆ ๐‘ˆ its neighbours play similar roles. 2nd level โ€“ pattern covariance structure ฮฃ ๐‘ ๐œ‚ = ๐’ฉ ๐œ‚ 0, ฮฃ ฮฃ = โˆ‘๐‘– ๐œ†๐‘– ๐‘  ๐‘– Thus: ๐‘ ๐›ผ|๐œ† = ๐’ฉ ๐‘› ๐›ผ 0, ๐‘ˆฮฃ๐‘ˆ ๐‘‡ and ๐‘ ๐œ† = ๐’ฉ ๐œ† ๐œ‹, ฮ  โˆ’1 29
  • 30. Inverting the model 1 Model inversion involves Partition #1 subset ๐‘  finding the posterior distribution over voxel ฮฃ = ๐œ†1 ร— weights ๐›ผ. In MVB, this includes a greedy search for the optimal covariance 1 2 Partition #2 subset ๐‘  subset ๐‘  structure that governs the prior over ๐›ผ. ฮฃ = ๐œ†1 ร— +๐œ†2 ร— 1 2 3 Partition #3 subset ๐‘  subset ๐‘  subset ๐‘  (optimal) ฮฃ = ๐œ†1 ร— +๐œ†2 ร— +๐œ†3 ร— 30
  • 31. Example: decoding motion from visual cortex photic motion attention const MVB can be illustrated using SPMโ€™s attention- to-motion example dataset. This dataset is based on a simple block design. There are three experimental factors: ๏‚จ photic โ€“ display shows random dots ๏‚จ motion โ€“ dots are moving scans ๏‚จ attention โ€“ subjects asked to pay attention Buechel & Friston 1999 Cerebral Cortex Friston et al. 2008 NeuroImage 31
  • 32. Multivariate Bayes in SPM Step 1 After having specified and estimated a model, use the Results button. Step 2 Select the contrast to be decoded. 32
  • 33. Multivariate Bayes in SPM Step 3 Pick a region of interest. 33
  • 34. Multivariate Bayes in SPM Step 5 Here, the region of interest is specified as a sphere around the cursor. The spatial prior implements a sparse coding hypothesis. Step 4 Multivariate Bayes can be invoked from within the Multivariate section. 34
  • 35. Multivariate Bayes in SPM Step 6 Results can be displayed using the BMS button. 35
  • 36. Observations vs. predictions ๐‘น๐‘ฟ๐‘ (motion) 36
  • 37. Model evidence and voxel weights log evidence = 3 37
  • 38. Using MVB for point classification MVB may outperform conventional point classifiers when using a more appropriate coding hypothesis. Support Vector Machine 38
  • 39. Summary: research questions for MVB Where does the brain represent things? How does the brain represent things? Evaluating competing anatomical hypotheses Evaluating competing coding hypotheses 39
  • 40. Overview 1 Introduction 2 Classification 3 Multivariate Bayes 4 Model-based analyses 40
  • 41. Classification approaches by data representation Model-based classification How do patterns of hidden quantities (e.g., connectivity among brain regions) differ between groups? Activation-based Structure-based classification classification Which functional Which anatomical differences allow us to structures allow us to separate groups? separate patients and healthy controls? 41
  • 42. Generative embedding for model-based classification step 1 โ€” A step 2 โ€” Aโ†’B modelling embedding Aโ†’C B Bโ†’B C Bโ†’C measurements from subject-specific subject representation in an individual subject generative model model-based feature space 1 A accuracy step 5 โ€” step 4 โ€” step 3 โ€” interpretation evaluation classification B C 0 jointly discriminative discriminability of classification connection strengths? groups? model Brodersen, Haiss, Ong, Jung, Tittgemeyer, Buhmann, Weber, Stephan (2011) NeuroImage Brodersen, Schofield, Leff, Ong, Lomakina, Buhmann, Stephan (2011) PLoS Comput Biol 42
  • 43. Example: diagnosing stroke patients anatomical regions of interest y = โ€“26 mm 43
  • 44. Example: diagnosing stroke patients PT PT HG HG L (A1) (A1) R MGB MGB stimulus input 44
  • 45. Multivariate analysis: connectional fingerprints patients controls 45
  • 46. Dissecting diseases into physiologically distinct subgroups Voxel-based contrast space Model-based parameter space 0.4 0.4 -0.15 -0.15 embedding generative Voxel (64,-24,4) mm 0.3 0.3 -0.2 -0.2 L.HG โ†’ L.HG 0.2 0.2 -0.25 -0.25 0.1 0.1 -0.3 -0.3 patients 0 0 -0.35 -0.35 controls -10 -10 0.5 0.5 -0.1 -0.1 -0.4 -0.4 -0.5 -0.5 0 0 -0.4 -0.4 0 0 0 0 -0.2 -0.2 Voxel (-56,-20,10) mm R.HG โ†’ L.HG Voxel (-42,-26,10) mm 0.5 10 0.5 10 0 -0.5 0 -0.5 L.MGB โ†’ L.MGB classification accuracy classification accuracy (using all voxels in the regions of interest) (using all 23 model parameters) 75% 98% 46
  • 47. Discriminative features in model space PT PT HG HG (A1) (A1) L R MGB MGB stimulus input 47
  • 48. Discriminative features in model space PT PT HG HG (A1) (A1) L R MGB MGB stimulus input highly discriminative somewhat discriminative not discriminative 48
  • 49. Generative embedding and DCM Question 1 โ€“ What do the data tell us about hidden processes in the brain? ๏ƒฐ compute the posterior ? ๐‘ ๐‘ฆ ๐œƒ,๐‘š ๐‘ ๐œƒ ๐‘š ๐‘ ๐œƒ ๐‘ฆ, ๐‘š = ๐‘ ๐‘ฆ ๐‘š Question 2 โ€“ Which model is best w.r.t. the observed fMRI data? ๏ƒฐ compute the model evidence ๐‘ ๐‘š ๐‘ฆ โˆ ๐‘ ๐‘ฆ ๐‘š ๐‘(๐‘š) = โˆซ ๐‘ ๐‘ฆ ๐œƒ, ๐‘š ๐‘ ๐œƒ ๐‘š ๐‘‘๐œƒ ? Question 3 โ€“ Which model is best w.r.t. an external criterion? ๏ƒฐ compute the classification accuracy { patient, ๐‘ โ„Ž ๐‘ฆ = ๐‘ฅ ๐‘ฆ control } = ๐‘ โ„Ž ๐‘ฆ = ๐‘ฅ ๐‘ฆ, ๐‘ฆtrain , ๐‘ฅtrain ๐‘ ๐‘ฆ ๐‘ ๐‘ฆtrain ๐‘ ๐‘ฅtrain ๐‘‘๐‘ฆ ๐‘‘๐‘ฆtrain ๐‘ฅtrain 49
  • 50. Summary Classification โ€ข to assess whether a cognitive state is linked to patterns of activity โ€ข to assess the spatial deployment of discriminative activity Multivariate Bayes โ€ข to evaluate competing anatomical hypotheses โ€ข to evaluate competing coding hypotheses Model-based analyses โ€ข to assess whether groups differ in terms of patterns of connectivity โ€ข to generate new grouping hypotheses 50