Machine learning and cognitive neuroimaging:
new tools can answer new questions
Gaël Varoquaux
How machine learning is shaping cognitive neuroimaging
[Varoquaux and Thirion 2014]
Cognitive neuroscience: linking psychology and
neuroscience (neural implementations)
Vision: A computational investigation into the human representation
and processing of visual information [Marr 1982]
G Varoquaux 2
Machine learning:
computational statistics
for prediction
(out-of-sample properties)
Paradigm shift
the dimensionality of
data grows,
enables richer models
Open-ended questions
⇒ large # features
From parameter
inference to prediction
x
y
G Varoquaux 3
Machine learning:
computational statistics
for prediction
(out-of-sample properties)
Paradigm shift
the dimensionality of
data grows,
enables richer models
Open-ended questions
⇒ large # features
From parameter
inference to prediction
x
y
Understanding, not predicting
Danger of solving the
wrong problem
Lost in formalization
G Varoquaux 3
Statistics Machine learning
Statistical machine learning
Hypothesis testing Prediction
T-test Tests on prediction Cross-validation
G Varoquaux 4
Statistics Machine learning
Statistical machine learning
Hypothesis testing Prediction
T-test Tests on prediction Cross-validation
In sample Out of sample
G Varoquaux 4
Statistics Machine learning
Statistical machine learning
Hypothesis testing Prediction
T-test Tests on prediction Cross-validation
In sample Out of sample
Parametric Non-parametric
G Varoquaux 4
Statistics Machine learning
Statistical machine learning
Hypothesis testing Prediction
T-test Tests on prediction Cross-validation
In sample Out of sample
Parametric Non-parametric
Non-parametric tests Probabilistic modeling
Few parameters Many parameters
G Varoquaux 4
Statistics Machine learning
Statistical machine learning
Hypothesis testing Prediction
T-test Tests on prediction Cross-validation
In sample Out of sample
Parametric Non-parametric
Non-parametric tests Probabilistic modeling
Few parameters Many parameters
Univariate Multivariate
G Varoquaux 4
Statistics Machine learning
Statistical machine learning
Hypothesis testing Prediction
T-test Tests on prediction Cross-validation
In sample Out of sample
Parametric Non-parametric
Non-parametric tests Probabilistic modeling
Few parameters Many parameters
Univariate Multivariate
GLM = correlations Naive Bayes
Univariate selection
Differences mostly cultural: it’s a continuum
G Varoquaux 4
Cognitive neuroimaging and machine learning
G Varoquaux 5
Cognitive neuroimaging and machine learning
Predicting the task: decoding
G Varoquaux 5
Cognitive neuroimaging and machine learning
Predicting neural response: encoding
G Varoquaux 5
Cognitive neuroimaging and machine learning
Unsupervised learning on brain activity
G Varoquaux 5
Cognitive neuroimaging and machine learning
Unsupervised learning on behavior
G Varoquaux 5
Cognitive neuroimaging and machine learning
G Varoquaux 5
Rest of this talk
1 Encoding
2 Decoding
G Varoquaux 6
1 Encoding
Towards richer models of brain activity
G Varoquaux 7
1 Uncovering neural coding
Insights on breaking down cognitive functions into
atomic steps
[Hubel and Wiesel 1962]
Neurons receptive to
Gabors (edges)
G Varoquaux 8
1 Uncovering neural coding
Insights on breaking down cognitive functions into
atomic steps
[Hubel and Wiesel 1962]
Neurons receptive to
Gabors (edges)
[Logothetis... 1995]
Shapes in inferior
temporal cortex
G Varoquaux 8
1 Uncovering neural coding: richer models
Insights on breaking down cognitive functions into
atomic steps
[Hubel and Wiesel 1962]
Neurons receptive to
Gabors (edges)
[Logothetis... 1995]
Shapes in inferior
temporal cortex
Machine learning:
computer-vision models mapped to brain activity
[Yamins... 2014]
G Varoquaux 8
1 Uncovering neural coding: in fMRI
Model-based fMRI [O’Doherty... 2007]
[Harvey... 2013]
High-level descriptions [Mitchell... 2008]
Natural stimuly [Kay... 2008]
G Varoquaux 9
Machine learning for encoding models
Richer models of encoding
capture fine descriptions of behavior / stimuli
Require to forgo the contrast methodolgy
Is this a good or a bad thing?
G Varoquaux 10
1 Models of the visual system
Image
V1
cortex
V2
cortex
Inferior
temporal
cortex
Fusiform
face area
Jack?
Is there a “face” region? A “foot” region? A “left big toe” region?
G Varoquaux 11
1 Uncovering neural coding: cognitive oppositions
Is there a “face” region? A “foot” region? A “left big toe” region?
vs
G Varoquaux 12
1 Uncovering neural coding: cognitive oppositions
Is there a “face” region? A “foot” region? A “left big toe” region?
vs
G Varoquaux 12
1 Uncovering neural coding: cognitive oppositions
Is there a “face” region? A “foot” region? A “left big toe” region?
vs
-
G Varoquaux 12
1 Uncovering neural coding: cognitive oppositions
Is there a “face” region? A “foot” region? A “left big toe” region?
vs
-Mapping relies on cognitive subtraction
Bound to mental process decomposition
G Varoquaux 12
1 Decomposing visual stimuli
Low-level visual cortex is tuned
to natural image statistics
[Olshausen et al. 1996]
What drives high-level representations?
G Varoquaux 13
1 Decomposing visual stimuli
Low-level visual cortex is tuned
to natural image statistics
[Olshausen et al. 1996]
What drives high-level representations?
Convolutional Net
G Varoquaux 13
Data-driven encoding models
Image
V1
cortex
V2
cortex
Inferior
temporal
cortex
Fusiform
face area
Jack?
[Khaligh-Razavi and Kriegeskorte 2014, Güçlü and van Gerven 2015]
FMRI beyond a handfull of contrasts
⇒ Sets us free from the paradigm
G Varoquaux 14
2 Decoding
From brain activity to behavior
G Varoquaux 15
2 Increased sensitivity
“Given the goal of detecting the presence of a particular
mental representation in the brain, the primary advantage
of MVPA methods over individual-voxel-based methods is
increased sensitivity.” — [Norman... 2006]
G Varoquaux 16
2 Increased sensitivity
An omnibus test
“Given the goal of detecting the presence of a particular
mental representation in the brain, the primary advantage
of MVPA methods over individual-voxel-based methods is
increased sensitivity.” — [Norman... 2006]
Is there “information” about a
stimuli in a given region?
G Varoquaux 16
2 Increased sensitivity
An omnibus test
“Given the goal of detecting the presence of a particular
mental representation in the brain, the primary advantage
of MVPA methods over individual-voxel-based methods is
increased sensitivity.” — [Norman... 2006]
“However, these maps are not guaranteed to include all
the voxels that are involved in representing the categories
of interest.” — [Norman... 2006]
G Varoquaux 16
Non-linear
cognitive model
Linear
predictive models
Representations
Stimuli
2 Increased sensitivity
An omnibus test
Decoding used to test / compare encoding models
[Naselaris... 2011]
G Varoquaux 17
2 Generalization as a test: cross-validation
x
y
x
y
High-dimensional models
G Varoquaux 18
2 Generalization as a test: cross-validation
x
y
x
y
High-dimensional models
⇒ Important to test on independent data,
to control for model complexity
G Varoquaux 18
2 Generalization as a test: cross-validation
High-dimensional models
⇒ Important to test on independent data,
to control for model complexity
­40% ­20% ­10%  0% +10% +20% +40%
Leave one
sample out
Leave one
subject/session
20% left­out, 
 3 splits
20% left­out, 
 10 splits
20% left­out, 
 50 splits
­22% +19%
+3% +43%
­10% +10%
­21% +17%
­11% +11%
­24% +16%
­9% +9%
­24% +14%
­9% +8%
­23% +13%
  Intra
subject
  Inter
subject
No silver bullet Poster 3829, Oral Th 12:45
G Varoquaux 18
2 Behavioral predictions as a test
Increase “cognitive resolution”
One voxel’s information is not enough to distinguish
many cognitive states
⇒ analysis combining info across voxels
G Varoquaux 19
2 Behavioral predictions as a test
Increase “cognitive resolution”
One voxel’s information is not enough to distinguish
many cognitive states
⇒ analysis combining info across voxels
Interpreting overlapping activations
Psychology not interested in where a task is
creating activation,
but if two tasks are creating activations in same areas
G Varoquaux 19
2 Inference in cognitive neuroimaging
What is the neural support of a function?
What is function of a given brain module?
G Varoquaux 20
2 Inference in cognitive neuroimaging
What is the neural support of a function?
What is function of a given brain module?
Brain mapping = task-evoked activity
G Varoquaux 20
2 Inference in cognitive neuroimaging
[Poldrack 2006, Henson 2006]
What is the neural support of a function?
What is function of a given brain module?
Reverse inference
Brain mapping = task-evoked activity
+ crafting “contrasts” to isolate effects
G Varoquaux 20
2 Inference in cognitive neuroimaging
[Kanwisher... 1997, Gauthier... 2000, Hanson and Halchenko 2008]
What is the neural support of a function?
What is function of a given brain module?
Reverse inference
Is there a face area?
G Varoquaux 20
2 Inference in cognitive neuroimaging
[Poldrack... 2009, Schwartz... 2013]
What is the neural support of a function?
What is function of a given brain module?
Reverse inference
Decoding: Find regions that
predict observed cognition
G Varoquaux 20
2 Decoding for reverse inference
[Poldrack... 2009, Schwartz... 2013]
Prediction = proxy for implication
Need large cognitive coverage
G Varoquaux 21
2 Decoding for reverse inference
[Poldrack... 2009, Schwartz... 2013]
Prediction = proxy for implication
Need large cognitive coverage
Interpretation of the “grandmother neuron”
“more than a neuron re-
sponds to one concept and
[...] neurons do not neces-
sarily respond to only one
concept are given by the
data itself
[Quian Quiroga and Kreiman 2010]
G Varoquaux 21
2 Brain decoding with linear models
Design
matrix
× Coefficients =
Coefficients are
brain maps
Target
G Varoquaux 22
2 Brain decoding to recover predictive regions?
Face vs house visual recognition [Haxby... 2001]
SVM
error: 26%
G Varoquaux 23
2 Brain decoding to recover predictive regions?
Face vs house visual recognition [Haxby... 2001]
Sparse model
error: 19%
G Varoquaux 23
2 Brain decoding to recover predictive regions?
Face vs house visual recognition [Haxby... 2001]
Ridge
error: 15%
Best predictor outlines the worst regions
Best maps predict worst
G Varoquaux 23
2 Decoders as estimators [Gramfort... 2013]
Inverse problem
Minimize the error term:
ˆw = argmin
w
l(y − X w)
Ill-posed:
Many different w will give
the same prediction error
Choice driven by (implicit) priors of the decoder
SVM sparse ridge TV- 1
G Varoquaux 24
2 Decoders as estimators [Gramfort... 2013]
Inverse problem
Minimize the error term:
ˆw = argmin
w
l(y − X w)
Ill-posed:
Many different w will give
the same prediction error
Choice driven by (implicit) priors of the decoder
SVM sparse ridge TV- 1
Inferences rely, explicitely or implicitely,
on the regions estimated by the decoder
G Varoquaux 24
Wrapping up
G Varoquaux 25
@GaelVaroquaux
Machine learning for cognitive neuroimaging
The description of cognition is hard ⇒ Encoding
Rich models depend less on paradigms
@GaelVaroquaux
Machine learning for cognitive neuroimaging
The description of cognition is hard ⇒ Encoding
Decoding as an omnibus test
For rich encoding models
To interpret overlaping activation
Cross-validation error bars
@GaelVaroquaux
Machine learning for cognitive neuroimaging
The description of cognition is hard ⇒ Encoding
Decoding as an omnibus test
Decoding for reverse inference
Requires large cognitive coverage
@GaelVaroquaux
Machine learning for cognitive neuroimaging
The description of cognition is hard ⇒ Encoding
Decoding as an omnibus test
Decoding for reverse inference
Estimation of predictive regions is difficult
Infinite number of maps predict as well
@GaelVaroquaux
Machine learning for cognitive neuroimaging
The description of cognition is hard ⇒ Encoding
Decoding as an omnibus test
Decoding for reverse inference
Estimation of predictive regions is difficult
Software: nilearn
In Python
http://nilearn.github.io
ni
[Varoquaux and Thirion 2014]
How machine learning is
shaping cognitive neuroimaging
References I
I. Gauthier, M. J. Tarr, J. Moylan, P. Skudlarski, J. C. Gore, and
A. W. Anderson. The fusiform “face area” is part of a network
that processes faces at the individual level. J cognitive
neuroscience, 12:495, 2000.
A. Gramfort, B. Thirion, and G. Varoquaux. Identifying predictive
regions from fMRI with TV-L1 prior. In PRNI, page 17, 2013.
U. Güçlü and M. A. van Gerven. Deep neural networks reveal a
gradient in the complexity of neural representations across the
ventral stream. The Journal of Neuroscience, 35(27):
10005–10014, 2015.
S. J. Hanson and Y. O. Halchenko. Brain reading using full brain
support vector machines for object recognition: there is no
“face” identification area. Neural Computation, 20:486, 2008.
B. Harvey, B. Klein, N. Petridou, and S. Dumoulin. Topographic
representation of numerosity in the human parietal cortex.
Science, 341(6150):1123–1126, 2013.
References II
J. V. Haxby, I. M. Gobbini, M. L. Furey, ... Distributed and
overlapping representations of faces and objects in ventral
temporal cortex. Science, 293:2425, 2001.
R. Henson. Forward inference using functional neuroimaging:
Dissociations versus associations. Trends in cognitive sciences,
10:64, 2006.
D. H. Hubel and T. N. Wiesel. Receptive fields, binocular
interaction and functional architecture in the cat’s visual cortex.
The Journal of physiology, 160:106, 1962.
N. Kanwisher, J. McDermott, and M. M. Chun. The fusiform face
area: a module in human extrastriate cortex specialized for face
perception. J Neuroscience, 17:4302, 1997.
K. N. Kay, T. Naselaris, R. J. Prenger, and J. L. Gallant.
Identifying natural images from human brain activity. Nature,
452:352, 2008.
References III
S.-M. Khaligh-Razavi and N. Kriegeskorte. Deep supervised, but
not unsupervised, models may explain it cortical representation.
PLoS Comput Biol, 10(11):e1003915, 2014.
N. K. Logothetis, J. Pauls, and T. Poggio. Shape representation in
the inferior temporal cortex of monkeys. Current Biology, 5:552,
1995.
D. Marr. Vision: A computational investigation into the human
representation and processing of visual information. The MIT
press, Cambridge, 1982.
T. M. Mitchell, S. V. Shinkareva, A. Carlson, K.-M. Chang, V. L.
Malave, R. A. Mason, and M. A. Just. Predicting human brain
activity associated with the meanings of nouns. science, 320:
1191, 2008.
T. Naselaris, K. N. Kay, S. Nishimoto, and J. L. Gallant. Encoding
and decoding in fMRI. Neuroimage, 56:400, 2011.
References IV
K. A. Norman, S. M. Polyn, G. J. Detre, and J. V. Haxby. Beyond
mind-reading: multi-voxel pattern analysis of fmri data. Trends
in cognitive sciences, 10:424, 2006.
J. P. O’Doherty, A. Hampton, and H. Kim. Model-based fMRI and
its application to reward learning and decision making. Annals of
the New York Academy of Sciences, 1104:35, 2007.
B. Olshausen ... Emergence of simple-cell remainsceptive field
properties by learning a sparse code for natural images. Nature,
381:607, 1996.
R. Poldrack. Can cognitive processes be inferred from
neuroimaging data? Trends in cognitive sciences, 10:59, 2006.
R. A. Poldrack, Y. O. Halchenko, and S. J. Hanson. Decoding the
large-scale structure of brain function by classifying mental
states across individuals. Psychological Science, 20:1364, 2009.
References V
R. Quian Quiroga and G. Kreiman. Postscript: About grandmother
cells and jennifer aniston neurons. Psychological Review, 117:
297, 2010.
Y. Schwartz, B. Thirion, and G. Varoquaux. Mapping cognitive
ontologies to and from the brain. In NIPS, 2013.
G. Varoquaux and B. Thirion. How machine learning is shaping
cognitive neuroimaging. GigaScience, 3:28, 2014.
D. L. Yamins, H. Hong, C. F. Cadieu, E. A. Solomon, D. Seibert,
and J. J. DiCarlo. Performance-optimized hierarchical models
predict neural responses in higher visual cortex. Proc Natl Acad
Sci, page 201403112, 2014.

Machine learning and cognitive neuroimaging: new tools can answer new questions

  • 1.
    Machine learning andcognitive neuroimaging: new tools can answer new questions Gaël Varoquaux How machine learning is shaping cognitive neuroimaging [Varoquaux and Thirion 2014]
  • 2.
    Cognitive neuroscience: linkingpsychology and neuroscience (neural implementations) Vision: A computational investigation into the human representation and processing of visual information [Marr 1982] G Varoquaux 2
  • 3.
    Machine learning: computational statistics forprediction (out-of-sample properties) Paradigm shift the dimensionality of data grows, enables richer models Open-ended questions ⇒ large # features From parameter inference to prediction x y G Varoquaux 3
  • 4.
    Machine learning: computational statistics forprediction (out-of-sample properties) Paradigm shift the dimensionality of data grows, enables richer models Open-ended questions ⇒ large # features From parameter inference to prediction x y Understanding, not predicting Danger of solving the wrong problem Lost in formalization G Varoquaux 3
  • 5.
    Statistics Machine learning Statisticalmachine learning Hypothesis testing Prediction T-test Tests on prediction Cross-validation G Varoquaux 4
  • 6.
    Statistics Machine learning Statisticalmachine learning Hypothesis testing Prediction T-test Tests on prediction Cross-validation In sample Out of sample G Varoquaux 4
  • 7.
    Statistics Machine learning Statisticalmachine learning Hypothesis testing Prediction T-test Tests on prediction Cross-validation In sample Out of sample Parametric Non-parametric G Varoquaux 4
  • 8.
    Statistics Machine learning Statisticalmachine learning Hypothesis testing Prediction T-test Tests on prediction Cross-validation In sample Out of sample Parametric Non-parametric Non-parametric tests Probabilistic modeling Few parameters Many parameters G Varoquaux 4
  • 9.
    Statistics Machine learning Statisticalmachine learning Hypothesis testing Prediction T-test Tests on prediction Cross-validation In sample Out of sample Parametric Non-parametric Non-parametric tests Probabilistic modeling Few parameters Many parameters Univariate Multivariate G Varoquaux 4
  • 10.
    Statistics Machine learning Statisticalmachine learning Hypothesis testing Prediction T-test Tests on prediction Cross-validation In sample Out of sample Parametric Non-parametric Non-parametric tests Probabilistic modeling Few parameters Many parameters Univariate Multivariate GLM = correlations Naive Bayes Univariate selection Differences mostly cultural: it’s a continuum G Varoquaux 4
  • 11.
    Cognitive neuroimaging andmachine learning G Varoquaux 5
  • 12.
    Cognitive neuroimaging andmachine learning Predicting the task: decoding G Varoquaux 5
  • 13.
    Cognitive neuroimaging andmachine learning Predicting neural response: encoding G Varoquaux 5
  • 14.
    Cognitive neuroimaging andmachine learning Unsupervised learning on brain activity G Varoquaux 5
  • 15.
    Cognitive neuroimaging andmachine learning Unsupervised learning on behavior G Varoquaux 5
  • 16.
    Cognitive neuroimaging andmachine learning G Varoquaux 5
  • 17.
    Rest of thistalk 1 Encoding 2 Decoding G Varoquaux 6
  • 18.
    1 Encoding Towards richermodels of brain activity G Varoquaux 7
  • 19.
    1 Uncovering neuralcoding Insights on breaking down cognitive functions into atomic steps [Hubel and Wiesel 1962] Neurons receptive to Gabors (edges) G Varoquaux 8
  • 20.
    1 Uncovering neuralcoding Insights on breaking down cognitive functions into atomic steps [Hubel and Wiesel 1962] Neurons receptive to Gabors (edges) [Logothetis... 1995] Shapes in inferior temporal cortex G Varoquaux 8
  • 21.
    1 Uncovering neuralcoding: richer models Insights on breaking down cognitive functions into atomic steps [Hubel and Wiesel 1962] Neurons receptive to Gabors (edges) [Logothetis... 1995] Shapes in inferior temporal cortex Machine learning: computer-vision models mapped to brain activity [Yamins... 2014] G Varoquaux 8
  • 22.
    1 Uncovering neuralcoding: in fMRI Model-based fMRI [O’Doherty... 2007] [Harvey... 2013] High-level descriptions [Mitchell... 2008] Natural stimuly [Kay... 2008] G Varoquaux 9
  • 23.
    Machine learning forencoding models Richer models of encoding capture fine descriptions of behavior / stimuli Require to forgo the contrast methodolgy Is this a good or a bad thing? G Varoquaux 10
  • 24.
    1 Models ofthe visual system Image V1 cortex V2 cortex Inferior temporal cortex Fusiform face area Jack? Is there a “face” region? A “foot” region? A “left big toe” region? G Varoquaux 11
  • 25.
    1 Uncovering neuralcoding: cognitive oppositions Is there a “face” region? A “foot” region? A “left big toe” region? vs G Varoquaux 12
  • 26.
    1 Uncovering neuralcoding: cognitive oppositions Is there a “face” region? A “foot” region? A “left big toe” region? vs G Varoquaux 12
  • 27.
    1 Uncovering neuralcoding: cognitive oppositions Is there a “face” region? A “foot” region? A “left big toe” region? vs - G Varoquaux 12
  • 28.
    1 Uncovering neuralcoding: cognitive oppositions Is there a “face” region? A “foot” region? A “left big toe” region? vs -Mapping relies on cognitive subtraction Bound to mental process decomposition G Varoquaux 12
  • 29.
    1 Decomposing visualstimuli Low-level visual cortex is tuned to natural image statistics [Olshausen et al. 1996] What drives high-level representations? G Varoquaux 13
  • 30.
    1 Decomposing visualstimuli Low-level visual cortex is tuned to natural image statistics [Olshausen et al. 1996] What drives high-level representations? Convolutional Net G Varoquaux 13
  • 31.
    Data-driven encoding models Image V1 cortex V2 cortex Inferior temporal cortex Fusiform facearea Jack? [Khaligh-Razavi and Kriegeskorte 2014, Güçlü and van Gerven 2015] FMRI beyond a handfull of contrasts ⇒ Sets us free from the paradigm G Varoquaux 14
  • 32.
    2 Decoding From brainactivity to behavior G Varoquaux 15
  • 33.
    2 Increased sensitivity “Giventhe goal of detecting the presence of a particular mental representation in the brain, the primary advantage of MVPA methods over individual-voxel-based methods is increased sensitivity.” — [Norman... 2006] G Varoquaux 16
  • 34.
    2 Increased sensitivity Anomnibus test “Given the goal of detecting the presence of a particular mental representation in the brain, the primary advantage of MVPA methods over individual-voxel-based methods is increased sensitivity.” — [Norman... 2006] Is there “information” about a stimuli in a given region? G Varoquaux 16
  • 35.
    2 Increased sensitivity Anomnibus test “Given the goal of detecting the presence of a particular mental representation in the brain, the primary advantage of MVPA methods over individual-voxel-based methods is increased sensitivity.” — [Norman... 2006] “However, these maps are not guaranteed to include all the voxels that are involved in representing the categories of interest.” — [Norman... 2006] G Varoquaux 16
  • 36.
    Non-linear cognitive model Linear predictive models Representations Stimuli 2Increased sensitivity An omnibus test Decoding used to test / compare encoding models [Naselaris... 2011] G Varoquaux 17
  • 37.
    2 Generalization asa test: cross-validation x y x y High-dimensional models G Varoquaux 18
  • 38.
    2 Generalization asa test: cross-validation x y x y High-dimensional models ⇒ Important to test on independent data, to control for model complexity G Varoquaux 18
  • 39.
    2 Generalization asa test: cross-validation High-dimensional models ⇒ Important to test on independent data, to control for model complexity ­40% ­20% ­10%  0% +10% +20% +40% Leave one sample out Leave one subject/session 20% left­out,   3 splits 20% left­out,   10 splits 20% left­out,   50 splits ­22% +19% +3% +43% ­10% +10% ­21% +17% ­11% +11% ­24% +16% ­9% +9% ­24% +14% ­9% +8% ­23% +13%   Intra subject   Inter subject No silver bullet Poster 3829, Oral Th 12:45 G Varoquaux 18
  • 40.
    2 Behavioral predictionsas a test Increase “cognitive resolution” One voxel’s information is not enough to distinguish many cognitive states ⇒ analysis combining info across voxels G Varoquaux 19
  • 41.
    2 Behavioral predictionsas a test Increase “cognitive resolution” One voxel’s information is not enough to distinguish many cognitive states ⇒ analysis combining info across voxels Interpreting overlapping activations Psychology not interested in where a task is creating activation, but if two tasks are creating activations in same areas G Varoquaux 19
  • 42.
    2 Inference incognitive neuroimaging What is the neural support of a function? What is function of a given brain module? G Varoquaux 20
  • 43.
    2 Inference incognitive neuroimaging What is the neural support of a function? What is function of a given brain module? Brain mapping = task-evoked activity G Varoquaux 20
  • 44.
    2 Inference incognitive neuroimaging [Poldrack 2006, Henson 2006] What is the neural support of a function? What is function of a given brain module? Reverse inference Brain mapping = task-evoked activity + crafting “contrasts” to isolate effects G Varoquaux 20
  • 45.
    2 Inference incognitive neuroimaging [Kanwisher... 1997, Gauthier... 2000, Hanson and Halchenko 2008] What is the neural support of a function? What is function of a given brain module? Reverse inference Is there a face area? G Varoquaux 20
  • 46.
    2 Inference incognitive neuroimaging [Poldrack... 2009, Schwartz... 2013] What is the neural support of a function? What is function of a given brain module? Reverse inference Decoding: Find regions that predict observed cognition G Varoquaux 20
  • 47.
    2 Decoding forreverse inference [Poldrack... 2009, Schwartz... 2013] Prediction = proxy for implication Need large cognitive coverage G Varoquaux 21
  • 48.
    2 Decoding forreverse inference [Poldrack... 2009, Schwartz... 2013] Prediction = proxy for implication Need large cognitive coverage Interpretation of the “grandmother neuron” “more than a neuron re- sponds to one concept and [...] neurons do not neces- sarily respond to only one concept are given by the data itself [Quian Quiroga and Kreiman 2010] G Varoquaux 21
  • 49.
    2 Brain decodingwith linear models Design matrix × Coefficients = Coefficients are brain maps Target G Varoquaux 22
  • 50.
    2 Brain decodingto recover predictive regions? Face vs house visual recognition [Haxby... 2001] SVM error: 26% G Varoquaux 23
  • 51.
    2 Brain decodingto recover predictive regions? Face vs house visual recognition [Haxby... 2001] Sparse model error: 19% G Varoquaux 23
  • 52.
    2 Brain decodingto recover predictive regions? Face vs house visual recognition [Haxby... 2001] Ridge error: 15% Best predictor outlines the worst regions Best maps predict worst G Varoquaux 23
  • 53.
    2 Decoders asestimators [Gramfort... 2013] Inverse problem Minimize the error term: ˆw = argmin w l(y − X w) Ill-posed: Many different w will give the same prediction error Choice driven by (implicit) priors of the decoder SVM sparse ridge TV- 1 G Varoquaux 24
  • 54.
    2 Decoders asestimators [Gramfort... 2013] Inverse problem Minimize the error term: ˆw = argmin w l(y − X w) Ill-posed: Many different w will give the same prediction error Choice driven by (implicit) priors of the decoder SVM sparse ridge TV- 1 Inferences rely, explicitely or implicitely, on the regions estimated by the decoder G Varoquaux 24
  • 55.
  • 56.
    @GaelVaroquaux Machine learning forcognitive neuroimaging The description of cognition is hard ⇒ Encoding Rich models depend less on paradigms
  • 57.
    @GaelVaroquaux Machine learning forcognitive neuroimaging The description of cognition is hard ⇒ Encoding Decoding as an omnibus test For rich encoding models To interpret overlaping activation Cross-validation error bars
  • 58.
    @GaelVaroquaux Machine learning forcognitive neuroimaging The description of cognition is hard ⇒ Encoding Decoding as an omnibus test Decoding for reverse inference Requires large cognitive coverage
  • 59.
    @GaelVaroquaux Machine learning forcognitive neuroimaging The description of cognition is hard ⇒ Encoding Decoding as an omnibus test Decoding for reverse inference Estimation of predictive regions is difficult Infinite number of maps predict as well
  • 60.
    @GaelVaroquaux Machine learning forcognitive neuroimaging The description of cognition is hard ⇒ Encoding Decoding as an omnibus test Decoding for reverse inference Estimation of predictive regions is difficult Software: nilearn In Python http://nilearn.github.io ni [Varoquaux and Thirion 2014] How machine learning is shaping cognitive neuroimaging
  • 61.
    References I I. Gauthier,M. J. Tarr, J. Moylan, P. Skudlarski, J. C. Gore, and A. W. Anderson. The fusiform “face area” is part of a network that processes faces at the individual level. J cognitive neuroscience, 12:495, 2000. A. Gramfort, B. Thirion, and G. Varoquaux. Identifying predictive regions from fMRI with TV-L1 prior. In PRNI, page 17, 2013. U. Güçlü and M. A. van Gerven. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. The Journal of Neuroscience, 35(27): 10005–10014, 2015. S. J. Hanson and Y. O. Halchenko. Brain reading using full brain support vector machines for object recognition: there is no “face” identification area. Neural Computation, 20:486, 2008. B. Harvey, B. Klein, N. Petridou, and S. Dumoulin. Topographic representation of numerosity in the human parietal cortex. Science, 341(6150):1123–1126, 2013.
  • 62.
    References II J. V.Haxby, I. M. Gobbini, M. L. Furey, ... Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293:2425, 2001. R. Henson. Forward inference using functional neuroimaging: Dissociations versus associations. Trends in cognitive sciences, 10:64, 2006. D. H. Hubel and T. N. Wiesel. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of physiology, 160:106, 1962. N. Kanwisher, J. McDermott, and M. M. Chun. The fusiform face area: a module in human extrastriate cortex specialized for face perception. J Neuroscience, 17:4302, 1997. K. N. Kay, T. Naselaris, R. J. Prenger, and J. L. Gallant. Identifying natural images from human brain activity. Nature, 452:352, 2008.
  • 63.
    References III S.-M. Khaligh-Razaviand N. Kriegeskorte. Deep supervised, but not unsupervised, models may explain it cortical representation. PLoS Comput Biol, 10(11):e1003915, 2014. N. K. Logothetis, J. Pauls, and T. Poggio. Shape representation in the inferior temporal cortex of monkeys. Current Biology, 5:552, 1995. D. Marr. Vision: A computational investigation into the human representation and processing of visual information. The MIT press, Cambridge, 1982. T. M. Mitchell, S. V. Shinkareva, A. Carlson, K.-M. Chang, V. L. Malave, R. A. Mason, and M. A. Just. Predicting human brain activity associated with the meanings of nouns. science, 320: 1191, 2008. T. Naselaris, K. N. Kay, S. Nishimoto, and J. L. Gallant. Encoding and decoding in fMRI. Neuroimage, 56:400, 2011.
  • 64.
    References IV K. A.Norman, S. M. Polyn, G. J. Detre, and J. V. Haxby. Beyond mind-reading: multi-voxel pattern analysis of fmri data. Trends in cognitive sciences, 10:424, 2006. J. P. O’Doherty, A. Hampton, and H. Kim. Model-based fMRI and its application to reward learning and decision making. Annals of the New York Academy of Sciences, 1104:35, 2007. B. Olshausen ... Emergence of simple-cell remainsceptive field properties by learning a sparse code for natural images. Nature, 381:607, 1996. R. Poldrack. Can cognitive processes be inferred from neuroimaging data? Trends in cognitive sciences, 10:59, 2006. R. A. Poldrack, Y. O. Halchenko, and S. J. Hanson. Decoding the large-scale structure of brain function by classifying mental states across individuals. Psychological Science, 20:1364, 2009.
  • 65.
    References V R. QuianQuiroga and G. Kreiman. Postscript: About grandmother cells and jennifer aniston neurons. Psychological Review, 117: 297, 2010. Y. Schwartz, B. Thirion, and G. Varoquaux. Mapping cognitive ontologies to and from the brain. In NIPS, 2013. G. Varoquaux and B. Thirion. How machine learning is shaping cognitive neuroimaging. GigaScience, 3:28, 2014. D. L. Yamins, H. Hong, C. F. Cadieu, E. A. Solomon, D. Seibert, and J. J. DiCarlo. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc Natl Acad Sci, page 201403112, 2014.