SlideShare a Scribd company logo
Extracting Functional-Connectome Biomarkers
with Machine Learning
Ga¨el Varoquaux
Extracting Functional-Connectome Biomarkers
with Machine Learning
Ga¨el Varoquaux
How Do Current Predictive Connectivity Models
Meet Clinician’s Needs?
This house believes that predictive biomarkers are, today,
the most useful endeavor for clinical application of func-
tional connectivity
Extracting Functional-Connectome Biomarkers
with Machine Learning
Ga¨el Varoquaux
How Do Current Predictive Connectivity Models
Meet Clinician’s Needs?
This house believes that predictive biomarkers are, today,
the most useful endeavor for clinical application of func-
tional connectivity
They are just not reliable enough
1 Prediction matters
2 Extracting biomarkers from rest fMRI
G Varoquaux 2
1 Prediction matters
Machine learning is useful, and must be done right
G Varoquaux 3
Prediction? Nah...Prediction? Nah...
We want neurobiological insightWe want neurobiological insight
G Varoquaux 4
What if I told youWhat if I told you
Brain imaging predicts the risk that a 2 year-oldBrain imaging predicts the risk that a 2 year-old
develops on the autism spectrumdevelops on the autism spectrum
Brain imaging predicts long-term cognitive deficitBrain imaging predicts long-term cognitive deficit
after strokeafter stroke
G Varoquaux 5
1 Heterogeneity is a roadblock?
[Abraham... 2017]
Autism:
ill-defined diagnostic criteria
sensitive to parents’ social-economic status
ABIDE:
post-hoc aggregation of data
across many cities and countries
Can autism biomarkers carry over to new sites?
Training set Testing set
G Varoquaux 6
1 Heterogeneity is a roadblock?
[Abraham... 2017]
Autism:
ill-defined diagnostic criteria
sensitive to parents’ social-economic status
ABIDE:
post-hoc aggregation of data
across many cities and countries
Can autism biomarkers carry over to new sites?
Training set Testing set
Accuracy
Fraction of subjects used
Prediction to new sites works as well
G Varoquaux 6
Yes we can
extract biomarkersextract biomarkers
despite heterogeneitydespite heterogeneity
Multi-variate predictive models, unlike
classical statistics, can learn to reject
confounds, given examples of confound-
ing heterogeneity
G Varoquaux 7
1 Proxy clinical outcomes
[Liem... 2016]
Predicting brain aging = chronological age
Predicts age with a mean absolute error of 4.3 years
G Varoquaux 8
1 Proxy clinical outcomes
[Liem... 2016]
Predicting brain aging = chronological age
Predicts age with a mean absolute error of 4.3 years
Discrepency with chronological age
correlates with cognitive impairment
0 2 4
Brain aging discrepancy
(years)
-0.38
0.74
1.72
Objective Cognitive
Impairment group
Normal
Mild
Major
Biomarker
surrogate,
but useful
G Varoquaux 8
1 Better descriptions of subjects?
[Rahim... 2017]
An individual should not be reduced to
a single diagnostic or behavioral quantity
G Varoquaux 9
1 Better descriptions of subjects?
[Rahim... 2017]
Multi-output prediction
Predict jointly multiple individual phenotypes
behavioral scores diagnostic status
They improve eachother’s prediction
Adding MMSE as a target improves AD prediction
Functional
connectivity (fMRI)
Protein
biomarkers (CSF)
Hippocampus
volumetry (MRI)
50% 60% 70% 80% 90%
Cross-validation accuracy
Stacked predictions
of fMRI, MRI, CSF
mono-
modal
multi-
modal
Classification: AD vs. MCI
Single-output
Multi-output
G Varoquaux 9
1 Trustworthy biomarkers
[Woo... 2017]
Good biomarkers generalize to new subjects
to new sites
G Varoquaux 10
1 Trustworthy biomarkers
[Woo... 2017]
Good biomarkers generalize to new subjects
to new sites
Bad biomarkers overly adapt
to a few subjects
to site observation noise
Predictive modeling: machine learning
Prediction rather than association
out-of-sample statistics
G Varoquaux 10
One does not simplyOne does not simply
claim predictionclaim prediction
G Varoquaux 11
1 Prediction requires more than association
[R. Poldrack, G. Huckins, G. Varoquaux, submitted]
2 1 0 1 2
5.0
7.5
10.0
12.5
15.0
17.5
20.0
22.5
25.0
order = 1
0
0
20
40
60
80
100
Meansquarederror
G Varoquaux 12
1 Prediction requires more than association
[R. Poldrack, G. Huckins, G. Varoquaux, submitted]
2 1 0 1 2
5.0
7.5
10.0
12.5
15.0
17.5
20.0
22.5
25.0
order = 1
order = 2
0
0
20
40
60
80
100
Meansquarederror
G Varoquaux 12
1 Prediction requires more than association
[R. Poldrack, G. Huckins, G. Varoquaux, submitted]
2 1 0 1 2
5.0
7.5
10.0
12.5
15.0
17.5
20.0
22.5
25.0
order = 1
order = 2
order = 15
0
0
20
40
60
80
100
Meansquarederror
Quality of fit on data used to fit is not meaningful
Only new (test) data, can measure prediction
G Varoquaux 12
1 Evidence for prediction
[Varoquaux... 2017]
Established by cross-validation
Test setTrain set
Full data
G Varoquaux 13
[R. Poldrack, G. Huckins, G. Varoquaux, submitted]
One does not simplyOne does not simply
claim predictionclaim prediction
100 last publications on
“fMRI prediction”
0 20 40
None
K-fold
Leave-one-out
Leave-X-out
Other
G Varoquaux 14
1 Cross-validation is solid evidence?
[Varoquaux 2017]
In the literature, effect sizes decrease with sample sizes
50%
75%
100%
p=.05
Wolfer2015:
Psychiatric diagnostic
p=.05
Arbabshirani2017:
Alzheimer's
p=.05
Woo2017:
Alzheimer's
p=.05
Woo2017:
Depression
30 100 3001000
50%
75%
100%
p=.05
Brown2017:
Connectome learning
30 100 3001000
p=.05
Arbabshirani2017:
Schizophrenia
30 100 3001000
p=.05
Woo2017:
Psychosis
30 100 3001000
p=.05
Reportedaccuracy
Study sample size
Woo2017:
Autism
G Varoquaux 15
1 Cross-validation is solid evidence?
[Varoquaux 2017]
Trivial analytic variations on a permuted data:
smoothing, SVM vs log-reg, feature selection
30% 40% 50% 60% 70%
Cross­validation scores for different decoders            
4 first
4 last
6 first
6 last
all 12
Sessions used 
25% 39%
40% 71%
38% 57%
47% 57%
44% 52%
n~72
n~72
n~108
n~108
n~216
G Varoquaux 15
1 Cross-validation is solid evidence?
[Varoquaux 2017]
Trivial analytic variations on a permuted data:
smoothing, SVM vs log-reg, feature selection
30% 40% 50% 60% 70%
Cross­validation scores for different decoders            
4 first
4 last
6 first
6 last
all 12
Sessions used 
25% 39%
40% 71%
38% 57%
47% 57%
44% 52%
n~72
n~72
n~108
n~108
n~216
With small n, by chance, some analytic
choices give seemingly good predictions
G Varoquaux 15
1 Cross-validation is solid evidence?
[Varoquaux 2017]
30
100
200
300
umber of available samples    ­19% +15%
­20% +18%
­10% +8%
­10% +10%
­7% +5%
­7% +7%
­5% +4%
­6% +6%
LOO
50 splits, 20% test
LOO
50 splits, 20% test
LOO
50 splits, 20% test
LOO
50 splits, 20% test
LOO
50 splits, 20% test
Sampling distribution of test error for n = 30
G Varoquaux 15
1 Cross-validation is solid evidence?
[Varoquaux 2017]
30
100
200
300
1000
Number of available samples   
­19% +15%
­20% +18%
­10% +8%
­10% +10%
­7% +5%
­7% +7%
­5% +4%
­6% +6%
­3% +2%
­3% +3%
LOO
50 splits, 20% test
LOO
50 splits, 20% test
LOO
50 splits, 20% test
LOO
50 splits, 20% test
LOO
50 splits, 20% test
LOO
50 splits, 20% test
G Varoquaux 15
1 Cross-validation is solid evidence?
[Varoquaux 2017]
­45% ­30% ­15%  0% +15% +30%
Difference between public and private scores       
­15% +14%
Kaggle competition on r-fMRI for Schizophrenia
2 different test sets: size 30 and 28
G Varoquaux 15
One does not simplyOne does not simply
claim predictionclaim prediction
We need
Clean cross-validation
strong-generalization
= testing on data never seen
Several 100s subjects
G Varoquaux 16
Yes we can
Reliable prediction of clinical end-
points would be game changing
But we need larger sizes, reduced
analytical variability, and clean
validation
G Varoquaux 17
2 Extracting biomarkers from
rest fMRI
Addressing the perils of
analytical variabality
Systematic study:
6 different cohorts
More than 2000 individuals
[Dadi... 2019]
G Varoquaux 18
From rest-fMRI to biomarkers
No salient features in rest fMRI
G Varoquaux 19
From rest-fMRI to biomarkers
Define functional regions
G Varoquaux 19
From rest-fMRI to biomarkers
Define functional regions
Learn interactions
G Varoquaux 19
From rest-fMRI to biomarkers
Define functional regions
Learn interactions
Detect differences
G Varoquaux 19
From rest-fMRI to biomarkers
Functional
connectivity
matrix
Time series
extraction
Region
definition
Supervised learning
RS-fMRI
G Varoquaux 20
2 Defining regions
Anatomical atlases
Clustering
k-means
ward
[Thirion... 2014]
... ... ...
... ...
G Varoquaux 21
2 Defining regions
Anatomical atlases
Clustering
k-means
ward
[Thirion... 2014]
Decomposition models
time
voxels
time
voxels
time voxels
Y +E · S=
25
N
G Varoquaux 21
2 Defining regions
Anatomical atlases
Clustering
k-means
ward
[Thirion... 2014]
Decomposition models
ICA:
seek independence of maps
Sparse dictionary learning:
seek sparse maps
G Varoquaux 21
2 In connectome prediction settings
RS-fMRI
Functional
connectivity
Time series
2
4
3
1
Diagnosis
ROIs
Choice of regions for best prediction?
G Varoquaux 22
2 In connectome prediction settings
RS-fMRI
Functional
connectivity
Time series
2
4
3
1
Diagnosis
ROIs
Choice of regions for best prediction?
[Dadi... 2019]
G Varoquaux 22
2 Connectome: building a connectivity matrix
How to capture and represent interactions?
G Varoquaux 23
2 Connectome: differences across subjects
0 5 10 15 20 25
0
5
10
15
20
25
0 5 10 15 20 25
0
5
10
15
20
25
0 5 10 15 20 25
0
5
10
15
20
25
0 5 10 15 20 25
0
5
10
15
20
25
Correlation matrices
0 5 10 15 20 25
0
5
10
15
20
25
0 5 10 15 20 25
0
5
10
15
20
25
0 5 10 15 20 25
0
5
10
15
20
25
0 5 10 15 20 25
0
5
10
15
20
25
Partial correlation matrices
3 controls, 1 severe stroke patient
Which is which?
G Varoquaux 24
2 Connectome: differences across subjects
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25Large lesion
Correlation matrices
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25Large lesion
Partial correlation matrices
Spread-out variability in correlation matrices
Noise in partial-correlations
Strong dependence between coefficients
[Varoquaux... 2010]
G Varoquaux 24
2 Connectome: differences across subjects
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25Large lesion
Correlation matrices
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25Large lesion
Partial correlation matrices
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25 Control
0 5 10 15 20 25
0
5
10
15
20
25Large lesion
Tangent-space embedding
[varoquaux 2010]
G Varoquaux 25
2 Connectivity matrix for predictive models
Time series
2
RS-fMRI
41
Diagnosis
ROIs Functional
connectivity
3
[Dadi... 2019]
G Varoquaux 26
2 Machine learning for connectome prediction
Functional
connectivity
Time series
3
4
Diagnosis
2
RS-fMRI
1 ROIs
Supervised learning step
Linear models
Random forests
Sparse or non sparse?
G Varoquaux 27
2 Machine learning for connectome prediction
Functional
connectivity
Time series
3
4
Diagnosis
2
RS-fMRI
1 ROIs
Supervised learning step
Linear models
Random forests
Sparse or non sparse? [Dadi... 2019]
G Varoquaux 27
@GaelVaroquaux
Functionnal-connectome biomarkers
Biomarkers
Early assessment
Pronostic
Proxy clinical endpoints
Reliable biomarkers
Larger sample sizes
Clean evidence of generalization
Higher standards
[Woo... 2017]
@GaelVaroquaux
Functionnal-connectome biomarkers
Biomarkers game-changing if trustworthy
Rest-fMRI biomarkers extraction
Functional regions (extracted by dictionary learning)
Tangent space to compare connectomes
Linear model for supervised learning
RS-fMRI
Diagnosis
Connectivity
Parameterization
Supervised
Learning
Defining Brain
ROIs
1 2 3
Software: nilearn ni
References I
A. Abraham, M. Milham, A. Di Martino, R. C. Craddock,
D. Samaras, B. Thirion, and G. Varoquaux. Deriving
reproducible biomarkers from multi-site resting-state data:
An autism-based example. NeuroImage, 2017.
K. Dadi, M. Rahim, A. Abraham, D. Chyzhyk, M. Milham,
B. Thirion, G. Varoquaux, and A. D. N. Initiative.
Benchmarking functional connectome-based predictive
models for resting-state fmri. NeuroImage, 2019.
F. Liem, G. Varoquaux, J. Kynast, F. Beyer, S. K. Masouleh,
J. M. Huntenburg, L. Lampe, M. Rahim, A. Abraham, R. C.
Craddock, ... Predicting brain-age from multimodal imaging
data captures cognitive impairment. NeuroImage, 2016.
M. Rahim, B. Thirion, D. Bzdok, I. Buvat, and G. Varoquaux.
Joint prediction of multiple scores captures better individual
traits from brain images. Neuroimage, in rev, 2017.
References II
B. Thirion, G. Varoquaux, E. Dohmatob, and J. Poline. Which
fMRI clustering gives good brain parcellations? Name:
Frontiers in Neuroscience, 8:167, 2014.
G. Varoquaux. Cross-validation failure: small sample sizes lead
to large error bars. NeuroImage, 2017.
G. Varoquaux, F. Baronnet, A. Kleinschmidt, P. Fillard, and
B. Thirion. Detection of brain functional-connectivity
difference in post-stroke patients using group-level
covariance modeling. In MICCAI. 2010.
G. Varoquaux, P. R. Raamana, D. A. Engemann,
A. Hoyos-Idrobo, Y. Schwartz, and B. Thirion. Assessing
and tuning brain decoders: cross-validation, caveats, and
guidelines. NeuroImage, 145:166–179, 2017.
C.-W. Woo, L. J. Chang, M. A. Lindquist, and T. D. Wager.
Building better biomarkers: brain models in translational
neuroimaging. Nature neuroscience, 20(3):365, 2017.

More Related Content

Similar to Functional-connectome biomarkers to meet clinical needs?

Measuring mental health with machine learning and brain imaging
Measuring mental health with machine learning and brain imagingMeasuring mental health with machine learning and brain imaging
Measuring mental health with machine learning and brain imaging
Gael Varoquaux
 
Towards Replicable and Genereralizable Genomic Prediction Models
Towards Replicable and Genereralizable Genomic Prediction ModelsTowards Replicable and Genereralizable Genomic Prediction Models
Towards Replicable and Genereralizable Genomic Prediction Models
Levi Waldron
 
20190820 deepest
20190820 deepest 20190820 deepest
20190820 deepest
Ryoungwoo Jang
 
Machine learning and cognitive neuroimaging: new tools can answer new questions
Machine learning and cognitive neuroimaging: new tools can answer new questionsMachine learning and cognitive neuroimaging: new tools can answer new questions
Machine learning and cognitive neuroimaging: new tools can answer new questions
Gael Varoquaux
 
Objective vs subjective outcomes in the latest trifcoal iol Physiol (Finevision)
Objective vs subjective outcomes in the latest trifcoal iol Physiol (Finevision)Objective vs subjective outcomes in the latest trifcoal iol Physiol (Finevision)
Objective vs subjective outcomes in the latest trifcoal iol Physiol (Finevision)
Dr. Ivo Ferreira Rios M.D.
 
Langs - Machine Learning in Medical Imaging: Learning from Large-scale popula...
Langs - Machine Learning in Medical Imaging: Learning from Large-scale popula...Langs - Machine Learning in Medical Imaging: Learning from Large-scale popula...
Langs - Machine Learning in Medical Imaging: Learning from Large-scale popula...
Vienna Data Science Group
 
The Lachman Test
The Lachman TestThe Lachman Test
The Lachman Test
Laura Torres
 
Fundamentals of statistical analysis
Fundamentals of statistical analysisFundamentals of statistical analysis
Fundamentals of statistical analysis
Paul Gardner
 
Print skripsie (1)
Print skripsie (1)Print skripsie (1)
Print skripsie (1)Petri Nel
 
Probability Forecasting - a Machine Learning Perspective
Probability Forecasting - a Machine Learning PerspectiveProbability Forecasting - a Machine Learning Perspective
Probability Forecasting - a Machine Learning Perspectivebutest
 
Critical Appraisal - Quantitative SS.pptx
Critical Appraisal - Quantitative SS.pptxCritical Appraisal - Quantitative SS.pptx
Critical Appraisal - Quantitative SS.pptx
Mrs S Sen
 
PMED Opening Workshop - Credible Ecological Inference for Medical Decisions w...
PMED Opening Workshop - Credible Ecological Inference for Medical Decisions w...PMED Opening Workshop - Credible Ecological Inference for Medical Decisions w...
PMED Opening Workshop - Credible Ecological Inference for Medical Decisions w...
The Statistical and Applied Mathematical Sciences Institute
 
Research methodology and biostatistics
Research methodology and biostatisticsResearch methodology and biostatistics
Research methodology and biostatistics
Medical Ultrasound
 
Diagnosing Learning Related Vision Problems
Diagnosing Learning Related Vision ProblemsDiagnosing Learning Related Vision Problems
Diagnosing Learning Related Vision Problems
Dominick Maino
 
Bias in covid 19 models
Bias in covid 19 modelsBias in covid 19 models
Bias in covid 19 models
Laure Wynants
 
Sample Size Determination.23.11.2021.pdf
Sample Size Determination.23.11.2021.pdfSample Size Determination.23.11.2021.pdf
Sample Size Determination.23.11.2021.pdf
statsanjal
 
Running head PROJECT PHASE 4-INFECTIOUS DISEASES1PROJECT PHASE.docx
Running head PROJECT PHASE 4-INFECTIOUS DISEASES1PROJECT PHASE.docxRunning head PROJECT PHASE 4-INFECTIOUS DISEASES1PROJECT PHASE.docx
Running head PROJECT PHASE 4-INFECTIOUS DISEASES1PROJECT PHASE.docx
toltonkendal
 
Bayesain Hypothesis of Selective Attention - Raw 2011 poster
Bayesain Hypothesis of Selective Attention - Raw 2011 posterBayesain Hypothesis of Selective Attention - Raw 2011 poster
Bayesain Hypothesis of Selective Attention - Raw 2011 poster
Giacomo Veneri
 
From Experimental to Applied Predictive Analytics on Big Data - Milan Vukicevic
 From Experimental to Applied Predictive Analytics on Big Data - Milan Vukicevic From Experimental to Applied Predictive Analytics on Big Data - Milan Vukicevic
From Experimental to Applied Predictive Analytics on Big Data - Milan Vukicevic
Institute of Contemporary Sciences
 

Similar to Functional-connectome biomarkers to meet clinical needs? (20)

Measuring mental health with machine learning and brain imaging
Measuring mental health with machine learning and brain imagingMeasuring mental health with machine learning and brain imaging
Measuring mental health with machine learning and brain imaging
 
Towards Replicable and Genereralizable Genomic Prediction Models
Towards Replicable and Genereralizable Genomic Prediction ModelsTowards Replicable and Genereralizable Genomic Prediction Models
Towards Replicable and Genereralizable Genomic Prediction Models
 
20190820 deepest
20190820 deepest 20190820 deepest
20190820 deepest
 
Machine learning and cognitive neuroimaging: new tools can answer new questions
Machine learning and cognitive neuroimaging: new tools can answer new questionsMachine learning and cognitive neuroimaging: new tools can answer new questions
Machine learning and cognitive neuroimaging: new tools can answer new questions
 
Objective vs subjective outcomes in the latest trifcoal iol Physiol (Finevision)
Objective vs subjective outcomes in the latest trifcoal iol Physiol (Finevision)Objective vs subjective outcomes in the latest trifcoal iol Physiol (Finevision)
Objective vs subjective outcomes in the latest trifcoal iol Physiol (Finevision)
 
Langs - Machine Learning in Medical Imaging: Learning from Large-scale popula...
Langs - Machine Learning in Medical Imaging: Learning from Large-scale popula...Langs - Machine Learning in Medical Imaging: Learning from Large-scale popula...
Langs - Machine Learning in Medical Imaging: Learning from Large-scale popula...
 
The Lachman Test
The Lachman TestThe Lachman Test
The Lachman Test
 
Fundamentals of statistical analysis
Fundamentals of statistical analysisFundamentals of statistical analysis
Fundamentals of statistical analysis
 
Print skripsie (1)
Print skripsie (1)Print skripsie (1)
Print skripsie (1)
 
Probability Forecasting - a Machine Learning Perspective
Probability Forecasting - a Machine Learning PerspectiveProbability Forecasting - a Machine Learning Perspective
Probability Forecasting - a Machine Learning Perspective
 
Critical Appraisal - Quantitative SS.pptx
Critical Appraisal - Quantitative SS.pptxCritical Appraisal - Quantitative SS.pptx
Critical Appraisal - Quantitative SS.pptx
 
PMED Opening Workshop - Credible Ecological Inference for Medical Decisions w...
PMED Opening Workshop - Credible Ecological Inference for Medical Decisions w...PMED Opening Workshop - Credible Ecological Inference for Medical Decisions w...
PMED Opening Workshop - Credible Ecological Inference for Medical Decisions w...
 
Research methodology and biostatistics
Research methodology and biostatisticsResearch methodology and biostatistics
Research methodology and biostatistics
 
Diagnosing Learning Related Vision Problems
Diagnosing Learning Related Vision ProblemsDiagnosing Learning Related Vision Problems
Diagnosing Learning Related Vision Problems
 
Bias in covid 19 models
Bias in covid 19 modelsBias in covid 19 models
Bias in covid 19 models
 
Sample Size Determination.23.11.2021.pdf
Sample Size Determination.23.11.2021.pdfSample Size Determination.23.11.2021.pdf
Sample Size Determination.23.11.2021.pdf
 
Oac guidelines
Oac guidelinesOac guidelines
Oac guidelines
 
Running head PROJECT PHASE 4-INFECTIOUS DISEASES1PROJECT PHASE.docx
Running head PROJECT PHASE 4-INFECTIOUS DISEASES1PROJECT PHASE.docxRunning head PROJECT PHASE 4-INFECTIOUS DISEASES1PROJECT PHASE.docx
Running head PROJECT PHASE 4-INFECTIOUS DISEASES1PROJECT PHASE.docx
 
Bayesain Hypothesis of Selective Attention - Raw 2011 poster
Bayesain Hypothesis of Selective Attention - Raw 2011 posterBayesain Hypothesis of Selective Attention - Raw 2011 poster
Bayesain Hypothesis of Selective Attention - Raw 2011 poster
 
From Experimental to Applied Predictive Analytics on Big Data - Milan Vukicevic
 From Experimental to Applied Predictive Analytics on Big Data - Milan Vukicevic From Experimental to Applied Predictive Analytics on Big Data - Milan Vukicevic
From Experimental to Applied Predictive Analytics on Big Data - Milan Vukicevic
 

More from Gael Varoquaux

Evaluating machine learning models and their diagnostic value
Evaluating machine learning models and their diagnostic valueEvaluating machine learning models and their diagnostic value
Evaluating machine learning models and their diagnostic value
Gael Varoquaux
 
Machine learning with missing values
Machine learning with missing valuesMachine learning with missing values
Machine learning with missing values
Gael Varoquaux
 
Dirty data science machine learning on non-curated data
Dirty data science machine learning on non-curated dataDirty data science machine learning on non-curated data
Dirty data science machine learning on non-curated data
Gael Varoquaux
 
Representation learning in limited-data settings
Representation learning in limited-data settingsRepresentation learning in limited-data settings
Representation learning in limited-data settings
Gael Varoquaux
 
Better neuroimaging data processing: driven by evidence, open communities, an...
Better neuroimaging data processing: driven by evidence, open communities, an...Better neuroimaging data processing: driven by evidence, open communities, an...
Better neuroimaging data processing: driven by evidence, open communities, an...
Gael Varoquaux
 
Atlases of cognition with large-scale human brain mapping
Atlases of cognition with large-scale human brain mappingAtlases of cognition with large-scale human brain mapping
Atlases of cognition with large-scale human brain mapping
Gael Varoquaux
 
Similarity encoding for learning on dirty categorical variables
Similarity encoding for learning on dirty categorical variablesSimilarity encoding for learning on dirty categorical variables
Similarity encoding for learning on dirty categorical variables
Gael Varoquaux
 
Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities
Gael Varoquaux
 
A tutorial on Machine Learning, with illustrations for MR imaging
A tutorial on Machine Learning, with illustrations for MR imagingA tutorial on Machine Learning, with illustrations for MR imaging
A tutorial on Machine Learning, with illustrations for MR imaging
Gael Varoquaux
 
Scikit-learn and nilearn: Democratisation of machine learning for brain imaging
Scikit-learn and nilearn: Democratisation of machine learning for brain imagingScikit-learn and nilearn: Democratisation of machine learning for brain imaging
Scikit-learn and nilearn: Democratisation of machine learning for brain imaging
Gael Varoquaux
 
Computational practices for reproducible science
Computational practices for reproducible scienceComputational practices for reproducible science
Computational practices for reproducible science
Gael Varoquaux
 
Coding for science and innovation
Coding for science and innovationCoding for science and innovation
Coding for science and innovation
Gael Varoquaux
 
On the code of data science
On the code of data scienceOn the code of data science
On the code of data science
Gael Varoquaux
 
Scientist meets web dev: how Python became the language of data
Scientist meets web dev: how Python became the language of dataScientist meets web dev: how Python became the language of data
Scientist meets web dev: how Python became the language of data
Gael Varoquaux
 
Social-sparsity brain decoders: faster spatial sparsity
Social-sparsity brain decoders: faster spatial sparsitySocial-sparsity brain decoders: faster spatial sparsity
Social-sparsity brain decoders: faster spatial sparsity
Gael Varoquaux
 
Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016
Gael Varoquaux
 
Inter-site autism biomarkers from resting state fMRI
Inter-site autism biomarkers from resting state fMRIInter-site autism biomarkers from resting state fMRI
Inter-site autism biomarkers from resting state fMRI
Gael Varoquaux
 
Brain maps from machine learning? Spatial regularizations
Brain maps from machine learning? Spatial regularizationsBrain maps from machine learning? Spatial regularizations
Brain maps from machine learning? Spatial regularizations
Gael Varoquaux
 
Scikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the projectScikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the project
Gael Varoquaux
 
Simple big data, in Python
Simple big data, in PythonSimple big data, in Python
Simple big data, in Python
Gael Varoquaux
 

More from Gael Varoquaux (20)

Evaluating machine learning models and their diagnostic value
Evaluating machine learning models and their diagnostic valueEvaluating machine learning models and their diagnostic value
Evaluating machine learning models and their diagnostic value
 
Machine learning with missing values
Machine learning with missing valuesMachine learning with missing values
Machine learning with missing values
 
Dirty data science machine learning on non-curated data
Dirty data science machine learning on non-curated dataDirty data science machine learning on non-curated data
Dirty data science machine learning on non-curated data
 
Representation learning in limited-data settings
Representation learning in limited-data settingsRepresentation learning in limited-data settings
Representation learning in limited-data settings
 
Better neuroimaging data processing: driven by evidence, open communities, an...
Better neuroimaging data processing: driven by evidence, open communities, an...Better neuroimaging data processing: driven by evidence, open communities, an...
Better neuroimaging data processing: driven by evidence, open communities, an...
 
Atlases of cognition with large-scale human brain mapping
Atlases of cognition with large-scale human brain mappingAtlases of cognition with large-scale human brain mapping
Atlases of cognition with large-scale human brain mapping
 
Similarity encoding for learning on dirty categorical variables
Similarity encoding for learning on dirty categorical variablesSimilarity encoding for learning on dirty categorical variables
Similarity encoding for learning on dirty categorical variables
 
Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities
 
A tutorial on Machine Learning, with illustrations for MR imaging
A tutorial on Machine Learning, with illustrations for MR imagingA tutorial on Machine Learning, with illustrations for MR imaging
A tutorial on Machine Learning, with illustrations for MR imaging
 
Scikit-learn and nilearn: Democratisation of machine learning for brain imaging
Scikit-learn and nilearn: Democratisation of machine learning for brain imagingScikit-learn and nilearn: Democratisation of machine learning for brain imaging
Scikit-learn and nilearn: Democratisation of machine learning for brain imaging
 
Computational practices for reproducible science
Computational practices for reproducible scienceComputational practices for reproducible science
Computational practices for reproducible science
 
Coding for science and innovation
Coding for science and innovationCoding for science and innovation
Coding for science and innovation
 
On the code of data science
On the code of data scienceOn the code of data science
On the code of data science
 
Scientist meets web dev: how Python became the language of data
Scientist meets web dev: how Python became the language of dataScientist meets web dev: how Python became the language of data
Scientist meets web dev: how Python became the language of data
 
Social-sparsity brain decoders: faster spatial sparsity
Social-sparsity brain decoders: faster spatial sparsitySocial-sparsity brain decoders: faster spatial sparsity
Social-sparsity brain decoders: faster spatial sparsity
 
Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016
 
Inter-site autism biomarkers from resting state fMRI
Inter-site autism biomarkers from resting state fMRIInter-site autism biomarkers from resting state fMRI
Inter-site autism biomarkers from resting state fMRI
 
Brain maps from machine learning? Spatial regularizations
Brain maps from machine learning? Spatial regularizationsBrain maps from machine learning? Spatial regularizations
Brain maps from machine learning? Spatial regularizations
 
Scikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the projectScikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the project
 
Simple big data, in Python
Simple big data, in PythonSimple big data, in Python
Simple big data, in Python
 

Recently uploaded

Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 

Recently uploaded (20)

Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 

Functional-connectome biomarkers to meet clinical needs?

  • 1. Extracting Functional-Connectome Biomarkers with Machine Learning Ga¨el Varoquaux
  • 2. Extracting Functional-Connectome Biomarkers with Machine Learning Ga¨el Varoquaux How Do Current Predictive Connectivity Models Meet Clinician’s Needs? This house believes that predictive biomarkers are, today, the most useful endeavor for clinical application of func- tional connectivity
  • 3. Extracting Functional-Connectome Biomarkers with Machine Learning Ga¨el Varoquaux How Do Current Predictive Connectivity Models Meet Clinician’s Needs? This house believes that predictive biomarkers are, today, the most useful endeavor for clinical application of func- tional connectivity They are just not reliable enough
  • 4. 1 Prediction matters 2 Extracting biomarkers from rest fMRI G Varoquaux 2
  • 5. 1 Prediction matters Machine learning is useful, and must be done right G Varoquaux 3
  • 6. Prediction? Nah...Prediction? Nah... We want neurobiological insightWe want neurobiological insight G Varoquaux 4
  • 7. What if I told youWhat if I told you Brain imaging predicts the risk that a 2 year-oldBrain imaging predicts the risk that a 2 year-old develops on the autism spectrumdevelops on the autism spectrum Brain imaging predicts long-term cognitive deficitBrain imaging predicts long-term cognitive deficit after strokeafter stroke G Varoquaux 5
  • 8. 1 Heterogeneity is a roadblock? [Abraham... 2017] Autism: ill-defined diagnostic criteria sensitive to parents’ social-economic status ABIDE: post-hoc aggregation of data across many cities and countries Can autism biomarkers carry over to new sites? Training set Testing set G Varoquaux 6
  • 9. 1 Heterogeneity is a roadblock? [Abraham... 2017] Autism: ill-defined diagnostic criteria sensitive to parents’ social-economic status ABIDE: post-hoc aggregation of data across many cities and countries Can autism biomarkers carry over to new sites? Training set Testing set Accuracy Fraction of subjects used Prediction to new sites works as well G Varoquaux 6
  • 10. Yes we can extract biomarkersextract biomarkers despite heterogeneitydespite heterogeneity Multi-variate predictive models, unlike classical statistics, can learn to reject confounds, given examples of confound- ing heterogeneity G Varoquaux 7
  • 11. 1 Proxy clinical outcomes [Liem... 2016] Predicting brain aging = chronological age Predicts age with a mean absolute error of 4.3 years G Varoquaux 8
  • 12. 1 Proxy clinical outcomes [Liem... 2016] Predicting brain aging = chronological age Predicts age with a mean absolute error of 4.3 years Discrepency with chronological age correlates with cognitive impairment 0 2 4 Brain aging discrepancy (years) -0.38 0.74 1.72 Objective Cognitive Impairment group Normal Mild Major Biomarker surrogate, but useful G Varoquaux 8
  • 13. 1 Better descriptions of subjects? [Rahim... 2017] An individual should not be reduced to a single diagnostic or behavioral quantity G Varoquaux 9
  • 14. 1 Better descriptions of subjects? [Rahim... 2017] Multi-output prediction Predict jointly multiple individual phenotypes behavioral scores diagnostic status They improve eachother’s prediction Adding MMSE as a target improves AD prediction Functional connectivity (fMRI) Protein biomarkers (CSF) Hippocampus volumetry (MRI) 50% 60% 70% 80% 90% Cross-validation accuracy Stacked predictions of fMRI, MRI, CSF mono- modal multi- modal Classification: AD vs. MCI Single-output Multi-output G Varoquaux 9
  • 15. 1 Trustworthy biomarkers [Woo... 2017] Good biomarkers generalize to new subjects to new sites G Varoquaux 10
  • 16. 1 Trustworthy biomarkers [Woo... 2017] Good biomarkers generalize to new subjects to new sites Bad biomarkers overly adapt to a few subjects to site observation noise Predictive modeling: machine learning Prediction rather than association out-of-sample statistics G Varoquaux 10
  • 17. One does not simplyOne does not simply claim predictionclaim prediction G Varoquaux 11
  • 18. 1 Prediction requires more than association [R. Poldrack, G. Huckins, G. Varoquaux, submitted] 2 1 0 1 2 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 25.0 order = 1 0 0 20 40 60 80 100 Meansquarederror G Varoquaux 12
  • 19. 1 Prediction requires more than association [R. Poldrack, G. Huckins, G. Varoquaux, submitted] 2 1 0 1 2 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 25.0 order = 1 order = 2 0 0 20 40 60 80 100 Meansquarederror G Varoquaux 12
  • 20. 1 Prediction requires more than association [R. Poldrack, G. Huckins, G. Varoquaux, submitted] 2 1 0 1 2 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 25.0 order = 1 order = 2 order = 15 0 0 20 40 60 80 100 Meansquarederror Quality of fit on data used to fit is not meaningful Only new (test) data, can measure prediction G Varoquaux 12
  • 21. 1 Evidence for prediction [Varoquaux... 2017] Established by cross-validation Test setTrain set Full data G Varoquaux 13
  • 22. [R. Poldrack, G. Huckins, G. Varoquaux, submitted] One does not simplyOne does not simply claim predictionclaim prediction 100 last publications on “fMRI prediction” 0 20 40 None K-fold Leave-one-out Leave-X-out Other G Varoquaux 14
  • 23. 1 Cross-validation is solid evidence? [Varoquaux 2017] In the literature, effect sizes decrease with sample sizes 50% 75% 100% p=.05 Wolfer2015: Psychiatric diagnostic p=.05 Arbabshirani2017: Alzheimer's p=.05 Woo2017: Alzheimer's p=.05 Woo2017: Depression 30 100 3001000 50% 75% 100% p=.05 Brown2017: Connectome learning 30 100 3001000 p=.05 Arbabshirani2017: Schizophrenia 30 100 3001000 p=.05 Woo2017: Psychosis 30 100 3001000 p=.05 Reportedaccuracy Study sample size Woo2017: Autism G Varoquaux 15
  • 24. 1 Cross-validation is solid evidence? [Varoquaux 2017] Trivial analytic variations on a permuted data: smoothing, SVM vs log-reg, feature selection 30% 40% 50% 60% 70% Cross­validation scores for different decoders             4 first 4 last 6 first 6 last all 12 Sessions used  25% 39% 40% 71% 38% 57% 47% 57% 44% 52% n~72 n~72 n~108 n~108 n~216 G Varoquaux 15
  • 25. 1 Cross-validation is solid evidence? [Varoquaux 2017] Trivial analytic variations on a permuted data: smoothing, SVM vs log-reg, feature selection 30% 40% 50% 60% 70% Cross­validation scores for different decoders             4 first 4 last 6 first 6 last all 12 Sessions used  25% 39% 40% 71% 38% 57% 47% 57% 44% 52% n~72 n~72 n~108 n~108 n~216 With small n, by chance, some analytic choices give seemingly good predictions G Varoquaux 15
  • 26. 1 Cross-validation is solid evidence? [Varoquaux 2017] 30 100 200 300 umber of available samples    ­19% +15% ­20% +18% ­10% +8% ­10% +10% ­7% +5% ­7% +7% ­5% +4% ­6% +6% LOO 50 splits, 20% test LOO 50 splits, 20% test LOO 50 splits, 20% test LOO 50 splits, 20% test LOO 50 splits, 20% test Sampling distribution of test error for n = 30 G Varoquaux 15
  • 27. 1 Cross-validation is solid evidence? [Varoquaux 2017] 30 100 200 300 1000 Number of available samples    ­19% +15% ­20% +18% ­10% +8% ­10% +10% ­7% +5% ­7% +7% ­5% +4% ­6% +6% ­3% +2% ­3% +3% LOO 50 splits, 20% test LOO 50 splits, 20% test LOO 50 splits, 20% test LOO 50 splits, 20% test LOO 50 splits, 20% test LOO 50 splits, 20% test G Varoquaux 15
  • 28. 1 Cross-validation is solid evidence? [Varoquaux 2017] ­45% ­30% ­15%  0% +15% +30% Difference between public and private scores        ­15% +14% Kaggle competition on r-fMRI for Schizophrenia 2 different test sets: size 30 and 28 G Varoquaux 15
  • 29. One does not simplyOne does not simply claim predictionclaim prediction We need Clean cross-validation strong-generalization = testing on data never seen Several 100s subjects G Varoquaux 16
  • 30. Yes we can Reliable prediction of clinical end- points would be game changing But we need larger sizes, reduced analytical variability, and clean validation G Varoquaux 17
  • 31. 2 Extracting biomarkers from rest fMRI Addressing the perils of analytical variabality Systematic study: 6 different cohorts More than 2000 individuals [Dadi... 2019] G Varoquaux 18
  • 32. From rest-fMRI to biomarkers No salient features in rest fMRI G Varoquaux 19
  • 33. From rest-fMRI to biomarkers Define functional regions G Varoquaux 19
  • 34. From rest-fMRI to biomarkers Define functional regions Learn interactions G Varoquaux 19
  • 35. From rest-fMRI to biomarkers Define functional regions Learn interactions Detect differences G Varoquaux 19
  • 36. From rest-fMRI to biomarkers Functional connectivity matrix Time series extraction Region definition Supervised learning RS-fMRI G Varoquaux 20
  • 37. 2 Defining regions Anatomical atlases Clustering k-means ward [Thirion... 2014] ... ... ... ... ... G Varoquaux 21
  • 38. 2 Defining regions Anatomical atlases Clustering k-means ward [Thirion... 2014] Decomposition models time voxels time voxels time voxels Y +E · S= 25 N G Varoquaux 21
  • 39. 2 Defining regions Anatomical atlases Clustering k-means ward [Thirion... 2014] Decomposition models ICA: seek independence of maps Sparse dictionary learning: seek sparse maps G Varoquaux 21
  • 40. 2 In connectome prediction settings RS-fMRI Functional connectivity Time series 2 4 3 1 Diagnosis ROIs Choice of regions for best prediction? G Varoquaux 22
  • 41. 2 In connectome prediction settings RS-fMRI Functional connectivity Time series 2 4 3 1 Diagnosis ROIs Choice of regions for best prediction? [Dadi... 2019] G Varoquaux 22
  • 42. 2 Connectome: building a connectivity matrix How to capture and represent interactions? G Varoquaux 23
  • 43. 2 Connectome: differences across subjects 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 Correlation matrices 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 Partial correlation matrices 3 controls, 1 severe stroke patient Which is which? G Varoquaux 24
  • 44. 2 Connectome: differences across subjects 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25Large lesion Correlation matrices 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25Large lesion Partial correlation matrices Spread-out variability in correlation matrices Noise in partial-correlations Strong dependence between coefficients [Varoquaux... 2010] G Varoquaux 24
  • 45. 2 Connectome: differences across subjects 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25Large lesion Correlation matrices 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25Large lesion Partial correlation matrices 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25 Control 0 5 10 15 20 25 0 5 10 15 20 25Large lesion Tangent-space embedding [varoquaux 2010] G Varoquaux 25
  • 46. 2 Connectivity matrix for predictive models Time series 2 RS-fMRI 41 Diagnosis ROIs Functional connectivity 3 [Dadi... 2019] G Varoquaux 26
  • 47. 2 Machine learning for connectome prediction Functional connectivity Time series 3 4 Diagnosis 2 RS-fMRI 1 ROIs Supervised learning step Linear models Random forests Sparse or non sparse? G Varoquaux 27
  • 48. 2 Machine learning for connectome prediction Functional connectivity Time series 3 4 Diagnosis 2 RS-fMRI 1 ROIs Supervised learning step Linear models Random forests Sparse or non sparse? [Dadi... 2019] G Varoquaux 27
  • 49. @GaelVaroquaux Functionnal-connectome biomarkers Biomarkers Early assessment Pronostic Proxy clinical endpoints Reliable biomarkers Larger sample sizes Clean evidence of generalization Higher standards [Woo... 2017]
  • 50. @GaelVaroquaux Functionnal-connectome biomarkers Biomarkers game-changing if trustworthy Rest-fMRI biomarkers extraction Functional regions (extracted by dictionary learning) Tangent space to compare connectomes Linear model for supervised learning RS-fMRI Diagnosis Connectivity Parameterization Supervised Learning Defining Brain ROIs 1 2 3 Software: nilearn ni
  • 51. References I A. Abraham, M. Milham, A. Di Martino, R. C. Craddock, D. Samaras, B. Thirion, and G. Varoquaux. Deriving reproducible biomarkers from multi-site resting-state data: An autism-based example. NeuroImage, 2017. K. Dadi, M. Rahim, A. Abraham, D. Chyzhyk, M. Milham, B. Thirion, G. Varoquaux, and A. D. N. Initiative. Benchmarking functional connectome-based predictive models for resting-state fmri. NeuroImage, 2019. F. Liem, G. Varoquaux, J. Kynast, F. Beyer, S. K. Masouleh, J. M. Huntenburg, L. Lampe, M. Rahim, A. Abraham, R. C. Craddock, ... Predicting brain-age from multimodal imaging data captures cognitive impairment. NeuroImage, 2016. M. Rahim, B. Thirion, D. Bzdok, I. Buvat, and G. Varoquaux. Joint prediction of multiple scores captures better individual traits from brain images. Neuroimage, in rev, 2017.
  • 52. References II B. Thirion, G. Varoquaux, E. Dohmatob, and J. Poline. Which fMRI clustering gives good brain parcellations? Name: Frontiers in Neuroscience, 8:167, 2014. G. Varoquaux. Cross-validation failure: small sample sizes lead to large error bars. NeuroImage, 2017. G. Varoquaux, F. Baronnet, A. Kleinschmidt, P. Fillard, and B. Thirion. Detection of brain functional-connectivity difference in post-stroke patients using group-level covariance modeling. In MICCAI. 2010. G. Varoquaux, P. R. Raamana, D. A. Engemann, A. Hoyos-Idrobo, Y. Schwartz, and B. Thirion. Assessing and tuning brain decoders: cross-validation, caveats, and guidelines. NeuroImage, 145:166–179, 2017. C.-W. Woo, L. J. Chang, M. A. Lindquist, and T. D. Wager. Building better biomarkers: brain models in translational neuroimaging. Nature neuroscience, 20(3):365, 2017.