This document summarizes the current state of using machine learning to predict traits and behaviors from brain images. It discusses typical machine learning workflows and a favorite predictive model called the Brain Basis Set. It reviews what traits have been successfully predicted from brain images so far. It also discusses characteristics of successful predictive models, the role of large datasets, and ways prediction could be improved, such as through better data preprocessing and addressing bias. Throughout, it emphasizes the importance of transparency, reproducibility, and collaboration.
Current State of Brain Prediction & Improving Models
1. The current state of
prediction in neuroimaging
Saige Rutherford
@being_saige
www.beingsaige.com
2. Road Map
• Quick review of typical ML workflow + my favorite predictive
model
• Which traits and behaviors can we predict from brain
images?
• What do various successful predictive models have in
common?
• What does a “successful” predictive model look like?
• How does big data fit in, is there hope for smaller datasets?
• Where is there room to improve brain-behavior predictive
models?
6. Favorite predictive model: Brain Basis Set
Basis Set = Chosen # of top components from PCA decomposition of subjects x features matrix
aka principle component regression
7. Phenotype BBS CPM
General Executive 0.44 0.42
Processing Speed 0.39 0.23
Penn Progressive
Matrices
0.30 0.32
ASR Externalizing 0.24 0.03
ASR Internalizing 0.20 0.04
ASR Attention 0.21 0.00
NEO-Openness 0.18 0.11
NEO-
Conscientiousness
0.19 0.15
NEO-Extroversion 0.13 0.04
NEO-Agreeableness 0.19 0.10
NEO-Neuroticism 0.00 0.05Number of Components Used to Predict
MeanCorrelationbetweenPredicted&
ObservedPhenotype
Sripada et al. Scientific Reports (2019)
100 held out unrelated subjects10-fold Cross Validation
9. Successful Predictive Modeling
Test your prediction model in “the wild”
Sripada et. al Molecular Psychiatry (2019)
Ex. controlling for confounds (motion, demographics,
medication), different cross validation splits.
This shows more believable and realistic results!
Rozycki et. al Schizophrenia Bulletin (2017)
10. Successful Predictive Modeling
Impact of region-definition method on
prediction accuracy
Impact of connectivity
parameterization on prediction
accuracy
Impact of classifier choice on
prediction accuracy
https://www.sciencedirect.com/science/article/pii/S1053811919301594Dadi et al. Neuroimage (2019)
12. What not to do
don’t be this guy
1. Be a research troll
Research Troll: Someone who is overly protective of their
data, unwilling to share data and well-documented code.
2. No out of sample test set or cross validation
13. What not to do
Make bold claims about one model/method being the best…
You know what they say when you assume…
You’re probably wrong, and someone will publicly prove this to
you in a Twitter thread
14. Big Datasets are taking over…
Where does my “small” data fit in?
15. Big Datasets are taking over…
Where does my “small” data fit in?
Big data can be act as a “discovery” data
set.
Use HCP, ABCD, or UKBiobank to find a
brain basis set then get expression
scores of these components in your
dataset.
Use pretrained models from big data,
treat your dataset as a true out of sample
test set.
Externalizing
Internalizing
Attention
Model
Externalizing*
Multi-Task Learning, Transfer Learning
16. Contributing to Big Data
Federated Learning: allows us to train models on distributed datasets that you cannot
directly access.
https://blog.openmined.org/federated-learning-differential-privacy-and-encrypted-computation-for-medical-imaging/
https://arxiv.org/pdf/1610.05492.pdf
https://ai.googleblog.com/2017/04/federated-learning-collaborative.html
Federated Learning tutorial using brain age prediction model coming soon
17. How can we improve prediction?
Put in the (hard) work to prepare your data properly…
Tangential point about preprocessing fMRI data
Haak, Marquand, Beckman, Neuroimage 2017
18. Lots of papers pointing to this same idea…
Don’t use a fixed atlas!
https://cdn.elifesciences.org/articles/44890/elife-44890-v2.pdf
https://cdn.elifesciences.org/articles/32992/elife-32992-v1.pdf
https://www.ncbi.nlm.nih.gov/pubmed/25598050
https://www.sciencedirect.com/science/article/pii/S1053811917305463
https://www.biorxiv.org/content/10.1101/431833v2https://www.ncbi.nlm.nih.gov/pubmed/29878084
19. How can we improve prediction?
Most of machine learning is about good data hygiene.
UNDERSTAND YOUR DATA!
https://twitter.com/justmarkham/status/1155840938356432896
pip install pandas_profiling
import pandas_profiling
df.profile_report()
20. Patient or
healthy
control?
Think deeply before you turn a continuous
trait into a categorical trait.
Dimensional neuroimaging: our ability to
place a brain scan into a succinct, yet highly
comprehensive and informative reference
system, dimensions of which will reflect
patterns associated with normal or pathologic
brain structure or function.
21. How can we improve prediction?
Bias in neuroimaging data…we need to do better at acknowledging it.
Big Data != Population Data
Does ML reveal the true nature of relationships, unconstrained by any bias or human influence?
The answer is an unequivocal No.
23. Take Home Messages
There is not one perfect prediction framework to rule them all.
Machine Learning No Free Lunch theorem: no machine learning
method is better than the others, on average, over a broad family of
problems.
Embrace and collaborate with big data.
Big data: multi-task learning, share your models
Small data: transfer learning, use pre-trained models
Focus on transparency and reproducibility
24. Learning Resources
This is a research process, not a final offering.
OHBM 2019 talks on ML:
https://www.pathlms.com/ohbm/courses/12238/sections/15843/video_presentations/138295
https://www.pathlms.com/ohbm/courses/12238/sections/15843/video_presentations/138032
https://www.pathlms.com/ohbm/courses/12238/sections/15843/video_presentations/138231
https://www.pathlms.com/ohbm/courses/12238/sections/15843/video_presentations/138291
https://www.pathlms.com/ohbm/courses/12238/sections/15843/video_presentations/138219
Gael Varoquaux talks:
https://www.slideshare.net/GaelVaroquaux/functionalconnectome-biomarkers-to-meet-clinical-needs
https://www.slideshare.net/GaelVaroquaux/machine-learning-on-non-curated-data-154905090
Machine learning in neuroimaging: Progress and challenges. Neuroimage. 2019 August 15.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6499712/pdf/nihms-1025732.pdf
Learn a new Pandas trick everyday: https://www.dataschool.io/python-pandas-tips-and-tricks/
25.
26. Thank you!
All who have supported/inspired me on my learning journey.
Mike Angstadt, Chandra Sripada, Jenna Wiens, Daniel Kessler, Aman
Taxali, Bennet Fauber, Marlena Duda, GirlsWhoCode Organization, Ivy Tso,
Soo-Eun Chang, Steve Taylor, the entire University of Michigan community!
@being_saige
www.beingsaige.com
Editor's Notes
How should nodes be chosen? How many nodes are needed for brain-imaging based diagnosis?
How should weights of brain functional connectomes be represented?
What classifiers should be used? Should linear or non-linear models be preferred? Spare or non-sparse models be used? With or without feature selection?
. We study the prediction score of each pipeline relative to the mean across pipelines on each fold. This relative measure discards the variance in scores due to folds or datasets.
•Regions defined functionally (with dictionary learning or ICA) give best prediction.
•Prefer tangent-space parametrization of connectomes to full or partial correlation.
•Non-sparse linear classifiers are best for supervised learning.
Machine learning 101: a model that fits the data well doesn't necessarily generalize well.
In the era of big data, generalization should be tested in separate samples, or else using split-sample approaches in which one split is kept completely hidden until the very final application of a model.
Transfer learning example: knowledge gained while learning to recognize cars could apply when trying to recognize trucks.
Big Data people: think about multi-task learning, which creates more generalizable models. Also make sure you always share your saved models for people who might now have access to the big data.
Small data people: think transfer learning. If you can get access to the saved models that big data people share you can use them to test on your data, even if the model wasn’t explicitly train to predict the exact phenotype you are using.
Collaborative machine learning without centralized datasets.
One theoretical region where there is one mode of organization (it could encode task activation, connectivity, stimulus response) running in one particular direction and in the same area there is a second mode of organization running in a perpendicular direction. Taking measurements directly would mean that we are taking the superposition of this organization and we would wrongly infer that things are organized along this diagonal. When you parcellate this data you get a completely wrong ROI atlas definition which does not at all respect the underlying data. We know this is true in motor cortex and primary visual cortex. Lots of works suggests this presence in other regions of the brain.
Majority of machine learning in clinical neuroscience focused on classifying patients from healthy controls. Although this is a good starting point, its practical value is very limited, since those patients are presumably already “correctly” classified via simpler clinical examinations, hence they are used as ground truth.
In the computer-science based machine learning community, the discussion of bias in predictive models is widely acknowledged and discussed. At MLHC this past summer, the ending panel spent 2 hours discussing biases and ways of overcoming them.