Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - Frederick Eberhardt, December 11, 2019

Causal Discovery in Neuro Imaging Data
Frederick Eberhardt
SAMSI Causal Inference Workshop, Dec 9-11, 2019

Overview
2
causaldiscovery
resting state brain causal graph of brain regions

resting state fMRI data in the HCP
4
• The subjects are asked to lie with eyes open, with “relaxed” fixation
on a white cross (on a dark background), think of nothing in
particular, and not to fall asleep.
• Four ~15-minute rfMRI runs are acquired in two separate fMRI
sessions
• Acquisition parameters: 1200 volumes, TR = 720 ms, voxel size =
2mm isotropic, 72 slices

Anatomical Parcelation
anatomical parcelationvoxel level resting state fMRI

Parcel Activation over Time
time points
parcels

7
10-100ms interactions
~100billionneurons
What are we measuring?

7
~100billionneurons
~90,000voxelsBOLDsignal 720ms measurement interval

7
~100billionneurons
5-6s hemodynamic delay

7
~100billionneurons
}<latexit sha1_base64="DsDSimPTxv/Qm15+jZsruPb2YT4=">AAAB6XicdVDLSsNAFL3xWeur6tLNYBFclaQJNu6KblxWsQ9oQ5lMJ+3QySTMTIQS+gduXCji1j9y5984fQgqeuDC4Zx7ufeeMOVMadv+sFZW19Y3Ngtbxe2d3b390sFhSyWZJLRJEp7ITogV5UzQpmaa004qKY5DTtvh+Grmt++pVCwRd3qS0iDGQ8EiRrA20m1v2i+V7YrtedWqjwxxa37NNcTzzy9cFzkVe44yLNHol957g4RkMRWacKxU17FTHeRYakY4nRZ7maIpJmM8pF1DBY6pCvL5pVN0apQBihJpSmg0V79P5DhWahKHpjPGeqR+ezPxL6+b6cgPcibSTFNBFouijCOdoNnbaMAkJZpPDMFEMnMrIiMsMdEmnKIJ4etT9D9pVSuOXXFuvHL9chlHAY7hBM7AgRrU4Roa0AQCETzAEzxbY+vRerFeF60r1nLmCH7AevsEE0uNuA==</latexit><latexit sha1_base64="DsDSimPTxv/Qm15+jZsruPb2YT4=">AAAB6XicdVDLSsNAFL3xWeur6tLNYBFclaQJNu6KblxWsQ9oQ5lMJ+3QySTMTIQS+gduXCji1j9y5984fQgqeuDC4Zx7ufeeMOVMadv+sFZW19Y3Ngtbxe2d3b390sFhSyWZJLRJEp7ITogV5UzQpmaa004qKY5DTtvh+Grmt++pVCwRd3qS0iDGQ8EiRrA20m1v2i+V7YrtedWqjwxxa37NNcTzzy9cFzkVe44yLNHol957g4RkMRWacKxU17FTHeRYakY4nRZ7maIpJmM8pF1DBY6pCvL5pVN0apQBihJpSmg0V79P5DhWahKHpjPGeqR+ezPxL6+b6cgPcibSTFNBFouijCOdoNnbaMAkJZpPDMFEMnMrIiMsMdEmnKIJ4etT9D9pVSuOXXFuvHL9chlHAY7hBM7AgRrU4Roa0AQCETzAEzxbY+vRerFeF60r1nLmCH7AevsEE0uNuA==</latexit><latexit sha1_base64="DsDSimPTxv/Qm15+jZsruPb2YT4=">AAAB6XicdVDLSsNAFL3xWeur6tLNYBFclaQJNu6KblxWsQ9oQ5lMJ+3QySTMTIQS+gduXCji1j9y5984fQgqeuDC4Zx7ufeeMOVMadv+sFZW19Y3Ngtbxe2d3b390sFhSyWZJLRJEp7ITogV5UzQpmaa004qKY5DTtvh+Grmt++pVCwRd3qS0iDGQ8EiRrA20m1v2i+V7YrtedWqjwxxa37NNcTzzy9cFzkVe44yLNHol957g4RkMRWacKxU17FTHeRYakY4nRZ7maIpJmM8pF1DBY6pCvL5pVN0apQBihJpSmg0V79P5DhWahKHpjPGeqR+ezPxL6+b6cgPcibSTFNBFouijCOdoNnbaMAkJZpPDMFEMnMrIiMsMdEmnKIJ4etT9D9pVSuOXXFuvHL9chlHAY7hBM7AgRrU4Roa0AQCETzAEzxbY+vRerFeF60r1nLmCH7AevsEE0uNuA==</latexit><latexit sha1_base64="DsDSimPTxv/Qm15+jZsruPb2YT4=">AAAB6XicdVDLSsNAFL3xWeur6tLNYBFclaQJNu6KblxWsQ9oQ5lMJ+3QySTMTIQS+gduXCji1j9y5984fQgqeuDC4Zx7ufeeMOVMadv+sFZW19Y3Ngtbxe2d3b390sFhSyWZJLRJEp7ITogV5UzQpmaa004qKY5DTtvh+Grmt++pVCwRd3qS0iDGQ8EiRrA20m1v2i+V7YrtedWqjwxxa37NNcTzzy9cFzkVe44yLNHol957g4RkMRWacKxU17FTHeRYakY4nRZ7maIpJmM8pF1DBY6pCvL5pVN0apQBihJpSmg0V79P5DhWahKHpjPGeqR+ezPxL6+b6cgPcibSTFNBFouijCOdoNnbaMAkJZpPDMFEMnMrIiMsMdEmnKIJ4etT9D9pVSuOXXFuvHL9chlHAY7hBM7AgRrU4Roa0AQCETzAEzxbY+vRerFeF60r1nLmCH7AevsEE0uNuA==</latexit>
parcel 1
parcel 405
…

7
~100billionneurons
• indirect measurement of neural activity by
BOLD signal
• aggregation in space from voxel to parcel
• aggregation in time by an order of magnitude
parcel 1
parcel 405
…

Fast Greedy Equivalence Search (FGES)
8
Chickering (2002) JMLR

Ramsey (2017) IJDSA
• greedy local search for a DAG
• uses Bayesian Information Criterion (BIC) to determine which edge to
add/delete at any stage in the algorithm
• one free “sparsity” parameter that regulates ﬁt vs. model complexity
• is consistent: given its assumptions it will identify the Markov
equivalence class of the true DAG in the large sample limit

8
• Assumptions
- acyclic causal structure
- no confounding
- linear Gaussian structural equation model
- iid data

Ramsey (2017) IJDSA

8
• Assumptions
- acyclic causal structure
- no confounding
- linear Gaussian structural equation model
- iid data

Ramsey (2017) IJDSA
• scales to >100,000 variables
- implementation in java in the Tetrad code
package
- our python implementation here: https://
github.com/eberharf/fges-py/

resting state fMRI
880
subjects
single
subject
correlations

resting state fMRI
880
subjects
single
subject
correlations
880
subjects
single
subject
left
hemisphere
right
hemisphere
causal connections
between
hemispheres

causal connections
880
subjects
single
subject
left
hemisphere
right
hemisphere
resting state fMRI

causal connections
880
subjects
single
subject
left
hemisphere
right
hemisphere
example structure for illustration
resting state fMRI

causal connections
between
hemispheres
880
subjects
single
subject
resting state fMRI

causal connections
between
hemispheres
880
subjects
single
subject
example structure for illustration
resting state fMRI

HUMAN 
causal connections
MOUSE 
causal connections
Cross-species Analysis

Fast Greedy Equivalence Search
• acyclic causal structure
• no unmeasured confounding
• very scalable
Causal Orientation

• very scalable
Answer Set Programming for Causality
• cyclic or acyclic causal structure
• unmeasured confounding permitted
• does not scale
Causal Orientation

56
59
81
82108
109
110
• very scalable
Answer Set Programming for Causality
• cyclic or acyclic causal structure
• unmeasured confounding permitted
• does not scale
Causal Orientation

true causal structure
Observation to Intervention
x y
z

x y
z
equivalence class of causal
structures that we learned
x y
z

x y
z
equivalence class of causal
structures that we learned
x y
z
intervened causal structure
x y
z

15
3.0 s
EPI scan EPI scan
Stimulation
period (30 s)
Electrical stimulation
Stimulus
isolator
Waveform
generation
computer
+ -
B
100 ms
a,b = 0.25 ms
c = 0.75 ms
a b c
d
e
R
(-2.4, -8.2,Subj.292)
L
electrical stimulation fMRI

16
simple contrast (GLM) causal discovery
Effect of stimulating the right amygdala

peak activation
depletion
-10 -5 0 5
-40-200204060
run 1, mean of 10 stimulus blocks stacked
timesteps before and after onset of stimulus
activation
-10 -5 0 5
-40-2002040
activation
-10 -5 0 5
-60-40-20020406080
activation
-10 -5 0 5
-60-40-2002040
activation
stimulation stimulation
stimulation stimulation

18
parcel intervention On/Off on first 5 observation
alL_AngularGyrus x
R_InsularCortex x x
R_TemporalPole x x x
R_SuperiorTemporalGyrusPosterior x
R_MiddleTemporalGyrusAnterior x
R_MiddleTemporalGyrusTemporoOccipital x
R_InferiorTemporalGyrusAnterior x
R_InferiorTemporalGyrusTemporoOccipital x
R_FrontalOrbitalCortex x
R_ParahippocampalGyrusAnterior x
R_ParahippocampalGyrusPosterior x
R_TemporalFusiformCortexPosterior x
R_PlanumPolare x
LeftCaudate x
LeftAmygdala x
RightHippocampus x x x
After adjusting for hemodynamic delay and depletion…

18
parcel intervention On/Off on first 5 observation
alL_AngularGyrus x
R_InsularCortex x x
R_TemporalPole x x x
R_SuperiorTemporalGyrusPosterior x
R_MiddleTemporalGyrusAnterior x
R_MiddleTemporalGyrusTemporoOccipital x
R_InferiorTemporalGyrusAnterior x
R_InferiorTemporalGyrusTemporoOccipital x
R_FrontalOrbitalCortex x
R_ParahippocampalGyrusAnterior x
R_ParahippocampalGyrusPosterior x
R_TemporalFusiformCortexPosterior x
R_PlanumPolare x
LeftCaudate x
LeftAmygdala x
RightHippocampus x x x
—— one subject —— | population
After adjusting for hemodynamic delay and depletion…

Population Prior to Improve Learning for Single Subject
BIC(H) = − 2log(ℒ) + λ log(n)Bayesian Information Criterion

Prior probability on the presence
of an edge
ϕij

of an edge
ϕij
Effect of the prior probability
when adding an edge i—j
BIC(Hij) − BIC(H) − α log(
ϕij
1 − ϕij
)

of an edge
ϕij
Effect of the prior probability
when adding an edge i—j
BIC(Hij) − BIC(H) − α log(
ϕij
1 − ϕij
)
sparsity
prior weight

Evaluation without ground truth
20

20
• single-subject connectivity = high-confidence edges inferred on
MyConnectome data (84 scanning sessions of one individual)
- distinct from HCP data; different scanner

20
• human-generic edges = high-confidence edges learned from
Human Connectome population data

20
• subject-specific edges = high-confidence edges from single
subject that are not human-generic edges

20
‣ single-subject connectivity = 60% human-generic, 40%
subject-specific

20
subject-specific
• test data: two sessions (~1,000 timepoints) of single-subject
MyConnectome data

20
subject-specific
• test data: two sessions (~1,000 timepoints) of single-subject
MyConnectome data
➡ How much does the prior help to learn the single-subject
connectivity from the test data?

Recall for human-generic and subject-specific Connectivity
human-generic edges,
recall for different
sparsities and prior weights

Recall for human-generic and subject-specific Connectivity
human-generic edges,
subject-specific edges,

video from Fetcho lab: http://pages.nbb.cornell.edu/neurobio/Fetcho/what-we-do-overview/; image from https://grants.nih.gov/sites/default/files/150312_zebrafish_slides.pdf
Zebrafish Larvae

GIF from video Ahrens M.B. et al. Nat. Methods 10, 413–420 (2013)
https://www.sfari.org/2018/09/11/sfari-workshop-discusses-zebrafish-as-experimental-systems-to-study-autism/#ref
Lightsheet Microscopy

Causal Discovery
Connections in the Brain of a Zebrafish Larva
~100,000 neurons
~250,000 direct
connections

Known Connections from Anatomical Studies
inferior olivecerebellum

inferior olivecerebellum inferior olivecerebellum
Recovered reliably across different larvae
fish 1 fish 2

image credit: Hildebrand et al, 2017; Janelia
The Aim: From Functional to Anatomical Connections
Anatomical Studies
Hypotheses
Causal Discovery

Open Questions
28
• What are appropriate coarse-grained parcelations of a brain?

Open Questions
28
- How can we discover them?

Open Questions
28
• How do we handle time delays?

Open Questions
28
- (sampling rate)

Open Questions
28
- (sampling rate)
- slower vs. faster causal effects

Open Questions
28
- (sampling rate)
• How to model the interventions?

Open Questions
28
- (sampling rate)
- (soft vs. hard)

Open Questions
28
- (sampling rate)
- (soft vs. hard)
- time course of the intervention

Open Questions
28
- (sampling rate)
- (soft vs. hard)
- electric stimulation vs. task fMRI vs. magnetic stimulation

Open Questions
28
- (sampling rate)
- (soft vs. hard)
• What is the relation between the observational network and
the stimulated network?

Open Questions
28
- (sampling rate)
- (soft vs. hard)
• What is the relation between the observational network and
the stimulated network?
• [scalability of methods with weaker assumptions: fast non-
parametric independence tests; non-Gaussian / non-linear
methods etc.]

Caltech:
• Ralph Adolphs
• Julien Dubois
• Lynn Paul
• Mike Tyszka
• Krzysztof
Chalupka
• Pietro Perona
• Yusuke Tomina
• Daniel Wagenaar
 
Undergraduates:
• RJ Antonello
• Lin Lin Lee
• Samuel Liebana
• John Moss
• Ethan Pronovost
• Amy Xiong
• Mark Xu
• Dan Xu
Funding:
• Chen Institute Director Award
• Caltech Carver Mead seed
grant
• AWS credit award
• NSF 1564330
• NSF 1845958
• NIH U01 NS103780-01
Janelia:
• Misha Ahrens
• Yu Mu
• Mikail Rubinov
• Greg Fleischman
UCLA / LA
Biomed:
• Paul Mathews
• Neil Harris
• Katrina Choe
• Joshua
Schoenfield
Collaborators
Thank you!

Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - Frederick Eberhardt, December 11, 2019

Recommended

Recommended

More Related Content

Similar to Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - Frederick Eberhardt, December 11, 2019

Similar to Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - Frederick Eberhardt, December 11, 2019 (20)

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Recently uploaded

Recently uploaded (20)

Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - Frederick Eberhardt, December 11, 2019