Transcript of "Cosmo-not: a brief look at methods of analysis in functional MRI and in diffusion tensor imaging (DTI)"
1.
Cosmo-not: a brief look at methods of analysis in functional MRI andin diffusion tensor imaging (DTI) Paul Taylor AIMS, UMDNJ Cosmology seminar, Nov. 2012
2.
Outline• FMRI and DTI described (briefly)• Granger Causality• PCA• ICA – Individual, group, covariance networks• Jackknifing/bootstrapping
3.
The brain in brief (large scales)has many parts- complex blood vessels neurons aqueous tissue (GM, WM, CSF)activity (examples): hydrodynamics electrical impulses chemical
4.
The brain in brief (large scales)has many parts- complex blood vessels neurons aqueous tissue (GM, WM, CSF)activity (examples): hydrodynamics electrical impulses chemicalhow do different parts/areas work together?A) observe various parts acting together in unison during some activities (functional relation -> fMRI)B) follow structural connections, esp. due to WM tracts, which affect random motion in fluid/aqueous tissue (-> DTI, DSI, et al.)
5.
Functional (GM)Example:Resting statenetworks Biswal et al. (2010, PNAS) GM ROIs in networks: spatially distinct regions working in concert
6.
Basic fMRI• General topic of functional MRI: – Segment the brain into ‘functional networks’ for various tasks – Motor, auditory, vision, memory, executive control, etc. – Quantify, track changes, compare populations (HC vs disorder)
7.
Basic fMRI• General topic of functional MRI: – Segment the brain into ‘functional networks’ for various tasks – Motor, auditory, vision, memory, executive control, etc. – Quantify, track changes, compare populations (HC vs disorder)• Try to study which regions have ‘active’ neurons – Modalities for measuring metabolism directly include PET scan
8.
Basic fMRI• General topic of functional MRI: – Segment the brain into ‘functional networks’ for various tasks – Motor, auditory, vision, memory, executive control, etc. – Quantify, track changes, compare populations (HC vs disorder)• Try to study which regions have ‘active’ neurons – Modalities for measuring metabolism directly include PET scan• With fMRI, use an indirect measure of blood oxygenation
9.
MRI vs. fMRI MRI fMRI one image …fMRI Blood Oxygenation Level Dependent (BOLD) signal indirect measure of neural activity ↑ neural activity ↑ blood oxygen ↑ fMRI signal
10.
BOLD signal Blood Oxygen Level Dependent signalneural activity ↑ blood flow ↑ oxyhemoglobin ↑ T2* ↑ MR signal Mxy Signal Mo sinθ T2* task T2* control Stask Scontrol ΔS TEoptimum time Source: fMRIB Brief Introduction to fMRI Source: Jorge Jovicich
11.
Basic fMRI Step 1: Person is told to perform a task, maybe tapping fingers, in a time-varying pattern:ONOFF 30s 30s 30s 30s 30s 30s
12.
Basic fMRIStep 2: we measurea signal from eachbrain voxel over timesignal: basically,local increase inoxygenation: idea thatneurons which areactive are hungrier, anddemand an increase infood (oxygen) (example slice of time series)
14.
First Functional ImagesSource: Kwong et al., 1992
15.
Basic fMRIStep 4: map out regions of significant correlation (yellow/red)and anti-correlation (blue), which we take to be someinvolved in specific task given (to some degree); these areasare then taken to be ‘functionally’ related networks
16.
Basic fMRI• Have several types of tasks: – Again: motor, auditory, vision, memory, executive control, etc. – Could investigate network by network…
17.
Basic fMRI• Have several types of tasks: – Again: motor, auditory, vision, memory, executive control, etc. – Could investigate network by network…• Or, has been noticed that correlations among network ROIs exist even during rest – Subset of functional MRI called resting state fMRI (rs-fMRI) – First noticed by Biswal et al. (1995) – Main rs-fMRI signals exist in 0.01-0.1 Hz range – Offer way to study several networks at once
18.
Basic rs-fMRIe.g., Functional Connectome Project resting state networks (Biswal et al., 2010):
19.
Granger Causality• Issue to address: want to find relations between time series- does one affect another directly? Using time-lagged relations, can try to infer ‘causality’ (Granger 1969) (NB: careful in what one means by causal here…).
20.
Granger Causality• Issue to address: want to find relations between time series- does one affect another directly? Using time-lagged relations, can try to infer ‘causality’ (Granger 1969) (NB: careful in what one means by causal here…).• Modelling a measured time series x(t) as potentially autoregressive (first sum) and with time-lagged contributions of other time series y(i) – u(t) are errors/‘noise’ features, and c1 is baseline
21.
Granger Causality• Calculation: – Residual from: – Is compared with that of: – And put into an F-test: – (T= number of time points, p the lag) – Model order determined with Akaike Info. Criterion or Bayeian Info. Criterion (AIC and BIC, respectively)
22.
Granger Causality• Results, for example, in directed graphs: (Rypma et al. 2006)
23.
PCA• Principal Component Analysis (PCA): can treat FMRI dataset (3spatial+1time dimensions) as a 2D matrix (voxels x time). – Then, want to decompose it into spatial maps (~functional networks) with associated time series – goal of finding components which explain max/most of variance of dataset – Essentially, ‘eigen’-problem, use SVD to find eigenmodes, with associated vectors determining relative variance explained
24.
PCA• To calculate from (centred) dataset M with N columns: – Make correlation matrix: • C = M MT /(N-1) – Calculate eigenvectors Ei and -values λi from C, and the principal component is: • PCi = Ei [λI]1/2
25.
PCA• To calculate from (centred) dataset M with N columns: – Make correlation matrix: • C = M MT /(N-1) – Calculate eigenvectors Ei and -values λi from C, and the principal component is: • PCi = Ei [λI]1/2• For FMRI, this can yield spatial/temporal decomposition of dataset, with eigenvectors showing principal spatial maps (and associated time series), and the relative contribution of each component to total variance
26.
PCA• Graphic example: finding directions of maximum variance for 2 sources (example from web)
27.
PCA• (go to PCA reconstruction example in action from http://www.fil.ion.ucl.ac.uk/~wpenny/mbi/)
28.
ICA• Independent component analysis (ICA) (McKeown et al. 1998; Calhoun et al. 2002) is a method for decomposing a ‘mixed’ MRI signal into separate (statistically) independent components.(NB: ICA~ known ‘blindsource separation’ or‘cocktail party’ problems) (McKeown et al. 1998)
29.
ICA• ICA in brief (excellent discussion, see Hyvarinen & Oja 2000): – ICA basically is undoing Central Limit Theorem • CLT: sum of independent variables with randomness -> Gaussianity • Therefore, to decompose the mixture, find components with maximal non-Gaussianity – Several methods exist, essentially based on which function is powering the decomposition (i.e., by what quantity is non-Gaussianity measured): kurtosis, negentropy, pseudo-negentropy, mutual information, max. likelihood/infomax (latter used by McKeown et al. 1998 in fMRI)
30.
ICA• ICA in brief (excellent discussion, see Hyvarinen & Oja 2000): – ICA basically is undoing Central Limit Theorem • CLT: sum of independent variables with randomness -> Gaussianity • Therefore, to decompose the mixture, find components with maximal non-Gaussianity – Several methods exist, essentially based on which function is powering the decomposition (i.e., by what quantity is non-Gaussianity measured): kurtosis, negentropy, pseudo-negentropy, mutual information, max. likelihood/infomax (latter used by McKeown et al. 1998 in fMRI)• NB: can’t determine ‘energy’/variances or order of ICs, due to ambiguity of matrix decomp (too much freedom to rescale columns or permute matrix). – i.e.: relative importance/magnitude of components is not known.
31.
ICA• Simple/standard representation of matrix decomposition for ICA of individual dataset: voxels -> # ICs voxels -> # ICs time -> time -> = x Spatial map (IC) of ith component Time series of ith component Have to choose number of ICs--often based on ‘knowledge’ of system, or preliminary PCA-variance explained
32.
ICA• Can do group ICA, with assumptions of some similarity across a group to yield ‘group level’ spatial map – Very similar to individual spatial ICA, based on concatenating sets along time
33.
ICA• Can do group ICA, with assumptions of some similarity across a group to yield ‘group level’ spatial map – Very similar to individual spatial ICA, based on concatenating sets along time voxels -> # ICs voxels -> # ICs Subjects and time ->Subjects and time -> Subject 1 = x Group spatial map Subject 2 Time series of ith component, S1 (IC) of ith component Subject 3 Time series of ith component, S2 Time series of ith component, S3
34.
ICA• Group ICA example (visual paradigm) (Calhoun et al. 2009)
35.
ICA• GLM decomp (~correlation to modelled/known time course) vs ICA decomp (unknown components-- ‘data driven’, assumptions of indep. sources) (images:Calhoun et al. 2009)
36.
ICA• GLM decomp (~correlation to • PCA decomp (ortho. modelled/known time course) directions of max vs variance; 2nd order) ICA decomp (unknown vs components-- ‘data driven’, ICA decomp (directions assumptions of indep. sources) of max independence; higher order) (images:Calhoun et al. 2009)
37.
Dual Regression• ICA is useful for finding an individual’s (independent) spatial/temporal maps; also for the ICs which are represented across a group. – Dual regression (Beckmann et al. 2009) is a method for taking that group IC and finding its associated, subject-specific IC.
38.
Dual Regression • ICA is useful for finding an individual’s (independent) spatial/temporal maps; also for the ICs which are represented across a group. – Dual regression (Beckmann et al. 2009) is a method for taking that group IC and finding its associated, subject-specific IC. Steps:• 1) ICA decomposition: voxels # ICs voxels time # ICs time – >‘group’ time courses and ‘group’ spatial x time maps, independent time components (ICs) (graphics from ~Beckmann et al. 2009)
39.
Dual Regression• 2) Use group ICs as time # ICs time regressors per # ICs voxels individual in GLM x – > Time series associated with that spatial map (graphics from ~Beckmann et al. 2009)
40.
Dual Regression• 2) Use group ICs as time # ICs time regressors per # ICs voxels individual in GLM x – > Time series associated with that spatial map• 3) GLM regression with voxels # ICs voxels time courses per # ICs time time individual = x – > find each subject’s spatial map of that IC (graphics from ~Beckmann et al. 2009)
41.
Covariance networks (in brief)• Group level analysis tool• Take a single property across whole brain – That property has different values across brain (per subject) and across subjects (per voxel)• Find voxels/regions (->network) in which that property changes similarly (-> covariance) as one goes from subject to subject (-> subject series)
42.
ICA for BOLD series and FCNsStandard BOLD Subject series analysis analysis
43.
Covariance networks (in brief)• Group level analysis tool• Take a single property across whole brain – That property has different values across brain (per subject) and across subjects (per voxel)• Find voxels/regions (->network) in which that property changes similarly (-> covariance) as one goes from subject to subject (-> subject series)• Networks reflect shared information or single influence at basic/organizational level (discussed further, below).
44.
Covariance networks (in brief)• Can use with many different parameters, e.g.: – Mechelli et al. (2005): GMV – He et al. (2007): cortical thickness – Xu et al. (2009): GMV – Zielinski et al. (2010): GMV – Bergfield et al. (2010): GMV – Zhang et al. (2011): ALFF – Taylor et al. (2012): ALFF, fALFF, H, rs-fMRI mean and std, GMV – Di et al. (2012): FDG-PET
45.
Analysis: making subject series• A) Start with group of M subjects (for example, fMRI dataset)A 21 3 + 4 5
46.
Analysis: making subject series• A) Start with group of M subjects (for example, fMRI dataset)• B) Calculate a voxelwise parameter, P, producing 3D dataset per subjectA B 21 3 Pi + 4 5
47.
Analysis: making subject series• A) Start with group of M subjects (for example, fMRI dataset)• B) Calculate a voxelwise parameter, P, producing 3D dataset per subject• C) Concatenate the 3D datasets of whole group (in MNI) to form a 4D ‘subject series’ – Analogous to standard ‘time series’, but now each voxel has M values of P – Instead of i-th ‘time point’, now have i-th subjectA B C 21 3 n=1 Pi 2 + 3 4 4 5 5
48.
Analysis: making subject series• A) Start with group of M subjects (for example, fMRI dataset)• B) Calculate a voxelwise parameter, P, producing 3D dataset per subject• C) Concatenate the 3D datasets of whole group (in MNI) to form a 4D ‘subject series’ – Analogous to standard ‘time series’, but now each voxel has M values of P – Instead of i-th ‘time point’, now have i-th subject• NB: for all analyses, order of subjects is arbitrary and has no effectA B C 21 3 n=1 Pi 2 + 3 4 4 5 5
49.
Analysis: making subject series• A) Start with group of M subjects (for example, fMRI dataset)• B) Calculate a voxelwise parameter, P, producing 3D dataset per subject• C) Concatenate the 3D datasets of whole group (in MNI) to form a 4D ‘subject series’ – Analogous to standard ‘time series’, but now each voxel has M values of P – Instead of i-th ‘time point’, now have i-th subject• NB: for all analyses, order of subjects is arbitrary and has no effect• Can perform usual ‘time series’ analyses (correlation, ICA, etc.) on subject seriesA B C 21 3 n=1 Pi 2 + 3 4 4 5 5
50.
Interpreting subject series covarianceEx.: Consider 3 ROIs (X, Y and Z) in subjects with GMV dataSay, values of ROIs X and Y correlate strongly, but neither with Z. X1 Y1 X2 Y2 X3 Y3 X4 Y4 X5 Y5 Z1 Z2 Z3 Z4 Z5
51.
Interpreting subject series covarianceEx.: Consider 3 ROIs (X, Y and Z) in subjects with GMV dataSay, values of ROIs X and Y correlate strongly, but neither with Z. X1 Y1 X2 Y2 X3 Y3 X4 Y4 X5 Y5 Z1 Z2 Z3 Z4 Z5 --> X and Y form ‘GMV covariance network’
52.
Interpreting subject series covarianceEx.: Consider 3 ROIs (X, Y and Z) in subjects with GMV dataSay, values of ROIs X and Y correlate strongly, but neither with Z. X1 Y1 X2 Y2 X3 Y3 X4 Y4 X5 Y5 Z1 Z2 Z3 Z4 Z5Then, knowing the X-values and one Y-value (since X and Y canhave different bases/scales) can lead us to informed guesses aboutthe remaining Y-values, but nothing can be said about Z-values.
53.
Interpreting subject series covarianceEx.: Consider 3 ROIs (X, Y and Z) in subjects with GMV dataSay, values of ROIs X and Y correlate strongly, but neither with Z. X1 Y1 X2 Y2 X3 Y3 X4 Y4 X5 Y5 Z1 Z2 Z3 Z4 Z5 Then, knowing the X-values and one Y-value (since X and Y can have different bases/scales) can lead us to informed guesses about the remaining Y-values, but nothing can be said about Z-values.-> ROIs X and Y have information about each other even acrossdifferent subjects, while having little/none about Z.-> X and Y must have some mutual/common influence, which Z maynot.
54.
Interpreting covariance networks• Analyzing: similarity of brain structure across subjects.
55.
Interpreting covariance networks• Analyzing: similarity of brain structure across subjects.• Null hypothesis: local brain structure due to local control, (mainly) independent of other regions. – -> would observe little/no correlation of ‘subject series’ non-locally
56.
Interpreting covariance networks• Analyzing: similarity of brain structure across subjects.• Null hypothesis: local brain structure due to local control, (mainly) independent of other regions. – -> would observe little/no correlation of ‘subject series’ non-locally• Alt. Hypothesis: can have (1 or many) extended/multi-region influences controlling localities as general feature – -> can observe consistent patterns of properties as correlation of subject series ‘non-locally’
57.
Interpreting covariance networks• Analyzing: similarity of brain structure across subjects.• Null hypothesis: local brain structure due to local control, (mainly) independent of other regions. – -> would observe little/no correlation of ‘subject series’ non-locally• Alt. Hypothesis: can have (1 or many) extended/multi-region influences controlling localities as general feature – -> can observe consistent patterns of properties as correlation of subject series ‘non-locally’ – -> observed network and property are closely related
58.
Interpreting covariance networks• Analyzing: similarity of brain structure across subjects.• Null hypothesis: local brain structure due to local control, (mainly) independent of other regions. – -> would observe little/no correlation of ‘subject series’ non-locally• Alt. Hypothesis: can have (1 or many) extended/multi-region influences controlling localities as general feature – -> can observe consistent patterns of properties as correlation of subject series ‘non-locally’ – -> observed network and property are closely related – -> one network would have one organizing influence across itself – [-> perhaps independent networks with separate influences might have low/no correlation; related networks perhaps have some correlation].
59.
Switching gears…• Statistical resampling: methods for estimating confidence intervals for estimates• Several kinds, two common ones in fMRI are jackknifing and bootstrapping (see, e.g. Efron et al. 1982).• Can use with fMRI, and also with DTI (~for noisy ellipsoid estimates-- confidence in fit parameters)
60.
Jackknifing• Basically, take M acquisitions e.g., M=12
61.
Jackknifing• Basically, take M acquisitions e.g., M=12• Randomly select MJ < M to use MJ=9 to calculate quantity of interest – standard nonlinear fits(ellipsoid is definedby 6 parameters of [D11 D22 D33 D12 D13 D23] = ....quadratic surface)
62.
Jackknifing• Basically, take M acquisitions e.g., M=12• Randomly select MJ < M to use MJ=9 to calculate quantity of interest – standard nonlinear fits• Repeatedly subsample large number (~103-104 times) [D11 D22 D33 D12 D13 D23] = .... [D11 D22 D33 D12 D13 D23] = .... [D11 D22 D33 D12 D13 D23] = .... ....
63.
Jackknifing• Basically, take M acquisitions e.g., M=12• Randomly select MJ < M to use MJ=9 to calculate quantity of interest – standard nonlinear fits• Repeatedly subsample large number (~103-104 times)• Analyze distribution of values for estimator (mean) and [D11 D22 D33 D12 D13 D23] = .... confidence interval [D11 D22 D33 D12 D13 D23] = .... – sort/%iles • (not so efficient) [D11 D22 D33 D12 D13 D23] = .... – if Gaussian, e.g. µ±2σ .... • simple
65.
Jackknifing - not too bad with smaller M, even - but could use min/max from distributions for %iles (don’t need to sort)M=12 gradients
66.
Bootstrapping • Similar principal to jackknifing,but need multiple copies of dataset.A B e.g., M=12 e.g., M=12C D e.g., M=12 e.g., M=12
67.
Bootstrapping • Make an estimate from 12 measures, but randomly selected from each set:A B e.g., M=12 e.g., M=12C D e.g., M=12 e.g., M=12
68.
Bootstrapping • Then select another random (complete) set, build a distribution, etc.A B e.g., M=12 e.g., M=12C D e.g., M=12 e.g., M=12
69.
Summary• There are a wide array of methods applicable to MRI analysis – Many of them involve statistics and are therefore always believable at face value. – The applicability of the assumptions of the underlying mathematics to the real situation is always key. – Often, in MRI, we are concerned with a ‘network’ view of regions working together to do certain tasks. • Therefore, we are interested in grouping regions together per task (as with PCA/ICA) – New approaches start now to look at temporal variance of networks (using, e.g., sliding window or wavelet decompositions). – Methods of preprocessing (noise filtering, motion correction, MRI-field imperfections) should also be considered as part of the methodology.
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.
Be the first to comment