Introduction to Trentool

891 views
756 views

Published on

Published in: Health & Medicine, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
891
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
24
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Here’s a short introduction about a tool I encountered during a workshop in Frankfurt.\nTrentool is used to measure information transfer directionality between two or more signal streams, and has been successfully used in a lot of areas from meteorology to neuroscience to laser physics.\n
  • We start at the basic models and work our way upwards:\nwe have our traditional coherence\nand the partial coherence for multivariate data.\nBoth of these models can be extended to measure causality,\nand we arrive at Granger causality and partial directed coherence.\n
  • We start at the basic models and work our way upwards:\nwe have our traditional coherence\nand the partial coherence for multivariate data.\nBoth of these models can be extended to measure causality,\nand we arrive at Granger causality and partial directed coherence.\n
  • That’s why I’m presenting a new model today, which is based on a completely different mathematical model: entropy.\n
  • Let’s say we have two signal streams, which possibly exchange some information with each other.\nThe information content in each data stream can be represented by its respective entropy.\nOne important aspect: The combined entropy of the two sources can be divided into two features: conditional entropy and mutual information.\n
  • In a given data stream, it’s entirely possible that certain aspects repeat over the course of time. We can test for repeating patterns by taking the whole history of the data stream and predicting the next data point with that. The probability for some new information to appear during this step is called transition probability.\n
  • Next point is mutual information.\nFirst, when we combine the entropy of both data streams, we get the receivers' diversity H(X) + H(Y).\nThen, we can calculate the conditional entropy, which you can imagine as the equivocation of the receiver about its source.\nBy subtracting these two measures, we get the amount of information that both datastreams share with each other, but doesn’t increase the opponent entropy.\nIn general, mutual information can be thought as a measure of disagreement between information receivers.\nAnd this is what is popularly known as transfer entropy.\n
  • When we combine conditional entropy and mutual information, we get conditional mutual information (or the apparent transfer entropy).\nYou can imagine it like this:\nHow much are the transition probabilities changed by knowing the information history of the other data stream?\nIf both sources ‘do their own thing’, the transition probabilities don’t change at all, and transfer entropy is zero.\nBut if there is some information transfer, then this comparison develops a peak, and shows exactly how much entropy has been transferred.\nWe can derive two additional measures from that, conditional transfer entropy and predictive information.\nI’ll explain the practical importance of conditional transfer entropy in the next slide.\n
  • First, we have apparent transfer entropy. It has severe issues with logical connections.\nIt can’t discover causality in multivariate couplings, it overlooks intermediate sources, and it has an issue with common causes.\n\nConditional transfer entropy, on the other hand, can deal with the latter two issues.\nHowever, multivariate interactions (green) can only be found when we attempt to interpret as many signals as possible. This is called the complete approach.\n\nFor the purpose of cognitive science, conditional Transfer Entropy is usually sufficient.\nSo, let’s take a look at the properties of Transfer Entropy when we apply it to our field.\n
  • Contrary to most other methods, transfer entropy doesn’t assume anything about its data.\nIf the source data is noise-free and linear, the underlying kernel function will be low-dimensional and very fast to reduce.\nThe method does underestimate the information flow in short data sequences, but this bias can be calculated and corrected.\nIn practice, even data sources of as little as 100 samples can be examined for their causality.\n
  • Experience with transfer entropy in neuroscience (mostly ECoG) shows that most causal relationships occur in the single digit timespan.\nGood news for us: neurons are fairly weakly coupled, and this means that bigger amounts of entropy are transferred with each step.\nComplex networks like neuronal networks open up additional issues, but these are independent from Transfer Entropy, and I’ll cover these in a bit.\n\n
  • We also have two problems, both of which are very important for applied research:\nFirst, a bias from input amplitude, which makes it hard to test for significance.\nSecond, a very strong causality result when two data streams contain the some of the same information at the same time.\nOf course, this causes problems with volume conduction in EEG.\n
  • Coming back to our issue with complex systems.\n Here, we encounter another, more basic problem: generalized synchronization\n Neurons are fairly slow and quite complicated structures, so let’s break it down to something more basic.\n A physicist, Ingo Fischer, built a setup of semiconductor lasers.\n Each laser is set up to modulate another laser, and they’re all arranged in a ring.\n Semiconductor lasers react very fast (on the order of nanoseconds) and are a bit noisy,\n so they can form a chaotic feedforward coupling, just as neurons.\n When you insert a probe into this system, you can perform a causality analysis on the output signals.\n Here’s the result for different amounts of lasers in the ring:\n Two lasers are fairly closely correlated, and also transfer entropy provides a nice contrast.\n With eight lasers, both transfer entropy and correlation have dropped to 0.1\n and in the asymptotic case of infinite lasers, both measurements drop to 0.\n But even in an infinitely large ring, two elements are just as strongly coupled as in the small setup!\n It’s just that neither correlation- nor entropy-based models can detect this coupling anymore.\n
  • Ok, enough with the theory, let’s take a look at something practical: Trentool.\nTrentool is an implementation for Transfer Entropy,\nand most importantly, it covers the two big problems: volume conduction and significance testing.\nIt uses a normalized version of conditional transfer entropy, which removes the input amplitude bias, and enables significance testing.\nVolume conduction is dealt with by analyzing causality for each possible delay, and removing the sub-milisecond-results from the other results.\nTherefore, it’s become the tool of choice for my own causal measurements.\nNow we’ll take a look under the hood and see what it does with its data.\n
  • A nice surprise was that Trentool works with Fieldtrip-styled data. It accepts Fieldtrip raw data, and its configuration is manipulated with a single cfg variable.\n
  • The parameter optimization estimates whether the data is stochastic or not, and uses one of these two components to estimate the kernel embedding dimension and the probable search range for delays.\nCao criterion: for deterministic data\nRagwitz criterion: for stochastic data\n
  • The data is shifted sample for sample, and then run through transfer entropy calculation.\nThe permutation test turns out a few values:\np-Value\nSignificance (uncorrected)\nStatistic Value (mean or t-value)\n
  • Shift test:\nDetecting volume conduction in multiple electrode settings\n
  • second permutation test between conditions\n
  • additional Final results:\nsignificance for each signal delay (corrected for multiple comparisons)\nestimate for volume conduction\n
  • The use case for multiple datasets is only slightly different, and the permutation test adapts the according significance values automatically.\n
  • This how it looks like in practice!\nThis is noisy data, so the estimation of kernel dimension always goes to the maximum. With a dimensionality and a search range of 50ms, the causality tests took about 30 minutes on a quad-core workstation.\n
  • Results for information transfer from X to Y: significant information transfer until 16ms, with a possible peak at 14ms\n
  • Same datasets, opposite direction: no significant effect\n
  • Ok, that’s it for today. Thank you!\n
  • \n
  • Introduction to Trentool

    1. 1. Introduction to Trentool The transfer entropy toolbox Max Planck Institute for Human Cognitive and Brain Sciences Leipzig, GermanyDominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    2. 2. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 51, NO. 9, SEPTEMBER 2004 The Basics Functional Effective Fig. 5. ( and multi axes: amp (B) dDTF 1504 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 51, NO. 9, SEPTEMBER 2004 Pattern of IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 51, NO. 9, SEPTEMBER 2004 Granger- direct flow Coherency Geweke The a surrogat Causality diagona see that DTF va Fig. 5. (A) Ordinary (graphs above diagonal), partial (graphs below diagonal), (A) Granger causality calculated pair-wise.marked above the column flows w Fig. 3. function describing transmission from the channel Each graph represents the (Bendat and Piersol, 1986) and multiple coherences (graphs on the diagonal) for the simulation (Geweke, 1982; Bressler et al., of the row. Horizontal axis: frequency ( I. Vertical channel marked on the left 2007) to the the orde range). Vertical axis: Granger causality in arbitrary units. Graphs on test. Ho axes: amplitude in range. Horizontal axes: frequency in range. the diagonal contain power spectra. (B) Resulting flow scheme. Convention In ord (B) dDTFs for the simulated data (power spectra shown on the diagonal). (C) concerning drawing of arrows the same as in Fig. 2. troduced Pattern of direct connections estimated from partial coherences. (D) Pattern of partial c direct flows estimated from dDTFs. tion fact This kin is small The accuracy of the results can be estimated by means of the Partial the chan Partial surrogate data test. The results are shown in Fig. 4(b). On the to ch Directed value at Coherence diagonal of Fig. 4(b), the power spectra are illustrated; we can the prop Coherence a “dip” see that they correspond well to the spectra from Fig. 3. The avoid th face ele DTF values from 2000) 4(a) corresponding to “leak(graphs (Baccalá and Sameshima, (graphs below diagonal), Fig. Fig. 5. (A) Ordinary flows”—the above diagonal), partial 2001) specificalculated pair-wise. Each graph represents the andFig. 5. 1986; Dalhaus, (graphs above diagonal), partialcoherences (graphs on the diagonal) for the simulation I. Vertical (Bendat Piersol, (A) Ordinary (graphs below diagonal), from the channel marked above the column flows which should (graphs on the diagonal) for oursimulation I. Vertical and multiple coherences not exist according to the scheme—are of and multiple The s axes: amplitude in in range. Horizontal Nonnormalized multichannel DTFs for the simulation I (Fig. 1). ence—c Fig. 4. (A) axes: frequency in range. axes:order of the values obtained by means of the surrogaterange. organization similar to Fig. 3 (on the diagonal power spectra). (B) DTFs common t of the row. Horizontal axis: frequency ( the amplitude in range. Horizontal axes: frequency data Pictureanger causality in arbitrary units. Graphs on (B) dDTFs for the this is not the case for theshown onsimulated data obtained from surrogate data. (C) Resulting flow pattern. Plots A(C)B are in other si test. However, simulated data (power spectra “cascade” diagonal). (power spectra shown on the diagonal). and (B) dDTFs for the the (C) flows. the samefrom arbitrary units. Horizontal axes:(D) Pattern of range). set of sig Pattern of direct connections estimated scale in partial coherences. frequency ( Pattern of direct connections estimated from partial coherences. (D) Pattern ofctra. (B) Resulting flow scheme. Conventione same as in Fig. 2. direct flows estimated from dDTFs. flows, one can use the dDTF in- In order to find only direct direct flows estimated from dDTFs. are illus 1/9 Inspecting Figs. 2 and 3, we observe that the channels, which coheren troduced in [20]. This function is a combination of ffDTF and delayed than the others, became “sinks” of activity. herence are more Dominic Portain - 16.08.2012 partial coherence. In the definitionbe estimated by means of It iscanHuman for pair-wise means that the Sciences The accuracy of the results can The Max Planck the results for be estimated by estimates of they show directly of ffDTF (7), the Institute quite common Cognitive and Brain accuracy of normaliza- the sinks rather than sources of activity. This effect appears also in The r
    3. 3. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 51, NO. 9, SEPTEMBER 2004 The Basics Functional Effective Fig. 5. ( and multi axes: amp (B) dDTF 1504 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 51, NO. 9, SEPTEMBER 2004 Pattern of Bivariate IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 51, NO. 9, SEPTEMBER 2004 Granger- direct flow Coherency Geweke The a surrogat Causality diagona see that DTF va Fig. 5. (A) Ordinary (graphs above diagonal), partial (graphs below diagonal), (A) Granger causality calculated pair-wise.marked above the column flows w Fig. 3. function describing transmission from the channel Each graph represents the (Bendat and Piersol, 1986) and multiple coherences (graphs on the diagonal) for the simulation (Geweke, 1982; Bressler et al., of the row. Horizontal axis: frequency ( I. Vertical channel marked on the left 2007) to the the orde range). Vertical axis: Granger causality in arbitrary units. Graphs on test. Ho axes: amplitude in range. Horizontal axes: frequency in range. the diagonal contain power spectra. (B) Resulting flow scheme. Convention In ord (B) dDTFs for the simulated data (power spectra shown on the diagonal). (C) concerning drawing of arrows the same as in Fig. 2. troduced Pattern of direct connections estimated from partial coherences. (D) Pattern of partial c direct flows estimated from dDTFs. tion fact This kin Multivariate is small The accuracy of the results can be estimated by means of the Partial the chan Partial surrogate data test. The results are shown in Fig. 4(b). On the to ch Directed value at Coherence diagonal of Fig. 4(b), the power spectra are illustrated; we can the prop Coherence a “dip” see that they correspond well to the spectra from Fig. 3. The avoid th face ele DTF values from 2000) 4(a) corresponding to “leak(graphs (Baccalá and Sameshima, (graphs below diagonal), Fig. Fig. 5. (A) Ordinary flows”—the above diagonal), partial 2001) specificalculated pair-wise. Each graph represents the andFig. 5. 1986; Dalhaus, (graphs above diagonal), partialcoherences (graphs on the diagonal) for the simulation I. Vertical (Bendat Piersol, (A) Ordinary (graphs below diagonal), from the channel marked above the column flows which should (graphs on the diagonal) for oursimulation I. Vertical and multiple coherences not exist according to the scheme—are of and multiple The s axes: amplitude in in range. Horizontal Nonnormalized multichannel DTFs for the simulation I (Fig. 1). ence—c Fig. 4. (A) axes: frequency in range. axes:order of the values obtained by means of the surrogaterange. organization similar to Fig. 3 (on the diagonal power spectra). (B) DTFs common t of the row. Horizontal axis: frequency ( the amplitude in range. Horizontal axes: frequency data Pictureanger causality in arbitrary units. Graphs on (B) dDTFs for the this is not the case for theshown onsimulated data obtained from surrogate data. (C) Resulting flow pattern. Plots A(C)B are in other si test. However, simulated data (power spectra “cascade” diagonal). (power spectra shown on the diagonal). and (B) dDTFs for the the (C) flows. the samefrom arbitrary units. Horizontal axes:(D) Pattern of range). set of sig Pattern of direct connections estimated scale in partial coherences. frequency ( Pattern of direct connections estimated from partial coherences. (D) Pattern ofctra. (B) Resulting flow scheme. Conventione same as in Fig. 2. direct flows estimated from dDTFs. flows, one can use the dDTF in- In order to find only direct direct flows estimated from dDTFs. are illus 1/9 Inspecting Figs. 2 and 3, we observe that the channels, which coheren troduced in [20]. This function is a combination of ffDTF and delayed than the others, became “sinks” of activity. herence are more Dominic Portain - 16.08.2012 partial coherence. In the definitionbe estimated by means of It iscanHuman for pair-wise means that the Sciences The accuracy of the results can The Max Planck the results for be estimated by estimates of they show directly of ffDTF (7), the Institute quite common Cognitive and Brain accuracy of normaliza- the sinks rather than sources of activity. This effect appears also in The r
    4. 4. Causality methods Causal modeling linear data nonlinear data Extended Granger causality mapping Bivariate Granger causality Bilinear DCM Partial directed Coherence Multivariate Transfer Entropy Directed Transfer function2/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    5. 5. Entropy Two signal streams: Entropy: X H(X) Y H(Y) H(X) + H(Y) = H(Xt+1|Xt) + H(Yt+1|Yt) + I(X,Y) Conditional Mutual Entropy Entropy Information Schreiber 20003/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    6. 6. Conditional Entropy Conditional Entropy: H(Xt+1|Xt) X(t) Using Xt to predict Xt+1 Transition probability: p(Xt+1|Xt)3/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    7. 7. Mutual Information Mutual Information: Entropy: X H(X) Y H(Y) X|Y H(X|Y) I(X,Y) = H(X) + H(Y) – H(X|Y) “Transfer entropy”3/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    8. 8. Conditional Transfer Entropy Conditional Entropy: H(Xt+1|Xt) transition probability Mutual information: I(X,Y) “Apparent Transfer entropy” Conditional mutual information: I(X,Yt+1|Yt) “Conditional transfer entropy” predictive information: H(Xt+1) - H(Xt+1|Xt) total uncertainty uncertainty about the future about the future, given the past3/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    9. 9. Types of Transfer Entropy apparent Transfer Entropy misses multivariate effects • doesnt capture multivariate interactions, e.g., ( xor ) • doesnt distinguish: • redundant information • common causes conditional TE conditions other possible information sources • eliminates redundancy, respects causal pathways complete Transfer Entropy involves all source information • captures collective interactions4/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    10. 10. Melanoma series 6 4 2 Properties of Transfer Entropy 0 detrended melanoma series Advantages 1 0.5 • 0 model-free, robust to noise −0.5 • −1 inherently non-linear,1960 1965 1970 fast with linear data 1935 1940 1945 1950 1955 but works 1975 • weaker coupling -> better results! year Figure 1: Detrendedwith multivariate effects: . • copes well Sunspot-Melanoma 1936-1972 Series Melanoma & Sunspot: Standardized series 3 Melanoma 2 Sunspot 1 0 −1 −2 1935 1940 1945 1950 1955 1960 1965 1970 1975 Normalized cross−correlation function5/9 1Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    11. 11. Properties of Transfer Entropy Application in Neuroscience • causal interactions occur at a fine temporal scale (<10ms) • weaker coupling -> better causal results! • Issues with complex networks • Noise influence: • good detection rate for SNR above 15db • breaks down to 50% at 10db5/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    12. 12. Properties of Transfer Entropy Problems • (predictable) estimation bias for non-infinite data sequences • difficult to test for significance • vulnerable to volume conduction5/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    13. 13. Generalized Synchronization • paradigm: delayed feedback (stable and predictable sync) • model: delay-coupled lasers – increasingly complex behavior – identical synchronization is always unstable – response is shifted by coupling time (a few nanoseconds) – cross correlation shows strong peaks at the coupling time6/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    14. 14. Trentool Properties Requirements built on robust Matlab Transfer Entropy Fieldtrip detection of volume applicable conduction Open- TSTOOL7/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    15. 15. Workflow Preparation Single Dataset Input in Fieldtrip raw format7/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    16. 16. Workflow Preparation Single Dataset Input Input sanitation in Fieldtrip and validation raw format Parameter optimization Cao Ragwitz7/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    17. 17. Workflow Preparation Single Dataset Input Input sanitation in Fieldtrip Permutation test and validation raw format Parameter Calculate optimization Transfer Entropy Sample Cao Ragwitz Shift7/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    18. 18. Workflow Preparation Single Dataset Input Input sanitation in Fieldtrip Permutation test Shift test and validation raw format Parameter Calculate optimization Transfer Entropy Sample Cao Ragwitz Shift7/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    19. 19. Workflow Preparation Single Dataset Input Input sanitation in Fieldtrip Permutation test Shift test and validation raw format Parameter Calculate Permutation test optimization Transfer Entropy between conditions Sample Cao Ragwitz Shift Results7/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    20. 20. Workflow Preparation Single Dataset Input Input sanitation in Fieldtrip Permutation test Shift test and validation raw format Parameter Calculate Permutation test optimization Transfer Entropy between conditions conditions loop over Sample Cao Ragwitz Shift Results7/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    21. 21. Workflow Preparation Multiple Datasets Input Input sanitation in Fieldtrip Permutation test Shift test and validation raw format Parameter Calculate Permutation test optimization Transfer Entropy between datasets datasets loop over Sample Cao Ragwitz Shift Results7/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    22. 22. Data analysis with Trentool Example set: 35 trials of 3500x2 samples Quadrilinear relationship, delay of 15 ms8/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    23. 23. Data analysis with Trentool Significance and delay estimation8/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    24. 24. Data analysis with Trentool Significance and delay estimation8/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    25. 25. Data analysis with Trentool8/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences
    26. 26. Thanks Questions?9/9Dominic Portain - 16.08.2012 Max Planck Institute for Human Cognitive and Brain Sciences

    ×