Be the first to like this
Besides accurate prediction/description of the data, one goal in multivariate machine learning is to gain insights into the problem domain by analyzing what aspects of the data are most relevant for the model to achieve its performance. However, such analyses may give rise to misinterpretation in the sense that data features that are for example relevant for a classification task, are not necessarily informative of any of the classes themselves.
Haufe’s talk points out this problem and its implications in the context of linear “decoding” applications in neuroimaging, demonstrating that it is caused by correlated additive noise, and proposing a simple transformation of linear methods into a representation from which the class-specific features of actual interest can be easily read off.
Another domain in which correlated noise can lead to misinterpretation is the analysis of interactions between time series. Using again an example from neuroimaging, the presentation points out how linear mixing of noise or signal components in the data can lead to spurious detection of interaction for some of the most established interaction measures. To deal with this problem, a number of “robust” measures and demonstrate their advantages on simulated data will be introduced.