Final Report - Google Summer of Code 2019 with MNE-Python
During the last couple of months, I've been working on the GSoC-project for enhancing statistical inference using linear regression in MNE-Python.
- Please refer to the project's GitHub repository, which contains the code developed during the GSoC period (although also see here and here for completeness).
- Also a detailed list of contributions can be found here.
- In addition, I've put up a website that contains the major achievements of the GSoC project.
Enhance statistical interference using linear regression
As this year's GSoC period comes to and end, I would like to further describe some the major achievements of this project. Hopefully this will trigger some discussion concerning remaining issues, considerations, and possible strategies for future work.
The following is a 2-3 min read.
The primary goal of the GSoC project was to broaden the capabilities of MNE in terms of how the fitting of linear regression models is done, putting a particular focus on statistical inference measures and the support of more complex statistical models, which might be of common interest for the MNE-community.
Summary of major achievements:
We though the best way to address this issues would be to set up a „gallery of examples“, which allows users to browse through common research questions, providing auxiliary code for setting up and fitting linear models, as well as inspecting and visualizing results with tools currently available in NumPy, SciPy and MNE.
For this purpose we have put up a sandbox repository, which contains code to replicate and extend some of the main analysis and tools integrated in LIMO MEEG a MATLAB toolbox originally designed to interface with EEGLAB. The corresponding website contains examples for typical single-subject and group-level analysis pipelines.
In the following I provide a quick overview of such an analysis pipeline and the corresponding features developed during GSoC.
During the project, we've adopted a multi-level (or hierarchical) modeling approach, allowing the combination of predictors at different levels of the experimental design (trials, subjects, etc.) and testing effects in a mass-univariate analysis fashion, i.e., not only focusing on average data for a few sensors, but rather taking the full data space into account (all electrodes/sensors and at all time points of an analysis time window; see here).
Of particular importance, the analysis pipelines allow users to deal with within-subjects variance (i.e., 1st-level analysis), as well as between-subjects variance (i.e., 2nd-level analysis), by modeling the co-variation of subject-level parameter estimates and inter-subject variability in some possible moderator variable (see here).
This hierarchical approach consist in estimating linear model parameters for each subject in a data set (this is done at each time-point and sensor independently). At the second-level beta coefficients obtained from each subject are integrated across subjects to test for statistical significance.
The implemented methods correspond to tests performed using bootstrap under H1 to derive confidence intervals (i.e., providing a measure of consistency of the observed effects on a group level) and "studentized bootstrap" (or bootstrap-t) to provide an approximation of H0, and control for multiple testing (e.g., via spatiotemporal clustering techniques).
One of the main issues concerns the integration of the fitting tools to MNE's API.
- So far, we've been using scikit-learn's linear regression module to fit the models.
- The advantage here consists in having a linear regression "object" as output, increasing the flexibility for manipulation of the linear model results (t-values, p-values, measures of model fit, etc.), while leaving MNE's linear regression function untouched (for now).
- However, we believe that using a machine learning package for linear regression might irritate users on the long run.
- What are your thoughts on this? One strategy could be to modify, or simplify, MNE's linear regression function to obtain similar output. Here, we would still be doing the linear algebra our selves and avoid (unnecessary?) excursions into scikit-learn.
The second major issue concerns the inference part.
- At the moment we are using parts of MNE's
cluster_levelcode to run spatiotemporal clustering, which in principle mimics the behavior of
mne.stats.cluster_level._permutation_cluster_test, but uses bootstrap to threshold the results.
- Thus perhaps the easiest approach would be integrate bootstrap in
mne.stats.cluster_level._permutation_cluster_testor extract the cluster stats from
mne.stats.cluster_level._permutation_cluster_testwithout permutation and submit these to bootstrap in a second function.
- At the moment we are using parts of MNE's
Another good question for future work concerns tools needed for dealing with outliers.
- There are seveal ways to control for (too) influential observations in linear regresssion models. One solution could be to extend the linear regresion module to allow for the fitting of weighted-least squares regression.
There are a couple of other smaller issues, which will continue to be discussed on the issue section of the project's GSoC repository.
I really enjoyed working on this project during the summer and would be glad to continue working on these tools after GSoC.
Thanks for reading and stay tuned!