This document discusses techniques for predictive modeling of brain imaging data using statistical learning methods. It presents an approach that combines sparse recovery, randomized clustering, and total variation regularization to predict stimuli from fMRI data with over 50,000 voxels and around 100 samples. The key steps are clustering spatially correlated voxels, running sparse models on the reduced feature set, and accumulating selected features over multiple runs. Simulations show this approach outperforms other methods at recovering brain patches. The document also discusses disseminating research through open source Python libraries like scikit-learn, which has helped popularize machine learning techniques.