minx ||y - Ax||2 + λ ||x||1
Irina Rish
IBM T.J. Watson Research Center
Learning About the Brain
and
Brain-Inspired Learning
Collaborators (an incomplete list)
IBM T.J. Watson Research:
Guillermo Cecchi Steve Heisig Aurelie Lozano
Google: Mt Sinai: Northwestern U.
Melissa Carroll Rita Goldstein A. Vania Apkarian
INRIA: Neurospin/UC Berkeley
Bertrand Thirion
MIT:
Pouya Bashivan
Purdue:
Jean Honorio
Lehigh U.
Katya Scheinberg
SUNY Stony Brook
Dimitris Samaras
St. Johns U.
Genady GrabarnikJB Poline
USC:
Sahil Garg
AI Brain2
Brain 2 AI: Brain-inspired AI Algorithms
AI 2 Brain: Mental-State Prediction
and Statistical Biomarker Discovery
Mental State Recognition to
Improve Mental Function
Detecting emotional & cognitive changes to predict response to
different types of input, e.g. music, video, news, ads, emails
(both for mental health and for neuromarketing)
Safety: detecting changes in driver’s alertness level
(drowsiness, microsleeps) to prevent accidents
Computational psychiatry:
data-analytic approach to diagnosis based on objective measurements
(new Research Domain Criteria (RDoC) initiative by NIMH)
Our current focus: schizophrenia, addiction, Huntington’s, Alzheimer’s, Parkinson’s
“Psychiatric research is in crisis”
[Wiecki et al. 2015]
AI 2 Brain:
Health & Productivity: mental-state-sensitive software
monitoring cognitive load, focus/attention; monitoring
stress/anxiety
Overview: Machine Learning in Neuroimaging
“Statistical biomarkers”:
[Cecchi et al, NIPS 2009]
[Rish et al, PLOS One, 2013]
[Carroll et al, Neuroimage 2009]
[Scheinberg&Rish, ECML 2010]
Schizophrenia classification: 86% to 93% accuracy
[Rish et al, Brain Informatics 2010]
[Rish et al, SPIE Med.Imaging 2012]
[Cecchi et al, PLOS Comp Bio 2012]
Cognitive state prediction in videogames: 70-95%
Pain perception: 70-80%, distributed activation patterns
[Honorio et al, AISTATS 2012]
[Rish et al, SPIE Med.Imaging 2016]
Cocaine addiction: evaluating potential treatments
[Bashivan et al, ICLR 2016]
EEG-cognitive load prediction: 91% w/ recurrent ConvNets
+
++
+
- -
---
Predictive Model
mental
disorder
healthy
Example 1: Cocaine Addiction fMRI Study
Cocaine: Mechanism of Action
• Cocaine affects the reward pathway in the brain (blocks the dopamine transporter)
• May lead to addiction: cocaine use disorder (CUD)
MPH: A Stimulant for a Stimulant?
A potential therapeutic agent for CUD?
(e.g., similarly to nicotine patch and using methadone for heroin addiction)
Methylphenidate Hydrochloride (MPH)
• Common ADHD treatment (Ritalin)
• Similarity to cocaine:
• chemical structure
• mechanism of action (blocks
dopamine transporter)
• Difference: slower rate of clearance
(90 vs 20 min), and thus a lower
dependence and abuse potential
• MPH has shown positive behavioral effects on CUD subjects [Levin 2007]
• MPH tends to normalize both task-related [Goldstein 2010] and resting-state functional activity in
certain areas [Konova 2013]
Resting-state Functional MRI
Image courtesy of fMRI Research
Center at Columbia University
Resting-state fMRI experiment: MPH vs. placebo [Konova et al 2013]
Features: functional network degrees
• Network link (i,j)  correlation between BOLD signals of voxels i and j
exceeds a given threshold (e.g., > 0.6)
• Feature selection: univariate ranking based on p-values; multiple subsets of
top K features, with increasing K, are used to train classifiers
Classification Results: MPH Normalizes CUD’s Networks
[Rish, Bashivan, Cecchi, Goldstein, SPIE 2016]
MPH ‘normalizes’ CUD networks:
CUD’s are harder to discriminate from
controls (10-20% increase in classification
error) under MPH vs under placebo
MPH has stronger effect on CUDs:
MPH (M2) vs Placebo (P2) condition
is much easier to discriminate for CUDs
rather than for controls
Leave-one-out CV with Nearest Neighbor (NN), Linear SVM, Decision Tree (DT), Random
Forest (RF), Logistic Regression (LR), Naïve Bayes (NB), Linear Discriminant Analysis (LDA)
Example 2: Working Memory Load Classification from EEG)
EEG Experiment:
 64-electrode EEG
 Working Memory task, 4 levels of difficulty:
2,4,6, or 8 symbols to remember
 13 subjects, 240 trials each (=3120 trial)
[Bashivan, Rish, Yeasin, Codella, ICLR 2016]
Classification Problem:
given time-series recorded during each trial of WM task, predict WM load level
Data Samples: 2670 correctly answered trials (a subset of total 3120)
Feature Extraction: FFT to find spectral power within each electrode at three
frequency bands - theta (4-8Hz), alpha (8-13Hz), and beta (13-30Hz).
Evaluation: leave-one-subject out (i.e., 13-fold) CV
Brain ‘Movie’ Classification with Recurrent ConvNets
• Idea: combine spatial, temporal and frequency – make EEG ‘movies’
• EEG images: project 3D electrode locations (64) into a 2D via distance-preserving Azimuthal
Equidistant Projection, then interpolate the activity
• RGB colores = Theta, Alpha, Beta frequences
• Each trial = 7 frames (RGB images)  short “movies” as samples
• FFT over the complete trial
 single image for each trial
• VGG style ConvNets
[Simonyan & Zisserman, 2015]
• Conv layers with 3 x 3 receptive
fields
• 4 architectures, increasing depth;
deeper is (slightly) better
Baseline: Non-temporal Approach with ConvNets
ConvNet Configurations
A B C D
input (32 x 32 3-channel image)
Conv3-32
Conv3-32
Conv3-32
Conv3-32
Conv3-32
Conv3-32
Conv3-32
Conv3-32
Conv3-32
Conv3-32
maxpool
-
Conv3-64
Conv3-64
Conv3-64
Conv3-64
Conv3-64
Conv3-64
- maxpool
- -
Conv3-
128
Conv3-
128
- - maxpool
Architecture
Number of
parameters
Test Error
A ~10k 13.05
B ~65.5k 13.17
C ~139.4k 13.91
D ~158k 12.39
Adding Time is Better: Recurrent ConvNets
Best result: 8.9% error discriminating
among 4 levels of cognitive load
achieved by recurrent Conv Nets with
LSTM + time convolution
• EEG times series for each trial split into 7
windows (0.5 sec). FFT on each time window
to get an image as before
• Best ConvNet (7-layer) used as C component
• All 7 ConvNets shared parameters
• video classification architectures from
[Ng et al, CVPR 2015]
• Temporal Maxpool: Max pool over time frames
• Temporal Convolution: 1D convolution over time
frames
• LSTM - sequence mapping over times frames
• Mixed LSTM/1D-Conv: Combination of both LSTM
and 1D-Conv architectures
Architecture
Test
Error (%)
Validation
Error (%)
Number of
parameters
RBF SVM 15.34 - -
L1-logistic
regression
15.32 -
-
Random Forest 12.59 - -
DBN 14.96 8.37 1.02 mil
ConvNet+Maxpoo
l
14.80 8.48
1.21 mil
ConvNet+1D-
Conv
11.32 9.28
441 k
ConvNet+LSTM 10.54 6.10 1.34 mil
ConvNet+LSTM/1
D-Conv
8.89 8.39 1.62 mil
[Bashivan, Rish, Yeasin, Codella, ICLR 2016]
Interpretability via Deconvolution
Code: https://github.com/pbashivan/EEGLearn
Using deconvnet of [Zeiler et al] to map features
back to the brain images
Back Projections: maps obtained by deconvnet
on the feature map displaying structures in the
input image that excite that particular feature map.
Some of these features correspond to well-known
electrophysiological markers of cognitive load.
First-layer features (1st stack, kernel 7) captured
wide-spread theta (1st stack output-kernel7) and
another (1st stack, kernel 23) frontal beta
activity
Second- and third-layer features – frontal theta/beta
(2nd stack,kernel7) and 3rd stack kernel60, 112) as well
as parietal alpha (2nd stack kernel29) .
Frontal theta and beta activity as well as parietal
alpha are most prominent markers of
cognitive/memory load in neuroscience
literature [Bashivan et al., 2015; Jensen et al., 2002;
Onton et al., 2005; Tallon-Baudry et al., 1999]
Input EEG images: top 9 images with highest
feature activations across the training set
Layer4Layer6Layer7
• Current theories: the hippocampus functions as an autoenconder to evoke
memories; similar encoding function is suggested in the olfactory bulb
• Our computational model: sparse linear autoencoder (online dictionary learning of
Mairal et al) + dynamic addition (birth) abnd deletion (death) of hidden nodes
Adult Neurogenesis:
Inspiration for Adaptive Representation Learning
• Predominant in the dentate gyrus of the hippocampus
and in the olfactory bulb
Olfactory bulb Dentate gyrus
[Garg, Rish, Cecchi, Lozano, ICLR 2017]
nsamples
p variables
~~
mbasisvectors
(dictionary)sparse
representation
input x
output x’ 
reconstructed x
hidden nodes c 
encoded x
link weights 
‘dictionary’ D
c c
Brain 2 AI:
Better Adaptation in Non-Stationary Environment
Learned dictionary size ‘Old’ domain reconstruction ‘New’ domain reconstruction
non-stationary visual input
Outperforms fixed-size autoencoder on non-stationary input:
improved accuracy + more compact representation
Adapts to a new domain without forgetting the old one
(via ‘memory’ matrices, part of original Mairal’s method)
Some Lessons
 In brain imaging applications
 Datasets are relatively small (e.g., few 1000 samples)
 Model interpretability is important
 Brain-inspired algorithms:
 Neurogenesis, attention, memory and many other brain phenomena can serve
as inspiration for better AI algorithms
 Challenge: deeper understanding and better modeling of such phenomena
 Deep learning faces specific challenges in neuroimaging
 Need for stronger regularization
 Need for interpretability (e.g., deconvolution, sparsity)
References
[Garg, Rish, Cecchi, Lozano 2016; submitted] S. Garg, I. Rish, G. Cecchi, A. Lozano. Neurogenesis-inspired Dictionary Learning: Online Model Adaptation in
a changing world, submitted to ICLR-2017
[Bashivan et al, ICLR 2016] P. Bashivan, I. Rish, M. Yeasin, N. Codella. Learning Representations from EEG with Deep Recurrent-Convolutional Neural
Networks. ICLR 2016 : International Conference on Learning Representations.
[Bashivan et al, 2015] Mental State Recognition via Wearable EEG, in Proc. of MLINI-2015 workshop at NIPS-2015.
[Heisig et al, 2014] S. Heisig, G. Cecchi, R. Rao and I. Rish. Augmented Human: Human OS for Improved Mental Function. AAAI 2014 Workshop on Cognitive
Computing and Augmented Human Intelligence.
[Neuropsychopharmacology, 2014] A Window into the Intoxicated Mind? Speech as an Index of Psychoactive Drug Effects. Bedi G, Cecchi G A, Fernandez
Slezak D, Carrillo F, Sigman M, de Wit H. Neuropsychopharmacology, 2014
[NPJ 2015] G. Bedi, F. Carrillo, G. A Cecchi, D. F. Slezak, M. Sigman, N. B Mota, S. Ribeiro, D C Javitt, M. Copelli, C M Corcoran. Automated analysis of free
speech predicts psychosis onset in high-risk youths. NPJ Schizophrenia 2015.
[PLoS ONE, 2013] Schizophrenia as a Network Disease: Disruption of Emergent Brain Function in Patients with Auditory Hallucinations, I Rish, G Cecchi, B
Thyreau, B Thirion, M Plaze, M-L Paillere-Martinot, C Martelli, J-L Martinot, J-B Poline. PloS ONE 8(1), e50625, Public Library of Science, 2013.
[PLoS One, 2012] Speech Graphs Provide a Quantitative Measure of Thought Disorder in Psychosis. N.B. Mota, N.A.P. Vasconcelos, N. Lemos, A.C. Pieretti,
O. Kinouchi, G.A. Cecchi, M. Copelli, S. Ribeiro. PLoS One, 2012
[Rish et al, SPIE 2016] I.Rish, P. Bashivan, G. A. Cecchi, R.Z. Goldstein, Evaluating Effects of Methylphenidate on Brain Activity in Cocaine Addiction: A
Machine-Learning Approach. SPIE Medical Imaging, 2016
[SPIE Med.Imaging 2012] Sparse regression analysis of task-relevant information distribution in the brain.
Irina Rish, Guillermo A Cecchi, Kyle Heuton, Marwan N Baliki, A Vania Apkarian, SPIE Medical Imaging, 2012.
[AISTATS 2012] J. Honorio, D. Samaras, I. Rish, G.A. Cecchi. Variable Selection for Gaussian Graphical Models. AISTATS, 2012.
[PLoS Comp Bio 2012] Predictive Dynamics of Human Pain Perception, GA Cecchi, L Huang, J Ali Hashmi, M Baliki, MV Centeno, I Rish, AV Apkarian,
PLoS Comp Bio 8(10), e1002719, Public Library of Science, 2012.
[Brain Informatics 2010] I. Rish, G. Cecchi, M.N. Baliki and A.V. Apkarian. Sparse Regression Models of Pain Perception, in Proc. of Brain Informatics (BI-
2010), Toronto, Canada, August 2010.
[NeuroImage, 2009] Prediction and interpretation of distributed neural activity with sparse models. Melissa K Carroll, Guillermo A Cecchi, Irina Rish, Rahul
Garg, A Ravishankar Rao. NeuroImage 44(1), 112--122, Elsevier, 2009.
[NIPS, 2009] Discriminative network models of schizophrenia, GA Cecchi, I Rish, B Thyreau, B Thirion, M Plaze, M-L Paillere-Martinot, C Martelli, J-L Martinot,
J-B Poline. Advances in Neural Information Processing Systems (NIPS 2009) , pp. 252--260, 2009.

Irina Rish, Researcher, IBM Watson, at MLconf NYC 2017

  • 1.
    minx ||y -Ax||2 + λ ||x||1 Irina Rish IBM T.J. Watson Research Center Learning About the Brain and Brain-Inspired Learning
  • 2.
    Collaborators (an incompletelist) IBM T.J. Watson Research: Guillermo Cecchi Steve Heisig Aurelie Lozano Google: Mt Sinai: Northwestern U. Melissa Carroll Rita Goldstein A. Vania Apkarian INRIA: Neurospin/UC Berkeley Bertrand Thirion MIT: Pouya Bashivan Purdue: Jean Honorio Lehigh U. Katya Scheinberg SUNY Stony Brook Dimitris Samaras St. Johns U. Genady GrabarnikJB Poline USC: Sahil Garg
  • 3.
    AI Brain2 Brain 2AI: Brain-inspired AI Algorithms AI 2 Brain: Mental-State Prediction and Statistical Biomarker Discovery
  • 4.
    Mental State Recognitionto Improve Mental Function Detecting emotional & cognitive changes to predict response to different types of input, e.g. music, video, news, ads, emails (both for mental health and for neuromarketing) Safety: detecting changes in driver’s alertness level (drowsiness, microsleeps) to prevent accidents Computational psychiatry: data-analytic approach to diagnosis based on objective measurements (new Research Domain Criteria (RDoC) initiative by NIMH) Our current focus: schizophrenia, addiction, Huntington’s, Alzheimer’s, Parkinson’s “Psychiatric research is in crisis” [Wiecki et al. 2015] AI 2 Brain: Health & Productivity: mental-state-sensitive software monitoring cognitive load, focus/attention; monitoring stress/anxiety
  • 5.
    Overview: Machine Learningin Neuroimaging “Statistical biomarkers”: [Cecchi et al, NIPS 2009] [Rish et al, PLOS One, 2013] [Carroll et al, Neuroimage 2009] [Scheinberg&Rish, ECML 2010] Schizophrenia classification: 86% to 93% accuracy [Rish et al, Brain Informatics 2010] [Rish et al, SPIE Med.Imaging 2012] [Cecchi et al, PLOS Comp Bio 2012] Cognitive state prediction in videogames: 70-95% Pain perception: 70-80%, distributed activation patterns [Honorio et al, AISTATS 2012] [Rish et al, SPIE Med.Imaging 2016] Cocaine addiction: evaluating potential treatments [Bashivan et al, ICLR 2016] EEG-cognitive load prediction: 91% w/ recurrent ConvNets + ++ + - - --- Predictive Model mental disorder healthy
  • 6.
    Example 1: CocaineAddiction fMRI Study Cocaine: Mechanism of Action • Cocaine affects the reward pathway in the brain (blocks the dopamine transporter) • May lead to addiction: cocaine use disorder (CUD)
  • 7.
    MPH: A Stimulantfor a Stimulant? A potential therapeutic agent for CUD? (e.g., similarly to nicotine patch and using methadone for heroin addiction) Methylphenidate Hydrochloride (MPH) • Common ADHD treatment (Ritalin) • Similarity to cocaine: • chemical structure • mechanism of action (blocks dopamine transporter) • Difference: slower rate of clearance (90 vs 20 min), and thus a lower dependence and abuse potential • MPH has shown positive behavioral effects on CUD subjects [Levin 2007] • MPH tends to normalize both task-related [Goldstein 2010] and resting-state functional activity in certain areas [Konova 2013]
  • 8.
    Resting-state Functional MRI Imagecourtesy of fMRI Research Center at Columbia University Resting-state fMRI experiment: MPH vs. placebo [Konova et al 2013] Features: functional network degrees • Network link (i,j)  correlation between BOLD signals of voxels i and j exceeds a given threshold (e.g., > 0.6) • Feature selection: univariate ranking based on p-values; multiple subsets of top K features, with increasing K, are used to train classifiers
  • 9.
    Classification Results: MPHNormalizes CUD’s Networks [Rish, Bashivan, Cecchi, Goldstein, SPIE 2016] MPH ‘normalizes’ CUD networks: CUD’s are harder to discriminate from controls (10-20% increase in classification error) under MPH vs under placebo MPH has stronger effect on CUDs: MPH (M2) vs Placebo (P2) condition is much easier to discriminate for CUDs rather than for controls Leave-one-out CV with Nearest Neighbor (NN), Linear SVM, Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), Naïve Bayes (NB), Linear Discriminant Analysis (LDA)
  • 10.
    Example 2: WorkingMemory Load Classification from EEG) EEG Experiment:  64-electrode EEG  Working Memory task, 4 levels of difficulty: 2,4,6, or 8 symbols to remember  13 subjects, 240 trials each (=3120 trial) [Bashivan, Rish, Yeasin, Codella, ICLR 2016] Classification Problem: given time-series recorded during each trial of WM task, predict WM load level Data Samples: 2670 correctly answered trials (a subset of total 3120) Feature Extraction: FFT to find spectral power within each electrode at three frequency bands - theta (4-8Hz), alpha (8-13Hz), and beta (13-30Hz). Evaluation: leave-one-subject out (i.e., 13-fold) CV
  • 11.
    Brain ‘Movie’ Classificationwith Recurrent ConvNets • Idea: combine spatial, temporal and frequency – make EEG ‘movies’ • EEG images: project 3D electrode locations (64) into a 2D via distance-preserving Azimuthal Equidistant Projection, then interpolate the activity • RGB colores = Theta, Alpha, Beta frequences • Each trial = 7 frames (RGB images)  short “movies” as samples
  • 12.
    • FFT overthe complete trial  single image for each trial • VGG style ConvNets [Simonyan & Zisserman, 2015] • Conv layers with 3 x 3 receptive fields • 4 architectures, increasing depth; deeper is (slightly) better Baseline: Non-temporal Approach with ConvNets ConvNet Configurations A B C D input (32 x 32 3-channel image) Conv3-32 Conv3-32 Conv3-32 Conv3-32 Conv3-32 Conv3-32 Conv3-32 Conv3-32 Conv3-32 Conv3-32 maxpool - Conv3-64 Conv3-64 Conv3-64 Conv3-64 Conv3-64 Conv3-64 - maxpool - - Conv3- 128 Conv3- 128 - - maxpool Architecture Number of parameters Test Error A ~10k 13.05 B ~65.5k 13.17 C ~139.4k 13.91 D ~158k 12.39
  • 13.
    Adding Time isBetter: Recurrent ConvNets Best result: 8.9% error discriminating among 4 levels of cognitive load achieved by recurrent Conv Nets with LSTM + time convolution • EEG times series for each trial split into 7 windows (0.5 sec). FFT on each time window to get an image as before • Best ConvNet (7-layer) used as C component • All 7 ConvNets shared parameters • video classification architectures from [Ng et al, CVPR 2015] • Temporal Maxpool: Max pool over time frames • Temporal Convolution: 1D convolution over time frames • LSTM - sequence mapping over times frames • Mixed LSTM/1D-Conv: Combination of both LSTM and 1D-Conv architectures Architecture Test Error (%) Validation Error (%) Number of parameters RBF SVM 15.34 - - L1-logistic regression 15.32 - - Random Forest 12.59 - - DBN 14.96 8.37 1.02 mil ConvNet+Maxpoo l 14.80 8.48 1.21 mil ConvNet+1D- Conv 11.32 9.28 441 k ConvNet+LSTM 10.54 6.10 1.34 mil ConvNet+LSTM/1 D-Conv 8.89 8.39 1.62 mil [Bashivan, Rish, Yeasin, Codella, ICLR 2016]
  • 14.
    Interpretability via Deconvolution Code:https://github.com/pbashivan/EEGLearn Using deconvnet of [Zeiler et al] to map features back to the brain images Back Projections: maps obtained by deconvnet on the feature map displaying structures in the input image that excite that particular feature map. Some of these features correspond to well-known electrophysiological markers of cognitive load. First-layer features (1st stack, kernel 7) captured wide-spread theta (1st stack output-kernel7) and another (1st stack, kernel 23) frontal beta activity Second- and third-layer features – frontal theta/beta (2nd stack,kernel7) and 3rd stack kernel60, 112) as well as parietal alpha (2nd stack kernel29) . Frontal theta and beta activity as well as parietal alpha are most prominent markers of cognitive/memory load in neuroscience literature [Bashivan et al., 2015; Jensen et al., 2002; Onton et al., 2005; Tallon-Baudry et al., 1999] Input EEG images: top 9 images with highest feature activations across the training set Layer4Layer6Layer7
  • 15.
    • Current theories:the hippocampus functions as an autoenconder to evoke memories; similar encoding function is suggested in the olfactory bulb • Our computational model: sparse linear autoencoder (online dictionary learning of Mairal et al) + dynamic addition (birth) abnd deletion (death) of hidden nodes Adult Neurogenesis: Inspiration for Adaptive Representation Learning • Predominant in the dentate gyrus of the hippocampus and in the olfactory bulb Olfactory bulb Dentate gyrus [Garg, Rish, Cecchi, Lozano, ICLR 2017] nsamples p variables ~~ mbasisvectors (dictionary)sparse representation input x output x’  reconstructed x hidden nodes c  encoded x link weights  ‘dictionary’ D c c Brain 2 AI:
  • 16.
    Better Adaptation inNon-Stationary Environment Learned dictionary size ‘Old’ domain reconstruction ‘New’ domain reconstruction non-stationary visual input Outperforms fixed-size autoencoder on non-stationary input: improved accuracy + more compact representation Adapts to a new domain without forgetting the old one (via ‘memory’ matrices, part of original Mairal’s method)
  • 17.
    Some Lessons  Inbrain imaging applications  Datasets are relatively small (e.g., few 1000 samples)  Model interpretability is important  Brain-inspired algorithms:  Neurogenesis, attention, memory and many other brain phenomena can serve as inspiration for better AI algorithms  Challenge: deeper understanding and better modeling of such phenomena  Deep learning faces specific challenges in neuroimaging  Need for stronger regularization  Need for interpretability (e.g., deconvolution, sparsity)
  • 18.
    References [Garg, Rish, Cecchi,Lozano 2016; submitted] S. Garg, I. Rish, G. Cecchi, A. Lozano. Neurogenesis-inspired Dictionary Learning: Online Model Adaptation in a changing world, submitted to ICLR-2017 [Bashivan et al, ICLR 2016] P. Bashivan, I. Rish, M. Yeasin, N. Codella. Learning Representations from EEG with Deep Recurrent-Convolutional Neural Networks. ICLR 2016 : International Conference on Learning Representations. [Bashivan et al, 2015] Mental State Recognition via Wearable EEG, in Proc. of MLINI-2015 workshop at NIPS-2015. [Heisig et al, 2014] S. Heisig, G. Cecchi, R. Rao and I. Rish. Augmented Human: Human OS for Improved Mental Function. AAAI 2014 Workshop on Cognitive Computing and Augmented Human Intelligence. [Neuropsychopharmacology, 2014] A Window into the Intoxicated Mind? Speech as an Index of Psychoactive Drug Effects. Bedi G, Cecchi G A, Fernandez Slezak D, Carrillo F, Sigman M, de Wit H. Neuropsychopharmacology, 2014 [NPJ 2015] G. Bedi, F. Carrillo, G. A Cecchi, D. F. Slezak, M. Sigman, N. B Mota, S. Ribeiro, D C Javitt, M. Copelli, C M Corcoran. Automated analysis of free speech predicts psychosis onset in high-risk youths. NPJ Schizophrenia 2015. [PLoS ONE, 2013] Schizophrenia as a Network Disease: Disruption of Emergent Brain Function in Patients with Auditory Hallucinations, I Rish, G Cecchi, B Thyreau, B Thirion, M Plaze, M-L Paillere-Martinot, C Martelli, J-L Martinot, J-B Poline. PloS ONE 8(1), e50625, Public Library of Science, 2013. [PLoS One, 2012] Speech Graphs Provide a Quantitative Measure of Thought Disorder in Psychosis. N.B. Mota, N.A.P. Vasconcelos, N. Lemos, A.C. Pieretti, O. Kinouchi, G.A. Cecchi, M. Copelli, S. Ribeiro. PLoS One, 2012 [Rish et al, SPIE 2016] I.Rish, P. Bashivan, G. A. Cecchi, R.Z. Goldstein, Evaluating Effects of Methylphenidate on Brain Activity in Cocaine Addiction: A Machine-Learning Approach. SPIE Medical Imaging, 2016 [SPIE Med.Imaging 2012] Sparse regression analysis of task-relevant information distribution in the brain. Irina Rish, Guillermo A Cecchi, Kyle Heuton, Marwan N Baliki, A Vania Apkarian, SPIE Medical Imaging, 2012. [AISTATS 2012] J. Honorio, D. Samaras, I. Rish, G.A. Cecchi. Variable Selection for Gaussian Graphical Models. AISTATS, 2012. [PLoS Comp Bio 2012] Predictive Dynamics of Human Pain Perception, GA Cecchi, L Huang, J Ali Hashmi, M Baliki, MV Centeno, I Rish, AV Apkarian, PLoS Comp Bio 8(10), e1002719, Public Library of Science, 2012. [Brain Informatics 2010] I. Rish, G. Cecchi, M.N. Baliki and A.V. Apkarian. Sparse Regression Models of Pain Perception, in Proc. of Brain Informatics (BI- 2010), Toronto, Canada, August 2010. [NeuroImage, 2009] Prediction and interpretation of distributed neural activity with sparse models. Melissa K Carroll, Guillermo A Cecchi, Irina Rish, Rahul Garg, A Ravishankar Rao. NeuroImage 44(1), 112--122, Elsevier, 2009. [NIPS, 2009] Discriminative network models of schizophrenia, GA Cecchi, I Rish, B Thyreau, B Thirion, M Plaze, M-L Paillere-Martinot, C Martelli, J-L Martinot, J-B Poline. Advances in Neural Information Processing Systems (NIPS 2009) , pp. 252--260, 2009.