Two Algorithms Weakly Supervised Denoising EEG Data

Two Algorithms for Weakly Supervised
Denoising of EEG Data
Tim Oates
Professor
University of Maryland Baltimore County

The Real Brains
3/2/2019 Artifact Removal From Electroencephalography (EEG) Data 2
Sunil Gandhi
Ph.D. student
Will graduate soon!

Agenda
• Introduction To Artifact Removal From EEG data
• ICA And Multi-instance Learning Solution
• Asymmetric Generative Adversarial Network Solution
• Future Work
• Questions

• Future Work
• Questions

Electroencephalography (EEG)
• Electroencephalography (EEG) is used to record electrical
activity in the brain
5
Image Source: https://en.wikipedia.org/wiki/Electroencephalography
3/2/2019 Artifact Removal From Electroencephalography (EEG) Data

Electroencephalography (EEG) Data
– Seizure detection
6
Page, Adam, Chris Sagedy, Emily Smith, Nasrin Attaran, Tim Oates, and Tinoosh Mohsenin. "A flexible multichannel EEG
feature extractor and classifier for seizure detection." IEEE Transactions on Circuits and Systems II: Express Briefs 62, no. 2
(2015): 109-113.

– Determining cognitive load
7
Bashivan, Pouya, Irina Rish, Mohammed Yeasin, and Noel Codella. "Learning representations from EEG with deep recurrent-
convolutional neural networks." International Conference on Learning Representations 2016.

– Brain Computer interfaces (BCI)
8
Source: https://www.youtube.com/watch?v=rYwgnFeFqmc

– Brain Computer interfaces (BCI)
– Driver drowsiness estimation
9
Lance, Brent J., W. David Hairston, Greg Apker, Keith W. Whitaker, Geoff Slipher, Randy Mrozek, Scott E. Kerick, Jason Metcalfe,
Christopher Manteuffel, and Matthew Jaswa. 2012 year-end report on neurotechnologies for in-vehicle applications. No. ARL-SR-267.
ARMY RESEARCH LAB ABERDEEN PROVING GROUND MD, 2013.

Challenges in using EEG data
• Compact comfortable headsets
• Noisy Data
• Online, fully automated algorithms
• Hardware implementations

• Noisy Data

EEG Artifacts
• Artifacts: Unwanted electrical activity arising from sources
other than the brain
• Types of Artifacts
– Biological Artifacts
• Ocular artifacts
• Muscular artifacts
• Cardiac artifacts
– External artifacts
• Electrode motion
• External device artifacts

EEG Artifacts
• Artifacts: Unwanted electrical activity arising from sources
other than the brain
• Ocular artifacts
• Muscular artifacts
• Cardiac artifacts
– External artifacts
• Electrode motion
• External device artifacts
Eye Blink Artifact

EEG Artifacts
• Artifacts: Unwanted electrical activity arising from sources other
than the brain
– External Artifacts
• Why is it important to remove them?
– Increase the chance of false alarms in seizure detection (Seneviratne et al.
2013)
– They can also alter the shape of neurological events causing unintentional
control of BCI systems (Vaughan et al. 2003).

EEG Artifacts
• Artifacts: Unwanted electrical activity arising from sources other
than the brain
– External Artifacts
• Why is it important to remove them?
• Can occurrence of artifacts be avoided?
– The occurrence of external artifacts can be reduced by proper placement
of electrodes, but it is impossible to avoid artifacts of biological origin.

Contributions
• Propose a system for artifact removal from EEG data using
weak supervisory information.
• Improve the model by proposing an online, fully automated,
end-to-end system for artifact removal trained using unpaired
training corpora.
• Creating a framework for evaluation of artifact removal
algorithms
• Improving artifact removal using additional annotations
• Generalizing AsymmetricGAN for denoising speech data

Contributions
• Propose a system for artifact removal from EEG data using
weak supervisory information.
• Improve the model by proposing an online, fully automated,
end-to-end system for artifact removal trained using unpaired
training corpora.
algorithms
• Generalizing AsymmetricGAN for denoising speech data

• Future Work
• Questions

Related Work
• Artifact Rejection(Nolan, Whelan, and Reilly 2010)
– Train an artifact detector
– Remove the segments of data where artifacts are present
– Excessive loss of information
• Regression using reference channel (Croft and Barry 2000)
– Use a reference channel like Electrocardiography (ECG), Electrooculography
(EOG) or Electromyography (EMG).
– Fail if a reference signal is not available.
• Adaptive filtering (Sweeney, Ward, and McLoone 2012)
– Generate an artifact signal that is uncorrelated to the EEG signal but correlated
to the reference signal.
– Fail if a reference signal is not available.

Related Work
• Wavelet transforms (Islam et al. 2017) (Daly et al. 2015)
– Decompose each channel using DWT
– Threshold the low frequency component and reconstruct the signal
– Process each channel separately and could miss important clues for
removing the artifact.

Related Work
• Principal component analysis (PCA) (Berg and Scherg 1991)
– One of the initial approaches that used principal component analysis
(PCA) for decomposing the signal and removing artifacts from the
decomposed signal.
– Uses orthogonal transformation to convert a multi channel EEG signal
into a set of linearly uncorrelated variables.
– It has been demonstrated that PCA is unable to separate some
artifactual components from brain signals, especially when they have
similar amplitudes (Fitzgibbon et al. 2007b).

Independent component analysis (ICA)
• Most popular technique for artifact removal from EEG data
• ICA is used to recover independent source signals called
components and then the components corresponding to
artifacts are identified.

ICA Based Artifact Removal
• Shackman, Alexander J., et al. "Identifying robust and sensitive frequency bands for
interrogating neural oscillations." Neuroimage 51.4 (2010): 1319-1333.
• Fitzgibbon, S. P., et al. "Removal of EEG noise and artifact using blind source separation."
Journal of Clinical Neurophysiology 24.3 (2007): 232-243.
• Joyce, Carrie A., Irina F. Gorodnitsky, and Marta Kutas. "Automatic removal of eye
movement and blink artifacts from EEG data using blind component separation."
Psychophysiology 41.2 (2004): 313-325.
• Jung, Tzyy-Ping, et al. "Removal of eye activity artifacts from visual event-related potentials
in normal and clinical subjects." Clinical Neurophysiology 111.10 (2000): 1745-1758.
• Jung, Tzyy-Ping, et al. "Removing electroencephalographic artifacts by blind source
separation." Psychophysiology 37.02 (2000): 163-178.
• Keren, Alon S., Shlomit Yuval-Greenberg, and Leon Y. Deouell. "Saccadic spike potentials in
gamma-band EEG: characterization, detection and suppression." Neuroimage 49.3 (2010):
2248-2263.

2248-2263.
Artifact components need to be manually
identified or supervisory signal is needed for
training

2248-2263.
Key Idea: Identify components corresponding to
artifacts using weak supervisory signal

System Architecture
28
Mul$ Channel EEG Signal
Channel 1
Channel 2
Channel 64
.
...
..
ICA
Bag of components
.
..
SAX
Feature
Extrac0on .
..
0 1 0 1 0 0
Mul0-
Instance
Learning
(MIL)
0 0 1 1 0 0
0 0 1 1 1 1
Vector represen$ng
component
Sax string: "aaa" "abc" "aab" "aba" "acb" "acc"
Probability of noise
for each component
Ar$fact/ Not Ar$fact?
• Main Processing Blocks
⎻ Independent Component Analysis
⎻ SAX feature extraction
⎻ Multi Instance Learning

ICA
Origin 1
Origin 2
Origin 3
Mixing
System
Observed 1
Observed 2
Observed 3
Estimated 1
Estimated 2
Estimated 3
Unknown

Source: http://scikit-learn.sourceforge.net/0.6/auto_examples/plot_ica_blind_source_separation.html

ICA
Origin 1
Origin 2
Origin 3
Mixing
System
Observed 1
Observed 2
Observed 3
Estimated 1
Estimated 2
Estimated 3
Unknown

ICA
Clean EEG
Signal
Artifact
Mixing
System
Observed 1
Observed 2
Observed 3
Denoised EEG
Artifact
Unknown

System Architecture
33
Channel 1
Channel 2
Channel 64
.
...
..
ICA
Bag of components
.
..
SAX
Feature
Extrac0on .
..
0 1 0 1 0 0
Mul0-
Instance
Learning
(MIL)
0 0 1 1 0 0
0 0 1 1 1 1
Vector represen$ng
component
for each component

Slide by Eamonn Keogh(http://www.cs.ucr.edu/~eamonn/SAX.ppt)
34
How do we obtain SAX features?
0
00 20 40 60 80 100 120
bb
b
a
c
c
c
a
baabccbc
0
20
20 40 60 80 100 120

System Architecture
35
Channel 1
Channel 2
Channel 64
.
...
..
ICA
Bag of components
.
..
SAX
Feature
Extrac0on .
..
0 1 0 1 0 0
Mul0-
Instance
Learning
(MIL)
0 0 1 1 0 0
0 0 1 1 1 1
Vector represen$ng
component
for each component

Multi-Instance Learning
• A bag is labelled positive if it has at least one positive instance
• Goal: Predict bag label and instance labels
Bag
Positive/Negative Bag?

Multi-Instance Learning
• A bag is labelled positive if it has at least one positive instance
• Goal: Predict bag label and instance label
Bag
Positive/Negative Instance?

System Architecture
39
Channel 1
Channel 2
Channel 64
.
...
..
ICA
Bag of components
.
..
SAX
Feature
Extrac0on .
..
0 1 0 1 0 0
Mul0-
Instance
Learning
(MIL)
0 0 1 1 0 0
0 0 1 1 1 1
Vector represen$ng
component
for each component

EEG Dataset Generated by the US Army Research
Laboratory
40
EEG when participant is
not performing any activity
• 17 subjects
• Artifacts
o Sync pulse
o Clench jaw
o Move jaw vertically
o Blink (no squinting!)
o Move eyes left, then back
to center
o Move eyes up, then back
to center
o Raise and lower eyebrows
o Rotate head side-to-side
o Shrug shoulders
o Rotate torso (hips) EEG when participant is
raising and lowering eyebrows

Results
41
Algorithm Kernel Accuracy
Normalized Set Kernel Linear 95.2%
MISVM Linear 73.2%
miSVM Linear 70.0%
MISVM Quadratic Kernel 81.6%
miSVM Quadratic Kernel 66.0%

Results
Patient ID LOSO Accuracy Personalized Accuracy
1 72.54901961 98.04(3.39)
2 76.47058824 96.08(3.39)
3 76.47058824 92.16(6.79)
4 82.35294118 98.04(3.39)
5 84.31372549 98.04(3.39)
6 31.37254902 94.12(0)
7 94.11764706 100(0)
8 100 100(0)
9 98.03921569 100(0)
10 66.66666667 90.2(6.8)
11 98.03921569 100(0)
12 80.39215686 94.12(10.19)
13 88.23529412 100(0)
14 49.01960784 100(0)
15 66.66666667 90.2(12.25)
16 54.90196078 100(0)
17 90.19607843 100(0)
Average 77.0472895 97.12

Accuracy
Patient ID LOSO Accuracy Personalized Accuracy
1 72.54901961 98.04(3.39)
2 76.47058824 96.08(3.39)
3 76.47058824 92.16(6.79)
4 82.35294118 98.04(3.39)
5 84.31372549 98.04(3.39)
6 31.37254902 94.12(0)
7 94.11764706 100(0)
8 100 100(0)
9 98.03921569 100(0)
10 66.66666667 90.2(6.8)
11 98.03921569 100(0)
12 80.39215686 94.12(10.19)
13 88.23529412 100(0)
14 49.01960784 100(0)
15 66.66666667 90.2(12.25)
16 54.90196078 100(0)
17 90.19607843 100(0)
Average 77.0472895 97.12
Patient ID Accuracy
1 100(0)
2 89(7.3)
3 100(0)
4 99(2.8)
5 100((0)
6 97(5.5)
7 97(5.5)
Average 98(5.5)
Pooled 97(1.9)
Bootstrap 14(17.7)
LOSO 75
Lawhern, Vernon, W. David Hairston, Kaleb McDowell,
Marissa Westerfield, and Kay Robbins. "Detection and
classification of subject-generated artifacts in EEG
signals using autoregressive models." Journal of
neuroscience methods 208, no. 2 (2012): 181-189.

Summary
• We presented an EEG artifact removal system, using ICA, SAX
feature extraction, and Multi-Instance Learning algorithms. The
proposed system uses a weak supervisory signal to indicate
that some noise is occurring, but not what the source of the
noise is or how it is manifested in the EEG signal. We optimize
the hyper-parameters of the system to reduce the execution
time of the system while maintaining accuracy.
45
Jafari, Ali, Sunil Gandhi, Sri Harsha Konuru, W. David Hairston, Tim Oates, and Tinoosh Mohsenin. "An EEG artifact
identification embedded system using ICA and multi-instance learning." In Circuits and Systems (ISCAS), 2017 IEEE
International Symposium on, pp. 1-4. IEEE, 2017.

Limitations
• Independent component analysis needs large number of
samples to converge. ICA has high computational complexity
and large memory requirements, making it unsuitable for real-
time applications.
• Each subsystem has its own hyperparameters and tuning them
jointly is a challenging task.

Limitations
• Independent component analysis needs large number of
samples to converge. ICA has high computational complexity
and large memory requirements, making it unsuitable for real-
time applications.
• Each subsystem has its own hyperparameters and tuning them
jointly is a challenging task. Also,
47
Can we learn an end to end system to generate
clean EEG from noisy EEG using deep neural
networks

• Future Work
• Questions

Background
• Generative Adversarial network (GAN)
• CycleGAN: Unpaired Image-to-Image Translation using Cycle-
Consistent Adversarial Networks

Generative Adversarial Networks
• Generator network: try to fool the discriminator by generating real-looking images
• Discriminator network: try to distinguish between real and fake images
real or fake?
Discriminator
z
G(z)
D
Generator
G

[Goodfellow et al. 2014]
real
z G(z)
DG
D
x
fake
51

CycleGAN: Unpaired Image-to-image translation
using cycle-consistent adversarial networks
52

…
[Zhu*, Park*, Isola, Efros on arxiv]
……
X
Dx
Y
Dy
G
F
Cycle GAN
53

Cycle-consistency Loss
Backward cycle loss: 𝐺 𝐹 𝑦 − 𝑦 1
Forward cycle loss: F G x − x 1
G(x) F(G x )x F(y) G(F x )𝑦
54

CycleGAN: Unpaired Image-to-image translation
using cycle-consistent adversarial networks
EEG Data
EEG Data without
Artifacts 55

CycleGAN Model
A B
G_B
G_A
Da Db

CycleGAN Model
A B
G_B
G_A
Da Db
• Problems
– Network G_A does not have
access to predicted noise
signal to reconstruct A
– The network is penalized for
not adding noise that is the
same as predicted noise

CycleGAN Model
A B
G_B
G_A
Da Db
A
A’
G_B
G_A

AsymmetricGAN Model
A B
G_B
G_A
Da Db
We preserve the noise in the signal and use it for reconstruction of the original signal
A BN
G_B
G_NG_N
+
Da Db

AsymmetricGAN Model
A B
G_B
G_A
Da Db
+

AsymmetricGAN Model
G_N is a function that extracts noise if the input is a noisy signal A. It generates noise if the input is a clean signal B.

Generator and discriminator network architecture

Generator and discriminator network architecture
• Network is independent of the
relative ordering of the
channels
• Channel indices do not
correspond to the spatial
locations of the electrodes.
• Filter Size:
• 1D convolution with filter
size K
• 2D convolution with filter
size N ✕ k
Qi, Charles R., Hao Su, Kaichun Mo, and Leonidas J. Guibas.
"Pointnet: Deep learning on point sets for 3d classification and
segmentation." Proc. Computer Vision and Pattern
Recognition (CVPR), IEEE 1, no. 2 (2017): 4.

Evaluation
• Evaluation of artifact removal from EEG data is difficult as
ground truth signal does not exist.
• We just have set of clean EEG signals and noisy EEG signals. We
don’t have clean version of noisy EEG.
• Visualizing and understanding EEG data is a time consuming
task.
• To validate the model we create a simpler synthetic dataset.

Synthetic Dataset
• Clean signal is a linear combination of a sine and a square wave
• Noisy signal is linear combination of sine, square and sawtooth
waves
• The period of the sine and square waves is randomly selected
between 2 and 5. Sawtooth has fixed period of 6
• Number of samples in each time series is 1000
• Training set: 4000 Signals
• Validation set: 1000 Signals
• Test set: 100 Signals

Synthetic Dataset
(e) True Sources
(c) Signal cleaned by Asymmetric GAN
(a) Signal with Artifact (b) Clean Signal
(d) Noise Signal predicted by Asymmetric GAN

Synthetic Dataset
(e) True Sources
(c) Signal cleaned by Asymmetric GAN
(a) Signal with Artifact (b) Clean Signal
(d) Noise Signal predicted by Asymmetric GAN
The MSE between the ground truth clean signal and
denoised signal is 0.0406. The mean MSE error for entire
test set is 0.0387 and standard deviation is 0.0043.

EEG Dataset Generated by the US Army Research
Laboratory
• During collection of this dataset, participants were told to not move and look
straight at the computer screen for the collection of clean data.
• Despite the instruction to the patients, we noticed that there were artifacts even
in “clean” data. We also noticed that some “noisy” EEG did not contain the
corresponding artifact.
• Manually annotating all artifacts from all channels for all artifacts is a time
consuming task. So in this work, we focus on ocular artifacts in the Fp1 electrode
of the frontal region.
• Remove all patients that have more than two ocular artifacts in the clean data
and do not have artifacts in the region of eyebrow raising.
• In the resulting dataset, we have 4 patients with clean data and 10 patients with
noisy data.
• Each patient’s clean data has 4836 samples and noisy data has 420354 samples.
We use this manually annotated data in all experiments below.

Evaluation
• Qualitative Results
• Using a detector

Qualitative Results
(a) (b)

Evaluation by detection
• We use artifact detection as a way of measuring the
performance of artifact removal.
• We first train artifact detection and artifact removal and use
the artifact detector to classify every window in EEG data
denoised by the artifact removal algorithm.
• The error is given by the percent of windows where an artifact
was detected in denoised EEG data.

Dataset Division
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Clean Noisy Clean Noisy Clean Noisy Clean Noisy
Train Test Train Test
Artifact Detector Artifact Removal

Evaluation by detection
• Artifact Detector accuracy: 97.39%
• Denoised signal was classified as clean by artifact detector
72.37% times.

Summary
• We presented an online, fully automated, end-to-end system
for denoising EEG data. Our system for denoising is trained
using unpaired training corpora. It does not need any
information about the source of the noise or how it is
manifested in the EEG signal. We created a synthetic dataset
and used it to validate our network. We also used our system
to remove artifacts from existing EEG dataset
Sunil Gandhi, Tim Oates, Tinoosh Mohsenin, Dave Hairston. ”Denoising Time Series Data Using Asymmetric Generative
Adversarial Networks.” In Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2018.

• Future Work
• Questions

Future Work
algorithms
• Removing multiple artifacts using AsymmetricGAN
• Generalizing AsymmetricGAN for speech data

Evaluation of artifact removal algorithms
• Evaluating artifact removal algorithms is important before they
can be used in clinical contexts.
• But, it is challenging because of a lack of availability of ground
truth clean EEG signals.

• Evaluating artifact removal algorithms is important before they
can be used in clinical contexts.
• But, it is challenging because of a lack of availability of ground
truth clean EEG signals.
The performance evaluation of artifact removal methods found in the literature is
always problematic. It can be done either by visually by expert which is subjective
or by synthetic/semi-synthetic data (but uncertainty of reconstructed data whether
perfectly realistic or not). Since there is neither any ground truth data available nor
any universal or standard quantitative metric(s) used in the literature that can
capture both amount of artifact removal and distortion. … Therefore, it is not fair
to tell which performs best based on the study.
Islam, Md Kafiul, Amir Rastegarnia, and Zhi Yang. "Methods for artifact detection and removal from scalp EEG: A
review." Neurophysiologie Clinique/Clinical Neurophysiology 46, no. 4-5 (2016): 287-305.

• Qualitative Evaluation
• Evaluation using simulated data
• Evaluation using detection
• Correlating with the reference signal

– Subjective
– Does not give quantitative metric

– Allows usage of metrics like signal-to-noise ratio (SNR) and
normalized mean squared error to compare deviation between the
denoised signal and clean signal.
– Training and evaluation can be performed on large datasets
– But it is very difficult to replicate all the characteristics of EEG data
like synchronization among channels, time-locking to ERPs and
contamination by different types of artifacts in a realistic manner.

– Use artifact detector to check if artifacts are present in the denoised signal.
– Can be used to evaluate denoising of real EEG data
– Automated, not subjective and gives a quantitative measure to compare several
artifact removal algorithms.
– Usefulness of this metric is dependent on the performance of the artifact
detector.
– Does not give a measure of the sensitivity (whether the method removes the
artifacts) and specificity (whether it preserves EEG signals) of the artifact
removal algorithm.

• Correlating with the reference signal
– Cannot be performed if a reference signal for the corresponding
artifact is not available.
– Does not give a measure of the sensitivity (whether the method
removes the artifacts) and specificity (whether it preserves EEG
signals) of the artifact removal algorithm.

• Evaluation Methods
– Qualitative Evaluation
– Evaluation using simulated data
– Evaluation using detection
– Correlating with the reference signal
• Multiple evaluation methods with several variants of each of these methods exist in
literature.
• Each of these methods has their strengths and limitations. This makes performance
comparison of these methods hard.
• In future we plan to
– Standardize the evaluation mechanism
– Create a tool for evaluating artifact removal algorithms using all of the above approaches.
– Comparing our approach with existing methods using the tool

Future Work
algorithms

Removing multiple artifacts using AsymmetricGAN
• Having separate algorithms for removal of different types of
artifacts makes the preprocessing step inefficient.
• In future, we plan to adapt the AsymmetricGAN architecture for
removing multiple types of artifacts.
G_B
G_NG_N
+
Da Db
MAA B
OA

Future Work
algorithms

Improving artifact removal using additional
annotations
• Modify AsymmetricGAN to utilize
– artifact type
– location of artifact
– reference channel.
• We will study differences in performance because of each type
of additional data.
• This will give insight on which annotations are most helpful for
artifact removal algorithms, thus helping creation of future
datasets.

Future Work
algorithms

Generalizing AsymmetricGAN for speech data
• Speech denoising produces noise-free speech signals from noisy recordings
• Applications
– Recognizing speech in environments with background noise like car driving.
– Reduce discomfort and increase understanding of the speech for people wearing
hearing aids
• Easier to evaluate
• Evaluate generalizability of asymmetricGAN across domains
• Datasets:
– Voice Bank corpus: Speech data for 400 sentences from 28 speakers each.
– Demand dataset: Real world background noise of 18 diverse environments

Questions?

Two Algorithms Weakly Supervised Denoising EEG Data

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Similar to Two Algorithms Weakly Supervised Denoising EEG Data

Similar to Two Algorithms Weakly Supervised Denoising EEG Data (20)

More from Data Works MD

More from Data Works MD (18)

Recently uploaded

Recently uploaded (20)

Two Algorithms Weakly Supervised Denoising EEG Data

Editor's Notes