SlideShare a Scribd company logo
Machine	Learning	Specializa0on	
Latent Dirichlet
Allocation: Mixed
Membership Modeling
Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
1 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	
Mixed membership models
for documents
2 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	3	
So far, clustered articles into groups
©2016	Emily	Fox	&	Carlos	Guestrin	
Doc labeled
with a topic
assignment
Clustering goal: discover groups of related docs
Machine	Learning	Specializa0on	4	
Are documents about just one thing?
©2016	Emily	Fox	&	Carlos	Guestrin	
Is this article
just about
science?
Machine	Learning	Specializa0on	5	
Soft assignments capture uncertainty
©2016	Emily	Fox	&	Carlos	Guestrin	
Soft assignment rik
tells us this doc
could be about world
news or science
But, clustering
model still specifies
each doc belongs to
1 topic
Machine	Learning	Specializa0on	6	 ©2016	Emily	Fox	&	Carlos	Guestrin	
0	
1	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Based on science
related words, maybe
doc in cluster 4
Encoding of cluster
membership zi = 4
Machine	Learning	Specializa0on	7	
0	
1	
©2016	Emily	Fox	&	Carlos	Guestrin	
Or maybe cluster 2
(technology) is a better fit
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Encoding of cluster
membership zi = 2
Soft assignments
capture uncertainty
in zi = 2 or 4
Machine	Learning	Specializa0on	8	 ©2016	Emily	Fox	&	Carlos	Guestrin	
0	
0.2	
0.4	
0.6	
Really, it’s about science
and technology
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
“zi” is both 2 and 4
Machine	Learning	Specializa0on	9	
Mixed membership
models
Want to discover a
set of memberships
(In contrast, cluster models
aim at discovering a single
membership)
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Machine	Learning	Specializa0on	
Building up to document
mixed membership models
©2016	Emily	Fox	&	Carlos	Guestrin	10
Machine	Learning	Specializa0on	11	
An alternative document clustering model
©2016	Emily	Fox	&	Carlos	Guestrin	
(Back to
clustering,
not mixed
membership
modeling)
Machine	Learning	Specializa0on	12	
So far, we have considered…
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
tf-idf vector
xi =
Machine	Learning	Specializa0on	13	
Bag-of-words representation
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
epilepsy
modeling
clinical
complex
Bayesian
Machine	Learning	Specializa0on	14	
Bag-of-words representation
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
multiset
= unordered set of words with
duplication of unique elements
mattering
xi = {modeling, complex, epilepsy,
modeling, Bayesian, clinical,
epilepsy, EEG, data, dynamic…}
Machine	Learning	Specializa0on	15	
A model for bag-of-words
representation
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
As before, the “prior”
probability that doc i is
from topic k is:
p(zi = k) = ⇡k
π = [π1 π2 … πK]
represents corpus-wide
topic prevalence
Machine	Learning	Specializa0on	16	
A model for bag-of-words
representation
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Assuming doc i is from
topic k, words occur
with probabilities:
SCIENCE
patients 0.05
clinical 0.01
epilepsy 0.002
seizures 0.0015
EEG 0.001
… …
wordsinvocab
Machine	Learning	Specializa0on	17	
Topic-specific word probabilities
Distribution on words in vocab for each topic
©2016	Emily	Fox	&	Carlos	Guestrin	
SCIENCE
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TECH
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
SPORTS
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
(table now organized by decreasing probabilities
showing top words in each category)
Machine	Learning	Specializa0on	18	
Comparing and contrasting
Previously
©2016	Emily	Fox	&	Carlos	Guestrin	
SCIENCE
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TECH
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
SPORTS
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
Now
Prior topic
probabilities
p(zi = k) = ⇡k p(zi = k) = ⇡k
Likelihood
under
each topic
…
tf-idf vector
compute likelihood of tf-idf
vector under each Gaussian
{modeling, complex, epilepsy,
modeling, Bayesian, clinical,
epilepsy, EEG, data, dynamic…}
compute likelihood of the
collection of words in doc
under each topic distribution
Machine	Learning	Specializa0on	
Latent Dirichlet allocation (LDA)
©2016	Emily	Fox	&	Carlos	Guestrin	19
Machine	Learning	Specializa0on	20	
LDA is a mixed
membership model
Want to discover a
set of topics
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
0	
0.2	
0.4	
0.6
Machine	Learning	Specializa0on	21	 ©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
arXiv:1402.6951v2[stat.ML]14Jul2014
SCIENCE
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TECH
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
SPORTS
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
Topic vocab
distributions:
…
Clustering:
One topic indicator
zi per document i
All words come from
(get scored under)
same topic zi
Distribution on
prevalence of
topics in corpus
π = [π1 π2 … πK]
Machine	Learning	Specializa0on	22	 ©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
arXiv:1402.6951v2[stat.ML]14Jul2014
SCIENCE
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TECH
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
SPORTS
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
Same topic
distributions:
…
In LDA:
One topic indicator
ziw per word in doc i
Each word gets
scored under its
topic ziw
Distribution on
prevalence of
topics in document
πi = [πi1 πi2 … πiK]
Machine	Learning	Specializa0on	23	 ©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
arXiv:1402.6951v2[stat.ML]14Jul2014
SCIENCE
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TECH
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
SPORTS
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
Topic vocab
distributions:
…
Document topic
proportions:
πi = [πi1 πi2 … πiK]
0	
0.2	
0.4	
0.6
Machine	Learning	Specializa0on	
Inference in LDA models
©2016	Emily	Fox	&	Carlos	Guestrin	24
Machine	Learning	Specializa0on	25	 ©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
arXiv:1402.6951v2[stat.ML]14Jul2014
SCIENCE
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TECH
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
SPORTS
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
Topic vocab
distributions:
…
Document topic
proportions:
πi = [πi1 πi2 … πiK]
0	
0.2	
0.4	
0.6
Machine	Learning	Specializa0on	26	 ©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
Word 1 ?
Word 2 ?
Word 3 ?
Word 4 ?
Word 5 ?
… …
TOPIC 2
Word 1 ?
Word 2 ?
Word 3 ?
Word 4 ?
Word 5 ?
… …
TOPIC 3
Word 1 ?
Word 2 ?
Word 3 ?
Word 4 ?
Word 5 ?
… …
Topic vocab
distributions:
…
Document topic
proportions:
πi = [πi1 πi2 … πiK]
0	
0.2	
0.4	
0.6	
? ? ? ?
Machine	Learning	Specializa0on	27	 ©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
Word 1 ?
Word 2 ?
Word 3 ?
Word 4 ?
Word 5 ?
… …
TOPIC 2
Word 1 ?
Word 2 ?
Word 3 ?
Word 4 ?
Word 5 ?
… …
TOPIC 3
Word 1 ?
Word 2 ?
Word 3 ?
Word 4 ?
Word 5 ?
… …
Topic vocab
distributions:
…
Document topic
proportions:
πi = [πi1 πi2 … πiK]
0	
0.2	
0.4	
0.6	
? ? ? ?
LDA inputs:
- Set of words per doc for each
doc in corpus
LDA outputs:
- Corpus-wide topic vocab
distributions
- Topic assignments per word
- Topic proportions per doc
Machine	Learning	Specializa0on	28	
Interpreting LDA outputs
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6
Machine	Learning	Specializa0on	29	
Interpreting LDA outputs
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6	
Examine coherence of
learned topics
-  What are top words per topic?
-  Do they form meaningful
groups?
-  Use to post-facto label topics
(e.g., science, tech, sports,…)
Machine	Learning	Specializa0on	30	
Interpreting LDA outputs
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6	
Doc-specific topic
proportions can be used to:
-  Relate documents
-  Study user topic preferences
-  Assign docs to multiple
categories
Machine	Learning	Specializa0on	31	
Interpreting LDA outputs
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6	
Typically not interested
in word assignments
Machine	Learning	Specializa0on	
An inference algorithm for LDA:
Gibbs sampling
32 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	33	
Clustering algorithms so far
©2016	Emily	Fox	&	Carlos	Guestrin	
Assign observations to
closest cluster center
Revise cluster centers
zi arg min
j
||µj xi||2
2
µj arg min
µ
X
i:zi=j
||µ xi||2
2
k-means
E-step: estimate cluster
responsibilities
M-step: maximize likelihood
over parameters
EM for MoG
Iterative hard assignment
to max objective
Iterative soft assignment
to max objective
Machine	Learning	Specializa0on	34	
What can we do for our bag-of-words models?
Part 1: Clustering model
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
SCIENCE
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TECH
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
SPORTS
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
One topic indicator
zi per document i
All words come from
(get scored under)
same topic zi
Distribution on
prevalence of
topics in corpus
π = [π1 π2 … πK]
Machine	Learning	Specializa0on	35	
What can we do for our bag-of-words models?
Part 1: Clustering model
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
SCIENCE
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TECH
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
SPORTS
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
Can derive
EM algorithm:
-  Gaussian likelihood of
tf-idf vector
-  multinomial likelihood
of word counts
(mw successes of word w)
-  Result: mixture of
multinomial model
Machine	Learning	Specializa0on	36	
What can we do for our bag-of-words models?
Part 2: LDA model
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6	
Can derive
EM algorithm,
but not common
(performs poorly)
Machine	Learning	Specializa0on	37	
Typical LDA implementations
Normally LDA is specified as a Bayesian model
(otherwise, “probabilistic latent semantic analysis/indexing”)
- Account for uncertainty in parameters
when making predictions
- Naturally regularizes parameter estimates
in contrast to MLE
EM-like algorithms (e.g., “variational EM”), or…
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	
Gibbs sampling for Bayesian inference
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	39	
Gibbs sampling
Benefits:
•  Typically intuitive updates
•  Very straightforward to implement
©2016	Emily	Fox	&	Carlos	Guestrin	
Iterative random hard assignment!
Machine	Learning	Specializa0on	40	
Random sample #10000
©2016	Emily	Fox	&	Carlos	Guestrin	
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6	
Current set of
assignments
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
Machine	Learning	Specializa0on	41	
Random sample #10001
©2016	Emily	Fox	&	Carlos	Guestrin	
TOPIC 1
experiment 0.12
test 0.06
hypothesize 0.042
discover 0.04
climate 0.011
… …
TOPIC 2
develop 0.16
computer 0.11
user 0.03
processor 0.029
internet 0.023
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
offense 0.02
defense 0.018
… …
…
0	
0.2	
0.4	
0.6	
Current set of
assignments
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
Machine	Learning	Specializa0on	42	
Random sample #10002
©2016	Emily	Fox	&	Carlos	Guestrin	
TOPIC 1
experiment 0.10
discover 0.055
hypothesize 0.043
test 0.042
examine 0.015
… …
TOPIC 2
computer 0.12
develop 0.115
user 0.031
device 0.022
cloud 0.018
… …
TOPIC 3
player 0.17
score 0.09
game 0.062
team 0.043
win 0.03
… …
…
0	
0.2	
0.4	
0.6	
Current set of
assignments
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
Machine	Learning	Specializa0on	43	
What do we know about this process?
©2016	Emily	Fox	&	Carlos	Guestrin	
iterations
Jointmodel
probability
probability of observations given variables/parameters
and probability of variables/parameters themselves
Not an optimization algorithm
Eventually
provides
“correct”
Bayesian
estimates…
Machine	Learning	Specializa0on	44	
What to do with sampling output?
Predictions:
1.  Make prediction for each snapshot of randomly
assigned variables/parameters (full iteration)
2.  Average predictions for final result
Parameter or assignment estimate:
-  Look at snapshot of randomly assigned
variables/parameters that maximizes
“joint model probability”
©2016	Emily	Fox	&	Carlos	Guestrin	
iterations
Jointmodel
probability
Machine	Learning	Specializa0on	
Standard Gibbs sampling steps
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	46	
Gibbs sampling algorithm outline
Assignment variables and model parameters
treated similarly
Iteratively draw variable/parameter from
conditional distribution having fixed:
-  all other variables/parameters
•  values randomly selected in previous rounds
•  changes from iter to iter
-  observations
•  always the same values
©2016	Emily	Fox	&	Carlos	Guestrin	
Iterative random hard assignment!
Machine	Learning	Specializa0on	47	
Gibbs sampling for LDA
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6	
Current set of
assignments
Machine	Learning	Specializa0on	48	
Gibbs sampling for LDA
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6	
Step 1: Randomly
reassign all ziw based on
-  doc topic proportions
-  topic vocab distributions
Draw randomly from
responsibility vector
[riw1 riw2 … riwK]
Machine	Learning	Specializa0on	49	
Gibbs sampling for LDA
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
Step 2: Randomly
reassign doc topic
proportions based on
assignments ziw in
current doc
0	
0.2	
0.4	
0.6	
? ? ? ?
Machine	Learning	Specializa0on	50	
Gibbs sampling for LDA
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
Step 3: Repeat for all docs
0	
0.2	
0.4	
0.6	
? ? ? ?
Machine	Learning	Specializa0on	51	
Gibbs sampling for LDA
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
…
Step 4: Randomly
reassign topic vocab
distributions based on
assignments ziw in
entire corpus
0	
0.2	
0.4	
0.6	
TOPIC 1
Word 1 ?
Word 2 ?
Word 3 ?
Word 4 ?
Word 5 ?
… …
TOPIC 2
Word 1 ?
Word 2 ?
Word 3 ?
Word 4 ?
Word 5 ?
… …
TOPIC 3
Word 1 ?
Word 2 ?
Word 3 ?
Word 4 ?
Word 5 ?
… …
Machine	Learning	Specializa0on	52	
Gibbs sampling for LDA
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6	
Repeat Steps 1-4 until
max iter reached
Machine	Learning	Specializa0on	
Collapsed Gibbs sampling in LDA
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	54	
“Collapsed” Gibbs sampling for LDA
Based on special structure of LDA model, can
sample just indicator variables ziw
- No need to sample other parameters
•  corpus-wide topic vocab distributions
•  per-doc topic proportions
Often leads to much better performance
because examining uncertainty in smaller space
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	55	
Collapsed Gibbs sampling for LDA
©2016	Emily	Fox	&	Carlos	Guestrin	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6	
Randomly reassign ziw
based on current
assignments zjv of all
other words in document
and corpus
Never draw topic vocab
distributions or doc topic
proportions
Machine	Learning	Specializa0on	56	
Select a document
©2016	Emily	Fox	&	Carlos	Guestrin	
epilepsy dynamic Bayesian EEG model
Machine	Learning	Specializa0on	
Randomly assign topics
©2016	Emily	Fox	&	Carlos	Guestrin	
3 2 1 3 1
epilepsy dynamic Bayesian EEG model
57
Machine	Learning	Specializa0on	
Randomly assign topics
58 ©2016	Emily	Fox	&	Carlos	Guestrin	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
2 3 2 1 1
test temple ship trade market
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
2 3 2 1 1
model temple ship trade market
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
3	 2	 1	 3	 1	
Etruscan	 trade	 price	 temple	 market	
2 3 2 1 1
cross test validate likelihood data
3 2 1 3 1
epilepsy dynamic Bayesian EEG model
Repeat for
each doc in
the corpus
Machine	Learning	Specializa0on	
Maintain local statistics
59 ©2016	Emily	Fox	&	Carlos	Guestrin	
3 2 1 3 1
epilepsy dynamic Bayesian EEG model
Topic 1 Topic 2 Topic 3
Doc i
Machine	Learning	Specializa0on	
Maintain global statistics
60 ©2016	Emily	Fox	&	Carlos	Guestrin	
Topic 1 Topic 2 Topic 3
epilepsy 1 0 35
Bayesian 50 0 1
model 42 1 0
EEG 0 0 20
dynamic 10 8 1
...
Total
counts
from all
docs
3 2 1 3 1
epilepsy dynamic Bayesian EEG model
Topic 1 Topic 2 Topic 3
Doc i 2 1 2
Machine	Learning	Specializa0on	
Randomly reassign topics
61 ©2016	Emily	Fox	&	Carlos	Guestrin	
3 2 1 3 1
epilepsy dynamic Bayesian EEG model
Topic 1 Topic 2 Topic 3
epilepsy 1 0 35
Bayesian 50 0 1
model 42 1 0
EEG 0 0 20
dynamic 10 8 1
...
Topic 1 Topic 2 Topic 3
Doc i 2 1 2
Machine	Learning	Specializa0on	
Probability of new assignment
62 ©2016	Emily	Fox	&	Carlos	Guestrin	
3 ? 1 3 1
epilepsy dynamic Bayesian EEG model
Machine	Learning	Specializa0on	
Probability of new assignment
63 ©2016	Emily	Fox	&	Carlos	Guestrin	
3 ? 1 3 1
epilepsy dynamic Bayesian EEG model
Topic 1 Topic 2 Topic 3
Topic 1 Topic 2 Topic 3
Doc i 2 0 2
How much doc “likes”
each topic based on other
assignments in doc
# current assignments
to topic k in doc i
# words in doc i
smoothing param
Machine	Learning	Specializa0on	
Probability of new assignment
64 ©2016	Emily	Fox	&	Carlos	Guestrin	
3 ? 1 3 1
epilepsy dynamic Bayesian EEG model
Topic 1 Topic 2 Topic 3
dynamic 10 7 1
How much each topic
likes the word “dynamic”
based on assignments in
other docs in corpus
smoothing param
# assignments
corpus-wide of
word “dynamic”
to topic k
Topic 1 Topic 2 Topic 3
Machine	Learning	Specializa0on	
Probability of new assignment
65 ©2016	Emily	Fox	&	Carlos	Guestrin	
3 ? 1 3 1
epilepsy dynamic Bayesian EEG model
Topic 1 Topic 2 Topic 3
dynamic 10 7 1
Topic 2 also really likes “dynamic”,
but in a different context…
e.g., a topic on fluid dynamics
Topic 1 Topic 2 Topic 3
Machine	Learning	Specializa0on	
Probability of new assignment
66 ©2016	Emily	Fox	&	Carlos	Guestrin	
3 ? 1 3 1
epilepsy dynamic Bayesian EEG model
Topic fits word
and document
Topic fits word,
but not doc
Topic fits doc,
but not word
Topic 1 Topic 2 Topic 3
Machine	Learning	Specializa0on	
Probability of new assignment
67 ©2016	Emily	Fox	&	Carlos	Guestrin	
3 ? 1 3 1
epilepsy dynamic Bayesian EEG model
Topic 1 Topic 2 Topic 3
How much
topic likes word	
How much
doc likes topic
Machine	Learning	Specializa0on	
Randomly draw a new topic indicator
68 ©2016	Emily	Fox	&	Carlos	Guestrin	
3 ? 1 3 1
epilepsy dynamic Bayesian EEG model
Topic 1 Topic 2 Topic 3
How much
topic likes word	
How much
doc likes topic
To draw new topic assignment (equivalently):
-  roll K-sided die with these probabilities
-  throw dart at these regions Normalize this product of
terms over K possible topics!
Machine	Learning	Specializa0on	
Update counts
69 ©2016	Emily	Fox	&	Carlos	Guestrin	
3 1 1 3 1
epilepsy dynamic Bayesian EEG model
Topic 1 Topic 2 Topic 3
epilepsy 1 0 35
Bayesian 50 0 1
model 42 1 0
EEG 0 0 20
dynamic 10 7 1
...
Topic 1 Topic 2 Topic 3
Doc i 2 0 2
Machine	Learning	Specializa0on	
Geometrically…
70 ©2016	Emily	Fox	&	Carlos	Guestrin	
3 1 1 3 1
epilepsy dynamic Bayesian EEG model
Topic 1 Topic 2 Topic 3
Increase popularity of
“dynamic” in topic 1
(corpus-wide)
Increase popularity of
topic 1 in doc i
Machine	Learning	Specializa0on	
Iterate through all words/docs
71 ©2016	Emily	Fox	&	Carlos	Guestrin	
3 1 1 3 1
epilepsy dynamic Bayesian EEG model
Topic 1 Topic 2 Topic 3
epilepsy 1 0 35
Bayesian 50 0 1
model 42 1 0
EEG 0 0 20
dynamic 10 7 1
...
Topic 1 Topic 2 Topic 3
Doc i 2 0 2
Machine	Learning	Specializa0on	
Using samples from collapsed Gibbs
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	73	
What to do with the collapsed samples?
©2016	Emily	Fox	&	Carlos	Guestrin	
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
From “best” sample of {ziw},
can infer:
1.  Topics from conditional
distribution…
need corpus-wide info
Machine	Learning	Specializa0on	74	
What to do with the collapsed samples?
©2016	Emily	Fox	&	Carlos	Guestrin	
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
From “best” sample of {ziw},
can infer:
1.  Topics from conditional
distribution…
need corpus-wide info
Machine	Learning	Specializa0on	75	
What to do with the collapsed samples?
©2016	Emily	Fox	&	Carlos	Guestrin	
TOPIC 1
experiment 0.1
test 0.08
discover 0.05
hypothesize 0.03
climate 0.01
… …
TOPIC 2
develop 0.18
computer 0.09
processor 0.032
user 0.027
internet 0.02
… …
TOPIC 3
player 0.15
score 0.07
team 0.06
goal 0.03
injury 0.01
… …
…
0	
0.2	
0.4	
0.6	
Modeling the Complex Dynamics and Changing
Correlations of Epileptic Events
Drausin F. Wulsina
, Emily B. Foxc
, Brian Litta,b
a
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
b
Department of Neurology, University of Pennsylvania, Philadelphia, PA
c
Department of Statistics, University of Washington, Seattle, WA
Abstract
Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in
addition to full-blown clinical seizures. We believe the relationship between
these two classes of events—something not previously studied quantitatively—
could yield important insights into the nature and intrinsic dynamics of
seizures. A goal of our work is to parse these complex epileptic events
into distinct dynamic regimes. A challenge posed by the intracranial EEG
(iEEG) data we study is the fact that the number and placement of electrodes
can vary between patients. We develop a Bayesian nonparametric Markov
switching process that allows for (i) shared dynamic regimes between a vari-
able number of channels, (ii) asynchronous regime-switching, and (iii) an
unknown dictionary of dynamic regimes. We encode a sparse and changing
set of dependencies between the channels using a Markov-switching Gaussian
graphical model for the innovations process driving the channel dynamics and
demonstrate the importance of this model in parsing and out-of-sample pre-
dictions of iEEG data. We show that our model produces intuitive state
assignments that can help automate clinical analysis of seizures and enable
the comparison of sub-clinical bursts and full clinical seizures.
Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model,
graphical model, time series
1. Introduction
Despite over three decades of research, we still have very little idea of
what defines a seizure. This ignorance stems both from the complexity of
epilepsy as a disease and a paucity of quantitative tools that are flexible
Preprint submitted to Artificial Intelligence Journal July 29, 2014
arXiv:1402.6951v2[stat.ML]14Jul2014
From “best” sample of {ziw},
can infer:
1.  Topics from conditional
distribution…
need corpus-wide info
2.  Document “embedding”…
need doc info only
C4.5
C4.5
C4.5
C4.5

More Related Content

Viewers also liked

Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003
Ajay Ohri
 
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation SystemLatent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Shailly Saxena
 
LDA Beginner's Tutorial
LDA Beginner's TutorialLDA Beginner's Tutorial
LDA Beginner's Tutorial
Wayne Lee
 
Atu media eval_sed2014
Atu media eval_sed2014Atu media eval_sed2014
Atu media eval_sed2014
multimediaeval
 
Lime
LimeLime
C4.4
C4.4C4.4
C3.1.logistic intro
C3.1.logistic introC3.1.logistic intro
C3.1.logistic intro
Daniel LIAO
 
Search Engines
Search EnginesSearch Engines
Search Engines
butest
 
C3.3.1
C3.3.1C3.3.1
C3.3.1
Daniel LIAO
 
Amharic document clustering
Amharic document clusteringAmharic document clustering
Amharic document clustering
Guy De Pauw
 
WSDM2014
WSDM2014WSDM2014
WSDM2014
Jun Yu
 
Interactive Latent Dirichlet Allocation
Interactive Latent Dirichlet AllocationInteractive Latent Dirichlet Allocation
Interactive Latent Dirichlet Allocation
Quentin Pleplé
 
Goal-based Recommendation utilizing Latent Dirichlet Allocation
Goal-based Recommendation utilizing Latent Dirichlet AllocationGoal-based Recommendation utilizing Latent Dirichlet Allocation
Goal-based Recommendation utilizing Latent Dirichlet Allocation
Sebastien Louvigne
 
A Note on Expectation-Propagation for Latent Dirichlet Allocation
A Note on Expectation-Propagation for Latent Dirichlet AllocationA Note on Expectation-Propagation for Latent Dirichlet Allocation
A Note on Expectation-Propagation for Latent Dirichlet Allocation
Tomonari Masada
 
What's new in scikit-learn 0.17
What's new in scikit-learn 0.17What's new in scikit-learn 0.17
What's new in scikit-learn 0.17
Andreas Mueller
 
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...
Tomonari Masada
 
Latent dirichletallocation presentation
Latent dirichletallocation presentationLatent dirichletallocation presentation
Latent dirichletallocation presentation
Soojung Hong
 
Visualzing Topic Models
Visualzing Topic ModelsVisualzing Topic Models
Visualzing Topic Models
Turi, Inc.
 
Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
Kyunghoon Kim
 
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic ModelA Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
Tomonari Masada
 

Viewers also liked (20)

Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003
 
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation SystemLatent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation System
 
LDA Beginner's Tutorial
LDA Beginner's TutorialLDA Beginner's Tutorial
LDA Beginner's Tutorial
 
Atu media eval_sed2014
Atu media eval_sed2014Atu media eval_sed2014
Atu media eval_sed2014
 
Lime
LimeLime
Lime
 
C4.4
C4.4C4.4
C4.4
 
C3.1.logistic intro
C3.1.logistic introC3.1.logistic intro
C3.1.logistic intro
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
C3.3.1
C3.3.1C3.3.1
C3.3.1
 
Amharic document clustering
Amharic document clusteringAmharic document clustering
Amharic document clustering
 
WSDM2014
WSDM2014WSDM2014
WSDM2014
 
Interactive Latent Dirichlet Allocation
Interactive Latent Dirichlet AllocationInteractive Latent Dirichlet Allocation
Interactive Latent Dirichlet Allocation
 
Goal-based Recommendation utilizing Latent Dirichlet Allocation
Goal-based Recommendation utilizing Latent Dirichlet AllocationGoal-based Recommendation utilizing Latent Dirichlet Allocation
Goal-based Recommendation utilizing Latent Dirichlet Allocation
 
A Note on Expectation-Propagation for Latent Dirichlet Allocation
A Note on Expectation-Propagation for Latent Dirichlet AllocationA Note on Expectation-Propagation for Latent Dirichlet Allocation
A Note on Expectation-Propagation for Latent Dirichlet Allocation
 
What's new in scikit-learn 0.17
What's new in scikit-learn 0.17What's new in scikit-learn 0.17
What's new in scikit-learn 0.17
 
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...
 
Latent dirichletallocation presentation
Latent dirichletallocation presentationLatent dirichletallocation presentation
Latent dirichletallocation presentation
 
Visualzing Topic Models
Visualzing Topic ModelsVisualzing Topic Models
Visualzing Topic Models
 
Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
 
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic ModelA Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
 

Similar to C4.5

8421ijbes01
8421ijbes018421ijbes01
8421ijbes01
ijbesjournal
 
A Novel Approach For Detection of Neurological Disorders through Electrical P...
A Novel Approach For Detection of Neurological Disorders through Electrical P...A Novel Approach For Detection of Neurological Disorders through Electrical P...
A Novel Approach For Detection of Neurological Disorders through Electrical P...
IJECEIAES
 
IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016
tsysglobalsolutions
 
Introduction to systems biology
Introduction to systems biologyIntroduction to systems biology
Introduction to systems biology
lemberger
 
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
Servio Fernando Lima Reina
 
Lagging_Inference_Networks_and_Posterior_Collapse_.pdf
Lagging_Inference_Networks_and_Posterior_Collapse_.pdfLagging_Inference_Networks_and_Posterior_Collapse_.pdf
Lagging_Inference_Networks_and_Posterior_Collapse_.pdf
AnkitBiswas31
 
woot2
woot2woot2
Application of stochastic modelling in bioinformatics
Application of stochastic modelling in bioinformaticsApplication of stochastic modelling in bioinformatics
Application of stochastic modelling in bioinformatics
Spyros Ktenas
 
GLUER integrative analysis of single-cell omics and imaging data by deep neur...
GLUER integrative analysis of single-cell omics and imaging data by deep neur...GLUER integrative analysis of single-cell omics and imaging data by deep neur...
GLUER integrative analysis of single-cell omics and imaging data by deep neur...
mallannasuman
 
Pattern recognition system based on support vector machines
Pattern recognition system based on support vector machinesPattern recognition system based on support vector machines
Pattern recognition system based on support vector machines
Alexander Decker
 
Summary Of Thesis
Summary Of ThesisSummary Of Thesis
Summary Of Thesis
guestb452d6
 
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Arinze Akutekwe
 
Context-Aware and Time-Aware Attention-Based Model for Disease Risk Predictio...
Context-Aware and Time-Aware Attention-Based Model for Disease Risk Predictio...Context-Aware and Time-Aware Attention-Based Model for Disease Risk Predictio...
Context-Aware and Time-Aware Attention-Based Model for Disease Risk Predictio...
OKOKPROJECTS
 
Effective electroencephalogram based epileptic seizure detection using suppo...
Effective electroencephalogram based epileptic seizure detection  using suppo...Effective electroencephalogram based epileptic seizure detection  using suppo...
Effective electroencephalogram based epileptic seizure detection using suppo...
IJECEIAES
 
Beyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysis
Beyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysisBeyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysis
Beyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysis
PetteriTeikariPhD
 
An information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksAn information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networks
Jim Bagrow
 
Convolutional Networks
Convolutional NetworksConvolutional Networks
Convolutional Networks
Nicole Savoie
 
STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...
Lars Juhl Jensen
 
Deep Learning for Leukemia Detection: A MobileNetV2-Based Approach for Accura...
Deep Learning for Leukemia Detection: A MobileNetV2-Based Approach for Accura...Deep Learning for Leukemia Detection: A MobileNetV2-Based Approach for Accura...
Deep Learning for Leukemia Detection: A MobileNetV2-Based Approach for Accura...
IRJET Journal
 
Fuzzy Logic Based Parameter Adaptation of Interior Search Algorithm
Fuzzy Logic Based Parameter Adaptation of Interior Search AlgorithmFuzzy Logic Based Parameter Adaptation of Interior Search Algorithm
Fuzzy Logic Based Parameter Adaptation of Interior Search Algorithm
ijtsrd
 

Similar to C4.5 (20)

8421ijbes01
8421ijbes018421ijbes01
8421ijbes01
 
A Novel Approach For Detection of Neurological Disorders through Electrical P...
A Novel Approach For Detection of Neurological Disorders through Electrical P...A Novel Approach For Detection of Neurological Disorders through Electrical P...
A Novel Approach For Detection of Neurological Disorders through Electrical P...
 
IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016
 
Introduction to systems biology
Introduction to systems biologyIntroduction to systems biology
Introduction to systems biology
 
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
 
Lagging_Inference_Networks_and_Posterior_Collapse_.pdf
Lagging_Inference_Networks_and_Posterior_Collapse_.pdfLagging_Inference_Networks_and_Posterior_Collapse_.pdf
Lagging_Inference_Networks_and_Posterior_Collapse_.pdf
 
woot2
woot2woot2
woot2
 
Application of stochastic modelling in bioinformatics
Application of stochastic modelling in bioinformaticsApplication of stochastic modelling in bioinformatics
Application of stochastic modelling in bioinformatics
 
GLUER integrative analysis of single-cell omics and imaging data by deep neur...
GLUER integrative analysis of single-cell omics and imaging data by deep neur...GLUER integrative analysis of single-cell omics and imaging data by deep neur...
GLUER integrative analysis of single-cell omics and imaging data by deep neur...
 
Pattern recognition system based on support vector machines
Pattern recognition system based on support vector machinesPattern recognition system based on support vector machines
Pattern recognition system based on support vector machines
 
Summary Of Thesis
Summary Of ThesisSummary Of Thesis
Summary Of Thesis
 
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
 
Context-Aware and Time-Aware Attention-Based Model for Disease Risk Predictio...
Context-Aware and Time-Aware Attention-Based Model for Disease Risk Predictio...Context-Aware and Time-Aware Attention-Based Model for Disease Risk Predictio...
Context-Aware and Time-Aware Attention-Based Model for Disease Risk Predictio...
 
Effective electroencephalogram based epileptic seizure detection using suppo...
Effective electroencephalogram based epileptic seizure detection  using suppo...Effective electroencephalogram based epileptic seizure detection  using suppo...
Effective electroencephalogram based epileptic seizure detection using suppo...
 
Beyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysis
Beyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysisBeyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysis
Beyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysis
 
An information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksAn information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networks
 
Convolutional Networks
Convolutional NetworksConvolutional Networks
Convolutional Networks
 
STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...
 
Deep Learning for Leukemia Detection: A MobileNetV2-Based Approach for Accura...
Deep Learning for Leukemia Detection: A MobileNetV2-Based Approach for Accura...Deep Learning for Leukemia Detection: A MobileNetV2-Based Approach for Accura...
Deep Learning for Leukemia Detection: A MobileNetV2-Based Approach for Accura...
 
Fuzzy Logic Based Parameter Adaptation of Interior Search Algorithm
Fuzzy Logic Based Parameter Adaptation of Interior Search AlgorithmFuzzy Logic Based Parameter Adaptation of Interior Search Algorithm
Fuzzy Logic Based Parameter Adaptation of Interior Search Algorithm
 

More from Daniel LIAO

C4.3.1
C4.3.1C4.3.1
C4.3.1
Daniel LIAO
 
C4.1.2
C4.1.2C4.1.2
C4.1.2
Daniel LIAO
 
C3.7.1
C3.7.1C3.7.1
C3.7.1
Daniel LIAO
 
C3.6.1
C3.6.1C3.6.1
C3.6.1
Daniel LIAO
 
C3.5.1
C3.5.1C3.5.1
C3.5.1
Daniel LIAO
 
C3.4.2
C3.4.2C3.4.2
C3.4.2
Daniel LIAO
 
C3.4.1
C3.4.1C3.4.1
C3.4.1
Daniel LIAO
 
C2.3
C2.3C2.3
C2.6
C2.6C2.6
C2.2
C2.2C2.2
C2.5
C2.5C2.5
C2.4
C2.4C2.4
C3.1.2
C3.1.2C3.1.2
C3.1.2
Daniel LIAO
 
C3.2.2
C3.2.2C3.2.2
C3.2.2
Daniel LIAO
 
C3.2.1
C3.2.1C3.2.1
C3.2.1
Daniel LIAO
 
C3.1 intro
C3.1 introC3.1 intro
C3.1 intro
Daniel LIAO
 
C2.1 intro
C2.1 introC2.1 intro
C2.1 intro
Daniel LIAO
 

More from Daniel LIAO (17)

C4.3.1
C4.3.1C4.3.1
C4.3.1
 
C4.1.2
C4.1.2C4.1.2
C4.1.2
 
C3.7.1
C3.7.1C3.7.1
C3.7.1
 
C3.6.1
C3.6.1C3.6.1
C3.6.1
 
C3.5.1
C3.5.1C3.5.1
C3.5.1
 
C3.4.2
C3.4.2C3.4.2
C3.4.2
 
C3.4.1
C3.4.1C3.4.1
C3.4.1
 
C2.3
C2.3C2.3
C2.3
 
C2.6
C2.6C2.6
C2.6
 
C2.2
C2.2C2.2
C2.2
 
C2.5
C2.5C2.5
C2.5
 
C2.4
C2.4C2.4
C2.4
 
C3.1.2
C3.1.2C3.1.2
C3.1.2
 
C3.2.2
C3.2.2C3.2.2
C3.2.2
 
C3.2.1
C3.2.1C3.2.1
C3.2.1
 
C3.1 intro
C3.1 introC3.1 intro
C3.1 intro
 
C2.1 intro
C2.1 introC2.1 intro
C2.1 intro
 

Recently uploaded

A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
Steve Thomason
 
Ch-4 Forest Society and colonialism 2.pdf
Ch-4 Forest Society and colonialism 2.pdfCh-4 Forest Society and colonialism 2.pdf
Ch-4 Forest Society and colonialism 2.pdf
lakshayrojroj
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
nitinpv4ai
 
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptxContiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Kalna College
 
How to Manage Reception Report in Odoo 17
How to Manage Reception Report in Odoo 17How to Manage Reception Report in Odoo 17
How to Manage Reception Report in Odoo 17
Celine George
 
220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology
Kalna College
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
zuzanka
 
CIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdfCIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdf
blueshagoo1
 
How to Download & Install Module From the Odoo App Store in Odoo 17
How to Download & Install Module From the Odoo App Store in Odoo 17How to Download & Install Module From the Odoo App Store in Odoo 17
How to Download & Install Module From the Odoo App Store in Odoo 17
Celine George
 
220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science
Kalna College
 
78 Microsoft-Publisher - Sirin Sultana Bora.pptx
78 Microsoft-Publisher - Sirin Sultana Bora.pptx78 Microsoft-Publisher - Sirin Sultana Bora.pptx
78 Microsoft-Publisher - Sirin Sultana Bora.pptx
Kalna College
 
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGHKHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
shreyassri1208
 
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
ImMuslim
 
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
ShwetaGawande8
 
How to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in useHow to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in use
Celine George
 
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapitolTechU
 
How to Setup Default Value for a Field in Odoo 17
How to Setup Default Value for a Field in Odoo 17How to Setup Default Value for a Field in Odoo 17
How to Setup Default Value for a Field in Odoo 17
Celine George
 
Educational Technology in the Health Sciences
Educational Technology in the Health SciencesEducational Technology in the Health Sciences
Educational Technology in the Health Sciences
Iris Thiele Isip-Tan
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
Krassimira Luka
 
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
Payaamvohra1
 

Recently uploaded (20)

A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
 
Ch-4 Forest Society and colonialism 2.pdf
Ch-4 Forest Society and colonialism 2.pdfCh-4 Forest Society and colonialism 2.pdf
Ch-4 Forest Society and colonialism 2.pdf
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
 
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptxContiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptx
 
How to Manage Reception Report in Odoo 17
How to Manage Reception Report in Odoo 17How to Manage Reception Report in Odoo 17
How to Manage Reception Report in Odoo 17
 
220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
 
CIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdfCIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdf
 
How to Download & Install Module From the Odoo App Store in Odoo 17
How to Download & Install Module From the Odoo App Store in Odoo 17How to Download & Install Module From the Odoo App Store in Odoo 17
How to Download & Install Module From the Odoo App Store in Odoo 17
 
220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science
 
78 Microsoft-Publisher - Sirin Sultana Bora.pptx
78 Microsoft-Publisher - Sirin Sultana Bora.pptx78 Microsoft-Publisher - Sirin Sultana Bora.pptx
78 Microsoft-Publisher - Sirin Sultana Bora.pptx
 
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGHKHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
 
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
 
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
 
How to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in useHow to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in use
 
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
 
How to Setup Default Value for a Field in Odoo 17
How to Setup Default Value for a Field in Odoo 17How to Setup Default Value for a Field in Odoo 17
How to Setup Default Value for a Field in Odoo 17
 
Educational Technology in the Health Sciences
Educational Technology in the Health SciencesEducational Technology in the Health Sciences
Educational Technology in the Health Sciences
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
 
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
 

C4.5

  • 1. Machine Learning Specializa0on Latent Dirichlet Allocation: Mixed Membership Modeling Emily Fox & Carlos Guestrin Machine Learning Specialization University of Washington 1 ©2016 Emily Fox & Carlos Guestrin
  • 2. Machine Learning Specializa0on Mixed membership models for documents 2 ©2016 Emily Fox & Carlos Guestrin
  • 3. Machine Learning Specializa0on 3 So far, clustered articles into groups ©2016 Emily Fox & Carlos Guestrin Doc labeled with a topic assignment Clustering goal: discover groups of related docs
  • 4. Machine Learning Specializa0on 4 Are documents about just one thing? ©2016 Emily Fox & Carlos Guestrin Is this article just about science?
  • 5. Machine Learning Specializa0on 5 Soft assignments capture uncertainty ©2016 Emily Fox & Carlos Guestrin Soft assignment rik tells us this doc could be about world news or science But, clustering model still specifies each doc belongs to 1 topic
  • 6. Machine Learning Specializa0on 6 ©2016 Emily Fox & Carlos Guestrin 0 1 Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Based on science related words, maybe doc in cluster 4 Encoding of cluster membership zi = 4
  • 7. Machine Learning Specializa0on 7 0 1 ©2016 Emily Fox & Carlos Guestrin Or maybe cluster 2 (technology) is a better fit Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Encoding of cluster membership zi = 2 Soft assignments capture uncertainty in zi = 2 or 4
  • 8. Machine Learning Specializa0on 8 ©2016 Emily Fox & Carlos Guestrin 0 0.2 0.4 0.6 Really, it’s about science and technology Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible “zi” is both 2 and 4
  • 9. Machine Learning Specializa0on 9 Mixed membership models Want to discover a set of memberships (In contrast, cluster models aim at discovering a single membership) ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible
  • 10. Machine Learning Specializa0on Building up to document mixed membership models ©2016 Emily Fox & Carlos Guestrin 10
  • 11. Machine Learning Specializa0on 11 An alternative document clustering model ©2016 Emily Fox & Carlos Guestrin (Back to clustering, not mixed membership modeling)
  • 12. Machine Learning Specializa0on 12 So far, we have considered… ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 tf-idf vector xi =
  • 13. Machine Learning Specializa0on 13 Bag-of-words representation ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 epilepsy modeling clinical complex Bayesian
  • 14. Machine Learning Specializa0on 14 Bag-of-words representation ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 multiset = unordered set of words with duplication of unique elements mattering xi = {modeling, complex, epilepsy, modeling, Bayesian, clinical, epilepsy, EEG, data, dynamic…}
  • 15. Machine Learning Specializa0on 15 A model for bag-of-words representation ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible As before, the “prior” probability that doc i is from topic k is: p(zi = k) = ⇡k π = [π1 π2 … πK] represents corpus-wide topic prevalence
  • 16. Machine Learning Specializa0on 16 A model for bag-of-words representation ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Assuming doc i is from topic k, words occur with probabilities: SCIENCE patients 0.05 clinical 0.01 epilepsy 0.002 seizures 0.0015 EEG 0.001 … … wordsinvocab
  • 17. Machine Learning Specializa0on 17 Topic-specific word probabilities Distribution on words in vocab for each topic ©2016 Emily Fox & Carlos Guestrin SCIENCE experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TECH develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … SPORTS player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … (table now organized by decreasing probabilities showing top words in each category)
  • 18. Machine Learning Specializa0on 18 Comparing and contrasting Previously ©2016 Emily Fox & Carlos Guestrin SCIENCE experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TECH develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … SPORTS player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … Now Prior topic probabilities p(zi = k) = ⇡k p(zi = k) = ⇡k Likelihood under each topic … tf-idf vector compute likelihood of tf-idf vector under each Gaussian {modeling, complex, epilepsy, modeling, Bayesian, clinical, epilepsy, EEG, data, dynamic…} compute likelihood of the collection of words in doc under each topic distribution
  • 19. Machine Learning Specializa0on Latent Dirichlet allocation (LDA) ©2016 Emily Fox & Carlos Guestrin 19
  • 20. Machine Learning Specializa0on 20 LDA is a mixed membership model Want to discover a set of topics ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible 0 0.2 0.4 0.6
  • 21. Machine Learning Specializa0on 21 ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible arXiv:1402.6951v2[stat.ML]14Jul2014 SCIENCE experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TECH develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … SPORTS player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … Topic vocab distributions: … Clustering: One topic indicator zi per document i All words come from (get scored under) same topic zi Distribution on prevalence of topics in corpus π = [π1 π2 … πK]
  • 22. Machine Learning Specializa0on 22 ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible arXiv:1402.6951v2[stat.ML]14Jul2014 SCIENCE experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TECH develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … SPORTS player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … Same topic distributions: … In LDA: One topic indicator ziw per word in doc i Each word gets scored under its topic ziw Distribution on prevalence of topics in document πi = [πi1 πi2 … πiK]
  • 23. Machine Learning Specializa0on 23 ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible arXiv:1402.6951v2[stat.ML]14Jul2014 SCIENCE experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TECH develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … SPORTS player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … Topic vocab distributions: … Document topic proportions: πi = [πi1 πi2 … πiK] 0 0.2 0.4 0.6
  • 24. Machine Learning Specializa0on Inference in LDA models ©2016 Emily Fox & Carlos Guestrin 24
  • 25. Machine Learning Specializa0on 25 ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible arXiv:1402.6951v2[stat.ML]14Jul2014 SCIENCE experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TECH develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … SPORTS player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … Topic vocab distributions: … Document topic proportions: πi = [πi1 πi2 … πiK] 0 0.2 0.4 0.6
  • 26. Machine Learning Specializa0on 26 ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 Word 1 ? Word 2 ? Word 3 ? Word 4 ? Word 5 ? … … TOPIC 2 Word 1 ? Word 2 ? Word 3 ? Word 4 ? Word 5 ? … … TOPIC 3 Word 1 ? Word 2 ? Word 3 ? Word 4 ? Word 5 ? … … Topic vocab distributions: … Document topic proportions: πi = [πi1 πi2 … πiK] 0 0.2 0.4 0.6 ? ? ? ?
  • 27. Machine Learning Specializa0on 27 ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 Word 1 ? Word 2 ? Word 3 ? Word 4 ? Word 5 ? … … TOPIC 2 Word 1 ? Word 2 ? Word 3 ? Word 4 ? Word 5 ? … … TOPIC 3 Word 1 ? Word 2 ? Word 3 ? Word 4 ? Word 5 ? … … Topic vocab distributions: … Document topic proportions: πi = [πi1 πi2 … πiK] 0 0.2 0.4 0.6 ? ? ? ? LDA inputs: - Set of words per doc for each doc in corpus LDA outputs: - Corpus-wide topic vocab distributions - Topic assignments per word - Topic proportions per doc
  • 28. Machine Learning Specializa0on 28 Interpreting LDA outputs ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6
  • 29. Machine Learning Specializa0on 29 Interpreting LDA outputs ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6 Examine coherence of learned topics -  What are top words per topic? -  Do they form meaningful groups? -  Use to post-facto label topics (e.g., science, tech, sports,…)
  • 30. Machine Learning Specializa0on 30 Interpreting LDA outputs ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6 Doc-specific topic proportions can be used to: -  Relate documents -  Study user topic preferences -  Assign docs to multiple categories
  • 31. Machine Learning Specializa0on 31 Interpreting LDA outputs ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6 Typically not interested in word assignments
  • 32. Machine Learning Specializa0on An inference algorithm for LDA: Gibbs sampling 32 ©2016 Emily Fox & Carlos Guestrin
  • 33. Machine Learning Specializa0on 33 Clustering algorithms so far ©2016 Emily Fox & Carlos Guestrin Assign observations to closest cluster center Revise cluster centers zi arg min j ||µj xi||2 2 µj arg min µ X i:zi=j ||µ xi||2 2 k-means E-step: estimate cluster responsibilities M-step: maximize likelihood over parameters EM for MoG Iterative hard assignment to max objective Iterative soft assignment to max objective
  • 34. Machine Learning Specializa0on 34 What can we do for our bag-of-words models? Part 1: Clustering model ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 SCIENCE experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TECH develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … SPORTS player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … One topic indicator zi per document i All words come from (get scored under) same topic zi Distribution on prevalence of topics in corpus π = [π1 π2 … πK]
  • 35. Machine Learning Specializa0on 35 What can we do for our bag-of-words models? Part 1: Clustering model ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 SCIENCE experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TECH develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … SPORTS player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … Can derive EM algorithm: -  Gaussian likelihood of tf-idf vector -  multinomial likelihood of word counts (mw successes of word w) -  Result: mixture of multinomial model
  • 36. Machine Learning Specializa0on 36 What can we do for our bag-of-words models? Part 2: LDA model ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6 Can derive EM algorithm, but not common (performs poorly)
  • 37. Machine Learning Specializa0on 37 Typical LDA implementations Normally LDA is specified as a Bayesian model (otherwise, “probabilistic latent semantic analysis/indexing”) - Account for uncertainty in parameters when making predictions - Naturally regularizes parameter estimates in contrast to MLE EM-like algorithms (e.g., “variational EM”), or… ©2016 Emily Fox & Carlos Guestrin
  • 38. Machine Learning Specializa0on Gibbs sampling for Bayesian inference ©2016 Emily Fox & Carlos Guestrin
  • 39. Machine Learning Specializa0on 39 Gibbs sampling Benefits: •  Typically intuitive updates •  Very straightforward to implement ©2016 Emily Fox & Carlos Guestrin Iterative random hard assignment!
  • 40. Machine Learning Specializa0on 40 Random sample #10000 ©2016 Emily Fox & Carlos Guestrin TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6 Current set of assignments Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014
  • 41. Machine Learning Specializa0on 41 Random sample #10001 ©2016 Emily Fox & Carlos Guestrin TOPIC 1 experiment 0.12 test 0.06 hypothesize 0.042 discover 0.04 climate 0.011 … … TOPIC 2 develop 0.16 computer 0.11 user 0.03 processor 0.029 internet 0.023 … … TOPIC 3 player 0.15 score 0.07 team 0.06 offense 0.02 defense 0.018 … … … 0 0.2 0.4 0.6 Current set of assignments Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014
  • 42. Machine Learning Specializa0on 42 Random sample #10002 ©2016 Emily Fox & Carlos Guestrin TOPIC 1 experiment 0.10 discover 0.055 hypothesize 0.043 test 0.042 examine 0.015 … … TOPIC 2 computer 0.12 develop 0.115 user 0.031 device 0.022 cloud 0.018 … … TOPIC 3 player 0.17 score 0.09 game 0.062 team 0.043 win 0.03 … … … 0 0.2 0.4 0.6 Current set of assignments Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014
  • 43. Machine Learning Specializa0on 43 What do we know about this process? ©2016 Emily Fox & Carlos Guestrin iterations Jointmodel probability probability of observations given variables/parameters and probability of variables/parameters themselves Not an optimization algorithm Eventually provides “correct” Bayesian estimates…
  • 44. Machine Learning Specializa0on 44 What to do with sampling output? Predictions: 1.  Make prediction for each snapshot of randomly assigned variables/parameters (full iteration) 2.  Average predictions for final result Parameter or assignment estimate: -  Look at snapshot of randomly assigned variables/parameters that maximizes “joint model probability” ©2016 Emily Fox & Carlos Guestrin iterations Jointmodel probability
  • 45. Machine Learning Specializa0on Standard Gibbs sampling steps ©2016 Emily Fox & Carlos Guestrin
  • 46. Machine Learning Specializa0on 46 Gibbs sampling algorithm outline Assignment variables and model parameters treated similarly Iteratively draw variable/parameter from conditional distribution having fixed: -  all other variables/parameters •  values randomly selected in previous rounds •  changes from iter to iter -  observations •  always the same values ©2016 Emily Fox & Carlos Guestrin Iterative random hard assignment!
  • 47. Machine Learning Specializa0on 47 Gibbs sampling for LDA ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6 Current set of assignments
  • 48. Machine Learning Specializa0on 48 Gibbs sampling for LDA ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6 Step 1: Randomly reassign all ziw based on -  doc topic proportions -  topic vocab distributions Draw randomly from responsibility vector [riw1 riw2 … riwK]
  • 49. Machine Learning Specializa0on 49 Gibbs sampling for LDA ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … Step 2: Randomly reassign doc topic proportions based on assignments ziw in current doc 0 0.2 0.4 0.6 ? ? ? ?
  • 50. Machine Learning Specializa0on 50 Gibbs sampling for LDA ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … Step 3: Repeat for all docs 0 0.2 0.4 0.6 ? ? ? ?
  • 51. Machine Learning Specializa0on 51 Gibbs sampling for LDA ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 … Step 4: Randomly reassign topic vocab distributions based on assignments ziw in entire corpus 0 0.2 0.4 0.6 TOPIC 1 Word 1 ? Word 2 ? Word 3 ? Word 4 ? Word 5 ? … … TOPIC 2 Word 1 ? Word 2 ? Word 3 ? Word 4 ? Word 5 ? … … TOPIC 3 Word 1 ? Word 2 ? Word 3 ? Word 4 ? Word 5 ? … …
  • 52. Machine Learning Specializa0on 52 Gibbs sampling for LDA ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6 Repeat Steps 1-4 until max iter reached
  • 53. Machine Learning Specializa0on Collapsed Gibbs sampling in LDA ©2016 Emily Fox & Carlos Guestrin
  • 54. Machine Learning Specializa0on 54 “Collapsed” Gibbs sampling for LDA Based on special structure of LDA model, can sample just indicator variables ziw - No need to sample other parameters •  corpus-wide topic vocab distributions •  per-doc topic proportions Often leads to much better performance because examining uncertainty in smaller space ©2016 Emily Fox & Carlos Guestrin
  • 55. Machine Learning Specializa0on 55 Collapsed Gibbs sampling for LDA ©2016 Emily Fox & Carlos Guestrin Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6 Randomly reassign ziw based on current assignments zjv of all other words in document and corpus Never draw topic vocab distributions or doc topic proportions
  • 58. Machine Learning Specializa0on Randomly assign topics 58 ©2016 Emily Fox & Carlos Guestrin 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 2 3 2 1 1 test temple ship trade market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 2 3 2 1 1 model temple ship trade market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 3 2 1 3 1 Etruscan trade price temple market 2 3 2 1 1 cross test validate likelihood data 3 2 1 3 1 epilepsy dynamic Bayesian EEG model Repeat for each doc in the corpus
  • 59. Machine Learning Specializa0on Maintain local statistics 59 ©2016 Emily Fox & Carlos Guestrin 3 2 1 3 1 epilepsy dynamic Bayesian EEG model Topic 1 Topic 2 Topic 3 Doc i
  • 60. Machine Learning Specializa0on Maintain global statistics 60 ©2016 Emily Fox & Carlos Guestrin Topic 1 Topic 2 Topic 3 epilepsy 1 0 35 Bayesian 50 0 1 model 42 1 0 EEG 0 0 20 dynamic 10 8 1 ... Total counts from all docs 3 2 1 3 1 epilepsy dynamic Bayesian EEG model Topic 1 Topic 2 Topic 3 Doc i 2 1 2
  • 61. Machine Learning Specializa0on Randomly reassign topics 61 ©2016 Emily Fox & Carlos Guestrin 3 2 1 3 1 epilepsy dynamic Bayesian EEG model Topic 1 Topic 2 Topic 3 epilepsy 1 0 35 Bayesian 50 0 1 model 42 1 0 EEG 0 0 20 dynamic 10 8 1 ... Topic 1 Topic 2 Topic 3 Doc i 2 1 2
  • 62. Machine Learning Specializa0on Probability of new assignment 62 ©2016 Emily Fox & Carlos Guestrin 3 ? 1 3 1 epilepsy dynamic Bayesian EEG model
  • 63. Machine Learning Specializa0on Probability of new assignment 63 ©2016 Emily Fox & Carlos Guestrin 3 ? 1 3 1 epilepsy dynamic Bayesian EEG model Topic 1 Topic 2 Topic 3 Topic 1 Topic 2 Topic 3 Doc i 2 0 2 How much doc “likes” each topic based on other assignments in doc # current assignments to topic k in doc i # words in doc i smoothing param
  • 64. Machine Learning Specializa0on Probability of new assignment 64 ©2016 Emily Fox & Carlos Guestrin 3 ? 1 3 1 epilepsy dynamic Bayesian EEG model Topic 1 Topic 2 Topic 3 dynamic 10 7 1 How much each topic likes the word “dynamic” based on assignments in other docs in corpus smoothing param # assignments corpus-wide of word “dynamic” to topic k Topic 1 Topic 2 Topic 3
  • 65. Machine Learning Specializa0on Probability of new assignment 65 ©2016 Emily Fox & Carlos Guestrin 3 ? 1 3 1 epilepsy dynamic Bayesian EEG model Topic 1 Topic 2 Topic 3 dynamic 10 7 1 Topic 2 also really likes “dynamic”, but in a different context… e.g., a topic on fluid dynamics Topic 1 Topic 2 Topic 3
  • 66. Machine Learning Specializa0on Probability of new assignment 66 ©2016 Emily Fox & Carlos Guestrin 3 ? 1 3 1 epilepsy dynamic Bayesian EEG model Topic fits word and document Topic fits word, but not doc Topic fits doc, but not word Topic 1 Topic 2 Topic 3
  • 67. Machine Learning Specializa0on Probability of new assignment 67 ©2016 Emily Fox & Carlos Guestrin 3 ? 1 3 1 epilepsy dynamic Bayesian EEG model Topic 1 Topic 2 Topic 3 How much topic likes word How much doc likes topic
  • 68. Machine Learning Specializa0on Randomly draw a new topic indicator 68 ©2016 Emily Fox & Carlos Guestrin 3 ? 1 3 1 epilepsy dynamic Bayesian EEG model Topic 1 Topic 2 Topic 3 How much topic likes word How much doc likes topic To draw new topic assignment (equivalently): -  roll K-sided die with these probabilities -  throw dart at these regions Normalize this product of terms over K possible topics!
  • 69. Machine Learning Specializa0on Update counts 69 ©2016 Emily Fox & Carlos Guestrin 3 1 1 3 1 epilepsy dynamic Bayesian EEG model Topic 1 Topic 2 Topic 3 epilepsy 1 0 35 Bayesian 50 0 1 model 42 1 0 EEG 0 0 20 dynamic 10 7 1 ... Topic 1 Topic 2 Topic 3 Doc i 2 0 2
  • 70. Machine Learning Specializa0on Geometrically… 70 ©2016 Emily Fox & Carlos Guestrin 3 1 1 3 1 epilepsy dynamic Bayesian EEG model Topic 1 Topic 2 Topic 3 Increase popularity of “dynamic” in topic 1 (corpus-wide) Increase popularity of topic 1 in doc i
  • 71. Machine Learning Specializa0on Iterate through all words/docs 71 ©2016 Emily Fox & Carlos Guestrin 3 1 1 3 1 epilepsy dynamic Bayesian EEG model Topic 1 Topic 2 Topic 3 epilepsy 1 0 35 Bayesian 50 0 1 model 42 1 0 EEG 0 0 20 dynamic 10 7 1 ... Topic 1 Topic 2 Topic 3 Doc i 2 0 2
  • 72. Machine Learning Specializa0on Using samples from collapsed Gibbs ©2016 Emily Fox & Carlos Guestrin
  • 73. Machine Learning Specializa0on 73 What to do with the collapsed samples? ©2016 Emily Fox & Carlos Guestrin TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6 Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 From “best” sample of {ziw}, can infer: 1.  Topics from conditional distribution… need corpus-wide info
  • 74. Machine Learning Specializa0on 74 What to do with the collapsed samples? ©2016 Emily Fox & Carlos Guestrin TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6 Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 From “best” sample of {ziw}, can infer: 1.  Topics from conditional distribution… need corpus-wide info
  • 75. Machine Learning Specializa0on 75 What to do with the collapsed samples? ©2016 Emily Fox & Carlos Guestrin TOPIC 1 experiment 0.1 test 0.08 discover 0.05 hypothesize 0.03 climate 0.01 … … TOPIC 2 develop 0.18 computer 0.09 processor 0.032 user 0.027 internet 0.02 … … TOPIC 3 player 0.15 score 0.07 team 0.06 goal 0.03 injury 0.01 … … … 0 0.2 0.4 0.6 Modeling the Complex Dynamics and Changing Correlations of Epileptic Events Drausin F. Wulsina , Emily B. Foxc , Brian Litta,b a Department of Bioengineering, University of Pennsylvania, Philadelphia, PA b Department of Neurology, University of Pennsylvania, Philadelphia, PA c Department of Statistics, University of Washington, Seattle, WA Abstract Patients with epilepsy can manifest short, sub-clinical epileptic “bursts” in addition to full-blown clinical seizures. We believe the relationship between these two classes of events—something not previously studied quantitatively— could yield important insights into the nature and intrinsic dynamics of seizures. A goal of our work is to parse these complex epileptic events into distinct dynamic regimes. A challenge posed by the intracranial EEG (iEEG) data we study is the fact that the number and placement of electrodes can vary between patients. We develop a Bayesian nonparametric Markov switching process that allows for (i) shared dynamic regimes between a vari- able number of channels, (ii) asynchronous regime-switching, and (iii) an unknown dictionary of dynamic regimes. We encode a sparse and changing set of dependencies between the channels using a Markov-switching Gaussian graphical model for the innovations process driving the channel dynamics and demonstrate the importance of this model in parsing and out-of-sample pre- dictions of iEEG data. We show that our model produces intuitive state assignments that can help automate clinical analysis of seizures and enable the comparison of sub-clinical bursts and full clinical seizures. Keywords: Bayesian nonparametric, EEG, factorial hidden Markov model, graphical model, time series 1. Introduction Despite over three decades of research, we still have very little idea of what defines a seizure. This ignorance stems both from the complexity of epilepsy as a disease and a paucity of quantitative tools that are flexible Preprint submitted to Artificial Intelligence Journal July 29, 2014 arXiv:1402.6951v2[stat.ML]14Jul2014 From “best” sample of {ziw}, can infer: 1.  Topics from conditional distribution… need corpus-wide info 2.  Document “embedding”… need doc info only