Composing graphical models with neural networks for structured representations and fast inference

CS592 Presentation #18
Composing graphical models with
neural networks for structured
representations and fast inference
20173586 Jeongmin Cha
20184666 Yajie Zhou
20174463 Jaesung Choe

Content
1. Motivation
2. Modeling idea
3. Structural Variational Autoencoder (SVAE)
4. Background
5. Main algorithm
6. Experiment
7. Group Discussion Point

1. Motivation
● How can we build interpretable models of high-dimensional data?
● modeling video of a mouse
● a mouse usually repeats a certain behavior
dart groom rear

1. Motivation
● We want a model
○ can explain which behavior
most is performing at each frame

1. Motivation
● What we want to do is ...
● segment and categorize mouse behavior from the video
● Q: generative vs discriminative model for this task?

1. Motivation
○ We can use both
○ a large number of labeled data needed in discriminative scheme
○ discriminative model relaxes conditional independence assumption
so may achieve better predictive result

1. Motivation
○ We can use both
○ a large number of unlabeled data from a small number of labeled data

1. Motivation
○ We can use both
○ a large number of unlabeled data from a small number of labeled data
○ This paper want to build a generative model for video of a mouse

1. Motivation
● a generative model for video of a mouse
● a mouse repeats certain behaviors
GMM = one solution

1. Motivation
● However, what if the data is not well described by Gaussian?

1. Motivation
● a mixture of gaussians fits the data poorly
● reports too many clusters (not natural clustering result)
GMMGMM

1. Motivation
● neural network fits data well
● but, difficult to interpret in high dimensions (lack interpretability)
GMM
density net
(VAE)

1. Motivation
● neural network fits data well
● but, difficult to interpret in high dimensions (lack interpretability)
● does not explicitly represent discrete mixture components,
GMM
density net
(VAE)
An appropriate model might
switch between discrete states

1. Motivation
● How about combining both? (Graphical model + Deep Learning)
● Structured Variational AutoEncoder (SVAE)
density net
(VAE)
GMM GMM SVAE

1. Motivation
● Q: Graphical model vs Deep Learning, pros and cons?

1. Motivation
● Q: Graphical model vs Deep Learning, pros and cons?
● specify explicit relationship between variables before learning
○ Graphical model configuration starts from a higher level (deduction)
○ Deep learning configuration starts from a lower level (induction)

1. Motivation
● Graphical model
○ + interpretable, structured
representations
○ + data and computational efficiency
○ - strong assumptions may not fit
○ - feature engineering
○ - top-down inference
● Deep learning
○ - not directly interpretable structure
○ - can require lots of data
○ + flexible representations, learn
automatically
○ + feature learning
○ + recognition networks (bottom-up)

2. Modeling idea
● graphical models on latent variables
○ structured probability distributions
○ fast exact inference subroutines
● neural network models (VAE) for observations
○ produce flexible non-linear feature manifold
■ nonlinear high-dimensional data to low-dimensional and dense representations
○ recognition network
■ instead of learning variational distribution parameters directly
■ map observations to conjugate graphical model potentials
■

3. Structure Variational AutoEncoder (SVAE)
Under the exponential conjugate property we can define SAVE as below,
where p(θ) is the prior distribution and the p(x|θ) is the posterior distribution.
Statistics function is defined and the partition function is
described as,
Finally, we should like to infer the likelihood.

described as,
Discussion Point:
Can you tell the fundamental
difference between VAE and
SAVE ?

described as,
Discussion Point:
Can you tell the fundamental
difference between VAE and
SAVE ? : Conjugate property

4. Background : conjugate distribution (VAE vs SVAE)
if the posterior distributions p(x|θ) are in the same probability distribution family as the prior probability
distribution p(θ), the prior and posterior are then called conjugate distributions.
What is conjugate distribution?
if the likelihood function is Poisson distribution, choosing a Poisson prior over the parameter λ will ensure that the
posterior distribution is also Poisson distribution.
Example
where λ = 4 where k = 4 and θ = 1
Posteriori : Prior:

4. Background : conjugate distribution.
if the likelihood function is Poisson distribution, choosing a Poisson prior over the parameter λ will ensure that the
posterior distribution is also Poisson distribution.
Example
Likelihood:
(assume i=1, …, 6)
If we set k = 10 and θ = 0.5, Prior becomes posterior:
Prior:

estimate the likelihood by updating the parameters of our prior
— reflecting a new mean and confidence level

Discussion Point:
Why the upper property is
important in SVAE ?

if the posterior distributions p(θ | x) are in the same probability distribution family as the prior probability
Discussion Point:
Why the upper property is
important in SVAE ?
A : Conjugacy property is
useful in Bayesian inference !

the integral of the marginal likelihood = is intractable.
Intractability in VAE.
Conjugacy in SVAE (Proposition B.4)
=
where the posterior p(θ|x) is in the same exponential family as p(θ) with the natural parameter
=
, and d are statistic function.

=
where the posterior p(θ|x) is in the same exponential family as p(θ) with the natural parameter
=
, and d are statistic function.
VAE handles a general non-conjugate observation models by introducing recognition network.

where the posterior p(θ|x) is in the same exponential family as p(θ)
=
, and are statistic function.

= =
=
This relationship is useful in Bayesian inference under the conjugacy property.

5.1 Conjugate Inference
Inference

5.2 Non-conjugate Inference
Discussion Point:
is still linear Gaussian?
what is conjugate prior?

SVAEs: recognition networks output conjugate potentials, then apply fast
graphical model inference

Discussion Point:
How to optimize this bound?

https://www.youtube.com/watch?v=9WSb-89UsEo&t=60s
(This video is disabled to be watched on other sites except for Youtube)
7. Group discussion
Group Discussion Point:
VAE vs SVAE : which model can have better
performance? (Is it strike or ball?)
: Strike
: None
: Strike
: Ball
፠ Supplementary material
- For those who are not familiar with the baseball rules.

https://www.youtube.com/watch?v=9WSb-89UsEo&t=60s
(This video is disabled to be watched on other sites except for Youtube)
7. Group discussion
Group Discussion Point:
VAE vs SVAE : which model can have better
performance? (Is it strike or ball?)
: Strike
: None
: Strike
: Ball
፠ Supplementary material
- For those who are not familiar with the baseball rules.
Hint or not

If SVAE follows the linear-chain structure,
(expect) SVAE can predict better accuracy in video classification.
(expect) VAE would be better for the single image classification.
7. Group discussion

If SVAE follows the linear-chain structure,
(expect) SVAE can predict better accuracy in video classification.
(expect) VAE would be better for the single image classification.
7. Group discussion
NO

By the way, what is the result?
strike - ball
7. Group discussion

By the way, what is the result?
strike - ball
7. Group discussion
scoreboard

By the way, what is the result? Strick!!
7. Group discussion
strike - ball
scoreboard

By the way, what is the result? Strick!! How did you check the results?
7. Group discussion
strike - ball
scoreboard

By the way, what is the result? Strick!! How did you check the results?
I think just single frame is enough !!
7. Group discussion
strike - ball
scoreboard

As we check the scoreboard,
AI also look at the scoreboard for the inference.
In other words, we do not need sequential frames.
(Our expectation) VAE would be better.
If we mask the scoreboard,
(our expectation) SVAE would be better.
7. Group discussion
Where AI is looking at (i.e. high attention)
Mask(non-observable area)

Composing graphical models with neural networks for structured representations and fast inference

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Composing graphical models with neural networks for structured representations and fast inference

Similar to Composing graphical models with neural networks for structured representations and fast inference (20)

More from Jeongmin Cha

More from Jeongmin Cha (8)

Recently uploaded

Recently uploaded (20)

Composing graphical models with neural networks for structured representations and fast inference