SlideShare a Scribd company logo
1 of 36
Download to read offline
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Approximate Inference
Henrik I. Christensen
Robotics & Intelligent Machines @ GT
Georgia Institute of Technology,
Atlanta, GA 30332-0280
hic@cc.gatech.edu
Henrik I. Christensen (RIM@GT) Approximate Inference 1 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Outline
1 Introduction
2 Variational Inference
3 Variational Mixture of Gaussians
4 Exponential Family
5 Expectation Propagation
6 Summary
Henrik I. Christensen (RIM@GT) Approximate Inference 2 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Introduction
We often are required to estimate a (conditional) prior of the form
p(Z|X)
The solution might be intractable
1 There might not be a close form solution
2 The integration over X or a parameter space θ might be
computationally challenging
3 The set of possible outcomes might be significant/exponential
Two strategies
1 Deterministic Approximation Methods
2 Stochastic Sampling (Monte Carlo Techniques)
Today we will talk about deterministic techniques
Henrik I. Christensen (RIM@GT) Approximate Inference 3 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Outline
1 Introduction
2 Variational Inference
3 Variational Mixture of Gaussians
4 Exponential Family
5 Expectation Propagation
6 Summary
Henrik I. Christensen (RIM@GT) Approximate Inference 4 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Variational Inference
In general we have a Bayesian Model as seen earlier, ie.
ln p(X) = ln p(X, Z) − ln p(Z|X)
We can rewrite this to
ln p(X) = L(q) + KL(q||p)
where
L(q) =
Z
q(Z) ln

p(X, Z)
q(Z)

KL(q||p) = −
Z
q(Z) ln

p(Z|X)
q(Z)

So L(q) is an estimate of the joint distribution and KL is the
Kullback-Leibler comparison of q(Z) to p(Z|X).
Henrik I. Christensen (RIM@GT) Approximate Inference 5 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Factorized Distributions
Assume for now that we can factorize Z into disjoint groups so that
q(Z) =
M
Y
i=1
qi (Zi )
In physics a similar model has been adopted termed mean field theory
We can them optimize L(q) through a component wise optimization
L(q) =
Z Y
i
qi



ln p(X, Z) −
X
j
qj



dZ
=
Z
qj ln p̃(X, Zj )dZj −
Z
qj ln qj dZj + const
where
p̃(X, Zj ) = Ei6=j [ln p(X, Z)] + c = ln p(X, Z)
Y
i6=j
qi dZi + c
Henrik I. Christensen (RIM@GT) Approximate Inference 6 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Factorized distributions
The optimal solution is now
ln q∗
j (Zj ) = Ei6=j [ln p(X, Z)] + c
Ie the solution where every factor minimizes the influence on L(q)
Henrik I. Christensen (RIM@GT) Approximate Inference 7 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Outline
1 Introduction
2 Variational Inference
3 Variational Mixture of Gaussians
4 Exponential Family
5 Expectation Propagation
6 Summary
Henrik I. Christensen (RIM@GT) Approximate Inference 8 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Variational Mixture of Gaussians
We encounter mixtures of Gaussians all the time
Examples are multi-wall modelling, ambiguous localization, ...
We have:
a set of observed data X,
a set of latent variables, Z that describe the mixture
Henrik I. Christensen (RIM@GT) Approximate Inference 9 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Mixture of Gaussians - Modelling
We can model the mixture model
p(Z|π) =
N
Y
n=1
K
Y
k=1
πznk
k
We can also derive the observed conditional
p(X|Z, µ, Λ) =
N
Y
n=1
K
Y
k=1
N(xn|µk, Λ−1
k )znk
We will for now assume that mixtures are modelled as diraclets
p(π) = Dir(π|α0) = C(α0)
K
Y
k=1
πα0−1
k
Henrik I. Christensen (RIM@GT) Approximate Inference 10 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Mixture of Gaussians - Modelling
The component processes can be modelled as a Gaussian-Wishart
p(µ, Λ) = p(µ|Λ)p(Λ) =
K
Y
k=1
N(µk|m0, (β0Λk)−1
)W (Λk|W0, ν0)
Ie a total model of
xn
zn
N
π
µ
Λ
Henrik I. Christensen (RIM@GT) Approximate Inference 11 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Mixtures of Gaussians - Variational
The conditional model can be seen as
p(X, Z, π, µ, Λ) = p(X|Z, µ, Λ)p(Z|π)p(π)p(µ|Λ)p(Λ)
Only X is observed
We can now consider the selection of a distribution
q(Z, π, µ, Λ) = q(Z)q(π, µ, Λ)
this is clear an assumption of independence.
We can use the general result of component-wise optimization
ln q∗
(Z) = Eπ,µ,Λ[ln p(X, Z, π, µ, Λ] + const
Decomposition gives us
ln q∗
(Z) = Eπ[ln p(Z|π)] + Eµ,Λ[ln p(X|Z, µ, Λ)] + const
ln q∗
(Z) =
N
X
n=1
K
X
k=1
znk ln ρnk + const
Henrik I. Christensen (RIM@GT) Approximate Inference 12 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Mixtures of Gaussians - Variational
We can further achieve
ln ρnk = E[ln πk ]+
1
2
E[ln |Λk |]−
D
2
ln 2π−
1
2
Eµk ,Λk
[(xn −µk )T
Λk (xn −µk )]+c
Taking the exponential we have
q∗
(Z) ∝
K
Y
k=1
N
Y
n=1
ρznk
nk
Using normalization we arrive at
q∗
(Z) ∝
K
Y
k=1
N
Y
n=1
rznk
nk
Where
rnk =
ρnk
P
j ρnj
Henrik I. Christensen (RIM@GT) Approximate Inference 13 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Mixtures of Gaussians - Variational
Just as we saw for EM we can define
Nk =
N
X
n=1
rnk
x̄k =
1
Nk
N
X
n=1
rnkxn
Sk =
1
Nk
N
X
n=1
rnk(xn − x̄n)(xn − x̄n)T
Henrik I. Christensen (RIM@GT) Approximate Inference 14 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Mixtures of Gaussians - Parameters/Mixture
Lets now consider q(π, µ, Λ) to arrive at
ln q
∗
(π, µ, Λ) = ln p(π) +
K
X
k=1
ln p(µk , Λk ) + EZ [ln p(Z|π)] +
k
X
k=1
N
X
n=1
E[znk ] ln N(xn|µk , Λ
−1
k ) + c
We can partition the problem into
q(π, µ, Λ) = q(π)
K
Y
k=1
q(µk, Λk)
We can derive
ln q∗
(π) = (α0 − 1)
K
X
k=1
ln πk +
K
X
k=1
N
X
n=1
rnk ln πk + c
We can now derive
q∗
(π) = Dir(π|α)
where
αk = α0 + Nk
Henrik I. Christensen (RIM@GT) Approximate Inference 15 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Mixtures of Gaussians - Parameters/Mixture
We can then derive
q∗
(µk, Λk) = N(µk|mk, (βkΛk)−1
)W (λk|Wk, νk)
where
βk = β0 + Nk
mk =
1
βk
(β0m0 + Nkx̄k)
W −1
K = W −1
0 + NkSk +
β0Nk
β0 + Nk
(x̄k − m0)(x̄k − m0)T
νk = ν0 + Nk + 1
Henrik I. Christensen (RIM@GT) Approximate Inference 16 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Mixtures of Gaussians - Parameters
We can now arrive at the parameters
Eµk ,Λk
[(xn − µk )T
(xn − µk )] = Dβ−1
k + νk (xn − mk )T
WK (xn − mk )
ln Λ̃k = E[ln |Λ|k |] =
D
X
i=1
ψ

νk + 1 − i
2

+ D ln 2 + ln |Wk |
ln π̃k = E[ln πk ] = ψ(αk ) − ψ(α̂)
here ψ(.) which is defined as d/da ln Γ(a) also known as the digramma
function. The last two results are given by the Gauss-Wishart
Henrik I. Christensen (RIM@GT) Approximate Inference 17 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Mixtures of Gaussians - Parameters
We can finally find the responsibilities
rnk ∝ πk|Λk|1/2
exp

−
1
2
(xn − µk)T
Λk(xn − µk)

The optimization is stepwise
1 Estimate µ, Λ and then rnk
2 Estimate π and Z
3 Check for convergence - return to 1 if not converged
Henrik I. Christensen (RIM@GT) Approximate Inference 18 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Mixture of Gaussians - Example
0 15
60 120
Henrik I. Christensen (RIM@GT) Approximate Inference 19 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
MoG - Varional Lower Bound
We can estimate the best fit / lower bound
L = E[ln p(X|Z, µ, Λ)] + E[ln p(Z|pi)] + E[ln p(µ, Λ)] − E[ln q(Z)] − E[ln q(π)] − E[ln q(µ, Λ)]
E[ln p(X|Z, µ, Λ)] =
1
2
X
k
Nk
n
ln Λ̃k − Dβ−1
k − νk Tr(Sk Wk )
−νk (x̄k − mk )T
WK (x̄k − mk ) − D ln 2π
	
E[ln p(Z|π)] =
X
n
X
k
rnk ln rnk
E[ln p(π)] = ln C(α0) + (α0 − 1)
X
k
ln π̃k
.
.
. =
.
.
. (see book)
Henrik I. Christensen (RIM@GT) Approximate Inference 20 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Outline
1 Introduction
2 Variational Inference
3 Variational Mixture of Gaussians
4 Exponential Family
5 Expectation Propagation
6 Summary
Henrik I. Christensen (RIM@GT) Approximate Inference 21 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Exponential Family Distribution
Recall from 3rd lecture:
Exponential family
p(x|η) = h(x)g(η) exp
n
ηT
u(x)
o
where η represent the “natural parameters”
g(η) is the normalization “factor”
u(x) is some general function of data
Henrik I. Christensen (RIM@GT) Approximate Inference 22 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Exponential Family Distribution
The joint distribution for observed and latent variables is then
p(X, Z|η) =
N
Y
n=1
h(xn, zn)g(η) exp
n
ηT
u(xn, zn)
o
The conjugate prior for η is then
p(η|ν0, v0) = f (ν0, χ0)g(η)ν0
exp
n
ν0ηT
χ
o
where ν0 is prior number of observations and χ is the sufficient
statistics (moments)
Henrik I. Christensen (RIM@GT) Approximate Inference 23 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Exponential Family Distribution - Variational
As before we can compute
ln q∗
(Z) = Eη[ln p(X, Z|η)] + const
=
X
n
n
ln h(xn, zn) + E[ηT
]u(xn, zn)
o
+ const
i.e. a sum of independent terms
Taking exponential on both sides we have
q∗
(zn) = h(xn, zn)g(E[η]) exp
n
E[ηT
]u(xn, zn)
o
Henrik I. Christensen (RIM@GT) Approximate Inference 24 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Exponential Family Distribution - Variational
Similarly the natural parameters can be optimized by
ln q∗
(η) = ln p(η|ν0, χ0) + EZ [ln p(X, Z|η)] + const
Which expands to
ln q∗
(η) = ν0 ln g(η) + ηT
χ0 +
X 
ln g(η) + ηT
Ezn [u(xn, zn)]
	
+ const
Using the trick of exponentials on both sides we have
q∗
(η) = f (νN, χN)g(η)νN
exp
n
ηT
χN
o
where
νN = ν0 + N χn = χ0 +
X
n
Ezn [u(xn, zn)]
Henrik I. Christensen (RIM@GT) Approximate Inference 25 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Exponential Family Distribution - Variational
As expected the solution is iterative
q∗(zn) and q∗(η) are coupled.
In the E step compute E[u(xn, zn)] - the sufficient statistics and
compute q(η)
In the M step use the estimate to maximize the estimate for q(zn)
and compute E[ηT ]
Henrik I. Christensen (RIM@GT) Approximate Inference 26 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Outline
1 Introduction
2 Variational Inference
3 Variational Mixture of Gaussians
4 Exponential Family
5 Expectation Propagation
6 Summary
Henrik I. Christensen (RIM@GT) Approximate Inference 27 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Expectation Propagation
Fundamentally we are trying to match distributions to the data and
match up the natural parameters. I.e. find the “best”family of
distributions and at the same time fit the parameter.
In the end we are trying to minimize the Kullback-Leibler (KL) with
respect to q(z)
Consider for a minute KL(p||q) where p(z) is fixed and q(z) is a
member of the exponential family
q(z) = h(z)g(η) exp
n
ηT
u(z)
o
Henrik I. Christensen (RIM@GT) Approximate Inference 28 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Expectation Propagation - Optimization
The Kullback - Leibler is then
KL(p||q) = − ln g(η) − ηT
Ep(z)[u(z)] + const
The extrema is then given by
−∇ ln g(η) = Ep(z)[u(z)]
i.e. the best estimate is to match q(z) to p(z) by setting “natural
parameters” to the sufficient statistics (moment matching).
I.e. q(z) = N(z|µ, Σ) as a model for the data
Henrik I. Christensen (RIM@GT) Approximate Inference 29 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Expectation Propagation - Modelling
Consider a model with factorized probabilities
p(D, θ) =
Y
i
fi (θ)
where fi (theta) = p(xn|θ) and you might have a prior f0(θ) = p(θ).
The posterior is then
p(θ|D) =
1
p(D)
Y
i
fi (θ)
The model evident is given by
p(D) =
Z Y
i
fi (θ)dθ
Henrik I. Christensen (RIM@GT) Approximate Inference 30 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Expectation Propagation - Computing
The estimate is then
q(θ) =
1
Z
Y
i
˜
fi (θ)
q(θ) can be factorized so that each term is optimized
Through optimization factor-by-factor it is possible to generate an
estimate - take-one-out-and-optimize
Henrik I. Christensen (RIM@GT) Approximate Inference 31 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Expectation Propagation - Algorithm
Initialize factor approximation - ˜
fi (θ)
Initialize posterior estimate - q(θ) ∝
Q
i
˜
fi (θ)
iterate
1 Choose a factor to refine
2 Remove ˜
fj (θ) from prior qj
= q/f
3 Evaluate new posterior/sufficient statistics
4 Update factors
5 Evaluate aproximation
Henrik I. Christensen (RIM@GT) Approximate Inference 32 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Expectation Propagation - Example
θ x
−5 0 5 10
p(x|θ) = (1 − w)N(x|θ, I) + wN(x|0, aI)
Henrik I. Christensen (RIM@GT) Approximate Inference 33 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Expectation Propagation - Example
θ
−5 0 5 10 θ
−5 0 5 10
Henrik I. Christensen (RIM@GT) Approximate Inference 34 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Outline
1 Introduction
2 Variational Inference
3 Variational Mixture of Gaussians
4 Exponential Family
5 Expectation Propagation
6 Summary
Henrik I. Christensen (RIM@GT) Approximate Inference 35 / 36
Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary
Summary
Often computation of complete model is a challenge
Two ways to approximate computations
Deterministic Approximations
Sampling Based Methods
Many tricks for approximation
Factorization is typically a first strategy
Iterative optimization of factors
Next time we will talk about sampling based methods
Henrik I. Christensen (RIM@GT) Approximate Inference 36 / 36

More Related Content

What's hot

Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussionChristian Robert
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsStefano Cabras
 
Bayesian model choice in cosmology
Bayesian model choice in cosmologyBayesian model choice in cosmology
Bayesian model choice in cosmologyChristian Robert
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distancesChristian Robert
 
BlUP and BLUE- REML of linear mixed model
BlUP and BLUE- REML of linear mixed modelBlUP and BLUE- REML of linear mixed model
BlUP and BLUE- REML of linear mixed modelKyusonLim
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?Christian Robert
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannolli0601
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsChristian Robert
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distancesChristian Robert
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Christian Robert
 
Approximating Bayes Factors
Approximating Bayes FactorsApproximating Bayes Factors
Approximating Bayes FactorsChristian Robert
 

What's hot (20)

Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-Likelihoods
 
Big model, big data
Big model, big dataBig model, big data
Big model, big data
 
Dag in mmhc
Dag in mmhcDag in mmhc
Dag in mmhc
 
Bayesian model choice in cosmology
Bayesian model choice in cosmologyBayesian model choice in cosmology
Bayesian model choice in cosmology
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distances
 
BlUP and BLUE- REML of linear mixed model
BlUP and BLUE- REML of linear mixed modelBlUP and BLUE- REML of linear mixed model
BlUP and BLUE- REML of linear mixed model
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmann
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithms
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distances
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Approximating Bayes Factors
Approximating Bayes FactorsApproximating Bayes Factors
Approximating Bayes Factors
 

Similar to 8803-09-lec16.pdf

Divergence clustering
Divergence clusteringDivergence clustering
Divergence clusteringFrank Nielsen
 
On the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract meansOn the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract meansFrank Nielsen
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Daisuke Yoneoka
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep LearningRayKim51
 
Divergence center-based clustering and their applications
Divergence center-based clustering and their applicationsDivergence center-based clustering and their applications
Divergence center-based clustering and their applicationsFrank Nielsen
 
On Clustering Histograms with k-Means by Using Mixed α-Divergences
 On Clustering Histograms with k-Means by Using Mixed α-Divergences On Clustering Histograms with k-Means by Using Mixed α-Divergences
On Clustering Histograms with k-Means by Using Mixed α-DivergencesFrank Nielsen
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodFrank Nielsen
 
Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Sangwoo Mo
 
Tales on two commuting transformations or flows
Tales on two commuting transformations or flowsTales on two commuting transformations or flows
Tales on two commuting transformations or flowsVjekoslavKovac1
 
Decision Making with Hierarchical Credal Sets (IPMU 2014)
Decision Making with Hierarchical Credal Sets (IPMU 2014)Decision Making with Hierarchical Credal Sets (IPMU 2014)
Decision Making with Hierarchical Credal Sets (IPMU 2014)Alessandro Antonucci
 
Introducing Zap Q-Learning
Introducing Zap Q-Learning   Introducing Zap Q-Learning
Introducing Zap Q-Learning Sean Meyn
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
Tensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsTensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsAlexander Litvinenko
 
Clustering Random Walk Time Series
Clustering Random Walk Time SeriesClustering Random Walk Time Series
Clustering Random Walk Time SeriesGautier Marti
 
Scattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisScattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisVjekoslavKovac1
 
Geometric and viscosity solutions for the Cauchy problem of first order
Geometric and viscosity solutions for the Cauchy problem of first orderGeometric and viscosity solutions for the Cauchy problem of first order
Geometric and viscosity solutions for the Cauchy problem of first orderJuliho Castillo
 
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)Dahua Lin
 

Similar to 8803-09-lec16.pdf (20)

Divergence clustering
Divergence clusteringDivergence clustering
Divergence clustering
 
On the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract meansOn the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract means
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep Learning
 
Divergence center-based clustering and their applications
Divergence center-based clustering and their applicationsDivergence center-based clustering and their applications
Divergence center-based clustering and their applications
 
Vancouver18
Vancouver18Vancouver18
Vancouver18
 
On Clustering Histograms with k-Means by Using Mixed α-Divergences
 On Clustering Histograms with k-Means by Using Mixed α-Divergences On Clustering Histograms with k-Means by Using Mixed α-Divergences
On Clustering Histograms with k-Means by Using Mixed α-Divergences
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
 
Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)
 
Tales on two commuting transformations or flows
Tales on two commuting transformations or flowsTales on two commuting transformations or flows
Tales on two commuting transformations or flows
 
Decision Making with Hierarchical Credal Sets (IPMU 2014)
Decision Making with Hierarchical Credal Sets (IPMU 2014)Decision Making with Hierarchical Credal Sets (IPMU 2014)
Decision Making with Hierarchical Credal Sets (IPMU 2014)
 
Introducing Zap Q-Learning
Introducing Zap Q-Learning   Introducing Zap Q-Learning
Introducing Zap Q-Learning
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
Tensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsTensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEs
 
Clustering Random Walk Time Series
Clustering Random Walk Time SeriesClustering Random Walk Time Series
Clustering Random Walk Time Series
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
Scattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisScattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysis
 
Geometric and viscosity solutions for the Cauchy problem of first order
Geometric and viscosity solutions for the Cauchy problem of first orderGeometric and viscosity solutions for the Cauchy problem of first order
Geometric and viscosity solutions for the Cauchy problem of first order
 
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
 
Nested sampling
Nested samplingNested sampling
Nested sampling
 

More from KSChidanandKumarJSSS (8)

12_applications.pdf
12_applications.pdf12_applications.pdf
12_applications.pdf
 
16_graphical_models.pdf
16_graphical_models.pdf16_graphical_models.pdf
16_graphical_models.pdf
 
10_rnn.pdf
10_rnn.pdf10_rnn.pdf
10_rnn.pdf
 
15_representation.pdf
15_representation.pdf15_representation.pdf
15_representation.pdf
 
14_autoencoders.pdf
14_autoencoders.pdf14_autoencoders.pdf
14_autoencoders.pdf
 
Adversarial ML - Part 2.pdf
Adversarial ML - Part 2.pdfAdversarial ML - Part 2.pdf
Adversarial ML - Part 2.pdf
 
Adversarial ML - Part 1.pdf
Adversarial ML - Part 1.pdfAdversarial ML - Part 1.pdf
Adversarial ML - Part 1.pdf
 
13_linear_factors.pdf
13_linear_factors.pdf13_linear_factors.pdf
13_linear_factors.pdf
 

Recently uploaded

Charbagh / best call girls in Lucknow - Book 🥤 8923113531 🪗 Call Girls Availa...
Charbagh / best call girls in Lucknow - Book 🥤 8923113531 🪗 Call Girls Availa...Charbagh / best call girls in Lucknow - Book 🥤 8923113531 🪗 Call Girls Availa...
Charbagh / best call girls in Lucknow - Book 🥤 8923113531 🪗 Call Girls Availa...gurkirankumar98700
 
Jeremy Casson - An Architectural and Historical Journey Around Europe
Jeremy Casson - An Architectural and Historical Journey Around EuropeJeremy Casson - An Architectural and Historical Journey Around Europe
Jeremy Casson - An Architectural and Historical Journey Around EuropeJeremy Casson
 
Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...
Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...
Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...anilsa9823
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Pari Chowk | Noida
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Pari Chowk | NoidaFULL ENJOY 🔝 8264348440 🔝 Call Girls in Pari Chowk | Noida
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Pari Chowk | Noidasoniya singh
 
RAK Call Girls Service # 971559085003 # Call Girl Service In RAK
RAK Call Girls Service # 971559085003 # Call Girl Service In RAKRAK Call Girls Service # 971559085003 # Call Girl Service In RAK
RAK Call Girls Service # 971559085003 # Call Girl Service In RAKedwardsara83
 
Young⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort ServiceYoung⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort Servicesonnydelhi1992
 
Hazratganj / Call Girl in Lucknow - Phone 🫗 8923113531 ☛ Escorts Service at 6...
Hazratganj / Call Girl in Lucknow - Phone 🫗 8923113531 ☛ Escorts Service at 6...Hazratganj / Call Girl in Lucknow - Phone 🫗 8923113531 ☛ Escorts Service at 6...
Hazratganj / Call Girl in Lucknow - Phone 🫗 8923113531 ☛ Escorts Service at 6...akbard9823
 
Best Call girls in Lucknow - 9548086042 - with hotel room
Best Call girls in Lucknow - 9548086042 - with hotel roomBest Call girls in Lucknow - 9548086042 - with hotel room
Best Call girls in Lucknow - 9548086042 - with hotel roomdiscovermytutordmt
 
AaliyahBell_themist_v01.pdf .
AaliyahBell_themist_v01.pdf             .AaliyahBell_themist_v01.pdf             .
AaliyahBell_themist_v01.pdf .AaliyahB2
 
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call GirlsCall Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call Girlsparisharma5056
 
Jeremy Casson - How Painstaking Restoration Has Revealed the Beauty of an Imp...
Jeremy Casson - How Painstaking Restoration Has Revealed the Beauty of an Imp...Jeremy Casson - How Painstaking Restoration Has Revealed the Beauty of an Imp...
Jeremy Casson - How Painstaking Restoration Has Revealed the Beauty of an Imp...Jeremy Casson
 
Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...
Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...
Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...anilsa9823
 
this is a jarvis ppt for jarvis ai assistant lovers and this is for you
this is a jarvis ppt for jarvis ai assistant lovers and this is for youthis is a jarvis ppt for jarvis ai assistant lovers and this is for you
this is a jarvis ppt for jarvis ai assistant lovers and this is for youhigev50580
 
Turn Lock Take Key Storyboard Daniel Johnson
Turn Lock Take Key Storyboard Daniel JohnsonTurn Lock Take Key Storyboard Daniel Johnson
Turn Lock Take Key Storyboard Daniel Johnsonthephillipta
 
Deconstructing Gendered Language; Feminist World-Making 2024
Deconstructing Gendered Language; Feminist World-Making 2024Deconstructing Gendered Language; Feminist World-Making 2024
Deconstructing Gendered Language; Feminist World-Making 2024samlnance
 
Lucknow 💋 Russian Call Girls Lucknow - Book 8923113531 Call Girls Available 2...
Lucknow 💋 Russian Call Girls Lucknow - Book 8923113531 Call Girls Available 2...Lucknow 💋 Russian Call Girls Lucknow - Book 8923113531 Call Girls Available 2...
Lucknow 💋 Russian Call Girls Lucknow - Book 8923113531 Call Girls Available 2...anilsa9823
 
Lucknow 💋 Call Girls Service Lucknow ₹7.5k Pick Up & Drop With Cash Payment 8...
Lucknow 💋 Call Girls Service Lucknow ₹7.5k Pick Up & Drop With Cash Payment 8...Lucknow 💋 Call Girls Service Lucknow ₹7.5k Pick Up & Drop With Cash Payment 8...
Lucknow 💋 Call Girls Service Lucknow ₹7.5k Pick Up & Drop With Cash Payment 8...anilsa9823
 
Young⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort ServiceYoung⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort Servicesonnydelhi1992
 

Recently uploaded (20)

Charbagh / best call girls in Lucknow - Book 🥤 8923113531 🪗 Call Girls Availa...
Charbagh / best call girls in Lucknow - Book 🥤 8923113531 🪗 Call Girls Availa...Charbagh / best call girls in Lucknow - Book 🥤 8923113531 🪗 Call Girls Availa...
Charbagh / best call girls in Lucknow - Book 🥤 8923113531 🪗 Call Girls Availa...
 
Jeremy Casson - An Architectural and Historical Journey Around Europe
Jeremy Casson - An Architectural and Historical Journey Around EuropeJeremy Casson - An Architectural and Historical Journey Around Europe
Jeremy Casson - An Architectural and Historical Journey Around Europe
 
Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...
Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...
Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Pari Chowk | Noida
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Pari Chowk | NoidaFULL ENJOY 🔝 8264348440 🔝 Call Girls in Pari Chowk | Noida
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Pari Chowk | Noida
 
RAK Call Girls Service # 971559085003 # Call Girl Service In RAK
RAK Call Girls Service # 971559085003 # Call Girl Service In RAKRAK Call Girls Service # 971559085003 # Call Girl Service In RAK
RAK Call Girls Service # 971559085003 # Call Girl Service In RAK
 
Young⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort ServiceYoung⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort Service
 
Independent UAE Call Girls # 00971528675665 # Independent Call Girls In Dubai...
Independent UAE Call Girls # 00971528675665 # Independent Call Girls In Dubai...Independent UAE Call Girls # 00971528675665 # Independent Call Girls In Dubai...
Independent UAE Call Girls # 00971528675665 # Independent Call Girls In Dubai...
 
Hazratganj / Call Girl in Lucknow - Phone 🫗 8923113531 ☛ Escorts Service at 6...
Hazratganj / Call Girl in Lucknow - Phone 🫗 8923113531 ☛ Escorts Service at 6...Hazratganj / Call Girl in Lucknow - Phone 🫗 8923113531 ☛ Escorts Service at 6...
Hazratganj / Call Girl in Lucknow - Phone 🫗 8923113531 ☛ Escorts Service at 6...
 
Best Call girls in Lucknow - 9548086042 - with hotel room
Best Call girls in Lucknow - 9548086042 - with hotel roomBest Call girls in Lucknow - 9548086042 - with hotel room
Best Call girls in Lucknow - 9548086042 - with hotel room
 
AaliyahBell_themist_v01.pdf .
AaliyahBell_themist_v01.pdf             .AaliyahBell_themist_v01.pdf             .
AaliyahBell_themist_v01.pdf .
 
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call GirlsCall Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
 
Jeremy Casson - How Painstaking Restoration Has Revealed the Beauty of an Imp...
Jeremy Casson - How Painstaking Restoration Has Revealed the Beauty of an Imp...Jeremy Casson - How Painstaking Restoration Has Revealed the Beauty of an Imp...
Jeremy Casson - How Painstaking Restoration Has Revealed the Beauty of an Imp...
 
Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...
Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...
Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...
 
this is a jarvis ppt for jarvis ai assistant lovers and this is for you
this is a jarvis ppt for jarvis ai assistant lovers and this is for youthis is a jarvis ppt for jarvis ai assistant lovers and this is for you
this is a jarvis ppt for jarvis ai assistant lovers and this is for you
 
Turn Lock Take Key Storyboard Daniel Johnson
Turn Lock Take Key Storyboard Daniel JohnsonTurn Lock Take Key Storyboard Daniel Johnson
Turn Lock Take Key Storyboard Daniel Johnson
 
Deconstructing Gendered Language; Feminist World-Making 2024
Deconstructing Gendered Language; Feminist World-Making 2024Deconstructing Gendered Language; Feminist World-Making 2024
Deconstructing Gendered Language; Feminist World-Making 2024
 
Lucknow 💋 Russian Call Girls Lucknow - Book 8923113531 Call Girls Available 2...
Lucknow 💋 Russian Call Girls Lucknow - Book 8923113531 Call Girls Available 2...Lucknow 💋 Russian Call Girls Lucknow - Book 8923113531 Call Girls Available 2...
Lucknow 💋 Russian Call Girls Lucknow - Book 8923113531 Call Girls Available 2...
 
Lucknow 💋 Call Girls Service Lucknow ₹7.5k Pick Up & Drop With Cash Payment 8...
Lucknow 💋 Call Girls Service Lucknow ₹7.5k Pick Up & Drop With Cash Payment 8...Lucknow 💋 Call Girls Service Lucknow ₹7.5k Pick Up & Drop With Cash Payment 8...
Lucknow 💋 Call Girls Service Lucknow ₹7.5k Pick Up & Drop With Cash Payment 8...
 
Dxb Call Girls # +971529501107 # Call Girls In Dxb Dubai || (UAE)
Dxb Call Girls # +971529501107 # Call Girls In Dxb Dubai || (UAE)Dxb Call Girls # +971529501107 # Call Girls In Dxb Dubai || (UAE)
Dxb Call Girls # +971529501107 # Call Girls In Dxb Dubai || (UAE)
 
Young⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort ServiceYoung⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort Service
 

8803-09-lec16.pdf

  • 1. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Approximate Inference Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Approximate Inference 1 / 36
  • 2. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Outline 1 Introduction 2 Variational Inference 3 Variational Mixture of Gaussians 4 Exponential Family 5 Expectation Propagation 6 Summary Henrik I. Christensen (RIM@GT) Approximate Inference 2 / 36
  • 3. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Introduction We often are required to estimate a (conditional) prior of the form p(Z|X) The solution might be intractable 1 There might not be a close form solution 2 The integration over X or a parameter space θ might be computationally challenging 3 The set of possible outcomes might be significant/exponential Two strategies 1 Deterministic Approximation Methods 2 Stochastic Sampling (Monte Carlo Techniques) Today we will talk about deterministic techniques Henrik I. Christensen (RIM@GT) Approximate Inference 3 / 36
  • 4. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Outline 1 Introduction 2 Variational Inference 3 Variational Mixture of Gaussians 4 Exponential Family 5 Expectation Propagation 6 Summary Henrik I. Christensen (RIM@GT) Approximate Inference 4 / 36
  • 5. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Variational Inference In general we have a Bayesian Model as seen earlier, ie. ln p(X) = ln p(X, Z) − ln p(Z|X) We can rewrite this to ln p(X) = L(q) + KL(q||p) where L(q) = Z q(Z) ln p(X, Z) q(Z) KL(q||p) = − Z q(Z) ln p(Z|X) q(Z) So L(q) is an estimate of the joint distribution and KL is the Kullback-Leibler comparison of q(Z) to p(Z|X). Henrik I. Christensen (RIM@GT) Approximate Inference 5 / 36
  • 6. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Factorized Distributions Assume for now that we can factorize Z into disjoint groups so that q(Z) = M Y i=1 qi (Zi ) In physics a similar model has been adopted termed mean field theory We can them optimize L(q) through a component wise optimization L(q) = Z Y i qi    ln p(X, Z) − X j qj    dZ = Z qj ln p̃(X, Zj )dZj − Z qj ln qj dZj + const where p̃(X, Zj ) = Ei6=j [ln p(X, Z)] + c = ln p(X, Z) Y i6=j qi dZi + c Henrik I. Christensen (RIM@GT) Approximate Inference 6 / 36
  • 7. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Factorized distributions The optimal solution is now ln q∗ j (Zj ) = Ei6=j [ln p(X, Z)] + c Ie the solution where every factor minimizes the influence on L(q) Henrik I. Christensen (RIM@GT) Approximate Inference 7 / 36
  • 8. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Outline 1 Introduction 2 Variational Inference 3 Variational Mixture of Gaussians 4 Exponential Family 5 Expectation Propagation 6 Summary Henrik I. Christensen (RIM@GT) Approximate Inference 8 / 36
  • 9. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Variational Mixture of Gaussians We encounter mixtures of Gaussians all the time Examples are multi-wall modelling, ambiguous localization, ... We have: a set of observed data X, a set of latent variables, Z that describe the mixture Henrik I. Christensen (RIM@GT) Approximate Inference 9 / 36
  • 10. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixture of Gaussians - Modelling We can model the mixture model p(Z|π) = N Y n=1 K Y k=1 πznk k We can also derive the observed conditional p(X|Z, µ, Λ) = N Y n=1 K Y k=1 N(xn|µk, Λ−1 k )znk We will for now assume that mixtures are modelled as diraclets p(π) = Dir(π|α0) = C(α0) K Y k=1 πα0−1 k Henrik I. Christensen (RIM@GT) Approximate Inference 10 / 36
  • 11. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixture of Gaussians - Modelling The component processes can be modelled as a Gaussian-Wishart p(µ, Λ) = p(µ|Λ)p(Λ) = K Y k=1 N(µk|m0, (β0Λk)−1 )W (Λk|W0, ν0) Ie a total model of xn zn N π µ Λ Henrik I. Christensen (RIM@GT) Approximate Inference 11 / 36
  • 12. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Variational The conditional model can be seen as p(X, Z, π, µ, Λ) = p(X|Z, µ, Λ)p(Z|π)p(π)p(µ|Λ)p(Λ) Only X is observed We can now consider the selection of a distribution q(Z, π, µ, Λ) = q(Z)q(π, µ, Λ) this is clear an assumption of independence. We can use the general result of component-wise optimization ln q∗ (Z) = Eπ,µ,Λ[ln p(X, Z, π, µ, Λ] + const Decomposition gives us ln q∗ (Z) = Eπ[ln p(Z|π)] + Eµ,Λ[ln p(X|Z, µ, Λ)] + const ln q∗ (Z) = N X n=1 K X k=1 znk ln ρnk + const Henrik I. Christensen (RIM@GT) Approximate Inference 12 / 36
  • 13. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Variational We can further achieve ln ρnk = E[ln πk ]+ 1 2 E[ln |Λk |]− D 2 ln 2π− 1 2 Eµk ,Λk [(xn −µk )T Λk (xn −µk )]+c Taking the exponential we have q∗ (Z) ∝ K Y k=1 N Y n=1 ρznk nk Using normalization we arrive at q∗ (Z) ∝ K Y k=1 N Y n=1 rznk nk Where rnk = ρnk P j ρnj Henrik I. Christensen (RIM@GT) Approximate Inference 13 / 36
  • 14. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Variational Just as we saw for EM we can define Nk = N X n=1 rnk x̄k = 1 Nk N X n=1 rnkxn Sk = 1 Nk N X n=1 rnk(xn − x̄n)(xn − x̄n)T Henrik I. Christensen (RIM@GT) Approximate Inference 14 / 36
  • 15. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Parameters/Mixture Lets now consider q(π, µ, Λ) to arrive at ln q ∗ (π, µ, Λ) = ln p(π) + K X k=1 ln p(µk , Λk ) + EZ [ln p(Z|π)] + k X k=1 N X n=1 E[znk ] ln N(xn|µk , Λ −1 k ) + c We can partition the problem into q(π, µ, Λ) = q(π) K Y k=1 q(µk, Λk) We can derive ln q∗ (π) = (α0 − 1) K X k=1 ln πk + K X k=1 N X n=1 rnk ln πk + c We can now derive q∗ (π) = Dir(π|α) where αk = α0 + Nk Henrik I. Christensen (RIM@GT) Approximate Inference 15 / 36
  • 16. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Parameters/Mixture We can then derive q∗ (µk, Λk) = N(µk|mk, (βkΛk)−1 )W (λk|Wk, νk) where βk = β0 + Nk mk = 1 βk (β0m0 + Nkx̄k) W −1 K = W −1 0 + NkSk + β0Nk β0 + Nk (x̄k − m0)(x̄k − m0)T νk = ν0 + Nk + 1 Henrik I. Christensen (RIM@GT) Approximate Inference 16 / 36
  • 17. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Parameters We can now arrive at the parameters Eµk ,Λk [(xn − µk )T (xn − µk )] = Dβ−1 k + νk (xn − mk )T WK (xn − mk ) ln Λ̃k = E[ln |Λ|k |] = D X i=1 ψ νk + 1 − i 2 + D ln 2 + ln |Wk | ln π̃k = E[ln πk ] = ψ(αk ) − ψ(α̂) here ψ(.) which is defined as d/da ln Γ(a) also known as the digramma function. The last two results are given by the Gauss-Wishart Henrik I. Christensen (RIM@GT) Approximate Inference 17 / 36
  • 18. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Parameters We can finally find the responsibilities rnk ∝ πk|Λk|1/2 exp − 1 2 (xn − µk)T Λk(xn − µk) The optimization is stepwise 1 Estimate µ, Λ and then rnk 2 Estimate π and Z 3 Check for convergence - return to 1 if not converged Henrik I. Christensen (RIM@GT) Approximate Inference 18 / 36
  • 19. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixture of Gaussians - Example 0 15 60 120 Henrik I. Christensen (RIM@GT) Approximate Inference 19 / 36
  • 20. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary MoG - Varional Lower Bound We can estimate the best fit / lower bound L = E[ln p(X|Z, µ, Λ)] + E[ln p(Z|pi)] + E[ln p(µ, Λ)] − E[ln q(Z)] − E[ln q(π)] − E[ln q(µ, Λ)] E[ln p(X|Z, µ, Λ)] = 1 2 X k Nk n ln Λ̃k − Dβ−1 k − νk Tr(Sk Wk ) −νk (x̄k − mk )T WK (x̄k − mk ) − D ln 2π E[ln p(Z|π)] = X n X k rnk ln rnk E[ln p(π)] = ln C(α0) + (α0 − 1) X k ln π̃k . . . = . . . (see book) Henrik I. Christensen (RIM@GT) Approximate Inference 20 / 36
  • 21. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Outline 1 Introduction 2 Variational Inference 3 Variational Mixture of Gaussians 4 Exponential Family 5 Expectation Propagation 6 Summary Henrik I. Christensen (RIM@GT) Approximate Inference 21 / 36
  • 22. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Exponential Family Distribution Recall from 3rd lecture: Exponential family p(x|η) = h(x)g(η) exp n ηT u(x) o where η represent the “natural parameters” g(η) is the normalization “factor” u(x) is some general function of data Henrik I. Christensen (RIM@GT) Approximate Inference 22 / 36
  • 23. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Exponential Family Distribution The joint distribution for observed and latent variables is then p(X, Z|η) = N Y n=1 h(xn, zn)g(η) exp n ηT u(xn, zn) o The conjugate prior for η is then p(η|ν0, v0) = f (ν0, χ0)g(η)ν0 exp n ν0ηT χ o where ν0 is prior number of observations and χ is the sufficient statistics (moments) Henrik I. Christensen (RIM@GT) Approximate Inference 23 / 36
  • 24. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Exponential Family Distribution - Variational As before we can compute ln q∗ (Z) = Eη[ln p(X, Z|η)] + const = X n n ln h(xn, zn) + E[ηT ]u(xn, zn) o + const i.e. a sum of independent terms Taking exponential on both sides we have q∗ (zn) = h(xn, zn)g(E[η]) exp n E[ηT ]u(xn, zn) o Henrik I. Christensen (RIM@GT) Approximate Inference 24 / 36
  • 25. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Exponential Family Distribution - Variational Similarly the natural parameters can be optimized by ln q∗ (η) = ln p(η|ν0, χ0) + EZ [ln p(X, Z|η)] + const Which expands to ln q∗ (η) = ν0 ln g(η) + ηT χ0 + X ln g(η) + ηT Ezn [u(xn, zn)] + const Using the trick of exponentials on both sides we have q∗ (η) = f (νN, χN)g(η)νN exp n ηT χN o where νN = ν0 + N χn = χ0 + X n Ezn [u(xn, zn)] Henrik I. Christensen (RIM@GT) Approximate Inference 25 / 36
  • 26. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Exponential Family Distribution - Variational As expected the solution is iterative q∗(zn) and q∗(η) are coupled. In the E step compute E[u(xn, zn)] - the sufficient statistics and compute q(η) In the M step use the estimate to maximize the estimate for q(zn) and compute E[ηT ] Henrik I. Christensen (RIM@GT) Approximate Inference 26 / 36
  • 27. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Outline 1 Introduction 2 Variational Inference 3 Variational Mixture of Gaussians 4 Exponential Family 5 Expectation Propagation 6 Summary Henrik I. Christensen (RIM@GT) Approximate Inference 27 / 36
  • 28. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Expectation Propagation Fundamentally we are trying to match distributions to the data and match up the natural parameters. I.e. find the “best”family of distributions and at the same time fit the parameter. In the end we are trying to minimize the Kullback-Leibler (KL) with respect to q(z) Consider for a minute KL(p||q) where p(z) is fixed and q(z) is a member of the exponential family q(z) = h(z)g(η) exp n ηT u(z) o Henrik I. Christensen (RIM@GT) Approximate Inference 28 / 36
  • 29. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Expectation Propagation - Optimization The Kullback - Leibler is then KL(p||q) = − ln g(η) − ηT Ep(z)[u(z)] + const The extrema is then given by −∇ ln g(η) = Ep(z)[u(z)] i.e. the best estimate is to match q(z) to p(z) by setting “natural parameters” to the sufficient statistics (moment matching). I.e. q(z) = N(z|µ, Σ) as a model for the data Henrik I. Christensen (RIM@GT) Approximate Inference 29 / 36
  • 30. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Expectation Propagation - Modelling Consider a model with factorized probabilities p(D, θ) = Y i fi (θ) where fi (theta) = p(xn|θ) and you might have a prior f0(θ) = p(θ). The posterior is then p(θ|D) = 1 p(D) Y i fi (θ) The model evident is given by p(D) = Z Y i fi (θ)dθ Henrik I. Christensen (RIM@GT) Approximate Inference 30 / 36
  • 31. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Expectation Propagation - Computing The estimate is then q(θ) = 1 Z Y i ˜ fi (θ) q(θ) can be factorized so that each term is optimized Through optimization factor-by-factor it is possible to generate an estimate - take-one-out-and-optimize Henrik I. Christensen (RIM@GT) Approximate Inference 31 / 36
  • 32. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Expectation Propagation - Algorithm Initialize factor approximation - ˜ fi (θ) Initialize posterior estimate - q(θ) ∝ Q i ˜ fi (θ) iterate 1 Choose a factor to refine 2 Remove ˜ fj (θ) from prior qj = q/f 3 Evaluate new posterior/sufficient statistics 4 Update factors 5 Evaluate aproximation Henrik I. Christensen (RIM@GT) Approximate Inference 32 / 36
  • 33. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Expectation Propagation - Example θ x −5 0 5 10 p(x|θ) = (1 − w)N(x|θ, I) + wN(x|0, aI) Henrik I. Christensen (RIM@GT) Approximate Inference 33 / 36
  • 34. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Expectation Propagation - Example θ −5 0 5 10 θ −5 0 5 10 Henrik I. Christensen (RIM@GT) Approximate Inference 34 / 36
  • 35. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Outline 1 Introduction 2 Variational Inference 3 Variational Mixture of Gaussians 4 Exponential Family 5 Expectation Propagation 6 Summary Henrik I. Christensen (RIM@GT) Approximate Inference 35 / 36
  • 36. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Summary Often computation of complete model is a challenge Two ways to approximate computations Deterministic Approximations Sampling Based Methods Many tricks for approximation Factorization is typically a first strategy Iterative optimization of factors Next time we will talk about sampling based methods Henrik I. Christensen (RIM@GT) Approximate Inference 36 / 36