This document discusses approximate inference in Bayesian networks using sampling methods. It introduces random number generation, which is important for sampling algorithms. Random number generators in programming languages typically generate uniform random numbers, but different distributions are needed for sampling Bayesian networks. The document covers generating random numbers from univariate and multivariate distributions to estimate probabilities for approximate inference in Bayesian networks.
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
07 approximate inference in bn
1. Bayesian Networks
Unit 7 Approximate Inference
in Bayesian Networks
Wang, Yuan-Kai, 王元凱
ykwang@mails.fju.edu.tw
http://www.ykwang.tw
Department of Electrical Engineering, Fu Jen Univ.
輔仁大學電機工程系
2006~2011
Reference this document as:
Wang, Yuan-Kai, “Approximate Inference in Bayesian Networks,"
Lecture Notes of Wang, Yuan-Kai, Fu Jen University, Taiwan, 2011.
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
2. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 2
Goal of This Unit
• P(X|e) inference for Bayesian networks
• Why approximate inference
– Exact inference is too slow because of
exponential complexity
• Using approximate approaches
– Sampling methods
• Likelihood weighting sampling
• Markov Chain Monte Carlo sampling
– Loopy belief propagation
– Variational method
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
3. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 3
Related Units
• Background
– Probabilistic graphical model
– Exact inference in BN
• Next units
– Probabilistic inference over time
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
4. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 4
Self-Study References
• Chapter 14, Artificial Intelligence-a modern
approach, 2nd, by S. Russel & P. Norvig, Prentice
Hall, 2003.
• Inference in Bayesian networks, B. D’Ambrosio, AI
Magazine, 1999.
• Probabilistic Inference in graphical models, M. I.
Jordan & Y. Weiss.
• An introduction to MCMC for machine learning.
Andrieu, C., De Freitas, J., Doucet, A., & Jordan,
M. I., Machine Learning, vol. 50, pp.5-43, 2003.
• Computational Statistics Handbook with Matlab,
W. L. Martinez and A. R. Martinez, Chapman &
Hall/CRC, 2002
– Chapter 3 Sampling Concepts
– Chapter 4 Generating Random Variables
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
5. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 5
Structure of Related Lecture Notes
Problem Structure Data
Learning
PGM B E
Representation Learning
A
Unit 5 : BN Units 16~ : MLE, EM
Unit 9 : Hybrid BN J M
Units 10~15: Naïve Bayes, MRF,
HMM, DBN,
Kalman filter P(B) Parameter
P(E) Learning
P(A|B,E)
P(J|A)
Query Inference
P(M|A)
Unit 6: Exact inference
Unit 7: Approximate inference
Unit 8: Temporal inference
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
6. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 6
Contents
1. Sampling .......................................................... 11
2. Random Number Generator .......................... 20
3. Stochastic Simulation ……............................. 70
4. Markov Chain Monte Carlo .......................... 113
5. Loopy Belief Propagation …………………. 145
6. Variational Methods ………………………... 146
7. Implementation …………………………….. 147
8. Summary ……………………………………. 148
9. References …………………………………… 151
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
7. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 7
4 Steps of Inference
• Step 1: Bayesian theorem
P ( X , E e)
P ( X | E e) P ( X , E e)
P ( E e)
• Step 2: Marginalization
P( X , E e, H h)
hH
• Step 3: Conditional independence
P( X i | Pa ( X i ))
hH i 1~ n
• Step 4: Product sum computation (Enumeration)
– Exact inference
– Approximate inference
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
8. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 8
Five Types of Queries in Inference
• For a probabilistic graphical model G
• Given a set of evidence E=e
• Query the PGM with
– P(e) : Likelihood query
– arg max P(e) :
Maximum likelihood query
– P(X|e) : Posterior belief query
– arg maxx P(X=x|e) : (Single query variable)
Maximum a posterior (MAP) query
– arg maxx …x P(X1=x1, …, Xk=xk|e) :
1 k
Most probable explanation (MPE) query
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
9. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 9
Approximate Inference
v.s. Exact Inference
• Exact inference: P(X|E) = 0.71828
– Get exact probability value
– Using the inference steps derived by
probabilistic formula
– Need exponential time complexity
• Approximate inference: P(X|E) 0.71
– Get approximate probability value
– Using sampling theorem
– Need only polynomial time complexity,
fast computation
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
10. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 10
Why Approximate Inference
• Large treewidth
– Large, highly connected graphical models
– Treewidth may be large (>40) in sparse
networks
• In many applications, approximation are
sufficient
– Example: P(X = x|e) = 0.3183098861
– Maybe P(X = x|e) 0.3 is a good enough
approximation
– e.g., we take action only if P(X=x|e) > 0.5
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
11. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 11
1. Sampling
• 1.1 What Is Sampling
• 1.2 Sampling for Inference
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
12. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 12
Basic Idea of Sampling
• Why sampling
– Estimate some values by random number
generation
1. Sampling
– Random number generating
– Draw N samples from a known distribution P
– Generate N random numbers from a known
distribution S
2. Estimation
ˆ
– Compute an approximate probability P , which
approximates the real posterior probability
P(X|E)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
13. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 13
1.1 What Is Sampling
• A very simple example with a random
variable : coin toss
– Tossing the coin, get head or tail
– It is a Boolean R.V.
• coin = head or tail
– If it is unbiased coin, head and tail have
equal probability
• A prior probability distribution
P(Coin) = <0.5, 0.5>
• Uniform distribution
– Assume we have a coin but we do not
know it is unbiased
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
14. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 14
Sampling of Coin Toss
• Sampling in this example
= flipping the coin many times N
– e.g., N=1000 times
– One flipping get one sample
– Ideally, 500 heads, 500 tails
• P(head) = 500/1000=0.5
P(tail) = 500/1000=0.5
– Practically, 5001 heads, 499 tails
• P(head) = 501/1000=0.501
P(tail) = 499/1000=0.499
• After the sampling,
– We can estimate probability distribution
– Check if it is biased
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
15. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 15
Sampling & Estimation (Math)
• For a Boolean random variable X
– P(X) is prior distribution
= <P(x), P(x)>
– Using a sampling algorithm to generate N
samples
– Say N(x) is the number of samples that x is
true, N(x) x is false
N ( x) ˆ N ( x ) ˆ
P( x), P (x )
N N
N ( x) N ( x )
lim P( x), lim P ( x )
N N N N
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
16. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 16
1.2 Sampling for Inference
• Given a Bayesian network G including
(X1, …, Xn)
– We get a joint probability distribution
P(X1, …, Xn) = P(Xi|Pa(Xi))
• For a query P(X|E=e)
– P(X|e) = P(Xi | Parent(Xi))
– It is hard to compute
• Need exponential time in number of Xi
– We will try to use sampling to compute it
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
17. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 17
Compute P(X|e) by Sampling
• Sampling Explained in
– Generate N samples of Sections 2,3,4
P(X1, …, Xn) = P(Xi|Pa(Xi))
• Estimation
– Use N samples to estimate
P(X,e) N(X,e)/N
– Use N samples to estimate P(e) N(e)/N
– Estimate P(X|e) by P(X,e) / P(e)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
18. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 18
What Is Sampling Algorithm
• The algorithm to
– Generate samples from a known
probability distribution P
ˆ
– Estimate the approximate probability P
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
19. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 19
Various Sampling Algorithms
• Stochastic simulation Section 3
– Direct Sampling
– Rejection sampling
• Reject samples disagreeing with evidence
– Likelihood weighting
• Use evidence to weight samples
• Markov chain Monte Carlo Section 4
(MCMC)
– Sample from a stochastic process whose
stationary distribution is the true posterior
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
20. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 20
2. Random Number Generator
• Very important for sampling algorithm
• Introduce basic concepts related to
sampling of Bayesian networks
• Subsections
– 2.1 Univariate
– 2.2 Multivariate
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
21. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 21
RNG In Programming Languages
• Random number generator (RNG)
– C/C++: rand()
– Java: random()
– Matlab: rand()
• Why should we discuss it?
– They generate random numbers with
uniform distribution
– How to generate
• Gaussian, …
• Multivariate, dependent random
variables
• Non-closed-form distribution?
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
22. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 22
Generate a Random Number (1/2)
• Examples in C
– int i = rand();
– Return 0 ~ RAND_MAX (32767)
– It generates integers
• Generate a random number
between 1 and n (n<32767)
– int i = 1 + ( rand() % n )
– (rand() % n) returns a number between 0
and n - 1
– Add 1 to make random number between 1
and n
– It generates integers, but not real numbers
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
23. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 23
Generate a Random Number (2/2)
• Ex: integer between 1 and 6
–1 + ( rand() % 6)
• Ex: real number between 0 and 1
–double i = rand() / RAND_MAX
• Exercise
– Real number between 10 and 20
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
24. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 24
Generate
Many Random Numbers Repeatedly
• Using loop for repeated generation
– for (int i=0; i<1000; i++)
{ rand(); }
– int i, j[1000];
for (i=0; i<1000; i++)
{ j[i] = 1 + rand() % 6; }
rand() generates a number uniformly
Uniform distribution
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
25. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 25
Why Generate Random Numbers
• Simulate random behavior
• Make random decision
• Estimate some values
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
26. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 26
Random Behavior/Decision (1/2)
• Flip a coin for decision (Boolean)
– Fair: each face has equal probability
– int coin_face;
if (rand() > RAND_MAX/2)
coin_face = 1;
else coin_face = 0;
– int coin_face;
coin_face = rand() % 2;
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
27. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 27
Random Behavior/Decision (2/2)
• Random decision of multiple choices
– Discrete random variable
• Ex: roll a die Uniform distribution
– Fair: each face has equal probability
• int die_face; //Random variable
die_face = rand() % 6;
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
28. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 28
Estimation
• If we can simulate a random behavior
• We can estimate some values
– First, we repeat the random behavior
– Then we estimate the value
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
29. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 29
Example: The Coin Toss
• Flip the coin 1000 times to estimate the
fairness of the coin
– int coin_face; //Random variable
int frequency[2];
Uniform distribution
for (i=0; i<1000; i++)
frequency
{ coin_face = rand() % 2
frequency[coin_face]++;
}
0 1 Coin
face
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
30. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 30
Example : Area of Circle (Estimation)
• int x, y; //Two random variables
int N=1000, NCircle=0, Area;
for (i=0; i<N; i++)
{ x = rand() / RAND_MAX; x and y are
y = rand() / RAND_MAX; independent
if ( (x*x + y*y) <= 1 )
NCircle = NCircle + 1;
} A random number ?
Area = 4 * (NCircle/N);
We call (x,y) a sample
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
31. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 31
Multiple Dependent Random Variables
• Markov Chain: n random variables
X1 ... Xk ... Xn
• Bayesian Networks: 5 random variables
Burglary Earthquake
Alarm What is a sample ?
John Calls Mary Calls
Variables are dependent
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
32. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 32
Sampling
• It is to randomly generate a sample
– For a random variable X or Univariate
A set of random variables X1, …, Xn Multivariate
• Boolean, Discrete, Continuous
• Multivariate
– Independent, dependent
– According to a probability distribution P(X)
• Discrete X: Histogram
• Continuous X:
– Uniform, Gaussian, or
– Any distribution: Gaussian mixture models
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
33. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 33
Sub-Sections for
Generating a Sample
• 2.1 Univariate
– Uniform, Gaussian, Gaussian mixture
• 2.2 Multivariate
– Uniform
– Gaussian
• Independent, dependent
– Any distribution
• Gaussian mixture
– Independent, dependent
• Bayesian network
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
34. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 34
2.1 Univariate
• For a random variable X
– Boolean, discrete, continuous, hybrid
• We know P(X) is
– Uniform, Gaussian, Gaussian mixture
• Generate a sample X according to P(X)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
35. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 35
Uniform Generator
• Every programming language provides
a rand()/random() function to generate
a uniform-distributed number
– Integer number within [0, MAX)
• Sampling a Boolean uniform number
– rand() %2
• Sampling a discrete uniform number
within [0, d)
– rand() % d
• Sampling a continuous uniform number
– Within [0, 1): rand() % MAX
– Within [a, b): a + (rand() % MAX)*(a-b)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
36. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 36
Example : Uniform Generator
• x=rand(1,10000);
• h=hist(x,20); 600
• bar(h);
500
400
300
200
100
0
0 5 10 15 20 25
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
37. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 37
Gaussian Generator (1/2)
• Sampling Gaussian can be obtained by
uniform distribution
• There are functions in C/Java/Matlab to
randomly generate a univariate
Gaussian real number with (, )=(0,1)
– C : Numerical recipies in C,
– Java: Random.nextGaussian()
– Matlab: randn()
• Suppose it is called Gaussian()
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
38. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 38
Gaussian Generator (2/2)
• Sampling a continuous Gaussian
number with (, )
– (Gaussian() * ) +
• Sampling a discrete Gaussian number
with (, ) ?
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
39. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 39
Example : Gaussian Generator (1/2)
• Pseudo codes
– Assume Gaussian() is a pseudo function to
generate Gaussian numbers
– double x[10000];
for (i=0; i<10000; i++)
x[i] = Gaussian();
– for (i=0; i<10000; i++)
x[i] = + Gaussian() * ;
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
40. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 40
Example : Gaussian Generator (2/2)
• Matlab • Java
– x=randn(1,10000); – Random r=new
– h=hist(x,20); Random();
1600
– bar(h); int x[10000];
1400
for (i=0;i<10000;i++)
1200
x[i]=r.nextGaussian();
1000
800
600
400
200
0
0 5 10 15 20 25
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
41. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 41
Gaussian Mixture Generator (1/2)
• Random variable X with Gaussian
– P(X) = N(X; , )
• Random variable Y with Gaussian
mixture
– P(Y) = m mN(Y; m, m)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
42. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 42
Gaussian Mixture Generator (2/2)
• Generate N samples of X
– for (i=0; i<N; i++)
x[i]=(Gaussian() * ) +
• Generate N samples of Y with mixture
of M Gaussians
– Each Gaussian m has m, m
– for (m=0; m<M; m++)
for (i=0; i<N*m; i++)
y[m][i] = (Gaussian() * m) + m
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
43. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 43
Example : Gaussian Mixture
Generator
• N=10000; pi1=0.8; pi2=0.2;
• mu1=0; mu2=15; sigma1=3; sigma2=5;
• x1 = mu1 + randn(1,N*pi1) * sigma1;
• x2 = mu2 + randn(1,N*pi2) * sigma2; 900
• x = [x1, x2]; 800
• h=hist(x,50); 700
• bar(h); 600
500
400
300
200
100
0
0 10 20 30 40 50 60
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
44. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 44
2.2 Multivariate
• For random variables X1,… ,Xn
– Boolean, discrete, continuous, hybrid
• We know P(X1,… ,Xn) is
– Uniform, Gaussian, Gaussian mixture, any
distribution
• Generate a sample (X1,… ,Xn) according
to P(X1,… ,Xn)
– Independent
– Dependent
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
45. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 45
Multivariate Boolean Uniform Generator
• Boolean random variables X1,… ,Xn
• int X[n]; // A sample
for (i=0; i<n; i++)
X[i] = rand() % 2;
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
46. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 46
Multivariate Discrete Uniform Generator
• Discrete random variables X1,…, Xn
– Each with d discrete values: [0, d-1]
– Each Xi is uniform distributed
– X1,…, Xn must be independent
• int X[n]; // A sample
for (i=0; i<n; i++)
X[i] = rand() % d;
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
47. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 47
Multivariate Gaussian Generator
- Independent (1/2)
• Pseudo codes
• For n random variables X=(X1,…,Xn)
– Gaussian : N(X; , )
• Mean vector:
• Covariance matrix: =[ij]
• X1,…,Xn are independent
– ij = 0 for ij
• Generate a sample of X
Generate each Xi independently
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
48. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 48
Multivariate Gaussian Generator
- Independent (2/2)
• Generate a sample of X =(X1,…,Xn) with
i=0, ii=1, ij = 0 for ij
– int X[n]; // a sample
for (i=0; i<n; i++)
X[i] = Gaussian();
• Generate a sample of X =(X1,…,Xn) with
i0, ii 1, ij = 0 for ij
– int X[n]; // a sample
for (i=0; i<n; i++)
X[i] = i + Gaussian() * ii;
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
49. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 49
Example – Matlab (1/2)
mx=[0 0]'; X (0,0) T
Cx=[1 0; 0 1]; 1 0
x1=-3:0.1:3; X
x2=-3:0.1:3; 0 1
for i=1:length(x1),
for j=1:length(x2),
f(i,j)=(1/(2*pi*det(Cx)^
1/2))*exp((-1/2)*([x1(i)
x2(j)]-
mx')*inv(Cx)*([x1(i);x2(
j)]-mx));
end
end
mesh(x1,x2,f)
pause;
contour(x1,x2,f)
pause
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
50. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 50
Example – Matlab (2/2)
• Randomly generate 1000 samples for
1 0
X (0,0) , X
T
0 1
y1=randn(1,1000);
y2=randn(1,1000);
plot(y1,y2,'.');
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
51. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 51
Multivariate Gaussian Generator
- Dependent (1/4)
• For n random variables X=(X1,…,Xn)
–Gaussian : N(X; , )
• Mean vector:
• Covariance matrix: =[ij]
– is a positive definite matrix
• Symmetric and all eigenvalues (pivots) > 0
– For general matrix A : A= LDU
• L: lower triangular, U: upper triangular
D: diagonal matrix of pivots
– For symmetric matrix S: S = LDLT
– For positive definite matrix = LDL PPT
T
T= L D L D
– This is called Cholesky decomposition
• X1,…,Xn are dependent
–ij 0
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
52. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 52
Multivariate Gaussian Generator
- Dependent (2/4)
• Generate a sample of X with ,
– Perform Cholesky decomposition of
• Cholesky decomposition is pivot decomposition
for positive definite matrix
• = PP-1 = PPT
– Generate independent Gaussian Y=(Y1,…,Yn )
with i=0, i=1
– X = PY +
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
53. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 53
Multivariate Gaussian Generator
- Dependent (3/4)
• Pseudo code to generate a sample of X
with ,
– Matrix ;
Vector ;
Vector X(n), Y(n); // a sample
Matrix P=chol(); //Cholesky decomp.
for (i=0; i<n; i++) Y(i) = Gaussian();
X=P*Y+
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
54. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 54
Multivariate Gaussian Generator
- Dependent (4/4)
• Proof
– For n random variables X=(X1,…,Xn) with ,
– Generate n independent, zero-mean, unit variance
normal random variables Y=(Y1,…,Yn)
1 0
Y (Y1 , , Yn )T , Y (0, ,0)T , Y
0 1
– Take X = PY+, where =PP -1 =PPT
Covariance Matrix of X E ( X )( X )T
E{( PY )( PY )T } E{PYY T P T } PE{YY T }P T PP T
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
55. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 55
Example – Matlab (1/4)
Assume
X (0,0)T
1 1 / 2 1 0
X , P 1 / 2 3
1 / 2 1 2
1/ 2
Matlab:
mx=[0 0]';
Cx=[1 1/2; 1/2 1];
P=chol(Cx);
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
56. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 56
Example – Matlab (2/4)
• Randomly generate 1000 samples for
1 1 / 2
X (0,0) , X
T
1/ 2 1
• mx=zeros(2,1000);
y1=randn(1,1000);
y2=randn(1,1000);
y=[y1;y2];
P=[1, 0; 1/2, sqrt(3)/2];
x=P*y+mx;
x1=x(1,:);
x2=x(2,:);
plot(x1,x2,'.');
r=corrcoef(x1',x2');
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
57. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 57
Example – Matlab (3/4)
Assume
X (5,5)T
1 0.9 1 0
X , P 9 19
0.9 1 10 10
0.9
Matlab:
• mx=[5 5]';
• Cx=[1 9/10; 9/10 1];
• P=chol(Cx);
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
58. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 58
Example – Matlab (4/4)
• Randomly generate 1000 samples for
1 0.9
X (5,5) , X
T
0. 9 1
• mx=5*ones(2,1000);
y1=randn(1,1000);
y2=randn(1,1000);
y=[y1;y2];
P=[1, 0; 9/10, sqrt(19)/10];
x=P*y+mx;
x1=x(1,:);
x2=x(2,:);
plot(x1,x2,'.');
r=corrcoef(x1',x2');
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
59. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 59
Multivariate Gaussian Mixture
Generator
• Generate N samples of X with mixture of M
Gaussians (Matlab-like pseudo code)
– for (m=0; m<M; m++)
{ Matrix P=chol(m) //Cholesky decomposition
for (i=0; i<N*m; i++)
{ //Generate n independent normally distributed
// R.V. (=0, =1)
y = randn(1, n)
// Transform y into x
x=P*y+
}
}
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
60. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 60
Example – Matlab (1/4)
• Combine the previous two Gaussians:
1=0.5, 2=0.5, 7
1 (0,0)
6
T
5
1 1 / 2 4
1
1/ 2 1
3
2
2 (5,5) T 1
0
1 0. 9 -1
2 -2
0.9 1 -3
-4 -2 0 2 4 6 8 10
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
61. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 61
Example – Matlab (2/4)
• pi1= 0.5; pi2=0.5; N=2000;
mx1=zeros(2,pi1*N); Cx1=[1 1/2; 1/2 1];
P1=chol(Cx1); %P=[1, 0; 1/2, sqrt(3)/2];
y1_1=randn(1,pi1*N); y1_2=randn(1,pi1*N);
y1=[y1_1;y1_2];
x1=P1*y1+mx1; x1_1=x1(1,:); x1_2=x1(2,:);
mx2=5*ones(2,pi2*N); Cx2=[1 9/10; 9/10 1];
P2=chol(Cx2); %P=[1, 0; 1/2, sqrt(3)/2];
y2_1=randn(1,pi2*N); y2_2=randn(1,pi2*N);
y2=[y2_1;y2_2];
x2=P2*y2+mx2; x2_1=x2(1,:); x2_2=x2(2,:);
z1=[x1_1,x2_1]; z2=[x1_2,x2_2];
plot(z1,z2,'.');
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
62. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 62
Example – Matlab (3/4)
• Combine the previous two Gaussians
1=0.2, 2=0.8 7
6
1 (0,0) T
5
1 1 / 2 4
1
1/ 2 1
3
2
2 (5,5) T 1
0
1 0. 9 -1
2 -2
0.9 1 -3
-4 -2 0 2 4 6 8 10
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
63. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 63
Example – Matlab (4/4)
• pi1= 0.2; pi2=0.8; N=2000;
mx1=zeros(2,pi1*N); Cx1=[1 1/2; 1/2 1];
P1=chol(Cx1); %P=[1, 0; 1/2, sqrt(3)/2];
y1_1=randn(1,pi1*N); y1_2=randn(1,pi1*N);
y1=[y1_1;y1_2];
x1=P1*y1+mx1; x1_1=x1(1,:); x1_2=x1(2,:);
mx2=5*ones(2,pi2*N); Cx2=[1 9/10; 9/10 1];
P2=chol(Cx2); %P=[1, 0; 1/2, sqrt(3)/2];
y2_1=randn(1,pi2*N); y2_2=randn(1,pi2*N);
y2=[y2_1;y2_2];
x2=P2*y2+mx2; x2_1=x2(1,:); x2_2=x2(2,:);
z1=[x1_1,x2_1]; z2=[x1_2,x2_2];
plot(z1,z2,'.');
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
64. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 64
Exercise
• Write a program to randomly generate
1000 samples of 3-dimensional Gaussian
with =(5,10,-3), =(2,1,3;4,2,2;3,1,2)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
65. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 65
Any Distribution
• For random variables X1,… ,Xn
– Boolean, discrete, continuous, hybrid
• We know P(X1,… ,Xn) has no closed-form
formula
– Independent: P(X1,… ,Xn)= P(X1)… P(Xn)
– Dependent:
P(X1,… ,Xn)= P(Xi | Parent(Xi))
• Generate a sample (X1,… ,Xn) according to
P(X1,… ,Xn)
– Independent: generate each Xi by P(Xi)
– Dependent: generate each Xi by P(Xi| Parent(Xi))
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
66. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 66
Two Boolean R.V.s - Independent
• X1, X2 have distributions :
– P(X1)=<0.67, 0.33>, P(X2)=<0.75,0.25>
• int X1, X2; P(X1)
for (i=0; i<1000; i++) 0.67
{ if (rand() > RAND_MAX/3)
X1 = 1;
else X1 = 0; 0 1 X1
if (rand() > RAND_MAX/4) P(X2)
X2 = 1; 0.75
else X2 = 0;
}
0 1 X2
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
67. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 67
Two Boolean R.V.s - Dependent
• X1, X2 have distributions :
– P(X1)=<0.67, 0.33>
– P(X2|X1=T)=<0.75,0.25>, P(X2|X1=F)=<0.8,0.2>
• Generate a sample (x1, x2)
if (rand() > RAND_MAX/3) x1 = 1;
else x1 = 0;
if (x1==1)
if (rand() > RAND_MAX/4) x2 = 1;
else x2 = 0;
else // x1==0
if (rand() > RAND_MAX/5) x2 = 1;
else x2 = 0;
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
68. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 68
Markov Chain
• Markov Chain: n random variables
X1 ... Xk ... Xn
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
69. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 69
Bayesian Network
• Example: 5 random variables
Burglary Earthquake
Alarm
John Calls Mary Calls
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
70. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 70
3. Stochastic Simulation
• Also called
– Monte Carlo Methods
– Sampling Methods
• Sub-sections
– 3.1 Direct sampling
– 3.2 Rejection sampling
– 3.3 Likelihood weighting
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
71. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 71
3.1 Direct Sampling
• Generate N samples randomly
• For the inference P(X|E)
– P(X|E)= P(X^E) / P(E)
– Get N(E) & N(X^E) from the N
samples
• N(E) : No. of samples of E
• N(X^E) : No. of samples of X and E
– P(E) = N(E) / N,
P(X^E) = N(X^E) / N
– P(X|E) = N(X^E) / N(E)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
72. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 72
Example (1/4)
• For the sprinkler network
– Estimate P(w|r)
by direct sampling
– 4 random variables
– A sample =
(c,s,r,w)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
73. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 73
Example (2/4)
• Generate 1000 samples
Cloudy Sprinkler Rain WetGrass
T T T F
F T T F
F F T T
T T T F
T T T F
... ... ... ...
F T T F
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
74. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 74
Example (3/4)
• P(r| w) = P(r, w)/P(w)
Nw: No. of WetGrass=False
Nr^w: No. of (Rain=True&WetGrass=False)
Cloudy Sprinkler Rain WetGrass
T T T F
F T T F
Nr^w / Nw F F T T
T T F F
... ... ... ...
F T T F
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
75. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 75
Example (4/4)
• P(R|w)
– = P(R, w)/P(w)
– = < P(r ^ w)/P(w), P(r ^ w)/P(w) >
Cloudy Sprinkler Rain WetGrass
T T T F
F T T F
F F T T
T T F F
... ... ... ...
F T T F
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
76. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 76
How to Generate a Sample
for the Bayesian Network? (1/3)
• The sprinkler Bayesian network
A sample is an atomic event :
(cloundy,sprinkler,rain,wetgrass)
=(T, F, T, T)
•Assume a sampling order:
[ Cloudy, Sprinkler,
Rain, WetGrass ]
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
77. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 77
How to Generate a Sample
for the Bayesian Network? (2/3)
• int C, S, R, W;
for (i=0; i<1000; i++)
{ if (rand() > RAND_MAX/2) C = T;
else C = F;
if (rand() > RAND_MAX/2) S = T;
else S = F;
if (rand() > RAND_MAX/2) R = T;
else R = F;
if (rand() > RAND_MAX/2) W = T;
else W = F;
} Incorrect
Implementation
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
78. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 78
How to Generate a Sample
for the Bayesian Network? (3/3)
• int C, S, R, W;
for (i=0; i<1000; i++)
{ if (rand() > RAND_MAX/2) C = T;
else C = F;
if (C==T)
if (rand() > RAND_MAX*0.9)
S = T;
else S = F;
else // C==F
if (rand() > RAND_MAX/2)
S = T;
else S = F;
...
}
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
79. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 79
An Example
Generating One Sample (1/8)
• The sampling algorithm
1.Sample from P(Cloudy)=<0.5, 0.5>
– Suppose it returns true
2.Sample from
P(Sprinkler|Cloudy=true)=<0.1,0.9>
– Suppose it returns false
3.Sample from
P(Rain|Cloudy=true)=<0.8,0.2>
– Suppose it returns true
4.Sample from
P(WetGrass|Sprinkler=false, Rain=true) =
<0.9,0.1>
– Suppose it returns true
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
80. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 80
An Example
Generating One Sample (2/8)
C S R W
Samples:
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
81. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 81
An Example
Generating One Sample (3/8)
Random sampling: C S R W
Cloudy Samples:
c
Return: Cloudy=true
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
82. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 82
An Example
Generating One Sample (4/8)
C S R W
c
Samples:
Random sampling
1. Sprinkler
2. Rain
Given Cloudy=true
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
83. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 83
An Example
Generating One Sample (5/8)
C S R W
c s
Samples:
Random sampling
Sprinkler
Given Cloudy=true
Return: Sprinkler=false
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
84. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 84
An Example
Generating One Sample (6/8)
C S R W
c s r
Samples:
Random sampling Rain
Given Cloudy=true
Return: Rain=true
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
85. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 85
An Example
Generating One Sample (7/8)
C S R W
c s r
Samples:
Random sampling WetGrass
Given Rain=true,
Sprinkler=false
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
86. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 86
An Example
Generating One Sample (8/8)
C S R W
c s r w
Samples:
Random sampling WetGrass
Given Rain=true,
Sprinkler=false
Return: WetGrass=true
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
87. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 87
The Algorithm (1/2)
• To generate one sample
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
88. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 88
The Algorithm (2/2)
• In previous example
– We get a sample [true, false, true, true]
of a Bayesian network using the Prior-
Sample
• The sampling of a Bayesian network
– Repeat the sampling N times
– We get N samples
• We can use the N samples to compute
any query probability in the Bayesian
network
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
89. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 89
How It Works (1/2)
• Why any probability can be
answered from the sampling?
– The N samples is actually a full joint
distribution table (FJD)
C S R W C S R W P
T T T F T T T F 0.02
F T T F F T T F 0.13
F F T T F F T T 0.04
T T F F T T F F 0.15
... ... ... ... ... ... ... ... ...
F T T F FJD
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
90. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 90
Why It Works (2/2)
• A sample is an atomic event (x1, ..., xn)
• P(x1, ..., xn) N(x1, ..., xn) / N
• Therefore, a FJD is generated from
the N samples
• Note: N < 2n
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
91. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 91
Exercise: Direct Sampling
p(smart)=.8 p(study)=.6 Query: What is the probability
smart study that a student studied, given
that they pass the exam?
p(fair)=.9
prepared fair
p(prep|…) smart smart
pass study .9 .7
smart smart study .5 .1
p(pass|…)
prep prep prep prep
fair .9 .7 .7 .2
fair .1 .1 .1 .1
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
92. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 92
Problems of Direct Sampling
• It needs to generate very many
samples in order to obtain the
approximate FJD
• For a query of conditional
probability P(X|e)
– Can we just approximate the
conditional probability?
– Yes, the following two algorithms will
do this
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
93. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 93
3.2 Rejection Sampling
ˆ
• P( X | e) is estimated from samples
agreeing with e
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
94. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 94
An Example
• Estimate P(Rain|Sprinkler=true)
using 100 samples
– 27 samples have Sprinkler = true
– Of these, 8 have Rain=true and
19 have Rain=false
– P(Rain|Sprinkler=true) =
Normalize(<8,19>) = <0.296, 0.704>
• Similar to a basic real-world
empirical estimation procedure
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
95. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 95
Analysis of Rejection Sampling
P ( X | e)
ˆ N ( X ,e )
N (e) P ( X ,e )
P (e) P ( X | e)
• Hence rejection sampling returns
consistent posterior estimates
• Problem: expensive if P(e) is small
– P(e) drops off exponentially with
number of evidence variables!
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
96. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 96
3.3 Likelihood Weighting
• Avoids the inefficiency of rejection
sampling
– By generating only events consistent
with the evidence variables e
• Idea Randomly
– Fix evidence variables, generate
a sample
– Sample only hidden variables event
– Weight each sample event by the
likelihood it accords the evidence
• Events have different weights
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
97. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 97
An Example (1/9)
• Query P(Rain|sprinkler, wetgrass)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
98. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 98
An Example (2/9)
1. Set the weight =1.0
2. Sample from P(Cloudy)=<0.5,0.5>
• Suppose it returns true
3. The evidence Sprinkler=true. So we set
= P(sprinkler|cloudy)=1*0.1=0.1
4. Sample from P(Rain|cloudy)=<0.8,0.2>
• Suppose it returns true
5. The evidence WetGrass=true. So we set
= P(wetgrass|sprinkler,rain)
=0.1*0.99=0.099
A sample event (true, true, true, true)
with weight 0.099
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
99. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 99
An Example (3/9)
=1.0
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
100. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 100
An Example (4/9)
=1.0
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
101. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 101
An Example (5/9)
=1.0
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
102. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 102
An Example (6/9)
=1.0 0.1
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
103. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 103
An Example (7/9)
=1.0 0.1
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
104. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 104
An Example (8/9)
=1.0 0.1
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
105. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 105
An Example (9/9)
=1.0 0.1 0.99
= 0.099
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
106. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 106
The Algorithm (1/2)
• The example generates a sample
event (true, true, true, true) for the
query P(Rain|sprinkler, wetgrass)
• Repeat the sampling N times
– We get N sample events
– Each event has a likelihood weight
– 1 = rain=true , 1 = rain=false
• P(Rain|sprinkler, wetgrass)
= < 1/(1+2), 2/(1+2) >
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
107. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 107
The Algorithm (2/2)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
108. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 108
Exercise: Likelihood Weighting
p(smart)=.8 p(study)=.6 Query: What is the probability
smart study that a student studied, given
that they pass the exam?
p(fair)=.9
prepared fair
p(prep|…) smart smart
pass study .9 .7
smart smart study .5 .1
p(pass|…)
prep prep prep prep
fair .9 .7 .7 .2
fair .1 .1 .1 .1
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
109. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 109
Analysis (1/3)
• Why the algorithm works? P(X|E=e)
• Let the sampling probability for
WEIGHTED-SAMPLE be SWS
– The evidence variables E are fixed
with e
– All the other variables Z = {X} Y
– The algorithm samples each variable
in Z given its parent values
l
SWS ( z , e) P( zi | parents( Z i ))
i 1
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
110. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 110
Analysis (2/3)
• The likelihood weight w for a given
sample (z, e)=(x, y, e) is
m
w( z , e) P (ei | parents ( Ei ))
i 1
• The weighted probability of a
sample (z,e)=(x, y, e) is
SWS ( z , e) w( z , e)
l m
P( zi | parents ( Z i )) P (ei | parents ( Ei ))
i 1 i 1 n
P ( x, y , e) P( x1 , , xn ) P( xi | parents ( X i ))
i 1
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
111. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 111
Analysis (3/3)
P( x | e) NWS ( x, y, e) w( x, y, e)
ˆ
y
' SWS ( x, y, e) w( x, y, e)
y
' P ( x, y , e)
y
' P ( x, e) P ( x | e)
So the algorithm works
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
112. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 112
Discussions
• Likelihood weighting is efficient
because it uses all the samples
generated
• However, it suffers a degradation in
performance as the no. of evidence
variables increases, because
– Most samples will have very low weights,
– The weighted estimate will be dominated
by the tiny fraction of samples that have
infinitesimal likelihood
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
113. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 113
4. Inference by MCMC
• Key idea
– Sampling process as a Markov Chain
• Next sample depends on the previous one
– Approximate any posterior distribution
• "State" of network
= current assignment to all variables
• Generate next state
– by sampling one variable given Markov
blanket
• Sample each variable in turn, keeping
evidence fixed
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
114. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 114
The Markov Chain
• With Sprinkler =true, WetGrass=true,
there are four states:
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
115. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 115
Markov Blanket Sampling
• Markov blanket of Cloudy is
– Sprinkler and Rain
• Markov blanket of Rain is
– Cloudy, Sprinkler, and WetGrass
• Probability given the Markov
blanket is calculated as follows
– P(x'i|MB(Xi))
= P(x'i|Parents(Xi))
ZjChildren(Xi)P(zj|Parents(Zj))
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
116. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 116
An Example (1/2)
• Estimate P(Rain|sprinkler,wetgrass)
• Loop for N times
– Sample Cloudy or Rain given its
Markov blanket
• Count number of times Rain=true
and Rain=false in the samples
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
117. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 117
An Example (2/2)
• E.g., visit 100 states
– 31 have Rain=true,
– 69 have Rain=false
• P(Rain|sprinkler,wetgrass)
= Normalize(<31, 69>)
= <0.31, 0.69>
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
118. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 118
The Algorithm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
119. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 119
Why it works
• Skipped
– Details in pp. 517-518 in the AIMA 2e
textbook
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
120. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 120
Sub-Sections
• 4.1 Markov chain theory
• 4.2 Two MCMC sampling algorithms
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
121. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 121
4.1 Markov Chain Theory
• Suppose X1, X2, … take some set of values
– wlog. These values are 1, 2, ...
• A Markov chain is a process that corresponds
... ...
to the network:
X1 X2 X3 Xn
• To quantify the chain, we need to specify
– Initial probability: P(X1)
– Transition probability: P(Xt+1|Xt)
• A Markov chain has stationary transition
probability: P(Xt+1|Xt) same for all times t
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
122. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 122
Irreducible Chains
• A state j is accessible from state i if there
is an n such that P(Xn = j | X1 = i) >
0
– There is a positive probability of reaching
j from i after some number steps
• A chain is irreducible if every state is
accessible from every state
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
123. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 123
Ergodic Chains
• A state is positively recurrent if there is a
finite expected time to get back to state i
after being in state i
– If X has finite number of states, then this is
suffices that i is accessible from itself
• A chain is ergodic if it is irreducible and
every state is positively recurrent
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
124. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 124
(A)periodic Chains
• A state i is periodic if there is an integer
d such that when n is not divisible by d
P(Xn = i | X1 = i ) = 0
• Intuition: only every d steps state i may
occur
• A chain is aperiodic if it contains no
periodic state
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
125. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 125
Stationary Probabilities
Thm:
• If a chain is ergodic and aperiodic, then
the limit n P ( X n | X 1 i )
lim
exists, and does not depend on i
• Moreover, let P * ( X j ) n P ( X n j | X 1 i )
lim
then, P*(X) is the unique probability
satisfying
P * (X j ) P ( X t 1 j | X t i )P * ( X i )
i
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
126. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 126
Stationary Probabilities
• The probability P*(X) is the stationary
probability of the process
• Regardless of the starting point, the
process will converge to this probability
• The rate of convergence depends on
properties of the transition probability
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
127. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 127
Sampling from the
Stationary Probability
• This theory suggests how to sample from
the stationary probability:
– Set X1 = i, for some random/arbitrary i
– For t = 1, 2, …, n
• Sample a value xt+1 for Xt+1 from
P(Xt+1|Xt=xt)
– return xn
• If n is large enough, then this is a sample
from P*(X)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
128. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 128
Designing Markov Chains
• How do we construct the right chain to
sample from?
– Ensuring aperiodicity and irreducibility is
usually easy
• Problem is ensuring the desired
stationary probability
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
129. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 129
Designing Markov Chains
Key tool:
• If the transition probability satisfies
P ( Xt 1 j |Xt i ) Q (X j )
P ( Xt 1 i |Xt j )
Q ( X i )
whenever P ( Xt 1 j | Xt i ) 0
then, P*(X) = Q(X)
• This gives a local criteria for checking
that the chain will have the right
stationary distribution
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
130. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 130
MCMC Methods
• We can use these results to sample from
P(X1,…,Xn|e)
Idea:
• Construct an ergodic & aperiodic
Markov Chain such that
P*(X1,…,Xn) = P(X1,…,Xn|e)
• Simulate the chain n steps to get a
sample
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
131. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 131
MCMC Methods
Notes:
• The Markov chain variable Y takes as
value assignments to all variables that
are consistent evidence
V (Y ) { x 1 ,..., x n V ( X 1 ) V ( X 1 ) | x 1 ,..., x n satisfy e }
• For simplicity, we will denote such a
state using the vector of variables
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
132. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 132
4.2 Two MCMC Sampling
Algorithms
• Gibbs Sampler
• Metropolis-Hastings Sampler
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
133. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 133
Gibbs Sampler
• One of the simplest MCMC method
• Each transition changes the state of one
Xi
• The transition probability defined by P
itself as a stochastic procedure:
– Input: a state x1,…,xn
– Choose i at random (uniform probability)
– Sample x’i from P(Xi|x1, …, xi-1, xi+1 ,…,
xn, e)
– let x’j = xj for all j i
– return x’1,…,x’n
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
134. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 134
Correctness of Gibbs Sampler
• How do we show correctness?
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
135. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 135
Correctness of Gibbs Sampler
• By chain rule
P(x1,…,xi-1, xi, xi+1,…,xn|e) =
P(x1,…,xi-1, xi+1,…,xn|e)P(xi|x1,…,xi-1,
xi+1,…,xn, e)
• Thus, we get Transition
P ( x 1 ,, x i 1 , x i , x i 1 ,, x n |e ) P ( x i |x 1 ,, x i 1 , x i 1 ,, x n ,e )
P ( x 1 ,, x i 1 , x 'i , x i 1 ,, x n |e )
P ( x 'i |x 1 ,, x i 1 , x i 1 ,, x n ,e )
• Since we choose i from the same
distribution at each stage, this
procedure satisfies the ratio criteria
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
136. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 136
Gibbs Sampling for
Bayesian Network
• Why is the Gibbs sampler “easy” in BNs?
• Recall that the Markov blanket of a
variable separates it from the other
variables in the network
– P(Xi | X1,…,Xi-1,Xi+1,…,Xn) = P(Xi |
Mbi )
• This property allows us to use local
computations to perform sampling in
each transition
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
137. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 137
Gibbs Sampling in
Bayesian Networks
• How do we evaluate
P(Xi | x1,…,xi-1,xi+1,…,xn) ?
• Let Y1, …, Yk be the children of Xi
– By definition of Mbi, the parents of Yj are
in Mbi{Xi}
• It is easy to show that
P ( xi | Pa i ) P ( y j | pa y j )
P ( xi | Mb i )
j
P ( x ' | Pa ) P ( y
x 'i
i i
j
j | pa y j )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
138. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 138
Metropolis-Hastings
• More general than Gibbs (Gibbs is a
special case of M-H)
• Proposal distribution arbitrary q(x’|x)
that is ergodic and aperiodic (e.g.,
uniform)
• Transition to x’ happens with
probability
(x’|x)=min(1, P(x’)q(x|x’)/P(x)q(x’|x))
• Useful when computing P(x) infeasible
• q(x’|x)=0 implies P(x’)=0 or q(x|x’)=0
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
139. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 139
Sampling Strategy
• How do we collect the samples?
Strategy I:
• Run the chain M times, each for N steps
– each run starts from a different state
points
• Return the last state in each run
M chains
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
140. Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 140
Sampling Strategy
Strategy II:
• Run one chain for a long time
• After some “burn in” period, sample
points every some fixed number of steps
“burn in” M samples from one chain
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright