06 exact inference in bn
Upcoming SlideShare
Loading in...5
×
 

06 exact inference in bn

on

  • 785 views

 

Statistics

Views

Total Views
785
Slideshare-icon Views on SlideShare
776
Embed Views
9

Actions

Likes
0
Downloads
43
Comments
0

1 Embed 9

http://bn-course.wikispaces.com 9

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    06 exact inference in bn 06 exact inference in bn Presentation Transcript

    • Bayesian Networks Unit 6 Exact Inference in Bayesian Networks Wang, Yuan-Kai, 王元凱 ykwang@mails.fju.edu.tw http://www.ykwang.tw Department of Electrical Engineering, Fu Jen Univ. 輔仁大學電機工程系 2006~2011 Reference this document as: Wang, Yuan-Kai, “Exact Inference in Bayesian Networks," Lecture Notes of Wang, Yuan-Kai, Fu Jen University, Taiwan, 2011.Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 2 Goal of This Unit • Learn to efficiently compute the sum product of the inference formula P( X | E  e)     P ( X i | Pa ( X i )) hH i 1~ n – Remember: enumeration and multiplication of all P(Xi|Pa(Xi) are not efficient – We will learn other 3 methods for exact inference Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 3 Related Units • Background – Probabilistic graphical model • Next units – Approximate inference algorithms – Probabilistic inference over time Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 4 Self-Study References • Chapter 14, Artificial Intelligence-a modern approach, 2nd, by S. Russel & P. Norvig, Prentice Hall, 2003. • The generalized distributive law, S. M. Aji and R. J. McEliece, IEEE Trans. On Information Theory, vol. 46, no. 2, 2000. • Inference in Bayesian networks, B. D’Ambrosio, AI Magazine, 1999. • Probabilistic Inference in graphical models, M. I. Jordan & Y. Weiss. Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 5 Structure of Related Lecture Notes Problem Structure Data Learning PGM B E Representation Learning A Unit 5 : BN Units 16~ : MLE, EM Unit 9 : Hybrid BN J M Units 10~15: Naïve Bayes, MRF, HMM, DBN, Kalman filter P(B) Parameter P(E) Learning P(A|B,E) P(J|A) Query Inference P(M|A) Unit 6: Exact inference Unit 7: Approximate inference Unit 8: Temporal inference Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 6 Contents 1. Basics of Graph ……………………………… 11 2. Sum-Product and Generalized Distributive Law …………………………………………..... 20 3. Variable Elimination ........................................ 29 4. Belief Propagation ....……............................... 96 5. Junction Tree ……………...……………........ 157 6. Summary .......................................................... 212 7. Implementation ……………………………… 214 8. Reference .......................................................... 215 Fu Jen University Fu Jen University Department of Electrical Engineering Department of Electronic Engineering Wang, Yuan-Kai Copyright Yuan-Kai Wang Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 7 Four Steps of Inference P(X|e) • Step 1: Bayesian theorem P ( X , E  e) P ( X | E  e)   P ( X , E  e) P ( E  e) • Step 2: Marginalization    P( X , E  e, H  h) hH • Step 3: Conditional independence     P( X i | Pa ( X i )) hH i 1~ n • Step 4: Sum-Product computation – Exact inference – Approximate inference Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 8 Five Types of Queries in Inference • For a probabilistic graphical model G • Given a set of evidence E=e • Query the PGM with – P(e) : Likelihood query – arg max P(e) : Maximum likelihood query – P(X|e) : Posterior belief query – arg maxx P(X=x|e) : (Single query variable) Maximum a posterior (MAP) query – arg maxx …x P(X1=x1, …, Xk=xk|e) : 1 k Most probable explanation (MPE) query Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 9 Brute Force Enumeration • We can compute in O(KN) time, where K=|Xi| B E A J M • By using BN, we can represent joint distribution in O(N) space Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 10 Expression Tree of Enumeration : Repeated Computations • P(b|j,m)= EAP(b)P(E)P(A|b,E)P(j|A)P(m|A) E=e + E= e + + A=a * A=a * * * A= a * * * A= a * * * * * * * P(a|b,e) P(a|b,e) P(m|a) P(e) P(b) * P(a|b,e)P(m|a) * P(e) P(b) P(a|b,e) P(j|a) P(m|a) P(e) P(b) P(j|a) P(m|a) P(e) P(b) P(j|a) P(j|a) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 11 1. Basics of Graph • Polytree • Multiply connected networks • Clique • Markov network • Chordal graph • Induced width Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 12 Two Kinds of PGMs • There are two kinds of probabilistic graphical models (PGMs) – Singly connected network • Polytree – Multiply connected network Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 13 Singly Connected Networks (Polytree) • Any two nodes are Burglary Earthquake connected by at most one undirected path Alarm • Theorem John Calls Mary Calls • Inference in a polytree is linear in the node size A H of the network B C • This assumes tabular CPT representation D E F G Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 14 Multiply Connected Networks • At least two nodes are connected by more than one undirected path Cloudy Sprinkler Rain Wet Grass Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 15 Clique (1/2) • A clique is a subgraph of an undirected graph that is complete and maximal – Complete: • Fully connected • Every node connects to every other nodes – Maximal: Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 16 Clique (2/2) • Identify cliques A EGH CEG B C G DEF ACE D E H F ABD ADE Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 17 Markov Network (1/2) • An undirected graph with – Hyper-nodes (multi-vertex nodes) – Hyper-edges (multi-vertex edges) EGH CEG DEF ACE ABD ADE Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 18 Markov Network (2/2) • Every hyper-edge e=(x1…xk) has a potential function fe(x1…xk) • The probability distribution is P ( X 1 ,..., X n )  Z  f e ( x e1 ,..., x ek ) e E Z  1 /  ...  f e ( x e1 ,..., x ek ) x1 xn e E EGH CEG P ( EGH , CEG )  Z  f e ( E , G, H , C ) eE Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 19 Chordal Graphs • Elimination ordering  undirected chordal graph V S V S T L T L A B A B X D X D Graph: • Maximal cliques are factors in elimination • Factors in elimination are cliques in the graph • Complexity is exponential in size of the largest clique in graph Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 20 2. Sum-Product and Generalized Distributive Law P ( X | E  e)     P ( X i | Pa ( X i )) hH i 1~ n We obtain the formula because two rules in probability theory Sum Rule : P( x)   P( x, y ) y Product Rule : P( x, y )  P( x | y ) P( y ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 21 The Sum-Product with Generalized Distributive Law P ( X | E  e)     P ( X i | Pa ( X i )) hH i 1~ n       P ( X i | Pa ( X i )) Xk X 1 i 1~ k      P ( X 1 | Pa ( X 1 )) P ( X k | Pa ( X k )) Xk X1    P( X k | Pa ( X k )) P( X t | X k , )   Xk X k 1  P( X X1 1 | Pa ( X 1 )) P( X u | X 1 , ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 22 Distributive Law for Sum-Product (1/3) • ax1  ax2  a ( x1  x2 )  ax i i  a  xi i •  x x i j i j  ( x)( x)    i i j j  P( x | x )   i i h i P ( xi , x h ) P ( xh )    P ( xi , xh )  P ( xh ) •   P ( x i ) P ( x j )   P ( x i ) P ( x j )     i Variable i i j i j is eliminated   P( x | x ) P( x i j i h j | xk )  ( i )( P ( x i | xh ) j P ( x j | xk ) )  P ( xh )   P ( xk )  f1 ( xh )  f 2 ( xk ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 23 Distributive Law for Sum-Product (2/3) •   P ( x i | xh ) P ( x j | x k )  i j ( i P ( x i | xh ) )( j ) P ( x j | xk )  f1 ( xh )  f 2 ( xk ) •   P( x i | xk ) P( x j | xi )   P( x | x )( P( x | x ) ) i k j i i j i j   P( x | x )  f ( x ))  f ( x ) ( i k i k i • P(b | j , m)    P(b) P(e)P(a | b, e) P( j | a) P(m | a) e a  P (b) P(e) P (a | b, e) P ( j | a ) P (m | a ) e a Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 24 Distributive Law for Sum-Product (3/3) ab + ac = a(b+c) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 25 Distributive Law for Max-Product • max(ax1 , ax2 )  a max( x1 , x2 )  max axi  a max xi i i • max max xi x j  max xi max x j i j i j • max max P ( x i ) P ( x j )  max P ( x i ) max P ( x j ) i j i j max max P( x i | xk ) P( x j | xk ) i j  max P( x i | xk ) max P( x j | xk ) i j • arg max P ( xi ) i Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 26 Generalized Distributive Law (1/2) Aji and McEliece, 2000 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 27 Generalized Distributive Law (2/2) Aji and McEliece, 2000 •a+0=0+a=a •a*1=1*a=a •a*b+a*c=a*(b+c) •max(a,0)=max(0+a)=a •a*1=1*a=a •max(a*b, a*c) =a*max(b, c) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 28 Marginal to MAP : MAX Product Likelihood & Posterior Queries x1 x2 x3 x4 x5 Maximum Likelihood Query & MAP Query Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 29 3. Variable Elimination • Variable elimination improves the enumeration algorithm by – Eliminating repeated calculations • Carry out summations right-to-left –Bottom-up in the evaluation tree • Storing intermediate results (factors) to avoid re-computation – Dropping irrelevant variables Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 30 Basic Idea • Write query in the form P ( X n , e )       P ( xi | pa i ) xk x3 x2 i • Iteratively –Move all irrelevant terms (constants) outside the innermost summation   (i aibc) =  (bc (i ai )) –Perform innermost sum, getting a new term: factors –Insert the new term into the product Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 31 An Example without Evidence (1/2) P(C) Cloudy 0.5 C P(R|C) T 0.8 C P(S|C) F 0.2 T 0.1 F 0.5 Sprinkler Rain S R P(W|S,R) T T 0.99 T F 0.90 F T 0.90 WetGrass F F 0.00 P ( w)   P ( w | r , s ) P ( r | c ) P ( s | c ) P (c ) r , s ,c   P ( w | r , s ) P ( r | c ) P ( s | c ) P (c ) r ,s c   P ( w | r , s ) f1 ( r , s ) f1 ( r , s ) r ,s Factor Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 32 An Example without Evidence (2/2) R S C P(R|C) P(S|C) P(C) P(R|C) P(S|C) P(C) T T T T T F T F T T F F F T T F T F F F T F F F R S f1(R,S) = ∑c P(R|S) P(S|C) P(C) Factor f1(r,s) T T A factor may be T F • A function F T • A value F F Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 33 An Example with Evidence (1/2) Factors Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 34 An Example with Evidence (2/2) P(E) Burglary Earthquake • fM(a) = <0.7,0.1> P(B) 0.002 B E P(A|B,E) T T 0.95 • fJ(a) = <0.9,0.05> 0.001 Alarm T F F T 0.95 0.29 • fA(a,b,e) A P(J|A) F F 0.001 John Calls T 0.90 Mary Calls A P(M|A) • fÃJM(b,e) F 0.05 T F 0.70 0.01 J M A B E fM(a) PJ(a) fA(a,b,e) fJM (a,b,e) fÃJM (b,e) T T T T T 0.7 0.9 0.95 0.7*0.9*0.95 T T T T F 0.7 0.9 0.95 0.7*0.9*0.95 T T T F T 0.7 0.9 0.29 0.7*0.9*0.29 T T T F F 0.7 0.9 0.001 0.7*0.9*0.01 T T F T T 0.1 0.05 0.05 0.1*0.05*0.05 T T F T F 0.1 0.05 0.05 0.1*0.05*0.05 T T F F T 0.1 0.05 0.71 0.1*0.05*0.71 T T F F F 0.1 0.05 0.95 0.1*0.05*0.95 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 35 Basic Operations • Summing out a variable from a product of factors – Move any irrelevant terms (constants) outside the innermost summation – Add up submatrices in pointwise product of remaining factors Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 36 Variable Elimination Algorithm Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 37 Irrelevant Variables (1/2) • Consider the query P(JohnCalls|Burglary = true) – P(J|b)= P(b) eP(e) aP(a|b,e)P(J|a) mP(m|a) – Sum over m is identically 1 mP(m|a) = 1 – M is irrelevant to the query Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 38 Irrelevant Variables (2/2) • Theorem 1: P(X|E) Y is irrelevant if YAncestors({X}E) • In the example P(J|b) – X =JohnCalls, E={Burglary} – Ancestors({X}  E) = {Alarm,Earthquake} – so MaryCalls is irrelevant Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 39 Complexity • Time and space cost of variable elimination are O(dkn) – n: No. of random variables – d: no. of discrete values – k: no. of parent nodes k is critical for • Polytrees : k is small, Linear complexity – If k=1, O(dn) • Multiply connected networks : – O(dkn), k is large – Can reduce 3SAT to variable elimination • NP-hard – Equivalent to counting 3SAT models • #P-complete, i.e. strictly harder than NP-complete problems Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 40 Pros and Cons • Variable elimination is simple and efficient for single query P(Xi | e) • But it is less efficient if all the variables are computed: P(X1 | e), …, P(Xk | e) – In a polytree network, one would need to issue O(n) queries costing O(n) each: O(n2) • Junction tree algorithm extends variable elimination that compute posterior probabilities for all nodes simultaneously Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 41 3.1 An Example • The Asia network Visit to Smoking Asia Tuberculosis Lung Cancer Abnormality Bronchitis in Chest X-Ray Dyspnea Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 42 V S • We want to inference P(d) • Need to eliminate: v,s,x,t,l,a,b T L A B Initial factors X D P (v, s , t , l , a , b, x, d )  P ( v ) P ( s ) P (t | v ) P (l | s ) P (b | s ) P ( a | t , l ) P ( x | a ) P ( d | a , b ) “Brute force approach” P (d)         P (v, s, t, l, a,b, x, d) x b a l t s v T Complexity is exponential O(N ) • N : size of the graph, number of variables • K : number of states for each variable Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 43 V S • We want to inference P(d) • Need to eliminate : v,s,x,t,l,a,b T L A B Initial factors X D P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b ) Eliminate: v Compute: fv (t )   P (v )P (t |v ) v  fv (t )P ( s )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b ) t fv(t) Note: fv(t) = P(t) T 0.70 In general, result of elimination is F 0.01 not necessarily a probability term Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 44 V S • We want to inference P(d) • Need to eliminate : s,x,t,l,a,b T L A B • Initial factors X D P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )  fv (t )P ( s )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b ) Eliminate: s Compute: fs (b , l )   P (s )P (b | s )P (l | s ) s  fv (t )fs (b , l )P (a | t , l )P ( x | a )P (d | a , b ) b l fs(b,l) T T 0.95 •Summing on s results in fs(b,l) T F F 0.95 T 0.29 •A factor with two arguments F F 0.001 •Result of elimination may be a function of several variables Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 45 V S • We want to inference P(d) • Need to eliminate : x,t,l,a,b T L A B • Initial factors X D P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )  fv (t )P ( s )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )  fv (t )fs (b , l )P (a | t , l )P ( x | a )P (d | a , b ) Eliminate: x Compute: fx (a )   P (x | a ) x  fv (t )fs (b , l )fx (a )P (a | t , l )P (d | a , b ) Note: fx(a) = 1 for all values of a !! Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 46 V S • We want to inference P(d) • Need to eliminate : t,l,a,b T L A B • Initial factors X D P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )  fv (t )P ( s )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )  fv (t )fs (b , l )P (a | t , l )P ( x | a )P (d | a , b )  fv (t )fs (b , l )fx (a )P (a | t , l )P (d | a , b ) Eliminate: t Compute: ft (a , l )   fv (t )P (a |t , l ) t  fs (b , l )fx (a )ft (a , l )P (d | a , b ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 47 V S • We want to inference P(d) • Need to eliminate : l,a,b T L A B • Initial factors X D P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )  fv (t )P ( s )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )  fv (t )fs (b , l )P (a | t , l )P ( x | a )P (d | a , b )  fv (t )fs (b , l )fx (a )P (a | t , l )P (d | a , b )  fs (b , l )fx (a )ft (a , l )P (d | a , b ) Eliminate: l Compute: fl (a , b )   fs (b , l )ft (a , l ) l  fl (a , b )fx (a )P (d | a , b ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 48 V S • We want to inference P(d) T L • Need to eliminate : b A B • Initial factors X D P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )  fv (t )P ( s )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )  fv (t )fs (b , l )P (a | t , l )P ( x | a )P (d | a , b )  fv (t )fs (b , l )fx (a )P (a | t , l )P (d | a , b )  fs (b , l )fx (a )ft (a , l )P (d | a , b )  fl (a , b )fx (a )P (d | a , b )  fa (b , d )  fb (d ) Eliminate: a,b Compute: fa (b , d )   fl (a , b )fx (a ) p (d | a , b ) a fb (d )   fa (b , d ) b Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 49 V S • Different elimination ordering • Need to eliminate : a,b,x,t,v,s,l T L • Initial factors A B P (v)P (s)P (t | v)P (l | s)P (b | s)P (a | t, l)P ( x | a)P (d | a,b) X D Intermediate factors: In previous order g a (l , t , d , b , x , s , v ) Both f v (v, s , x, t , l , a , b ) g b (l , t , d , x , s , v ) need f s ( s , x, t , l , a , b ) g x (l , t , d , s , v ) n=7 f x ( x, t , l , a , b ) g t (l , d , s , v ) steps f t (t , l , a, b) g v (l , d , s ) f l (l , a, b) g s (l , d ) But each step has f a ( a, b) different g l (d ) computation size f b (d ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 50 Short Summary • Variable elimination is a sequence of rewriting operations • Computation depends on – Number of variables n • Each elimination step reduces one variable • So we need n elimination steps – Size of factors • Effected by order of elimination • Discussed in sub-section 3.2 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 51 V S Dealing with Evidence(1/7) T L A B • How do we deal with evidence? X D • Suppose get evidence V = t, S = f, D = t • We want to compute P(L, V = t, S = f, D = t) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 52 V S Dealing with Evidence(2/7) T L A B • We start by writing the factors: X D P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b ) • Since we know that V = t, we don’t need to eliminate V • Instead, we can replace the factors P(V) and P(T|V) with fP (V )  P (V  t ) fp (T |V ) ( )  P ( |V  t ) T T • These “select” the appropriate parts of the original factors given the evidence • Note that fp(V) is a constant, and thus does not appear in elimination of other variables Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 53 V S Dealing with Evidence(3/7) T L • Given evidence V = t, S = f, D = t A B • Compute P(L, V = t, S = f, D = t ) X D • Initial factors, after setting evidence: fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )P ( x | a )fP (d |a ,b ) (a , b ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 54 V S Dealing with Evidence(4/7) T L A B • Given evidence V = t, S = f, D = t • Compute P(L, V = t, S = f, D = t ) X D • Initial factors, after setting evidence: fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )P ( x | a )fP (d |a ,b ) (a , b ) • Eliminating x, we get fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )fx (a )fP (d |a ,b ) (a , b ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 55 V S Dealing with Evidence(5/7) T L • Given evidence V = t, S = f, D = t A B • Compute P(L, V = t, S = f, D = t ) • Initial factors, after setting evidence: X D fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )P ( x | a )fP (d |a ,b ) (a , b ) • Eliminating x, we get fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )fx (a )fP (d |a ,b ) (a , b ) • Eliminating t, we get fP (v )fP ( s )fP (l |s ) (l )fP ( b|s ) (b )ft (a , l )fx (a )fP (d |a ,b ) (a , b ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 56 V S Dealing with Evidence(6/7) T L • Given evidence V = t, S = f, D = t A B • Compute P(L, V = t, S = f, D = t ) • Initial factors, after setting evidence: X D fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )P ( x | a )fP (d |a ,b ) (a , b ) • Eliminating x, we get fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )fx (a )fP (d |a ,b ) (a , b ) • Eliminating t, we get fP (v )fP ( s )fP (l |s ) (l )fP ( b|s ) (b )ft (a , l )fx (a )fP (d |a ,b ) (a , b ) • Eliminating a, we get fP (v )fP ( s )fP (l |s ) (l )fP ( b|s ) (b )fa (b , l ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 57 V S Dealing with Evidence(7/7) T L • Given evidence V = t, S = f, D = t A B • Compute P(L, V = t, S = f, D = t ) • Initial factors, after setting evidence: X D fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )P ( x | a )fP (d |a ,b ) (a , b ) • Eliminating x, we get fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )fx (a )fP (d |a ,b ) (a , b ) • Eliminating t, we get fP (v )fP ( s )fP (l |s ) (l )fP ( b|s ) (b )ft (a , l )fx (a )fP (d |a ,b ) (a , b ) • Eliminating a, we get fP (v )fP ( s )fP (l |s ) (l )fP ( b|s ) (b )fa (b , l ) • Eliminating b, we get fP (v )fP ( s )fP (l |s ) (l )fb (l ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 58 Complexity (1/2) • Suppose in one elimination step we compute fx ( y 1 ,  , y k )  f x (x , y ,  , y k ) x 1 m f x ( x , y ,  , y k )   fi ( x , y  , y 1 1,1, 1,li ) i 1 This requires |X| : No. of discrete values of X • m  X   Yi multiplications i – For each value for x, y1, …, yk, we do m multiplications • X   Yi additions i – For each value of y1, …, yk , we do |X| additions Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 59 Complexity (2/2) • One elimination step requires – m  X   Yi multiplications i – X   Yi additions i – O( X   Yi ), m is a constant (neglected) i – Or O(d k) if • |X|=|Yi|=d, • k: no. of parent nodes • Time and space cost are O(dkn) Complexity is – n: No. of random variables exponential in number – d: no. of discrete values of variables k – k: no. of parent nodes Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 60 3.2 Order of Elimination • How to select “good” elimination orderings in order to reduce complexity 1. Start by understanding variable elimination via the graph we are working with 2. Then reduce the problem of finding good ordering to graph-theoretic operation that is well-understood Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 61 Undirected Graph Conversion (1/2) • At each stage of the variable elimination, • We have an algebraic term that we need to evaluate • This term is of the form P ( x 1 ,  , x k )      fi ( Z i ) y1 yn i where Zi are sets of variables Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 62 Undirected Graph Conversion (2/2) • Plot a graph where – If X,Y are arguments of some factor • That is, if X,Y are in some Zi – There are undirected edges X--Y Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 63 Example • Consider the “Asia” example • The initial factors are P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b ) • The undirected graph is V S V S T L T L A B A B X D X D • In the first step this graph is just the moralized graph Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 64 Variable Elimination  Change of Graph P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b ) • Now we eliminate t, getting P (v )P ( s )P (l | s )P (b | s )P ( x | a )P (d | a , b )ft (v , a , l ) • The corresponding change in the graph is V S V S T L T L Nodes V,L,A become A B A B a clique X D X D Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 65 Example (1/6) • Want to compute P(L,V=t,S=f,D=t) V S T L A B • Moralizing V S X D T L A B X D Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 66 Example (2/6) • Want to compute P(L,V=t,S=f,D=t) V S T L • Moralizing A B • Setting evidence X D V S T L A B X D Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 67 Example (3/6) • Want to compute P(L,V=t,S=f,D=t) V S T L • Moralizing A B • Setting evidence • Eliminating x X D V S – New factor fx(A) T L A B X D Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 68 Example (4/6) • Want to compute P(L,V=t,S=f,D=t) V S T L • Moralizing A B • Setting evidence X D • Eliminating x Eliminating a V S • – New factor fa(b,t,l) T L A B A clique in reduced undirected graph X D Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 69 Example (5/6) • Want to compute P(L,V=t,S=f,D=t) V S • Moralizing T L • Setting evidence A B • Eliminating x X D • Eliminating a V S • Eliminating b T L – New factor fb(t,l) A B A clique in reduced X D undirected graph Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 70 Example (6/6) • Want to compute P(L,V=t,S=f,D=t) V S T L • Moralizing A B • Setting evidence X D • Eliminating x V S • Eliminating a T L • Eliminating b Eliminating t A B • – New factor ft(l) X D Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 71 Elimination and Clique (1/2) • We can eliminate a variable x by 1. For all Y,Z, s.t., Y--X, Z--X • add an edge Y--Z 2. Remove X and all adjacent edges to it • This procedures create a clique that contains all the neighbors of X • After step 1 we have a clique that corresponds to the intermediate factor (before marginalization) • The cost of the step is exponential in the size of this clique : dk in O(ndk) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 72 Elimination and Clique (2/2) • The process of eliminating nodes from an undirected graph gives us a clue to the complexity of inference • To see this, we will examine the graph that contains all of the edges we added during the elimination • The resulting graph is always chordal Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 73 V S Example (1/7) T L • Want to compute P(L) A B X D • Moralizing V S T L A B X D Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 74 V S Example (2/7) T L • Want to compute P(L) A B X D • Moralizing • Eliminating v V S – Multiply to get f’v(v,t) – Result fv(t) T L A B X D Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 75 V S Example (3/7) T L • Want to compute P(L) A B X D • Moralizing • Eliminating v V S • Eliminating x T L –Multiply to get f’x(a,x) –Result fx(a) A B X D Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 76 V S Example (4/7) T L • Want to compute P(L) A B X D • Moralizing • Eliminating v V S • Eliminating x T L • Eliminating s –Multiply to get f’s(l,b,s) A B –Result fs(l,b) X D Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 77 V S Example (5/7) T L • Want to compute P(D) A B X D • Moralizing • Eliminating v • Eliminating x V S • Eliminating s T L • Eliminating t A B –Multiply to get f’t(a,l,t) X D –Result ft(a,l) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 78 V S Example (6/7) T L • Want to compute P(D) A B X D • Moralizing • Eliminating v V S • Eliminating x T L • Eliminating s • Eliminating t A B • Eliminating l X D –Multiply to get f’l(a,b,l) –Result fl(a,b) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 79 V S Example (7/7) T L • Want to compute P(D) A B X D • Moralizing • Eliminating v • Eliminating x V S • Eliminating s T L • Eliminating t A B • Eliminating l X D • Eliminating a, b –Multiply to get f’a(a,b,d) –Result f(d) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 80 Induced Graphs V S • The resulting graph are induced T L graphs (for this particular ordering) A B X D • Main property: – Every maximal clique in the induced graph corresponds to an intermediate factor in the computation – Every factor stored during the process is a subset of some maximal clique in the graph • These facts are true for any variable elimination ordering on any network Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 81 Induced Width (Treewidth) • The size of the largest clique k in the induced graph is – An indicator for the complexity of variable elimination • w=k-1 is called – Induced width (treewidth) of a graph – According to the specified ordering • Finding a good ordering for a graph is equivalent to finding the minimal induced width of the graph Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 82 Treewidth Low treewidth High tree width Chains N=nxn grid W=1 Trees (no loops) W = O(n) = O(p N) MINVOLSET INTUBATION KINKEDTUBE PULMEMBOLUS VENTMACH DISCONNECT PAP SHUNT VENTLUNG VENITUBE MINOVL PVSAT VENTALV ARTCO2 PRESS Loopy graphs Arnborg85 TPR SAO2 EXPCO2 INSUFFANESTH HYPOVOLEMIA LVFAILURE CATECHOL LVEDVOLUME STROEVOLUME ERRBLOWOUTPUT HISTORY HRERRCAUTER CVP PCWP CO HREKGHRSAT HRBP BP W = #parents W = NP-hard to find Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 83 Complexity • Time and space cost of variable elimination are O(dkn) – n: No. of random variables – d: no. of discrete values – k: no. of parent nodes = treewidth + 1 (W+1) • Polytrees : k is small, Linear – If k=1, O(dn) • Multiply connected networks : – O(dkn), k is large – Can reduce 3SAT to variable elimination • NP-hard – Equivalent to counting 3SAT models • #P-complete, i.e. strictly harder than NP-complete problems Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 84 Elimination on Trees (1/3) • Suppose we have a tree that – A network where each variable has at most one parent • Then all the factors involve at most two variables: Treewidth=1 • The moralized graph is also a tree A A B C B C D E D E F G F G Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 85 Elimination on Trees (2/3) • We can maintain the tree structure by eliminating extreme variables in the tree A A B C B C D E D E A F G F G B C D E F G Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 86 Elimination on Trees (3/3) • Formally, for any tree, there is an elimination ordering with treewidth = 1 Theorem • Inference on trees is linear in number of variables : O(dn) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 87 Exercise: Variable Elimination p(smart)=.8 p(study)=.6 Query: What is the probability smart study that a student studied, given that they pass the exam? p(fair)=.9 prepared fair p(prep|…) smart smart pass study .9 .7 smart smart study .5 .1 p(pass|…) prep prep prep prep fair .9 .7 .7 .2 fair .1 .1 .1 .1 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 88 Variable Elimination Algorithm • Let X1,…, Xm be an ordering on the non-query variables   ...   P ( X | Parents ( X )) j j X 1 X 2 X m j • For i = m, …, 1 – Leave in the summation for Xi only factors mentioning Xi – Multiply the factors, getting a factor that contains a number for each value of the variables mentioned, including Xi – Sum out Xi, getting a factor f that contains a number for each value of the variables mentioned, not including Xi – Replace the multiplied factor in the summation Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 89 3.3 General Graphs • If the graph is not a polytree – More general networks – Usually loopy networks • Can we inference loopy networks by variable elimination? – If network has a cycle, the treewidth for any ordering is greater than 1 – Its complexity is high, – VE becomes a not practical algorithm Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN A p. 90 B C Example (1/2) D E • Eliminating A, B, C, D, E,…. F G • Resulting graph is chordal with treewidth 2 H A A A A B C B C B C B C D E D E D E D E F G F G F G F G H H H H Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN A p. 91 B C Example (2/2) D E • Eliminating H,G, E, C, F, D, E, A F G • Resulting graph is chordal with treewidth 3 H A A A A B C B C B C B C D E D E D E D E F G F G F G F G H H H H Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 92 Find Good Elimination Order in General Graph Theorem: • Finding an ordering that minimizes the treewidth is NP-Hard However, • There are reasonable heuristic for finding “relatively” good ordering • There are provable approximations to the best treewidth • If the graph has a small treewidth, there are algorithms that find it in polynomial time Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 93 Heuristics for Finding an Elimination Order • Since elimination order is NP-hard to optimize, • It is common to apply greedy search techniques: Kjaerulff90 • At each iteration, eliminate the node that would result in the smallest – Number of fill-in edges [min-fill] – Resulting clique weight [min-weight] (Weight of clique = product of number of states per node in clique) • There are some approximation algorithms Amir01 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 94 Factorization in Loopy Networks Probabilistic models with no loop are tractable Factorizable a b  Pa, x P(b, x) P(c, x) P(d, x) a b c d c      d    P (a, x)   P (b, x)   P (c, x)   P (d, x)   a  b  c  d  Probabilistic models with loop are not tractable a Not Factorizable b c  Pa, b, c, d, x  a b c d d Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 95 Short Summary • Variable elimination – Actual computation is done in elimination step – Computation depends on order of elimination – Very sensitive to topology – Space = time • Complexity – Polytrees: Linear time – General graphs: NP-hard Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 96 4. Belief Propagation • Also called – Message passing – Pearl’s algorithm • Subsections – 4.1 Message passing in simple chains – 4.2 Message passing in trees – 4.3 BP Algorithm – 4.4 Message passing in general graphs Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 97 What’s Wrong with VarElim • Often we want to query all hidden nodes • Variable elimination takes O(N2dk) time to compute P(Xi|e) for all (hidden) nodes Xi • Message passing algorithms that can do this in O(Ndk) time Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 98 Repeated Variable Elimination Leads to Redundant Calculations X1 X2 X3 Y1 Y2 Y3 P ( x1 | y1:3 )  P ( x1 ) P ( y1 | x1 )  P ( x 2 | x1 ) P ( y 2 | x 2 )  P ( x3 | x 2 ) P ( y 3 | x3 ) x2 x3 P ( x 2 | y1:3 )  P ( x 2 | x1 ) P ( y 2 | x 2 )  P ( x1 ) P ( y1 | x1 )  P ( x3 | x 2 ) P ( y 3 | x3 ) x1 x3 P ( x3 | y1:3 )  P ( x3 | x 2 ) P ( y 3 | x3 )  P ( x1 ) P ( y1 | x1 )  P ( x 2 | x1 ) P ( y 2 | x 2 ) x1 x2 O(N2 K2) time to compute all N marginals Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 99 Belief Propagation • Belief propagation (BP) operates by sending beliefs/messages between nearby variables in the graphical model • It works like variable elimination Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 100 4.1 Message Passing in Simple Chains X1 ... Xk ... Xn • Likelihood query (query without evidence) – P(X1), P(Xn), P(Xk) – P(Xj , Xk) • Posterior query (query with evidence) – P(X1|Xn), P(Xn|X1), – P(Xk|X1), P(Xk|Xn), – P(X1|Xk), P(Xn|Xk), – P(Xk|Xj) • Maximum A Posterior (MAP) query – arg max P(Xk|Xj) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 101 Sum-Product of the Simple Chain (1/2) X1 ... Xk ... Xn P( X k )   P( X X 1  X k 1 , X k 1  X n 1 , , X k , , X n )       P ( X 1 ,  , X k ,  , X n ) X1 X k 1 X k 1 Xn        P ( X i | Pa ( X i )) X1 X k 1 X k 1 Xn Xi       P ( X n | X n 1 )  P ( X k | X k 1 )  P ( X 2 | X 1 ) P ( X 1 ) X1 X k 1 X k 1 Xn   P ( X 1 )  P ( X 2 | X 1 )   P ( X k 1 | X k  2 ) P ( X k | X k 1 ) X1 X2 X k 1  P( X X k 1 k 1 | X k )  P ( X n | X n 1 ) Xn Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 102 Sum-Product of the Simple Chain (2/2) X1 ... Xk ... Xn P( X k | X j )   P ( X 1 , , X n ) { X i |1 i  n , i  j , k }    P( X { X i |1 i  n , i  j , k } X i i | Pa ( X i ))   P( X { X i |1 i  n , i  j , k } n | X n 1 )  P ( X k | X k 1 )  P ( X 2 | X 1 ) P ( X 1 ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 103 4.1.1 Likelihood Query • P(Xn) or P(xn) : Forward passing X1 X2 X3 ... Xn • P(X1) or P(x1) : Backward passing X1 X2 X3 ... Xn • P(Xk) or P(xk) : Forward-Backward passing X1 X2 ... Xk ... Xn Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 104 Forward Passing (1/6) A B C D E • P(e) P ( e )      P ( a ) P (b | a ) P ( c | b ) P ( d | c ) P ( e | d ) d c b a   P ( e | d )  P ( d | c )  P ( c | b )  P ( a ) P (b | a ) d c b a Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 105  Forward Passing (2/6) X we can perform innermost summation m AB (B ) A B C D E • Now P ( e )   P ( e | d )  P ( d | c )  P ( c | b )  P ( a ) P (b | a ) d c b a   P ( e | d )  P ( d | c )  P ( c | b ) p (b ) d c b • This summation is exactly – A variable elimination step – We call it: send a CPT P(b) to compute next innermost summation – The sent CPT P(b) is called a belief, or message: m AB (b)  P(b)   P(a ) P(b | a )  f (a, b) a a Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 106 Forward Passing (3/6)    mBC (C ) X X mCD (D) mDE (E ) A B C D E • Rearranging and then summing again, P ( e )   P ( e | d )  P ( d | c )  P ( c | b ) p (b )  d c b m AB (B )   P (e | d )  P ( d | c ) p (c ) d c   P ( e | d )  P ( d | c ) m BC ( c ) d c   mBC (c)   P(c | b)m AB (b)  P(c) b Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 107 Forward Passing (4/6)    mn 1,n ( xn ) X1 m12 ( x2 ) X2 m23 ( x3 ) X3 ... Xn • How do we compute P(Xn)? P( xn )   P( xn | xn 1 )   P( x3 | x2 ) P( x2 | x1 )P( x1 ) xn1 x2 x1  mn 1,n ( xn ) • Actually, we recursively compute mk-1,k(xk)   mk 1,k ( xk )  P( xk )   P( xk | xk 1 ) P ( xk 1 )   P ( xk | xk 1 )mk  2,k 1 ( xk 1 ) xk 1 xk 1 mk-1,k(xk) is called a belief, or message Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 108 Forward Passing (5/6)    m12 ( x2 ) m23 ( x3 ) mn 1,n ( xn ) X1 X2 X3 ... Xn   mk 1,k ( xk )   P( xk | xk 1 )mk  2,k 1 ( xk 1 ) xk 1 xk mk-1,i(xk) xk-1 xk P(xk|xk-1) P(xk-1)= P(xk)= mk-2,k-1(xk) mk-1,k(xk) T T T F T F F T Advantage: F F After P(Xn), all P(Xk) are also obtained O(Ndk) Compute beliefs of all variables at once Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 109 Forward Passing (6/6)     m12 ( x2 ) m23 ( x3 ) mn 1,n ( xn ) m01 ( x1 )  P ( x1 ) X1 X2 X3 Xn ...   mk 1,k ( xk )   P ( xk | xk 1 )mk  2,k 1 ( xk 1 ) x  k 1 mk 1,k ( xk )  P( xk ) if xk has no parents • Because when Xi-1 has no parent  m AB (B ) X A B C D P ( e )   P ( e | d )  P ( d | c )  P ( c | b )  P ( a ) P (b | a ) E d c b a   P ( e | d )  P ( d | c )  P ( c | b ) m AB (b ) d c b Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 110 Backward Passing (1/4)  X mED (D ) • P(a) A B C D E P ( a )      P ( a ) P (b | a ) P ( c | b ) P ( d | c ) P ( e | d ) b c d e  P ( a )  P (b | a )  P ( c | b )  P ( d | c )  P ( e | d ) b c d e  P ( a )  P (b | a )  P ( c | b )  P ( d | c ) f ( d ) b c  d mED ( D)  f (d )   P(e | d ) 1 ( =1 ) e • Eliminating variable e, we get f(d) – We call it a belief/message sent from e to d Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 111 Backward Passing (2/4)   X X mDC (C ) mED (D ) A B C D E P ( a )  P ( a )  P (b | a )  P ( c | b )  P ( d | c ) m ED ( d ) b c d  P ( a )  P (b | a )  P ( c | b ) f ( c ) b c • Eliminating d, we get f(c) – We call it a belief/message sent from d to c  mDC (C )  f (c)   P(d | c)mED (d ) ( =1 ) d Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 112 Backward Passing (3/4)     X X X mBA ( A) mCB (B) mDC (C ) mED (D ) A B C X D E P ( a )  P ( a )  P (b | a )  P ( c | b ) m DC ( c ) b c • Eliminating c, P ( a )  P ( a )  P (b | a ) f (b )  P ( a )  P (b | a ) m CB (b ) b b • Eliminating b,  P ( a )  P ( a ) f ( a )  P ( a ) m BA ( a ) mBA (a )  1 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 113 Backward Passing (4/4)    m32 ( x2 ) X1 m21 ( x1 ) X2 X3 ... mn,n 1 ( xn 1 ) Xn    mn 1,n ( xn )  1 mk 1,k ( xk )   P ( xk 1 | xk )mk  2,k 1 ( xk 1 )  xk 1 mk 1,k ( xk )  1 if xk has no child • Because when Xi-1 has no parent m (D)  ED A B C D E P ( a )  P ( a )  P (b | a )  P ( c | b )  P ( d | c )  P ( e | d ) b c d e  P ( a )  P (b | a )  P ( c | b )  P ( d | c ) m ED ( D )  b c d mED ( D)   P(e | d ) 1 e Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 114 Comparison    mn 1,n ( xn ) m12 ( x2 ) X1 X2 m23 ( x3 ) X3 ... Xn   mk 1,k ( xk )   P ( xk | xk 1 )mk  2,k 1 ( xk 1 ) xk 1  mk 1,k ( xk )  P( xk ) if xk has no parents    m32 ( x2 ) X1 m21 ( x1 ) X2 X3 ... mn,n 1 ( xn 1 ) Xn   mk 1,k ( xk )   P ( xk 1 | xk )mk  2,k 1 ( xk 1 ) xk 1  mk 1,k ( xk )  1 if xk has no child Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 115 Forward-Backward Passing (1/3) • P(Xk)   mk 1,k ( xk ) mk 1,k ( xk ) X1 X2 ... Xk ... Xn P ( xk )  (  P( x xn1 n | xn 1 )   P( xk  2 | xk 1 ) P( xk 1 | xk ) ) xk 1 (  P( xk | xk 1 )  P( x1 ) ) x k 1    mk 1,k ( xk )mk 1,k ( xk ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 116 Forward-Backward Passing (2/3) • P(Xk)   mk 1,k ( xk ) mk 1,k ( xk ) X1 X2 ... Xk ... Xn   P( xk )  mk 1,k ( xk )mk 1,k ( xk )   mk 1,k ( xk )   P ( xk | xk 1 )mk  2,k 1 ( xk 1 ) xk 1   mk 1,k ( xk )   P ( xk 1 | xk )mk  2,k 1 ( xk 1 ) xk 1 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 117 Forward-Backward Passing (3/3) • P(Xn) as forward-backward passing  X1 X2 ... Xk ... mn 1,n ( xn ) Xn mn 1,n ( xn ) =1    P ( xn )  mn 1,n ( xn )  mn 1,n ( xn )mn 1,n ( xn )  P(X1) as forward-backward passing •  m0,1 ( x1 ) m2,1 ( x1 ) X1 X2 ... Xk ... Xn   P( x1 )  m0,1 ( x1 )m2,1 ( x1 ) =P(x1) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 118 Exercise • P(X1, Xn) • P(Xj, Xk) X1 ... Xj ... Xk ... Xn Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 119 4.1.2 Posterior Query • P(Xn|x1) : Forward passing x1 X2 X3 ... Xn • P(X1|xn) : Backward passing X1 X2 X3 ... xn • P(Xk) or P(xk) : Forward-Backward passing X1 X2 ... Xk ... Xn Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 120 Backward Passing (1/6) A B C D e • A query P(A|e) = P(A,e) – Variable elimination in Chains with Evidence P ( A, e )     P ( A, b , c , d , e ) b c d     P ( A ) P (b | A ) P ( c | b ) P ( d | c ) P ( e | d ) b c d  P ( A )  P (b | A )  P ( c | b )  P ( d | c ) P ( e | d ) b c d Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 121 Backward Passing (2/6)    mDC (C ) mED (D ) mFE ( E )  1 A B C X D e P ( A, e )  P ( A )  P ( b | A )  P ( c | b )  P ( d | c ) P ( e | d ) b c d  P ( A )  P (b | A )  P ( c | b ) f ( c , e ) b c • Eliminating d, we get P(e|c) – We call it a belief/message sent from d to c mDC (C )  f (c, e)   P(d | c) P(e | d )   mED (D ) d   mED ( D )  P (e | d ) 1  P (e | d )mFE ( E ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 122 Backward Passing (3/6)   X X mCB (B) mDC (C ) A B C D e • Eliminating c, we get P ( A, e )  P ( A )  P (b | A )  P ( c | b ) m DC ( c ) b c  P ( A )  P (b | A ) f (b , e ) b   mCB ( B )  f (b, e)   P (c | b) f (c, e)   P (c | b)mDC (C ) c c Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 123 Backward Passing (4/6)   X X X mBA ( A) mCB (B) A B C D e • Finally, we eliminate b P ( A, e )  P ( A )  P (b | A ) m CB (b ) b  P ( A ) f ( A, e )   mBA ( A)  f ( A, e)   P(b | A) f (b, e)   P (b | A)mCB ( B) b b Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 124 Backward Passing (5/6)    m32 ( x2 ) m01 ( x1 ) X1 m21 ( x1 ) X2 X3 ... xn • Given Xn=xn • How to compute P(X1,xn) for P(X1|xn)?  P ( X 1 | xn )    P ( x i | x i 1 )  P ( x1 ) m 21 ( x1 )   x n 1  x 2 i  n ~ 2  m01 ( x1 )m21 ( x1 )   mk 1,k ( xk )   P ( xk | xk 1 )mk  2,k 1 ( xk 1 ) x  k 1 mk 1,k ( xk )  P( xk ) if xk has no parents Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 125 Backward Passing (6/6)   X mDC (C ) mED (D ) A B C D e P ( a , e )  P ( a )  P (b | a )  P ( c | b )  P ( d | c ) P ( e | d ) b c d  P ( a )  P (b | a )  P ( c | b ) m DC ( c ) b c  mDC (C )   P(d | c) P(e | d )   P(d | c)mED (d ) d d Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 126 4.1.3 Short Summary • Message can be recursively computed   mk 1,k ( xk )   P ( xk | xk 1 )mk  2,k 1 ( xk 1 ) xk 1   mk 1,k ( xk )   P ( xk 1 | xk )mk  2,k 1 ( xk 1 ) xk 1     mk  2,k 1 ( xk 1 ) mk 1,k ( xk ) mk 1,k ( xk ) m k  2 , k 1 ( xk 1 ) X1 ... Xk-1 Xk Xk+1 ... Xn Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 127 Belief of Any Node     mk  2,k 1 ( xk 1 ) mk 1,k ( xk ) mk 1,k ( xk ) mk  2,k 1 ( xk 1 ) X1 ... Xk-1 Xk Xk+1 ... Xn   P( xk )  mk 1,k ( xk )mk 1,k ( xk )   P( xk | e)  mk 1,k ( xk )mk 1,k ( xk )   mk 1,k ( xk )   P ( xk | xk 1 )mk  2,k 1 ( xk 1 ) xk 1   mk 1,k ( xk )   P ( xk 1 | xk )mk  2,k 1 ( xk 1 ) xk 1 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 128 Three Special Cases of Message • Node xk with evidence  mk 1,k ( xk )  1 mk 1,k ( xk )  1   mk ,k 1 ( xk 1 )  P( xk 1 | xk ) mk ,k 1 ( xk 1 )  P( xk | xk 1 )  • Node xk without parents mk 1,k ( xk )  P( xk )  • Node xk without child mk 1,k ( xk )  1    mk 1,k  1 mk ,k 1  1  m X1 ...  Xk-1 Xk Xk+1 ... Xn m mk ,k 1  P( xk |xk 1 ) mk ,k 1  P( xk 1 | xk ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 129 4.2 Message Passing in Tree     mk 1,k ( xk ) Xk mk 1,k ( xk ) mk 1,k ( xk )  Xk mk 1,k ( xk ) mm,k ( xk ) Xk-1 Xk+1 ... ... ... Xk-1 Xm Xk+1 ... X1 Xn X1 Xm-1 Xn Markov Chain Markov Tree   P( xk )  mk 1,k ( xk )mk 1,k ( xk )    P( xk )  mk 1,k ( xk )mk 1,k ( xk )mm,k ( xk )   m j , k ( xk ) jN ( xk ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 130 Two Examples Simple tree General tree x1 x2 x3 x4 x5 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 131 Message Passing in Simple Tree (1/3) x1 m12  x 2  x2 x3 x4 x5 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 132 Message Passing in Simple Tree (2/3) x1 m12  x2  x2 m23 x3  x3 m43 x3  m35  x5  x4 x5 m3,5 ( x5 )   P ( x5 | x3 )m2,3 ( x3 )m4,3 ( x3 ) x3 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 133 Message Passing in Simple Tree (3/3) x1 m12  x2  P( x3 )  m23 ( x3 )m43 ( x3 )m53 ( x3 )   mi, j ( x j ) x2  P( x j )  m23  x3  { xi | xi Neighbor ( x j )} x3 m43 x3  m35  x5  m53  x5  m3,5 ( x5 )   P ( x5 | x3 )m2,3 ( x3 )m4,3 ( x3 ) x4 x5 x3    m j , k ( xk )   P ( xk | x j )  mi , j ( x j ) xj { xi | xi Neighbor ( x j ), xi  xk } Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 134 Message Passing in HMM (1/3) Filtering (Forward algorithm) P(X3|y1:3) X1 X2 X3 Y1 Y2 Y3 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 135 Message Passing in HMM (2/3) • Smoothing • P(X1|y1:3) : Backward algorithm X1 X2 X3 Y1 Y2 Y3 • P(X2|y1:3) : Backward algorithm X1 X2 X3 Y1 Y2 Y3 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 136 Message Passing without Evidence x i1 … m j ,ik ( xik ) x ik m j ,i1 ( xi1 ) mi1 , j x j  mik , j x j  xj mik 1 , j x j  mim , j x j  x i k 1   m j ,ik 1 xik 1 … m j ,i ( xi ) x im m m Belief ( x j )  P( x j )   { xi | xi Neighbor ( x j )} mi , j ( x j ) m j , k ( xk )   P ( xk | x j )  mi , j ( x j ) xj { xi | xi Neighbor ( x j ), xi  xk } Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 137 Message Passing with Evidence (1/2) • Given a set of evidence e = e+e- • The node x splits network into two disjoint parts     P (e , e | x )  P (e | x ) P (e | x ) Conditional Independence  Polytree Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 138 Message Passing with Evidence (2/2) • Given a set of evidence e = e+e- • The belief Belief(x) of a node x is   Belief ( x)  P( x | e)  P( x | e , e )   P (e , e | x ) P ( x )     P (e  | x ) P (e  | x ) P ( x )  P (e  | x ) P ( x | e  )   ( x) ( x) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 139 Factorization Probabilistic models with no loop are factorizable a b c d  Pa, x P(b, x) P(c, x) P(d, x) a b c d         P (a, x)   P (b, x)   P (c, x)   P (d, x)   a  b  c  d  Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 140 Marginal to MAP : MAX Product x1 m12  x2  x2 m23 x3  x3 m43 x3  m35  x5  m35  x5  m43 x3  m23  x3  m12  x2  x4 x5 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 141 Sum-Product v.s. Max-Product • Sum-product computes marginals using this rule m j , k ( xk )   P ( xk | x j )  mi , j ( x j ) xj { xi | xi Neighbor ( x j ), xi  xk } • Max-product computes max marginals using the rule m j ,k ( xk )  max P( xk | x j ) xj  mi , j ( x j ) { xi | xi Neighbor ( x j ), xi  xk } • Same algorithm on different semirings: (+,x,0,1) and (max,x,-1,1) Shafer90,Bistarelli97,Goodman99,Aji00 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 142 4.3 Pearl’s BP Algorithm Pearl88,Shafer90,Yedidia01,etc • Forwards-backwards algorithm can be generalized to apply to any tree-like graph (ones with no loops) • For now, we assume pairwise potentials Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 143 Basic Idea • 2 passes : Collect and Distribute Collect Evidence Distribute Evidence root root Figure from P. Green Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 144 Collect Evidence: Absorb Messages x i1 … x ik mi1 , j x j  mik , j x j  xj mik 1 , j x j  mim , j x j  x i k 1 … x im Belief ( x j )  P( x j )   { xi | xi Neighbor ( x j )} mi , j ( x j ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 145 Distribute Evidence: Send Messages x i1 … m j ,i ( xi ) x ik m j ,i1 ( xi1 ) k k xj x i k 1   m j ,ik 1 xik 1 … m j ,i ( xi ) x im m m m j , k ( xk )   P ( xk | x j )  mi , j ( x j ) xj { xi | xi Neighbor ( x j ), xi  xk } Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 146 Initialization • For nodes with evidence e – (xi) = 1 wherever xi = ei ; 0 otherwise – (xi) = 1 wherever xi = ei ; 0 otherwise • For nodes without parents – (xi) = p(xi) - prior probabilities • For nodes without children – (xi) = 1 uniformly (normalize at end) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 147 Centralized Protocol Collect to root (post-order) Distribute from root (pre-order) R R 3 4 3 2 1 5 5 4 1 2 Computes all N marginals in 2 passes over graph Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 148 Distributed Protocol Collect Distribute Computes all N marginals in O(N) parallel updates Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 149 Propagation Example in a Tree Collect Data Data Distribute • The example requires five time periods to reach equilibrium (Pearl, 1988, p 174) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 150 Properties of BP • Exact inference for polytrees – Each node separates the polytree into 2 disjoint components • On a polytree, the BP algorithm converges in time linearly proportional to number of nodes – Work done in a node is proportional to the size of CPT – Hence BP is linear in number of network parameters • For general graphs – Exact inference is NP-hard – Approximate inference is NP-hard Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 151 4.4 Message Passing in General Graphs • Belief propagation is only guaranteed to be correct for polytrees (trees) • Most probabilistic graphs – Are not polytrees – Has many loops • We can not factorize the joint probability P(X1,…,Xn) into sum-product P(Xi|Pa(Xi)) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 152 Loopy Belief Propagation • Applying BP to graphs with loops (cycles) can give the wrong answer, because it overcounts evidence Cloudy Sprinkler Rain WetGrass • In practice, often works well (e.g., error correcting codes) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 153 Factorization in Loopy Networks Probabilistic models with no loop are tractable Factorizable a b  Pa, x P(b, x) P(c, x) P(d, x) a b c d c      d    P (a, x)   P (b, x)   P (c, x)   P (d, x)   a  b  c  d  Probabilistic models with loop are not tractable a Not Factorizable b c  Pa, b, c, d, x  a b c d d Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 154 Two Methods • Loopy Belief Propagation – Approximate Inference • Clustering (Join Tree, Junction Tree) – Combine multiple nodes into a hyper-node • Transform loopy graph into polytree – Then perform belief propagation Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 155 Loopy Belief Propagation • If BP is used on graphs with loops, messages may circulate indefinitely • Empirically, a good approximation is still achievable – Stop after fixed # of iterations – Stop when no significant change in beliefs – If solution is not oscillatory but converges, it usually is a good approximation • Example: Turbo Codes Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 156 Clustering • The general graph should be converted to a junction tree, by clustering nodes • Message passing in the general graph = Message passing in the junction tree Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 157 5. Junction Tree • Also known as – Clustering algorithm – Join tree algorithm • Sub-sections – 5.1 Junction tree algorithm – 5.2 Example: create join tree Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 158 Basic Idea • Join individual nodes to form cluster nodes • The resulting network becomes a polytree – Singly connected – Undirected • Inference is performed in the polytree • Reduce the cost of inference of all variables to O(n) – n is the size of the modified network : polytree Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 159 An Example (1/2) • A multiply connected network Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 160 An Example (2/2) • A polytree by combining Sprinkler and Rain Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 161 Why Junction Tree • More efficient inference of all variables – Than variable elimination – For some PGMs (multiply connected network) • Avoid cycles if we – Turn highly-interconnected subsets of the nodes into “hypernodes” (Cluster) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 162 5.1 Junction Tree Algorithm Step 1: Graph Transformation (a) Moralize Steps 1 and 2 (b) Triangulate are performed (c) Identify cliques only once (c) Build junction tree Step 2: Initialization (of values) (a) Set up potentials (b) Propagate potentials Step 3: Update beliefs (a) Insert evidence into the junction tree (b) Propagate potentials Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 163 Step 1: Graph Transformation DAG 1(a) Moralize Moral Graph 1(b) Triangulate Triangulated Graph 1(c) Identify cliques Hypernodes of Cliques 1(d) Build junction tree Junction Tree Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 164 Step 1(a) - Moralize (1/2) • Add undirected edges to all co-parents which are not currently joined – Marrying parents A A B C G B C G D E H D E H F F Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 165 Step 1(a) - Moralize (2/2) • Drop the directions of the arcs A A B C G B C G D E H D E H F F Directed Undirected  Moral Graph Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 166 Step 1(b) - Triangulating • An undirected graph is triangulated • Iff every cycle of length >3 contains an edge to connect two nonadjacent nodes A B C G NO YES D E H F Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 167 Step 1(c) - Identifying Cliques • A clique is a subgraph of an undirected graph that is complete and maximal A EGH CEG B C G DEF ACE D E H F ABD ADE Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 168 Properties of Junction Tree • Each node is a cluster (nonempty set) of variables • Running intersection property: – Given two clusters X and Y, all clusters on the path between X and Y contain XY • Separator sets (sepsets): – Intersection of the adjacent cluster ABD AD ADE DE DEF Cluster ABD Sepset DE Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 169 Step 1(d) - Build Junction Tree (1/2) • A junction tree is a clique EGH CEG graph that – is an undirected tree DEF ACE – contains all the cliques – satisfies the running ABD ADE intersection property ABD AD ADE AE ACE CE CEG DE EG DEF EGH Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 170 Step 1(d) - Build Junction Tree (2/2) abd ace ad ae ce ade ceg GJT de eg In JT cliques becomes def egh vertices sepsets Ex: ceg  egh = eg Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 171 Junction Tree Algorithm Step 1: Transformation of graph (a) Moralize Steps 1 and 2 (b) Triangulate are performed (c) Identify cliques only once (c) Build junction tree Step 2: Initialization (of values) (a) Set up potentials (b) Propagate potentials Step 3: Update beliefs (a) Insert evidence into the junction tree (b) Propagate potentials Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 172 Step 2: Initialization DAG Step 1 Junction Tree 2(a) Set up potentials Inconsistent Junction Tree Step 2 2(b) Propagate potentials Consistent Junction Tree Marginalization P (V  v | E  e ) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 173 Potentials DEFINITION: A potential A over a set of variables XA is a function that maps each instantiation of xA into a non- negative real number. We denote the number that A maps xA by A(xA). Ex: A potential abc over the set of vertices {a,b,c}. A joint probability is a special Xa has four states, and case of a potential where  Xb and Xc has three states. A(xA)=1. Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 174 Decomposable Distribution DEFINITION: A probability distribution is said to be decomposable with respect to graph G = ( V , E ) if G is triangulated and it hold for any clusters A and B with separator C that X A  X B | XC Step 1 & 2 of the junction tree algorithm guarantees this property! Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 175 Factorization of Potentials THEOREM: Given a decomposable probability distribution P(XV) on the graph G = ( V , E ) it can be written as the product of all potentials of the cliques divided by the product of all potentials of the sepsets:  all cliques C P(XV) =  all sepsets S Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 176 Step 2(a) – Set Up Potentials 1. For each cluster C and sepset S; C 1 , S 1 2. For each vertex u in the BN select a parent cluster C s.t. C pa(u). Include the conditional probability P( Xu | Xpa(u) ) into C ; C  C · P( Xu | Xpa(u) ) ”PROOF:”  all cliques C  all vertices P( Xu | Xpa(u) ) = = P(XV)  all sepsets S 1 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 177 The potentials in the junction tree are not consistent with each other., i.e. if we use marginalization to get the probability distribution for a variable Xu we will get different results depending on which clique we use. abd ace P(Xa) =  ace ce ad ae ce = (0.12, 0.33, 0.11, 0.03) P(Xa) =  ade de ade ceg = (0.02, 0.43, 0.31, 0.12) de eg def egh The potentials might not even sum to one, i.e. they are not joint probability distributions. Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 178 Step 2(b) - Propagate Potentials Message Passing from clique A to clique B 1. Project the potential of A into SAB 2. Absorb the potential of SAB into B Projection Absorption Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 179 Global Propagation 1. COLLECT-EVIDENCE messages 1-5 2. DISTRIBUTE-EVIDENCE messages 6-10 (both methods are recursive) Start here! abd ace 2 3 7 ad ae ce 9 6 5 ade ceg 1 de 8 4 eg 10 def egh Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 180 A Priori Distribution global propagation  potentials are consistent  Marginalizations gives probability distributions for the variables Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 181 Short Summary a a abd ace ad ae ce b c g b c g ade ceg d e h d e h de eg f f def egh 1. For each cluster C and sepset S; abd ace 7 C 1 , S 1 ad 2 3 ae ce 9 6 5 2. For each vertex u in the BN select a ade ceg parent cluster C s.t. C fa(u). Include the conditional probability P( Xu | Xpa(u) ) into C de 8 eg 10 C  C · P( Xu | Xpa(u) ) 1 4 def egh Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 182 Junction Tree Algorithm Step 1: Transformation of graph (a) Moralize Steps 1 and 2 (b) Triangulate are performed (c) Identify cliques only once (c) Build junction tree Step 2: Initialization (of values) (a) Set up potentials (b) Propagate potentials Step 3: Update beliefs (a) Insert evidence into the junction tree (b) Propagate potentials Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 183 Step 3(a) - Insert Evidence into JT • Evidence is new information about a r.v. that changes our belief about its distribution Ex. Before receiving evidence P(Xu) = (0.14, 0.43, 0.31, 0.12) • Hard evidence • The r.v. is instantiated (observed) Xu=xu  P(Xu) := (0, 0, 1, 0) • Soft evidence - everything else. Xu < x1  P(Xu) := (0.5, 0.5, 0, 0) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 184 Hard Evidence as a Likelihood If we observe the the variable Xu to be xu the likelihood function becomes: 1, when xu is the observed value Xu(xu) = 0, otherwise Xu(0,1,2,3) & observes Xu =2  Xu(xu)=(0,0,1,0); For all unobserved variables Xv we make the likelihood function constant: Modify the initialization Xv(xv)= 1/n ; for all xv step 2a to include this! where n is the number of states of Xv Xv(0,1,2) unobserved  Xv(xv)=(0.33,0.33,0.33) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 185 Entering Observations 1. For each observation Xu=xu: (a)Encode the observation as a likelihood Xu (b)Identify one clique C that contains u and update C as:  C   C ·  Xu Step 3b. Propagate potentials To make the potentials in the junction tree consistent, perform a global update Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 186 Short Summary 1. a 2. b c g d e h f 3. 4. a b c g Observe Xd=xd d e h Xg=xg f Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 187 5.2 Example: Create Junction Tree HMM with 2 time steps: X1 X2 Y1 Y2 Junction Tree: X1,Y1 X1 X1,X2 X2 X2,Y2 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 188 Initialization X1,Y1 X1 X1,X2 X2 X2,Y2 Associated Potential Variable Cluster function X1 X1,Y1  X 1,Y 1  P ( X1)  X 1,Y 1  Y1 X1,Y1 P ( X1)P (Y1 | X1) X2 X1,X2  X 1,X 2  P ( X 2 | X1) Y2 X2,Y2  X 2,Y 2  P (Y 2 | X 2) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 189 Collect Evidence (1/2) • Choose arbitrary clique, e.g. X1,X2, where all potential functions will be collected. • Call recursively neighboring cliques for messages: • 1. Call X1,Y1. – 1. Projection:       P ( X1,Y1)  P ( X1) X1 X 1,Y 1 { X 1,Y 1} X 1 Y1 – 2. Absorption: X1  X 1,X 2   X 1,X 2 old  P ( X 2 | X1)P ( X1)  P ( X1, X 2) X1 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 190 Collect Evidence (2/2) • 2. Call X2,Y2: – 1. Projection: X 2    X 2,Y 2   P (Y 2 | X 2)  1 { X 2,Y 2} X 2 Y2 – 2. Absorption: X 2  X 1,X 2   X 1,X 2 old  P ( X1, X 2) X 2 X1,Y1 X1 X1,X2 X2 X2,Y2 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 191 Distribute Evidence (1/2) • Pass messages recursively to neighboring nodes • Pass message from X1,X2 to X1,Y1: – 1. Projection:  X 1    X 1,X 2   P ( X1, X 2)  P ( X1) { X 1,X 2} X 1 X2 – 2. Absorption: X1 P ( X1)  X 1,Y 1   X 1,Y 1 old  P ( X1,Y1) X1 P ( X1) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 192 Example: Distribute Evidence (2/2) • Pass message from X1,X2 to X2,Y2: – 1. Projection:  X 2    X 1,X 2   P ( X1, X 2)  P ( X 2) { X 1,X 2} X 2 X1 – 2. Absorption: X 2 P ( X 2)  X 2,Y 2   X 2,Y 2 old  P (Y 2 | X 2)  P (Y 2, X 2) X 2 1 X1,Y1 X1 X1,X2 X2 X2,Y2 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 193 Inference with Evidence (1/2) • Assume we want to compute: P(X2|Y1=0,Y2=1) (state estimation) • Assign likelihoods to the potential functions during initialization:  0 if Y1  1  X 1,Y 1   P ( X1,Y1  0) if Y1  0  0 if Y 2  0  X 2,Y 2   P(Y 2  1 | X 2) if Y 2  1 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 194 Inference with Evidence (2/2) • Repeating the same steps as in the previous case, we obtain:  0 if Y1  1  X 1,Y 1   P( X1,Y1  0,Y 2  1) if Y1  0  X 1  P ( X1,Y1  0,Y 2  1)  X 1,X 2  P ( X1,Y1  0, X 2,Y 2  1)  X 2  P (Y1  0, X 2,Y 2  1)  0 if Y2  0  X 2,Y 2   P(Y1  0, X 2,Y 2  1) if Y2 1 Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 195 An Example • TBU Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 196 An Example • To perform exact inference in an arbitrary graph, convert it to a junction tree, and then perform belief propagation. • A jtree is a tree whose nodes are sets, and which has the Jtree property: all sets which contain any given variable form a connected graph (variable cannot appear in 2 disjoint places) C CSR C moralize Make jtree S R S R SR W W Maximal cliques = { {C,S,R}, {S,R,W} } SRW Separators = { {C,S,R} Å {S,R,W} = {S,R} } Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 197 Making a Junction Tree G GM B D B D moralize A F A F C E C E Wij = |Ci Å Cj| Triangulate {a,b,c} (order f,d,e,c,b,a) 1 Jensen94 {b,d} {b,e,f} Max spanning tree Find B D 1 {b,c,e} 1 max cliques {b,d} 1 2 A F {b,c,e} {a,b,c} 2 {b,e,f} Jtree Jgraph GT C E Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 198 C S R Clique Potentials W CSR Each model clique potential gets assigned to one Jtree clique potential Each observed variable assigns a delta function SR to one Jtree clique potential If we observe W=w*, set E(w)=(w,w*), else E(w)=1 SRW SquareFu Jen University nodes are factors Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 199 C S R Separator Potentials W CSR Separator potentials enforce consistency between neighboring cliques on common variables. SR SRW SquareFu Jen University nodes are factors Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 200 BP on a Jtree CSR • A Jtree is a MRF with pairwise potentials. 1 4 • Each (clique) node potential contains CPDs and local evidence. • Each edge potential acts like a projection SR function. 2 • We do a forwards (collect) pass, then a 3 backwards (distribute) pass. SRW • The result is the Hugin/ Shafer-Shenoy algorithm. Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 201 BP on a Jtree (collect) CSR Initial clique potentials contain CPDs SR and evidence SRW Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 202 BP on a Jtree (collect) CSR SR Message from clique to separator marginalizes belief (projects onto intersection) [remove c] SRW Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 203 BP on a Jtree (collect) CSR SR Separator potentials gets marginal belief from their parent clique. SRW Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 204 BP on a Jtree (collect) CSR SR SRW Message from separator to clique expands marginal [add w] Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 205 BP on a Jtree (collect) CSR SR SRW Fu Jen University Root clique Wang, Yuan-Kai Copyright Department of Electrical Engineering has seen all the evidence
    • Bayesian Networks Unit - Exact Inference in BN p. 206 BP on a Jtree (distribute) CSR CSR SR SR SRW SRW Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 207 BP on a Jtree (distribute) CSR CSR Marginalize out w and exclude old evidence (ec, er) SR SR SRW SRW Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 208 BP on a Jtree (distribute) CSR CSR Combine upstream and downstream evidence SR SR SRW SRW Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 209 BP on a Jtree (distribute) CSR Add c and exclude CSR old evidence (ec, er) SR SR SRW SRW Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 210 BP on a Jtree (distribute) CSR CSR SR SR SRW SRW Combine upstream and downstream evidence Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 211 Partial Beliefs CSR CSR SR SR SRW SRW Evidence on R now added here •The “beliefs”/ messages at intermediate stages (before finishing both passes) may not be meaningful, because any given clique may not have “seen” all the model potentials/ evidence (and hence may not be normalizable). •This can cause problems when messages may fail (eg. Sensor nets). •One must reparameterize using the decomposable model to ensure meaningful Fu Jen University partial beliefs. Paskin04 Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 212 6. Summary • Variable elimination – Good concept of sum-product computation – No good for computing many nodes • Belief propagation – Good for • Computing beliefs of many nodes • Poly-tree – No good for general graphs • Junction tree – Good for • Computing beliefs of many nodes • General graphs Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 213 Three Methods Are Closely Related • Variable elimination provides basic ideas of BP and Junction Tree – Belief/Message  Factor Propagation/Passing  Elimination – Clustering  Elimination, Factor • Junction tree is the converging algorithm • Message passing provide unified formula Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 214 7. Implementation PNL GeNIe Enumeration v (Naïve) Variable Elimination Belief Propagation v (Pearl) v (Polytree) Junction Tree v v (Clustering) Direct Sampling v (Logic) Likelihood Sampling v(LWSampling) v(Likelihood sampling) MCMC Sampling v(Gibbswithanneal) (Other 5 samplings) Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 215 8. References • S. M. Aji and R. J. McEliece, The generalized distributive law, IEEE Trans. On Information Theory, vol. 46, no. 2, 2000. • F. R. Kschischang, B. J. Frey, H.-Andrea Loeiliger, Factor graph and the sum-product algorithm, IEEE Trans. On Information Theory, vol. 47, no. 2, 2001. Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 216 Recent Books • R. E. Neapolitan, Learning Bayesian Networks, Prentice Hall, 2004. • C. Borgelt and R. Kruse, Graphical Models:methods for data analysis and mining, Wiley, 2002. • D. Edwards, Introduction to Graphical Modelling, 2nd, Springer, 2000. • S. L. Lauritzen, Graphical Models, Oxford, 1996. • M. I. Jordan (ed.), Learning in Graphical Models, MIT, 2001. Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 217 Probabilistic Inference Using Bayesian Network • Introductory article: M. Henrion, “An introduction to algorithms for inference in belief nets,” in M. Henrion, R. Shachter, L. Kanal, and J. Lemmer (eds.), Uncertainty in Artificial Intelligence, 5, Amsterdam:North Holland, 1990. • Textbook with HUGIN system: F. Jensen, An Introduction to Bayesian Networks, New York: Springer-Verlag, 1996. • R. Neal, “Connectionist learning of belief networks,” Artificial Intelligence, 56:71-113, 1991. Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
    • Bayesian Networks Unit - Exact Inference in BN p. 218 General Probabilistic Inference • J. Pearl, Probabilistic reasoning in intelligent systems – Networks of plausible inference, Morgan – Kaufmann 1988. • E. Castillo, J. M. Gutierrez, A. S. Hadi, Expert Systems and Probabilistic Network Models, Springer 1997. • R. Neapolitan, Probabilistic Reasoning in Expert Systems:Theory and Algorithms, New York:John Wiley & Sons, 1990. • A special issue on “Uncertainty in AI” of the Communications of the ACM, vol. 38, no. 3, March 1995. • G. Shafer and J. Pearl (eds.), Readings in Uncertain Reasoning, San Francisco:Morgan Kaufmann, 1990. Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright