Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bayesian network structure estimation based on the Bayesian/MDL criteria when both discrete and continuous variables are present

89 views

Published on

J. Suzuki. ``Bayesian network structure estimation based on the Bayesian/MDL criteria when both discrete and continuous variables are present". IEEE Data Compression Conference, pp. 307-316, Snowbird, Utah, April 2012.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Bayesian network structure estimation based on the Bayesian/MDL criteria when both discrete and continuous variables are present

  1. 1. . ...... Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both Discrete and Continuous Variables are Present Joe Suzuki Osaka University April 11, 2012 Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 1 / 17
  2. 2. Road Map ...1 Problem ...2 Density Estimation ...3 Density Estimation in a General Sense ...4 Structure Estimation in a General Sense ...5 Summary Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 2 / 17
  3. 3. Problem Bayesian Network Structure X, Y , Z: random variables ordered as X < Y < Z  
  4. 4. Y Y Y YZ Z Z Z X X X X ¡ ¡ ¡ ¡ E ¡ ¡ e e… ¡ ¡ e e… E
  5. 5. Y Y Y YZ Z Z Z X X X X E e e… e e… E Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 3 / 17
  6. 6. Problem Structure Estimation X, Y , Z: random variables over sets A, B, C {(xi , yi , zi )}n i=1 ∈ (A × B × C)n: n examples independently emitted by P(X, Y , Z) . Structure Estimation .. ......Choose one among the eight structures based on {(xi , yi , zi )}n i=1   (The three variable case X, Y , Z can be extended to the d variable case {Xj }d j=1 in a straightforward manner. ) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 4 / 17
  7. 7. Problem Previous Works Previous approaches assume either all of Xj are finite, or all of Xj are Gaussian. . In reality, .. ......in any database, some fields are discrete, and other fields continuous. Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 5 / 17
  8. 8. Problem If A, B, C are finite Given xn ∈ An, yn ∈ Bn, zn ∈ Cn, we compute Qn (xn ), Qn (yn ), Qn (zn ), Q(xn , yn ), Qn (xn , zn ), Qn (yn , zn ), Qn (xn , yn , zn ) For some prior probabilities p0, p1, p00, p01, p10, p11, what Y depends on is based on which is larger between p0Q(xn ), p1 Qn(xn, yn) Qn(xn) and what Z depends on is based on which is the largest among p00Qn (zn ), p01 Qn(yn, zn) Q(yn) , p10 Qn(xn, zn) Q(xn) , p11 Qn(xn, yn, zn) Qn(xn, yn) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 6 / 17
  9. 9. Problem Universal Coding A := {0, 1, · · · , m − 1} with m ≥ 2   xn = (x1, · · · , xn) ∈ An: independently emitted by unknown Pn (xn ) := n∏ i=1 P(xi ) φ: uniquely decodable coding An → {0, 1}∗ φ(xn ) ∈ {0, 1}m =⇒ Lφ(xn ) := m . φ: universal .. ...... Lφ(xn) n → H := ∑ x∈A −P(x) log P(x) for any P, such as LZ, CTW Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 7 / 17
  10. 10. Problem Why can Pn be replaced by Qn ? . Qn: a universal coding measure w.r.t. A .. ...... − 1 n log Qn (xn ) → H for any P ∑ xn∈An Qn (xn ) ≤ 1 such as Qn (xn ) := 2−Lφ(xn) if φ is universal Shannon-McMillan-Breiman: for any P, − 1 n log Pn (xn ) = 1 n n∑ i=1 {− log P(xi )} → E[− log P(X)] = H . Universality .. ...... 1 n log Pn(xn) Qn(xn) → 0 Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 8 / 17
  11. 11. Problem Today’s Problem: What if A, B, C are not finite? X ∈ A = [0, 1) Continuous Y ∈ B = {1, 2, · · · } Discrete and Infinite Z ∈ C = [0, 1) ∪ {1, 2, · · · } neither Continuous nor Discrete   Without assuming that A, B, C are either discrete or continuous, What is universality like 1 n log Pn(xn) Qn(xn) → 0 ? What is a universal measure like Qn? Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 9 / 17
  12. 12. Density Estimation If Density Function f exists for X A0 := {A} Ak+1 is a refinment of Ak Example 1: A = [0, 1) A0 = {[0, 1)} A1 = {[0, 1/2), [1/2, 1)} A2 = {[0, 1/4), [1/4, 1/2), [1/2, 3/4), [3/4, 1)} . . . Ak = {[0, 2−(k−1)), [2−(k−1), 2 · 2−(k−1)), · · · , [(2k−1 − 1)2−(k−1), 1)} . . . sk : A → Ak (quantizer over A) sn k : An → An k (quantizer over An) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 10 / 17
  13. 13. Density Estimation Qn k : a universal coding measure w.r.t. Ak λn: Lebesgue measure (width of an interval), λn (sn k (xn )) = n∏ i=1 λ(sk(xi )) gn k (xn ) := Qn k (sn k (xn)) λn(sn k (xn)) {ωk}∞ k=1: ∑ k ωk = 1, ωk 0 , gn(xn) := ∑ k ωkgn k (xn) f n k (xn ) := Pn k (sn k (xn)) λn(sn k (xn)) = n∏ i=1 Pk(sk(xi )) λ(sk(xi )) If {Ak} is s.t. h(fk) → h(f ) (k → ∞), for any f n, 1 n log f n(xn) gn(xn) → 0 (B. Ryabko, 2009) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 11 / 17
  14. 14. Density Estimation in a General Sense Exactly when does a density function f exist given X? B: the Borel set of R µ(D): the probability of D ∈ B λ(D): the Lebesgues measure of D ∈ B . µ is Absolutely Continuous w.r.t. λ .. ...... Equivalent Conditions (Radon-Nykodim): µ ≪ λ: for each D ∈ B, λ(D) = 0 =⇒ µ(D) = 0. There exists dµ dλ := f s.t. µ(D) = ∫ t∈D f (t)dλ(t). Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 12 / 17
  15. 15. Density Estimation in a General Sense Density Estimation in a General Sense (Suzuki 2011) . µ is Absolutely Continuous w.r.t. η .. ...... Equivalent Conditions (Radon-Nykodim): µ ≪ η: for each D ∈ B, η(D) = 0 =⇒ µ(D) = 0 There exists dµ dη := f s.t. µ(D) = ∫ t∈D f (t)dη(t) Example 2: µ({j}) 0, η({j}) := 1 j(j + 1) , j ∈ B = {1, 2, · · · } =⇒ µ ≪ η ⇐⇒ there exists f s.t. µ(D) = ∑ j∈D f (j)η({j}) , D ⊆ B In fact, f (j) = µ({j}) η({j}) satisfies the condition. (The Lebesgues ∫ does not distinguish discrete Σ and continuous ∫ .) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 13 / 17
  16. 16. Density Estimation in a General Sense B0 := {B} with B = {1, 2, · · · } B1 := {{1}, {2, 3, · · · }} B2 := {{1}, {2}, {3, 4, · · · }} . . . Bk := {{1}, {2}, · · · , {k}, {k + 1, k + 2, · · · }} . . . sk : B → Bk, sn k : Bn → Bn k gn k (yn ) := Qn k (sn k (yn)) ηn(sn k (yn)) , gn (yn ) := ∞∑ k=1 ωkgn k (yn ) If {Bk} is s.t. h(fk) → h(f ) (k → ∞), for any f n, 1 n log f n(yn) gn(yn) → 0 (gn(yn) ∏n i=1 ηn({yi }) estimates P(yn) = f n(yn) ∏n i=1 ηn({yi }).) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 14 / 17
  17. 17. Density Estimation in a General Sense Estimation of Simultaneous Density Functions Example 3: A × B (based on Examples 1,2 for A, B) µ ≪ λ and µ ≪ η A0 × B0 = {A} × {B} = [0, 1) × {1, 2, · · · } A1 × B1 A2 × B2 . . . Ak × Bk . . . sk : A × B → Ak × Bk   If {Ak × Bk} is s.t. h(fk) → h(f ) (k → ∞), for any f n, gn can be constructed so that 1 n log f n(xn, yn) gn(xn, yn) → 0 (1) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 15 / 17
  18. 18. Structure Estimation in a General Sense Structure Estimation in a General Sense Estimate the generalized density functions: f n X (xn ), f n Y (yn ), f n Z (zn ) f n XY (xn , yn ), f n XZ (xn , zn ), f n YZ (yn , zn ), f n XYZ (xn , yn , zn ) by gn X (xn ), gn Y (yn ), gn Z (zn ) gn XY (xn , yn ), gn XZ (xn , zn ), gn YZ (yn , zn ), gn XYZ (xn , yn , zn ) so that we can compare p0gn Y (yn ), p1 gXY (xn, yn) gn X (xn) p00gn Z (zn ), p01 gn YZ (yn, zn) gn Y (yn) , p10 gn XZ (xn, zn) gn XY (xn) , p11 gn XYZ (xn, yn, zn) gn XY (xn, yn) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 16 / 17
  19. 19. Summary Summary . Universal measure without assuming either discrete or continuous .. ...... 1 n log f n(xn) gn(xn) → 0 f n (xn ) = dµn dηn (xn ), gn (xn ) = dνn dηn (xn ): extended density functions . Many applications based on the same approach .. ...... Estimation of Markov orders (discrete times and continuous values) Estimation of mutual information and its application to Chow-Liu . Future Works .. ...... Realistic settings of {Ak}, {ωk} based on the a prior informaation Development of structure estimation modules Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 17 / 17

×