Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
.
......
Bayesian Network Structure Estimation
Based on the Bayesian/MDL Criteria
When Both Discrete and Continuous Variab...
Road Map
...1 Problem
...2 Density Estimation
...3 Density Estimation in a General Sense
...4 Structure Estimation in a Ge...
Problem
Bayesian Network Structure
X, Y , Z: random variables ordered as X < Y < Z
 
Y Y Y YZ Z Z Z
X X X X
¡
¡
¡
¡
E
¡
¡
e
e…
¡
¡
e
e…
E
Y Y Y YZ Z Z Z
X X X X
E
e
e…
e
e…
E
Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Baye...
Problem
Structure Estimation
X, Y , Z: random variables over sets A, B, C
{(xi , yi , zi )}n
i=1 ∈ (A × B × C)n:
n example...
Problem
Previous Works
Previous approaches assume either
all of Xj are finite, or
all of Xj are Gaussian.
.
In reality,
..
...
Problem
If A, B, C are finite
Given xn ∈ An, yn ∈ Bn, zn ∈ Cn, we compute
Qn
(xn
), Qn
(yn
), Qn
(zn
), Q(xn
, yn
), Qn
(xn...
Problem
Universal Coding
A := {0, 1, · · · , m − 1} with m ≥ 2
 
xn = (x1, · · · , xn) ∈ An: independently emitted by unkn...
Problem
Why can Pn
be replaced by Qn
?
.
Qn: a universal coding measure w.r.t. A
..
......
−
1
n
log Qn
(xn
) → H for any ...
Problem
Today’s Problem: What if A, B, C are not finite?
X ∈ A = [0, 1) Continuous
Y ∈ B = {1, 2, · · · } Discrete and Infin...
Density Estimation
If Density Function f exists for X
A0 := {A}
Ak+1 is a refinment of Ak
Example 1: A = [0, 1)
A0 = {[0, 1...
Density Estimation
Qn
k : a universal coding measure w.r.t. Ak
λn: Lebesgue measure (width of an interval), λn
(sn
k (xn
)...
Density Estimation in a General Sense
Exactly when does a density function f exist given X?
B: the Borel set of R
µ(D): th...
Density Estimation in a General Sense
Density Estimation in a General Sense (Suzuki 2011)
.
µ is Absolutely Continuous w.r...
Density Estimation in a General Sense
B0 := {B} with B = {1, 2, · · · }
B1 := {{1}, {2, 3, · · · }}
B2 := {{1}, {2}, {3, 4...
Density Estimation in a General Sense
Estimation of Simultaneous Density Functions
Example 3: A × B (based on Examples 1,2...
Upcoming SlideShare
Loading in …5
×

Bayesian network structure estimation based on the Bayesian/MDL criteria when both discrete and continuous variables are present

82 views

Published on

J. Suzuki. ``Bayesian network structure estimation based on the Bayesian/MDL criteria when both discrete and continuous variables are present". IEEE Data Compression Conference, pp. 307-316, Snowbird, Utah, April 2012.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Bayesian network structure estimation based on the Bayesian/MDL criteria when both discrete and continuous variables are present

  1. 1. . ...... Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both Discrete and Continuous Variables are Present Joe Suzuki Osaka University April 11, 2012 Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 1 / 17
  2. 2. Road Map ...1 Problem ...2 Density Estimation ...3 Density Estimation in a General Sense ...4 Structure Estimation in a General Sense ...5 Summary Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 2 / 17
  3. 3. Problem Bayesian Network Structure X, Y , Z: random variables ordered as X < Y < Z  
  4. 4. Y Y Y YZ Z Z Z X X X X ¡ ¡ ¡ ¡ E ¡ ¡ e e… ¡ ¡ e e… E
  5. 5. Y Y Y YZ Z Z Z X X X X E e e… e e… E Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 3 / 17
  6. 6. Problem Structure Estimation X, Y , Z: random variables over sets A, B, C {(xi , yi , zi )}n i=1 ∈ (A × B × C)n: n examples independently emitted by P(X, Y , Z) . Structure Estimation .. ......Choose one among the eight structures based on {(xi , yi , zi )}n i=1   (The three variable case X, Y , Z can be extended to the d variable case {Xj }d j=1 in a straightforward manner. ) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 4 / 17
  7. 7. Problem Previous Works Previous approaches assume either all of Xj are finite, or all of Xj are Gaussian. . In reality, .. ......in any database, some fields are discrete, and other fields continuous. Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 5 / 17
  8. 8. Problem If A, B, C are finite Given xn ∈ An, yn ∈ Bn, zn ∈ Cn, we compute Qn (xn ), Qn (yn ), Qn (zn ), Q(xn , yn ), Qn (xn , zn ), Qn (yn , zn ), Qn (xn , yn , zn ) For some prior probabilities p0, p1, p00, p01, p10, p11, what Y depends on is based on which is larger between p0Q(xn ), p1 Qn(xn, yn) Qn(xn) and what Z depends on is based on which is the largest among p00Qn (zn ), p01 Qn(yn, zn) Q(yn) , p10 Qn(xn, zn) Q(xn) , p11 Qn(xn, yn, zn) Qn(xn, yn) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 6 / 17
  9. 9. Problem Universal Coding A := {0, 1, · · · , m − 1} with m ≥ 2   xn = (x1, · · · , xn) ∈ An: independently emitted by unknown Pn (xn ) := n∏ i=1 P(xi ) φ: uniquely decodable coding An → {0, 1}∗ φ(xn ) ∈ {0, 1}m =⇒ Lφ(xn ) := m . φ: universal .. ...... Lφ(xn) n → H := ∑ x∈A −P(x) log P(x) for any P, such as LZ, CTW Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 7 / 17
  10. 10. Problem Why can Pn be replaced by Qn ? . Qn: a universal coding measure w.r.t. A .. ...... − 1 n log Qn (xn ) → H for any P ∑ xn∈An Qn (xn ) ≤ 1 such as Qn (xn ) := 2−Lφ(xn) if φ is universal Shannon-McMillan-Breiman: for any P, − 1 n log Pn (xn ) = 1 n n∑ i=1 {− log P(xi )} → E[− log P(X)] = H . Universality .. ...... 1 n log Pn(xn) Qn(xn) → 0 Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 8 / 17
  11. 11. Problem Today’s Problem: What if A, B, C are not finite? X ∈ A = [0, 1) Continuous Y ∈ B = {1, 2, · · · } Discrete and Infinite Z ∈ C = [0, 1) ∪ {1, 2, · · · } neither Continuous nor Discrete   Without assuming that A, B, C are either discrete or continuous, What is universality like 1 n log Pn(xn) Qn(xn) → 0 ? What is a universal measure like Qn? Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 9 / 17
  12. 12. Density Estimation If Density Function f exists for X A0 := {A} Ak+1 is a refinment of Ak Example 1: A = [0, 1) A0 = {[0, 1)} A1 = {[0, 1/2), [1/2, 1)} A2 = {[0, 1/4), [1/4, 1/2), [1/2, 3/4), [3/4, 1)} . . . Ak = {[0, 2−(k−1)), [2−(k−1), 2 · 2−(k−1)), · · · , [(2k−1 − 1)2−(k−1), 1)} . . . sk : A → Ak (quantizer over A) sn k : An → An k (quantizer over An) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 10 / 17
  13. 13. Density Estimation Qn k : a universal coding measure w.r.t. Ak λn: Lebesgue measure (width of an interval), λn (sn k (xn )) = n∏ i=1 λ(sk(xi )) gn k (xn ) := Qn k (sn k (xn)) λn(sn k (xn)) {ωk}∞ k=1: ∑ k ωk = 1, ωk 0 , gn(xn) := ∑ k ωkgn k (xn) f n k (xn ) := Pn k (sn k (xn)) λn(sn k (xn)) = n∏ i=1 Pk(sk(xi )) λ(sk(xi )) If {Ak} is s.t. h(fk) → h(f ) (k → ∞), for any f n, 1 n log f n(xn) gn(xn) → 0 (B. Ryabko, 2009) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 11 / 17
  14. 14. Density Estimation in a General Sense Exactly when does a density function f exist given X? B: the Borel set of R µ(D): the probability of D ∈ B λ(D): the Lebesgues measure of D ∈ B . µ is Absolutely Continuous w.r.t. λ .. ...... Equivalent Conditions (Radon-Nykodim): µ ≪ λ: for each D ∈ B, λ(D) = 0 =⇒ µ(D) = 0. There exists dµ dλ := f s.t. µ(D) = ∫ t∈D f (t)dλ(t). Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 12 / 17
  15. 15. Density Estimation in a General Sense Density Estimation in a General Sense (Suzuki 2011) . µ is Absolutely Continuous w.r.t. η .. ...... Equivalent Conditions (Radon-Nykodim): µ ≪ η: for each D ∈ B, η(D) = 0 =⇒ µ(D) = 0 There exists dµ dη := f s.t. µ(D) = ∫ t∈D f (t)dη(t) Example 2: µ({j}) 0, η({j}) := 1 j(j + 1) , j ∈ B = {1, 2, · · · } =⇒ µ ≪ η ⇐⇒ there exists f s.t. µ(D) = ∑ j∈D f (j)η({j}) , D ⊆ B In fact, f (j) = µ({j}) η({j}) satisfies the condition. (The Lebesgues ∫ does not distinguish discrete Σ and continuous ∫ .) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 13 / 17
  16. 16. Density Estimation in a General Sense B0 := {B} with B = {1, 2, · · · } B1 := {{1}, {2, 3, · · · }} B2 := {{1}, {2}, {3, 4, · · · }} . . . Bk := {{1}, {2}, · · · , {k}, {k + 1, k + 2, · · · }} . . . sk : B → Bk, sn k : Bn → Bn k gn k (yn ) := Qn k (sn k (yn)) ηn(sn k (yn)) , gn (yn ) := ∞∑ k=1 ωkgn k (yn ) If {Bk} is s.t. h(fk) → h(f ) (k → ∞), for any f n, 1 n log f n(yn) gn(yn) → 0 (gn(yn) ∏n i=1 ηn({yi }) estimates P(yn) = f n(yn) ∏n i=1 ηn({yi }).) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 14 / 17
  17. 17. Density Estimation in a General Sense Estimation of Simultaneous Density Functions Example 3: A × B (based on Examples 1,2 for A, B) µ ≪ λ and µ ≪ η A0 × B0 = {A} × {B} = [0, 1) × {1, 2, · · · } A1 × B1 A2 × B2 . . . Ak × Bk . . . sk : A × B → Ak × Bk   If {Ak × Bk} is s.t. h(fk) → h(f ) (k → ∞), for any f n, gn can be constructed so that 1 n log f n(xn, yn) gn(xn, yn) → 0 (1) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 15 / 17
  18. 18. Structure Estimation in a General Sense Structure Estimation in a General Sense Estimate the generalized density functions: f n X (xn ), f n Y (yn ), f n Z (zn ) f n XY (xn , yn ), f n XZ (xn , zn ), f n YZ (yn , zn ), f n XYZ (xn , yn , zn ) by gn X (xn ), gn Y (yn ), gn Z (zn ) gn XY (xn , yn ), gn XZ (xn , zn ), gn YZ (yn , zn ), gn XYZ (xn , yn , zn ) so that we can compare p0gn Y (yn ), p1 gXY (xn, yn) gn X (xn) p00gn Z (zn ), p01 gn YZ (yn, zn) gn Y (yn) , p10 gn XZ (xn, zn) gn XY (xn) , p11 gn XYZ (xn, yn, zn) gn XY (xn, yn) Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 16 / 17
  19. 19. Summary Summary . Universal measure without assuming either discrete or continuous .. ...... 1 n log f n(xn) gn(xn) → 0 f n (xn ) = dµn dηn (xn ), gn (xn ) = dνn dηn (xn ): extended density functions . Many applications based on the same approach .. ...... Estimation of Markov orders (discrete times and continuous values) Estimation of mutual information and its application to Chow-Liu . Future Works .. ...... Realistic settings of {Ak}, {ωk} based on the a prior informaation Development of structure estimation modules Joe Suzuki (Osaka University) Bayesian Network Structure Estimation Based on the Bayesian/MDL Criteria When Both DApril 11, 2012 17 / 17

×