The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
2014 9-16
1. .
.
Learning BNs with Discrete and Continuous Variables
Joe Suzuki
Osaka University
PGM 2014 @Utrecht
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 1 / 27
2. Road Map
Road Map
1. Learning BNs
2. When a Density exists
3. The General Case
4. Practical BN Learning with Discrete and Continuous Variables
5. Conclusion
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 2 / 27
3. Learning BNs
Factor P(X; Y ; Z)
P(X)P(Y )P(Z) P(X)P(Y ; Z) P(Y )P(Z; X)
P(X; Y )P(X; Z)
P(Z)P(X; Y )
P(X)
P(X; Y )P(Y ; Z)
P(Y )
P(X; Z)P(Y ; Z)
P(Z)
P(Y )P(Z)P(X; Y ; Z)
P(Y ; Z)
P(Z)P(X)P(X; Y ; Z)
P(Z;X)
P(X)P(Y )P(X; Y ; Z)
P(X; Y )
P(X;Y ; Z)
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 3 / 27
4. Learning BNs
BNs for X; Y ; Z
X
m Xm (1) (2) (3) (4)
m X
m X
AAU
Ym- Zm
m X
A AK
m Y m Z
m X
AAU
Y m-
Z
m m X
AKA
Y m Z
m
m X
m X
AAU
Y m Z
m X
Y m
Z
m m Y m Z
m X
m
m -
Y m Z
m Y m Z
Ym- Zm
m
AKA
Y m Z
(5) (6) (7) (8)
(9) (10) (11)
Markov Equivalence
(5) (8)
m X
AAU
Y m
Z
m m X
AAU
Y m Z
m
m X
AKA
Y m
Z
m m X
AKA
Y m Z
m
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 4 / 27
5. Learning BNs
The Problem
Identify the BN structures among (1)-(11) from n examples
xn = (x1; ; xn) ; yn = (y1; ; yn) ; zn = (z1; ; zn)
X = x1 Y = y1 Z = z1
X = x2 Y = y2 Z = z2
... ...
...
X = xn Y = yn Z = zn
9=
;
i:i:d:
(N̸= 3 variables will be considered)
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 5 / 27
7. elds are discrete and others continuous
Discrete Only: fMale,Femaleg fMarried,Unmarriedg Age
Continous Only: Height Weight Footsize
Discrete/Continous: Height Weight Age
.
BN Structure Learning with both Discrete and Continuous Variables.
.
.Why do you solve the easiest but unrealistic problems ?
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 6 / 27
8. Learning BNs
Previous Works
Independent Testing PC Algorithm (Spirtes, 2000) etc.
Bayesian the problem can be classi
9. ed into
Factor Scores
Structure Scores given Factor Scores to
10. nd the Best
almost all assume
Descrete only
Gaussian only
Descrete and Continuous are mixed: no performance guaranteed
1. Friedman and Goldszmidt (UAI-97) decretizing continous vaiables
2. the R package by Bottcher (2003) assuming Gaussian
3. Monti and Cooper (NIPS-96) approxmating neural networks
Shenoy (PGM-12): mxtures of polynomials only for density estimation
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 7 / 27
11. Learning BNs
N = 2 (Bayesian Independence Test)
w(): the Prior over
p: the Prior of X ?? Y
Qn(xn) :=
∫
Pn(xnj)w()d ; Qn(yn) :=
∫
Pn(ynj)w()d ;
Qn(xn; yn) :=
∫
Pn(xn; ynj)w()d ;
The Posterior Prob. of X ?? Y given xn; yn:
P(X ?? Y jxn; yn) =
pQn(xn)Q(yn)
pQn(xn)Q(yn) + (1 p)Qn(xn; yn)
;
The Decision Rule:
X ?? Y () pQn(xn)Q(yn) (1 p)Qn(xn; yn) :
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 8 / 27
13. Learning BNs
Factor Score Qn(xn) :=
∫
Pn(xnj)w()d
c: # of ones in xn (binary)
w() /
a+1(1 )
b+1 =) Q(xn+1jxn) =
Qn+1(xn+1)
Qn(xn)
=
c + a
n + a + b
P(xnj) = c(1 )nc
Kraft's Inequality:
Σ
xn
Qn(xn) 1
.
Qn: Universal
.
For any in P(Xj), as n ! 1, with Prob. 1
.
1
n
log
Pn(xnj)
Qn(xn)
! 0
Decision Rule does not depend on the Prior fpig and w() as n ! 1
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 10 / 27
14. When a Density exists
When a density f exists w.r.t. X (Ryabko, 2009)
A0 := fAg
Aj+1 is a re
15. nement of Aj
For each level j , quantize xn = (x1; ; xn) into (a(j)
1 ; ; a(j)
n )
...
-
...
... ...
-
-
A1
A2
Aj
gn
1 (xn) =
Qn
1 (a(1)
1 ; ; a(1)
n )
(a(1)
1 ) (a(1)
n )
gn
2 (xn) =
Qn
2 (a(2)
1 ; ; a(2)
n )
(a(2)
1 ) (a(2)
n )
gn
j (xn) =
Qn
j (a(j)
1 ; ; a(j)
n )
(a(j)
1 ) (a(j)
n )
: Lebesgue measure (interval length)
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 11 / 27
17. When a Density exists
Universaliy of gn
f : the (true) density
fj : (approximated) density of level j
f n(xn) := f (x1) f (xn)
.
Ryabko 2009
.
For any f s.t. D(f jjfj ) ! 0 (j ! 1), w.th Prob. 1, as n ! 1
.
1
n
log
f n(xn)
gn(xn)
! 0
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 13 / 27
18. The General Case
When no density exist w.r.t. X (Suzuki 2011)
B1 := ff1g; f2; 3; gg
B2 := ff1g; f2g; f3; 4; gg
: : :
Bk := ff1g; f2g; ; fkg; fk + 1; k + 2; gg
: : :
For each level k, quantize xn = (x1; ; xn) into (b(k)
1 ; ; b(k)
n )
(fkg) =
1
k
1
k + 1
gn
k (yn) :=
Qn
k (b(k)
1 ; ; b(k)
n )
(b(k)
1 ) (b(k)
n )
Σ
!k = 1, !k 0, gn(xn) :=
1Σ
k=1
!kgn
k (xn)
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 14 / 27
19. The General Case
D(f jjfj )̸! 0 as j ! 1 (1)
∫ 1
1
2
f (x)dx 0
-
0 1 x
C0
C1
C2
C3
...
...
...
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 15 / 27
20. The General Case
D(f jjfj )̸! 0 as j ! 1 (2)
∫ 1
1
f (x)dx 0
-
0 1 x
C0
C1
C2
C3
...
...
...
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 16 / 27
21. The General Case
D(f jjfj ) ! 0 as j ! 1
Universal Histogram Sequence fCkg1
k=0
... ...
-
x
C0
C1
C2
C3
...
.
Suzuki 2013
.
For any (generalized) density f as n ! 1 with Prob. 1
.
1
n
log
f n(xn)
gn(xn)
! 0
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 17 / 27
22. The General Case
For (X; Y ) rather than X
gn
jk (xn; yn) :=
Qn
jk (a(j)
1 ; ; a(j)
1 ; b(k)
1 ; ; b(k)
n )
(a(j)
1 ) (a(j)
n )(b(k)
1 ) (b(k)
n )
Σ
jk !jk = 1, !jk 0, gn(xn; yn) :=
1Σ
k=1
!jkgn
jk (xn; yn)
Similarly, obtain 2N 1 = 7 facter scores
gn(yn); gn(zn); gn(xn; yn); gn(xn; zn); gn(yn; zn); gn(xn; yn; zn)
and M(N) = 11 structure scores
p1gn(xn)gn(yn)g(zn); p2gn(xn)gn(yn; zn);
; p10
gn(xn)gn(yn)gn(xn; yn; zn)
gn(xn; yn)
; p11gn(xn; yn; zn)
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 18 / 27
23. Practical BN Learning with Discrete and Continuous Variables
Factor and Structure Scores
STEP 1: compute 2N 1 factor scores. For (10),
1
n
f log p11log gn(xn)log gn(yn)log gn(xn; yn; zn)+log gn(xn; yn)g
X Y (X;Y ) Z (X; Z) (Y ; Z) (X;Y ; Z)
1
n log gn() 1.617 1.533 3.249 1.647 3.318 3.290 4.943
STEP 2: compute M(N) structure scores and
24. nd the best. For N = 3
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11)
4.799 4.908 4.852 4.897 4.950 5.006 4.962 4.833 4.890 4.845 4.943
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 19 / 27
25. Practical BN Learning with Discrete and Continuous Variables
Computing Factor Scores
Input xn 2 An, output gn(xn)
1. For each k = 1; ;K, gn
k (xn) := 0
2. For each k = 1; ;K and each a 2 Ak , ck (a) := 0
3. For each i = 1; ; n, for each k = 1; ;K
1. Find ai 2 Ak from xi 2 A
2. gn
k (xn) := gn
k (xn) log
ck (ai ) + 1=2
i 1 + jAk j=2
+ log(X (ai ))
3. ck (ai ) := ck (ai ) + 1
4. gn(xn) := 1
K
ΣK
k=1 gn
k (xn)
Qn
k (xn) =
Πn
i=1
c(a(k)
i ) + 1=2
i 1 + jAj=2
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 20 / 27
26. Practical BN Learning with Discrete and Continuous Variables
Computation: maxfO(n2NK);M(N)g = O(M(N))
.
Factor Scores
.
.O(n2NK)
Proportional to n and 2N
a(1)
7! a(2)
i
i
7! 7! a(K)
i : Binary Search
Proprtional to K
gn(xn; yn) can be obtained by
ΣK
k=1
!kgn
k;k (xn; yn) rather than
ΣJ
j=1
ΣK
k=1
!jkgn
jk (xn; yn).
.
Structure Scores
.
.O(M(N))
Compute the M(N) structure scores and
27. nd the best.
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 21 / 27
28. Practical BN Learning with Discrete and Continuous Variables
Experiment 1
1. X; Y 2 f0; 1g (X ?? Y ): equi-prob., U N(x + y; 1),
V N(x y; 1)
2. X; Y N(0; 1) (X ?? Y ), U;V 2 f0; 1g s.t.
P(U = 1jX + Y = z) = P(V = 1jX Y = z) =
8
:
0; z 1
(z + 1)=2; 1 z 1
1; z 1
X 2 f0; 1g
38. Practical BN Learning with Discrete and Continuous Variables
Experiment 2
(1) X; Y ; Z N(0; 1).
(2)(3)(4) X;U N(0; 1), Y = X +
√
1 2U, Z N(0; 1).
(5)(6)(7) X;U; V N(0; 1), Y = aX +
√
1 2a
U,
Y = x +
√
1 2
bU.
(8)(9)(10) X;U N(0; 1), Y = aX +
√
1 2a
U,
Z = bX +
√
1 2
bV.
(11) X;U; V N(0; 1), Y = aX +
√
1 2a
U,
Z = bX + cY +
√
1 2
b
2c
V.
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 24 / 27
40. Practical BN Learning with Discrete and Continuous Variables
Experiment 3
For each R data set, evaluate the exec time:
data.frame N data.type n time (sec) time (sec)=2N
faithful 2 c,d 272 6.08 3.04
quakes 5 c,c,d,d,c 1000 60.77 1.90
attitude 7 d,d,d,d,d,d,d 30 27.66 0.216
longley 7 c,c,c,c,c,c,d 16 44.63 0.349
USJudgeRatings 12 c,c,c,c,c,c,c,c,c,c,c,c 43 1946.63 1.90
The large N grows, the large computation become is the same for
discrete and continuous variables.
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 26 / 27
41. Conclusion
Conclusion
.
Establish BN Structure Learning without assuming either Discrete or
Continuous
.
.
Theoretical Analysis w.r.t. n;N;K (K: quantization depth)
Realistic Computation using R
Insight:
The computation is proportional to K
O(M(N)) O(nK2N) if n;K are constant
Future Works:
Optimal K w.r.t. n;N
Exponential Memory w.r.t. K
R Package Publication
Joe Suzuki (Osaka University) Learning BNs with Discrete and Continuous Variables PGM 2014 @Utrecht 27 / 27