Upcoming SlideShare
×

# 2013 IEEE International Symposium on Information Theory

249 views
188 views

Published on

July 7-12,2013, Istanbul Turky.

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
249
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
1
0
Likes
0
Embeds 0
No embeds

No notes for slide

### 2013 IEEE International Symposium on Information Theory

1. 1. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Universal Bayesian Measures Joe Suzuki Osaka University IEEE International Symposium on Information Theory Istanbul, Turky July 8, 2013 1 / 19 Universal Bayesian Measures
2. 2. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Given n examples, identify whether X, Y are independent or not (x1, y1), · · · , (xn, yn) ∼ (X, Y ) ∈ {0, 1} × {0, 1} p: a prior probability that X, Y are independent The Bayesian answer Consider weight W over θ to compute Qn (xn ) := ∫ P(xn |θ)dW (θ) , Qn (yn ) := ∫ P(yn |θ)dW (θ) Qn (xn , yn ) := ∫ P(xn , yn |θ)dW (θ) pQn(xn)Qn(yn) ≥ (1 − p)Qn(xn, yn) ⇐⇒ X, Y are independent 2 / 19 Universal Bayesian Measures
3. 3. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Problem: what if X, Y are arbitrary random variables? (Ω, F, P): probability space B: the Borel set of R   X is a random variable . . X : Ω → R is F-measurable (D ∈ B =⇒ {ω ∈ Ω|X(ω) ∈ D} ∈ F)   X, Y may be either discrete contunuous none of them 3 / 19 Universal Bayesian Measures
4. 4. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary What Qn is qualiﬁed to be an alternative to Pn ? True θ = θ∗ is not available . . Pn(xn) = P(xn|θ∗), Pn(yn) = P(yn|θ∗) Pn(xn, yn) = Pn(xn, yn|θ∗) Qn (xn ) := ∫ P(xn |θ)dW (θ) , Qn (yn ) := ∫ P(yn |θ)dW (θ) Qn (xn , yn ) := ∫ P(xn , yn |θ)dW (θ) 4 / 19 Universal Bayesian Measures
5. 5. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Example: Bayes Codes c: the # of ones in xn θ: the prob. of ones P(xn |θ) = θc (1 − θ)n−c a, b > 0 w(θ) ∝ 1 θa(1 − θ)b   For each xn = (x1, · · · , xn) ∈ {0, 1}n, Qn (xn ) := ∫ w(θ)P(xn |θ)dθ = ∏c−1 j=0 (j + a) · ∏n−c−1 k=0 (k + b) ∏n−1 i=0 (i + a + b) 5 / 19 Universal Bayesian Measures
6. 6. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Universal Coding/Measures If we choose a = b = 1/2 (Krichevsky-Troﬁmov) and xn is i.i.d. emitted by Pn (xn |θ) = n∏ i=1 P(xi ) , P(xi ) = θ, 1 − θ then, for any P, almost surely, − 1 n log Qn (xn ) → H := ∑ x∈A −P(x) log P(x) From Shannon McMillian Breiman, for any P, − 1 n log Pn (xn |θ) = 1 n n∑ i=1 − log P(xi ) → E[− log P(xi )] = H 6 / 19 Universal Bayesian Measures
7. 7. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Why Pn can be replaced by Qn if n is large ? For any P, almost surely, 1 n log Pn(xn) Qn(xn) → 0 (1) Qn: a universal Bayesian measure for A . What are Qn and (1) in the general settings ? 7 / 19 Universal Bayesian Measures
8. 8. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Suppose a density function exists for X A: the range of X A0 := {A} Aj+1 is a reﬁnement of Aj Example 1: if A = [0, 1), the sequence can be A0 = {[0, 1)}, A1 = {[0, 1/2), [1/2, 1)} A2 = {[0, 1/4), [1/4, 1/2), [1/2, 3/4), [3/4, 1)} . . . Aj = {[0, 2−(j−1)), [2−(j−1), 2 · 2−(j−1)), · · · , [(2j−1 − 1)2−(j−1), 1)} . . . sj : A → Aj (quantization, x ∈ a ∈ Aj =⇒ sj (x) = a) λ : R → B (Lebesgue measure, a = [b, c) =⇒ λ(a) = c − b) Qn j : a universal Bayesian measure for Aj 8 / 19 Universal Bayesian Measures
9. 9. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary If (sj (x1), · · · , sj (xn)) = (a1, · · · , an), gn j (xn ) := Qn j (a1, · · · , an) λ(a1) · · · λ(an) f n j (xn ) := fj (x1) · · · fj (xn) = Pj (a1) · · · Pj (an) λ(a1) . . . λ(an) For {ωj }∞ j=1: ∑ ωj = 1, ωj > 0, gn (xn ) := ∞∑ j=1 ωj gn j (xn ) For any f and {Aj } s.t. h(fj ) → h(f ) as j → ∞, almost surely 1 n log f n(xn) gn(xn) → 0 (2) B. Ryabko. IEEE Trans. on Inform. Theory, 55, 9, 2009. 9 / 19 Universal Bayesian Measures
10. 10. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Our Goal: what are they generalized into? . 1 if the random variable takes ﬁnite values: 1 n log Pn (xn ) Qn(xn) → 0 (1) for any Pn . 2 if a density function exists: 1 n log f n (xn ) gn(xn) → 0 (2) for any f n and {Aj } satisﬁes h(fj ) → h(f ) as j → ∞ 10 / 19 Universal Bayesian Measures
11. 11. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Exactly when does density function exist? B: the Borel sets of R µ(D): the prob. of D ∈ B When a density function exists . The following are equivalent (µ ≪ λ): for each D ∈ B, λ(D) = 0 =⇒ µ(D) = 0 ∃ B-measurable dµ dλ := f s.t. µ(D) = ∫ D f (t)dλ(t) f is the density function (w.r.t. λ). 11 / 19 Universal Bayesian Measures
12. 12. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Density Functions in a General Sense Radon-Nikodum’s Theorem . . The following are equivalent (µ ≪ η): for each D ∈ B, η(D) = 0 =⇒ µ(D) = 0 ∃ B-measurable dµ dη := fη s.t. µ(D) = ∫ D fη(t)dη(t) fη is the density function w.r.t. η.   Example 2: µ({h}) > 0, η({h}) := 1 h(h + 1) , h ∈ B := {1, 2, · · · } µ ≪ η µ(D) = ∑ h∈D∩B fη(h)η({h}) dµ dη (h) = fη(h) = µ({h}) η({h}) = h(h + 1)µ({h}) 12 / 19 Universal Bayesian Measures
13. 13. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary B1 := {{1}, {2, 3, · · · }} B2 := {{1}, {2}, {3, 4, · · · }} . . . Bk := {{1}, {2}, · · · , {k}, {k + 1, k + 2, · · · }} . . . tk : B → Bk (quantization, y ∈ b ∈ Bk =⇒ tk(y) = b) If (tk(y1), · · · , tk(yn)) = (b1, · · · , bn), gn η,k(yn ) := Qn k (b1, · · · , bn) η(b1) · · · η(bn) , gn η (yn ) := ∞∑ k=1 ωkgn η,k(yn ) For any fη and {Bk} s.t. h(fη,k) → h(fη) , almost surely 1 n log f n η (yn) gn η (yn) → 0 (3) gn(yn) ∏n i=1 ηn({yi }) estimates P(yn) = f n η (yn) ∏n i=1 ηn({yi }) 13 / 19 Universal Bayesian Measures
14. 14. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary In the general case µn (Dn ) := ∫ D f n η (yn )dηn (yn ) νn (Dn ) := ∫ D gn η (yn )dηn (yn ) f n η (yn) gn η (yn) = dµn dηn (yn )/ dνn dηn (yn ) = dµn dνn (yn ) D(µ||ν) := ∫ dµ log dµ dν h(fη) := ∫ −f n η (yn ) log f n η (yn )dη(yn ) = − ∫ dµ dη (yn ) log dµ dη (yn ) · dη(yn ) = −D(µ||η) 14 / 19 Universal Bayesian Measures
15. 15. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Main Theorem Theorem . With probability one as n → ∞ 1 n log dµn dνn (yn ) → 0 for any stationary ergodic µn and {Bk} such that D(µk||η) → D(µ||η) as k → ∞ 15 / 19 Universal Bayesian Measures
16. 16. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Joint Density Functions Example 3: A × B (based on Examples 1,2) µ ≪ λη A0 × B0 = {A} × {B} = {[0, 1)} × {{1, 2, · · · }} A1 × B1 A2 × B2 . . . Aj × Bk . . . (sj , tk) : A × B → Aj × Bk   If {Aj × Bk} satisﬁes fλη,jk → fλη, for any fλη, almost surely, we can construct gn λη s.t. 1 n log f n λη(xn, yn) gn λη(xn, yn) → 0 (4) 16 / 19 Universal Bayesian Measures
17. 17. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary The Answer to the Problem Estimate f n X (xn), f n Y (yn), f n XY (xn, yn) by gn X (xn), gn Y (yn), gn XY (xn, yn)   The Bayesian answer . . pgn X (xn)gn Y (yn) ≤ (1 − p)gXY (xn, yn) ⇐⇒ X, Y are independent 17 / 19 Universal Bayesian Measures
18. 18. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary The General Bayesian Solution Givem n examples zn and prior {pm} over models m = 1, 2, · · · , compute gn (zn |m) for each m = 1, 2, · · · ﬁnd the model m maxmizing pmg(zn |m) 18 / 19 Universal Bayesian Measures
19. 19. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Summary and Discussion Bayesian Measure . . Generalization without assuming Discrete or Continuous Universality of Bayes/MDL in the generalized sense Many Applications Bayesian network structure estimation (DCC 2012) The Bayesian Chow-Liu Algorithm (PGM 2012) Markov order estimation even when {Xi } is continuous 19 / 19 Universal Bayesian Measures