2013 IEEE International Symposium on Information Theory

  • 96 views
Uploaded on

July 7-12,2013, Istanbul Turky.

July 7-12,2013, Istanbul Turky.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
96
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
1
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Universal Bayesian Measures Joe Suzuki Osaka University IEEE International Symposium on Information Theory Istanbul, Turky July 8, 2013 1 / 19 Universal Bayesian Measures
  • 2. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Given n examples, identify whether X, Y are independent or not (x1, y1), · · · , (xn, yn) ∼ (X, Y ) ∈ {0, 1} × {0, 1} p: a prior probability that X, Y are independent The Bayesian answer Consider weight W over θ to compute Qn (xn ) := ∫ P(xn |θ)dW (θ) , Qn (yn ) := ∫ P(yn |θ)dW (θ) Qn (xn , yn ) := ∫ P(xn , yn |θ)dW (θ) pQn(xn)Qn(yn) ≥ (1 − p)Qn(xn, yn) ⇐⇒ X, Y are independent 2 / 19 Universal Bayesian Measures
  • 3. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Problem: what if X, Y are arbitrary random variables? (Ω, F, P): probability space B: the Borel set of R   X is a random variable . . X : Ω → R is F-measurable (D ∈ B =⇒ {ω ∈ Ω|X(ω) ∈ D} ∈ F)   X, Y may be either discrete contunuous none of them 3 / 19 Universal Bayesian Measures
  • 4. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary What Qn is qualified to be an alternative to Pn ? True θ = θ∗ is not available . . Pn(xn) = P(xn|θ∗), Pn(yn) = P(yn|θ∗) Pn(xn, yn) = Pn(xn, yn|θ∗) Qn (xn ) := ∫ P(xn |θ)dW (θ) , Qn (yn ) := ∫ P(yn |θ)dW (θ) Qn (xn , yn ) := ∫ P(xn , yn |θ)dW (θ) 4 / 19 Universal Bayesian Measures
  • 5. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Example: Bayes Codes c: the # of ones in xn θ: the prob. of ones P(xn |θ) = θc (1 − θ)n−c a, b > 0 w(θ) ∝ 1 θa(1 − θ)b   For each xn = (x1, · · · , xn) ∈ {0, 1}n, Qn (xn ) := ∫ w(θ)P(xn |θ)dθ = ∏c−1 j=0 (j + a) · ∏n−c−1 k=0 (k + b) ∏n−1 i=0 (i + a + b) 5 / 19 Universal Bayesian Measures
  • 6. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Universal Coding/Measures If we choose a = b = 1/2 (Krichevsky-Trofimov) and xn is i.i.d. emitted by Pn (xn |θ) = n∏ i=1 P(xi ) , P(xi ) = θ, 1 − θ then, for any P, almost surely, − 1 n log Qn (xn ) → H := ∑ x∈A −P(x) log P(x) From Shannon McMillian Breiman, for any P, − 1 n log Pn (xn |θ) = 1 n n∑ i=1 − log P(xi ) → E[− log P(xi )] = H 6 / 19 Universal Bayesian Measures
  • 7. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Why Pn can be replaced by Qn if n is large ? For any P, almost surely, 1 n log Pn(xn) Qn(xn) → 0 (1) Qn: a universal Bayesian measure for A . What are Qn and (1) in the general settings ? 7 / 19 Universal Bayesian Measures
  • 8. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Suppose a density function exists for X A: the range of X A0 := {A} Aj+1 is a refinement of Aj Example 1: if A = [0, 1), the sequence can be A0 = {[0, 1)}, A1 = {[0, 1/2), [1/2, 1)} A2 = {[0, 1/4), [1/4, 1/2), [1/2, 3/4), [3/4, 1)} . . . Aj = {[0, 2−(j−1)), [2−(j−1), 2 · 2−(j−1)), · · · , [(2j−1 − 1)2−(j−1), 1)} . . . sj : A → Aj (quantization, x ∈ a ∈ Aj =⇒ sj (x) = a) λ : R → B (Lebesgue measure, a = [b, c) =⇒ λ(a) = c − b) Qn j : a universal Bayesian measure for Aj 8 / 19 Universal Bayesian Measures
  • 9. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary If (sj (x1), · · · , sj (xn)) = (a1, · · · , an), gn j (xn ) := Qn j (a1, · · · , an) λ(a1) · · · λ(an) f n j (xn ) := fj (x1) · · · fj (xn) = Pj (a1) · · · Pj (an) λ(a1) . . . λ(an) For {ωj }∞ j=1: ∑ ωj = 1, ωj > 0, gn (xn ) := ∞∑ j=1 ωj gn j (xn ) For any f and {Aj } s.t. h(fj ) → h(f ) as j → ∞, almost surely 1 n log f n(xn) gn(xn) → 0 (2) B. Ryabko. IEEE Trans. on Inform. Theory, 55, 9, 2009. 9 / 19 Universal Bayesian Measures
  • 10. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Our Goal: what are they generalized into? . 1 if the random variable takes finite values: 1 n log Pn (xn ) Qn(xn) → 0 (1) for any Pn . 2 if a density function exists: 1 n log f n (xn ) gn(xn) → 0 (2) for any f n and {Aj } satisfies h(fj ) → h(f ) as j → ∞ 10 / 19 Universal Bayesian Measures
  • 11. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Exactly when does density function exist? B: the Borel sets of R µ(D): the prob. of D ∈ B When a density function exists . The following are equivalent (µ ≪ λ): for each D ∈ B, λ(D) = 0 =⇒ µ(D) = 0 ∃ B-measurable dµ dλ := f s.t. µ(D) = ∫ D f (t)dλ(t) f is the density function (w.r.t. λ). 11 / 19 Universal Bayesian Measures
  • 12. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Density Functions in a General Sense Radon-Nikodum’s Theorem . . The following are equivalent (µ ≪ η): for each D ∈ B, η(D) = 0 =⇒ µ(D) = 0 ∃ B-measurable dµ dη := fη s.t. µ(D) = ∫ D fη(t)dη(t) fη is the density function w.r.t. η.   Example 2: µ({h}) > 0, η({h}) := 1 h(h + 1) , h ∈ B := {1, 2, · · · } µ ≪ η µ(D) = ∑ h∈D∩B fη(h)η({h}) dµ dη (h) = fη(h) = µ({h}) η({h}) = h(h + 1)µ({h}) 12 / 19 Universal Bayesian Measures
  • 13. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary B1 := {{1}, {2, 3, · · · }} B2 := {{1}, {2}, {3, 4, · · · }} . . . Bk := {{1}, {2}, · · · , {k}, {k + 1, k + 2, · · · }} . . . tk : B → Bk (quantization, y ∈ b ∈ Bk =⇒ tk(y) = b) If (tk(y1), · · · , tk(yn)) = (b1, · · · , bn), gn η,k(yn ) := Qn k (b1, · · · , bn) η(b1) · · · η(bn) , gn η (yn ) := ∞∑ k=1 ωkgn η,k(yn ) For any fη and {Bk} s.t. h(fη,k) → h(fη) , almost surely 1 n log f n η (yn) gn η (yn) → 0 (3) gn(yn) ∏n i=1 ηn({yi }) estimates P(yn) = f n η (yn) ∏n i=1 ηn({yi }) 13 / 19 Universal Bayesian Measures
  • 14. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary In the general case µn (Dn ) := ∫ D f n η (yn )dηn (yn ) νn (Dn ) := ∫ D gn η (yn )dηn (yn ) f n η (yn) gn η (yn) = dµn dηn (yn )/ dνn dηn (yn ) = dµn dνn (yn ) D(µ||ν) := ∫ dµ log dµ dν h(fη) := ∫ −f n η (yn ) log f n η (yn )dη(yn ) = − ∫ dµ dη (yn ) log dµ dη (yn ) · dη(yn ) = −D(µ||η) 14 / 19 Universal Bayesian Measures
  • 15. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Main Theorem Theorem . With probability one as n → ∞ 1 n log dµn dνn (yn ) → 0 for any stationary ergodic µn and {Bk} such that D(µk||η) → D(µ||η) as k → ∞ 15 / 19 Universal Bayesian Measures
  • 16. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Joint Density Functions Example 3: A × B (based on Examples 1,2) µ ≪ λη A0 × B0 = {A} × {B} = {[0, 1)} × {{1, 2, · · · }} A1 × B1 A2 × B2 . . . Aj × Bk . . . (sj , tk) : A × B → Aj × Bk   If {Aj × Bk} satisfies fλη,jk → fλη, for any fλη, almost surely, we can construct gn λη s.t. 1 n log f n λη(xn, yn) gn λη(xn, yn) → 0 (4) 16 / 19 Universal Bayesian Measures
  • 17. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary The Answer to the Problem Estimate f n X (xn), f n Y (yn), f n XY (xn, yn) by gn X (xn), gn Y (yn), gn XY (xn, yn)   The Bayesian answer . . pgn X (xn)gn Y (yn) ≤ (1 − p)gXY (xn, yn) ⇐⇒ X, Y are independent 17 / 19 Universal Bayesian Measures
  • 18. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary The General Bayesian Solution Givem n examples zn and prior {pm} over models m = 1, 2, · · · , compute gn (zn |m) for each m = 1, 2, · · · find the model m maxmizing pmg(zn |m) 18 / 19 Universal Bayesian Measures
  • 19. Problem Density Functions Generalized Density Functions The Bayesian Solution Summary Summary and Discussion Bayesian Measure . . Generalization without assuming Discrete or Continuous Universality of Bayes/MDL in the generalized sense Many Applications Bayesian network structure estimation (DCC 2012) The Bayesian Chow-Liu Algorithm (PGM 2012) Markov order estimation even when {Xi } is continuous 19 / 19 Universal Bayesian Measures