SlideShare a Scribd company logo
1 of 18
Download to read offline
The Universal Measure for General Sources and its
Application to MDL/Bayesian Criteria
Joe Suzuki
Osaka University
March 30
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 1 / 18
Road Map
...1 Universal Coding with Finite Alphabet
...2 Universal Coding when the Density Function exists)
...3 Radon-Nykodim’s Theorem
...4 A Generalized Universal Coding
...5 A Generalized MDL Principle
...6 Summary
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 2 / 18
Universal Coding with Finite Alphabet
{Xi }n
i=1 ∼ Pn: Stationary Ergodic
A := Xi (Ω) < ∞, i = 1, · · · , n
.
Universal Coding
..
......
There exists Qn s.t. for all Pn with probability one
∑
xn∈An
Qn
(xn
) ≤ 1 (Kraft’s inequality)
−
1
n
log Qn
(xn
) → H(P) := lim
n→∞
H(Xn|X1 · · · Xn−1)
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 3 / 18
Universal Coding with Finite Alphabet (cont’d)
Shannon-McMillan-Breiman: with probability one
−
1
n
log Pn
(xn
) → H(P)
.
We wish to generalize that
..
......
there exists Qn s.t. for all Pn with probability one
1
n
log
Pn(xn)
Qn(xn)
→ 0
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 4 / 18
Universal Coding when the Density Function exists
{Xi }n
i=1 ∼ f n: Stationary Ergodic
{Ak}∞
k=1
Ak is a Partion of Xi (Ω)
Ak+1 is a Refinment of Ak with A0 := {Xi (Ω)}
ex. Xi (Ω) = [0, 1)
A1 = {[0, 1/2), [1/2, 1)}
A2 = {[0, 1/4), [1/4, 1/2), [1/2, 3/4), [3/4, 1)}
. . .
Ak = {[0, 2−(k−1)), [2−(k−1), 2 · 2−(k−1)), · · · , [(2k−1 − 1)2−(k−1), 1)}
. . .
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 5 / 18
Universal Coding when the Density Function exists (cont’d)
sk : Rn → An
k (Projection)
Pk: the Probability of sk(Xn)
λn: Lebesgue Measure
.
For each k, there exists universal Qk
..
......
fk(xn
) :=
Pk(sk(xn))
λn(sk(xn))
, gk(xn
) :=
Qk(sk(xn))
λn(sk(xn))
1
n
log
Pk(sk(xn))
Qk(sk(xn))
→ 0
{ωk}∞
k=1:
∑
ωk = 1, ωk > 0
g(xn
) :=
∞∑
k=1
ωkgk(xn
)
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 6 / 18
Universal Coding when the Density Function exists (cont’d)
h(f ) := lim
n→∞
∫
−f (xn
) log f (xn|x1, · · · , xn−1)dxn
.
We wish to generalize
..
......
If we choose {Ak}∞
k=1 s.t. h(fk) → h(f )(k → ∞), there exists gn
(
∫ ∞
−∞ gn(xn)dxn ≤ 1) s.t. for all f n, with probability one
1
n
log
f n(xn)
gn(xn)
→ 0
B. Ryabko. IEEE Trans. on Information Theory, VOL. 55, NO. 9, 2009.
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 7 / 18
What if there exists no Density Function
ex.
∫ ∞
0
h(x)dx = 1 and
FX (x) =



0 x < −1,
1
2 , −1 ≤ x < 0∫ x
0
1
2 h(t)dt, 0 ≤ x
=⇒ there exists no fX s.t. FX (x) =
∫ x
−∞
fX (t)dt
By what are
P(xn)
Q(xn)
,
f (xn)
g(xn)
expressed in the general setting of {Xi }n
i=1?
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 8 / 18
Random Variables
(Ω, F, µ): Probability Space
B: the Borel set in R
.
Xis a Random Variable
..
......
F-measurable X : Ω → R, i.e.
D ∈ B =⇒ {ω ∈ Ω|X(ω) ∈ D} ∈ F
Finite Sources
Continuous Sources with Density Functions
Continuous Sources without Density Functions
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 9 / 18
Radon-Nykodim’s Theorem
.
µ is Absolutely Continiuous w.r.t. ν (µ << ν)
..
......
for each A ∈ F
ν(A) = 0 =⇒ µ(A) = 0
.
Radon-Nykodim derivative
dµ
dν..
......
µ << ν ⇐⇒
there exists F-measureble g : Ω → R s.t. for each A ∈ F
µ(A) =
∫
A g(ω)dν(ω)
λ: Lebesgue measure on R
.
Density function fX exists
..
......⇐⇒ µ << λ for FX (x) := µ(ω ∈ Ω|X(ω) ≤ x)
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 10 / 18
Kullback-Leibler Information
.
Kullback-Leibler Information
..
......
When µ << ν
D(µ||ν) :=
∫
dµ log
dµ
dν
Finite Source: P, Q =⇒
dµ
dν
(xn
) =
P(xn)
Q(xn)
D(µn
||νn
) =
∑
xn∈An
P(xn
) log
Pn(xn)
Qn(xn)
Continuous Source with Density Function: f , g =⇒
dµ
dν
(xn
) =
f (xn)
g(xn)
D(µn
||νn
) =
∫
f n
(xn
) log
f n(xn)
gn(xn)
dxn
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 11 / 18
Construction of Measure νn
Qn
k (an) , an ∈ An
k
ηn: µn << ηn (ηn = λn =⇒ Ryabko)
 
For each (D1, · · · , Dn) ∈ Bn
νn
k (D1, · · · , Dn) :=
∑
a1,··· ,an∈Ak
ηn(a1 ∩ D1, · · · , an ∩ Dn)
ηn(a1, · · · , an)
Qn
k (a1, · · · , an) .
(
⇐⇒
dνn
k
dηn
:=
Qn
k (a1, · · · , an)
ηn(a1, · · · , an)
)
 
{ωi }∞
k=0:
∞∑
k=0
ωk = 1, ωk > 0
νn
(D1, · · · , Dn) :=
∞∑
k=0
ωkνn
k (D1, · · · , Dn)
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 12 / 18
A Generalized Universal Coding
µn
k(D1, · · · , Dn) :=
∑
a1,··· ,an∈Ak
ηn(a1 ∩ D1, · · · , an ∩ Dn)
ηn(a1, · · · , an)
Pn
k (a1, · · · , an) .
D(µ||ν) := lim
n→∞
∫
dµ(xn
) log
dµ
dν
(xn|x1, · · · , xn−1)
.
Theorem
..
......
If we choose {Ak}∞
k=1 s.t. D(µk||η) = D(µ||η) (k → ∞), there exists νn
(
∫
xn∈Xn(Ω) dνn(xn) ≤ 1) s.t. for all µn, with probability one
1
n
log
dµn
dνn
(x1, · · · , xn) → 0
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 13 / 18
An Example not realized by the existing Universal Coding
X(Ω) := N = {1, 2, · · · }, η(j) =
1
j
−
1
j + 1
, j ∈ N
A1 := {{1}, N − {1}}
A2 := {{1}, {2}, N − {1, 2}}
· · ·
Ak := {{1}, {2}, · · · , {k}, N − {1, · · · , k}}
· · ·
Qn
k (sk(xn)):
1
n
log
Pn
k (sk(xn))
Qn
k (sk(xn))
→ 0 , n → ∞
The Probability of j ∈ N − {1, · · · , k} is to be proporional to
η(j) =
1
j
−
1
j + 1
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 14 / 18
Case Study 1: Markov Order Estmation
.
The Markov Order
..
......
For each n = 1, 2, · · · , the minimum s s.t.
{Xj }∞
j=n ⊥⊥ {Xj }n−s−1
j=1 |{Xj }n−1
j=n−s
{Xi }n
i=1 ∼ Pn[s]: Markov with order s
π[s]: the a Prior Probability of Order s
 
If Xi (Ω) = A < ∞,
...1 for each s = 0, 1, · · · , we estimate Qn[s]:
▶
∑
xn∈An
Qn
[s](xn
) ≤ 1
▶
1
n
log
Pn
[s](xn
)
Qn[s](xn)
→ 0
...2 Given a Sequence xn, we choose s maximizing π[s]Qn[s](xn)
(minimizing ⇐⇒ − log π[s] − log Qn[s](xn))
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 15 / 18
Case Study 1: Markov Order Estmation (cont’d)
In general, in the neighborhood of xn,
maximizing π[s]νn[s](∆xn) (⇐⇒ minimizing − log π[s] − log νn[s](∆xn))
 
.
Decision Rule
..
......
...1 Construct νn[s] for each s = 0, 1, · · · ,
▶
∑
xn∈An
νn
[s](xn
) ≤ 1
▶
1
n
log
dµn
[s]
dνn[s]
(xn
) → 0
...2 Given Sequence xn,
π[s]
π[s′]
·
dνn[s]
dνn[s′]
(xn
) > 1 ⇐⇒ s is better than s′
The Ratios of Probabilities and Density Functions are Radon-Nykodim
Derivative in the general setting.
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 16 / 18
Case Study 2: Discrete and Continuous Features are mixed in
Pattern Recognition
S∗: Finite Set
{Xk}k∈S∗ , Y : Random Variables
 
xn := {(xi,k)k∈S∗ }n
i=1, yn := {yi }n
i=1: Examples
Finite Case: choose S ⊆ S∗ maximizing R(xn
, yn
, S) := π(S)
Qn
[xn
, yn
|S]
Qn
[xn
|S]
General Case: choose S ⊆ S∗ maximizing
R(xn, ∆yn, S) := π(S)
dνn
[∆xn
, ∆yn
|S]
dνn
[∆xn
|S] xn
dR(xn, ∆yn, S)
dR(xn, ∆yn, S′) yn
> 1 ⇐⇒ S is better than S′
.
Conditional Probability of Y given X
..
......
µ(Y ∈ D|X = x) := f (Y ∈ D|x) =
dµ(X ∈ ∆x, Y ∈ D)
dµ(X ∈ ∆x)
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 17 / 18
Contribution
.
New Theory
..
......
Universal Coding without assuming Discrete and Continuous Sources
The MDL Principle without assuming Discrete and Continuous
Sources
.
Applications
..
......
Previously, discrete and Continuous cases were separated
Markov Order Estimation (Continuous Data Sequence)
Feature Selection (Discrete and continuous features are mixed)
BN Structure Estimation (Discrete and continuous rvs are mixed)
.
Feature Work
..
......
Computation
Applications
Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 18 / 18

More Related Content

What's hot

Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeVjekoslavKovac1
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannolli0601
 
Density theorems for anisotropic point configurations
Density theorems for anisotropic point configurationsDensity theorems for anisotropic point configurations
Density theorems for anisotropic point configurationsVjekoslavKovac1
 
Multilinear Twisted Paraproducts
Multilinear Twisted ParaproductsMultilinear Twisted Paraproducts
Multilinear Twisted ParaproductsVjekoslavKovac1
 
Scattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisScattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisVjekoslavKovac1
 
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operatorsA T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operatorsVjekoslavKovac1
 
A Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cubeA Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cubeVjekoslavKovac1
 
On Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular IntegralsOn Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular IntegralsVjekoslavKovac1
 
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremS. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremSteven Duplij (Stepan Douplii)
 

What's hot (20)

QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cube
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmann
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Density theorems for anisotropic point configurations
Density theorems for anisotropic point configurationsDensity theorems for anisotropic point configurations
Density theorems for anisotropic point configurations
 
Multilinear Twisted Paraproducts
Multilinear Twisted ParaproductsMultilinear Twisted Paraproducts
Multilinear Twisted Paraproducts
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Scattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisScattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysis
 
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operatorsA T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
 
A Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cubeA Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cube
 
On Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular IntegralsOn Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular Integrals
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremS. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
 

Similar to The Universal Measure for General Sources and its Application to MDL/Bayesian Criteria

Universal Prediction without assuming either Discrete or Continuous
Universal Prediction without assuming either Discrete or ContinuousUniversal Prediction without assuming either Discrete or Continuous
Universal Prediction without assuming either Discrete or ContinuousJoe Suzuki
 
Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Frank Nielsen
 
Bayes Independence Test
Bayes Independence TestBayes Independence Test
Bayes Independence TestJoe Suzuki
 
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Frank Nielsen
 
Gaussian process in machine learning
Gaussian process in machine learningGaussian process in machine learning
Gaussian process in machine learningVARUN KUMAR
 
Normal density and discreminant analysis
Normal density and discreminant analysisNormal density and discreminant analysis
Normal density and discreminant analysisVARUN KUMAR
 
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Frank Nielsen
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
2013 IEEE International Symposium on Information Theory
2013 IEEE International Symposium on Information Theory2013 IEEE International Symposium on Information Theory
2013 IEEE International Symposium on Information TheoryJoe Suzuki
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodFrank Nielsen
 
Accelerated reconstruction of a compressively sampled data stream
Accelerated reconstruction of a compressively sampled data streamAccelerated reconstruction of a compressively sampled data stream
Accelerated reconstruction of a compressively sampled data streamPantelis Sopasakis
 
Density theorems for Euclidean point configurations
Density theorems for Euclidean point configurationsDensity theorems for Euclidean point configurations
Density theorems for Euclidean point configurationsVjekoslavKovac1
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applicationsFrank Nielsen
 
Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Joe Suzuki
 
Problem Understanding through Landscape Theory
Problem Understanding through Landscape TheoryProblem Understanding through Landscape Theory
Problem Understanding through Landscape Theoryjfrchicanog
 
The Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability DistributionThe Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability DistributionPedro222284
 

Similar to The Universal Measure for General Sources and its Application to MDL/Bayesian Criteria (20)

Universal Prediction without assuming either Discrete or Continuous
Universal Prediction without assuming either Discrete or ContinuousUniversal Prediction without assuming either Discrete or Continuous
Universal Prediction without assuming either Discrete or Continuous
 
Lecture9 xing
Lecture9 xingLecture9 xing
Lecture9 xing
 
Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...
 
Bayes Independence Test
Bayes Independence TestBayes Independence Test
Bayes Independence Test
 
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
 
SASA 2016
SASA 2016SASA 2016
SASA 2016
 
Gaussian process in machine learning
Gaussian process in machine learningGaussian process in machine learning
Gaussian process in machine learning
 
Normal density and discreminant analysis
Normal density and discreminant analysisNormal density and discreminant analysis
Normal density and discreminant analysis
 
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
2013 IEEE International Symposium on Information Theory
2013 IEEE International Symposium on Information Theory2013 IEEE International Symposium on Information Theory
2013 IEEE International Symposium on Information Theory
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
 
Accelerated reconstruction of a compressively sampled data stream
Accelerated reconstruction of a compressively sampled data streamAccelerated reconstruction of a compressively sampled data stream
Accelerated reconstruction of a compressively sampled data stream
 
Density theorems for Euclidean point configurations
Density theorems for Euclidean point configurationsDensity theorems for Euclidean point configurations
Density theorems for Euclidean point configurations
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applications
 
Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...
 
Problem Understanding through Landscape Theory
Problem Understanding through Landscape TheoryProblem Understanding through Landscape Theory
Problem Understanding through Landscape Theory
 
The Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability DistributionThe Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability Distribution
 
Bayes gauss
Bayes gaussBayes gauss
Bayes gauss
 
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
 

More from Joe Suzuki

RとPythonを比較する
RとPythonを比較するRとPythonを比較する
RとPythonを比較するJoe Suzuki
 
R集会@統数研
R集会@統数研R集会@統数研
R集会@統数研Joe Suzuki
 
E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...Joe Suzuki
 
分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減する分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減するJoe Suzuki
 
連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定Joe Suzuki
 
E-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka UniversityE-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka UniversityJoe Suzuki
 
AMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップAMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップJoe Suzuki
 
CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要Joe Suzuki
 
Forest Learning from Data
Forest Learning from DataForest Learning from Data
Forest Learning from DataJoe Suzuki
 
A Bayesian Approach to Data Compression
A Bayesian Approach to Data CompressionA Bayesian Approach to Data Compression
A Bayesian Approach to Data CompressionJoe Suzuki
 
研究紹介(学生向け)
研究紹介(学生向け)研究紹介(学生向け)
研究紹介(学生向け)Joe Suzuki
 
Efficietly Learning Bayesian Network Structures based on the B&B Strategy: A ...
Efficietly Learning Bayesian Network Structuresbased on the B&B Strategy: A ...Efficietly Learning Bayesian Network Structuresbased on the B&B Strategy: A ...
Efficietly Learning Bayesian Network Structures based on the B&B Strategy: A ...Joe Suzuki
 
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...Joe Suzuki
 
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...Joe Suzuki
 
連続変量を含む相互情報量の推定
連続変量を含む相互情報量の推定連続変量を含む相互情報量の推定
連続変量を含む相互情報量の推定Joe Suzuki
 
Jeffreys' and BDeu Priors for Model Selection
Jeffreys' and BDeu Priors for Model SelectionJeffreys' and BDeu Priors for Model Selection
Jeffreys' and BDeu Priors for Model SelectionJoe Suzuki
 
離散と連続の入り混じった相互情報量を推定して、 SNP と遺伝子発現量の因果関係をさぐる
離散と連続の入り混じった相互情報量を推定して、SNP と遺伝子発現量の因果関係をさぐる離散と連続の入り混じった相互情報量を推定して、SNP と遺伝子発現量の因果関係をさぐる
離散と連続の入り混じった相互情報量を推定して、 SNP と遺伝子発現量の因果関係をさぐるJoe Suzuki
 
MaCaulay2 Miuraパッケージの開発と今後
MaCaulay2 Miuraパッケージの開発と今後MaCaulay2 Miuraパッケージの開発と今後
MaCaulay2 Miuraパッケージの開発と今後Joe Suzuki
 

More from Joe Suzuki (20)

RとPythonを比較する
RとPythonを比較するRとPythonを比較する
RとPythonを比較する
 
R集会@統数研
R集会@統数研R集会@統数研
R集会@統数研
 
E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...
 
分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減する分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減する
 
連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定
 
E-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka UniversityE-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka University
 
UAI 2017
UAI 2017UAI 2017
UAI 2017
 
AMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップAMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップ
 
CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要
 
Forest Learning from Data
Forest Learning from DataForest Learning from Data
Forest Learning from Data
 
A Bayesian Approach to Data Compression
A Bayesian Approach to Data CompressionA Bayesian Approach to Data Compression
A Bayesian Approach to Data Compression
 
研究紹介(学生向け)
研究紹介(学生向け)研究紹介(学生向け)
研究紹介(学生向け)
 
Efficietly Learning Bayesian Network Structures based on the B&B Strategy: A ...
Efficietly Learning Bayesian Network Structuresbased on the B&B Strategy: A ...Efficietly Learning Bayesian Network Structuresbased on the B&B Strategy: A ...
Efficietly Learning Bayesian Network Structures based on the B&B Strategy: A ...
 
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
 
2016 7-13
2016 7-132016 7-13
2016 7-13
 
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
 
連続変量を含む相互情報量の推定
連続変量を含む相互情報量の推定連続変量を含む相互情報量の推定
連続変量を含む相互情報量の推定
 
Jeffreys' and BDeu Priors for Model Selection
Jeffreys' and BDeu Priors for Model SelectionJeffreys' and BDeu Priors for Model Selection
Jeffreys' and BDeu Priors for Model Selection
 
離散と連続の入り混じった相互情報量を推定して、 SNP と遺伝子発現量の因果関係をさぐる
離散と連続の入り混じった相互情報量を推定して、SNP と遺伝子発現量の因果関係をさぐる離散と連続の入り混じった相互情報量を推定して、SNP と遺伝子発現量の因果関係をさぐる
離散と連続の入り混じった相互情報量を推定して、 SNP と遺伝子発現量の因果関係をさぐる
 
MaCaulay2 Miuraパッケージの開発と今後
MaCaulay2 Miuraパッケージの開発と今後MaCaulay2 Miuraパッケージの開発と今後
MaCaulay2 Miuraパッケージの開発と今後
 

Recently uploaded

Introduction to Green chemistry ppt.pptx
Introduction to Green chemistry ppt.pptxIntroduction to Green chemistry ppt.pptx
Introduction to Green chemistry ppt.pptxMuskan219429
 
AKSHITA A R ECOLOGICAL NICHE and Gauss lawpptx
AKSHITA A R ECOLOGICAL NICHE and Gauss lawpptxAKSHITA A R ECOLOGICAL NICHE and Gauss lawpptx
AKSHITA A R ECOLOGICAL NICHE and Gauss lawpptxharichikku1713
 
Solid waste management_13_409_U1_2024.pptx
Solid waste management_13_409_U1_2024.pptxSolid waste management_13_409_U1_2024.pptx
Solid waste management_13_409_U1_2024.pptxkrishuchavda31032003
 
Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...
Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...
Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...ORAU
 
Geometric New Earth, Solarsystem, projection
Geometric New Earth, Solarsystem, projectionGeometric New Earth, Solarsystem, projection
Geometric New Earth, Solarsystem, projectionWim van Es
 
Production of super male Tilapia (Sex reversal techniques).pptx
Production of super male Tilapia (Sex reversal techniques).pptxProduction of super male Tilapia (Sex reversal techniques).pptx
Production of super male Tilapia (Sex reversal techniques).pptxAKSHAY MANDAL
 
Preparation of enterprise budget for integrated fish farming
Preparation of enterprise budget for integrated fish farmingPreparation of enterprise budget for integrated fish farming
Preparation of enterprise budget for integrated fish farmingbhanilsaa
 
RHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITY
RHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITYRHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITY
RHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITYDnyandaBopche
 
Naomi Baes's PhD Confirmation Presentation: A Multidimensional Framework for ...
Naomi Baes's PhD Confirmation Presentation: A Multidimensional Framework for ...Naomi Baes's PhD Confirmation Presentation: A Multidimensional Framework for ...
Naomi Baes's PhD Confirmation Presentation: A Multidimensional Framework for ...Naomi Baes
 
INFLUENCE OF PREHARVEST PRACTICES, ENZYMATIC AND TEXTURAL CHANGES, RESPIRATIO...
INFLUENCE OF PREHARVEST PRACTICES, ENZYMATIC AND TEXTURAL CHANGES, RESPIRATIO...INFLUENCE OF PREHARVEST PRACTICES, ENZYMATIC AND TEXTURAL CHANGES, RESPIRATIO...
INFLUENCE OF PREHARVEST PRACTICES, ENZYMATIC AND TEXTURAL CHANGES, RESPIRATIO...Ajay kamboj
 
structure of proteins and its type I PPT
structure of proteins and its type I PPTstructure of proteins and its type I PPT
structure of proteins and its type I PPTvishalbhati28
 
Development of a Questionnaire for Identifying Personal Values in Driving
Development of a Questionnaire for Identifying Personal Values in DrivingDevelopment of a Questionnaire for Identifying Personal Values in Driving
Development of a Questionnaire for Identifying Personal Values in Drivingstudiotelon
 
The GIS Capability Maturity Model (2013)
The GIS Capability Maturity Model (2013)The GIS Capability Maturity Model (2013)
The GIS Capability Maturity Model (2013)GregBabinski
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...ORAU
 
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.kapgateprachi@gmail.com
 
Description of cultivating Duckweed Syllabus.pdf
Description of cultivating Duckweed Syllabus.pdfDescription of cultivating Duckweed Syllabus.pdf
Description of cultivating Duckweed Syllabus.pdfHaim R. Branisteanu
 
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...klada0003
 
Introduction about protein and General method of analysis of protein
Introduction about protein and General method of analysis of proteinIntroduction about protein and General method of analysis of protein
Introduction about protein and General method of analysis of proteinSowmiya
 
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...dkNET
 
Skin: Structure and function of the skin
Skin: Structure and function of the skinSkin: Structure and function of the skin
Skin: Structure and function of the skinheenarahangdale01
 

Recently uploaded (20)

Introduction to Green chemistry ppt.pptx
Introduction to Green chemistry ppt.pptxIntroduction to Green chemistry ppt.pptx
Introduction to Green chemistry ppt.pptx
 
AKSHITA A R ECOLOGICAL NICHE and Gauss lawpptx
AKSHITA A R ECOLOGICAL NICHE and Gauss lawpptxAKSHITA A R ECOLOGICAL NICHE and Gauss lawpptx
AKSHITA A R ECOLOGICAL NICHE and Gauss lawpptx
 
Solid waste management_13_409_U1_2024.pptx
Solid waste management_13_409_U1_2024.pptxSolid waste management_13_409_U1_2024.pptx
Solid waste management_13_409_U1_2024.pptx
 
Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...
Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...
Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...
 
Geometric New Earth, Solarsystem, projection
Geometric New Earth, Solarsystem, projectionGeometric New Earth, Solarsystem, projection
Geometric New Earth, Solarsystem, projection
 
Production of super male Tilapia (Sex reversal techniques).pptx
Production of super male Tilapia (Sex reversal techniques).pptxProduction of super male Tilapia (Sex reversal techniques).pptx
Production of super male Tilapia (Sex reversal techniques).pptx
 
Preparation of enterprise budget for integrated fish farming
Preparation of enterprise budget for integrated fish farmingPreparation of enterprise budget for integrated fish farming
Preparation of enterprise budget for integrated fish farming
 
RHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITY
RHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITYRHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITY
RHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITY
 
Naomi Baes's PhD Confirmation Presentation: A Multidimensional Framework for ...
Naomi Baes's PhD Confirmation Presentation: A Multidimensional Framework for ...Naomi Baes's PhD Confirmation Presentation: A Multidimensional Framework for ...
Naomi Baes's PhD Confirmation Presentation: A Multidimensional Framework for ...
 
INFLUENCE OF PREHARVEST PRACTICES, ENZYMATIC AND TEXTURAL CHANGES, RESPIRATIO...
INFLUENCE OF PREHARVEST PRACTICES, ENZYMATIC AND TEXTURAL CHANGES, RESPIRATIO...INFLUENCE OF PREHARVEST PRACTICES, ENZYMATIC AND TEXTURAL CHANGES, RESPIRATIO...
INFLUENCE OF PREHARVEST PRACTICES, ENZYMATIC AND TEXTURAL CHANGES, RESPIRATIO...
 
structure of proteins and its type I PPT
structure of proteins and its type I PPTstructure of proteins and its type I PPT
structure of proteins and its type I PPT
 
Development of a Questionnaire for Identifying Personal Values in Driving
Development of a Questionnaire for Identifying Personal Values in DrivingDevelopment of a Questionnaire for Identifying Personal Values in Driving
Development of a Questionnaire for Identifying Personal Values in Driving
 
The GIS Capability Maturity Model (2013)
The GIS Capability Maturity Model (2013)The GIS Capability Maturity Model (2013)
The GIS Capability Maturity Model (2013)
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
 
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
 
Description of cultivating Duckweed Syllabus.pdf
Description of cultivating Duckweed Syllabus.pdfDescription of cultivating Duckweed Syllabus.pdf
Description of cultivating Duckweed Syllabus.pdf
 
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
 
Introduction about protein and General method of analysis of protein
Introduction about protein and General method of analysis of proteinIntroduction about protein and General method of analysis of protein
Introduction about protein and General method of analysis of protein
 
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
 
Skin: Structure and function of the skin
Skin: Structure and function of the skinSkin: Structure and function of the skin
Skin: Structure and function of the skin
 

The Universal Measure for General Sources and its Application to MDL/Bayesian Criteria

  • 1. The Universal Measure for General Sources and its Application to MDL/Bayesian Criteria Joe Suzuki Osaka University March 30 Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 1 / 18
  • 2. Road Map ...1 Universal Coding with Finite Alphabet ...2 Universal Coding when the Density Function exists) ...3 Radon-Nykodim’s Theorem ...4 A Generalized Universal Coding ...5 A Generalized MDL Principle ...6 Summary Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 2 / 18
  • 3. Universal Coding with Finite Alphabet {Xi }n i=1 ∼ Pn: Stationary Ergodic A := Xi (Ω) < ∞, i = 1, · · · , n . Universal Coding .. ...... There exists Qn s.t. for all Pn with probability one ∑ xn∈An Qn (xn ) ≤ 1 (Kraft’s inequality) − 1 n log Qn (xn ) → H(P) := lim n→∞ H(Xn|X1 · · · Xn−1) Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 3 / 18
  • 4. Universal Coding with Finite Alphabet (cont’d) Shannon-McMillan-Breiman: with probability one − 1 n log Pn (xn ) → H(P) . We wish to generalize that .. ...... there exists Qn s.t. for all Pn with probability one 1 n log Pn(xn) Qn(xn) → 0 Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 4 / 18
  • 5. Universal Coding when the Density Function exists {Xi }n i=1 ∼ f n: Stationary Ergodic {Ak}∞ k=1 Ak is a Partion of Xi (Ω) Ak+1 is a Refinment of Ak with A0 := {Xi (Ω)} ex. Xi (Ω) = [0, 1) A1 = {[0, 1/2), [1/2, 1)} A2 = {[0, 1/4), [1/4, 1/2), [1/2, 3/4), [3/4, 1)} . . . Ak = {[0, 2−(k−1)), [2−(k−1), 2 · 2−(k−1)), · · · , [(2k−1 − 1)2−(k−1), 1)} . . . Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 5 / 18
  • 6. Universal Coding when the Density Function exists (cont’d) sk : Rn → An k (Projection) Pk: the Probability of sk(Xn) λn: Lebesgue Measure . For each k, there exists universal Qk .. ...... fk(xn ) := Pk(sk(xn)) λn(sk(xn)) , gk(xn ) := Qk(sk(xn)) λn(sk(xn)) 1 n log Pk(sk(xn)) Qk(sk(xn)) → 0 {ωk}∞ k=1: ∑ ωk = 1, ωk > 0 g(xn ) := ∞∑ k=1 ωkgk(xn ) Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 6 / 18
  • 7. Universal Coding when the Density Function exists (cont’d) h(f ) := lim n→∞ ∫ −f (xn ) log f (xn|x1, · · · , xn−1)dxn . We wish to generalize .. ...... If we choose {Ak}∞ k=1 s.t. h(fk) → h(f )(k → ∞), there exists gn ( ∫ ∞ −∞ gn(xn)dxn ≤ 1) s.t. for all f n, with probability one 1 n log f n(xn) gn(xn) → 0 B. Ryabko. IEEE Trans. on Information Theory, VOL. 55, NO. 9, 2009. Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 7 / 18
  • 8. What if there exists no Density Function ex. ∫ ∞ 0 h(x)dx = 1 and FX (x) =    0 x < −1, 1 2 , −1 ≤ x < 0∫ x 0 1 2 h(t)dt, 0 ≤ x =⇒ there exists no fX s.t. FX (x) = ∫ x −∞ fX (t)dt By what are P(xn) Q(xn) , f (xn) g(xn) expressed in the general setting of {Xi }n i=1? Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 8 / 18
  • 9. Random Variables (Ω, F, µ): Probability Space B: the Borel set in R . Xis a Random Variable .. ...... F-measurable X : Ω → R, i.e. D ∈ B =⇒ {ω ∈ Ω|X(ω) ∈ D} ∈ F Finite Sources Continuous Sources with Density Functions Continuous Sources without Density Functions Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 9 / 18
  • 10. Radon-Nykodim’s Theorem . µ is Absolutely Continiuous w.r.t. ν (µ << ν) .. ...... for each A ∈ F ν(A) = 0 =⇒ µ(A) = 0 . Radon-Nykodim derivative dµ dν.. ...... µ << ν ⇐⇒ there exists F-measureble g : Ω → R s.t. for each A ∈ F µ(A) = ∫ A g(ω)dν(ω) λ: Lebesgue measure on R . Density function fX exists .. ......⇐⇒ µ << λ for FX (x) := µ(ω ∈ Ω|X(ω) ≤ x) Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 10 / 18
  • 11. Kullback-Leibler Information . Kullback-Leibler Information .. ...... When µ << ν D(µ||ν) := ∫ dµ log dµ dν Finite Source: P, Q =⇒ dµ dν (xn ) = P(xn) Q(xn) D(µn ||νn ) = ∑ xn∈An P(xn ) log Pn(xn) Qn(xn) Continuous Source with Density Function: f , g =⇒ dµ dν (xn ) = f (xn) g(xn) D(µn ||νn ) = ∫ f n (xn ) log f n(xn) gn(xn) dxn Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 11 / 18
  • 12. Construction of Measure νn Qn k (an) , an ∈ An k ηn: µn << ηn (ηn = λn =⇒ Ryabko)   For each (D1, · · · , Dn) ∈ Bn νn k (D1, · · · , Dn) := ∑ a1,··· ,an∈Ak ηn(a1 ∩ D1, · · · , an ∩ Dn) ηn(a1, · · · , an) Qn k (a1, · · · , an) . ( ⇐⇒ dνn k dηn := Qn k (a1, · · · , an) ηn(a1, · · · , an) )   {ωi }∞ k=0: ∞∑ k=0 ωk = 1, ωk > 0 νn (D1, · · · , Dn) := ∞∑ k=0 ωkνn k (D1, · · · , Dn) Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 12 / 18
  • 13. A Generalized Universal Coding µn k(D1, · · · , Dn) := ∑ a1,··· ,an∈Ak ηn(a1 ∩ D1, · · · , an ∩ Dn) ηn(a1, · · · , an) Pn k (a1, · · · , an) . D(µ||ν) := lim n→∞ ∫ dµ(xn ) log dµ dν (xn|x1, · · · , xn−1) . Theorem .. ...... If we choose {Ak}∞ k=1 s.t. D(µk||η) = D(µ||η) (k → ∞), there exists νn ( ∫ xn∈Xn(Ω) dνn(xn) ≤ 1) s.t. for all µn, with probability one 1 n log dµn dνn (x1, · · · , xn) → 0 Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 13 / 18
  • 14. An Example not realized by the existing Universal Coding X(Ω) := N = {1, 2, · · · }, η(j) = 1 j − 1 j + 1 , j ∈ N A1 := {{1}, N − {1}} A2 := {{1}, {2}, N − {1, 2}} · · · Ak := {{1}, {2}, · · · , {k}, N − {1, · · · , k}} · · · Qn k (sk(xn)): 1 n log Pn k (sk(xn)) Qn k (sk(xn)) → 0 , n → ∞ The Probability of j ∈ N − {1, · · · , k} is to be proporional to η(j) = 1 j − 1 j + 1 Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 14 / 18
  • 15. Case Study 1: Markov Order Estmation . The Markov Order .. ...... For each n = 1, 2, · · · , the minimum s s.t. {Xj }∞ j=n ⊥⊥ {Xj }n−s−1 j=1 |{Xj }n−1 j=n−s {Xi }n i=1 ∼ Pn[s]: Markov with order s π[s]: the a Prior Probability of Order s   If Xi (Ω) = A < ∞, ...1 for each s = 0, 1, · · · , we estimate Qn[s]: ▶ ∑ xn∈An Qn [s](xn ) ≤ 1 ▶ 1 n log Pn [s](xn ) Qn[s](xn) → 0 ...2 Given a Sequence xn, we choose s maximizing π[s]Qn[s](xn) (minimizing ⇐⇒ − log π[s] − log Qn[s](xn)) Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 15 / 18
  • 16. Case Study 1: Markov Order Estmation (cont’d) In general, in the neighborhood of xn, maximizing π[s]νn[s](∆xn) (⇐⇒ minimizing − log π[s] − log νn[s](∆xn))   . Decision Rule .. ...... ...1 Construct νn[s] for each s = 0, 1, · · · , ▶ ∑ xn∈An νn [s](xn ) ≤ 1 ▶ 1 n log dµn [s] dνn[s] (xn ) → 0 ...2 Given Sequence xn, π[s] π[s′] · dνn[s] dνn[s′] (xn ) > 1 ⇐⇒ s is better than s′ The Ratios of Probabilities and Density Functions are Radon-Nykodim Derivative in the general setting. Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 16 / 18
  • 17. Case Study 2: Discrete and Continuous Features are mixed in Pattern Recognition S∗: Finite Set {Xk}k∈S∗ , Y : Random Variables   xn := {(xi,k)k∈S∗ }n i=1, yn := {yi }n i=1: Examples Finite Case: choose S ⊆ S∗ maximizing R(xn , yn , S) := π(S) Qn [xn , yn |S] Qn [xn |S] General Case: choose S ⊆ S∗ maximizing R(xn, ∆yn, S) := π(S) dνn [∆xn , ∆yn |S] dνn [∆xn |S] xn dR(xn, ∆yn, S) dR(xn, ∆yn, S′) yn > 1 ⇐⇒ S is better than S′ . Conditional Probability of Y given X .. ...... µ(Y ∈ D|X = x) := f (Y ∈ D|x) = dµ(X ∈ ∆x, Y ∈ D) dµ(X ∈ ∆x) Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 17 / 18
  • 18. Contribution . New Theory .. ...... Universal Coding without assuming Discrete and Continuous Sources The MDL Principle without assuming Discrete and Continuous Sources . Applications .. ...... Previously, discrete and Continuous cases were separated Markov Order Estimation (Continuous Data Sequence) Feature Selection (Discrete and continuous features are mixed) BN Structure Estimation (Discrete and continuous rvs are mixed) . Feature Work .. ...... Computation Applications Joe Suzuki (Osaka University) The Universal Measure for General Sources and its Application to MDL/Bayesian CriteriaMarch 30 18 / 18