Alessandro Ortis
Università degli Studi di Catania
Dipartimento di Matematica e Informatica
Image Processing Lab - iplab.dmi.unict.it
Sufficient statistics
Sufficient statistics
Parameter estimation: given a sample X = (x1, x2, … xn)
from a population with pdf P(X|θ), we try to infer θ
from some information represented by X.
A. Ortis – Sufficient statistics
Sufficient statistics
Could be useful finding a reduced representation of X
by means a function F(X)?
Ex:
X T(X)= mean(X)
4 5 6 5
5 5 5 5
3 5 7 5
A. Ortis – Sufficient statistics
Sufficient statistics
[ 4, 5, 6]
T(X) = 5 [ 5, 5, 5]
[ 3, 5, 7]
...
Is there any loss of information ? Have we lost useful
data or the representation given by T(X) is enought to
infer the same information about θ conteined in X ?
A. Ortis – Sufficient statistics
Sufficient statistics
[ 4, 5, 6]
T(X) = 5 [ 5, 5, 5]
[ 3, 5, 7]
….
Is it sufficient to consider only the reduced data T(X)?
A. Ortis – Sufficient statistics
Sufficient statistics
Def .
A statistic T(X) is sufficient for θ if P(X|T(X)) is
not a function of θ.
A. Ortis – Sufficient statistics
Sufficient statistics
Example: Let (x1, x2, … xn) be a random sample of n Bernoulli(p)
trials
x =
1 𝑤𝑖𝑡ℎ 𝑝𝑟𝑜𝑏. 𝑝
0 𝑤𝑖𝑡ℎ 𝑝𝑟𝑜𝑏. 1 − 𝑝
Can we find a sufficient statistic for p?
Considering the definition of sufficiency, can we find a function
T(X) such that P(X|T(X)) is independent from p?
(solution in the next slide...)
A. Ortis – Sufficient statistics
Sufficient statistics
This conditional distribution does not depend on p!
Once the value of T(X) is known, no other functon of X
will provide any additiona information about p.
If T(X) = 𝑋𝑖 = t
we have P(X | T(X)) =
1
𝑛
𝑡
A. Ortis – Sufficient statistics
Sufficient statistics
A sufficient statistic T(X) reduces X in two senses:
1) We can reduce the dimensionality of data
2) The possible values assumed by T(X) are fewer
A. Ortis – Sufficient statistics
Sufficient statistics
A statistic T(X) induces a partition on the sample space.
Given a value t, we can define the subset
𝐴 𝑡 = 𝑋: 𝑇 𝑋 = 𝑡
A. Ortis – Sufficient statistics
Sufficient statistics
Bernoulli population with n=3, the sample space of X is
0,0,0 ; 0,0,1 ; 0,1,0 ; 0,1,1 ; 1,0,0 ;
1,0,1 ; 1,1,0 ; 1,1,1
A. Ortis – Sufficient statistics
Sufficient statistics
0,0,0 ; 0,0,1 ; 0,1,0 ; 0,1,1 ; 1,0,0 ;
1,0,1 ; 1,1,0 ; 1,1,1
t Induced subset
0 { 0,0,0 }
1 { 0,0,1 ; 0,1,0 ; 1,0,0 }
2 { 0,1,1 ; 1,1,0 ; 1,0,1 }
3 { 1,1,1 }
A. Ortis – Sufficient statistics
Sufficient statistics
Theorem:
T(X) is a sufficient statistic for θ sif the likelihood
factorizes into the following form
L(x1, x2, … xn | θ ) = g( θ, T(x1, x2, … xn))·h(x1, x2, … xn)
A. Ortis – Sufficient statistics
Sufficient statistics
Theorem:
T(X) is a sufficient statistic for θ sif the likelihood
factorizes into the following form
L(x1, x2, … xn | θ ) = g( θ, T(x1, x2, … xn))·h(x1, x2, … xn)
θ and X interact only via T(X)
A. Ortis – Sufficient statistics
Sufficient statistics
Def.
T is a minimal sufficient statistic if the following statements
are true:
1. T is sufficient
2. If S is any other sufficient statistic then T = g(U) for some
function g
A. Ortis – Sufficient statistics
Sufficient statistics
In other words, T generates the coarsest sufficient partition.
A minimal sufficient statistic is the smallest sufficient
statistic and therefore it represents the ultimate data
reduction with respect to estimating θ . In general, it may or
may not exists.
A. Ortis – Sufficient statistics
Sufficient statistics
Theorem:
T(X) is a minimal sufficient statistics if
P(𝑥1, 𝑥2, … 𝑥 𝑛 | 𝜃)
P(𝑦1, 𝑦2, … 𝑦𝑛 | 𝜃)
𝑖𝑠 𝑛𝑜𝑡 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝜃
𝑇(𝑥1, 𝑥2, … 𝑥 𝑛) = 𝑇(𝑦1, 𝑦2, … 𝑦𝑛)
A. Ortis – Sufficient statistics
Sufficient statistics
• T(X) may not exist
• If so, is not unique
• Any 1-1 function of a sufficient statistic which does
not depends on 𝜃 is also a sufficient statistic
• All we considered so far on sufficiency can easily be
extended to accommodate two (or more)
parameters.
A. Ortis – Sufficient statistics
Sufficient statistics
Example: let (x1, x2, … xn) be N(μ,σ2) observations.
Let 𝜃1 = μ e 𝜃2 = σ2 we have that
T(X) = ( 𝑋𝑖, 𝑋𝑖
2
)
T(X) = ( 𝑋, 𝑆2)
Are both minimal sufficient statistics for N(μ,σ2)
A. Ortis – Sufficient statistics

Sufficient statistics

  • 1.
    Alessandro Ortis Università degliStudi di Catania Dipartimento di Matematica e Informatica Image Processing Lab - iplab.dmi.unict.it Sufficient statistics
  • 2.
    Sufficient statistics Parameter estimation:given a sample X = (x1, x2, … xn) from a population with pdf P(X|θ), we try to infer θ from some information represented by X. A. Ortis – Sufficient statistics
  • 3.
    Sufficient statistics Could beuseful finding a reduced representation of X by means a function F(X)? Ex: X T(X)= mean(X) 4 5 6 5 5 5 5 5 3 5 7 5 A. Ortis – Sufficient statistics
  • 4.
    Sufficient statistics [ 4,5, 6] T(X) = 5 [ 5, 5, 5] [ 3, 5, 7] ... Is there any loss of information ? Have we lost useful data or the representation given by T(X) is enought to infer the same information about θ conteined in X ? A. Ortis – Sufficient statistics
  • 5.
    Sufficient statistics [ 4,5, 6] T(X) = 5 [ 5, 5, 5] [ 3, 5, 7] …. Is it sufficient to consider only the reduced data T(X)? A. Ortis – Sufficient statistics
  • 6.
    Sufficient statistics Def . Astatistic T(X) is sufficient for θ if P(X|T(X)) is not a function of θ. A. Ortis – Sufficient statistics
  • 7.
    Sufficient statistics Example: Let(x1, x2, … xn) be a random sample of n Bernoulli(p) trials x = 1 𝑤𝑖𝑡ℎ 𝑝𝑟𝑜𝑏. 𝑝 0 𝑤𝑖𝑡ℎ 𝑝𝑟𝑜𝑏. 1 − 𝑝 Can we find a sufficient statistic for p? Considering the definition of sufficiency, can we find a function T(X) such that P(X|T(X)) is independent from p? (solution in the next slide...) A. Ortis – Sufficient statistics
  • 8.
    Sufficient statistics This conditionaldistribution does not depend on p! Once the value of T(X) is known, no other functon of X will provide any additiona information about p. If T(X) = 𝑋𝑖 = t we have P(X | T(X)) = 1 𝑛 𝑡 A. Ortis – Sufficient statistics
  • 9.
    Sufficient statistics A sufficientstatistic T(X) reduces X in two senses: 1) We can reduce the dimensionality of data 2) The possible values assumed by T(X) are fewer A. Ortis – Sufficient statistics
  • 10.
    Sufficient statistics A statisticT(X) induces a partition on the sample space. Given a value t, we can define the subset 𝐴 𝑡 = 𝑋: 𝑇 𝑋 = 𝑡 A. Ortis – Sufficient statistics
  • 11.
    Sufficient statistics Bernoulli populationwith n=3, the sample space of X is 0,0,0 ; 0,0,1 ; 0,1,0 ; 0,1,1 ; 1,0,0 ; 1,0,1 ; 1,1,0 ; 1,1,1 A. Ortis – Sufficient statistics
  • 12.
    Sufficient statistics 0,0,0 ;0,0,1 ; 0,1,0 ; 0,1,1 ; 1,0,0 ; 1,0,1 ; 1,1,0 ; 1,1,1 t Induced subset 0 { 0,0,0 } 1 { 0,0,1 ; 0,1,0 ; 1,0,0 } 2 { 0,1,1 ; 1,1,0 ; 1,0,1 } 3 { 1,1,1 } A. Ortis – Sufficient statistics
  • 13.
    Sufficient statistics Theorem: T(X) isa sufficient statistic for θ sif the likelihood factorizes into the following form L(x1, x2, … xn | θ ) = g( θ, T(x1, x2, … xn))·h(x1, x2, … xn) A. Ortis – Sufficient statistics
  • 14.
    Sufficient statistics Theorem: T(X) isa sufficient statistic for θ sif the likelihood factorizes into the following form L(x1, x2, … xn | θ ) = g( θ, T(x1, x2, … xn))·h(x1, x2, … xn) θ and X interact only via T(X) A. Ortis – Sufficient statistics
  • 15.
    Sufficient statistics Def. T isa minimal sufficient statistic if the following statements are true: 1. T is sufficient 2. If S is any other sufficient statistic then T = g(U) for some function g A. Ortis – Sufficient statistics
  • 16.
    Sufficient statistics In otherwords, T generates the coarsest sufficient partition. A minimal sufficient statistic is the smallest sufficient statistic and therefore it represents the ultimate data reduction with respect to estimating θ . In general, it may or may not exists. A. Ortis – Sufficient statistics
  • 17.
    Sufficient statistics Theorem: T(X) isa minimal sufficient statistics if P(𝑥1, 𝑥2, … 𝑥 𝑛 | 𝜃) P(𝑦1, 𝑦2, … 𝑦𝑛 | 𝜃) 𝑖𝑠 𝑛𝑜𝑡 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝜃 𝑇(𝑥1, 𝑥2, … 𝑥 𝑛) = 𝑇(𝑦1, 𝑦2, … 𝑦𝑛) A. Ortis – Sufficient statistics
  • 18.
    Sufficient statistics • T(X)may not exist • If so, is not unique • Any 1-1 function of a sufficient statistic which does not depends on 𝜃 is also a sufficient statistic • All we considered so far on sufficiency can easily be extended to accommodate two (or more) parameters. A. Ortis – Sufficient statistics
  • 19.
    Sufficient statistics Example: let(x1, x2, … xn) be N(μ,σ2) observations. Let 𝜃1 = μ e 𝜃2 = σ2 we have that T(X) = ( 𝑋𝑖, 𝑋𝑖 2 ) T(X) = ( 𝑋, 𝑆2) Are both minimal sufficient statistics for N(μ,σ2) A. Ortis – Sufficient statistics