SlideShare a Scribd company logo
Sufficient Statistics
Guy Lebanon
May 2, 2006
A sufficient statistics with respect to θ is a statistic T(X1, . . . , Xn) that contains all the information that
is useful for the estimation of θ. It is useful a data reduction tool, and studying its properties leads to other
useful results.
Definition 1. A statistic T is sufficient for θ if p(x1, . . . , xn|T(x1, . . . , xn)) is not a function of θ.
A useful way to visualize it is as a Markov chain θ → T(X1, . . . , Xn) → {X1, . . . , Xn} (although in
classical statistics θ is not a random variable but a specific value). Conditioned on the middle part of the
chain, the front and back are independent.
As mentioned above, the intuition behind the sufficient statistic concept is that it contains all the infor-
mation necessary for estimating θ. Therefore if one is interested in estimating θ, it is perfectly fine to ‘get rid’
of the original data while keeping only the value of the sufficient statistic. The motivation connects to the
formal definition by considering the concept of sampling a ghost sample: Consider a statistician who erased
the original data, but kept the sufficient statistic. Since p(x1, . . . , xn|T(x1, . . . , xn)) is not a function of θ
(which is unknown), we assume that it is a known distribution. The statistician can then sample x1, . . . , xn
from that conditional distribution, and that ghost sample can be used in lieu of the original data that was
thrown away.
The definition of sufficient statistic is very hard to verify. A much easier way to find sufficient statistics
is through the factorization theorem.
Definition 2. Let X1, . . . , Xn be iid RVs whose distribution is the pdf fXi
or the pmf pXi
. The likelihood
function is the product of the pdfs or pmfs
L(x1, . . . , xn|θ) =
n
i=1 fXi (xi) Xi is a continuous RV
n
i=1 pXi
(xi) Xi is a discrete RV
.
The likelihood function is sometimes viewed as a function of x1, . . . , xn (fixing θ) and sometimes as a
function of θ (fixing x1, . . . , xn). In the latter case, the likelihood is sometimes denoted L(θ).
Theorem 1 (Factorization Theorem). T is a sufficient statistic for θ if the likelihood factorizes into the
following form
L(x1, . . . , xn|θ) = g(θ, T(x1, . . . , xn)) · h(x1, . . . , xn)
for some functions g, h.
Proof. We prove the theorem only for the discrete case (the continuous case requires different techniques).
First assume the likelihood factorizes as above. Then
p(x1, . . . , xn|T(x1, . . . , xn)) =
p(x1, . . . , xn, T(x1, . . . , xn))
p(T(x1, . . . , xn))
=
p(x1, . . . , xn)
y:T (y)=T (x) p(y1, . . . , yn)
=
h(x1, . . . , xn)
y:T (y)=T (x) h(y1, . . . , yn)
which is not a function of θ. Conversely, assume that T is a sufficient statistic for θ. Then
L(x1, . . . , xn|θ) = p(x1, . . . , xn|T(x1, . . . , xn), θ)p(T(x1, . . . , xn)|θ) = h(x1, . . . , xn)g(T(x1, . . . , xn), θ).
1
Example: A sufficient statistic for Ber(θ) is Xi since
L(x1, . . . , xn|θ) =
i
θxi
(1 − θ)1−xi
= θ
P
xi
(1 − θ)n−
P
xi
= g θ, xi · 1.
Example: A sufficient statistic for the uniform distribution U([0, θ]) is max(X1, . . . , Xn) since
L(x1, . . . , xn|θ) =
i
1
θ
·1{0≤xi≤θ} = θ−n
·1{max(x1,...,xn)≤θ}·1{min(x1,...,xn)≥0} = g(θ, max(x1, . . . , xn))h(x1, . . . , xn).
In the case that θ is a vector rather than a scalar, the sufficient statistic may be a vector as well. In this
case we say that the sufficient statistic vector is jointly sufficient for the parameter vector θ. The definitions
and factorization theorem carry over with little change.
Example: T = ( Xi, X2
i ) are jointly sufficient statistics for θ = (µ, σ2
) for normally distributed data
X1, . . . , Xn ∼ N(µ, σ2
):
L(x1, . . . , xn|θ) =
i
1
√
2πσ2
e−(xi−µ)2
/(2σ2
)
= (2πσ2
)−n/2
e−
P
i(xi−µ)2
/(2σ2
)
= (2πσ2
)−n/2
e−
P
i x2
i /(2σ2
)+2µ
P
i xi/(2σ2
)−µ2
/(2σ2
)
= g(θ, T) · 1 = g(θ, T) · h(x1, . . . , xn)
Clearly, sufficient statistics are not unique. From the factorization theorem it is easy to see that (i)
the identity function T(x1, . . . , xn) = (x1, . . . , xn) is a sufficient statistic vector and (ii) if T is a sufficient
statistic for θ then so is any 1-1 function of T. A function that is not 1-1 of a sufficient statistic may or may
not be a sufficient statistic. This leads to the notion of a minimal sufficient statistic.
Definition 3. A statistic that is a sufficient statistic and that is a function of all other sufficient statistics
is called a minimal sufficient statistic.
In a sense, a minimal sufficient statistic is the smallest sufficient statistic and therefore it represents the
ultimate data reduction with respect to estimating θ. In general, it may or may not exists.
Example: Since T = ( Xi, X2
i ) are jointly sufficient statistics for θ = (µ, σ2
) for normally distributed
data X1, . . . , Xn ∼ N(µ, σ2
), then so are ( ¯X, S2
) which are a 1-1 function of ( Xi, X2
i ).
The following theorem provides a way of verifying that a sufficient statistic is minimal.
Theorem 2. T is a minimal sufficient statistics if
L(x1, . . . , xn|θ)
L(y1, . . . , yn|θ)
is not a function of θ ⇔ T(x1, . . . , xn) = T(y1, . . . , yn).
Proof. First we show that T is a sufficient statistic. For each element in the range of T, fix a sample
yt
1, . . . , yt
n. For arbitrary x1, . . . , xn denote T(x1, . . . , xn) = t and
L(x1, . . . , xn|θ) =
L(x1, . . . , xn|θ)
L(yt
1, . . . , yt
n|θ)
L(yt
1, . . . , yt
n|θ) = h(x1, . . . , xn)g(T(x1, . . . , xn), θ).
We show that T is a function of some other arbitrary sufficient statistic T . Let x, y be such that T (x1, . . . , xn) =
T (y1, . . . , yn). Since
L(x1, . . . , xn|θ)
L(y1, . . . , yn|θ)
=
g (T (x1, . . . , xn), θ)h (x1, . . . , xn)
g (T (y1, . . . , yn), θ)h (y1, . . . , yn)
=
h (x1, . . . , xn)
h (y1, . . . , yn)
is independent of θ, T(x1, . . . , xn) = T(y1, . . . , yn) and T is a 1-1 function of T .
Example: T = ( Xi, X2
i ) is a minimal sufficient statistic for the Normal distribution since the
likelihood ratio is not a function of θ iff T(x) = T(y)
L(x1, . . . , xn|θ)
L(y1, . . . , yn|θ)
= e− 1
2σ2
P
(xi−µ)2
−(yi−µ)2
= e− 1
2σ2 (
P
x2
i −
P
y2
i )+ µ
σ2 (
P
xi−
P
yi)
.
Since ( ¯X, S2
) is a function of T, it is minimal sufficient statistic as well.
2

More Related Content

What's hot

(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?
Christian Robert
 
Problem_Session_Notes
Problem_Session_NotesProblem_Session_Notes
Problem_Session_NotesLu Mao
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABC
Christian Robert
 
Statistics (1): estimation, Chapter 1: Models
Statistics (1): estimation, Chapter 1: ModelsStatistics (1): estimation, Chapter 1: Models
Statistics (1): estimation, Chapter 1: Models
Christian Robert
 
MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...
MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...
MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...
The Statistical and Applied Mathematical Sciences Institute
 
Batch mode reinforcement learning based on the synthesis of artificial trajec...
Batch mode reinforcement learning based on the synthesis of artificial trajec...Batch mode reinforcement learning based on the synthesis of artificial trajec...
Batch mode reinforcement learning based on the synthesis of artificial trajec...Université de Liège (ULg)
 
Convolution
ConvolutionConvolution
Convolution
vandanamalode1
 
Beyond function approximators for batch mode reinforcement learning: rebuildi...
Beyond function approximators for batch mode reinforcement learning: rebuildi...Beyond function approximators for batch mode reinforcement learning: rebuildi...
Beyond function approximators for batch mode reinforcement learning: rebuildi...Université de Liège (ULg)
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...butest
 
1 - Linear Regression
1 - Linear Regression1 - Linear Regression
1 - Linear Regression
Nikita Zhiltsov
 
Application of interpolation and finite difference
Application of interpolation and finite differenceApplication of interpolation and finite difference
Application of interpolation and finite difference
Manthan Chavda
 
CS229 Machine Learning Lecture Notes
CS229 Machine Learning Lecture NotesCS229 Machine Learning Lecture Notes
CS229 Machine Learning Lecture Notes
Eric Conner
 
probability assignment help
probability assignment helpprobability assignment help
probability assignment help
Statistics Homework Helper
 
Slides econ-lm
Slides econ-lmSlides econ-lm
Slides econ-lm
Arthur Charpentier
 
Machine learning (1)
Machine learning (1)Machine learning (1)
Machine learning (1)NYversity
 
stochastic processes assignment help
stochastic processes assignment helpstochastic processes assignment help
stochastic processes assignment help
Statistics Homework Helper
 

What's hot (20)

(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?
 
Problem_Session_Notes
Problem_Session_NotesProblem_Session_Notes
Problem_Session_Notes
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABC
 
Statistics (1): estimation, Chapter 1: Models
Statistics (1): estimation, Chapter 1: ModelsStatistics (1): estimation, Chapter 1: Models
Statistics (1): estimation, Chapter 1: Models
 
Assignment 2 solution acs
Assignment 2 solution acsAssignment 2 solution acs
Assignment 2 solution acs
 
MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...
MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...
MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...
 
Batch mode reinforcement learning based on the synthesis of artificial trajec...
Batch mode reinforcement learning based on the synthesis of artificial trajec...Batch mode reinforcement learning based on the synthesis of artificial trajec...
Batch mode reinforcement learning based on the synthesis of artificial trajec...
 
Slides lln-risques
Slides lln-risquesSlides lln-risques
Slides lln-risques
 
Convolution
ConvolutionConvolution
Convolution
 
Beyond function approximators for batch mode reinforcement learning: rebuildi...
Beyond function approximators for batch mode reinforcement learning: rebuildi...Beyond function approximators for batch mode reinforcement learning: rebuildi...
Beyond function approximators for batch mode reinforcement learning: rebuildi...
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...
 
1 - Linear Regression
1 - Linear Regression1 - Linear Regression
1 - Linear Regression
 
Application of interpolation and finite difference
Application of interpolation and finite differenceApplication of interpolation and finite difference
Application of interpolation and finite difference
 
CS229 Machine Learning Lecture Notes
CS229 Machine Learning Lecture NotesCS229 Machine Learning Lecture Notes
CS229 Machine Learning Lecture Notes
 
probability assignment help
probability assignment helpprobability assignment help
probability assignment help
 
Slides econ-lm
Slides econ-lmSlides econ-lm
Slides econ-lm
 
Machine learning (1)
Machine learning (1)Machine learning (1)
Machine learning (1)
 
newton raphson method
newton raphson methodnewton raphson method
newton raphson method
 
stochastic processes assignment help
stochastic processes assignment helpstochastic processes assignment help
stochastic processes assignment help
 

Similar to Sufficiency

Sufficient statistics
Sufficient statisticsSufficient statistics
Sufficient statistics
Alessandro Ortis
 
Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...
praveenyadav2020
 
Norraine gadiano report
Norraine gadiano reportNorraine gadiano report
Norraine gadiano reportnorraine
 
Contribution of Fixed Point Theorem in Quasi Metric Spaces
Contribution of Fixed Point Theorem in Quasi Metric SpacesContribution of Fixed Point Theorem in Quasi Metric Spaces
Contribution of Fixed Point Theorem in Quasi Metric Spaces
AM Publications,India
 
Fixed Point Theorm In Probabilistic Analysis
Fixed Point Theorm In Probabilistic AnalysisFixed Point Theorm In Probabilistic Analysis
Fixed Point Theorm In Probabilistic Analysis
iosrjce
 
PaperNo20-hoseinihabibiPMS1-4-2014-PMS
PaperNo20-hoseinihabibiPMS1-4-2014-PMSPaperNo20-hoseinihabibiPMS1-4-2014-PMS
PaperNo20-hoseinihabibiPMS1-4-2014-PMSMezban Habibi
 
Interpolation techniques - Background and implementation
Interpolation techniques - Background and implementationInterpolation techniques - Background and implementation
Interpolation techniques - Background and implementation
Quasar Chunawala
 
CONTINUITY ON N-ARY SPACES
CONTINUITY ON N-ARY SPACESCONTINUITY ON N-ARY SPACES
CONTINUITY ON N-ARY SPACES
IAEME Publication
 
Chapter 3 – Random Variables and Probability Distributions
Chapter 3 – Random Variables and Probability DistributionsChapter 3 – Random Variables and Probability Distributions
Chapter 3 – Random Variables and Probability Distributions
JasonTagapanGulla
 
Fixed points of contractive and Geraghty contraction mappings under the influ...
Fixed points of contractive and Geraghty contraction mappings under the influ...Fixed points of contractive and Geraghty contraction mappings under the influ...
Fixed points of contractive and Geraghty contraction mappings under the influ...
IJERA Editor
 
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Christian Robert
 
Chapter 5 interpolation
Chapter 5 interpolationChapter 5 interpolation
Chapter 5 interpolation
ssuser53ee01
 
this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...
BhojRajAdhikari5
 
Stochastic Processes - part 6
Stochastic Processes - part 6Stochastic Processes - part 6
Stochastic Processes - part 6
HAmindavarLectures
 
Probability and Statistics
Probability and StatisticsProbability and Statistics
Probability and Statistics
Malik Sb
 
QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Alg...
QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Alg...QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Alg...
QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Alg...
The Statistical and Applied Mathematical Sciences Institute
 
PaperNo15-HoseiniHabibiSafariGhezelbash-IJMA
PaperNo15-HoseiniHabibiSafariGhezelbash-IJMAPaperNo15-HoseiniHabibiSafariGhezelbash-IJMA
PaperNo15-HoseiniHabibiSafariGhezelbash-IJMAMezban Habibi
 
Overview of Stochastic Calculus Foundations
Overview of Stochastic Calculus FoundationsOverview of Stochastic Calculus Foundations
Overview of Stochastic Calculus Foundations
Ashwin Rao
 
Common Fixed Theorems Using Random Implicit Iterative Schemes
Common Fixed Theorems Using Random Implicit Iterative SchemesCommon Fixed Theorems Using Random Implicit Iterative Schemes
Common Fixed Theorems Using Random Implicit Iterative Schemes
inventy
 

Similar to Sufficiency (20)

Sufficient statistics
Sufficient statisticsSufficient statistics
Sufficient statistics
 
Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...
 
Norraine gadiano report
Norraine gadiano reportNorraine gadiano report
Norraine gadiano report
 
Contribution of Fixed Point Theorem in Quasi Metric Spaces
Contribution of Fixed Point Theorem in Quasi Metric SpacesContribution of Fixed Point Theorem in Quasi Metric Spaces
Contribution of Fixed Point Theorem in Quasi Metric Spaces
 
Fixed Point Theorm In Probabilistic Analysis
Fixed Point Theorm In Probabilistic AnalysisFixed Point Theorm In Probabilistic Analysis
Fixed Point Theorm In Probabilistic Analysis
 
PaperNo20-hoseinihabibiPMS1-4-2014-PMS
PaperNo20-hoseinihabibiPMS1-4-2014-PMSPaperNo20-hoseinihabibiPMS1-4-2014-PMS
PaperNo20-hoseinihabibiPMS1-4-2014-PMS
 
Lagrange’s interpolation formula
Lagrange’s interpolation formulaLagrange’s interpolation formula
Lagrange’s interpolation formula
 
Interpolation techniques - Background and implementation
Interpolation techniques - Background and implementationInterpolation techniques - Background and implementation
Interpolation techniques - Background and implementation
 
CONTINUITY ON N-ARY SPACES
CONTINUITY ON N-ARY SPACESCONTINUITY ON N-ARY SPACES
CONTINUITY ON N-ARY SPACES
 
Chapter 3 – Random Variables and Probability Distributions
Chapter 3 – Random Variables and Probability DistributionsChapter 3 – Random Variables and Probability Distributions
Chapter 3 – Random Variables and Probability Distributions
 
Fixed points of contractive and Geraghty contraction mappings under the influ...
Fixed points of contractive and Geraghty contraction mappings under the influ...Fixed points of contractive and Geraghty contraction mappings under the influ...
Fixed points of contractive and Geraghty contraction mappings under the influ...
 
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
 
Chapter 5 interpolation
Chapter 5 interpolationChapter 5 interpolation
Chapter 5 interpolation
 
this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...
 
Stochastic Processes - part 6
Stochastic Processes - part 6Stochastic Processes - part 6
Stochastic Processes - part 6
 
Probability and Statistics
Probability and StatisticsProbability and Statistics
Probability and Statistics
 
QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Alg...
QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Alg...QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Alg...
QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Alg...
 
PaperNo15-HoseiniHabibiSafariGhezelbash-IJMA
PaperNo15-HoseiniHabibiSafariGhezelbash-IJMAPaperNo15-HoseiniHabibiSafariGhezelbash-IJMA
PaperNo15-HoseiniHabibiSafariGhezelbash-IJMA
 
Overview of Stochastic Calculus Foundations
Overview of Stochastic Calculus FoundationsOverview of Stochastic Calculus Foundations
Overview of Stochastic Calculus Foundations
 
Common Fixed Theorems Using Random Implicit Iterative Schemes
Common Fixed Theorems Using Random Implicit Iterative SchemesCommon Fixed Theorems Using Random Implicit Iterative Schemes
Common Fixed Theorems Using Random Implicit Iterative Schemes
 

More from mustafa sarac

Uluslararasilasma son
Uluslararasilasma sonUluslararasilasma son
Uluslararasilasma son
mustafa sarac
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3
mustafa sarac
 
Latka december digital
Latka december digitalLatka december digital
Latka december digital
mustafa sarac
 
Axial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualAxial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manual
mustafa sarac
 
Array programming with Numpy
Array programming with NumpyArray programming with Numpy
Array programming with Numpy
mustafa sarac
 
Math for programmers
Math for programmersMath for programmers
Math for programmers
mustafa sarac
 
The book of Why
The book of WhyThe book of Why
The book of Why
mustafa sarac
 
BM sgk meslek kodu
BM sgk meslek koduBM sgk meslek kodu
BM sgk meslek kodu
mustafa sarac
 
TEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizTEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimiz
mustafa sarac
 
How to make and manage a bee hotel?
How to make and manage a bee hotel?How to make and manage a bee hotel?
How to make and manage a bee hotel?
mustafa sarac
 
Cahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir miCahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir mi
mustafa sarac
 
How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?
mustafa sarac
 
Staff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital MarketsStaff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital Markets
mustafa sarac
 
Yetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimiYetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimi
mustafa sarac
 
Consumer centric api design v0.4.0
Consumer centric api design v0.4.0Consumer centric api design v0.4.0
Consumer centric api design v0.4.0
mustafa sarac
 
State of microservices 2020 by tsh
State of microservices 2020 by tshState of microservices 2020 by tsh
State of microservices 2020 by tsh
mustafa sarac
 
Uber pitch deck 2008
Uber pitch deck 2008Uber pitch deck 2008
Uber pitch deck 2008
mustafa sarac
 
Wireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guideWireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guide
mustafa sarac
 
State of Serverless Report 2020
State of Serverless Report 2020State of Serverless Report 2020
State of Serverless Report 2020
mustafa sarac
 
Dont just roll the dice
Dont just roll the diceDont just roll the dice
Dont just roll the dice
mustafa sarac
 

More from mustafa sarac (20)

Uluslararasilasma son
Uluslararasilasma sonUluslararasilasma son
Uluslararasilasma son
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3
 
Latka december digital
Latka december digitalLatka december digital
Latka december digital
 
Axial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualAxial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manual
 
Array programming with Numpy
Array programming with NumpyArray programming with Numpy
Array programming with Numpy
 
Math for programmers
Math for programmersMath for programmers
Math for programmers
 
The book of Why
The book of WhyThe book of Why
The book of Why
 
BM sgk meslek kodu
BM sgk meslek koduBM sgk meslek kodu
BM sgk meslek kodu
 
TEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizTEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimiz
 
How to make and manage a bee hotel?
How to make and manage a bee hotel?How to make and manage a bee hotel?
How to make and manage a bee hotel?
 
Cahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir miCahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir mi
 
How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?
 
Staff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital MarketsStaff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital Markets
 
Yetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimiYetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimi
 
Consumer centric api design v0.4.0
Consumer centric api design v0.4.0Consumer centric api design v0.4.0
Consumer centric api design v0.4.0
 
State of microservices 2020 by tsh
State of microservices 2020 by tshState of microservices 2020 by tsh
State of microservices 2020 by tsh
 
Uber pitch deck 2008
Uber pitch deck 2008Uber pitch deck 2008
Uber pitch deck 2008
 
Wireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guideWireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guide
 
State of Serverless Report 2020
State of Serverless Report 2020State of Serverless Report 2020
State of Serverless Report 2020
 
Dont just roll the dice
Dont just roll the diceDont just roll the dice
Dont just roll the dice
 

Recently uploaded

Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 

Recently uploaded (20)

Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 

Sufficiency

  • 1. Sufficient Statistics Guy Lebanon May 2, 2006 A sufficient statistics with respect to θ is a statistic T(X1, . . . , Xn) that contains all the information that is useful for the estimation of θ. It is useful a data reduction tool, and studying its properties leads to other useful results. Definition 1. A statistic T is sufficient for θ if p(x1, . . . , xn|T(x1, . . . , xn)) is not a function of θ. A useful way to visualize it is as a Markov chain θ → T(X1, . . . , Xn) → {X1, . . . , Xn} (although in classical statistics θ is not a random variable but a specific value). Conditioned on the middle part of the chain, the front and back are independent. As mentioned above, the intuition behind the sufficient statistic concept is that it contains all the infor- mation necessary for estimating θ. Therefore if one is interested in estimating θ, it is perfectly fine to ‘get rid’ of the original data while keeping only the value of the sufficient statistic. The motivation connects to the formal definition by considering the concept of sampling a ghost sample: Consider a statistician who erased the original data, but kept the sufficient statistic. Since p(x1, . . . , xn|T(x1, . . . , xn)) is not a function of θ (which is unknown), we assume that it is a known distribution. The statistician can then sample x1, . . . , xn from that conditional distribution, and that ghost sample can be used in lieu of the original data that was thrown away. The definition of sufficient statistic is very hard to verify. A much easier way to find sufficient statistics is through the factorization theorem. Definition 2. Let X1, . . . , Xn be iid RVs whose distribution is the pdf fXi or the pmf pXi . The likelihood function is the product of the pdfs or pmfs L(x1, . . . , xn|θ) = n i=1 fXi (xi) Xi is a continuous RV n i=1 pXi (xi) Xi is a discrete RV . The likelihood function is sometimes viewed as a function of x1, . . . , xn (fixing θ) and sometimes as a function of θ (fixing x1, . . . , xn). In the latter case, the likelihood is sometimes denoted L(θ). Theorem 1 (Factorization Theorem). T is a sufficient statistic for θ if the likelihood factorizes into the following form L(x1, . . . , xn|θ) = g(θ, T(x1, . . . , xn)) · h(x1, . . . , xn) for some functions g, h. Proof. We prove the theorem only for the discrete case (the continuous case requires different techniques). First assume the likelihood factorizes as above. Then p(x1, . . . , xn|T(x1, . . . , xn)) = p(x1, . . . , xn, T(x1, . . . , xn)) p(T(x1, . . . , xn)) = p(x1, . . . , xn) y:T (y)=T (x) p(y1, . . . , yn) = h(x1, . . . , xn) y:T (y)=T (x) h(y1, . . . , yn) which is not a function of θ. Conversely, assume that T is a sufficient statistic for θ. Then L(x1, . . . , xn|θ) = p(x1, . . . , xn|T(x1, . . . , xn), θ)p(T(x1, . . . , xn)|θ) = h(x1, . . . , xn)g(T(x1, . . . , xn), θ). 1
  • 2. Example: A sufficient statistic for Ber(θ) is Xi since L(x1, . . . , xn|θ) = i θxi (1 − θ)1−xi = θ P xi (1 − θ)n− P xi = g θ, xi · 1. Example: A sufficient statistic for the uniform distribution U([0, θ]) is max(X1, . . . , Xn) since L(x1, . . . , xn|θ) = i 1 θ ·1{0≤xi≤θ} = θ−n ·1{max(x1,...,xn)≤θ}·1{min(x1,...,xn)≥0} = g(θ, max(x1, . . . , xn))h(x1, . . . , xn). In the case that θ is a vector rather than a scalar, the sufficient statistic may be a vector as well. In this case we say that the sufficient statistic vector is jointly sufficient for the parameter vector θ. The definitions and factorization theorem carry over with little change. Example: T = ( Xi, X2 i ) are jointly sufficient statistics for θ = (µ, σ2 ) for normally distributed data X1, . . . , Xn ∼ N(µ, σ2 ): L(x1, . . . , xn|θ) = i 1 √ 2πσ2 e−(xi−µ)2 /(2σ2 ) = (2πσ2 )−n/2 e− P i(xi−µ)2 /(2σ2 ) = (2πσ2 )−n/2 e− P i x2 i /(2σ2 )+2µ P i xi/(2σ2 )−µ2 /(2σ2 ) = g(θ, T) · 1 = g(θ, T) · h(x1, . . . , xn) Clearly, sufficient statistics are not unique. From the factorization theorem it is easy to see that (i) the identity function T(x1, . . . , xn) = (x1, . . . , xn) is a sufficient statistic vector and (ii) if T is a sufficient statistic for θ then so is any 1-1 function of T. A function that is not 1-1 of a sufficient statistic may or may not be a sufficient statistic. This leads to the notion of a minimal sufficient statistic. Definition 3. A statistic that is a sufficient statistic and that is a function of all other sufficient statistics is called a minimal sufficient statistic. In a sense, a minimal sufficient statistic is the smallest sufficient statistic and therefore it represents the ultimate data reduction with respect to estimating θ. In general, it may or may not exists. Example: Since T = ( Xi, X2 i ) are jointly sufficient statistics for θ = (µ, σ2 ) for normally distributed data X1, . . . , Xn ∼ N(µ, σ2 ), then so are ( ¯X, S2 ) which are a 1-1 function of ( Xi, X2 i ). The following theorem provides a way of verifying that a sufficient statistic is minimal. Theorem 2. T is a minimal sufficient statistics if L(x1, . . . , xn|θ) L(y1, . . . , yn|θ) is not a function of θ ⇔ T(x1, . . . , xn) = T(y1, . . . , yn). Proof. First we show that T is a sufficient statistic. For each element in the range of T, fix a sample yt 1, . . . , yt n. For arbitrary x1, . . . , xn denote T(x1, . . . , xn) = t and L(x1, . . . , xn|θ) = L(x1, . . . , xn|θ) L(yt 1, . . . , yt n|θ) L(yt 1, . . . , yt n|θ) = h(x1, . . . , xn)g(T(x1, . . . , xn), θ). We show that T is a function of some other arbitrary sufficient statistic T . Let x, y be such that T (x1, . . . , xn) = T (y1, . . . , yn). Since L(x1, . . . , xn|θ) L(y1, . . . , yn|θ) = g (T (x1, . . . , xn), θ)h (x1, . . . , xn) g (T (y1, . . . , yn), θ)h (y1, . . . , yn) = h (x1, . . . , xn) h (y1, . . . , yn) is independent of θ, T(x1, . . . , xn) = T(y1, . . . , yn) and T is a 1-1 function of T . Example: T = ( Xi, X2 i ) is a minimal sufficient statistic for the Normal distribution since the likelihood ratio is not a function of θ iff T(x) = T(y) L(x1, . . . , xn|θ) L(y1, . . . , yn|θ) = e− 1 2σ2 P (xi−µ)2 −(yi−µ)2 = e− 1 2σ2 ( P x2 i − P y2 i )+ µ σ2 ( P xi− P yi) . Since ( ¯X, S2 ) is a function of T, it is minimal sufficient statistic as well. 2