This document provides an overview of point estimation methods, including maximum likelihood estimation and the method of moments. It begins with an introduction to statistical inference and the theory of estimation. Point estimation is defined as using sample data to calculate a single value as the best estimate of an unknown population parameter. Maximum likelihood estimation maximizes the likelihood function to find the parameter values that make the observed sample data most probable. The method of moments equates sample moments to theoretical moments to derive parameter estimates. Examples are provided to illustrate how to apply each method to obtain point estimators.
2. TOPICS TO BE COVERED
1. Introduction to statistical inference
2. Theory of estimation
3. Methods of estimation
3.1 Method of maximum likelihood estimation
3.2 Method of moments
3. 1. Introduction to statistical inference
Statistics
Inferential Statistics
Estimation Testing of Hypothesis
Descriptive statistics
Descriptive analysis Graphical Presentation
4. What do we mean by Statistical Inference?
Drawing conclusion or making decision about population based
on information collected from the sample.
Population Sample
Representative
Making Conclusions
5. – Statistical inference is further divided into two parts
Testing of hypothesis &
Theory of Estimation
Testing of hypothesis –
➢ The theory of testing of hypothesis is initiated by J. Neyman and
E. S. Pearson.
➢ It provides the rule which makes one to decide about the
acceptance or rejection of the hypothesis under study.
Theory of estimation –
➢ The theory of estimation was founded by Prof. R. A. Fisher.
➢ It discuss the ways of assigning the value to a population
parameter based on values of corresponding statistics (function
of sample observations).
6. 2. Theory of estimation
➢ The theory of estimation was founded by R. A. Fisher.
Inferential
Statistics
Estimation
Point
Estimation
Interval
Estimation
Testing of
hypothesis
7. What do we mean by Estimation
It discuss the ways of assigning the values to a population parameter
based on the values of the corresponding statistics(function of the
sample observations).
The statistics used to estimate population parameter is called
estimator.
The value of the estimator is called estimate.
9. Point Estimation
It involves the use of sample data to calculate a single value(known
as a Point estimate) which is to serve as a best guess or best estimate
of an unknown population parameter. More formally, it is the
application of a point estimator to the data to obtain a point
estimate.
10. Interval estimation
It is the use of sample data to calculate an interval of possible values
of an unknown population parameter; this is in contrast to point
estimation, which gives a single value
Is an interval which is formed by two quantities based on sample
data within which the parameter will lie with very high probability.
11. 3. Methods of Estimation
– Following are some of the important methods for obtaining good
estimators :
➢ Method of maximum likelihood estimation
➢ Method of moments
12. 3.1 Method of maximum likelihood
estimation
– It is initially formulated by C. F. Gauss.
– In statistics, maximum likelihood estimation (MLE) is a method
of estimating the parameters of a probability distribution by
maximizing a likelihood function, so that under the
assumed statistical model the observed data is most probable.
The point in the parameter space that maximizes the likelihood
function is called the maximum likelihood estimate.
13. Likelihood function
It is formed from the joint density function of the sample.
i.e.,
𝐿 = 𝐿 𝜃 = 𝑓 𝑥1, 𝜃 … … … . 𝑓 𝑥𝑛, 𝜃 = ෑ
𝑖=1
𝑛
𝑓 𝑥𝑖, 𝜃
Where 𝑥1, 𝑥2, 𝑥3,……. 𝑥𝑛 be a random sample of size n from a
population with density function 𝑓 𝑥, 𝜃 .
14. Steps to perform in MLE
1. Define the likelihood, ensuring you’re using the correct
distribution for your classification problem.
2. Take the natural log and reduce the product function to a sum
function.
3. Then compute the parameter by considering the case
𝜕
𝜕𝜃
𝑙𝑜𝑔𝐿 = 0 &
𝜕2
𝜕𝜃2 𝑙𝑜𝑔𝐿 < 0
This equations are usually referred to as the Likelihood Equation for
estimating the parameters.
15. Example
Suppose we have a random sample 𝑋1, 𝑋2, … … … , 𝑋𝑛 where :
𝑋𝑖 = 0 ; 𝑖𝑓 𝑎 𝑟𝑎𝑛𝑑𝑜𝑚𝑙𝑦 𝑠𝑒𝑙𝑒𝑐𝑡𝑒𝑑 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 𝑑𝑜𝑒𝑠 𝑛𝑜𝑡 𝑜𝑤𝑛 𝑎 𝑐𝑎𝑟, 𝑎𝑛𝑑
𝑋𝑖 = 1 ; 𝑖𝑓 𝑎 𝑟𝑎𝑛𝑑𝑜𝑚𝑙𝑦 𝑠𝑒𝑙𝑒𝑐𝑡𝑒𝑑 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 𝑑𝑜𝑒𝑠 𝑜𝑤𝑛 𝑎 𝑐𝑎𝑟.
Assuming that the 𝑋𝑖 are independent Bernoulli random variables with
unknown parameter p, find the maximum likelihood estimator of p, the
proportion of students who own a sports car.
16. If the 𝑋𝑖 are independent Bernoulli random variables with unknown
parameter p, then the probability mass function of each 𝑋𝑖 is :
𝑓 𝑥; 𝑝 = 𝑝𝑥 1 − 𝑝 1−𝑥
For 𝑋𝑖 = 0𝑜𝑟 1 𝑎𝑛𝑑 0 < 𝑝 < 1.
Therefore, the likelihood function L(p) is, by definition:
Answer
𝐿 𝑝 = ς𝑖=1
𝑛
𝑓 𝑥; 𝑝 = 𝑝𝑥1 1 − 𝑝 1−𝑥1 × 𝑝𝑥2 1 − 𝑝 1−𝑥2 × ⋯ … … … × 𝑝𝑥𝑛ሺ
ሻ
1 −
𝑝 1−𝑥𝑛
For 0 < p < 1.
Simplifying, by summing up the exponents we get:
𝐿 𝑝 = 𝑝σ𝑖=1
𝑛
𝑥𝑖 1 − 𝑝 𝑛 − σ𝑖=1
𝑛
𝑥𝑖 ………………………..(1)
17. Now, in order to implement the method of maximum likelihood, we
need to find the value of unknown parameter p that maximizes the
likelihood L(p) given in equation (1).
So to maximize the function, we are need to differentiate the likelihood
function with respect to p.
And to make the differentiation easy we are going to use the logarithm
of likelihood function as it is an increasing function of x.
That is, if 𝑥1 < 𝑥2 , then𝑓ሺ𝑥1ሻ < 𝑓ሺ𝑥2ሻ. That means the value of p that
maximizes the natural logarithm of the likelihood function log L(p) is also
the value of p that maximizes the likelihood function L(p).
18. So we take the derivative of log L(p) with respect to p instead of taking the
derivative of L(p).
In this case, the log likelihood function is :
𝑙𝑜𝑔𝐿 𝑝 = σ𝑖=1
𝑛
𝑥𝑖 log 𝑝 + 𝑛 − σ𝑖=1
𝑛
𝑥𝑖 log 1 − 𝑝 …………….. (2)
Taking the derivative of log L(p) with respect to p and equate it with 0 we get :
𝜕 log 𝐿 𝑝
𝜕𝑝
= 0
19. =>
σ 𝑥𝑖
𝑝
−
𝑛 − σ 𝑥𝑖
1 − 𝑝
= 0
Now by simplifying this for p we get;
Here (“^”) is used to represent the estimate of parameter p.
Though we find the estimate of parameter p, technically to verify that it is
maximum. For that the second derivative of the logL(p) with respect to p should
negative i.e.,
𝜕2 log 𝐿 𝑝
𝜕𝑝2
< 0 => −𝑛 < 0 … … … … . ሺ𝑏𝑦 3ሻ
Thus, Ƹ
𝑝 =
σ 𝑥𝑖
𝑛
is maximum likelihood estimator of p.
20. 3.2 Method of moments
– This method was discovered and studied in detail by Karl Pearson.
– The basic idea behind this form of the method is to:
1. Equate the first sample moment about the origin
𝑀1 =
1
𝑛
σ𝑖=1
𝑛
𝑋𝑖 = ҧ
𝑥
to the first theoretical moment E(X).
2. Equate the second sample moment about the origin
𝑀2=
1
𝑛
σ𝑖=1
𝑛
𝑋𝑖
2
to the second theoretical moment E(𝑋2
).
21. 3. Continue equating sample moments about the origin, 𝑀𝑘, with
the corresponding theoretical moments E(𝑋𝑘),k=3,4,… until you
have as many equations as you have parameters.
4. Solve this equation for the parameters.
– The resulting values are called method of moments estimators. It
seems reasonable that this method would provide good estimates,
since the empirical distribution converges in some sense to the
probability distribution. Therefore, the corresponding moments
should be about equal.
22. Another Form of the Method
– The basic idea behind this form of the method is to:
1. Equate the first sample moment about the origin 𝑀1 =
1
𝑛
σ𝑖=1
𝑛
𝑋𝑖 = ҧ
𝑥 to the first theoretical moment E(X).
2. Equate the second sample moment about the mean 𝑀1 =
1
𝑛
σ𝑖=1
𝑛
ሺ𝑥𝑖 − ҧ
𝑥ሻ2 to the second theoretical moment about the
mean E[ሺ𝑋 − 𝜇ሻ2].
3. Continue equating sample moments about the mean Mk∗ with the
corresponding theoretical moments about the
mean E[ሺ𝑋 − 𝜇ሻ𝑘], k=3,4,… until you have as many equations as
you have parameters.
4. Solve for the parameters.
– Again, the resulting values are called method of moments
estimators.
23. Example
Let 𝑋1, 𝑋2, … … … , 𝑋𝑛 be normal random variate with mean 𝜇 and
variance 𝜎2. What are the method of moment estimators of the
mean 𝜇 and variance 𝜎2 ?
24. The first and second theoretical moments about the origin are:
𝐸 𝑋𝑖 = 𝜇 𝑎𝑛𝑑 𝐸 𝑋𝑖
2
= 𝜎2 + 𝜇2
Here we have two parameters for which we are trying to derive
method of moment’s estimators.
Answer
25. Therefore, we need two equations here. Equating the first theoretical
moment about the origin with the corresponding sample moment, we get:
𝐸 𝑋𝑖 = 𝜇 =
1
𝑛
𝑖=1
𝑛
𝑋𝑖 … … … … … … … . ሺ1ሻ
And equating the second theoretical moment about the origin with the
corresponding sample moment, we get:
𝐸 𝑋𝑖
2
= 𝜎2 + 𝜇2 =
1
𝑛
𝑖=1
𝑛
𝑋𝑖
2
… … … … … … … . 2
26. Now from equation (1) we say that the method of moments estimator
for the mean 𝜇 is the sample mean:
Ƹ
𝜇𝑀𝑀 =
1
𝑛
𝑖=1
𝑛
𝑋𝑖 = ത
𝑋
And by substituting the sample mean as the estimator of 𝜇 in the
second equation and solving for 𝜎2, we get the method of moments
estimator for the variance 𝜎2
is;
𝜎2
𝑀𝑀 =
1
𝑛
𝑖=1
𝑛
𝑋𝑖
2
− 𝜇2 =
1
𝑛
𝑖=1
𝑛
𝑋𝑖
2
− ത
𝑋2 =
1
𝑛
𝑖=1
𝑛
ሺ𝑋𝑖 − ഥ
𝑋 ሻ2
For this example, if cross check, then method of moments estimators
are the same as the maximum likelihood estimators.