2_Interval_Estimation.pptx

Interval Estimation and
Sample Size Decision
• Point estimation
• Interval estimation for
 Population Mean
 Population Proportion
 Population Variance
• Sample size decision in estimating
 Population Mean
 Population Proportion
 Population Variance
QAM – II by Gaurav Garg (IIM Lucknow)

Statistical Estimation
• We take data from a sample and say something about the
population from which the sample was drawn
• Sample statistic is used to estimate unknown parameter.
• There are two types of estimation:
• Point Estimation:
 Calculation of a single value of a sample statistic
• Interval Estimation
 Calculation of an interval using a sample statistic
 This interval is calculated at a desired level of confidence
• Eg. 95% confidence, 99% confidence, can not be 100%
 Sample to sample variation (standard error) is also taken
into consideration.

• Let θ be the unknown parameter.
• Suppose T is the point estimate of θ and E(T) = θ.
• Fix the confidence level at (1-  )x100 %.
•  is the probability of “error”.
• (1- ) is called confidence coefficient.
• Thus, for 95% confidence level,  = 0.05.
• Confidence interval estimate of θ is [T-h, T+h]
• It means that P(T-h ≤ θ ≤ T+h) = 1- 
• Where, h = critical value x standard error
Confidence Interval Estimates

• Formula for confidence interval is [T-h, T+h]
• T = Unbiased (Point) Estimate of the unknown
parameter
• h = critical value x standard error of the estimate
• Critical Value is obtained using confidence coefficient
(1-  ) (will be discussed later)
• Lower Confidence Limit = T-h
• Upper Confidence Limit = T+h
Point Estimate
Lower Confidence Limit Upper Confidence Limit
Width of
confidence interval

• Using Central Limit Theorem, for large sample
• Where T is the unbiased point estimate of θ
• SE(T) is the standard error of T.
• Confidence coefficient is fixed as (1-  ).
• Critical value is given by z/2 as below
• P(-z/2 < Z < z/2) = (1-  ), where Z~N(0,1).
)
1
,
0
(
~
)
(
N
T
SE
T
Z



N(0,1)

•
• For Z~N(0,1)
• This implies
• or
• Thus (1-  )x100 % Confidence interval estimate of θ is
• [T - z/2 x SE(T), T + z/2 x SE(T)]



 












 1
)
(
2
/
2
/ z
T
SE
T
z
P
  
 
 






 1
)
(
)
( 2
/
2
/ T
SE
z
T
T
SE
z
T
P
)
1
,
0
(
~
)
(
N
T
SE
T
Z



  

 



 1
2
/
2
/ z
Z
z
P

Confidence Interval for Population Mean μ
(σ Known)
• When
 Population standard deviation σ is known
 Population is normally distributed
 If population is not normal, sample size is large
• (1-  )x100 % Confidence interval estimate of μ
is given by
• where P(-z/2 < Z < z/2) = (1-  ), Z~N(0,1).










n
z
x
n
z
x



 2
/
2
/ ,

Commonly used confidence levels and corresponding
critical values (N(0,1) Distribution)
-z/2 = - 1.96 z/2 = 1.96
.95
0
1 

025
2
.
α
 025
2
.
α

0
Confidence Level
Confidence
Coefficient α Critical Value
80% 0.8 0.2 1.28
90% 0.9 0.1 1.645
95% 0.95 0.05 1.96
98% 0.98 0.02 2.33
99% 0.99 0.01 2.58
99.80% 0.998 0.002 3.08
99.90% 0.999 0.001 3.27
N(0,1)

μ
μx 
Distribution of the Sample Mean










n
σ
z
x
n
σ
z
x α/
α/ 2
2 ,
samples)
different
(for
Intervals
Confidence
/2
 /2



1
 
n
N 
,
samples
different
for
Mean
Sample
of
Value x (1-) x100%
of intervals will
contain μ.

• Example:
• A sample of 11 circuits from a large normal population
has a mean resistance of 2.20 ohms.
• We know from past testing that the population standard
deviation is 0.35 ohms.
• Determine a 95% confidence interval for the true mean
resistance of the population.
• Ans.
2.4068)
,
(1.9932
0.2068
2.20
)
11
(0.35/
1.96
2.20
n
σ




 )
025
.
0
(
z
x

Confidence Interval for Population Mean μ
(σ Unknown)
• Use unbiased estimate of σ, given by
• Case 1: n is small
 Value of s1 varies sample to sample
 This increases extra variability
 Normal distribution can not be used
 We use t distribution with (n -1) d.f.
• Case 2: n is large
 When n is large, t distribution approaches normal distribution
 We use N(0,1) distribution





n
i
i x
x
n
s
1
2
1 )
(
1
1

Case 1: σ is unknown and n is small
• Assumption: Population has normal distribution
• (1-  )x100 % Confidence interval estimate of μ is given
by
• Where t/2 is given such that
• P(-t/2 < T < t/2) = (1-  ), for T ~ t(n-1).










n
s
t
x
n
s
t
x 1
2
/
1
2
/ , 


Some Critical Values of t(n-1) distribution for
given α and d.f. (n-1)
-t/2 t/2


1
2
α
2
α
0
d.f.
(n-1)
Critical Value
at α = 0.05
Critical Value
at α = 0.10
1 12.706 6.314
2 4.303 2.92
3 3.182 2.353
4 2.776 2.132
5 2.571 2.015
6 2.447 1.943
7 2.365 1.895
t(n-1)

• Consider the same example
• A sample of 11 circuits from a large normal population
has a mean resistance of 2.20 ohms.
• Population standard deviation is not known.
• Sample standard deviation (s1) is 0.35 ohms.
• Determine a 95% confidence interval for the true mean
resistance of the population.
• Ans.
)
.
,
.
(
.
.
)
/
.
(
.
n
s
t
x )
.
(
4351
2
9649
1
2351
0
20
2
11
35
0
22814
.
2
20
2
1
025
0







Case 2: σ is unknown and n is large
• Population may or may not have normal distribution
• (1-  )x100 % Confidence interval estimate of is μ given
by
• Where z/2 is given such that
• For Z~N(0,1), P(-z/2 < Z < z/2) = (1-  ).










n
s
z
x
n
s
z
x 1
2
/
1
2
/ , 


Confidence Interval Estimate of μ
σ known σ Unknown
n small n large
Normal
Distribution
Any
Distribution










n
z
x
n
z
x



 2
/
2
/ ,
n small n large
Normal
Distribution
Any
Distribution










n
s
t
x
n
s
t
x 1
2
/
1
2
/ , 











n
s
z
x
n
s
z
x 1
2
/
1
2
/ , 


Confidence Intervals for Population Proportion π
Case 1:
• Small Sample: out of scope
Case 2:
• Large Sample
• We know that for large n
• For Z~N(0,1), we have
)
1
,
0
(
~
)
1
(
N
n
p
Z






  














































1
)
1
(
)
1
(
or
1
)
1
(
or
1
)
(
2
/
2
/
2
/
2
/
2
/
2
/
n
z
p
n
z
p
P
z
n
p
z
P
z
Z
z
P

• Thus (1-  )x100 % CI estimate of π is given by
• This expression itself contains π. Which is
unknown
• So, this CI estimate becomes meaningless.
• We use the unbiased estimate of π
• Then, (1-  )x100 % CI estimate of π is given by
• Where q=1-p.
• Required Assumption: Large Sample only.
 
n
z
p
n
z
p )
1
(
,
)
1
( 2
/
2
/ 


 
 





 
n
pq
z
p
n
pq
z
p 


 2
/
2
/ , 


• Example:
• A random sample of 100 people shows that 25
have opened IRA (individual retirement
arrangement) this year.
• Construct a 95% confidence interval for the true
proportion of population who have opened IRA.
• Ans
)
.
,
.
(
)
(.
.
.
)/
.
(
.
.
/
p)/n
p(
z
p )
.
(
3349
0
1651
0
0433
96
1
25
0
100
75
0
25
0
96
1
100
25
1
025
0








Confidence Interval for Population Variance  2
• Variance is an inverse measure of the group’s
homogeneity.
• Variance is an important indicator of total quality in
standardized products and services.
• Managers improve processes by reducing variance.
• Variance is a measure of financial risk.
• Variance of rates of return help managers assess
financial and capital investment alternatives.
• Variability is a reality in global markets.
• Productivity, wages, and costs of living vary between
regions and nations.

Case 1:
• Small Sample
• Parent Population is Normal
• Let us take a sample from N(μ,σ).
• Then,
• We know that
• So,
n
x
x
x ,...,
, 2
1
2
)
1
(
2
1
~ 

 




 
n
n
i
i x
x







n
i
i x
x
n
s
1
2
2
1 )
(
1
1
2
)
1
(
2
2
1
~
)
1
(


n
s
n



• Then, (1-  )x100 % CI estimate of  2 is given by
• Or
• Here, are critical values obtained
using Chi Square distribution with (n-1) d.f.
   


 

2
2
/
1
2
1
2
2
2
/
2
1 1
1




 s
n
s
n
   














 

2
2
/
1
2
1
2
2
/
2
1 1
,
1 s
n
s
n

 

and
2
2
/
1
2
2
/ 

df = 7
α = 0.10
α /2 = 0.05
α/2 = 0.05
1- α =0.90
2.167 14.067

• Example:
• The cholesterol concentration in the yolks of a
sample of 18 randomly selected eggs laid by
genetically engineered chickens were found to
have a mean value of 9.38 mg/g of yolk and a
standard deviation of 1.62 mg/g.
• Use this information to construct a confidence
interval estimate of the true variance of the
cholesterol concentration in these egg yolks.

Case 2:
• Large Sample
• Parent Population may or may not be Normal
• We know that
• Also, (Proof is out of scope)
• So, for large samples.
• Using this, (1-  )x100 % CI estimate of  2 is given by
•
2
2
1 )
( 

s
E
)
1
(
2
)
.(
. 2
2
1 
 n
s
E
S 
)
1
,
0
(
~
)
1
(
2
2
2
2
1
N
n
s















 )
1
(
2
1
,
)
1
(
2
1 2
/
2
1
2
/
2
1
n
z
s
n
z
s



• Example:
• A technologist is developing a new method for processing
a food material.
• For best quality, it is important to control moisture content
in the final product.
• So, as one part of determining the practicality of the new
method, the technologist must estimate the variability of
water content in the resulting product.
• He collects 50 specimens of product from the new
process, and determines the percent water in each.
• These 50 specimens give a sample mean water content of
43.24% and a sample standard deviation of 7.93%.
• Compute a 95% confidence interval estimate of the true
variance of the percentage water for this new process.

(when Estimating μ)
• We have seen (for sufficiently large n) that
• Error of Estimation
• Fix the confidence level at (1-  )x100 %
• Obtain critical value is z/2 using N(0,1) such that
• Then, we have
)
,
(
~ n
N
x 
 )
1
,
0
(
~
or N
n
x
Z






 x
e
2
2
/
or 






e
z
n 

n
e
z

 
2
/

• Thus the sample size for estimating population mean μ
is
• Critical value z/2 can be taken from the table.
• Estimation Error (e) should be fixed by the researcher in
advance.
• Clearly, e ≠ 0
• Population standard deviation σ can be estimated from
some other small sample or pilot survey as
• Range/6 or by sample standard deviation
2
2
/







e
z
n 


• Example:
• In a pilot survey, it is observed that the smallest
observation is 6 and the largest observation is 276.
• What should be the sample size needed to estimate the
population mean within ± 5 with 90% confidence level?
• Ans.
219
19
.
219
5
645
.
1
45
ˆ
So,
645
.
1
value
critical
level,
confidence
90%
For
5
Error
Estimation
45
6
6
276
ˆ
deviation
standard
population
of
Estimate
2
2
05
.
0
)
05
.
0
(







 













e
z
n
z
e



(when Estimating 𝛑)
• Similarly, the sample size for estimating population
proportion 𝛑 is given by
• For fixed confidence coefficient (1-  ), critical value z/2 can
be taken from the normal table.
• Estimation Error (e = |p – 𝛑|) should be fixed by the
researcher in advance. Clearly, e ≠ 0
• Population proportion 𝛑 can be estimated from some other
small sample or pilot survey.
• If no information is available, it can be decided by the
researcher using past experience or can be taken as 0.5.
2
2
2
/ )
(
)
1
(
e
z
n 

 


• Example:
• How large a sample would be necessary to
estimate the true proportion defective in a large
population within ±3%, with 95% confidence?
• (Assume a pilot sample yields p = 0.12)
•Ans.
451
75
.
450
03
.
0
03
.
0
96
.
1
96
.
1
88
.
0
12
.
0
)
(
So,
96
.
1
value
critical
level,
confidence
95%
For
03
.
0
100
/
3
Error
Estimation
12
.
0
proportion
population
of
Estimate
2
2
025
.
0
)
025
.
0
(












e
z
pq
n
z
e
p

(when Estimating  2)
• We know, for large samples,
• Similarly, the sample size for estimating population variance  2 is
given by
• For fixed confidence coefficient (1-  ), critical value z/2 can be
taken from the normal table.
• Estimation Error should be fixed by the
researcher in advance. Clearly, e ≠ 0
• Population variance  2 can be estimated from some other small
sample or pilot survey.
• If no information is available, it can be decided by the researcher
using past experience or can be taken as the square of Range/6.
)
1
,
0
(
~
)
1
(
2
2
2
2
1
N
n
s




2
2
2
/
4
2
1
e
z
n 



2
2
1 

 s
e

Estimating Total
• In auditing, one is more interested to get the estimate of
population total amount.
• The point estimate of it can be given by 𝑁 × 𝑥
•
• The CI estimate at (1-  )x100 % confidence level is given by
• 𝑁𝑥 ∓ 𝑁𝑧𝛼/2 ×
𝑠1
√𝑛
, for large samples
• 𝑁𝑥 ∓ 𝑁𝑧𝛼
2
×
𝑠1
𝑛
×
𝑁−𝑛
𝑁−1
, for large samples, if
𝑛
𝑁
≥ 0.05.

Example: A firm has a population of 1000 accounts and
wishes to estimate the total population value.
• A sample of 80 accounts is selected with average
balance of $87.6 and standard deviation of $22.3.
• Find the 95% confidence interval estimate of the total
balance.
• Ans:
3
22
6
87
80
1000 1 .
, s
.
x
,
, n
N 



)
48
.
92362
,
52
.
82837
(
48
762
4
600
87
1
1000
80
1000
80
3
22
96
1
1000
6
87
1000
1
1
025
0










.
,
,
.
)
.
)(
(
)
.
)(
(
N
n
N
n
s
z
N
x
N .

• Example:
• Econe Dresses has 1200 inventory items.
• In the past 15% items were incorrectly priced.
• A sample of 120 items was selected.
• Tagged cost of each item was compared with the
actual value.
• 15 items differ in their tagged cost and actual
cost.
• These values are as follows:

Tagged
Cost
Actual
Value
Di
261 240 21
87 105 -18
201 276 -75
121 110 11
315 298 17
411 356 55
249 211 38
216 305 -89
21 210 -189
140 152 -12
129 112 17
340 216 124
341 402 -61
135 97 38
228 220 8
24482
.
25
95833
.
0
1200
,
120





D
s
D
N
n
]
1
1200
120
1200
120
24482
.
25
96
.
1
1200
)
95833
.
0
(
1200
[
1
)
025
.
0
(



















N
n
N
n
s
Nz
D
N D
is
CI
95%
n/N = 120/1200 = 0.1 > 0.05,
So we use fpc

Population
Mean (μ)
σ is
know
n
Small sample
(Normal Distribution)
Large sample
(Any Distribution)
σ is
not
know
n
Small sample
Large sample
(Any Distribution)
Population
Proportion (𝛑)
Small sample OUT OF SCOPE
Large sample
(Any Distribution)
Population
Variance (σ2)
Small sample
Large sample
(Any Distribution)
n
z
x

 
2
/

n
s
t
x 1
2
/ 


n
s
z
x 1
2
/ 


n
pq
z
p 
2
/


   

 

2
2
/
1
2
1
2
2
/
2
1 1
,
1


 s
n
s
n
)
1
(
2
1 2
/
2
1

 n
z
s

SUMMARY
(INTERVAL
ESTIMATES)

For estimating
Population Mean
(μ)
Large sample
(Any Distribution)
For estimating
Population
Proportion (𝛑)
Large sample
(Any Distribution)
For estimating
Population
Variance (σ2)
Large sample
(Any Distribution)
SUMMARY
(SAMPLE
SIZE
DECISION)
2
2
2
/
4
2
1
e
z
n 



2
2
2
/ )
(
)
1
(
e
z
n 

 

2
2
/







e
z
n 


2_Interval_Estimation.pptx

Recommended

Recommended

More Related Content

Similar to 2_Interval_Estimation.pptx

Similar to 2_Interval_Estimation.pptx (20)

Recently uploaded

Recently uploaded (20)

2_Interval_Estimation.pptx