1. Interval Estimation and
Sample Size Decision
• Point estimation
• Interval estimation for
Population Mean
Population Proportion
Population Variance
• Sample size decision in estimating
Population Mean
Population Proportion
Population Variance
QAM – II by Gaurav Garg (IIM Lucknow)
2. Statistical Estimation
• We take data from a sample and say something about the
population from which the sample was drawn
• Sample statistic is used to estimate unknown parameter.
• There are two types of estimation:
• Point Estimation:
Calculation of a single value of a sample statistic
• Interval Estimation
Calculation of an interval using a sample statistic
This interval is calculated at a desired level of confidence
• Eg. 95% confidence, 99% confidence, can not be 100%
Sample to sample variation (standard error) is also taken
into consideration.
QAM – II by Gaurav Garg (IIM Lucknow)
3. • Let θ be the unknown parameter.
• Suppose T is the point estimate of θ and E(T) = θ.
• Fix the confidence level at (1- )x100 %.
• is the probability of “error”.
• (1- ) is called confidence coefficient.
• Thus, for 95% confidence level, = 0.05.
• Confidence interval estimate of θ is [T-h, T+h]
• It means that P(T-h ≤ θ ≤ T+h) = 1-
• Where, h = critical value x standard error
QAM – II by Gaurav Garg (IIM Lucknow)
Confidence Interval Estimates
4. • Formula for confidence interval is [T-h, T+h]
• T = Unbiased (Point) Estimate of the unknown
parameter
• h = critical value x standard error of the estimate
• Critical Value is obtained using confidence coefficient
(1- ) (will be discussed later)
• Lower Confidence Limit = T-h
• Upper Confidence Limit = T+h
QAM – II by Gaurav Garg (IIM Lucknow)
Point Estimate
Lower Confidence Limit Upper Confidence Limit
Width of
confidence interval
5. • Using Central Limit Theorem, for large sample
• Where T is the unbiased point estimate of θ
• SE(T) is the standard error of T.
• Confidence coefficient is fixed as (1- ).
• Critical value is given by z/2 as below
• P(-z/2 < Z < z/2) = (1- ), where Z~N(0,1).
QAM – II by Gaurav Garg (IIM Lucknow)
)
1
,
0
(
~
)
(
N
T
SE
T
Z
N(0,1)
6. •
• For Z~N(0,1)
• This implies
• or
• Thus (1- )x100 % Confidence interval estimate of θ is
• [T - z/2 x SE(T), T + z/2 x SE(T)]
QAM – II by Gaurav Garg (IIM Lucknow)
1
)
(
2
/
2
/ z
T
SE
T
z
P
1
)
(
)
( 2
/
2
/ T
SE
z
T
T
SE
z
T
P
)
1
,
0
(
~
)
(
N
T
SE
T
Z
1
2
/
2
/ z
Z
z
P
7. Confidence Interval for Population Mean μ
(σ Known)
• When
Population standard deviation σ is known
Population is normally distributed
If population is not normal, sample size is large
• (1- )x100 % Confidence interval estimate of μ
is given by
• where P(-z/2 < Z < z/2) = (1- ), Z~N(0,1).
QAM – II by Gaurav Garg (IIM Lucknow)
n
z
x
n
z
x
2
/
2
/ ,
9. QAM – II by Gaurav Garg (IIM Lucknow)
μ
μx
Distribution of the Sample Mean
n
σ
z
x
n
σ
z
x α/
α/ 2
2 ,
samples)
different
(for
Intervals
Confidence
/2
/2
1
n
N
,
samples
different
for
Mean
Sample
of
Value x (1-) x100%
of intervals will
contain μ.
10. • Example:
• A sample of 11 circuits from a large normal population
has a mean resistance of 2.20 ohms.
• We know from past testing that the population standard
deviation is 0.35 ohms.
• Determine a 95% confidence interval for the true mean
resistance of the population.
• Ans.
QAM – II by Gaurav Garg (IIM Lucknow)
2.4068)
,
(1.9932
0.2068
2.20
)
11
(0.35/
1.96
2.20
n
σ
)
025
.
0
(
z
x
11. Confidence Interval for Population Mean μ
(σ Unknown)
• Use unbiased estimate of σ, given by
• Case 1: n is small
Value of s1 varies sample to sample
This increases extra variability
Normal distribution can not be used
We use t distribution with (n -1) d.f.
• Case 2: n is large
When n is large, t distribution approaches normal distribution
We use N(0,1) distribution
QAM – II by Gaurav Garg (IIM Lucknow)
n
i
i x
x
n
s
1
2
1 )
(
1
1
12. Case 1: σ is unknown and n is small
• Assumption: Population has normal distribution
• (1- )x100 % Confidence interval estimate of μ is given
by
• Where t/2 is given such that
• P(-t/2 < T < t/2) = (1- ), for T ~ t(n-1).
QAM – II by Gaurav Garg (IIM Lucknow)
n
s
t
x
n
s
t
x 1
2
/
1
2
/ ,
13. Some Critical Values of t(n-1) distribution for
given α and d.f. (n-1)
QAM – II by Gaurav Garg (IIM Lucknow)
-t/2 t/2
1
2
α
2
α
0
d.f.
(n-1)
Critical Value
at α = 0.05
Critical Value
at α = 0.10
1 12.706 6.314
2 4.303 2.92
3 3.182 2.353
4 2.776 2.132
5 2.571 2.015
6 2.447 1.943
7 2.365 1.895
t(n-1)
14. • Consider the same example
• A sample of 11 circuits from a large normal population
has a mean resistance of 2.20 ohms.
• Population standard deviation is not known.
• Sample standard deviation (s1) is 0.35 ohms.
• Determine a 95% confidence interval for the true mean
resistance of the population.
• Ans.
QAM – II by Gaurav Garg (IIM Lucknow)
)
.
,
.
(
.
.
)
/
.
(
.
n
s
t
x )
.
(
4351
2
9649
1
2351
0
20
2
11
35
0
22814
.
2
20
2
1
025
0
15. Case 2: σ is unknown and n is large
• Population may or may not have normal distribution
• (1- )x100 % Confidence interval estimate of is μ given
by
• Where z/2 is given such that
• For Z~N(0,1), P(-z/2 < Z < z/2) = (1- ).
QAM – II by Gaurav Garg (IIM Lucknow)
n
s
z
x
n
s
z
x 1
2
/
1
2
/ ,
16. QAM – II by Gaurav Garg (IIM Lucknow)
Confidence Interval Estimate of μ
σ known σ Unknown
n small n large
Normal
Distribution
Any
Distribution
n
z
x
n
z
x
2
/
2
/ ,
n small n large
Normal
Distribution
Any
Distribution
n
s
t
x
n
s
t
x 1
2
/
1
2
/ ,
n
s
z
x
n
s
z
x 1
2
/
1
2
/ ,
17. Confidence Intervals for Population Proportion π
Case 1:
• Small Sample: out of scope
Case 2:
• Large Sample
• We know that for large n
• For Z~N(0,1), we have
QAM – II by Gaurav Garg (IIM Lucknow)
)
1
,
0
(
~
)
1
(
N
n
p
Z
1
)
1
(
)
1
(
or
1
)
1
(
or
1
)
(
2
/
2
/
2
/
2
/
2
/
2
/
n
z
p
n
z
p
P
z
n
p
z
P
z
Z
z
P
18. • Thus (1- )x100 % CI estimate of π is given by
• This expression itself contains π. Which is
unknown
• So, this CI estimate becomes meaningless.
• We use the unbiased estimate of π
• Then, (1- )x100 % CI estimate of π is given by
• Where q=1-p.
• Required Assumption: Large Sample only.
QAM – II by Gaurav Garg (IIM Lucknow)
n
z
p
n
z
p )
1
(
,
)
1
( 2
/
2
/
n
pq
z
p
n
pq
z
p
2
/
2
/ ,
19. • Example:
• A random sample of 100 people shows that 25
have opened IRA (individual retirement
arrangement) this year.
• Construct a 95% confidence interval for the true
proportion of population who have opened IRA.
• Ans
QAM – II by Gaurav Garg (IIM Lucknow)
)
.
,
.
(
)
(.
.
.
)/
.
(
.
.
/
p)/n
p(
z
p )
.
(
3349
0
1651
0
0433
96
1
25
0
100
75
0
25
0
96
1
100
25
1
025
0
20. Confidence Interval for Population Variance 2
• Variance is an inverse measure of the group’s
homogeneity.
• Variance is an important indicator of total quality in
standardized products and services.
• Managers improve processes by reducing variance.
• Variance is a measure of financial risk.
• Variance of rates of return help managers assess
financial and capital investment alternatives.
• Variability is a reality in global markets.
• Productivity, wages, and costs of living vary between
regions and nations.
QAM – II by Gaurav Garg (IIM Lucknow)
21. Confidence Interval for Population Variance 2
Case 1:
• Small Sample
• Parent Population is Normal
• Let us take a sample from N(μ,σ).
• Then,
• We know that
• So,
n
x
x
x ,...,
, 2
1
QAM – II by Gaurav Garg (IIM Lucknow)
2
)
1
(
2
1
~
n
n
i
i x
x
n
i
i x
x
n
s
1
2
2
1 )
(
1
1
2
)
1
(
2
2
1
~
)
1
(
n
s
n
22. • Then, (1- )x100 % CI estimate of 2 is given by
• Or
• Here, are critical values obtained
using Chi Square distribution with (n-1) d.f.
QAM – II by Gaurav Garg (IIM Lucknow)
2
2
/
1
2
1
2
2
2
/
2
1 1
1
s
n
s
n
2
2
/
1
2
1
2
2
/
2
1 1
,
1 s
n
s
n
and
2
2
/
1
2
2
/
25. • Example:
• The cholesterol concentration in the yolks of a
sample of 18 randomly selected eggs laid by
genetically engineered chickens were found to
have a mean value of 9.38 mg/g of yolk and a
standard deviation of 1.62 mg/g.
• Use this information to construct a confidence
interval estimate of the true variance of the
cholesterol concentration in these egg yolks.
QAM – II by Gaurav Garg (IIM Lucknow)
26. Confidence Interval for Population Variance 2
Case 2:
• Large Sample
• Parent Population may or may not be Normal
• We know that
• Also, (Proof is out of scope)
• So, for large samples.
• Using this, (1- )x100 % CI estimate of 2 is given by
•
QAM – II by Gaurav Garg (IIM Lucknow)
2
2
1 )
(
s
E
)
1
(
2
)
.(
. 2
2
1
n
s
E
S
)
1
,
0
(
~
)
1
(
2
2
2
2
1
N
n
s
)
1
(
2
1
,
)
1
(
2
1 2
/
2
1
2
/
2
1
n
z
s
n
z
s
27. • Example:
• A technologist is developing a new method for processing
a food material.
• For best quality, it is important to control moisture content
in the final product.
• So, as one part of determining the practicality of the new
method, the technologist must estimate the variability of
water content in the resulting product.
• He collects 50 specimens of product from the new
process, and determines the percent water in each.
• These 50 specimens give a sample mean water content of
43.24% and a sample standard deviation of 7.93%.
• Compute a 95% confidence interval estimate of the true
variance of the percentage water for this new process.
QAM – II by Gaurav Garg (IIM Lucknow)
28. Sample Size Decision
(when Estimating μ)
• We have seen (for sufficiently large n) that
• Error of Estimation
• Fix the confidence level at (1- )x100 %
• Obtain critical value is z/2 using N(0,1) such that
• Then, we have
QAM – II by Gaurav Garg (IIM Lucknow)
)
,
(
~ n
N
x
)
1
,
0
(
~
or N
n
x
Z
x
e
2
2
/
or
e
z
n
n
e
z
2
/
29. • Thus the sample size for estimating population mean μ
is
• Critical value z/2 can be taken from the table.
• Estimation Error (e) should be fixed by the researcher in
advance.
• Clearly, e ≠ 0
• Population standard deviation σ can be estimated from
some other small sample or pilot survey as
• Range/6 or by sample standard deviation
QAM – II by Gaurav Garg (IIM Lucknow)
2
2
/
e
z
n
30. • Example:
• In a pilot survey, it is observed that the smallest
observation is 6 and the largest observation is 276.
• What should be the sample size needed to estimate the
population mean within ± 5 with 90% confidence level?
• Ans.
QAM – II by Gaurav Garg (IIM Lucknow)
219
19
.
219
5
645
.
1
45
ˆ
So,
645
.
1
value
critical
level,
confidence
90%
For
5
Error
Estimation
45
6
6
276
ˆ
deviation
standard
population
of
Estimate
2
2
05
.
0
)
05
.
0
(
e
z
n
z
e
31. Sample Size Decision
(when Estimating 𝛑)
• Similarly, the sample size for estimating population
proportion 𝛑 is given by
• For fixed confidence coefficient (1- ), critical value z/2 can
be taken from the normal table.
• Estimation Error (e = |p – 𝛑|) should be fixed by the
researcher in advance. Clearly, e ≠ 0
• Population proportion 𝛑 can be estimated from some other
small sample or pilot survey.
• If no information is available, it can be decided by the
researcher using past experience or can be taken as 0.5.
QAM – II by Gaurav Garg (IIM Lucknow)
2
2
2
/ )
(
)
1
(
e
z
n
32. • Example:
• How large a sample would be necessary to
estimate the true proportion defective in a large
population within ±3%, with 95% confidence?
• (Assume a pilot sample yields p = 0.12)
•Ans.
QAM – II by Gaurav Garg (IIM Lucknow)
451
75
.
450
03
.
0
03
.
0
96
.
1
96
.
1
88
.
0
12
.
0
)
(
So,
96
.
1
value
critical
level,
confidence
95%
For
03
.
0
100
/
3
Error
Estimation
12
.
0
proportion
population
of
Estimate
2
2
025
.
0
)
025
.
0
(
e
z
pq
n
z
e
p
33. Sample Size Decision
(when Estimating 2)
• We know, for large samples,
• Similarly, the sample size for estimating population variance 2 is
given by
• For fixed confidence coefficient (1- ), critical value z/2 can be
taken from the normal table.
• Estimation Error should be fixed by the
researcher in advance. Clearly, e ≠ 0
• Population variance 2 can be estimated from some other small
sample or pilot survey.
• If no information is available, it can be decided by the researcher
using past experience or can be taken as the square of Range/6.
QAM – II by Gaurav Garg (IIM Lucknow)
)
1
,
0
(
~
)
1
(
2
2
2
2
1
N
n
s
2
2
2
/
4
2
1
e
z
n
2
2
1
s
e
34. Estimating Total
• In auditing, one is more interested to get the estimate of
population total amount.
• The point estimate of it can be given by 𝑁 × 𝑥
•
• The CI estimate at (1- )x100 % confidence level is given by
• 𝑁𝑥 ∓ 𝑁𝑧𝛼/2 ×
𝑠1
√𝑛
, for large samples
• 𝑁𝑥 ∓ 𝑁𝑧𝛼
2
×
𝑠1
𝑛
×
𝑁−𝑛
𝑁−1
, for large samples, if
𝑛
𝑁
≥ 0.05.
QAM – II by Gaurav Garg (IIM Lucknow)
35. Example: A firm has a population of 1000 accounts and
wishes to estimate the total population value.
• A sample of 80 accounts is selected with average
balance of $87.6 and standard deviation of $22.3.
• Find the 95% confidence interval estimate of the total
balance.
• Ans:
QAM – II by Gaurav Garg (IIM Lucknow)
3
22
6
87
80
1000 1 .
, s
.
x
,
, n
N
)
48
.
92362
,
52
.
82837
(
48
762
4
600
87
1
1000
80
1000
80
3
22
96
1
1000
6
87
1000
1
1
025
0
.
,
,
.
)
.
)(
(
)
.
)(
(
N
n
N
n
s
z
N
x
N .
36. • Example:
• Econe Dresses has 1200 inventory items.
• In the past 15% items were incorrectly priced.
• A sample of 120 items was selected.
• Tagged cost of each item was compared with the
actual value.
• 15 items differ in their tagged cost and actual
cost.
• These values are as follows:
QAM – II by Gaurav Garg (IIM Lucknow)
37. QAM – II by Gaurav Garg (IIM Lucknow)
Tagged
Cost
Actual
Value
Di
261 240 21
87 105 -18
201 276 -75
121 110 11
315 298 17
411 356 55
249 211 38
216 305 -89
21 210 -189
140 152 -12
129 112 17
340 216 124
341 402 -61
135 97 38
228 220 8
24482
.
25
95833
.
0
1200
,
120
D
s
D
N
n
]
1
1200
120
1200
120
24482
.
25
96
.
1
1200
)
95833
.
0
(
1200
[
1
)
025
.
0
(
N
n
N
n
s
Nz
D
N D
is
CI
95%
n/N = 120/1200 = 0.1 > 0.05,
So we use fpc
38. Population
Mean (μ)
σ is
know
n
Small sample
(Normal Distribution)
Large sample
(Any Distribution)
σ is
not
know
n
Small sample
(Normal Distribution)
Large sample
(Any Distribution)
Population
Proportion (𝛑)
Small sample OUT OF SCOPE
Large sample
(Any Distribution)
Population
Variance (σ2)
Small sample
(Normal Distribution)
Large sample
(Any Distribution)
n
z
x
2
/
n
s
t
x 1
2
/
n
s
z
x 1
2
/
n
pq
z
p
2
/
2
2
/
1
2
1
2
2
/
2
1 1
,
1
s
n
s
n
)
1
(
2
1 2
/
2
1
n
z
s
SUMMARY
(INTERVAL
ESTIMATES)
QAM – II by Gaurav Garg (IIM Lucknow)
39. For estimating
Population Mean
(μ)
Large sample
(Any Distribution)
For estimating
Population
Proportion (𝛑)
Large sample
(Any Distribution)
For estimating
Population
Variance (σ2)
Large sample
(Any Distribution)
SUMMARY
(SAMPLE
SIZE
DECISION)
QAM – II by Gaurav Garg (IIM Lucknow)
2
2
2
/
4
2
1
e
z
n
2
2
2
/ )
(
)
1
(
e
z
n
2
2
/
e
z
n