1.
Introduction to
Sampling Distributions and
Estimating Population Values
Istanbul Bilgi University
FEC 512 Financial Econometrics-I
Dr. Orhan Erdem
2.
Unbiasedness
ˆ
A point estimator θ is said to be an
unbiased estimator of the parameter θ if
the expected value, or mean, of the
ˆ
sampling distribution of θ is θ,
ˆ
E(θ) = θ
Examples:
The sample mean is an unbiased estimator of µ
The sample variance is an unbiased estimator of σ2
Lecture 4- 2
FEC 512
4.
Bias
ˆ
Let θ be an estimator of θ
ˆ
The bias in θ is defined as the difference
between its mean and θ
ˆ ˆ
Bias(θ) = E(θ) − θ
The bias of an unbiased estimator is 0
Lecture 4- 4
FEC 512
5.
Consistency
ˆ
Let θ be an estimator of θ
ˆ
θ is a consistent estimator of θ if the
ˆ
difference between the expected value of θ
and θ decreases as the sample size
increases
Consistency is desired when unbiased
estimators cannot be obtained
Lecture 4- 5
FEC 512
6.
Most Efficient Estimator
Suppose there are several unbiased estimators of θ
The most efficient estimator or the minimum variance
unbiased estimator of θ is the unbiased estimator with
the smallest variance
ˆ ˆ
Let θ1 and θ2 be two unbiased estimators of θ, based
on the same number of sample observations. Then,
ˆ
ˆ
θ1 is said to be more efficient than θ 2 if
ˆ ˆ
Var(θ ) < Var(θ )
1 2
Lecture 4- 6
FEC 512
7.
Sampling Distribution
A sampling distribution is a
distribution of the possible values
of a statistic for a given size
sample selected from a
population
Lecture 4- 7
FEC 512
8.
Developing a Sampling Distribution
Assume there is a population …
D
Population size N=4 C
A B
Random variable, X,
is age of individuals
Values of X:
18, 20, 22, 24 (years)
Lecture 4- 8
FEC 512
9.
Developing a Sampling Distribution
(continued)
Summary Measures for the Population Distribution:
∑X P(x)
µ= i
N
.25
18 + 20 + 22 + 24
= = 21
4
0
∑ (X − µ) 2
x
18 20 22 24
σ= = 2.236
i
A B C D
N
Uniform Distribution
Lecture 4- 9
FEC 512
10.
Developing a Sampling Distribution
(continued)
Now consider all possible samples of size n = 2
1st 2nd Observation
16 Sample
Obs 18 20 22 24
Means
18 18,18 18,20 18,22 18,24
1st 2nd Observation
20 20,18 20,20 20,22 20,24 Obs 18 20 22 24
22 22,18 22,20 22,22 22,24 18 18 19 20 21
24 24,18 24,20 24,22 24,24 20 19 20 21 22
22 20 21 22 23
16 possible samples
(sampling with
24 21 22 23 24
replacement)
Lecture 4- 10
FEC 512
11.
Developing a Sampling Distribution
(continued)
Sampling Distribution of All Sample Means
Sample Means
16 Sample Means
Distribution
1st 2nd Observation _
P(X)
Obs 18 20 22 24
.3
18 18 19 20 21
.2
20 19 20 21 22
.1
22 20 21 22 23
_
0
24 21 22 23 24 18 19 20 21 22 23 24 X
(no longer uniform) 4- 11
Lecture
FEC 512
12.
Developing a
Sampling Distribution
(continued)
Summary Measures of this Sampling Distribution:
∑X 18 + 19 + 21+ L + 24
E(X) = = = 21 = µ
i
N 16
∑ ( Xi − µ)2
σX =
N
(18 - 21)2 + (19 - 21)2 + L + (24 - 21)2
= = 1.58
16
Lecture 4- 12
FEC 512
13.
Comparing the Population with its
Sampling Distribution
Population Sample Means Distribution
n=2
N=4
µX = 21 σ X = 1.58
µ = 21 σ = 2.236
_
P(X) P(X)
.3
.3
.2 .2
.1 .1
_
0 0
X 18 19 20 21 22 23 24
18 20 22 24 X
A B C D
Lecture 4- 13
FEC 512
15.
Histogram of 500 Sample Means from
Sample Size n=10
Mean of the Sample Means is 2.41 with
0.421 St.Dev. σ 1.507
= = 0.477
where
n 10 Lecture 4- 15
FEC 512
16.
Mean of the Sample Means is 2.53 with
0.376 St.Dev. σ 1.507
= = 0.337
where
n 20 Lecture 4- 16
FEC 512
17.
Properties of a Sampling Distribution
For any population,
the average value of all possible sample means
computed from all possible random samples of a given
size from the population is equal to the population mean:
µx = µ Theorem 1
The standard deviation of the possible sample means
computed from all random samples of size n is equal to
the population standard deviation divided by the square
root of the sample size: σ
σx = Theorem 2
n
Lecture 4- 17
FEC 512
18.
If the Population is Normal
If a population is normal with mean µ and
standard deviation σ, the sampling distribution
x
of is also normally distributed with
σ
σx =
µx = µ and
n Theorem 3
Lecture 4- 18
FEC 512
19.
z-value for Sampling Distribution
of x
Z-value for the sampling distribution of x :
( x − µ)
z=
σ
n
x = sample mean
where:
µ = population mean
σ = population standard deviation
n = sample size
Lecture 4- 19
FEC 512
20.
Sampling Distribution Properties
The sample mean is an unbiased estimator
Normal Population
Distribution
x
µ
µx = µ
Normal Sampling
Distribution
(has the same mean)
µx
x
Lecture 4- 20
FEC 512
21.
Sampling Distribution Properties
(continued)
The sample mean is a consistent estimator
(the value of x becomes closer to µ as n
Population
increases):
x
Small
sample size
As n increases, x
σ x = σ/ n
Larger
sample size
decreases
x
µ Lecture 4- 21
FEC 512
22.
If the Population is not Normal
We can apply the Central Limit Theorem:
Even if the population is not normal,
…sample means from the population will be
approximately normal as long as the sample size
is large enough
…and the sampling distribution will have
σ
σx =
µx = µ
n
and Theorem 4
Lecture 4- 22
FEC 512
23.
Central Limit Theorem
the sampling
As the n↑
distribution
sample
becomes
size gets
almost normal
large
regardless of
enough…
shape of
population
x
Lecture 4- 23
FEC 512
24.
If the Population is not Normal
(continued)
Population Distribution
Sampling distribution
properties:
Central Tendency
µx = µ
µ x
Sampling Distribution
Variation
σ (becomes normal as n increases)
σx = Larger
n Smaller sample
sample size size
(Sampling with
replacement)
x
µx Lecture 4- 24
FEC 512
25.
How Large is Large Enough?
For most distributions, n > 25 will give a
sampling distribution that is nearly normal
For fairly symmetric distributions, n > 15 is
sufficient
For normal population distributions, the
sampling distribution of the mean is always
normally distributed
Lecture 4- 25
FEC 512
26.
Example
Suppose a population has mean µ = 8 and
standard deviation σ = 3. Suppose a
random sample of size n = 36 is selected.
What is the probability that the sample mean
is between 7.8 and 8.2?
Lecture 4- 26
FEC 512
27.
Example
(continued)
Solution:
Even if the population is not normally
distributed, the central limit theorem can be
used (n > 30)
… so the sampling distribution of x is
approximately normal
µx = µ = 8
… with mean
σ 3
σx = = = 0.5
…and standard deviation n 36
Lecture 4- 27
FEC 512
28.
Example
(continued)
Solution (continued) -- find z-scores:
7.8 - 8 8.2 - 8
µx -µ
P(7.8 < µ x < 8.2) = P < <
3 σ 3
36
36 n
= P(-0.4 < z < 0.4) = 0.3108
Population Sampling Standard Normal
Distribution Distribution Distribution
.1554
??? +.1554
? ??
?
?? Sample Standardize
?
? ?
-0.4 0.4
µz = 0
7.8 8.2 z
x x
µ=8 µx = 8
Lecture 4- 28
FEC 512
29.
Suppose that Y1, Y2 Y3... Yn are i.i.d., and let µx and
σx2 denote the mean and the variance of Yi.
n
1
∑ E (Yi ) = µ Y
E (Y ) =
n i =1
1n
V a r (Y ) = V a r ( ∑ Yi )
n i =1
1n 1n n
= 2 ∑ V a r (Yi ) + 2 ∑ ∑ C o v (YiY j )
n i =1 n i =1 j = 1, j ≠ i
σ Y2
=
n
Lecture 4- 29
FEC 512
30.
Point and Interval Estimates
A point estimate is a single number,
a confidence interval provides additional
information about variability
Upper
Lower
Confidence
Confidence
Point Estimate Limit
Limit
Width of
confidence interval
Lecture 4- 30
FEC 512
31.
Confidence Intervals
How much uncertainty is associated with a point
estimate of a population parameter?
An interval estimate provides more information about
a population characteristic than does a point estimate
Such interval estimates are called confidence
intervals
Never 100% sure:
“The surer we want to be, the less we have to be
sure of” -Freund and Williams(1977)-
Lecture 4- 31
FEC 512
32.
Estimation Process
I am 95%
Random Sample
confident that
µ is between
Population Mean 40 & 60.
(mean, µ, is x = 50
unknown)
Sample
Lecture 4- 32
FEC 512
33.
General Formula
The general formula for all
confidence intervals is:
Point Estimate ± (Critical Value)(Standard Error)
Lecture 4- 33
FEC 512
34.
Confidence Level, (1-α)
(continued)
Suppose confidence level = 95%
Also written (1 - α) = .95
A relative frequency interpretation:
In the long run, 95% of all the confidence
intervals that can be constructed will contain
the unknown true parameter
A specific interval either will contain or
will not contain the true parameter
No probability involved in a specific interval
Lecture 4- 34
FEC 512
35.
Confidence Interval for µ
(σ Known)
Assumptions
Population standard deviation σ is known
Population is normally distributed
If population is not normal, use large
sample
σ
x ± z α/2
Confidence interval estimate
n
Lecture 4- 35
FEC 512
36.
Finding the Critical Value
Consider a 95% confidence interval: z α/2 = ± 1.96
1 − α = .95
α α
= .025 = .025
2 2
z.025= -1.96 z.025= 1.96
z units: 0
Lower Upper
x units: Point Estimate
Confidence Confidence
Limit Limit
Lecture 4- 36
FEC 512
37.
Common Levels of Confidence
Commonly used confidence levels are
90%, 95%, and 99%
Confidence
Confidence z value,
Coefficient,
z α/2
Level
1− α
80% .80 1.28
90% .90 1.645
95% .95 1.96
98% .98 2.33
99% .99 2.57
99.8% .998 3.08
99.9% .999 3.27
Lecture 4- 37
FEC 512
38.
Interval and Level of Confidence
Sampling Distribution of the Mean
1− α
α/2 α/2
x
Intervals µx = µ
extend from x1
100(1-α)%
σ x2
x + z α/2 of intervals
n
constructed
to
contain µ;
σ
x − z α/2
100α% do not.
n
Confidence Intervals Lecture 4- 38
FEC 512
39.
Margin of Error
Margin of Error (e): the amount added and
subtracted to the point estimate to form the
confidence interval
Example: Margin of error for estimating µ, σ known:
σ σ
x ± z α/2 e = z α/2
n n
Lecture 4- 39
FEC 512
40.
Factors Affecting Margin of Error
σ
e = z α/2
n
Data variation, σ : e as σ
Sample size, n : e as n
Level of confidence, 1 - α : if 1 - α
e
Lecture 4- 40
FEC 512
41.
Confidence Interval for µ
(σ Unknown)
If the population standard deviation σ
is unknown, we can substitute the
sample standard deviation, s
This introduces extra uncertainty,
since s is variable from sample to
sample
So we use the t distribution instead of
the normal distribution Lecture 4- 41
FEC 512
42.
Confidence Interval for µ
(σ Unknown)
(continued)
Assumptions
Population standard deviation is unknown
Population is normally distributed
If population is not normal, use large sample (if
you have a large sample you can still use z dist.)
Use Student’s t Distribution
s
Confidence Interval Estimate
x ± t α/2
n
Lecture 4- 42
FEC 512
43.
Student’s t Distribution
The t is a family of distributions
The t value depends on degrees of
freedom (d.f.)
Number of observations that are free to vary after
sample mean has been calculated
d.f. = n - 1
Lecture 4- 43
FEC 512
44.
Degrees of Freedom (df)
Idea: Number of observations that are free to vary
after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0
Let x1 = 7
If the mean of these three
Let x2 = 8
values is 8.0,
What is x3? then x3 must be 9
(i.e., x3 is not free to vary)
Here, n = 3, so degrees of freedom = n -1 = 3 – 1 = 2
(2 values can be any numbers, but the third is not free to vary
for a given mean)
Lecture 4- 44
FEC 512
45.
Student’s t Distribution
Note: t z as n increases
Standard
Normal
(t with df = ∞)
t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal
t
0
Lecture 4- 45
FEC 512
46.
Student’s t Table
Upper Tail Area
Let: n = 3
df = n - 1 = 2
df .25 .10 .05
α = .10
α/2 =.05
1 1.000 3.078 6.314
2 0.817 1.886 2.920
α/2 = .05
3 0.765 1.638 2.353
The body of the table
0 2.920 t
contains t values, not
probabilities
Lecture 4- 46
FEC 512
47.
t distribution values
With comparison to the z value
Confidence t t t z
Level (10 d.f.) (20 d.f.) (30 d.f.) ____
.80 1.372 1.325 1.310 1.28
.90 1.812 1.725 1.697 1.64
.95 2.228 2.086 2.042 1.96
.99 3.169 2.845 2.750 2.57
Note: t z as n increases
Lecture 4- 47
FEC 512
48.
Example
A random sample of n = 25 has x = 50 and
s = 8. Form a 95% confidence interval for µ
d.f. = n – 1 = 24, so t n−1,α/2 = t 24,.025 = 2.0639
The confidence interval is
S S
x − t n-1,α/2 < µ < x + t n-1,α/2
n n
8 8
50 − (2.0639) < µ < 50 + (2.0639)
25 25
46.698 < µ < 53.302
Lecture 4- 48
FEC 512
49.
Example
A money manager wants to obtain a 95% CI
for fund inflows and outflows over the future.
He calls a random sample of 10 clients
enquiring about their planned additions to
and withdrawals from the fund. He computes
that there will be an average of 5.5m cash
inflows with 10m standard deviation. A
histogram of past data looks fairly normal.
Calculate a 95% CI for the population mean.
Lecture 4- 49
FEC 512
50.
Solution
s 10
x ± t0.025 = 5.5 ± 2.262 = 5.5 ± 7.15%
n 10
The CI for the population means spans -1.65m to
12.65m. The manager can be confident at the 95%
level that this range includes the population mean
Lecture 4- 50
FEC 512
51.
Approximation for Large Samples
Since t approaches z as the sample size
increases, an approximation is sometimes
used when n ≥ 30:
Technically Approximation
correct for large n
s s
x ± t α/2 x ± z α/2
n n
Lecture 4- 51
FEC 512
52.
Example: Sharpe Ratio
Suppose an investment advisor takes a
random sample of stock funds and
calculates the average Sharpe
ratio(excess return/st.dev). The sample
size is 100, and has a standard deviation
of 0.30. If the average Sharpe ratio is
0.45, determine a 90% confidence
interval for the true population mean of
Sharpe ratio.
Lecture 4- 52
FEC 512
53.
Solution:
(continued)
s 0.30
x ± z0.025 = 0.45 ± 1.645 = 0.45 ± 1.645 * 0.03
n 100
= 0.401 : 0.499
Lecture 4- 53
FEC 512
Be the first to comment