SlideShare a Scribd company logo
1 of 70
Download to read offline
ECON 1005 INTRODUCTION TO
STATISTICS
ESTIMATION
Dr. Henry Bailey
1
Checklist
• The parameters of a distribution
• The idea of estimating parameters of a distribution
based on a sample
• Estimators and the estimation process
• Point vs interval estimates
• Sampling and non-sampling error
• Bias
• Sampling distribution of the mean and proportion
• Standard error
• The Central Limit Theorem
• The t distribution and working with t tables
• Confidence intervals
2
Introduc1on
• Inference
– “a conclusion reached on the
basis of evidence and reasoning.”
• Inferen0al sta0s0cs:
– Allows us to make decisions
about some characteris4cs of a
popula4on based on sample
informa4on.
– I.e. we draw conclusions about a
popula4on based on a sample. 3
Introduc1on
• We have discussed the characteris.cs and proper.es of
the probability distribu.ons of random variables
• These characteris.cs were the parameters:
– n and p for the Binomial Distribu4on
– λ for the Poisson Distribu4on
– μ and σ for the Normal Distribu4on
• In the real world we o;en don’t know the values of
these parameters and will have to es.mate them.
• Three key words:
–Es#ma#on(the process)
–Es#mate (the result)
–Es#mator (the facilitator)
4
Three approaches to estimating
unknown population parameters.
1. Census
2. Guess
3. The preferred method:
– draw a random sample of appropriate size from
the popula4on,
– use the sample data,
– choose a formula (called a sample sta4s4c) to
es4mate the unknown popula4on parameter.
5
Defini&on of Es&ma&on
Es#ma#on is the process by which we
es0mate the value of an unknown popula0on
parameter by making use of the data from a
random sample that was drawn from that
Popula0on.
6
THE ESTIMATION PROCESS
1. Iden'fy the Unknown Popula'on Parameter
2. Decide on the Size of the Random Sample: n
3. Select the Random Sample of Size n
4. Choose an Appropriate Sample Sta's'c [Es#mator]
5. Subs'tute the Sample Data into the Sample Sta's'c
6. Calculate the es'mate and interpret
7
Two Types of Es&mates
• Suppose we seek to es.mate the mean age of
Level I students on the Campus.
• We may draw a random sample of 100 Level I
students from the Campus, record their ages,
subs.tute the 100 values into the formula for the
mean of a sample (also called the sample sta)s)c
or es)mator), and read off the es)mate.
• The resul.ng es.mate can be
– a single value e.g. 20 or
– an interval of values ( 18 - 22).
8
Two Types of Es&mates
• Point es0mate
• Interval es0mate
of a popula4on parameter
9
Es&mators
• How do we use the data from our random sample
to arrive at an estimate?
• We substitute the sample data into a formula
better known as a sample statistic.
• These sample statistics are called estimators.
• A point estimator for an unknown population
parameter is a sample statistic into which the
data from the random sample is substituted, so
as to yield a point estimate of that parameter.
10
Commonly Used Point Estimators
Population Parameter Sample Statistic
Mean µ Sample Mean
Sample Median
Sample Mode
Standard Deviation σ Sample St. Dev s
Proportion p Sample Proportion p̂
11
Example
• The mean and standard deviation of the teaching experience
of faculty members in a department at a University are
unknown. A random sample of 5 faculty members was
selected; their teaching experience in years was as follows:
7, 8, 14, 7, 20
1. Identify suitable point estimators for the mean teaching
experience of the entire faculty
2. Identify suitable point estimators for the standard
deviation of teaching experience of the entire faculty
3. Find a point estimate of the mean teaching experience of
the entire faculty
4. Find a point estimate of the std deviation of the teaching
experience of the entire faculty.
12
Solution
1. We can use any of three point es'mators to es'mate the
popula'on mean: sample mean, sample mode or sample
median.
– On the basis of the three es.mators declared in 1. above, we
can compute three point es.mates.
• Sample Mean = 1/5 ( 7 + 8 + 14 + 7 + 20 ) = 11.2
• Sample Mode = 7
• Sample Median = 8
2. We can use the sample standard devia#on as the point
es#mator for the popula#on standard devia#on.
• The point es'mate of the popula'on standard devia'on is
the value of s .
s = 1/4 (4.2 2 + 3.22 + 4.22 +2.82 +8.82) = 5.718 13
Some Issues
• Since we must es?mate popula?on parameters from
samples, it is inevitable that we will make errors.
– Different sample sizes can give rise to different point
esRmates when the same esRmator is used
– Different esRmators can give rise to different point
esRmates when the same sample is used
– Different esRmators and different sample sizes can give
rise to different point esRmates
– Some esRmates will agree with the true value of the
populaRon parameter; others will not.
14
Error in Estimation
• The difference between the point estimate and the true value of the
population parameter is known as the total error in the estimate.
• This total error between the point estimate and the true value of the
population parameter can be the result of both sampling error and non-
sampling error.
• The sampling errors occur because of chance.
• Other errors may also arise as a result of human errors, and not chance;
these tend to impair the results obtained. Such errors are called non-
sampling errors.
TOTAL ERROR IN THE ESTIMATE = SAMPLING ERROR + NON-SAMPLING ERROR
15
Sources of Non-Sampling Error
• There are many poten0al sources of non-
sampling error:
– Inability to obtain all the required informa4on
from all elements of the sample
– Difficul4es in defining terms
– Differences in interpreta4on of ques4ons
– Errors in the data collec4on such as in recording or
coding
– Errors made in the data tabula4on ac4vity.
16
Example
• Consider a staRsRcs class of five students. Their exam scores
were: 70, 78, 80, 80 & 95.
• Find the populaRon mean.
• Suppose that a random sample of three students was drawn
i.e. 70, 80 & 95
• Use the sample data and the sample mean to esRmate the
populaRon mean.
• What is the difference due to chance?
• Now suppose that we mistakenly recorded 82 instead of 80.
• What would be the new esRmate of the populaRon mean?
• What is the new difference between the populaRon mean and
the point esRmate?
17
Example Continued
• It is this difference of 1.73 that we call the total error in the
estimate. It is subdivided into two components:
– The sampling error of 1.07
– The non-sampling error of 0.66
• As this error grows, the sample statistic will become less
useful as an estimator of the population parameter.
• We must therefore be able to determine the impact of the
error on the inferences that we will be making by
subjecting the estimators to specific tests. These are
discussed in the next chapter.
18
What is bias ?
• Bias is a tendency to lean in a certain direction, either in
favour of or against a particular thing. To be
truly biased means to lack a neutral viewpoint on a particular
topic.
• Statistical bias is a feature of a statistical technique or of its
results whereby the expected value of the results differs from
the true underlying quantitative parameter being estimated.
19
Unbiased Point Es7mator
• A point estimator !
𝜃 is said to be an unbiased
estimator of a population parameter 𝜃 if
E( !
𝜃)= 𝜃
• If E( !
𝜃)≠ 𝜃 then the point estimator is said to
be biased.
• The extent of the bias will be equal to
E( !
𝜃) – 𝜃
20
Unbiased Point Estimator
• The unbiased es0mators that we will use in
this course are:
21
SAMPLING DISTRIBUTION OF THE MEAN
• Return to our popula.on of test scores for the class comprising five
students A, B, C, D and E.
• A = 70, B = 78, C = 80, D = 80, E = 95
• Popula'on Mean = 80.6 Popula'on Std Devia'on = 8.09
• We will now perform the following ac.vi.es.
1. Consider all possible samples of three scores from this popula6on;
there are 10 such samples.
2. Compute the sample mean for each of the 10 samples.
3. Construct the Frequency Distribu6on of Sample Means.
4. Construct the Rela6ve Frequency Distribu6on of Sample Means.
5. Rename Rela6ve Frequency as Probability to create the Probability
Distribu6on of the Sample Means
22
1 & 2. Generating the 10 Random
Samples of Size 3
23
3. The Frequency Distribution of
Sample Means
24
Sample
mean
Frequency
76.00 2
76.67 1
79.33 1
81.00 1
81.67 2
84.33 2
85.00 1
Σf= 10
5. The Probability Distribu1on of
Sample Means (or The Sampling
Distribu1on of the Mean)
25
Sample
mean
Probability
76.00 0.2
76.67 0.1
79.33 0.1
81.00 0.1
81.67 0.2
84.33 0.2
85.00 0.1
Σ 1.00
Sample
mean
Frequency
76.00 2
76.67 1
79.33 1
81.00 1
81.67 2
84.33 2
85.00 1
Σf= 10
Sampling Distribu1ons in this Course
• In general, the probability distribu.on of a
Sample Sta.s.c is called its sampling distribu.on.
• We will focus on two sampling distribu3ons:
– Sampling Distribu0on of the Mean
– Sampling Distribu0on of the Propor0on
• In the Sampling Distribu.on of the Mean, the
random variable is the sample mean.
• In the Sampling Distribu.on of the Propor.on,
the random variable is the sample propor.on p̂
26
The Mean of the Sampling Distribution of the Mean
• The mean of the sampling distribution of the mean is
equal to the population mean μ.
Class Activity
• Compute the mean of the Sampling Distribution of the
Mean Score based on the ten random samples of size 3.
• Show that it is indeed equal to the population mean.
27
The Standard Devia1on of the
Sampling Distribu1on of the Mean
• The Standard Deviation of the Sampling Distribution of
Mean is given by σx where
• σx = σ/√n
• σx is also called the standard error.
• The spread of the Sampling Distribution of the Mean is
smaller than the spread of the corresponding
population distribution.
• The standard deviation of the Sampling Distribution of
Mean decreases as the sample size increases.
28
What kind of distribu2on will the Sampling Distribu2on of the Mean
have?
• If the population from which the samples are
drawn is normally distributed with mean μ and
standard deviation σ, then the Sampling
Distribution of the Mean will also be normally
distributed with mean μ and standard
deviation σx (irrespective of the sample size).
• Does the above result hold true if the
population is not normally distributed?
29
Parent Population
Sampling Distribu4on of x for n=2
Sampling Distribu4on of x for n=5
Sampling Distribution of x for n=30
30
Full Screen
What kind of Probability Distribution does the Sampling Distribution
of the Mean possess when the population is not Normal ?
The Central Limit Theorem assures us that:
• If the sample size is large, the Sampling Distribution of
the Mean will be approximately normally distributed
with mean μ and standard deviation σx irrespective of
the distribution of the population.
• ‘Large’ is taken to mean n≥30
• What happens when the sample size is small i.e. n < 30?
31
What kind of Probability Distribu2on does the Sampling Distribu2on of
the Mean possess when the popula2on is not Normal and sample size
is small i.e. n < 30?
• We must look to the Student t Distribu2on
• The Student t DistribuRon is a specific type of bell-shaped
distribuRon with a lower height and a wider spread than the
Standard Normal DistribuRon.
• The Student t DistribuRon has only one parameter i.e. the number
of degrees of freedom abbreviated df
• The number of degrees of freedom is the number of observaRons
that can be freely chosen.
• The mean of the Student t DistribuRon is 0
• The standard deviaRon of the Student t DistribuRon is df/(df – 2)
• As the degrees of freedom increases the Student t DistribuRon
approaches the Standard Normal DistribuRon. 32
• If the popula0on from which the samples are
drawn is either of unknown distribu0on or not
normally distributed with mean μ and standard
devia0on σ, then the Sampling Distribu0on of the
Mean is specified by the Student t DistribuBon
with n - 1 degrees of freedom.
• The random variable of the Student t Distribu4on is given
by t where:
33
What kind of Probability Distribution does the Sampling Distribution
of the Mean possess when the population is not Normal and sample
size is small i.e. n < 30?
t =
!"#
$!
The Sampling Distribution of Proportion
The Sampling Distribu.on of Propor.on
• The probability distribu.on of the sample
propor.on is called the Sampling Distribu.on of
the Propor.on.
• The random variable of the Sampling Distribu.on
of the Propor.on is p̂
• The mean of the Sampling Distribu.on of the
Propor.on is the popula.on propor.on p.
• The standard devia.on of the Sampling
Distribu.on of the Propor.on is given by √(pq/n).
34
What is the shape of the Sampling
Distribution of the Proportion?
The Central Limit Theorem assures us that:
• If the sample size is sufficiently large, the
Sampling Distribu0on of the Propor0on will be
approximately normally distributed with mean
p and standard devia0on √(pq/n).
• Sufficiently Large means np > 5 and nq > 5.
35
Interval Estimates: Confidence
Intervals
• We were speaking all along about Unbiased Point
Estimators.
• Instead of assigning a single value to an unknown
population parameter, we can construct an interval
of values around the point estimate and make a
probabilistic statement that the interval contains the
value of the corresponding population parameter.
• Such activity is called interval estimation and interval
estimators are called Confidence Intervals.
• These estimators, when applied to the data from a
random sample, defines an interval that is likely to
contain the true value of the population parameter
being estimated. 36
Confidence Level and Confidence Interval
Defini6on
Each interval is constructed with regard to a given confidence level and is
called a confidence interval. The confidence level is given as
Point es)mate ± Margin of error
The confidence level associated with a confidence interval states how much
confidence we have that this interval contains the true popula6on
parameter. The confidence level is denoted by (1 – α)100%.
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
Interval es4ma4on.
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
Interval Estimates
• An interval that is constructed based on the confidence level is called a
confidence interval.
• A 90% Confidence Interval means a 10% significance level i.e. α = 10%
• A 95% Confidence Interval means a 5% significance level i.e. α = 5%
• Confidence Interval Estimates in this course are as follows:
– For the population mean based on large samples
– For the population mean based on small samples
– For the population mean based on large samples with σ unknown
– For the population mean based on small samples with σ unknown
– For the population proportion
39
A 100 (1 - α)% Confidence Interval
EsEmate for the PopulaEon Mean μ
• Let X ~ N(μ , σ) where σ is known. A single sample of size n
was drawn and the sample mean X is computed.
• On the basis of this sample mean we seek to find a
100(1 - α)% Confidence Interval Es#mate for μ.
• A 100( 1 – α)% interval es'mate for the popula'on mean μ
is given by:
X – Zα/2 σx ≤ μ ≤ X + Zα/2 σx
or
(X – Zα/2 σx , X + Zα/2 σx)
where Zα/2 is the standard score that cuts off a tail area of
α/2% in the Standard Normal Curve. 40
A 100( 1 – α)% Interval
Es2mate for the
Popula2on Mean μ
(μ – Zα/2 σx , μ + Zα/2 σx)
where Zα/2 is the
standard score
that cuts off a tail
area of
%
&
% in the
Standard Normal
Curve.
41
Eg Finding z for a 95% confidence level.
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
Figure 8.3 Area in the tails.
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
= 0.95
= 0.025
Example
• Find a 100( 1 – α)% Interval Estimate for the
Population mean μ using the following:
§ α = 5%
§ Sample mean = 52
§ σx= 4
CI = μ – Zα/2 σx to μ + Zα/2 σx
44
Example
• Find a 100( 1 – α)% Interval Es0mate for the
Popula0on mean μ using the following:
§ α = 5%
§ Sample mean = 52
§ σx= 4
CI = μ – Zα/2 σx to μ + Zα/2 σx
45
95% Confidence Interval=
μ – Zα/2 σx to μ + Zα/2 σx
52 – (1.96 x 4) to 52 + (1.96 x 4)
52 – 7.84 to 52 + 7.84
44.16 to 59.84
Example
• Find a 100( 1 – α)% Interval Es0mate for the
Popula0on mean μ using the following:
§ α = 5%
§ Sample mean = 52
§ σx= 4
CI = μ – Zα/2 σx to μ + Zα/2 σx
46
95% Confidence Interval=
μ – Zα/2 σx to μ + Zα/2 σx
52 – (1.96 x 4) to 52 + (1.96 x 4)
52 – 7.84 to 52 + 7.84
44.16 to 59.84
Find μ
Get Z from
tables (using
half of alpha)
Calculate σx
A publishing company has just published a new college
textbook. Before the company decides the price at which to
sell this textbook, it wants to know the average price of all
such textbooks in the market. The research department at the
company took a sample of 25 comparable textbooks and
collected information on their prices. This information
produces a mean price of $145 for this sample. It is known
that the standard deviation of the prices of all such textbooks
is $35 and the population of such prices is normal.
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
(a) What is the point es6mate of the mean price of all such textbooks?
(b) Construct a 90% confidence interval for the mean price of all such
college textbooks.
A 100( 1 – α)% Interval
Estimate for the
Population Mean μ
(μ – Zα/2 σx , μ + Zα/2 σx)
Find Zα/2 for a 90% C.I.
48
(b) Confidence level is 90% or .90. Here, the area in each tail of the normal
distribu6on curve is α/2=(1-.90)/2=.05. Hence, z = 1.65.
145 1.65(7.00) 145 11.55
(145-11.55) to (145 11.55)
$133.45 to $156.55
x
x zs
± = ± = ±
= +
=
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 8-2
According to Moebs Services Inc., an individual checking
account at major U.S. banks costs the banks between $350 and
$450 per year (Time, November 21, 2011). A recent random
sample of 600 such checking accounts produced a mean
annual cost of $500 to major U.S. banks. Assume that the
standard deviaRon of annual costs to major U.S. banks of all
such checking accounts is $40. Make a 99% confidence interval
for the current mean annual cost to major U.S. banks of all
such checking accounts.
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
• Confidence level 99% or .99
• The sample size is large (n ≥ 30)
§ Therefore, we use the normal distribution
§ z = 2.58
§ Thus, we can state with 99% confidence that the current mean
annual cost to major U.S. banks of all individual checking
accounts is between $495.79 and $504.21
51
A 100 (1 - α)% Confidence Interval Estimate for the
Population Mean μ where σ is unknown
Let X ~ N(μ , σ) where σ is unknown. A single sample of
size n was drawn and the sample mean X was
computed. On the basis of this single sample mean, find
a 100(1 - α)% Confidence Interval EsMmate for μ.
• Here we subs4tute s for the unknown σ.
• However, it mamers whether n is large i.e. (n ≥ 30) or
small i.e. (n < 30)
– If n ≥ 30 the CLT allows us to use the Normal Distribu'on
N(μ , s/√n ) as the Sampling Distribu'on
– If n < 30 the CLT allows us to use the Student-t
Distribu'on with n – 1 df as the Sampling Distribu#on.
52
A 100 (1 - α)% Confidence Interval Estimate for the
Population Mean μ where σ is unknown and n ≥ 30
• A 100( 1 – α)% interval es.mate for the
popula.on mean μ when n ≥ 30 and σ is
unknown is given by
X – Zα/2 s/√n ≤ μ ≤ X+ Zα/2 s/√n
or
(X – Zα/2 s/√n, X+ Zα/2 s/√n)
• where Zα/2 comes from the Std Normal
Distribu.on and s is the sample standard
devia.on. 53
A 100 (1 - α)% Confidence Interval Estimate for the
Population Mean μ where σ is unknown and n ≤ 30
• A 100( 1 – α)% interval estimate for the
population mean μ when n < 30 and σ is
unknown is given by
X – tα/2 s/√n ≤ μ ≤ X + tα/2 s/√n
or
( X – tα/2 s/√n , X + tα/2 s/√n )
• where tα/2 comes from the Student-t Distribution
with (n – 1) degrees of freedom and s is the
sample standard deviation 54
Example 8-4
Find the value of t for 16 degrees of freedom and .05 area in the right tail
of a t distribu6on curve.
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
Table 8.2 Determining t for 16 df and .05 Area in the
Right Tail
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
57
Figure 8.6 The value of t for 16 df and .05 area in the
right tail.
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
Figure 8.7 The value of t for 16 df and .05 area in the
lep tail.
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
60
Find the values
of t for:
• 12 df and 0.025
area in the right
tail.
• 20 df and 0.01
area in the right
tail.
• 20 df and 0.05
area in the right
tail.
• 15 df and 0.005
area in the leA tail
• 22 df and 0.001
area in the leA
tail.
Confidence Interval for μ Using the t Distribu4on
The (1 – α)100% confidence interval for μ is
The value of t is obtained from the t distribution table for n – 1 degrees of
freedom and the given confidence level. Here is the margin of error of the
estimate; that is,
x
E ts
=
where
x x
s
x ts s
n
± =
x
ts
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 8-5
According to the Kaiser Family FoundaIon, U.S. workers who had employer-
provided health insurance coverage paid an average premium of $4129 for
family health insurance coverage during 2011 (USA TODAY, October 10, 2011).
A random sample of 25 workers from New York City who have employer-
provided health insurance coverage paid an average premium of $6600 for
family health insurance coverage with a standard deviaIon of $800. Make a
95% confidence interval for the current average premium paid for family health
insurance coverage by all workers in New York City who have employer-
provided health insurance coverage. Assume that the distribuIon of premiums
paid for family health insurance coverage by all workers in New York City who
have employer-provided health insurance coverage is normally distributed.
n = 25
̅
𝑥=6600
s = 800
Create a 95% confidence interval for μ
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 8-5: Solution
• σ is not known, n < 30, and the popula6on is normally distributed
• Use the t distribu6on to make a confidence interval for μ
𝑛 = 25, ̅
𝑥 = $6600, 𝑠 = $800 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 = 95% 𝑜𝑟 .95
𝑠 ̅
" =
𝑠
𝑛
=
800
25
= $160
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 8-5: Solution
• df = n – 1 = 25 – 1 = 24
• Area in each tail = .5 – (.95/2) = .5 - .4750 = .025
• The value of t in the right tail is 2.064 (from table)
̅
𝑥 ± 𝑡𝑠 ̅
" = 6600 ± 2.064 160 =
6600 ± 330.24 =
$6269.76 𝑡𝑜 $6930.24
Prem Mann, Introductory Statistics, 8/E
Copyright © 2013 John Wiley & Sons. All rights reserved.
Thus, we can state with 95% confidence that the current mean
premium paid for family health insurance coverage by all workers in
New York City who have employer-provided health insurance coverage
is between $6269.76 and $6930.24
Class Exercise 1
• The standard deviation for a population is 14.8.
• A sample of 100 observations selected from this
population gave a mean of 143.72.
– Construct a 99% confidence interval for μ
– Construct a 95% confidence interval for μ.
– Construct a 90% confidence interval for μ.
– Does the width of the confidence intervals
constructed in parts a. to c. decrease as the
confidence level decreases? Explain.
65
Answer to Class Exercise 1
• 99% CI is (139.92 and 147.52)
• 95% CI is (140.82 and 146.62)
• 90% CI is (141.28 and 146.16)
• No0ce that the width of the Confidence
Interval decreases as the Confidence level
decreases.
• It makes sense right? Why?
66
Another Class Exercise
• A sample of 10 observa0ons taken from a
normally distributed popula0on produced the
following data:
44 52 31 48 46 39 47 36 41 57
a. What is the point es0mate of μ?
b. Construct a 95% confidence interval for μ.
67
A 100 (1 - α)% Confidence Interval Estimate for
the Population Proportion p.
• A 100( 1 – α)% interval es.mate for the popula.on
propor.on p is given by
p̂ – Zα/2 √(pq/n) ≤ p ≤ p̂ + Zα/2 √(pq/n)
or
(p̂ – Zα/2 √(pq/n) , p̂ + Zα/2 √(pq/n))
• where Zα/2 comes from the Std Normal Distribu.on.
68
IMPORTANT !!!
• Some versions of the on-line text say that
when popula0on standard devia0on is not
known, the t distribu0on should be used for
hypothesis tes0ng.
• In this course (and in prac0ce) we use the Z
tables for hypothesis tes0ng once the sample
size is large (at least over 30).
69
End of Lecture
• We have reviewed the Confidence Intervals
that form an integral part of the 5 stages of a
sta0s0cal analysis.
• Next we move on to another level of
inves0ga0on with respect to sample data.
• This involves Hypothesis tes0ng.
70

More Related Content

Similar to LR 9 Estimation.pdf

Presentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalPresentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalIstiqlalEid
 
2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higginsrgveroniki
 
ststs nw.pptx
ststs nw.pptxststs nw.pptx
ststs nw.pptxMrymNb
 
Inferential Statistics.pptx
Inferential Statistics.pptxInferential Statistics.pptx
Inferential Statistics.pptxjonatanjohn1
 
Statistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptxStatistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptxnagarajan740445
 
Measures of Variability
Measures of VariabilityMeasures of Variability
Measures of Variabilityjasondroesch
 
SAMPLE SIZE DETERMINATION.ppt
SAMPLE SIZE DETERMINATION.pptSAMPLE SIZE DETERMINATION.ppt
SAMPLE SIZE DETERMINATION.pptabdulwehab2
 
samplesizedetermination-221008120007-0081a5b4.ppt
samplesizedetermination-221008120007-0081a5b4.pptsamplesizedetermination-221008120007-0081a5b4.ppt
samplesizedetermination-221008120007-0081a5b4.pptmekuriatadesse
 
Statr sessions 11 to 12
Statr sessions 11 to 12Statr sessions 11 to 12
Statr sessions 11 to 12Ruru Chowdhury
 
statistical inference.pptx
statistical inference.pptxstatistical inference.pptx
statistical inference.pptxsuerie2
 
data_management_review_descriptive_statistics.ppt
data_management_review_descriptive_statistics.pptdata_management_review_descriptive_statistics.ppt
data_management_review_descriptive_statistics.pptRestyLlagas1
 
Topic 3 (Stats summary)
Topic 3 (Stats summary)Topic 3 (Stats summary)
Topic 3 (Stats summary)Ryan Herzog
 
GROUP 1 biostatistics ,sample size and epid.pptx
GROUP 1 biostatistics ,sample size and epid.pptxGROUP 1 biostatistics ,sample size and epid.pptx
GROUP 1 biostatistics ,sample size and epid.pptxEmma910932
 
regression.pptx
regression.pptxregression.pptx
regression.pptxaneeshs28
 
Gravetter10e_PPT_Ch07_student.pptx
Gravetter10e_PPT_Ch07_student.pptxGravetter10e_PPT_Ch07_student.pptx
Gravetter10e_PPT_Ch07_student.pptxNaveedahmed476791
 
Marketing Research Project on T test
Marketing Research Project on T test Marketing Research Project on T test
Marketing Research Project on T test Meghna Baid
 

Similar to LR 9 Estimation.pdf (20)

Presentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalPresentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlal
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
 
2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins
 
ststs nw.pptx
ststs nw.pptxststs nw.pptx
ststs nw.pptx
 
Inferential Statistics.pptx
Inferential Statistics.pptxInferential Statistics.pptx
Inferential Statistics.pptx
 
Statistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptxStatistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptx
 
Measures of Variability
Measures of VariabilityMeasures of Variability
Measures of Variability
 
SAMPLE SIZE DETERMINATION.ppt
SAMPLE SIZE DETERMINATION.pptSAMPLE SIZE DETERMINATION.ppt
SAMPLE SIZE DETERMINATION.ppt
 
samplesizedetermination-221008120007-0081a5b4.ppt
samplesizedetermination-221008120007-0081a5b4.pptsamplesizedetermination-221008120007-0081a5b4.ppt
samplesizedetermination-221008120007-0081a5b4.ppt
 
Statr sessions 11 to 12
Statr sessions 11 to 12Statr sessions 11 to 12
Statr sessions 11 to 12
 
statistical inference.pptx
statistical inference.pptxstatistical inference.pptx
statistical inference.pptx
 
data_management_review_descriptive_statistics.ppt
data_management_review_descriptive_statistics.pptdata_management_review_descriptive_statistics.ppt
data_management_review_descriptive_statistics.ppt
 
Topic 3 (Stats summary)
Topic 3 (Stats summary)Topic 3 (Stats summary)
Topic 3 (Stats summary)
 
Environmental statistics
Environmental statisticsEnvironmental statistics
Environmental statistics
 
GROUP 1 biostatistics ,sample size and epid.pptx
GROUP 1 biostatistics ,sample size and epid.pptxGROUP 1 biostatistics ,sample size and epid.pptx
GROUP 1 biostatistics ,sample size and epid.pptx
 
regression.pptx
regression.pptxregression.pptx
regression.pptx
 
Gravetter10e_PPT_Ch07_student.pptx
Gravetter10e_PPT_Ch07_student.pptxGravetter10e_PPT_Ch07_student.pptx
Gravetter10e_PPT_Ch07_student.pptx
 
Data analysis
Data analysisData analysis
Data analysis
 
Marketing Research Project on T test
Marketing Research Project on T test Marketing Research Project on T test
Marketing Research Project on T test
 
Presentation1
Presentation1Presentation1
Presentation1
 

More from giovanniealvarez1

More from giovanniealvarez1 (6)

Holocaust.pptx
Holocaust.pptxHolocaust.pptx
Holocaust.pptx
 
lipsey14e_ppt_ch04 (1).pptx
lipsey14e_ppt_ch04 (1).pptxlipsey14e_ppt_ch04 (1).pptx
lipsey14e_ppt_ch04 (1).pptx
 
Introduction-to-Supply-and-Demand-Power-Point.ppt
Introduction-to-Supply-and-Demand-Power-Point.pptIntroduction-to-Supply-and-Demand-Power-Point.ppt
Introduction-to-Supply-and-Demand-Power-Point.ppt
 
07 GOVT 1000-TOPIC 7.pptx
07 GOVT 1000-TOPIC 7.pptx07 GOVT 1000-TOPIC 7.pptx
07 GOVT 1000-TOPIC 7.pptx
 
Sets powerpoint.ppt
Sets powerpoint.pptSets powerpoint.ppt
Sets powerpoint.ppt
 
Lr 1 Intro.pdf
Lr 1 Intro.pdfLr 1 Intro.pdf
Lr 1 Intro.pdf
 

Recently uploaded

Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communicationskarancommunications
 
rishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdfrishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdfmuskan1121w
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurSuhani Kapoor
 
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service PuneVIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service PuneCall girls in Ahmedabad High profile
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in managementchhavia330
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageMatteo Carbone
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
Catalogue ONG NUOC PPR DE NHAT .pdf
Catalogue ONG NUOC PPR DE NHAT      .pdfCatalogue ONG NUOC PPR DE NHAT      .pdf
Catalogue ONG NUOC PPR DE NHAT .pdfOrient Homes
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...lizamodels9
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...lizamodels9
 
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts ServiceVip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Serviceankitnayak356677
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Neil Kimberley
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Roomdivyansh0kumar0
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...lizamodels9
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...lizamodels9
 

Recently uploaded (20)

Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communications
 
rishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdfrishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdf
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
 
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service PuneVIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in management
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
 
Best Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting PartnershipBest Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting Partnership
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
Catalogue ONG NUOC PPR DE NHAT .pdf
Catalogue ONG NUOC PPR DE NHAT      .pdfCatalogue ONG NUOC PPR DE NHAT      .pdf
Catalogue ONG NUOC PPR DE NHAT .pdf
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
 
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts ServiceVip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
 

LR 9 Estimation.pdf

  • 1. ECON 1005 INTRODUCTION TO STATISTICS ESTIMATION Dr. Henry Bailey 1
  • 2. Checklist • The parameters of a distribution • The idea of estimating parameters of a distribution based on a sample • Estimators and the estimation process • Point vs interval estimates • Sampling and non-sampling error • Bias • Sampling distribution of the mean and proportion • Standard error • The Central Limit Theorem • The t distribution and working with t tables • Confidence intervals 2
  • 3. Introduc1on • Inference – “a conclusion reached on the basis of evidence and reasoning.” • Inferen0al sta0s0cs: – Allows us to make decisions about some characteris4cs of a popula4on based on sample informa4on. – I.e. we draw conclusions about a popula4on based on a sample. 3
  • 4. Introduc1on • We have discussed the characteris.cs and proper.es of the probability distribu.ons of random variables • These characteris.cs were the parameters: – n and p for the Binomial Distribu4on – λ for the Poisson Distribu4on – μ and σ for the Normal Distribu4on • In the real world we o;en don’t know the values of these parameters and will have to es.mate them. • Three key words: –Es#ma#on(the process) –Es#mate (the result) –Es#mator (the facilitator) 4
  • 5. Three approaches to estimating unknown population parameters. 1. Census 2. Guess 3. The preferred method: – draw a random sample of appropriate size from the popula4on, – use the sample data, – choose a formula (called a sample sta4s4c) to es4mate the unknown popula4on parameter. 5
  • 6. Defini&on of Es&ma&on Es#ma#on is the process by which we es0mate the value of an unknown popula0on parameter by making use of the data from a random sample that was drawn from that Popula0on. 6
  • 7. THE ESTIMATION PROCESS 1. Iden'fy the Unknown Popula'on Parameter 2. Decide on the Size of the Random Sample: n 3. Select the Random Sample of Size n 4. Choose an Appropriate Sample Sta's'c [Es#mator] 5. Subs'tute the Sample Data into the Sample Sta's'c 6. Calculate the es'mate and interpret 7
  • 8. Two Types of Es&mates • Suppose we seek to es.mate the mean age of Level I students on the Campus. • We may draw a random sample of 100 Level I students from the Campus, record their ages, subs.tute the 100 values into the formula for the mean of a sample (also called the sample sta)s)c or es)mator), and read off the es)mate. • The resul.ng es.mate can be – a single value e.g. 20 or – an interval of values ( 18 - 22). 8
  • 9. Two Types of Es&mates • Point es0mate • Interval es0mate of a popula4on parameter 9
  • 10. Es&mators • How do we use the data from our random sample to arrive at an estimate? • We substitute the sample data into a formula better known as a sample statistic. • These sample statistics are called estimators. • A point estimator for an unknown population parameter is a sample statistic into which the data from the random sample is substituted, so as to yield a point estimate of that parameter. 10
  • 11. Commonly Used Point Estimators Population Parameter Sample Statistic Mean µ Sample Mean Sample Median Sample Mode Standard Deviation σ Sample St. Dev s Proportion p Sample Proportion p̂ 11
  • 12. Example • The mean and standard deviation of the teaching experience of faculty members in a department at a University are unknown. A random sample of 5 faculty members was selected; their teaching experience in years was as follows: 7, 8, 14, 7, 20 1. Identify suitable point estimators for the mean teaching experience of the entire faculty 2. Identify suitable point estimators for the standard deviation of teaching experience of the entire faculty 3. Find a point estimate of the mean teaching experience of the entire faculty 4. Find a point estimate of the std deviation of the teaching experience of the entire faculty. 12
  • 13. Solution 1. We can use any of three point es'mators to es'mate the popula'on mean: sample mean, sample mode or sample median. – On the basis of the three es.mators declared in 1. above, we can compute three point es.mates. • Sample Mean = 1/5 ( 7 + 8 + 14 + 7 + 20 ) = 11.2 • Sample Mode = 7 • Sample Median = 8 2. We can use the sample standard devia#on as the point es#mator for the popula#on standard devia#on. • The point es'mate of the popula'on standard devia'on is the value of s . s = 1/4 (4.2 2 + 3.22 + 4.22 +2.82 +8.82) = 5.718 13
  • 14. Some Issues • Since we must es?mate popula?on parameters from samples, it is inevitable that we will make errors. – Different sample sizes can give rise to different point esRmates when the same esRmator is used – Different esRmators can give rise to different point esRmates when the same sample is used – Different esRmators and different sample sizes can give rise to different point esRmates – Some esRmates will agree with the true value of the populaRon parameter; others will not. 14
  • 15. Error in Estimation • The difference between the point estimate and the true value of the population parameter is known as the total error in the estimate. • This total error between the point estimate and the true value of the population parameter can be the result of both sampling error and non- sampling error. • The sampling errors occur because of chance. • Other errors may also arise as a result of human errors, and not chance; these tend to impair the results obtained. Such errors are called non- sampling errors. TOTAL ERROR IN THE ESTIMATE = SAMPLING ERROR + NON-SAMPLING ERROR 15
  • 16. Sources of Non-Sampling Error • There are many poten0al sources of non- sampling error: – Inability to obtain all the required informa4on from all elements of the sample – Difficul4es in defining terms – Differences in interpreta4on of ques4ons – Errors in the data collec4on such as in recording or coding – Errors made in the data tabula4on ac4vity. 16
  • 17. Example • Consider a staRsRcs class of five students. Their exam scores were: 70, 78, 80, 80 & 95. • Find the populaRon mean. • Suppose that a random sample of three students was drawn i.e. 70, 80 & 95 • Use the sample data and the sample mean to esRmate the populaRon mean. • What is the difference due to chance? • Now suppose that we mistakenly recorded 82 instead of 80. • What would be the new esRmate of the populaRon mean? • What is the new difference between the populaRon mean and the point esRmate? 17
  • 18. Example Continued • It is this difference of 1.73 that we call the total error in the estimate. It is subdivided into two components: – The sampling error of 1.07 – The non-sampling error of 0.66 • As this error grows, the sample statistic will become less useful as an estimator of the population parameter. • We must therefore be able to determine the impact of the error on the inferences that we will be making by subjecting the estimators to specific tests. These are discussed in the next chapter. 18
  • 19. What is bias ? • Bias is a tendency to lean in a certain direction, either in favour of or against a particular thing. To be truly biased means to lack a neutral viewpoint on a particular topic. • Statistical bias is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated. 19
  • 20. Unbiased Point Es7mator • A point estimator ! 𝜃 is said to be an unbiased estimator of a population parameter 𝜃 if E( ! 𝜃)= 𝜃 • If E( ! 𝜃)≠ 𝜃 then the point estimator is said to be biased. • The extent of the bias will be equal to E( ! 𝜃) – 𝜃 20
  • 21. Unbiased Point Estimator • The unbiased es0mators that we will use in this course are: 21
  • 22. SAMPLING DISTRIBUTION OF THE MEAN • Return to our popula.on of test scores for the class comprising five students A, B, C, D and E. • A = 70, B = 78, C = 80, D = 80, E = 95 • Popula'on Mean = 80.6 Popula'on Std Devia'on = 8.09 • We will now perform the following ac.vi.es. 1. Consider all possible samples of three scores from this popula6on; there are 10 such samples. 2. Compute the sample mean for each of the 10 samples. 3. Construct the Frequency Distribu6on of Sample Means. 4. Construct the Rela6ve Frequency Distribu6on of Sample Means. 5. Rename Rela6ve Frequency as Probability to create the Probability Distribu6on of the Sample Means 22
  • 23. 1 & 2. Generating the 10 Random Samples of Size 3 23
  • 24. 3. The Frequency Distribution of Sample Means 24 Sample mean Frequency 76.00 2 76.67 1 79.33 1 81.00 1 81.67 2 84.33 2 85.00 1 Σf= 10
  • 25. 5. The Probability Distribu1on of Sample Means (or The Sampling Distribu1on of the Mean) 25 Sample mean Probability 76.00 0.2 76.67 0.1 79.33 0.1 81.00 0.1 81.67 0.2 84.33 0.2 85.00 0.1 Σ 1.00 Sample mean Frequency 76.00 2 76.67 1 79.33 1 81.00 1 81.67 2 84.33 2 85.00 1 Σf= 10
  • 26. Sampling Distribu1ons in this Course • In general, the probability distribu.on of a Sample Sta.s.c is called its sampling distribu.on. • We will focus on two sampling distribu3ons: – Sampling Distribu0on of the Mean – Sampling Distribu0on of the Propor0on • In the Sampling Distribu.on of the Mean, the random variable is the sample mean. • In the Sampling Distribu.on of the Propor.on, the random variable is the sample propor.on p̂ 26
  • 27. The Mean of the Sampling Distribution of the Mean • The mean of the sampling distribution of the mean is equal to the population mean μ. Class Activity • Compute the mean of the Sampling Distribution of the Mean Score based on the ten random samples of size 3. • Show that it is indeed equal to the population mean. 27
  • 28. The Standard Devia1on of the Sampling Distribu1on of the Mean • The Standard Deviation of the Sampling Distribution of Mean is given by σx where • σx = σ/√n • σx is also called the standard error. • The spread of the Sampling Distribution of the Mean is smaller than the spread of the corresponding population distribution. • The standard deviation of the Sampling Distribution of Mean decreases as the sample size increases. 28
  • 29. What kind of distribu2on will the Sampling Distribu2on of the Mean have? • If the population from which the samples are drawn is normally distributed with mean μ and standard deviation σ, then the Sampling Distribution of the Mean will also be normally distributed with mean μ and standard deviation σx (irrespective of the sample size). • Does the above result hold true if the population is not normally distributed? 29
  • 30. Parent Population Sampling Distribu4on of x for n=2 Sampling Distribu4on of x for n=5 Sampling Distribution of x for n=30 30 Full Screen
  • 31. What kind of Probability Distribution does the Sampling Distribution of the Mean possess when the population is not Normal ? The Central Limit Theorem assures us that: • If the sample size is large, the Sampling Distribution of the Mean will be approximately normally distributed with mean μ and standard deviation σx irrespective of the distribution of the population. • ‘Large’ is taken to mean n≥30 • What happens when the sample size is small i.e. n < 30? 31
  • 32. What kind of Probability Distribu2on does the Sampling Distribu2on of the Mean possess when the popula2on is not Normal and sample size is small i.e. n < 30? • We must look to the Student t Distribu2on • The Student t DistribuRon is a specific type of bell-shaped distribuRon with a lower height and a wider spread than the Standard Normal DistribuRon. • The Student t DistribuRon has only one parameter i.e. the number of degrees of freedom abbreviated df • The number of degrees of freedom is the number of observaRons that can be freely chosen. • The mean of the Student t DistribuRon is 0 • The standard deviaRon of the Student t DistribuRon is df/(df – 2) • As the degrees of freedom increases the Student t DistribuRon approaches the Standard Normal DistribuRon. 32
  • 33. • If the popula0on from which the samples are drawn is either of unknown distribu0on or not normally distributed with mean μ and standard devia0on σ, then the Sampling Distribu0on of the Mean is specified by the Student t DistribuBon with n - 1 degrees of freedom. • The random variable of the Student t Distribu4on is given by t where: 33 What kind of Probability Distribution does the Sampling Distribution of the Mean possess when the population is not Normal and sample size is small i.e. n < 30? t = !"# $!
  • 34. The Sampling Distribution of Proportion The Sampling Distribu.on of Propor.on • The probability distribu.on of the sample propor.on is called the Sampling Distribu.on of the Propor.on. • The random variable of the Sampling Distribu.on of the Propor.on is p̂ • The mean of the Sampling Distribu.on of the Propor.on is the popula.on propor.on p. • The standard devia.on of the Sampling Distribu.on of the Propor.on is given by √(pq/n). 34
  • 35. What is the shape of the Sampling Distribution of the Proportion? The Central Limit Theorem assures us that: • If the sample size is sufficiently large, the Sampling Distribu0on of the Propor0on will be approximately normally distributed with mean p and standard devia0on √(pq/n). • Sufficiently Large means np > 5 and nq > 5. 35
  • 36. Interval Estimates: Confidence Intervals • We were speaking all along about Unbiased Point Estimators. • Instead of assigning a single value to an unknown population parameter, we can construct an interval of values around the point estimate and make a probabilistic statement that the interval contains the value of the corresponding population parameter. • Such activity is called interval estimation and interval estimators are called Confidence Intervals. • These estimators, when applied to the data from a random sample, defines an interval that is likely to contain the true value of the population parameter being estimated. 36
  • 37. Confidence Level and Confidence Interval Defini6on Each interval is constructed with regard to a given confidence level and is called a confidence interval. The confidence level is given as Point es)mate ± Margin of error The confidence level associated with a confidence interval states how much confidence we have that this interval contains the true popula6on parameter. The confidence level is denoted by (1 – α)100%. Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.
  • 38. Interval es4ma4on. Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.
  • 39. Interval Estimates • An interval that is constructed based on the confidence level is called a confidence interval. • A 90% Confidence Interval means a 10% significance level i.e. α = 10% • A 95% Confidence Interval means a 5% significance level i.e. α = 5% • Confidence Interval Estimates in this course are as follows: – For the population mean based on large samples – For the population mean based on small samples – For the population mean based on large samples with σ unknown – For the population mean based on small samples with σ unknown – For the population proportion 39
  • 40. A 100 (1 - α)% Confidence Interval EsEmate for the PopulaEon Mean μ • Let X ~ N(μ , σ) where σ is known. A single sample of size n was drawn and the sample mean X is computed. • On the basis of this sample mean we seek to find a 100(1 - α)% Confidence Interval Es#mate for μ. • A 100( 1 – α)% interval es'mate for the popula'on mean μ is given by: X – Zα/2 σx ≤ μ ≤ X + Zα/2 σx or (X – Zα/2 σx , X + Zα/2 σx) where Zα/2 is the standard score that cuts off a tail area of α/2% in the Standard Normal Curve. 40
  • 41. A 100( 1 – α)% Interval Es2mate for the Popula2on Mean μ (μ – Zα/2 σx , μ + Zα/2 σx) where Zα/2 is the standard score that cuts off a tail area of % & % in the Standard Normal Curve. 41
  • 42. Eg Finding z for a 95% confidence level. Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.
  • 43. Figure 8.3 Area in the tails. Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved. = 0.95 = 0.025
  • 44. Example • Find a 100( 1 – α)% Interval Estimate for the Population mean μ using the following: § α = 5% § Sample mean = 52 § σx= 4 CI = μ – Zα/2 σx to μ + Zα/2 σx 44
  • 45. Example • Find a 100( 1 – α)% Interval Es0mate for the Popula0on mean μ using the following: § α = 5% § Sample mean = 52 § σx= 4 CI = μ – Zα/2 σx to μ + Zα/2 σx 45 95% Confidence Interval= μ – Zα/2 σx to μ + Zα/2 σx 52 – (1.96 x 4) to 52 + (1.96 x 4) 52 – 7.84 to 52 + 7.84 44.16 to 59.84
  • 46. Example • Find a 100( 1 – α)% Interval Es0mate for the Popula0on mean μ using the following: § α = 5% § Sample mean = 52 § σx= 4 CI = μ – Zα/2 σx to μ + Zα/2 σx 46 95% Confidence Interval= μ – Zα/2 σx to μ + Zα/2 σx 52 – (1.96 x 4) to 52 + (1.96 x 4) 52 – 7.84 to 52 + 7.84 44.16 to 59.84 Find μ Get Z from tables (using half of alpha) Calculate σx
  • 47. A publishing company has just published a new college textbook. Before the company decides the price at which to sell this textbook, it wants to know the average price of all such textbooks in the market. The research department at the company took a sample of 25 comparable textbooks and collected information on their prices. This information produces a mean price of $145 for this sample. It is known that the standard deviation of the prices of all such textbooks is $35 and the population of such prices is normal. Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved. (a) What is the point es6mate of the mean price of all such textbooks? (b) Construct a 90% confidence interval for the mean price of all such college textbooks.
  • 48. A 100( 1 – α)% Interval Estimate for the Population Mean μ (μ – Zα/2 σx , μ + Zα/2 σx) Find Zα/2 for a 90% C.I. 48
  • 49. (b) Confidence level is 90% or .90. Here, the area in each tail of the normal distribu6on curve is α/2=(1-.90)/2=.05. Hence, z = 1.65. 145 1.65(7.00) 145 11.55 (145-11.55) to (145 11.55) $133.45 to $156.55 x x zs ± = ± = ± = + = Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.
  • 50. Example 8-2 According to Moebs Services Inc., an individual checking account at major U.S. banks costs the banks between $350 and $450 per year (Time, November 21, 2011). A recent random sample of 600 such checking accounts produced a mean annual cost of $500 to major U.S. banks. Assume that the standard deviaRon of annual costs to major U.S. banks of all such checking accounts is $40. Make a 99% confidence interval for the current mean annual cost to major U.S. banks of all such checking accounts. Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.
  • 51. • Confidence level 99% or .99 • The sample size is large (n ≥ 30) § Therefore, we use the normal distribution § z = 2.58 § Thus, we can state with 99% confidence that the current mean annual cost to major U.S. banks of all individual checking accounts is between $495.79 and $504.21 51
  • 52. A 100 (1 - α)% Confidence Interval Estimate for the Population Mean μ where σ is unknown Let X ~ N(μ , σ) where σ is unknown. A single sample of size n was drawn and the sample mean X was computed. On the basis of this single sample mean, find a 100(1 - α)% Confidence Interval EsMmate for μ. • Here we subs4tute s for the unknown σ. • However, it mamers whether n is large i.e. (n ≥ 30) or small i.e. (n < 30) – If n ≥ 30 the CLT allows us to use the Normal Distribu'on N(μ , s/√n ) as the Sampling Distribu'on – If n < 30 the CLT allows us to use the Student-t Distribu'on with n – 1 df as the Sampling Distribu#on. 52
  • 53. A 100 (1 - α)% Confidence Interval Estimate for the Population Mean μ where σ is unknown and n ≥ 30 • A 100( 1 – α)% interval es.mate for the popula.on mean μ when n ≥ 30 and σ is unknown is given by X – Zα/2 s/√n ≤ μ ≤ X+ Zα/2 s/√n or (X – Zα/2 s/√n, X+ Zα/2 s/√n) • where Zα/2 comes from the Std Normal Distribu.on and s is the sample standard devia.on. 53
  • 54. A 100 (1 - α)% Confidence Interval Estimate for the Population Mean μ where σ is unknown and n ≤ 30 • A 100( 1 – α)% interval estimate for the population mean μ when n < 30 and σ is unknown is given by X – tα/2 s/√n ≤ μ ≤ X + tα/2 s/√n or ( X – tα/2 s/√n , X + tα/2 s/√n ) • where tα/2 comes from the Student-t Distribution with (n – 1) degrees of freedom and s is the sample standard deviation 54
  • 55. Example 8-4 Find the value of t for 16 degrees of freedom and .05 area in the right tail of a t distribu6on curve. Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.
  • 56. Table 8.2 Determining t for 16 df and .05 Area in the Right Tail Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.
  • 57. 57
  • 58. Figure 8.6 The value of t for 16 df and .05 area in the right tail. Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.
  • 59. Figure 8.7 The value of t for 16 df and .05 area in the lep tail. Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.
  • 60. 60 Find the values of t for: • 12 df and 0.025 area in the right tail. • 20 df and 0.01 area in the right tail. • 20 df and 0.05 area in the right tail. • 15 df and 0.005 area in the leA tail • 22 df and 0.001 area in the leA tail.
  • 61. Confidence Interval for μ Using the t Distribu4on The (1 – α)100% confidence interval for μ is The value of t is obtained from the t distribution table for n – 1 degrees of freedom and the given confidence level. Here is the margin of error of the estimate; that is, x E ts = where x x s x ts s n ± = x ts Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.
  • 62. Example 8-5 According to the Kaiser Family FoundaIon, U.S. workers who had employer- provided health insurance coverage paid an average premium of $4129 for family health insurance coverage during 2011 (USA TODAY, October 10, 2011). A random sample of 25 workers from New York City who have employer- provided health insurance coverage paid an average premium of $6600 for family health insurance coverage with a standard deviaIon of $800. Make a 95% confidence interval for the current average premium paid for family health insurance coverage by all workers in New York City who have employer- provided health insurance coverage. Assume that the distribuIon of premiums paid for family health insurance coverage by all workers in New York City who have employer-provided health insurance coverage is normally distributed. n = 25 ̅ 𝑥=6600 s = 800 Create a 95% confidence interval for μ Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.
  • 63. Example 8-5: Solution • σ is not known, n < 30, and the popula6on is normally distributed • Use the t distribu6on to make a confidence interval for μ 𝑛 = 25, ̅ 𝑥 = $6600, 𝑠 = $800 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 = 95% 𝑜𝑟 .95 𝑠 ̅ " = 𝑠 𝑛 = 800 25 = $160 Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.
  • 64. Example 8-5: Solution • df = n – 1 = 25 – 1 = 24 • Area in each tail = .5 – (.95/2) = .5 - .4750 = .025 • The value of t in the right tail is 2.064 (from table) ̅ 𝑥 ± 𝑡𝑠 ̅ " = 6600 ± 2.064 160 = 6600 ± 330.24 = $6269.76 𝑡𝑜 $6930.24 Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved. Thus, we can state with 95% confidence that the current mean premium paid for family health insurance coverage by all workers in New York City who have employer-provided health insurance coverage is between $6269.76 and $6930.24
  • 65. Class Exercise 1 • The standard deviation for a population is 14.8. • A sample of 100 observations selected from this population gave a mean of 143.72. – Construct a 99% confidence interval for μ – Construct a 95% confidence interval for μ. – Construct a 90% confidence interval for μ. – Does the width of the confidence intervals constructed in parts a. to c. decrease as the confidence level decreases? Explain. 65
  • 66. Answer to Class Exercise 1 • 99% CI is (139.92 and 147.52) • 95% CI is (140.82 and 146.62) • 90% CI is (141.28 and 146.16) • No0ce that the width of the Confidence Interval decreases as the Confidence level decreases. • It makes sense right? Why? 66
  • 67. Another Class Exercise • A sample of 10 observa0ons taken from a normally distributed popula0on produced the following data: 44 52 31 48 46 39 47 36 41 57 a. What is the point es0mate of μ? b. Construct a 95% confidence interval for μ. 67
  • 68. A 100 (1 - α)% Confidence Interval Estimate for the Population Proportion p. • A 100( 1 – α)% interval es.mate for the popula.on propor.on p is given by p̂ – Zα/2 √(pq/n) ≤ p ≤ p̂ + Zα/2 √(pq/n) or (p̂ – Zα/2 √(pq/n) , p̂ + Zα/2 √(pq/n)) • where Zα/2 comes from the Std Normal Distribu.on. 68
  • 69. IMPORTANT !!! • Some versions of the on-line text say that when popula0on standard devia0on is not known, the t distribu0on should be used for hypothesis tes0ng. • In this course (and in prac0ce) we use the Z tables for hypothesis tes0ng once the sample size is large (at least over 30). 69
  • 70. End of Lecture • We have reviewed the Confidence Intervals that form an integral part of the 5 stages of a sta0s0cal analysis. • Next we move on to another level of inves0ga0on with respect to sample data. • This involves Hypothesis tes0ng. 70