11/5/2018
1
SBE 304: Bio-Statistics
Confidence Intervals
Dr. Ayman Eldeib
Systems & Biomedical
Engineering Department
Fall 2018
SBE 304: Statistical Intervals
Estimation
 The purpose of statistics is to permit the user to make an
inference about a population based on information contained in a
sample.
 Populations are characterized by numerical descriptive measures
called parameters, the objective of many statistical
investigations is to make an inference about one or more
population parameters.
 The parameter of interest might be the population mean or the
population variance or the proportion of occurrence of a certain
phenomenon or any other measure of interest which will be called
the target parameter.
 Point Estimates: Use a single value that is intended to be close
to the true value of the target parameter.
 Interval Estimates: Two values are used to construct an interval
that is intended to enclose the parameter of interest.
11/5/2018
2
SBE 304: Statistical Intervals
Statistics
Descriptive Inferential
Correlational
Relationships
Organising,
summarising &
describing data
Estimation Hypothesis
Point
Estimation
Interval
Estimation
SBE 304: Statistical Intervals
Confidence Interval
 A confidence interval (CI) is a particular kind of interval estimate
of a population parameter and is used to indicate the reliability of
an estimate.
 It is an observed interval (i.e. it is calculated from the
observations), in principle different from sample to sample, that
frequently includes the parameter of interest.
 How frequently the observed interval contains the parameter is
determined by the confidence level or confidence coefficient.
 A confidence interval does not predict that the true value of the
parameter has a particular probability of being in the confidence
interval given the data actually obtained.
 Statisticians use a confidence interval to describe the amount of
uncertainty associated with a sample estimate of a population
parameter.
11/5/2018
3
SBE 304: Statistical Intervals
Confidence Interval
Example
 A confidence interval can be used to describe how reliable survey
results are.
 In a poll of election voting-intentions, the result might be that
40% of respondents intend to vote for a certain party.
 A 90% confidence interval for the proportion in the whole
population having the same intention on the survey date might be
38% to 42%.
 From the same data one may calculate a 95% confidence
interval, which might in this case be 36% to 44%.
 A major factor determining the length of a confidence interval is
the size of the sample used in the estimation procedure, for
example the number of people taking part in a survey.
SBE 304: Statistical Intervals
Confidence Interval
A confidence interval is formed as: point estimate +/- margin of error.
The margin of error is the standard error, e.g. of the mean,
multiplied by the appropriate z-score (1.96 for 95%).
Confidence limits are the lower
and upper boundaries / values of a
confidence interval, that is, the
values which define the range of a
confidence interval.
11/5/2018
4
SBE 304: Statistical Intervals
Confidence Interval
4 steps to constructing a confidence interval
 Identify a sample statistic, e.g. mean
 Select a confidence level, e.g. 90%, 95%, or 99% confidence
levels.
 Compute the margin of error
 Compute alpha (α): α = 1 - (confidence level / 100)
 Find the critical probability (p*): p* = 1 - α/2
 Find the z score (critical value) having a cumulative probability
equal to the critical probability (p*). When the population
standard deviation is unknown or when the sample size is
small, the t score is preferred.
 Margin of error = Critical value * Standard deviation of statistic
 Confidence interval = sample statistic + Margin of error
SBE 304: Statistical Intervals
Confidence Interval
Example
To determine if the machine is adequately
calibrated, a sample of n = 25 bottles is
chosen at random and the bottles are
weighed. The resulting measured masses are
X1, ..., X25, a random sample from X.
A machine fills bottles with medicine (e.g. antibiotics) powder, and
is supposed to be adjusted so that the content of the bottles is
250 g. As the machine cannot fill every bottle with exactly 250 g,
the content added to individual bottles shows some variation, and
is considered a random variable X. This variation is assumed to
be normally distributed around the desired average of 250 g, with
a standard deviation of 2.5 g.
11/5/2018
5
SBE 304: Statistical Intervals
The sample shows actual weights x1, ..., x25, with
mean:
By standardizing, we get a random variable
It is possible to find numbers −z and z, independent of μ, where Z lies in
between with probability 1 − α, a measure of how confident we want to
be. We take 1 − α = 0.95. So we have:
Our 0.95 confidence interval becomes:
Margin of error = 1.96 * .5 = 0.98
(= Critical value * Standard deviation of statistic)
Cont’d
Confidence Interval
SBE 304: Statistical Intervals
A Large-Sample Confidence Interval for µ
 Generally n should be at least 40 to use this result reliably.
 In case of unknown σ , replace σ by S in Z calculation.
 The central limit theorem generally holds for n ≥ 30, but the larger
sample size is recommended here because replacing σ by S in Z
results in additional variability.
Confidence Interval
11/5/2018
6
SBE 304: Statistical Intervals
A Large-Sample Confidence Interval for µ
Nine hundred (900) high school freshmen were randomly selected
for a national survey. Among survey participants, the mean grade-
point average (GPA) was 2.7, and the standard deviation was 0.4.
What is the margin of error, assuming a 95% confidence level?
Compute alpha (α): α = 1 - (confidence level / 100) = 1 - 0.95 = 0.05
Find the critical probability (p*): p* = 1 - α/2 = 1 - 0.05/2 = 0.975
Find the critical z score. z = 1.96
Standard error of the mean = s / sqrt( n ) = 0.4 / sqrt( 900 ) = 0.4 / 30 = 0.013
Margin of error (ME) = Critical value x Standard error = 1.96 * 0.013 = 0.025
Cont’d
Confidence Interval
SBE 304: Statistical Intervals
Choice of Sample Size
If x is used as an estimate of µ, we can be 100(1 - α)%
confident that the error | x - µ| will not exceed a specified
amount E when the sample size is
n = (Zα/2 * σ / E)2
E can be estimated as (CI length)/2
-
-
Determine how many specimens must be tested to ensure that the
95% CI on for A238 steel cut at 60°C has a length of at most 1.0J
knowing that σ = 1.
Example
n = (1.96 * 1/ 0.5)2 = 15.37
and because n must be an integer, the required sample size is n = 16.
11/5/2018
7
SBE 304: Statistical Intervals
t-Distribution
Student's t-distribution (or simply the t-distribution) is a
continuous probability distribution that arises when
estimating the mean of a normally distributed population in
situations where the sample size is small and/or the
standard deviation of the population σ is unknown. When
either of these problems occur, statisticians rely on the
distribution of the t-statistic whose values are given by:
t = [ x - μ ] / [ s / sqrt( n ) ]
-
(also known as the t score)
where x is the sample mean, μ is the population mean, s is the
standard deviation of the sample, and n is the sample size.
-
SBE 304: Statistical Intervals
Degrees of Freedom
The particular form of the t distribution is determined by its
degrees of freedom. The degrees of freedom refers to the
number of independent observations in a set of data.
The number of independent observations is equal to the sample
size minus one. Hence, the distribution of the t statistic from
samples of size 8 would be described by a t distribution having 8 -
1 or 7 degrees of freedom. Similarly, a t distribution having 15
degrees of freedom would be used with a sample of size 16.
t-Distribution
11/5/2018
8
SBE 304: Statistical Intervals
Properties of the t Distribution
 The mean of the distribution is equal to 0.
 The variance is equal to v / ( v - 2 ), where v is the degrees of
freedom and v > 2.
 The variance is always greater than 1, although it is close to 1
when there are many degrees of freedom. With infinite degrees of
freedom, the t distribution is the same as the standard normal
distribution.
t-Distribution
SBE 304: Statistical Intervals
When to Use the t-Distribution
 The t-distribution can be used with any statistic having a
bell-shaped distribution (i.e., approximately normal).
 The population distribution is normal.
 The sampling distribution is symmetric, unimodal,
without outliers, and the sample size is 15 or less.
 The sampling distribution is moderately skewed,
unimodal, without outliers, and the sample size is
between 16 and 40.
 The sample size is greater than 40, without outliers.
 The t-distribution should not be used with small samples
from populations that are not approximately normal.
t-Distribution
11/5/2018
9
SBE 304: Statistical Intervals
Example
The degrees of freedom are equal to n – 1 = 22 - 1 = 21.
tα/2, n-1 = t0.025, 21 The t0.025, 21 score is equal to 2.080 from the table.
The sample mean = 13.71, and s = 3.55
Standard error of the mean = s / sqrt( n ) = 3.55 / sqrt( 22 ) = 0.756
Margin of error (ME) = Critical value x Standard error = 2.080 * 0.756 = 1.57
13.71 - 1.57 ≤ μ ≤ 13.71 + 1.57 12.14 ≤ μ ≤ 15.28
The load at specimen failure is as follows (in megapascals). find a 95% CI
on μ.
19.8 10.1 14.9 7.5 15.4 15.4 15.4 18.5 7.9 12.7 11.9
11.4 11.4 14.1 17.6 16.7 15.8 19.5 8.8 13.6 11.9 11.4
t-Distribution
SBE 304: Statistical Intervals
t-Distribution
Table
11/5/2018
10
SBE 304: Statistical Intervals
A Large-Sample Confidence interval for A
Population Proportion
If X is a binomial random variable (such as a population proportion),
is approximately a standard normal random variable.
The approximation is good for :
np > 5 and n(1- p) > 5
Z = (X - np) / sqrt(np(1 - p))
= P – p / sqrt(p(1 - p)/n)
^
SBE 304: Statistical Intervals
Example
In a random sample of 85 automobile engine crankshaft
bearings, 10 have a surface finish that is rougher than the
specifications allow. Therefore, a point estimate of the
proportion of bearings in the population that exceeds the
roughness specification is p = 10/85 = 0.12
A 95% two-sided confidence interval for p can be computed
as follows:
0.05 ≤ p ≤ 0.19
0.12 – 1.96*sqrt(0.12(0.88)/85) ≤ p ≤ 0.12 + 1.96*sqrt(0.12(0.88)/85)
A Large-Sample Confidence interval for A
Population Proportion
11/5/2018
11
SBE 304: Statistical Intervals
Choice of Sample Size
n = (Zα/2 / E)2 * p * (1 - p)
How large a sample is required if we want to be 95% confident
that the error in using to estimate p is less than 0.05?
Example
Using 0.12 as an initial estimate of p, the required sample
size is:
n = (1.96 / 0.05)2 * 0.12 * (0.88) = 163
g{tÇ~ lÉâg{tÇ~ lÉâg{tÇ~ lÉâg{tÇ~ lÉâ

Lec 5 statistical intervals

  • 1.
    11/5/2018 1 SBE 304: Bio-Statistics ConfidenceIntervals Dr. Ayman Eldeib Systems & Biomedical Engineering Department Fall 2018 SBE 304: Statistical Intervals Estimation  The purpose of statistics is to permit the user to make an inference about a population based on information contained in a sample.  Populations are characterized by numerical descriptive measures called parameters, the objective of many statistical investigations is to make an inference about one or more population parameters.  The parameter of interest might be the population mean or the population variance or the proportion of occurrence of a certain phenomenon or any other measure of interest which will be called the target parameter.  Point Estimates: Use a single value that is intended to be close to the true value of the target parameter.  Interval Estimates: Two values are used to construct an interval that is intended to enclose the parameter of interest.
  • 2.
    11/5/2018 2 SBE 304: StatisticalIntervals Statistics Descriptive Inferential Correlational Relationships Organising, summarising & describing data Estimation Hypothesis Point Estimation Interval Estimation SBE 304: Statistical Intervals Confidence Interval  A confidence interval (CI) is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate.  It is an observed interval (i.e. it is calculated from the observations), in principle different from sample to sample, that frequently includes the parameter of interest.  How frequently the observed interval contains the parameter is determined by the confidence level or confidence coefficient.  A confidence interval does not predict that the true value of the parameter has a particular probability of being in the confidence interval given the data actually obtained.  Statisticians use a confidence interval to describe the amount of uncertainty associated with a sample estimate of a population parameter.
  • 3.
    11/5/2018 3 SBE 304: StatisticalIntervals Confidence Interval Example  A confidence interval can be used to describe how reliable survey results are.  In a poll of election voting-intentions, the result might be that 40% of respondents intend to vote for a certain party.  A 90% confidence interval for the proportion in the whole population having the same intention on the survey date might be 38% to 42%.  From the same data one may calculate a 95% confidence interval, which might in this case be 36% to 44%.  A major factor determining the length of a confidence interval is the size of the sample used in the estimation procedure, for example the number of people taking part in a survey. SBE 304: Statistical Intervals Confidence Interval A confidence interval is formed as: point estimate +/- margin of error. The margin of error is the standard error, e.g. of the mean, multiplied by the appropriate z-score (1.96 for 95%). Confidence limits are the lower and upper boundaries / values of a confidence interval, that is, the values which define the range of a confidence interval.
  • 4.
    11/5/2018 4 SBE 304: StatisticalIntervals Confidence Interval 4 steps to constructing a confidence interval  Identify a sample statistic, e.g. mean  Select a confidence level, e.g. 90%, 95%, or 99% confidence levels.  Compute the margin of error  Compute alpha (α): α = 1 - (confidence level / 100)  Find the critical probability (p*): p* = 1 - α/2  Find the z score (critical value) having a cumulative probability equal to the critical probability (p*). When the population standard deviation is unknown or when the sample size is small, the t score is preferred.  Margin of error = Critical value * Standard deviation of statistic  Confidence interval = sample statistic + Margin of error SBE 304: Statistical Intervals Confidence Interval Example To determine if the machine is adequately calibrated, a sample of n = 25 bottles is chosen at random and the bottles are weighed. The resulting measured masses are X1, ..., X25, a random sample from X. A machine fills bottles with medicine (e.g. antibiotics) powder, and is supposed to be adjusted so that the content of the bottles is 250 g. As the machine cannot fill every bottle with exactly 250 g, the content added to individual bottles shows some variation, and is considered a random variable X. This variation is assumed to be normally distributed around the desired average of 250 g, with a standard deviation of 2.5 g.
  • 5.
    11/5/2018 5 SBE 304: StatisticalIntervals The sample shows actual weights x1, ..., x25, with mean: By standardizing, we get a random variable It is possible to find numbers −z and z, independent of μ, where Z lies in between with probability 1 − α, a measure of how confident we want to be. We take 1 − α = 0.95. So we have: Our 0.95 confidence interval becomes: Margin of error = 1.96 * .5 = 0.98 (= Critical value * Standard deviation of statistic) Cont’d Confidence Interval SBE 304: Statistical Intervals A Large-Sample Confidence Interval for µ  Generally n should be at least 40 to use this result reliably.  In case of unknown σ , replace σ by S in Z calculation.  The central limit theorem generally holds for n ≥ 30, but the larger sample size is recommended here because replacing σ by S in Z results in additional variability. Confidence Interval
  • 6.
    11/5/2018 6 SBE 304: StatisticalIntervals A Large-Sample Confidence Interval for µ Nine hundred (900) high school freshmen were randomly selected for a national survey. Among survey participants, the mean grade- point average (GPA) was 2.7, and the standard deviation was 0.4. What is the margin of error, assuming a 95% confidence level? Compute alpha (α): α = 1 - (confidence level / 100) = 1 - 0.95 = 0.05 Find the critical probability (p*): p* = 1 - α/2 = 1 - 0.05/2 = 0.975 Find the critical z score. z = 1.96 Standard error of the mean = s / sqrt( n ) = 0.4 / sqrt( 900 ) = 0.4 / 30 = 0.013 Margin of error (ME) = Critical value x Standard error = 1.96 * 0.013 = 0.025 Cont’d Confidence Interval SBE 304: Statistical Intervals Choice of Sample Size If x is used as an estimate of µ, we can be 100(1 - α)% confident that the error | x - µ| will not exceed a specified amount E when the sample size is n = (Zα/2 * σ / E)2 E can be estimated as (CI length)/2 - - Determine how many specimens must be tested to ensure that the 95% CI on for A238 steel cut at 60°C has a length of at most 1.0J knowing that σ = 1. Example n = (1.96 * 1/ 0.5)2 = 15.37 and because n must be an integer, the required sample size is n = 16.
  • 7.
    11/5/2018 7 SBE 304: StatisticalIntervals t-Distribution Student's t-distribution (or simply the t-distribution) is a continuous probability distribution that arises when estimating the mean of a normally distributed population in situations where the sample size is small and/or the standard deviation of the population σ is unknown. When either of these problems occur, statisticians rely on the distribution of the t-statistic whose values are given by: t = [ x - μ ] / [ s / sqrt( n ) ] - (also known as the t score) where x is the sample mean, μ is the population mean, s is the standard deviation of the sample, and n is the sample size. - SBE 304: Statistical Intervals Degrees of Freedom The particular form of the t distribution is determined by its degrees of freedom. The degrees of freedom refers to the number of independent observations in a set of data. The number of independent observations is equal to the sample size minus one. Hence, the distribution of the t statistic from samples of size 8 would be described by a t distribution having 8 - 1 or 7 degrees of freedom. Similarly, a t distribution having 15 degrees of freedom would be used with a sample of size 16. t-Distribution
  • 8.
    11/5/2018 8 SBE 304: StatisticalIntervals Properties of the t Distribution  The mean of the distribution is equal to 0.  The variance is equal to v / ( v - 2 ), where v is the degrees of freedom and v > 2.  The variance is always greater than 1, although it is close to 1 when there are many degrees of freedom. With infinite degrees of freedom, the t distribution is the same as the standard normal distribution. t-Distribution SBE 304: Statistical Intervals When to Use the t-Distribution  The t-distribution can be used with any statistic having a bell-shaped distribution (i.e., approximately normal).  The population distribution is normal.  The sampling distribution is symmetric, unimodal, without outliers, and the sample size is 15 or less.  The sampling distribution is moderately skewed, unimodal, without outliers, and the sample size is between 16 and 40.  The sample size is greater than 40, without outliers.  The t-distribution should not be used with small samples from populations that are not approximately normal. t-Distribution
  • 9.
    11/5/2018 9 SBE 304: StatisticalIntervals Example The degrees of freedom are equal to n – 1 = 22 - 1 = 21. tα/2, n-1 = t0.025, 21 The t0.025, 21 score is equal to 2.080 from the table. The sample mean = 13.71, and s = 3.55 Standard error of the mean = s / sqrt( n ) = 3.55 / sqrt( 22 ) = 0.756 Margin of error (ME) = Critical value x Standard error = 2.080 * 0.756 = 1.57 13.71 - 1.57 ≤ μ ≤ 13.71 + 1.57 12.14 ≤ μ ≤ 15.28 The load at specimen failure is as follows (in megapascals). find a 95% CI on μ. 19.8 10.1 14.9 7.5 15.4 15.4 15.4 18.5 7.9 12.7 11.9 11.4 11.4 14.1 17.6 16.7 15.8 19.5 8.8 13.6 11.9 11.4 t-Distribution SBE 304: Statistical Intervals t-Distribution Table
  • 10.
    11/5/2018 10 SBE 304: StatisticalIntervals A Large-Sample Confidence interval for A Population Proportion If X is a binomial random variable (such as a population proportion), is approximately a standard normal random variable. The approximation is good for : np > 5 and n(1- p) > 5 Z = (X - np) / sqrt(np(1 - p)) = P – p / sqrt(p(1 - p)/n) ^ SBE 304: Statistical Intervals Example In a random sample of 85 automobile engine crankshaft bearings, 10 have a surface finish that is rougher than the specifications allow. Therefore, a point estimate of the proportion of bearings in the population that exceeds the roughness specification is p = 10/85 = 0.12 A 95% two-sided confidence interval for p can be computed as follows: 0.05 ≤ p ≤ 0.19 0.12 – 1.96*sqrt(0.12(0.88)/85) ≤ p ≤ 0.12 + 1.96*sqrt(0.12(0.88)/85) A Large-Sample Confidence interval for A Population Proportion
  • 11.
    11/5/2018 11 SBE 304: StatisticalIntervals Choice of Sample Size n = (Zα/2 / E)2 * p * (1 - p) How large a sample is required if we want to be 95% confident that the error in using to estimate p is less than 0.05? Example Using 0.12 as an initial estimate of p, the required sample size is: n = (1.96 / 0.05)2 * 0.12 * (0.88) = 163 g{tÇ~ lÉâg{tÇ~ lÉâg{tÇ~ lÉâg{tÇ~ lÉâ