Tests of significance

Hypothesis TestsHypothesis Tests
One Sample Means

Example 2: A government agency
has received numerous
complaints that a particular
restaurant has been selling
underweight hamburgers. The
restaurant advertises that it’s
patties are “a quarter pound” (4
ounces).
How can I tell if they
really are underweight?
Take a sample & find x.
But how do I know if this x is one
that I expectexpect to happen or is it
one that is unlikelyunlikely to happen?
A hypothesis
test will help me
decide!

What are hypothesis tests?What are hypothesis tests?
Calculations that tell us if a value, x,
occurs by random chance or not – if
it is statistically significant
Is it . . .
– a random occurrence due to
natural variation?
– a biased occurrence due to some
other reason?
Statistically significant means that
it is NOTNOT a random chance
occurrence!
Is it one of the
sample means that
are likely to occur?
Is it one that
isn’t likely to
occur?

Nature of hypothesis tests -Nature of hypothesis tests -
• First begin by supposing the
“effect” is NOT present
• Next, see if data provides
evidence against the
supposition
Example: murder trial
How does a murder trial
work?
First - assume that the
person is innocent
Then – mustmust have sufficient
evidence to prove guilty
Hmmmmm …
Hypothesis tests use
the same process!

Steps:Steps:
1) Assumptions
2) Hypothesis statements &
define parameters
3) Calculations
4) Conclusion, in context
Notice the steps are the
same except we add
hypothesis statements –
which you will learn today

Assumptions for z-test (t-test):Assumptions for z-test (t-test):
• Have an SRS of context
• Distribution is (approximately)
normal
– Given
– Large sample size
– Graph data
• σ is known (unknown)
YEAYEA –
These are the same
assumptions as
confidence intervals!!

Example 1: Bottles of a popular cola are
supposed to contain 300 mL of cola. There
is some variation from bottle to bottle. An
inspector, who suspects that the bottler is
under-filling, measures the contents of six
randomly selected bottles. Are the
assumptions met?
299.4 297.7 298.9 300.2 297 301
•Have an SRS of bottles
•Sampling distribution is approximately
normal because the boxplot is
symmetrical
• σ is unknown

Writing Hypothesis statements:Writing Hypothesis statements:
• Null hypothesis – is the statement
being tested; this is a statement of
“no effect” or “no difference”
• Alternative hypothesis – is the
statement that we suspect is true
H0:
Ha:

The form:The form:
Null hypothesis
H0: parameter = hypothesized value
Alternative hypothesis
Ha: parameter > hypothesized value
Ha: parameter < hypothesized value
Ha: parameter = hypothesized value

Example 2: A government agency has
received numerous complaints that a
particular restaurant has been selling
underweight hamburgers. The
restaurant advertises that it’s patties
are “a quarter pound” (4 ounces).
State the hypotheses :
Where µ is the true
mean weight of
hamburger patties
H0: µ = 4
Ha: µ < 4

Example 3: A car dealer advertises
that is new subcompact models get
47 mpg. You suspect the mileage
might be overrated.
State the hypotheses :
Where µ is the
true mean mpg
H0: µ = 47
Ha: µ < 47

Example 4: Many older homes have electrical
systems that use fuses rather than circuit
breakers. A manufacturer of 40-A fuses
wants to make sure that the mean amperage at
which its fuses burn out is in fact 40. If the
mean amperage is lower than 40, customers
will complain because the fuses require
replacement too often. If the amperage is
higher than 40, the manufacturer might be
liable for damage to an electrical system due
to fuse malfunction. State the hypotheses :
Where µ is the true
mean amperage of
the fuses
H0: µ = 40
Ha: µ = 40

Facts to remember about hypotheses:Facts to remember about hypotheses:
• ALWAYS refer to populations
(parameters)
• The null hypothesis for the
“difference” between populations is
usually equal to zero
• The null hypothesis for the correlation
(rho) of two events is usually equal to
zero.
H0: µx-y= 0
H0: ρ= 0

Activity: For each pair of
hypotheses, indicate which are not
legitimate & explain why
0:H;0:He)
6.:H;4.:Hd)
1.:H;1.:Hc)
123:H;123:Hb)
15:H;15:Ha)
a0
a0
a0
a0
a0
=≠
>=
≠=
<=
==
ρρ
µµ
ππ
µµ
xx
Must use parameter (population)
x is a statistics (sample)
π is the population
proportion!Must use same
number as H0!ρ is parameter for population
correlation coefficient – but H0
MUST be “=“ !
Must be NOT equal!

P-values -P-values -
• Assuming H0 is true,the
probability that the test
statistic would have a value as
extreme or more than what
is actually observedIn other words . . . is it
far out in the tails of the
distribution?

Level of significance -Level of significance -
• Is the amount of evidence
necessary before we begin to doubt
that the null hypothesis is true
• Is the probability that we will
reject the null hypothesis, assuming
that it is true
• Denoted by αα
– Can be any value
– Usual values: 0.1, 0.05, 0.01
– Most common is 0.05

Statistically significant –
• The p-value is as smallas small or smallersmaller
than the level of significance (α)
• If p > α, “fail to rejectfail to reject” the null
hypothesis at the α level.
• If p < α, “rejectreject” the null
hypothesis at the α level.

Facts about p-values:
• ALWAYS make decision about the null
hypothesis!
• Large p-values show support for the
null hypothesis, but never that it is
true!
• Small p-values show support that the
null is not true.
• Double the p-value for two-tail (=)
tests
• Never acceptNever accept the null hypothesis!

Never “accept” the null hypothesis!
Never “accept” the null
hypothesis!
Never “accept” the
null hypothesis!

At an α level of .05, would you
reject or fail to reject H0 for
the given p-values?
a) .03
b) .15
c) .45
d) .023
Reject
Reject
Fail to reject
Fail to reject

Draw & shade a curve &
calculate the p-value:
1) right-tail test t = 1.6; n = 20
2) left-tail test z = -2.4; n = 15
3) two-tail test t = 2.3; n = 25
P-value = .0630
P-value = .0082
P-value = (.0152)2 = .0304

Writing Conclusions:
1) A statement of the decision
being made (reject or fail to
reject H0) & why (linkage)
2) A statement of the results in
context. (state in terms of Ha)
AND

“Since the p-value < (>) α,
I reject (fail to reject)
the H0. There is (is not)
sufficient evidence to
suggest that Ha.”
Be sure to write Ha in
context (words)!

Example 5: Drinking water is considered
unsafe if the mean concentration of lead is
greater than 15 ppb (parts per billion).
Suppose a community randomly selects of
25 water samples and computes a t-test
statistic of 2.1. Assume that lead
concentrations are normally distributed.
Write the hypotheses, calculate the p-
value & write the appropriate conclusion
for α = 0.05.
H0: µ = 15
Ha: µ > 15
Where µ is the true mean
concentration of lead in drinking water
P-value = tcdf(2.1,10^99,24)
=.0232
t=2.1
Since the p-value < α, I reject H0. There
is sufficient evidence to suggest that the
mean concentration of lead in drinking
water is greater than 15 ppb.

Example 6: A certain type of frozen
dinners states that the dinner
contains 240 calories. A random
sample of 12 of these frozen dinners
was selected from production to see
if the caloric content was greater
than stated on the box. The t-test
statistic was calculated to be 1.9.
Assume calories vary normally. Write
the hypotheses, calculate the p-value
& write the appropriate conclusion
for α = 0.05.
H0: µ = 240
Ha: µ > 240
Where µ is the true mean caloric
content of the frozen dinners
P-value = tcdf(1.9,10^99,11)
=.0420
t=1.9
Since the p-value < α, I reject H0. There
is sufficient evidence to suggest that the
true mean caloric content of these
frozen dinners is greater than 240
calories.

Formulas:
σ known:
statisticofdeviationstandard
parameter-statistic
statistictest =
z =
−x µ
n
σ

Formulas:
σ unknown:
statisticofdeviationstandard
parameter-statistic
statistictest =
t =
−x µ
n
s

Example 7: The Fritzi Cheese Company buys milk from
several suppliers as the essential raw material for its
cheese. Fritzi suspects that some producers are
adding water to their milk to increase their profits.
Excess water can be detected by determining the
freezing point of milk. The freezing temperature of
natural milk varies normally, with a mean of -0.545
degrees and a standard deviation of 0.008. Added
water raises the freezing temperature toward 0
degrees, the freezing point of water (in Celsius). The
laboratory manager measures the freezing
temperature of five randomly selected lots of milk
from one producer with a mean of -0.538 degrees. Is
there sufficient evidence to suggest that this
producer is adding water to his milk?

Assumptions:
•I have an SRS of milk from one producer
•The freezing temperature of milk is a normal distribution.
(given)
• σ is known
SRS?
Normal?
How do you
know?
Do you
know σ?H0: µ = -0.545
Ha: µ > -0.545
where µ is the true mean freezing temperature of milk
What are your
hypothesis
statements? Is
there a key
word?
( ) 9566.1
5
008.
545.538.
=
−−−
=z Plug values
into
formula.
p-value = normalcdf(1.9566,1E99)=.0252
Use normalcdf to
calculate p-value.α = .05

Conclusion: Compare your p-value
to a & make decision
Since p-value < α, I reject the null hypothesis.
Write conclusion in
context in terms of Ha.
There is sufficient evidence to suggest that the true
mean freezing temperature is greater than -0.545. This
suggests that the producer is adding water to the milk.

Example 8: The Degree of Reading Power
(DRP) is a test of the reading ability of
children. Here are DRP scores for a random
sample of 44 third-grade students in a
suburban district:
(data on note page)
At the α = .1, is there sufficient evidence to
suggest that this district’s third graders
reading ability is different than the national
mean of 34?

• I have an SRS of third-graders
•Since the sample size is large, the sampling distribution is
approximately normally distributed
OR
•Since the histogram is unimodal with no outliers, the
sampling distribution is approximately normally distributed
• σ is unknown
SRS?
Normal?
How do you
know?
Do you
know σ? What are your
hypothesis
statements? Is
there a key word?
6467.
44
189.11
34091.35
=
−
=t
Plug values
into formula.
p-value = tcdf(.6467,1E99,43)=.2606(2)=.5212
Use tcdf to
calculate p-value.α = .1
H0: µ = 34 where µ is the true mean reading
Ha: µ = 34 ability of the district’s third-graders

Conclusion: Compare your p-value
to α & make decision
Since p-value > α, I fail to reject the null
hypothesis.
Write conclusion in
context in terms of Ha.
There is not sufficient evidence to suggest that the
true mean reading ability of the district’s third-graders
is different than the national mean of 34.

Example 9: The Wall Street Journal
(January 27, 1994) reported that based
on sales in a chain of Midwestern grocery
stores, President’s Choice Chocolate Chip
Cookies were selling at a mean rate of
$1323 per week. Suppose a random sample
of 30 weeks in 1995 in the same stores
showed that the cookies were selling at
the average rate of $1208 with standard
deviation of $275. Does this indicate that
the sales of the cookies is different from
the earlier figure?

Assume:
•Have an SRS of weeks
•Distribution of sales is approximately normal due to
large sample size
• s unknown
H0: µ = 1323 where µ is the true mean cookie sales
Ha: µ ≠ 1323 per week
Since p-value < α of 0.05, I reject the null hypothesis.
There is sufficient to suggest that the sales of cookies
are different from the earlier figure.
0295.29.2
30
275
13231208
=−−=
−
= valuept

Example 9: President’s Choice Chocolate Chip
Cookies were selling at a mean rate of $1323
per week. Suppose a random sample of 30
weeks in 1995 in the same stores showed
that the cookies were selling at the average
rate of $1208 with standard deviation of
$275. Compute a 95% confidence interval for
the mean weekly sales rate.
CI = ($1105.30, $1310.70)
Based on this interval, is the mean weekly
sales rate statistically different from the
reported $1323?

What do you notice about the decision from the
confidence interval & the hypothesis test?
What decision would you make on Example 10 if α =
.01?
What confidence level would be correct to use?
Does that confidence interval provide the same
decision?
If Ha: µ < 1323, what decision would the hypothesis
test give at α = .02?
Now, what confidence level is appropriate for this
alternative hypothesis?
You should use a 99% confidence level for
a two-sided hypothesis test at α = .01.
You would fail to reject H0 since the p-value > α.
CI = ($1068.6 , $1346.40) - Since $1323 is in
this interval we would fail to reject H0.
Remember your, p-value = .01475
At α = .02, we would reject H0.
The 98% CI = ($1084.40, $1331.60) -
Since $1323 is in the interval, we would fail
to reject H0.
Why are we getting different answers?
In a one-sided test, all of α (2%) goes into that tail (lower
tail).
α = .02
In a CI, the tails have equal area –
so there should also be 2% in the
upper tail
That leaves 96%96% in the middle & that
should be your confidence levelconfidence level
.02.96
A 96% CI = ($1100, $1316).
Since $1323 is not in the interval, we
would reject H0.
Tail probabilities between
the significant level (α) and
the confidence level MUST
match!)

Matched Pairs
Test
A special type of
t-inference

Matched Pairs – two forms
• Pair individuals by
certain
characteristics
• Randomly select
treatment for
individual A
• Individual B is
assigned to other
treatment
• Assignment of B is
dependent on
assignment of A
• Individual persons
or items receive
both treatments
• Order of
treatments are
randomly assigned
or before & after
measurements are
taken
• The two measures
are dependent on
the individual

Is this an example of matched pairs?
1)A college wants to see if there’s a
difference in time it took last year’s
class to find a job after graduation and
the time it took the class from five years ago
to find work after graduation. Researchers
take a random sample from both classes and
measure the number of days between
graduation and first day of employment
No, there is no pairing of individuals, you
have two independent samples

2) In a taste test, a researcher asks people
in a random sample to taste a certain brand
of spring water and rate it. Another
random sample of people is asked to
taste a different brand of water and rate it.
The researcher wants to compare these
samples
No, there is no pairing of individuals, you
have two independent samples – If you would
have the same people taste both brands in
random order, then it would bean example of
matched pairs.

3) A pharmaceutical company wants to test
its new weight-loss drug. Before giving the
drug to a random sample, company
researchers take a weight measurement
on each person. After a month of using
the drug, each person’s weight is
measured again.
Yes, you have two measurements that are
dependent on each individual.

A whale-watching company noticed that many
customers wanted to know whether it was
better to book an excursion in the morning or
the afternoon. To test this question, the
company collected the following data on 15
randomly selected days over the past
month. (Note: days were not
consecutive.)
Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Morning 8 9 7 9 10 13 10 8 2 5 7 7 6 8 7
After-
noon
8 10 9 8 9 11 8 10 4 7 8 9 6 6 9
First, you must find
the differences for
each day.
Since you have two values for
each day, they are dependent
on the day – making this data
matched pairs
You may subtract either
way – just be careful when
writing Ha

Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Morning 8 9 7 9 10 13 10 8 2 5 7 7 6 8 7
After-
noon 8 10 9 8 9 11 8 10 4 7 8 9 6 6 9
Differenc
es 0 -1 -2 1 1 2 2 -2 -2 -2 -1 -2 0 2 -2
Assumptions:
• Have an SRS of days for whale-watching
• σ unknown
•Since the normal probability plot is approximately
linear, the distribution of difference is approximately
normal.
I subtracted:
Morning – afternoon
You could subtract the
other way!
You need to state assumptions using the
differences!
Notice the granularity in
this plot, it is still displays
a nice linear relationship!

Differences 0 -1 -2 1 1 2 2 -2 -2 -2 -1 -2 0 2 -2
Is there sufficient evidence that more whales are
sighted in the afternoon?
Be careful writing your Ha!
Think about how you
subtracted: M-A
If afternoon is more should
the differences be + or -?
Don’t look at numbers!!!!
H0: µD = 0
Ha: µD < 0
Where µD is the true mean
difference in whale sightings
from morning minus afternoon
Notice we used µD for
differences
& it equals 0 since the null should
be that there is NO difference.
If you subtract afternoon –
morning; then Ha: µD>0

finishing the hypothesis test:
Since p-value > α, I fail to reject H0. There is
insufficient evidence to suggest that more whales are
sighted in the afternoon than in the morning.
05.14
1803.
945.
15
639.1
04.
==
=
−=
−−
=
−
=
α
µ
df
p
n
s
x
t Notice that if
you subtracted
A-M, then your
test statistic
t = + .945, but p-
value would be
the same
In your calculator,
perform a t-test
using the
differences (L3)
Differences 0 -1 -2 1 1 2 2 -2 -2 -2 -1 -2 0 2 -2

Tests of significance

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Tests of significance

Similar to Tests of significance (20)

Recently uploaded

Recently uploaded (20)

Tests of significance