12The Chi-Square Test Analyzing Categorical DataLea.docx

12
The Chi-Square Test: Analyzing
Categorical Data
Learning Objectives
After reading this chapter, you should be able to:
• Describe the conditions that fit chi-square tests.
• Calculate and interpret the goodness of fit test and chi-square
test of independence.
• Calculate and interpret the phi coefficient and Cramer’s V.
iStockphoto/Thinkstock
tan81004_12_c12_295-322.indd 295 2/22/13 3:44 PM
CHAPTER 1212.1 Examining Categorical Data
Chapter Outline
12.1 Examining Categorical Data
12.2 The Goodness-of-Fit (1 3 k) Chi-Square
Calculating the Test Statistic
Interpreting the Test Statistic
Understanding the Chi-Square Hypotheses

Distinguishing Between Goodness-of-Fit Chi-Square Tests and
t-Tests or ANOVAs
A 1 3 k (Goodness-of-Fit) Chi-Square Problem With Unequal
fe Values
A Final 1 3 k Problem
12.3 The Chi-Square and Statistical Power
12.4 The Goodness-of-Fit Test in Excel
12.5 The Chi-Square Test of Independence
Setting up the Chi-Square Test of Independence
Interpreting the Chi-Square Test of Independence
Phi Coefficient and Cramer’s V
A 3 3 3 Test of Independence Problem
Chapter Summary
12.1 Examining Categorical Data
The 19th-century British statesman Benjamin Disraeli is
credited with saying that there are three kinds of lies: lies,
damned lies, and statistics. Clearly, he had to have a place
in this book, even if it is in the final chapter. But he belongs
here because of another com-
ment that is particularly relevant to the topics in this chapter.
He observed that what
we anticipate seldom occurs and what we least expect generally
happens (Oxford, 1980).
Disraeli’s expressed skepticism was almost certainly tongue in
cheek. Indeed, the work
on regression in Chapters 9 and 10 is based on the
understanding that outcomes are not
unpredictable, but the statement provides an effective segue into
the connection between
what occurs and what might be expected to occur. That analysis

is the focus of this chapter.
Part of the discussion in Chapter 2 was how data differ
according to scale, and how the
statistics that can be calculated also relate to scale; you learned
about different types of
data scales and the appropriate types of statistics for each. For
example, for nominal scale
data, only the mode (Mo) makes sense as a measure of central
tendency. Subsequent chap-
ters revealed that it is not only descriptive statistics that are
specific to the scale of the data.
The more involved statistical tests are also data-scale
dependent. Recall that the depen-
dent variable in a t-test, a z-test, and ANOVA must be data that
fit a continuous (interval
or ratio) scale. Both variables in the Pearson Correlation must
be at least interval scale.
These distinctions are very important. Along with whether the
hypothesis deals with dif-
ference or association and whether the groups are independent,
the scale of the data is an
important guide to determining the appropriate statistical
procedure.
tan81004_12_c12_295-322.indd 296 2/22/13 3:44 PM
CHAPTER 12Section 12.2 The Goodness-of-Fit (1 3 k) Chi-
Square
Sometimes the best data available are not continuous. There
may be no way to verify the
normality of the data (perhaps because they are not normal).
The groups involved in the

analysis may not have equivalent variances. And perhaps the
relationships between vari-
ables are not linear. When important assumptions about the
quality of the data cannot be
satisfied, the answer is to move to tests with more relaxed
requirements.
Chapter 8 provided the first discussion of nonparametric
statistical analysis. Recall that
these tests set aside a number of important assumptions about
the characteristics of the
groups involved in an analysis. Removing some of the
requirements associated with the
characteristics of the data provides greater analytical flexibility.
Chi-square tests allow for the analysis of data that are
exclusively categorical. This distin-
guishes them from all the statistical tests discussed in previous
chapters. In addition, the
use of categorical data, since categorical/nominal data cannot
reflect the characteristics of
normality, suggests that the chi-square tests are nonparametric
tests, also referred to as
“distribution-free” tests. No assumptions are needed about how
the data are distributed,
nor are there requirements regarding their scale. These tests
provide flexibility, but they
exact a cost as well, which will be discussed as the chapter
progresses. The chi-square pro-
cedures have many applications in business analysis and
decision making and are a main-
stay in the manager’s statistical “toolbox.”
The chi-square tests were developed by Karl Pear-
son—the “Pearson Correlation” Pearson. Note that
the Greek letter for “c” is written x and pronounced

with a hard c, so chi is pronounced “kie,” rhyming
with “pie.” There are two chi-square procedures
discussed in this chapter. Both of them have two
names. The first is called the goodness-of-fit chi-
square test, or alternatively the 1 3 k (said “one by
kay”) chi-square.
12.2 The Goodness-of-Fit (1 3 k) Chi-Square
Perhaps a market research specialist is trying to determine
whether several local “talk radio” stations have approximately
similar audiences. Among other things, the answer
will influence what individual stations can charge for
advertising. The market research
specialist makes a random selection of people from a local
telephone book and calls each
number to ask residents if they listen to talk radio. For those
answering in the affirmative,
the question is which station they prefer.
Both names for this procedure are instructive for what they
reveal about the kind of analy-
sis involved. “Goodness-of-fit,” however awkward the grammar,
indicates that what is at
issue is how “good” the data fit an initial hypothesis. That
initial hypothesis describes the
expected distribution of the data. In the talk radio example, the
procedure will be to test
whether listeners prefer the major talk stations in about equal
proportions; it will provide
an analysis of how well the data fit that assumption.
Key Terms: The goodness-
of-fit, or 1 3 k chi-square, is
a test for significant differences
in the categories of a single,

nominal scale variable.
tan81004_12_c12_295-322.indd 297 2/22/13 3:44 PM
CHAPTER 12Section 12.2 The Goodness-of-Fit (1 3 k) Chi-
Square
The 1 3 k chi-square name is a reminder that the procedure
involves just a single grouping
variable (the “1” in the name), which is divided into some
number (k) of categories. Note
that the “k” here has the same meaning that it had in ANOVA. It
refers to the number of
groups in the analysis. In the preceding example:
• The categorical (grouping) variable is the preferred radio
station.
• The k refers to the number of categories into which the single
variable is
divided, which is the number of talk radio stations that will be
involved in
the analysis.
Take care not to confuse the number of categories with the
number of variables involved
in the analysis. If there are five talk radio stations in the
listening area, there are five cat-
egories or dimensions of the single variable, preferred radio
station.
Recall that nominal data are also often called categorical, or
“count” data, stemming from
the fact that the measurement involved with these kinds of data

is a matter of asking listen-
ers which station they listen to, and then sorting them into the
relevant category (preferred
radio station) accordingly. This “count” becomes the dependent
variable in chi-square
tests. The analysis hinges on the frequency with which subjects
fall into the individual
categories. Note that the issue is not how much more the
individual prefers station A to
station B, something that would indicate ordinal scale data, and
the question is not how
much time the individual spends listening, which would provide
ratio scale data. The only
question for respondents who listen to talk radio is which is the
preferred station.
If the question that drives the study is whether listeners prefer
the five stations in about
equal proportions, that is the hypothesis that will be tested. In
that instance, the expecta-
tion is that numbers of listeners in each of the categories that
represent the different talk
radio stations will be reasonably similar. If they are, the
resulting chi-square value will not
be statistically significant. Statistically significant results
emerge in a goodness-of-fit chi-
square when there is a substantial discrepancy between what is
observed in the data and
what is expected based on that initial hypothesis. Exactly how
much of a discrepancy
there must be is determined by comparing the calculated value
of chi-square to a critical
value that, like the other t, F, and r test statistics, is determined
by the degrees of freedom
for the problem and the probability level at which the test is
conducted.

Referring back to the radio station problem, the market
research specialist needs to either find support for the hypoth-
esis that listeners tune in to all five stations in about equal
proportions or update the expectation. Ninety-five listeners
are asked about their station preferences. Sixty of the respon-
dents name one of the five stations in the market area. The
other 35 listen to subscription stations on satellite radio that
are not located in the area. Since the interest is in listeners to
local stations, those 35 people
were excluded from the study. From the remaining 60 listeners,
the results are as follows:
StationA 15
StationB 8
StationC 12
StationD 10
StationE 15
Review Question A:
What is the scale of
the data required by
either of the chi-
square tests?
tan81004_12_c12_295-322.indd 298 2/22/13 3:44 PM
CHAPTER 12

Where
x2 5 the value of chi-square
fo 5 the frequency observed; how many individuals occur in
the particular
categories
fe 5 the frequency expected; how often individuals can be
expected to occur in the
particular category according to the initial hypothesis
The chi-square test statistic involves a good deal of repetitive
subtracting and squaring.
A good way to keep the calculations straight is to complete
them in something like Table
12.1. The successive steps in the calculations are represented in
the rows beginning with
the fo row near the top of the table, and then working down the
rows one at a time. Each
row in the table represents a calculation step for determining
the x2 value.
T
Table 12.1: The 1 3 k chi-square
The survey of the 60 respondents indicates that preferences are
as follows:
StationA 15
StationB 8
StationC 12

StationD 10
StationE 15
These results provide a range of 8 listeners for the least
frequently mentioned Station B to
15 listeners for the most popular stations, which are Stations A
and E, so there are clearly
differences in listeners’ preferences. The question the chi-
square procedure will answer is
whether the differences in “count” across the five stations are
just random differences that
can be expected because of sampling error, or whether the
differences are great enough
that they are likely to emerge every time data are collected and
the results analyzed. It is
that last outcome, of course, that defines statistical significance.
Calculating the Test Statistic
The chi-square test statistic, which is neither intimidating to
look at nor difficult to calcu-
late, has this form:
Formula 12.1 x2 5 S
1 fo 2 fe 2 2
fe
Section 12.2 The Goodness-of-Fit (1 3 k) Chi-Square
tan81004_12_c12_295-322.indd 299 2/22/13 3:44 PM
CHAPTER 12

Following the steps outlined above,
Statistic Station A Station B Station C Station D Station E
fo 15 8 12 10 15
fe 12 12 12 12 12
fo 2 fe 3 24 0 22 3
( fo 2 fe)
2 9 16 0 4 9
( fo 2 fe)
2/fe .75 1.33 0 .33 .75
x2 5.751 1.33 101 .33 1.755 3.16
The values in the fo row are counts of the number of individuals
that occur in each cat-
egory of the variable.
• These “frequency observed” values are the number of listeners
from the
sample of 60 who indicate that they listen to a particular radio
station.
• The sum of the fo values across the multiple categories of the
variable must
always sum to the total sample size, n.
The second row, designated fe, indicates what is expected based
on whatever hypothesis
or assumption prompted the analysis.

• For the problem above, the hypothesis/assumption is that
listeners are
attracted to the five stations in approximately equal proportions.
• That expectation is reflected in equal values for each of the
different
categories.
Later there will be a problem where the expectation is that the
categories will not be equal,
which must also be reflected in the fe values. If, for example,
station A is associated with
a major network and has several nationally syndicated shows,
perhaps the expectation is
that station A is twice as popular as the others. In that case, the
fe value for station A would
be twice as high as the fe values for each of the other stations.
Here, however, the problem is simpler. The hypothesis is that
the stations are equally
popular, so determining what to expect is simply a matter of
dividing the total number of
listeners by the number of categories:
fe 5 n/k
Where
n 5 the total of all subjects in all categories
k 5 the number of categories
So for the radio station problem, because n 5 60 and k 5 5, the
fe values are determined
as follows:

fe 5 60/5 5 12 in each fe category.
tan81004_12_c12_295-322.indd 300 2/22/13 3:44 PM
CHAPTER 12
Study the formula for the test statistic for a moment, review the
order of mathemati-
cal operations, and the process for calculating x2 will be
straightforward. Recall from
what your ninth-grade algebra teacher told you that when there
are multiple operations
required:
• The first step is to complete whatever needs to be done in the
parentheses
(“please”).
• Then deal with exponents (“excuse”).
• Handle multiplication and division (“my dear”) next.
• Then complete any addition and subtraction last (“Aunt
Sally”) when they
are not in parentheses.
So in terms of the rows in Table 12.1, the process is to:
1. Fill in the fo values in the first line, determined by the
number of people who
indicate that they listen to a particular station.
2. For this problem at least, because of the expectation that the

five stations are
all about equally popular among listeners, divide n by k and
enter that value in
each of the five fe boxes on the second line.
Now, following order of mathematical operations in columns 1
through 5:
3. On line 3 enter the value that is fo 2 fe for each of the five
stations; for each
category it is the fo value from the first line minus the fe value
from the second
line.
4. On line 4 deal with the exponent by squaring each difference
between fo and fe.
5. On the next line, divide the result of the previous line ( fo 2
fe)
2 by the fe value
from the second line.
6. On the final line, sum the ( fo 2 fe)
2/fe across the five categories to determine the
x2 value.
The result of completing these steps for the data above is that
x2 5 3.16.
Interpreting the Test Statistic
The next step is to determine whether a x2 value of 3.16 is
statistically significant by com-
paring it to the appropriate critical value of x2 from Table 12.2.
As with the other tables,

the critical value is determined according to the probability
level at which the test is con-
ducted (p 5 .05 is the default level for this test as well), and the
number of degrees of
freedom for the problem. The df for the goodness-of-fit chi-
square are k 2 1, the number of
categories of the variable, minus 1. Here, df 5 4.
tan81004_12_c12_295-322.indd 301 2/22/13 3:44 PM
CHAPTER 12
Table 12.2: The critical values of chi-square
df p 50.05 p 50.01 p 50.001
1 3.84 6.64 10.83
2 5.99 9.21 13.82
3 7.82 11.35 16.27
4 9.49 13.28 18.47
5 11.07 15.09 20.52
6 12.59 16.81 22.46
7 14.07 18.48 24.32
8 15.51 20.09 26.13

9 16.92 21.67 27.88
10 18.31 23.21 29.59
11 19.68 24.73 31.26
12 21.03 26.22 32.91
13 22.36 27.69 34.53
14 23.69 29.14 36.12
15 25.00 30.58 37.70
16 26.30 32.00 39.25
17 27.59 33.41 40.79
18 28.87 34.81 42.31
19 30.14 36.19 43.82
20 31.41 37.57 45.32
Source:
http://home.comcast.net/~sharov/PopEcol/tables/chisq.html
Retrieved: 6 July, 2012.
As with the other tests conducted so far, the calculated value of
chi-square is statistically
significant when it is equal to, or larger than, a critical value.
That value is determined
by the probability level of the test and the degrees of freedom
for the problem. For a
x2.05(4) the table indicates that the critical value is 9.49. A

critical value from the table
greater than the calculated value of chi-square indicates that the
fo to fe difference is best
explained by differences that could occur by chance in the chi-
square distribution. The
result is not statistically significant. This result prompts the
marketing analyst to fail to
reject the null hypothesis.
As an aside, the critical values for chi-square are often
appended to two decimals in their
tables, just as z values were in their table. With chi-square,
stopping at two decimals is not
just a matter of crowding more values onto a page as was the
case with z. The nominal
data upon which chi-square values are based are relatively crude
compared to ordinal, or
interval or ratio, data, and it makes less sense than with those
other data to imply the level
of exactness suggested by three decimals.
tan81004_12_c12_295-322.indd 302 2/22/13 3:44 PM
http://home.comcast.net/~sharov/PopEcol/tables/chisq.html
CHAPTER 12
Understanding the Chi-Square Hypotheses
Like all statistical hypotheses, the null and alternate hypotheses
in chi-square problems
refer to what occurs in the populations the samples represent.

• In the language of statistics, the null hypothesis for a chi-
square problem is
that the frequency expected is equal to the frequency observed
(Ho: fe 5 fo).
The fact that the result in the problem just worked was not
statistically sig-
nificant indicates that in a population where people listen to
five talk radio
stations in equal proportions, it is not improbable to draw a
sample in which
the numbers of listeners who prefer each station range from 8 to
15 out
of 60. A sample with fo values of 15, 8, 12, 10, 15 is still
consistent with the
null hypothesis, and it is one of the outcomes that make up the
chi-square
distribution.
• The alternate hypothesis is that fo values of 15, 8, 12, 10, 15
is not a sam-
ple likely to occur in the chi-square distribution. Stated
symbolically it is
was actually
observed.
The null hypothesis indicates that any variability between what
is observed and what
is expected is explained by what will probably occur in the chi-
square distribution. In
other words, there is insufficient evidence to reject the
possibility that what occurred is
likely to have occurred by chance. The alternate hypothesis is
that there is too much dif-
ference between what is observed and what is expected to
conclude that the outcome is

due to chance.
Distinguishing Between Goodness-of-Fit Chi-Square Tests and
t-Tests
or ANOVAs
The 1 3 k, or goodness-of-fit, chi-square procedure falls under
the general hypothesis of
difference category of procedures. In that regard it is similar to
the independent samples
t-test and to ANOVA. Like those procedures, the value of the
chi-square statistic is a mea-
sure of difference. The primary difference is the scale of the
data in the analysis. In inde-
pendent samples t-tests and ANOVA, the t and the F
respectively are measures of the
difference between the means of the samples involved in the
analysis. The independent
(grouping) variable is categorical, and the dependent variable is
continuous (interval or
ratio scale). On the other hand, the chi-square statistic measures
the difference between
the frequencies of occurrence of a nominal (categorical)
variable compared with what is
expected. The larger the gap between the expected and observed
frequency distribution,
the greater the difference between fo and fe. Since the
dependent variable is the “count,” or
frequency, of occurrence of the categorical variable of interest,
it is impossible to calculate
means and standard deviations, which makes t-tests and
ANOVAs impossible.
A 1 3 k (Goodness-of-Fit) Chi-Square Problem With Unequal fe
Values

This first chi-square problem was based on the assumption that
listeners preferred the five
radio stations in about equal proportions, but equal fe values
across all categories of the
variable are not always the case. Perhaps a consumer advocate
is testing the claim made
by the manufacturer of an energy drink called Rush that
consumers prefer its product
tan81004_12_c12_295-322.indd 303 2/22/13 3:44 PM
CHAPTER 12
2-to-1 over the major competitor’s product (Advantage) based
on taste alone. If this is accu-
rate, a random sample of preferences from consumers of energy
drinks should indicate
that twice as many prefer Rush over Advantage. Since it is
highly unlikely that a random
sample will yield exactly those results even if the claim is
accurate, the chi-square test can
be used to determine whether sample results are close enough to
support that claim, or
whether results are significantly different from the claim.
The consumer advocate takes a sample of 150 students and finds
that 27 of them have
used both Rush and Advantage and express a preference for one
over the other. The other
123 students prefer either some other energy drink, or use none
at all. Their responses are
discarded. Of the remaining 27 students, 16 of them prefer Rush

and 11 prefer Advantage.
Just as with the first problem, the 11 and 16 numbers represent
the fo values, and their
sum equals the value of n for the problem. That is the easy part.
Because the claim by the
manufacturer of Rush is that consumers prefer its drink 2-to-1
over the major competitor,
Advantage, the fe values must reflect the 2-to-1 expectation.
Calculating fe Values for Unequal Categories
The total of both frequencies observed and frequencies expected
must sum to the total, n.
This will always be the case, regardless of the particular
hypothesis.
S fo 5 n, and
S fe 5 n
To calculate the fe values when the numbers in multiple cat-
egories are not the same will involve vindicating that ninth-
grade math teacher who said that someday algebra would be
helpful. To determine the fe values,
1. Let x equal fe for the number who prefer the Advantage
energy drink
2. Since the expectation in this example is that twice as many
consumers will
prefer Rush over Advantage, let 2x be the fe for those who
prefer the Rush
energy drink
3. Because the fe categories must sum to the total then x 1 2x 5

n
4. Since, n 5 27, the expression can be changed as follows: x 1
2x 5 27
a. If x 1 2x 5 27, if follows that 3x 5 27
b. If 3x 5 27, then x 5 27/3, which makes x equal to 9
As a result of these calculations, the fe value for the Rush
consumers is 18 (2x 5 2 3 9 5 18),
and the fe value for Advantage consumers is 9 (x 5 9). With
those values in hand, the claim
that Rush is preferred twice as often as Advantage on the basis
of taste can be tested with the
1 3 k chi-square. The solution is Table 12.3.
Review Question
B: What is the null
hypothesis in a chi-
square problem?
tan81004_12_c12_295-322.indd 304 2/22/13 3:44 PM
CHAPTER 12
Table 12.3: A 1 3 k chi-square for unequal fe values: Is Rush
twice as popular as Advantage?
Advantage Rush
fo 11 16
fe 9 18

fo 2 fe 2 22
( fo 2 fe)
2 4 4
( fo 2 fe)
2/fe .44 .22
x2 5 .44 1.22 5 .66
x2 5 .66
x2.05(1) 5 3.84. Accept Ho.
Interpreting the Results
Since the calculated value of chi-square is lower than the
critical value from the table
for p 5 .05 and 1 degree of freedom, the decision is to fail to
reject. That part is straight-
forward enough, but in this problem where the claim is that
Rush is twice as popular as
Advantage, what does failing to reject mean? The key is the null
hypothesis, which always
reflects the expectation upon which the test is based. Because
this problem was set up
with an fe value for one outcome that is two times the value of
the other, failing to reject
the null hypothesis means that there is not enough evidence to
reject the claim that Rush
is twice as popular as Advantage. To say it another way,
although the data do not reflect
exactly a 2-to-1 preference for Rush (16 is not 2 3 11), the
departure from that claim is not
sufficient to allow the consumer advocate to reject it.

Note that the makers of Rush maintain that their product is
twice as popular as Advantage
based on taste. Whether it is entirely taste or not probably
cannot be verified. Perhaps it is
marketing ability that prompts the students to prefer Rush, or
maybe the costs of the two
products differ, or maybe one comes in a more convenient size
than the other. The way the
data were collected made the consumer’s stated preference the
issue, without questions
about the reasons for the preference. For whatever reason,
students prefer the one product
to the other by a great enough margin that it could be 2-to-1.
A Final 1 3 k Problem
To solidify the grasp of the goodness-of-fit procedure, here is
one more problem. In this
instance a consulting company is retained by a satellite provider
to determine whether,
in a particular region of the country, satellite TV is three times
more popular than free TV,
and whether cable TV is twice as popular as free TV. The
consulting company is retained
to check the veracity of that claim. A random sample of 93
viewers in the region is exam-
ined and found to rely on the following for television service:
tan81004_12_c12_295-322.indd 305 2/22/13 3:44 PM
CHAPTER 12

Satellite 65
Cable 16
Free 12
The same approach used in the last problem to determine the fe
values produces the
following:
3x (satellite) 1 2x (cable) 1 x (free TV) 5 93
6x 5 93
x 5 15.5
That x value makes the fe for
free TV 5 15.5,
for cable TV 5 2 3 15.5 5 31,
and for satellite TV 5 3 3 15.5 5 46.5.
Note that the fe values sum to 93, as they must. Do not be
distracted by the fact that the fe
values are not whole numbers. Although asking which type of
TV service people are using
will make the fo values whole, the fe numbers indicating what
people are expected to use
can take on any value. The calculations for this problem are in
Table 12.4.
Table 12.4: Another 1 3 k chi-square problem
Satellite Cable Free TV
fo 65 16 12

fe 46.5 31 15.5
fo 2 fe 18.5 215 23.5
( f o 2 fe)
2 342.25 225 12.25
( f o 2 fe)
2/fe 7.36 7.26 .79
S x2 5 7.36 1 7.26 1 .79 5 15.41
With a calculated x2 5 15.41 and the table value for x2 of 5.99
when testing at p 5 .05 with 2 degrees of freedom, the results
indicate that the null hypothesis should be rejected. The way
these 93 people are distributed does not fit a chi-square dis-
tribution where cable is twice as popular and satellite is three
times as popular as free television.
Review Question C:
How is the scale of the
data involved in an
analysis related to the
power of the statisti-
cal procedure?
tan81004_12_c12_295-322.indd 306 2/22/13 3:44 PM
CHAPTER 12Section 12.3 The Chi-Square and Statistical Power
The difficulty with a problem such as this one is that the initial

hypothesis is actually two
hypotheses: Satellite is three times as popular as free television,
and cable is twice as pop-
ular as free television. When the results indicate rejecting the
null hypothesis, it could be
because the expected ratio of satellite versus free TV customers
is not supported, because
the ratio of cable versus free TV customers is not supported, or
because neither hypothesis
is supported. Recall that this was the case with a significant F
in ANOVA as well. It did not
provide clear evidence for which two specific groups are
significantly different. A more
definitive chi-square test would require that the consultant
gather new data and check
those hypotheses individually.
12.3 The Chi-Square and Statistical Power
As noted at the beginning of the chapter, distribution-free tests
like chi-square proce-dures provide great flexibility. They
provide no restrictions regarding the scale of the
data, there are no normality assumptions to contend with, and
these procedures work
quite well with small samples. In the statistical version of the
“there’s no such thing as
a free lunch,” expression, however, chi-square and most
nonparametric procedures have
a drawback.
Note that even though there were differences between what was
observed and what was
expected in the first two problems in this chapter, neither chi-
square value was statistically
significant. The chi-square procedures are not very sensitive to
minor variations between

what is seen and what is expected. Indeed, the differences must
be fairly substantial to
produce a significant chi-square value. Recall that power in
statistical testing is the ability
of a procedure to detect statistical significance. Compared to
something like ANOVA, the
chi-square procedures are not particularly powerful.
Much of the lack of power comes down to the scale of the data
that are involved. Nominal
data provide no information about what is measured except the
category of the variable
to which the individual belongs. For each of the energy drink
consumers who prefer Rush
over Advantage, all that is revealed is which energy drink is
preferred. Nothing in the data
indicates how much more one drink is preferred over the other.
There is no ranking of
preference on a scale of 1 to 5. All that is known is that,
presented a choice, the consumer
chose Rush.
Recall that all statistical tests are based on probabilities, and
that because the outcome is
therefore never a certainty, there is the constant possibility of a
Type I or Type II decision
error. Because the chi-square procedures are insensitive to
minor variations between fo and
fe, chi-square analyses are more inclined toward Type II (beta)
decision errors than tradi-
tional parametric tests of significant differences. The risk is that
an analysis that suggests
that results are not statistically significant might be set aside if
new data were gathered
and the analysis run a second time.

On the bright side, Type I (alpha) decision errors are relatively
uncommon. It is not likely
that upon finding a result statistically significant, further testing
with new data would
suggest otherwise; a decision to reject the null hypothesis is
unlikely to be overturned by
a second analysis.
tan81004_12_c12_295-322.indd 307 2/22/13 3:44 PM
CHAPTER 12Section 12.4 The Goodness-of-Fit Test in Excel
12.4 The Goodness-of-Fit Test in Excel
The procedures in the Excel Data Analysis package do not
include the chi-square tests. However, because the test statistic
involves a good deal of repetitive subtracting,
squaring, dividing, and so on, it is not difficult to set up an
Excel spreadsheet to accommo-
date a 1 3 k problem. This can be done by organizing a
spreadsheet to complete the same
calculations used in Tables 12.1, 12.3, and 12.4. To illustrate,
an organic vegetable grower
claims that in spite of the higher price, shoppers will select
organically grown spinach as
often as spinach grown with the help of pesticides and chemical
fertilizers. The first 30
people who buy spinach on a particular day at a grocery store
are examined for whether
they bought the organically grown vegetable. Results are as
follows:
Organically grown: 10
Conventionally grown: 20

To test the grower’s claim,
• Enter the labels organic in cell B1 and conventional in C1.
• Enter the labels fo in cell A2, fe in A3, fo 2 fe in A4, ( fo 2
fe) sqd in A5, 4fe in
A6, and sum in A7.
• Enter the values 10 and 20 in cells B2 and C2 respectively.
Since the claim is that organic spinach will sell as frequently as
conventionally grown
spinach, the fe values are simply n 4 2 5 15.
• Enter 15 in both B3 and C3.
• In cell B4 enter the formula 5B2-B3 and then press Enter.
• With the cursor on B4 hold the shift key down and move the
cursor to C4 and
then enter the command to fill right which will be near the far
right of the
menu ribbon so that the procedure in B4 is repeated in C4.
• In cell B5 enter the command 5B4^2 to square the value in B4.
• Repeat the B5 procedure in C5 using the fill right command as
above.
• In cell B6 enter the command 5B5/B3.
• Repeat the B6 procedure in C6 using the fill right command.
• In cell B7 enter the command 5sum(B6:C6).
The last command in the above sequence produces the chi-
square value for this problem,
x2 5 3.33. The critical value from the table for 1 degree of
freedom and p 5 .05 is x2 5 3.84.
The results are not significantly different from the organic
grower’s claim; the organically

grown spinach may be just as popular as the traditionally
produced spinach, in spite of its
higher price. Figure 12.1 is a screenshot of the spreadsheet for
this problem.
tan81004_12_c12_295-322.indd 308 2/22/13 3:44 PM
CHAPTER 12Section 12.5 The Chi-Square Test of
Independence
Figure 12.1: Chi-square goodness-of-fit using Excel
12.5 The Chi-Square Test of Independence
The goodness-of-fit or 1 3 k chi-square procedure
accommodates just one categorical (grouping) variable. In that
regard, it is similar to a one-way ANOVA, which likewise
involves just a single variable, although it is an interval or ratio
scale rather than a nomi-
nal scale variable with the data divided into any number of
categories or groups. Although
basing an analysis on a single variable keeps the arithmetic
simple, it consigns to error any
variance that is not explained by that single variable. On the
other hand, when multiple
independent variables are included, as is the case in factorial
ANOVA, there is less residual
variance and a smaller error term. In addition, besides each
variable contributing to the
result, sometimes multiple variables act together, a phenomenon
called a “statistical inter-
action” in factorial ANOVA. A similar thing can happen with
chi-square procedures. Some-
times a single variable is an inadequate explanation of an

outcome, and in those circum-
stances a second variable will act in concert with the first.
The factorial ANOVA has an approximate equiva-
lent in one of the chi-square procedures. It is called
the chi-square test of independence, or the
r 3 k chi-square. The names for this procedure are
just as informative as were those for the one-variable
chi-square test. The “test of independence” alludes to
the fact that what is tested is whether the two vari-
ables included in the analysis operate independently.
Key Terms: The chi-square
test of independence, or r 3 k
chi-square is a test of the inde-
pendence of two nominal scale
variables.
tan81004_12_c12_295-322.indd 309 2/22/13 3:45 PM
Independence
In the language of ANOVA, the variables are analyzed to
determine whether they interact.
When the chi-square value is statistically significant, it
indicates that the two variables do
not operate independently, a result that will lead to an ancillary
analysis to determine the
level of their relationship.
As was the case in factorial ANOVA, both of the interacting
variables in the r 3 k chi-
square are categorical. The r 3 k designation refers to the way

the analysis is set up. The
data are organized so that the levels of one variable are
indicated in the rows (r) of a table,
and the other variable is represented in the columns, with each
column representing a
separate category (k) of that variable. Although the calculations
will fit the same table that
was used for the goodness-of-fit (1 3 k) problems, the frequency
expected ( fe) values are
calculated differently.
Setting up the Chi-Square Test of Independence
The rows and columns explanation just above pro-
vides the organization for a contingency table that
is used in this chi-square test of independence. The
way it is used can be illustrated with a problem. The
human resources department for a fast food chain
is considering offering an early retirement package
to some of the more senior managers in an effort to
reduce payroll. The department wishes to be able to
predict whether a severance package with a $15,000
bonus offered to early retirees will affect retirement plans.
Because of the potential costs
associated with offering the bonus to dozens of senior
managers, the human resources
people need to have some understanding of the impact that the
bonus will have on
employees’ decisions to retire. Among the managers, 30 senior
managers are identified
and randomly divided into two groups of 15 each.
• The 15 managers in the first group are asked to complete a
questionnaire, to
be submitted anonymously, which includes a question about
whether the

respondent anticipates retiring within the next three years.
Among this group,
two managers indicate that they intend to retire within the
specified period.
• Those in the second group of 15 managers are asked whether,
if a $15,000
bonus were offered to those who retire in the next three years,
they would
retire in that period. Of the 15, seven managers indicate that
they would retire
within the next three years if the bonus were offered.
Note that there are two potentially related variables involved.
One is whether managers
intend to retire in the coming three years. The other is whether
they would retire in that
time frame if a bonus were offered. These two variables can be
represented in a table that
looks much like what was used earlier to set up a two-way
ANOVA. In addition to helping
one to visualize the problem, when used in the chi-square test of
independence the table
helps with the task of deriving the fe values. But before
worrying about the calculations,
note the contingency table below. It is organized with the
categories of one variable in the
rows of the table and the categories of the other variable in the
table columns:
Key Terms: The contingency
table organizes data into rows
for the categories of one variable,
and columns for the catego-
ries of the other in an r 3 k
chi-square.

tan81004_12_c12_295-322.indd 310 2/22/13 3:45 PM
Independence
Retirement Yes Retirement No
No Bonus
Bonus
Organizing the Contingency Table
This same table appears in Table 12.5 with the results of the
survey filled in. There are also
totals for each row, each column, and a value for all subjects
together, n. This particular
contingency table is a 2 3 2. Although the chi-square test of
independence is limited to
two variables, those two variables can each have any number of
categories. There could
have been five levels of the bonus, for example offered to
different groups of potential
retirees: no bonus, a $5,000 bonus, a $10,000 bonus, a $15,000
bonus, and a $20,000 bonus.
There might have been more than two categories of the
retirement decision as well: retire
in the next year, retire in two to three years, retire in four to
five years, and so on. With
the 2 3 2 problem, there are just four cells labeled “a” through
“d.” Note that the value
in cell “a” is the number that represents the combination of the
no-bonus group and the
number in that group who indicated that they would retire. The

number in cell “d” is the
combination of the bonus and the number in that group who
opted not to retire, and so on.
Table 12.5: The chi-square test of independence for retirement
decisions and a severance
bonus
A. The Contingency Table
Will Retire Won’t Retire Row Totals
No Bonus a 2 b 13 15
Bonus c 7 d8 15
Column Totals 9 21 n 5 30
B. Completing the Analysis
Statistic a b c d
fo 2 13 7 8
fe 4.5 10.5 4.5 10.5
fo 2 fe 22.5 2.5 2.5 22.5
( f o 2 fe)
2 6.25 6.25 6.25 6.25
( f o 2 fe)
2/fe 1.39 .60 1.39 .60
x2 5 1.39 1.601 1.39 1.6053.98

The two sample groups each have values that sum to 15, a value
reflected in the row totals
on the right. The sum of the two rows must total n. Although the
2 columns together must
also sum to 30, the individual columns will not necessarily each
be 15 since the number
opting for and the number opting against retirement from the
bonus and no-bonus groups
are not equal.
tan81004_12_c12_295-322.indd 311 2/22/13 3:45 PM
Independence
The fo and fe Values in the Chi-square Test of Independence
The values in each of the four cells are the fo values that will be
used to calculate the chi-
square value. Note that in Part B of Table 12.5, the fo values
are listed just as they were
listed in the goodness-of-fit problems done earlier.
The difference between the way the chi-square values are
calculated in goodness-of-fit
and test of independence problems is in the way the fe values
are determined. For each
of the “a” through “d” cell values, fe is calculated as the row
total in which the particular
cell is included times the column total in which the cell is part,
divided by n. Symbolically,
for each cell fe 5 (row ttl 3 col ttl)/n. This makes the fe values
in the retirement problem as
follows:

• Cell a: (15 3 9)/30 5 4.5
• Cell b: (15 3 21)/30 5 10.5
• Cell c: (9 3 15)/30 5 4.5
• Cell d: (15 3 21)/30 5 10.5
Once the fe values are determined, the rest of the calculations
for a chi-square value are
the same as they were for a goodness-of-fit test and are
completed in Part B of Table 12.5.
Begin by subtracting fe from fo, square the difference, and so
on. The critical value of chi-
square comes from the same table used for goodness-of-fit
problems. The degrees of free-
dom for the test are determined by taking the rows minus 1,
times the number of columns
minus one (df 5 (r 2 1) 3 (k 2 1)). For this problem, df 5 (2 2 1)
3 (2 2 1) 5 1.
With the calculated value x2 5 3.98, and
the table value x2.05(1) 5 3.84, the result is statistically
significant.
In the context of the r 3 k procedure, what does a significant
outcome mean? Chi-square
results are statistically significant when the fe values diverge
enough from the fo values
that the difference between the two is unlikely to have occurred
by chance. The implica-
tion is that the factor that creates the fo versus fe difference in r
3 k problems is the rela-
tionship between the two variables. If the retirement decision
and the availability of the
retirement bonus variables are unrelated, which is to say that
they operate independently,
there will be no significant result. The fo 5 fe null hypothesis

for an r 3 k problem has the
same meaning as the null hypothesis in a Pearson Correlation
problem. It means that
there is no relationship between the two variables. The
difference is that a Pearson Cor-
relation cannot be calculated between two nominal variables.
The Yates Correction to 2 3 2 Problems
Earlier we said that Type I decision errors are relatively
uncommon with chi-square proce-
dures. While that is generally true, the 2 3 2 problem, where
each variable has two levels
like the example here, may be an exception. With those
problems there can be a tendency
to incorrectly find statistical significance when one or more of
the fe values in the problem
fall below 5.0. In what is now called the “Yates correction,”
Yates suggested curbing this
tendency by subtracting .5 from all fo 2 fe cell differences in
any 2 3 2 problem if at least
one fe value is less than 5.0. The reduced fo 2 fe difference
makes a significant chi-square
value less likely, of course, and so reduces Type I error
probability.
tan81004_12_c12_295-322.indd 312 2/22/13 3:45 PM
Independence
There is not a consensus about the use of the Yates correction,
and some statisticians argue
that the .5 reduction is in fact an overcorrection that makes the

procedure unnecessarily
conservative. The decision in this book has been to not
incorporate the correction. The
issue is raised so that the reader will know what it is and why
some analysts recommend
making the correction. For more information, Howell (1992)
provides a helpful discussion
of the Yates correction.
Interpreting the Chi-square Test of Independence
The issue in both the chi-square goodness-of-fit (1 3 k) and test
of independence (r 3 k) is
whether what is observed is consistent with an expected
outcome. In the case of the test
of independence, a significant result (rejecting the null
hypothesis) indicates that the two
variables are not functioning independently; they are correlated.
In our example, rejecting
Ho indicates that the intention to retire and the availability of
the cash bonus are related.
In a reference back to the difference between correlation and
causation in Chapter 8, it is
not clear that the bonus causes managers to make a retirement
decision, but for whatever
reason, there are significantly more intents to retire when the
bonus is part of the equation.
At this point the focus turns to the nature of that relationship.
Since it is clear from com-
paring the calculated chi-square value to the table value that
there is a correlation, the
question now is of the strength of the relationship between the
two variables.
Phi Coefficient and Cramer’s V

To this point the analysis was similar to ANOVA or t-tests and
fell under the general
umbrella of the hypothesis of difference. Having determined a
significant difference
between fo and fe, the focus now shifts to a hypothesis of
association issue.
The correlation procedure for interval/ratio variables that meet
normality requirements
was Pearson’s r. For a correlation of ordinal scale data, or for
interval/ratio data that fail
to satisfy normality requirements, Spearman’s rho was the
answer. The need here is for
a correlation procedure based on nominal data, and there are
several from which to
choose. Pearson, who developed chi-square, also developed a
correlation procedure for
nominal variables called coefficient of contingency, C. Because
it produces quite a con-
servative correlation value, it is not as widely used as some of
the alternatives. The
upper bound for most correlation procedures is 1.0. The
coefficient of contingency can-
not reach that value.
Two of the other correlation procedures for nomi-
nal variables are phi coefficient, f (f is the Greek
equivalent of f ) , and Cramer’s V, which are both
explained here. Contingency coefficient, phi coeffi-
cient and Cramer’s V are all based directly on the
chi-square value, which makes them easy to calcu-
late once the chi-square value has been determined.
Key Terms: Phi coefficient
and Cramer’s V are both cor-

relation procedures for nominal
data used after a significant
r 3 k chi-square result.
tan81004_12_c12_295-322.indd 313 2/22/13 3:45 PM
Independence
Formula 12.3 V 5 "1f2/fewer of rows or columns 2 1 2
Where
x2 5 the value calculated in the r 3 k procedure
n 5 the total number of subjects
For the decision-to-retire and the cash bonus problem, x2 5
3.98, and n 5 30. Therefore to
solve for f:
f 5 "1x2/n 2 5 "13.98/30 2 5 "1.33
f 5 .36
The retirement/cash bonus relationship is f 5 .36. Although f
cannot have a negative
value (the ways x2 is calculated and the square root function in
the f formula do away
with that possibility), it is interpreted like any other correlation
statistic. A 0 correlation
indicates no relationship; a 1.0 correlation indicates a perfect
correlation. The correlation
here of f 5 .36 is “modest,” or perhaps “low,” by correlation
standards.

When either of the two variables in the analysis has two levels,
V 5 f. This is because the
formula for Cramer’s V is,
Furthermore, if the chi-square value is statistically significant,
the correlation coefficient
will be significant at the same level; there are no separate
significance tests necessary for
C, V, or f. If the chi-square value is not statistically significant
(if the decision in the initial
chi-square analysis is to fail to reject), there is no point in
calculating a correlation value
since failing to reject Ho: fo 5 fe indicates that the variables are
independent.
In a statistically significant 2 3 2, 2 3 3, or 3 3 2 chi-square
problem, phi coefficient will be
the appropriate follow-up correlation procedure. The formula is
the following:
Formula 12.2 f 5 "1x2/n 2
Where
f2 5 the square of the phi coefficient
rows or columns 5 the number of levels of the two variables
If there are just 2 levels of either variable, the divisor is 1 and
".362 5 .36, V 5 f. If the
fewest number of rows or columns is three, this changes, of
course, and the correct correla-
tion value to calculate is Cramer’s V.
tan81004_12_c12_295-322.indd 314 2/22/13 3:45 PM

Independence
A 3 3 3 Test of Independence Problem
A property management company in a large city manages the
landscaping and rent col-
lection at several apartment complexes. Some of the complexes
are quite large, with more
than 50 units; some are very small, with fewer than 10 units;
and the others are classi-
fied as medium-sized. Collecting and crediting rent payments
each month is very time-
consuming. A bookkeeper at the management company guesses
that in the smaller
complexes there is a more intimate relationship between
manager and tenant than in
the larger complexes, and rent difficulties are correspondingly
lower as a result. To test
that assumption, the bookkeeper examines data from 100
apartments located in each of
small, medium, and large complexes and determines the number
of rent payments that
are on time, that are within one week late, and that are more
than one week late. The
data are below:
Rent Submission
On-time Within 1 week .1 week late Row totals
Small a 65 b 30 c5 100
Medium d55 e 35 f 10 100

Large g45 h 25 i 30 100
Column totals 165 90 45 300
The first question is whether the two variables of apartment
complex size and rent late-
ness are independent. If the chi-square value is statistically
significant, the decision will
be that they are not independent, and that will prompt a second
question about the
strength of their relationship. The calculations for this chi-
square test of independence
are in Table 12.6.
Table 12.6: A chi-square test of independence for the size of the
apartment complex and
the lateness of the rent
a b c d e f g h i
fo 65 30 5 55 35 10 45 25 30
fe 55 30 15 55 30 15 55 30 15
fo 2 fe 10 0 210 0 5 25 210 25 15
( fo 2 fe)
2 100 0 100 0 25 25 100 25 225
( fo 2 fe)
2/fe 1.82 0 6.67 0 .83 1.67 1.82 .83 15
S 28.64
• The calculated x2 5 28.64. Since this is a 3 3 3 problem, df 5

(3 2 1) 3
(3 2 1) 5 4.
• The critical value of chi-square x2.05(4) 5 9.49.
tan81004_12_c12_295-322.indd 315 2/22/13 3:45 PM
Independence
The result is statistically significant. How promptly renters pay
their monthly rent is
related to the size of the complex in which they live.
Determining the strength of the correlation calls for Cramer’s V
since both variables have
more than two levels.
V 5"1f2/fewer of rows or columns 2 1 2
But since V requires the calculation first of phi coefficient, the
answer begins there.
f 5 "1x2/n 2
5 "128.64/300 2
5 .31
With a value for phi, V can be calculated.
V 5"1f2/fewer of rows or columns 2 1 2
5 "1 .312/2 2

5 ".05
5 .22
The relationship between the size of the rental complex and
how promptly rent is paid is V 5 .22. The correlation is not
particularly robust, but it is statistically significant, since the
value of chi-square upon which it is based is significant. Based
on the analysis, those at the property management company
are in a position to alter procedures in some way that responds
to the relationship between
rent payment and complex size. Perhaps it will make a
difference if rent collections can be
made online. Maybe an effort to improve the social relationship
between the apartment
manager and the tenants, particularly in large apartment
complexes, will prompt rent
payments to be made in a more timely fashion.
Review Question D:
What does phi coef-
ficient measure?
tan81004_12_c12_295-322.indd 316 2/22/13 3:45 PM
CHAPTER 12Chapter Summary
Chapter Summary
The chi-square tests assume that Disraeli was unnecessarily
skeptical. In fact, an informed expectation of an outcome should
provide a fairly good indicator of what will actually

occur, which is the understanding upon which both chi-square
tests are based. The chi-
square tests answer many of the same questions that earlier tests
in this book answered.
The difference is that the chi-square tests are based on nominal
data. Pearson developed
tests for data which indicate nothing more than the count of the
number that occur in a
particular category. Consequently, the analysis is based on
differences between the fre-
quency observed ( fo) and the frequency expected ( fe)
(Objective 1).
The goodness-of-fit, or 1 3 k procedure analyzes whether the
proportions occurring in
the multiple categories of a single variable are consistent with
what is expected based on
an initial hypothesis. The chi-square test of independence, also
called the r 3 k procedure,
straddles the boundary between tests of the hypothesis of
difference and those related to
the hypothesis of association. The initial analysis establishes
whether two variables func-
tion independently. Like the goodness-of-fit test, this part of the
analysis is based on the
magnitude of the fo 2 fe difference. A significant value of chi-
square means rejecting the
probability that the variables are independent (Objective 2). At
that point the question is
about the strength of the relationship between the two variables.
That correlation can be
gauged by one of several correlation procedures designed for
nominal data. Those cov-
ered in this chapter include the phi coefficient and Cramer’s V
(Objective 3).

It should be noted here that this book represents only a brief
introduction to the analysis
procedures that can be useful for managers. The list of
statistical procedures covered in 12
chapters is far from exhaustive, but it is a valuable beginning.
The different tests explained
in Chapters 1212 are representative of those that are appropriate
to many kinds of busi-
ness analysis. Figure 12.2 is a flowchart-like guide to which test
will answer the manager’s
question. It is provided here as a summary and overview of the
preceding chapters. As the
decision tree is followed from the top down, note the issues are:
• Is the question about differences or associations?
• Are the data involved nominal (categorical), ordinal, or
interval/ratio?
• How many groups are involved?
• Are the groups independent?
tan81004_12_c12_295-322.indd 317 2/22/13 3:45 PM
CHAPTER 12Chapter Summary
Figure 12.2: Finding the appropriate test
Answering each question above in turn guides one to the test
tailored to the particular
problem. If the question is about differences between groups
(1), the measures involved
are interval or ratio scale (2), there are 4 groups involved (3)
that are independent (4), one
of the ANOVA tests will answer the question.

No statistics book can provide comprehensive coverage of every
statistical test, and early
in the development of this book a decision was made to present
tests neither for sig-
nificant differences nor association for ordinal data. They are
presented in Figure 12.2 to
round it out, but they are not found elsewhere in the book.
Should the reader wish to pur-
sue Mann-Whitney, Kruskal-Wallis, Wilcoxon, or Friedman’s
ANOVA tests, Tanner (2011)
is a useful source.
No. Groups The testIndependent?
Independent
Related
Questions about Differences Questions about Associations
Chi-square goodness of fit
Phi Coefficient
Spearman’s rho
Pearson Correlation
Point-biserial Correlation
Multiple Correlation
Semi-partial Correlation
Chi-square test of independence
Mann-Whitney U

Wilcoxon T
Kruskal-Wallis H
Friedman’s ANOVA
Friedman’s ANOVA
Independent t
Before/After t
Analysis of Variance
1
2
2
2+
Data Scale
Nominal
data
Ordinal
data
Interval/
Ratio data
Independent

Related
Independent
Related
2
2+ Independent
Related
tan81004_12_c12_295-322.indd 318 2/22/13 3:45 PM
CHAPTER 12Chapter Formulas
When the procedures encountered in this book are used by
managers with an under-
standing of the procedures’ purpose and an appreciation for
their requirements, they offer
the promise that the related decisions will be more reasonable
and better informed, and
can equip managers with the tools they need to be most
effective. Finally, if you wish
to broaden your horizons in the future, virtually all of the more
advanced procedures
that are beyond the scope of this book are based on the concepts
represented here. Your
authors wish you the best of luck. The first author can be
reached with comments and
questions at [email protected]
Answers to Review Questions
A. The chi-square procedures require data of only nominal

scale.
B. The null hypothesis is that the frequency observed equals the
frequency expected,
fo 5 fe, meaning that there is not enough evidence to reject the
possibility that
what emerged in the analysis was consistent with the prediction.
C. Nominal data provide very little measurement information,
and the chi-square
procedures are a case in point. The analyses are based on
nothing more than the
frequency with which data occur in the various categories.
These data yield noth-
ing about how much of a measured quality is present, for
example. Because none
of the data nuances present with ordinal and interval/ratio data
are gauged,
differences must be substantial to prompt rejecting the null
hypothesis. These are
not particularly powerful procedures.
D. Phi coefficient measures the strength of the relationship
between two nominal
variables when at least one of them has only two categories.
Chapter Formulas
Formula 12.1 x2 5 S
1 fo 2 fe 2 2
fe
is the formula for the chi-square test statistic. The same formula
is used for both “goodness

of fit” test and for the r 3 k chi-square test of independence.
Formula 12.2 f 5"1x2/n 2
Phi coefficient is the measure of the correlation for two nominal
variables when the chi-
square test of independence indicates a significant result, and
when one of the variables
involved has two or fewer categories.
Formula 12.3 V"f2/ 1smaller of rows or columns 2 2 1
When the test of independence is significant and both variables
have at least three catego-
ries, Cramer’s V is calculated rather than phi coefficient. V
requires phi, however, which
must be calculated first.
tan81004_12_c12_295-322.indd 319 2/22/13 3:45 PM
mailto:[email protected]
CHAPTER 12Management Application Exercises
Management Application Exercises
Unless otherwise stated, use p 5 .05 in all your answers.
1. Three new movies, each with the potential to be a
blockbuster, are released on the
same day. Reporters from the local television station are
interested to see whether
one appears to have caught the public attention more than the
others. The reporter
goes to the local multiplex and asks those waiting to buy tickets

which movie they
intend to see. On the basis of results from 52 people, are there
significant differ-
ences in movie preferences? The data are as follows:
Fantasy Haven: 22
Night of Terror: 18
Fists of Glory: 12
2. Data from behavioral psychology indicate that administering
a tangible reward to
subjects will prompt response levels twice as frequent as from
subjects who receive
a nontangible reward. To test this notion in a business context,
two sales seminars
are compared. In one seminar, sales representatives are tossed a
piece of candy
every time they ask a relevant question or provide an insightful
comment. In the
other seminar, only verbal reinforcement is provided. At the end
of the seminars,
data are as follows:
verbal reinforcement seminar—17 questions/comments
tangible reward seminar—27 questions/comments
a. What is the fe value for each group?
b. Are the results consistent with the expectation?
3. A Department of Labor study of education and employment
found that unem-
ployed full-time students take twice as many units as students
who are full-time
employees and 1.5 times more units than students who are part-
time employees.

a. If the fe for the unemployed student is 16 units, what are the
fe values for
students who work part time and full time?
b. If the student who is unemployed takes 16 units, the student
who is em-
ployed part time takes 14 units, and the full-time employee
takes 12 units,
is the expectation supported?
4. In a management trainee program for a multinational
corporation, trainees are
expected to learn a foreign language. Besides classes at a
language training insti-
tute, tutors are available. Experience suggests that those
learning Japanese seek
the help of tutors twice as frequently as those who are learning
Spanish. Among 20
students of Japanese, 16 ask for the help of tutors. Among 30
students of Spanish, 8
ask for tutors’ help.
a. What are the fe values?
b. Are results consistent with prior experience?
c. In this instance, what does HA specify?
tan81004_12_c12_295-322.indd 320 2/22/13 3:45 PM
CHAPTER 12Key Terms
5. A marketing analyst is examining the relationship between
shoppers’ ethnicity
and the purchase of certain grocery item. From ethnic group A,
2 of 12 people

purchased the item. From ethnic group B, 5 of 10 people
purchased the item. From
ethnic group C, 4 of 14 people purchased the item.
a. Are the shoppers’ ethnicities and the tendency to purchase
this item inde-
pendent?
b. If not, what is the correlation?
6. During the summer months, when electricity usage is high,
the power company
appeals to customers to reduce consumption by 10% as a public
service to avoid
blackouts. An alternative is to offer rebates to customers who
reduce usage by
10% compared to the same month the previous year. Among 50
randomly selected
customers just asked to reduce electricity use, 14 reduce their
use by 10% or more.
Among 50 randomly selected customers offered rebates, 25
reduce their electricity
use by 10%. Are the differences between the public service
appeal and the rebates
statistically significant?
7. A number of nonprofit groups use fireworks sales as the
major fundraiser in the
days before the 4th of July. Some of the nonprofits are service
groups such as the
Veterans of Foreign Wars. Others are intended for support of
groups like the cheer-
leaders from the local high school. The questions are whether
those two groups
attract different numbers of customers, and whether the gender
of the customer is

a factor. Among 20 men who bought fireworks during a
particular 2-hour period,
14 purchased from service organizations, the other 6 purchased
from non-service
groups. Of the 18 women who purchased in the same period, 8
bought from ser-
vice organizations, and 10 from non-service groups. Is the
gender of the purchaser
related to the group from which the purchase is made?
8. A corporate CEO is interested in whether 20 management
trainees’ possession of
a graduate degree (yes/no) is related to their promotion within
the first five years
(yes/no). The x2 value is 8.450.
a. What does the x2 value indicate about possession of a
graduate degree and
promotion?
b. What is the value of f?
Key Terms
• The goodness-of-fit, or 1 3 k, is a chi-square test for
significant differences between
the frequency observed and the frequency expected in the
categories of one nominal-
scale variable.
• The chi-square test of independence, or the r 3 k chi-square is
a test of whether
two nominal scale variables operate independently. The test
statistic is the same as
for the goodness-of-fit chi-square. A statistically significant
result indicates that the

two are not independent and a correlation procedure follows.
tan81004_12_c12_295-322.indd 321 2/22/13 3:45 PM
CHAPTER 12Key Terms
• The data in a chi-square test of independence are often
arranged in a contingency
table with the rows indicating the categories of one variable and
the columns the
categories of the other. With the count of the data entered into
the resulting cells, the
table provides a visual indicator of the way the variables
operate together.
• When a chi-square test of independence is statistically
significant, it is followed by
a measure of the strength of the correlation between the
variables. This is usually a
phi coefficient if there are only two categories of one of the
variables, or Cramer’s
V when there are more than two categories.
tan81004_12_c12_295-322.indd 322 2/22/13 3:45 PM
BU
Due: 09/01/2015

CE 371
Numerical Methods in Civil Engineering
PROJECT
Fall’14
Page 1 of 1
two infinite
cylindrical surfaces of radii 1r and 2r as shown in figure below.
Initially (at
ywhere (inner circle is insulated);
The inner
The dimensionless heat conduction equation in radial
coordinates is;
2
2

1
r r r t
To reduce the situation to a characteristic value problem, we
must
render the boundary conditions homogenous, that is, of form 0a
r
.
differential equation
becomes;
2
2

1
r r r t
subject to;
ii) the boundary conditions: 0
r
a) Simulate the equation using finite difference method explicit
scheme
with Δt = 0.25, 0.5, 1.0
b) Simulate the equation using finite difference method implicit
scheme
with Δt = 0.25, 0.5, 1.0

c) Plot your findings and compare the results for a and b.
r1 = 1 m
r2 = 10 m
NOTE: Your project reports will be evaluated for both
correctness and accuracy of
your results and for clarity and neatness of your report. Be sure
to organize your
plots neatly.
11
Confidence Intervals
Learning Objectives
After reading this chapter, you should be able to:
• Distinguish between point and interval-estimates of values.
• Calculate confidence intervals for t-tests and regression
solutions.
• Explain the factors that affect the width of a confidence
interval.
iStockphoto/Thinkstock

tan81004_11_c11_273-294.indd 273 2/22/13 3:44 PM
CHAPTER 11Introduction
Chapter Outline
11.1 A Confidence Interval for a One-Sample t-Test
Confidence Intervals and the Significance of the Test
The Width of the Interval
11.2 A Confidence Interval for an Independent Samples t-Test
An Independent Samples t-Test Example
Calculating the Confidence Interval of the Difference
11.3 The Confidence Interval of the Prediction
The Standard Error of the Estimate
Calculating the Confidence Interval of the Prediction
Regarding the Width of the Interval
The Excel Confidence Intervals for a Regression
Solution
Chapter Summary
Introduction

Early in the book there was a distinction drawn between the
statistics that describe sample data and the parameter values
that describe the characteristics of populations.
The relationship between statistics and parameters is important.
The value in calculat-
ing many statistics is connected to how well they represent one
of the possible values
of a related parameter. In fact a good deal of analysis and
decision making is based on
the understanding that when samples are large and randomly
selected, the statistics that
describe the sample will also provide useful indicators of what
the parameter values will
likely be. In the progression from one type of analysis to the
next in this book, there have
been a number of instances in which a statistic was calculated
and used in a procedure
because the parameter value was not available. In each
situation, the statistic that was cal-
culated from the sample data was employed as an estimate of
what the related parameter
value would likely be for the entire population. For example,
• The sample mean, M, is one of the possible values of the
population mean, m.

• The standard deviation, s, can serve as an estimate of s.
• The sample standard error of the mean (SEM) was used to
complete the one-
sample t-test in the place of the population standard error of the
mean, sM,
which the z-test requires.
• The standard error of the difference, SEd, in the independent
samples t-test
substituted for the standard deviation of the differences
sM12M2 between the
means of all possible pairs of samples in a distribution of
difference scores.
• Ordinary least squares regression produces a predicted value
of the criterion
variable (y’) when the actual value of that variable, y, is
unknown.
In each case, the statistic is a particular, discrete value that
takes the place of the related
parameter. The value of the statistic is based on the
understanding that it is the best estimate
available from whatever data are at hand for the value of the

more-difficult-to-determine
tan81004_11_c11_273-294.indd 274 2/22/13 3:44 PM
CHAPTER 11Section 11.1 A Confidence Interval for a One-
Sample t-Test
parameter. Usually a relatively large sample that is created by
randomly selecting the indi-
viduals who make it up will reflect the essential characteristics
of the population from which
the sample was drawn. But “usually” means that sometimes the
sample does not accurately
represent the population, and the difficulty is that at the time it
may not be clear whether the
sample is representative.
Many of the statistical procedures used to this point involve
calculating a value that indi-
cates the degree of difference or association between groups and
then determining whether
that value is statistically significant. The calculated values
represent what are called

point estimates of outcomes; they are discrete numbers that are
used to determine when
a difference or a relationship reflected in sample data is likely
to have occurred by chance.
The difficulty is that even when a value is based on the most
careful data collection proce-
dures, there is always a risk of sampling error. The
point estimate by itself provides no indicator of its
accuracy. It is difficult to know how much confidence
to have in results that are based on these values.
One way to address the limitations of point esti-
mates is to move away from the focus on discrete
values and rely instead on what are called interval estimates of
the relevant values. For
example, rather than asking whether the sample mean provides
an accurate estimate of
the mean of the population from which the sample was drawn,
the question becomes,
“What range of values is likely to capture the true value of the
population mean from
which a significantly different sample was drawn?” In other
words, instead of relying

on point estimates to estimate population parameters, a
confidence interval provides,
with a specified level of probability, a range of values
within which the estimated population parameter is
likely to fall. Relying on a range of values to capture
the value of interest rather than trying to pinpoint it
with a discrete value is why this approach to statis-
tical analysis makes reference to “confidence inter-
vals.” The emphasis is on the use of “interval esti-
mates” rather than on “point estimates.”
It isn’t uncommon to see both point and interval esti-
mates used in the same analysis. If a statistical test
produces a statistically significant result, the analysis that
provided a point estimate of
the value of the population parameter is often followed by a
confidence interval for that
population parameter. That way one can know (a) what the best
estimate of the value is,
and (b) how much variation should be allowed around that value
in order to have a rea-
sonable chance of capturing it. Different statistical procedures
involve different kinds of

confidence intervals.
11.1 A Confidence Interval for a One-Sample t-Test
Suppose that the market development team associated with an
electric power plant located
in Fresno County, California, wishes to estimate average
monthly electricity usage. Cal-
culating the mean (M) use of electricity among a randomly
selected sample of Fresno
Key Terms: Point estimates
are discrete values that are cal-
culated for unknown parameter
values.
Key Terms: Interval esti-
mates are calculated ranges
within which an unknown
parameter value is likely to
occur. Because the interval is
established based on a specified
level of confidence, it is also
called a confidence interval.
tan81004_11_c11_273-294.indd 275 2/22/13 3:44 PM

CHAPTER 11
Formula 11.1 C.I..95 5 6t1SEM2 1 M
residents will probably provide a reasonable estimate of mean
use among all members of
the county, (m), so long as the sample is of an adequate size and
is selected in a way that
minimizes sampling error. However, what if the goal is to
compare domestic use in Fresno
County to domestic use in the country as a whole?
As an example, suppose that the mean electricity bill for
domestic users in the country
as a whole is $230.835 per month. For Fresno County, the
monthly costs for 31 ran-
domly selected residences average $245, with a standard
deviation of 37.555. Since
SEM 5 s/ "n , SEM 5 37.555/ "31 5 6.745. Recall from Chapter
4 that t can be calculated
as follows:

t 5
M 2 mM
SEM
Remembering that uM and m have the same value,
t 5
245 2 230.835
6.745
t 5 2.100
If the probability of a type I error is set at p 5 .05, then the
critical value of t for df 5 30
(n 2 1) 5 2.042 (Table 4.1). Average electricity use by the
people in Fresno County is sig-
nificantly different than it is for the country as a whole. For a
one-sample t-test, a statisti-
cally significant result indicates that the mean of the population
from which the sample
was drawn (Fresno) is different from the population mean to
which it was compared
(United States); to say it differently, it indicates that there are
two different populations

involved—one is the population from which the sample was
drawn (Fresno), and the
other is the population to which the sample was compared
(United States). But is
the sample mean (M) that prompts a significant t value really an
accurate estimate of
the population from which the sample was drawn? In the
electricity use problem, does
M provide a reasonably good estimate of electricity use in the
population of all electric-
ity users in Fresno County? How heavily can one afford to rely
on the “M as a good rep-
resentation of m” assumption? Rather than calculat-
ing the sample mean (M) and relying on sample size
and random selection to make the case that M 5 m,
the confidence interval of the population mean
provides a range of values, thus the reference to an
interval, within which mM, the mean of the popu-
lation to which the sample does belong, will occur
with a specified level of probability. The confidence
interval for a one-sample t is calculated with this
formula:
Key Terms: The confidence
interval of the popula-

tion mean is a range within
which the value of an unknown
population mean has a specified
probability of occurring.
Section 11.1 A Confidence Interval for a One-Sample t-Test
tan81004_11_c11_273-294.indd 276 2/22/13 3:44 PM
CHAPTER 11
Where
C.I..95 5 a .95 confidence interval, or a range of values within
which the probability
is p 5 .95 that the true value of the population mean, mM, will
be included.
t 5 the critical value of t for the degrees of freedom associated
with the t-test.
The 6 symbol indicates that the critical value of t from the table
is included twice,

once as a positive value and a second time as a negative value.
SEM 5 the calculated value of the standard error of the mean.
M 5 the sample mean.
As is the case with any type of statistical testing, when
calculating confidence intervals the
analyst deals in probabilities rather than certainties. The range
of values that is a .95 con-
fidence interval for a one-sample t-test will capture the mean of
the population, which is
represented by the sample 19 times out of 20. The other side of
that coin for a .95 confi-
dence interval is that based on averages, 1 time in 20 the
parameter value that is sought
will occur outside the interval.
The level of probability for the confidence interval is
determined by whatever the prob-
ability level was for the original test of statistical significance.
Since most statistical testing
is conducted at p 5 .05, that is also the most common standard
for the confidence interval.
The confidence interval calculated after a one-sample t that is
statistically significant at

p 5 .05 must produce a range of values that will miss the true
value of the population mean no more than 5 times in 100.
However, confidence intervals are stated in terms of the prob-
ability of capturing the particular value rather than the proba-
bility of missing that value. So instead of a p 5 .05 confidence
interval, it is a p 5 .95 confidence interval. If the t-test had
been conducted at p 5 .01, a .99 confidence interval would be
the appropriate interval estimate to calculate, and so on.
To calculate the confidence interval for the one-sample t
procedure completed above, the
process is as follows:
C.I..95 5 6t(SEM) 1 M
C.I..95 5 62.042(6.745) 1 245
C.I..95 5 613.773 1 245
C.I..95 5 258.773, 231.227
Review Question A:
What is the probabil-
ity that the true value
will be outside a .95

confidence interval?
tan81004_11_c11_273-294.indd 277 2/22/13 3:44 PM
CHAPTER 11Section 11.1 A Confidence Interval for a One-
Sample t-Test
The result is interpreted this way: With .95 confidence, the
mean monthly cost of electric-
ity in the population to which the Fresno County people belong
is somewhere between
$231.23 and $258.77.
The initial question that prompted the t-test could be framed
this way: Is monthly electric-
ity usage in Fresno County consistent with electricity usage for
the same month nation-
wide? Because the t value is statistically significant, the answer
is “no.” The statistically
significant t indicates that the sample probably belongs to a
population other than the one

to which it was compared. Maybe it is summer, and with
temperatures higher in Cali-
fornia than elsewhere, California consumers have higher
electricity usage than what is
characteristic of the country as a whole. For whatever reason,
the t-test indicates that the
sample belongs to a different population than the population
that is the country.
The sample mean used in the t-test is a point estimate of the
mean of a population. If the
t-test result is not statistically significant, the conclusion is that
the sample mean is one of
the many samples making up the population of sample means.
However, the fact that the
t-test result is statistically significant indicates that the sample
mean is point estimate of
the value of some population mean different from what was
indicated for the country as a
whole. What the point estimate does not indicate, however, is
how well the sample mean
estimates the value of the mean of the population to which the
sample belongs. The confi-
dence interval responds to this absence by providing a way to
calculate a range of values

within which the population mean will probably occur. The
“probably” is the reminder
that there is never a certainty of including the value. In this
case, there is 95% probability
that the mean of the population represented by the sample is
somewhere between $231.23
and $258.77.
Note that the range of values in the confidence interval does not
include the original pop-
ulation mean ($230.835) from the initial t-test. When a sample
mean in a one-sample t-test
is significantly higher than the population mean, the sample
mean (M) is probably not
one of the possible values of the population mean, a population
with mM 5 230.835 in this
case. Because the sample mean was significantly higher than
mM, the mean of the popula-
tion to which the sample probably does belong will have a value
higher than 230.835 in
the t-test. So the values in the range that is the confidence
interval are all beyond the origi-
nal value of mM. If a significant t test involved a sample mean
less than mM, the confidence
interval would include a range with values all lower than mM.

Confidence Intervals and the Significance of the Test
Confidence intervals are only calculated for significant results.
If the value of t is not sta-
tistically significant, the confidence interval will include a
range of values that includes
the original value of mM. This happens because a non-
significant t-test result is interpreted
to mean that the value of mM from the test is one of the values
that could be the mean for
the population to which the sample was compared. An example
will clarify this.
Suppose sales of frozen yogurt for a franchise average $1,125
per day. For a particularly
cold week, the daily sales were as follows:
$974, $1,256, $1,170, $842, $875, $1,056, $1,145
tan81004_11_c11_273-294.indd 278 2/22/13 3:44 PM
CHAPTER 11

Are sales during that week significantly different from average
daily sales of $1,125? Ver-
ify that:
M 5 1045.429
s 5 155.679
SEM 5 s/"n 5 155.679/"7 5 58.841
t 5
M 2 mM
SEM
5
1045.429 2 1125.0
58.841
5 21.352;
With t.05(6) 5 2.447, the result is not significant.
The sample mean from that non-significant result, with the

critical value of t and the esti-
mated standard error of the mean, produce this .95 confidence
interval:
C.I..95 5 6t(SEM) 1 M 5 62.447(58.841) 1 1045.429 5
1189.413, 901.445
The t-test was not significant, and the resulting confidence
interval therefore includes the
original population mean ($1,125) as one of the values in the
interval. Logically, if the point
estimate (M) is not significantly different from the population
mean (mM), then maybe the
mean of the population represented by the sample does have the
same value as the speci-
fied population mean. At least it is a possibility that cannot be
rejected.
The Width of the Interval
The narrower the confidence interval, the more precise estimate
the interval provides for
the value of the unknown population mean. Sometimes
confidence intervals are so wide
that they seem not to be very helpful as indicators of the

unknown value, so it makes sense
to analyze the factors that affect the width of the interval. There
are several factors, includ-
ing the level of probability at which the interval is calculated.
For the electricity problem,
the .95 confidence interval stretched from $231.227 to
$258.773, around the $245 sample
mean. If the market development team wanted a greater level of
certainty about capturing
the value of the population mean to which the sample belongs, a
C.I..99 could be used, but
note what happens as a result. First, verify that the critical t
value for p 5 .01 and df 5 30 is
2.750. Now the calculated t value of 2.11 is not statistically
significant. Second, to calculate
the .99 confidence interval:
C.I..99 5 6t(SEM) 1 M
C.I..99 5 62.750(6.745) 1 245
C.I..99 5 618.549 1 245
C.I..99 5 263.549, 226.451

tan81004_11_c11_273-294.indd 279 2/22/13 3:44 PM
CHAPTER 11Section 11.2 A Confidence Interval for an
Independent Samples t-Test
The price for a greater certainty of capturing the true value of
the population mean
(p 5 .99 rather than p 5 .95) represented by the sample is a
wider confidence interval and
less precision. Now the market development department is 99%
certain that the confi-
dence interval includes the population mean, but the interval
estimate is much broader
and probably less helpful. Note that the interval now includes
the population mean of
$230.835. Precision and certainty push the confidence interval
in opposite directions. One
is improved only at the expense of reducing the other.
Just as the confidence interval increased when a .95 confidence
interval was used, it could

have been reduced by calculating a .90 C.I. interval instead
(although the particular table
in this book does not provide the critical values for significance
testing at p 5 .1). Note
that with a C.I.90 the probability of missing the true value of
the population mean to which
the sample belongs is 1 time in 10. As the level of probability is
relaxed, the critical value
of t diminishes accordingly, which shrinks the C.I.
The other element in the width of the confidence interval is the
standard error of the mean,
SEM. Since SEM 5 s/ "n , the value of the standard error of the
mean can be reduced by
either increasing the sample size (larger n, larger divisor,
smaller resulting value), finding
a way to reduce the variability in the scores so that the standard
deviation is smaller, or
both. As it turns out, increasing the sample size usually will
serve both purposes. Besides
the larger n value, it also usually decreases the s value. Small
samples tend to be platykur-
tic; they ordinarily have relatively large standard deviations
given their ranges. As sample
sizes grow, particularly randomly selected samples, overall

variability tends to decrease.
This is consistent with the central tendency characteristics of
normal distributions, as more
of the scores selected will tend to occur in the middle of the
distribution near the mean.
Since the standard deviation measures how much individual
values tend to vary from the
mean of the group, increasing sample size usually shrinks the
standard deviation value.
11.2 A Confidence Interval for an Independent Samples t-Test
A statistically significant independent samples t-test indicates
that the two samples represented by M1 and M2 probably did
not come from the same population; they
came from populations with different means. Stated
symbolically, m1 2 m2 ? 0. Consistent with the lan-
guage used earlier, the two sample means in the
case of the independent t-test are point estimates
of their respective population means. The more
difference there is between M1 and M2 in a statis-
tically significant t-test, the greater the difference
there probably is between the means of the two
related populations. The confidence interval for the independent
samples t-test is called

the confidence interval of the difference. It provides a way to
estimate the difference
between the two population means represented by the sample
means in a statistically
significant independent samples t-test.
The confidence interval of the difference employs this formula:
interval of the difference
is a range probably contain-
ing the difference between two
unknown population means.
tan81004_11_c11_273-294.indd 280 2/22/13 3:44 PM
Formula 11.2 C.I..95 5 6t(SEd) 1 (M1 2 M2)
Where

C.I..95 5 a range of values within which the difference between
the means of the
population represented by the samples will be captured with p 5
.95
t 5 the critical value of t for n1 1 n2 2 2 degrees of freedom
(the same number of
degrees of freedom as there were for the independent samples t)
SEd 5 the calculated value of the standard error of the
difference
M1, M2 5 the two sample means from the independent t-test
As with the one-sample t-test, a confidence interval of the
difference is only calculated if
the independent samples t-test is statistically significant.
Considering the purpose for the
confidence interval makes the reason for this evident. What is at
issue in the independent
samples t is whether the two samples likely represent
populations with the same means,
or for practical purposes, whether the two samples belong to the
same population. That
probability is rejected when the result is significant. If the
analysis is not significant and
the decision is to “fail to reject,” then the samples may come
from populations with the

same means. It makes little sense, therefore, to calculate a
confidence interval for the dif-
ference between the population means if there may be only one
population involved.
An Independent Samples t-Test Example
A pharmaceutical manufacturer has received Federal Drug
Administration (FDA)
approval to market a new drug for people with high cholesterol.
In two sales regions, pre-
scription sales have been similar. But then at some point, one of
the two representatives
receives special training on the drug’s chemistry so that the
representative can explain more knowledgeably how the drug
works and what the side effects are likely to be. The question
is whether the extra training is worth the trouble—do the doc-
tors visited by the sales rep who received the extra training
write a significantly different number of prescriptions than
the doctors the other rep visits? The numbers of prescription
orders for the drug by doctors in the two sales regions over a
seven-week period following the training are as follows:
Extra training: 13, 10, 14, 17, 16, 12, 15

No extra training: 12, 10, 9, 12, 8, 8, 9
Review Question B:
What is the relation-
ship between the level
of confidence and the
width of the confi-
dence interval?
tan81004_11_c11_273-294.indd 281 2/22/13 3:44 PM
Verify the following descriptive statistics:
Mean Standard Deviation Standard Error of
the Mean
Extra Training 13.857 2.410 .911
No Training 9.714 1.704 .644

Recall that SEd 5 "1SEM2 2 1 SEM22 2 5 "1.9112 1 .6442 2
5 1.116
t 5
M1 2 M2
SEd
5
13.857 2 9.714
1.116
5 3.712; t.051122 5 2.179. Reject H0
The result is statistically significant. As the difference between
the sample means sug-
gests, the doctors visited by the sales representative with the
extra training are writing
significantly more prescriptions than the doctors in the other
sales district. In terms of
the number of prescriptions the doctors write, the two sales reps
probably now represent
two distinct populations. The sample means provide point
estimates of the means of the
two populations involved. The confidence interval of the

difference provides a different
way to contrast the two populations by indicating, with the
specified probability, what
range of values is likely to capture the magnitude of the
difference between the two
population means.
Calculating the Confidence Interval of the Difference
Formula 11.2 is:
C.I..95 5 6t(SEd) 1 (M1 2 M2)
Note that although the final term in the formula indicates M1 2
M2, it is the absolute
value of the difference between the means that the formula
requires. If M2 happens to
have a greater value than M1, the result will be a difference that
is negative, but whether
M1 2 M2 is positive or negative isn’t relevant to the confidence
interval. It is the absolute
value of M1 2 M2 that is entered in the formula.
All the required values are available from the t-test solution
above. Substituting them

gives the following for the confidence interval:
tan81004_11_c11_273-294.indd 282 2/22/13 3:44 PM
CHAPTER 11Section 11.3 The Confidence Interval of the
Prediction
C.I..95 5 62.179(1.116) 1 (13.857 2 9.714)
C.I..95 5 6.575, 1.711
a. 2.179 3 1.116 1 (13.857 2 9.714) is the upper bound of the
confidence inter-
val, and
b. 22.179 3 1.116 1 (13.857 2 9.714) is the lower bound of the
confidence interval.
The significant independent samples t-test result that indicates
that the two samples prob-
ably do not represent populations with the same mean is
followed with a confidence inter-
val estimating how much difference there is between the means

of the two populations.
The calculations indicate that with p 5 .95 confidence, the
difference will be somewhere
from 1.711 to 6.575 prescriptions per week.
Note that the confidence interval does not estimate the means of
the two populations.
Rather, it estimates the difference between their means. The
sample means provide esti-
mates of the population means, which is as close as we come to
identifying their values
without further data collection and analysis.
11.3 The Confidence Interval of the Prediction
The first two confidence intervals in this chapter were
calculated and interpreted in ref-
erence to population means. When t is significant in the one-
sample test, the confidence
interval is a range of values within which the mean of the
population represented by the
sample is likely to occur. When t is significant in an
independent groups test, the confi-
dence interval indicates how much difference there is likely to
be between the means of
the two populations inferred by the samples.

The confidence interval of the prediction has a different
orientation. It is a confidence
interval for a regression solution, and because the task in
regression is to predict the value
of y from x (or from multiple x predictors), what the confidence
interval produces is a
range of values within which the true value of y will probably
occur, given a specific value
for x.
Remember that the basis for regression is that when-
ever variables are significantly correlated, a number
of estimates of the value of y from x will be more
accurate than a series of random estimates. While
that represents a very important theoretical princi-
ple, managers are not usually interested in a series of
estimates. For example, given a correlation between
the price of some product and the resulting sales vol-
ume, a production manager who is trying to forecast what
production will need to be in
order to satisfy demand is often not interested in a series of
estimates of sales volume (y)
for several different prices (xs), but rather in one estimate of y

from x. The question is often
“If the price is ___, what are sales most likely to be?” The value
of the confidence interval
is that it provides a way to estimate how precise that one
prediction of sales volume is
likely to be.
interval of the prediction is
a range of probability contain-
ing the true value of a criterion
variable.
tan81004_11_c11_273-294.indd 283 2/22/13 3:44 PM
Prediction
Small values of the standard error of the estimate indicate
relatively little error and inter-
vals from y’ 2 1(SEest), to y’ 1 1(SEest) that are relatively
narrow. When that occurs, there
can be increased confidence in the predicted value of y. Small

intervals indicate little error
in the prediction. On the other hand, if that range is quite wide,
it suggests there could be
a good deal of variance between the predicted value of y and its
actual value.
Capturing the true value of the criterion variable about 2⁄3 of
the time, which is what y’
6SEest provides for, produces essentially a p 5 .68 confidence
interval. The problem is that
the range of values that is y’ 61(SEest) would miss the true
value of y about
1⁄3 of the time.
The probability of not capturing the true predicted value in such
a confidence interval is
too great to be helpful in most analytical situations. To improve
the probability of captur-
ing y, the more precise C.I..95 and C.I..99 confidence intervals
of the prediction are calcu-
lated. The formula for a C.I..95 is:
Formula 11.4 C.I..95 5 6tn22(SEest) 1 y’
Formula 11.3 C.I..68 5 61(SEest) 1 y’

It is important to be cautious about the distinction between
specificity and precision. The
difficulty is not in predicting the value of y given x, which can
be done whenever x and
y are significantly correlated. The challenge is in making
precise predictions of the value
of y from x. Although the least squares regression equation
provides a particular value
that is the best prediction of the criterion variable from a
specific value of x, what consti-
tutes the best prediction is relative. The best prediction in some
circumstances can be very
imprecise. A number of factors conspire against accurate
predictions, most notably, weak
correlations. The problem that the regression solution does not
solve is that the predicted
value of y, (y’), yields little information about how precise the
prediction is likely to be. It
is a shortfall that the confidence interval addresses.
The Standard Error of the Estimate
Recall that in any normal distribution, plus or minus 1 standard
deviation from the mean,

accounts for about 2⁄3 of the population, 68.26% to be more
exact. Recall also from Chapter
9 that the standard error of the estimate (SEest), which
estimates prediction errors in regres-
sion problem, is by definition a standard deviation value. In fact
it is the standard deviation
of all possible error scores from an infinite number of
predictions of y from x. While that
definition has important conceptual value, in practice the
standard error of the estimate is
in fact an estimated value, as the name suggests. Recall that
SEest 5 sy"1 2 r2xy . Because
it is a type of standard deviation value, and since 6 one standard
deviation from the mean
in any normal distribution includes 68% of that distribution, we
can say by extension that
61(SEest) from the predicted value of y (y’) point should
provide a range of values within
which the true value of y will occur about 68% of the time.
tan81004_11_c11_273-294.indd 284 2/22/13 3:44 PM

Prediction
Where
C.I..95 5 a .95 confidence interval for the regression solution
t 5 the critical value of t for n 2 2 degrees of freedom where n 5
the number of
pairs of scores
SEest 5 the standard error of the estimate
y’ 5 the predicted value for the criterion variable
Calculating the Confidence Interval of the Prediction
A produce importer recognizes a relationship between how long
it takes produce to get
from shipping docks to the retail market and how much of the
order is lost to spoilage
because of over-ripening. The data for number of days (x) and
the percentage of the order
lost to over-ripeness (y) is as follows:
Number of days: 12, 15, 11, 9, 7, 9, 12, 10, 7, 8, 9, 8
Percentage lost: 8, 10, 7, 7, 6, 8, 9, 9, 5, 6, 7, 6
With a dock workers’ strike looming and produce coming into

port, the importer wishes
to predict what the losses will be if it takes 18 days to get the
produce to the retailer.
Verify the following descriptive statistics:
Mean Standard Deviation
Number of Days (x) 9.750 2.379
% Lost (y) 7.333 1.497
rxy 5 .868 and is statistically significant at level p 5 .05
y’ 5 a 1 bx
b 5 rxy(sy/sx)
5 .868(1.497 4 2.379)
5 .546
a 5 My 2 bMx
5 7.333 2 (.546)(9.750)

5 2.010
y’ 5 2.010 1 .546(18)
5 11.838
tan81004_11_c11_273-294.indd 285 2/22/13 3:44 PM
Prediction
Based on these data, if it takes 18 days to get the produce to
retail outlets, there will be an
11.838% loss due to over-ripening.
Like the values for M in the one sample t, and M1 2 M2 in the
independent t, the 11.838
value is a point estimate—a discrete number—indicating a
solution value. As with t-tests,
the confidence interval of the prediction produces an interval
estimate, or a range of num-
bers within which the true percentage of spoilage will occur

12The Chi-Square Test Analyzing Categorical DataLea.docx

12The Chi-Square Test Analyzing Categorical DataLea.docx

Recommended

Recommended

More Related Content

Similar to 12The Chi-Square Test Analyzing Categorical DataLea.docx

Similar to 12The Chi-Square Test Analyzing Categorical DataLea.docx (20)

More from hyacinthshackley2629

More from hyacinthshackley2629 (20)

Recently uploaded

Recently uploaded (20)

12The Chi-Square Test Analyzing Categorical DataLea.docx